date:20160128

Re: [Qemu-devel] [PATCH v4 8/8] raspi: add raspberry pi 2 machine

2016-01-28 Thread Peter Crosthwaite

On Fri, Jan 15, 2016 at 3:58 PM, Andrew Baumann
 wrote:
> Signed-off-by: Andrew Baumann 
> ---
>
> Notes:
> Pi1 requires more peripherals, and will be added in a later patch
> series.
>
> v4:
> * drop header comment from versatile
> * made smpboot and board setup blobs relocatable (within limits:
>   we can't use ARMv7 MOVW for Pi1, so it's messier than highbank)
> * move board setup blob to common code
> * modify SCR using read-or-write
> * s/RaspiMachineState/RaspiState/
> * style tweaks
>
> v3:
>  * fix board setup to remain Pi1 compatible
>  * pass ram property
>
>  hw/arm/Makefile.objs |   2 +-
>  hw/arm/raspi.c   | 156 
> +++
>  2 files changed, 157 insertions(+), 1 deletion(-)
>  create mode 100644 hw/arm/raspi.c
>
> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
> index f55f8d2..a711e4d 100644
> --- a/hw/arm/Makefile.objs
> +++ b/hw/arm/Makefile.objs
> @@ -11,7 +11,7 @@ obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o 
> pxa2xx_pic.o
>  obj-$(CONFIG_DIGIC) += digic.o
>  obj-y += omap1.o omap2.o strongarm.o
>  obj-$(CONFIG_ALLWINNER_A10) += allwinner-a10.o cubieboard.o
> -obj-$(CONFIG_RASPI) += bcm2835_peripherals.o bcm2836.o
> +obj-$(CONFIG_RASPI) += bcm2835_peripherals.o bcm2836.o raspi.o
>  obj-$(CONFIG_STM32F205_SOC) += stm32f205_soc.o
>  obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp.o xlnx-ep108.o
>  obj-$(CONFIG_FSL_IMX25) += fsl-imx25.o imx25_pdk.o
> diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
> new file mode 100644
> index 000..2110725
> --- /dev/null
> +++ b/hw/arm/raspi.c
> @@ -0,0 +1,156 @@
> +/*
> + * Raspberry Pi emulation (c) 2012 Gregory Estrade
> + * Upstreaming code cleanup [including bcm2835_*] (c) 2013 Jan Petrous
> + *
> + * Rasperry Pi 2 emulation Copyright (c) 2015, Microsoft
> + * Written by Andrew Baumann
> + *
> + * This code is licensed under the GNU GPLv2 and later.
> + */
> +
> +#include "hw/arm/bcm2836.h"
> +#include "qemu/error-report.h"
> +#include "hw/boards.h"
> +#include "hw/loader.h"
> +#include "hw/arm/arm.h"
> +#include "sysemu/sysemu.h"
> +
> +#define SMPBOOT_ADDR0x300 /* this should leave enough space for ATAGS */
> +#define MVBAR_ADDR  0x400 /* secure vectors */
> +#define BOARDSETUP_ADDR (MVBAR_ADDR + 0x20) /* board setup code */
> +#define FIRMWARE_ADDR   0x8000 /* Pi loads kernel.img here by default */
> +
> +/* Table of Linux board IDs for different Pi versions */
> +static const int raspi_boardid[] = {[1] = 0xc42, [2] = 0xc43};
> +
> +typedef struct RaspiState {

A quick google search, I see the camel case form for rpi is usually
"RasPi". Should we follow?

> +union {

union not needed.

> +BCM2836State pi2;
> +} soc;
> +MemoryRegion ram;
> +} RaspiState;
> +
> +static void write_smpboot(ARMCPU *cpu, const struct arm_boot_info *info)
> +{
> +static const uint32_t smpboot[] = {
> +0xE1A0E00F, /*mov lr, pc */
> +0xE3A0FE00 + (BOARDSETUP_ADDR >> 4), /* mov pc, BOARDSETUP_ADDR */
> +0xEE100FB0, /*mrc p15, 0, r0, c0, c0, 5;get core ID */
> +0xE7E10050, /*ubfxr0, r0, #0, #2   ;extract LSB */
> +0xE59F5014, /*ldr r5, =0x40CC  ;load mbox base */
> +0xE320F001, /* 1: yield */
> +0xE7953200, /*ldr r3, [r5, r0, lsl #4] ;read mbox for our 
> core*/
> +0xE353, /*cmp r3, #0   ;spin while zero */
> +0x0AFB, /*beq 1b */
> +0xE7853200, /*str r3, [r5, r0, lsl #4] ;clear mbox */
> +0xE12FFF13, /*bx  r3   ;jump to target */
> +0x40CC, /* (constant: mailbox 3 read/clear base) */

lower case hex for consistency with other blobbing boards (exynos,
zynq, arm_boot).

> +};
> +

> +/* check that we don't overrun board setup vectors */
> +assert(SMPBOOT_ADDR + sizeof(smpboot) <= MVBAR_ADDR);
> +/* check that board setup address is correctly relocated */
> +assert((BOARDSETUP_ADDR & 0xf) == 0 && (BOARDSETUP_ADDR >> 4) < 0x100);

QEMU_BUILD_BUG_ON (both asserts should be convertible)

> +
> +rom_add_blob_fixed("raspi_smpboot", smpboot, sizeof(smpboot),
> +   info->smp_loader_start);
> +}
> +
> +static void write_board_setup(ARMCPU *cpu, const struct arm_boot_info *info)
> +{
> +arm_write_secure_board_setup_dummy_smc(cpu, info, MVBAR_ADDR);
> +}
> +
> +static void reset_secondary(ARMCPU *cpu, const struct arm_boot_info *info)
> +{
> +CPUState *cs = CPU(cpu);
> +cpu_set_pc(cs, info->smp_loader_start);
> +}
> +
> +static void setup_boot(MachineState *machine, int version, size_t ram_size)
> +{
> +static struct arm_boot_info binfo;
> +int r;
> +
> +binfo.board_id = raspi_boardid[version];
> +binfo.ram_size = ram_size;
> +binfo.nb_cpus = smp_cpus;
> +binfo.board_setup_addr = BOARDSETUP_ADDR;
> +binfo.write_board_setup = write_board_setup;
> +

Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

2016-01-28 Thread Jike Song

This discussion becomes a little difficult for a newbie like me :(

On 01/28/2016 11:23 PM, Alex Williamson wrote:
> On Thu, 2016-01-28 at 14:00 +0800, Jike Song wrote:
>> On 01/28/2016 12:19 AM, Alex Williamson wrote:
>>> On Wed, 2016-01-27 at 13:43 +0800, Jike Song wrote:
>> {snip}
>>  
 Had a look at eventfd, I would say yes, technically we are able to
 achieve the goal: introduce a fd, with fop->{read|write} defined in KVM,
 call into vgpu device-model, also an iodev registered for a MMIO GPA
 range to invoke the fop->{read|write}.  I just didn't understand why
 userspace can't register an iodev via API directly.
>>>  
>>> Please elaborate on how it would work via iodev.
>>>  
>>  
>> QEMU forwards BAR0 write to the bus driver, in the bus driver, if
>> found that MEM bit is enabled, register an iodev to KVM: with an
>> ops:
>>  
>>  const struct kvm_io_device_ops trap_mmio_ops = {
>>  .read   = kvmgt_guest_mmio_read,
>>  .write  = kvmgt_guest_mmio_write,
>>  };
>>  
>> I may not be able to illustrated it clearly with descriptions but this
>> should not be a problem, thanks to your explanation, I can understand
>> and adopt it for KVMGT.
> 
> You're still crossing modules with direct callbacks, right?  What's the
> advantage versus using the file descriptor + offset approach which could
> offer the same performance and improve KVM overall by creating a new
> option for generically handling MMIO?
> 

Yes, the method I gave above is the current way: calling kvm_io_device_ops
from KVM hypervisor, and then going to vgpu device-model directly.

>From KVMGT's side this is almost the same as what you suggested, I don't
think now we have a problem here. I will adopt your suggestion.

 Besides, this doesn't necessarily require another thread, right?
 I guess it can be within the VCPU thread? 
>>>  
>>> I would think so too, the vcpu is blocked on the MMIO access, we should
>>> be able to service it in that context.  I hope.
>>>  
>>  
>> Thanks for confirmation.
>>  
 And this brought another question: except the vfio bus drvier and
 iommu backend (and the page_track ulitiy used for guest memory 
 write-protection), 
 is it KVMGT allowed to call into kvm.ko (or modify)? Though we are
 becoming less and less willing to do that with VFIO, it's still better
 to know that before going wrong.
>>>  
>>> kvm and vfio are separate modules, for the most part, they know nothing
>>> about each other and have no hard dependencies between them.  We do have
>>> various accelerations we can use to avoid paths through userspace, but
>>> these are all via APIs that are agnostic of the party on the other end.
>>> For example, vfio signals interrups through eventfds and has no concept
>>> of whether that eventfd terminates in userspace or into an irqfd in KVM.
>>> vfio supports direct access to device MMIO regions via mmaps, but vfio
>>> has no idea if that mmap gets directly mapped into a VM address space.
>>> Even with posted interrupts, we've introduced an irq bypass manager
>>> allowing interrupt producers and consumers to register independently to
>>> form a connection without directly knowing anything about the other
>>> module.  That sort or proper software layering needs to continue.  It
>>> would be wrong for a vfio bus driver to assume KVM is the user and
>>> directly call into KVM interfaces.  Thanks,
>>>  
>>  
>> I understand and agree with your point, it's bad if the bus driver
>> assume KVM is the user and/or call into KVM interfaces.
>>  
>> However, the vgpu device-model, in intel case also a part of i915 driver,
>> will always need to call some hypervisor-specific interfaces.
> 
> No, think differently.
> 
>> For example, when a guest gfx driver submit GPU commands, the device-model
>> may want to scan it for security or whatever-else purpose:
>>  
>>  - get a GPA (from GPU page tables)
>>  - want to read 16 bytes from that GPA
>>  - call hypervisor-specific read_gpa() method
>>  - for Xen, the GPA belongs to a foreign domain, it must find
>>a way to map & read it - beyond our scope here;
>>  - for KVM, the GPA can converted to HVA, copy_from_user (if
>>called from vcpu thread) or access_remote_vm (if called from
>>other threads);
>>  
>> Please note that this is not from the vfio bus driver, but from the vgpu
>> device-model; also this is not DMA addr from GPU talbes, but real GPA.
> 
> This is exactly why we're proposing that the vfio IOMMU interface be
> used as a database of guest translations. 
> The type1 IOMMU model in QEMU
> maps all of guest memory through the IOMMU, in the vGPU model type1 is
> simply collecting these and they map GPA to process virtual memory.

GPA to HVA mappings are maintained in KVM/QEMU, via memslots.
Do you mean making type1 to duplicate the GPA <-> HVA/HPA translations from
KVM? Even technically this could be done,

Re: [Qemu-devel] [PATCH v4 7/8] arm/boot: move highbank secure board setup code to common routine

2016-01-28 Thread Peter Crosthwaite

On Fri, Jan 15, 2016 at 3:58 PM, Andrew Baumann
 wrote:
> The new version is slightly different, to support Rasbperry Pi (in
> particular, Pi1's arm11 core which doesn't support v7 instructions
> such as MOVW).
>
> Signed-off-by: Andrew Baumann 
> ---
>
> Notes:
> This has not yet been tested on Highbank! Peter C -- please help :)
>

qemu-system-arm -kernel
/home/pcrost/poky/build/tmp/deploy/images/qemuarm/zImage -dtb
/home/pcrost/poky/build/tmp/deploy/images/qemuarm/zImage-highbank.dtb
-device ide-drive,drive=sata,bus=ide.0 -M highbank --no-reboot -drive
file=/home/pcrost/poky/build/tmp/deploy/images/qemuarm/core-image-minimal-qemuarm-20160114031411.rootfs.ext4,if=none,id=sata,format=raw
-no-reboot -nographic -m 128 -serial mon:stdio -serial null --append
"console=tty console=ttyAMA0,115200 ip=dhcp mem=128M highres=off
root=/dev/sda rw rootfstype=ext4 console=ttyS0"
[0.00] Booting Linux on physical CPU 0x0
[0.00] Linux version 4.2.1 (pcrost@pcrost-box) (gcc version
5.2.0 (GCC) ) #1 SMP Wed Jan 13 19:43:13 PST 2016
[0.00] CPU: ARMv7 Processor [410fc090] revision 0 (ARMv7), cr=10c5387d
[0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT
nonaliasing instruction cache
[0.00] Machine model: Calxeda Highbank
[0.00] cma: Failed to reserve 64 MiB
[0.00] Memory policy: Data cache writeback
[0.00] DT missing boot CPU MPIDR[23:0], fall back to default
cpu_logical_map
[0.00] psci: probing for conduit method from DT.
[0.00] psci: Using PSCI v0.1 Function IDs from DT
[0.00] CPU: All CPU(s) started in SVC mode.
...
[4.263064] random: dd urandom read with 84 bits of entropy available
[7.103537] random: nonblocking pool is initialized
[9.286935] uart-pl011 fff36000.serial: no DMA platform data

Poky (Yocto Project Reference Distro) 2.0 qemuarm /dev/ttyAMA0

qemuarm login: root
root@qemuarm:~# uname -a
Linux qemuarm 4.2.1 #1 SMP Wed Jan 13 19:43:13 PST 2016 armv7l GNU/Linux
root@qemuarm:~#

Tested-by: Peter Crosthwaite 

> Honestly, I fear that the overhead of maintaining support for two very
> different platforms (including Pi1) may outweigh the value of unifying
> these blobs.
>

Having a look at the new code it is more robust than the original in
its own right with the separation of the blobs.

>  hw/arm/boot.c| 53 
> 
>  hw/arm/highbank.c| 37 ++--
>  include/hw/arm/arm.h |  5 +
>  3 files changed, 60 insertions(+), 35 deletions(-)
>
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 75f69bf..bc1ea4d 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -178,6 +178,59 @@ static void default_write_secondary(ARMCPU *cpu,
>   smpboot, fixupcontext);
>  }
>
> +void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
> +const struct arm_boot_info *info,
> +hwaddr mvbar_addr)
> +{
> +int n;
> +uint32_t mvbar_blob[] = {
> +/* mvbar_addr: secure monitor vectors
> + * Default unimplemented and unused vectors to spin. Makes it
> + * easier to debug (as opposed to the CPU running away).
> + */
> +0xEAFE, /* (spin) */
> +0xEAFE, /* (spin) */
> +0xE1B0F00E, /* movs pc, lr ;SMC exception return */
> +0xEAFE, /* (spin) */
> +0xEAFE, /* (spin) */
> +0xEAFE, /* (spin) */
> +0xEAFE, /* (spin) */
> +0xEAFE, /* (spin) */

The code currently in arm_boot uses lower case for hex constants so we
should preserve convention.

> +};
> +uint32_t board_setup_blob[] = {
> +/* board setup addr */
> +0xE3A00E00 + (mvbar_addr >> 4), /* mov r0, #mvbar_addr */
> +0xEE0C0F30, /* mcr p15, 0, r0, c12, c0, 1 ;set MVBAR */
> +0xEE110F11, /* mrc p15, 0, r0, c1 , c1, 0 ;read SCR */
> +0xE3800031, /* orr r0, #0x31  ;enable AW, FW, NS */
> +0xEE010F11, /* mcr p15, 0, r0, c1, c1, 0  ;write SCR */
> +0xE1A0100E, /* mov r1, lr ;save LR across SMC */
> +0xE1600070, /* smc #0 ;call monitor to flush 
> SCR */
> +0xE1A0F001, /* mov pc, r1 ;return */
> +};
> +
> +/* check that mvbar_addr is correctly aligned and relocatable (using 
> MOV) */
> +assert((mvbar_addr & 0x1f) == 0 && (mvbar_addr >> 4) < 0x100);
> +
> +/* check that these blobs don't overlap */
> +assert((mvbar_addr + sizeof(mvbar_blob) <= info->board_setup_addr)
> +  || (info->board_setup_addr + sizeof(board_setup_blob) <= 
> mvbar_addr));
> +
> +for (n = 0; n < ARRAY_SIZE(mvbar_blob); n++) {
> +mvbar_blob[n] = tswap32(mvbar_blob[n]);
> +}
> +rom_add_blob_fixed("board-setup-mvbar", mvbar_blob, sizeof(mvbar_blob),
> +   mv

Re: [Qemu-devel] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks

2016-01-28 Thread Gerd Hoffmann

  Hi,

> 1) The OpRegion MemoryRegion is mapped into system_memory through
> programming of the 0xFC config space register.
>  a) vfio-pci could pick an address to do this as it is realized.
>  b) SeaBIOS/OVMF could program this.
> 
> Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to
> pick an address and mark it as e820 reserved.  I'm not sure how to pick
> that address.

Because of that I'd let the firmware pick the address and program 0xfc
accordingly, i.e. (b).  seabios can simply malloc two pages and be done
with it (any ram allocated by seabios will be tagged as e820 reserved).

> 2) Read-only mappings version of 1)
> 
> Discussion: Really nothing changes from the issues above, just prevents
> any possibility of the guest modifying anything in the host.  Xen
> apparently allows write access to the host page already.

I think read-only is out.  Probably xen allows write access because
guest drivers expect they have write access to the opregion, so the
question is ...

> 3) Copy OpRegion contents into buffer and do either 1) or 2) above.

whenever we give the guest a copy of the host opregion or direct access.

> 4) Copy contents into a guest RAM location, mark it reserved, point to
> it via 0xFC config as scratch register.
>  a) Done by QEMU (vfio-pci)
>  b) Done by SeaBIOS/OVMF
> 
> Discussion: This is the most like real hardware.  4.a) has the usual
> issue of how to pick an address, but the benefit of not requiring BIOS
> changes (simply mark the RAM reserved via existing methods).  4.b) would
> require passing a buffer containing the contents of the OpRegion via
> fw_cfg and letting the BIOS do the setup.  The latter of course requires
> modifying each BIOS for this support.

Maybe we should define the interface as "guest writes 0xfc to pick
address, qemu takes care to place opregion there".  That gives us the
freedom to change the qemu implementation (either copy host opregion or
map the host opregion) without breaking things.

> Of course none of these support hotplug nor really can they since
> reserved memory regions are not dynamic in the architecture.

igd is chipset graphics and therefore not hotpluggable anyway (on
physical hardware), I'd be very surprised if the guest drivers are
prepared to handle hotplug.

> Another thing I notice in this series is the access to PCI config space
> of both the host bridge and the LPC bridge.  This prevents unprivileged
> use cases

lpc bridge is no problem, only pci id fields are copied over and
unprivileged access is allowed for them.

Copying the gfx registers of the host bridge is a problem indeed.

> Should vfio add
> additional device specific regions to expose the config space of these
> other devices?

That is an option.  It is not clear yet which route we have to take
though.  Testing shows that newer linux drivers work fine even without
igd-passthru=on tweaks, whereas older linux kernels and windows drivers
don't work even with this series applied and igd-passthru=on.  I'll go
look at this as soon as I have test hardware (getting some is wip atm).

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v4 0/5] ARM: Add NUMA support for machine virt

2016-01-28 Thread Shannon Zhao



On 2016/1/29 14:32, Ashok Kumar wrote:
> Hi, 
> 
> On Sat, Jan 23, 2016 at 07:36:41PM +0800, Shannon Zhao wrote:
>> > From: Shannon Zhao 
>> > 
>> > Add NUMA support for machine virt. Tested successfully running a guest
>> > Linux kernel with the following patch applied:
>> > 
>> > - [PATCH v9 0/6] arm64, numa: Add numa support for arm64 platforms
>> > https://lwn.net/Articles/672329/
>> > - [PATCH v2 0/4] ACPI based NUMA support for ARM64
>> > http://www.spinics.net/lists/linux-acpi/msg61795.html
>> > 
>> > Changes since v3:
>> > * based on new kernel driver and device bindings
>> > * add ACPI part
>> > 
>> > Changes since v2:
>> > * update to use NUMA node property arm,associativity.
>> > 
>> > Changes since v1:
>> > Take into account Peter's comments:
>> > * rename virt_memory_init to arm_generate_memory_dtb
>> > * move arm_generate_memory_dtb to boot.c and make it a common func
>> > * use a struct numa_map to generate numa dtb
>> > 
>> > Example qemu command line:
>> > qemu-system-aarch64 \
>> > -enable-kvm -smp 4\
>> > -kernel Image \
>> > -m 512 -machine virt,kernel_irqchip=on \
>> > -initrd guestfs.cpio.gz \
>> > -cpu host -nographic \
>> > -numa node,mem=256M,cpus=0-1,nodeid=0 \
>> > -numa node,mem=256M,cpus=2-3,nodeid=1 \
>> > -append "console=ttyAMA0 root=/dev/ram"
>> > 
>> > Shannon Zhao (5):
>> >   ARM: Virt: Add /distance-map node for NUMA
>> >   ARM: Virt: Set numa-node-id for CPUs
>> >   ARM: Add numa-node-id for /memory node
>> >   include/hw/acpi/acpi-defs: Add GICC Affinity Structure
>> >   hw/arm/virt-acpi-build: Generate SRAT table
>> > 
>> >  hw/arm/boot.c   | 29 ++-
>> >  hw/arm/virt-acpi-build.c| 58 
>> > +
>> >  hw/arm/virt.c   | 37 +
>> >  hw/i386/acpi-build.c|  2 +-
>> >  include/hw/acpi/acpi-defs.h | 15 +++-
>> >  5 files changed, 138 insertions(+), 3 deletions(-)
>> > 
>> > -- 
>> > 2.0.4
>> > 
> Don't we need to populate the NUMA node in the Affinity byte of MPIDR?
> Linux uses the Affinity information in MPIDR to build topology which
> might go wrong for the guest in this case. 
> Maybe a non Linux OS might be impacted more?
> 
Ah, yes. It needs to update the MPIDR. But currently QEMU uses the value
from KVM when using KVM. It needs to call kvm_set_one_reg to set the
MPIDR and I'm not sure if this will affect KVM by looking at following
comments:
/*
 * When KVM is in use, PSCI is emulated in-kernel and not by qemu.
 * Currently KVM has its own idea about MPIDR assignment, so we
 * override our defaults with what we get from KVM.
 */

Peter, do you have any suggestion?

> distance-map compatible string has been changed from
> "numa,distance-map-v1" to "numa-distance-map-v1"
Will update this.

Thanks,
-- 
Shannon

Re: [Qemu-devel] [PATCH v13 00/10] Block replication for continuous checkpoints

2016-01-28 Thread Wen Congyang

On 01/27/2016 07:03 PM, Dr. David Alan Gilbert wrote:
> Hi,
>   I've got a block error if I kill the secondary.
> 
> Start both primary & secondary
> kill -9 secondary qemu
> x_colo_lost_heartbeat on primary
> 
> The guest sees a block error and the ext4 root switches to read-only.
> 
> I gdb'd the primary with a breakpoint on quorum_report_bad; see
> backtrace below.
> (This is based on colo-v2.4-periodic-mode of the framework
> code with the block and network proxy merged in; so it could be my
> merging but I don't think so ?)
> 
> 
> (gdb) where
> #0  quorum_report_bad (node_name=0x7f2946a0892c "node0", ret=-5, 
> acb=0x7f2946cb3910, acb=0x7f2946cb3910)
> at /root/colo/jan-2016/qemu/block/quorum.c:222
> #1  0x7f2943b23058 in quorum_aio_cb (opaque=, 
> ret=)
> at /root/colo/jan-2016/qemu/block/quorum.c:315
> #2  0x7f2943b311be in bdrv_co_complete (acb=0x7f2946cb3f60) at 
> /root/colo/jan-2016/qemu/block/io.c:2122
> #3  0x7f2943ae777d in aio_bh_call (bh=) at 
> /root/colo/jan-2016/qemu/async.c:64
> #4  aio_bh_poll (ctx=ctx@entry=0x7f2945b771d0) at 
> /root/colo/jan-2016/qemu/async.c:92
> #5  0x7f2943af5090 in aio_dispatch (ctx=0x7f2945b771d0) at 
> /root/colo/jan-2016/qemu/aio-posix.c:305
> #6  0x7f2943ae756e in aio_ctx_dispatch (source=, 
> callback=, 
> user_data=) at /root/colo/jan-2016/qemu/async.c:231
> #7  0x7f293b84a79a in g_main_context_dispatch () from 
> /lib64/libglib-2.0.so.0
> #8  0x7f2943af3a00 in glib_pollfds_poll () at 
> /root/colo/jan-2016/qemu/main-loop.c:211
> #9  os_host_main_loop_wait (timeout=) at 
> /root/colo/jan-2016/qemu/main-loop.c:256
> #10 main_loop_wait (nonblocking=) at 
> /root/colo/jan-2016/qemu/main-loop.c:504
> #11 0x7f29438529ee in main_loop () at /root/colo/jan-2016/qemu/vl.c:1945
> #12 main (argc=, argv=, envp=) 
> at /root/colo/jan-2016/qemu/vl.c:4707
> 
> (gdb) p s->num_children
> $1 = 2
> (gdb) p acb->success_count
> $2 = 0
> (gdb) p acb->is_read
> $5 = false

Sorry for the late reply.
What it the value of acb->count?

If secondary host is down, you should remove quorum's children.1. Otherwise, 
you will get
I/O error event.

Thanks
Wen Congyang

> 
> (qemu) info block
> colo-disk0 (#block080): json:{"children": [{"driver": "raw", "file": 
> {"driver": "file", "filename": "/root/colo/bugzilla.raw"}}, {"driver": 
> "replication", "mode": "primary", "file": {"port": "8889", "host": "ibpair", 
> "driver": "nbd", "export": "colo-disk0"}}], "driver": "quorum", "blkverify": 
> false, "rewrite-corrupted": false, "vote-threshold": 1} (quorum)
> Cache mode:   writeback, direct
> 
> Dave
> 
> * Changlong Xie (xiecl.f...@cn.fujitsu.com) wrote:
>> Block replication is a very important feature which is used for
>> continuous checkpoints(for example: COLO).
>>
>> You can get the detailed information about block replication from here:
>> http://wiki.qemu.org/Features/BlockReplication
>>
>> Usage:
>> Please refer to docs/block-replication.txt
>>
>> This patch series is based on the following patch series:
>> 1. http://lists.nongnu.org/archive/html/qemu-devel/2015-12/msg04570.html
>>
>> You can get the patch here:
>> https://github.com/Pating/qemu/tree/changlox/block-replication-v13
>>
>> You can get the patch with framework here:
>> https://github.com/Pating/qemu/tree/changlox/colo_framework_v12
>>
>> TODO:
>> 1. Continuous block replication. It will be started after basic functions
>>are accepted.
>>
>> Changs Log:
>> V13:
>> 1. Rebase to the newest codes
>> 2. Remove redundant marcos and semicolon in replication.c 
>> 3. Fix typos in block-replication.txt
>> V12:
>> 1. Rebase to the newest codes
>> 2. Use backing reference to replcace 'allow-write-backing-file'
>> V11:
>> 1. Reopen the backing file when starting blcok replication if it is not
>>opened in R/W mode
>> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>>when opening backing file
>> 3. Block the top BDS so there is only one block job for the top BDS and
>>its backing chain.
>> V10:
>> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>>reference.
>> 2. Address the comments from Eric Blake
>> V9:
>> 1. Update the error messages
>> 2. Rebase to the newest qemu
>> 3. Split child add/delete support. These patches are sent in another 
>> patchset.
>> V8:
>> 1. Address Alberto Garcia's comments
>> V7:
>> 1. Implement adding/removing quorum child. Remove the option non-connect.
>> 2. Simplify the backing refrence option according to Stefan Hajnoczi's 
>> suggestion
>> V6:
>> 1. Rebase to the newest qemu.
>> V5:
>> 1. Address the comments from Gong Lei
>> 2. Speed the failover up. The secondary vm can take over very quickly even
>>if there are too many I/O requests.
>> V4:
>> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
>> V3:
>> 1: use error_setg() instead of error_set()
>> 2. Add a new block job API
>> 3. Active disk, hidden disk and nbd target uses the same AioContext
>> 4. Ad

Re: [Qemu-devel] [PATCH v7 13/13] hmp: Add "info ppc-cpu-cores" command

2016-01-28 Thread Bharata B Rao

On Thu, Jan 28, 2016 at 02:56:41PM -0700, Eric Blake wrote:
> On 01/27/2016 10:49 PM, Bharata B Rao wrote:
> > This is the hmp equivalent of "query ppc-cpu-cores"
> 
> The QMP command is spelled "query-ppc-cpu-cores".
> 
> Most HMP commands prefer '_' over '-'; so this should be 'info
> ppc_cpu_cores'.

I see that a few commands have '-' but as you note, majority of them use
'_'. Though I personally prefer '-', if HMP convention is to go with '_',
will change in the next iteration.

> 
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hmp-commands-info.hx | 16 
> >  hmp.c| 31 +++
> >  hmp.h|  1 +
> >  3 files changed, 48 insertions(+)
> > 
> 
> > +++ b/hmp.c
> > @@ -2375,3 +2375,34 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const 
> > QDict *qdict)
> >  
> >  qapi_free_RockerOfDpaGroupList(list);
> >  }
> > +
> > +void hmp_info_ppc_cpu_cores(Monitor *mon, const QDict *qdict)
> > +{
> > +Error *err = NULL;
> > +PPCCPUCoreList *ppc_cpu_core_list = qmp_query_ppc_cpu_cores(&err);
> > +PPCCPUCoreList *s = ppc_cpu_core_list;
> > +CpuInfoList *thread;
> > +
> > +while (s) {
> > +monitor_printf(mon, "PowerPC CPU device: \"%s\"\n",
> > +   s->value->id ? s->value->id : "");
> 
> This should probably be checking s->value->has_id rather than assuming
> that s->value->id will be NULL when not present (well, I'd like to clean
> up qapi to avoid the need for has_FOO when FOO  is a pointer, but we're
> not there yet).

Ok, will switch to s->value->has_id ? s->value->id : "")

> 
> > +monitor_printf(mon, "  hotplugged: %s\n",
> > +   s->value->hotplugged ? "true" : "false");
> > +monitor_printf(mon, "  hotpluggable: %s\n",
> > +   s->value->hotpluggable ? "true" : "false");
> > +monitor_printf(mon, "  Threads:\n");
> > +for (thread = s->value->threads; thread; thread = thread->next) {
> > +monitor_printf(mon, "CPU #%" PRId64 ":", 
> > thread->value->CPU);
> > +monitor_printf(mon, " nip=0x%016" PRIx64,
> > +   thread->value->u.ppc->nip);
> 
> This uses value->u.ppc without first checking that the discriminator
> value->arch is set to CPU_INFO_ARCH_PPC; could that be a problem down
> the road?

Can't think of any potential problems as this command is PowerPC
specific.

BTW can you please let me know what else is needed from QEMU end to
drive this PowerPC CPU core device hotplug from libvirt ?

Regards,
Bharata.

qemu-devel@nongnu.org

2016-01-28 Thread David Gibson

On Thu, Jan 28, 2016 at 10:53:43PM +0100, Lluís Vilanova wrote:
> Replaces all direct uses of 'error_setg(&error_fatal/abort)' with
> 'error_report_fatal/abort'. Also reimplements the former on top of the
> latter.
> 
> Signed-off-by: Lluís Vilanova 

I think the spapr parts of this will be obsoleted by the cleanups to
error handling included in the pull request I sent today.
-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [RFCv2 1/6] pseries: Simplify handling of the hash page table fd

2016-01-28 Thread David Gibson

When migrating the 'pseries' machine type with KVM, we use a special fd
to access the hash page table stored within KVM.  Usually, this fd is
opened at the beginning of migration, and kept open until the migration
is complete.

However, if there is a guest reset during the migration, the fd can become
stale and we need to re-open it.  At the moment we use an 'htab_fd_stale'
flag in sPAPRMachineState to signal this, which is checked in the migration
iterators.

But that's rather ugly.  It's simpler to just close and invalidate the
fd on reset, and lazily re-open it in migration if necessary.  This patch
implements that change.

This requires a small addition to the machine state's instance_init,
so that htab_fd is initialized to -1 (telling the migration code it
needs to open it) instead of 0, which could be a valid fd.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 86 --
 include/hw/ppc/spapr.h |  1 -
 2 files changed, 41 insertions(+), 46 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a9c9a95..52b0e49 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1023,6 +1023,32 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
 #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
tswap64(~HPTE64_V_HPTE_DIRTY))
 #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
tswap64(HPTE64_V_HPTE_DIRTY))
 
+/*
+ * Get the fd to access the kernel htab, re-opening it if necessary
+ */
+static int get_htab_fd(sPAPRMachineState *spapr)
+{
+if (spapr->htab_fd >= 0) {
+return spapr->htab_fd;
+}
+
+spapr->htab_fd = kvmppc_get_htab_fd(false);
+if (spapr->htab_fd < 0) {
+error_report("Unable to open fd for reading hash table from KVM: %s",
+ strerror(errno));
+}
+
+return spapr->htab_fd;
+}
+
+static void close_htab_fd(sPAPRMachineState *spapr)
+{
+if (spapr->htab_fd >= 0) {
+close(spapr->htab_fd);
+}
+spapr->htab_fd = -1;
+}
+
 static void spapr_alloc_htab(sPAPRMachineState *spapr)
 {
 long shift;
@@ -1084,10 +1110,7 @@ static void spapr_reset_htab(sPAPRMachineState *spapr)
 error_setg(&error_abort, "Requested HTAB allocation failed during 
reset");
 }
 
-/* Tell readers to update their file descriptor */
-if (spapr->htab_fd >= 0) {
-spapr->htab_fd_stale = true;
-}
+close_htab_fd(spapr);
 } else {
 memset(spapr->htab, 0, HTAB_SIZE(spapr));
 
@@ -1120,28 +1143,6 @@ static int find_unknown_sysbus_device(SysBusDevice 
*sbdev, void *opaque)
 return 0;
 }
 
-/*
- * A guest reset will cause spapr->htab_fd to become stale if being used.
- * Reopen the file descriptor to make sure the whole HTAB is properly read.
- */
-static int spapr_check_htab_fd(sPAPRMachineState *spapr)
-{
-int rc = 0;
-
-if (spapr->htab_fd_stale) {
-close(spapr->htab_fd);
-spapr->htab_fd = kvmppc_get_htab_fd(false);
-if (spapr->htab_fd < 0) {
-error_report("Unable to open fd for reading hash table from KVM: "
- "%s", strerror(errno));
-rc = -1;
-}
-spapr->htab_fd_stale = false;
-}
-
-return rc;
-}
-
 static void ppc_spapr_reset(void)
 {
 sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
@@ -1312,14 +1313,6 @@ static int htab_save_setup(QEMUFile *f, void *opaque)
 spapr->htab_first_pass = true;
 } else {
 assert(kvm_enabled());
-
-spapr->htab_fd = kvmppc_get_htab_fd(false);
-spapr->htab_fd_stale = false;
-if (spapr->htab_fd < 0) {
-fprintf(stderr, "Unable to open fd for reading hash table from 
KVM: %s\n",
-strerror(errno));
-return -1;
-}
 }
 
 
@@ -1459,6 +1452,7 @@ static int htab_save_later_pass(QEMUFile *f, 
sPAPRMachineState *spapr,
 static int htab_save_iterate(QEMUFile *f, void *opaque)
 {
 sPAPRMachineState *spapr = opaque;
+int fd;
 int rc = 0;
 
 /* Iteration header */
@@ -1467,13 +1461,12 @@ static int htab_save_iterate(QEMUFile *f, void *opaque)
 if (!spapr->htab) {
 assert(kvm_enabled());
 
-rc = spapr_check_htab_fd(spapr);
-if (rc < 0) {
-return rc;
+fd = get_htab_fd(spapr);
+if (fd < 0) {
+return fd;
 }
 
-rc = kvmppc_save_htab(f, spapr->htab_fd,
-  MAX_KVM_BUF_SIZE, MAX_ITERATION_NS);
+rc = kvmppc_save_htab(f, fd, MAX_KVM_BUF_SIZE, MAX_ITERATION_NS);
 if (rc < 0) {
 return rc;
 }
@@ -1494,6 +1487,7 @@ static int htab_save_iterate(QEMUFile *f, void *opaque)
 static int htab_save_complete(QEMUFile *f, void *opaque)
 {
 sPAPRMachineState *spapr = opaque;
+int fd;
 
 /* Iteration header */
 qemu_put_be32(f, 0);
@@ -1503,17 +1497,16 @@ static int htab_save_complete(QEMUFile *f, void *opaque)
 
 assert(kvm_enab

[Qemu-devel] [RFCv2 3/6] pseries: Stubs for HPT resizing

2016-01-28 Thread David Gibson

This introduces stub implementations of the H_RESIZE_HPT_PREPARE and
H_RESIZE_HPT_COMMIT hypercalls which we hope to add in a PAPR
extension to allow run time resizing of a guest's hash page table.  It
also adds a new machine property for controlling whether this new
facility is available.

Finally, it adds a new string to the hypertas property in the device
tree, advertising to the guest the availability of the HPT resizing
hypercalls.  This is a tentative suggested value, and would need to be
standardized by PAPR before being merged.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 40 
 hw/ppc/spapr_hcall.c   | 37 +
 include/hw/ppc/spapr.h | 11 ++-
 trace-events   |  2 ++
 4 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bb701e3..b7bd1c1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -317,6 +317,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
const char *kernel_cmdline,
uint32_t epow_irq)
 {
+sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
 void *fdt;
 uint32_t start_prop = cpu_to_be32(initrd_base);
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
@@ -336,6 +337,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 add_str(hypertas, "hcall-splpar");
 add_str(hypertas, "hcall-bulk");
 add_str(hypertas, "hcall-set-mode");
+if (spapr->resize_hpt != SPAPR_RESIZE_HPT_DISABLED) {
+add_str(hypertas, "hcall-hpt-resize");
+}
 add_str(qemu_hypertas, "hcall-memop1");
 
 fdt = g_malloc0(FDT_MAX_SIZE);
@@ -2100,6 +2104,36 @@ static void spapr_set_kvm_type(Object *obj, const char 
*value, Error **errp)
 spapr->kvm_type = g_strdup(value);
 }
 
+static char *spapr_get_resize_hpt(Object *obj, Error **errp)
+{
+sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+
+switch (spapr->resize_hpt) {
+case SPAPR_RESIZE_HPT_DISABLED:
+return g_strdup("disabled");
+case SPAPR_RESIZE_HPT_ENABLED:
+return g_strdup("enabled");
+case SPAPR_RESIZE_HPT_REQUIRED:
+return g_strdup("required");
+}
+assert(0);
+}
+
+static void spapr_set_resize_hpt(Object *obj, const char *value, Error **errp)
+{
+sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
+
+if (strcmp(value, "disabled") == 0) {
+spapr->resize_hpt = SPAPR_RESIZE_HPT_DISABLED;
+} else if (strcmp(value, "enabled") == 0) {
+spapr->resize_hpt = SPAPR_RESIZE_HPT_ENABLED;
+} else if (strcmp(value, "required") == 0) {
+spapr->resize_hpt = SPAPR_RESIZE_HPT_REQUIRED;
+} else {
+error_setg(errp, "Bad value for \"resize-hpt\" property");
+}
+}
+
 static void spapr_machine_initfn(Object *obj)
 {
 sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
@@ -2110,6 +2144,12 @@ static void spapr_machine_initfn(Object *obj)
 object_property_set_description(obj, "kvm-type",
 "Specifies the KVM virtualization mode 
(HV, PR)",
 NULL);
+
+object_property_add_str(obj, "resize-hpt",
+spapr_get_resize_hpt, spapr_set_resize_hpt, NULL);
+object_property_set_description(obj, "resize-hpt",
+"Resizing of the Hash Page Table (enabled, 
disabled, required)",
+NULL);
 }
 
 static void spapr_machine_finalizefn(Object *obj)
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index a535c73..f285d34 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -331,6 +331,38 @@ static target_ulong h_read(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
+ sPAPRMachineState *spapr,
+ target_ulong opcode,
+ target_ulong *args)
+{
+target_ulong flags = args[0];
+target_ulong shift = args[1];
+
+if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
+return H_AUTHORITY;
+}
+
+trace_spapr_h_resize_hpt_prepare(flags, shift);
+return H_HARDWARE;
+}
+
+static target_ulong h_resize_hpt_commit(PowerPCCPU *cpu,
+sPAPRMachineState *spapr,
+target_ulong opcode,
+target_ulong *args)
+{
+target_ulong flags = args[0];
+target_ulong shift = args[1];
+
+if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
+return H_AUTHORITY;
+}
+
+trace_spapr_h_resize_hpt_commit(flags, shift);
+return H_HARDWARE;
+}
+
 static target_ulong h_set_dabr(PowerPCCPU *cpu, sPAPRMachineState *spapr,
target_ulong opcode, target_ulong *args)
 {

[Qemu-devel] [RFCv2 6/6] pseries: Use smaller default hash page tables when guest can resize

2016-01-28 Thread David Gibson

We've now implemented a PAPR extension allowing PAPR guest to resize
their hash page table (HPT) during runtime.

This patch makes use of that facility to allocate smaller HPTs by default.
Specifically when a guest is aware of the HPT resize facility, qemu sizes
the HPT to the initial memory size, rather than the maximum memory size on
the assumption that the guest will resize its HPT if necessary for hot
plugged memory.

When the initial memory size is much smaller than the maximum memory size
(a common configuration with e.g. oVirt / RHEV) then this can save
significant memory on the HPT.

If the guest does *not* advertise HPT resize awareness when it makes the
ibm,client-architecture-support call, qemu resizes the HPT for maxmimum
memory size (unless it's been configured not to allow such guests at all).

For now we make that reallocation assuming the guest has not yet used the
HPT at all.  That's true in practice, but not, strictly, an architectural
or PAPR requirement.  If we need to in future we can fix this by having
the client-architecture-support call reboot the guest with the revised
HPT size (the client-architecture-support call is explicitly permitted to
trigger a reboot in this way).

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c   | 10 +++---
 hw/ppc/spapr_hcall.c | 28 +++-
 2 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0d2759a..c7b3814 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1137,14 +1137,18 @@ static void ppc_spapr_reset(void)
 sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
 PowerPCCPU *first_ppc_cpu;
 uint32_t rtas_limit;
+int hpt_shift;
 
 /* Check for unknown sysbus devices */
 foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
 
 /* Allocate and/or reset the hash page table */
-spapr_reallocate_hpt(spapr,
- spapr_hpt_shift_for_ramsize(machine->maxram_size),
- &error_fatal);
+if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
+hpt_shift = spapr_hpt_shift_for_ramsize(machine->maxram_size);
+} else {
+hpt_shift = spapr_hpt_shift_for_ramsize(machine->ram_size);
+}
+spapr_reallocate_hpt(spapr, hpt_shift, &error_fatal);
 
 /* Update the RMA size if necessary */
 if (spapr->vrma_adjust) {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 2345196..b33c83d 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1160,12 +1160,14 @@ static void do_set_compat(void *arg)
 ((cpuver) == CPU_POWERPC_LOGICAL_2_07) ? 2070 : 0)
 
 #define OV5_DRCONF_MEMORY 0x20
+#define OV5_HPT_RESIZE0x80
 
 static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
   sPAPRMachineState *spapr,
   target_ulong opcode,
   target_ulong *args)
 {
+MachineState *machine = MACHINE(spapr);
 target_ulong list = ppc64_phys_to_real(args[0]);
 target_ulong ov_table, ov5;
 PowerPCCPUClass *pcc_ = POWERPC_CPU_GET_CLASS(cpu_);
@@ -1175,7 +1177,7 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
 unsigned compat_lvl = 0, cpu_version = 0;
 unsigned max_lvl = get_compat_level(cpu_->max_compat);
 int counter;
-char ov5_byte2;
+char ov5_byte2, ov5_byte8;
 
 /* Parse PVR list */
 for (counter = 0; counter < 512; ++counter) {
@@ -1265,6 +1267,30 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
 memory_update = true;
 }
 
+ov5_byte8 = ldub_phys(&address_space_memory, ov5 + 8);
+if (!(ov5_byte8 & OV5_HPT_RESIZE)) {
+int maxshift = spapr_hpt_shift_for_ramsize(machine->maxram_size);
+
+if (spapr->resize_hpt == SPAPR_RESIZE_HPT_REQUIRED) {
+error_report(
+"h_client_architecture_support: Guest doesn't support HPT 
resizing with resize-hpt=required");
+exit(1);
+}
+
+if (spapr->htab_shift < maxshift) {
+CPUState *cs;
+/* Guest doesn't know about HPT resizing, so we
+ * pre-emptively resize for the maximum permitted RAM.  At
+ * the point this is called, nothing should have been
+ * entered into the existing HPT */
+spapr_reallocate_hpt(spapr, maxshift, &error_fatal);
+CPU_FOREACH(cs) {
+run_on_cpu(cs, pivot_hpt, cs);
+}
+cpu_update = true;
+}
+}
+
 if (spapr_h_cas_compose_response(spapr, args[1], args[2],
  cpu_update, memory_update)) {
 qemu_system_reset_request();
-- 
2.5.0

[Qemu-devel] [RFCv2 0/6] PAPR hash page table resizing

2016-01-28 Thread David Gibson

This series implements the host / qemu side to allow hash page table
(HPT) resizing for PAPR (pseries machine type) guests.  This is a
proposed extension to the PAPR spec.  It also requires awareness on
the guest side, which I've posted a series for today.

This applies on top of my ppc-for-2.6 branch.

Changes from RFCv1:
  * Limit the guest's HPT to one order larger than we'd usually give them
  * Size initial HPT for initial rather than maximum memory for guests
that are HPT resize aware

David Gibson (6):
  pseries: Simplify handling of the hash page table fd
  pseries: Move hash page table allocation to reset time
  pseries: Stubs for HPT resizing
  pseries: Implement HPT resizing
  pseries: Enable HPT resizing for 2.6
  pseries: Use smaller default hash page tables when guest can resize

 hw/ppc/spapr.c  | 278 +++
 hw/ppc/spapr_hcall.c| 375 +++-
 include/hw/ppc/spapr.h  |  20 ++-
 target-ppc/mmu-hash64.h |   4 +
 trace-events|   2 +
 5 files changed, 553 insertions(+), 126 deletions(-)

-- 
2.5.0

[Qemu-devel] [RFCv2 2/6] pseries: Move hash page table allocation to reset time

2016-01-28 Thread David Gibson

At the moment the size of the hash page table (HPT) is fixed based on the
maximum memory allowed to the guest.  As such, we allocate the table during
machine construction, and just clear it at reset.

However, we're planning to implement a PAPR extension allowing the hash
page table to be resized at runtime.  This will mean that on reset we want
to revert it to the default size.  It also means that when migrating, we
need to make sure the destination allocates an HPT of size matching the
host, since the guest could have changed it before the migration.

This patch replaces the spapr_alloc_htab() and spapr_reset_htab() functions
with a new spapr_reallocate_hpt() function.  This is called at reset and
inbound migration only, not during machine init any more.

In addition, we add a new helper to compute the recommended hash table size
for a given RAM size.  We export this as well as spapr_reallocate_hpt(),
since we'll be needing them elsewhere in future.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 151 ++---
 include/hw/ppc/spapr.h |   3 +
 2 files changed, 71 insertions(+), 83 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 52b0e49..bb701e3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1049,81 +1049,67 @@ static void close_htab_fd(sPAPRMachineState *spapr)
 spapr->htab_fd = -1;
 }
 
-static void spapr_alloc_htab(sPAPRMachineState *spapr)
-{
-long shift;
-int index;
-
-/* allocate hash page table.  For now we always make this 16mb,
- * later we should probably make it scale to the size of guest
- * RAM */
-
-shift = kvmppc_reset_htab(spapr->htab_shift);
-if (shift < 0) {
-/*
- * For HV KVM, host kernel will return -ENOMEM when requested
- * HTAB size can't be allocated.
- */
-error_setg(&error_abort, "Failed to allocate HTAB of requested size, 
try with smaller maxmem");
-} else if (shift > 0) {
-/*
- * Kernel handles htab, we don't need to allocate one
- *
- * Older kernels can fall back to lower HTAB shift values,
- * but we don't allow booting of such guests.
- */
-if (shift != spapr->htab_shift) {
-error_setg(&error_abort, "Failed to allocate HTAB of requested 
size, try with smaller maxmem");
+int spapr_hpt_shift_for_ramsize(uint64_t ramsize)
+{
+int shift;
+
+/* We aim for a hash table of size 1/128 the size of RAM (rounded
+ * up).  The PAPR recommendation is actually 1/64 of RAM size, but
+ * that's much more than is needed for Linux guests */
+shift = ctz64(pow2ceil(ramsize)) - 7;
+shift = MAX(shift, 18); /* Minimum architected size */
+shift = MIN(shift, 46); /* Maximum architected size */
+return shift;
+}
+
+void spapr_reallocate_hpt(sPAPRMachineState *spapr, int shift, Error **errp)
+{
+long rc;
+
+/* Clean up any HPT info from a previous boot */
+g_free(spapr->htab);
+spapr->htab = NULL;
+spapr->htab_shift = 0;
+close_htab_fd(spapr);
+
+rc = kvmppc_reset_htab(shift);
+if (rc < 0) {
+/* kernel-side HPT needed, but couldn't allocate one */
+error_setg_errno(errp, errno,
+ "Failed to allocate KVM HPT of order %d (try smaller 
maxmem?)",
+ shift);
+/* This is almost certainly fatal, but if the caller really
+ * wants to carry on with shift == 0, it's welcome to try */
+} else if (rc > 0) {
+/* kernel-side HPT allocated */
+if (rc != shift) {
+error_setg(errp,
+   "Requested order %d HPT, but kernel allocated order %ld 
(try smaller maxmem?)",
+   shift, rc);
 }
 
 spapr->htab_shift = shift;
 kvmppc_kern_htab = true;
 } else {
-/* Allocate htab */
-spapr->htab = qemu_memalign(HTAB_SIZE(spapr), HTAB_SIZE(spapr));
-
-/* And clear it */
-memset(spapr->htab, 0, HTAB_SIZE(spapr));
-
-for (index = 0; index < HTAB_SIZE(spapr) / HASH_PTE_SIZE_64; index++) {
-DIRTY_HPTE(HPTE(spapr->htab, index));
-}
-}
-}
-
-/*
- * Clear HTAB entries during reset.
- *
- * If host kernel has allocated HTAB, KVM_PPC_ALLOCATE_HTAB ioctl is
- * used to clear HTAB. Otherwise QEMU-allocated HTAB is cleared manually.
- */
-static void spapr_reset_htab(sPAPRMachineState *spapr)
-{
-long shift;
-int index;
+/* kernel-side HPT not needed, allocate in userspace instead */
+size_t size = 1ULL << shift;
+int i;
 
-shift = kvmppc_reset_htab(spapr->htab_shift);
-if (shift < 0) {
-error_setg(&error_abort, "Failed to reset HTAB");
-} else if (shift > 0) {
-if (shift != spapr->htab_shift) {
-error_setg(&error_abort, "Requested HTAB allocation failed during 
reset");
+spapr->htab = qemu_memalign(size, size);
+if (!spapr->htab) {
+

[Qemu-devel] [RFCv2 5/6] pseries: Enable HPT resizing for 2.6

2016-01-28 Thread David Gibson

We've now implemented a PAPR extensions which allows PAPR guests (i.e.
"pseries" machine type) to resize their hash page table during runtime.

However, that extension is only enabled if explicitly chosen on the
command line.  This patch enables it by default for qemu-2.6, but leaves it
disabled (by default) for older machine types.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ddd8b99..0d2759a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2137,12 +2137,17 @@ static void spapr_machine_initfn(Object *obj)
 sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
 
 spapr->htab_fd = -1;
+
 object_property_add_str(obj, "kvm-type",
 spapr_get_kvm_type, spapr_set_kvm_type, NULL);
 object_property_set_description(obj, "kvm-type",
 "Specifies the KVM virtualization mode 
(HV, PR)",
 NULL);
 
+if (!kvm_enabled()) {
+/* No KVM implementation of HPT resizing yet */
+spapr->resize_hpt = SPAPR_RESIZE_HPT_ENABLED;
+}
 object_property_add_str(obj, "resize-hpt",
 spapr_get_resize_hpt, spapr_set_resize_hpt, NULL);
 object_property_set_description(obj, "resize-hpt",
@@ -2412,6 +2417,10 @@ DEFINE_SPAPR_MACHINE(2_6, "2.6", true);
 
 static void spapr_machine_2_5_instance_options(MachineState *machine)
 {
+sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
+
+spapr_machine_2_6_instance_options(machine);
+spapr->resize_hpt = SPAPR_RESIZE_HPT_DISABLED;
 }
 
 static void spapr_machine_2_5_class_options(MachineClass *mc)
-- 
2.5.0

[Qemu-devel] [RFCv2 4/6] pseries: Implement HPT resizing

2016-01-28 Thread David Gibson

This patch implements hypercalls allowing a PAPR guest to resize its own
hash page table.  This will eventually allow for more flexible memory
hotplug.

The implementation is partially asynchronous, handled in a special thread
running the hpt_prepare_thread() function.  The state of a pending resize
is stored in SPAPR_MACHINE->pending_hpt.

The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or,
if one is already in progress, monitor it for completion.  If there is an
existing HPT resize in progress that doesn't match the size specified in
the call, it will cancel it, replacing it with a new one matching the
given size.

The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only
be called successfully once H_RESIZE_HPT_PREPARE has successfully
completed initialization of a new HPT.  The guest must ensure that there
are no concurrent accesses to the existing HPT while this is called (this
effectively means stop_machine() for Linux guests).

For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each
HPTE into the new HPT.  This can have quite high latency, but it seems to
be of the order of typical migration downtime latencies for HPTs of size
up to ~2GiB (which would be used in a 256GiB guest).

In future we probably want to move more of the rehashing to the "prepare"
phase, by having H_ENTER and other hcalls update both current and
pending HPTs.  That's a project for another day, but should be possible
without any changes to the guest interface.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c  |   2 -
 hw/ppc/spapr_hcall.c| 316 +++-
 include/hw/ppc/spapr.h  |   5 +
 target-ppc/mmu-hash64.h |   4 +
 4 files changed, 322 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b7bd1c1..ddd8b99 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -90,8 +90,6 @@
 
 #define PHANDLE_XICP0x
 
-#define HTAB_SIZE(spapr)(1ULL << ((spapr)->htab_shift))
-
 static XICSState *try_create_xics(const char *type, int nr_servers,
   int nr_irqs, Error **errp)
 {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index f285d34..2345196 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1,4 +1,5 @@
 #include "sysemu/sysemu.h"
+#include "qemu/error-report.h"
 #include "cpu.h"
 #include "helper_regs.h"
 #include "hw/ppc/spapr.h"
@@ -331,20 +332,290 @@ static target_ulong h_read(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return H_SUCCESS;
 }
 
+struct sPAPRPendingHPT {
+/* These fields are read-only after initialization */
+int shift;
+QemuThread thread;
+
+/* These fields are protected by the BQL */
+bool complete;
+
+/* These fields are private to the preparation thread if
+ * !complete, otherwise protected by the BQL */
+int ret;
+void *hpt;
+};
+
+static void free_pending_hpt(sPAPRPendingHPT *pending)
+{
+if (pending->hpt) {
+qemu_vfree(pending->hpt);
+}
+
+g_free(pending);
+}
+
+static void *hpt_prepare_thread(void *opaque)
+{
+sPAPRPendingHPT *pending = opaque;
+size_t size = 1ULL << pending->shift;
+
+pending->hpt = qemu_memalign(size, size);
+if (pending->hpt) {
+memset(pending->hpt, 0, size);
+pending->ret = H_SUCCESS;
+} else {
+pending->ret = H_NO_MEM;
+}
+
+qemu_mutex_lock_iothread();
+
+if (SPAPR_MACHINE(qdev_get_machine())->pending_hpt != pending) {
+/* We've been cancelled, clean ourselves up */
+free_pending_hpt(pending);
+goto out;
+}
+
+pending->complete = true;
+
+out:
+qemu_mutex_unlock_iothread();
+return NULL;
+}
+
+/* Must be called with BQL held */
+static void cancel_hpt_prepare(sPAPRMachineState *spapr)
+{
+sPAPRPendingHPT *pending = spapr->pending_hpt;
+
+/* Let the thread know it's cancelled */
+spapr->pending_hpt = NULL;
+
+if (!pending) {
+/* Nothing to do */
+return;
+}
+
+if (!pending->complete) {
+/* thread will clean itself up */
+return;
+}
+
+free_pending_hpt(pending);
+}
+
 static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
  sPAPRMachineState *spapr,
  target_ulong opcode,
  target_ulong *args)
 {
 target_ulong flags = args[0];
-target_ulong shift = args[1];
+int shift = args[1];
+sPAPRPendingHPT *pending = spapr->pending_hpt;
 
 if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
 return H_AUTHORITY;
 }
 
 trace_spapr_h_resize_hpt_prepare(flags, shift);
-return H_HARDWARE;
+
+if (flags != 0) {
+return H_PARAMETER;
+}
+
+if (shift && ((shift < 18) || (shift > 46))) {
+return H_PARAMETER;
+}
+
+if (pending) {
+/* something already in progress */
+i

Re: [Qemu-devel] [PATCH v7 12/13] qmp: Add query-ppc-cpu-cores command

2016-01-28 Thread Bharata B Rao

On Thu, Jan 28, 2016 at 01:52:26PM -0700, Eric Blake wrote:
> On 01/27/2016 10:49 PM, Bharata B Rao wrote:
> > Show the details of PPC CPU cores via a new QMP command.
> > 
> > TODO: update qmp-commands.hx with example
> 
> Is this a stale comment? [1]

Yes, I missed removing it after I put the example in qmp-commands.hx.

> 
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> 
> >  #include 
> > +#include 
> >  #include "qemu/error-report.h"
> > +#include "qmp-commands.h"
> > +
> > +/*
> > + * QMP: info ppc-cpu-cores
> > + */
> > +static int qmp_ppc_cpu_list(Object *obj, void *opaque)
> 
> Comment is a bit off - 'info ...' is an HMP command; this callback is
> helping implement the QMP function query-ppc-cpu-cores.

Ok, will make it "QMP: query-ppc-cpu-cores"

> 
> > +++ b/qapi-schema.json
> > @@ -4083,3 +4083,34 @@
> >  ##
> >  { 'enum': 'ReplayMode',
> >'data': [ 'none', 'record', 'play' ] }
> > +
> > +##
> > +# @PPCCPUCore:
> > +#
> > +# Information about PPC CPU core devices
> > +#
> > +# @hotplugged: true if device was hotplugged
> > +#
> > +# @hotpluggable: true if device if could be added/removed while machine is 
> > running
> > +#
> > +# Since: 2.6
> 
> Missing docs on 'id' and 'threads'.

Will add.

> 
> > +##
> > +
> > +{ 'struct': 'PPCCPUCore',
> > +  'data': { '*id': 'str',
> > +'hotplugged': 'bool',
> > +'hotpluggable': 'bool',
> > +'threads' : ['CpuInfo']
> > +  }
> > +}
> > +
> > +##
> > +# @query-ppc-cpu-core:
> > +#
> > +# Returns information for all PPC CPU core devices
> > +#
> > +# Returns: a list of @PPCCPUCore.
> > +#
> > +# Since: 2.6
> > +##
> > +{ 'command': 'query-ppc-cpu-cores', 'returns': ['PPCCPUCore'] }
> 
> Interface seems okay.
> 
> > +++ b/qmp-commands.hx
> > @@ -4795,3 +4795,54 @@ Example:
> >   {"type": 0, "out-pport": 0, "pport": 0, "vlan-id": 3840,
> >"pop-vlan": 1, "id": 251658240}
> > ]}
> > +
> > +EQMP
> > +
> > +#if defined TARGET_PPC64
> > +{
> > +.name   = "query-ppc-cpu-cores",
> > +.args_type  = "",
> > +.mhandler.cmd_new = qmp_marshal_query_ppc_cpu_cores,
> > +},
> > +#endif
> 
> Hmm. Conditional compilation. Does the command show up in
> 'query-commands' and introspection output, even when the target is not
> ppc64?  We may need to fix qapi introspection to support conditionals
> better; maybe some of Marc-Andre's patches towards eliminating
> qmp-commands.hx will come into play here.

For targets other than ppc64, query-commands doesn't list
query-ppc-cpu-cores.

> 
> > +
> > +SQMP
> > +@query-ppc-cpu-cores
> > +
> > +
> > +Show PowerPC CPU core devices information.
> > +
> > +Example:
> > +-> { "execute": "query-ppc-cpu-cores" }
> > +<- {"return": [{"threads": [
> > + {"arch": "ppc",
> 
> [1] looks like you provided an example after all.  Is it worth
> documenting that this command is only conditionally available?

Will add

# Note: This command is available only for PowerPC targets

> 
> 
> > +++ b/stubs/qmp_query_ppc_cpu_cores.c
> > @@ -0,0 +1,10 @@
> > +#include "qom/object.h"
> > +#include "qapi/qmp/qerror.h"
> > +#include "qemu/typedefs.h"
> > +#include "qmp-commands.h"
> > +
> > +PPCCPUCoreList *qmp_query_ppc_cpu_cores(Error **errp)
> > +{
> > +error_setg(errp, QERR_UNSUPPORTED);
> > +return 0;
> > +}
> 
> Hmm - will the stub even be used, since you used an #ifdef in the .hx file?

Though I have put

.mhandler.cmd_new = qmp_marshal_query_ppc_cpu_cores,

under ifdef in .hx, qmp_marshal_query_ppc_cpu_cores() is getting defined in
qmp-marshal.c and hence we need this stub file so that qmp_query_ppc_cpu_cores()
gets resolved from qmp-marshal.c:qmp_marshal_query_ppc_cpu_cores().

Regards,
Bharata.

Re: [Qemu-devel] [PATCH v4 0/5] ARM: Add NUMA support for machine virt

2016-01-28 Thread Ashok Kumar

Hi, 

On Sat, Jan 23, 2016 at 07:36:41PM +0800, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Add NUMA support for machine virt. Tested successfully running a guest
> Linux kernel with the following patch applied:
> 
> - [PATCH v9 0/6] arm64, numa: Add numa support for arm64 platforms
> https://lwn.net/Articles/672329/
> - [PATCH v2 0/4] ACPI based NUMA support for ARM64
> http://www.spinics.net/lists/linux-acpi/msg61795.html
> 
> Changes since v3:
> * based on new kernel driver and device bindings
> * add ACPI part
> 
> Changes since v2:
> * update to use NUMA node property arm,associativity.
> 
> Changes since v1:
> Take into account Peter's comments:
> * rename virt_memory_init to arm_generate_memory_dtb
> * move arm_generate_memory_dtb to boot.c and make it a common func
> * use a struct numa_map to generate numa dtb
> 
> Example qemu command line:
> qemu-system-aarch64 \
> -enable-kvm -smp 4\
> -kernel Image \
> -m 512 -machine virt,kernel_irqchip=on \
> -initrd guestfs.cpio.gz \
> -cpu host -nographic \
> -numa node,mem=256M,cpus=0-1,nodeid=0 \
> -numa node,mem=256M,cpus=2-3,nodeid=1 \
> -append "console=ttyAMA0 root=/dev/ram"
> 
> Shannon Zhao (5):
>   ARM: Virt: Add /distance-map node for NUMA
>   ARM: Virt: Set numa-node-id for CPUs
>   ARM: Add numa-node-id for /memory node
>   include/hw/acpi/acpi-defs: Add GICC Affinity Structure
>   hw/arm/virt-acpi-build: Generate SRAT table
> 
>  hw/arm/boot.c   | 29 ++-
>  hw/arm/virt-acpi-build.c| 58 
> +
>  hw/arm/virt.c   | 37 +
>  hw/i386/acpi-build.c|  2 +-
>  include/hw/acpi/acpi-defs.h | 15 +++-
>  5 files changed, 138 insertions(+), 3 deletions(-)
> 
> -- 
> 2.0.4
> 

Don't we need to populate the NUMA node in the Affinity byte of MPIDR?
Linux uses the Affinity information in MPIDR to build topology which
might go wrong for the guest in this case. 
Maybe a non Linux OS might be impacted more?

distance-map compatible string has been changed from
"numa,distance-map-v1" to "numa-distance-map-v1"

Thanks,
Ashok

Re: [Qemu-devel] [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks

2016-01-28 Thread Jike Song

On 01/29/2016 10:54 AM, Alex Williamson wrote:
> On Fri, 2016-01-29 at 02:22 +, Kay, Allen M wrote:
>>  
>>> -Original Message-
>>> From: iGVT-g [mailto:igvt-g-boun...@lists.01.org] On Behalf Of Alex
>>> Williamson
>>> Sent: Thursday, January 28, 2016 11:36 AM
>>> To: Gerd Hoffmann; qemu-devel@nongnu.org
>>> Cc: igv...@ml01.01.org; xen-de...@lists.xensource.com; Eduardo Habkost;
>>> Stefano Stabellini; Cao jin; vfio-us...@redhat.com
>>> Subject: Re: [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset
>>> tweaks
>>>  
>>>  
>>> 1) The OpRegion MemoryRegion is mapped into system_memory through
>>> programming of the 0xFC config space register.
>>>  a) vfio-pci could pick an address to do this as it is realized.
>>>  b) SeaBIOS/OVMF could program this.
>>>  
>>> Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to pick
>>> an address and mark it as e820 reserved.  I'm not sure how to pick that
>>> address.  We'd probably want to make the 0xFC config register read-
>>> only.  1.b) has the issue you mentioned where in most cases the OpRegion
>>> will be 8k, but the BIOS won't know how much address space it's mapping
>>> into system memory when it writes the 0xFC register.  I don't know how
>>> much of a problem this is since the BIOS can easily determine the size once
>>> mapped and re-map it somewhere there's sufficient space.
>>> Practically, it seems like it's always going to be 8K.  This of course 
>>> requires
>>> modification to every BIOS.  It also leaves the 0xFC register as a mapping
>>> control rather than a pointer to the OpRegion in RAM, which doesn't really
>>> match real hardware.  The BIOS would need to pick an address in this case.
>>>  
>>> 2) Read-only mappings version of 1)
>>>  
>>> Discussion: Really nothing changes from the issues above, just prevents any
>>> possibility of the guest modifying anything in the host.  Xen apparently 
>>> allows
>>> write access to the host page already.
>>>  
>>> 3) Copy OpRegion contents into buffer and do either 1) or 2) above.
>>>  
>>> Discussion: No benefit that I can see over above other than maybe allowing
>>> write access that doesn't affect the host.
>>>  
>>> 4) Copy contents into a guest RAM location, mark it reserved, point to it 
>>> via
>>> 0xFC config as scratch register.
>>>  a) Done by QEMU (vfio-pci)
>>>  b) Done by SeaBIOS/OVMF
>>>  
>>> Discussion: This is the most like real hardware.  4.a) has the usual issue 
>>> of
>>> how to pick an address, but the benefit of not requiring BIOS changes 
>>> (simply
>>> mark the RAM reserved via existing methods).  4.b) would require passing a
>>> buffer containing the contents of the OpRegion via fw_cfg and letting the
>>> BIOS do the setup.  The latter of course requires modifying each BIOS for 
>>> this
>>> support.
>>>  
>>> Of course none of these support hotplug nor really can they since reserved
>>> memory regions are not dynamic in the architecture.
>>>  
>>> In all cases, some piece of software needs to know where it can place the
>>> OpRegion in guest memory.  It seems like there are advantages or
>>> disadvantages whether that's done by QEMU or the BIOS, but we only need
>>> to do it once if it's QEMU.  Suggestions, comments, preferences?
>>>  
>>  
>> Hi Alex, another thing to consider is how to communicate to the guest driver 
>> the address at 0xFC contains a valid GPA address that can be accessed by the 
>> driver without causing a EPT fault - since
>> the same driver will be used on other hypervisors and they may not EPT map 
>> OpRegion memory.  On idea proposed by display driver team is to set bit0 of 
>> the address to 1 for indicating OpRegion memory
>> can be safely accessed by the guest driver.
> 
> Hi Allen,
> 
> Why is that any different than a guest accessing any other memory area
> that it shouldn't?  The OpRegion starts with a 16-byte ID string, so if
> the guest finds that it should feel fairly confident the OpRegion data
> is valid.  The published spec also seems to define all bits of 0xfc as
> valid, not implying any sort of alignment requirements, and the i915
> driver does a memremap directly on the value read from 0xfc.  So I'm not
> sure whether there's really a need to or ability to define any of those
> bits in an adhoc way to indicate mapping.  If we do things right,
> shouldn't the guest driver not even know it's running in a VM, at least
> for the KVMGT-d case, so we need to be compatible with physical
> hardware.  Thanks,
> 

I agree. EPT page fault is allowed on guest OpRegion accessing, as long as
during the page fault handling, KVM will find a proper PFN for that GPA.
It's exactly what is expected for 'normal' memory.

> Alex
> 

--
Thanks,
Jike

Re: [Qemu-devel] [RFC 0/3] Draft implementation of HPT resizing (qemu side)

2016-01-28 Thread Alexander Graf



> Am 29.01.2016 um 04:47 schrieb David Gibson :
> 
>> On Thu, Jan 28, 2016 at 10:04:58PM +0100, Alexander Graf wrote:
>> 
>> 
>>> On 01/19/2016 12:02 PM, David Gibson wrote:
 On Tue, Jan 19, 2016 at 01:18:17PM +0530, Bharata B Rao wrote:
> On Mon, Jan 18, 2016 at 04:44:38PM +1100, David Gibson wrote:
> Here is a draft qemu implementation of my proposed PAPR extension for
> allowing runtime resizing of a KVM/ppc64 guest's hash page table.
> That in turn will allow for more flexible memory hotplug.
> 
> This should work with the guest kernel side patches I also posted
> recently [1].
> 
> Still required to make this into a full implementation:
>  * Guest needs to auto-resize HPT on memory hotplug events
> 
>  * qemu needs to allocate HPT size based on current rather than
>maximum memory if the guest is HPT resize aware
> 
>  * KVM host side implementation
> 
>  * PAPR standardization
 So with the current patchset (QEMU and guest kernel changes), I should
 be able to change the HTAB size of a PR guest right ? I see the below
 failure though:
>>> Uh.. to be honest I haven't really considered the KVM case at all.
>>> I'm kind of surprised it didn't just refuse to do anything.
>>> 
 [root@localhost ~]# cat /sys/kernel/debug/powerpc/pft-size
 24
 [root@localhost ~]# echo 26 > /sys/kernel/debug/powerpc/pft-size
 [   65.996845] lpar: Attempting to resize HPT to shift 26
 [   65.996845] lpar: Attempting to resize HPT to shift 26
 [   66.113596] lpar: HPT resize to shift 26 complete (109 ms / 6 ms)
 [   66.113596] lpar: HPT resize to shift 26 complete (109 ms / 6 ms)
 
 PR guest just hangs here while I see tons of below messages in
 the 1st level guest:
 
 KVM can't copy data from 0x3fff99e91400!
 ...
 Couldn't emulate instruction 0x (op 0 xop 0)
 kvmppc_handle_exit_pr: emulation at 700 failed ()
>>> Hm, not sure why that's happening.  At first I thought it was because
>>> we weren't updating SDR1 with the address of the new htab, but that's
>>> actually in there.  Maybe the KVM PR code isn't rereading it after
>>> initial VM startup.
>> 
>> The KVM PR code doesn't care - it just rereads SDR1 on every pteg lookup ;).
>> There's no caching at all.
> 
> Ok, no idea why it's not working then.  I'll investigate when I get a chance.
> 
>> Of course, the guest needs to invalidate all pending tlb entries if they're
>> now invalid.
>> 
>> Does this work on real hardware? Say, a G5?
> 
> As Paulus says it would be possible to do HPT resizing on real
> hardware, but the implementation I've done is specific to PAPR.  And
> obviously qemu wouldn't be relevant to that case.

So why make it specific to papr? Wouldn't it make sense to have it as a (ppc) 
generic interface in Linux?

For the PR PAPR case, QEMU allocates the HTAB, so it needs to make sure it 
pushes the changed address as new fake SDR1 value into kvm when it changes.


Alex

Re: [Qemu-devel] [PATCH v7 02/13] exec: Remove cpu from cpus list during cpu_exec_exit()

2016-01-28 Thread Bharata B Rao

On Thu, Jan 28, 2016 at 05:19:33PM -0200, Eduardo Habkost wrote:
> On Thu, Jan 28, 2016 at 11:19:44AM +0530, Bharata B Rao wrote:
> > CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
> > should be removed from cpu_exec_exit().
> > 
> > cpu_exec_init() is called from generic CPU::instance_finalize and some
> > archs like PowerPC call it from CPU unrealizefn. So ensure that we
> > dequeue the cpu only once.
> > 
> > Now -1 value for cpu->cpu_index indicates that we have already dequeued
> > the cpu for CONFIG_USER_ONLY case also.
> > 
> > Signed-off-by: Bharata B Rao 
> > Reviewed-by: David Gibson 
> > ---
> >  exec.c | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/exec.c b/exec.c
> > index 7115403..c8da9d4 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -596,6 +596,7 @@ void cpu_exec_exit(CPUState *cpu)
> >  return;
> >  }
> >  
> > +QTAILQ_REMOVE(&cpus, cpu, node);
> >  bitmap_clear(cpu_index_map, cpu->cpu_index, 1);
> >  cpu->cpu_index = -1;
> >  }
> > @@ -614,6 +615,15 @@ static int cpu_get_free_index(Error **errp)
> >  
> >  void cpu_exec_exit(CPUState *cpu)
> >  {
> > +cpu_list_lock();
> > +if (cpu->cpu_index == -1) {
> > +cpu_list_unlock();
> > +return;
> > +}
> > +
> > +QTAILQ_REMOVE(&cpus, cpu, node);
> > +cpu->cpu_index = -1;
> > +cpu_list_unlock();
> 
> With this, the only differences between the two cpu_exec_exit()
> implementations are:
> 
> * cpu_list_lock()/cpu_list_unlock() functions.
>   * We can add !CONFIG_USER_ONLY stubs for them.
> * The bitmap_clear() call.
>   * It can be abstracted away in a cpu_release_index() function,
> just like we already have a CONFIG_USER_ONLY version of
> cpu_get_free_index().

Ok, made those changes so that cpu_exec_exit() will be a common routine
with some CONFIG_USER_ONLY ifdefs in between.

Regards,
Bharata.

[Qemu-devel] [Question] a physical usb mouse becomes invalid when redirected and attached to a xHCI controller

2016-01-28 Thread Qixiong Su

A physical usb mouse becomes invalid when redirected from pc and attached to a 
xHCI controller.


The QEMU running command is as follow.
/root/sqx/qemu-root/bin/qemu-system-x86_64 \
-name win7_sqx_qemu \
-machine pc-i440fx-2.1,accel=kvm,usb=off \
-m 1024 \
-realtime mlock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid 2792b55d-f9b0-4e81-bf71-466ca7338628 \
-no-user-config \
-nodefaults \
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/win7_sqx.monitor,server,nowait 
\
-mon chardev=charmonitor,id=monitor \
-rtc base=localtime \
-no-shutdown \
-global PIIX4_PM.disable_s3=1 \
-global PIIX4_PM.disable_s4=0 \
-boot strict=on \
-device nec-usb-xhci,id=xhci,bus=pci.0,p2=6,p3=6,addr=0x1.0x2 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 \
-drive 
file=/opt/sqx/win7_sqx.append,if=none,id=drive-ide0-0-0,format=qcow2,cache=writeback
 \
-device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=serial0 \
-chardev pty,id=charserial1 \
-device isa-serial,chardev=charserial1,id=serial1 \
-chardev spicevmc,id=charchannel0,name=vdagent \
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
 \
-device usb-tablet,id=input0 \
-spice port=5950,addr=0.0.0.0,disable-ticketing,seamless-migration=on \
-vnc 0.0.0.0:51 \
-device 
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x3 \
-device intel-hda,id=sound0,bus=pci.0,addr=0x4 \
-device hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 \
-device hda-duplex,id=sound0-codec1,bus=sound0.0,cad=1 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 \
-chardev spicevmc,name=usbredir,id=usbredirchardev1 \
-device usb-redir,chardev=usbredirchardev1,id=usbredirdev1,bus=xhci.0,debug=5 \
-chardev spicevmc,name=usbredir,id=usbredirchardev2 \
-device usb-redir,chardev=usbredirchardev2,id=usbredirdev2,bus=xhci.0,debug=5 \
-chardev spicevmc,name=usbredir,id=usbredirchardev3 \
-device usb-redir,chardev=usbredirchardev3,id=usbredirdev3,bus=xhci.0,debug=5 \
-cpu SandyBridge,+vmx,hv-relaxed=on


does anyone encounter this problem or know how to solve it?

[Qemu-devel] [PULL 39/39] target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro

2016-01-28 Thread David Gibson

From: James Clarke 

Signed-off-by: James Clarke 
Signed-off-by: David Gibson 
---
 target-ppc/cpu.h | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 0820390..f300c86 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -687,24 +687,37 @@ enum {
 
 #define FP_FX  (1ull << FPSCR_FX)
 #define FP_FEX (1ull << FPSCR_FEX)
+#define FP_VX  (1ull << FPSCR_VX)
 #define FP_OX  (1ull << FPSCR_OX)
-#define FP_OE  (1ull << FPSCR_OE)
 #define FP_UX  (1ull << FPSCR_UX)
-#define FP_UE  (1ull << FPSCR_UE)
-#define FP_XX  (1ull << FPSCR_XX)
-#define FP_XE  (1ull << FPSCR_XE)
 #define FP_ZX  (1ull << FPSCR_ZX)
-#define FP_ZE  (1ull << FPSCR_ZE)
-#define FP_VX  (1ull << FPSCR_VX)
+#define FP_XX  (1ull << FPSCR_XX)
 #define FP_VXSNAN  (1ull << FPSCR_VXSNAN)
 #define FP_VXISI   (1ull << FPSCR_VXISI)
-#define FP_VXIMZ   (1ull << FPSCR_VXIMZ)
-#define FP_VXZDZ   (1ull << FPSCR_VXZDZ)
 #define FP_VXIDI   (1ull << FPSCR_VXIDI)
+#define FP_VXZDZ   (1ull << FPSCR_VXZDZ)
+#define FP_VXIMZ   (1ull << FPSCR_VXIMZ)
 #define FP_VXVC(1ull << FPSCR_VXVC)
+#define FP_FR  (1ull << FSPCR_FR)
+#define FP_FI  (1ull << FPSCR_FI)
+#define FP_C   (1ull << FPSCR_C)
+#define FP_FL  (1ull << FPSCR_FL)
+#define FP_FG  (1ull << FPSCR_FG)
+#define FP_FE  (1ull << FPSCR_FE)
+#define FP_FU  (1ull << FPSCR_FU)
+#define FP_FPCC(FP_FL | FP_FG | FP_FE | FP_FU)
+#define FP_FPRF(FP_C  | FP_FL | FP_FG | FP_FE | FP_FU)
+#define FP_VXSOFT  (1ull << FPSCR_VXSOFT)
+#define FP_VXSQRT  (1ull << FPSCR_VXSQRT)
 #define FP_VXCVI   (1ull << FPSCR_VXCVI)
 #define FP_VE  (1ull << FPSCR_VE)
-#define FP_FI  (1ull << FPSCR_FI)
+#define FP_OE  (1ull << FPSCR_OE)
+#define FP_UE  (1ull << FPSCR_UE)
+#define FP_ZE  (1ull << FPSCR_ZE)
+#define FP_XE  (1ull << FPSCR_XE)
+#define FP_NI  (1ull << FPSCR_NI)
+#define FP_RN1 (1ull << FPSCR_RN1)
+#define FP_RN  (1ull << FPSCR_RN)
 
 /*/
 /* Vector status and control register */
-- 
2.5.0

[Qemu-devel] [PULL 03/39] macio: use the existing IDEDMA aiocb to hold the active DMA aiocb

2016-01-28 Thread David Gibson

From: Mark Cave-Ayland 

Currently the aiocb is held within MACIOIDEState, however the IDE core code
assumes that the current actvie DMA aiocb is held in aiocb in a few places,
e.g. ide_bus_reset() and ide_reset().

Switch over to using IDEDMA aiocb to store the aiocb for the current active
DMA request so that bus resets and restarts are handled correctly. As a
consequence we can now use ide_set_inactive() rather than handling its
functionality ourselves.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: John Snow 
Signed-off-by: David Gibson 
---
 hw/ide/macio.c  |  20 +-
 hw/ide/macio.c.orig | 634 
 hw/ppc/mac.h|   1 -
 3 files changed, 646 insertions(+), 9 deletions(-)
 create mode 100644 hw/ide/macio.c.orig

diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index d4031b6..110af46 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -119,8 +119,8 @@ static void pmac_dma_read(BlockBackend *blk,
 MACIO_DPRINTF("--- Block read transfer - sector_num: %" PRIx64 "  "
   "nsector: %x\n", (offset >> 9), (bytes >> 9));
 
-m->aiocb = blk_aio_readv(blk, (offset >> 9), &io->iov, (bytes >> 9),
- cb, io);
+s->bus->dma->aiocb = blk_aio_readv(blk, (offset >> 9), &io->iov,
+ (bytes >> 9), cb, io);
 }
 
 static void pmac_dma_write(BlockBackend *blk,
@@ -204,8 +204,8 @@ static void pmac_dma_write(BlockBackend *blk,
 MACIO_DPRINTF("--- Block write transfer - sector_num: %" PRIx64 "  "
   "nsector: %x\n", (offset >> 9), (bytes >> 9));
 
-m->aiocb = blk_aio_writev(blk, (offset >> 9), &io->iov, (bytes >> 9),
-  cb, io);
+s->bus->dma->aiocb = blk_aio_writev(blk, (offset >> 9), &io->iov,
+ (bytes >> 9), cb, io);
 }
 
 static void pmac_dma_trim(BlockBackend *blk,
@@ -231,8 +231,8 @@ static void pmac_dma_trim(BlockBackend *blk,
 s->io_buffer_index += io->len;
 io->len = 0;
 
-m->aiocb = ide_issue_trim(blk, (offset >> 9), &io->iov, (bytes >> 9),
-  cb, io);
+s->bus->dma->aiocb = ide_issue_trim(blk, (offset >> 9), &io->iov,
+ (bytes >> 9), cb, io);
 }
 
 static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
@@ -291,6 +291,8 @@ done:
 } else {
 block_acct_done(blk_get_stats(s->blk), &s->acct);
 }
+
+ide_set_inactive(s, false);
 io->dma_end(opaque);
 }
 
@@ -305,7 +307,6 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
 
 if (ret < 0) {
 MACIO_DPRINTF("DMA error: %d\n", ret);
-m->aiocb = NULL;
 ide_dma_error(s);
 goto done;
 }
@@ -356,6 +357,8 @@ done:
 block_acct_done(blk_get_stats(s->blk), &s->acct);
 }
 }
+
+ide_set_inactive(s, false);
 io->dma_end(opaque);
 }
 
@@ -393,8 +396,9 @@ static void pmac_ide_transfer(DBDMA_io *io)
 static void pmac_ide_flush(DBDMA_io *io)
 {
 MACIOIDEState *m = io->opaque;
+IDEState *s = idebus_active_if(&m->bus);
 
-if (m->aiocb) {
+if (s->bus->dma->aiocb) {
 blk_drain_all();
 }
 }
diff --git a/hw/ide/macio.c.orig b/hw/ide/macio.c.orig
new file mode 100644
index 000..d4031b6
--- /dev/null
+++ b/hw/ide/macio.c.orig
@@ -0,0 +1,634 @@
+/*
+ * QEMU IDE Emulation: MacIO support.
+ *
+ * Copyright (c) 2003 Fabrice Bellard
+ * Copyright (c) 2006 Openedhand Ltd.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "hw/hw.h"
+#include "hw/ppc/mac.h"
+#include "hw/ppc/mac_dbdma.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/dma.h"
+
+#include 
+
+/* debug MACIO */
+// #define DEBUG_MACIO
+
+#ifdef DEBUG_MACIO
+static const int debug_macio = 1;
+#else
+static const int debug_macio = 0;
+#endif
+
+#define MACIO_DPRINTF(fmt, ...) do { \
+if (debug_macio) { \
+printf(fmt , ## __VA_ARGS__); \
+} \
+} while (

[Qemu-devel] [PULL 31/39] target-ppc: Rework ppc_store_slb

2016-01-28 Thread David Gibson

ppc_store_slb updates the SLB for PPC cpus with 64-bit hash MMUs.
Currently it takes two parameters, which contain values encoded as the
register arguments to the slbmte instruction, one register contains the
ESID portion of the SLBE and also the slot number, the other contains the
VSID portion of the SLBE.

We're shortly going to want to do some SLB updates from other code where
it is more convenient to supply the slot number and ESID separately, so
rework this function and its callers to work this way.

As a bonus, this slightly simplifies the emulation of segment registers for
when running a 32-bit OS on a 64-bit CPU.

Signed-off-by: David Gibson 
Reviewed-by: Laurent Vivier 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 target-ppc/kvm.c|  2 +-
 target-ppc/mmu-hash64.c | 24 +---
 target-ppc/mmu-hash64.h |  3 ++-
 target-ppc/mmu_helper.c | 14 +-
 4 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 98d7ba6..0f45380 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1205,7 +1205,7 @@ int kvm_arch_get_registers(CPUState *cs)
  * Only restore valid entries
  */
 if (rb & SLB_ESID_V) {
-ppc_store_slb(cpu, rb, rs);
+ppc_store_slb(cpu, rb & 0xfff, rb & ~0xfffULL, rs);
 }
 }
 #endif
diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index 03e25fd..6e05643 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -135,28 +135,30 @@ void helper_slbie(CPUPPCState *env, target_ulong addr)
 }
 }
 
-int ppc_store_slb(PowerPCCPU *cpu, target_ulong rb, target_ulong rs)
+int ppc_store_slb(PowerPCCPU *cpu, target_ulong slot,
+  target_ulong esid, target_ulong vsid)
 {
 CPUPPCState *env = &cpu->env;
-int slot = rb & 0xfff;
 ppc_slb_t *slb = &env->slb[slot];
 
-if (rb & (0x1000 - env->slb_nr)) {
-return -1; /* Reserved bits set or slot too high */
+if (slot >= env->slb_nr) {
+return -1; /* Bad slot number */
+}
+if (esid & ~(SLB_ESID_ESID | SLB_ESID_V)) {
+return -1; /* Reserved bits set */
 }
-if (rs & (SLB_VSID_B & ~SLB_VSID_B_1T)) {
+if (vsid & (SLB_VSID_B & ~SLB_VSID_B_1T)) {
 return -1; /* Bad segment size */
 }
-if ((rs & SLB_VSID_B) && !(env->mmu_model & POWERPC_MMU_1TSEG)) {
+if ((vsid & SLB_VSID_B) && !(env->mmu_model & POWERPC_MMU_1TSEG)) {
 return -1; /* 1T segment on MMU that doesn't support it */
 }
 
-/* Mask out the slot number as we store the entry */
-slb->esid = rb & (SLB_ESID_ESID | SLB_ESID_V);
-slb->vsid = rs;
+slb->esid = esid;
+slb->vsid = vsid;
 
 LOG_SLB("%s: %d " TARGET_FMT_lx " - " TARGET_FMT_lx " => %016" PRIx64
-" %016" PRIx64 "\n", __func__, slot, rb, rs,
+" %016" PRIx64 "\n", __func__, slot, esid, vsid,
 slb->esid, slb->vsid);
 
 return 0;
@@ -196,7 +198,7 @@ void helper_store_slb(CPUPPCState *env, target_ulong rb, 
target_ulong rs)
 {
 PowerPCCPU *cpu = ppc_env_get_cpu(env);
 
-if (ppc_store_slb(cpu, rb, rs) < 0) {
+if (ppc_store_slb(cpu, rb & 0xfff, rb & ~0xfffULL, rs) < 0) {
 helper_raise_exception_err(env, POWERPC_EXCP_PROGRAM,
POWERPC_EXCP_INVAL);
 }
diff --git a/target-ppc/mmu-hash64.h b/target-ppc/mmu-hash64.h
index 6e3de7e..24fd2c4 100644
--- a/target-ppc/mmu-hash64.h
+++ b/target-ppc/mmu-hash64.h
@@ -6,7 +6,8 @@
 #ifdef TARGET_PPC64
 void ppc_hash64_check_page_sizes(PowerPCCPU *cpu, Error **errp);
 void dump_slb(FILE *f, fprintf_function cpu_fprintf, PowerPCCPU *cpu);
-int ppc_store_slb(PowerPCCPU *cpu, target_ulong rb, target_ulong rs);
+int ppc_store_slb(PowerPCCPU *cpu, target_ulong slot,
+  target_ulong esid, target_ulong vsid);
 hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, target_ulong addr);
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, target_ulong address, int rw,
 int mmu_idx);
diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c
index 0ab73bc..c040b17 100644
--- a/target-ppc/mmu_helper.c
+++ b/target-ppc/mmu_helper.c
@@ -2088,21 +2088,17 @@ void helper_store_sr(CPUPPCState *env, target_ulong 
srnum, target_ulong value)
 (int)srnum, value, env->sr[srnum]);
 #if defined(TARGET_PPC64)
 if (env->mmu_model & POWERPC_MMU_64) {
-uint64_t rb = 0, rs = 0;
+uint64_t esid, vsid;
 
 /* ESID = srnum */
-rb |= ((uint32_t)srnum & 0xf) << 28;
-/* Set the valid bit */
-rb |= SLB_ESID_V;
-/* Index = ESID */
-rb |= (uint32_t)srnum;
+esid = ((uint64_t)(srnum & 0xf) << 28) | SLB_ESID_V;
 
 /* VSID = VSID */
-rs |= (value & 0xfff) << 12;
+vsid = (value & 0xfff) << 12;
 /* flags = flags */
-rs |= ((value >> 27) & 0xf) << 8;
+

[Qemu-devel] [PULL 36/39] target-ppc: Add new TLB invalidate by HPTE call for hash64 MMUs

2016-01-28 Thread David Gibson

When HPTEs are removed or modified by hypercalls on spapr, we need to
invalidate the relevant pages in the qemu TLB.

Currently we do that by doing some complicated calculations to work out the
right encoding for the tlbie instruction, then passing that to
ppc_tlb_invalidate_one()... which totally ignores the argument and flushes
the whole tlb.

Avoid that by adding a new flush-by-hpte helper in mmu-hash64.c.

Signed-off-by: David Gibson 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 hw/ppc/spapr_hcall.c| 46 --
 target-ppc/mmu-hash64.c | 12 
 target-ppc/mmu-hash64.h |  3 +++
 3 files changed, 19 insertions(+), 42 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 4707196..dedc7e0 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -37,42 +37,6 @@ static void set_spr(CPUState *cs, int spr, target_ulong 
value,
 run_on_cpu(cs, do_spr_sync, &s);
 }
 
-static target_ulong compute_tlbie_rb(target_ulong v, target_ulong r,
- target_ulong pte_index)
-{
-target_ulong rb, va_low;
-
-rb = (v & ~0x7fULL) << 16; /* AVA field */
-va_low = pte_index >> 3;
-if (v & HPTE64_V_SECONDARY) {
-va_low = ~va_low;
-}
-/* xor vsid from AVA */
-if (!(v & HPTE64_V_1TB_SEG)) {
-va_low ^= v >> 12;
-} else {
-va_low ^= v >> 24;
-}
-va_low &= 0x7ff;
-if (v & HPTE64_V_LARGE) {
-rb |= 1; /* L field */
-#if 0 /* Disable that P7 specific bit for now */
-if (r & 0xff000) {
-/* non-16MB large page, must be 64k */
-/* (masks depend on page size) */
-rb |= 0x1000;/* page encoding in LP field */
-rb |= (va_low & 0x7f) << 16; /* 7b of VA in AVA/LP field */
-rb |= (va_low & 0xfe);   /* AVAL field */
-}
-#endif
-} else {
-/* 4kB page */
-rb |= (va_low & 0x7ff) << 12;   /* remaining 11b of AVA */
-}
-rb |= (v >> 54) & 0x300;/* B field */
-return rb;
-}
-
 static inline bool valid_pte_index(CPUPPCState *env, target_ulong pte_index)
 {
 /*
@@ -198,7 +162,7 @@ static RemoveResult remove_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 {
 CPUPPCState *env = &cpu->env;
 uint64_t token;
-target_ulong v, r, rb;
+target_ulong v, r;
 
 if (!valid_pte_index(env, ptex)) {
 return REMOVE_PARM;
@@ -217,8 +181,7 @@ static RemoveResult remove_hpte(PowerPCCPU *cpu, 
target_ulong ptex,
 *vp = v;
 *rp = r;
 ppc_hash64_store_hpte(cpu, ptex, HPTE64_V_HPTE_DIRTY, 0);
-rb = compute_tlbie_rb(v, r, ptex);
-ppc_tlb_invalidate_one(env, rb);
+ppc_hash64_tlb_flush_hpte(cpu, ptex, v, r);
 return REMOVE_SUCCESS;
 }
 
@@ -322,7 +285,7 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 target_ulong pte_index = args[1];
 target_ulong avpn = args[2];
 uint64_t token;
-target_ulong v, r, rb;
+target_ulong v, r;
 
 if (!valid_pte_index(env, pte_index)) {
 return H_PARAMETER;
@@ -343,10 +306,9 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 r |= (flags << 55) & HPTE64_R_PP0;
 r |= (flags << 48) & HPTE64_R_KEY_HI;
 r |= flags & (HPTE64_R_PP | HPTE64_R_N | HPTE64_R_KEY_LO);
-rb = compute_tlbie_rb(v, r, pte_index);
 ppc_hash64_store_hpte(cpu, pte_index,
   (v & ~HPTE64_V_VALID) | HPTE64_V_HPTE_DIRTY, 0);
-ppc_tlb_invalidate_one(env, rb);
+ppc_hash64_tlb_flush_hpte(cpu, pte_index, v, r);
 /* Don't need a memory barrier, due to qemu's global lock */
 ppc_hash64_store_hpte(cpu, pte_index, v | HPTE64_V_HPTE_DIRTY, r);
 return H_SUCCESS;
diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index ee1e8bf..3284776 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -707,3 +707,15 @@ void ppc_hash64_store_hpte(PowerPCCPU *cpu,
  env->htab_base + pte_index + HASH_PTE_SIZE_64 / 2, pte1);
 }
 }
+
+void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu,
+   target_ulong pte_index,
+   target_ulong pte0, target_ulong pte1)
+{
+/*
+ * XXX: given the fact that there are too many segments to
+ * invalidate, and we still don't have a tlb_flush_mask(env, n,
+ * mask) in QEMU, we just invalidate all TLBs
+ */
+tlb_flush(CPU(cpu), 1);
+}
diff --git a/target-ppc/mmu-hash64.h b/target-ppc/mmu-hash64.h
index 24fd2c4..293a951 100644
--- a/target-ppc/mmu-hash64.h
+++ b/target-ppc/mmu-hash64.h
@@ -13,6 +13,9 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, target_ulong 
address, int rw,
 int mmu_idx);
 void ppc_hash64_store_hpte(PowerPCCPU *cpu, target_ulong index,
target_ulong pte0, target_ulong pte1);
+void ppc_hash64_tlb_flush_hpte(Power

[Qemu-devel] [PULL 37/39] target-ppc: Helper to determine page size information from hpte alone

2016-01-28 Thread David Gibson

h_enter() in the spapr code needs to know the page size of the HPTE it's
about to insert.  Unlike other paths that do this, it doesn't have access
to the SLB, so at the moment it determines this with some open-coded
tests which assume POWER7 or POWER8 page size encodings.

To make this more flexible add ppc_hash64_hpte_page_shift_noslb() to
determine both the "base" page size per segment, and the individual
effective page size from an HPTE alone.

This means that the spapr code should now be able to handle any page size
listed in the env->sps table.

Signed-off-by: David Gibson 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 hw/ppc/spapr_hcall.c| 25 ++---
 target-ppc/mmu-hash64.c | 35 +++
 target-ppc/mmu-hash64.h |  3 +++
 3 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index dedc7e0..a535c73 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -72,31 +72,18 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 target_ulong pte_index = args[1];
 target_ulong pteh = args[2];
 target_ulong ptel = args[3];
-target_ulong page_shift = 12;
+unsigned apshift, spshift;
 target_ulong raddr;
 target_ulong index;
 uint64_t token;
 
-/* only handle 4k and 16M pages for now */
-if (pteh & HPTE64_V_LARGE) {
-#if 0 /* We don't support 64k pages yet */
-if ((ptel & 0xf000) == 0x1000) {
-/* 64k page */
-} else
-#endif
-if ((ptel & 0xff000) == 0) {
-/* 16M page */
-page_shift = 24;
-/* lowest AVA bit must be 0 for 16M pages */
-if (pteh & 0x80) {
-return H_PARAMETER;
-}
-} else {
-return H_PARAMETER;
-}
+apshift = ppc_hash64_hpte_page_shift_noslb(cpu, pteh, ptel, &spshift);
+if (!apshift) {
+/* Bad page size encoding */
+return H_PARAMETER;
 }
 
-raddr = (ptel & HPTE64_R_RPN) & ~((1ULL << page_shift) - 1);
+raddr = (ptel & HPTE64_R_RPN) & ~((1ULL << apshift) - 1);
 
 if (is_ram_address(spapr, raddr)) {
 /* Regular RAM - should have WIMG=0010 */
diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index 3284776..19ee942 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -512,6 +512,41 @@ static unsigned hpte_page_shift(const struct 
ppc_one_seg_page_size *sps,
 return 0; /* Bad page size encoding */
 }
 
+unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
+  uint64_t pte0, uint64_t pte1,
+  unsigned *seg_page_shift)
+{
+CPUPPCState *env = &cpu->env;
+int i;
+
+if (!(pte0 & HPTE64_V_LARGE)) {
+*seg_page_shift = 12;
+return 12;
+}
+
+/*
+ * The encodings in env->sps need to be carefully chosen so that
+ * this gives an unambiguous result.
+ */
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+const struct ppc_one_seg_page_size *sps = &env->sps.sps[i];
+unsigned shift;
+
+if (!sps->page_shift) {
+break;
+}
+
+shift = hpte_page_shift(sps, pte0, pte1);
+if (shift) {
+*seg_page_shift = sps->page_shift;
+return shift;
+}
+}
+
+*seg_page_shift = 0;
+return 0;
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, target_ulong eaddr,
 int rwx, int mmu_idx)
 {
diff --git a/target-ppc/mmu-hash64.h b/target-ppc/mmu-hash64.h
index 293a951..34cf975 100644
--- a/target-ppc/mmu-hash64.h
+++ b/target-ppc/mmu-hash64.h
@@ -16,6 +16,9 @@ void ppc_hash64_store_hpte(PowerPCCPU *cpu, target_ulong 
index,
 void ppc_hash64_tlb_flush_hpte(PowerPCCPU *cpu,
target_ulong pte_index,
target_ulong pte0, target_ulong pte1);
+unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
+  uint64_t pte0, uint64_t pte1,
+  unsigned *seg_page_shift);
 #endif
 
 /*
-- 
2.5.0

[Qemu-devel] [PULL 35/39] target-ppc: Split 44x tlbiva from ppc_tlb_invalidate_one()

2016-01-28 Thread David Gibson

Currently both the tlbiva instruction (used on 44x chips) and the tlbie
instruction (used on hash MMU chips) are both handled via
ppc_tlb_invalidate_one().  This is silly, because they're invoked from
different places, and do different things.

Clean this up by separating out the tlbiva instruction into its own
handling.  In fact the implementation is only a stub anyway.

Signed-off-by: David Gibson 
Reviewed-by: Laurent Vivier 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 target-ppc/helper.h |  1 +
 target-ppc/mmu_helper.c | 14 ++
 target-ppc/translate.c  |  2 +-
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 869be15..e5a8f7b 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -544,6 +544,7 @@ DEF_HELPER_2(74xx_tlbd, void, env, tl)
 DEF_HELPER_2(74xx_tlbi, void, env, tl)
 DEF_HELPER_FLAGS_1(tlbia, TCG_CALL_NO_RWG, void, env)
 DEF_HELPER_FLAGS_2(tlbie, TCG_CALL_NO_RWG, void, env, tl)
+DEF_HELPER_FLAGS_2(tlbiva, TCG_CALL_NO_RWG, void, env, tl)
 #if defined(TARGET_PPC64)
 DEF_HELPER_FLAGS_3(store_slb, TCG_CALL_NO_RWG, void, env, tl, tl)
 DEF_HELPER_2(load_slb_esid, tl, env, tl)
diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c
index 82ebe5d..04b1fe1 100644
--- a/target-ppc/mmu_helper.c
+++ b/target-ppc/mmu_helper.c
@@ -1971,10 +1971,6 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, 
target_ulong addr)
 ppc6xx_tlb_invalidate_virt(env, addr, 1);
 }
 break;
-case POWERPC_MMU_BOOKE:
-/* XXX: TODO */
-cpu_abort(CPU(cpu), "BookE MMU model is not implemented\n");
-break;
 case POWERPC_MMU_32B:
 case POWERPC_MMU_601:
 /* tlbie invalidate TLBs for all segments */
@@ -2116,6 +2112,16 @@ void helper_tlbie(CPUPPCState *env, target_ulong addr)
 ppc_tlb_invalidate_one(env, addr);
 }
 
+void helper_tlbiva(CPUPPCState *env, target_ulong addr)
+{
+PowerPCCPU *cpu = ppc_env_get_cpu(env);
+
+/* tlbiva instruction only exists on BookE */
+assert(env->mmu_model == POWERPC_MMU_BOOKE);
+/* XXX: TODO */
+cpu_abort(CPU(cpu), "BookE MMU model is not implemented\n");
+}
+
 /* Software driven TLBs management */
 /* PowerPC 602/603 software TLB load instructions helpers */
 static void do_6xx_tlb(CPUPPCState *env, target_ulong new_EPN, int is_code)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 4be7eaa..a05a169 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -5904,7 +5904,7 @@ static void gen_tlbiva(DisasContext *ctx)
 }
 t0 = tcg_temp_new();
 gen_addr_reg_index(ctx, t0);
-gen_helper_tlbie(cpu_env, cpu_gpr[rB(ctx->opcode)]);
+gen_helper_tlbiva(cpu_env, cpu_gpr[rB(ctx->opcode)]);
 tcg_temp_free(t0);
 #endif
 }
-- 
2.5.0

[Qemu-devel] [PULL 38/39] target-ppc: Allow more page sizes for POWER7 & POWER8 in TCG

2016-01-28 Thread David Gibson

Now that the TCG and spapr code has been extended to allow (semi-)
arbitrary page encodings in the CPU's 'sps' table, we can add the many
page sizes supported by real POWER7 and POWER8 hardware that we previously
didn't support in TCG.

Signed-off-by: David Gibson 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 target-ppc/mmu-hash64.h |  2 ++
 target-ppc/translate_init.c | 32 
 2 files changed, 34 insertions(+)

diff --git a/target-ppc/mmu-hash64.h b/target-ppc/mmu-hash64.h
index 34cf975..ab0f86b 100644
--- a/target-ppc/mmu-hash64.h
+++ b/target-ppc/mmu-hash64.h
@@ -48,6 +48,8 @@ unsigned ppc_hash64_hpte_page_shift_noslb(PowerPCCPU *cpu,
 #define SLB_VSID_LLP_MASK   (SLB_VSID_L | SLB_VSID_LP)
 #define SLB_VSID_4K 0xULL
 #define SLB_VSID_64K0x0110ULL
+#define SLB_VSID_16M0x0100ULL
+#define SLB_VSID_16G0x0120ULL
 
 /*
  * Hash page table definitions
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 156d156..d557043 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8104,6 +8104,36 @@ static Property powerpc_servercpu_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+#ifdef CONFIG_SOFTMMU
+static const struct ppc_segment_page_sizes POWER7_POWER8_sps = {
+.sps = {
+{
+.page_shift = 12, /* 4K */
+.slb_enc = 0,
+.enc = { { .page_shift = 12, .pte_enc = 0 },
+ { .page_shift = 16, .pte_enc = 0x7 },
+ { .page_shift = 24, .pte_enc = 0x38 }, },
+},
+{
+.page_shift = 16, /* 64K */
+.slb_enc = SLB_VSID_64K,
+.enc = { { .page_shift = 16, .pte_enc = 0x1 },
+ { .page_shift = 24, .pte_enc = 0x8 }, },
+},
+{
+.page_shift = 24, /* 16M */
+.slb_enc = SLB_VSID_16M,
+.enc = { { .page_shift = 24, .pte_enc = 0 }, },
+},
+{
+.page_shift = 34, /* 16G */
+.slb_enc = SLB_VSID_16G,
+.enc = { { .page_shift = 34, .pte_enc = 0x3 }, },
+},
+}
+};
+#endif /* CONFIG_SOFTMMU */
+
 static void init_proc_POWER7 (CPUPPCState *env)
 {
 init_proc_book3s_64(env, BOOK3S_CPU_POWER7);
@@ -8167,6 +8197,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 pcc->mmu_model = POWERPC_MMU_2_06;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
+pcc->sps = &POWER7_POWER8_sps;
 #endif
 pcc->excp_model = POWERPC_EXCP_POWER7;
 pcc->bus_model = PPC_FLAGS_INPUT_POWER7;
@@ -8247,6 +8278,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 pcc->mmu_model = POWERPC_MMU_2_07;
 #if defined(CONFIG_SOFTMMU)
 pcc->handle_mmu_fault = ppc_hash64_handle_mmu_fault;
+pcc->sps = &POWER7_POWER8_sps;
 #endif
 pcc->excp_model = POWERPC_EXCP_POWER7;
 pcc->bus_model = PPC_FLAGS_INPUT_POWER7;
-- 
2.5.0

[Qemu-devel] [PULL 32/39] target-ppc: Rework SLB page size lookup

2016-01-28 Thread David Gibson

Currently, the ppc_hash64_page_shift() function looks up a page size based
on information in an SLB entry.  It open codes the bit translation for
existing CPUs, however different CPU models can have different SLB
encodings.  We already store those in the 'sps' table in CPUPPCState, but
we don't currently enforce that that actually matches the logic in
ppc_hash64_page_shift.

This patch reworks lookup of page size from SLB in several ways:
  * ppc_store_slb() will now fail (triggering an illegal instruction
exception) if given a bad SLB page size encoding
  * On success ppc_store_slb() stores a pointer to the relevant entry in
the page size table in the SLB entry.  This is looked up directly from
the published table of page size encodings, so can't get out ot sync.
  * ppc_hash64_htab_lookup() and others now use this precached page size
information rather than decoding the SLB values
  * Now that callers have easy access to the page_shift,
ppc_hash64_pte_raddr() amounts to just a deposit64(), so remove it and
have the callers use deposit64() directly.

Signed-off-by: David Gibson 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 target-ppc/cpu.h|  1 +
 target-ppc/machine.c| 20 +
 target-ppc/mmu-hash64.c | 74 +++--
 3 files changed, 56 insertions(+), 39 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 2bc96b4..0820390 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -419,6 +419,7 @@ typedef struct ppc_slb_t ppc_slb_t;
 struct ppc_slb_t {
 uint64_t esid;
 uint64_t vsid;
+const struct ppc_one_seg_page_size *sps;
 };
 
 #define MAX_SLB_ENTRIES 64
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index b61c060..ca62d3e 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -2,6 +2,7 @@
 #include "hw/boards.h"
 #include "sysemu/kvm.h"
 #include "helper_regs.h"
+#include "mmu-hash64.h"
 
 static int cpu_load_old(QEMUFile *f, void *opaque, int version_id)
 {
@@ -352,11 +353,30 @@ static bool slb_needed(void *opaque)
 return (cpu->env.mmu_model & POWERPC_MMU_64);
 }
 
+static int slb_post_load(void *opaque, int version_id)
+{
+PowerPCCPU *cpu = opaque;
+CPUPPCState *env = &cpu->env;
+int i;
+
+/* We've pulled in the raw esid and vsid values from the migration
+ * stream, but we need to recompute the page size pointers */
+for (i = 0; i < env->slb_nr; i++) {
+if (ppc_store_slb(cpu, i, env->slb[i].esid, env->slb[i].vsid) < 0) {
+/* Migration source had bad values in its SLB */
+return -1;
+}
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_slb = {
 .name = "cpu/slb",
 .version_id = 1,
 .minimum_version_id = 1,
 .needed = slb_needed,
+.post_load = slb_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_INT32_EQUAL(env.slb_nr, PowerPCCPU),
 VMSTATE_SLB_ARRAY(env.slb, PowerPCCPU, MAX_SLB_ENTRIES),
diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index 6e05643..b784791 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -19,6 +19,7 @@
  */
 #include "cpu.h"
 #include "exec/helper-proto.h"
+#include "qemu/error-report.h"
 #include "sysemu/kvm.h"
 #include "kvm_ppc.h"
 #include "mmu-hash64.h"
@@ -140,6 +141,8 @@ int ppc_store_slb(PowerPCCPU *cpu, target_ulong slot,
 {
 CPUPPCState *env = &cpu->env;
 ppc_slb_t *slb = &env->slb[slot];
+const struct ppc_one_seg_page_size *sps = NULL;
+int i;
 
 if (slot >= env->slb_nr) {
 return -1; /* Bad slot number */
@@ -154,8 +157,29 @@ int ppc_store_slb(PowerPCCPU *cpu, target_ulong slot,
 return -1; /* 1T segment on MMU that doesn't support it */
 }
 
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+const struct ppc_one_seg_page_size *sps1 = &env->sps.sps[i];
+
+if (!sps1->page_shift) {
+break;
+}
+
+if ((vsid & SLB_VSID_LLP_MASK) == sps1->slb_enc) {
+sps = sps1;
+break;
+}
+}
+
+if (!sps) {
+error_report("Bad page size encoding in SLB store: slot "TARGET_FMT_lu
+ " esid 0x"TARGET_FMT_lx" vsid 0x"TARGET_FMT_lx,
+ slot, esid, vsid);
+return -1;
+}
+
 slb->esid = esid;
 slb->vsid = vsid;
+slb->sps = sps;
 
 LOG_SLB("%s: %d " TARGET_FMT_lx " - " TARGET_FMT_lx " => %016" PRIx64
 " %016" PRIx64 "\n", __func__, slot, esid, vsid,
@@ -394,24 +418,6 @@ static hwaddr ppc_hash64_pteg_search(PowerPCCPU *cpu, 
hwaddr hash,
 return -1;
 }
 
-static uint64_t ppc_hash64_page_shift(ppc_slb_t *slb)
-{
-uint64_t epnshift;
-
-/* Page size according to the SLB, which we use to generate the
- * EPN for hash table lookup..  When we implement more recent MMU
- * extensions this might be different from the actual page size
- * encoded in the PTE */
-

[Qemu-devel] [PULL 34/39] target-ppc: Remove unused mmu models from ppc_tlb_invalidate_one

2016-01-28 Thread David Gibson

ppc_tlb_invalidate_one() has a big switch handling many different MMU
types.  However, most of those branches can never be reached:

It is called from 3 places: from remove_hpte() and h_protect() in
spapr_hcall.c (which always has a 64-bit hash MMU type), and from
helper_tlbie() in mmu_helper.c.

Calls to helper_tlbie() are generated from gen_tlbiel, gen_tlbiel and
gen_tlbiva.  The first two are only used with the PPC_MEM_TLBIE flag,
set only with 32-bit or 64-bit hash MMU models, and gen_tlbiva() is
used only on 440 and 460 models with the BookE mmu model.

These means the exhaustive list of MMU types which may call
ppc_tlb_invalidate_one() is: POWERPC_MMU_SOFT_6xx, POWERPC_MMU_601,
POWERPC_MMU_32B, POWERPC_MMU_SOFT_74xx, POWERPC_MMU_64B, POWERPC_MMU_2_03,
POWERPC_MMU_2_06, POWERPC_MMU_2_07 and POWERPC_MMU_BOOKE.

Clean up by removing logic for all other MMU types from
ppc_tlb_invalidate_one().

Signed-off-by: David Gibson 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 target-ppc/mmu_helper.c | 20 ++--
 1 file changed, 2 insertions(+), 18 deletions(-)

diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c
index c040b17..82ebe5d 100644
--- a/target-ppc/mmu_helper.c
+++ b/target-ppc/mmu_helper.c
@@ -1971,25 +1971,10 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, 
target_ulong addr)
 ppc6xx_tlb_invalidate_virt(env, addr, 1);
 }
 break;
-case POWERPC_MMU_SOFT_4xx:
-case POWERPC_MMU_SOFT_4xx_Z:
-ppc4xx_tlb_invalidate_virt(env, addr, env->spr[SPR_40x_PID]);
-break;
-case POWERPC_MMU_REAL:
-cpu_abort(CPU(cpu), "No TLB for PowerPC 4xx in real mode\n");
-break;
-case POWERPC_MMU_MPC8xx:
-/* XXX: TODO */
-cpu_abort(CPU(cpu), "MPC8xx MMU model is not implemented\n");
-break;
 case POWERPC_MMU_BOOKE:
 /* XXX: TODO */
 cpu_abort(CPU(cpu), "BookE MMU model is not implemented\n");
 break;
-case POWERPC_MMU_BOOKE206:
-/* XXX: TODO */
-cpu_abort(CPU(cpu), "BookE 2.06 MMU model is not implemented\n");
-break;
 case POWERPC_MMU_32B:
 case POWERPC_MMU_601:
 /* tlbie invalidate TLBs for all segments */
@@ -2031,9 +2016,8 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, 
target_ulong addr)
 break;
 #endif /* defined(TARGET_PPC64) */
 default:
-/* XXX: TODO */
-cpu_abort(CPU(cpu), "Unknown MMU model\n");
-break;
+/* Should never reach here with other MMU models */
+assert(0);
 }
 #else
 ppc_tlb_invalidate_all(env);
-- 
2.5.0

[Qemu-devel] [PULL 24/39] target-ppc: gdbstub: fix spe registers for little-endian guests

2016-01-28 Thread David Gibson

From: Greg Kurz 

Let's reuse the ppc_maybe_bswap_register() helper, like we already do
with the general registers.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 target-ppc/translate_init.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 02734a2..83942fe 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8849,6 +8849,7 @@ static int gdb_get_spe_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 if (n < 32) {
 #if defined(TARGET_PPC64)
 stl_p(mem_buf, env->gpr[n] >> 32);
+ppc_maybe_bswap_register(env, mem_buf, 4);
 #else
 stl_p(mem_buf, env->gprh[n]);
 #endif
@@ -8856,10 +8857,12 @@ static int gdb_get_spe_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 }
 if (n == 32) {
 stq_p(mem_buf, env->spe_acc);
+ppc_maybe_bswap_register(env, mem_buf, 8);
 return 8;
 }
 if (n == 33) {
 stl_p(mem_buf, env->spe_fscr);
+ppc_maybe_bswap_register(env, mem_buf, 4);
 return 4;
 }
 return 0;
@@ -8870,7 +8873,11 @@ static int gdb_set_spe_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 if (n < 32) {
 #if defined(TARGET_PPC64)
 target_ulong lo = (uint32_t)env->gpr[n];
-target_ulong hi = (target_ulong)ldl_p(mem_buf) << 32;
+target_ulong hi;
+
+ppc_maybe_bswap_register(env, mem_buf, 4);
+
+hi = (target_ulong)ldl_p(mem_buf) << 32;
 env->gpr[n] = lo | hi;
 #else
 env->gprh[n] = ldl_p(mem_buf);
@@ -8878,10 +8885,12 @@ static int gdb_set_spe_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 return 4;
 }
 if (n == 32) {
+ppc_maybe_bswap_register(env, mem_buf, 8);
 env->spe_acc = ldq_p(mem_buf);
 return 8;
 }
 if (n == 33) {
+ppc_maybe_bswap_register(env, mem_buf, 4);
 env->spe_fscr = ldl_p(mem_buf);
 return 4;
 }
-- 
2.5.0

[Qemu-devel] [PULL 33/39] target-ppc: Use actual page size encodings from HPTE

2016-01-28 Thread David Gibson

At present the 64-bit hash MMU code uses information from the SLB to
determine the page size of a translation.  We do need that information to
correctly look up the hash table.  However the MMU also allows a
possibly larger page size to be encoded into the HPTE itself, which is used
to populate the TLB.  At present qemu doesn't check that, and so doesn't
support the MPSS "Multiple Page Size per Segment" feature.

This makes a start on allowing this, by adding an hpte_page_shift()
function which looks up the page size of an HPTE.  We use this to validate
page sizes encodings on faults, and populate the qemu TLB with larger
page sizes when appropriate.

Signed-off-by: David Gibson 
Acked-by: Benjamin Herrenschmidt 
Reviewed-by: Alexander Graf 
---
 target-ppc/mmu-hash64.c | 63 ++---
 1 file changed, 60 insertions(+), 3 deletions(-)

diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index b784791..ee1e8bf 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -21,6 +21,7 @@
 #include "exec/helper-proto.h"
 #include "qemu/error-report.h"
 #include "sysemu/kvm.h"
+#include "qemu/error-report.h"
 #include "kvm_ppc.h"
 #include "mmu-hash64.h"
 
@@ -474,12 +475,50 @@ static hwaddr ppc_hash64_htab_lookup(PowerPCCPU *cpu,
 return pte_offset;
 }
 
+static unsigned hpte_page_shift(const struct ppc_one_seg_page_size *sps,
+uint64_t pte0, uint64_t pte1)
+{
+int i;
+
+if (!(pte0 & HPTE64_V_LARGE)) {
+if (sps->page_shift != 12) {
+/* 4kiB page in a non 4kiB segment */
+return 0;
+}
+/* Normal 4kiB page */
+return 12;
+}
+
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+const struct ppc_one_page_size *ps = &sps->enc[i];
+uint64_t mask;
+
+if (!ps->page_shift) {
+break;
+}
+
+if (ps->page_shift == 12) {
+/* L bit is set so this can't be a 4kiB page */
+continue;
+}
+
+mask = ((1ULL << ps->page_shift) - 1) & HPTE64_R_RPN;
+
+if ((pte1 & mask) == (ps->pte_enc << HPTE64_R_RPN_SHIFT)) {
+return ps->page_shift;
+}
+}
+
+return 0; /* Bad page size encoding */
+}
+
 int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, target_ulong eaddr,
 int rwx, int mmu_idx)
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = &cpu->env;
 ppc_slb_t *slb;
+unsigned apshift;
 hwaddr pte_offset;
 ppc_hash_pte64_t pte;
 int pp_prot, amr_prot, prot;
@@ -543,6 +582,18 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, 
target_ulong eaddr,
 qemu_log_mask(CPU_LOG_MMU,
 "found PTE at offset %08" HWADDR_PRIx "\n", pte_offset);
 
+/* Validate page size encoding */
+apshift = hpte_page_shift(slb->sps, pte.pte0, pte.pte1);
+if (!apshift) {
+error_report("Bad page size encoding in HPTE 0x%"PRIx64" - 0x%"PRIx64
+ " @ 0x%"HWADDR_PRIx, pte.pte0, pte.pte1, pte_offset);
+/* Not entirely sure what the right action here, but machine
+ * check seems reasonable */
+cs->exception_index = POWERPC_EXCP_MCHECK;
+env->error_code = 0;
+return 1;
+}
+
 /* 5. Check access permissions */
 
 pp_prot = ppc_hash64_pte_prot(cpu, slb, pte);
@@ -595,10 +646,10 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, 
target_ulong eaddr,
 
 /* 7. Determine the real address from the PTE */
 
-raddr = deposit64(pte.pte1 & HPTE64_R_RPN, 0, slb->sps->page_shift, eaddr);
+raddr = deposit64(pte.pte1 & HPTE64_R_RPN, 0, apshift, eaddr);
 
 tlb_set_page(cs, eaddr & TARGET_PAGE_MASK, raddr & TARGET_PAGE_MASK,
- prot, mmu_idx, TARGET_PAGE_SIZE);
+ prot, mmu_idx, 1ULL << apshift);
 
 return 0;
 }
@@ -609,6 +660,7 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 ppc_slb_t *slb;
 hwaddr pte_offset;
 ppc_hash_pte64_t pte;
+unsigned apshift;
 
 if (msr_dr == 0) {
 /* In real mode the top 4 effective address bits are ignored */
@@ -625,7 +677,12 @@ hwaddr ppc_hash64_get_phys_page_debug(PowerPCCPU *cpu, 
target_ulong addr)
 return -1;
 }
 
-return deposit64(pte.pte1 & HPTE64_R_RPN, 0, slb->sps->page_shift, addr)
+apshift = hpte_page_shift(slb->sps, pte.pte0, pte.pte1);
+if (!apshift) {
+return -1;
+}
+
+return deposit64(pte.pte1 & HPTE64_R_RPN, 0, apshift, addr)
 & TARGET_PAGE_MASK;
 }
 
-- 
2.5.0

[Qemu-devel] [PULL 29/39] target-ppc: Remove unused kvmppc_read_segment_page_sizes() stub

2016-01-28 Thread David Gibson

This stub function is in the !KVM ifdef in target-ppc/kvm_ppc.h.  However
no such function exists on the KVM side, or is ever used.

I think this originally referenced a function which read host page size
information from /proc, for we we now use the KVM GET_SMMU_INFO extension
instead.

In any case, it has no function now, so remove it.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Laurent Vivier 
Reviewed-by: Alexander Graf 
---
 target-ppc/kvm_ppc.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 5e1333d..62406ce 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -98,11 +98,6 @@ static inline int kvmppc_get_hypercall(CPUPPCState *env, 
uint8_t *buf, int buf_l
 return -1;
 }
 
-static inline int kvmppc_read_segment_page_sizes(uint32_t *prop, int maxcells)
-{
-return -1;
-}
-
 static inline int kvmppc_set_interrupt(PowerPCCPU *cpu, int irq, int level)
 {
 return -1;
-- 
2.5.0

[Qemu-devel] [PULL 30/39] target-ppc: Convert mmu-hash{32, 64}.[ch] from CPUPPCState to PowerPCCPU

2016-01-28 Thread David Gibson

Like a lot of places these files include a mixture of functions taking
both the older CPUPPCState *env and newer PowerPCCPU *cpu.  Move a step
closer to cleaning this up by standardizing on PowerPCCPU, except for the
helper_* functions which are called with the CPUPPCState * from tcg.

Callers and some related functions are updated as well, the boundaries of
what's changed here are a bit arbitrary.

Signed-off-by: David Gibson 
Reviewed-by: Laurent Vivier 
Reviewed-by: Alexander Graf 
---
 hw/ppc/spapr_hcall.c| 31 ++-
 target-ppc/kvm.c|  2 +-
 target-ppc/mmu-hash32.c | 68 +++--
 target-ppc/mmu-hash32.h | 30 ++-
 target-ppc/mmu-hash64.c | 80 +
 target-ppc/mmu-hash64.h | 21 ++---
 target-ppc/mmu_helper.c | 13 
 7 files changed, 136 insertions(+), 109 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index c4ae255..4707196 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -160,7 +160,7 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 pte_index &= ~7ULL;
 token = ppc_hash64_start_access(cpu, pte_index);
 for (; index < 8; index++) {
-if ((ppc_hash64_load_hpte0(env, token, index) & HPTE64_V_VALID) == 
0) {
+if (!(ppc_hash64_load_hpte0(cpu, token, index) & HPTE64_V_VALID)) {
 break;
 }
 }
@@ -170,14 +170,14 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 }
 } else {
 token = ppc_hash64_start_access(cpu, pte_index);
-if (ppc_hash64_load_hpte0(env, token, 0) & HPTE64_V_VALID) {
+if (ppc_hash64_load_hpte0(cpu, token, 0) & HPTE64_V_VALID) {
 ppc_hash64_stop_access(token);
 return H_PTEG_FULL;
 }
 ppc_hash64_stop_access(token);
 }
 
-ppc_hash64_store_hpte(env, pte_index + index,
+ppc_hash64_store_hpte(cpu, pte_index + index,
   pteh | HPTE64_V_HPTE_DIRTY, ptel);
 
 args[0] = pte_index + index;
@@ -191,11 +191,12 @@ typedef enum {
 REMOVE_HW = 3,
 } RemoveResult;
 
-static RemoveResult remove_hpte(CPUPPCState *env, target_ulong ptex,
+static RemoveResult remove_hpte(PowerPCCPU *cpu, target_ulong ptex,
 target_ulong avpn,
 target_ulong flags,
 target_ulong *vp, target_ulong *rp)
 {
+CPUPPCState *env = &cpu->env;
 uint64_t token;
 target_ulong v, r, rb;
 
@@ -203,9 +204,9 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
target_ulong ptex,
 return REMOVE_PARM;
 }
 
-token = ppc_hash64_start_access(ppc_env_get_cpu(env), ptex);
-v = ppc_hash64_load_hpte0(env, token, 0);
-r = ppc_hash64_load_hpte1(env, token, 0);
+token = ppc_hash64_start_access(cpu, ptex);
+v = ppc_hash64_load_hpte0(cpu, token, 0);
+r = ppc_hash64_load_hpte1(cpu, token, 0);
 ppc_hash64_stop_access(token);
 
 if ((v & HPTE64_V_VALID) == 0 ||
@@ -215,7 +216,7 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
target_ulong ptex,
 }
 *vp = v;
 *rp = r;
-ppc_hash64_store_hpte(env, ptex, HPTE64_V_HPTE_DIRTY, 0);
+ppc_hash64_store_hpte(cpu, ptex, HPTE64_V_HPTE_DIRTY, 0);
 rb = compute_tlbie_rb(v, r, ptex);
 ppc_tlb_invalidate_one(env, rb);
 return REMOVE_SUCCESS;
@@ -224,13 +225,12 @@ static RemoveResult remove_hpte(CPUPPCState *env, 
target_ulong ptex,
 static target_ulong h_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr,
  target_ulong opcode, target_ulong *args)
 {
-CPUPPCState *env = &cpu->env;
 target_ulong flags = args[0];
 target_ulong pte_index = args[1];
 target_ulong avpn = args[2];
 RemoveResult ret;
 
-ret = remove_hpte(env, pte_index, avpn, flags,
+ret = remove_hpte(cpu, pte_index, avpn, flags,
   &args[0], &args[1]);
 
 switch (ret) {
@@ -271,7 +271,6 @@ static target_ulong h_remove(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 static target_ulong h_bulk_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr,
   target_ulong opcode, target_ulong *args)
 {
-CPUPPCState *env = &cpu->env;
 int i;
 
 for (i = 0; i < H_BULK_REMOVE_MAX_BATCH; i++) {
@@ -293,7 +292,7 @@ static target_ulong h_bulk_remove(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return H_PARAMETER;
 }
 
-ret = remove_hpte(env, *tsh & H_BULK_REMOVE_PTEX, tsl,
+ret = remove_hpte(cpu, *tsh & H_BULK_REMOVE_PTEX, tsl,
   (*tsh & H_BULK_REMOVE_FLAGS) >> 26,
   &v, &r);
 
@@ -330,8 +329,8 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 }
 
 token = ppc_hash64_start_access(cpu, pte_index);
-v = ppc_hash64_load_hpte0(env, token, 0);
-

[Qemu-devel] [PULL 28/39] uninorth.c: add support for UniNorth kMacRISCPCIAddressSelect (0x48) register

2016-01-28 Thread David Gibson

From: Programmingkid 

Darwin/OS X use the undocumented kMacRISCPCIAddressSelect (0x48) to
configure PCI memory space size for mac99 machines. Without this
register, warnings similar to below are emitted to the console during boot:

AppleMacRiscPCI: bad range 2(8000:0100)
AppleMacRiscPCI: bad range 2(8100:1000)
AppleMacRiscPCI: bad range 2(8108:0008)

Based upon the algorithm in Darwin's AppleMacRiscPCI.cpp driver, set the
kMacRISCPCIAddressSelect register so that Darwin considers the PCI
memory space to be at 0x8000 (size 0x1000) which matches that
currently used by QEMU and OpenBIOS.

Signed-off-by: John Arbuckle 
Tested-by: Mark Cave-Ayland 
[commit message and comment revised as suggested by Mark Cave-Ayland]
Signed-off-by: David Gibson 
---
 hw/pci-host/uninorth.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/pci-host/uninorth.c b/hw/pci-host/uninorth.c
index 215b64f..d4aff84 100644
--- a/hw/pci-host/uninorth.c
+++ b/hw/pci-host/uninorth.c
@@ -330,6 +330,15 @@ static void unin_agp_pci_host_realize(PCIDevice *d, Error 
**errp)
 d->config[0x0C] = 0x08; // cache_line_size
 d->config[0x0D] = 0x10; // latency_timer
 //d->config[0x34] = 0x80; // capabilities_pointer
+/*
+ * Set kMacRISCPCIAddressSelect (0x48) register to indicate PCI
+ * memory space with base 0x8000, size 0x1000 for Apple's
+ * AppleMacRiscPCI driver
+ */
+d->config[0x48] = 0x0;
+d->config[0x49] = 0x0;
+d->config[0x4a] = 0x0;
+d->config[0x4b] = 0x1;
 }
 
 static void u3_agp_pci_host_realize(PCIDevice *d, Error **errp)
-- 
2.5.0

[Qemu-devel] [PULL 20/39] target-ppc: rename and export maybe_bswap_register()

2016-01-28 Thread David Gibson

From: Greg Kurz 

This helper will be used to support FP, Altivec and VSX registers when
the guest is little-endian.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 target-ppc/cpu.h |  1 +
 target-ppc/gdbstub.c | 10 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index b3b89e6..2bc96b4 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -2355,4 +2355,5 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
  */
 PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
 
+void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
 #endif /* !defined (__CPU_PPC_H__) */
diff --git a/target-ppc/gdbstub.c b/target-ppc/gdbstub.c
index 14675f4..b20bb0c 100644
--- a/target-ppc/gdbstub.c
+++ b/target-ppc/gdbstub.c
@@ -88,7 +88,7 @@ static int ppc_gdb_register_len(int n)
the proper ordering for the binary, and cannot be changed.
For system mode, TARGET_WORDS_BIGENDIAN is always set, and we must check
the current mode of the chip to see if we're running in little-endian.  */
-static void maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len)
+void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len)
 {
 #ifndef CONFIG_USER_ONLY
 if (!msr_le) {
@@ -158,7 +158,7 @@ int ppc_cpu_gdb_read_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 break;
 }
 }
-maybe_bswap_register(env, mem_buf, r);
+ppc_maybe_bswap_register(env, mem_buf, r);
 return r;
 }
 
@@ -214,7 +214,7 @@ int ppc_cpu_gdb_read_register_apple(CPUState *cs, uint8_t 
*mem_buf, int n)
 break;
 }
 }
-maybe_bswap_register(env, mem_buf, r);
+ppc_maybe_bswap_register(env, mem_buf, r);
 return r;
 }
 
@@ -227,7 +227,7 @@ int ppc_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 if (!r) {
 return r;
 }
-maybe_bswap_register(env, mem_buf, r);
+ppc_maybe_bswap_register(env, mem_buf, r);
 if (n < 32) {
 /* gprs */
 env->gpr[n] = ldtul_p(mem_buf);
@@ -277,7 +277,7 @@ int ppc_cpu_gdb_write_register_apple(CPUState *cs, uint8_t 
*mem_buf, int n)
 if (!r) {
 return r;
 }
-maybe_bswap_register(env, mem_buf, r);
+ppc_maybe_bswap_register(env, mem_buf, r);
 if (n < 32) {
 /* gprs */
 env->gpr[n] = ldq_p(mem_buf);
-- 
2.5.0

[Qemu-devel] [PULL 21/39] target-ppc: gdbstub: fix float registers for little-endian guests

2016-01-28 Thread David Gibson

From: Greg Kurz 

Let's reuse the ppc_maybe_bswap_register() helper, like we already do
with the general registers.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 target-ppc/translate_init.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 4c61525..26b9b67 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8755,10 +8755,12 @@ static int gdb_get_float_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 {
 if (n < 32) {
 stfq_p(mem_buf, env->fpr[n]);
+ppc_maybe_bswap_register(env, mem_buf, 8);
 return 8;
 }
 if (n == 32) {
 stl_p(mem_buf, env->fpscr);
+ppc_maybe_bswap_register(env, mem_buf, 4);
 return 4;
 }
 return 0;
@@ -8767,10 +8769,12 @@ static int gdb_get_float_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 static int gdb_set_float_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
 {
 if (n < 32) {
+ppc_maybe_bswap_register(env, mem_buf, 8);
 env->fpr[n] = ldfq_p(mem_buf);
 return 8;
 }
 if (n == 32) {
+ppc_maybe_bswap_register(env, mem_buf, 4);
 helper_store_fpscr(env, ldl_p(mem_buf), 0x);
 return 4;
 }
-- 
2.5.0

[Qemu-devel] [PULL 26/39] pseries: Allow TCG h_enter to work with hotplugged memory

2016-01-28 Thread David Gibson

The implementation of the H_ENTER hypercall for PAPR guests needs to
enforce correct access attributes on the inserted HPTE.  This means
determining if the HPTE's real address is a regular RAM address (which
requires attributes for coherent access) or an IO address (which requires
attributes for cache-inhibited access).

At the moment this check is implemented with (raddr < machine->ram_size),
but that only handles addresses in the base RAM area, not any hotplugged
RAM.

This patch corrects the problem with a new helper.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_hcall.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index e9c057d..c4ae255 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -84,10 +84,25 @@ static inline bool valid_pte_index(CPUPPCState *env, 
target_ulong pte_index)
 return true;
 }
 
+static bool is_ram_address(sPAPRMachineState *spapr, hwaddr addr)
+{
+MachineState *machine = MACHINE(spapr);
+MemoryHotplugState *hpms = &spapr->hotplug_memory;
+
+if (addr < machine->ram_size) {
+return true;
+}
+if ((addr >= hpms->base)
+&& ((addr - hpms->base) < memory_region_size(&hpms->mr))) {
+return true;
+}
+
+return false;
+}
+
 static target_ulong h_enter(PowerPCCPU *cpu, sPAPRMachineState *spapr,
 target_ulong opcode, target_ulong *args)
 {
-MachineState *machine = MACHINE(spapr);
 CPUPPCState *env = &cpu->env;
 target_ulong flags = args[0];
 target_ulong pte_index = args[1];
@@ -119,7 +134,7 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 
 raddr = (ptel & HPTE64_R_RPN) & ~((1ULL << page_shift) - 1);
 
-if (raddr < machine->ram_size) {
+if (is_ram_address(spapr, raddr)) {
 /* Regular RAM - should have WIMG=0010 */
 if ((ptel & HPTE64_R_WIMG) != HPTE64_R_M) {
 return H_PARAMETER;
-- 
2.5.0

[Qemu-devel] [PULL 13/39] pseries: Clean up error handling in spapr_validate_node_memory()

2016-01-28 Thread David Gibson

Use error_setg() and return an error, rather than using an explicit exit().

Also improve messages, and be more explicit about which constraint failed.

Signed-off-by: David Gibson 
Reviewed-by: Bharata B Rao 
Reviewed-by: Thomas Huth 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 447fa5d..b20b109 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1698,27 +1698,34 @@ static void 
spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
  * to SPAPR_MEMORY_BLOCK_SIZE(256MB), then refuse to start the guest
  * since we can't support such unaligned sizes with DRCONF_MEMORY.
  */
-static void spapr_validate_node_memory(MachineState *machine)
+static void spapr_validate_node_memory(MachineState *machine, Error **errp)
 {
 int i;
 
-if (machine->maxram_size % SPAPR_MEMORY_BLOCK_SIZE ||
-machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) {
-error_report("Can't support memory configuration where RAM size "
- "0x" RAM_ADDR_FMT " or maxmem size "
- "0x" RAM_ADDR_FMT " isn't aligned to %llu MB",
- machine->ram_size, machine->maxram_size,
- SPAPR_MEMORY_BLOCK_SIZE/M_BYTE);
-exit(EXIT_FAILURE);
+if (machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) {
+error_setg(errp, "Memory size 0x" RAM_ADDR_FMT
+   " is not aligned to %llu MiB",
+   machine->ram_size,
+   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
+return;
+}
+
+if (machine->maxram_size % SPAPR_MEMORY_BLOCK_SIZE) {
+error_setg(errp, "Maximum memory size 0x" RAM_ADDR_FMT
+   " is not aligned to %llu MiB",
+   machine->ram_size,
+   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
+return;
 }
 
 for (i = 0; i < nb_numa_nodes; i++) {
 if (numa_info[i].node_mem % SPAPR_MEMORY_BLOCK_SIZE) {
-error_report("Can't support memory configuration where memory size"
- " %" PRIx64 " of node %d isn't aligned to %llu MB",
- numa_info[i].node_mem, i,
- SPAPR_MEMORY_BLOCK_SIZE/M_BYTE);
-exit(EXIT_FAILURE);
+error_setg(errp,
+   "Node %d memory size 0x%" PRIx64
+   " is not aligned to %llu MiB",
+   i, numa_info[i].node_mem,
+   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
+return;
 }
 }
 }
@@ -1808,7 +1815,7 @@ static void ppc_spapr_init(MachineState *machine)
   XICS_IRQS);
 
 if (smc->dr_lmb_enabled) {
-spapr_validate_node_memory(machine);
+spapr_validate_node_memory(machine, &error_fatal);
 }
 
 /* init CPUs */
-- 
2.5.0

[Qemu-devel] [PULL 12/39] pseries: Clean up error handling of spapr_cpu_init()

2016-01-28 Thread David Gibson

Currently spapr_cpu_init() is hardcoded to handle any errors as fatal.
That works for now, since it's only called from initial setup where an
error here means we really can't proceed.

However, we'll want to handle this more flexibly for cpu hotplug in future
so generalize this using the error reporting infrastructure.  While we're
at it make a small cleanup in a related part of ppc_spapr_init() to use
error_report() instead of an old-style explicit fprintf().

Signed-off-by: David Gibson 
Reviewed-by: Bharata B Rao 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 672815f..447fa5d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1624,7 +1624,8 @@ static void spapr_boot_set(void *opaque, const char 
*boot_device,
 machine->boot_order = g_strdup(boot_device);
 }
 
-static void spapr_cpu_init(sPAPRMachineState *spapr, PowerPCCPU *cpu)
+static void spapr_cpu_init(sPAPRMachineState *spapr, PowerPCCPU *cpu,
+   Error **errp)
 {
 CPUPPCState *env = &cpu->env;
 
@@ -1642,7 +1643,13 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
PowerPCCPU *cpu)
 }
 
 if (cpu->max_compat) {
-ppc_set_compat(cpu, cpu->max_compat, &error_fatal);
+Error *local_err = NULL;
+
+ppc_set_compat(cpu, cpu->max_compat, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
 }
 
 xics_cpu_setup(spapr->icp, cpu);
@@ -1811,10 +1818,10 @@ static void ppc_spapr_init(MachineState *machine)
 for (i = 0; i < smp_cpus; i++) {
 cpu = cpu_ppc_init(machine->cpu_model);
 if (cpu == NULL) {
-fprintf(stderr, "Unable to find PowerPC CPU definition\n");
+error_report("Unable to find PowerPC CPU definition");
 exit(1);
 }
-spapr_cpu_init(spapr, cpu);
+spapr_cpu_init(spapr, cpu, &error_fatal);
 }
 
 if (kvm_enabled()) {
-- 
2.5.0

[Qemu-devel] [PULL 25/39] target-ppc: gdbstub: Add VSX support

2016-01-28 Thread David Gibson

From: Anton Blanchard 

Add the XML and functions to get and set VSX registers.

Signed-off-by: Anton Blanchard 
(fixed little-endian guests)
Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 configure   |  6 +++---
 gdb-xml/power-vsx.xml   | 44 
 target-ppc/translate_init.c | 24 
 3 files changed, 71 insertions(+), 3 deletions(-)
 create mode 100644 gdb-xml/power-vsx.xml

diff --git a/configure b/configure
index 3506e44..297bfc7 100755
--- a/configure
+++ b/configure
@@ -5702,20 +5702,20 @@ case "$target_name" in
   ppc64)
 TARGET_BASE_ARCH=ppc
 TARGET_ABI_DIR=ppc
-gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
+gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml power-vsx.xml"
   ;;
   ppc64le)
 TARGET_ARCH=ppc64
 TARGET_BASE_ARCH=ppc
 TARGET_ABI_DIR=ppc
-gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
+gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml power-vsx.xml"
   ;;
   ppc64abi32)
 TARGET_ARCH=ppc64
 TARGET_BASE_ARCH=ppc
 TARGET_ABI_DIR=ppc
 echo "TARGET_ABI32=y" >> $config_target_mak
-gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
+gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml power-vsx.xml"
   ;;
   sh4|sh4eb)
 TARGET_ARCH=sh4
diff --git a/gdb-xml/power-vsx.xml b/gdb-xml/power-vsx.xml
new file mode 100644
index 000..fd290e9
--- /dev/null
+++ b/gdb-xml/power-vsx.xml
@@ -0,0 +1,44 @@
+
+
+
+
+
+
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 83942fe..156d156 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8897,6 +8897,26 @@ static int gdb_set_spe_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 return 0;
 }
 
+static int gdb_get_vsx_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
+{
+if (n < 32) {
+stq_p(mem_buf, env->vsr[n]);
+ppc_maybe_bswap_register(env, mem_buf, 8);
+return 8;
+}
+return 0;
+}
+
+static int gdb_set_vsx_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
+{
+if (n < 32) {
+ppc_maybe_bswap_register(env, mem_buf, 8);
+env->vsr[n] = ldq_p(mem_buf);
+return 8;
+}
+return 0;
+}
+
 static int ppc_fixup_cpu(PowerPCCPU *cpu)
 {
 CPUPPCState *env = &cpu->env;
@@ -9002,6 +9022,10 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error 
**errp)
 gdb_register_coprocessor(cs, gdb_get_spe_reg, gdb_set_spe_reg,
  34, "power-spe.xml", 0);
 }
+if (pcc->insns_flags2 & PPC2_VSX) {
+gdb_register_coprocessor(cs, gdb_get_vsx_reg, gdb_set_vsx_reg,
+ 32, "power-vsx.xml", 0);
+}
 
 qemu_init_vcpu(cs);
 
-- 
2.5.0

[Qemu-devel] [PULL 19/39] target-ppc: kvm: fix floating point registers sync on little-endian hosts

2016-01-28 Thread David Gibson

From: Greg Kurz 

On VSX capable CPUs, the 32 FP registers are mapped to the high-bits
of the 32 first VSX registers. So if you have:

VSR31 = (uint128) 0x0102030405060708090a0b0c0d0e0f00

then

FPR31 = (uint64) 0x0102030405060708

The kernel stores the VSX registers in the fp_state struct following the
host endian element ordering.

On big-endian:

fp_state.fpr[31][0] = 0x0102030405060708
fp_state.fpr[31][1] = 0x090a0b0c0d0e0f00

On little-endian:

fp_state.fpr[31][0] = 0x090a0b0c0d0e0f00
fp_state.fpr[31][1] = 0x0102030405060708

The KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctls preserve this ordering, but
QEMU considers it as big-endian and always copies element [0] to the
fpr[] array and element [1] to the vsr[] array. This does not work with
little-endian hosts, and you will get:

(qemu) p $f31
0x90a0b0c0d0e0f00

instead of:

(qemu) p $f31
0x102030405060708

This patch fixes the element ordering for little-endian hosts.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 target-ppc/kvm.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 9940a90..4524999 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -650,8 +650,13 @@ static int kvm_put_fp(CPUState *cs)
 for (i = 0; i < 32; i++) {
 uint64_t vsr[2];
 
+#ifdef HOST_WORDS_BIGENDIAN
 vsr[0] = float64_val(env->fpr[i]);
 vsr[1] = env->vsr[i];
+#else
+vsr[0] = env->vsr[i];
+vsr[1] = float64_val(env->fpr[i]);
+#endif
 reg.addr = (uintptr_t) &vsr;
 reg.id = vsx ? KVM_REG_PPC_VSR(i) : KVM_REG_PPC_FPR(i);
 
@@ -721,10 +726,17 @@ static int kvm_get_fp(CPUState *cs)
 vsx ? "VSR" : "FPR", i, strerror(errno));
 return ret;
 } else {
+#ifdef HOST_WORDS_BIGENDIAN
 env->fpr[i] = vsr[0];
 if (vsx) {
 env->vsr[i] = vsr[1];
 }
+#else
+env->fpr[i] = vsr[1];
+if (vsx) {
+env->vsr[i] = vsr[0];
+}
+#endif
 }
 }
 }
-- 
2.5.0

[Qemu-devel] [PULL 23/39] target-ppc: gdbstub: fix altivec registers for little-endian guests

2016-01-28 Thread David Gibson

From: Greg Kurz 

Altivec registers are 128-bit wide. They are stored in memory as two
64-bit values that must be byteswapped when the guest is little-endian.
Let's reuse the ppc_maybe_bswap_register() helper for this.

We also need to fix the ordering of the 64-bit elements according to
the target endianness, for both system and user mode.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 target-ppc/translate_init.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 41308c3..02734a2 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8754,9 +8754,9 @@ static void dump_ppc_insns (CPUPPCState *env)
 static bool avr_need_swap(CPUPPCState *env)
 {
 #ifdef HOST_WORDS_BIGENDIAN
-return false;
+return msr_le;
 #else
-return true;
+return !msr_le;
 #endif
 }
 
@@ -8800,14 +8800,18 @@ static int gdb_get_avr_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 stq_p(mem_buf, env->avr[n].u64[1]);
 stq_p(mem_buf+8, env->avr[n].u64[0]);
 }
+ppc_maybe_bswap_register(env, mem_buf, 8);
+ppc_maybe_bswap_register(env, mem_buf + 8, 8);
 return 16;
 }
 if (n == 32) {
 stl_p(mem_buf, env->vscr);
+ppc_maybe_bswap_register(env, mem_buf, 4);
 return 4;
 }
 if (n == 33) {
 stl_p(mem_buf, (uint32_t)env->spr[SPR_VRSAVE]);
+ppc_maybe_bswap_register(env, mem_buf, 4);
 return 4;
 }
 return 0;
@@ -8816,6 +8820,8 @@ static int gdb_get_avr_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 static int gdb_set_avr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
 {
 if (n < 32) {
+ppc_maybe_bswap_register(env, mem_buf, 8);
+ppc_maybe_bswap_register(env, mem_buf + 8, 8);
 if (!avr_need_swap(env)) {
 env->avr[n].u64[0] = ldq_p(mem_buf);
 env->avr[n].u64[1] = ldq_p(mem_buf+8);
@@ -8826,10 +8832,12 @@ static int gdb_set_avr_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 return 16;
 }
 if (n == 32) {
+ppc_maybe_bswap_register(env, mem_buf, 4);
 env->vscr = ldl_p(mem_buf);
 return 4;
 }
 if (n == 33) {
+ppc_maybe_bswap_register(env, mem_buf, 4);
 env->spr[SPR_VRSAVE] = (target_ulong)ldl_p(mem_buf);
 return 4;
 }
-- 
2.5.0

[Qemu-devel] [PULL 11/39] ppc: Clean up error handling in ppc_set_compat()

2016-01-28 Thread David Gibson

Current ppc_set_compat() returns -1 for errors, and also (unconditionally)
reports an error message.  The caller in h_client_architecture_support()
may then report it again using an outdated fprintf().

Clean this up by using the modern error reporting mechanisms.  Also add
strerror(errno) to the error message.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr.c  |  4 +---
 hw/ppc/spapr_hcall.c| 10 +-
 target-ppc/cpu.h|  2 +-
 target-ppc/translate_init.c | 13 +++--
 4 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 86e5023..672815f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1642,9 +1642,7 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
PowerPCCPU *cpu)
 }
 
 if (cpu->max_compat) {
-if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
-exit(1);
-}
+ppc_set_compat(cpu, cpu->max_compat, &error_fatal);
 }
 
 xics_cpu_setup(spapr->icp, cpu);
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 9dbdba9..e9c057d 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -837,7 +837,7 @@ static target_ulong cas_get_option_vector(int vector, 
target_ulong table)
 typedef struct {
 PowerPCCPU *cpu;
 uint32_t cpu_version;
-int ret;
+Error *err;
 } SetCompatState;
 
 static void do_set_compat(void *arg)
@@ -845,7 +845,7 @@ static void do_set_compat(void *arg)
 SetCompatState *s = arg;
 
 cpu_synchronize_state(CPU(s->cpu));
-s->ret = ppc_set_compat(s->cpu, s->cpu_version);
+ppc_set_compat(s->cpu, s->cpu_version, &s->err);
 }
 
 #define get_compat_level(cpuver) ( \
@@ -930,13 +930,13 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
 SetCompatState s = {
 .cpu = POWERPC_CPU(cs),
 .cpu_version = cpu_version,
-.ret = 0
+.err = NULL,
 };
 
 run_on_cpu(cs, do_set_compat, &s);
 
-if (s.ret < 0) {
-fprintf(stderr, "Unable to set compatibility mode\n");
+if (s.err) {
+error_report_err(s.err);
 return H_HARDWARE;
 }
 }
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 9706000..b3b89e6 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1210,7 +1210,7 @@ void ppc_store_msr (CPUPPCState *env, target_ulong value);
 
 void ppc_cpu_list (FILE *f, fprintf_function cpu_fprintf);
 int ppc_get_compat_smt_threads(PowerPCCPU *cpu);
-int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version);
+void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version, Error **errp);
 
 /* Time-base and decrementer management */
 #ifndef NO_CPU_IO_DEFS
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index d7e1a4e..4c61525 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -9186,7 +9186,7 @@ int ppc_get_compat_smt_threads(PowerPCCPU *cpu)
 return ret;
 }
 
-int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
+void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version, Error **errp)
 {
 int ret = 0;
 CPUPPCState *env = &cpu->env;
@@ -9208,12 +9208,13 @@ int ppc_set_compat(PowerPCCPU *cpu, uint32_t 
cpu_version)
 break;
 }
 
-if (kvm_enabled() && kvmppc_set_compat(cpu, cpu->cpu_version) < 0) {
-error_report("Unable to set compatibility mode in KVM");
-ret = -1;
+if (kvm_enabled()) {
+ret = kvmppc_set_compat(cpu, cpu->cpu_version);
+if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Unable to set CPU compatibility mode in KVM");
+}
 }
-
-return ret;
 }
 
 static gint ppc_cpu_compare_class_pvr(gconstpointer a, gconstpointer b)
-- 
2.5.0

[Qemu-devel] [PULL 09/39] spapr: Remove abuse of rtas_ld() in h_client_architecture_support

2016-01-28 Thread David Gibson

h_client_architecture_support() uses rtas_ld() for general purpose memory
access, despite the fact that it's not an RTAS routine at all and rtas_ld
makes things more awkward.

Clean this up by replacing rtas_ld() calls with appropriate ldXX_phys()
calls.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_hcall.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index cebceea..9dbdba9 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -861,7 +861,8 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
   target_ulong opcode,
   target_ulong *args)
 {
-target_ulong list = args[0], ov_table;
+target_ulong list = ppc64_phys_to_real(args[0]);
+target_ulong ov_table, ov5;
 PowerPCCPUClass *pcc_ = POWERPC_CPU_GET_CLASS(cpu_);
 CPUState *cs;
 bool cpu_match = false, cpu_update = true, memory_update = false;
@@ -875,9 +876,9 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
 for (counter = 0; counter < 512; ++counter) {
 uint32_t pvr, pvr_mask;
 
-pvr_mask = rtas_ld(list, 0);
+pvr_mask = ldl_be_phys(&address_space_memory, list);
 list += 4;
-pvr = rtas_ld(list, 0);
+pvr = ldl_be_phys(&address_space_memory, list);
 list += 4;
 
 trace_spapr_cas_pvr_try(pvr);
@@ -948,14 +949,13 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
 /* For the future use: here @ov_table points to the first option vector */
 ov_table = list;
 
-list = cas_get_option_vector(5, ov_table);
-if (!list) {
+ov5 = cas_get_option_vector(5, ov_table);
+if (!ov5) {
 return H_SUCCESS;
 }
 
 /* @list now points to OV 5 */
-list += 2;
-ov5_byte2 = rtas_ld(list, 0) >> 24;
+ov5_byte2 = ldub_phys(&address_space_memory, ov5 + 2);
 if (ov5_byte2 & OV5_DRCONF_MEMORY) {
 memory_update = true;
 }
-- 
2.5.0

[Qemu-devel] [PULL 27/39] cuda.c: return error for unknown commands

2016-01-28 Thread David Gibson

From: Alyssa Milburn 

This avoids MacsBug hanging at startup in the absence of ADB mouse
input, by replying with an error (which is also what MOL does) when
it sends an unknown command (0x1c).

Signed-off-by: Alyssa Milburn 
Signed-off-by: David Gibson 
---
 hw/misc/macio/cuda.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/misc/macio/cuda.c b/hw/misc/macio/cuda.c
index 3556852..5e4d5d5 100644
--- a/hw/misc/macio/cuda.c
+++ b/hw/misc/macio/cuda.c
@@ -605,6 +605,11 @@ static void cuda_receive_packet(CUDAState *s,
 }
 break;
 default:
+obuf[0] = ERROR_PACKET;
+obuf[1] = 0x2;
+obuf[2] = CUDA_PACKET;
+obuf[3] = data[0];
+cuda_send_packet_to_host(s, obuf, 4);
 break;
 }
 }
-- 
2.5.0

[Qemu-devel] [PULL 08/39] spapr: Remove rtas_st_buffer_direct()

2016-01-28 Thread David Gibson

rtas_st_buffer_direct() is a not particularly useful wrapper around
cpu_physical_memory_write().  All the callers are in
rtas_ibm_configure_connector, where it's better handled by local helper.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_rtas.c| 17 ++---
 include/hw/ppc/spapr.h |  8 
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 8b702b5..eac1556 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -505,6 +505,13 @@ out:
 #define CC_VAL_DATA_OFFSET ((CC_IDX_PROP_DATA_OFFSET + 1) * 4)
 #define CC_WA_LEN 4096
 
+static void configure_connector_st(target_ulong addr, target_ulong offset,
+   const void *buf, size_t len)
+{
+cpu_physical_memory_write(ppc64_phys_to_real(addr + offset),
+  buf, MIN(len, CC_WA_LEN - offset));
+}
+
 static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
  sPAPRMachineState *spapr,
  uint32_t token, uint32_t nargs,
@@ -570,8 +577,7 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
 /* provide the name of the next OF node */
 wa_offset = CC_VAL_DATA_OFFSET;
 rtas_st(wa_addr, CC_IDX_NODE_NAME_OFFSET, wa_offset);
-rtas_st_buffer_direct(wa_addr + wa_offset, CC_WA_LEN - wa_offset,
-  (uint8_t *)name, strlen(name) + 1);
+configure_connector_st(wa_addr, wa_offset, name, strlen(name) + 1);
 resp = SPAPR_DR_CC_RESPONSE_NEXT_CHILD;
 break;
 case FDT_END_NODE:
@@ -596,8 +602,7 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
 /* provide the name of the next OF property */
 wa_offset = CC_VAL_DATA_OFFSET;
 rtas_st(wa_addr, CC_IDX_PROP_NAME_OFFSET, wa_offset);
-rtas_st_buffer_direct(wa_addr + wa_offset, CC_WA_LEN - wa_offset,
-  (uint8_t *)name, strlen(name) + 1);
+configure_connector_st(wa_addr, wa_offset, name, strlen(name) + 1);
 
 /* provide the length and value of the OF property. data gets
  * placed immediately after NULL terminator of the OF property's
@@ -606,9 +611,7 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
 wa_offset += strlen(name) + 1,
 rtas_st(wa_addr, CC_IDX_PROP_LEN, prop_len);
 rtas_st(wa_addr, CC_IDX_PROP_DATA_OFFSET, wa_offset);
-rtas_st_buffer_direct(wa_addr + wa_offset, CC_WA_LEN - wa_offset,
-  (uint8_t *)((struct fdt_property 
*)prop)->data,
-  prop_len);
+configure_connector_st(wa_addr, wa_offset, prop->data, prop_len);
 resp = SPAPR_DR_CC_RESPONSE_NEXT_PROPERTY;
 break;
 case FDT_END:
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 1e10fc9..1f9e722 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -506,14 +506,6 @@ static inline void rtas_st(target_ulong phys, int n, 
uint32_t val)
 stl_be_phys(&address_space_memory, ppc64_phys_to_real(phys + 4*n), val);
 }
 
-static inline void rtas_st_buffer_direct(target_ulong phys,
- target_ulong phys_len,
- uint8_t *buffer, uint16_t buffer_len)
-{
-cpu_physical_memory_write(ppc64_phys_to_real(phys), buffer,
-  MIN(buffer_len, phys_len));
-}
-
 typedef void (*spapr_rtas_fn)(PowerPCCPU *cpu, sPAPRMachineState *sm,
   uint32_t token,
   uint32_t nargs, target_ulong args,
-- 
2.5.0

[Qemu-devel] [PULL 10/39] spapr: Don't create ibm, dynamic-reconfiguration-memory w/o DR LMBs

2016-01-28 Thread David Gibson

From: Bharata B Rao 

If guest doesn't have any dynamically reconfigurable (DR) logical memory
blocks (LMB), then we shouldn't create ibm,dynamic-reconfiguration-memory
device tree node.

Signed-off-by: Bharata B Rao 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 50e5a26..86e5023 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -763,6 +763,13 @@ static int spapr_populate_drconf_memory(sPAPRMachineState 
*spapr, void *fdt)
 int nr_nodes = nb_numa_nodes ? nb_numa_nodes : 1;
 
 /*
+ * Don't create the node if there are no DR LMBs.
+ */
+if (!nr_lmbs) {
+return 0;
+}
+
+/*
  * Allocate enough buffer size to fit in ibm,dynamic-memory
  * or ibm,associativity-lookup-arrays
  */
@@ -868,7 +875,7 @@ int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
 _FDT((spapr_fixup_cpu_dt(fdt, spapr)));
 }
 
-/* Generate memory nodes or ibm,dynamic-reconfiguration-memory node */
+/* Generate ibm,dynamic-reconfiguration-memory node if required */
 if (memory_update && smc->dr_lmb_enabled) {
 _FDT((spapr_populate_drconf_memory(spapr, fdt)));
 }
-- 
2.5.0

[Qemu-devel] [PULL 14/39] pseries: Clean up error handling in spapr_vga_init()

2016-01-28 Thread David Gibson

Use error_setg() to return an error rather than an explicit exit().
Previously it was an exit(0) instead of a non-zero exit code, which was
simply a bug.  Also improve the error message.

While we're at it change the type of spapr_vga_init() to bool since that's
how we're using it anyway.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b20b109..045c5a1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1245,7 +1245,7 @@ static void spapr_rtc_create(sPAPRMachineState *spapr)
 }
 
 /* Returns whether we want to use VGA or not */
-static int spapr_vga_init(PCIBus *pci_bus)
+static bool spapr_vga_init(PCIBus *pci_bus, Error **errp)
 {
 switch (vga_interface_type) {
 case VGA_NONE:
@@ -1256,9 +1256,9 @@ static int spapr_vga_init(PCIBus *pci_bus)
 case VGA_VIRTIO:
 return pci_vga_init(pci_bus) != NULL;
 default:
-fprintf(stderr, "This vga model is not supported,"
-"currently it only supports -vga std\n");
-exit(0);
+error_setg(errp,
+   "Unsupported VGA mode, only -vga std or -vga virtio is 
supported");
+return false;
 }
 }
 
@@ -1933,7 +1933,7 @@ static void ppc_spapr_init(MachineState *machine)
 }
 
 /* Graphics */
-if (spapr_vga_init(phb->bus)) {
+if (spapr_vga_init(phb->bus, &error_fatal)) {
 spapr->has_graphics = true;
 machine->usb |= defaults_enabled() && !machine->usb_disabled;
 }
-- 
2.5.0

[Qemu-devel] [PULL 18/39] pseries: Clean up error reporting in htab migration functions

2016-01-28 Thread David Gibson

The functions for migrating the hash page table on pseries machine type
(htab_save_setup() and htab_load()) can report some errors with an
explicit fprintf() before returning an appropriate error code.  Change some
of these to use error_report() instead. htab_save_setup() is omitted for
now to avoid conflicts with some other in-progress work.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f8404d3..a9c9a95 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1533,7 +1533,7 @@ static int htab_load(QEMUFile *f, void *opaque, int 
version_id)
 int fd = -1;
 
 if (version_id < 1 || version_id > 1) {
-fprintf(stderr, "htab_load() bad version\n");
+error_report("htab_load() bad version");
 return -EINVAL;
 }
 
@@ -1554,8 +1554,8 @@ static int htab_load(QEMUFile *f, void *opaque, int 
version_id)
 
 fd = kvmppc_get_htab_fd(true);
 if (fd < 0) {
-fprintf(stderr, "Unable to open fd to restore KVM hash table: 
%s\n",
-strerror(errno));
+error_report("Unable to open fd to restore KVM hash table: %s",
+ strerror(errno));
 }
 }
 
@@ -1575,9 +1575,9 @@ static int htab_load(QEMUFile *f, void *opaque, int 
version_id)
 if ((index + n_valid + n_invalid) >
 (HTAB_SIZE(spapr) / HASH_PTE_SIZE_64)) {
 /* Bad index in stream */
-fprintf(stderr, "htab_load() bad index %d (%hd+%hd entries) "
-"in htab stream (htab_shift=%d)\n", index, n_valid, 
n_invalid,
-spapr->htab_shift);
+error_report(
+"htab_load() bad index %d (%hd+%hd entries) in htab stream 
(htab_shift=%d)",
+index, n_valid, n_invalid, spapr->htab_shift);
 return -EINVAL;
 }
 
-- 
2.5.0

[Qemu-devel] [PULL 16/39] pseries: Clean up error handling in xics_system_init()

2016-01-28 Thread David Gibson

Use the error handling infrastructure to pass an error out from
try_create_xics() instead of assuming &error_abort - the caller is in a
better position to decide on error handling policy.

Also change the error handling from an &error_abort to &error_fatal, since
this occurs during the initial machine construction and could be triggered
by bad configuration rather than a program error.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 045c5a1..59f0a16 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -111,7 +111,7 @@ static XICSState *try_create_xics(const char *type, int 
nr_servers,
 }
 
 static XICSState *xics_system_init(MachineState *machine,
-   int nr_servers, int nr_irqs)
+   int nr_servers, int nr_irqs, Error **errp)
 {
 XICSState *icp = NULL;
 
@@ -130,7 +130,7 @@ static XICSState *xics_system_init(MachineState *machine,
 }
 
 if (!icp) {
-icp = try_create_xics(TYPE_XICS, nr_servers, nr_irqs, &error_abort);
+icp = try_create_xics(TYPE_XICS, nr_servers, nr_irqs, errp);
 }
 
 return icp;
@@ -1812,7 +1812,7 @@ static void ppc_spapr_init(MachineState *machine)
 spapr->icp = xics_system_init(machine,
   DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(),
smp_threads),
-  XICS_IRQS);
+  XICS_IRQS, &error_fatal);
 
 if (smc->dr_lmb_enabled) {
 spapr_validate_node_memory(machine, &error_fatal);
-- 
2.5.0

[Qemu-devel] [PULL 15/39] pseries: Clean up error handling in spapr_rtas_register()

2016-01-28 Thread David Gibson

The errors detected in this function necessarily indicate bugs in the rest
of the qemu code, rather than an external or configuration problem.

So, a simple assert() is more appropriate than any more complex error
reporting.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr_rtas.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index eac1556..130c917 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -664,17 +664,11 @@ target_ulong spapr_rtas_call(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 
 void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
 {
-if (!((token >= RTAS_TOKEN_BASE) && (token < RTAS_TOKEN_MAX))) {
-fprintf(stderr, "RTAS invalid token 0x%x\n", token);
-exit(1);
-}
+assert((token >= RTAS_TOKEN_BASE) && (token < RTAS_TOKEN_MAX));
 
 token -= RTAS_TOKEN_BASE;
-if (rtas_table[token].name) {
-fprintf(stderr, "RTAS call \"%s\" is registered already as 0x%x\n",
-rtas_table[token].name, token);
-exit(1);
-}
+
+assert(!rtas_table[token].name);
 
 rtas_table[token].name = name;
 rtas_table[token].fn = fn;
-- 
2.5.0

[Qemu-devel] [PULL 22/39] target-ppc: gdbstub: introduce avr_need_swap()

2016-01-28 Thread David Gibson

From: Greg Kurz 

This helper will be used to support Altivec registers in little-endian guests.
This patch does not change functionnality.

Note: I had to put the helper some lines away from the gdb_*_avr_reg()
routines to get a more readable patch.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 target-ppc/translate_init.c | 37 +++--
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 26b9b67..41308c3 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8751,6 +8751,15 @@ static void dump_ppc_insns (CPUPPCState *env)
 }
 #endif
 
+static bool avr_need_swap(CPUPPCState *env)
+{
+#ifdef HOST_WORDS_BIGENDIAN
+return false;
+#else
+return true;
+#endif
+}
+
 static int gdb_get_float_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
 {
 if (n < 32) {
@@ -8784,13 +8793,13 @@ static int gdb_set_float_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 static int gdb_get_avr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
 {
 if (n < 32) {
-#ifdef HOST_WORDS_BIGENDIAN
-stq_p(mem_buf, env->avr[n].u64[0]);
-stq_p(mem_buf+8, env->avr[n].u64[1]);
-#else
-stq_p(mem_buf, env->avr[n].u64[1]);
-stq_p(mem_buf+8, env->avr[n].u64[0]);
-#endif
+if (!avr_need_swap(env)) {
+stq_p(mem_buf, env->avr[n].u64[0]);
+stq_p(mem_buf+8, env->avr[n].u64[1]);
+} else {
+stq_p(mem_buf, env->avr[n].u64[1]);
+stq_p(mem_buf+8, env->avr[n].u64[0]);
+}
 return 16;
 }
 if (n == 32) {
@@ -8807,13 +8816,13 @@ static int gdb_get_avr_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 static int gdb_set_avr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
 {
 if (n < 32) {
-#ifdef HOST_WORDS_BIGENDIAN
-env->avr[n].u64[0] = ldq_p(mem_buf);
-env->avr[n].u64[1] = ldq_p(mem_buf+8);
-#else
-env->avr[n].u64[1] = ldq_p(mem_buf);
-env->avr[n].u64[0] = ldq_p(mem_buf+8);
-#endif
+if (!avr_need_swap(env)) {
+env->avr[n].u64[0] = ldq_p(mem_buf);
+env->avr[n].u64[1] = ldq_p(mem_buf+8);
+} else {
+env->avr[n].u64[1] = ldq_p(mem_buf);
+env->avr[n].u64[0] = ldq_p(mem_buf+8);
+}
 return 16;
 }
 if (n == 32) {
-- 
2.5.0

[Qemu-devel] [PULL 17/39] pseries: Clean up error reporting in ppc_spapr_init()

2016-01-28 Thread David Gibson

This function includes a number of explicit fprintf()s for errors.
Change these to use error_report() instead.

Also replace the single exit(EXIT_FAILURE) with an explicit exit(1), since
the latter is the more usual idiom in qemu by a large margin.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
Reviewed-by: Markus Armbruster 
---
 hw/ppc/spapr.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 59f0a16..f8404d3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1788,8 +1788,8 @@ static void ppc_spapr_init(MachineState *machine)
 }
 
 if (spapr->rma_size > node0_size) {
-fprintf(stderr, "Error: Numa node 0 has to span the RMA 
(%#08"HWADDR_PRIx")\n",
-spapr->rma_size);
+error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
+ spapr->rma_size);
 exit(1);
 }
 
@@ -1855,10 +1855,10 @@ static void ppc_spapr_init(MachineState *machine)
 ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
 
 if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
-error_report("Specified number of memory slots %" PRIu64
- " exceeds max supported %d",
+error_report("Specified number of memory slots %"
+ PRIu64" exceeds max supported %d",
  machine->ram_slots, SPAPR_MAX_RAM_SLOTS);
-exit(EXIT_FAILURE);
+exit(1);
 }
 
 spapr->hotplug_memory.base = ROUND_UP(machine->ram_size,
@@ -1954,8 +1954,9 @@ static void ppc_spapr_init(MachineState *machine)
 }
 
 if (spapr->rma_size < (MIN_RMA_SLOF << 20)) {
-fprintf(stderr, "qemu: pSeries SLOF firmware requires >= "
-"%ldM guest RMA (Real Mode Area memory)\n", MIN_RMA_SLOF);
+error_report(
+"pSeries SLOF firmware requires >= %ldM guest RMA (Real Mode Area 
memory)",
+MIN_RMA_SLOF);
 exit(1);
 }
 
@@ -1971,8 +1972,8 @@ static void ppc_spapr_init(MachineState *machine)
 kernel_le = kernel_size > 0;
 }
 if (kernel_size < 0) {
-fprintf(stderr, "qemu: error loading %s: %s\n",
-kernel_filename, load_elf_strerror(kernel_size));
+error_report("error loading %s: %s",
+ kernel_filename, load_elf_strerror(kernel_size));
 exit(1);
 }
 
@@ -1985,8 +1986,8 @@ static void ppc_spapr_init(MachineState *machine)
 initrd_size = load_image_targphys(initrd_filename, initrd_base,
   load_limit - initrd_base);
 if (initrd_size < 0) {
-fprintf(stderr, "qemu: could not load initial ram disk '%s'\n",
-initrd_filename);
+error_report("could not load initial ram disk '%s'",
+ initrd_filename);
 exit(1);
 }
 } else {
-- 
2.5.0

[Qemu-devel] [PULL 07/39] spapr: Small fixes to rtas_ibm_get_system_parameter, remove rtas_st_buffer

2016-01-28 Thread David Gibson

rtas_st_buffer() appears in spapr.h as though it were a widely used helper,
but in fact it is only used for saving data in a format used by
rtas_ibm_get_system_parameter().  This changes it to a local helper more
specifically for that function.

While we're there fix a couple of small defects in
rtas_ibm_get_system_parameter:
  - For the string value SPLPAR_CHARACTERISTICS, it wasn't including the
terminating \0 in the length which it should according to LoPAPR
7.3.16.1
  - It now checks that the supplied buffer has at least enough space for
the length of the returned data, and returns an error if it does not.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_rtas.c| 21 +
 include/hw/ppc/spapr.h | 28 +---
 2 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 34b12a3..8b702b5 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -228,6 +228,19 @@ static void rtas_stop_self(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 env->msr = 0;
 }
 
+static inline int sysparm_st(target_ulong addr, target_ulong len,
+ const void *val, uint16_t vallen)
+{
+hwaddr phys = ppc64_phys_to_real(addr);
+
+if (len < 2) {
+return RTAS_OUT_SYSPARM_PARAM_ERROR;
+}
+stw_be_phys(&address_space_memory, phys, vallen);
+cpu_physical_memory_write(phys + 2, val, MIN(len - 2, vallen));
+return RTAS_OUT_SUCCESS;
+}
+
 static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
   sPAPRMachineState *spapr,
   uint32_t token, uint32_t nargs,
@@ -237,7 +250,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
 target_ulong parameter = rtas_ld(args, 0);
 target_ulong buffer = rtas_ld(args, 1);
 target_ulong length = rtas_ld(args, 2);
-target_ulong ret = RTAS_OUT_SUCCESS;
+target_ulong ret;
 
 switch (parameter) {
 case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
@@ -249,18 +262,18 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
   current_machine->ram_size / M_BYTE,
   smp_cpus,
   max_cpus);
-rtas_st_buffer(buffer, length, (uint8_t *)param_val, 
strlen(param_val));
+ret = sysparm_st(buffer, length, param_val, strlen(param_val) + 1);
 g_free(param_val);
 break;
 }
 case RTAS_SYSPARM_DIAGNOSTICS_RUN_MODE: {
 uint8_t param_val = DIAGNOSTICS_RUN_MODE_DISABLED;
 
-rtas_st_buffer(buffer, length, ¶m_val, sizeof(param_val));
+ret = sysparm_st(buffer, length, ¶m_val, sizeof(param_val));
 break;
 }
 case RTAS_SYSPARM_UUID:
-rtas_st_buffer(buffer, length, qemu_uuid, (qemu_uuid_set ? 16 : 0));
+ret = sysparm_st(buffer, length, qemu_uuid, (qemu_uuid_set ? 16 : 0));
 break;
 default:
 ret = RTAS_OUT_NOT_SUPPORTED;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 53af76a..1e10fc9 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -408,14 +408,15 @@ int spapr_allocate_irq_block(int num, bool lsi, bool msi);
 #define RTAS_SLOT_PERM_ERR_LOG   2
 
 /* RTAS return codes */
-#define RTAS_OUT_SUCCESS0
-#define RTAS_OUT_NO_ERRORS_FOUND1
-#define RTAS_OUT_HW_ERROR   -1
-#define RTAS_OUT_BUSY   -2
-#define RTAS_OUT_PARAM_ERROR-3
-#define RTAS_OUT_NOT_SUPPORTED  -3
-#define RTAS_OUT_NO_SUCH_INDICATOR  -3
-#define RTAS_OUT_NOT_AUTHORIZED -9002
+#define RTAS_OUT_SUCCESS0
+#define RTAS_OUT_NO_ERRORS_FOUND1
+#define RTAS_OUT_HW_ERROR   -1
+#define RTAS_OUT_BUSY   -2
+#define RTAS_OUT_PARAM_ERROR-3
+#define RTAS_OUT_NOT_SUPPORTED  -3
+#define RTAS_OUT_NO_SUCH_INDICATOR  -3
+#define RTAS_OUT_NOT_AUTHORIZED -9002
+#define RTAS_OUT_SYSPARM_PARAM_ERROR-
 
 /* RTAS tokens */
 #define RTAS_TOKEN_BASE  0x2000
@@ -513,17 +514,6 @@ static inline void rtas_st_buffer_direct(target_ulong phys,
   MIN(buffer_len, phys_len));
 }
 
-static inline void rtas_st_buffer(target_ulong phys, target_ulong phys_len,
-  uint8_t *buffer, uint16_t buffer_len)
-{
-if (phys_len < 2) {
-return;
-}
-stw_be_phys(&address_space_memory,
-ppc64_phys_to_real(phys), buffer_len);
-rtas_st_buffer_direct(phys + 2, phys_len - 2, buffer, buffer_len);
-}
-
 typedef void (*spapr_rtas_fn)(PowerPCCPU *cpu, sPAPRMachineState *sm,
   uint32_t token,
   uint32_t nargs, target_ulong args,
-- 
2.5.0

[Qemu-devel] [PULL 02/39] target-ppc: use cpu_write_xer() helper in cpu_post_load

2016-01-28 Thread David Gibson

From: Mark Cave-Ayland 

Otherwise some internal xer variables fail to get set post-migration.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Alexey Kardashevskiy 
Signed-off-by: David Gibson 
---
 target-ppc/machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index f4ac761..b61c060 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -168,7 +168,7 @@ static int cpu_post_load(void *opaque, int version_id)
 env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
 env->lr = env->spr[SPR_LR];
 env->ctr = env->spr[SPR_CTR];
-env->xer = env->spr[SPR_XER];
+cpu_write_xer(env, env->spr[SPR_XER]);
 #if defined(TARGET_PPC64)
 env->cfar = env->spr[SPR_CFAR];
 #endif
-- 
2.5.0

[Qemu-devel] [PULL 06/39] cuda: add missing fields to VMStateDescription

2016-01-28 Thread David Gibson

From: Mark Cave-Ayland 

Include some fields missed from the previous VMState conversion to the
migration stream, as well as the new SR_INT delay timer.

Signed-off-by: Mark Cave-Ayland 
Signed-off-by: David Gibson 
---
 hw/misc/macio/cuda.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/misc/macio/cuda.c b/hw/misc/macio/cuda.c
index 9db4c64..3556852 100644
--- a/hw/misc/macio/cuda.c
+++ b/hw/misc/macio/cuda.c
@@ -704,15 +704,17 @@ static const VMStateDescription vmstate_cuda_timer = {
 
 static const VMStateDescription vmstate_cuda = {
 .name = "cuda",
-.version_id = 2,
-.minimum_version_id = 2,
+.version_id = 3,
+.minimum_version_id = 3,
 .fields = (VMStateField[]) {
 VMSTATE_UINT8(a, CUDAState),
 VMSTATE_UINT8(b, CUDAState),
+VMSTATE_UINT8(last_b, CUDAState),
 VMSTATE_UINT8(dira, CUDAState),
 VMSTATE_UINT8(dirb, CUDAState),
 VMSTATE_UINT8(sr, CUDAState),
 VMSTATE_UINT8(acr, CUDAState),
+VMSTATE_UINT8(last_acr, CUDAState),
 VMSTATE_UINT8(pcr, CUDAState),
 VMSTATE_UINT8(ifr, CUDAState),
 VMSTATE_UINT8(ier, CUDAState),
@@ -727,6 +729,7 @@ static const VMStateDescription vmstate_cuda = {
 VMSTATE_STRUCT_ARRAY(timers, CUDAState, 2, 1,
  vmstate_cuda_timer, CUDATimer),
 VMSTATE_TIMER_PTR(adb_poll_timer, CUDAState),
+VMSTATE_TIMER_PTR(sr_delay_timer, CUDAState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
2.5.0

[Qemu-devel] [PULL 01/39] target-ppc: Use sensible POWER8/POWER8E versions

2016-01-28 Thread David Gibson

From: Benjamin Herrenschmidt 

We never released anything older than POWER8 DD2.0 and POWER8E DD2.1,
so let's use these versions, without that some firmware or Linux code
might fail to use some HW features that were non functional in earlier
internal only spins of the chip.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: David Gibson 
---
 target-ppc/cpu-models.c | 12 ++--
 target-ppc/cpu-models.h |  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
index 4d5ab4b..349783e 100644
--- a/target-ppc/cpu-models.c
+++ b/target-ppc/cpu-models.c
@@ -1138,10 +1138,10 @@
 "POWER7 v2.3")
 POWERPC_DEF("POWER7+_v2.1",  CPU_POWERPC_POWER7P_v21,POWER7,
 "POWER7+ v2.1")
-POWERPC_DEF("POWER8E_v1.0",  CPU_POWERPC_POWER8E_v10,POWER8,
-"POWER8E v1.0")
-POWERPC_DEF("POWER8_v1.0",   CPU_POWERPC_POWER8_v10, POWER8,
-"POWER8 v1.0")
+POWERPC_DEF("POWER8E_v2.1",  CPU_POWERPC_POWER8E_v21,POWER8,
+"POWER8E v2.1")
+POWERPC_DEF("POWER8_v2.0",   CPU_POWERPC_POWER8_v20, POWER8,
+"POWER8 v2.0")
 POWERPC_DEF("970_v2.2",  CPU_POWERPC_970_v22,970,
 "PowerPC 970 v2.2")
 POWERPC_DEF("970fx_v1.0",CPU_POWERPC_970FX_v10,  970,
@@ -1389,8 +1389,8 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
 { "POWER5gs", "POWER5+_v2.1" },
 { "POWER7", "POWER7_v2.3" },
 { "POWER7+", "POWER7+_v2.1" },
-{ "POWER8E", "POWER8E_v1.0" },
-{ "POWER8", "POWER8_v1.0" },
+{ "POWER8E", "POWER8E_v2.1" },
+{ "POWER8", "POWER8_v2.0" },
 { "970", "970_v2.2" },
 { "970fx", "970fx_v3.1" },
 { "970mp", "970mp_v1.1" },
diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
index 9d80e72..2992427 100644
--- a/target-ppc/cpu-models.h
+++ b/target-ppc/cpu-models.h
@@ -557,9 +557,9 @@ enum {
 CPU_POWERPC_POWER7P_BASE   = 0x004A,
 CPU_POWERPC_POWER7P_v21= 0x004A0201,
 CPU_POWERPC_POWER8E_BASE   = 0x004B,
-CPU_POWERPC_POWER8E_v10= 0x004B0100,
+CPU_POWERPC_POWER8E_v21= 0x004B0201,
 CPU_POWERPC_POWER8_BASE= 0x004D,
-CPU_POWERPC_POWER8_v10 = 0x004D0100,
+CPU_POWERPC_POWER8_v20 = 0x004D0200,
 CPU_POWERPC_970_v22= 0x00390202,
 CPU_POWERPC_970FX_v10  = 0x00391100,
 CPU_POWERPC_970FX_v20  = 0x003C0200,
-- 
2.5.0

[Qemu-devel] [PULL 00/39] ppc-for-2.6 queue 20160129

2016-01-28 Thread David Gibson

The following changes since commit 357e81c7e880f868833edf9f53cce1f3b09ea8ec:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20160128' into 
staging (2016-01-28 11:46:34 +)

are available in the git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.6-20160129

for you to fetch changes up to 1699679e699276c0538008f6ca74cd04e6c68b42:

  target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro 
(2016-01-29 14:01:52 +1100)

This is similar to the 2016-01-25 pull request which was dropped due
to a build bug on 32-bit hosts.  In addition to fixing that bug, I've
added in the page size cleanup and one other small cleanup patch.



ppc patch queue for 2016-01-29

Currently accumulated patches for target-ppc, pseries machine type and
related devices.
  * Cleanup of error handling code in spapr
  * A number of fixes for Macintosh devices for the benefit of MacOS 9 and X
  * Remove some abuses of the RTAS memory access functions in spapr
  * Fixes for the gdbstub (and monitor debug) for VMX and VSX extensions.
  * Fix pseries machine hotplug memory under TCG
  * Clean up and extend handling of multiple page sizes with 64-bit hash MMUs


Alyssa Milburn (1):
  cuda.c: return error for unknown commands

Anton Blanchard (1):
  target-ppc: gdbstub: Add VSX support

Benjamin Herrenschmidt (1):
  target-ppc: Use sensible POWER8/POWER8E versions

Bharata B Rao (1):
  spapr: Don't create ibm,dynamic-reconfiguration-memory w/o DR LMBs

David Gibson (22):
  spapr: Small fixes to rtas_ibm_get_system_parameter, remove rtas_st_buffer
  spapr: Remove rtas_st_buffer_direct()
  spapr: Remove abuse of rtas_ld() in h_client_architecture_support
  ppc: Clean up error handling in ppc_set_compat()
  pseries: Clean up error handling of spapr_cpu_init()
  pseries: Clean up error handling in spapr_validate_node_memory()
  pseries: Clean up error handling in spapr_vga_init()
  pseries: Clean up error handling in spapr_rtas_register()
  pseries: Clean up error handling in xics_system_init()
  pseries: Clean up error reporting in ppc_spapr_init()
  pseries: Clean up error reporting in htab migration functions
  pseries: Allow TCG h_enter to work with hotplugged memory
  target-ppc: Remove unused kvmppc_read_segment_page_sizes() stub
  target-ppc: Convert mmu-hash{32,64}.[ch] from CPUPPCState to PowerPCCPU
  target-ppc: Rework ppc_store_slb
  target-ppc: Rework SLB page size lookup
  target-ppc: Use actual page size encodings from HPTE
  target-ppc: Remove unused mmu models from ppc_tlb_invalidate_one
  target-ppc: Split 44x tlbiva from ppc_tlb_invalidate_one()
  target-ppc: Add new TLB invalidate by HPTE call for hash64 MMUs
  target-ppc: Helper to determine page size information from hpte alone
  target-ppc: Allow more page sizes for POWER7 & POWER8 in TCG

Greg Kurz (6):
  target-ppc: kvm: fix floating point registers sync on little-endian hosts
  target-ppc: rename and export maybe_bswap_register()
  target-ppc: gdbstub: fix float registers for little-endian guests
  target-ppc: gdbstub: introduce avr_need_swap()
  target-ppc: gdbstub: fix altivec registers for little-endian guests
  target-ppc: gdbstub: fix spe registers for little-endian guests

James Clarke (1):
  target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro

Mark Cave-Ayland (5):
  target-ppc: use cpu_write_xer() helper in cpu_post_load
  macio: use the existing IDEDMA aiocb to hold the active DMA aiocb
  macio: add dma_active to VMStateDescription
  mac_dbdma: add DBDMA controller state to VMStateDescription
  cuda: add missing fields to VMStateDescription

Programmingkid (1):
  uninorth.c: add support for UniNorth kMacRISCPCIAddressSelect (0x48) 
register

 configure   |   6 +-
 gdb-xml/power-vsx.xml   |  44 +++
 hw/ide/macio.c  |  23 +-
 hw/ide/macio.c.orig | 634 
 hw/misc/macio/cuda.c|  12 +-
 hw/misc/macio/mac_dbdma.c   |  40 ++-
 hw/pci-host/uninorth.c  |   9 +
 hw/ppc/mac.h|   1 -
 hw/ppc/spapr.c  | 112 
 hw/ppc/spapr_hcall.c| 145 --
 hw/ppc/spapr_rtas.c |  50 ++--
 include/hw/ppc/spapr.h  |  36 +--
 target-ppc/cpu-models.c |  12 +-
 target-ppc/cpu-models.h |   4 +-
 target-ppc/cpu.h|  35 ++-
 target-ppc/gdbstub.c|  10 +-
 target-ppc/helper.h |   1 +
 target-ppc/kvm.c|  14 +-
 target-ppc/kvm_ppc.h|   5 -
 target-ppc/machine.c|  22 +-
 target-ppc/mmu-hash32.c |  68 +++--
 target-ppc/mmu-hash32.h |  30 ++-
 target-ppc/mmu-hash64.c | 270 +--
 target-ppc/mmu-hash64.h

[Qemu-devel] [PULL 04/39] macio: add dma_active to VMStateDescription

2016-01-28 Thread David Gibson

From: Mark Cave-Ayland 

Make sure that we include the value of dma_active in the migration stream.

Signed-off-by: Mark Cave-Ayland 
Acked-by: John Snow 
Signed-off-by: David Gibson 
---
 hw/ide/macio.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 110af46..a39bdc0 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -516,11 +516,12 @@ static const MemoryRegionOps pmac_ide_ops = {
 
 static const VMStateDescription vmstate_pmac = {
 .name = "ide",
-.version_id = 3,
+.version_id = 4,
 .minimum_version_id = 0,
 .fields = (VMStateField[]) {
 VMSTATE_IDE_BUS(bus, MACIOIDEState),
 VMSTATE_IDE_DRIVES(bus.ifs, MACIOIDEState),
+VMSTATE_BOOL(dma_active, MACIOIDEState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
2.5.0

[Qemu-devel] [PULL 05/39] mac_dbdma: add DBDMA controller state to VMStateDescription

2016-01-28 Thread David Gibson

From: Mark Cave-Ayland 

Make sure that we include the DBDMA controller state in the migration
stream.

Signed-off-by: Mark Cave-Ayland 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 40 
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index 5ee8f02..161f49e 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -712,20 +712,52 @@ static const MemoryRegionOps dbdma_ops = {
 },
 };
 
-static const VMStateDescription vmstate_dbdma_channel = {
-.name = "dbdma_channel",
+static const VMStateDescription vmstate_dbdma_io = {
+.name = "dbdma_io",
+.version_id = 0,
+.minimum_version_id = 0,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(addr, struct DBDMA_io),
+VMSTATE_INT32(len, struct DBDMA_io),
+VMSTATE_INT32(is_last, struct DBDMA_io),
+VMSTATE_INT32(is_dma_out, struct DBDMA_io),
+VMSTATE_BOOL(processing, struct DBDMA_io),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_dbdma_cmd = {
+.name = "dbdma_cmd",
 .version_id = 0,
 .minimum_version_id = 0,
 .fields = (VMStateField[]) {
+VMSTATE_UINT16(req_count, dbdma_cmd),
+VMSTATE_UINT16(command, dbdma_cmd),
+VMSTATE_UINT32(phy_addr, dbdma_cmd),
+VMSTATE_UINT32(cmd_dep, dbdma_cmd),
+VMSTATE_UINT16(res_count, dbdma_cmd),
+VMSTATE_UINT16(xfer_status, dbdma_cmd),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_dbdma_channel = {
+.name = "dbdma_channel",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
 VMSTATE_UINT32_ARRAY(regs, struct DBDMA_channel, DBDMA_REGS),
+VMSTATE_STRUCT(io, struct DBDMA_channel, 0, vmstate_dbdma_io, 
DBDMA_io),
+VMSTATE_STRUCT(current, struct DBDMA_channel, 0, vmstate_dbdma_cmd,
+   dbdma_cmd),
 VMSTATE_END_OF_LIST()
 }
 };
 
 static const VMStateDescription vmstate_dbdma = {
 .name = "dbdma",
-.version_id = 2,
-.minimum_version_id = 2,
+.version_id = 3,
+.minimum_version_id = 3,
 .fields = (VMStateField[]) {
 VMSTATE_STRUCT_ARRAY(channels, DBDMAState, DBDMA_CHANNELS, 1,
  vmstate_dbdma_channel, DBDMA_channel),
-- 
2.5.0

[Qemu-devel] [PULL 00/39] ppc-for-2.6 queue 20160129

2016-01-28 Thread David Gibson

The following changes since commit 357e81c7e880f868833edf9f53cce1f3b09ea8ec:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20160128' into 
staging (2016-01-28 11:46:34 +)

are available in the git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.6-20160129

for you to fetch changes up to 1699679e699276c0538008f6ca74cd04e6c68b42:

  target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro 
(2016-01-29 14:01:52 +1100)


ppc patch queue for 2016-01-29

Currently accumulated patches for target-ppc, pseries machine type and
related devices.
  * Cleanup of error handling code in spapr
  * A number of fixes for Macintosh devices for the benefit of MacOS 9 and X
  * Remove some abuses of the RTAS memory access functions in spapr
  * Fixes for the gdbstub (and monitor debug) for VMX and VSX extensions.
  * Fix pseries machine hotplug memory under TCG
  * Clean up and extend handling of multiple page sizes with 64-bit hash MMUs


Alyssa Milburn (1):
  cuda.c: return error for unknown commands

Anton Blanchard (1):
  target-ppc: gdbstub: Add VSX support

Benjamin Herrenschmidt (1):
  target-ppc: Use sensible POWER8/POWER8E versions

Bharata B Rao (1):
  spapr: Don't create ibm,dynamic-reconfiguration-memory w/o DR LMBs

David Gibson (22):
  spapr: Small fixes to rtas_ibm_get_system_parameter, remove rtas_st_buffer
  spapr: Remove rtas_st_buffer_direct()
  spapr: Remove abuse of rtas_ld() in h_client_architecture_support
  ppc: Clean up error handling in ppc_set_compat()
  pseries: Clean up error handling of spapr_cpu_init()
  pseries: Clean up error handling in spapr_validate_node_memory()
  pseries: Clean up error handling in spapr_vga_init()
  pseries: Clean up error handling in spapr_rtas_register()
  pseries: Clean up error handling in xics_system_init()
  pseries: Clean up error reporting in ppc_spapr_init()
  pseries: Clean up error reporting in htab migration functions
  pseries: Allow TCG h_enter to work with hotplugged memory
  target-ppc: Remove unused kvmppc_read_segment_page_sizes() stub
  target-ppc: Convert mmu-hash{32,64}.[ch] from CPUPPCState to PowerPCCPU
  target-ppc: Rework ppc_store_slb
  target-ppc: Rework SLB page size lookup
  target-ppc: Use actual page size encodings from HPTE
  target-ppc: Remove unused mmu models from ppc_tlb_invalidate_one
  target-ppc: Split 44x tlbiva from ppc_tlb_invalidate_one()
  target-ppc: Add new TLB invalidate by HPTE call for hash64 MMUs
  target-ppc: Helper to determine page size information from hpte alone
  target-ppc: Allow more page sizes for POWER7 & POWER8 in TCG

Greg Kurz (6):
  target-ppc: kvm: fix floating point registers sync on little-endian hosts
  target-ppc: rename and export maybe_bswap_register()
  target-ppc: gdbstub: fix float registers for little-endian guests
  target-ppc: gdbstub: introduce avr_need_swap()
  target-ppc: gdbstub: fix altivec registers for little-endian guests
  target-ppc: gdbstub: fix spe registers for little-endian guests

James Clarke (1):
  target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro

Mark Cave-Ayland (5):
  target-ppc: use cpu_write_xer() helper in cpu_post_load
  macio: use the existing IDEDMA aiocb to hold the active DMA aiocb
  macio: add dma_active to VMStateDescription
  mac_dbdma: add DBDMA controller state to VMStateDescription
  cuda: add missing fields to VMStateDescription

Programmingkid (1):
  uninorth.c: add support for UniNorth kMacRISCPCIAddressSelect (0x48) 
register

 configure   |   6 +-
 gdb-xml/power-vsx.xml   |  44 +++
 hw/ide/macio.c  |  23 +-
 hw/ide/macio.c.orig | 634 
 hw/misc/macio/cuda.c|  12 +-
 hw/misc/macio/mac_dbdma.c   |  40 ++-
 hw/pci-host/uninorth.c  |   9 +
 hw/ppc/mac.h|   1 -
 hw/ppc/spapr.c  | 112 
 hw/ppc/spapr_hcall.c| 145 --
 hw/ppc/spapr_rtas.c |  50 ++--
 include/hw/ppc/spapr.h  |  36 +--
 target-ppc/cpu-models.c |  12 +-
 target-ppc/cpu-models.h |   4 +-
 target-ppc/cpu.h|  35 ++-
 target-ppc/gdbstub.c|  10 +-
 target-ppc/helper.h |   1 +
 target-ppc/kvm.c|  14 +-
 target-ppc/kvm_ppc.h|   5 -
 target-ppc/machine.c|  22 +-
 target-ppc/mmu-hash32.c |  68 +++--
 target-ppc/mmu-hash32.h |  30 ++-
 target-ppc/mmu-hash64.c | 270 +--
 target-ppc/mmu-hash64.h |  30 ++-
 target-ppc/mmu_helper.c |  59 ++---
 target-ppc/translate.c  |   2 +-
 target-ppc/translate_init.c | 129 +++--
 27 files changed, 1382 insertions(+), 421 deletions(-)
 create mode 100644 gd

Re: [Qemu-devel] [PATCH v4 6/8] bcm2836: add bcm2836 soc device

2016-01-28 Thread Peter Crosthwaite

SoC in subject line.

On Fri, Jan 15, 2016 at 3:58 PM, Andrew Baumann
 wrote:
> This is the SoC for Raspberry Pi 2.
>
> Signed-off-by: Andrew Baumann 
> ---
>
> Notes:
> v4:
> * s/ic/control/
> * replace use of smp_cpus with enabled-cpus property
> * propagate errors rather than exit(1)
>
>  hw/arm/Makefile.objs |   2 +-
>  hw/arm/bcm2836.c | 165 
> +++
>  include/hw/arm/bcm2836.h |  34 ++
>  3 files changed, 200 insertions(+), 1 deletion(-)
>  create mode 100644 hw/arm/bcm2836.c
>  create mode 100644 include/hw/arm/bcm2836.h
>
> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
> index 82cc142..f55f8d2 100644
> --- a/hw/arm/Makefile.objs
> +++ b/hw/arm/Makefile.objs
> @@ -11,7 +11,7 @@ obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o 
> pxa2xx_pic.o
>  obj-$(CONFIG_DIGIC) += digic.o
>  obj-y += omap1.o omap2.o strongarm.o
>  obj-$(CONFIG_ALLWINNER_A10) += allwinner-a10.o cubieboard.o
> -obj-$(CONFIG_RASPI) += bcm2835_peripherals.o
> +obj-$(CONFIG_RASPI) += bcm2835_peripherals.o bcm2836.o
>  obj-$(CONFIG_STM32F205_SOC) += stm32f205_soc.o
>  obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp.o xlnx-ep108.o
>  obj-$(CONFIG_FSL_IMX25) += fsl-imx25.o imx25_pdk.o
> diff --git a/hw/arm/bcm2836.c b/hw/arm/bcm2836.c
> new file mode 100644
> index 000..0fd6118
> --- /dev/null
> +++ b/hw/arm/bcm2836.c
> @@ -0,0 +1,165 @@
> +/*
> + * Raspberry Pi emulation (c) 2012 Gregory Estrade
> + * Upstreaming code cleanup [including bcm2835_*] (c) 2013 Jan Petrous
> + *
> + * Rasperry Pi 2 emulation and refactoring Copyright (c) 2015, Microsoft
> + * Written by Andrew Baumann
> + *
> + * This code is licensed under the GNU GPLv2 and later.
> + */
> +
> +#include "hw/arm/bcm2836.h"
> +#include "hw/arm/raspi_platform.h"
> +#include "hw/sysbus.h"
> +#include "exec/address-spaces.h"
> +
> +/* Peripheral base address seen by the CPU */
> +#define BCM2836_PERI_BASE   0x3F00
> +
> +/* "QA7" (Pi2) interrupt controller and mailboxes etc. */
> +#define BCM2836_CONTROL_BASE0x4000
> +
> +static void bcm2836_init(Object *obj)
> +{
> +BCM2836State *s = BCM2836(obj);
> +int n;
> +
> +for (n = 0; n < BCM2836_NCPUS; n++) {
> +object_initialize(&s->cpus[n], sizeof(s->cpus[n]),
> +  "cortex-a15-" TYPE_ARM_CPU);
> +object_property_add_child(obj, "cpu[*]", OBJECT(&s->cpus[n]),
> +  &error_abort);
> +}
> +
> +object_initialize(&s->control, sizeof(s->control), TYPE_BCM2836_CONTROL);
> +object_property_add_child(obj, "control", OBJECT(&s->control), NULL);
> +qdev_set_parent_bus(DEVICE(&s->control), sysbus_get_default());
> +
> +object_initialize(&s->peripherals, sizeof(s->peripherals),
> +  TYPE_BCM2835_PERIPHERALS);
> +object_property_add_child(obj, "peripherals", OBJECT(&s->peripherals),
> +  &error_abort);
> +qdev_set_parent_bus(DEVICE(&s->peripherals), sysbus_get_default());
> +}
> +
> +static void bcm2836_realize(DeviceState *dev, Error **errp)
> +{
> +BCM2836State *s = BCM2836(dev);
> +Object *obj;
> +Error *err = NULL;
> +int n;
> +
> +/* common peripherals from bcm2835 */
> +
> +obj = object_property_get_link(OBJECT(dev), "ram", &err);
> +if (obj == NULL) {
> +error_setg(errp, "%s: required ram link not found: %s",
> +   __func__, error_get_pretty(err));
> +return;
> +}
> +
> +object_property_add_const_link(OBJECT(&s->peripherals), "ram", obj, 
> &err);
> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +object_property_set_bool(OBJECT(&s->peripherals), true, "realized", 
> &err);
> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +sysbus_mmio_map_overlap(SYS_BUS_DEVICE(&s->peripherals), 0,
> +BCM2836_PERI_BASE, 1);
> +
> +/* bcm2836 interrupt controller (and mailboxes, etc.) */
> +object_property_set_bool(OBJECT(&s->control), true, "realized", &err);
> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +sysbus_mmio_map(SYS_BUS_DEVICE(&s->control), 0, BCM2836_CONTROL_BASE);
> +
> +sysbus_connect_irq(SYS_BUS_DEVICE(&s->peripherals), 0,
> +qdev_get_gpio_in_named(DEVICE(&s->control), "gpu-irq", 0));
> +sysbus_connect_irq(SYS_BUS_DEVICE(&s->peripherals), 1,
> +qdev_get_gpio_in_named(DEVICE(&s->control), "gpu-fiq", 0));
> +
> +for (n = 0; n < BCM2836_NCPUS; n++) {
> +/* Mirror bcm2836, which has clusterid set to 0xf
> + * TODO: this should be converted to a property of ARM_CPU
> + */
> +s->cpus[n].mp_affinity = 0xF00 | n;
> +
> +/* set periphbase/CBAR value for CPU-local registers */
> +object_property_set_int(OBJECT(&s->cpus[n]),
> +BCM2836_PERI_BASE + MC

Re: [Qemu-devel] [PATCH v4 5/8] bcm2836_control: add bcm2836 ARM control logic

2016-01-28 Thread Andrew Baumann

Hi Peter,

> From: Peter Crosthwaite [mailto:crosthwaitepe...@gmail.com]
> Sent: Thursday, 28 January 2016 20:38
> 
> On Fri, Jan 15, 2016 at 3:58 PM, Andrew Baumann
>  wrote:
> > This module is specific to the bcm2836 (Pi2). It implements the top
> > level interrupt controller, and mailboxes used for inter-processor
> > synchronisation.
[...]
> > +for (i = 0; i < BCM2836_NCORES; i++) {
> > +/* handle local timer interrupts for this core */
> > +if (s->timerirqs[i]) {
> > +assert(s->timerirqs[i] < (1 << IRQ_MAILBOX0)); /* sanity check 
> > */
> > +for (j = 0; j < IRQ_MAILBOX0; j++) {
> 
> I think <= IRQ_CNTVIRQ is cleaner, as it keeps "MAILBOX" out of the timer
> code.

Ok.

[...]
> > +typedef struct BCM2836ControlState {
> > +/*< private >*/
> > +SysBusDevice busdev;
> > +/*< public >*/
> > +MemoryRegion iomem;
> > +
> 
> > +/* interrupt status registers (not directly visible to user) */
> > +bool gpu_irq, gpu_fiq;
> > +uint8_t timerirqs[BCM2836_NCORES];
> > +
> 
> This ...
> 
> > +/* mailboxes */
> > +uint32_t mailboxes[BCM2836_NCORES * BCM2836_MBPERCORE];
> > +
> > +/* interrupt routing/control registers */
> > +uint8_t route_gpu_irq, route_gpu_fiq;
> > +uint32_t timercontrol[BCM2836_NCORES];
> > +uint32_t mailboxcontrol[BCM2836_NCORES];
> > +
> 
> > +/* interrupt source registers, post-routing (visible) */
> > +uint32_t irqsrc[BCM2836_NCORES];
> > +uint32_t fiqsrc[BCM2836_NCORES];
> > +
> 
> And this are absent from the VMSD, but after some thought they don't
> need to be as they are pure functions of the input pin state that is
> always refreshable from other state no? I would these together with a
> brief comment as to the above, and keep the migratable state (genuine
> device state) all together.

Yes, that was exactly the intention. I'll comment/revise as you suggest.

> Reviewed-by: Peter Crosthwaite 

Thanks for the review,
Andrew

Re: [Qemu-devel] [PATCH v4 5/8] bcm2836_control: add bcm2836 ARM control logic

2016-01-28 Thread Peter Crosthwaite

On Fri, Jan 15, 2016 at 3:58 PM, Andrew Baumann
 wrote:
> This module is specific to the bcm2836 (Pi2). It implements the top
> level interrupt controller, and mailboxes used for inter-processor
> synchronisation.
>
> Signed-off-by: Andrew Baumann 
> ---
>
> Notes:
> v4:
> * delete unused defs
> * s/localirqs/timerirqs/
> * factor out deliver_local() from bcm2836_control_update
> * use deposit32 in place of bit manipulation in set_local_irq
> * introduced register offset defs, and reduced comments in read/write 
> handlers
> * delete commented code
> * s/_/-/ rename GPIOs
>
> v3:
>  * uint8 localirqs
>  * style tweaks
>  * add MR access size limits
>
>  hw/intc/Makefile.objs |   2 +-
>  hw/intc/bcm2836_control.c | 303 
> ++
>  include/hw/intc/bcm2836_control.h |  51 +++
>  3 files changed, 355 insertions(+), 1 deletion(-)
>  create mode 100644 hw/intc/bcm2836_control.c
>  create mode 100644 include/hw/intc/bcm2836_control.h
>
> diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
> index 2ad1204..6a13a39 100644
> --- a/hw/intc/Makefile.objs
> +++ b/hw/intc/Makefile.objs
> @@ -24,7 +24,7 @@ obj-$(CONFIG_GRLIB) += grlib_irqmp.o
>  obj-$(CONFIG_IOAPIC) += ioapic.o
>  obj-$(CONFIG_OMAP) += omap_intc.o
>  obj-$(CONFIG_OPENPIC_KVM) += openpic_kvm.o
> -obj-$(CONFIG_RASPI) += bcm2835_ic.o
> +obj-$(CONFIG_RASPI) += bcm2835_ic.o bcm2836_control.o
>  obj-$(CONFIG_SH4) += sh_intc.o
>  obj-$(CONFIG_XICS) += xics.o
>  obj-$(CONFIG_XICS_KVM) += xics_kvm.o
> diff --git a/hw/intc/bcm2836_control.c b/hw/intc/bcm2836_control.c
> new file mode 100644
> index 000..f0b7b0a
> --- /dev/null
> +++ b/hw/intc/bcm2836_control.c
> @@ -0,0 +1,303 @@
> +/*
> + * Rasperry Pi 2 emulation ARM control logic module.
> + * Copyright (c) 2015, Microsoft
> + * Written by Andrew Baumann
> + *
> + * Based on bcm2835_ic.c (Raspberry Pi emulation) (c) 2012 Gregory Estrade
> + * This code is licensed under the GNU GPLv2 and later.
> + *
> + * At present, only implements interrupt routing, and mailboxes (i.e.,
> + * not local timer, PMU interrupt, or AXI counters).
> + *
> + * Ref:
> + * 
> https://www.raspberrypi.org/documentation/hardware/raspberrypi/bcm2836/QA7_rev3.4.pdf
> + */
> +
> +#include "hw/intc/bcm2836_control.h"
> +
> +#define REG_GPU_ROUTE   0x0c
> +#define REG_TIMERCONTROL0x40
> +#define REG_MBOXCONTROL 0x50
> +#define REG_IRQSRC  0x60
> +#define REG_FIQSRC  0x70
> +#define REG_MBOX0_WR0x80
> +#define REG_MBOX0_RDCLR 0xc0
> +#define REG_LIMIT  0x100
> +
> +#define IRQ_BIT(cntrl, num) (((cntrl) & (1 << (num))) != 0)
> +#define FIQ_BIT(cntrl, num) (((cntrl) & (1 << ((num) + 4))) != 0)
> +
> +#define IRQ_CNTPSIRQ0
> +#define IRQ_CNTPNSIRQ   1
> +#define IRQ_CNTHPIRQ2
> +#define IRQ_CNTVIRQ 3
> +#define IRQ_MAILBOX04
> +#define IRQ_MAILBOX15
> +#define IRQ_MAILBOX26
> +#define IRQ_MAILBOX37
> +#define IRQ_GPU 8
> +#define IRQ_PMU 9
> +#define IRQ_AXI 10
> +#define IRQ_TIMER   11
> +#define IRQ_MAX IRQ_TIMER
> +
> +static void deliver_local(BCM2836ControlState *s, uint8_t core, uint8_t irq,
> +  uint32_t controlreg, uint8_t controlidx)
> +{
> +if (FIQ_BIT(controlreg, controlidx)) {
> +/* deliver a FIQ */
> +s->fiqsrc[core] |= (uint32_t)1 << irq;
> +} else if (IRQ_BIT(controlreg, controlidx)) {
> +/* deliver an IRQ */
> +s->irqsrc[core] |= (uint32_t)1 << irq;
> +} else {
> +/* the interrupt is masked */
> +}
> +}
> +
> +/* Update interrupts.  */
> +static void bcm2836_control_update(BCM2836ControlState *s)
> +{
> +int i, j;
> +
> +/* reset pending IRQs/FIQs */
> +for (i = 0; i < BCM2836_NCORES; i++) {
> +s->irqsrc[i] = s->fiqsrc[i] = 0;
> +}
> +
> +/* apply routing logic, update status regs */
> +if (s->gpu_irq) {
> +assert(s->route_gpu_irq < BCM2836_NCORES);
> +s->irqsrc[s->route_gpu_irq] |= (uint32_t)1 << IRQ_GPU;
> +}
> +
> +if (s->gpu_fiq) {
> +assert(s->route_gpu_fiq < BCM2836_NCORES);
> +s->fiqsrc[s->route_gpu_fiq] |= (uint32_t)1 << IRQ_GPU;
> +}
> +
> +for (i = 0; i < BCM2836_NCORES; i++) {
> +/* handle local timer interrupts for this core */
> +if (s->timerirqs[i]) {
> +assert(s->timerirqs[i] < (1 << IRQ_MAILBOX0)); /* sanity check */
> +for (j = 0; j < IRQ_MAILBOX0; j++) {

I think <= IRQ_CNTVIRQ is cleaner, as it keeps "MAILBOX" out of the timer code.

> +if ((s->timerirqs[i] & (1 << j)) != 0) {
> +/* local interrupt j is set */
> +deliver_local(s, i, j, s->timercontrol[i], j);
> +}
> +}
> +}
> +
> +/* handle mailboxes for this core */
> +for (j = 0; j < BCM2836_MBPERCORE;

Re: [Qemu-devel] [PATCHv2 00/10] Clean up page size handling for ppc 64-bit hash MMUs with TCG

2016-01-28 Thread David Gibson

On Thu, Jan 28, 2016 at 09:44:53PM +0100, Alexander Graf wrote:
> 
> 
> On 01/27/2016 11:13 AM, David Gibson wrote:
> >Encoding of page sizes on 64-bit hash MMUs for Power is rather arcane,
> >involving control bits in both the SLB and HPTE.  At present we
> >support a few of the options, but far fewer than real hardware.
> >
> >We're able to get away with that in practice, because guests use a
> >device tree property to determine which page sizes are available and
> >we are setting that to match.  However, the fact that the actual code
> >doesn't necessarily what we put into the table of available page sizes
> >is another ugliness.
> >
> >This series makes a number of cleanups to the page size handling.  The
> >upshot is that afterwards the softmmu code operates off the same page
> >size encoding table that is advertised to the guests, ensuring that
> >they will be in sync.
> >
> >Finally, we extend the table of allowed sizes for POWER7 and POWER8 to
> >include the options allowed in hardware (including MPSS).  We can fix
> >other hash MMU based CPUs in future if anyone cares enough.
> >
> >For a simple benchmark I timed fully booting then cleanly shutting
> >down a TCG guest (RHEL7.2 userspace with a recent upstream kernel).
> >Repeated 5 times on the current master branch, my current ppc-for-2.6
> >branch and this branch.  It looks like it improves speed, although the
> >difference is pretty much negligible:
> >
> >master:  2m25 2m28 2m26 2m26 2m26
> >ppc-for-2.6:2m26 2m25 2m26 2m27 2m25
> >this series:2m20 2m23 2m23 2m25 2m21
> >
> >Please review, and I'll fold into ppc-for-2.6 for my next pull.
> >
> >Changes since v1:
> >   * Fix a couple of simple but serious bugs in logic
> >   * Did some rudimentary benchmarking
> >Changes since RFC:
> >   * Moved lookup of SLB encodings table from SLB lookup time to SLB
> >   store time
> 
> LGTM, apart from the comments that people already made. Please also provide
> changelogs in the individual patch files next time - it makes it easier for
> people who just try to see what changed from one version to another ;).
> 
> Reviewed-by: Alexander Graf 

Thanks, I've merged to ppc-for-2.6.

> Also, please just double sanity check that the code after your conversion
> still works well on 32bit hosts ;). I suppose you have a 32bit build
> environment by now, so that should be quite easy to pull off.

Yeah, will do.  I'm still pretty pissed that glib breaks the multiarch
build, which should be straightforward, but I have something workable.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PULL 00/28] ppc-for-2.6 queue 20160125

2016-01-28 Thread David Gibson

On Tue, Jan 26, 2016 at 09:13:34AM +, Peter Maydell wrote:
> On 26 January 2016 at 05:37, David Gibson  wrote:
> > Good grief.  And this would be why I don't generally test 32-bit
> > builds...
> 
> 32-bit on 64-bit host is a special case of a cross-compile,
> and cross-compiling is always pain... (My test 32-bit builds
> are just done on a natively 32-bit machine.)

Well, sort of.  With modern distro biarch / multiarch support it's
supposed to be a much easier case.  And is with most packages.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] target-ppc: mcrfs should always update FEX/VX and only clear exception bits

2016-01-28 Thread David Gibson

On Sun, Jan 24, 2016 at 03:41:26PM +, James Clarke wrote:
> Signed-off-by: James Clarke 

So, first, for a patch making a subtle behavioural change like this a
detailed commit message is absolutely essential.  In this case I can
take the description from 0/2, but in future please include
rationale's like that in the individual patches, so they'll appear in
the git history without extra work on my part.

But.. there's a more serious bug here, so I've backed this out of
ppc-for-2.6...

[snip]
> @@ -2501,17 +2501,24 @@ static void gen_mcrfs(DisasContext *ctx)
>  {
>  TCGv tmp = tcg_temp_new();
>  int bfa;
> +int nibble;
> +int shift;
>  
>  if (unlikely(!ctx->fpu_enabled)) {
>  gen_exception(ctx, POWERPC_EXCP_FPU);
>  return;
>  }
> -bfa = 4 * (7 - crfS(ctx->opcode));
> -tcg_gen_shri_tl(tmp, cpu_fpscr, bfa);
> +bfa = crfS(ctx->opcode);
> +nibble = 7 - bfa;
> +shift = 4 * nibble;
> +tcg_gen_shri_tl(tmp, cpu_fpscr, shift);
>  tcg_gen_trunc_tl_i32(cpu_crf[crfD(ctx->opcode)], tmp);
> -tcg_temp_free(tmp);
>  tcg_gen_andi_i32(cpu_crf[crfD(ctx->opcode)], cpu_crf[crfD(ctx->opcode)], 
> 0xf);
> -tcg_gen_andi_tl(cpu_fpscr, cpu_fpscr, ~(0xF << bfa));
> +/* Only the exception bits (including FX) should be cleared if read */
> +tcg_gen_andi_tl(tmp, cpu_fpscr, ~((0xF << shift) & FP_EX_CLEAR_BITS));
> +/* FEX and VX need to be updated, so don't set fpscr directly */
> +gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);

This doesn't compile.  For 64-bit targets we get:

  CCppc64-softmmu/target-ppc/translate.o
/home/dwg/src/qemu/target-ppc/translate.c: In function ‘gen_mcrfs’:
/home/dwg/src/qemu/target-ppc/translate.c:2520:42: error: passing argument 3 of 
‘gen_helper_store_fpscr’ makes pointe
r from integer without a cast [-Werror=int-conversion]
 gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);
  ^
In file included from /home/dwg/src/qemu/include/exec/helper-gen.h:59:0,
 from /home/dwg/src/qemu/tcg/tcg-op.h:27,
 from /home/dwg/src/qemu/target-ppc/translate.c:23:
/home/dwg/src/qemu/target-ppc/helper.h:56:58: note: expected ‘TCGv_i32 {aka 
struct TCGv_i32_d *}’ but argument is of 
type ‘int’

For 32-bit targets it's worse:

  CCppcemb-softmmu/target-ppc/translate.o
/home/dwg/src/qemu/target-ppc/translate.c: In function ‘gen_mcrfs’:
/home/dwg/src/qemu/target-ppc/translate.c:2520:37: error: passing argument 2 of 
‘gen_helper_store_fpscr’ from incompa
tible pointer type [-Werror=incompatible-pointer-types]
 gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);
 ^
In file included from /home/dwg/src/qemu/include/exec/helper-gen.h:59:0,
 from /home/dwg/src/qemu/tcg/tcg-op.h:27,
 from /home/dwg/src/qemu/target-ppc/translate.c:23:
/home/dwg/src/qemu/target-ppc/helper.h:56:58: note: expected ‘TCGv_i64 {aka 
struct TCGv_i64_d *}’ but argument is of 
type ‘TCGv_i32 {aka struct TCGv_i32_d *}’
/home/dwg/src/qemu/target-ppc/translate.c:2520:42: error: passing argument 3 of 
‘gen_helper_store_fpscr’ makes pointe
r from integer without a cast [-Werror=int-conversion]
 gen_helper_store_fpscr(cpu_env, tmp, 1 << nibble);
  ^
In file included from /home/dwg/src/qemu/include/exec/helper-gen.h:59:0,
 from /home/dwg/src/qemu/tcg/tcg-op.h:27,
 from /home/dwg/src/qemu/target-ppc/translate.c:23:
/home/dwg/src/qemu/target-ppc/helper.h:56:58: note: expected ‘TCGv_i32 {aka 
struct TCGv_i32_d *}’ but argument is of 
type ‘int’


-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v7 01/13] machine: Don't allow CPU toplogies with partially filled cores

2016-01-28 Thread David Gibson

On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:
> Prevent guests from booting with CPU topologies that have partially
> filled CPU cores or can result in partially filled CPU cores after
> CPU hotplug like
> 
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> 
> This is enforced by introducing MachineClass::validate_smp_config()
> that gets called from generic SMP parsing code. Machine type versions
> that want to enforce this can define this to the generic version
> provided.
> 
> Only sPAPR and PC machine types starting from version 2.6 enforce this in
> this patch.
> 
> Signed-off-by: Bharata B Rao 

I've been kind of lost in the back and forth about
threads/cores/sockets.

What, in the end, is the rationale for allowing partially filled
sockets, but not partially filled cores?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC 0/3] Draft implementation of HPT resizing (qemu side)

2016-01-28 Thread David Gibson

On Thu, Jan 28, 2016 at 10:04:58PM +0100, Alexander Graf wrote:
> 
> 
> On 01/19/2016 12:02 PM, David Gibson wrote:
> >On Tue, Jan 19, 2016 at 01:18:17PM +0530, Bharata B Rao wrote:
> >>On Mon, Jan 18, 2016 at 04:44:38PM +1100, David Gibson wrote:
> >>>Here is a draft qemu implementation of my proposed PAPR extension for
> >>>allowing runtime resizing of a KVM/ppc64 guest's hash page table.
> >>>That in turn will allow for more flexible memory hotplug.
> >>>
> >>>This should work with the guest kernel side patches I also posted
> >>>recently [1].
> >>>
> >>>Still required to make this into a full implementation:
> >>>   * Guest needs to auto-resize HPT on memory hotplug events
> >>>
> >>>   * qemu needs to allocate HPT size based on current rather than
> >>> maximum memory if the guest is HPT resize aware
> >>>
> >>>   * KVM host side implementation
> >>>
> >>>   * PAPR standardization
> >>So with the current patchset (QEMU and guest kernel changes), I should
> >>be able to change the HTAB size of a PR guest right ? I see the below
> >>failure though:
> >Uh.. to be honest I haven't really considered the KVM case at all.
> >I'm kind of surprised it didn't just refuse to do anything.
> >
> >>[root@localhost ~]# cat /sys/kernel/debug/powerpc/pft-size
> >>24
> >>[root@localhost ~]# echo 26 > /sys/kernel/debug/powerpc/pft-size
> >>[   65.996845] lpar: Attempting to resize HPT to shift 26
> >>[   65.996845] lpar: Attempting to resize HPT to shift 26
> >>[   66.113596] lpar: HPT resize to shift 26 complete (109 ms / 6 ms)
> >>[   66.113596] lpar: HPT resize to shift 26 complete (109 ms / 6 ms)
> >>
> >>PR guest just hangs here while I see tons of below messages in
> >>the 1st level guest:
> >>
> >>KVM can't copy data from 0x3fff99e91400!
> >>...
> >>Couldn't emulate instruction 0x (op 0 xop 0)
> >>kvmppc_handle_exit_pr: emulation at 700 failed ()
> >Hm, not sure why that's happening.  At first I thought it was because
> >we weren't updating SDR1 with the address of the new htab, but that's
> >actually in there.  Maybe the KVM PR code isn't rereading it after
> >initial VM startup.
> 
> The KVM PR code doesn't care - it just rereads SDR1 on every pteg lookup ;).
> There's no caching at all.

Ok, no idea why it's not working then.  I'll investigate when I get a chance.

> Of course, the guest needs to invalidate all pending tlb entries if they're
> now invalid.
> 
> Does this work on real hardware? Say, a G5?

As Paulus says it would be possible to do HPT resizing on real
hardware, but the implementation I've done is specific to PAPR.  And
obviously qemu wouldn't be relevant to that case.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC handles mcrfs incorrectly

2016-01-28 Thread David Gibson

On Sun, Jan 24, 2016 at 03:41:24PM +, James Clarke wrote:
> Here is the description of the mcrfs instruction from the PowerPC Architecture
> Book, Version 2.02, Book I: PowerPC User Instruction Set Architecture
> (http://www.ibm.com/developerworks/systems/library/es-archguide-v2.html), 
> found
> on page 120:

Thanks I've merged these fixes to ppc-for-2.6 which I'll send a pull
request for shortly.

> 
> The contents of FPSCR field BFA are copied to Condition Register field BF.
> All exception bits copied are set to 0 in the FPSCR. If the FX bit is
> copied, it is set to 0 in the FPSCR.
> 
> Special Registers Altered:
> CR field BF
> FX OX(if BFA=0)
> UX ZX XX VXSNAN  (if BFA=1)
> VXISI VXIDI VXZDZ VXIMZ  (if BFA=2)
> VXVC (if BFA=3)
> VXSOFT VXSQRT VXCVI  (if BFA=5)
> 
> However, currently every bit in FPSCR field BFA is set to 0, including ones 
> not
> on that list.
> 
> I noticed this with the following simple C program:
> 
> #include 
> #include 
> 
> int main(int argc, char **argv) {
> int ret;
> ret = fegetround();
> printf("Current rounding: %d\n", ret);
> ret = fesetround(FE_UPWARD);
> printf("Setting to FE_UPWARD (%d): %d\n", FE_UPWARD, ret);
> ret = fegetround();
> printf("Current rounding: %d\n", ret);
> ret = fegetround();
> printf("Current rounding: %d\n", ret);
> return 0;
> }
> 
> which gave the output:
> 
> Current rounding: 0
> Setting to FE_UPWARD (2): 0
> Current rounding: 2
> Current rounding: 0
> 
> instead of (with these patches applied):
> 
> Current rounding: 0
> Setting to FE_UPWARD (2): 0
> Current rounding: 2
> Current rounding: 2
> 
> The relevant disassembly is in fegetround(), which, on my system, is:
> 
> __GI___fegetround:
> <+0>:   mcrfs  cr7, cr7
> <+4>:   mfcr   r3
> <+8>:   clrldi r3, r3, 62
> <+12>:  blr
> 
> What happens is that, the first time fegetround() is called, FPSCR field 7 is
> retrieved. However, because of the bug in mcrfs, the entirety of field 7 is 
> set
> to 0, which includes the rounding mode.
> 
> There are other issues this will fix, such as condition flags not persisting
> when they should if read, and if you were to read a specific field with some
> exception bits set, but no others were set in the entire register, then the
> bits would be cleared correctly, but FEX/VX would not be updated to 0 as they
> should be.
> 
> The first commit is because some FP_ macros needed to calculate
> FP_EX_CLEAR_BITS did not exist, and I reordered all the FP_ macros so that 
> they
> are defined in the same order as the FPSCR_ macros.
> 
> James Clarke (2):
>   target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro
>   target-ppc: mcrfs should always update FEX/VX and only clear exception
> bits
> 
>  target-ppc/cpu.h   | 37 -
>  target-ppc/translate.c | 15 +++
>  2 files changed, 39 insertions(+), 13 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCHv2 06/10] target-ppc: Remove unused mmu models from ppc_tlb_invalidate_one

2016-01-28 Thread David Gibson

On Thu, Jan 28, 2016 at 04:45:21PM +0100, Thomas Huth wrote:
> On 27.01.2016 11:13, David Gibson wrote:
> > ppc_tlb_invalidate_one() has a big switch handling many different MMU
> > types.  However, most of those branches can never be reached:
> > 
> > It is called from 3 places: from remove_hpte() and h_protect() in
> > spapr_hcall.c (which always has a 64-bit hash MMU type), and from
> > helper_tlbie() in mmu_helper.c.
> > 
> > Calls to helper_tlbie() are generated from gen_tlbiel, gen_tlbiel and
> > gen_tlbiva.  The first two are only used with the PPC_MEM_TLBIE flag,
> > set only with 32-bit or 64-bit hash MMU models, and gen_tlbiva() is
> > used only on 440 and 460 models with the BookE mmu model.
> > 
> > These means the exhaustive list of MMU types which may call
> > ppc_tlb_invalidate_one() is: POWERPC_MMU_SOFT_6xx, POWERPC_MMU_601,
> > POWERPC_MMU_32B, POWERPC_MMU_SOFT_74xx, POWERPC_MMU_64B, POWERPC_MMU_2_03,
> > POWERPC_MMU_2_06, POWERPC_MMU_2_07 and POWERPC_MMU_BOOKE.
> > 
> > Clean up by removing logic for all other MMU types from
> > ppc_tlb_invalidate_one().
> > 
> > Signed-off-by: David Gibson 
> > ---
> ...
> > @@ -2031,9 +2016,8 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, 
> > target_ulong addr)
> >  break;
> >  #endif /* defined(TARGET_PPC64) */
> >  default:
> > -/* XXX: TODO */
> > -cpu_abort(CPU(cpu), "Unknown MMU model\n");
> > -break;
> > +/* Should never reach here with other MMU models */
> > +assert(0);
> >  }
> 
> May I suggest to simply use "abort()" instead of "assert(0)" ?

I actually prefer assert(0), because it documents that this is really
a "can't happen" rather than just "we can't cope".  It also means it
can be elided with -DNDEBUG.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v14 7/8] Implement new driver for block replication

2016-01-28 Thread Changlong Xie


On 01/28/2016 11:15 PM, Stefan Hajnoczi wrote:

On Thu, Jan 28, 2016 at 09:13:24AM +0800, Wen Congyang wrote:

On 01/27/2016 10:46 PM, Stefan Hajnoczi wrote:

On Wed, Jan 13, 2016 at 05:18:31PM +0800, Changlong Xie wrote:

+static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
+{
+Error *local_err = NULL;
+int ret;
+
+if (!s->secondary_disk->job) {
+error_setg(errp, "Backup job is cancelled unexpectedly");
+return;
+}
+
+block_job_do_checkpoint(s->secondary_disk->job, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+ret = s->active_disk->drv->bdrv_make_empty(s->active_disk);


What happens to in-flight requests to the active and hidden disks?


we MUST call do_checkpoint() when the vm is stopped.


Please document the environment under which the block replication
callback functions run.


OK



I'm concerned that the bdrv_drain_all() in vm_stop() can take a long
time if the disk is slow/failing.  bdrv_drain_all() blocks until all
in-flight I/O requests have completed.  What does the Primary do if the
Secondary becomes unresponsive?


Actually, we knew this problem. But currently, there seems no better way 
to resolve it. If you have any ideas?





+switch (s->mode) {
+case REPLICATION_MODE_PRIMARY:
+break;
+case REPLICATION_MODE_SECONDARY:
+s->active_disk = bs->file->bs;
+if (!bs->file->bs->backing) {
+error_setg(errp, "Active disk doesn't have backing file");
+return;
+}
+
+s->hidden_disk = s->active_disk->backing->bs;
+if (!s->hidden_disk->backing) {
+error_setg(errp, "Hidden disk doesn't have backing file");
+return;
+}
+
+s->secondary_disk = s->hidden_disk->backing->bs;
+if (!s->secondary_disk->blk) {
+error_setg(errp, "The secondary disk doesn't have block backend");
+return;
+}


Kevin: Is code allowed to stash away BlockDriverState pointers for
convenience or should it keep the BdrvChild pointers instead?  In order
for replication to work as expected, the graph shouldn't change but for
consistency maybe BdrvChild is best.


I asked Kevin about this on IRC and he agreed that BdrvChild should be
used instead of holding on to BlockDriverState * pointers.  Although
these pointers will not change during replication (if the op blockers
are set up correctly), it's more consistent and certainly safer to go
through BdrvChild.


Ok




+/* start backup job now */
+error_setg(&s->blocker,
+   "block device is in use by internal backup job");
+bdrv_op_block_all(s->top_bs, s->blocker);
+bdrv_op_unblock(s->top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
+bdrv_ref(s->hidden_disk);


Why is the explicit reference to hidden_disk (but not secondary_disk or
active_disk) is necessary?


IIRC, we should reference the backup target before calling backup_start(),
and we will reference the backup source in backup_start().


I'm not sure why this is necessary since they are part of the backing
chain.



Just as Wen said, we should reference the backup target before calling 
backup_start() to protect it from destroying, if backup job is stopped 
unexpectedly.



If it is necessary, please add a comment so it's clear why the reference
is being taken.



Ok


Stefan

Re: [Qemu-devel] [PATCH v9 20/37] qmp: Don't abuse stack to track qmp-output root

2016-01-28 Thread Eric Blake

On 01/21/2016 06:58 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> The previous commit documented an inconsistency in how we are
>> using the stack of qmp-output-visitor.  Normally, pushing a
>> single top-level object puts the object on the stack twice:
>> once as the root, and once as the current container being
>> appended to; but popping that struct only pops once.  However,
>> qmp_ouput_add() was trying to either set up the added object
>> as the new root (works if you parse two top-level scalars in a
>> row: the second replaces the first as the root) or as a member
>> of the current container (works as long as you have an open
>> container on the stack; but if you have popped the first
>> top-level container, it then resolves to the root and still
>> tries to add into that existing container).
>>
>> Fix the stupidity by not tracking two separate things in the
>> stack.  Drop the now-useless qmp_output_first() while at it.
>>
>> Saved for a later patch: we still are rather sloppy in that
>> qmp_output_get_object() can be called in the middle of a parse,
>> rather than requiring that a visit is complete.
>>

>> +switch (qobject_type(cur)) {
>> +case QTYPE_QDICT:
>> +assert(name);
>> +qdict_put_obj(qobject_to_qdict(cur), name, value);
>> +break;
>> +case QTYPE_QLIST:
>> +qlist_append_obj(qobject_to_qlist(cur), value);
>> +break;
>> +default:
>> +g_assert_not_reached();
> 
> We usually just abort().

But there are definitely existing examples, and it is a bit more
self-documenting.


>>
>> @@ -230,7 +205,9 @@ static void qmp_output_type_any(Visitor *v, const char 
>> *name, QObject **obj,
>>  /* Finish building, and return the root object. Will not be NULL. */
>>  QObject *qmp_output_get_qobject(QmpOutputVisitor *qov)
>>  {
>> -QObject *obj = qmp_output_first(qov);
>> +/* FIXME: we should require that a visit occurred, and that it is
>> + * complete (no starts without a matching end) */
> 
> I agree the visit must complete before you can retrieve the value.
> 
> I think there are two sane ways to recover from errors:
> 
> 1. Require the client to empty the stack by calling the necessary end
>methods.
> 
> 2. Allow the client to reset or destroy the visitor without calling end
>methods.
> 
> *This* visitor would be fine with either.  I guess the others would be
> fine, too.  So it's a question of interface design.
> 
> I'm currently leaning towards 2, because "you must do A, B and C before
> you can destroy this object" would be weird.  What do you think?

Patches later in the series revisit the question, and adding a reset
also makes it more interesting to be able to reset at any point in a
partial visit.  For _this_ patch, I think we're okay, but this is
probably the cutoff for what I'm sending in the first round of v10,
saving all the trickier stuff for later.

> 
>> +QObject *obj = qov->root;
>>  if (obj) {
>>  qobject_incref(obj);
>>  } else {
>> @@ -248,16 +225,12 @@ void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
>>  {
>>  QStackEntry *e, *tmp;
>>
>> -/* The bottom QStackEntry, if any, owns the root QObject. See the
>> - * qmp_output_push_obj() invocations in qmp_output_add_obj(). */
>> -QObject *root = QTAILQ_EMPTY(&v->stack) ? NULL : qmp_output_first(v);
>> -
> 
> If we require end methods to be called, the stack must be empty here,
> rendering the following loop useless.

Yeah, an interesting observation that will affect what I do in 24/37.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks

2016-01-28 Thread Alex Williamson

On Fri, 2016-01-29 at 02:22 +, Kay, Allen M wrote:
> 
> > -Original Message-
> > From: iGVT-g [mailto:igvt-g-boun...@lists.01.org] On Behalf Of Alex
> > Williamson
> > Sent: Thursday, January 28, 2016 11:36 AM
> > To: Gerd Hoffmann; qemu-devel@nongnu.org
> > Cc: igv...@ml01.01.org; xen-de...@lists.xensource.com; Eduardo Habkost;
> > Stefano Stabellini; Cao jin; vfio-us...@redhat.com
> > Subject: Re: [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset
> > tweaks
> > 
> > 
> > 1) The OpRegion MemoryRegion is mapped into system_memory through
> > programming of the 0xFC config space register.
> >  a) vfio-pci could pick an address to do this as it is realized.
> >  b) SeaBIOS/OVMF could program this.
> > 
> > Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to pick
> > an address and mark it as e820 reserved.  I'm not sure how to pick that
> > address.  We'd probably want to make the 0xFC config register read-
> > only.  1.b) has the issue you mentioned where in most cases the OpRegion
> > will be 8k, but the BIOS won't know how much address space it's mapping
> > into system memory when it writes the 0xFC register.  I don't know how
> > much of a problem this is since the BIOS can easily determine the size once
> > mapped and re-map it somewhere there's sufficient space.
> > Practically, it seems like it's always going to be 8K.  This of course 
> > requires
> > modification to every BIOS.  It also leaves the 0xFC register as a mapping
> > control rather than a pointer to the OpRegion in RAM, which doesn't really
> > match real hardware.  The BIOS would need to pick an address in this case.
> > 
> > 2) Read-only mappings version of 1)
> > 
> > Discussion: Really nothing changes from the issues above, just prevents any
> > possibility of the guest modifying anything in the host.  Xen apparently 
> > allows
> > write access to the host page already.
> > 
> > 3) Copy OpRegion contents into buffer and do either 1) or 2) above.
> > 
> > Discussion: No benefit that I can see over above other than maybe allowing
> > write access that doesn't affect the host.
> > 
> > 4) Copy contents into a guest RAM location, mark it reserved, point to it 
> > via
> > 0xFC config as scratch register.
> >  a) Done by QEMU (vfio-pci)
> >  b) Done by SeaBIOS/OVMF
> > 
> > Discussion: This is the most like real hardware.  4.a) has the usual issue 
> > of
> > how to pick an address, but the benefit of not requiring BIOS changes 
> > (simply
> > mark the RAM reserved via existing methods).  4.b) would require passing a
> > buffer containing the contents of the OpRegion via fw_cfg and letting the
> > BIOS do the setup.  The latter of course requires modifying each BIOS for 
> > this
> > support.
> > 
> > Of course none of these support hotplug nor really can they since reserved
> > memory regions are not dynamic in the architecture.
> > 
> > In all cases, some piece of software needs to know where it can place the
> > OpRegion in guest memory.  It seems like there are advantages or
> > disadvantages whether that's done by QEMU or the BIOS, but we only need
> > to do it once if it's QEMU.  Suggestions, comments, preferences?
> > 
> 
> Hi Alex, another thing to consider is how to communicate to the guest driver 
> the address at 0xFC contains a valid GPA address that can be accessed by the 
> driver without causing a EPT fault - since
> the same driver will be used on other hypervisors and they may not EPT map 
> OpRegion memory.  On idea proposed by display driver team is to set bit0 of 
> the address to 1 for indicating OpRegion memory
> can be safely accessed by the guest driver.

Hi Allen,

Why is that any different than a guest accessing any other memory area
that it shouldn't?  The OpRegion starts with a 16-byte ID string, so if
the guest finds that it should feel fairly confident the OpRegion data
is valid.  The published spec also seems to define all bits of 0xfc as
valid, not implying any sort of alignment requirements, and the i915
driver does a memremap directly on the value read from 0xfc.  So I'm not
sure whether there's really a need to or ability to define any of those
bits in an adhoc way to indicate mapping.  If we do things right,
shouldn't the guest driver not even know it's running in a VM, at least
for the KVMGT-d case, so we need to be compatible with physical
hardware.  Thanks,

Alex

Re: [Qemu-devel] [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset tweaks

2016-01-28 Thread Kay, Allen M



> -Original Message-
> From: iGVT-g [mailto:igvt-g-boun...@lists.01.org] On Behalf Of Alex
> Williamson
> Sent: Thursday, January 28, 2016 11:36 AM
> To: Gerd Hoffmann; qemu-devel@nongnu.org
> Cc: igv...@ml01.01.org; xen-de...@lists.xensource.com; Eduardo Habkost;
> Stefano Stabellini; Cao jin; vfio-us...@redhat.com
> Subject: Re: [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset
> tweaks
> 
> 
> 1) The OpRegion MemoryRegion is mapped into system_memory through
> programming of the 0xFC config space register.
>  a) vfio-pci could pick an address to do this as it is realized.
>  b) SeaBIOS/OVMF could program this.
> 
> Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to pick
> an address and mark it as e820 reserved.  I'm not sure how to pick that
> address.  We'd probably want to make the 0xFC config register read-
> only.  1.b) has the issue you mentioned where in most cases the OpRegion
> will be 8k, but the BIOS won't know how much address space it's mapping
> into system memory when it writes the 0xFC register.  I don't know how
> much of a problem this is since the BIOS can easily determine the size once
> mapped and re-map it somewhere there's sufficient space.
> Practically, it seems like it's always going to be 8K.  This of course 
> requires
> modification to every BIOS.  It also leaves the 0xFC register as a mapping
> control rather than a pointer to the OpRegion in RAM, which doesn't really
> match real hardware.  The BIOS would need to pick an address in this case.
> 
> 2) Read-only mappings version of 1)
> 
> Discussion: Really nothing changes from the issues above, just prevents any
> possibility of the guest modifying anything in the host.  Xen apparently 
> allows
> write access to the host page already.
> 
> 3) Copy OpRegion contents into buffer and do either 1) or 2) above.
> 
> Discussion: No benefit that I can see over above other than maybe allowing
> write access that doesn't affect the host.
> 
> 4) Copy contents into a guest RAM location, mark it reserved, point to it via
> 0xFC config as scratch register.
>  a) Done by QEMU (vfio-pci)
>  b) Done by SeaBIOS/OVMF
> 
> Discussion: This is the most like real hardware.  4.a) has the usual issue of
> how to pick an address, but the benefit of not requiring BIOS changes (simply
> mark the RAM reserved via existing methods).  4.b) would require passing a
> buffer containing the contents of the OpRegion via fw_cfg and letting the
> BIOS do the setup.  The latter of course requires modifying each BIOS for this
> support.
> 
> Of course none of these support hotplug nor really can they since reserved
> memory regions are not dynamic in the architecture.
> 
> In all cases, some piece of software needs to know where it can place the
> OpRegion in guest memory.  It seems like there are advantages or
> disadvantages whether that's done by QEMU or the BIOS, but we only need
> to do it once if it's QEMU.  Suggestions, comments, preferences?
> 

Hi Alex, another thing to consider is how to communicate to the guest driver 
the address at 0xFC contains a valid GPA address that can be accessed by the 
driver without causing a EPT fault - since the same driver will be used on 
other hypervisors and they may not EPT map OpRegion memory.  On idea proposed 
by display driver team is to set bit0 of the address to 1 for indicating 
OpRegion memory can be safely accessed by the guest driver.

> 
> Another thing I notice in this series is the access to PCI config space of 
> both
> the host bridge and the LPC bridge.  This prevents unprivileged use cases and
> is a barrier to libvirt support since it will need to provide access to the 
> pci-
> sysfs files for the process.  Should vfio add additional device specific 
> regions
> to expose the config space of these other devices?  I don't see that there's
> any write access necessary, so these would be read-only.  The comment in
> the kernel regarding why an unprivileged user can only access standard
> config space indicates that some devices lockup if unimplemented config
> space is accessed.  It seems like that's probably not an issue for recent-ish
> Intel host bridges and LPC devices.  If OpRegion, host bridge config, and LPC
> config were all provided through vfio, would there be any need for igd-
> passthrough switches on the machine type?  It seems like the QEMU vfio-pci
> driver could enable the necessary features and pre-fill the host and LPC
> bridge config items on demand when parsing an IGD device.  Thanks,
> 
> Alex
> 
> __

Allen
_
> iGVT-g mailing list
> igv...@lists.01.org
> https://lists.01.org/mailman/listinfo/igvt-g

[Qemu-devel] [PATCH v3] blockjob: Fix hang in block_job_finish_sync

2016-01-28 Thread Fam Zheng

With a mirror job running on a virtio-blk dataplane disk, sending "q" to
HMP will cause a dead loop in block_job_finish_sync.

This is because the aio_poll() only processes the AIO context of bs
which has no more work to do, while the main loop BH that is scheduled
for setting the job->completed flag is never processed.

Fix this by adding a flag in BlockJob structure, to track which context
to poll for the block job to make progress. Its value is set to true
when block_job_coroutine_complete() is called, and is checked in
block_job_finish_sync to determine which context to poll.

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 blockjob.c   | 5 -
 include/block/blockjob.h | 9 +
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/blockjob.c b/blockjob.c
index 80adb9d..25e1581 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -304,7 +304,9 @@ static int block_job_finish_sync(BlockJob *job,
 return -EBUSY;
 }
 while (!job->completed) {
-aio_poll(bdrv_get_aio_context(bs), true);
+aio_poll(job->deferred_to_main_loop ? qemu_get_aio_context() :
+  bdrv_get_aio_context(bs),
+ true);
 }
 ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
 block_job_unref(job);
@@ -497,6 +499,7 @@ void block_job_defer_to_main_loop(BlockJob *job,
 data->aio_context = bdrv_get_aio_context(job->bs);
 data->fn = fn;
 data->opaque = opaque;
+job->deferred_to_main_loop = true;
 
 qemu_bh_schedule(data->bh);
 }
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index d84ccd8..550de26 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -130,6 +130,11 @@ struct BlockJob {
  */
 bool ready;
 
+/**
+ * Set to true when the job has deferred work to the main loop.
+ */
+bool deferred_to_main_loop;
+
 /** Status that is published by the query-block-jobs QMP API */
 BlockDeviceIoStatus iostatus;
 
@@ -402,6 +407,10 @@ typedef void BlockJobDeferToMainLoopFn(BlockJob *job, void 
*opaque);
  * AioContext acquired.  Block jobs must call bdrv_unref(), bdrv_close(), and
  * anything that uses bdrv_drain_all() in the main loop.
  *
+ * The job->deferred_to_main_loop flag will be set. Caller must clear it once
+ * the deferred work is done and the block job coroutine continues, unless it's
+ * completing immediately.
+ *
  * The @job AioContext is held while @fn executes.
  */
 void block_job_defer_to_main_loop(BlockJob *job,
-- 
2.4.3

Re: [Qemu-devel] [PATCH v2 1/3] linux-user/mmap.c: Set prot page flags for the correct region in mmap_frag()

2016-01-28 Thread Chen Gang


On 2016年01月28日 22:54, Peter Maydell wrote:
> On 27 January 2016 at 01:37, Chen Gang  wrote:
>> Within one single call to target_mmap(), it should be OK.
>>
>> But multiple call to target_mmap(), may call mmap_frag() multiple times
>> for the same host page (also for the same target page). In our case:
>>
>>  - 4600 
>> mmap2(0x0034,135168,PROT_READ,MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED,-1,0) 
>> = 0x0034
>>
>>It will call mmap_frag() with start address 0x0034 + 128KB, and
>>set the target page with PAGE_VALID. But left the half below host
>>page without PAGE_VALID.
> 
> So, just to put some numbers in here:
> 
>  0x34 .. 0x34  0x35 .. 0x35 0x36 .. 0x360fff
>(64k, first host page)   (64k, second host page)  (4k guest page)
> 
> and we call mmap_frag() once for that last 4K fragment. It should:
>  * allocate a host page (since none is there yet)
>  * return to target_mmap, which will mark the range
>0x3f .. 0x360fff as PROT_VALID (together with the other
>read/write/etc permissions)
> 
> I think this part is definitely correct.
> 

Yes to me.

>>  - 4600 mmap2(0x0034,135168,PROT_READ,MAP_SHARED|MAP_FIXED,8,0) = 
>> 0x0034
>>
>>It will call mmap_frag() with start address 0x0034 + 128KB, and
>>check the half below host page which has no PAGE_VALID, then "prot1
>>== 0", mmap_frag() thinks "no page was there, so we allocate one".
> 
> On the second call, we again call mmap_frag for that last 4K.
> We check for any other valid guest pages in the 64k host page,
> and there aren't any. This will indeed cause us to mmap() again,
> which ideally we would not. But:
> 

OK.

> (1) Is this actually causing anything to fail? Calling host
> mmap() again is ever so slightly inefficient, but I don't think
> that it causes the guest to see anything wrong.
> 

For me, something may be a little complex (assume 8KB host page, 4KB
guest page):

 - 1st mmap2() is for MAP_PRIVATE, 2nd mmap2() is for MAP_SHARED.

 - So, if 2nd call mmap_frag() with the same start address only calls
   mprotect(), doesn't call mmap2() again, the target page will be
   still MAP_PRIVATE? (but caller wants it to be MAP_SHARED).

And theoretically, if the caller wants the 2 target pages within a host
page have different mapping attributes (e.g. half top host page is
MAP_SHARED, but half bottom host page is MAP_PRIVATE):

 - I guess, our current softmmu can not do that (we have to implment
   softmmu again, just like rth said originally).

 - But lucky to me, Wine will manage the whole memory by its own, and
   also windows its own also manage its whole memory. They try to be as
   simple as they can. So I guess, current softmmu is enough to me.

> (2) If we do want to fix this, your fix is doing the wrong thing.
> It is correct that we don't mark the areas outside the guest page
> as PROT_VALID, because they are not valid guest memory. If you
> want to avoid the mmap() you need to change the condition we're using
> to decide whether to mmap() a fresh host page (it would need to
> look at the PROT_VALID bits within the new guest mapping, not just
> the ones outside it). Something like:
> 
> /* get the protection of the target pages outside the mapping,
>  * and check whether we already have a host page allocated
>  */
> prot1 = 0;
> havevalid = 0;
> for(addr = real_start; addr < real_end; addr++) {
> int pageprot = page_get_flags(addr);
> if (addr < start || addr >= end) {
> prot1 |= pageprot;
> }
> havevalid |= pageprot;
> }
> 
> if (!havevalid) {
> /* no page was there, so ... */
> ...
> }
>
 
What you said above sounds OK to me, if we don't consider about
MAP_PRIVATE or MAP_SHARED.

After think of again, for me: we need keep the current code no touch,
but the related comments "/* no page was there, so ... */" need be
improved, I guess, it should be:

 - If there is no host page or only one target page, we need call mmap2
   again, which will satisfy the parameter 'flags' (e.g. MAP_PRIVATE or
   MAP_SHARED), else FIXME: at present, the 'flags' has to be skipped.


> But I think it's only worth making this change if we're fixing
> a real bug where the guest behaves wrongly.
> 

It sounds OK to me.

Thanks.
-- 
Chen Gang (陈刚)

Open, share, and attitude like air, water, and life which God blessed

Re: [Qemu-devel] [PATCH V2] net/traffic-mirror:Add traffic-mirror

2016-01-28 Thread Li Zhijian




On 01/28/2016 01:44 PM, Jason Wang wrote:



On 01/27/2016 10:40 AM, Zhang Chen wrote:

From: ZhangChen 

Traffic-mirror is a netfilter plugin.
It gives qemu the ability to copy and mirror guest's
net packet. we output packet to chardev.

usage:

-netdev tap,id=hn0
-chardev socket,id=mirror0,host=ip_primary,port=X,server,nowait
-traffic-mirror,id=m0,netdev=hn0,queue=tx/rx/all,outdev=mirror0

Signed-off-by: ZhangChen 
Signed-off-by: Wen Congyang 
Reviewed-by: Yang Hongyang 


Thanks for the patch. Several questions:

- I'm curious about how the patch was tested? Simple setup e.g:

-netdev tap,id=hn0 -device virtio-net-pci,netdev=hn0 -chardev
socket,id=c0,host=localhost,port=,server,nowait -object
traffic-mirror,netdev=hn0,outdev=c0,id=f0 -netdev
socket,id=s0,connect=127.0.0.1: -device e1000,netdev=s0

does not works for me.

Hi， Jason

I just test the mirror using the command line above, it don't work too.
I am looking to it, and find that seems because the -net socket problem that
I have ever post a patch  try to fix（refer to ↓）
[Qemu-devel] [PATCH] report a error message if -net socket can not connect to 
server
https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg00758.html

after applying this patch, the qemu monitor tell me following message:
(qemu) qemu-system-x86_64: net socket is not connected Connection refused


Thanks
Li Zhijian

Re: [Qemu-devel] [PATCH v2] linux-user: Original qemu-binfmt-conf.h is only able to write configuration into /proc/sys/fs/binfmt_misc, and the configuration is lost on reboot.

2016-01-28 Thread Laurent Vivier



Le 28/01/2016 23:29, Eric Blake a écrit :
> On 01/28/2016 03:08 PM, Laurent Vivier wrote:
> 
> Subject line is TOOO long.  I suggest:
> 
> linux-user: Fix qemu-binfmt-conf.h to store config across reboot

OK. I was waiting this comment ;)

> 
>> This script can configure debian and systemd services to restore 
>> configuration on reboot. Moreover, it is able to manage binfmt 
>> credential and to configure the path of the interpreter.
>> 
> 
>> diff --git a/scripts/qemu-binfmt-conf.sh
>> b/scripts/qemu-binfmt-conf.sh old mode 100644 new mode 100755 
>> index 289b1a3..56bc88e --- a/scripts/qemu-binfmt-conf.sh +++
>> b/scripts/qemu-binfmt-conf.sh @@ -1,72 +1,314 @@ #!/bin/sh #
>> enable automatic i386/ARM/M68K/MIPS/SPARC/PPC/s390 program
>> execution by the kernel
>> 
> 
>> +aarch64_magic='\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7\x00'
>>
>> 
+aarch64_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff'
>> +aarch64_family=arm + +qemu_get_family() {
> 
>> + +usage() { +cat < 
> Use of '!EOF' is an unusual heredoc delimiter; it fails miserably
> if history expansion is enabled.

Well, I've learned that on HP-UX 10.20 (in '90s), so I'm not surprised
it could have some troubles now. I will change that.

>> +Usage: qemu-binfmt-conf.sh [--qemu-path
>> PATH][--debian][--systemd CPU]
> 
>> +   --credential: if yes, credential an security tokens are
> 
> s/an/and/

OK

> 
>> +echo -n ""
> 
> 'echo -n' is not portable.  Use 'printf' instead.

OK

> 
>> +for CPU in $qemu_target_list ; +do +echo -n
>> "$CPU "
> 
> and again.

OK

>> +done +echo
> 
> This loop results in trailing whitespace (not fatal, but nice to
> avoid where possible).  Also, using a shell loop is a waste of
> effort; when you can just do:
> 
> printf "%s " $qemu_target_list
> 
> and get the same effect.

OK

>> +} + +qemu_check_access() { +if [ ! -w "$1" ] ; then +
>> echo "ERROR: cannot write to $1" 1>&2 +exit 1
> 
> Checking whether a file is writable is often a TOCTTOU race; since
> you have to handle failures to redirect to the file anyways (in
> case the file changed between your check and the actual use), can
> you just skip the check as redundant?

Checking right access allows to know if the system supports
binfmt_misc, debian packages or systemd, and if we can write here (are
we root ?, see Alex comment), so this check is really needed here. No
need to care of TOCTTOU.

> 
> 
>> +qemu_check_debian() { +if [ ! -e /etc/debian_version ] ;
>> then +echo "WARNING: your system is not a Debian based
>> distro" 1>&2 +elif ! installed_dpkg binfmt-support ; then +
>> echo "WARNING: package binfmt-support is needed !" 1>&2
> 
> Trailing '!' in error messages is shouting at the user; I tend to
> avoid them.  But if you must use it, in English there is no space
> between the final word and the punctuation: s/needed !/needed!/

Yes, I often forget French and English differ in the use of
punctuation. :)
I will remove the '!'.

> 
>> +fi +qemu_check_access "$EXPORTDIR" +} + 
>> +qemu_check_systemd() { +if ! systemctl -q is-enabled
>> systemd-binfmt.service ; then +echo "WARNING:
>> systemd-binfmt.service is missing or disabled !" 1>&2
> 
> and again

OK

>> +qemu_generate_debian() { +cat > "$EXPORTDIR/qemu-$cpu"
>> < 
> Again, !EOF is an unusual delimiter.

OK

> 
>> +qemu_set_binfmts() { +# probe cpu type +
>> host_family=$(qemu_get_family) + +# register the interpreter
>> for each cpu except for the native one + +for cpu in
>> ${qemu_target_list} ; do +magic=$(eval echo
>> \$${cpu}_magic) +mask=$(eval echo \$${cpu}_mask) +
>> family=$(eval echo \$${cpu}_family)
> 
> Use of eval is risky; fortunately, it looks like $qemu_target_list
> is under your control and can't be overridden by the user's
> environment to do something malicious.
> 
>> + +if [ "$magic" = "" -o "$mask" = "" -o "$family" = "" ]
>> ; then
> 
> "[ ... -o ... ]" is not portable.  Use "[ ... ] || [ ... ]"
> instead.

OK

Thank you!

Laurent

Re: [Qemu-devel] [PATCH v9 07/37] qapi: Improve generated event use of qapi visitor

2016-01-28 Thread Eric Blake

On 01/20/2016 08:19 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> All other successful clients of visit_start_struct() were paired
>> with an unconditional visit_end_struct(); but the generated
>> code for events was relying on qmp_output_visitor_cleanup() to
>> work on an incomplete visit.
> 

>> +++ b/scripts/qapi.py
>> @@ -1636,7 +1636,8 @@ def gen_err_check(label='out', skiperr=False):
>>   label=label)
>>
>>
>> -def gen_visit_fields(members, prefix='', need_cast=False, skiperr=False):
>> +def gen_visit_fields(members, prefix='', need_cast=False, skiperr=False,
>> + label='out'):
> 
> Probably clearer than label=None, but duplicates gen_err_check()'s
> default.  Fine with me.

Use of label=None resulted in literal "goto None;" in the generated file
(Python stringized it, rather than treating it as a hint to behave as if
the argument were not supplied).  So yes, I had to duplicate the default
value in both methods, to avoid having to do 'if label:' within the
implementations.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCHv2 00/10] Clean up page size handling for ppc 64-bit hash MMUs with TCG

2016-01-28 Thread Alexander Graf




On 01/27/2016 11:13 AM, David Gibson wrote:

Encoding of page sizes on 64-bit hash MMUs for Power is rather arcane,
involving control bits in both the SLB and HPTE.  At present we
support a few of the options, but far fewer than real hardware.

We're able to get away with that in practice, because guests use a
device tree property to determine which page sizes are available and
we are setting that to match.  However, the fact that the actual code
doesn't necessarily what we put into the table of available page sizes
is another ugliness.

This series makes a number of cleanups to the page size handling.  The
upshot is that afterwards the softmmu code operates off the same page
size encoding table that is advertised to the guests, ensuring that
they will be in sync.

Finally, we extend the table of allowed sizes for POWER7 and POWER8 to
include the options allowed in hardware (including MPSS).  We can fix
other hash MMU based CPUs in future if anyone cares enough.

For a simple benchmark I timed fully booting then cleanly shutting
down a TCG guest (RHEL7.2 userspace with a recent upstream kernel).
Repeated 5 times on the current master branch, my current ppc-for-2.6
branch and this branch.  It looks like it improves speed, although the
difference is pretty much negligible:

master: 2m25 2m28 2m26 2m26 2m26
ppc-for-2.6:2m26 2m25 2m26 2m27 2m25
this series:2m20 2m23 2m23 2m25 2m21

Please review, and I'll fold into ppc-for-2.6 for my next pull.

Changes since v1:
   * Fix a couple of simple but serious bugs in logic
   * Did some rudimentary benchmarking
Changes since RFC:
   * Moved lookup of SLB encodings table from SLB lookup time to SLB
   store time


LGTM, apart from the comments that people already made. Please also 
provide changelogs in the individual patch files next time - it makes it 
easier for people who just try to see what changed from one version to 
another ;).


Reviewed-by: Alexander Graf 

Also, please just double sanity check that the code after your 
conversion still works well on 32bit hosts ;). I suppose you have a 
32bit build environment by now, so that should be quite easy to pull off.



Alex
e

Re: [Qemu-devel] [PATCH v2] ide: ahci: add check before calling dma_memory_unmap

2016-01-28 Thread Paolo Bonzini

On 28/01/2016 20:57, John Snow wrote:
> This is fine for now as it protects us against doing something stupid in
> a mechanical fashion, but I still wonder under which case we are
> "starting" the FIS or CLB engines without getting a valid address. (The
> unmap should only be happening when we /stop/ the engines, which implies
> they were started -- which points to a bug somewhere else, too.)

It can happen if both of them are registered to an MMIO address.  The
second map will fail.

Perhaps it's a better fix to switch FIS and CLB to
address_space_read/write, but this works as a simple fix for the SEGV.

Paolo

[Qemu-devel] [PATCH] build: Add include check on syscall.h

2016-01-28 Thread Lluís Vilanova

The LTTng tracing backend includes the system's "syscall.h", but QEMU
replaces it with its own for linux-user builds. This results in a double
include on some targets (when LTTng is enabled).

Signed-off-by: Lluís Vilanova 
---
 linux-user/aarch64/syscall.h  |5 +
 linux-user/alpha/syscall.h|5 +
 linux-user/arm/syscall.h  |4 
 linux-user/i386/syscall.h |5 +
 linux-user/m68k/syscall.h |4 
 linux-user/mips/syscall.h |4 
 linux-user/mips64/syscall.h   |4 
 linux-user/openrisc/syscall.h |5 +
 linux-user/ppc/syscall.h  |5 +
 linux-user/s390x/syscall.h|5 +
 linux-user/sh4/syscall.h  |5 +
 linux-user/sparc/syscall.h|5 +
 linux-user/sparc64/syscall.h  |5 +
 linux-user/x86_64/syscall.h   |5 +
 14 files changed, 66 insertions(+)

diff --git a/linux-user/aarch64/syscall.h b/linux-user/aarch64/syscall.h
index dc72a15..b2e63c0 100644
--- a/linux-user/aarch64/syscall.h
+++ b/linux-user/aarch64/syscall.h
@@ -1,3 +1,6 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
+
 struct target_pt_regs {
 uint64_tregs[31];
 uint64_tsp;
@@ -11,3 +14,5 @@ struct target_pt_regs {
 #define TARGET_MINSIGSTKSZ   2048
 #define TARGET_MLOCKALL_MCL_CURRENT 1
 #define TARGET_MLOCKALL_MCL_FUTURE  2
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/alpha/syscall.h b/linux-user/alpha/syscall.h
index 245cff2..f3f7ee8 100644
--- a/linux-user/alpha/syscall.h
+++ b/linux-user/alpha/syscall.h
@@ -1,3 +1,6 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
+
 /* default linux values for the selectors */
 #define __USER_DS  (1)
 
@@ -255,3 +258,5 @@ struct target_pt_regs {
 #define TARGET_MINSIGSTKSZ  4096
 #define TARGET_MLOCKALL_MCL_CURRENT 0x2000
 #define TARGET_MLOCKALL_MCL_FUTURE  0x4000
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/arm/syscall.h b/linux-user/arm/syscall.h
index 3844a96..795b99e 100644
--- a/linux-user/arm/syscall.h
+++ b/linux-user/arm/syscall.h
@@ -1,3 +1,5 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
 
 /* this struct defines the way the registers are stored on the
stack during a system call. */
@@ -48,3 +50,5 @@ struct target_pt_regs {
 #define TARGET_MINSIGSTKSZ 2048
 #define TARGET_MLOCKALL_MCL_CURRENT 1
 #define TARGET_MLOCKALL_MCL_FUTURE  2
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/i386/syscall.h b/linux-user/i386/syscall.h
index 906aaac..527789b 100644
--- a/linux-user/i386/syscall.h
+++ b/linux-user/i386/syscall.h
@@ -1,3 +1,6 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
+
 /* default linux values for the selectors */
 #define __USER_CS  (0x23)
 #define __USER_DS  (0x2B)
@@ -150,3 +153,5 @@ struct target_vm86plus_struct {
 #define TARGET_MINSIGSTKSZ 2048
 #define TARGET_MLOCKALL_MCL_CURRENT 1
 #define TARGET_MLOCKALL_MCL_FUTURE  2
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/m68k/syscall.h b/linux-user/m68k/syscall.h
index 9218493..16db513 100644
--- a/linux-user/m68k/syscall.h
+++ b/linux-user/m68k/syscall.h
@@ -1,3 +1,5 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
 
 /* this struct defines the way the registers are stored on the
stack during a system call. */
@@ -23,3 +25,5 @@ struct target_pt_regs {
 #define TARGET_MLOCKALL_MCL_FUTURE  2
 
 void do_m68k_simcall(CPUM68KState *, int);
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/mips/syscall.h b/linux-user/mips/syscall.h
index 35ca23b..3d66419 100644
--- a/linux-user/mips/syscall.h
+++ b/linux-user/mips/syscall.h
@@ -1,3 +1,5 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
 
 /* this struct defines the way the registers are stored on the
stack during a system call. */
@@ -231,3 +233,5 @@ struct target_pt_regs {
 #define TARGET_MINSIGSTKSZ 2048
 #define TARGET_MLOCKALL_MCL_CURRENT 1
 #define TARGET_MLOCKALL_MCL_FUTURE  2
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/mips64/syscall.h b/linux-user/mips64/syscall.h
index 6733107..850900b 100644
--- a/linux-user/mips64/syscall.h
+++ b/linux-user/mips64/syscall.h
@@ -1,3 +1,5 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
 
 /* this struct defines the way the registers are stored on the
stack during a system call. */
@@ -228,3 +230,5 @@ struct target_pt_regs {
 #define TARGET_MINSIGSTKSZ  2048
 #define TARGET_MLOCKALL_MCL_CURRENT 1
 #define TARGET_MLOCKALL_MCL_FUTURE  2
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/openrisc/syscall.h b/linux-user/openrisc/syscall.h
index 8ac0365..dedec50 100644
--- a/linux-user/openrisc/syscall.h
+++ b/linux-user/openrisc/syscall.h
@@ -1,3 +1,6 @@
+#ifndef SYSCALL_H
+#define SYSCALL_H
+
 struct target_pt_regs {
 union {
 struct {
@@ -27,3 +30,5 @@ struct target_pt_regs {
 #define TARGET_MINSIGSTKSZ 2048
 #define TARGET_MLOCKALL_MCL_CURRENT 1
 #define TARGET_MLOCKALL_MCL_FUTURE  2
+
+#endif  /* SYSCALL_H */
diff --git a/linux-user/ppc/syscall.h b/linux-user/ppc/syscall.h
index 0daf5cd..eec5a5f 100644
--- a/linux-user/ppc/syscall.h
+++ b/linux-user/ppc/syscall.h
@@ -

Re: [Qemu-devel] [PATCH v2] linux-user: Original qemu-binfmt-conf.h is only able to write configuration into /proc/sys/fs/binfmt_misc, and the configuration is lost on reboot.

2016-01-28 Thread Eric Blake

On 01/28/2016 03:08 PM, Laurent Vivier wrote:

Subject line is TOOO long.  I suggest:

linux-user: Fix qemu-binfmt-conf.h to store config across reboot

> This script can configure debian and systemd services to restore
> configuration on reboot. Moreover, it is able to manage binfmt
> credential and to configure the path of the interpreter.
> 

> diff --git a/scripts/qemu-binfmt-conf.sh b/scripts/qemu-binfmt-conf.sh
> old mode 100644
> new mode 100755
> index 289b1a3..56bc88e
> --- a/scripts/qemu-binfmt-conf.sh
> +++ b/scripts/qemu-binfmt-conf.sh
> @@ -1,72 +1,314 @@
>  #!/bin/sh
>  # enable automatic i386/ARM/M68K/MIPS/SPARC/PPC/s390 program execution by 
> the kernel
>  

> +aarch64_magic='\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7\x00'
> +aarch64_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff'
> +aarch64_family=arm
> +
> +qemu_get_family() {

> +
> +usage() {
> +cat < +Usage: qemu-binfmt-conf.sh [--qemu-path PATH][--debian][--systemd CPU]

> +   --credential: if yes, credential an security tokens are

s/an/and/

> +echo -n ""

'echo -n' is not portable.  Use 'printf' instead.

> +for CPU in $qemu_target_list ;
> +do
> +echo -n "$CPU "

and again.

> +done
> +echo

This loop results in trailing whitespace (not fatal, but nice to avoid
where possible).  Also, using a shell loop is a waste of effort; when
you can just do:

printf "%s " $qemu_target_list

and get the same effect.

> +}
> +
> +qemu_check_access() {
> +if [ ! -w "$1" ] ; then
> +echo "ERROR: cannot write to $1" 1>&2
> +exit 1

Checking whether a file is writable is often a TOCTTOU race; since you
have to handle failures to redirect to the file anyways (in case the
file changed between your check and the actual use), can you just skip
the check as redundant?


> +qemu_check_debian() {
> +if [ ! -e /etc/debian_version ] ; then
> +echo "WARNING: your system is not a Debian based distro" 1>&2
> +elif ! installed_dpkg binfmt-support ; then
> +echo "WARNING: package binfmt-support is needed !" 1>&2

Trailing '!' in error messages is shouting at the user; I tend to avoid
them.  But if you must use it, in English there is no space between the
final word and the punctuation: s/needed !/needed!/

> +fi
> +qemu_check_access "$EXPORTDIR"
> +}
> +
> +qemu_check_systemd() {
> +if ! systemctl -q is-enabled systemd-binfmt.service ; then
> +echo "WARNING: systemd-binfmt.service is missing or disabled !" 1>&2

and again

> +qemu_generate_debian() {
> +cat > "$EXPORTDIR/qemu-$cpu" < +package qemu-$cpu

Again, !EOF is an unusual delimiter.

> +qemu_set_binfmts() {
> +# probe cpu type
> +host_family=$(qemu_get_family)
> +
> +# register the interpreter for each cpu except for the native one
> +
> +for cpu in ${qemu_target_list} ; do
> +magic=$(eval echo \$${cpu}_magic)
> +mask=$(eval echo \$${cpu}_mask)
> +family=$(eval echo \$${cpu}_family)

Use of eval is risky; fortunately, it looks like $qemu_target_list is
under your control and can't be overridden by the user's environment to
do something malicious.

> +
> +if [ "$magic" = "" -o "$mask" = "" -o "$family" = "" ] ; then

"[ ... -o ... ]" is not portable.  Use "[ ... ] || [ ... ]" instead.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 4/5] util: [ppc] Use new error_report_abort() instead of abort()

2016-01-28 Thread Lluís Vilanova

Eric Blake writes:

> On 01/28/2016 02:41 PM, Lluís Vilanova wrote:
>> Signed-off-by: Lluís Vilanova 
>> ---
>> target-ppc/kvm.c|4 ++--
>> target-ppc/kvm_ppc.h|   15 +--
>> target-ppc/mmu-hash32.c |5 +++--
>> target-ppc/mmu_helper.c |3 +--
>> 4 files changed, 15 insertions(+), 12 deletions(-)
>> 

>> +++ b/target-ppc/kvm_ppc.h
>> @@ -9,6 +9,9 @@
>> #ifndef __KVM_PPC_H__
>> #define __KVM_PPC_H__
>> 
>> +#include "qemu/error-report.h"
>> +
>> +
>> #define TYPE_HOST_POWERPC_CPU "host-" TYPE_POWERPC_CPU
>> 
>> #ifdef CONFIG_KVM
>> @@ -220,36 +223,36 @@ static inline int kvmppc_get_htab_fd(bool write)
>> static inline int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize,
>> int64_t max_ns)
>> {
>> -abort();
>> +error_report_abort(" ");

> Aborting with an empty string with trailing spaces feels awkward.
> Either this should be a real message, or abort() was just fine.

See my other mail for why (I think) it makes sense to abort without an
additional message. Also, an empty string makes gcc grumpy with a warning.


Cheers,
  Lluis

Re: [Qemu-devel] [PATCH v4 5/5] doc: Introduce coding style for errors

2016-01-28 Thread Lluís Vilanova

Eric Blake writes:

> On 01/28/2016 02:41 PM, Lluís Vilanova wrote:
>> Gives some general guidelines for reporting errors in QEMU.
>> 
>> Signed-off-by: Lluís Vilanova 
>> ---
>> HACKING |   33 +
>> 1 file changed, 33 insertions(+)
>> 
>> diff --git a/HACKING b/HACKING
>> index 12fbc8a..f5783d4 100644
>> --- a/HACKING
>> +++ b/HACKING
>> @@ -157,3 +157,36 @@ painful. These are:
>> * you may assume that integers are 2s complement representation
>> * you may assume that right shift of a signed integer duplicates
>> the sign bit (ie it is an arithmetic shift, not a logical shift)
>> +
>> +7. Error reporting
>> +
>> +QEMU provides various mechanisms for reporting errors using a uniform 
>> format,
>> +ensuring the user will receive them (e.g., shown in QMP when necessary). You
>> +should use one of these mechanisms instead of manually reporting them 
>> (i.e., do
>> +not use 'printf()', 'exit()' or 'abort()').

> abort() for unreachable code may be okay, but I'm not sure how to word
> that.  Maybe "avoid use 'printf()' or 'exit()', and minimize use of
> 'abort()' situations that should be unreachable code".

Hmmm. I was thinking it'd be more informative to always use error_report_abort()
instead of abort:

* The program name is shown (great for multi-app logs)
* The aborting location is shown (a bit useful to quickly see where it comes
  from)
* A message can be provided

I think the message should be optional, since error_report_abort() already
provides more information than plain abort().

But if that seems unreasonable, I can reword it as:

  QEMU provides various mechanisms for reporting errors using a uniform format,
  ensuring the user will receive them (e.g., shown in QMP when necessary). You
  should use one of these mechanisms instead of manually reporting them; i.e.,
  do not use 'printf()' nor 'exit()', and minimize the use of 'abort()' to
  situations where code should be unreachable and an error message does not make
  sense.


> May be worth mentioning that if the user can trigger it (command line,
> hotplug, etc) then we want fatal; if it represents a programming bug
> that the user should not be able to trigger, then abort is okay.

So true. I'll add these.


>> +
>> +7.1. Simple error messages
>> +
>> +The 'error_report*()' functions in "include/qemu/error-report.h" will
>> +immediately report error messages to the user.
>> +
>> +WARNING: Do *not* use 'error_report_fatal()' or 'error_report_abort()' for
>> +errors that are (or can be) triggered by guest code (e.g., some 
>> unimplimented

> s/unimplimented/unimplemented/

Fixed in v5 (sorry I forgot about it).


>> +corner case in guest code translation or device code). Otherwise that can be
>> +abused by guest code to terminate QEMU. Instead, you should use
>> +'error_report()'.
>> +
>> +7.2. Errors in user inputs
>> +
>> +The 'loc_*()' functions in "include/qemu/error-report.h" will extend the
>> +messages from 'error_report*()' with references to locations in inputs 
>> provided
>> +by the user (e.g., command line arguments or configuration files).
>> +
>> +7.3. More complex error management
>> +
>> +The functions in "include/qapi/error.h" can be used to accumulate error 
>> messages
>> +in an 'Error' object, which can be propagated up the call chain where it is
>> +finally reported.
>> +
>> +WARNING: The special 'error_fatal' and 'error_abort' objects follow the same
>> +constrains as the 'error_report_fatal' and 'error_report_abort' functions.

> s/constrains/constraints/

Will add.


Thanks,
  Lluis

Re: [Qemu-devel] [RFC 0/3] Draft implementation of HPT resizing (qemu side)

2016-01-28 Thread Paul Mackerras

On Thu, Jan 28, 2016 at 10:04:58PM +0100, Alexander Graf wrote:
> 
> Does this work on real hardware? Say, a G5?

Do you mean, could a bare-metal kernel change its hashed page table?
It could - it would have to allocate a new table, copy over the bolted
mappings (at least), switch to real mode, change SDR1, switch back to
virtual mode.

Paul.

[Qemu-devel] [PATCH v2] linux-user: Original qemu-binfmt-conf.h is only able to write configuration into /proc/sys/fs/binfmt_misc, and the configuration is lost on reboot.

2016-01-28 Thread Laurent Vivier

This script can configure debian and systemd services to restore
configuration on reboot. Moreover, it is able to manage binfmt
credential and to configure the path of the interpreter.

List of supported CPU is:

i386 i486 alpha arm sparc32plus ppc ppc64 ppc64le
m68k mips mipsel mipsn32 mipsn32el mips64 mips64el
sh4 sh4eb s390x aarch64

Usage: qemu-binfmt-conf.sh [--qemu-path PATH][--debian][--systemd CPU]
   [--help][--credential yes|no][--exportdir PATH]

   Configure binfmt_misc to use qemu interpreter

   --help:   display this usage
   --qemu-path:  set path to qemu interpreter (/usr/local/bin)
   --debian: don't write into /proc,
 instead generate update-binfmts templates
   --systemd:don't write into /proc,
 instead generate file for systemd-binfmt.service
 for the given CPU
   --exportdir:  define where to write configuration files
 (default: /etc/binfmt.d or /usr/share/binfmts)
   --credential: if yes, credential an security tokens are
 calculated according to the binary to interpret

To import templates with update-binfmts, use :

sudo update-binfmts --importdir /usr/share/binfmts --import qemu-CPU

To remove interpreter, use :

sudo update-binfmts --package qemu-CPU --remove qemu-CPU /usr/local/bin

With systemd, binfmt files are loaded by systemd-binfmt.service

The environment variable HOST_ARCH allows to override 'uname' to generate
configuration files for a different architecture than the current one.

Signed-off-by: Laurent Vivier 
---
v2: replace some ERRORS by WARNINGS to be able to use the script inside a 
package build
check only the right to write in the directory, no need to be root
merge systemd and binfmt_misc configuration generation
s/qemu_generate_packages/qemu_generate_debian/
add support of HOST_ARCH from debian, and update CPU families.
allow to use --exportdir with --systemd and update "Usage".

 scripts/qemu-binfmt-conf.sh | 380 
 1 file changed, 311 insertions(+), 69 deletions(-)
 mode change 100644 => 100755 scripts/qemu-binfmt-conf.sh

diff --git a/scripts/qemu-binfmt-conf.sh b/scripts/qemu-binfmt-conf.sh
old mode 100644
new mode 100755
index 289b1a3..56bc88e
--- a/scripts/qemu-binfmt-conf.sh
+++ b/scripts/qemu-binfmt-conf.sh
@@ -1,72 +1,314 @@
 #!/bin/sh
 # enable automatic i386/ARM/M68K/MIPS/SPARC/PPC/s390 program execution by the 
kernel
 
-# load the binfmt_misc module
-if [ ! -d /proc/sys/fs/binfmt_misc ]; then
-  /sbin/modprobe binfmt_misc
-fi
-if [ ! -f /proc/sys/fs/binfmt_misc/register ]; then
-  mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
-fi
-
-# probe cpu type
-cpu=`uname -m`
-case "$cpu" in
-  i386|i486|i586|i686|i86pc|BePC|x86_64)
-cpu="i386"
-  ;;
-  m68k)
-cpu="m68k"
-  ;;
-  mips*)
-cpu="mips"
-  ;;
-  "Power Macintosh"|ppc|ppc64)
-cpu="ppc"
-  ;;
-  armv[4-9]*)
-cpu="arm"
-  ;;
-esac
-
-# register the interpreter for each cpu except for the native one
-if [ $cpu != "i386" ] ; then
-echo 
':i386:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/local/bin/qemu-i386:'
 > /proc/sys/fs/binfmt_misc/register
-echo 
':i486:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x06\x00:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/local/bin/qemu-i386:'
 > /proc/sys/fs/binfmt_misc/register
-fi
-if [ $cpu != "alpha" ] ; then
-echo 
':alpha:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x26\x90:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/local/bin/qemu-alpha:'
 > /proc/sys/fs/binfmt_misc/register
-fi
-if [ $cpu != "arm" ] ; then
-echo   
':arm:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x28\x00:\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/local/bin/qemu-arm:'
 > /proc/sys/fs/binfmt_misc/register
-echo   
':armeb:M::\x7fELF\x01\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x28:\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff:/usr/local/bin/qemu-armeb:'
 > /proc/sys/fs/binfmt_misc/register
-fi
-if [ $cpu != "aarch64" ] ; then
-echo 
':aarch64:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7\x00:\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/local/bin/qemu-aarch64:'
 > /proc/sys/fs/binfmt_misc/register
-fi
-if [ $cpu != "sparc" ] ; then
-echo   
':sparc:M::\x7fELF\x01\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x02:\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff:/usr/local/bin/qemu-sparc:'
 > /proc/sys/fs/binfmt_misc/regi

Re: [Qemu-devel] [PATCH v5 5/5] doc: Introduce coding style for errors

2016-01-28 Thread Eric Blake

On 01/28/2016 02:53 PM, Lluís Vilanova wrote:
> Gives some general guidelines for reporting errors in QEMU.
> 
> Signed-off-by: Lluís Vilanova 
> ---
>  HACKING |   33 +
>  1 file changed, 33 insertions(+)

I'm not sure if my v4 review crossed paths with this, but I still see typos:

> +
> +WARNING: The special 'error_fatal' and 'error_abort' objects follow the same
> +constrains as the 'error_report_fatal' and 'error_report_abort' functions.

s/constrains/constraints/

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 02/16] register: Add Register API

2016-01-28 Thread Alistair Francis

On Wed, Jan 27, 2016 at 6:46 AM, KONRAD Frederic
 wrote:
> Hi,
>
>
> Le 19/01/2016 23:34, Alistair Francis a écrit :
>>
>> From: Peter Crosthwaite 
>>
>> This API provides some encapsulation of registers and factors our some
>> common functionality to common code. Bits of device state (usually MMIO
>> registers), often have all sorts of access restrictions and semantics
>> associated with them. This API allow you to define what those
>> restrictions are on a bit-by-bit basis.
>>
>> Helper functions are then used to access the register which observe the
>> semantics defined by the RegisterAccessInfo struct.
>>
>> Some features:
>> Bits can be marked as read_only (ro field)
>> Bits can be marked as write-1-clear (w1c field)
>> Bits can be marked as reserved (rsvd field)
>> Reset values can be defined (reset)
>> Bits can throw guest errors when written certain values (ge0, ge1)
>> Bits can throw unimp errors when written certain values (ui0, ui1)
>> Bits can be marked clear on read (cor)
>> Pre and post action callbacks can be added to read and write ops
>> Verbose debugging info can be enabled/disabled
>>
>> Useful for defining device register spaces in a data driven way. Cuts
>> down on a lot of the verbosity and repetition in the switch-case blocks
>> in the standard foo_mmio_read/write functions.
>>
>> Also useful for automated generation of device models from hardware
>> design sources.
>>
>> Signed-off-by: Peter Crosthwaite 
>> Signed-off-by: Alistair Francis 
>> ---
>> changed from v2:
>> Simplified! Removed pre-read, nwx, wo
>> Removed byte loops (Gerd Review)
>> Made data pointer optional
>> Added fast paths for simple registers
>> Moved into hw/core and include/hw (Paolo Review)
>> changed from v1:
>> Rebranded as the "Register API" - I think thats probably what it is.
>> Near total rewrite of implementation.
>> De-arrayified reset (this is client/Memory APIs job).
>> Moved out of bitops into its own file (Blue review)
>> Added debug, the register pointer, and prefix to a struct (Blue Review)
>> Made 64-bit to play friendlier with memory API (Blue review)
>> Made backend storage uint8_t (MST review)
>> Added read/write callbacks (Blue review)
>> Added ui0, ui1 (Blue review)
>> Moved re-purposed width (now byte width defining actual storage size)
>> Arrayified ge0, ge1 (ui0, ui1 too) and added .reason
>> Added wo field (not an April fools joke - this has genuine meaning here)
>> Added we mask to write accessor
>>
>>   hw/core/Makefile.objs |   1 +
>>   hw/core/register.c| 186
>> ++
>>   include/hw/register.h | 132 +++
>>   3 files changed, 319 insertions(+)
>>   create mode 100644 hw/core/register.c
>>   create mode 100644 include/hw/register.h
>>
>> diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs
>> index abb3560..bf95db5 100644
>> --- a/hw/core/Makefile.objs
>> +++ b/hw/core/Makefile.objs
>> @@ -14,4 +14,5 @@ common-obj-$(CONFIG_SOFTMMU) += machine.o
>>   common-obj-$(CONFIG_SOFTMMU) += null-machine.o
>>   common-obj-$(CONFIG_SOFTMMU) += loader.o
>>   common-obj-$(CONFIG_SOFTMMU) += qdev-properties-system.o
>> +common-obj-$(CONFIG_SOFTMMU) += register.o
>>   common-obj-$(CONFIG_PLATFORM_BUS) += platform-bus.o
>> diff --git a/hw/core/register.c b/hw/core/register.c
>> new file mode 100644
>> index 000..02a4376
>> --- /dev/null
>> +++ b/hw/core/register.c
>> @@ -0,0 +1,186 @@
>> +/*
>> + * Register Definition API
>> + *
>> + * Copyright (c) 2013 Xilinx Inc.
>> + * Copyright (c) 2013 Peter Crosthwaite 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "hw/register.h"
>> +#include "qemu/log.h"
>> +
>> +static inline void register_write_log(RegisterInfo *reg, int dir,
>> uint64_t val,
>> +  int mask, const char *msg,
>> +  const char *reason)
>> +{
>> +qemu_log_mask(mask, "%s:%s bits %#" PRIx64 " %s write of %d%s%s\n",
>> +  reg->prefix, reg->access->name, val, msg, dir,
>> +  reason ? ": " : "", reason ? reason : "");
>> +}
>> +
>> +static inline void register_write_val(RegisterInfo *reg, uint64_t val)
>> +{
>> +if (!reg->data) {
>> +return;
>> +}
>> +switch (reg->data_size) {
>> +case 1:
>> +*(uint8_t *)reg->data = val;
>> +break;
>> +case 2:
>> +*(uint16_t *)reg->data = val;
>> +break;
>> +case 4:
>> +*(uint32_t *)reg->data = val;
>> +break;
>> +case 8:
>> +*(uint64_t *)reg->data = val;
>> +break;
>> +default:
>> +abort();
>> +}
>> +}
>> +
>> +static inline uint64_t register_read_val(RegisterInfo *reg)
>> +{
>> +switch (reg->data_size) {
>> +case 1:
>> +return *(uint8_t *)reg->data;
>> +case 2:
>> +return *(uint16_t *)reg->data;
>> +case 4:
>

[Qemu-devel] [PATCH v5 5/5] doc: Introduce coding style for errors

2016-01-28 Thread Lluís Vilanova

Gives some general guidelines for reporting errors in QEMU.

Signed-off-by: Lluís Vilanova 
---
 HACKING |   33 +
 1 file changed, 33 insertions(+)

diff --git a/HACKING b/HACKING
index 12fbc8a..aecc77c 100644
--- a/HACKING
+++ b/HACKING
@@ -157,3 +157,36 @@ painful. These are:
  * you may assume that integers are 2s complement representation
  * you may assume that right shift of a signed integer duplicates
the sign bit (ie it is an arithmetic shift, not a logical shift)
+
+7. Error reporting
+
+QEMU provides various mechanisms for reporting errors using a uniform format,
+ensuring the user will receive them (e.g., shown in QMP when necessary). You
+should use one of these mechanisms instead of manually reporting them (i.e., do
+not use 'printf()', 'exit()' or 'abort()').
+
+7.1. Simple error messages
+
+The 'error_report*()' functions in "include/qemu/error-report.h" will
+immediately report error messages to the user.
+
+WARNING: Do *not* use 'error_report_fatal()' or 'error_report_abort()' for
+errors that are (or can be) triggered by guest code (e.g., some unimplemented
+corner case in guest code translation or device code). Otherwise that can be
+abused by guest code to terminate QEMU. Instead, you should use
+'error_report()'.
+
+7.2. Errors in user inputs
+
+The 'loc_*()' functions in "include/qemu/error-report.h" will extend the
+messages from 'error_report*()' with references to locations in inputs provided
+by the user (e.g., command line arguments or configuration files).
+
+7.3. More complex error management
+
+The functions in "include/qapi/error.h" can be used to accumulate error 
messages
+in an 'Error' object, which can be propagated up the call chain where it is
+finally reported.
+
+WARNING: The special 'error_fatal' and 'error_abort' objects follow the same
+constrains as the 'error_report_fatal' and 'error_report_abort' functions.

Re: [Qemu-devel] [PATCH v7 13/13] hmp: Add "info ppc-cpu-cores" command

2016-01-28 Thread Eric Blake

On 01/27/2016 10:49 PM, Bharata B Rao wrote:
> This is the hmp equivalent of "query ppc-cpu-cores"

The QMP command is spelled "query-ppc-cpu-cores".

Most HMP commands prefer '_' over '-'; so this should be 'info
ppc_cpu_cores'.

> 
> Signed-off-by: Bharata B Rao 
> ---
>  hmp-commands-info.hx | 16 
>  hmp.c| 31 +++
>  hmp.h|  1 +
>  3 files changed, 48 insertions(+)
> 

> +++ b/hmp.c
> @@ -2375,3 +2375,34 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const 
> QDict *qdict)
>  
>  qapi_free_RockerOfDpaGroupList(list);
>  }
> +
> +void hmp_info_ppc_cpu_cores(Monitor *mon, const QDict *qdict)
> +{
> +Error *err = NULL;
> +PPCCPUCoreList *ppc_cpu_core_list = qmp_query_ppc_cpu_cores(&err);
> +PPCCPUCoreList *s = ppc_cpu_core_list;
> +CpuInfoList *thread;
> +
> +while (s) {
> +monitor_printf(mon, "PowerPC CPU device: \"%s\"\n",
> +   s->value->id ? s->value->id : "");

This should probably be checking s->value->has_id rather than assuming
that s->value->id will be NULL when not present (well, I'd like to clean
up qapi to avoid the need for has_FOO when FOO  is a pointer, but we're
not there yet).

> +monitor_printf(mon, "  hotplugged: %s\n",
> +   s->value->hotplugged ? "true" : "false");
> +monitor_printf(mon, "  hotpluggable: %s\n",
> +   s->value->hotpluggable ? "true" : "false");
> +monitor_printf(mon, "  Threads:\n");
> +for (thread = s->value->threads; thread; thread = thread->next) {
> +monitor_printf(mon, "CPU #%" PRId64 ":", thread->value->CPU);
> +monitor_printf(mon, " nip=0x%016" PRIx64,
> +   thread->value->u.ppc->nip);

This uses value->u.ppc without first checking that the discriminator
value->arch is set to CPU_INFO_ARCH_PPC; could that be a problem down
the road?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC][PATCH v4 0/5] utils: Improve and document error reporting

2016-01-28 Thread Lluís Vilanova

Please ignore this version; I forgot to fix a typo (fixed in v5).

Cheers,
  Lluis

Lluís Vilanova writes:

> Adds leaner error-reporting functions for simple cases, and documents the
> purpose of the different facilities available in QEMU.

> Although not all printf+exit/abort are replaced with the proper functions, a 
> few
> are ported as an example.


> Changes in v4
> =

> * Introduce 'error_report_fatal()' and 'error_report_abort()' functions
>   [suggested by Thomas Huth].
> * Repalce all existing uses of 'error_setg(error_fatal)' and
>   'error_setg(error_abort)' with 'error_report_fatal()' and
>   'error_report_abort()'.
> * Replace all uses of 'exit()' with 'error_report_fatal()' in 'target-ppc'.
> * Replace all uses of 'abort()' with 'error_report_abort()' in 'target-ppc'.

> Changes in v3
> =

> * Drop special object 'error_warn' in favour of raw 'error_report()'
>   [suggested by Markus Armbruster].


> Changes in v2
> =

> * Split in two patches.
> * Explicitly add a warning error object.


> Signed-off-by: Lluís Vilanova 
> ---

> Lluís Vilanova (5):
>   util: Introduce error reporting functions with fatal/abort
>   util: Use new error_report_fatal/abort instead of 
> error_setg(&error_fatal/abort)
>   util: [ppc] Use new error_report_fatal() instead of exit()
>   util: [ppc] Use new error_report_abort() instead of abort()
>   doc: Introduce coding style for errors


>  HACKING |   33 ++
>  hw/block/fdc.c  |6 ++-
>  hw/ppc/spapr.c  |8 ++--
>  hw/ppc/spapr_drc.c  |2 +
>  include/qemu/error-report.h |   19 ++
>  target-ppc/kvm.c|9 ++---
>  target-ppc/kvm_ppc.h|   15 +---
>  target-ppc/mmu-hash32.c |5 ++-
>  target-ppc/mmu_helper.c |3 +-
>  target-ppc/translate.c  |7 ++--
>  target-ppc/translate_init.c |   80 
> +--
>  util/error.c|9 ++---
>  util/qemu-error.c   |   33 ++
>  13 files changed, 155 insertions(+), 74 deletions(-)


> To: qemu-devel@nongnu.org
> Cc: Stefan Hajnoczi 
> Cc: Dr. David Alan Gilbert 
> Cc: Thomas Huth 
> Cc: Markus Armbruster 
> Cc: Eric Blake

[Qemu-devel] [PATCH v5 4/5] util: [ppc] Use new error_report_abort() instead of abort()

2016-01-28 Thread Lluís Vilanova

Signed-off-by: Lluís Vilanova 
---
 target-ppc/kvm.c|4 ++--
 target-ppc/kvm_ppc.h|   15 +--
 target-ppc/mmu-hash32.c |5 +++--
 target-ppc/mmu_helper.c |3 +--
 4 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 098a40d..e7596a2 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -587,7 +587,7 @@ static void kvm_get_one_spr(CPUState *cs, uint64_t id, int 
spr)
 
 default:
 /* Don't handle this size yet */
-abort();
+error_report_abort("Unhandled size: %d", id & KVM_REG_SIZE_MASK);
 }
 }
 }
@@ -617,7 +617,7 @@ static void kvm_put_one_spr(CPUState *cs, uint64_t id, int 
spr)
 
 default:
 /* Don't handle this size yet */
-abort();
+error_report_abort("Unhandled size: %d", id & KVM_REG_SIZE_MASK);
 }
 
 ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 5e1333d..07ff3fc 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -9,6 +9,9 @@
 #ifndef __KVM_PPC_H__
 #define __KVM_PPC_H__
 
+#include "qemu/error-report.h"
+
+
 #define TYPE_HOST_POWERPC_CPU "host-" TYPE_POWERPC_CPU
 
 #ifdef CONFIG_KVM
@@ -220,36 +223,36 @@ static inline int kvmppc_get_htab_fd(bool write)
 static inline int kvmppc_save_htab(QEMUFile *f, int fd, size_t bufsize,
int64_t max_ns)
 {
-abort();
+error_report_abort(" ");
 }
 
 static inline int kvmppc_load_htab_chunk(QEMUFile *f, int fd, uint32_t index,
  uint16_t n_valid, uint16_t n_invalid)
 {
-abort();
+error_report_abort(" ");
 }
 
 static inline uint64_t kvmppc_hash64_read_pteg(PowerPCCPU *cpu,
target_ulong pte_index)
 {
-abort();
+error_report_abort(" ");
 }
 
 static inline void kvmppc_hash64_free_pteg(uint64_t token)
 {
-abort();
+error_report_abort(" ");
 }
 
 static inline void kvmppc_hash64_write_pte(CPUPPCState *env,
target_ulong pte_index,
target_ulong pte0, target_ulong 
pte1)
 {
-abort();
+error_report_abort(" ");
 }
 
 static inline bool kvmppc_has_cap_fixup_hcalls(void)
 {
-abort();
+error_report_abort(" ");
 }
 
 static inline int kvmppc_enable_hwrng(void)
diff --git a/target-ppc/mmu-hash32.c b/target-ppc/mmu-hash32.c
index a00ae3c..9d1cc33 100644
--- a/target-ppc/mmu-hash32.c
+++ b/target-ppc/mmu-hash32.c
@@ -20,6 +20,7 @@
 
 #include "cpu.h"
 #include "exec/helper-proto.h"
+#include "qemu/error-report.h"
 #include "sysemu/kvm.h"
 #include "kvm_ppc.h"
 #include "mmu-hash32.h"
@@ -55,7 +56,7 @@ static int ppc_hash32_pp_prot(int key, int pp, int nx)
 break;
 
 default:
-abort();
+error_report_abort("Unhandled pp: %d", pp);
 }
 } else {
 switch (pp) {
@@ -73,7 +74,7 @@ static int ppc_hash32_pp_prot(int key, int pp, int nx)
 break;
 
 default:
-abort();
+error_report_abort("Unhandled pp: %d", pp);
 }
 }
 if (nx == 0) {
diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c
index 5217691..7ded975 100644
--- a/target-ppc/mmu_helper.c
+++ b/target-ppc/mmu_helper.c
@@ -1349,8 +1349,7 @@ static inline int check_physical(CPUPPCState *env, 
mmu_ctx_t *ctx,
 
 default:
 /* Caller's checks mean we should never get here for other models */
-abort();
-return -1;
+error_report_abort("Unhandled MMU model: %d", env->mmu_model);
 }
 
 return ret;

qemu-devel@nongnu.org

2016-01-28 Thread Lluís Vilanova

Replaces all direct uses of 'error_setg(&error_fatal/abort)' with
'error_report_fatal/abort'. Also reimplements the former on top of the
latter.

Signed-off-by: Lluís Vilanova 
---
 hw/block/fdc.c |6 +++---
 hw/ppc/spapr.c |8 
 hw/ppc/spapr_drc.c |2 +-
 util/error.c   |9 +++--
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index e3b0e1e..8f0c947 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -347,9 +347,9 @@ static int pick_geometry(FDrive *drv)
 
 /* No match of any kind found -- fd_format is misconfigured, abort. */
 if (match == -1) {
-error_setg(&error_abort, "No candidate geometries present in table "
-   " for floppy drive type '%s'",
-   FloppyDriveType_lookup[drv->drive]);
+error_report_abort("No candidate geometries present in table "
+   " for floppy drive type '%s'",
+   FloppyDriveType_lookup[drv->drive]);
 }
 
 parse = &(fd_formats[match]);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 50e5a26..a5afea1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1031,7 +1031,7 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
  * For HV KVM, host kernel will return -ENOMEM when requested
  * HTAB size can't be allocated.
  */
-error_setg(&error_abort, "Failed to allocate HTAB of requested size, 
try with smaller maxmem");
+error_report_abort("Failed to allocate HTAB of requested size, try 
with smaller maxmem");
 } else if (shift > 0) {
 /*
  * Kernel handles htab, we don't need to allocate one
@@ -1040,7 +1040,7 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
  * but we don't allow booting of such guests.
  */
 if (shift != spapr->htab_shift) {
-error_setg(&error_abort, "Failed to allocate HTAB of requested 
size, try with smaller maxmem");
+error_report_abort("Failed to allocate HTAB of requested size, try 
with smaller maxmem");
 }
 
 spapr->htab_shift = shift;
@@ -1071,10 +1071,10 @@ static void spapr_reset_htab(sPAPRMachineState *spapr)
 
 shift = kvmppc_reset_htab(spapr->htab_shift);
 if (shift < 0) {
-error_setg(&error_abort, "Failed to reset HTAB");
+error_report_abort("Failed to reset HTAB");
 } else if (shift > 0) {
 if (shift != spapr->htab_shift) {
-error_setg(&error_abort, "Requested HTAB allocation failed during 
reset");
+error_report_abort("Requested HTAB allocation failed during 
reset");
 }
 
 /* Tell readers to update their file descriptor */
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index dccb908..0d8f5b4 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -322,7 +322,7 @@ static void prop_get_fdt(Object *obj, Visitor *v, void 
*opaque,
 break;
 }
 default:
-error_setg(&error_abort, "device FDT in unexpected state: %d", 
tag);
+error_report_abort("device FDT in unexpected state: %d", tag);
 }
 fdt_offset = fdt_offset_next;
 } while (fdt_depth != 0);
diff --git a/util/error.c b/util/error.c
index 57303fd..b8a9120 100644
--- a/util/error.c
+++ b/util/error.c
@@ -30,15 +30,12 @@ Error *error_fatal;
 
 static void error_handle_fatal(Error **errp, Error *err)
 {
+/* None of them has a hint, so error_report_err() is not necessary here */
 if (errp == &error_abort) {
-fprintf(stderr, "Unexpected error in %s() at %s:%d:\n",
-err->func, err->src, err->line);
-error_report_err(err);
-abort();
+error_report_abort_internal("%s", err->msg);
 }
 if (errp == &error_fatal) {
-error_report_err(err);
-exit(1);
+error_report_fatal("%s", err->msg);
 }
 }

[Qemu-devel] [PATCH v5 3/5] util: [ppc] Use new error_report_fatal() instead of exit()

2016-01-28 Thread Lluís Vilanova

Signed-off-by: Lluís Vilanova 
---
 target-ppc/kvm.c|5 +--
 target-ppc/translate.c  |7 ++--
 target-ppc/translate_init.c |   80 +--
 3 files changed, 44 insertions(+), 48 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 9940a90..098a40d 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -316,9 +316,8 @@ static long gethugepagesize(const char *mem_path)
 } while (ret != 0 && errno == EINTR);
 
 if (ret != 0) {
-fprintf(stderr, "Couldn't statfs() memory path: %s\n",
-strerror(errno));
-exit(1);
+error_report_fatal("Couldn't statfs() memory path: %s",
+   strerror(errno));
 }
 
 #define HUGETLBFS_MAGIC   0x958458f6
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 4be7eaa..2dfbbc2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11574,10 +11574,9 @@ void gen_intermediate_code(CPUPPCState *env, struct 
TranslationBlock *tb)
 break;
 }
 if (tcg_check_temp_count()) {
-fprintf(stderr, "Opcode %02x %02x %02x (%08x) leaked 
temporaries\n",
-opc1(ctx.opcode), opc2(ctx.opcode), opc3(ctx.opcode),
-ctx.opcode);
-exit(1);
+error_report_fatal("Opcode %02x %02x %02x (%08x) leaked 
temporaries",
+   opc1(ctx.opcode), opc2(ctx.opcode), 
opc3(ctx.opcode),
+   ctx.opcode);
 }
 }
 if (tb->cflags & CF_LAST_IO)
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index d7e1a4e..dc9bbd6 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -619,8 +619,8 @@ static inline void _spr_register(CPUPPCState *env, int num,
 spr->oea_read != NULL || spr->oea_write != NULL ||
 #endif
 spr->uea_read != NULL || spr->uea_write != NULL) {
-printf("Error: Trying to register SPR %d (%03x) twice !\n", num, num);
-exit(1);
+error_report_fatal("Error: Trying to register SPR %d (%03x) twice !",
+   num, num);
 }
 #if defined(PPC_DEBUG_SPR)
 printf("*** register spr %d (%03x) %s val " TARGET_FMT_lx "\n", num, num,
@@ -1608,8 +1608,7 @@ static void gen_spr_BookE (CPUPPCState *env, uint64_t 
ivor_mask)
 for (i = 0; i < 64; i++) {
 if (ivor_mask & (1ULL << i)) {
 if (ivor_sprn[i] == SPR_BOOKE_IVORxx) {
-fprintf(stderr, "ERROR: IVOR %d SPR is not defined\n", i);
-exit(1);
+error_report_fatal("ERROR: IVOR %d SPR is not defined", i);
 }
 spr_register(env, ivor_sprn[i], ivor_names[i],
  SPR_NOACCESS, SPR_NOACCESS,
@@ -8319,14 +8318,14 @@ static void init_ppc_proc(PowerPCCPU *cpu)
 case POWERPC_FLAG_VRE:
 break;
 default:
-fprintf(stderr, "PowerPC MSR definition inconsistency\n"
-"Should define POWERPC_FLAG_SPE or POWERPC_FLAG_VRE\n");
-exit(1);
+error_report("PowerPC MSR definition inconsistency");
+error_report_fatal(
+"Should define POWERPC_FLAG_SPE or POWERPC_FLAG_VRE");
 }
 } else if (env->flags & (POWERPC_FLAG_SPE | POWERPC_FLAG_VRE)) {
-fprintf(stderr, "PowerPC MSR definition inconsistency\n"
-"Should not define POWERPC_FLAG_SPE nor POWERPC_FLAG_VRE\n");
-exit(1);
+error_report("PowerPC MSR definition inconsistency");
+error_report_fatal(
+"Should not define POWERPC_FLAG_SPE nor POWERPC_FLAG_VRE");
 }
 if (env->msr_mask & (1 << 17)) {
 switch (env->flags & (POWERPC_FLAG_TGPR | POWERPC_FLAG_CE)) {
@@ -8334,14 +8333,14 @@ static void init_ppc_proc(PowerPCCPU *cpu)
 case POWERPC_FLAG_CE:
 break;
 default:
-fprintf(stderr, "PowerPC MSR definition inconsistency\n"
-"Should define POWERPC_FLAG_TGPR or POWERPC_FLAG_CE\n");
-exit(1);
+error_report("PowerPC MSR definition inconsistency");
+error_report_fatal(
+"Should define POWERPC_FLAG_TGPR or POWERPC_FLAG_CE");
 }
 } else if (env->flags & (POWERPC_FLAG_TGPR | POWERPC_FLAG_CE)) {
-fprintf(stderr, "PowerPC MSR definition inconsistency\n"
-"Should not define POWERPC_FLAG_TGPR nor POWERPC_FLAG_CE\n");
-exit(1);
+error_report("PowerPC MSR definition inconsistency");
+error_report_fatal(
+"Should not define POWERPC_FLAG_TGPR nor POWERPC_FLAG_CE");
 }
 if (env->msr_mask & (1 << 10)) {
 switch (env->flags & (POWERPC_FLAG_SE | POWERPC_FLAG_DWE |
@@ -8351,17 +8350,17 @@ static void init_ppc_proc(PowerPCCPU *cpu)
 case POWERPC_FLAG_UBLE:
 break;
 default:
-fprintf(stderr, "PowerPC M

[Qemu-devel] [PATCH v5 1/5] util: Introduce error reporting functions with fatal/abort

2016-01-28 Thread Lluís Vilanova

Provide two lean functions to report error messages that fatal/abort
QEMU.

Signed-off-by: Lluís Vilanova 
---
 include/qemu/error-report.h |   19 +++
 util/qemu-error.c   |   33 +
 2 files changed, 52 insertions(+)

diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
index 7ab2355..6c2f142 100644
--- a/include/qemu/error-report.h
+++ b/include/qemu/error-report.h
@@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 const char *error_get_progname(void);
 extern bool enable_timestamp_msg;
 
+/* Report message and exit with error */
+void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list ap) 
GCC_FMT_ATTR(1, 0);
+void QEMU_NORETURN error_report_fatal(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
+/* Report message with caller location and abort */
+#define error_vreport_abort(fmt, ap)\
+do {\
+error_report_abort_caller_internal(__FILE__, __LINE__, __func__); \
+error_vreport_abort_internal(fmt, ap);  \
+} while (0)
+#define error_report_abort(fmt, ...)\
+do {\
+error_report_abort_caller_internal(__FILE__, __LINE__, __func__); \
+error_report_abort_internal(fmt, ##__VA_ARGS__);\
+} while (0)
+
+void error_report_abort_caller_internal(const char *file, int line, const char 
*func);
+void QEMU_NORETURN error_vreport_abort_internal(const char *fmt, va_list ap) 
GCC_FMT_ATTR(1, 0);
+void QEMU_NORETURN error_report_abort_internal(const char *fmt, ...) 
GCC_FMT_ATTR(1, 2);
+
 #endif
diff --git a/util/qemu-error.c b/util/qemu-error.c
index ecf5708..3de002b 100644
--- a/util/qemu-error.c
+++ b/util/qemu-error.c
@@ -237,3 +237,36 @@ void error_report(const char *fmt, ...)
 error_vreport(fmt, ap);
 va_end(ap);
 }
+
+void error_vreport_fatal(const char *fmt, va_list ap)
+{
+error_vreport(fmt, ap);
+exit(1);
+}
+
+void error_report_fatal(const char *fmt, ...)
+{
+va_list ap;
+va_start(ap, fmt);
+error_vreport_fatal(fmt, ap);
+va_end(ap);
+}
+
+void error_report_abort_caller_internal(const char *file, int line, const char 
*func)
+{
+error_report("Unexpected error in %s() at %s:%d:", func, file, line);
+}
+
+void error_vreport_abort_internal(const char *fmt, va_list ap)
+{
+error_vreport(fmt, ap);
+abort();
+}
+
+void error_report_abort_internal(const char *fmt, ...)
+{
+va_list ap;
+va_start(ap, fmt);
+error_vreport_abort_internal(fmt, ap);
+va_end(ap);
+}

[Qemu-devel] [RFC][PATCH v5 0/5] utils: Improve and document error reporting

2016-01-28 Thread Lluís Vilanova

Adds leaner error-reporting functions for simple cases, and documents the
purpose of the different facilities available in QEMU.

Although not all printf+exit/abort are replaced with the proper functions, a few
are ported as an example.


Changes in v5
=

* Fix typo in documentation [Eric Blake].


Changes in v4
=

* Introduce 'error_report_fatal()' and 'error_report_abort()' functions
  [suggested by Thomas Huth].
* Repalce all existing uses of 'error_setg(error_fatal)' and
  'error_setg(error_abort)' with 'error_report_fatal()' and
  'error_report_abort()'.
* Replace all uses of 'exit()' with 'error_report_fatal()' in 'target-ppc'.
* Replace all uses of 'abort()' with 'error_report_abort()' in 'target-ppc'.

Changes in v3
=

* Drop special object 'error_warn' in favour of raw 'error_report()'
  [suggested by Markus Armbruster].


Changes in v2
=

* Split in two patches.
* Explicitly add a warning error object.


Signed-off-by: Lluís Vilanova 
---

Lluís Vilanova (5):
  util: Introduce error reporting functions with fatal/abort
  util: Use new error_report_fatal/abort instead of 
error_setg(&error_fatal/abort)
  util: [ppc] Use new error_report_fatal() instead of exit()
  util: [ppc] Use new error_report_abort() instead of abort()
  doc: Introduce coding style for errors


 HACKING |   33 ++
 hw/block/fdc.c  |6 ++-
 hw/ppc/spapr.c  |8 ++--
 hw/ppc/spapr_drc.c  |2 +
 include/qemu/error-report.h |   19 ++
 target-ppc/kvm.c|9 ++---
 target-ppc/kvm_ppc.h|   15 +---
 target-ppc/mmu-hash32.c |5 ++-
 target-ppc/mmu_helper.c |3 +-
 target-ppc/translate.c  |7 ++--
 target-ppc/translate_init.c |   80 +--
 util/error.c|9 ++---
 util/qemu-error.c   |   33 ++
 13 files changed, 155 insertions(+), 74 deletions(-)


To: qemu-devel@nongnu.org
Cc: Stefan Hajnoczi 
Cc: Dr. David Alan Gilbert 
Cc: Thomas Huth 
Cc: Markus Armbruster 
Cc: Eric Blake

Re: [Qemu-devel] [PATCH v4 5/5] doc: Introduce coding style for errors

2016-01-28 Thread Eric Blake

On 01/28/2016 02:41 PM, Lluís Vilanova wrote:
> Gives some general guidelines for reporting errors in QEMU.
> 
> Signed-off-by: Lluís Vilanova 
> ---
>  HACKING |   33 +
>  1 file changed, 33 insertions(+)
> 
> diff --git a/HACKING b/HACKING
> index 12fbc8a..f5783d4 100644
> --- a/HACKING
> +++ b/HACKING
> @@ -157,3 +157,36 @@ painful. These are:
>   * you may assume that integers are 2s complement representation
>   * you may assume that right shift of a signed integer duplicates
> the sign bit (ie it is an arithmetic shift, not a logical shift)
> +
> +7. Error reporting
> +
> +QEMU provides various mechanisms for reporting errors using a uniform format,
> +ensuring the user will receive them (e.g., shown in QMP when necessary). You
> +should use one of these mechanisms instead of manually reporting them (i.e., 
> do
> +not use 'printf()', 'exit()' or 'abort()').

abort() for unreachable code may be okay, but I'm not sure how to word
that.  Maybe "avoid use 'printf()' or 'exit()', and minimize use of
'abort()' situations that should be unreachable code".

May be worth mentioning that if the user can trigger it (command line,
hotplug, etc) then we want fatal; if it represents a programming bug
that the user should not be able to trigger, then abort is okay.

> +
> +7.1. Simple error messages
> +
> +The 'error_report*()' functions in "include/qemu/error-report.h" will
> +immediately report error messages to the user.
> +
> +WARNING: Do *not* use 'error_report_fatal()' or 'error_report_abort()' for
> +errors that are (or can be) triggered by guest code (e.g., some unimplimented

s/unimplimented/unimplemented/

> +corner case in guest code translation or device code). Otherwise that can be
> +abused by guest code to terminate QEMU. Instead, you should use
> +'error_report()'.
> +
> +7.2. Errors in user inputs
> +
> +The 'loc_*()' functions in "include/qemu/error-report.h" will extend the
> +messages from 'error_report*()' with references to locations in inputs 
> provided
> +by the user (e.g., command line arguments or configuration files).
> +
> +7.3. More complex error management
> +
> +The functions in "include/qapi/error.h" can be used to accumulate error 
> messages
> +in an 'Error' object, which can be propagated up the call chain where it is
> +finally reported.
> +
> +WARNING: The special 'error_fatal' and 'error_abort' objects follow the same
> +constrains as the 'error_report_fatal' and 'error_report_abort' functions.

s/constrains/constraints/

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

1 2 3 >

1 - 100 of 288 matches

Mail list logo