[Qemu-devel] [PATCH v1 00/15] Multi-Arch Phase 1

2015-09-11 Thread Peter Crosthwaite
This is the first set of patches needed to enable Multi-arch system
emulation. For full context refer to RFCv3:

[PATCH v3 00/35] Multi Architecture System Emulation
https://lists.gnu.org/archive/html/qemu-devel/2015-07/msg03929.html

This is the first patch-pack intended for merge.

Original cover, as well as overall series state below for further
information.

Regards,
Peter

Original Multi-arch arch patch series cover:

***

This is target-multi, a system-mode build that can support multiple
cpu-types.

Two architectures are initially converted. Microblaze and ARM. Step
by step conversion in done for each. A microblaze is added to
Xilinx Zynq platform as a test case. This will be elaborted more in
future spins. This use case is valid, as Microblazes can be added (any
number of them!) in Zynq FPGA programmable logic configuration.

The general approach (radically different to approach in V1 RFC) is to build
and prelink an object (arch-obj.o) per-arch containing:

1: target-foo/*
2: All uses of env internals and CPU_GET_ENV
* cputlb, translate-all, cpu-exec
* TCG backend

This means cputlb and friends are compiled multiple times fo each arch. The
symbols for each of these pre-links are then localised to avoid link time name
collisions. This is based on Paolo's suggestion to templatify cputlb and
friends. Just the net of what to multi-compile is widened to include the TCG
stuff as well now.

Despite being some "major surgery" this approach actually solves many of big
the problems raised in V1. Big problems sovled:

1: With the multi-compile TCG backends there are now multiple tcg_ctx's for
each architecture. This solves the issue PMM raised WRT false positives on TB
hashing as archs no longer share translation context.

2: There is no longer a need to reorder the CPU_COMMON within the ENV or the ENV
within the CPU. This was flagged as a performance issue by multiple people in
V1.
All users of the env internals as well as ENV_GET_CPU are now in multi-compile
code and so multi-arch does not need to define a generic ENV nor does in need to
def the problematic ENV_GET_CPU.

3: With the prelink symbol localisation, link time namespace collision of
helpers from multiple arches is no longer an issue. No need to bloat all the
function names with arch specific prefixes.

4: The architecture specifics used/defined by cpu-defs can now vary from arch to
arch (incl. target_ulong) greatly reducing coversion effort needed. The list
of restrictions for multi-arch capability is much reduced since V1. No
target_long issues anymore.

include/exec/*.h and some of the common code needs some refactoring to setup
this single vs multi compile split. Mostly code movements.

Some functions (like tcg_enabled) need to be listified for each of the
now-multiple TCG engines.

The interface between the multi compile and single compiled files needs to be
virtualised using QOM cpu functions. But this is now a very low footprint
change as most of the virtualised hooks are now in mutli-compiled code (they
only exist as text once). There are more new hooks than before, but the per
target change pattern is reduced.

For the implementation of the series, the trickiest part is (still) cpu.h
inclusion management. There are now more than one cpu.h's and different
parts of the tree need a different include scheme. target-multi defines
it's own cpu.h which is bare minimum defs as needed by core code only.
target-foo/cpu.h are mostly the same but refactored to avoid collisions
with other cpu.h's. Inclusion scheme goes something like
this (for the multi-arch build):

*: Core code includes only target-multi/cpu.h
*: target-foo/ implementation code includes target-foo/cpu.h locally
*: System level code (e.g. mach models) can use multiple target-foo/cpu.h's

The hardest unasnwered Q is (still) what to do about bootloading. Currently
each arch has it's own architecture specific bootloading which may assume a
single architecture. I have applied some hacks to at least get this
RFC testable using a -kernel -firmware split but going forward being
able to associate an elf/image with a cpu explictitly needs to be
solved.

No support for KVM, im not sure if a mix of TCG and KVM is supported even for
a single arch? (which would be prerequisite to MA KVM).

***

Current review state of full multi-arch work in progress branch:

cpu-exec: Migrate some generic fns to cpu-exec-common
translate: Listify tcg_exec_init()  R:rth
translate-all: Move tcg_handle_interrupt() to -common   R:rth
tcg: split tcg_op_defs to -common
tcg: Move tcg_tb_ptr to -common
translate: move real_host_page setting to -common
cpus: Listify cpu_list() function
translate-common: Listify tcg_enabled()
core: Convert tcg_enabled() uses to any/all variants
exec-all: Move cpu_can_do_io() to qom/cpu.h R:rth
cputlb: move CPU_LOOP() for tlb_reset() to exec.c
cputlb: Change tlb_set_dirty() arg to cpu
include/exec: Move cputlb exec.c defs out 

Re: [Qemu-devel] [PATCH v1 00/15] Multi-Arch Phase 1

2015-09-11 Thread Paolo Bonzini


On 11/09/2015 07:39, Peter Crosthwaite wrote:
> This is the first set of patches needed to enable Multi-arch system
> emulation. For full context refer to RFCv3:
> 
> [PATCH v3 00/35] Multi Architecture System Emulation
> https://lists.gnu.org/archive/html/qemu-devel/2015-07/msg03929.html
> 
> This is the first patch-pack intended for merge.
> 
> Original cover, as well as overall series state below for further
> information.

I think we can already merge patches 1, 3, 4, 5, 6, 11, 12, 13, 15
(plus, patch 10 is gone). The others do not make much sense without
multiarch support.

I suppose the next part of the surgery could be

  target-*: Don't redefine cpu_exec()
  target-*: cpu.h: Undefine core code symbols
  arm: cpu: static inline cpu_arm_init()
  target-arm: Split cp helper API to new C file
  hw: arm: Explicitly include cpu.h for consumers
  hw: mb: Explicitly include cpu.h for consumers

Paolo

> 
> Regards,
> Peter
> 
> Original Multi-arch arch patch series cover:
> 
> ***
> 
> This is target-multi, a system-mode build that can support multiple
> cpu-types.
> 
> Two architectures are initially converted. Microblaze and ARM. Step
> by step conversion in done for each. A microblaze is added to
> Xilinx Zynq platform as a test case. This will be elaborted more in
> future spins. This use case is valid, as Microblazes can be added (any
> number of them!) in Zynq FPGA programmable logic configuration.
> 
> The general approach (radically different to approach in V1 RFC) is to build
> and prelink an object (arch-obj.o) per-arch containing:
> 
> 1: target-foo/*
> 2: All uses of env internals and CPU_GET_ENV
> * cputlb, translate-all, cpu-exec
> * TCG backend
> 
> This means cputlb and friends are compiled multiple times fo each arch. The
> symbols for each of these pre-links are then localised to avoid link time name
> collisions. This is based on Paolo's suggestion to templatify cputlb and
> friends. Just the net of what to multi-compile is widened to include the TCG
> stuff as well now.
> 
> Despite being some "major surgery" this approach actually solves many of big
> the problems raised in V1. Big problems sovled:
> 
> 1: With the multi-compile TCG backends there are now multiple tcg_ctx's for
> each architecture. This solves the issue PMM raised WRT false positives on TB
> hashing as archs no longer share translation context.
> 
> 2: There is no longer a need to reorder the CPU_COMMON within the ENV or the 
> ENV
> within the CPU. This was flagged as a performance issue by multiple people in
> V1.
> All users of the env internals as well as ENV_GET_CPU are now in multi-compile
> code and so multi-arch does not need to define a generic ENV nor does in need 
> to
> def the problematic ENV_GET_CPU.
> 
> 3: With the prelink symbol localisation, link time namespace collision of
> helpers from multiple arches is no longer an issue. No need to bloat all the
> function names with arch specific prefixes.
> 
> 4: The architecture specifics used/defined by cpu-defs can now vary from arch 
> to
> arch (incl. target_ulong) greatly reducing coversion effort needed. The list
> of restrictions for multi-arch capability is much reduced since V1. No
> target_long issues anymore.
> 
> include/exec/*.h and some of the common code needs some refactoring to setup
> this single vs multi compile split. Mostly code movements.
> 
> Some functions (like tcg_enabled) need to be listified for each of the
> now-multiple TCG engines.
> 
> The interface between the multi compile and single compiled files needs to be
> virtualised using QOM cpu functions. But this is now a very low footprint
> change as most of the virtualised hooks are now in mutli-compiled code (they
> only exist as text once). There are more new hooks than before, but the per
> target change pattern is reduced.
> 
> For the implementation of the series, the trickiest part is (still) cpu.h
> inclusion management. There are now more than one cpu.h's and different
> parts of the tree need a different include scheme. target-multi defines
> it's own cpu.h which is bare minimum defs as needed by core code only.
> target-foo/cpu.h are mostly the same but refactored to avoid collisions
> with other cpu.h's. Inclusion scheme goes something like
> this (for the multi-arch build):
> 
> *: Core code includes only target-multi/cpu.h
> *: target-foo/ implementation code includes target-foo/cpu.h locally
> *: System level code (e.g. mach models) can use multiple target-foo/cpu.h's
> 
> The hardest unasnwered Q is (still) what to do about bootloading. Currently
> each arch has it's own architecture specific bootloading which may assume a
> single architecture. I have applied some hacks to at least get this
> RFC testable using a -kernel -firmware split but going forward being
> able to associate an elf/image with a cpu explictitly needs to be
> solved.
> 
> No support for KVM, im not sure if a mix of TCG and KVM is supported even for
> a single arch? (which would be