On 11 May 2015 at 07:29, Peter Crosthwaite <crosthwaitepe...@gmail.com> wrote:
> This is target-multi, a system-mode build that can support multiple
> cpu-types. Patches 1-3 are the main infrastructure. The hard part
> is the per-target changes needed to get each arch into an includable
> state.

Interesting. This is something I'd thought we were still some way
from being able to do :-)

> The hardest part is what to do about bootloading. Currently each arch
> has it's own architecture specific bootloading which may assume a
> single architecture. I have applied some hacks to at least get this
> RFC testable using a -kernel -firmware split but going forward being
> able to associate an elf/image with a cpu explictitly needs to be
> solved.

My first thought would be to leave the -kernel/-firmware stuff as
legacy (or at least with semantics defined by the board model in use)
and have per-CPU QOM properties for setting up images for genuinely
multi-CPU configs.

> For the implementation of this series, the trickiest part is cpu.h
> inclusion management. There are now more than one cpu.h's and different
> parts of the tree need a different include scheme. target-multi defines
> it's own cpu.h which is bare minimum defs as needed by core code only.
> target-foo/cpu.h are mostly the same but refactored to reuse common
> code (with target-multi/cpu-head.h). Inclusion scheme goes something like
> this (for the multi-arch build):
>
> 1: All obj-y modules include target-multi/cpu.h
> 2: Core code includes no other cpu.h's
> 3: target-foo/ implementation code includes target-foo/cpu.h
> 4: System level code (e.g. mach models) can use multiple target-foo/cpu.h's
>
> Point 4 means that cpu.h's needs to be refactored to be able to include one
> after the other. The interrupts for ARM and MB needed to be renamed to avoid
> namespace collision. A few other defs needed multiple include guards, and
> a few defs which where only for user mode are compiled out or relocated. No
> attempt at support for multi-arch linux-user mode (if that even makes sense?).

I don't think it does make much sense -- our linux-user code hardwires
a lot of ABI details like size of 'long' and struct layouts. In any
case we should probably leave it for later.

> The env as handle by common code now needs to architecture-agnostic. The
> MB and ARM envs are refactored to have CPU_COMMON as the first field(s)
> allowing QOM-style pointer casts to/from a generic env which contains only
> CPU_COMMON. Might need to lock down some struct packing for that but it
> works for me so far.

Have you managed to retain the "generated code passes around a pointer
to an env which starts with the CPU specific fields"? We have the env
structs the layout we do because it's a performance hit if the registers
aren't a short distance away from the pointer...

> The helper function namespace is going to be tricky. I haven't tackled the
> problem just yet, but looking for ideas on how we can avoid prefacing all
> helpers with arch prefixes to avoid link-time collisions because multiple
> arches use the same helper names.
>
> A lowest common denomintor approach is taken on architecture specifics. E.g.
> TARGET_LONG is 64-bit, and the address space sizes and NUM_MMU_MODES is set
> to the maximum of all the supported arches.

...speaking of performance hits.

I'm not sure you can do lowest-common-denominator for TARGET_PAGE_SIZE,
incidentally. At minimum it will result in a perf hit for the CPUs with
larger pages (because we end up taking the hugepage support paths in the
cputlb.c code), and at worst TLB flushing in the target's helper routines
might not take out the right pages. (I think ARM has some theoretical
bugs here which we don't hit in practice; ARM already has to cope with
a TARGET_PAGE_SIZE smaller than its usual pagesize, though.)

> The remaining globally defined interfaces between core code and CPUs are
> QOMified per-cpu (P2)
>
> Microblaze translation needs a change pattern to allow conversion to 64-bit
> TARGET_LONG. Uses of TCGv need to be removed and explicited to 32-bit.

Yeah, this will be a tedious job for the other targets (I had to do it
for ARM when I added the AArch64 support).

> This RFC will serve as a reference as I send bits and piece to the respective
> maintainers (many major subsystems are patched).
>
> No support for KVM, im not sure if a mix of TCG and KVM is supported even for
> a single arch? (which would be prerequisite to MA KVM).

You can build a single binary which supports both TCG and KVM for a
particular architecture. You just can't swap back and forth between
TCG and KVM at runtime. We should probably start by supporting KVM
only on boards with a single CPU architecture. I don't think it's
in-principle impossible to get a setup with 4 KVM CPUs and one
TCG emulated CPUs to work, but it probably needs to wait til we've
got multi-threaded TCG working before we even think about it.

> Depends (not heavily) on my on-list disas QOMification. Test instructions
> available on request. I have tested ARM & MB elfs handshaking through shared
> memory and both printfing to the same UART (verifying system level
> connectivity). -d in_asm works with the mix of disas arches comming out.

Did you do any benchmarking to see whether the performance hits are
noticeable in practice?

Do you give each CPU its own codegen buffer? (I'm thinking that some
of this might also be more easily done once multithreadded-TCG is
complete, since that will properly split the datastructures.)

thanks
-- PMM

Reply via email to