On Thu, Dec 03, 2015 at 12:04:58PM +1100, Alexey Kardashevskiy wrote:
> On 12/02/2015 04:29 PM, Benjamin Herrenschmidt wrote:
> >On Wed, 2015-12-02 at 13:24 +1100, Alexey Kardashevskiy wrote:
> >>>But on the whole I agree with you, since the LPC is part of the P8
> >>>chip, I think it makes sense to include it even with -nodefaults.
> >>
> >>POWER8 chips all have 8 threads per core but we do not always assume -smt
> >>...,threads=8, how are LPC or PHB different?
> >
> >First, for pseries which is paravirtualized it's a different can of
> >worms completely. For powernv, we *should* represent all 8 threads,
> >we just can't yet due to TCG limitations.
> 
> Out of curiosity - for pseries we should not? I know it works with various
> numbers of threads but is not that because we also control guest linux
> kernel and, for example, the Other OS (AIX) might be upset on
> non-multiply-of-2 number of threads?

There are several different cases here and I'm not sure which you're
thinking about.

1) Guest has different number of threads-per-core than the host

This one is just fine - PAPR defines how the guest should get the
number of threads from the device tree, and qemu sets that correctly.

2) Guest threads-per-core not a power of two

The PAPR thread mechanism allows this to be communicated to the guest,
and I don't know if PAPR explicitly permits or prohibitis this
situation.  Guests could get confused by it, although that's arguably
a guest bug.

2) "Partially filled core", e.g. guest has 8 threads-per-core declared
   but only one vcpu available

This is the only one I can see as relying on Linux guest behaviour.
We kind of get away with this by accident with a Linux guest - it will
try to bring up all 8 threads, but fail non fatally.  We shouldn't
allow this situation, although we do right now.  Bharata posted a
patch which would prevent this in qemu, and I have a BZ to make
libvirt not allow this construction either.

> >>PHB is more interesting - how is the user supposed to add more?
> >
> >That's an open question. Since we model a real P8 chip we can only
> >model the PHBs as they exist on it, which is up to 3 per chip at
> >very specific XSCOM addresses. We could try to model some non-existing
> >P8 chip with more but bad things will happen when the FW try to assign
> >interrupt numbers for example.
> >
> >We simulate a machine that has been primed by HostBoot before OPAL
> >starts. So we rely on what the device-tree tells us of what PHB were
> >enabled but appart from that, we have to stick to the limitations.
> >
> >>And there always will be the default one
> >>which properties are set in a separate way (via -global, not -device). I
> >>found it sometime really annoying to debug the existing pseries which
> >>always adds a default PHB (I know, this was to make libvirt happy but this
> >>is not the case here).
> >>
> >>Out of curiosity - if we have 2 chips, will the system work if the second
> >>chip does not get any LPC or PHB attached?
> >
> >This is something I need to look into, there's a lot of work needed to
> >properly model "chips" that I haven't done yet, but what is there is
> >sufficient for a lot of usages already.
> 
> For now, if possible, I'd suggest implementing -nodefaults with no defaults
> whatsoever and create a config somewhere in the qemu tree to pass it via
> -readconfig to get reasonably configured machine so people will know what is
> expected to work but there will still be possibility for experiments (do not
> we secretly hope that other vendors will start designing/manufacturing their
> ppc64 chips?). It could be a config file per an actual POWER8 chip (we have
> two already).

I can see some benefit to that approach, but it does stray away from
current qemu practice (in general, not just compared to spapr).
Hmm.. not sure.

What I do think would be a good idea is to represent a POWER8 "chip"
as a instantiable qdev device, which will create the scoms and PHBs
under itself as per the hardware.  We can add device properties as
needed to make that construction more flexible.

We probably don't want to link the number of CPUs to the chip qdevs,
partly because that doesn't really fit the qemu model, but also
because we'll probably want some extra flexibility.  e.g. making a UP
system for experimentation, even though a single chip would have
multiple cores (IIUC) - SMP TCG is super slow, so we probably want that.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature

Reply via email to