xive: Introduce spapr_xive_nr_ends()

Cédric Le Goater Tue, 24 Nov 2020 05:55:30 -0800

On 11/23/20 12:16 PM, Greg Kurz wrote:
> On Mon, 23 Nov 2020 10:46:38 +0100
> Cédric Le Goater <c...@kaod.org> wrote:
> 
>> On 11/20/20 6:46 PM, Greg Kurz wrote:
>>> We're going to kill the "nr_ends" field in a subsequent patch.
>>
>> why ? it is one of the tables of the controller and its part of 
>> the main XIVE concepts. Conceptually, we could let the machine 
>> dimension it with an arbitrary value as OPAL does. The controller
>> would fail when the table is fully used. 
>>
> 
> The idea is that the sPAPR machine only true need is to create a
> controller that can accommodate up to a certain number of vCPU ids.
> It doesn't really to know about the END itself IMHO.> 
> This being said, if we decide to pass both spapr_max_server_number()
> and smp.max_cpus down to the backends as function arguments, we won't
> have to change "nr_ends" at all.

I would prefer that but I am still not sure what they represent.

Looking at the sPAPR XIVE code, we deal with numbers/ranges in the
following places today.

* spapr_xive_dt()

It defines a range of interrupt numbers to be used by the guest
for the threads/vCPUs IPIs. It's a subset of interrupt numbers
in :

[ SPAPR_IRQ_IPI - SPAPR_IRQ_EPOW [

These are not vCPU ids.

Since these interrupt numbers will be considered as free to use
by the OS, it makes sense to pre-claim them. But claiming an
interrupt number in the guest can potentially set up, through
the KVM device, a mapping on the host and in HW. See below why
this can be a problem.

* kvmppc_xive_cpu_connect()

This sizes the NVT tables in OPAL for the guest. This is the
max number of vCPUs of the guest (not vCPU ids)

* spapr_irq_init()

This is where the IPI interrupt numbers are claimed today.
Directly in QEMU and KVM, if the machine is running XIVE only,
indirectly if it's dual, first in QEMU and then in KVM when
the machine switches of interrupt mode in CAS.

The problem is that the underlying XIVE resources in HW are
allocated where the QEMU process is running. Which is not the
best option when the vCPUs are pinned on different chips.

My patchset was trying to improve that by claiming the IPI on
demand when the vCPU is connected to the KVM device. But it
was using the vCPU id as the IPI interrupt number which is
utterly wrong, the guest OS could use any number in the range
exposed in the DT.

The last patch you sent was going in the right direction I think.
That is to claim the IPI when the guest OS is requesting for it.

http://patchwork.ozlabs.org/project/qemu-devel/patch/160528045027.804522.6161091782230763832.st...@bahia.lan/

But I don't understand why it was so complex. It should be like
the MSIs claimed by PCI devices.

All this to say, that we need to size better the range in the
"ibm,xive-lisn-ranges" property if that's broken for vSMT.

Then, I think the IPIs can be treated just like the PCI MSIs
but they need to be claimed first. That's the ugly part.

Should we add a special check in h_int_set_source_config to
deal with unclaimed IPIs that are being configured ?

Re: [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends()

Reply via email to