Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread David Woodhouse
On Wed, 2020-10-07 at 19:23 +0200, Thomas Gleixner wrote:
> > It so happens that in Linux, we don't really architect the software
> > like that. So each of the PCI MSI domain, HPET, and IOAPIC have their
> > *own* message composer which has the same limits and composes basically
> > the same messages as if it was *their* format, not dictated to them by
> > the APIC upstream. And that's what we're both getting our panties in a
> > knot about, I think.
> 
> Are you actually reading what I write and caring to look at the code?
> 
> PCI-MSI does not have a compose message callback in the irq chip. The
> message is composed by the underlying parent domain. Same for HPET.
> 
> The only dogdy part is the IO/APIC for hysterical raisins and because
> I did not come around yet to sort that out.

Right. And the problem is that the "underlying parent domain" in this
case is actually x86_vector_domain. Whereas it probably makes more
sense, at least in theory, for there to be an *intermediate* domain
responsible for composing the Compat MSI messages.

Both the Compat-MSI and the IR-MSI domains would be children of the
generic x86_vector_domain, and then any given HPET, IOAPIC or PCI-MSI
domain would be a child of either one of those generic MSI domains.

> > It really doesn't matter that much to the underlying generic irqdomain
> > support for limited affinities. Except that you want to make the
> > generic code support the concept of a child domain supporting *more*
> > CPUs than its parent, which really doesn't make much sense if you think
> > about it.
> 
> Right. So we really want to stick the restriction into a compat-MSI
> domain to make stuff match reality and to avoid banging the head against
> the wall sooner than later.

Right.


smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread Thomas Gleixner
On Wed, Oct 07 2020 at 16:46, David Woodhouse wrote:
> The PCI MSI domain, HPET, and even the IOAPIC are just the things out
> there on the bus which might perform those physical address cycles. And
> yes, as you say they're just a message store sending exactly the
> message that was composed for them. They know absolutely nothing about
> what the message means and how it is composed.

That's what I said.

> It so happens that in Linux, we don't really architect the software
> like that. So each of the PCI MSI domain, HPET, and IOAPIC have their
> *own* message composer which has the same limits and composes basically
> the same messages as if it was *their* format, not dictated to them by
> the APIC upstream. And that's what we're both getting our panties in a
> knot about, I think.

Are you actually reading what I write and caring to look at the code?

PCI-MSI does not have a compose message callback in the irq chip. The
message is composed by the underlying parent domain. Same for HPET.

The only dogdy part is the IO/APIC for hysterical raisins and because
I did not come around yet to sort that out.

> It really doesn't matter that much to the underlying generic irqdomain
> support for limited affinities. Except that you want to make the
> generic code support the concept of a child domain supporting *more*
> CPUs than its parent, which really doesn't make much sense if you think
> about it.

Right. So we really want to stick the restriction into a compat-MSI
domain to make stuff match reality and to avoid banging the head against
the wall sooner than later.

Thanks,

tglx

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread David Woodhouse



On 7 October 2020 17:02:59 BST, Thomas Gleixner  wrote:
>On Wed, Oct 07 2020 at 15:23, David Woodhouse wrote:
>> On Wed, 2020-10-07 at 16:05 +0200, Thomas Gleixner wrote:
>>> On Wed, Oct 07 2020 at 14:08, David Woodhouse wrote:
>>> > On 7 October 2020 13:59:00 BST, Thomas Gleixner
> wrote:
>>> > > On Wed, Oct 07 2020 at 08:48, David Woodhouse wrote:
>>> > > > To fix *that* case, we really do need the whole series giving
>us per-
>>> > > > domain restricted affinity, and to use it for those
>MSIs/IOAPICs that
>>> > > > the IRQ remapping doesn't cover.
>>> > > 
>>> > > Which do not exist today.
>>> > 
>>> > Sure. But at patch 10/13 into this particular patch series, it
>*does*
>>> > exist.
>>> 
>>> As I told you before: Your ordering is wrong. We do not introduce
>bugs
>>> first and then fix them later 
>>
>> I didn't introduce that bug; it's been there for years. Fixing it
>> properly requires per-irqdomain affinity limits.
>>
>> There's a cute little TODO at least in the Intel irq-remapping
>driver,
>> noting that we should probably check if there are any IOAPICs that
>> aren't in the scope of any DRHD at all. But that's all.
>
>So someone forgot to remove the cute little TODO when this was added:
>
>   if (parse_ioapics_under_ir()) {
>pr_info("Not enabling interrupt remapping\n");
>goto error;
>}

And HPET, and PCI devices including those that might be hotplugged in future 
and not be covered by any extant IOMMU's scope?

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread Thomas Gleixner
On Wed, Oct 07 2020 at 15:23, David Woodhouse wrote:
> On Wed, 2020-10-07 at 16:05 +0200, Thomas Gleixner wrote:
>> On Wed, Oct 07 2020 at 14:08, David Woodhouse wrote:
>> > On 7 October 2020 13:59:00 BST, Thomas Gleixner  wrote:
>> > > On Wed, Oct 07 2020 at 08:48, David Woodhouse wrote:
>> > > > To fix *that* case, we really do need the whole series giving us per-
>> > > > domain restricted affinity, and to use it for those MSIs/IOAPICs that
>> > > > the IRQ remapping doesn't cover.
>> > > 
>> > > Which do not exist today.
>> > 
>> > Sure. But at patch 10/13 into this particular patch series, it *does*
>> > exist.
>> 
>> As I told you before: Your ordering is wrong. We do not introduce bugs
>> first and then fix them later 
>
> I didn't introduce that bug; it's been there for years. Fixing it
> properly requires per-irqdomain affinity limits.
>
> There's a cute little TODO at least in the Intel irq-remapping driver,
> noting that we should probably check if there are any IOAPICs that
> aren't in the scope of any DRHD at all. But that's all.

So someone forgot to remove the cute little TODO when this was added:

   if (parse_ioapics_under_ir()) {
pr_info("Not enabling interrupt remapping\n");
goto error;
}

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread David Woodhouse
On Wed, 2020-10-07 at 17:25 +0200, Thomas Gleixner wrote:
> It's clearly how the hardware works. MSI has a message store of some
> sorts and if the entry is enabled then the MSI chip (in PCI or
> elsewhere) will send exactly the message which is in that message
> store. It knows absolutely nothing about what the message means and how
> it is composed. The only things which MSI knows about is whether the
> message address is 64bit wide or not and whether the entries are
> maskable or not and how many entries it can store.
> 
> Software allocates a message target at the underlying irq domain (vector
> or remap) and that underlying irq domain defines the properties.
> 
> If qemu emulates it differently then it's qemu's problem, but that does
> not make it in anyway something which influences the irq domain
> abstractions which are correctly modeled after how the hardware works.
> 
> > Not really the important part to deal with right now, either way.
> 
> Oh yes it is. We define that upfront and not after the fact.

The way the hardware works is that something handles physical address
cycles to addresses in the (on x86) 0xFEEx range, and turns them
into actual interrupts on the appropriate CPU — where the APIC ID and
vector (etc.) are directly encoded in the bits of the address and the
data written. That compatibility x86 APIC MSI format is where the 8-bit 
(or 15-bit) limit comes from.

Then interrupt remapping comes along, and now those physical address
cycles are actually handled by the IOMMU — which can either handle the
compatibility format as before, or use a different format of
address/data bits and perform a lookup in its IRTE table.

The PCI MSI domain, HPET, and even the IOAPIC are just the things out
there on the bus which might perform those physical address cycles. And
yes, as you say they're just a message store sending exactly the
message that was composed for them. They know absolutely nothing about
what the message means and how it is composed.

It so happens that in Linux, we don't really architect the software
like that. So each of the PCI MSI domain, HPET, and IOAPIC have their
*own* message composer which has the same limits and composes basically
the same messages as if it was *their* format, not dictated to them by
the APIC upstream. And that's what we're both getting our panties in a
knot about, I think.

It really doesn't matter that much to the underlying generic irqdomain
support for limited affinities. Except that you want to make the
generic code support the concept of a child domain supporting *more*
CPUs than its parent, which really doesn't make much sense if you think
about it. But it isn't that hard to do, and if it means we don't have
to argue any more about the x86 hierarchy not matching the hardware
then it's a price worth paying. 




smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread Thomas Gleixner
On Wed, Oct 07 2020 at 16:05, David Woodhouse wrote:
> On Wed, 2020-10-07 at 16:05 +0200, Thomas Gleixner wrote:
>> The top most MSI irq chip does not even have a compose function, neither
>> for the remap nor for the vector case. The composition is done by the
>> parent domain from the data which the parent domain constructed. Same
>> for the IO/APIC just less clearly separated.
>> 
>> The top most chip just takes what the underlying domain constructed and
>> writes it to the message store, because that's what the top most chip
>> controls. It does not control the content.
>
> Sure, whatever. The way we arrange the IRQ domain hierarchy in x86
> Linux doesn't really match my understanding of the real hardware, or
> how qemu emulates it either. But it is at least internally self-
> consistent, and in this model it probably does make some sense to claim
> the 8-bit limit on x86_vector_domain itself, and *remove* that limit in
> the remapping irqdomain.

It's clearly how the hardware works. MSI has a message store of some
sorts and if the entry is enabled then the MSI chip (in PCI or
elsewhere) will send exactly the message which is in that message
store. It knows absolutely nothing about what the message means and how
it is composed. The only things which MSI knows about is whether the
message address is 64bit wide or not and whether the entries are
maskable or not and how many entries it can store.

Software allocates a message target at the underlying irq domain (vector
or remap) and that underlying irq domain defines the properties.

If qemu emulates it differently then it's qemu's problem, but that does
not make it in anyway something which influences the irq domain
abstractions which are correctly modeled after how the hardware works.

> Not really the important part to deal with right now, either way.

Oh yes it is. We define that upfront and not after the fact.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread David Woodhouse
On Wed, 2020-10-07 at 16:05 +0200, Thomas Gleixner wrote:
> > > The information has to property of the relevant irq domains and the
> > > hierarchy allows you nicely to retrieve it from there instead of
> > > sprinkling this all over the place.
> > 
> > No. This is not a property of the parent domain per se. Especially if
> > you're thinking that we could inherit the affinity mask from the
> > parent, then twice no.
> > 
> > This is a property of the MSI domain itself, and how many bits of
> > destination ID the hardware at *this* level can interpret and pass on
> > to the parent domain.
> 
> Errm what?
> 
> The MSI domain does not know anything about what the underlying domain
> can handle and it shouldn't.
> 
> If MSI is on top of remapping then the remapping domain defines what the
> MSI domain can do and not the other way round. Ditto for the non
> remapped case in which the vector domain decides.
> 
> The top most MSI irq chip does not even have a compose function, neither
> for the remap nor for the vector case. The composition is done by the
> parent domain from the data which the parent domain constructed. Same
> for the IO/APIC just less clearly separated.
> 
> The top most chip just takes what the underlying domain constructed and
> writes it to the message store, because that's what the top most chip
> controls. It does not control the content.

Sure, whatever. The way we arrange the IRQ domain hierarchy in x86
Linux doesn't really match my understanding of the real hardware, or
how qemu emulates it either. But it is at least internally self-
consistent, and in this model it probably does make some sense to claim
the 8-bit limit on x86_vector_domain itself, and *remove* that limit in
the remapping irqdomain.

Not really the important part to deal with right now, either way.



smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread David Woodhouse
On Wed, 2020-10-07 at 16:05 +0200, Thomas Gleixner wrote:
> On Wed, Oct 07 2020 at 14:08, David Woodhouse wrote:
> > On 7 October 2020 13:59:00 BST, Thomas Gleixner  wrote:
> > > On Wed, Oct 07 2020 at 08:48, David Woodhouse wrote:
> > > > To fix *that* case, we really do need the whole series giving us per-
> > > > domain restricted affinity, and to use it for those MSIs/IOAPICs that
> > > > the IRQ remapping doesn't cover.
> > > 
> > > Which do not exist today.
> > 
> > Sure. But at patch 10/13 into this particular patch series, it *does*
> > exist.
> 
> As I told you before: Your ordering is wrong. We do not introduce bugs
> first and then fix them later 

I didn't introduce that bug; it's been there for years. Fixing it
properly requires per-irqdomain affinity limits.

There's a cute little TODO at least in the Intel irq-remapping driver,
noting that we should probably check if there are any IOAPICs that
aren't in the scope of any DRHD at all. But that's all.

If it happens, I think we'll end up doing the right thing and
instantiating a non-IR IOAPIC domain for it, and if we *don't* have any
CPU with an APIC ID above... (checks notes)... 7 ... then it'll just
about work out. Over 7 and we're screwed (because logical mode; see
also https://git.infradead.org/users/dwmw2/linux.git/commit/429f0c33f
for a straw man but that's getting even further down the rabbit hole)



smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread Thomas Gleixner
On Wed, Oct 07 2020 at 14:08, David Woodhouse wrote:
> On 7 October 2020 13:59:00 BST, Thomas Gleixner  wrote:
>>On Wed, Oct 07 2020 at 08:48, David Woodhouse wrote:
>>> To fix *that* case, we really do need the whole series giving us per-
>>> domain restricted affinity, and to use it for those MSIs/IOAPICs that
>>> the IRQ remapping doesn't cover.
>>
>>Which do not exist today.
>
> Sure. But at patch 10/13 into this particular patch series, it *does*
> exist.

As I told you before: Your ordering is wrong. We do not introduce bugs
first and then fix them later 

>>And all of this is completely wrong to begin with.
>>
>>The information has to property of the relevant irq domains and the
>>hierarchy allows you nicely to retrieve it from there instead of
>>sprinkling this all over the place.
>
> No. This is not a property of the parent domain per se. Especially if
> you're thinking that we could inherit the affinity mask from the
> parent, then twice no.
>
> This is a property of the MSI domain itself, and how many bits of
> destination ID the hardware at *this* level can interpret and pass on
> to the parent domain.

Errm what?

The MSI domain does not know anything about what the underlying domain
can handle and it shouldn't.

If MSI is on top of remapping then the remapping domain defines what the
MSI domain can do and not the other way round. Ditto for the non
remapped case in which the vector domain decides.

The top most MSI irq chip does not even have a compose function, neither
for the remap nor for the vector case. The composition is done by the
parent domain from the data which the parent domain constructed. Same
for the IO/APIC just less clearly separated.

The top most chip just takes what the underlying domain constructed and
writes it to the message store, because that's what the top most chip
controls. It does not control the content.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread David Woodhouse



On 7 October 2020 13:59:00 BST, Thomas Gleixner  wrote:
>On Wed, Oct 07 2020 at 08:48, David Woodhouse wrote:
>> On Tue, 2020-10-06 at 23:54 +0200, Thomas Gleixner wrote:
>>> On Mon, Oct 05 2020 at 16:28, David Woodhouse wrote:
>> This is the case I called out in the cover letter:
>>
>> This patch series implements a per-domain "maximum affinity" set
>and
>> uses it for the non-remapped IOAPIC and MSI domains on x86. As
>well as
>> allowing more CPUs to be used without interrupt remapping, this
>also
>> fixes the case where some IOAPICs or PCI devices aren't actually
>in
>> scope of any active IOMMU and are operating without remapping.
>>
>> To fix *that* case, we really do need the whole series giving us per-
>> domain restricted affinity, and to use it for those MSIs/IOAPICs that
>> the IRQ remapping doesn't cover.
>
>Which do not exist today.

Sure. But at patch 10/13 into this particular patch series, it *does* exist.

(Ignoring, for the moment, that I'm about to rewrite half the preceding patches 
to take a different approach)


>>> >   ip->irqdomain->parent = parent;
>>> > + if (parent == x86_vector_domain)
>>> > + irq_domain_set_affinity(ip->irqdomain, _non_ir_cpumask);
>>> 
>>> OMG
>>
>> This IOAPIC function may or may not be behind remapping.
>
>>> >   d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
>>> > + irq_domain_set_affinity(d, _non_ir_cpumask);
>>> 
>>> So here it's unconditional
>>
>> Yes, this code is only for the non-remapped case and there's a
>separate
>> arch_create_remap_msi_irq_domain() function a few lines further down
>> which creates the IR-PCI-MSI IRQ domain. So no need for a condition
>> here.
>>
>>> > + if (parent == x86_vector_domain)
>>> > + irq_domain_set_affinity(d, _non_ir_cpumask);
>>> 
>>> And here we need a condition again. Completely obvious and
>reviewable - NOT.
>>
>> And HPET may or may not be behind remapping so again needs the
>> condition. I had figured that part was fairly obvious but can note it
>> in the commit comment.
>
>And all of this is completely wrong to begin with.
>
>The information has to property of the relevant irq domains and the
>hierarchy allows you nicely to retrieve it from there instead of
>sprinkling this all over the place.

No. This is not a property of the parent domain per se. Especially if you're 
thinking that we could inherit the affinity mask from the parent, then twice no.

This is a property of the MSI domain itself, and how many bits of destination 
ID the hardware at *this* level can interpret and pass on to the parent domain.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread Thomas Gleixner
On Wed, Oct 07 2020 at 08:48, David Woodhouse wrote:
> On Tue, 2020-10-06 at 23:54 +0200, Thomas Gleixner wrote:
>> On Mon, Oct 05 2020 at 16:28, David Woodhouse wrote:
> This is the case I called out in the cover letter:
>
> This patch series implements a per-domain "maximum affinity" set and
> uses it for the non-remapped IOAPIC and MSI domains on x86. As well as
> allowing more CPUs to be used without interrupt remapping, this also
> fixes the case where some IOAPICs or PCI devices aren't actually in
> scope of any active IOMMU and are operating without remapping.
>
> To fix *that* case, we really do need the whole series giving us per-
> domain restricted affinity, and to use it for those MSIs/IOAPICs that
> the IRQ remapping doesn't cover.

Which do not exist today.

>> >ip->irqdomain->parent = parent;
>> > +  if (parent == x86_vector_domain)
>> > +  irq_domain_set_affinity(ip->irqdomain, _non_ir_cpumask);
>> 
>> OMG
>
> This IOAPIC function may or may not be behind remapping.

>> >d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
>> > +  irq_domain_set_affinity(d, _non_ir_cpumask);
>> 
>> So here it's unconditional
>
> Yes, this code is only for the non-remapped case and there's a separate
> arch_create_remap_msi_irq_domain() function a few lines further down
> which creates the IR-PCI-MSI IRQ domain. So no need for a condition
> here.
>
>> > +  if (parent == x86_vector_domain)
>> > +  irq_domain_set_affinity(d, _non_ir_cpumask);
>> 
>> And here we need a condition again. Completely obvious and reviewable - NOT.
>
> And HPET may or may not be behind remapping so again needs the
> condition. I had figured that part was fairly obvious but can note it
> in the commit comment.

And all of this is completely wrong to begin with.

The information has to property of the relevant irq domains and the
hierarchy allows you nicely to retrieve it from there instead of
sprinkling this all over the place.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-07 Thread David Woodhouse
On Tue, 2020-10-06 at 23:54 +0200, Thomas Gleixner wrote:
> On Mon, Oct 05 2020 at 16:28, David Woodhouse wrote:
> 
> > From: David Woodhouse 
> > 
> > When interrupt remapping isn't enabled, only the first 255 CPUs can
> 
> No, only CPUs with an APICid < 255 

Ack.

> > receive external interrupts. Set the appropriate max affinity for
> > the IOAPIC and MSI IRQ domains accordingly.
> > 
> > This also fixes the case where interrupt remapping is enabled but some
> > devices are not within the scope of any active IOMMU.
> 
> What? If this fixes an pre-existing problem then
> 
>   1) Explain the problem proper
>   2) Have a patch at the beginning of the series which fixes it
>  independently of this pile
> 
> If it's fixing a problem in your pile, then you got the ordering wrong.

It's not that simple; there's not a single patch which fixes that and
which can go first. I can, and do, fix the "no IR" case in a simple
patch that goes first, simply by restricting the kernel to the APIC IDs
which can be targeted.

This is the case I called out in the cover letter:

This patch series implements a per-domain "maximum affinity" set and
uses it for the non-remapped IOAPIC and MSI domains on x86. As well as
allowing more CPUs to be used without interrupt remapping, this also
fixes the case where some IOAPICs or PCI devices aren't actually in
scope of any active IOMMU and are operating without remapping.

To fix *that* case, we really do need the whole series giving us per-
domain restricted affinity, and to use it for those MSIs/IOAPICs that
the IRQ remapping doesn't cover.

> You didn't start kernel programming as of yesterday, so you really know
> how that works.
> 
> > ip->irqdomain->parent = parent;
> > +   if (parent == x86_vector_domain)
> > +   irq_domain_set_affinity(ip->irqdomain, _non_ir_cpumask);
> 
> OMG

This IOAPIC function may or may not be behind remapping.

> 
> > if (cfg->type == IOAPIC_DOMAIN_LEGACY ||
> > cfg->type == IOAPIC_DOMAIN_STRICT)
> > diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
> > index 4d891967bea4..af5ce5c4da02 100644
> > --- a/arch/x86/kernel/apic/msi.c
> > +++ b/arch/x86/kernel/apic/msi.c
> > @@ -259,6 +259,7 @@ struct irq_domain * __init 
> > native_create_pci_msi_domain(void)
> > pr_warn("Failed to initialize PCI-MSI irqdomain.\n");
> > } else {
> > d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
> > +   irq_domain_set_affinity(d, _non_ir_cpumask);
> 
> So here it's unconditional

Yes, this code is only for the non-remapped case and there's a separate
arch_create_remap_msi_irq_domain() function a few lines further down
which creates the IR-PCI-MSI IRQ domain. So no need for a condition
here.

> > }
> > return d;
> >  }
> > @@ -479,6 +480,8 @@ struct irq_domain *hpet_create_irq_domain(int hpet_id)
> > irq_domain_free_fwnode(fn);
> > kfree(domain_info);
> > }
> > +   if (parent == x86_vector_domain)
> > +   irq_domain_set_affinity(d, _non_ir_cpumask);
> 
> And here we need a condition again. Completely obvious and reviewable - NOT.

And HPET may or may not be behind remapping so again needs the
condition. I had figured that part was fairly obvious but can note it
in the commit comment.



smime.p7s
Description: S/MIME cryptographic signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 10/13] x86/irq: Limit IOAPIC and MSI domains' affinity without IR

2020-10-06 Thread Thomas Gleixner
On Mon, Oct 05 2020 at 16:28, David Woodhouse wrote:

> From: David Woodhouse 
>
> When interrupt remapping isn't enabled, only the first 255 CPUs can

No, only CPUs with an APICid < 255 

> receive external interrupts. Set the appropriate max affinity for
> the IOAPIC and MSI IRQ domains accordingly.
>
> This also fixes the case where interrupt remapping is enabled but some
> devices are not within the scope of any active IOMMU.

What? If this fixes an pre-existing problem then

  1) Explain the problem proper
  2) Have a patch at the beginning of the series which fixes it
 independently of this pile

If it's fixing a problem in your pile, then you got the ordering wrong.

You didn't start kernel programming as of yesterday, so you really know
how that works.

>   ip->irqdomain->parent = parent;
> + if (parent == x86_vector_domain)
> + irq_domain_set_affinity(ip->irqdomain, _non_ir_cpumask);

OMG

>   if (cfg->type == IOAPIC_DOMAIN_LEGACY ||
>   cfg->type == IOAPIC_DOMAIN_STRICT)
> diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
> index 4d891967bea4..af5ce5c4da02 100644
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -259,6 +259,7 @@ struct irq_domain * __init 
> native_create_pci_msi_domain(void)
>   pr_warn("Failed to initialize PCI-MSI irqdomain.\n");
>   } else {
>   d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
> + irq_domain_set_affinity(d, _non_ir_cpumask);

So here it's unconditional

>   }
>   return d;
>  }
> @@ -479,6 +480,8 @@ struct irq_domain *hpet_create_irq_domain(int hpet_id)
>   irq_domain_free_fwnode(fn);
>   kfree(domain_info);
>   }
> + if (parent == x86_vector_domain)
> + irq_domain_set_affinity(d, _non_ir_cpumask);

And here we need a condition again. Completely obvious and reviewable - NOT.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu