On 04.05.2022 15:00, Roger Pau Monné wrote:
> On Wed, May 04, 2022 at 02:12:58PM +0200, Jan Beulich wrote:
>> On 04.05.2022 14:01, Roger Pau Monné wrote:
>>> On Wed, May 04, 2022 at 12:51:25PM +0200, Jan Beulich wrote:
>>>> On 04.05.2022 12:30, Roger Pau Monné wrote:
>>>>> Right, ->iomem_caps is indeed too wide for our purpose.  What
>>>>> about using something like:
>>>>>
>>>>> else if ( is_pv_domain(d) )
>>>>> {
>>>>>     if ( !iomem_access_permitted(d, pfn, pfn) )
>>>>>         return 0;
>>>>
>>>> We can't return 0 here (as RAM pages also make it here when
>>>> !iommu_hwdom_strict), so I can at best take this as a vague outline
>>>> of what you really mean. And I don't want to rely on RAM pages being
>>>> (imo wrongly) represented by set bits in Dom0's iomem_caps.
>>>
>>> Well, yes, my suggestion was taking into account that ->iomem_caps for
>>> dom0 has mostly holes for things that shouldn't be mapped, but
>>> otherwise contains everything else as allowed (including RAM).
>>>
>>> We could instead do:
>>>
>>> else if ( is_pv_domain(d) && type != RAM_TYPE_CONVENTIONAL )
>>> {
>>>     ...
>>>
>>> So that we don't rely on RAM being 'allowed' in ->iomem_caps?
>>
>> This would feel to me like excess special casing.
> 
> What about placing this in the 'default:' label on the type switch a
> bit above?

I'd really like to stick to the present layout of where the special
casing is done, with PV and PVH logic at least next to each other. I
continue to think the construct I suggested (still visible below)
would do.

>>>>>     if ( rangeset_contains_singleton(mmio_ro_ranges, pfn) )
>>>>>         return IOMMUF_readable;
>>>>> }
>>>>>
>>>>> That would get us a bit closer to allowed CPU side mappings, and we
>>>>> don't need to special case IO-APIC or HPET addresses as those are
>>>>> already added to ->iomem_caps or mmio_ro_ranges respectively by
>>>>> dom0_setup_permissions().
>>>>
>>>> This won't fit in a region of code framed by a (split) comment
>>>> saying "Check that it doesn't overlap with ...". Hence if anything
>>>> I could put something like this further down. Yet even then the
>>>> question remains what to do with ranges which pass
>>>> iomem_access_permitted() but
>>>> - aren't really MMIO,
>>>> - are inside MMCFG,
>>>> - are otherwise special.
>>>>
>>>> Or did you perhaps mean to suggest something like
>>>>
>>>> else if ( is_pv_domain(d) && iomem_access_permitted(d, pfn, pfn) &&
>>>>           rangeset_contains_singleton(mmio_ro_ranges, pfn) )
>>>>     return IOMMUF_readable;
>>>
>>> I don't think this would be fully correct, as we would still allow
>>> mappings of IO-APIC pages explicitly banned in ->iomem_caps by not
>>> handling those?
>>
>> CPU side mappings don't deal with the IO-APICs specifically. They only
>> care about iomem_caps and mmio_ro_ranges. Hence explicitly banned
>> IO-APIC pages cannot be mapped there either. (Of course we only do
>> such banning if IO-APIC pages weren't possible to represent in
>> mmio_ro_ranges, which should effectively be never.)
> 
> I think I haven't expressed myself correctly.
> 
> This construct won't return 0 for pfns not in iomem_caps, and hence
> could allow mapping of addresses not in iomem_caps?

I'm afraid I don't understand: There's an iomem_access_permitted()
in the conditional. How would this allow mapping pages outside of
iomem_caps? The default case higher up has already forced perms to
zero for any non-RAM page (unless iommu_hwdom_inclusive).

>>>> ? Then there would only remain the question of whether mapping r/o
>>>> MMCFG pages is okay (I don't think it is), but that could then be
>>>> special-cased similar to what's done further down for vPCI (by not
>>>> returning in the "else if", but merely updating "perms").
>>>
>>> Well part of the point of this is to make CPU and Device mappings
>>> more similar.  I don't think devices have any business in poking at
>>> the MMCFG range, so it's fine to explicitly ban that range.  But I
>>> would have also said the same for IO-APIC pages, so I'm unsure why are
>>> IO-APIC pages fine to be mapped RO, but not the MMCFG range.
>>
>> I wouldn't have wanted to allow r/o mappings of the IO-APICs, but
>> Linux plus the ACPI tables of certain vendors require us to permit
>> this. If we didn't, Dom0 would crash there during boot.
> 
> Right, but those are required for the CPU only.  I think it's a fine
> goal to try to have similar mappings for CPU and Devices, and then
> that would also cover MMCFG in the PV case.  Or else it fine to assume
> CPU vs Device mappings will be slightly different, and then don't add
> any mappings for IO-APIC, HPET or MMCFG to the Device page tables
> (likely there's more that could be added here).

It being different is what Andrew looks to strongly dislike. And I agree
with this up to a certain point, i.e. I'm having a hard time seeing why
we should put in MMCFG mappings just for this reason. But if consensus
was that consistency across all types of MMIO is the goal, then I could
live with also making MMCFG mappings ...

Jan


Reply via email to