On 17/02/2020 13:09, Jan Beulich wrote:
> On 10.02.2020 15:28, Andrew Cooper wrote:
>> On 05/02/2020 09:43, Jan Beulich wrote:
>>> Introduce IOMMU_PDE_NEXT_LEVEL_{MIN,MAX} to replace literal 1, 6, and 7
>>> instances. While doing so replace two uses of memset() by initializers.
>>>
>>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
>> This does not look to be an improvement.  IOMMU_PDE_NEXT_LEVEL_MIN is
>> definitely bogus, and in all cases, a literal 1 is better, because that
>> is how we describe pagetable levels.
> I disagree.

A pagetable walking function which does:

while ( level > 1 )
{
    ...
    level--;
}

is far clearer and easier to follow than hiding 1 behind a constant
which isn't obviously 1.    Something like LEVEL_4K would at least be
something that makes sense in context, but a literal one less verbose.

>  The device table entry's mode field is bounded by 1
> (min) and 6 (max) for the legitimate values to put there.

If by 1, you mean 0, then yes.  Coping properly with a mode of 0 looks
to be easier than putting in an arbitrary restriction.

OTOH, you intended to restrict to just values we expect to find in a Xen
setup, then the answers are 3 and 4 only.  (The "correctness" of this
function depends on only running on Xen-written tables.  It doesn't
actually read the next-level field out of the PTE, and assumes that it
is a standard pagetable hierarchy.  Things will go wrong if it
encounters a superpage, or a next-level-7 entry.)

>
>> Something to replace literal 6/7 probably is ok, but doesn't want to be
>> done like this.
>>
>> The majority of the problems here as caused by iommu_pde_from_dfn()'s
>> silly ABI.  The pt_mfn[] array is problematic (because it is used as a
>> 1-based array, not 0-based) and useless because both callers only want
>> the 4k-equivelent mfn.  Fixing the ABI gets rid of quite a lot of wasted
>> stack space, every use of '1', and every upper bound other than the bug
>> on and amd_iommu_get_paging_mode().
> I didn't mean to alter that function's behavior, at the very least
> not until being certain there wasn't a reason it was coded with this
> array approach. IOW the alternative to going with this patch
> (subject to corrections of course) is for me to drop it altogether,
> keeping the hard-coded numbers in place. Just let me know.

If you don't want to change the API, then I'll put it on my todo list.

As previously expressed, this patch on its own is not an improvement IMO.

>>> ---
>>> TBD: We should really honor the hats field of union
>>>      amd_iommu_ext_features, but the specification (or at least the
>>>      parts I did look at in the course of putting together this patch)
>>>      is unclear about the maximum valid value in case EFRSup is clear.
>> It is available from PCI config space (Misc0 register, cap+0x10) even on
>> first gen IOMMUs,
> I don't think any of the address size fields there matches what
> HATS is about (limiting of the values valid to put in a DTE's
> mode field). In fact I'm having some difficulty bringing the
> two in (sensible) sync.

It will confirm whether 4-levels is available or not, but TBH, we know
that anyway by virtue of being 64bit.

Higher levels really don't matter because we don't support using them. 
We're we to support using them (and I do have one usecase in mind), it
would be entirely reasonable to restrict usage to systems which had EFR.

>
>> and the IVRS table in Type 10.
> Which may in turn be absent, i.e. the question of what to use as
> a default merely gets shifted.

One of Type 10 or 11 is mandatory for each IOMMU in the system.  One way
or another, the information is present.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to