On 6/22/26 11:03 AM, Peter Maydell wrote:
> On Sun, 14 Jun 2026 at 20:08, Michael S. Tsirkin <[email protected]> wrote:
>>
>> From: Alejandro Jimenez <[email protected]>
>>
>> DTE[Mode] and PTE NextLevel encode page table levels as 1-based values, but
>> fetch_pte() currently uses a 0-based level counter, making the logic
>> harder to follow and requiring conversions between DTE mode and level.
>>
>> Switch the page table walk logic to use 1-based level accounting in
>> fetch_pte() and the relevant macro helpers. To further simplify the page
>> walking loop, split the root page table access from the walk i.e. rework
>> fetch_pte() to follow the DTE Page Table Root Pointer and retrieve the top
>> level pagetable entry before entering the loop, then iterate only over the
>> PDE/PTE entries.
>>
>> The reworked algorithm fixes a page walk bug where the page size was
>> calculated for the next level before checking if the current PTE was already
>> a leaf/hugepage. That caused hugepage mappings to be reported as 4K pages,
>> leading to performance degradation and failures in some setups.
>
> Hi; Coverity flags up an issue with this commit. It won't have any
> practical consequences, but it does show that there's something we
> could clean up.
>
>> +/*
>> + * Validate DTE fields and extract permissions and top level data required
>> to
>> + * initiate the page table walk.
>> + *
>> + * On success, returns 0 and stores:
>> + * - top_level: highest page-table level encoded in DTE[Mode]
>> + * - dte_perms: effective permissions from the DTE
>> + *
>> + * On failure, returns -AMDVI_FR_PT_ROOT_INV. This includes cases where:
>> + * - DTE permissions disallow read AND write
>> + * - DTE[Mode] is invalid for translation
>> + * - IOVA exceeds the address width supported by DTE[Mode]
>> + * In all such cases a page walk must be aborted.
>> + */
>> +static uint64_t amdvi_get_top_pt_level_and_perms(hwaddr address, uint64_t
>> dte,
>> + uint8_t *top_level,
>> + IOMMUAccessFlags
>> *dte_perms)
>
> This function returns 0 on success, or a small negative number on
> failure. But we have typed it as returning uint64_t.
>
>> @@ -662,40 +708,49 @@ static uint64_t large_pte_page_size(uint64_t pte)
>> static uint64_t fetch_pte(AMDVIAddressSpace *as, hwaddr address, uint64_t
>> dte,
>> uint64_t *pte, hwaddr *page_size)
>> {
>> - IOMMUAccessFlags perms = amdvi_get_perms(dte);
>> -
>> - uint8_t level, mode;
>> uint64_t pte_addr;
>> + uint8_t pt_level, next_pt_level;
>> + IOMMUAccessFlags perms;
>> + int ret;
>>
>> - *pte = dte;
>> *page_size = 0;
>>
>> - if (perms == IOMMU_NONE) {
>> + /*
>> + * Verify the DTE is properly configured before page walk, and extract
>> + * top pagetable level and permissions.
>> + */
>> + ret = amdvi_get_top_pt_level_and_perms(address, dte, &pt_level, &perms);
>
> At the callsite we take that uint64_t return value and put it into
> an "int" local variable. That potentially overflows. (In practice
> none of the values actually returned will overflow.)
>
>> + if (ret < 0) {
>> + return ret;
>
> Then we return that 'int' value from this function, which extends
> it back out to uint64_t again.
>
>> + }
>
> Since all of fetch_pte()'s callers put its return value into an 'int'
> local variable, and fetch_pte() is the only function that calls
> amdvi_get_top_pt_level_and_perms(), I think the right fix here is
> to make both amdvi_get_top_pt_level_and_perms() and fetch_pte()
> return "int" instead of "uint64_t". Then the type matches what the
> return variable is (0 on success, small negative on failure) and
> also the type of the variable that all the callers put it into.
>
Thank you for pointing this out.
This started when I initially introduced fetch_pte() as an unused helper,
returning the PTE value directly. After feedback, I changed it to return an
"int", but a later patch in the series accidentally reverted back to
"uint64_t". Then I repeated the same wrong pattern in
amdvi_get_top_pt_level_and_perms().
I'll send a patch making both amdvi_get_top_pt_level_and_perms() and
fetch_pte() return "int".
Thank you,
Alejandro
> thanks
> -- PMM