On 12/02/2019 11:20, David Gibson wrote:
> On Mon, Feb 11, 2019 at 06:48:01PM +1100, Alexey Kardashevskiy wrote:
>> We store 2 multilevel tables in iommu_table - one for the hardware and
>> one with the corresponding userspace addresses. Before allocating
>> the tables, the iommu_table_group_ops::get_table_size() hook returns
>> the combined size of the two and VFIO SPAPR TCE IOMMU driver adjusts
>> the locked_vm counter correctly. When the table is actually allocated,
>> the amount of allocated memory is stored in iommu_table::it_allocated_size
>> and used to adjust the locked_vm counter when we release the memory used
>> by the table; .get_table_size() and .create_table() calculate it
>> independently but the result is expected to be the same.
> 
> Any way we can remove that redundant calculation?  That seems like
> begging for bugs.


I do not see an easy way. One way could be adding a "dryrun" flag to
pnv_pci_ioda2_table_alloc_pages(), count allocated memory there and call
it from .get_table_size() but for multilevel TCEs it only allocates
first level...


>> Unfortunately the allocator does not add the userspace table size to
>> ::it_allocated_size so when we destroy the table because of VFIO PCI
>> unplug (i.e. VFIO container is gone but the userspace keeps running),
>> we decrement locked_vm by just a half of size of memory we are releasing.
>> As the result, we leak locked_vm and may not be able to allocate more
>> IOMMU tables after few iterations of hotplug/unplug.
>>
>> This adjusts it_allocated_size if the userspace addresses table was
>> requested (total_allocated_uas is initialized by zero).
>>
>> Fixes: 090bad39b "powerpc/powernv: Add indirect levels to it_userspace"
>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
> 
> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au>
> 
>> ---
>>  arch/powerpc/platforms/powernv/pci-ioda-tce.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c 
>> b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
>> index 697449a..58146e1 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
>> @@ -313,7 +313,7 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 
>> bus_offset,
>>                      page_shift);
>>      tbl->it_level_size = 1ULL << (level_shift - 3);
>>      tbl->it_indirect_levels = levels - 1;
>> -    tbl->it_allocated_size = total_allocated;
>> +    tbl->it_allocated_size = total_allocated + total_allocated_uas;
>>      tbl->it_userspace = uas;
>>      tbl->it_nid = nid;
>>  
> 

-- 
Alexey

Reply via email to