On 27/02/17 22:00, Michael Ellerman wrote:
> Alexey Kardashevskiy <a...@ozlabs.ru> writes:
> 
>> The IODA2 specification says that a 64 DMA address cannot use top 4 bits
>> (3 are reserved and one is a "TVE select"); bottom page_shift bits
>> cannot be used for multilevel table addressing either.
>>
>> The existing IODA2 table allocation code aligns the minimum TCE table
>> size to PAGE_SIZE so in the case of 64K system pages and 4K IOMMU pages,
>> we have 64-4-12=48 bits. Since 64K page stores 8192 TCEs, i.e. needs
>> 13 bits, the maximum number of levels is 48/13 = 3 so we physically
>> cannot address more and EEH happens on DMA accesses.
>>
>> This adds a check that too many levels were requested.
>>
>> It is still possible to have 5 levels in the case of 4K system page size.
>>
>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
>> ---
>>
>> The alternative would be allocating TCE tables as big as PAGE_SIZE but
>> only using parts of it but this would complicate a bit bits of code
>> responsible for overall amount of memory used for TCE table.
>>
>> Or kmem_cache_create() could be used to allocate as big TCE table levels
>> as we really need but that API does not seem to support NUMA nodes.
> 
> kmem_cache_alloc_node() ?


Yeah, discovered this later. Still, if a single level is used, then the
table is 4MB and kmem_cache_alloc_node() does not seem the right tool here
(although I cannot find any enforced upper limit).

So to keep things simpler, I decided to stick to alloc_pages_node() and
avoid mixing memory allocation APIs.


-- 
Alexey

Reply via email to