Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM

Joel Fernandes Sat, 31 Jan 2026 12:03:06 -0800

Hi Dave,

> On Jan 30, 2026, at 10:01 PM, Dave Airlie <[email protected]> wrote:
> 
> On Sat, 31 Jan 2026 at 07:14, Joel Fernandes <[email protected]> wrote:
>> 
>> 
>> 
>>> On 1/29/2026 10:38 PM, John Hubbard wrote:
>>> On 1/29/26 5:59 PM, Joel Fernandes wrote:
>>>> On 1/29/26 8:12 PM, John Hubbard wrote:
>>>>> On 1/29/26 4:26 PM, Joel Fernandes wrote:
>>>>>> Based on the below discussion and research, I came up with some deadlock
>>>>>> scenarios that we need to handle in the v6 series of these patches.
>>>>>> [...]
>>>>>> memory allocations under locks that we need in the dma-fence signaling
>>>>>> critical path (when doing the virtual memory map/unmap)
>>>>> 
>>>>> unmap? Are you seeing any allocations happening during unmap? I don't
>>>>> immediately see any, but that sounds surprising.
>>>> 
>>>> Not allocations but we are acquiring locks during unmap. My understanding
>>>> is (at least some) unmaps have to also be done in the dma fence signaling
>>>> critical path (the run stage), but Danilo/you can correct me if I am wrong
>>>> on that. We cannot avoid all locking but those same locks cannot be held in
>>>> any other paths which do a memory allocation (as mentioned in one of the
>>>> deadlock scenarios), that is probably the main thing to check for unmap.
>>>> 
>>> 
>>> Right, OK we are on the same page now: no allocations happening on unmap,
>>> but it can still deadlock, because the driver is typically going to
>>> use a single lock to protect calls both map and unmap-related calls
>>> to the buddy allocator.
>> 
>> Yes exactly!
>> 
>>> 
>>> For the deadlock above, I think a good way to break that deadlock is
>>> to not allow taking that lock in a fence signaling calling path.
>>> 
>>> So during an unmap, instead of "lock, unmap/free, unlock" it should
>>> move the item to a deferred-free list, which is processed separately.
>>> Of course, this is a little complex, because the allocation and reclaim
>>> has to be aware of such lists if they get large.
>> Yes, also avoiding GFP_KERNEL allocations while holding any of these mm locks
>> (whichever we take during map). The GPU buddy actually does GFP_KERNEL
>> allocations internally which is problematic.
>> 
>> Some solutions / next steps:
>> 
>> 1. allocating (VRAM and system memory) outside mm locks just before 
>> acquiring them.
>> 
>> 2. pre-allocating both VRAM and system memory needed, before the DMA fence
>> critical paths (The issue is also to figure out how much memory to 
>> pre-allocate
>> for the page table pages based on the VM_BIND request. I think we can analyze
>> the page tables in the submit stage to make an estimate).
>> 
>> 3. Unfortunately, I am using gpu-buddy when allocating a VA range in the Vmm
>> (called virt_buddy), which itself does GFP_KERNEL memory allocations in the
>> allocate path. I am not sure what do yet about this. ISTR the maple tree also
>> has similar issues.
>> 
>> 4. Using non-reclaimable memory allocations where pre-allocation or
>> pre-allocated memory pools is not possible (I'd like to avoid this #4 so we
>> don't fail allocations when memory is scarce).
>> 
>> Will work on these issues for the v7. Thanks,
> 
> The way this works on nouveau at least (and I haven't yet read the
> nova code in depth).
> 
> Is we have 4 stages of vmm page table mgmt.
> 
> ref - locked with a ref lock - can allocate/free memory - just makes
> sure the page tables exist and are reference counted
> map - locked with a map lock - cannot allocate memory - fill in the
> PTEs in the page table
> unmap - locked with a map lock - cannot allocate memory - removes
> entries in PTEs
> unref - locked with a ref lock - can allocate/free memory - just drops
> references and frees (not sure if it ever merges).


Thanks for sharing this, yes this is similar to what I am coming up with.

One thing is OpenRM (and the Linux kernel) have finer grained locking.

But I think we can keep it simple initially like we do for Nouveau with 
additional complexity progressively added.

Joel Fernandes


> 
> So maps and unmaps can be in fence signalling paths, but unrefs are
> done in free job from a workqueue.
> 
> Dave.
>> 
>> --
>> Joel Fernandes
>>

Re: [PATCH RFC v6 05/26] nova-core: mm: Add support to use PRAMIN windows to write to VRAM

Reply via email to