Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
On 2018-09-14 01:52 PM, Christian König wrote:
Am 14.09.2018 um 19:47 schrieb Philip Yang:
On 2018-09-14 03:51 AM, Christian König wrote:
Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
On 2018-09-13 04:52 PM, Philip Yang wrote:
[SNIP]
   +    amdgpu_mn_read_unlock(amn);
+
amdgpu_mn_read_lock/unlock support recursive locking for multiple
overlapping or nested invalidation ranges. But if you'r locking and
unlocking in the same function. Is that still a concern?
I don't understand the possible recursive case, but
amdgpu_mn_read_lock() still support recursive locking.
Well the real problem is that unlocking them here won't work.

We need to hold the lock until we are sure that the operation which
updates the page tables is completed.

The reason for this change is because hmm mirror has invalidate_start
callback, no invalidate_end callback

Check mmu_notifier.c and hmm.c again, below is entire logic to update
CPU page tables and callback:

mn lock amn->lock is used to protect interval tree access because
user may submit/register new userptr anytime.
This is same for old and new way.

step 2 guarantee the GPU operation is done before updating CPU page
table.

So I think the change is safe. We don't need hold mn lock until the
CPU page tables update is completed.
No, that isn't even remotely correct. The lock doesn't protects the
interval tree.

Old:
    1. down_read_non_owner(&amn->lock)
    2. loop to handle BOs from node->bos through interval tree
amn->object nodes
        gfx: wait for pending BOs fence operation done, mark user
pages dirty
        kfd: evict user queues of the process, wait for queue
unmap/map operation done
    3. update CPU page tables
    4. up_read(&amn->lock)

New, switch step 3 and 4
    1. down_read_non_owner(&amn->lock)
    2. loop to handle BOs from node->bos through interval tree
amn->object nodes
        gfx: wait for pending BOs fence operation done, mark user
pages dirty
        kfd: evict user queues of the process, wait for queue
unmap/map operation done
    3. up_read(&amn->lock)
    4. update CPU page tables
The lock is there to make sure that we serialize page table updates
with command submission.
As I understand it, the idea is to prevent command submission (adding
new fences to BOs) while a page table invalidation is in progress.

Yes, exactly.

But do we really need another lock for this? Wouldn't the re-validation of
userptr BOs (currently calling get_user_pages) force synchronization
with the ongoing page table invalidation through the mmap_sem or other
MM locks?

No and yes. We don't hold any other locks while doing command submission, but I expect that HMM has its own mechanism to prevent that.

Since we don't modify amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly not using this mechanism correctly.

Regards,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to