On 2019-03-06 9:42 p.m., Yang, Philip wrote:
> Userptr restore may have concurrent userptr invalidation after
> hmm_vma_fault adds the range to the hmm->ranges list, needs call
> hmm_vma_range_done to remove the range from hmm->ranges list first,
> then reschedule the restore worker. Otherwise
Userptr restore may have concurrent userptr invalidation after
hmm_vma_fault adds the range to the hmm->ranges list, needs call
hmm_vma_range_done to remove the range from hmm->ranges list first,
then reschedule the restore worker. Otherwise hmm_vma_fault will add
same range to the list, this will
Hi Felix,
Thanks, there are other corner cases, call untrack at end of restore
userptr worker is better place to cleanup. I will submit v2 patch, to
fix this issue completely.
Philip
On 2019-03-06 3:01 p.m., Kuehling, Felix wrote:
> Hmm, I'm not sure. This change probably fixes this issue,
Hmm, I'm not sure. This change probably fixes this issue, but there may
be other similar corner cases in other situations where the restore
worker fails and needs to retry. The better place to call untrack inĀ
amdgpu_amdkfd_restore_userptr_worker would be at the very end. Anything
that's left
Userptr restore may have concurrent userptr invalidation after
hmm_vma_fault adds the range to the hmm->ranges list, needs call
hmm_vma_range_done to remove the range from hmm->ranges list first,
then reschedule the restore worker. Otherwise hmm_vma_fault will add
same range to the list, this will