On 27/06/2022 19:58, Zeng, Oak wrote:


Thanks,
Oak

-----Original Message-----
From: Tvrtko Ursulin <tvrtko.ursu...@linux.intel.com>
Sent: June 27, 2022 4:30 AM
To: Zeng, Oak <oak.z...@intel.com>; Landwerlin, Lionel G
<lionel.g.landwer...@intel.com>; Vishwanathapura, Niranjana
<niranjana.vishwanathap...@intel.com>
Cc: Zanoni, Paulo R <paulo.r.zan...@intel.com>; intel-
g...@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Hellstrom,
Thomas <thomas.hellst...@intel.com>; Wilson, Chris P
<chris.p.wil...@intel.com>; Vetter, Daniel <daniel.vet...@intel.com>;
christian.koe...@amd.com; Auld, Matthew <matthew.a...@intel.com>
Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition


On 24/06/2022 21:23, Zeng, Oak wrote:
Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at backing
storage":

Correctness:
consider this sequence of:
1. unbind va1 from pa1,
2. then bind va1 to pa2. //user space has the freedom to do this as it
manages virtual address space 3. Submit shader code using va1, 4. Then
retire pa1.

If you don't perform tlb invalidate at step #1, in step #3, shader will use
stale entries in tlb and pa1 will be used for the shader. User want to use pa2.
So I don't think invalidate tlb at step #4 make correctness.

Define step 3. Is it a new execbuf? If so then there will be a TLB flush there.
Unless the plan is to stop doing that with eb3 but I haven't picked up on that
anywhere so far.

In Niranjana's latest patch series, he removed the TLB flushing from vm_unbind. 
He also said explicitly TLB invalidation will be performed at job submission 
and backing storage releasing time, which is the existing behavior of the 
current i915 driver.

I think if we invalidate TLB on each vm_unbind, then we don't need to 
invalidate at submission and backing storage releasing. It doesn't make a lot 
of sense to me to perform a tlb invalidation at execbuf time. Maybe it is a 
behavior for the old implicit binding programming model. For vm_bind and eb3, 
we separate the binding and job submission into two APIs. It is more natural 
the TLB invalidation be coupled with the vm bind/unbind, not job submission. So 
in my opinion we should remove tlb invalidation from submission and backing 
storage releasing and add it to vm unbind. This is method is cleaner to me.

You can propose this model (not flushing in eb3) but I have my doubts. Consider the pointlessness of flushing on N unbinds for 99% of clients which are not infinite compute batch. And consider how you make the behaviour consistent on all platforms (selective vs global tlb flush).

Also note that this discussion is orthogonal to unbind vs backing store release.

Regarding performance, we don't have data. In my opinion, we should make things 
work in a most straight forward way as the first step. Then consider 
performance improvement if necessary. Consider some delayed tlb invalidation at 
submission and backing release time without performance data support wasn't a 
good decision.

It is quite straightforward though. ;) It aligns with the eb2 model and argument can be made backing store release is (much) less frequent than unbind (consider softpin where client could trigger a lot of pointless flushes).

Regards,

Tvrtko

Reply via email to