Quoting Scott D Phillips (2018-01-23 14:42:43)
> TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0
> Thus a cache line in the tiled surface is composed of a 2d area of
> 16x4 bytes of the linear surface.
> 
> Add a special case where the area being copied is 4-line aligned
> and a multiple of 4-lines so that entire cache lines will be
> written at a time.

Looks correct (mechanics of aligning to the WCB are sound). You can also
apply the same to ytiled_to_linear and use movntdqa for fast WC
readback.
-Chris
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to