Quoting Scott D Phillips (2018-01-23 14:42:43) > TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0 > Thus a cache line in the tiled surface is composed of a 2d area of > 16x4 bytes of the linear surface. > > Add a special case where the area being copied is 4-line aligned > and a multiple of 4-lines so that entire cache lines will be > written at a time.
Looks correct (mechanics of aligning to the WCB are sound). You can also apply the same to ytiled_to_linear and use movntdqa for fast WC readback. -Chris _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev