On Fri, Jul 7, 2017 at 9:37 PM, Dave Airlie <airl...@gmail.com> wrote: > On 8 July 2017 at 04:07, Christian König <deathsim...@vodafone.de> wrote: >> Am 07.07.2017 um 18:51 schrieb Marek Olšák: >>> >>> On Fri, Jul 7, 2017 at 11:18 AM, Christian König >>> <deathsim...@vodafone.de> wrote: >>>> >>>> What tilling format have the destination textures? >>>> >>>> Sounds like the offset is just added so that we distribute memory >>>> accesses >>>> more equally over memory channels. >>> >>> You can't set an offset that is not aligned. The hardware ignores the >>> low unaligned bits, so they have a different meaning. They specify >>> pipe and bank rotation for macro tiling. It's like a state. It >>> basically rotates the tile pattern. >> >> >> Yeah, I know. That's what I meant with distributing memory accesses more >> equally over all channels. The lower bits select a memory bank swizzle IIRC. >> >> I've tried years ago with R600 if shuffling them randomly could improve >> performance, but MRT wasn't widely used and/or supported at that time. > > I'd known this and forgotten, the public CIK docs say bits 0..7 must be zero, > but I have older docs which had more info. It would be nice if we could get > proper docs released for the bottom bits considering AMD are using them in > their > drivers.
The low 8 bits of the address are unused and can't be set, because CB_COLOR0_BASE is shifted by 8 bits. We are really talking about bits starting from 8 going higher. E.g. 8K alignment gives you 5 bits that can be used to express the rotation. > > It would be good to know what registers have the bits that matter (i.e. BASE, > FMASK, CMASK, DCC, and resource descriptors.) > > Then I suppose we'd need to know the algorithm for programming them, and > if we need to make any allocations bigger in order to do so. > > I expect this only starts to matter when we hit memory bandwidth limits, > the deferred demo does 3 MRT, one depth at 2kx2k then samples from those > down to 1280x720 displayed. This combined with a 3 instanced 57k vertex > draw seemed to be enough to see the pain. (Maybe a GL example doing something > similiar might show the problem for radeonsi). Addrlib contains the encoding code for the base address pipe/bank bits. > > The other open question I have, is does this just matter for MRT or does > texture > sampling also get some boost from it, my hack patch does it for only > surfaces which > will end up attached to the CB. Yes, it should be done for read-only textures too. > > I'll update the patch to not call it an offset but name them the tile > rotation bits. The proper name is "tile swizzle" or "pipe/bank swizzle". On gfx9, it's called "pipe/bank xor". Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev