On Fri, Apr 8, 2016 at 12:26 PM, Hans de Goede <hdego...@redhat.com> wrote: > Hi, > > > On 08-04-16 18:06, Hans de Goede wrote: >> >> Hi, >> >> On 08-04-16 17:45, Ilia Mirkin wrote: >>> >>> On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede <hdego...@redhat.com> >>> wrote: >>>> >>>> When dealing with non vector variables the llvm register allocator >>>> will use TEMP[0].x then TEMP[0].y, etc. >>>> >>>> When loading something from a global buffer it will calculate the >>>> address to use, and store that in say TEMP[0].x, so it ends up >>>> generating: >>>> >>>> LOAD TEMP[0].y, MEMORY[0], TEMP[0] >>>> >>>> Expecting the contents of TEMP[0].y to become the 32 bits of data >>>> to which TEMP[0].x is pointing. But instead it will get the 32 bits of >>>> data at address (TEMP[0].x + 4). >>>> >>>> With the old RES[32767] code one could generate the following TGSI: >>>> >>>> LOAD TEMP[0].y, RES[32767].xxxx, TEMP[0] >>>> >>>> And things would work fine since the .xxxx swizzling postfix would >>>> be honored and when storing to y (the only component set in the >>>> dest-mask) >>>> the x component at address (TEMP[0].x) would be loaded, rather then the >>>> y component at (TEMP[0].y) >>>> >>>> Note that another approach would be to not increment the address by >>>> a 32 bit word for skipped (not set in destmask) components. >>>> >>>> The way I see it either: >>>> >>>> 1) We see that LOAD does not deal with vectors, but with flat memory, >>>> in which case skipping 4 bytes because x is not set in the destmask >>>> does not make sense, as that is a vector thing todo. >>>> >>>> 2) LOAD is vector layout aware in which case supporting swizzling >>>> makes sense. >>>> >>>> Currently we have a weird hybrid which is rather cumbersome to >>>> work with from a compiler pov. >>> >>> >>> And I guess LLVM never ends up generating any of the other "funny" >>> instructions like LIT and the such. Well, I have no problem adding the >>> swizzling logic, i.e. the way that LOAD will now work (logically) is >>> that it will >>> >>> (a) fetch 4 values from the coordinates provided (4 sequential dwords >>> from src1.x in the case of buffer/memory, RGBA colors from src1.xyz in >>> the case of images) >>> (b) swizzle them according to the swizzle on the MEMORY/BUFFER/IMAGE >>> argument >>> (c) store that swizzled result into the destination based on the >>> writemask >>> >>> That would sound reasonable to me, and if I understand correctly, is >>> option 2 of your proposal. >> >> >> Yes that is option 2, and is basically what the patch which started this >> thread does. So that would work for me :) >> >>> We'd need some docs updates and buy-in from the other gallium driver >>> developers. >> >> >> What docs would need updating ? The TGSI docs I'm aware of are at: >> >> http://gallium.readthedocs.org/en/latest/tgsi.html >> >> I assume those have a source in the mesa src somewhere (I've not looked), >> but those mostly just look quite incomplete in general when it comes to >> TGSI >> (I've had to revert to figuring what things do from the mesa srcs quite >> often) >> >> Have I been looking at the wrong docs perhaps ? >> >> Note that them being incomplete is not intended as an excuse to not >> document >> this, I'm all for better documentation. >> >>> STORE remains unchanged, as the MEMORY/etc is in the destination, >>> where there is a writemask, which is presently used and will remain >>> effective. >> >> >> Right and note that the first src operand of STORE already takes swizzling >> into account, so the proposed option 2 will actually make the 2 more >> inline. > > > Erm, I mean the 2nd src operand of the store of-course, the actual src. > > On a related note, comparing handleLOAD and handleSTORE, this bit in > handleLOAD seems wrong: > > Value *off = fetchSrc(1, c); > > I believe that should be: > > Value *off = fetchSrc(1, 0);
Yep, that's wrong. I think I was waffling back and forth early on in the lifetime of the patchset about how it would work, and one of the options was to read each dword from a separate offset. (I think I started implementing atomic buffers well over a year ago, only to be stymied by the memory window issue and give up for a long time.) I eventually came to realize that was insanity. > > Just like handleSTORE does: > > off = fetchSrc(0, 0); > > And always using a 'c' of 0 seems correct here since we are dealing > with an address. > > Once I know which docs to update for this, I'll do a v2 of this patch > and add a preparation patch fixing the above to the v2 set. src/gallium/docs/source/tgsi.rst There are push hooks which make readthedocs.org re-pull from the mesa repo on every push so that things are up to date (well, it takes a few minutes to regenerate the html). -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev