On Fri, Apr 8, 2016 at 12:26 PM, Hans de Goede wrote:
> Hi,
>
>
> On 08-04-16 18:06, Hans de Goede wrote:
>>
>> Hi,
>>
>> On 08-04-16 17:45, Ilia Mirkin wrote:
>>>
>>> On Fri, Apr 8, 2016 at 11:28 AM, Hans de Goede
>>> wrote:
When dealing with non vector variables the llvm register allocator
will use TEMP[0].x then TEMP[0].y, etc.
When loading something from a global buffer it will calculate the
address to use, and store that in say TEMP[0].x, so it ends up
generating:
LOAD TEMP[0].y, MEMORY[0], TEMP[0]
Expecting the contents of TEMP[0].y to become the 32 bits of data
to which TEMP[0].x is pointing. But instead it will get the 32 bits of
data at address (TEMP[0].x + 4).
With the old RES[32767] code one could generate the following TGSI:
LOAD TEMP[0].y, RES[32767]., TEMP[0]
And things would work fine since the . swizzling postfix would
be honored and when storing to y (the only component set in the
dest-mask)
the x component at address (TEMP[0].x) would be loaded, rather then the
y component at (TEMP[0].y)
Note that another approach would be to not increment the address by
a 32 bit word for skipped (not set in destmask) components.
The way I see it either:
1) We see that LOAD does not deal with vectors, but with flat memory,
in which case skipping 4 bytes because x is not set in the destmask
does not make sense, as that is a vector thing todo.
2) LOAD is vector layout aware in which case supporting swizzling
makes sense.
Currently we have a weird hybrid which is rather cumbersome to
work with from a compiler pov.
>>>
>>>
>>> And I guess LLVM never ends up generating any of the other "funny"
>>> instructions like LIT and the such. Well, I have no problem adding the
>>> swizzling logic, i.e. the way that LOAD will now work (logically) is
>>> that it will
>>>
>>> (a) fetch 4 values from the coordinates provided (4 sequential dwords
>>> from src1.x in the case of buffer/memory, RGBA colors from src1.xyz in
>>> the case of images)
>>> (b) swizzle them according to the swizzle on the MEMORY/BUFFER/IMAGE
>>> argument
>>> (c) store that swizzled result into the destination based on the
>>> writemask
>>>
>>> That would sound reasonable to me, and if I understand correctly, is
>>> option 2 of your proposal.
>>
>>
>> Yes that is option 2, and is basically what the patch which started this
>> thread does. So that would work for me :)
>>
>>> We'd need some docs updates and buy-in from the other gallium driver
>>> developers.
>>
>>
>> What docs would need updating ? The TGSI docs I'm aware of are at:
>>
>> http://gallium.readthedocs.org/en/latest/tgsi.html
>>
>> I assume those have a source in the mesa src somewhere (I've not looked),
>> but those mostly just look quite incomplete in general when it comes to
>> TGSI
>> (I've had to revert to figuring what things do from the mesa srcs quite
>> often)
>>
>> Have I been looking at the wrong docs perhaps ?
>>
>> Note that them being incomplete is not intended as an excuse to not
>> document
>> this, I'm all for better documentation.
>>
>>> STORE remains unchanged, as the MEMORY/etc is in the destination,
>>> where there is a writemask, which is presently used and will remain
>>> effective.
>>
>>
>> Right and note that the first src operand of STORE already takes swizzling
>> into account, so the proposed option 2 will actually make the 2 more
>> inline.
>
>
> Erm, I mean the 2nd src operand of the store of-course, the actual src.
>
> On a related note, comparing handleLOAD and handleSTORE, this bit in
> handleLOAD seems wrong:
>
> Value *off = fetchSrc(1, c);
>
> I believe that should be:
>
> Value *off = fetchSrc(1, 0);
Yep, that's wrong. I think I was waffling back and forth early on in
the lifetime of the patchset about how it would work, and one of the
options was to read each dword from a separate offset. (I think I
started implementing atomic buffers well over a year ago, only to be
stymied by the memory window issue and give up for a long time.) I
eventually came to realize that was insanity.
>
> Just like handleSTORE does:
>
> off = fetchSrc(0, 0);
>
> And always using a 'c' of 0 seems correct here since we are dealing
> with an address.
>
> Once I know which docs to update for this, I'll do a v2 of this patch
> and add a preparation patch fixing the above to the v2 set.
src/gallium/docs/source/tgsi.rst
There are push hooks which make readthedocs.org re-pull from the mesa
repo on every push so that things are up to date (well, it takes a few
minutes to regenerate the html).
-ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev