On Wed, Oct 14, 2009 at 3:02 PM, Nicolai Hähnle <[email protected]> wrote:
> Alex, I added you to the CC in case you can help clarify the points on R500
> vertex programs.
>
> Am Wednesday 14 October 2009 08:20:42 schrieb Ian Romanick:
>> > Issue 2:
>> > 1) R500 supports unstructured branching in fragment programs but not in
>> > vertex programs, so I'm happy about leaving it out.
>>
>> Weird.  That's backwards from how other SM3 GPUs do it.  Usually you get
>> unstructured branching in the AoS vertex shader.
>
> I agree. To be honest, the vertex processor documentation for R500 confuses
> the hell out of me. Somehow, the way it is written it suggests that there is a
> JUMP instruction that can only jump based on a constant register, which just
> seems extremely bizarre, but the documentation is quite consistent about it,
> because it tells us to use conditional write instructions to implement if-
> else-statements.
>
> Maybe there is a part which is simply missing? I also see neither JUMP nor
> LOOP opcodes anywhere, just registers describing the first and last
> instruction pointer for a loop.
>

It's not part of the actual shader code per se;  it's implemented in
parallel to the vertex shader and interacts with it. See the
VAP_PVS_FLOW_CNTL_* regs.  r3xx-r5xx have them so it should be
possible on all 3 generations; r5xx supports longer programs however.

Alex

>
>> > 2) R500 supports address registers as described in vertex programs
>> > (including input/output offsets), but has no address registers at all in
>> > fragment programs. A loop address register can be used as offsets in
>> > loops, but the values loaded into this register must be determined at
>> > compile time.
>>
>> I had intended to move the grammar for ARL and ARR out of the generic
>> GPU grammar and into the vertex program-specific grammar.  The intention
>> is that LOOP/ENDLOOP is the only way to load an address register in a
>> fragment program.  LOOP/ENDLOOP set the .x component and leave the other
>> components undefined.  Since the ENDLOOP restores the "previous" value
>> of the address register, the last ENDLOOP restores garbage.  My
>> intention was to provide consistent syntactic sugar over the constrained
>> functionality of the loop index.
>
> Sounds good.
>
> <snip>
>> > I think we can do everything you throw at us on R500. The only difficulty
>> > is that R500 is a bit schizophrenic in that vertex programs are very
>> > different from fragment programs, but we can emulate things. The only
>> > stupid weakness is that swizzling predicates in fragment programs is
>> > essentially impossible (the only natively supported swizzles are .rgba
>> > and the smears .rrrr, .gggg, .bbbb, .aaaa). Obviously we can emulate
>> > this.
>>
>> How painful would it be to emulate?  We could restrict the set of
>> available predicate swizzles.  I think this matches D3D, so it shouldn't
>> be a problem for Wine.
>
> I'd always be happier if I didn't have to do it, but it's certainly easier
> than what we're already doing for R300 fragment programs anyway. The question
> is whether you want to add a fragment-program-only restriction to the provided
> swizzles. I don't feel very strongly either way.
>
>> > Issue 11:
>> > R500 supposedly supports relative addressing of temporary registers in
>> > vertex programs, and also in fragment programs (but only using loop
>> > indices). I have never tested whether it actually works, though.
>>
>> This would be a good feature to have.  Would it be possible to hack up a
>> test?  Do you know of any limitations?
>
> Will do this weekend, at least for vertex programs; I don't know of any
> limitations.
>
> I don't know if I'll get to hacking something up for fragment programs soon,
> because that's slightly more involved (I haven't done fragment program loops
> yet).
>
>
>> > Issue 13:
>> > Similar to issue 2, R500 fragment programs support unstructured
>> > everything but vertex programs don't, so not overlapping sounds good to
>> > me.
>> >
>> > Issue 15:
>> > I know R500 fragment programs can support a CONT, but I'm not so familiar
>> > with the R500 vertex programs, and they seem generally less flexible.
>>
>> I didn't see an explicit CONT instruction.  If there's no unstructured
>> branch, there probably isn't a way to do it.
>>
>> > Issue 17:
>> > I would *expect* negative addressing offsets to work on R500, but somehow
>> > I haven't been able to get them to work. I'll see if I can look into it
>> > again.
>>
>> No hardware that I'm aware of supports true negative offsets in the
>> instructions.  This is made to work with program parameters by putting
>> the base of the array at a large enough positive offset to make the
>> largest negative offset be zero.  For example, if the program uses
>> my_array[A0.x - 10], the driver has to place my_array at parameter slot
>> 10 or higher.
>
> I see.
>
>> I don't think we can do similar trickery for attributes and results.  I
>> think we may have to leave the negative offsets just for program
>> parameters and only allow positive offsets for attributes and results.
>> Note that NV_gpu_program4 only allows positive offsets.  It can get away
>> with this because SM4 has general purpose integer instructions and any
>> register can be used for indirect addressing.
>
> Well, one possible trickery that I believe Corbin suggested was transforming:
>
> ARL A0.x, R.x;
> MOV R, CONST[A0.x - 5];
>
> into:
>
> SUB TMP.x, R.x, 5;
> ARL A0.x, R.x;
> MOV R, CONST[A0.x];
>
>> > Issue 34:
>> > I don't see any support for an address register stack on R500, or
>> > anything else to provide for a subroutine stack.
>>
>> If you can do relative addressing of temporaries, you can fake a small
>> stack.  It's ugly, but it's possible.  Of course, without address
>> register math it's even more ugly.
>
> True, that's a good argument in favour of relative addressing of temporaries.
>
>> I'll post an updated version in the morning with the grammar change (for
>> ARL and ARR) and the documentation for the other predicate-set
>> instructions.
>
>
>

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to