On 02/25/2014 09:38 AM, Eric Anholt wrote:
> Matt Turner <matts...@gmail.com> writes:
> 
>> On Mon, Feb 24, 2014 at 10:15 AM, Eric Anholt <e...@anholt.net> wrote:
>>> I think we would do better by emitting
>>> ADD(y_minus_x, y, negate(x))
>>> MAC(dst, x, y_minus_x, a)
>>
>> MAC only takes two arguments, so
>>  - if you meant MAD, there's no MAD on platforms that don't have LRP
>>  - if you meant MAC(dst, ...) I don't see a way of doing it only two
>> instructions, but we could do
>>
>> MOV(acc, x)
>> ADD(y_minus_x, y, negate(x)
>> MAC(dst, y_minus_x, a)
> 
> Oops, yeah, I was still thinking in terms of MAD.  This should still be
> better I think, while being an obvious translation of the LRP
> instruction:
> 
> ADD one_minus_a, negate(a), 1.0f
> MUL null, y, a
> MAC dst, x, one_minus_a
> 
> (multiplying y * a first to slightly reduce the stall pressure from
> one_minus_a)

Nice.  I agree this is better, but it's harder than you think.  We would
have to:

1. Create a MAC() emitter.
2. Add BRW_OPCODE_MAC to vec4_generator.
3. Add a new "enable accumulator writes" flag to vec4_instruction
   and make vec4_generator respect that.  (The MUL needs this.)
4. Fix up dead code elimination and other things to know about implicit
accumulator writes.

Given the severity of this problem (GPU hangs and crashes) and the fact
that it's a regression in 10.1---which we plan to ship in three days---I
would like to commit my existing patches and improve this after the release.

--Ken

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to