Re: [Mesa-dev] [PATCH 6/7] nir: Teach nir_opt_algebraic about adding and subtracting the same thing

Eero Tamminen Thu, 17 Dec 2015 01:21:46 -0800

Hi,

On 12/17/2015 01:52 AM, Matt Turner wrote:

On Tue, Dec 15, 2015 at 1:16 AM, Eduardo Lima Mitev <el...@igalia.com> wrote:

On 12/15/2015 09:28 AM, Kristian Høgsberg Kristensen wrote:

This optimizes a + b - b to just a. Modest shader-db results (BDW):


   total instructions in shared programs: 7842452 -> 7841862 (-0.01%)
   instructions in affected programs:     61938 -> 61348 (-0.95%)
   total loops in shared programs:        2131 -> 2131 (0.00%)
   helped:                                263
   HURT:                                  0
   GAINED:                                0
   LOST:                                  0


In HSW, I get these shader-db results:

total instructions in shared programs: 6257265 -> 6256788 (-0.01%)
instructions in affected programs: 46601 -> 46124 (-1.02%)
helped: 218
HURT: 0

total cycles in shared programs: 56010026 -> 56007760 (-0.00%)
cycles in affected programs: 1048392 -> 1046126 (-0.22%)
helped: 199
HURT: 154

total loops in shared programs: 1979 -> 1979 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   0
GAINED: 0

I wonder where those cycle HURTs come from. In any case, the net result
is positive.


I haven't confirmed, but I've seen cases that seem like the cycle
counts are wrong.

I have doubts about the correctness of latency values set inbrw_schedule_instructions.cpp.

They were added mostly by Eric on 2012 & 2013. You added mad & lrp datain 2013 and Curro untyped atomics & surface reads in 2013. Both of themhave is_haswell check, but don't say anything about newer generations.

It seems that some of the values are from spec and some from tests.However, for the test data, the code doesn't say on what exact HW andstepping the tests were run on. Or where the sources for those testsare so that one could try to reproduce the results, verify (with perfcounters) that they actually are bound by what the test says, and updatedata gotten from them for newer generations (i.e. GEN8+).

In addition to this, Mesa is lacking at least stall cycles for 3srcregister bank conflicts.



        - Eero

PS. cycle values are anyway going to be off, code doesn't know memorylatencies as that depends on locality & cache utilization, and itdoesn't take threading into account. But it only tries to schedulethings so that HW is able to better compensate latency, so it doesn'tneed to know how much cycles take, just have good enough estimate. :-)


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/7] nir: Teach nir_opt_algebraic about adding and subtracting the same thing

Reply via email to