Re: [Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6

2015-09-22 Thread Iago Toral
On Wed, 2015-09-16 at 12:32 -0700, Kenneth Graunke wrote:
> On Wednesday, September 16, 2015 11:17:53 AM Iago Toral Quiroga wrote:
> > It seems that we have some bugs where we fail to compile shaders in gen6
> > because we do not having enough MRF registers available (see bugs 86469 and
> > 90631 for example). That triggered some discussion about the fact that SNB
> > might actually have 24 MRF registers available, but since the docs where not
> > very clear about this, it was suggested that it would be nice to try and
> > experiment if that was the case.
> > 
> > These series of patches implement such test, basically they turn our fixed
> > BRW_MAX_MRF into a macro that accepts the hardware generation and then 
> > changes
> > the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for
> > gen6 (something similar can be done for the vec4 code, I just did not do it
> > yet).
> > 
> > The good news is that this seems to work fine, at least I can do a full 
> > piglit
> > run without issues in SNB.
> 
> Sweet!
> 
> > In fact, this seems to help a lot of tests when I
> > force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs):
> > 
> > Using MRF registers 13-15 for spilling:
> > crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3
> > 
> > Using MRF registers 21-23 for spilling:
> > crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3
> > 
> > As you can see, we drop the fail ratio to almost 50%...
> 
> That seems odd - I wouldn't think using m13-15 vs. m21-23 would actually
> make a difference.  Perhaps it's papering over a bug where we're failing
> to notice that MRFs are in use?  If so, we should probably fix that (in
> addition to making this change).

(...)

I've looking into this. The problem was happening because our UBO loads
(via sampler messages) use MRF14 in gen6, so as soon as we had one of
these, get_used_mrfs in brw_fs_reg_allocate.cpp would detect that MRF14
is used and fail building any shader that needs spilling. This is quite
extreme, since just because MRF14 is used by something else tt does not
mean that there is an actual conflict, in fact, the test I was playing
with (fs-bools.shader_test) was a case of a false positive.

I suppose that a better way to do this would be to check if any write to
MRF14 prior to the current instruction needing spilling has been already
consumed by a matching SEND instruction also happening before the
instruction we need to spill for. However, now that we moved spilling to
a different MRF range this can't happen any more, so I wonder if we want
to implement this.

One more thing that this brought to my attention is that we probably
want to move UBO loads in gen6 to MRFs > 15 (and < 21, where we do our
spills). This is because our URB writes in gen6 can now happen in MRF
1..15. The likelihood of a conflict, I guess, is very small anyway,
specially because URB writes are bound to be the last thing to happen,
but maybe there is a rare case where this could bite us so I think we
are probably better off moving our UBO loads out of the conflict zone.

BTW, apparently, this does not happen in gen7+ because that uses GRF
registers for SEND instructions instead of MRFs and this code checks for
MRF destinations.

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6

2015-09-17 Thread Iago Toral
On Wed, 2015-09-16 at 12:32 -0700, Kenneth Graunke wrote:
> On Wednesday, September 16, 2015 11:17:53 AM Iago Toral Quiroga wrote:
> > It seems that we have some bugs where we fail to compile shaders in gen6
> > because we do not having enough MRF registers available (see bugs 86469 and
> > 90631 for example). That triggered some discussion about the fact that SNB
> > might actually have 24 MRF registers available, but since the docs where not
> > very clear about this, it was suggested that it would be nice to try and
> > experiment if that was the case.
> > 
> > These series of patches implement such test, basically they turn our fixed
> > BRW_MAX_MRF into a macro that accepts the hardware generation and then 
> > changes
> > the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for
> > gen6 (something similar can be done for the vec4 code, I just did not do it
> > yet).
> > 
> > The good news is that this seems to work fine, at least I can do a full 
> > piglit
> > run without issues in SNB.
> 
> Sweet!
> 
> > In fact, this seems to help a lot of tests when I
> > force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs):
> > 
> > Using MRF registers 13-15 for spilling:
> > crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3
> > 
> > Using MRF registers 21-23 for spilling:
> > crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3
> > 
> > As you can see, we drop the fail ratio to almost 50%...
> 
> That seems odd - I wouldn't think using m13-15 vs. m21-23 would actually
> make a difference.  Perhaps it's papering over a bug where we're failing
> to notice that MRFs are in use?  If so, we should probably fix that (in
> addition to making this change).

It could be, I will have a look at one of the affected tests and try to
understand what is going on when we hit that case.

> > The bad news is that, currently, we assert that MRF registers are within the
> > supported range in brw_reg.h. This works fine now because the limit does not
> > depend on the hardware generation, but these patches change that of course.
> > The natural way to fix this would be to pass a generation argument to
> > all brw_reg functions that can create a brw_reg, but I imagine that we don't
> > want to do that only for this, right?
> 
> Yeah...it does seem a bit funny to add a generation parameter to brw_reg
> functions just for an assert that the register number is in range.
> 
> What about adding the asserts in brw_set_src0 and brw_set_dest?  This
> would catch BLORP and the Gen4 clip/sf/gs code that emits assembly
> directly - it would catch everything.  But, unfortunately, at the last
> minute...when it might be harder to debug.  So, I do like adding the
> assertions to the generators as well.

Yeah, adding them to brw_eu_emit.c looks like the best choice, I'll do
that and also add asserts to the generator.

> > In that case, if we want to keep the
> > asserts (I think we do) we need a way around that limitatation. The first
> > patch in this series tries to move the asserts to the generator, but that 
> > won't
> > manage things like blorp and other modules that can emit code directly, so 
> > we
> > would lose the assert checks for those. Of course we could add individual
> > asserts for these as needed, but it is not ideal. Alternatively, we could 
> > add
> > a function wrapper to brw_message_reg that has the assert and use that
> > version of the function from these places. In that case, this wrapper might 
> > not
> > need to take in the generation number as parameter and could just check
> > with 16 as the limit, since we really only use MRF registers
> > beyond 16 for spilling, and we only handle spilling in code paths that end
> > up going through the generator.
> > 
> > Or maybe we think this is just not worth it if it only helps gen6...
> 
> I'd like to do it.

Great, I'll work on the patches and send them for review. Thanks for the
feedback!

Iago

> > 
> > what do you think? 
> > 
> > Iago Toral Quiroga (3):
> >   i965: Move MRF register asserts to the generator
> >   i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
> >   i965/fs: Use MRF registers 21-23 for spilling on gen6
> > 
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c|  2 +-
> >  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++
> >  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 
> > --
> >  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
> >  src/mesa/drivers/dri/i965/brw_reg.h|  5 +---
> >  .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  9 +---
> >  8 files changed, 37 insertions(+), 30 deletions(-)
> > 
> > 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6

2015-09-16 Thread Kenneth Graunke
On Wednesday, September 16, 2015 11:17:53 AM Iago Toral Quiroga wrote:
> It seems that we have some bugs where we fail to compile shaders in gen6
> because we do not having enough MRF registers available (see bugs 86469 and
> 90631 for example). That triggered some discussion about the fact that SNB
> might actually have 24 MRF registers available, but since the docs where not
> very clear about this, it was suggested that it would be nice to try and
> experiment if that was the case.
> 
> These series of patches implement such test, basically they turn our fixed
> BRW_MAX_MRF into a macro that accepts the hardware generation and then changes
> the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for
> gen6 (something similar can be done for the vec4 code, I just did not do it
> yet).
> 
> The good news is that this seems to work fine, at least I can do a full piglit
> run without issues in SNB.

Sweet!

> In fact, this seems to help a lot of tests when I
> force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs):
> 
> Using MRF registers 13-15 for spilling:
> crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3
> 
> Using MRF registers 21-23 for spilling:
> crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3
> 
> As you can see, we drop the fail ratio to almost 50%...

That seems odd - I wouldn't think using m13-15 vs. m21-23 would actually
make a difference.  Perhaps it's papering over a bug where we're failing
to notice that MRFs are in use?  If so, we should probably fix that (in
addition to making this change).

> The bad news is that, currently, we assert that MRF registers are within the
> supported range in brw_reg.h. This works fine now because the limit does not
> depend on the hardware generation, but these patches change that of course.
> The natural way to fix this would be to pass a generation argument to
> all brw_reg functions that can create a brw_reg, but I imagine that we don't
> want to do that only for this, right?

Yeah...it does seem a bit funny to add a generation parameter to brw_reg
functions just for an assert that the register number is in range.

What about adding the asserts in brw_set_src0 and brw_set_dest?  This
would catch BLORP and the Gen4 clip/sf/gs code that emits assembly
directly - it would catch everything.  But, unfortunately, at the last
minute...when it might be harder to debug.  So, I do like adding the
assertions to the generators as well.

> In that case, if we want to keep the
> asserts (I think we do) we need a way around that limitatation. The first
> patch in this series tries to move the asserts to the generator, but that 
> won't
> manage things like blorp and other modules that can emit code directly, so we
> would lose the assert checks for those. Of course we could add individual
> asserts for these as needed, but it is not ideal. Alternatively, we could add
> a function wrapper to brw_message_reg that has the assert and use that
> version of the function from these places. In that case, this wrapper might 
> not
> need to take in the generation number as parameter and could just check
> with 16 as the limit, since we really only use MRF registers
> beyond 16 for spilling, and we only handle spilling in code paths that end
> up going through the generator.
> 
> Or maybe we think this is just not worth it if it only helps gen6...

I'd like to do it.

> 
> what do you think? 
> 
> Iago Toral Quiroga (3):
>   i965: Move MRF register asserts to the generator
>   i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
>   i965/fs: Use MRF registers 21-23 for spilling on gen6
> 
>  src/mesa/drivers/dri/i965/brw_eu_emit.c|  2 +-
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 
> --
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
>  src/mesa/drivers/dri/i965/brw_reg.h|  5 +---
>  .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  9 +---
>  8 files changed, 37 insertions(+), 30 deletions(-)
> 
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6

2015-09-16 Thread Iago Toral Quiroga
It seems that we have some bugs where we fail to compile shaders in gen6
because we do not having enough MRF registers available (see bugs 86469 and
90631 for example). That triggered some discussion about the fact that SNB
might actually have 24 MRF registers available, but since the docs where not
very clear about this, it was suggested that it would be nice to try and
experiment if that was the case.

These series of patches implement such test, basically they turn our fixed
BRW_MAX_MRF into a macro that accepts the hardware generation and then changes
the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for
gen6 (something similar can be done for the vec4 code, I just did not do it
yet).

The good news is that this seems to work fine, at least I can do a full piglit
run without issues in SNB. In fact, this seems to help a lot of tests when I
force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs):

Using MRF registers 13-15 for spilling:
crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3

Using MRF registers 21-23 for spilling:
crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3

As you can see, we drop the fail ratio to almost 50%...

The bad news is that, currently, we assert that MRF registers are within the
supported range in brw_reg.h. This works fine now because the limit does not
depend on the hardware generation, but these patches change that of course.
The natural way to fix this would be to pass a generation argument to
all brw_reg functions that can create a brw_reg, but I imagine that we don't
want to do that only for this, right? In that case, if we want to keep the
asserts (I think we do) we need a way around that limitatation. The first
patch in this series tries to move the asserts to the generator, but that won't
manage things like blorp and other modules that can emit code directly, so we
would lose the assert checks for those. Of course we could add individual
asserts for these as needed, but it is not ideal. Alternatively, we could add
a function wrapper to brw_message_reg that has the assert and use that
version of the function from these places. In that case, this wrapper might not
need to take in the generation number as parameter and could just check
with 16 as the limit, since we really only use MRF registers
beyond 16 for spilling, and we only handle spilling in code paths that end
up going through the generator.

Or maybe we think this is just not worth it if it only helps gen6...

what do you think? 

Iago Toral Quiroga (3):
  i965: Move MRF register asserts to the generator
  i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
  i965/fs: Use MRF registers 21-23 for spilling on gen6

 src/mesa/drivers/dri/i965/brw_eu_emit.c|  2 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 --
 src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
 src/mesa/drivers/dri/i965/brw_reg.h|  5 +---
 .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  9 +---
 8 files changed, 37 insertions(+), 30 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev