Re: [Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6
On Wed, 2015-09-16 at 12:32 -0700, Kenneth Graunke wrote: > On Wednesday, September 16, 2015 11:17:53 AM Iago Toral Quiroga wrote: > > It seems that we have some bugs where we fail to compile shaders in gen6 > > because we do not having enough MRF registers available (see bugs 86469 and > > 90631 for example). That triggered some discussion about the fact that SNB > > might actually have 24 MRF registers available, but since the docs where not > > very clear about this, it was suggested that it would be nice to try and > > experiment if that was the case. > > > > These series of patches implement such test, basically they turn our fixed > > BRW_MAX_MRF into a macro that accepts the hardware generation and then > > changes > > the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for > > gen6 (something similar can be done for the vec4 code, I just did not do it > > yet). > > > > The good news is that this seems to work fine, at least I can do a full > > piglit > > run without issues in SNB. > > Sweet! > > > In fact, this seems to help a lot of tests when I > > force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs): > > > > Using MRF registers 13-15 for spilling: > > crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3 > > > > Using MRF registers 21-23 for spilling: > > crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3 > > > > As you can see, we drop the fail ratio to almost 50%... > > That seems odd - I wouldn't think using m13-15 vs. m21-23 would actually > make a difference. Perhaps it's papering over a bug where we're failing > to notice that MRFs are in use? If so, we should probably fix that (in > addition to making this change). (...) I've looking into this. The problem was happening because our UBO loads (via sampler messages) use MRF14 in gen6, so as soon as we had one of these, get_used_mrfs in brw_fs_reg_allocate.cpp would detect that MRF14 is used and fail building any shader that needs spilling. This is quite extreme, since just because MRF14 is used by something else tt does not mean that there is an actual conflict, in fact, the test I was playing with (fs-bools.shader_test) was a case of a false positive. I suppose that a better way to do this would be to check if any write to MRF14 prior to the current instruction needing spilling has been already consumed by a matching SEND instruction also happening before the instruction we need to spill for. However, now that we moved spilling to a different MRF range this can't happen any more, so I wonder if we want to implement this. One more thing that this brought to my attention is that we probably want to move UBO loads in gen6 to MRFs > 15 (and < 21, where we do our spills). This is because our URB writes in gen6 can now happen in MRF 1..15. The likelihood of a conflict, I guess, is very small anyway, specially because URB writes are bound to be the last thing to happen, but maybe there is a rare case where this could bite us so I think we are probably better off moving our UBO loads out of the conflict zone. BTW, apparently, this does not happen in gen7+ because that uses GRF registers for SEND instructions instead of MRFs and this code checks for MRF destinations. Iago ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6
On Wed, 2015-09-16 at 12:32 -0700, Kenneth Graunke wrote: > On Wednesday, September 16, 2015 11:17:53 AM Iago Toral Quiroga wrote: > > It seems that we have some bugs where we fail to compile shaders in gen6 > > because we do not having enough MRF registers available (see bugs 86469 and > > 90631 for example). That triggered some discussion about the fact that SNB > > might actually have 24 MRF registers available, but since the docs where not > > very clear about this, it was suggested that it would be nice to try and > > experiment if that was the case. > > > > These series of patches implement such test, basically they turn our fixed > > BRW_MAX_MRF into a macro that accepts the hardware generation and then > > changes > > the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for > > gen6 (something similar can be done for the vec4 code, I just did not do it > > yet). > > > > The good news is that this seems to work fine, at least I can do a full > > piglit > > run without issues in SNB. > > Sweet! > > > In fact, this seems to help a lot of tests when I > > force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs): > > > > Using MRF registers 13-15 for spilling: > > crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3 > > > > Using MRF registers 21-23 for spilling: > > crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3 > > > > As you can see, we drop the fail ratio to almost 50%... > > That seems odd - I wouldn't think using m13-15 vs. m21-23 would actually > make a difference. Perhaps it's papering over a bug where we're failing > to notice that MRFs are in use? If so, we should probably fix that (in > addition to making this change). It could be, I will have a look at one of the affected tests and try to understand what is going on when we hit that case. > > The bad news is that, currently, we assert that MRF registers are within the > > supported range in brw_reg.h. This works fine now because the limit does not > > depend on the hardware generation, but these patches change that of course. > > The natural way to fix this would be to pass a generation argument to > > all brw_reg functions that can create a brw_reg, but I imagine that we don't > > want to do that only for this, right? > > Yeah...it does seem a bit funny to add a generation parameter to brw_reg > functions just for an assert that the register number is in range. > > What about adding the asserts in brw_set_src0 and brw_set_dest? This > would catch BLORP and the Gen4 clip/sf/gs code that emits assembly > directly - it would catch everything. But, unfortunately, at the last > minute...when it might be harder to debug. So, I do like adding the > assertions to the generators as well. Yeah, adding them to brw_eu_emit.c looks like the best choice, I'll do that and also add asserts to the generator. > > In that case, if we want to keep the > > asserts (I think we do) we need a way around that limitatation. The first > > patch in this series tries to move the asserts to the generator, but that > > won't > > manage things like blorp and other modules that can emit code directly, so > > we > > would lose the assert checks for those. Of course we could add individual > > asserts for these as needed, but it is not ideal. Alternatively, we could > > add > > a function wrapper to brw_message_reg that has the assert and use that > > version of the function from these places. In that case, this wrapper might > > not > > need to take in the generation number as parameter and could just check > > with 16 as the limit, since we really only use MRF registers > > beyond 16 for spilling, and we only handle spilling in code paths that end > > up going through the generator. > > > > Or maybe we think this is just not worth it if it only helps gen6... > > I'd like to do it. Great, I'll work on the patches and send them for review. Thanks for the feedback! Iago > > > > what do you think? > > > > Iago Toral Quiroga (3): > > i965: Move MRF register asserts to the generator > > i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation > > i965/fs: Use MRF registers 21-23 for spilling on gen6 > > > > src/mesa/drivers/dri/i965/brw_eu_emit.c| 2 +- > > src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++-- > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++ > > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 27 > > -- > > src/mesa/drivers/dri/i965/brw_ir_vec4.h| 2 +- > > src/mesa/drivers/dri/i965/brw_reg.h| 5 +--- > > .../drivers/dri/i965/brw_schedule_instructions.cpp | 4 ++-- > > src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 9 +--- > > 8 files changed, 37 insertions(+), 30 deletions(-) > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6
On Wednesday, September 16, 2015 11:17:53 AM Iago Toral Quiroga wrote: > It seems that we have some bugs where we fail to compile shaders in gen6 > because we do not having enough MRF registers available (see bugs 86469 and > 90631 for example). That triggered some discussion about the fact that SNB > might actually have 24 MRF registers available, but since the docs where not > very clear about this, it was suggested that it would be nice to try and > experiment if that was the case. > > These series of patches implement such test, basically they turn our fixed > BRW_MAX_MRF into a macro that accepts the hardware generation and then changes > the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for > gen6 (something similar can be done for the vec4 code, I just did not do it > yet). > > The good news is that this seems to work fine, at least I can do a full piglit > run without issues in SNB. Sweet! > In fact, this seems to help a lot of tests when I > force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs): > > Using MRF registers 13-15 for spilling: > crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3 > > Using MRF registers 21-23 for spilling: > crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3 > > As you can see, we drop the fail ratio to almost 50%... That seems odd - I wouldn't think using m13-15 vs. m21-23 would actually make a difference. Perhaps it's papering over a bug where we're failing to notice that MRFs are in use? If so, we should probably fix that (in addition to making this change). > The bad news is that, currently, we assert that MRF registers are within the > supported range in brw_reg.h. This works fine now because the limit does not > depend on the hardware generation, but these patches change that of course. > The natural way to fix this would be to pass a generation argument to > all brw_reg functions that can create a brw_reg, but I imagine that we don't > want to do that only for this, right? Yeah...it does seem a bit funny to add a generation parameter to brw_reg functions just for an assert that the register number is in range. What about adding the asserts in brw_set_src0 and brw_set_dest? This would catch BLORP and the Gen4 clip/sf/gs code that emits assembly directly - it would catch everything. But, unfortunately, at the last minute...when it might be harder to debug. So, I do like adding the assertions to the generators as well. > In that case, if we want to keep the > asserts (I think we do) we need a way around that limitatation. The first > patch in this series tries to move the asserts to the generator, but that > won't > manage things like blorp and other modules that can emit code directly, so we > would lose the assert checks for those. Of course we could add individual > asserts for these as needed, but it is not ideal. Alternatively, we could add > a function wrapper to brw_message_reg that has the assert and use that > version of the function from these places. In that case, this wrapper might > not > need to take in the generation number as parameter and could just check > with 16 as the limit, since we really only use MRF registers > beyond 16 for spilling, and we only handle spilling in code paths that end > up going through the generator. > > Or maybe we think this is just not worth it if it only helps gen6... I'd like to do it. > > what do you think? > > Iago Toral Quiroga (3): > i965: Move MRF register asserts to the generator > i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation > i965/fs: Use MRF registers 21-23 for spilling on gen6 > > src/mesa/drivers/dri/i965/brw_eu_emit.c| 2 +- > src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++-- > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++ > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 27 > -- > src/mesa/drivers/dri/i965/brw_ir_vec4.h| 2 +- > src/mesa/drivers/dri/i965/brw_reg.h| 5 +--- > .../drivers/dri/i965/brw_schedule_instructions.cpp | 4 ++-- > src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 9 +--- > 8 files changed, 37 insertions(+), 30 deletions(-) > > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6
It seems that we have some bugs where we fail to compile shaders in gen6 because we do not having enough MRF registers available (see bugs 86469 and 90631 for example). That triggered some discussion about the fact that SNB might actually have 24 MRF registers available, but since the docs where not very clear about this, it was suggested that it would be nice to try and experiment if that was the case. These series of patches implement such test, basically they turn our fixed BRW_MAX_MRF into a macro that accepts the hardware generation and then changes the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for gen6 (something similar can be done for the vec4 code, I just did not do it yet). The good news is that this seems to work fine, at least I can do a full piglit run without issues in SNB. In fact, this seems to help a lot of tests when I force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs): Using MRF registers 13-15 for spilling: crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3 Using MRF registers 21-23 for spilling: crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3 As you can see, we drop the fail ratio to almost 50%... The bad news is that, currently, we assert that MRF registers are within the supported range in brw_reg.h. This works fine now because the limit does not depend on the hardware generation, but these patches change that of course. The natural way to fix this would be to pass a generation argument to all brw_reg functions that can create a brw_reg, but I imagine that we don't want to do that only for this, right? In that case, if we want to keep the asserts (I think we do) we need a way around that limitatation. The first patch in this series tries to move the asserts to the generator, but that won't manage things like blorp and other modules that can emit code directly, so we would lose the assert checks for those. Of course we could add individual asserts for these as needed, but it is not ideal. Alternatively, we could add a function wrapper to brw_message_reg that has the assert and use that version of the function from these places. In that case, this wrapper might not need to take in the generation number as parameter and could just check with 16 as the limit, since we really only use MRF registers beyond 16 for spilling, and we only handle spilling in code paths that end up going through the generator. Or maybe we think this is just not worth it if it only helps gen6... what do you think? Iago Toral Quiroga (3): i965: Move MRF register asserts to the generator i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation i965/fs: Use MRF registers 21-23 for spilling on gen6 src/mesa/drivers/dri/i965/brw_eu_emit.c| 2 +- src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++ src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 27 -- src/mesa/drivers/dri/i965/brw_ir_vec4.h| 2 +- src/mesa/drivers/dri/i965/brw_reg.h| 5 +--- .../drivers/dri/i965/brw_schedule_instructions.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 9 +--- 8 files changed, 37 insertions(+), 30 deletions(-) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev