Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4
On Wed, 2015-12-09 at 23:51 -0800, Jason Ekstrand wrote: > > On Dec 9, 2015 11:47 PM, "Iago Toral"wrote: > > > > On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote: > > > > > > On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" > > > wrote: > > > > > > > > This code in brw_set_dest adjusts the execution size of any > > > instruction > > > > with a dst.width < 8. However, we don't want to do this with > > > instructions > > > > operating on doubles, since these will have a width of 4, but > still > > > > need an execution size of 8 (for SIMD8). Unfortunately, we can't > > > just check > > > > the size of the operands involved to detect if we are doing an > > > operation on > > > > doubles, because we can have instructions that do operations on > > > double > > > > operands interpreted as UD, operating on any of its 2 32-bit > > > components. > > > > > > > > Previous commits have made it so we never emit instructions with > a > > > horizontal > > > > width of 4 that don't have the correct execution size set for > > > gen7/gen8, so > > > > we can skip it in this case, avoiding the conflicts with fp64 > > > requirements. > > > > > > > > Expanding the same fix to other hardware generations requires > many > > > more > > > > changes but since we are not targetting fp64 support on them > > > > wer don't really care for now. > > > > --- > > > > src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +- > > > > 1 file changed, 13 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > > > index 78f2c8c..50a8771 100644 > > > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > > > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, > brw_inst > > > *inst, struct brw_reg dest) > > > > /* Generators should set a default exec_size of either 8 > > > (SIMD4x2 or SIMD8) > > > > * or 16 (SIMD16), as that's normally correct. However, > when > > > dealing with > > > > * small registers, we automatically reduce it to match the > > > register size. > > > > +* > > > > +* In platforms that support fp64 we can emit instructions > with > > > a width of > > > > +* 4 that need two SIMD8 registers and an exec_size of 8 or > 16. > > > In these > > > > +* cases we need to make sure that these instructions have > their > > > exec sizes > > > > +* set properly when they are emitted and we can't rely on > this > > > code to fix > > > > +* it. > > > > */ > > > > - if (dest.width < BRW_EXECUTE_8) > > > > + bool fix_exec_size; > > > > + if (devinfo->gen == 7 || devinfo->gen == 8) > > > > > > If we're doing to take this approach, we definitely want to make > it > > > gen > 6 or something so we include future gens. Really gen > 4 is > > > probably doable since the only real problem is the legacy clipping > > > code. > > > > Strips and fans is also a problem, but it is certainly doable if we > want > > to do it. > > Yeah, my primary point is that we should make it as little of an > edge-case as possible. We could go back to at least gen6 and we > should go forward. That said, it'll take a little testing from the > Intel side. Sure, this sounds sensible to me. I'll scan the code for gen9 paths that work with a horizontal width of 4 and include patches to cover gen6 as well. When we have that I'll send the series for testing. Thanks Jason! Iago > > Iago > > > > > > > > + fix_exec_size = dest.width < BRW_EXECUTE_4; > > > > + else > > > > + fix_exec_size = dest.width < BRW_EXECUTE_8; > > > > + > > > > + if (fix_exec_size) > > > >brw_inst_set_exec_size(devinfo, inst, dest.width); > > > > } > > > > > > > > -- > > > > 2.1.4 > > > > > > > > ___ > > > > mesa-dev mailing list > > > > mesa-dev@lists.freedesktop.org > > > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > > > > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4
This code in brw_set_dest adjusts the execution size of any instruction with a dst.width < 8. However, we don't want to do this with instructions operating on doubles, since these will have a width of 4, but still need an execution size of 8 (for SIMD8). Unfortunately, we can't just check the size of the operands involved to detect if we are doing an operation on doubles, because we can have instructions that do operations on double operands interpreted as UD, operating on any of its 2 32-bit components. Previous commits have made it so we never emit instructions with a horizontal width of 4 that don't have the correct execution size set for gen7/gen8, so we can skip it in this case, avoiding the conflicts with fp64 requirements. Expanding the same fix to other hardware generations requires many more changes but since we are not targetting fp64 support on them wer don't really care for now. --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 78f2c8c..50a8771 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest) /* Generators should set a default exec_size of either 8 (SIMD4x2 or SIMD8) * or 16 (SIMD16), as that's normally correct. However, when dealing with * small registers, we automatically reduce it to match the register size. +* +* In platforms that support fp64 we can emit instructions with a width of +* 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these +* cases we need to make sure that these instructions have their exec sizes +* set properly when they are emitted and we can't rely on this code to fix +* it. */ - if (dest.width < BRW_EXECUTE_8) + bool fix_exec_size; + if (devinfo->gen == 7 || devinfo->gen == 8) + fix_exec_size = dest.width < BRW_EXECUTE_4; + else + fix_exec_size = dest.width < BRW_EXECUTE_8; + + if (fix_exec_size) brw_inst_set_exec_size(devinfo, inst, dest.width); } -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4
On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga"wrote: > > This code in brw_set_dest adjusts the execution size of any instruction > with a dst.width < 8. However, we don't want to do this with instructions > operating on doubles, since these will have a width of 4, but still > need an execution size of 8 (for SIMD8). Unfortunately, we can't just check > the size of the operands involved to detect if we are doing an operation on > doubles, because we can have instructions that do operations on double > operands interpreted as UD, operating on any of its 2 32-bit components. > > Previous commits have made it so we never emit instructions with a horizontal > width of 4 that don't have the correct execution size set for gen7/gen8, so > we can skip it in this case, avoiding the conflicts with fp64 requirements. > > Expanding the same fix to other hardware generations requires many more > changes but since we are not targetting fp64 support on them > wer don't really care for now. > --- > src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index 78f2c8c..50a8771 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest) > /* Generators should set a default exec_size of either 8 (SIMD4x2 or SIMD8) > * or 16 (SIMD16), as that's normally correct. However, when dealing with > * small registers, we automatically reduce it to match the register size. > +* > +* In platforms that support fp64 we can emit instructions with a width of > +* 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these > +* cases we need to make sure that these instructions have their exec sizes > +* set properly when they are emitted and we can't rely on this code to fix > +* it. > */ > - if (dest.width < BRW_EXECUTE_8) > + bool fix_exec_size; > + if (devinfo->gen == 7 || devinfo->gen == 8) If we're doing to take this approach, we definitely want to make it gen > 6 or something so we include future gens. Really gen > 4 is probably doable since the only real problem is the legacy clipping code. > + fix_exec_size = dest.width < BRW_EXECUTE_4; > + else > + fix_exec_size = dest.width < BRW_EXECUTE_8; > + > + if (fix_exec_size) >brw_inst_set_exec_size(devinfo, inst, dest.width); > } > > -- > 2.1.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4
On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote: > > On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga"> wrote: > > > > This code in brw_set_dest adjusts the execution size of any > instruction > > with a dst.width < 8. However, we don't want to do this with > instructions > > operating on doubles, since these will have a width of 4, but still > > need an execution size of 8 (for SIMD8). Unfortunately, we can't > just check > > the size of the operands involved to detect if we are doing an > operation on > > doubles, because we can have instructions that do operations on > double > > operands interpreted as UD, operating on any of its 2 32-bit > components. > > > > Previous commits have made it so we never emit instructions with a > horizontal > > width of 4 that don't have the correct execution size set for > gen7/gen8, so > > we can skip it in this case, avoiding the conflicts with fp64 > requirements. > > > > Expanding the same fix to other hardware generations requires many > more > > changes but since we are not targetting fp64 support on them > > wer don't really care for now. > > --- > > src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +- > > 1 file changed, 13 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > index 78f2c8c..50a8771 100644 > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst > *inst, struct brw_reg dest) > > /* Generators should set a default exec_size of either 8 > (SIMD4x2 or SIMD8) > > * or 16 (SIMD16), as that's normally correct. However, when > dealing with > > * small registers, we automatically reduce it to match the > register size. > > +* > > +* In platforms that support fp64 we can emit instructions with > a width of > > +* 4 that need two SIMD8 registers and an exec_size of 8 or 16. > In these > > +* cases we need to make sure that these instructions have their > exec sizes > > +* set properly when they are emitted and we can't rely on this > code to fix > > +* it. > > */ > > - if (dest.width < BRW_EXECUTE_8) > > + bool fix_exec_size; > > + if (devinfo->gen == 7 || devinfo->gen == 8) > > If we're doing to take this approach, we definitely want to make it > gen > 6 or something so we include future gens. Really gen > 4 is > probably doable since the only real problem is the legacy clipping > code. Strips and fans is also a problem, but it is certainly doable if we want to do it. Iago > > + fix_exec_size = dest.width < BRW_EXECUTE_4; > > + else > > + fix_exec_size = dest.width < BRW_EXECUTE_8; > > + > > + if (fix_exec_size) > >brw_inst_set_exec_size(devinfo, inst, dest.width); > > } > > > > -- > > 2.1.4 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4
On Dec 9, 2015 11:47 PM, "Iago Toral"wrote: > > On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote: > > > > On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" > > wrote: > > > > > > This code in brw_set_dest adjusts the execution size of any > > instruction > > > with a dst.width < 8. However, we don't want to do this with > > instructions > > > operating on doubles, since these will have a width of 4, but still > > > need an execution size of 8 (for SIMD8). Unfortunately, we can't > > just check > > > the size of the operands involved to detect if we are doing an > > operation on > > > doubles, because we can have instructions that do operations on > > double > > > operands interpreted as UD, operating on any of its 2 32-bit > > components. > > > > > > Previous commits have made it so we never emit instructions with a > > horizontal > > > width of 4 that don't have the correct execution size set for > > gen7/gen8, so > > > we can skip it in this case, avoiding the conflicts with fp64 > > requirements. > > > > > > Expanding the same fix to other hardware generations requires many > > more > > > changes but since we are not targetting fp64 support on them > > > wer don't really care for now. > > > --- > > > src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +- > > > 1 file changed, 13 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > > index 78f2c8c..50a8771 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst > > *inst, struct brw_reg dest) > > > /* Generators should set a default exec_size of either 8 > > (SIMD4x2 or SIMD8) > > > * or 16 (SIMD16), as that's normally correct. However, when > > dealing with > > > * small registers, we automatically reduce it to match the > > register size. > > > +* > > > +* In platforms that support fp64 we can emit instructions with > > a width of > > > +* 4 that need two SIMD8 registers and an exec_size of 8 or 16. > > In these > > > +* cases we need to make sure that these instructions have their > > exec sizes > > > +* set properly when they are emitted and we can't rely on this > > code to fix > > > +* it. > > > */ > > > - if (dest.width < BRW_EXECUTE_8) > > > + bool fix_exec_size; > > > + if (devinfo->gen == 7 || devinfo->gen == 8) > > > > If we're doing to take this approach, we definitely want to make it > > gen > 6 or something so we include future gens. Really gen > 4 is > > probably doable since the only real problem is the legacy clipping > > code. > > Strips and fans is also a problem, but it is certainly doable if we want > to do it. Yeah, my primary point is that we should make it as little of an edge-case as possible. We could go back to at least gen6 and we should go forward. That said, it'll take a little testing from the Intel side. > Iago > > > > > + fix_exec_size = dest.width < BRW_EXECUTE_4; > > > + else > > > + fix_exec_size = dest.width < BRW_EXECUTE_8; > > > + > > > + if (fix_exec_size) > > >brw_inst_set_exec_size(devinfo, inst, dest.width); > > > } > > > > > > -- > > > 2.1.4 > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev