Am 09.09.2018 um 21:19 schrieb Axel Davy: > Tests done on several devices of all 3 vendors and > of different generations showed that there are several > ways of handling infs and NaN for d3d9. > > Tests showed Intel on windows does always clamp > RCP, RSQ and LOG (thus preventing inf/nan generation), > for all shader versions (some vendor behaviours vary > with shader versions). > Doing this in nine avoids 0*inf issues for drivers > that can't generate 0*inf=0 (which is controled by > TGSI's MUL_ZERO_WINS). > > For now clamp for all drivers. An ulterior optimization > would be to avoid clamping for drivers with MUL_ZERO_WINS > for the specific shader versions where NV or AMD don't > clamp. > > LOG and RSQ being already clamped, this patch only > clamps RCP. > > Fixes: > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FiXit%2FMesa-3D%2Fissues%2F316&data=02%7C01%7Csroland%40vmware.com%7Cdccfde1e101a477ee00808d6168941d4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636721176130476488&sdata=JbGHhpPJPgUcw4i%2FSYN%2B30a7okSb5sT8bR%2B4PKvCnyM%3D&reserved=0 > > Signed-off-by: Axel Davy <davyax...@gmail.com> > CC: <mesa-sta...@lists.freedesktop.org> > --- > src/gallium/state_trackers/nine/nine_shader.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/nine/nine_shader.c > b/src/gallium/state_trackers/nine/nine_shader.c > index 7db07d8f69..5b8ad3f161 100644 > --- a/src/gallium/state_trackers/nine/nine_shader.c > +++ b/src/gallium/state_trackers/nine/nine_shader.c > @@ -2273,6 +2273,18 @@ DECL_SPECIAL(POW) > return D3D_OK; > } > > +DECL_SPECIAL(RCP) > +{ > + struct ureg_program *ureg = tx->ureg; > + struct ureg_dst dst = tx_dst_param(tx, &tx->insn.dst[0]); > + struct ureg_src src = tx_src_param(tx, &tx->insn.src[0]); > + struct ureg_dst tmp = tx_scratch(tx); > + ureg_RCP(ureg, tmp, src); > + ureg_MIN(ureg, tmp, ureg_imm1f(ureg, FLT_MAX), ureg_src(tmp)); > + ureg_MAX(ureg, dst, ureg_imm1f(ureg, -FLT_MAX), ureg_src(tmp)); I'm not sure what the ureg_MAX is supposed to do? The min already gets rid of all NaNs (iff the driver follows the d3d10-mandated behavior of picking the non-nan number for min/max if one of the values is a NaN - if not doing both min/max isn't going to help neither...).
Roland > + return D3D_OK; > +} > + > DECL_SPECIAL(RSQ) > { > struct ureg_program *ureg = tx->ureg; > @@ -2909,7 +2921,7 @@ static const struct sm1_op_info inst_table[] = > _OPI(SUB, NOP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, SPECIAL(SUB)), /* 3 > */ > _OPI(MAD, MAD, V(0,0), V(3,0), V(0,0), V(3,0), 1, 3, NULL), /* 4 */ > _OPI(MUL, MUL, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 5 */ > - _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, NULL), /* 6 */ > + _OPI(RCP, RCP, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RCP)), /* 6 > */ > _OPI(RSQ, RSQ, V(0,0), V(3,0), V(0,0), V(3,0), 1, 1, SPECIAL(RSQ)), /* 7 > */ > _OPI(DP3, DP3, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 8 */ > _OPI(DP4, DP4, V(0,0), V(3,0), V(0,0), V(3,0), 1, 2, NULL), /* 9 */ > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev