Am 01.02.2013 19:44, schrieb Christoph Bumiller: > On 01.02.2013 19:29, Brian Paul wrote: >> The glsl-to-tgsi translater will emit SQRT to implement GLSL's sqrt() >> and distance() functions if the PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED >> query says it's supported by the driver. >> >> Otherwise, sqrt(x) is implemented with x*rsq(x). The problem with >> this is sqrt(0) must be handled specially because rsq(0) might be >> Inf/NaN/undefined (and then 0*rsq(0) is Inf/Nan/undefined). In the > That's why we do rcp(rsq(x)), that works correctly. Yeah though some drivers don't have a good rcp implementation (llvmpipe, there's a fast sse2 rcp instruction but it's precision isn't sufficient - you could add Newton-Raphson step but this gets Infs and NaNs wrong too hence you need select workaround - so use division but that is slow...).
> I'm not sure we really need a cap for this though ... except to avoid > modifying drivers ;) > > I'll advertise the cap anyway, I prefer to be able to handle it internally. > But I like this change, lowering SQRT (or not) is device specific and > shouldn't be done unconditionally just because the API can't represent it. Agreed. > >> glsl-to-tgsi code we use an extra CMP to check if x is zero and then >> replace the result of x*rsq(x) with zero. >> >> In the end, this makes sqrt() generate much more reasonable code for >> drivers that can do square roots. >> >> Note that many of piglit's generated shader tests use the GLSL >> distance() function. >> --- >> src/gallium/docs/source/tgsi.rst | 9 +++++++++ >> src/gallium/include/pipe/p_defines.h | 3 ++- >> src/gallium/include/pipe/p_shader_tokens.h | 2 +- >> 3 files changed, 12 insertions(+), 2 deletions(-) >> >> diff --git a/src/gallium/docs/source/tgsi.rst >> b/src/gallium/docs/source/tgsi.rst >> index 548a9a3..5f03f32 100644 >> --- a/src/gallium/docs/source/tgsi.rst >> +++ b/src/gallium/docs/source/tgsi.rst >> @@ -89,6 +89,15 @@ This instruction replicates its result. >> dst = \frac{1}{\sqrt{|src.x|}} >> >> >> +.. opcode:: SQRT - Square Root >> + >> +This instruction replicates its result. >> + >> +.. math:: >> + >> + dst = {\sqrt{src.x}} >> + >> + >> .. opcode:: EXP - Approximate Exponential Base 2 >> >> .. math:: >> diff --git a/src/gallium/include/pipe/p_defines.h >> b/src/gallium/include/pipe/p_defines.h >> index d0db5e4..fdf6e7f 100644 >> --- a/src/gallium/include/pipe/p_defines.h >> +++ b/src/gallium/include/pipe/p_defines.h >> @@ -542,7 +542,8 @@ enum pipe_shader_cap >> PIPE_SHADER_CAP_SUBROUTINES = 16, /* BGNSUB, ENDSUB, CAL, RET */ >> PIPE_SHADER_CAP_INTEGERS = 17, >> PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS = 18, >> - PIPE_SHADER_CAP_PREFERRED_IR = 19 >> + PIPE_SHADER_CAP_PREFERRED_IR = 19, >> + PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED = 20 >> }; >> >> /** >> diff --git a/src/gallium/include/pipe/p_shader_tokens.h >> b/src/gallium/include/pipe/p_shader_tokens.h >> index 3fb12fb..a9fb6aa 100644 >> --- a/src/gallium/include/pipe/p_shader_tokens.h >> +++ b/src/gallium/include/pipe/p_shader_tokens.h >> @@ -275,7 +275,7 @@ struct tgsi_property_data { >> #define TGSI_OPCODE_SUB 17 >> #define TGSI_OPCODE_LRP 18 >> #define TGSI_OPCODE_CND 19 >> - /* gap */ >> +#define TGSI_OPCODE_SQRT 20 >> #define TGSI_OPCODE_DP2A 21 >> /* gap */ >> #define TGSI_OPCODE_FRC 24 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev