Ian Romanick <i...@freedesktop.org> writes: > On 01/25/2017 10:53 AM, Francisco Jerez wrote: >> Hi Ian, and thank you for your comments, >> >> Ian Romanick <i...@freedesktop.org> writes: >> >>> On 01/24/2017 03:26 PM, Francisco Jerez wrote: >>>> Will avoid a regression in a future commit that introduces some >>>> additional rcp operations. >>> >>> When I converted GLSL IR to ir_expression_operation.py, I was careful to >>> keep all the expressions the same. rcp and div had these weird guards. >>> GLSL doesn't require that NaN be generated, and quite a few old GPUs >>> don't. If the atan2 implementation depends on NaN being generated by >>> rcp, it may have problems on i915, r300, and similar GPUs. I don't know >>> what they generate, but it's not NaN and it's probably not 0.0. >> >> The atan2 implementation from patch 5 doesn't rely on NaNs being >> generated, but it does rely on the reciprocal operation handling zero >> and infinity correctly as specified by GLSL for the division operation. > > Okay. That is the problem on older GPUs that I was referring. > Specifically, all versions of the GLSL prior to 4.40 (!) say: > > Similarly, treatment of conditions such as divide by 0 may lead to > an unspecified result, but in no case should such a condition lead > to the interruption or termination of processing. >
I cannot find this paragraph in any of the GLSL 4.1+ specs. > I believe that DX11 requires the GLSL 4.40+ behavior, so all even > somewhat modern devices should just work. It's all the pre-DX11 > hardware that's might be a problem. > Actually I don't think PATCH 5 necessarily cares about infinities not getting generated -- The only requirement is for rcp(0) to return a fairly large value in absolute value (the larger the more accurate the result will be), even the sign of the result is pretty much irrelevant. That said there's a relatively straightforward change we could apply to PATCH 5 and 6 in order to make the calculation more robust against division by zero (it will probably make this patch unnecessary to avoid piglit regressions but I think we want it anyway because of GLSL 4.1+): We could leverage the coordinate rotation to turn (y, 0) into (0, y), avoiding division by zero along the whole vertical line -- The only remaining case where we could potentially divide by zero is along the left y=0 half-line, but the function jumps from -π to π along that line so returning an undefined value within [-π, π] for y=0 and x < 0 is very unlikely to hurt, because AFAICT all GLSL versions lacking well-defined divide by zero didn't require the implementation to represent the sign of zero consistently either, so the shader is unlikely to be able to generate a signed zero value accurately enough to notice the problem... > Now... talking to Jason just now, he reminded me that the spec also says > the following about built-in functions: > > Function parameters specified as angle are assumed to be in units > of radians. In no case will any of these functions result in a > divide by zero error. If the divisor of a ratio is 0, then results > will be undefined. > Heh... That could be interpreted as if atan2(y, 0) is undefined which would make this discussion moot -- I'll send a v2 of PATCH 5 and 6 anyway since it should be easy enough to get right. > We may be fine even on old, clunky hardware. Looking at the code in > patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces > undefined results. It looks like > tests/shaders/glsl-fs-atan-2.shader_test should hit that case. Anyone > have r300 or r400 hardware to test that? > > This patch doesn't affect that, and, even with the "unspecified result" > rule, it's clearly correct. This patch is > > Reviewed-by: Ian Romanick <ian.d.roman...@intel.com> > >>> That said, this matches NIR, and it's probably fine. >>> >>>> --- >>>> src/compiler/glsl/ir_expression_operation.py | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/src/compiler/glsl/ir_expression_operation.py >>>> b/src/compiler/glsl/ir_expression_operation.py >>>> index f91ac9b..4ac1ffb 100644 >>>> --- a/src/compiler/glsl/ir_expression_operation.py >>>> +++ b/src/compiler/glsl/ir_expression_operation.py >>>> @@ -422,7 +422,7 @@ ir_expression_operation = [ >>>> operation("neg", 1, source_types=numeric_types, c_expression={'u': >>>> "-((int) {src0})", 'default': "-{src0}"}), >>>> operation("abs", 1, source_types=signed_numeric_types, >>>> c_expression={'i': "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", >>>> 'd': "fabs({src0})", 'i64': "{src0} < 0 ? -{src0} : {src0}"}), >>>> operation("sign", 1, source_types=signed_numeric_types, >>>> c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > >>>> 0.0F) - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", >>>> 'i64': "({src0} > 0) - ({src0} < 0)"}), >>>> - operation("rcp", 1, source_types=real_types, c_expression={'f': >>>> "{src0} != 0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / >>>> {src0} : 0.0"}), >>>> + operation("rcp", 1, source_types=real_types, c_expression={'f': "1.0F >>>> / {src0}", 'd': "1.0 / {src0}"}), >>>> operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F >>>> / sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}), >>>> operation("sqrt", 1, source_types=real_types, c_expression={'f': >>>> "sqrtf({src0})", 'd': "sqrt({src0})"}), >>>> operation("exp", 1, source_types=(float_type,), >>>> c_expression="expf({src0})"), # Log base e on gentype >>>>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev