On 13/01/2017 19:50, Matteo Bruni wrote:
2017-01-13 3:37 GMT+01:00 Ilia Mirkin <imir...@alum.mit.edu>:
On Thu, Jan 12, 2017 at 9:13 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote:
Unless, of course, it's controlled by the same hardware bit... Clearly, we
can can give you abs on rsq without denorm flushing (easy shader hacks) but
not the other way around.
OK, so somehow I missed that earlier. However there's an interesting
section in the PRM:

https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf

on PDF page 854, "Dismissed Legacy Behaviors" which has a list of
suggested IEEE 754 deviations for DX9. One of them is indeed that 0 *
x = 0, but another is that input NaNs be propagated with certain
exceptions. Also they suggest that RCP(0)/RSQ(0) = fmax. Interesting.

So at this point, the zero_wins thing is pretty much blown. i965
appears to have an all-or-nothing approach, and additionally that
approach doesn't match up exactly to what NVIDIA does (or at least I'm
not aware of a clamp-everything mode).

This will take some thought to figure out how something can be
specified so that a single spec works for both i965 and nv/amd. OTOH
we could have two different specs that just expose different things -
e.g. i965 could expose a MESA_shader_float_alt_mode or whatever which
is spec'd to do the things that the PRM says, and nv/amd have the
MESA_shader_float_zero_wins ext which does what we were talking about
earlier.

I'm open to other suggestions too.
Maybe we can go back to the original idea and have the extension
require that no NaNs can be generated by GLSL mathematical operators
and builtin functions (if no operand is a NaN?) It's possible that's
not exactly it but in any case the idea is to just specify expected
results, without requiring a specific route to get there. The
extension could introduce undefined behavior where necessary e.g.
allowing (but not requiring) INF results to be always flushed to fmax
when enabled.

For Intel that would work trivially. For AMD it should be a matter of
using the special instructions where necessary and "be careful" in a
few places (in the same vein as the RSQ and POW opcodes of ARB
programs Marek mentioned). Not sure about nouveau, I guess it should
be similar to AMD in the end.

Would that be too messy? Am I completely missing the point?

Specifying just the behaviour for NaN doesn't solve the 0*inf issue for MAD operations. 24 + 0*inf = NaN gets converted to 0 instead of 24.


Axel

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to