Re: [Mesa-dev] [PATCH 0/5] Add ARB_derivative_control support

2014-08-14 Thread Matt Turner
On Wed, Aug 13, 2014 at 9:52 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 I left all the variants as separate operations in the glsl ir. However for
 gallium I only added the fine version, as it seems like DDX can do pretty much
 whatever it wants. I was on the fence about adding coarse versions as well and
 then using the FragmentShaderDerivative hint to select one or the other in the
 glsl - tgsi conversion.

 In the case of nv50/nvc0, doing the fine version is pretty much the only
 (easy) way of doing derivatives. I haven't traced the blob to see how it
 handles things yet. In any case, on nv50/nvc0 all this is completely moot, at
 least for now. Curious about what the situation with other hardware is.

i965 already implements coarse and fine derivatives, selectable by the
derivatives hint, coarse default.

The calculation of the derivative itself isn't faster for coarse
derivatives, but it was discovered that if all of the samples of a
sample_d are from the same LOD, it's a bunch faster on Haswell at
least. See commit 848c0e72. And with coarse derivatives they are.

Maybe other hardware has similar optimizations?

 Also, the extension spec claims to require GLSL 4.00, which seems a little
 extreme. Instead I restrict it to core contexts. Let me know if I should
 change this.

Making it core-only doesn't help, nor does it satisfy the GLSL = 4.0
requirement in the spec. I'm not sure if we have a way to arbitrarily
limit an extension to being exposed under certain GLSL versions... ?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] Add ARB_derivative_control support

2014-08-14 Thread Ian Romanick
On 08/13/2014 11:58 PM, Matt Turner wrote:
 On Wed, Aug 13, 2014 at 9:52 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 I left all the variants as separate operations in the glsl ir. However for
 gallium I only added the fine version, as it seems like DDX can do pretty 
 much
 whatever it wants. I was on the fence about adding coarse versions as well 
 and
 then using the FragmentShaderDerivative hint to select one or the other in 
 the
 glsl - tgsi conversion.

 In the case of nv50/nvc0, doing the fine version is pretty much the only
 (easy) way of doing derivatives. I haven't traced the blob to see how it
 handles things yet. In any case, on nv50/nvc0 all this is completely moot, at
 least for now. Curious about what the situation with other hardware is.
 
 i965 already implements coarse and fine derivatives, selectable by the
 derivatives hint, coarse default.

I don't think that's the same thing.  The fine derivatives in i965
definitely do not meet this requirement:

...second-order fine derivatives, e.g., dFdxFine(dFdxFine(x))
will properly reflect the difference between the independent
fine derivatives computed within the 2x2 square.

As it is now, dFdxFine(dFdxFine(x*x*x))) will always be zero in the i965
driver.  Two pixels on the same line will have different dFdy, but the
dFdx will be the same.  Right?

Is there a piglit test for that specific part?  (I haven't looked at the
piglit list at all.)

 The calculation of the derivative itself isn't faster for coarse
 derivatives, but it was discovered that if all of the samples of a
 sample_d are from the same LOD, it's a bunch faster on Haswell at
 least. See commit 848c0e72. And with coarse derivatives they are.
 
 Maybe other hardware has similar optimizations?
 
 Also, the extension spec claims to require GLSL 4.00, which seems a little
 extreme. Instead I restrict it to core contexts. Let me know if I should
 change this.
 
 Making it core-only doesn't help, nor does it satisfy the GLSL = 4.0
 requirement in the spec. I'm not sure if we have a way to arbitrarily
 limit an extension to being exposed under certain GLSL versions... ?
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] Add ARB_derivative_control support

2014-08-14 Thread Ilia Mirkin
On Thu, Aug 14, 2014 at 1:30 PM, Ian Romanick i...@freedesktop.org wrote:
 On 08/13/2014 11:58 PM, Matt Turner wrote:
 On Wed, Aug 13, 2014 at 9:52 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 I left all the variants as separate operations in the glsl ir. However for
 gallium I only added the fine version, as it seems like DDX can do pretty 
 much
 whatever it wants. I was on the fence about adding coarse versions as well 
 and
 then using the FragmentShaderDerivative hint to select one or the other in 
 the
 glsl - tgsi conversion.

 In the case of nv50/nvc0, doing the fine version is pretty much the only
 (easy) way of doing derivatives. I haven't traced the blob to see how it
 handles things yet. In any case, on nv50/nvc0 all this is completely moot, 
 at
 least for now. Curious about what the situation with other hardware is.

 i965 already implements coarse and fine derivatives, selectable by the
 derivatives hint, coarse default.

 I don't think that's the same thing.  The fine derivatives in i965
 definitely do not meet this requirement:

 ...second-order fine derivatives, e.g., dFdxFine(dFdxFine(x))
 will properly reflect the difference between the independent
 fine derivatives computed within the 2x2 square.

 As it is now, dFdxFine(dFdxFine(x*x*x))) will always be zero in the i965
 driver.  Two pixels on the same line will have different dFdy, but the
 dFdx will be the same.  Right?

I sent a question about this to the list earlier (with no response
other than my own), but I believe that to be a typo in the spec. Look
at Issue 2, which explicitly talks about dFdxFine(dFdyFine(...)).
There's no way to get second-order derivatives in a single variable
with only 2 points, so it would want a larger block.


 Is there a piglit test for that specific part?  (I haven't looked at the
 piglit list at all.)

There's a piglit test for dFdxFine(dFdyFine()).


 The calculation of the derivative itself isn't faster for coarse
 derivatives, but it was discovered that if all of the samples of a
 sample_d are from the same LOD, it's a bunch faster on Haswell at
 least. See commit 848c0e72. And with coarse derivatives they are.

 Maybe other hardware has similar optimizations?

 Also, the extension spec claims to require GLSL 4.00, which seems a little
 extreme. Instead I restrict it to core contexts. Let me know if I should
 change this.

 Making it core-only doesn't help, nor does it satisfy the GLSL = 4.0
 requirement in the spec. I'm not sure if we have a way to arbitrarily
 limit an extension to being exposed under certain GLSL versions... ?
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] Add ARB_derivative_control support

2014-08-14 Thread Ian Romanick
On 08/14/2014 10:30 AM, Ian Romanick wrote:
 On 08/13/2014 11:58 PM, Matt Turner wrote:
 On Wed, Aug 13, 2014 at 9:52 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 I left all the variants as separate operations in the glsl ir. However for
 gallium I only added the fine version, as it seems like DDX can do pretty 
 much
 whatever it wants. I was on the fence about adding coarse versions as well 
 and
 then using the FragmentShaderDerivative hint to select one or the other in 
 the
 glsl - tgsi conversion.

 In the case of nv50/nvc0, doing the fine version is pretty much the only
 (easy) way of doing derivatives. I haven't traced the blob to see how it
 handles things yet. In any case, on nv50/nvc0 all this is completely moot, 
 at
 least for now. Curious about what the situation with other hardware is.

 i965 already implements coarse and fine derivatives, selectable by the
 derivatives hint, coarse default.
 
 I don't think that's the same thing.  The fine derivatives in i965
 definitely do not meet this requirement:
 
 ...second-order fine derivatives, e.g., dFdxFine(dFdxFine(x))
 will properly reflect the difference between the independent
 fine derivatives computed within the 2x2 square.
 
 As it is now, dFdxFine(dFdxFine(x*x*x))) will always be zero in the i965
 driver.  Two pixels on the same line will have different dFdy, but the
 dFdx will be the same.  Right?

Hm... the overview in the extension also says:

For the fine-granularity derivative, two derivatives could
be computed for each 2x2 group of pixels; one for the top
row and one for the bottom row.

This matches the fine derivatives in the i965 driver, but it seems at
odds with the second-order derivative line in the GLSL 4.50 spec.  I
guess I'll submit a spec bug...

 Is there a piglit test for that specific part?  (I haven't looked at the
 piglit list at all.)
 
 The calculation of the derivative itself isn't faster for coarse
 derivatives, but it was discovered that if all of the samples of a
 sample_d are from the same LOD, it's a bunch faster on Haswell at
 least. See commit 848c0e72. And with coarse derivatives they are.

 Maybe other hardware has similar optimizations?

 Also, the extension spec claims to require GLSL 4.00, which seems a little
 extreme. Instead I restrict it to core contexts. Let me know if I should
 change this.

 Making it core-only doesn't help, nor does it satisfy the GLSL = 4.0
 requirement in the spec. I'm not sure if we have a way to arbitrarily
 limit an extension to being exposed under certain GLSL versions... ?
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] Add ARB_derivative_control support

2014-08-14 Thread Ian Romanick
On 08/14/2014 10:35 AM, Ilia Mirkin wrote:
 On Thu, Aug 14, 2014 at 1:30 PM, Ian Romanick i...@freedesktop.org wrote:
 On 08/13/2014 11:58 PM, Matt Turner wrote:
 On Wed, Aug 13, 2014 at 9:52 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 I left all the variants as separate operations in the glsl ir. However for
 gallium I only added the fine version, as it seems like DDX can do pretty 
 much
 whatever it wants. I was on the fence about adding coarse versions as well 
 and
 then using the FragmentShaderDerivative hint to select one or the other in 
 the
 glsl - tgsi conversion.

 In the case of nv50/nvc0, doing the fine version is pretty much the only
 (easy) way of doing derivatives. I haven't traced the blob to see how it
 handles things yet. In any case, on nv50/nvc0 all this is completely moot, 
 at
 least for now. Curious about what the situation with other hardware is.

 i965 already implements coarse and fine derivatives, selectable by the
 derivatives hint, coarse default.

 I don't think that's the same thing.  The fine derivatives in i965
 definitely do not meet this requirement:

 ...second-order fine derivatives, e.g., dFdxFine(dFdxFine(x))
 will properly reflect the difference between the independent
 fine derivatives computed within the 2x2 square.

 As it is now, dFdxFine(dFdxFine(x*x*x))) will always be zero in the i965
 driver.  Two pixels on the same line will have different dFdy, but the
 dFdx will be the same.  Right?
 
 I sent a question about this to the list earlier (with no response
 other than my own), but I believe that to be a typo in the spec. Look
 at Issue 2, which explicitly talks about dFdxFine(dFdyFine(...)).
 There's no way to get second-order derivatives in a single variable
 with only 2 points, so it would want a larger block.

Right... I'm at SIGGRAPH, so I haven't been keeping up with the mailing
list very well.  I did some research in the Khronos (non-public) mailing
list archives, and I came to the same conclusion.  I've submitted a spec
bug, so hopefully this will be fixed soon.

 Is there a piglit test for that specific part?  (I haven't looked at the
 piglit list at all.)
 
 There's a piglit test for dFdxFine(dFdyFine()).

Excellent. :)

 The calculation of the derivative itself isn't faster for coarse
 derivatives, but it was discovered that if all of the samples of a
 sample_d are from the same LOD, it's a bunch faster on Haswell at
 least. See commit 848c0e72. And with coarse derivatives they are.

 Maybe other hardware has similar optimizations?

 Also, the extension spec claims to require GLSL 4.00, which seems a little
 extreme. Instead I restrict it to core contexts. Let me know if I should
 change this.

 Making it core-only doesn't help, nor does it satisfy the GLSL = 4.0
 requirement in the spec. I'm not sure if we have a way to arbitrarily
 limit an extension to being exposed under certain GLSL versions... ?
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] Add ARB_derivative_control support

2014-08-13 Thread Ilia Mirkin
I left all the variants as separate operations in the glsl ir. However for
gallium I only added the fine version, as it seems like DDX can do pretty much
whatever it wants. I was on the fence about adding coarse versions as well and
then using the FragmentShaderDerivative hint to select one or the other in the
glsl - tgsi conversion.

In the case of nv50/nvc0, doing the fine version is pretty much the only
(easy) way of doing derivatives. I haven't traced the blob to see how it
handles things yet. In any case, on nv50/nvc0 all this is completely moot, at
least for now. Curious about what the situation with other hardware is.

Also, the extension spec claims to require GLSL 4.00, which seems a little
extreme. Instead I restrict it to core contexts. Let me know if I should
change this.

I will try to send some piglits out for this soon, but it's all fairly
straightforward...

Ilia Mirkin (5):
  mesa: add ARB_derivative_control extension bit
  glsl: add ARB_derivative control support
  gallium: add opcodes/cap for fine derivative support
  mesa/st: add support for emitting fine derivative opcodes
  nv50,nvc0: add support for fine derivatives

 docs/GL3.txt   |  2 +-
 src/gallium/auxiliary/tgsi/tgsi_info.c |  3 ++
 src/gallium/auxiliary/tgsi/tgsi_util.c |  2 +
 src/gallium/docs/source/screen.rst |  2 +
 src/gallium/docs/source/tgsi.rst   | 12 +-
 src/gallium/drivers/freedreno/freedreno_screen.c   |  1 +
 src/gallium/drivers/i915/i915_screen.c |  1 +
 src/gallium/drivers/ilo/ilo_screen.c   |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c   |  1 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  4 ++
 src/gallium/drivers/nouveau/nv30/nv30_screen.c |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  1 +
 src/gallium/drivers/r300/r300_screen.c |  1 +
 src/gallium/drivers/r600/r600_pipe.c   |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c |  1 +
 src/gallium/drivers/softpipe/sp_screen.c   |  1 +
 src/gallium/drivers/svga/svga_screen.c |  1 +
 src/gallium/drivers/vc4/vc4_screen.c   |  1 +
 src/gallium/include/pipe/p_defines.h   |  1 +
 src/gallium/include/pipe/p_shader_tokens.h |  5 ++-
 src/glsl/builtin_functions.cpp | 48 ++
 src/glsl/glcpp/glcpp-parse.y   |  3 ++
 src/glsl/glsl_parser_extras.cpp|  1 +
 src/glsl/glsl_parser_extras.h  |  2 +
 src/glsl/ir.h  |  4 ++
 src/glsl/ir_validate.cpp   |  4 ++
 src/mesa/main/extensions.c |  1 +
 src/mesa/main/mtypes.h |  1 +
 src/mesa/state_tracker/st_extensions.c |  3 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  9 +++-
 31 files changed, 114 insertions(+), 6 deletions(-)

-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev