[Mesa-dev] [PATCH] freedreno: document debug flag
Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I just peeked into the freedreno code, and noticed that it had some undocumented debug-code. Perhaps it could make sense to document it, like this? src/gallium/docs/source/debugging.rst | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index e081cbf..3257618 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG bool (false) + +Output debug information for the freedreno driver. + .. _flags: -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] freedreno: document debug flag
On Mon, Mar 25, 2013 at 6:25 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I just peeked into the freedreno code, and noticed that it had some undocumented debug-code. Perhaps it could make sense to document it, like this? src/gallium/docs/source/debugging.rst | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index e081cbf..3257618 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG bool (false) + +Output debug information for the freedreno driver. + OK, so it turns out I was a bit trigger-happy. This isn't a normal bool, it's a bitflag. So perhaps something like this instead? diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index 3257618..8658412 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,9 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. -.. envvar:: FD_MESA_DEBUG bool (false) +.. envvar:: FD_MESA_DEBUG bitflag (false) -Output debug information for the freedreno driver. +Output debug information for the freedreno driver. Bit #0 enables debug +prints, while bit #1 enables TGSI and ASM dumps. .. _flags: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] freedreno: document debug flag
On Mon, Mar 25, 2013 at 6:36 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Mar 25, 2013 at 6:25 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I just peeked into the freedreno code, and noticed that it had some undocumented debug-code. Perhaps it could make sense to document it, like this? src/gallium/docs/source/debugging.rst | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index e081cbf..3257618 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG bool (false) + +Output debug information for the freedreno driver. + OK, so it turns out I was a bit trigger-happy. This isn't a normal bool, it's a bitflag. So perhaps something like this instead? diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index 3257618..8658412 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,9 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. -.. envvar:: FD_MESA_DEBUG bool (false) +.. envvar:: FD_MESA_DEBUG bitflag (false) -Output debug information for the freedreno driver. +Output debug information for the freedreno driver. Bit #0 enables debug +prints, while bit #1 enables TGSI and ASM dumps. .. _flags: ...and with Rob's recent commit, I guess it should be: diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/deb index e081cbf..dc308e5 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG flags (0x0) + +Debug :ref:`flags` for the freedreno driver. + .. _flags: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] freedreno: document debug flag
On Mon, Mar 25, 2013 at 9:28 PM, Rob Clark robdcl...@gmail.com wrote: On Mon, Mar 25, 2013 at 3:27 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Mar 25, 2013 at 6:36 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Mar 25, 2013 at 6:25 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I just peeked into the freedreno code, and noticed that it had some undocumented debug-code. Perhaps it could make sense to document it, like this? src/gallium/docs/source/debugging.rst | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index e081cbf..3257618 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG bool (false) + +Output debug information for the freedreno driver. + OK, so it turns out I was a bit trigger-happy. This isn't a normal bool, it's a bitflag. So perhaps something like this instead? diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index 3257618..8658412 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,9 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. -.. envvar:: FD_MESA_DEBUG bool (false) +.. envvar:: FD_MESA_DEBUG bitflag (false) -Output debug information for the freedreno driver. +Output debug information for the freedreno driver. Bit #0 enables debug +prints, while bit #1 enables TGSI and ASM dumps. .. _flags: ...and with Rob's recent commit, I guess it should be: diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/deb index e081cbf..dc308e5 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG flags (0x0) + +Debug :ref:`flags` for the freedreno driver. + care to squash that down into one patch, and I'll push? Well, that IS one patch. But you might want something for git-am? I'll send one a bit later, sure! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] freedreno: document debug flag
Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- Here you go, a version of the patch prepared for git-am src/gallium/docs/source/debugging.rst | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index e081cbf..dc308e5 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG flags (0x0) + +Debug :ref:`flags` for the freedreno driver. + .. _flags: -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] freedreno: document debug flag
Ping? Anything wrong with it? On Tue, Mar 26, 2013 at 2:48 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- Here you go, a version of the patch prepared for git-am src/gallium/docs/source/debugging.rst | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/docs/source/debugging.rst b/src/gallium/docs/source/debugging.rst index e081cbf..dc308e5 100644 --- a/src/gallium/docs/source/debugging.rst +++ b/src/gallium/docs/source/debugging.rst @@ -81,6 +81,10 @@ Debug :ref:`flags` for the llvmpipe driver. Number of threads that the llvmpipe driver should use. +.. envvar:: FD_MESA_DEBUG flags (0x0) + +Debug :ref:`flags` for the freedreno driver. + .. _flags: -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] freedreno: document debug flag
On Thu, Apr 4, 2013 at 4:32 PM, Brian Paul bri...@vmware.com wrote: On 04/04/2013 05:20 AM, Erik Faye-Lund wrote: Ping? Anything wrong with it? Looks OK to me. Do you need someone to commit for you? Well, I don't have commit access. But I was expecting to see rob pick it up and push it out the next time he had some updates, but I can't see it in his repo either. Either would be fine by me. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/21] glsl: Write a new built-in function module.
On Thu, Sep 5, 2013 at 12:22 AM, Kenneth Graunke kenn...@whitecape.org wrote: +#define B0(X) ir_function_signature *_##X(); +#define B1(X) ir_function_signature *_##X(const glsl_type *); +#define B2(X) ir_function_signature *_##X(const glsl_type *, const glsl_type *); +#define B3(X) ir_function_signature *_##X(const glsl_type *, const glsl_type *, const glsl_type *); + B1(radians) + B1(degrees) + B1(sin) + B1(cos) snip + B1(bitCount) + B1(findLSB) + B1(findMSB) + B1(fma) +#undef B Shouldn't this be: #undef B0 #undef B1 #undef B2 #undef B3 ? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/21] glsl: Add IR builder shortcuts for a bunch of random opcodes.
diff --git a/src/glsl/ir_builder.cpp b/src/glsl/ir_builder.cpp index e12ae3c..31a457d 100644 --- a/src/glsl/ir_builder.cpp +++ b/src/glsl/ir_builder.cpp @@ -264,6 +264,46 @@ abs(operand a) return expr(ir_unop_abs, a); } +ir_expression *neg(operand a) +{ + return expr(ir_unop_neg, a); +} + snip +ir_expression* +f2b(operand a) +{ + return expr(ir_unop_f2b, a); +} + Any specific reason for the inconsistent folding before the function name? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: add PIPE_CAP_MIXED_FRAMEBUFFER_SIZES
On Mon, Sep 9, 2013 at 1:31 PM, Roland Scheidegger srol...@vmware.com wrote: I'm not really convinced of this idea. There's already PIPE_CAP_MIXED_COLORBUFFER_FORMATS which is sort of similar - a requirement of ARB_fbo, but it isn't used to determine support of ARB_fbo or not (my guess is drivers want to advertize ARB_fbo even if they can't do it, and ARB_fbo doesn't really have a hard requirement for anything as you always can say no to fb supported). So I can't see why not supporting different width/height is treated different to not supporting different formats. Actually, ARB_framebuffer_object does have a hard requirement on rendering to differently sized attachments. FRAMEBUFFER_UNSUPPORTED is only allowed for format-issues. EXT_framebuffer_objects can still be supported on hardware that cannot meet this requirement. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glcpp: error on multiple #else directives
The preprocessor currently eats multiple #else directives int the same #if(def) ... #endif block. While I haven't been able to find anything that explicitly disallows it, it's nonsensical and should probably not be allowed. Add checks to reject the code. --- I'm not entirely sure why parser-skip_stack can be NULL after _glcpp_parser_skip_stack_change_if, but apparently it is for at least one test. I'm also not entirely sure if this should be an error or a warning. Thoughts? src/glsl/glcpp/glcpp-parse.y | 13 - src/glsl/glcpp/glcpp.h| 1 + src/glsl/glcpp/tests/118-multiple-else.c | 6 ++ src/glsl/glcpp/tests/118-multiple-else.c.expected | 8 4 files changed, 27 insertions(+), 1 deletion(-) create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c.expected diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 6eaa5f9..885c64d 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -332,7 +332,17 @@ control_line: } } | HASH_ELSE { - _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, multiple #else); + } + else + { + _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack) + parser-skip_stack-has_else = true; + } } NEWLINE | HASH_ENDIF { _glcpp_parser_skip_stack_pop (parser, @1); @@ -2012,6 +2022,7 @@ _glcpp_parser_skip_stack_push_if (glcpp_parser_t *parser, YYLTYPE *loc, node-type = SKIP_TO_ENDIF; } + node-has_else = false; node-next = parser-skip_stack; parser-skip_stack = node; } diff --git a/src/glsl/glcpp/glcpp.h b/src/glsl/glcpp/glcpp.h index 8aaa551..ccae96c 100644 --- a/src/glsl/glcpp/glcpp.h +++ b/src/glsl/glcpp/glcpp.h @@ -153,6 +153,7 @@ typedef enum skip_type { typedef struct skip_node { skip_type_t type; + bool has_else; YYLTYPE loc; /* location of the initial #if/#elif/... */ struct skip_node *next; } skip_node_t; diff --git a/src/glsl/glcpp/tests/118-multiple-else.c b/src/glsl/glcpp/tests/118-multiple-else.c new file mode 100644 index 000..62ad49c --- /dev/null +++ b/src/glsl/glcpp/tests/118-multiple-else.c @@ -0,0 +1,6 @@ +#if 0 +#else +int foo; +#else +int bar; +#endif diff --git a/src/glsl/glcpp/tests/118-multiple-else.c.expected b/src/glsl/glcpp/tests/118-multiple-else.c.expected new file mode 100644 index 000..eaec481 --- /dev/null +++ b/src/glsl/glcpp/tests/118-multiple-else.c.expected @@ -0,0 +1,8 @@ +0:4(1): preprocessor error: multiple #else + + +int foo; + +int bar; + + -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glcpp: error on multiple #else directives
On Mon, Sep 23, 2013 at 10:35 PM, Erik Faye-Lund kusmab...@gmail.com wrote: The preprocessor currently eats multiple #else directives int the same #if(def) ... #endif block. While I haven't been able to find anything that explicitly disallows it, it's nonsensical and should probably not be allowed. Add checks to reject the code. --- I'm not entirely sure why parser-skip_stack can be NULL after _glcpp_parser_skip_stack_change_if, but apparently it is for at least one test. I'm also not entirely sure if this should be an error or a warning. Thoughts? src/glsl/glcpp/glcpp-parse.y | 13 - src/glsl/glcpp/glcpp.h| 1 + src/glsl/glcpp/tests/118-multiple-else.c | 6 ++ src/glsl/glcpp/tests/118-multiple-else.c.expected | 8 4 files changed, 27 insertions(+), 1 deletion(-) create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c.expected diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 6eaa5f9..885c64d 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -332,7 +332,17 @@ control_line: } } | HASH_ELSE { - _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, multiple #else); + } + else + { + _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack) + parser-skip_stack-has_else = true; + } Seems the HASH_ELIF variants needs the same treatment (sorry for the b0rked white-space, gmail is not playing nice today): diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 885c64d..690f2da 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -310,6 +310,11 @@ control_line: _glcpp_parser_expand_and_lex_from (parser, ELIF_EXPANDED, $2); } + else if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, #elif after #else); + } else { _glcpp_parser_skip_stack_change_if (parser, @1, @@ -324,6 +329,11 @@ control_line: { glcpp_error( @1, parser, #elif with no expression); } + else if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, #elif after #else); + } else { _glcpp_parser_skip_stack_change_if (parser, @1, diff --git a/src/glsl/glcpp/tests/119-elif-after-else.c b/src/glsl/glcpp/tests/119-elif-after-else.c new file mode 100644 index 000..9b9e923 --- /dev/null +++ b/src/glsl/glcpp/tests/119-elif-after-else.c @@ -0,0 +1,6 @@ +#if 0 +#else +int foo; +#elif 0 +int bar; +#endif diff --git a/src/glsl/glcpp/tests/119-elif-after-else.c.expected b/src/glsl/glcpp/tests/119-elif-after-else.c.expected new file mode 100644 index 000..33f0513 --- /dev/null +++ b/src/glsl/glcpp/tests/119-elif-after-else.c.expected @@ -0,0 +1,8 @@ +0:4(1): preprocessor error: #elif after #else + + +int foo; + +int bar; + + ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] context sharing of framebuffer objects
On Mon, Sep 30, 2013 at 11:01 AM, Henri Verbeet hverb...@gmail.com wrote: On 30 September 2013 02:18, Dave Airlie airl...@gmail.com wrote: So this led me to look at the spec and the mesa code, and I noticed it appears at some point maybe around 3.1 that FBOs are no longer considered shared objects at least in core profile, but mesa always seems to share them, just wondering is someone can confirm I'm reading things correctly, and if so I might try and do a piglit test and a patch. AFAIK the only FBOs that can be shared are ones create through EXT_fbo. (Specifically, see issue 10 in the ARB_fbo spec, and Appendix D in the GL 3.0 spec.) This matches my reading of the spec as well, and kind of makes the world horrible. From the ARB_framebuffer_object spec: Dependencies on EXT_framebuffer_object Framebuffer objects created with the commands defined by the GL_EXT_framebuffer_object extension are defined to be shared, while FBOs created with commands defined by the OpenGL core or GL_ARB_framebuffer_object extension are defined *not* to be shared. Undefined behavior results when using FBOs created by EXT commands through non-EXT interfaces, or vice-versa. Yuck. Also see issue #10 in the spec: (10) Can ARB framebuffer objects be shared between contexts? ARB_framebuffer_object is supposed to be compatible with EXT_framebuffer_object, but also a subset of OpenGL 3.0. EXT_framebuffer_object (rev. 120) explicitly allows sharing in issue 76, but the 3.0 spec explicitly disallows it in Appendix D. Resolved: No. ARB_framebuffer_object is intended to capture the functionality that went into GL 3.0. Furthermore, given that the entry points and tokens in this extension and the core are identical there is no way that an implementation could differentiate FBOs created with this extension from those created by core GL. ADDITIONAL COMMENTS: Undefined behavior results when using FBOs created by EXT commands through non-EXT FBO interfaces, or vice-versa. See the Dependencies on EXT_framebuffer_object section above. It basically says the same thing, only with a bit more explanation. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glcpp: error on multiple #else directives
On Mon, Sep 23, 2013 at 11:02 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Sep 23, 2013 at 10:35 PM, Erik Faye-Lund kusmab...@gmail.com wrote: The preprocessor currently eats multiple #else directives int the same #if(def) ... #endif block. While I haven't been able to find anything that explicitly disallows it, it's nonsensical and should probably not be allowed. Add checks to reject the code. --- I'm not entirely sure why parser-skip_stack can be NULL after _glcpp_parser_skip_stack_change_if, but apparently it is for at least one test. I'm also not entirely sure if this should be an error or a warning. Thoughts? Ping? Any thoughts? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] libGL without X
On Mon, Oct 21, 2013 at 4:05 PM, Chris Healy cphe...@gmail.com wrote: I have a headless platform I need OpenGL to work on that does not have X. It is x86 with Intel HD 4000 graphics. Ultimately, I'm just wanting to use OpenGL to render to memory for encoding to H.264 and streaming. I'm trying to build Mesa for this platform without X and cannot get it to build libGL.so. What am I missing here? Is it not possible to use OpenGL without X? I was hoping I could use OpenGL with EGL for testing purposes. If you build mesa with GBM support, you should be able to render without X. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] libGL without X
On Tue, Oct 22, 2013 at 2:58 AM, Chris Healy cphe...@gmail.com wrote: On Mon, Oct 21, 2013 at 2:03 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Oct 21, 2013 at 4:05 PM, Chris Healy cphe...@gmail.com wrote: I have a headless platform I need OpenGL to work on that does not have X. It is x86 with Intel HD 4000 graphics. Ultimately, I'm just wanting to use OpenGL to render to memory for encoding to H.264 and streaming. I'm trying to build Mesa for this platform without X and cannot get it to build libGL.so. What am I missing here? Is it not possible to use OpenGL without X? I was hoping I could use OpenGL with EGL for testing purposes. If you build mesa with GBM support, you should be able to render without X. I would still need to build Mesa with X so that libGL is built though, correct? Probably, yeah. But you can run the resulting binary without an x-server running. Dunno if that's sufficient for you or not. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: Optimize (not A) or (not B) into not (A and B).
On Thu, Oct 24, 2013 at 2:19 AM, Matt Turner matts...@gmail.com wrote: A few Serious Sam 3 shaders affected: instructions in affected programs: 4384 - 4344 (-0.91%) --- src/glsl/opt_algebraic.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 37b2f02..3bf0689 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -32,8 +32,11 @@ #include ir_visitor.h #include ir_rvalue_visitor.h #include ir_optimization.h +#include ir_builder.h #include glsl_types.h +using namespace ir_builder; + namespace { /** @@ -436,6 +439,15 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) this-progress = true; return new(mem_ctx) ir_constant(ir-type, data); + } else if (op_expr[0] op_expr[0]-operation == ir_unop_logic_not + op_expr[1] op_expr[1]-operation == ir_unop_logic_not) { + /* De Morgan's Law: + *(not A) or (not B) === not (A and B) + */ + temp = logic_not(logic_and(op_expr[0]-operands[0], +op_expr[1]-operands[0])); + return swizzle_if_required(ir, temp); + this-progress = true; Returning before reporting progress? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl: Optimize (not A) and (not B) into not (A or B).
On Thu, Oct 24, 2013 at 2:19 AM, Matt Turner matts...@gmail.com wrote: No shader-db changes, but seems like a good idea. --- src/glsl/opt_algebraic.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 3bf0689..1ce9e2d 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -401,6 +401,15 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) } else if (is_vec_zero(op_const[0]) || is_vec_zero(op_const[1])) { this-progress = true; return ir_constant::zero(mem_ctx, ir-type); + } else if (op_expr[0] op_expr[0]-operation == ir_unop_logic_not + op_expr[1] op_expr[1]-operation == ir_unop_logic_not) { + /* De Morgan's Law: + *(not A) and (not B) === not (A or B) + */ + temp = logic_not(logic_or(op_expr[0]-operands[0], + op_expr[1]-operands[0])); + return swizzle_if_required(ir, temp); + this-progress = true; Same here, returning before updating progress? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Drop no-op shifts by 0.
Why is this tagged as i965/fs, when everything seems to happen in the glsl-optimizer? On Thu, Oct 24, 2013 at 5:53 PM, Eric Anholt e...@anholt.net wrote: I noticed this in a shader in Unigine Heaven that was spilling. While it doesn't really reduce register pressure, it shaves a few instructions anyway (7955 - 7882). --- src/glsl/opt_algebraic.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 37b2f02..ff06cfc 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -387,6 +387,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) } break; + case ir_binop_rshift: + case ir_binop_lshift: + if (is_vec_zero(op_const[0])) + return ir-operands[1]; + else if (is_vec_zero(op_const[1])) + return ir-operands[0]; + break; + Maybe update progress inside the conditionals also? But wait a minute. x shifted by 0 is x, so the latter part looks correct. But the first conditional seems to assume that 0 sifted by x is x, but it's really 0, no? Shouldn't both cases return ir-operands[0]? What am I missing? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 6/6] mesa: OES_get_program_binary functionality
On Fri, Nov 1, 2013 at 10:16 AM, Tapani Pälli tapani.pa...@intel.com wrote: Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/shaderapi.c | 46 +++--- 1 file changed, 39 insertions(+), 7 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 7da860d..67aee6b 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -57,6 +57,8 @@ #include ../glsl/ir.h #include ../glsl/ir_uniform.h #include ../glsl/program.h +#include ../glsl/ir_cache.h +#include git_sha1.h /** Define this to enable shader substitution (see below) */ #define SHADER_SUBST 0 @@ -212,7 +214,6 @@ attach_shader(struct gl_context *ctx, GLuint program, GLuint shader) struct gl_shader_program *shProg; struct gl_shader *sh; GLuint i, n; - const bool same_type_disallowed = _mesa_is_gles(ctx); shProg = _mesa_lookup_shader_program_err(ctx, program, glAttachShader); @@ -1630,8 +1631,26 @@ _mesa_GetProgramBinary(GLuint program, GLsizei bufSize, GLsizei *length, if (length != NULL) *length = 0; - (void) binaryFormat; - (void) binary; + size_t size = 0; + char *data = _mesa_program_serialize(shProg, size, MESA_GIT_SHA1); + + /* we have more data that can fit to user given buffer */ + if (size bufSize) { + _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (data) + free(data); + return; + } + + if (data) { + memcpy(binary, data, size); + free(data); + } + + if (length != NULL) + *length = size; + + *binaryFormat = 0; } void GLAPIENTRY @@ -1645,10 +1664,23 @@ _mesa_ProgramBinary(GLuint program, GLenum binaryFormat, if (!shProg) return; - (void) binaryFormat; - (void) binary; - (void) length; - _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (length = 0) + return; + + /* free possible existing data and initialize structure */ + _mesa_free_shader_program_data(ctx, shProg); + _mesa_init_shader_program(ctx, shProg); + + /* fill structure from a binary blob */ + if (_mesa_program_deserialize(shProg, binary, length, MESA_GIT_SHA1)) { Won't using the git-sha1 as a compatibility-criteria cause issues for developers with local changes? I'm not so worried about this for OES_get_program_binary itself, but once the shader-cache is in place it sounds like a potential source of difficult to track down misbehavior... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 6/6] mesa: OES_get_program_binary functionality
On Fri, Nov 1, 2013 at 11:35 AM, Tapani Pälli tapani.pa...@intel.com wrote: On 11/01/2013 12:21 PM, Erik Faye-Lund wrote: On Fri, Nov 1, 2013 at 10:16 AM, Tapani Pälli tapani.pa...@intel.com wrote: Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/shaderapi.c | 46 +++--- 1 file changed, 39 insertions(+), 7 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 7da860d..67aee6b 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -57,6 +57,8 @@ #include ../glsl/ir.h #include ../glsl/ir_uniform.h #include ../glsl/program.h +#include ../glsl/ir_cache.h +#include git_sha1.h /** Define this to enable shader substitution (see below) */ #define SHADER_SUBST 0 @@ -212,7 +214,6 @@ attach_shader(struct gl_context *ctx, GLuint program, GLuint shader) struct gl_shader_program *shProg; struct gl_shader *sh; GLuint i, n; - const bool same_type_disallowed = _mesa_is_gles(ctx); shProg = _mesa_lookup_shader_program_err(ctx, program, glAttachShader); @@ -1630,8 +1631,26 @@ _mesa_GetProgramBinary(GLuint program, GLsizei bufSize, GLsizei *length, if (length != NULL) *length = 0; - (void) binaryFormat; - (void) binary; + size_t size = 0; + char *data = _mesa_program_serialize(shProg, size, MESA_GIT_SHA1); + + /* we have more data that can fit to user given buffer */ + if (size bufSize) { + _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (data) + free(data); + return; + } + + if (data) { + memcpy(binary, data, size); + free(data); + } + + if (length != NULL) + *length = size; + + *binaryFormat = 0; } void GLAPIENTRY @@ -1645,10 +1664,23 @@ _mesa_ProgramBinary(GLuint program, GLenum binaryFormat, if (!shProg) return; - (void) binaryFormat; - (void) binary; - (void) length; - _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (length = 0) + return; + + /* free possible existing data and initialize structure */ + _mesa_free_shader_program_data(ctx, shProg); + _mesa_init_shader_program(ctx, shProg); + + /* fill structure from a binary blob */ + if (_mesa_program_deserialize(shProg, binary, length, MESA_GIT_SHA1)) { Won't using the git-sha1 as a compatibility-criteria cause issues for developers with local changes? I'm not so worried about this for OES_get_program_binary itself, but once the shader-cache is in place it sounds like a potential source of difficult to track down misbehavior... I agree it might be too aggressive criteria but it is hard to come up with better and as simple. That's not my objection. My objection is that this might give headaches for people with local modifications to the glsl-compiler. Local modifications does not affect the git-sha1. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 6/6] mesa: OES_get_program_binary functionality
On Fri, Nov 1, 2013 at 1:08 PM, Tapani Pälli tapani.pa...@intel.com wrote: On 11/01/2013 12:38 PM, Erik Faye-Lund wrote: On Fri, Nov 1, 2013 at 11:35 AM, Tapani Pälli tapani.pa...@intel.com wrote: On 11/01/2013 12:21 PM, Erik Faye-Lund wrote: On Fri, Nov 1, 2013 at 10:16 AM, Tapani Pälli tapani.pa...@intel.com wrote: Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/shaderapi.c | 46 +++--- 1 file changed, 39 insertions(+), 7 deletions(-) diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 7da860d..67aee6b 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -57,6 +57,8 @@ #include ../glsl/ir.h #include ../glsl/ir_uniform.h #include ../glsl/program.h +#include ../glsl/ir_cache.h +#include git_sha1.h /** Define this to enable shader substitution (see below) */ #define SHADER_SUBST 0 @@ -212,7 +214,6 @@ attach_shader(struct gl_context *ctx, GLuint program, GLuint shader) struct gl_shader_program *shProg; struct gl_shader *sh; GLuint i, n; - const bool same_type_disallowed = _mesa_is_gles(ctx); shProg = _mesa_lookup_shader_program_err(ctx, program, glAttachShader); @@ -1630,8 +1631,26 @@ _mesa_GetProgramBinary(GLuint program, GLsizei bufSize, GLsizei *length, if (length != NULL) *length = 0; - (void) binaryFormat; - (void) binary; + size_t size = 0; + char *data = _mesa_program_serialize(shProg, size, MESA_GIT_SHA1); + + /* we have more data that can fit to user given buffer */ + if (size bufSize) { + _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (data) + free(data); + return; + } + + if (data) { + memcpy(binary, data, size); + free(data); + } + + if (length != NULL) + *length = size; + + *binaryFormat = 0; } void GLAPIENTRY @@ -1645,10 +1664,23 @@ _mesa_ProgramBinary(GLuint program, GLenum binaryFormat, if (!shProg) return; - (void) binaryFormat; - (void) binary; - (void) length; - _mesa_error(ctx, GL_INVALID_OPERATION, __FUNCTION__); + if (length = 0) + return; + + /* free possible existing data and initialize structure */ + _mesa_free_shader_program_data(ctx, shProg); + _mesa_init_shader_program(ctx, shProg); + + /* fill structure from a binary blob */ + if (_mesa_program_deserialize(shProg, binary, length, MESA_GIT_SHA1)) { Won't using the git-sha1 as a compatibility-criteria cause issues for developers with local changes? I'm not so worried about this for OES_get_program_binary itself, but once the shader-cache is in place it sounds like a potential source of difficult to track down misbehavior... I agree it might be too aggressive criteria but it is hard to come up with better and as simple. That's not my objection. My objection is that this might give headaches for people with local modifications to the glsl-compiler. Local modifications does not affect the git-sha1. For the automatic shader cache this headache could be helped a bit with a environment variable or drirc setting that can be used during development. On the other hand an automatic cache must work in a transparent way so it should be always able to recover when it fails, so one should only see it as 'slower than usual' (since recompilation/relink required) sort of behaviour. The WIP of the automatic cache I sent some time earlier also marked (renamed) these 'problematic' cached shaders so that they can be detected on further runs and cache can ignore those. I agree that it might become problematic, on the other hand it is also easy to just wipe ~/.cache/mesa and disable cache. That won't help for programs that maintain their own (explicit) shader-cache, which was the intention of introducing binary shaders to OpenGL ES in the first place. Not sure if Nvidia or Imagination try to handles these cases with their cache implementations. I would assume they simply piggie-backed on their binary-shader support. But I somehow doubt they have such a lazy approach to binary shaders as this series attempts. I worked on ARM_mali_shader_binary for the Mali-200 drivers myself, and our approach was quite different from this, and surely much more robust. To be honest, I find the whole idea of serializing the IR quite repelling, as it goes against almost every intention of the extension. Add to that mesa's IR not at all being stable, well yeah, that's a super-highway to disaster. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 6/6] mesa: OES_get_program_binary functionality
Sorry for the late reply. On Fri, Nov 1, 2013 at 4:14 PM, Tapani tapani.pa...@intel.com wrote: On 11/01/2013 03:31 PM, Erik Faye-Lund wrote: Won't using the git-sha1 as a compatibility-criteria cause issues for developers with local changes? I'm not so worried about this for OES_get_program_binary itself, but once the shader-cache is in place it sounds like a potential source of difficult to track down misbehavior... I agree it might be too aggressive criteria but it is hard to come up with better and as simple. That's not my objection. My objection is that this might give headaches for people with local modifications to the glsl-compiler. Local modifications does not affect the git-sha1. For the automatic shader cache this headache could be helped a bit with a environment variable or drirc setting that can be used during development. On the other hand an automatic cache must work in a transparent way so it should be always able to recover when it fails, so one should only see it as 'slower than usual' (since recompilation/relink required) sort of behaviour. The WIP of the automatic cache I sent some time earlier also marked (renamed) these 'problematic' cached shaders so that they can be detected on further runs and cache can ignore those. I agree that it might become problematic, on the other hand it is also easy to just wipe ~/.cache/mesa and disable cache. That won't help for programs that maintain their own (explicit) shader-cache, which was the intention of introducing binary shaders to OpenGL ES in the first place. Ok, we are talking about the extension, I thought you referred to the automatic caching. For extension to work, we need at least more Piglit tests to ensure that it does not break. I was actually of talking about both. But for the caching, it's probably more forgivable, as developers probably know they changed the compiler and can step around it by flushing the cache. Especially if the build time gets included, like Pauls suggested. Of course every time you go and touch the code, some functionality may break, be it this extension or something else. That's completely irrelevant here. The rejection mechanism isn't intended to catch bugs, but to detect intentional format changes. So let's not confuse the issue. I'm not sure if Chromium, Qt or other users expect glBinaryProgram call to always succeed, hopefully not. If they do, they're buggy. But there is a chance for that, yes. But I'm actually talking about that they might get *told* that everything went well, and still get a broken shader. Or even a crash. Of course, this only applies when mesa is build with local modifications, but that happens a *lot* while debugging application issues. Perhaps bugs start to disappear, because applications take another non-buggy code-path? It'll probably only affect developers, and not happen in the wild. But I still don't think that's a good excuse. However, by a strict reading of the spec, I don't even yhink we're allowed to break the shaders for just any reason. The wording of the spec is An implementation may reject a program binary if it determines the program binary was produced by an incompatible or outdated version of the compiler. The way I read that, changes that doesn't modify the compiler aren't really allowed to reject previous shaders. While diverging from the spec on this *might* not have many real-world implications, at the very best your solution goes against the intention of this rejection-mechanism. Not sure if Nvidia or Imagination try to handles these cases with their cache implementations. I would assume they simply piggie-backed on their binary-shader support. But I somehow doubt they have such a lazy approach to binary shaders as this series attempts. I worked on ARM_mali_shader_binary for the Mali-200 drivers myself, and our approach was quite different from this, and surely much more robust. With such strong opinion It would be nice to have some more technical explanation. Why it was surely more robust? Easy; it didn't break compatibility for every change to the driver. It didn't even break when we updated the compiler. As a matter of fact, I'm not even aware of *any* compatibility breakages (apart from if you change the hardware, of course), ever. The implementation itself can be likely very different as it targets only a particular GPU while the approach here is meant to be more generic. Our implementation did also support multiple GPUs, however I was only involved in the initial Mali-200 work. But the Mali-200 does not have a unified ISA, so we had to touch on this area anyway. To be honest, I find the whole idea of serializing the IR quite repelling, as it goes against almost every intention of the extension. Add to that mesa's IR not at all being stable, well yeah, that's a super-highway to disaster. Again, I would appreciate a bit more technical approach if possible. I can't tell from
Re: [Mesa-dev] [v2 6/6] mesa: OES_get_program_binary functionality
On Wed, Nov 6, 2013 at 7:49 PM, Paul Berry stereotype...@gmail.com wrote: On 6 November 2013 09:35, Erik Faye-Lund kusmab...@gmail.com wrote: Sorry for the late reply. On Fri, Nov 1, 2013 at 4:14 PM, Tapani tapani.pa...@intel.com wrote: On 11/01/2013 03:31 PM, Erik Faye-Lund wrote: Won't using the git-sha1 as a compatibility-criteria cause issues for developers with local changes? I'm not so worried about this for OES_get_program_binary itself, but once the shader-cache is in place it sounds like a potential source of difficult to track down misbehavior... I agree it might be too aggressive criteria but it is hard to come up with better and as simple. That's not my objection. My objection is that this might give headaches for people with local modifications to the glsl-compiler. Local modifications does not affect the git-sha1. For the automatic shader cache this headache could be helped a bit with a environment variable or drirc setting that can be used during development. On the other hand an automatic cache must work in a transparent way so it should be always able to recover when it fails, so one should only see it as 'slower than usual' (since recompilation/relink required) sort of behaviour. The WIP of the automatic cache I sent some time earlier also marked (renamed) these 'problematic' cached shaders so that they can be detected on further runs and cache can ignore those. I agree that it might become problematic, on the other hand it is also easy to just wipe ~/.cache/mesa and disable cache. That won't help for programs that maintain their own (explicit) shader-cache, which was the intention of introducing binary shaders to OpenGL ES in the first place. Ok, we are talking about the extension, I thought you referred to the automatic caching. For extension to work, we need at least more Piglit tests to ensure that it does not break. I was actually of talking about both. But for the caching, it's probably more forgivable, as developers probably know they changed the compiler and can step around it by flushing the cache. Especially if the build time gets included, like Pauls suggested. Of course every time you go and touch the code, some functionality may break, be it this extension or something else. That's completely irrelevant here. The rejection mechanism isn't intended to catch bugs, but to detect intentional format changes. So let's not confuse the issue. I'm not sure if Chromium, Qt or other users expect glBinaryProgram call to always succeed, hopefully not. If they do, they're buggy. But there is a chance for that, yes. But I'm actually talking about that they might get *told* that everything went well, and still get a broken shader. Or even a crash. Of course, this only applies when mesa is build with local modifications, but that happens a *lot* while debugging application issues. Perhaps bugs start to disappear, because applications take another non-buggy code-path? It'll probably only affect developers, and not happen in the wild. But I still don't think that's a good excuse. However, by a strict reading of the spec, I don't even yhink we're allowed to break the shaders for just any reason. The wording of the spec is An implementation may reject a program binary if it determines the program binary was produced by an incompatible or outdated version of the compiler. The way I read that, changes that doesn't modify the compiler aren't really allowed to reject previous shaders. While diverging from the spec on this *might* not have many real-world implications, at the very best your solution goes against the intention of this rejection-mechanism. We have to regard any bug fix, feature improvement, or performance optimization that's applied to the parser, ast, optimization passes, or lowering passes as constituting a change to the compiler. Not quite. Bug-fixes are fine; if the compiled shader was buggy, you'll have a buggy binary shader, and it should be loaded with those bugs intact. Feature improvements also does not matter: surely, a pre-compiled shader won't use newer features. The same goes for performance optimization; a program that pre-compiled a shader from an old compiler surely won't expect to get a performance boost from new optimization passes. If so, they'll use source shaders instead, and pay the price of re-compiling. This is *exactly* what binary shaders are for. Otherwise when the user upgrades Mesa, any apps that use get_program_binary will fail to get the advantage of the upgraded code. Yes, as I said, that's a part of the contract. As a practical matter, nearly every sMesa release makes changes to one of those components, even point releases. So allowing old program binaries in the event that the compiler hasn't changed really wouldn't produce any end user benefit. Even if *some* modification happened
Re: [Mesa-dev] Anybody still need clear_depth_stencil() or clear_render_target()?
On Mon, Dec 2, 2013 at 11:08 PM, Andreas Hartmetz ahartm...@gmail.com wrote: Hello, I've been lurking for a while, this seems to be my first post. While trying to make some easy (ha) improvements in radeonsi I looked around in all the surrounding code to get a good picture of how things work. So I noticed something: All the Gallium drivers need to implement clear_depth_stencil() and clear_render_target() from pipe_context. Those implementations are generally using little to no acceleration, just making sure that there's any implementation at all. Still, it's quite a few lines over all the drivers. The two methods in question were introduced for Direct3D, which has no in-tree state tracker anymore, and for scissored clearing in OpenGL. But scissored clearing is already implemented as a fallback using quad drawing in the OpenGL state tracker (st_cb_clear.c, clear_with_quad()) in pretty much the same way as the unoptimized paths in drivers. So, the question arises: Does anybody still use those functions, maybe VMware in some out-of-tree code? If not I suggest removing them and I'd send patches to do that. Alternatively, they could be reduced to do just scissored clearing, and the OpenGL state tracker could call that in the hope that some GPU will have an optimized way to do it (how realistic is that? I have not looked at all the GPU drivers...). I looked at doing scissored clears for OpenGL using these functions (see https://github.com/grate-driver/mesa/commit/2cd9c115a64e98c8f3dbd0699023f89ccbc28b38), but when asking on IRC it seemed people wasn't too keen on going down that path. So my next alternative (which I haven't finished) is to simply roll this into the normal clear-methods, (see https://github.com/grate-driver/mesa/tree/scissor-clear-v2 for the first patch, but I'd also want masked clears in there as well). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Gallium3d Documentation
On Tue, Dec 10, 2013 at 1:41 AM, Christopher Corsi cco...@g.clemson.edu wrote: Currently it can do rasterize with TL, texturing, bit-blits, and handle a subset of functionality needed for OpenGL (scissor, blend modes, etc.). Shader support is planned. AFAIK, Gallium3D requires support for vertex and fragment shaders. Is there any documentation available to help, or if not, what would be the most basic implementations to look at in the source? Have a look in src/gallium/docs/, there's some documentation there. On Tue, Dec 10, 2013 at 1:41 AM, Christopher Corsi cco...@g.clemson.edu wrote: Hi All, I'm currently working on a project for teaching driver development using virtual machines. We have developed in QEMU a virtual graphics adapter capable of a reasonable feature set. Note that because it's used in a teaching course, it is not very high-performance, as it is designed to look and act as close to a real device as possible (reasonable bit of overhead, and internally uses OSMesa). Currently it can do rasterize with TL, texturing, bit-blits, and handle a subset of functionality needed for OpenGL (scissor, blend modes, etc.). Shader support is planned. Currently we teach the course with the driver being written 100% from scratch as a character special device and writing user-space libraries manually. I'd like to move away from that, but I don't have any knowledge of newer mesa internals (including writing gallium drivers). Is there any documentation available to help, or if not, what would be the most basic implementations to look at in the source? Thanks, Christopher ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glcpp: error on multiple #else directives
On Fri, Oct 11, 2013 at 10:48 PM, Ian Romanick i...@freedesktop.org wrote: Carl, Can you look at this patch and Erik's follow-up patch? You (still) know the glcpp much better than any of the rest of us. (Carl is currently out of town, so I know his response will be slow...) I guess slow doesn't mean *this* slow? :) Should I resend the patch, with the #elseif fix squashed in? On 09/23/2013 01:35 PM, Erik Faye-Lund wrote: The preprocessor currently eats multiple #else directives int the same #if(def) ... #endif block. While I haven't been able to find anything that explicitly disallows it, it's nonsensical and should probably not be allowed. Add checks to reject the code. --- I'm not entirely sure why parser-skip_stack can be NULL after _glcpp_parser_skip_stack_change_if, but apparently it is for at least one test. I'm also not entirely sure if this should be an error or a warning. Thoughts? src/glsl/glcpp/glcpp-parse.y | 13 - src/glsl/glcpp/glcpp.h| 1 + src/glsl/glcpp/tests/118-multiple-else.c | 6 ++ src/glsl/glcpp/tests/118-multiple-else.c.expected | 8 4 files changed, 27 insertions(+), 1 deletion(-) create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c.expected diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 6eaa5f9..885c64d 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -332,7 +332,17 @@ control_line: } } |HASH_ELSE { - _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, multiple #else); + } + else + { + _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack) + parser-skip_stack-has_else = true; + } } NEWLINE |HASH_ENDIF { _glcpp_parser_skip_stack_pop (parser, @1); @@ -2012,6 +2022,7 @@ _glcpp_parser_skip_stack_push_if (glcpp_parser_t *parser, YYLTYPE *loc, node-type = SKIP_TO_ENDIF; } + node-has_else = false; node-next = parser-skip_stack; parser-skip_stack = node; } diff --git a/src/glsl/glcpp/glcpp.h b/src/glsl/glcpp/glcpp.h index 8aaa551..ccae96c 100644 --- a/src/glsl/glcpp/glcpp.h +++ b/src/glsl/glcpp/glcpp.h @@ -153,6 +153,7 @@ typedef enum skip_type { typedef struct skip_node { skip_type_t type; + bool has_else; YYLTYPE loc; /* location of the initial #if/#elif/... */ struct skip_node *next; } skip_node_t; diff --git a/src/glsl/glcpp/tests/118-multiple-else.c b/src/glsl/glcpp/tests/118-multiple-else.c new file mode 100644 index 000..62ad49c --- /dev/null +++ b/src/glsl/glcpp/tests/118-multiple-else.c @@ -0,0 +1,6 @@ +#if 0 +#else +int foo; +#else +int bar; +#endif diff --git a/src/glsl/glcpp/tests/118-multiple-else.c.expected b/src/glsl/glcpp/tests/118-multiple-else.c.expected new file mode 100644 index 000..eaec481 --- /dev/null +++ b/src/glsl/glcpp/tests/118-multiple-else.c.expected @@ -0,0 +1,8 @@ +0:4(1): preprocessor error: multiple #else + + +int foo; + +int bar; + + ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glcpp: error on multiple #else/#elif directives
The preprocessor currently accepts multiple else/elif-groups per if-section. The GLSL-preprocessor is defined by the C++ specification, which defines the following parse-rule: if-section: if-group elif-groups(opt) else-group(opt) endif-line This clearly only allows a single else-group, that has to come after any elif-groups. So let's modify the code to follow the specification. Add test to prevent regressions. --- Here's a resend of an older patch (original 1379968503-30246-1-git-send-email-kusmab...@gmail.com), this time with a better commit message, and a similar treatment for elif-after-else. src/glsl/glcpp/glcpp-parse.y | 23 +- src/glsl/glcpp/glcpp.h | 1 + src/glsl/glcpp/tests/118-multiple-else.c | 6 ++ src/glsl/glcpp/tests/118-multiple-else.c.expected | 8 src/glsl/glcpp/tests/119-elif-after-else.c | 6 ++ .../glcpp/tests/119-elif-after-else.c.expected | 8 6 files changed, 51 insertions(+), 1 deletion(-) create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c create mode 100644 src/glsl/glcpp/tests/118-multiple-else.c.expected create mode 100644 src/glsl/glcpp/tests/119-elif-after-else.c create mode 100644 src/glsl/glcpp/tests/119-elif-after-else.c.expected diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 7edc274..451b728 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -310,6 +310,11 @@ control_line: _glcpp_parser_expand_and_lex_from (parser, ELIF_EXPANDED, $2); } + else if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, #elif after #else); + } else { _glcpp_parser_skip_stack_change_if (parser, @1, @@ -324,6 +329,11 @@ control_line: { glcpp_error( @1, parser, #elif with no expression); } + else if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, #elif after #else); + } else { _glcpp_parser_skip_stack_change_if (parser, @1, @@ -332,7 +342,17 @@ control_line: } } | HASH_ELSE { - _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack + parser-skip_stack-has_else) + { + glcpp_error( @1, parser, multiple #else); + } + else + { + _glcpp_parser_skip_stack_change_if (parser, @1, else, 1); + if (parser-skip_stack) + parser-skip_stack-has_else = true; + } } NEWLINE | HASH_ENDIF { _glcpp_parser_skip_stack_pop (parser, @1); @@ -2024,6 +2044,7 @@ _glcpp_parser_skip_stack_push_if (glcpp_parser_t *parser, YYLTYPE *loc, node-type = SKIP_TO_ENDIF; } + node-has_else = false; node-next = parser-skip_stack; parser-skip_stack = node; } diff --git a/src/glsl/glcpp/glcpp.h b/src/glsl/glcpp/glcpp.h index 8aaa551..ccae96c 100644 --- a/src/glsl/glcpp/glcpp.h +++ b/src/glsl/glcpp/glcpp.h @@ -153,6 +153,7 @@ typedef enum skip_type { typedef struct skip_node { skip_type_t type; + bool has_else; YYLTYPE loc; /* location of the initial #if/#elif/... */ struct skip_node *next; } skip_node_t; diff --git a/src/glsl/glcpp/tests/118-multiple-else.c b/src/glsl/glcpp/tests/118-multiple-else.c new file mode 100644 index 000..62ad49c --- /dev/null +++ b/src/glsl/glcpp/tests/118-multiple-else.c @@ -0,0 +1,6 @@ +#if 0 +#else +int foo; +#else +int bar; +#endif diff --git a/src/glsl/glcpp/tests/118-multiple-else.c.expected b/src/glsl/glcpp/tests/118-multiple-else.c.expected new file mode 100644 index 000..eaec481 --- /dev/null +++ b/src/glsl/glcpp/tests/118-multiple-else.c.expected @@ -0,0 +1,8 @@ +0:4(1): preprocessor error: multiple #else + + +int foo; + +int bar; + + diff --git a/src/glsl/glcpp/tests/119-elif-after-else.c b/src/glsl/glcpp/tests/119-elif-after-else.c new file mode 100644 index 000..9b9e923 --- /dev/null +++ b/src/glsl/glcpp/tests/119-elif-after-else.c @@ -0,0 +1,6 @@ +#if 0 +#else +int foo; +#elif 0 +int bar; +#endif diff --git a/src/glsl/glcpp/tests/119-elif-after-else.c.expected b/src/glsl/glcpp/tests/119-elif-after-else.c.expected new file mode 100644 index 000..33f0513 --- /dev/null +++ b/src/glsl/glcpp/tests/119-elif-after-else.c.expected @@ -0,0 +1,8 @@ +0:4(1): preprocessor error: #elif after #else + + +int foo; + +int
Re: [Mesa-dev] [PATCH 0/2] code de-duplication and non-pci support
On Sat, Dec 14, 2013 at 8:28 PM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org It seems that over time, code related to finding driver name, dealing with pci-id table, etc, has been copy/pasted everywhere it was needed. Which is lame. And annoying if you have a device which is not pci. This patchset refactors it out into a simple loader util lib which is statically linked wherever it is needed. Perhaps there is some room for sharing other util bits (like _eglLog, perhaps) later. And once all this code is collected in one place, the 2nd patch only has to fix one place to add support for platform devices ;-) Rob Clark (2): loader: refactor duplicated code into loader util lib loader: fallback to drmGetVersion() for non-pci devices Nice. I've replaced Thierry's patch that does the latter, but only for GBM, and the result still works fine. So: Tested-by: Erik Faye-Lund kusmab...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] radeon: Add bo statistics dumping support
On Wed, Jan 1, 2014 at 3:57 PM, Lauri Kasanen c...@gmx.com wrote: diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 28921be..2c33ddf 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -209,6 +216,24 @@ bool r600_common_screen_init(struct r600_common_screen *rscreen, return false; } + if (rscreen-debug_flags DBG_BO_STATS) { + char statsfile[80]; + const pid_t pid = getpid(); + +#ifdef __GLIBC__ + snprintf(statsfile, 80, /tmp/bostats.%u.%s, pid, program_invocation_short_name); +#else + snprintf(statsfile, 80, /tmp/bostats.%u, pid); +#endif + + rscreen-ws-bo_stats_file = fopen(statsfile, w); + if (!rscreen-ws-bo_stats_file) + fprintf(stderr, Failed to open bo stats file %s\n, statsfile); + else + fprintf(rscreen-ws-bo_stats_file, Started at %llu\n, + (unsigned long long) os_time_get_nano() / 100); stats_time_get()? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] radeon: Add bo statistics dumping support
On Fri, Jan 3, 2014 at 4:49 PM, Lauri Kasanen c...@gmx.com wrote: No measurable overhead when off (glxgears within 0.5%). Signed-off-by: Lauri Kasanen c...@gmx.com --- src/gallium/drivers/radeon/r600_pipe_common.c | 32 +++ src/gallium/drivers/radeon/r600_pipe_common.h | 1 + src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 17 ++ src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 9 src/gallium/winsys/radeon/drm/radeon_winsys.h | 6 + 5 files changed, 65 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 28921be..121aa49 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -26,11 +26,18 @@ #include r600_pipe_common.h #include r600_cs.h +#include ../../winsys/radeon/drm/radeon_drm_winsys.h +#include os/os_time.h #include tgsi/tgsi_parse.h #include util/u_format_s3tc.h #include util/u_upload_mgr.h #include inttypes.h +#ifdef __GLIBC__ +#define _GNU_SOURCE +#include errno.h +#endif + static const struct debug_named_value common_debug_options[] = { /* logging */ { tex, DBG_TEX, Print texture info }, @@ -38,6 +45,7 @@ static const struct debug_named_value common_debug_options[] = { { compute, DBG_COMPUTE, Print compute info }, { vm, DBG_VM, Print virtual addresses when creating resources }, { trace_cs, DBG_TRACE_CS, Trace cs and write rlockup_csid.c file with faulty cs }, + { bostats, DBG_BO_STATS, Write bo statistics to /tmp/bostats.pid[.name] }, /* shaders */ { fs, DBG_FS, Print fetch shaders }, @@ -209,6 +217,24 @@ bool r600_common_screen_init(struct r600_common_screen *rscreen, return false; } + if (rscreen-debug_flags DBG_BO_STATS) { + char statsfile[80]; + const pid_t pid = getpid(); + +#ifdef __GLIBC__ + snprintf(statsfile, 80, /tmp/bostats.%u.%s, pid, program_invocation_short_name); +#else + snprintf(statsfile, 80, /tmp/bostats.%u, pid); +#endif + + rscreen-ws-bo_stats_file = fopen(statsfile, w); + if (!rscreen-ws-bo_stats_file) + fprintf(stderr, Failed to open bo stats file %s\n, statsfile); + else + fprintf(rscreen-ws-bo_stats_file, Started at %llu\n, + stats_time_get(ws)); Another nit: the other stats seems to use @%llu instead of at %llu. But looking a bit more about this, since all code-sites for stat_time_get() seems to be variations of: if (ws-bo_stats_file) { fprintf(..., stats_time_get(ws)); } ...perhaps stats_time_get is a too low-level abstraction? I'm not saying it is, just asking :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/9] tgsi: implement micro_fma() in tgsi exec machine
On Sun, Jan 5, 2014 at 12:42 AM, Maxence Le Doré maxence.led...@gmail.com wrote: From: Maxence Le Doré Maxence Le Doré Huh, user.email not configured properly? static void +micro_fma(union tgsi_exec_channel *dst, + const union tgsi_exec_channel *src0, + const union tgsi_exec_channel *src1, + const union tgsi_exec_channel *src2) +{ + dst-f[0] = fmaf(src0-f[0],src1-f[0],src2-f[0]); fmaf isn't portable enough, no? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: rename min(), max() functions to fix MSVC build
On Tue, Jan 7, 2014 at 12:12 AM, Brian Paul bri...@vmware.com wrote: Evidently, there's some other definition of min and max that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Wouldn't it be easier to just make sure NOMINMAX is defined before including any system-headers? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: rename min(), max() functions to fix MSVC build
On Tue, Jan 7, 2014 at 3:59 PM, Brian Paul bri...@vmware.com wrote: On 01/06/2014 05:22 PM, Erik Faye-Lund wrote: On Tue, Jan 7, 2014 at 12:12 AM, Brian Paul bri...@vmware.com wrote: Evidently, there's some other definition of min and max that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Wouldn't it be easier to just make sure NOMINMAX is defined before including any system-headers? I wasn't aware of that option. I think I'll just leave things as-is for now to avoid churn. Why was this pushed out prematurely, then? Reviewing within two hours seems to be reasonable, no? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: rename min(), max() functions to fix MSVC build
On Tue, Jan 7, 2014 at 4:11 PM, Brian Paul bri...@vmware.com wrote: On 01/07/2014 08:03 AM, Erik Faye-Lund wrote: Why was this pushed out prematurely, then? Reviewing within two hours seems to be reasonable, no? I got Kenneth's R-b so I pushed it. I saw no reason to wait. Also, I don't like build breaks to linger any longer than necessary. Fair enough, thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/9] tgsi: implement micro_fma() in tgsi exec machine
On Tue, Jan 7, 2014 at 6:46 PM, Maxence Le Doré maxence.led...@gmail.com wrote: Indeed, this patch compiles on linux but may break complilation at least on MSVC version 1400. I'm planing to install visual studio 2005 on a ms xp or seven virtual machine to check the behavior when compiling in microsoft windows environment. I can save you the trouble: I have MSVC 2008 installed, and it does not support fmaf. There's an effort to implement it in Cygwin here: http://www.cygwin.com/ml/libc-alpha/2010-10/msg7.html However, it depends on fetestexcept(), which is also not supported. Another implementation I found was this, which claims to be exact: http://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/fmaf.c There's also an implementation in gnulib, but it looks quite complicated to me. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: Honour pipe_rasterizer::point_quad_rasterization.
On Wed, Jan 8, 2014 at 6:27 PM, jfons...@vmware.com wrote: } - + + if (0) { + debug_printf( bbox: (%i, %i) - (%i, %i)\n, + bbox.x0, bbox.y0, + bbox.x1, bbox.y1); + } + if (!u_rect_test_intersection(setup-draw_regions[viewport_index], bbox)) { if (0) debug_printf(offscreen\n); LP_COUNT(nr_culled_tris); This part should probably be dropped, no? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] demos/pixeltest: Adapt the example for points too.
On Wed, Jan 8, 2014 at 6:30 PM, jfons...@vmware.com wrote: From: José Fonseca jfons...@vmware.com This example is very useful to understand the rasterization of lines. And with this change, points too. Adding a key / command-line option to switch modes is left for a future opportunity though. --- src/demos/pixeltest.c | 109 -- 1 file changed, 69 insertions(+), 40 deletions(-) diff --git a/src/demos/pixeltest.c b/src/demos/pixeltest.c index 0898290..0b6a580 100644 --- a/src/demos/pixeltest.c +++ b/src/demos/pixeltest.c @@ -31,6 +31,10 @@ float width = 1.0f; #define SIZE 128/* of non-zoomed drawing region */ #define ZOOM 32 /* scale factor for zooming */ +// TODO: Allow to switch mode via key and/or command-line option. +//GLenum mode = GL_POINT; +GLenum mode = GL_LINE; + C++/C99 comments in source-file normally using traditional C comments. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add bits for clipping points as tris (d3d-style)
On Fri, Jan 10, 2014 at 10:33 AM, Lauri Kasanen c...@gmx.com wrote: On Fri, 10 Jan 2014 03:57:45 +0100 srol...@vmware.com wrote: From: Roland Scheidegger srol...@vmware.com OpenGL does whole-point clipping, that is a large point is either fully clipped or fully unclipped (the latter means it may extend beyond the viewport as long as the center is inside the viewport). d3d9 (d3d10 has no large points) however requires points to be clipped after they are expanded to a rectangle. (Note some IHVs are known to ignore GL rules at least with some hw/drivers.) Hence add a rasterizer bit indicating which way points should be clipped (some drivers probably will always ignore this), and add the draw interaction this requires. Drivers wanting to support this and using draw must support large points on their own as draw doesn't implement vp clipping on the expanded points (it potentially could but the complexity doesn't seem warranted), and the driver needs to do viewport scissoring on such points. Is there any possibility of this being exposed to user-space? With that hat on, it would have been useful to me several times in the past. Perhaps whoever was Mesa's ARB contact could propose an extension? I know nothing about how that happens, just would like this functionality. IIRC there were some draft by NVIDIA discussed in Khronos OpenGL ES meetings 6-7 years ago, but I don't think it ever resulted in a released extension spec. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] gallium endianness and hw drivers
On Wed, Jan 15, 2014 at 7:07 AM, Michel Dänzer mic...@daenzer.net wrote: On Die, 2014-01-14 at 00:22 +0100, Marek Olšák wrote: I think the format conversion functions should look like: #ifdef BIG_ENDIAN case PIPE_FORMAT_A8B8G8R8_UNORM: return hw_format_for_R8G8B8A8_UNORM; ... #else case PIPE_FORMAT_R8G8B8A8_UNORM: return hw_format_for_R8G8B8A8_UNORM; #endif which can be simplified to: case PIPE_FORMAT_RGBA_UNORM: return hw_format_for_R8G8B8A8_UNORM; So that the GPU can see the same formats, but they are different for the CPU. What do you think? That might be an option, but PIPE_FORMAT_R8G8B8A8_UNORM is useful on big endian hosts as well, e.g. it matches GL_RGBA / GL_UNSIGNED_BYTE exactly. Well, there's the added complication that GL_RGBA / GL_UNSIGNED_BYTE can be specified with an unaligned pointer on architectures that does not support unaligned reads (e.g ARM), and packed formats cannot. And then there's GL_RGB / GL_UNSIGNED_BYTE which further messes the treat-array-formats-as-if-they-were-packed strategy. I think one issue here is that we're trying to use a single format attribute for several purposes (API visible representation, GPU/driver internal representation). Indeed, and if we stopped doing that, the issues above should be solvable. I've actually been involved in the successful shipping of an OpenGL ES 2.0 implementation written for little-endian machines, on a big-endian machine. But we had the benefit of being able to make hardware-modifications, which made this a whole lot easier. IIRC, we just made the texture-fetches and pixel-writes use native endianess. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [V3 PATCH 0/8] mesa: Naming MESA_FORMATs to a specification
On Fri, Jan 17, 2014 at 9:22 AM, Christian König deathsim...@vodafone.de wrote: But as a general note on writing patches: Please limit the subject line to a sane length! Something between 60 and 80 chars should be ok. Indeed. The regex patterns are nice to have. But please, put them in the *body* of the commit message. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] glsl: Optimize log(exp(x)) and exp(log(x)) into x.
On Mon, Jan 20, 2014 at 8:18 AM, Eric Anholt e...@anholt.net wrote: --- src/glsl/opt_algebraic.cpp | 36 1 file changed, 36 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 6b0d992..4aa49e5 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -252,6 +252,42 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) } break; + case ir_unop_exp: + if (op_expr[0] == NULL) +break; How can this happen?! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize ~~x into x.
On Mon, Jan 20, 2014 at 8:18 AM, Eric Anholt e...@anholt.net wrote: --- src/glsl/opt_algebraic.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 332f0b7..6b0d992 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -218,6 +218,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) this-mem_ctx = ralloc_parent(ir); switch (ir-operation) { + case ir_unop_bit_not: + if (op_expr[0] == NULL) +break; How can this happen?! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] glsl: Optimize log(exp(x)) and exp(log(x)) into x.
On Mon, Jan 20, 2014 at 5:39 PM, Eric Anholt e...@anholt.net wrote: Erik Faye-Lund kusmab...@gmail.com writes: On Mon, Jan 20, 2014 at 8:18 AM, Eric Anholt e...@anholt.net wrote: --- src/glsl/opt_algebraic.cpp | 36 1 file changed, 36 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 6b0d992..4aa49e5 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -252,6 +252,42 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) } break; + case ir_unop_exp: + if (op_expr[0] == NULL) +break; How can this happen?! If the first operand itself isn't an expression (for example, it's a swizzle of something else). But what is something else? The GLSL grammar defines the arguments to be assignment_expressions, so they should always be expressions, no? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] pipe-loader: Add auth_x parameter to pipe_loader_drm_probe_fd()
On Mon, Jan 27, 2014 at 5:13 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com The caller can use this boolean to parameter to tell the pipe-loader Boolean to parameter? A superfluous to, perhaps? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/tgsi: correct typo propagated from NV_vertex_program1_1
In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F80 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I've also e-mailed NVIDIA's contact as listed in the specification, but I haven't gotten any reply yet. I think it's pretty obvious, though. I seriously doubt anyone implemented the old limit, as it would require quite a bit more gates AFAIK. src/gallium/auxiliary/tgsi/tgsi_exec.c | 8 src/gallium/docs/source/tgsi.rst | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 3d37eaa..7d35aeb 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -915,8 +915,8 @@ micro_rcc(union tgsi_exec_channel *dst, float recip = 1.0f / src-f[i]; if (recip 0.0f) { - if (recip 1.884467e+019f) { -dst-f[i] = 1.884467e+019f; + if (recip 1.84467e+019f) { +dst-f[i] = 1.84467e+019f; } else if (recip 5.42101e-020f) { dst-f[i] = 5.42101e-020f; @@ -926,8 +926,8 @@ micro_rcc(union tgsi_exec_channel *dst, } } else { - if (recip -1.884467e+019f) { -dst-f[i] = -1.884467e+019f; + if (recip -1.84467e+019f) { +dst-f[i] = -1.84467e+019f; } else if (recip -5.42101e-020f) { dst-f[i] = -5.42101e-020f; diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index 0501aca..be42572 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -424,7 +424,7 @@ XXX cleanup on aisle three .. math:: - dst = (1 / src.x) 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020) + dst = (1 / src.x) 0 ? clamp(1 / src.x, 5.42101e-020, 1.84467e+019) : clamp(1 / src.x, -1.84467e+019, -5.42101e-020) .. opcode:: DPH - Homogeneous Dot Product -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] gallium/tgsi: use CLAMP instead of open-coded clamps
Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 26 -- 1 file changed, 4 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 3d37eaa..96809cd 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -914,28 +914,10 @@ micro_rcc(union tgsi_exec_channel *dst, for (i = 0; i 4; i++) { float recip = 1.0f / src-f[i]; - if (recip 0.0f) { - if (recip 1.884467e+019f) { -dst-f[i] = 1.884467e+019f; - } - else if (recip 5.42101e-020f) { -dst-f[i] = 5.42101e-020f; - } - else { -dst-f[i] = recip; - } - } - else { - if (recip -1.884467e+019f) { -dst-f[i] = -1.884467e+019f; - } - else if (recip -5.42101e-020f) { -dst-f[i] = -5.42101e-020f; - } - else { -dst-f[i] = recip; - } - } + if (recip 0.0f) + dst-f[i] = CLAMP(recip, 5.42101e-020f, 1.884467e+019f); + else + dst-f[i] = CLAMP(recip, -1.884467e+019f, -5.42101e-020f); } } -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] gallium/tgsi: correct typo propagated from NV_vertex_program1_1
In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F80 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 4 ++-- src/gallium/docs/source/tgsi.rst | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 96809cd..55da60a 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -915,9 +915,9 @@ micro_rcc(union tgsi_exec_channel *dst, float recip = 1.0f / src-f[i]; if (recip 0.0f) - dst-f[i] = CLAMP(recip, 5.42101e-020f, 1.884467e+019f); + dst-f[i] = CLAMP(recip, 5.42101e-020f, 1.84467e+019f); else - dst-f[i] = CLAMP(recip, -1.884467e+019f, -5.42101e-020f); + dst-f[i] = CLAMP(recip, -1.84467e+019f, -5.42101e-020f); } } diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index 0501aca..be42572 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -424,7 +424,7 @@ XXX cleanup on aisle three .. math:: - dst = (1 / src.x) 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020) + dst = (1 / src.x) 0 ? clamp(1 / src.x, 5.42101e-020, 1.84467e+019) : clamp(1 / src.x, -1.84467e+019, -5.42101e-020) .. opcode:: DPH - Homogeneous Dot Product -- 1.8.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/tgsi: correct typo propagated from NV_vertex_program1_1
On Thu, Feb 6, 2014 at 5:30 PM, Brian Paul bri...@vmware.com wrote: On 02/06/2014 09:09 AM, Erik Faye-Lund wrote: In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F80 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I've also e-mailed NVIDIA's contact as listed in the specification, but I haven't gotten any reply yet. I think it's pretty obvious, though. I seriously doubt anyone implemented the old limit, as it would require quite a bit more gates AFAIK. src/gallium/auxiliary/tgsi/tgsi_exec.c | 8 src/gallium/docs/source/tgsi.rst | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 3d37eaa..7d35aeb 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -915,8 +915,8 @@ micro_rcc(union tgsi_exec_channel *dst, float recip = 1.0f / src-f[i]; if (recip 0.0f) { - if (recip 1.884467e+019f) { -dst-f[i] = 1.884467e+019f; + if (recip 1.84467e+019f) { +dst-f[i] = 1.84467e+019f; } else if (recip 5.42101e-020f) { dst-f[i] = 5.42101e-020f; @@ -926,8 +926,8 @@ micro_rcc(union tgsi_exec_channel *dst, } } else { - if (recip -1.884467e+019f) { -dst-f[i] = -1.884467e+019f; + if (recip -1.84467e+019f) { +dst-f[i] = -1.84467e+019f; } else if (recip -5.42101e-020f) { dst-f[i] = -5.42101e-020f; As long as you're at it, this code should probably be rewritten in terms of the CLAMP() macro to be more concise. Good point. I'll do that in a preceding clean-up patch, though. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium/tgsi: use CLAMP instead of open-coded clamps
On Fri, Feb 7, 2014 at 4:15 PM, Brian Paul bri...@vmware.com wrote: On 02/07/2014 05:45 AM, Erik Faye-Lund wrote: Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 26 -- 1 file changed, 4 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 3d37eaa..96809cd 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -914,28 +914,10 @@ micro_rcc(union tgsi_exec_channel *dst, for (i = 0; i 4; i++) { float recip = 1.0f / src-f[i]; - if (recip 0.0f) { - if (recip 1.884467e+019f) { -dst-f[i] = 1.884467e+019f; - } - else if (recip 5.42101e-020f) { -dst-f[i] = 5.42101e-020f; - } - else { -dst-f[i] = recip; - } - } - else { - if (recip -1.884467e+019f) { -dst-f[i] = -1.884467e+019f; - } - else if (recip -5.42101e-020f) { -dst-f[i] = -5.42101e-020f; - } - else { -dst-f[i] = recip; - } - } + if (recip 0.0f) + dst-f[i] = CLAMP(recip, 5.42101e-020f, 1.884467e+019f); + else + dst-f[i] = CLAMP(recip, -1.884467e+019f, -5.42101e-020f); } } For both, Reviewed-by: Brian Paul bri...@vmware.com Thanks. Do you need me to push these for you? I don't have push access, so yeah. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Flatland
On Fri, Feb 7, 2014 at 6:34 AM, Connor Abbott cwabbo...@gmail.com wrote: - It turns out that the original advantage of a tree-based IR, to be able to automatically generate pattern-matching code for optimizing certain code patterns, only really matters for CPU's with weird instruction sets with lots of exotic instructions; GPU's tend to be pretty regular and consistent in their ISA's, so being able to pattern-match with trees doesn't help us much here. From my understanding this point in Ian's talk wasn't really about optimizations, but rather about instruction selection. I know at least one shader compiler architecture where pattern matching optimizations have paid off quite well, but not necessarily for a tree-based ISA. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/tgsi: correct typo propagated from NV_vertex_program1_1
On Thu, Feb 6, 2014 at 5:09 PM, Erik Faye-Lund kusmab...@gmail.com wrote: In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F80 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I've also e-mailed NVIDIA's contact as listed in the specification, but I haven't gotten any reply yet. I just thought I'd follow up now that I got a reply: my suspicion was indeed right, and they'll push out an updated spec at some point soon-ish. So yeah, this was the right thing to do. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/20] automake: silence folder creation
On Tue, Mar 4, 2014 at 10:12 PM, Emil Velikov emil.l.veli...@gmail.com wrote: There is little gain in printing whenever a folder is created. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/gallium/auxiliary/Makefile.am | 8 src/glsl/Makefile.am | 4 ++-- src/mesa/Makefile.am | 4 ++-- src/mesa/drivers/dri/Makefile.am | 2 +- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/gallium/auxiliary/Makefile.am b/src/gallium/auxiliary/Makefile.am index 2d2d8d4..afeeef9 100644 --- a/src/gallium/auxiliary/Makefile.am +++ b/src/gallium/auxiliary/Makefile.am @@ -32,17 +32,17 @@ libgallium_la_SOURCES += \ endif indices/u_indices_gen.c: $(srcdir)/indices/u_indices_gen.py - $(MKDIR_P) indices + @$(MKDIR_P) indices $(AM_V_GEN) $(PYTHON2) $ $@ I tend to prefer $(AM_V_at) in cases like this, because it makes it show up again in verbose builds. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/20] automake: make install-lib-links less chatty
On Tue, Mar 4, 2014 at 10:12 PM, Emil Velikov emil.l.veli...@gmail.com wrote: There is little point in echoing everything that the script does to stdout. Wrap it in AM_V_GEN so that a reasonable message is printed as a indication of it's invocation. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- install-lib-links.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/install-lib-links.mk b/install-lib-links.mk index 73d9e14..9dd4c30 100644 --- a/install-lib-links.mk +++ b/install-lib-links.mk @@ -4,7 +4,7 @@ all-local : .libs/install-mesa-links .libs/install-mesa-links : $(lib_LTLIBRARIES) - $(MKDIR_P) $(top_builddir)/$(LIB_DIR) + $(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \ GEN for a directory, really? Why not $(AM_V_at)? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/20] automake: make install-lib-links less chatty
On Wed, Mar 5, 2014 at 12:42 PM, Emil Velikov emil.l.veli...@gmail.com wrote: On 05/03/14 10:13, Erik Faye-Lund wrote: On Tue, Mar 4, 2014 at 10:12 PM, Emil Velikov emil.l.veli...@gmail.com wrote: There is little point in echoing everything that the script does to stdout. Wrap it in AM_V_GEN so that a reasonable message is printed as a indication of it's invocation. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- install-lib-links.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/install-lib-links.mk b/install-lib-links.mk index 73d9e14..9dd4c30 100644 --- a/install-lib-links.mk +++ b/install-lib-links.mk @@ -4,7 +4,7 @@ all-local : .libs/install-mesa-links .libs/install-mesa-links : $(lib_LTLIBRARIES) - $(MKDIR_P) $(top_builddir)/$(LIB_DIR) + $(AM_V_GEN)$(MKDIR_P) $(top_builddir)/$(LIB_DIR); \ GEN for a directory, really? Why not $(AM_V_at)? Notice the trailing backspace, which will wrap the whole script in GEN. One could use AM_V_at for the folder and AM_V_GEN (or other) for the link/copy. IMHO the latter may be an overkill but meh... Right, I missed that, thanks. Makes sense to me, and I agree that it's overkill. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.
On Mon, Mar 10, 2014 at 11:54 PM, Matt Turner matts...@gmail.com wrote: Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. --- src/glsl/opt_algebraic.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 5c49a78..8494bd9 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if (is_vec_two(op_const[0])) return expr(ir_unop_exp2, ir-operands[1]); + if (is_vec_two(op_const[1])) { + ir_variable *x = new(ir) ir_variable(ir-operands[1]-type, x, + ir_var_temporary); + base_ir-insert_before(x); + base_ir-insert_before(assign(x, ir-operands[0])); + return mul(x, x); + } + Is this safe? Since many GPUs implement pow(x, y) as exp2(log2(x) * y), this will give different results for if y comes from a uniform vs if it's a constant, no? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.
On Tue, Mar 11, 2014 at 2:50 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Mar 10, 2014 at 11:54 PM, Matt Turner matts...@gmail.com wrote: Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. --- src/glsl/opt_algebraic.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 5c49a78..8494bd9 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if (is_vec_two(op_const[0])) return expr(ir_unop_exp2, ir-operands[1]); + if (is_vec_two(op_const[1])) { + ir_variable *x = new(ir) ir_variable(ir-operands[1]-type, x, + ir_var_temporary); + base_ir-insert_before(x); + base_ir-insert_before(assign(x, ir-operands[0])); + return mul(x, x); + } + Is this safe? Since many GPUs implement pow(x, y) as exp2(log2(x) * y), this will give different results for if y comes from a uniform vs if it's a constant, no? To be a bit more clear: I don't think this is valid for expressions writing to variables marked as invariant (or expressions taking part in the calculations that leads up to invariant variable writes). I can't find anything allowing variance like this in the invariance section of the GLSL 3.30 specifications. In particular, the list following To guarantee invariance of a particular output variable across two programs, the following must also be true doesn't seem to require the values to be passed from the same source, only that the same values are passed. And in this case, the value 2.0 is usually exactly representable no matter what path it took there. Perhaps I'm being a bit too pedantic here, though. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.
On Tue, Mar 11, 2014 at 7:27 PM, Eric Anholt e...@anholt.net wrote: Erik Faye-Lund kusmab...@gmail.com writes: On Tue, Mar 11, 2014 at 2:50 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Mar 10, 2014 at 11:54 PM, Matt Turner matts...@gmail.com wrote: Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. --- src/glsl/opt_algebraic.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 5c49a78..8494bd9 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if (is_vec_two(op_const[0])) return expr(ir_unop_exp2, ir-operands[1]); + if (is_vec_two(op_const[1])) { + ir_variable *x = new(ir) ir_variable(ir-operands[1]-type, x, + ir_var_temporary); + base_ir-insert_before(x); + base_ir-insert_before(assign(x, ir-operands[0])); + return mul(x, x); + } + Is this safe? Since many GPUs implement pow(x, y) as exp2(log2(x) * y), this will give different results for if y comes from a uniform vs if it's a constant, no? Yes, but that wouldn't be covered by the invariant keyword. To be a bit more clear: I don't think this is valid for expressions writing to variables marked as invariant (or expressions taking part in the calculations that leads up to invariant variable writes). I can't find anything allowing variance like this in the invariance section of the GLSL 3.30 specifications. In particular, the list following To guarantee invariance of a particular output variable across two programs, the following must also be true doesn't seem to require the values to be passed from the same source, only that the same values are passed. And in this case, the value 2.0 is usually exactly representable no matter what path it took there. Perhaps I'm being a bit too pedantic here, though. This file would do the same thing on the same expression tree in two different programs, so invariant is fine (we've probably got other problems with invariant, though). The keyword you're probably thinking of is precise, which isn't in GLSL we implement yet. Are you saying that this only rewrites x = pow(y, 2.0) and not const float e = 2.0; x = pow(y, e);? If so, my point is moot, indeed. But if that's *not* the case, then I think we're in trouble still. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.
On Wed, Mar 12, 2014 at 12:00 AM, Eric Anholt e...@anholt.net wrote: Erik Faye-Lund kusmab...@gmail.com writes: On Tue, Mar 11, 2014 at 7:27 PM, Eric Anholt e...@anholt.net wrote: Erik Faye-Lund kusmab...@gmail.com writes: On Tue, Mar 11, 2014 at 2:50 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Mar 10, 2014 at 11:54 PM, Matt Turner matts...@gmail.com wrote: Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. --- src/glsl/opt_algebraic.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 5c49a78..8494bd9 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if (is_vec_two(op_const[0])) return expr(ir_unop_exp2, ir-operands[1]); + if (is_vec_two(op_const[1])) { + ir_variable *x = new(ir) ir_variable(ir-operands[1]-type, x, + ir_var_temporary); + base_ir-insert_before(x); + base_ir-insert_before(assign(x, ir-operands[0])); + return mul(x, x); + } + Is this safe? Since many GPUs implement pow(x, y) as exp2(log2(x) * y), this will give different results for if y comes from a uniform vs if it's a constant, no? Yes, but that wouldn't be covered by the invariant keyword. To be a bit more clear: I don't think this is valid for expressions writing to variables marked as invariant (or expressions taking part in the calculations that leads up to invariant variable writes). I can't find anything allowing variance like this in the invariance section of the GLSL 3.30 specifications. In particular, the list following To guarantee invariance of a particular output variable across two programs, the following must also be true doesn't seem to require the values to be passed from the same source, only that the same values are passed. And in this case, the value 2.0 is usually exactly representable no matter what path it took there. Perhaps I'm being a bit too pedantic here, though. This file would do the same thing on the same expression tree in two different programs, so invariant is fine (we've probably got other problems with invariant, though). The keyword you're probably thinking of is precise, which isn't in GLSL we implement yet. Are you saying that this only rewrites x = pow(y, 2.0) and not const float e = 2.0; x = pow(y, e);? If so, my point is moot, indeed. But if that's *not* the case, then I think we're in trouble still. The second would also get rewritten, because other passes will move the 2.0 into the pow. I thought I understood your objection, but now I don't. I think you'll have to lay out the pair of shaders involving the invariant keyword that you think that would be broken by this pass. My understanding is that ---8--- invariant varying float v; attribute float a; const float e = 2.0; void main() { v = pow(a, e); } ---8--- and ---8--- invariant varying float v; attribute float a; uniform float e; void main() { v = pow(a, e); } ---8--- ...should produce the exact same result, as long as the latter is passed 2.0 as the uniform e. Because v is marked as invariant, the expressions writing to v are the same, and the values passed in are the same. If we rewrite the first one to do a * a, we get a different result on implementations that do exp2(log2(a) * 2.0) for the latter, due to floating-point normalization in the intermediate steps. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Optimize pow(x, 2) into x * x.
On Wed, Mar 12, 2014 at 1:32 AM, Eric Anholt e...@anholt.net wrote: Erik Faye-Lund kusmab...@gmail.com writes: On Wed, Mar 12, 2014 at 12:00 AM, Eric Anholt e...@anholt.net wrote: Erik Faye-Lund kusmab...@gmail.com writes: On Tue, Mar 11, 2014 at 7:27 PM, Eric Anholt e...@anholt.net wrote: Erik Faye-Lund kusmab...@gmail.com writes: On Tue, Mar 11, 2014 at 2:50 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Mar 10, 2014 at 11:54 PM, Matt Turner matts...@gmail.com wrote: Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. --- src/glsl/opt_algebraic.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 5c49a78..8494bd9 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if (is_vec_two(op_const[0])) return expr(ir_unop_exp2, ir-operands[1]); + if (is_vec_two(op_const[1])) { + ir_variable *x = new(ir) ir_variable(ir-operands[1]-type, x, + ir_var_temporary); + base_ir-insert_before(x); + base_ir-insert_before(assign(x, ir-operands[0])); + return mul(x, x); + } + Is this safe? Since many GPUs implement pow(x, y) as exp2(log2(x) * y), this will give different results for if y comes from a uniform vs if it's a constant, no? Yes, but that wouldn't be covered by the invariant keyword. To be a bit more clear: I don't think this is valid for expressions writing to variables marked as invariant (or expressions taking part in the calculations that leads up to invariant variable writes). I can't find anything allowing variance like this in the invariance section of the GLSL 3.30 specifications. In particular, the list following To guarantee invariance of a particular output variable across two programs, the following must also be true doesn't seem to require the values to be passed from the same source, only that the same values are passed. And in this case, the value 2.0 is usually exactly representable no matter what path it took there. Perhaps I'm being a bit too pedantic here, though. This file would do the same thing on the same expression tree in two different programs, so invariant is fine (we've probably got other problems with invariant, though). The keyword you're probably thinking of is precise, which isn't in GLSL we implement yet. Are you saying that this only rewrites x = pow(y, 2.0) and not const float e = 2.0; x = pow(y, e);? If so, my point is moot, indeed. But if that's *not* the case, then I think we're in trouble still. The second would also get rewritten, because other passes will move the 2.0 into the pow. I thought I understood your objection, but now I don't. I think you'll have to lay out the pair of shaders involving the invariant keyword that you think that would be broken by this pass. My understanding is that ---8--- invariant varying float v; attribute float a; const float e = 2.0; void main() { v = pow(a, e); } ---8--- and ---8--- invariant varying float v; attribute float a; uniform float e; void main() { v = pow(a, e); } ---8--- ...should produce the exact same result, as long as the latter is passed 2.0 as the uniform e. Because v is marked as invariant, the expressions writing to v are the same, and the values passed in are the same. If we rewrite the first one to do a * a, we get a different result on implementations that do exp2(log2(a) * 2.0) for the latter, due to floating-point normalization in the intermediate steps. I don't think that's what the spec authors intended from the keyword. I think what they intended was that if you had uniform float e in both cases, but different code for setting *other* lvalues, that you'd still get the same result in v. I think that *might* be correct, but it doesn't seem to be what's actually defined. Invariance comes from ESSL, and FWIW, I was one of the spec authors in this case. But I don't remember the details down to this level, and the spec doesn't seem to clarify either. However, since constant expressions are given some slack wrt how it's evaluated, I'm inclined to think that you're right about the spirit of the spec. AFAIR, we introduced invariant to get rid of ftransform (since we'd already gotten rid of fixed-function state), so that multi-pass rendering algorthms could be guaranteed that all passes ended up covering the exact same fragments at the exact same depth coordinate. And in cases like that, the inputs would really be of the same kind all the time, I guess. So yeah, perhaps. But I wouldn't feel safe about optimizations like this without a clarification from Khronos. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] include/pipe: all major Linux libc support endian.h
On Fri, Mar 14, 2014 at 7:57 PM, David Heidelberger david.heidelber...@ixit.cz wrote: --- src/gallium/include/pipe/p_config.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/include/pipe/p_config.h b/src/gallium/include/pipe/p_config.h index d603681..cd6f560 100644 --- a/src/gallium/include/pipe/p_config.h +++ b/src/gallium/include/pipe/p_config.h @@ -130,7 +130,7 @@ * Endian detection. */ -#ifdef __GLIBC__ +#if defined(PIPE_OS_LINUX) || defined(__GLIBC__) #include endian.h #if __BYTE_ORDER == __LITTLE_ENDIAN Doesn't this do something else than the subject suggests? All *major* linux libcs isn't the same as all linux libcs... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] include/pipe: all major Linux libc support endian.h
On Tue, Mar 18, 2014 at 3:18 PM, David Heidelberger david.heidelber...@ixit.cz wrote: Dne 2014-03-18 13:43, Erik Faye-Lund napsal: On Fri, Mar 14, 2014 at 7:57 PM, David Heidelberger david.heidelber...@ixit.cz wrote: --- src/gallium/include/pipe/p_config.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/include/pipe/p_config.h b/src/gallium/include/pipe/p_config.h index d603681..cd6f560 100644 --- a/src/gallium/include/pipe/p_config.h +++ b/src/gallium/include/pipe/p_config.h @@ -130,7 +130,7 @@ * Endian detection. */ -#ifdef __GLIBC__ +#if defined(PIPE_OS_LINUX) || defined(__GLIBC__) #include endian.h #if __BYTE_ORDER == __LITTLE_ENDIAN Doesn't this do something else than the subject suggests? All *major* linux libcs isn't the same as all linux libcs... In this moment I'm not aware any libc without endian.h on linux, where is Mesa supposed to build. If there is any, let me know and I'll add it to #ifdef. Oh, OK. That's not what I read from the commit message, but thanks for clearing that up. I'm not aware of any either. At least not on platforms where Mesa with Gallium is a reasonable option. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/8] mesa: Enable GL_ARG_query_buffer_object extension
It seems the subject has a typo, and ARG - ARB would be appropriate. On Wed, Mar 19, 2014 at 10:30 PM, Rafal Mielniczuk rafal.mielnicz...@gmail.com wrote: Signed-off-by: Rafal Mielniczuk rafal.mielnicz...@gmail.com --- src/mesa/main/extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index a72284c..18710a6 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -123,6 +123,7 @@ static const struct extension extension_table[] = { { GL_ARB_point_parameters,o(EXT_point_parameters), GLL,1997 }, { GL_ARB_point_sprite,o(ARB_point_sprite), GL, 2003 }, { GL_ARB_provoking_vertex,o(EXT_provoking_vertex), GL, 2009 }, + { GL_ARB_query_buffer_object, o(dummy_true), GL, 2013 }, { GL_ARB_robustness, o(dummy_true), GL, 2010 }, { GL_ARB_sample_shading, o(ARB_sample_shading), GL, 2009 }, { GL_ARB_sampler_objects, o(dummy_true), GL, 2009 }, -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa branchpoint tags?
On Fri, Mar 28, 2014 at 8:25 AM, Eric Anholt e...@anholt.net wrote: I was looking at a bug report about old software, and wanted to see what development branch a quoted commit was on: anholt@eliezer:anholt/src/mesa-release% git describe 97217a40f97cdeae0304798b607f704deb0c3558 snb-magic-15797-g97217a4 That's... not terribly useful. It would be nice if git describe could be used so I could figure out what development branch a commit was for. You could also have used git branch --contains 97217a40f97cdeae0304798b607f704deb0c3558, which would've shown you that it was first introduced in the 9.2-branch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mixing of hardware and software renderers
On Tue, May 6, 2014 at 6:51 AM, Patrick McMunn doctorwho...@gmail.com wrote: I'm using some older hardware - an ATI Radeon 9200 - which can only handle up to OpenGL 1.2. I was wondering if it's possible to use the hardware renderer generally and have the driver hand off the handling of functions which my video card can't handle (such as functions from a higher OpenGL version) to the software render and then the software render hand control back to the hardware renderer once it's finished. If this isn't currently possible, is this perhaps a feature which might appear in the future? To add to what others have said, this is not only currently not possible, but also incredible difficult to do in a standard-compliant way, because you need to have a software rasterizer that bitwise matches the result of the GPU. One example where this is needed is rasterization. If your software rasterizer doesn't provide the exact same result as the hardware rasterizer, you'll end up with gaps or overdraw between two adjecent polygons that ended up being using hardware and software rasterizing. To add to this, OpenGL allows slack in the rasterization result, and different GPUs rasterize differently. Matching these things is far from trivial, and requires knowledge about the hardware implementation that is usually not documented, even in the vendors' internal documentation. And do note that rasterization is just one example from a huge array of features that needs to match. So no, it's just not a feasible task. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa IR as a list of instructions
On Fri, May 30, 2014 at 9:22 PM, Eric Anholt e...@anholt.net wrote: However, talking yesterday about SSA and vector representations, we again failed to come up with anything that sounded compelling -- it appears that SSA is going to make obvious optimizations like dead code elimination in a vec4 IR harder. Ken and Matt pretty well convinced me that any new IR we work on should probably be scalar -- 965 is mostly scalar getting more so, my next project is scalar, and as far as I know current radeon and nouveau is scalar. Is anyone working on hardware with vector instructions these days? Yes, the Tegra vertex shader ISA is NV30/40, which is vector. Also, the Lima-guys might not be happy with a scalar-only ISA. Neither of these are supported in Mesa today, but Freedreno supports Adreno 2xx, Nouveau supports NV30/40, and there's the AMD R300 driver, all vector hardware. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fixup glcpp tests for redefining a macro with whitespace changes.
On Thu, Jun 12, 2014 at 3:11 AM, Carl Worth cwo...@cworth.org wrote: Previously, the test suite was expecting the compiler to allow a redefintion of a macro with whitespace added, but gcc is more strict and allows only for changes in the amounts of whitespace, (but insists that whitespace exist or not in exactly the same places). See: https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html: These definitions are effectively the same: #define FOUR (2 + 2) #define FOUR (2+2) #define FOUR (2 /* two */ + 2) but these are not: #define FOUR (2 + 2) #define FOUR ( 2+2 ) #define FOUR (2 * 2) #define FOUR(score,and,seven,years,ago) (2 + 2) This change adjusts the existing redefine-macro-legitimate test to work with the more strict understanding, and adds a new redefine-whitespace test to verify that changes in the position of whitespace are flagged as errors. You may want to quote the relevant part from the C++ spec (16.3 Macro replacement, [cpp.replace]): Two replacement lists are identical if and only if the preprocessing tokens in both have the same number, ordering, spelling, and white-space separation, where all white-space separations are considered identical. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote: On Mon, Jul 7, 2014 at 6:57 AM, Abdiel Janulgue abdiel.janul...@linux.intel.com wrote: v2: Check that the base type is float (Ian Romanick) Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/glsl/opt_algebraic.cpp | 34 ++ 1 file changed, 34 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index ac7514a..2ad561c 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -614,6 +614,40 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) break; + case ir_binop_min: + case ir_binop_max: + if (ir-type-base_type != GLSL_TYPE_FLOAT) + break; + /* Replace min(max) operations and its commutative combinations with + * a saturate operation + */ + for (int op = 0; op 2; op++) { + ir_expression *minmax = op_expr[op]; + ir_constant *outer_const = op_const[(op == 0) ? 1 : 0]; We use (1 - op) in other places in opt_algebraic. (Same comment below) + ir_expression_operation op_cond = (ir-operation == ir_binop_max) ? +ir_binop_min : ir_binop_max; + + if (!minmax || !outer_const || (minmax-operation != op_cond)) +continue; + + /* Found a min/max operation. Now try to see if its operands + * meet our conditions that we can do just a single saturate operation + */ + for (int minmax_op = 0; minmax_op 2; minmax_op++) { +ir_rvalue *inner_val_a = minmax-operands[minmax_op]; +ir_rvalue *inner_val_b = minmax-operands[(minmax_op == 0) ? 1 : 0]; + +if (!inner_val_a || !inner_val_b) + continue; + +/* Found a min (max(x, 0), 1.0) */ This comment tripped me up for a second. This really means that you've found either - min(max(x, 0.0), 1.0); or - max(min(x, 1.0), 0.0) Hmm, but are optimizing both of these to saturate OK? Shouldn't min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give 1.0? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
On Tue, Jul 8, 2014 at 8:22 AM, Matt Turner matts...@gmail.com wrote: On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote: This comment tripped me up for a second. This really means that you've found either - min(max(x, 0.0), 1.0); or - max(min(x, 1.0), 0.0) Hmm, but are optimizing both of these to saturate OK? Shouldn't min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give 1.0? Under standard IEEE rules, wouldn't the NaN propagate through all of these expressions? The GLSL 4.40 spec says Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. So it seems like we have a lot of flexibility here. Is there some text I'm missing? I think the point about flexibility is a bit weak or even misunderstood: surely this applies to the built-ins used, not what we generate after optimizing. So if we chose to do that, we'd need to prevent min and max from propagating NaN even when issued stand-alone, which might negatively impact performance in some ISAs. As to why I reacted, I was just remembering that I read somewhere (one of Humus' low-level shader optimization paper, IIRC) that the HLSL compiler refused to do some similar optimizations for NaN-reasons. Checking the spec a bit closer, though: - min(x, y) is defined as Returns y if y x, otherwise it returns x - max(x, y) is defined as Returns y if x y, otherwise it returns x. All comparisons with NaN should AFAIK fail, making both the first and second comparison return NaN, if NaN were to be properly supported. So my initial analysis was wrong. However, all of the following permutations of the same logic would still be inconsistent. min(max(0.0, x), 1.0) max(min(1.0, x), 0.0) min(1.0, max(0.0, x)) max(0.0, min(1.0, x)) min(1.0, max(x, 0.0)) max(0.0, min(x, 1.0)) I don't understand the code well enough to figure out if the patch optimizes these, though. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
On Tue, Jul 8, 2014 at 5:44 PM, Roland Scheidegger srol...@vmware.com wrote: Am 08.07.2014 08:22, schrieb Matt Turner: On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote: This comment tripped me up for a second. This really means that you've found either - min(max(x, 0.0), 1.0); or - max(min(x, 1.0), 0.0) Hmm, but are optimizing both of these to saturate OK? Shouldn't min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give 1.0? Under standard IEEE rules, wouldn't the NaN propagate through all of these expressions? not under ieee754-2008 rules, no. min/max return the other number if one of them is a NaN. I'm not entirely sure what IEEE754-2008 has to say about min/max, as GLSL already defines those operations (and states that IEEE754 compliance isn't required). I was referring to the definition of the less-than comparison-operator, which GLSL seems to lean on. And that seems to always evaluate to 'false'. This is enforced by d3d10 so at least d3d10 capable hardware is going to honor that. http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050%28v=vs.85%29.aspx The order of min/max though indeed matters. Right, this is very useful information, thanks. This makes me think that the perhaps the dependence on the less-than definition could be considered a spec-bug. A quick test reveals that NVIDIA's Windows driver implement both min(0.5, NaN) and min(NaN, 0.5) as returning 0.5. So I'd say: raise an issue with Khronos, and implement DX10 rules. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] gallium: add new semantics for tessellation
On Sat, Jul 19, 2014 at 4:59 PM, Ilia Mirkin imir...@alum.mit.edu wrote: +TGSI_SEMANTIC_TESSCOORD + + +For tessellation evaluation shaders, this semantic label indicates the +coordinates of the vertex being processed. This is available in XYZ; W is +undefined. + + This corresponds to gl_TessCoord ? +TGSI_SEMANTIC_TESSOUTER + + +For tessellation evaluation/control shaders, this semantic label indicates the +outer tessellation levels of the patch. Isoline tessellation will only have XY +defined, triangle will have XYZ and quads will have XYZW defined. This +corresponds to gl_TessLevelOuter. + +TGSI_SEMANTIC_TESSINNER + + +For tessellation evaluation/control shaders, this semantic label indicates the +inner tessellation levels of the patch. The X value is only defined for +triangle tessellation, while quads will have XY defined. This is entirely +undefined for isoline tessellation. + This corresponds to gl_TessLevelInner ? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: glsl/glcpp: Allow for '#' characters to appear in shader body
On Tue, Aug 5, 2014 at 11:22 PM, Carl Worth cwo...@cworth.org wrote: Kenneth Graunke kenn...@whitecape.org writes: I agree that this is pretty bogus. I'm coming around to thinking it's totally bogus. How about emitting a warning in the RETURN_TOKEN ('#') case? Thanks for the review, and thanks for suggesting the warning. I added the warning, then decided it needed some additional testing. So I whipped up a test that uses every possible non-identifier character to separate a macro name from its definition, like so: #define S* splat (defining a macro S to expand to * splat). This is first patch attached below. With this test, there's the question of whether it's even legal to have no space between a macro name and the replacement list in a macro definition. As usual, the GLSL specification is mute on the question, (deferring just to as is standard for C++). If we choose GCC as the standard, it does accept this case, (and emits a missing space warning). By accident, glcpp has always silently accepted this, until recently barfing, but only when '#' was the separating character. My next patch expanded this new test so that '#' is tested as well, (and ensuring the warning about the known-bogus '#' character was emitted). This is the second patch attached below. The expanded test looks really odd to me. Even though there are something like two-dozen pieces of punctuation used a separators, only the case with '#' emits a warning. The insight is that the warning really has nothing to do with #define FOO* but everything to do with the '#' character appearing out of proper context. So what we really want is better testing of illegal characters in various contexts. So I wrote two additional new tests. The first runs every possible legal character through glcpp, ensuring they are all passed through correctly. It turns out this test caught a long-standing bug, (glcpp was tripping up on vertical tab and form-feed for which the GLSL specification is explicit should be allowed). The second test runs every possible illegal character through glcpp, ensuring that an error is generated for each. Instead of attaching those two tests, I'll send patches for them to the list. They're independent of the current discussion. And what was _really_ striking when looking at the final test (as I first wrote it, but not as I will submit it), is that every illegal character was generating an error message except for '#' which was just generating a warning. So that sealed it for me that this treatment is bogus. So I'm inclined to leave Mesa breaking Reaction Quake, (but I'll go file a bug against that application). Now, what we could do if we were so inclined, would be to defer the errors for illegal characters until they actually appeared in the pre-processor output. That, is, a line such as: $ would be an immediate error, (Illegal character '$'), but a line such as: #define DOLLAR $ would be accepted with no error in and of itself, (the illegal character has appeared in a replacement list, but may not ever appear in the output to be seen by the compiler). Then, a subsequent line such as: DOLLAR would then emit the Illegal character '$' error. If we had all of that in place, that would un-break Reaction Quake in a way that wouldn't feel creepy to me, (and none of the tests would have odd-looking exceptional behavior). What you describe here seems to actually be what the standard requires: The relevant bit of the parse rule for a define statement of this type is like so (Section 16 [cpp] of the C++ spec): # define identifier replacement-list new-line ...And: replacement-list: pp-tokens(opt) pp-tokens: preprocessing-token pp-tokens preprocessing-token And Section 2.4 [lex.pptoken] defines: preprocessing-token: header-name identifier pp-number character-literal string-literal preprocessing-op-or-punc each non-white-space character that cannot be one of the above So, from the preprocessor's point of view, '#' is allowed in a macro, as it matches preprocessing-op-or-punc (or if we consider the # operator removed from the cpp-spec due to lack of strings, it still matches that each non-white-space character that cannot be one of the above-rule). It's AFAICT not until such a pp-token gets converted to a glsl-token that this might become an error. And (as you kind of pointed out), if a macro is never used, it never gets converted. Note that '$' is a bit different, as it's not a part of the preprocessor's character set, so using it might be interpreted as undefined behavior. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: glsl/glcpp: Allow for '#' characters to appear in shader body
On Fri, Aug 8, 2014 at 12:29 AM, Carl Worth cwo...@cworth.org wrote: Erik Faye-Lund kusmab...@gmail.com writes: Note that '$' is a bit different, as it's not a part of the preprocessor's character set, so using it might be interpreted as undefined behavior. Right. That could easily go either way. Is the phrase each non-white-space character restricted to characters of the original character set? This is a bit unclear. Section 2.1 [lex.phases] has the following to say: 1. Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences are replaced by corresponding single-character internal representations. Any source file character not in the basic source character set is replaced by the universal-character-name that designates that character. (An implementation may use any internal encoding, so long as an actual extended character encountered in the source file, and the same extended character expressed in the source file as a universal-character-name (i.e. using the \u notation), are handled equivalently.) snip 5. Each source character set member, escape sequence, or universal-character-name in character literals and string literals is converted to the corresponding member of the execution character set; if there is no corresponding member, it is converted to an implementation-defined member other than the null (wide) character. The Any source file character not in the basic source character set is replaced by the universal-character-name that designates that character-bit seems to be the interesting part, but I'm not quite able to grok it. Appendix E lists universal-character-names, seemingly from ISO/IEC PDTR 10176 (which seems to be some kind of guidelines for programming language design). It seems to me like this is there to support non-ASCII identifiers and strings, but GLSL doesn't support either. I'm not able to come up with a conclusion here. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/9] i915: Use L8A8 instead of I8 to simulate A8 on gen2
On Thu, Aug 7, 2014 at 10:31 AM, ville.syrj...@linux.intel.com wrote: diff --git a/src/mesa/main/texformat.c b/src/mesa/main/texformat.c index c61a748..f414ea3 100644 --- a/src/mesa/main/texformat.c +++ b/src/mesa/main/texformat.c case GL_ALPHA12: case GL_ALPHA16: RETURN_IF_SUPPORTED(MESA_FORMAT_A_UNORM16); RETURN_IF_SUPPORTED(MESA_FORMAT_A_UNORM8); + RETURN_IF_SUPPORTED(MESA_FORMAT_L8A8_UNORM); break; I know this isn't exactly what your patch looked to support, but shouldn't MESA_FORMAT_L16A16_UNORM be tried as a lossless alternative also? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/9] i915: Use L8A8 instead of I8 to simulate A8 on gen2
On Fri, Aug 15, 2014 at 11:57 AM, Ville Syrjälä ville.syrj...@linux.intel.com wrote: On Fri, Aug 15, 2014 at 10:52:50AM +0200, Erik Faye-Lund wrote: On Thu, Aug 7, 2014 at 10:31 AM, ville.syrj...@linux.intel.com wrote: diff --git a/src/mesa/main/texformat.c b/src/mesa/main/texformat.c index c61a748..f414ea3 100644 --- a/src/mesa/main/texformat.c +++ b/src/mesa/main/texformat.c case GL_ALPHA12: case GL_ALPHA16: RETURN_IF_SUPPORTED(MESA_FORMAT_A_UNORM16); RETURN_IF_SUPPORTED(MESA_FORMAT_A_UNORM8); + RETURN_IF_SUPPORTED(MESA_FORMAT_L8A8_UNORM); break; I know this isn't exactly what your patch looked to support, but shouldn't MESA_FORMAT_L16A16_UNORM be tried as a lossless alternative also? I suppose, but I suspect you'll have a hard time finding hardware that supports L16A16 but not A16. Fair point. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Linking fails when not writing gl_Position.
On Tue, Sep 9, 2014 at 7:30 PM, Ian Romanick i...@freedesktop.org wrote: On 09/08/2014 01:10 AM, Tapani Pälli wrote: From: Kalyan Kondapally kalyan.kondapa...@intel.com According to GLSL-ES Spec(i.e. 1.0, 3.0), gl_Position value is undefined after the vertex processing stage if we don't write gl_Position. However, GLSL 1.10 Spec mentions that writing to gl_Position is mandatory. In case of GLSL-ES, it's not an error and atleast the linking should pass. Currently, Mesa throws an linker error in case we dont write to gl_position and Version is less then 140(GLSL) and 300(GLSL-ES). This patch changes it so that we don't report an error in case of GLSL-ES. Wow. We can add this to the list of ways OpenGL ES is just plain broken. Historical note: It's not broken, this is by design. The rationale for why we did this was to ease the burden of compile-time checks for the compiler - detecting if gl_Position is written no matter how the program branches is not trivial, so we decided against it. When we specified this, the typical mobile CPU clock-rate was *a lot* lower than it is today. Sure, it probably could have been fixed for GLES 3.0. I wasn't involved in those discussions, so I have no insight there. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Linking fails when not writing gl_Position.
On Thu, Sep 11, 2014 at 12:32 AM, Ian Romanick i...@freedesktop.org wrote: On 09/10/2014 01:53 PM, Erik Faye-Lund wrote: On Tue, Sep 9, 2014 at 7:30 PM, Ian Romanick i...@freedesktop.org wrote: On 09/08/2014 01:10 AM, Tapani Pälli wrote: From: Kalyan Kondapally kalyan.kondapa...@intel.com According to GLSL-ES Spec(i.e. 1.0, 3.0), gl_Position value is undefined after the vertex processing stage if we don't write gl_Position. However, GLSL 1.10 Spec mentions that writing to gl_Position is mandatory. In case of GLSL-ES, it's not an error and atleast the linking should pass. Currently, Mesa throws an linker error in case we dont write to gl_position and Version is less then 140(GLSL) and 300(GLSL-ES). This patch changes it so that we don't report an error in case of GLSL-ES. Wow. We can add this to the list of ways OpenGL ES is just plain broken. Historical note: It's not broken, this is by design. The rationale for why we did this was to ease the burden of compile-time checks for the compiler - detecting if gl_Position is written no matter how the program branches is not trivial, so we decided against it. When we specified this, the typical mobile CPU clock-rate was *a lot* lower than it is today. I'd call that broken by design. The overhead of checking that a shader statically accessed a variabl is infinitesimal, no matter how slow your CPU is. Not so. If you consider the minimal feature-set required for GLES 2.0, you'll realize that conformant implementations was expected to do *very* little program analysis, and compiling shader is something you're doing on a very restricted time-budget. When working on the Mali-200 driver for early customers, we also needed to keep the entire driver (including compiler) in the size-order of 100KB, due to ROM size limitations. I remember numbers being thrown around the working-group were that no one could spend more than around 50KB for the compiler, and around 30ms compilation time per shader (running on a 2-300 mhz ARM9 CPU, with 64MB of RAM). Sure, afterwards Android came around and changed a lot of that, but we couldn't have known (and a lot of people were explicitly betting *against* anything like that). So yeah, we really needed to keep stuff down, both for performance and size-reasons. These are the sort of cut-corners that make developers hate OpenGL development. Why is the screen black? No clue. Meh. GLES has way less of these than desktop GL of the same vintage, so color me unimpressed by that argument. But also, keep in mind that this was at a time where the total number of GLES adopters were zero. Yes, a some decisions were made that might seem questionable now that we *know* that GLES 2 became a success. But it might not even have become one if we didn't cut some of those corners. We went from zero to hundreds of millions of devices real quick, and initially all the chip-customers were *really* worried about even shipping a compiler on their device. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: Have validate_uniform_parameters return the gl_uniform_storage pointer
On Sat, Aug 2, 2014 at 4:09 AM, Ian Romanick i...@freedesktop.org wrote: diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 609d94b..7b089fa 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -266,30 +265,32 @@ validate_uniform_parameters(struct gl_context *ctx, */ if (shProg-UniformRemapTable[location] == INACTIVE_UNIFORM_EXPLICIT_LOCATION) - return false; + return NULL; - _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + unsigned loc; + _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + struct gl_uniform_storage *const uni = shProg-UniformStorage[loc]; - if (shProg-UniformStorage[*loc].array_elements == 0 count 1) { + if (uni-array_elements == 0 count 1) { I'm getting a NULL-pointer deference here when running piglit's spec/ARB_explicit_uniform_location/arb_explicit_uniform_location-use-of-unused-loc on yesterday's master. A quick git-log doesn't seem to contain a fix in today's master either. Perhaps something like this is in order? - if (uni-array_elements == 0 count 1) { + if (uni uni-array_elements == 0 count 1) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: Have validate_uniform_parameters return the gl_uniform_storage pointer
On Thu, Sep 11, 2014 at 1:27 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Sat, Aug 2, 2014 at 4:09 AM, Ian Romanick i...@freedesktop.org wrote: diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 609d94b..7b089fa 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -266,30 +265,32 @@ validate_uniform_parameters(struct gl_context *ctx, */ if (shProg-UniformRemapTable[location] == INACTIVE_UNIFORM_EXPLICIT_LOCATION) - return false; + return NULL; - _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + unsigned loc; + _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + struct gl_uniform_storage *const uni = shProg-UniformStorage[loc]; - if (shProg-UniformStorage[*loc].array_elements == 0 count 1) { + if (uni-array_elements == 0 count 1) { I'm getting a NULL-pointer deference here when running piglit's spec/ARB_explicit_uniform_location/arb_explicit_uniform_location-use-of-unused-loc on yesterday's master. A quick git-log doesn't seem to contain a fix in today's master either. Perhaps something like this is in order? - if (uni-array_elements == 0 count 1) { + if (uni uni-array_elements == 0 count 1) { Actually, that just moves the issue a few lines, as line 282 also dereferences 'uni'. Earlying out by checking for NULL in uni seems more appropriate, but if I do so without setting GL_INVALID_OEPRATION, the piglit test fails. Doing ---8--- diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 4cd2bca..4483948 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -268,6 +268,12 @@ validate_uniform_parameters(struct gl_context *ctx, return NULL; struct gl_uniform_storage *const uni = shProg-UniformRemapTable[location]; + if (!uni) { + _mesa_error(ctx, GL_INVALID_OPERATION, + %s(nonexistent uniform, location=%d), + caller, location); + return NULL; + } if (uni-array_elements == 0 count 1) { _mesa_error(ctx, GL_INVALID_OPERATION, ---8--- ...seems to fix both the crash and the test. Looking a bit more closely at the test, it seems somewhat appropriate. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: Have validate_uniform_parameters return the gl_uniform_storage pointer
On Thu, Sep 11, 2014 at 2:00 PM, Tapani Pälli tapani.pa...@intel.com wrote: On 09/11/2014 02:27 PM, Erik Faye-Lund wrote: On Sat, Aug 2, 2014 at 4:09 AM, Ian Romanick i...@freedesktop.org wrote: diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 609d94b..7b089fa 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -266,30 +265,32 @@ validate_uniform_parameters(struct gl_context *ctx, */ if (shProg-UniformRemapTable[location] == INACTIVE_UNIFORM_EXPLICIT_LOCATION) - return false; + return NULL; - _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + unsigned loc; + _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + struct gl_uniform_storage *const uni = shProg-UniformStorage[loc]; - if (shProg-UniformStorage[*loc].array_elements == 0 count 1) { + if (uni-array_elements == 0 count 1) { I'm getting a NULL-pointer deference here when running piglit's spec/ARB_explicit_uniform_location/arb_explicit_uniform_location-use-of-unused-loc on yesterday's master. A quick git-log doesn't seem to contain a fix in today's master either. Perhaps something like this is in order? - if (uni-array_elements == 0 count 1) { + if (uni uni-array_elements == 0 count 1) { fix is here: http://lists.freedesktop.org/archives/mesa-dev/2014-August/066752.html remaptable contains entries that are set NULL for inactive uniforms Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: check that uniform exists in glUniform* functions
On Thu, Aug 28, 2014 at 9:58 AM, Tapani Pälli tapani.pa...@intel.com wrote: Remap table for uniforms may contain empty entries when using explicit uniform locations. If no active/inactive variable exists with given location, remap table contains NULL. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/uniform_query.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 4cd2bca..5dcb528 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -239,7 +239,7 @@ validate_uniform_parameters(struct gl_context *ctx, * - if count is greater than one, and the uniform declared in the * shader is not an array variable, */ - if (location -1) { + if (location -1 || !shProg-UniformRemapTable[location]) { _mesa_error(ctx, GL_INVALID_OPERATION, %s(location=%d), caller, location); Nice. But shouldn't the the error-message be updated? With a location smaller than -1, it's very clear what's wrong. But when triggering the introduced condition, it's not. I think some wording should be added to make it clear that the location does not exist. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] mesa: Have validate_uniform_parameters return the gl_uniform_storage pointer
On Thu, Sep 11, 2014 at 8:35 PM, Ian Romanick i...@freedesktop.org wrote: On 09/11/2014 05:09 AM, Erik Faye-Lund wrote: On Thu, Sep 11, 2014 at 2:00 PM, Tapani Pälli tapani.pa...@intel.com wrote: On 09/11/2014 02:27 PM, Erik Faye-Lund wrote: On Sat, Aug 2, 2014 at 4:09 AM, Ian Romanick i...@freedesktop.org wrote: diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 609d94b..7b089fa 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -266,30 +265,32 @@ validate_uniform_parameters(struct gl_context *ctx, */ if (shProg-UniformRemapTable[location] == INACTIVE_UNIFORM_EXPLICIT_LOCATION) - return false; + return NULL; - _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + unsigned loc; + _mesa_uniform_split_location_offset(shProg, location, loc, array_index); + struct gl_uniform_storage *const uni = shProg-UniformStorage[loc]; - if (shProg-UniformStorage[*loc].array_elements == 0 count 1) { + if (uni-array_elements == 0 count 1) { I'm getting a NULL-pointer deference here when running piglit's spec/ARB_explicit_uniform_location/arb_explicit_uniform_location-use-of-unused-loc on yesterday's master. A quick git-log doesn't seem to contain a fix in today's master either. Perhaps something like this is in order? - if (uni-array_elements == 0 count 1) { + if (uni uni-array_elements == 0 count 1) { fix is here: http://lists.freedesktop.org/archives/mesa-dev/2014-August/066752.html remaptable contains entries that are set NULL for inactive uniforms Thanks. Does Tapani's patch fix the problem for you? If so, can we call that a Tested-by? :) It does indeed, so: Tested-by: Erik Faye-Lund kusmab...@gmail.com. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: check that uniform exists in glUniform* functions
On Thu, Sep 11, 2014 at 8:39 PM, Ian Romanick i...@freedesktop.org wrote: On 08/28/2014 12:58 AM, Tapani Pälli wrote: Remap table for uniforms may contain empty entries when using explicit uniform locations. If no active/inactive variable exists with given location, remap table contains NULL. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/uniform_query.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 4cd2bca..5dcb528 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -239,7 +239,7 @@ validate_uniform_parameters(struct gl_context *ctx, * - if count is greater than one, and the uniform declared in the * shader is not an array variable, */ - if (location -1) { + if (location -1 || !shProg-UniformRemapTable[location]) { _mesa_error(ctx, GL_INVALID_OPERATION, %s(location=%d), caller, location); return NULL; So... this would occur if someone explicitly put uniforms at locations 0 and 2, then queried location 1, right? What happens if someone queries location 0x7fff? Will it commit an out-of-bounds acces of UniformRemapTable, or is there an earlier check that's not visible in the patch? Outch. That check is right *after* this lookup, so no... that's not going to work as-is. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: improve accuracy of atan()
Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund kusmab...@gmail.com --- I noticed that glsl-const-folding-01 failed due to a pretty inaccurate result for atan(1.0) * 4.0. The result should be pi, but we only produce 3.141298. The new approximation gets us to 3.141580, which is at least better than before. I also tried using one less term (this is what Microsoft's HLSL compiler does) as well as a bunch of other simpler approximations, but that didn't get us far enough to pass the test. Depressingly enough, it seems NVIDIA's proprietary drivers uses the same polynomial as the HLSL compiler, but passes the test because it uses an even more accurate atan implementation during constant-folding. We could also go down that path, by introducing an ir_unop_atan that gets lowered to a polynomial before code-generation. That would benefit drivers for hardware with atan-support (at least Mali supports this, according to http://limadriver.org/Lima+ISA/), but it worries me a bit to do constant folding with different precision than execution. But perhaps we already have that problem, only a bit more subtle? src/glsl/builtin_functions.cpp | 68 +++--- 1 file changed, 58 insertions(+), 10 deletions(-) diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index 9be7f6d..1820dd5 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -442,6 +442,7 @@ private: ir_swizzle *matrix_elt(ir_variable *var, int col, int row); ir_expression *asin_expr(ir_variable *x); + void do_atan(ir_factory body, const glsl_type *type, ir_variable *res, operand y_over_x); /** * Call function \param f with parameters specified as the linked @@ -2684,11 +2685,7 @@ builtin_builder::_atan2(const glsl_type *type) ir_factory outer_then(outer_if-then_instructions, mem_ctx); /* Then...call atan(y/x) */ - ir_variable *y_over_x = outer_then.make_temp(glsl_type::float_type, y_over_x); - outer_then.emit(assign(y_over_x, div(y, x))); - outer_then.emit(assign(r, mul(y_over_x, rsq(add(mul(y_over_x, y_over_x), - imm(1.0f)); - outer_then.emit(assign(r, asin_expr(r))); + do_atan(body, glsl_type::float_type, r, div(y, x)); /* ...and fix it up: */ ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f))); @@ -2711,17 +2708,68 @@ builtin_builder::_atan2(const glsl_type *type) return sig; } +void +builtin_builder::do_atan(ir_factory body, const glsl_type *type, ir_variable *res, operand y_over_x) +{ + /* +* range-reduction, first step: +* +* / y_over_x if |y_over_x| = 1.0; +* x = +* \ 1.0 / y_over_x otherwise +*/ + ir_variable *x = body.make_temp(type, atan_x); + body.emit(assign(x, div(min2(abs(y_over_x), +imm(1.0f)), + max2(abs(y_over_x), +imm(1.0f); + + /* +* approximate atan by evaluating polynomial: +* +* x * 0.793128310355 - x^3 * 0.3326756418091246 + +* x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + +* x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 +*/ + ir_variable *tmp = body.make_temp(type, atan_tmp); + body.emit(assign(tmp, mul(x, x))); + body.emit(assign(tmp, mul(add(mul(sub(mul(add(mul(sub(mul(add(mul(imm(-0.0121323213173444f), + tmp), + imm(0.0536813784310406f)), + tmp), + imm(0.1173503194786851f
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
On Wed, Sep 24, 2014 at 12:39 AM, Erik Faye-Lund kusmab...@gmail.com wrote: This fixes the following piglit test: shaders/glsl-const-folding-01 Forgot to mention: no piglit regressions seen on Ivybridge. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
On Wed, Sep 24, 2014 at 4:10 AM, Ian Romanick i...@freedesktop.org wrote: On 09/23/2014 03:39 PM, Erik Faye-Lund wrote: Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund kusmab...@gmail.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49713 Ah! Thanks a lot for that link, very useful. We could also go down that path, by introducing an ir_unop_atan that gets lowered to a polynomial before code-generation. That would benefit drivers for hardware with atan-support (at least Mali supports this, according to http://limadriver.org/Lima+ISA/), but it worries me a bit to do constant folding with different precision than execution. But perhaps we already have that problem, only a bit more subtle? We used to implement constant folding for many built-in functions this way. Built-in functions like atan were detected in the constant folding process, and C library functions were used. Commit 363c14ae0 changed this to simplify the constant folding code. This test case began failing at that time, and Vinson submitted bug #49713. Aha, thanks for clearing that up. However, just to be clear, my alternative suggestion isn't to go back in that direction. It's to make atan a middle-class citizen, like ir_unop_[exp|log]. These always gets lowered to ir_unop_[exp|log]2, but in this case the lowering would go to arithmetic. Something like this (also no piglit regressions): --8-- diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index 9be7f6d..cc6c8b3 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -442,6 +442,7 @@ private: ir_swizzle *matrix_elt(ir_variable *var, int col, int row); ir_expression *asin_expr(ir_variable *x); + void do_atan(ir_factory body, const glsl_type *type, ir_variable *res, operand y_over_x); /** * Call function \param f with parameters specified as the linked @@ -2596,6 +2597,7 @@ builtin_builder::_degrees(const glsl_type *type) UNOP(sin, ir_unop_sin, always_available) UNOP(cos, ir_unop_cos, always_available) +UNOP(atan, ir_unop_atan, always_available) ir_function_signature * builtin_builder::_tan(const glsl_type *type) @@ -2684,11 +2686,7 @@ builtin_builder::_atan2(const glsl_type *type) ir_factory outer_then(outer_if-then_instructions, mem_ctx); /* Then...call atan(y/x) */ - ir_variable *y_over_x = outer_then.make_temp(glsl_type::float_type, y_over_x); - outer_then.emit(assign(y_over_x, div(y, x))); - outer_then.emit(assign(r, mul(y_over_x, rsq(add(mul(y_over_x, y_over_x), - imm(1.0f)); - outer_then.emit(assign(r, asin_expr(r))); + body.emit(assign(r, expr(ir_unop_atan, div(y, x; /* ...and fix it up: */ ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f))); @@ -2712,21 +2710,6 @@ builtin_builder::_atan2(const glsl_type *type) } ir_function_signature * -builtin_builder::_atan(const glsl_type *type) -{ - ir_variable *y_over_x = in_var(type, y_over_x); - MAKE_SIG(type, always_available, 1, y_over_x); - - ir_variable *t = body.make_temp(type, t); - body.emit(assign(t, mul(y_over_x, rsq(add(mul(y_over_x, y_over_x), - imm(1.0f)); - - body.emit(ret(asin_expr(t))); - - return sig; -} - -ir_function_signature * builtin_builder::_sinh(const glsl_type *type) { ir_variable *x = in_var(type, x); diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp index 739a9f4..1ec7789 100644 --- a/src/glsl/ir.cpp +++ b/src/glsl/ir.cpp @@ -247,6 +247,7 @@ ir_expression::ir_expression(int op, ir_rvalue *op0) case ir_unop_cos: case ir_unop_sin_reduced: case ir_unop_cos_reduced: + case ir_unop_atan: case ir_unop_dFdx: case ir_unop_dFdx_coarse
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
On Wed, Sep 24, 2014 at 1:35 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Hm. Don't I need to expand this last immediate to a vector of type-components() size as well? If so, this patch should go on top: diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index 1820dd5..bfa46eb 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -2749,13 +2749,11 @@ builtin_builder::do_atan(ir_factory body, const glsl_type *type, ir_variable *r /* range-reduction fixup */ body.emit(assign(tmp, add(tmp, csel(greater(abs(y_over_x), - swizzle(imm(1.0f), - SWIZZLE_, - type-components())), + imm(1.0f, type-components())), add(mul(tmp, imm(-2.0f)), imm(M_PI_2f)), - imm(0.0f); + imm(0.0f, type-components()); /* sign fixup */ body.emit(assign(res, mul(tmp, sign(y_over_x; Ugh, it seems things fail piglit's shaders/glsl-fs-atan-1 pretty badly on swrast. ../../src/mesa/program/ir_to_mesa.cpp:1426: virtual void {anonymous}::ir_to_mesa_visitor::visit(ir_expression*): Assertion `!not supported' failed. Ugh, yeah. We don't seem to support ir_trip_csel on swrast, and there's no attempt to lower it unless it's condition is constant 0 or 1 (in opt_algebraic.cpp). So yeah, this regresses there, but looking at the other builtins, so does anything that uses gentype mix(gentype, gentype, genBtype) etc. However, that's at least blocked by a GLSL v130 check, which swrast does not support. So, I've got some more work to do :/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
On Thu, Sep 25, 2014 at 4:54 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Wed, Sep 24, 2014 at 1:35 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Hm. Don't I need to expand this last immediate to a vector of type-components() size as well? If so, this patch should go on top: diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index 1820dd5..bfa46eb 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -2749,13 +2749,11 @@ builtin_builder::do_atan(ir_factory body, const glsl_type *type, ir_variable *r /* range-reduction fixup */ body.emit(assign(tmp, add(tmp, csel(greater(abs(y_over_x), - swizzle(imm(1.0f), - SWIZZLE_, - type-components())), + imm(1.0f, type-components())), add(mul(tmp, imm(-2.0f)), imm(M_PI_2f)), - imm(0.0f); + imm(0.0f, type-components()); /* sign fixup */ body.emit(assign(res, mul(tmp, sign(y_over_x; Ugh, it seems things fail piglit's shaders/glsl-fs-atan-1 pretty badly on swrast. ../../src/mesa/program/ir_to_mesa.cpp:1426: virtual void {anonymous}::ir_to_mesa_visitor::visit(ir_expression*): Assertion `!not supported' failed. Ugh, yeah. We don't seem to support ir_trip_csel on swrast, and there's no attempt to lower it unless it's condition is constant 0 or 1 (in opt_algebraic.cpp). So yeah, this regresses there, but looking at the other builtins, so does anything that uses gentype mix(gentype, gentype, genBtype) etc. However, that's at least blocked by a GLSL v130 check, which swrast does not support. So, I've got some more work to do :/ OK, that was easier than I thought: diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index bfa46eb..c126b60 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -2748,12 +2748,11 @@ builtin_builder::do_atan(ir_factory body, const glsl_type *type, ir_variable *r /* range-reduction fixup */ body.emit(assign(tmp, add(tmp, - csel(greater(abs(y_over_x), - imm(1.0f, type-components())), + mul(b2f(greater(abs(y_over_x), + imm(1.0f, type-components(, add(mul(tmp, imm(-2.0f)), - imm(M_PI_2f)), - imm(0.0f, type-components()); + imm(M_PI_2f)); /* sign fixup */ body.emit(assign(res, mul(tmp, sign(y_over_x; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: improve accuracy of atan()
Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund kusmab...@gmail.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/builtin_functions.cpp | 65 +++--- 1 file changed, 55 insertions(+), 10 deletions(-) diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index 9be7f6d..c126b60 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -442,6 +442,7 @@ private: ir_swizzle *matrix_elt(ir_variable *var, int col, int row); ir_expression *asin_expr(ir_variable *x); + void do_atan(ir_factory body, const glsl_type *type, ir_variable *res, operand y_over_x); /** * Call function \param f with parameters specified as the linked @@ -2684,11 +2685,7 @@ builtin_builder::_atan2(const glsl_type *type) ir_factory outer_then(outer_if-then_instructions, mem_ctx); /* Then...call atan(y/x) */ - ir_variable *y_over_x = outer_then.make_temp(glsl_type::float_type, y_over_x); - outer_then.emit(assign(y_over_x, div(y, x))); - outer_then.emit(assign(r, mul(y_over_x, rsq(add(mul(y_over_x, y_over_x), - imm(1.0f)); - outer_then.emit(assign(r, asin_expr(r))); + do_atan(body, glsl_type::float_type, r, div(y, x)); /* ...and fix it up: */ ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f))); @@ -2711,17 +2708,65 @@ builtin_builder::_atan2(const glsl_type *type) return sig; } +void +builtin_builder::do_atan(ir_factory body, const glsl_type *type, ir_variable *res, operand y_over_x) +{ + /* +* range-reduction, first step: +* +* / y_over_x if |y_over_x| = 1.0; +* x = +* \ 1.0 / y_over_x otherwise +*/ + ir_variable *x = body.make_temp(type, atan_x); + body.emit(assign(x, div(min2(abs(y_over_x), +imm(1.0f)), + max2(abs(y_over_x), +imm(1.0f); + + /* +* approximate atan by evaluating polynomial: +* +* x * 0.793128310355 - x^3 * 0.3326756418091246 + +* x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + +* x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 +*/ + ir_variable *tmp = body.make_temp(type, atan_tmp); + body.emit(assign(tmp, mul(x, x))); + body.emit(assign(tmp, mul(add(mul(sub(mul(add(mul(sub(mul(add(mul(imm(-0.0121323213173444f), + tmp), + imm(0.0536813784310406f)), + tmp), + imm(0.1173503194786851f)), + tmp), + imm(0.1938924977115610f)), + tmp), + imm(0.3326756418091246f)), + tmp), + imm(0.793128310355f)), + x))); + + /* range-reduction fixup */ + body.emit(assign(tmp, add(tmp, + mul(b2f(greater(abs(y_over_x), + imm(1.0f, type-components(, + add(mul(tmp, + imm(-2.0f)), + imm(M_PI_2f)); + + /* sign fixup */ + body.emit(assign(res, mul(tmp, sign(y_over_x; +} + ir_function_signature * builtin_builder::_atan(const glsl_type *type) { ir_variable *y_over_x = in_var(type, y_over_x); MAKE_SIG(type, always_available, 1
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund kusmab...@gmail.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com Ping? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
On Fri, Oct 10, 2014 at 12:22 PM, Timothy Arceri t_arc...@yahoo.com.au wrote: On Mon, 2014-10-06 at 17:03 +0200, Erik Faye-Lund wrote: On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund kusmab...@gmail.com wrote: Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund kusmab...@gmail.com Reviewed-by: Ian Romanick ian.d.roman...@intel.com Ping? Are you just looking for someone to commit this? Either that, or a reason for it to not be applied ;) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()
On Fri, Oct 10, 2014 at 8:44 PM, Olivier Galibert galib...@pobox.com wrote: Applied. Thanks :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 6/9] gallium/auxiliary: add dump functions for bind and transfer flags
On Sun, Nov 2, 2014 at 7:32 PM, David Heidelberg da...@ixit.cz wrote: v2: rename and extend support with code for C11 and MSVC (thanks to Brian) Signed-off-by: David Heidelberg da...@ixit.cz --- src/gallium/auxiliary/util/u_dump.h | 6 ++ src/gallium/auxiliary/util/u_dump_defines.c | 86 + 2 files changed, 92 insertions(+) diff --git a/src/gallium/auxiliary/util/u_dump.h b/src/gallium/auxiliary/util/u_dump.h index 58e7dfd..84ba1ed 100644 --- a/src/gallium/auxiliary/util/u_dump.h +++ b/src/gallium/auxiliary/util/u_dump.h @@ -88,6 +88,12 @@ util_dump_tex_filter(unsigned value, boolean shortened); const char * util_dump_query_type(unsigned value, boolean shortened); +const char * +util_dump_bind_flags(unsigned flags); + +const char * +util_dump_transfer_flags(unsigned flags); + /* * p_state.h, through a FILE diff --git a/src/gallium/auxiliary/util/u_dump_defines.c b/src/gallium/auxiliary/util/u_dump_defines.c index 03fd15d..20ae6c0 100644 --- a/src/gallium/auxiliary/util/u_dump_defines.c +++ b/src/gallium/auxiliary/util/u_dump_defines.c @@ -61,6 +61,36 @@ util_dump_enum_continuous(unsigned value, return names[value]; } +static const char * +util_dump_flags(unsigned flags, const char *prefix, +unsigned num_names, +const char **names) +{ +#if __STDC_VERSION__ = 201112 !defined __STDC_NO_THREADS__ + static _Thread_local char str[256]; +#elif defined(PIPE_CC_GCC) + static __thread char str[256]; +#elif defined(PIPE_CC_MSVC) + static __declspec(thread) char str[256]; +#else +#error Unsupported compiler: please find how to implement thread local storage on it +#endif Isn't failing compilation a bit aggressive? Can't we just lose the functionality instead? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Move mesa version scheme to a more common style ?
On Fri, Nov 14, 2014 at 3:39 PM, Emil Velikov emil.l.veli...@gmail.com wrote: Hello all, This is an old question that I had laying around - why doesn't mesa use a more conventional numbering for the development/rc releases ? Eg. mesa 10.4.0-rc1 - 10.3.99.901 mesa 10.4.0-rc2 - 10.3.99.902 ... mesa 10.4.0 - 10.4.0 mesa 10.4.1-rc1 - 10.4.0.901 ... you get the idea. Afaics most freedesktop project use it plus a big hunk of gnome. Does this really make it a more conventional numbering scheme? I've personally seen the 10.4.1-rc1-scheme way more often than the 10.4.0.901-scheme. In fact, I think this is the first time I've seen the latter used. But I haven't exactly gone out of my way looking for versioning schemes. Are there any objections if I move to the above format starting with mesa 10.4-rc1 ? I would appreciate any feedback over the next 2-3 days, and based on it I'll tag the first RC. Shouldn't it be the other way around? IMO we should have strong arguments for *changing* it, rather than keep going as-is... Any change can break something, so only changes that have clear benefits should be done, no? AFAICT, the current scheme conveys more relevant, obvious information than the proposed one, namely that it's a release *candidate* for v10.4.1. If no blocking issues are found, it'll become the *actual* release... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] configure.ac: roll up a program for the sse4.1 check
On Sat, Nov 15, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote: On Sat, Nov 15, 2014 at 10:13 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Sat, Nov 15, 2014 at 12:04 PM, Emil Velikov emil.l.veli...@gmail.com wrote: So when checking/building sse code we have three possibilities: 1 Old compiler, throws an error when using -msse* 2 New compiler, user disables sse* (-mno-sse*) 3 New compiler, user doesn't disable sse The original code, added code for #1 but not #2. Later on we patched around the lack of handling #2 by wrapping the code in __SSE4_1__. Yet it lead to a missing/undefined symbol in case of #1 or #2, which might cause an issue for #2 when using the i965 driver. A bit later we fixed the undefined symbol by using #1, rather than updating it to handle #2. With this commit we set things straight :) To top it all up, conventions state that in case of conflicting (-enable-foo -disable-foo) options, the latter one takes precedence. Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test. Cc: Siavash Eliasi siavashser...@gmail.com Cc: Matt Turner matts...@gmail.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- Man this thing is _very_ messy. Matt from the last hunk it seems that pixman might need fixing. Should be bother with that, or let people have fun when they hit it :P -Emil configure.ac | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 91e111b..9d1835e 100644 --- a/configure.ac +++ b/configure.ac @@ -252,7 +252,19 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) dnl dnl Optional flags, check for compiler support dnl -AX_CHECK_COMPILE_FLAG([-msse4.1], [SSE41_SUPPORTED=1], [SSE41_SUPPORTED=0]) +save_CFLAGS=$CFLAGS +CFLAGS=-msse4.1 $CFLAGS +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ +#include mmintrin.h +#include xmmintrin.h +#include emmintrin.h +#include smmintrin.h +int main () { +__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c; +c = _mm_max_epu32(a, b); +return 0; This seems complicated. (a) Just #include immintrin.h (b) Is any of this even necessary? how about int main() { return !__SSE_4_1__; } Checking that you can actually using the intrinsics seens like a good plan. Pixman's configure.ac has been doing that for a long time. I'd rather copy that. It's not like we'd save much by not doing it. ...But then we cannot cross-compile with run-time SSE 4.1 support from a machine without SSE 4.1 support, no? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] configure.ac: roll up a program for the sse4.1 check
On Mon, Nov 17, 2014 at 3:50 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Mon, Nov 17, 2014 at 9:44 AM, Erik Faye-Lund kusmab...@gmail.com wrote: On Sat, Nov 15, 2014 at 7:18 PM, Matt Turner matts...@gmail.com wrote: On Sat, Nov 15, 2014 at 10:13 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Sat, Nov 15, 2014 at 12:04 PM, Emil Velikov emil.l.veli...@gmail.com wrote: So when checking/building sse code we have three possibilities: 1 Old compiler, throws an error when using -msse* 2 New compiler, user disables sse* (-mno-sse*) 3 New compiler, user doesn't disable sse The original code, added code for #1 but not #2. Later on we patched around the lack of handling #2 by wrapping the code in __SSE4_1__. Yet it lead to a missing/undefined symbol in case of #1 or #2, which might cause an issue for #2 when using the i965 driver. A bit later we fixed the undefined symbol by using #1, rather than updating it to handle #2. With this commit we set things straight :) To top it all up, conventions state that in case of conflicting (-enable-foo -disable-foo) options, the latter one takes precedence. Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test. Cc: Siavash Eliasi siavashser...@gmail.com Cc: Matt Turner matts...@gmail.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- Man this thing is _very_ messy. Matt from the last hunk it seems that pixman might need fixing. Should be bother with that, or let people have fun when they hit it :P -Emil configure.ac | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 91e111b..9d1835e 100644 --- a/configure.ac +++ b/configure.ac @@ -252,7 +252,19 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) dnl dnl Optional flags, check for compiler support dnl -AX_CHECK_COMPILE_FLAG([-msse4.1], [SSE41_SUPPORTED=1], [SSE41_SUPPORTED=0]) +save_CFLAGS=$CFLAGS +CFLAGS=-msse4.1 $CFLAGS +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ +#include mmintrin.h +#include xmmintrin.h +#include emmintrin.h +#include smmintrin.h +int main () { +__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c; +c = _mm_max_epu32(a, b); +return 0; This seems complicated. (a) Just #include immintrin.h (b) Is any of this even necessary? how about int main() { return !__SSE_4_1__; } Checking that you can actually using the intrinsics seens like a good plan. Pixman's configure.ac has been doing that for a long time. I'd rather copy that. It's not like we'd save much by not doing it. ...But then we cannot cross-compile with run-time SSE 4.1 support from a machine without SSE 4.1 support, no? Why not? It's a compile-time test. Obviously if there's no compiler support, then you're SOL... Right, my bad. Thanks for setting the record straight. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/9] nir: Add lowering of POW instructions if the lower flag is set.
On Sun, Feb 1, 2015 at 10:17 PM, Eric Anholt e...@anholt.net wrote: This could be done in a separate pass like we do in GLSL IR, but it seems to me like having the definitions of the transformations in the two directions next to each other makes a lot of sense. --- src/glsl/nir/nir_opt_algebraic.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/glsl/nir/nir_opt_algebraic.py b/src/glsl/nir/nir_opt_algebraic.py index 2414f71..41bfe04 100644 --- a/src/glsl/nir/nir_opt_algebraic.py +++ b/src/glsl/nir/nir_opt_algebraic.py @@ -107,6 +107,7 @@ optimizations = [ (('fexp', ('flog', a)), a), # e^ln(a) = a (('flog2', ('fexp2', a)), a), # lg2(2^a) = a (('flog', ('fexp', a)), a), # ln(e^a) = a + (('fpow', a, b), ('fexp2', ('fmul', ('flog2', a), b)), 'options-lower_fpow'), # 2^(lg2(a)*b) = a^b Maybe update the comment to be # a^b = 2^(lg2(a)*b) instead? That way you're consistent with the other rewrite-rules that seems to list the source expression first and the replacement second... (('fexp2', ('fmul', ('flog2', a), b)), ('fpow', a, b), '!options-lower_fpow'), # 2^(lg2(a)*b) = a^b (('fexp', ('fmul', ('flog', a), b)), ('fpow', a, b), '!options-lower_fpow'), # e^(ln(a)*b) = a^b (('fpow', a, 1.0), a), -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Expected wide line rendering with clipping
On Fri, Feb 6, 2015 at 1:11 PM, Iago Toral ito...@igalia.com wrote: Hi, Eduardo and I have been looking into a few dEQP test failures that deal with wide line rendering. There are a few of them that fail because of how clipping is implemented for this case. The problem in these cases seems to be that the hw renders the wide line as a parallelogram so that if an edge of the parallelogram is partially clipped, the other parallel edge will also be clipped at the same height/width so that the resulting wide line stays a parallelogram. The dEQP renders as if a wide line where a collection of 1px lines, so cliping one edge of the resulting line does not have implications for the other edge. This ASCII art illustrates the problem (* represent line pixels, | represents the viewport's rightmost edge): Expected by dEQP i965 rendering | | *| | **| | ***| | | | | | | | Is that drawing correct? I would expect it to look like this: | | *| | **|**|** ***| ***|* | | | | | | From the drawing, it looks like the line is exactly 45 degrees, so I believe this particular case is actually undefined, because of the following spec wording: Line segment rasterization begins by characterizing the segment as either x-major or y-major. x-major line segments have slope in the closed interval [−1, 1]; all other line segments are y-major (slope is determined by the segment’s endpoints). A 45 degree line should have a slope of 1, and should in theory be identified as an x-major line, giving the result dEQP expects (i965 is in this case rendering the line as similar to y-major, except y-major rendering should also generate fragments outside the viewport, as pointed out in my drawing above). However, since we're exactly at the break-off point, previous calculations are allowed to be off by some small value, leaving the characterization undefined. So I think in this case, both results are allowed, and the test is therefore somewhat bogus. If the line was tilted a tiny amount so it's guaranteed to be characterized as an x-major, then it should be correct again. Not having access to the tests myself, they might already have done this, though. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] nir: Add algebraic optimizations for comparisons with identical operands.
On Sat, Feb 7, 2015 at 6:16 AM, Eric Anholt e...@anholt.net wrote: No change on shader-db on i965. --- src/glsl/nir/nir_opt_algebraic.py | 9 + 1 file changed, 9 insertions(+) diff --git a/src/glsl/nir/nir_opt_algebraic.py b/src/glsl/nir/nir_opt_algebraic.py index a5fe19a..0512a8f 100644 --- a/src/glsl/nir/nir_opt_algebraic.py +++ b/src/glsl/nir/nir_opt_algebraic.py @@ -83,6 +83,15 @@ optimizations = [ (('fne', ('fadd', a, b), 0.0), ('fne', a, ('fneg', b))), (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)), (('fmin', ('fmax', a, 0.0), 1.0), ('fsat', a)), + # Comparison with the same args. Note that these are not done for + # the float versions because float equality is used to test for + # NaN. I think this comment is a bit odd, as it kind of describes the wrong part of the story (although it helps you to get in the right direction). The issue isn't that float equality is used to test for NaN, but that IEEE 754 defines any equality-comparison with NaN to evaluate to false. And that's the reason why you *can* use float equality to check for NaN. This also means that we can still do 'lt' and 'gt' safely for floats. In fact, GCC does these optimizations: ---8--- int eq(float a) { return a == a; } int neq(float a) { return a != a; } int lt(float a) { return a a; } int lte(float a) { return a = a; } int gt(float a) { return a a; } int gte(float a) { return a = a; } ---8--- If you compile that program (even with optimizations disabled), GCC will make lt and gt simply return 0. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: Don't use __builtin_clrsb in util_last_bit().
On Tue, Feb 3, 2015 at 2:28 AM, Matt Turner matts...@gmail.com wrote: Unclear circumstances lead to undefined symbols on x86. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916 A comment[1] on the gentoo bug-tracker suggest doing something along these lines instead (untested)... [1]: https://bugs.gentoo.org/show_bug.cgi?id=536916#c5 diff --git a/configure.ac b/configure.ac index c2d775e..e6daf40 100644 --- a/configure.ac +++ b/configure.ac @@ -175,6 +175,7 @@ fi dnl Check for compiler builtins AX_GCC_BUILTIN([__builtin_bswap32]) AX_GCC_BUILTIN([__builtin_bswap64]) +AX_GCC_BUILTIN([__builtin_clrsb]) AX_GCC_BUILTIN([__builtin_clz]) AX_GCC_BUILTIN([__builtin_clzll]) AX_GCC_BUILTIN([__builtin_ctz]) diff --git a/src/gallium/auxiliary/util/u_math.h b/src/gallium/auxiliary/util/u_math.h index 19c7343..4591830 100644 --- a/src/gallium/auxiliary/util/u_math.h +++ b/src/gallium/auxiliary/util/u_math.h @@ -561,7 +561,7 @@ util_last_bit(unsigned u) static INLINE unsigned util_last_bit_signed(int i) { -#if defined(__GNUC__) ((__GNUC__ * 100 + __GNUC_MINOR__) = 407) !defined(__INTEL_COMPILER) +#if defined(HAVE___BUILTIN_CLRSB) return 31 - __builtin_clrsb(i); #else if (i = 0) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev