[Mesa-dev] [Bug 80183] [llvmpipe] triangles with vertices that map to raster positions viewport width/height are not displayed
https://bugs.freedesktop.org/show_bug.cgi?id=80183 --- Comment #14 from cgerlac...@gmail.com --- The problem is also reproducible with softpipe. I understand your concern and I will try provide some samplecode to reproduce the clipping error. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: Take memset out of rzalloc_size()
On 06.05.2015 21:51, Rob Clark wrote: On Wed, May 6, 2015 at 1:24 PM, Kenneth Graunke kenn...@whitecape.org wrote: On Wednesday, May 06, 2015 03:35:27 PM Juha-Pekka Heikkila wrote: rzalloc_size() call ralloc_size() to allocate memory. ralloc_size() use calloc to get memory thus zeroing in rzalloc_size is not necessary. Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/util/ralloc.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/util/ralloc.c b/src/util/ralloc.c index 01719c8..09f5fcd 100644 --- a/src/util/ralloc.c +++ b/src/util/ralloc.c @@ -132,8 +132,6 @@ void * rzalloc_size(const void *ctx, size_t size) { void *ptr = ralloc_size(ctx, size); - if (likely(ptr != NULL)) - memset(ptr, 0, size); return ptr; } Wow, I have no idea why I did that. This is certainly counter-intuitive. rzalloc() is supposed to guarantee zeroed memory. ralloc() is not, but it looks like it always has for some reason. I'm somewhat inclined to change ralloc_size() to use malloc instead of calloc. I wonder how many things would break :) try the change conditionally ifndef DEBUG?? (abusing --enable-debug as a proxy for --im-actually-a-mesa-dev-and-want-to-see-the-crashes) I did have a try to put malloc in place of calloc and did see basically almost all Piglit tests starting to fail on this one. There were handful of tests which still worked but also saw many different places for crashes thus though at first suggest just taking the memset out. :) /Juha-Pekka ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glapi: Add positional argument specifier.
On Wed, May 6, 2015 at 4:35 PM, Ian Romanick i...@freedesktop.org wrote: On 05/06/2015 03:45 PM, Kenneth Graunke wrote: On Wednesday, May 06, 2015 12:48:30 PM Vinson Lee wrote: Fix build error introduced with commit 1c5a57a glapi/es3.1: Add support for GLES versions 3.0 with Python 2.7. File src/mapi/glapi/gen/gl_genexec.py, line 230, in module printer.Print(api) File src/mapi/glapi/gen/gl_XML.py, line 120, in Print self.printBody(api) File src/mapi/glapi/gen/gl_genexec.py, line 187, in printBody condition_parts.append('(ctx-API == API_OPENGLES2 ctx-Version = {})'.format(int(f.api_map['es2'] * 10))) ValueError: zero length field name in format Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mapi/glapi/gen/gl_genexec.py |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py index e58cdfc..4e76fe3 100644 --- a/src/mapi/glapi/gen/gl_genexec.py +++ b/src/mapi/glapi/gen/gl_genexec.py @@ -184,7 +184,7 @@ class PrintCode(gl_XML.gl_print_base): condition_parts.append('ctx-API == API_OPENGLES') if 'es2' in f.api_map: if f.api_map['es2'] 2.0: -condition_parts.append('(ctx-API == API_OPENGLES2 ctx-Version = {})'.format(int(f.api_map['es2'] * 10))) +condition_parts.append('(ctx-API == API_OPENGLES2 ctx-Version = {0})'.format(int(f.api_map['es2'] * 10))) else: condition_parts.append('ctx-API == API_OPENGLES2') if not condition_parts: Do we actually care at this point? Depends on whether or not you care about CentOS 6 or whatever version RHEL it is based on. :( https://bugs.freedesktop.org/show_bug.cgi?id=90346 This patch does not fix bug 90346. DispatchSanity_test.GLES2 still fails on CentOS 6. Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glx: provide a way to disable DRI3 using an environment variable
On 06/05/15 19:47, Axel Davy wrote: Le 06/05/2015 14:43, Martin Peres a écrit : diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index ff77a91..5246737 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -2092,6 +2092,11 @@ dri3_create_display(Display * dpy) xcb_generic_error_t *error; const xcb_query_extension_reply_t*extension; + if (getenv(MESA_GLX_DRI3_DISABLE)) { + ErrorMessageF(DRI3 disabled by the environment\n); + return NULL; + } + xcb_prefetch_extension_data(c, xcb_dri3_id); xcb_prefetch_extension_data(c, xcb_present_id); There is already a LIBGL_DRI3_DISABLE env var. Does this one bring something different ? Yours, Axel Davy Thanks Axel! I heard that there was such a variable, but no-one could remember the name. I looked for it in the wrong place it would seem! Let's drop this patch for the moment. If the variable works as expected, I would suggest documenting it in envvar.html :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option
On 05.05.2015 01:47, Tom Stellard wrote: On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote: On Mon, May 4, 2015 at 10:04 AM, Tom Stellard t...@stellard.net wrote: On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote: On Sat, May 2, 2015 at 1:19 PM, EdB edb+m...@sigluy.net wrote: The standard ICD file path is /etc/OpenCL/vendor/. However it doesn't fit well with custom build. This option allow ICD vendor file installation path override --- configure.ac | 6 ++ src/gallium/targets/opencl/Makefile.am | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 095e23e..bf08d76 100644 --- a/configure.ac +++ b/configure.ac @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir], [D3D_DRIVER_INSTALL_DIR=$withval], [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d]) AC_SUBST([D3D_DRIVER_INSTALL_DIR]) +AC_ARG_WITH([icd-file-dir], +[AS_HELP_STRING([--with-icd-file-dir=DIR], +[directory for the OpenCL ICD vendor file @:@/etc/OpenCL/vendors@:@])], +[ICD_FILE_INSTALL_DIR=$withval], +[ICD_FILE_INSTALL_DIR=/etc/OpenCL/vendors]) What about making this default to ${sysconfdir}/OpenCL/vendors ? That way using --prefix should auto-make it go into the prefix instead of unexpectedly installing things outside of the specified prefix? That way a distro build which specifies --sysconfdir as /etc will get it in the right place, while by default it'll go into /usr/local/etc and a user can override the icd loader's default behaviour with OPENCL_VENDOR_PATH? I would prefer not to make this the default behavior, because it violates the spec and there could potentially be multiple icd implementations, which may or may not have the overrides. I think the best solution would be to rename the option to something like --enable-ocl-icd-respect-prefix (suggestions for other names encouraged). and have the option enable the behavior that Ilia is describing. This will give distros and advanced users a way to setup their system the way they want. It's just a very anti-autoconf thing to do to have make install fail by default unless you specify some hey, i actually want make install to work option. I think it's crazy to expect that, by default, people will want to write over their system installs, and having things go outside of the specified --prefix is very surprising (unless you force some other option). And asking the user to run make install as root is even crazier. My expectation is that, by default, when people specify --enable-opencl-icd they want an implementation that conforms to the specification. Unfortunately, this means installing icd files to /etc. There is no good solution here, but I'd rather have users specify a flag to get a sane build system, than requiring them to set a flag and set an environment variable just to get working OpenCL with the ICD loader. I guess I haven't hit this yet because there's no OpenCL support in nouveau or freedreno, but I made the same stink about vdpau when Emil tried to make it install to some system location by default. At least a few people seemed to agree with me back then... Does the vdpau spec also require installation to a specific system director (e.g. /etc/) ? Tom, I think ensuring that the OpenCL ICD loader can pick up the mesa.icd file is something for the distributor / administrator / user to worry about, not Mesa upstream. There's a similar situation with the drirc file, which is installed inside the prefix by default but only read from /etc/. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Thu, May 7, 2015 at 11:06 AM, Neil Roberts n...@linux.intel.com wrote: Commit 94ee908448 added a header size parameter to the function to create the LOAD_PAYLOAD instruction. However this broke opt_sampler_eot which manually constructs the instruction and so wasn't setting the header_size. This ends up making the parameters for the send message all have the wrong location and it all falls apart. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3bf5866..02a1ad5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot() load_payload-sources + 1); new_load_payload-regs_written = load_payload-regs_written + 1; + new_load_payload-header_size = 1; tex_inst-mlen++; tex_inst-header_size = 1; tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], new_load_payload); -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd
Le 2015-05-07 18:55, Aaron Watry a écrit : I'm not sure what the final consensus will be on how to do this, but FWIW: Tested-By: Aaron Watry awa...@gmail.com I've tested this with 4 combinations: no --with-opencl-icd option specified : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=standard : libMesaOpenCL.so installed in ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in ${prefix}/lib, icd in ${prefix}/etc//mesa.icd. I only specified --prefix, no other directories overridden in configure command. thanks EdB --Aaron On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote: The standard ICD file path is /etc/OpenCL/vendor/. However it doesn't fit well with custom build. This option allow ICD vendor file installation path override --- configure.ac [1] | 46 +++--- src/gallium/targets/opencl/Makefile.am | 2 +- 2 files changed, 33 insertions(+), 15 deletions(-) diff --git a/configure.ac [1] b/configure.ac [1] index 095e23e..90dba4e 100644 --- a/configure.ac [1] +++ b/configure.ac [1] @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl], [enable OpenCL library @:@default=disabled@:@])], [enable_opencl=$enableval], [enable_opencl=no]) -AC_ARG_ENABLE([opencl_icd], - [AS_HELP_STRING([--enable-opencl-icd], - [Build an OpenCL ICD library to be loaded by an ICD implementation - @:@default=disabled@:@])], - [enable_opencl_icd=$enableval], - [enable_opencl_icd=no]) AC_ARG_ENABLE([xlib-glx], [AS_HELP_STRING([--enable-xlib-glx], [make GLX library Xlib-based instead of DRI-based @:@default=disabled@:@])], @@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then # XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers enable_gallium_loader=yes - if test x$enable_opencl_icd = xyes; then - OPENCL_LIBNAME=MesaOpenCL - else - OPENCL_LIBNAME=OpenCL - fi - if test x$have_libelf != xyes; then AC_MSG_ERROR([Clover requires libelf]) fi fi AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes) -AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes) -AC_SUBST([OPENCL_LIBNAME]) dnl dnl Gallium configuration @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir], [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d]) AC_SUBST([D3D_DRIVER_INSTALL_DIR]) +dnl OpenCL ICD + +AC_ARG_WITH([opencl-icd], + [AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@], + [Build an OpenCL ICD library to be loaded by an ICD implementation. + If @:@standard@:@ the OpenCL ICD vendor file installs in /etc/OpenCL/vendors. + @:@sysconfdir@:@ installs the file in $sysconfdir/OpenCL/vendors + @:@default=no@:@])], + [OPENCL_ICD=$withval], + [OPENCL_ICD=no]) + +case x$OPENCL_ICD in +xno) + OPENCL_LIBNAME=OpenCL + ;; +xstandard) + OPENCL_LIBNAME=MesaOpenCL + ICD_FILE_DIR=/etc/OpenCL/vendors + ;; +xsysconfdir) + OPENCL_LIBNAME=MesaOpenCL + ICD_FILE_DIR=$sysconfdir/OpenCL/vendors + ;; +*) + AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for --with-opencl-icd]) + ;; +esac + +AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno) +AC_SUBST([OPENCL_LIBNAME]) +AC_SUBST([ICD_FILE_DIR]) + dnl dnl Gallium helper functions dnl diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 5daf327..781daa0 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES = opencl.sym EXTRA_DIST = mesa.icd opencl.sym if HAVE_CLOVER_ICD -icddir = /etc/OpenCL/vendors/ +icddir = $(ICD_FILE_DIR) icd_DATA = mesa.icd endif -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2] Links: -- [1] http://configure.ac [2] http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/wm/gen6: Add option for disabling statistics collection
On Thursday, May 07, 2015 04:39:14 PM Topi Pohjolainen wrote: Normally this always needed but for internal blits and clears we need to be able to disable it. CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: document the LIBGL_DRI3_DISABLE environment variable
On Thursday, May 07, 2015 05:34:13 PM Martin Peres wrote: Suggested-by: Axel Davy axel.d...@ens.fr Signed-off-by: Martin Peres martin.pe...@intel.linux.com --- docs/envvars.html | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/envvars.html b/docs/envvars.html index 31d14a4..c0d5a51 100644 --- a/docs/envvars.html +++ b/docs/envvars.html @@ -34,6 +34,7 @@ sometimes be useful for debugging end-user issues. liLIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for debugging) liLIBGL_SHOW_FPS - print framerate to stdout based on the number of glXSwapBuffers calls per second. +liLIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter) /ul Documentation?!? :) Always nice to have. Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null
On Thu, May 7, 2015 at 6:20 AM, Neil Roberts n...@linux.intel.com wrote: opt_sampler_eot enables a direct write to framebuffer from a sample. In order to do this the sample message needs to have a message header so if there wasn't one already then the function adds one. In addition the function sets the destination register to null because it's no longer used. However it was only doing this in cases where it was adding a message header. This patch just moves setting the destination so that it happens even if there's a messge header. In practice this doesn't seem to make any difference but it's a bit cleaner. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 1ca7ca6..72d408b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2675,6 +2675,7 @@ fs_visitor::opt_sampler_eot() tex_inst-offset |= fb_write-target 24; tex_inst-eot = true; + tex_inst-dst = reg_null_ud; fb_write-remove(cfg-blocks[cfg-num_blocks - 1]); /* If a header is present, marking the eot is sufficient. Otherwise, we need @@ -2712,7 +2713,6 @@ fs_visitor::opt_sampler_eot() tex_inst-header_present = true; tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], new_load_payload); tex_inst-src[0] = send_header; - tex_inst-dst = reg_null_ud; return true; } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev LGTM. Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot
Commit 94ee908448 added a header size parameter to the function to create the LOAD_PAYLOAD instruction. However this broke opt_sampler_eot which manually constructs the instruction and so wasn't setting the header_size. This ends up making the parameters for the send message all have the wrong location and it all falls apart. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3bf5866..02a1ad5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot() load_payload-sources + 1); new_load_payload-regs_written = load_payload-regs_written + 1; + new_load_payload-header_size = 1; tex_inst-mlen++; tex_inst-header_size = 1; tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], new_load_payload); -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option
On Thu, May 07, 2015 at 04:59:41PM +0900, Michel Dänzer wrote: On 05.05.2015 01:47, Tom Stellard wrote: On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote: On Mon, May 4, 2015 at 10:04 AM, Tom Stellard t...@stellard.net wrote: On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote: On Sat, May 2, 2015 at 1:19 PM, EdB edb+m...@sigluy.net wrote: The standard ICD file path is /etc/OpenCL/vendor/. However it doesn't fit well with custom build. This option allow ICD vendor file installation path override --- configure.ac | 6 ++ src/gallium/targets/opencl/Makefile.am | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 095e23e..bf08d76 100644 --- a/configure.ac +++ b/configure.ac @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir], [D3D_DRIVER_INSTALL_DIR=$withval], [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d]) AC_SUBST([D3D_DRIVER_INSTALL_DIR]) +AC_ARG_WITH([icd-file-dir], +[AS_HELP_STRING([--with-icd-file-dir=DIR], +[directory for the OpenCL ICD vendor file @:@/etc/OpenCL/vendors@:@])], +[ICD_FILE_INSTALL_DIR=$withval], +[ICD_FILE_INSTALL_DIR=/etc/OpenCL/vendors]) What about making this default to ${sysconfdir}/OpenCL/vendors ? That way using --prefix should auto-make it go into the prefix instead of unexpectedly installing things outside of the specified prefix? That way a distro build which specifies --sysconfdir as /etc will get it in the right place, while by default it'll go into /usr/local/etc and a user can override the icd loader's default behaviour with OPENCL_VENDOR_PATH? I would prefer not to make this the default behavior, because it violates the spec and there could potentially be multiple icd implementations, which may or may not have the overrides. I think the best solution would be to rename the option to something like --enable-ocl-icd-respect-prefix (suggestions for other names encouraged). and have the option enable the behavior that Ilia is describing. This will give distros and advanced users a way to setup their system the way they want. It's just a very anti-autoconf thing to do to have make install fail by default unless you specify some hey, i actually want make install to work option. I think it's crazy to expect that, by default, people will want to write over their system installs, and having things go outside of the specified --prefix is very surprising (unless you force some other option). And asking the user to run make install as root is even crazier. My expectation is that, by default, when people specify --enable-opencl-icd they want an implementation that conforms to the specification. Unfortunately, this means installing icd files to /etc. There is no good solution here, but I'd rather have users specify a flag to get a sane build system, than requiring them to set a flag and set an environment variable just to get working OpenCL with the ICD loader. I guess I haven't hit this yet because there's no OpenCL support in nouveau or freedreno, but I made the same stink about vdpau when Emil tried to make it install to some system location by default. At least a few people seemed to agree with me back then... Does the vdpau spec also require installation to a specific system director (e.g. /etc/) ? Tom, I think ensuring that the OpenCL ICD loader can pick up the mesa.icd file is something for the distributor / administrator / user to worry about, not Mesa upstream. I don't really disagree with this in general. My position is that when there is a situation where it is impossible to follow both the API spec and build system best practices that it is more important to follow the API spec. I realize some people disagree with this, and I completely understand their rationale. For this particular situation, I'm happy with any solution that: 1. Allows a user to install the icd file to /etc if he or she wants to. and 2. Does not require the user to read the spec to know that /etc is the correct place to install it. I think EdB's latest patch is a good solution: http://lists.freedesktop.org/archives/mesa-dev/2015-May/083661.html -Tom There's a similar situation with the drirc file, which is installed inside the prefix by default but only read from /etc/. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot
On Thu, May 7, 2015 at 11:06 AM, Neil Roberts n...@linux.intel.com wrote: Commit 94ee908448 added a header size parameter to the function to create the LOAD_PAYLOAD instruction. However this broke opt_sampler_eot which manually constructs the instruction and so wasn't setting the header_size. This ends up making the parameters for the send message all have the wrong location and it all falls apart. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3bf5866..02a1ad5 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot() load_payload-sources + 1); new_load_payload-regs_written = load_payload-regs_written + 1; + new_load_payload-header_size = 1; tex_inst-mlen++; tex_inst-header_size = 1; tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], new_load_payload); -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/wm/gen7: Refactor state setup
CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/i965/brw_state.h | 9 +++ src/mesa/drivers/dri/i965/gen7_wm_state.c | 98 --- 2 files changed, 74 insertions(+), 33 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index 26fdae6..5a52a74 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -264,6 +264,15 @@ void brw_update_renderbuffer_surfaces(struct brw_context *brw, /* gen7_wm_state.c */ void +gen7_upload_wm_state(struct brw_context *brw, + const struct gl_program *fp, + const struct brw_wm_prog_data *prog_data, + bool multisampled_fbo, int min_inv_per_frag, + bool kill_enable, bool color_buffer_write_enable, + bool msaa_enabled, bool statistic_enable, + bool line_stipple_enable, bool polygon_stipple_enable); + +void gen7_upload_ps_state(struct brw_context *brw, const struct gl_fragment_program *fp, const struct brw_stage_state *stage_state, diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c b/src/mesa/drivers/dri/i965/gen7_wm_state.c index b918275..b3fa5be 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c @@ -32,63 +32,53 @@ #include program/prog_statevars.h #include intel_batchbuffer.h -static void -upload_wm_state(struct brw_context *brw) +void +gen7_upload_wm_state(struct brw_context *brw, + const struct gl_program *fp, + const struct brw_wm_prog_data *prog_data, + bool multisampled_fbo, int min_inv_per_frag, + bool kill_enable, bool color_buffer_write_enable, + bool msaa_enabled, bool statistic_enable, + bool line_stipple_enable, bool polygon_stipple_enable) { - struct gl_context *ctx = brw-ctx; - /* BRW_NEW_FRAGMENT_PROGRAM */ - const struct brw_fragment_program *fp = - brw_fragment_program_const(brw-fragment_program); - /* BRW_NEW_FS_PROG_DATA */ - const struct brw_wm_prog_data *prog_data = brw-wm.prog_data; bool writes_depth = prog_data-computed_depth_mode != BRW_PSCDEPTH_OFF; uint32_t dw1, dw2; - /* _NEW_BUFFERS */ - bool multisampled_fbo = ctx-DrawBuffer-Visual.samples 1; - dw1 = dw2 = 0; - dw1 |= GEN7_WM_STATISTICS_ENABLE; + + if (statistic_enable) + dw1 |= GEN7_WM_STATISTICS_ENABLE; + dw1 |= GEN7_WM_LINE_AA_WIDTH_1_0; dw1 |= GEN7_WM_LINE_END_CAP_AA_WIDTH_0_5; - /* _NEW_LINE */ - if (ctx-Line.StippleFlag) + if (line_stipple_enable) dw1 |= GEN7_WM_LINE_STIPPLE_ENABLE; - /* _NEW_POLYGON */ - if (ctx-Polygon.StippleFlag) + if (polygon_stipple_enable) dw1 |= GEN7_WM_POLYGON_STIPPLE_ENABLE; - if (fp-program.Base.InputsRead VARYING_BIT_POS) + if (fp-InputsRead VARYING_BIT_POS) dw1 |= GEN7_WM_USES_SOURCE_DEPTH | GEN7_WM_USES_SOURCE_W; dw1 |= prog_data-computed_depth_mode GEN7_WM_COMPUTED_DEPTH_MODE_SHIFT; dw1 |= prog_data-barycentric_interp_modes GEN7_WM_BARYCENTRIC_INTERPOLATION_MODE_SHIFT; - /* _NEW_COLOR, _NEW_MULTISAMPLE */ - /* Enable if the pixel shader kernel generates and outputs oMask. -*/ - if (prog_data-uses_kill || ctx-Color.AlphaEnabled || - ctx-Multisample.SampleAlphaToCoverage || - prog_data-uses_omask) { + if (kill_enable) dw1 |= GEN7_WM_KILL_ENABLE; - } - /* _NEW_BUFFERS | _NEW_COLOR */ - if (brw_color_buffer_write_enabled(brw) || writes_depth || - dw1 GEN7_WM_KILL_ENABLE) { + if (color_buffer_write_enable || writes_depth || + dw1 GEN7_WM_KILL_ENABLE) dw1 |= GEN7_WM_DISPATCH_ENABLE; - } + if (multisampled_fbo) { - /* _NEW_MULTISAMPLE */ - if (ctx-Multisample.Enabled) + if (msaa_enabled) dw1 |= GEN7_WM_MSRAST_ON_PATTERN; else dw1 |= GEN7_WM_MSRAST_OFF_PIXEL; - if (_mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program, false) 1) + if (min_inv_per_frag 1) dw2 |= GEN7_WM_MSDISPMODE_PERSAMPLE; else dw2 |= GEN7_WM_MSDISPMODE_PERPIXEL; @@ -97,9 +87,8 @@ upload_wm_state(struct brw_context *brw) dw2 |= GEN7_WM_MSDISPMODE_PERSAMPLE; } - if (fp-program.Base.SystemValuesRead SYSTEM_BIT_SAMPLE_MASK_IN) { + if (fp-SystemValuesRead SYSTEM_BIT_SAMPLE_MASK_IN) dw1 |= GEN7_WM_USES_INPUT_COVERAGE_MASK; - } BEGIN_BATCH(3); OUT_BATCH(_3DSTATE_WM 16 | (3 - 2)); @@ -108,6 +97,49 @@ upload_wm_state(struct brw_context *brw) ADVANCE_BATCH(); } +static void +upload_wm_state(struct brw_context *brw) +{ + struct gl_context *ctx = brw-ctx; + /* BRW_NEW_FRAGMENT_PROGRAM */ + const struct brw_fragment_program *fp = +
Re: [Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1
On 05/07/2015 12:57 AM, Marta Lofstedt wrote: From: Marta Lofstedt marta.lofst...@intel.com v2: only expose enums from GL_ARB_shader_image_load_store for gles 3.1 and GL core Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 17 - 2 files changed, 14 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 9898197..73739b6 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = { EXTRA_END }; +static const int extra_ARB_shader_image_load_store_es31[] = { + EXT(ARB_shader_image_load_store), + EXTRA_API_ES31, I think you're missing the patch that adds EXTRA_API_ES31. Did you forget to send that one out? Also, on a few of these patches, I think the old, non-_es31 set of requirements can be removed due to no longer being used. + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 513d5d2..85c2494 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -413,6 +413,14 @@ descriptor=[ { apis: [GL_CORE, GLES3], params: [ # GL_ARB_draw_indirect / GLES 3.1 [ DRAW_INDIRECT_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, extra_ARB_draw_indirect_es31 ], +# GL_ARB_shader_image_load_store / GLES 3.1 + [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), extra_ARB_shader_image_load_store_es31], + [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), extra_ARB_shader_image_load_store_es31], + [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), extra_ARB_shader_image_load_store_es31], + [ MAX_VERTEX_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_GEOMETRY_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_FRAGMENT_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), extra_ARB_shader_image_load_store_es31], ]}, # Remaining enums are only in OpenGL @@ -780,15 +788,6 @@ descriptor=[ [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ], [ MAX_VERTEX_ATTRIB_BINDINGS, CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ], -# GL_ARB_shader_image_load_store - [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), extra_ARB_shader_image_load_store], - [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), extra_ARB_shader_image_load_store], - [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), extra_ARB_shader_image_load_store], - [ MAX_VERTEX_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), extra_ARB_shader_image_load_store], - [ MAX_GEOMETRY_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), extra_ARB_shader_image_load_store_and_geometry_shader], - [ MAX_FRAGMENT_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), extra_ARB_shader_image_load_store], - [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), extra_ARB_shader_image_load_store], - # GL_ARB_compute_shader [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ], [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), extra_ARB_compute_shader ], ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/13] SQUASH: nir: Update various components for the new list-based use/def sets
On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand ja...@jlekstrand.net wrote: --- src/glsl/nir/nir_from_ssa.c | 11 +-- src/glsl/nir/nir_lower_locals_to_regs.c | 14 ++ src/glsl/nir/nir_lower_to_source_mods.c | 20 src/glsl/nir/nir_lower_vars_to_ssa.c| 3 ++- src/glsl/nir/nir_opt_gcm.c | 14 ++ src/glsl/nir/nir_opt_global_to_local.c | 13 ++--- src/glsl/nir/nir_opt_peephole_ffma.c| 9 - src/glsl/nir/nir_opt_peephole_select.c | 10 -- src/glsl/nir/nir_to_ssa.c | 19 ++- 9 files changed, 55 insertions(+), 58 deletions(-) diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c index 5e7deca..94d1ced 100644 --- a/src/glsl/nir/nir_from_ssa.c +++ b/src/glsl/nir/nir_from_ssa.c @@ -345,6 +345,7 @@ isolate_phi_nodes_block(nir_block *block, void *void_state) nir_parallel_copy_entry *entry = rzalloc(state-dead_ctx, nir_parallel_copy_entry); + entry-src.parent_instr = pcopy-instr; I don't think this change, or the one immediately below, are needed since nir_instr_rewrite_uses() will already set the parent_instr. nir_ssa_dest_init(pcopy-instr, entry-dest, phi-dest.ssa.num_components, src-src.ssa-name); exec_list_push_tail(pcopy-entries, entry-node); @@ -358,6 +359,7 @@ isolate_phi_nodes_block(nir_block *block, void *void_state) nir_parallel_copy_entry *entry = rzalloc(state-dead_ctx, nir_parallel_copy_entry); + entry-src.parent_instr = block_pcopy-instr; nir_ssa_dest_init(block_pcopy-instr, entry-dest, phi-dest.ssa.num_components, phi-dest.ssa.name); exec_list_push_tail(block_pcopy-entries, entry-node); @@ -503,7 +505,7 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state) } nir_ssa_def_rewrite_uses(def, nir_src_for_reg(reg), state-mem_ctx); - assert(def-uses-entries == 0 def-if_uses-entries == 0); + assert(list_empty(def-uses) list_empty(def-if_uses)); if (def-parent_instr-type == nir_instr_type_ssa_undef) return true; @@ -515,12 +517,9 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state) */ nir_dest *dest = exec_node_data(nir_dest, def, ssa); - _mesa_set_destroy(dest-ssa.uses, NULL); - _mesa_set_destroy(dest-ssa.if_uses, NULL); - *dest = nir_dest_for_reg(reg); - - _mesa_set_add(reg-defs, state-instr); + dest-reg.parent_instr = state-instr; + list_addtail(dest-reg.def_link, reg-defs); return true; } diff --git a/src/glsl/nir/nir_lower_locals_to_regs.c b/src/glsl/nir/nir_lower_locals_to_regs.c index bc6a3d3..28fdec5 100644 --- a/src/glsl/nir/nir_lower_locals_to_regs.c +++ b/src/glsl/nir/nir_lower_locals_to_regs.c @@ -269,18 +269,16 @@ lower_locals_to_regs_block(nir_block *block, void *void_state) static nir_block * compute_reg_usedef_lca(nir_register *reg) { - struct set_entry *entry; nir_block *lca = NULL; - set_foreach(reg-defs, entry) - lca = nir_dominance_lca(lca, ((nir_instr *)entry-key)-block); + list_for_each_entry(nir_dest, def_dest, reg-defs, reg.def_link) + lca = nir_dominance_lca(lca, def_dest-reg.parent_instr-block); - set_foreach(reg-uses, entry) - lca = nir_dominance_lca(lca, ((nir_instr *)entry-key)-block); + list_for_each_entry(nir_src, use_src, reg-uses, use_link) + lca = nir_dominance_lca(lca, use_src-parent_instr-block); - set_foreach(reg-if_uses, entry) { - nir_if *if_stmt = (nir_if *)entry-key; - nir_cf_node *prev_node = nir_cf_node_prev(if_stmt-cf_node); + list_for_each_entry(nir_src, use_src, reg-if_uses, use_link) { + nir_cf_node *prev_node = nir_cf_node_prev(use_src-parent_if-cf_node); assert(prev_node-type == nir_cf_node_block); lca = nir_dominance_lca(lca, nir_cf_node_as_block(prev_node)); } diff --git a/src/glsl/nir/nir_lower_to_source_mods.c b/src/glsl/nir/nir_lower_to_source_mods.c index 7b4a0f6..94c7e36 100644 --- a/src/glsl/nir/nir_lower_to_source_mods.c +++ b/src/glsl/nir/nir_lower_to_source_mods.c @@ -88,8 +88,8 @@ nir_lower_to_source_mods_block(nir_block *block, void *state) alu-src[i].swizzle[j] = parent-src[0].swizzle[alu-src[i].swizzle[j]]; } - if (parent-dest.dest.ssa.uses-entries == 0 - parent-dest.dest.ssa.if_uses-entries == 0) + if (list_empty(parent-dest.dest.ssa.uses) + list_empty(parent-dest.dest.ssa.if_uses)) nir_instr_remove(parent-instr); } @@ -131,13 +131,13 @@ nir_lower_to_source_mods_block(nir_block *block, void *state) if (nir_op_infos[alu-op].output_type != nir_type_float) continue; - if (alu-dest.dest.ssa.if_uses-entries != 0) + if
[Mesa-dev] [PATCH 4/5] clover: Add a mutex to guard queue::queued_events
This fixes a potential crash where on a sequence like this: Thread 0: Check if queue is not empty. Thread 1: Remove item from queue, making it empty. Thread 0: Do something assuming queue is not empty. --- src/gallium/state_trackers/clover/core/queue.cpp | 2 ++ src/gallium/state_trackers/clover/core/queue.hpp | 2 ++ 2 files changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/core/queue.cpp b/src/gallium/state_trackers/clover/core/queue.cpp index 24f9326..87f9dcc 100644 --- a/src/gallium/state_trackers/clover/core/queue.cpp +++ b/src/gallium/state_trackers/clover/core/queue.cpp @@ -44,6 +44,7 @@ command_queue::flush() { pipe_screen *screen = device().pipe; pipe_fence_handle *fence = NULL; + std::lock_guardstd::mutex lock(queued_events_mutex); if (!queued_events.empty()) { pipe-flush(pipe, fence, 0); @@ -69,6 +70,7 @@ command_queue::profiling_enabled() const { void command_queue::sequence(hard_event ev) { + std::lock_guardstd::mutex lock(queued_events_mutex); if (!queued_events.empty()) queued_events.back()().chain(ev); diff --git a/src/gallium/state_trackers/clover/core/queue.hpp b/src/gallium/state_trackers/clover/core/queue.hpp index b7166e6..bddb86c 100644 --- a/src/gallium/state_trackers/clover/core/queue.hpp +++ b/src/gallium/state_trackers/clover/core/queue.hpp @@ -24,6 +24,7 @@ #define CLOVER_CORE_QUEUE_HPP #include deque +#include mutex #include core/object.hpp #include core/context.hpp @@ -69,6 +70,7 @@ namespace clover { cl_command_queue_properties props; pipe_context *pipe; + std::mutex queued_events_mutex; std::dequeintrusive_refhard_event queued_events; }; } -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] clover: Add threadsafe wrappers for pipe_screen and pipe_context
Events can be added to an OpenCL command queue concurrently from multiple threads, but pipe_context and pipe_screen objects are not threadsafe. The threadsafe wrappers protect all pipe_screen and pipe_context function calls with a mutex, so we can safely use them with multiple threads. --- src/gallium/state_trackers/clover/Makefile.am | 6 +- src/gallium/state_trackers/clover/Makefile.sources | 4 + src/gallium/state_trackers/clover/core/device.cpp | 2 + .../clover/core/pipe_threadsafe_context.c | 272 + .../clover/core/pipe_threadsafe_screen.c | 184 ++ .../state_trackers/clover/core/threadsafe.h| 39 +++ src/gallium/targets/opencl/Makefile.am | 3 +- 7 files changed, 508 insertions(+), 2 deletions(-) create mode 100644 src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c create mode 100644 src/gallium/state_trackers/clover/core/pipe_threadsafe_screen.c create mode 100644 src/gallium/state_trackers/clover/core/threadsafe.h diff --git a/src/gallium/state_trackers/clover/Makefile.am b/src/gallium/state_trackers/clover/Makefile.am index f46d9ef..8b615ae 100644 --- a/src/gallium/state_trackers/clover/Makefile.am +++ b/src/gallium/state_trackers/clover/Makefile.am @@ -1,5 +1,6 @@ AUTOMAKE_OPTIONS = subdir-objects +include $(top_srcdir)/src/gallium/Automake.inc include Makefile.sources AM_CPPFLAGS = \ @@ -32,6 +33,9 @@ cl_HEADERS = \ $(top_srcdir)/include/CL/opencl.h endif +AM_CFLAGS = \ + $(GALLIUM_CFLAGS) + noinst_LTLIBRARIES = libclover.la libcltgsi.la libclllvm.la libcltgsi_la_CXXFLAGS = \ @@ -58,6 +62,6 @@ libclover_la_CXXFLAGS = \ libclover_la_LIBADD = \ libcltgsi.la libclllvm.la -libclover_la_SOURCES = $(CPP_SOURCES) +libclover_la_SOURCES = $(CPP_SOURCES) $(C_SOURCES) EXTRA_DIST = Doxyfile diff --git a/src/gallium/state_trackers/clover/Makefile.sources b/src/gallium/state_trackers/clover/Makefile.sources index 10bbda0..90e6b7e 100644 --- a/src/gallium/state_trackers/clover/Makefile.sources +++ b/src/gallium/state_trackers/clover/Makefile.sources @@ -53,6 +53,10 @@ CPP_SOURCES := \ util/range.hpp \ util/tuple.hpp +C_SOURCES := \ + core/pipe_threadsafe_context.c \ + core/pipe_threadsafe_screen.c + LLVM_SOURCES := \ llvm/invocation.cpp diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index 42b45b7..b145027 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -22,6 +22,7 @@ #include core/device.hpp #include core/platform.hpp +#include core/threadsafe.h #include pipe/p_screen.h #include pipe/p_state.h @@ -47,6 +48,7 @@ device::device(clover::platform platform, pipe_loader_device *ldev) : pipe-destroy(pipe); throw error(CL_INVALID_DEVICE); } + pipe = pipe_threadsafe_screen(pipe); } device::~device() { diff --git a/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c b/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c new file mode 100644 index 000..f08f56c --- /dev/null +++ b/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c @@ -0,0 +1,272 @@ +/* + * Copyright 2015 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: Tom Stellard thomas.stell...@amd.com + * + */ + +#include stdio.h + +/** + * \file + * + * threadsafe_context is a wrapper around a pipe_context to make it thread + * safe. + */ + +#include os/os_thread.h +#include pipe/p_context.h +#include util/u_memory.h + +#include threadsafe.h + + + +struct threadsafe_context { + struct pipe_context base; + struct pipe_context *ctx; + pipe_mutex mutex; +}; + +static struct pipe_context *unwrap(struct pipe_context *ctx) { + if (!ctx) + return
[Mesa-dev] [PATCH 3/5] clover: Fix a bug with multi-threaded events
It was possible for some events never to get triggered if one thread was creating events and another threads was waiting for them. This patch consolidates soft_event::wait() and hard_event::wait() into event::wait() so that hard_event objects will now wait for all their dependencies to be submitted before flushing the command queue. --- src/gallium/state_trackers/clover/core/event.cpp | 19 +++ src/gallium/state_trackers/clover/core/event.hpp | 9 ++--- 2 files changed, 21 insertions(+), 7 deletions(-) diff --git a/src/gallium/state_trackers/clover/core/event.cpp b/src/gallium/state_trackers/clover/core/event.cpp index 3c9336e..da227bb 100644 --- a/src/gallium/state_trackers/clover/core/event.cpp +++ b/src/gallium/state_trackers/clover/core/event.cpp @@ -39,6 +39,7 @@ event::~event() { void event::trigger() { if (!--wait_count) { + signalled_cv.notify_all(); action_ok(*this); while (!_chain.empty()) { @@ -73,6 +74,15 @@ event::chain(event ev) { ev.deps.push_back(*this); } +void +event::wait() { + for (event ev : deps) + ev.wait(); + + std::unique_lockstd::mutex lock(signalled_mutex); + signalled_cv.wait(lock, [=]{ return signalled(); }); +} + hard_event::hard_event(command_queue q, cl_command_type command, const ref_vectorevent deps, action action) : event(q.context(), deps, profile(q, action), [](event ev){}), @@ -117,9 +127,11 @@ hard_event::command() const { } void -hard_event::wait() const { +hard_event::wait() { pipe_screen *screen = queue()-device().pipe; + event::wait(); + if (status() == CL_QUEUED) queue()-flush(); @@ -206,9 +218,8 @@ soft_event::command() const { } void -soft_event::wait() const { - for (event ev : deps) - ev.wait(); +soft_event::wait() { + event::wait(); if (status() != CL_COMPLETE) throw error(CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST); diff --git a/src/gallium/state_trackers/clover/core/event.hpp b/src/gallium/state_trackers/clover/core/event.hpp index d407c80..dffafb9 100644 --- a/src/gallium/state_trackers/clover/core/event.hpp +++ b/src/gallium/state_trackers/clover/core/event.hpp @@ -23,6 +23,7 @@ #ifndef CLOVER_CORE_EVENT_HPP #define CLOVER_CORE_EVENT_HPP +#include condition_variable #include functional #include core/object.hpp @@ -68,7 +69,7 @@ namespace clover { virtual cl_int status() const = 0; virtual command_queue *queue() const = 0; virtual cl_command_type command() const = 0; - virtual void wait() const = 0; + virtual void wait(); virtual struct pipe_fence_handle *fence() const { return NULL; @@ -87,6 +88,8 @@ namespace clover { action action_ok; action action_fail; std::vectorintrusive_refevent _chain; + std::condition_variable signalled_cv; + std::mutex signalled_mutex; }; /// @@ -111,7 +114,7 @@ namespace clover { virtual cl_int status() const; virtual command_queue *queue() const; virtual cl_command_type command() const; - virtual void wait() const; + virtual void wait(); const lazycl_ulong time_queued() const; const lazycl_ulong time_submit() const; @@ -149,7 +152,7 @@ namespace clover { virtual cl_int status() const; virtual command_queue *queue() const; virtual cl_command_type command() const; - virtual void wait() const; + virtual void wait(); }; } -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] clover: Replace open-coded event::signalled()
This consolidates signalled checks into the same place. --- src/gallium/state_trackers/clover/core/event.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/core/event.cpp b/src/gallium/state_trackers/clover/core/event.cpp index 58de888..3c9336e 100644 --- a/src/gallium/state_trackers/clover/core/event.cpp +++ b/src/gallium/state_trackers/clover/core/event.cpp @@ -66,7 +66,7 @@ event::signalled() const { void event::chain(event ev) { - if (wait_count) { + if (!signalled()) { ev.wait_count++; _chain.push_back(ev); } -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/13] util: Move gallium's linked list to util
Isn't this the same as src/util/simple_list.h? On 04/27/2015 09:03 PM, Jason Ekstrand wrote: The linked list in gallium is pretty much the kernel list and we would like to have a C-based linked list for all of mesa. Let's not duplicate and just steal the gallium one. --- src/gallium/auxiliary/Makefile.sources | 1 - src/gallium/auxiliary/hud/hud_private.h| 2 +- .../auxiliary/pipebuffer/pb_buffer_fenced.c| 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c | 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_debug.c | 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c| 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_pool.c | 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_slab.c | 2 +- src/gallium/auxiliary/util/u_debug_flush.c | 2 +- src/gallium/auxiliary/util/u_debug_memory.c| 2 +- src/gallium/auxiliary/util/u_dirty_surfaces.h | 2 +- src/gallium/auxiliary/util/u_double_list.h | 146 - src/gallium/drivers/freedreno/freedreno_context.h | 2 +- src/gallium/drivers/freedreno/freedreno_query_hw.h | 2 +- src/gallium/drivers/freedreno/freedreno_resource.h | 2 +- src/gallium/drivers/ilo/ilo_common.h | 2 +- src/gallium/drivers/nouveau/nouveau_buffer.h | 2 +- src/gallium/drivers/nouveau/nouveau_fence.c| 2 - src/gallium/drivers/nouveau/nouveau_fence.h| 2 +- src/gallium/drivers/nouveau/nouveau_mm.c | 2 +- src/gallium/drivers/nouveau/nv30/nv30_screen.h | 2 +- src/gallium/drivers/nouveau/nv50/nv50_resource.h | 2 +- src/gallium/drivers/r600/compute_memory_pool.c | 2 +- src/gallium/drivers/r600/evergreen_compute.c | 2 +- src/gallium/drivers/r600/r600_llvm.c | 2 +- src/gallium/drivers/r600/r600_pipe.h | 2 +- src/gallium/drivers/radeon/r600_pipe_common.h | 2 +- src/gallium/drivers/radeon/radeon_vce.h| 2 +- src/gallium/drivers/svga/svga_context.h| 2 +- src/gallium/drivers/svga/svga_resource_buffer.h| 2 - .../drivers/svga/svga_resource_buffer_upload.c | 1 - src/gallium/drivers/svga/svga_screen_cache.h | 2 +- src/gallium/state_trackers/nine/basetexture9.h | 2 +- src/gallium/state_trackers/nine/device9.h | 2 +- src/gallium/state_trackers/nine/nine_state.h | 2 +- src/gallium/state_trackers/nine/surface9.h | 2 +- src/gallium/state_trackers/omx/vid_dec.h | 2 +- src/gallium/state_trackers/omx/vid_enc.h | 2 +- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 2 +- .../winsys/svga/drm/pb_buffer_simple_fenced.c | 2 +- src/gallium/winsys/svga/drm/vmw_fence.c| 2 +- src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +- src/util/Makefile.sources | 1 + src/util/list.h| 146 + 44 files changed, 184 insertions(+), 189 deletions(-) delete mode 100644 src/gallium/auxiliary/util/u_double_list.h create mode 100644 src/util/list.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index ec7547c..62e6b94 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -197,7 +197,6 @@ C_SOURCES := \ util/u_dirty_surfaces.h \ util/u_dl.c \ util/u_dl.h \ - util/u_double_list.h \ util/u_draw.c \ util/u_draw.h \ util/u_draw_quad.c \ diff --git a/src/gallium/auxiliary/hud/hud_private.h b/src/gallium/auxiliary/hud/hud_private.h index 1606ada..c74dc3b 100644 --- a/src/gallium/auxiliary/hud/hud_private.h +++ b/src/gallium/auxiliary/hud/hud_private.h @@ -29,7 +29,7 @@ #define HUD_PRIVATE_H #include pipe/p_context.h -#include util/u_double_list.h +#include util/list.h struct hud_graph { /* initialized by common code */ diff --git a/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c b/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c index 9e0cace..7840467 100644 --- a/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c +++ b/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c @@ -46,7 +46,7 @@ #include util/u_debug.h #include os/os_thread.h #include util/u_memory.h -#include util/u_double_list.h +#include util/list.h #include pb_buffer.h #include pb_buffer_fenced.h diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c index 5eb8d06..5023687 100644 --- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c +++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c @@ -38,7 +38,7 @@ #include util/u_debug.h #include os/os_thread.h #include util/u_memory.h -#include util/u_double_list.h +#include util/list.h #include util/u_time.h #include
Re: [Mesa-dev] [PATCH 09/13] util/list: Add list_empty and list_length functions
On 05/05/2015 11:21 AM, Neil Roberts wrote: Jason Ekstrand ja...@jlekstrand.net writes: +static inline bool list_empty(struct list_head *list) +{ + return list-next == list; +} It would be good if list.h also included stdbool.h in order to get the declaration of bool. However, will that cause problems on MSVC? Is the Gallium code compiled on MSVC in general? +static inline unsigned list_length(struct list_head *list) +{ + unsigned length = 0; + for (struct list_head *node = list-next; node != list; node = node-next) + length++; + return length; +} Any reason not to use one of the list iterator macros here? Is it safe to use a C99-ism outside of a macro in this header? Maybe MSVC supports this particular C99-ism anyway. For what it's worth, I'm strongly in favour of using these kernel-style lists instead of exec_list. The kernel ones seem much less confusing. Huh? They're practically identical. The only difference is the kernel-style lists have a single sentinel node, and that node is impossible to identify in a crowd. The exec_lists use two sentinel nodes, and those nodes have one pointer of overlapping storage (head and tail are the next and prev pointers of one node, and tail and tail_pred are the next and prev pointers of the other). I thought there was some ASCII art in list.h that showed this, but that appears to not be the case... This gives some convenience that you can walk through a list from any node in the list without having a pointer to the list itself. I don't know if we still do, but there used to be a few places where we took advantage of that. Some of the APIs are (very) poorly named (I'm looking at you, insert_before), and I'd welcome patches to fix that up. Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] main: glGetIntegeri_v fails for GL_VERTEX_BINDING_STRIDE
The return type for GL_VERTEX_BINDING_STRIDE is missing, this cause glGetIntegeri_v to fail. Signed-off-by: Marta Lofstedt marta.lofst...@linux.intel.com --- src/mesa/main/get.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 6fc0f3f..9fb8fba 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -1959,6 +1959,7 @@ find_value_indexed(const char *func, GLenum pname, GLuint index, union value *v) if (index = ctx-Const.Program[MESA_SHADER_VERTEX].MaxAttribs) goto invalid_value; v-value_int = ctx-Array.VAO-VertexBinding[VERT_ATTRIB_GENERIC(index)].Stride; + return TYPE_INT; /* ARB_shader_image_load_store */ case GL_IMAGE_BINDING_NAME: { -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On 05/07/2015 05:44 PM, Jason Ekstrand wrote: On May 7, 2015 5:38 PM, Ian Romanick i...@freedesktop.org mailto:i...@freedesktop.org wrote: On 05/07/2015 04:50 PM, Jason Ekstrand wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. I thought we already did this... why didn't this happen when NIR became the default for the FS backend? And has that reason (assuming there was one) been resolved? We couldn't do copy propagation of values in the attribute register file. That, it turn was blocked on reworking the LOAD_PAYLOAD instruction. I pushed a series this morning that fixed both of those and cut 7.5% off of all SIMD8 VS instructions when using NIR. It also helps GLSL IR but by only 1% or so. --Jason Ah, that's right. Make it so! Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index fd7420a..8615e5e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true; ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false; - if (brw_env_var_as_boolean(INTEL_USE_NIR, false)) + if (brw_env_var_as_boolean(INTEL_USE_NIR, true)) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = nir_options; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. Again, I'm kind of confused and disappointed that we're just okay with hurting 4700 programs without more analysis. I guess I'll go do that... I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from 297 - 161 instructions. More concerning, the number of send instructions drop from 36 to 12, and a loop that was 111 instructions long suddenly becomes START B1 -B0 -B2 cmp.ge.f0(8)nullg428,8,1D g70,1,0D (+f0) break(8) JIP: 24 UIP: 24 END B1 -B3 -B2 START B2 -B1 add(8) g421D g428,8,1D 1D while(8)JIP: -32 END B2 -B1 That deserves a lot more investigation. I'll take a gamble and say something is broken. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On 05/07/2015 06:17 PM, Matt Turner wrote: On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. Again, I'm kind of confused and disappointed that we're just okay with hurting 4700 programs without more analysis. I guess I'll go do that... Yeah... I think I just (foolishly) assumed it was mostly +/- small amounts given the % in affected programs. I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from 297 - 161 instructions. More concerning, the number of send instructions drop from 36 to 12, and a loop that was 111 instructions long suddenly becomes START B1 -B0 -B2 cmp.ge.f0(8)nullg428,8,1D g70,1,0D (+f0) break(8) JIP: 24 UIP: 24 END B1 -B3 -B2 START B2 -B1 add(8) g421D g428,8,1D 1D while(8)JIP: -32 END B2 -B1 That deserves a lot more investigation. I'll take a gamble and say something is broken. Yikes. I guess I'm surprised that piglit+gles3conform+deqp didn't already find ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd
On Thu, May 7, 2015 at 5:27 PM, Jan Vesely jan.ves...@rutgers.edu wrote: On Thu, 2015-05-07 at 21:52 +0200, EdB wrote: Le 2015-05-07 18:55, Aaron Watry a écrit : I'm not sure what the final consensus will be on how to do this, but FWIW: Tested-By: Aaron Watry awa...@gmail.com I've tested this with 4 combinations: no --with-opencl-icd option specified : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=standard : libMesaOpenCL.so installed in ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in ${prefix}/lib, icd in ${prefix}/etc//mesa.icd. I only specified --prefix, no other directories overridden in configure command. shouldn't this part go to ${prefix}/etc/OpenCL/vendors? Is it just a typo or did it install to ${prefix}/etc//? That was just a typo. It went to ${prefix}/etc/OpenCL/vendors/mesa.icd. --Aaron jan thanks EdB --Aaron On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote: The standard ICD file path is /etc/OpenCL/vendor/. However it doesn't fit well with custom build. This option allow ICD vendor file installation path override --- configure.ac [1] | 46 +++--- src/gallium/targets/opencl/Makefile.am | 2 +- 2 files changed, 33 insertions(+), 15 deletions(-) diff --git a/configure.ac [1] b/configure.ac [1] index 095e23e..90dba4e 100644 --- a/configure.ac [1] +++ b/configure.ac [1] @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl], [enable OpenCL library @:@default=disabled@:@])], [enable_opencl=$enableval], [enable_opencl=no]) -AC_ARG_ENABLE([opencl_icd], - [AS_HELP_STRING([--enable-opencl-icd], - [Build an OpenCL ICD library to be loaded by an ICD implementation - @:@default=disabled@:@])], -[enable_opencl_icd=$enableval], -[enable_opencl_icd=no]) AC_ARG_ENABLE([xlib-glx], [AS_HELP_STRING([--enable-xlib-glx], [make GLX library Xlib-based instead of DRI-based @:@default=disabled@:@])], @@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then # XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers enable_gallium_loader=yes -if test x$enable_opencl_icd = xyes; then -OPENCL_LIBNAME=MesaOpenCL -else -OPENCL_LIBNAME=OpenCL -fi - if test x$have_libelf != xyes; then AC_MSG_ERROR([Clover requires libelf]) fi fi AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes) -AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes) -AC_SUBST([OPENCL_LIBNAME]) dnl dnl Gallium configuration @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir], [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d]) AC_SUBST([D3D_DRIVER_INSTALL_DIR]) +dnl OpenCL ICD + +AC_ARG_WITH([opencl-icd], + [AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@], +[Build an OpenCL ICD library to be loaded by an ICD implementation. + If @:@standard@:@ the OpenCL ICD vendor file installs in /etc/OpenCL/vendors. + @:@sysconfdir@:@ installs the file in $sysconfdir/OpenCL/vendors + @:@default=no@:@])], +[OPENCL_ICD=$withval], +[OPENCL_ICD=no]) + +case x$OPENCL_ICD in +xno) +OPENCL_LIBNAME=OpenCL +;; +xstandard) +OPENCL_LIBNAME=MesaOpenCL +ICD_FILE_DIR=/etc/OpenCL/vendors +;; +xsysconfdir) +OPENCL_LIBNAME=MesaOpenCL +ICD_FILE_DIR=$sysconfdir/OpenCL/vendors +;; +*) +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for --with-opencl-icd]) +;; +esac + +AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno) +AC_SUBST([OPENCL_LIBNAME]) +AC_SUBST([ICD_FILE_DIR]) + dnl dnl Gallium helper functions dnl diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 5daf327..781daa0 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES = opencl.sym EXTRA_DIST = mesa.icd opencl.sym if HAVE_CLOVER_ICD -icddir = /etc/OpenCL/vendors/ +icddir = $(ICD_FILE_DIR) icd_DATA = mesa.icd endif -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2] Links: -- [1] http://configure.ac [2] http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list
[Mesa-dev] [PATCH 5/5] clover: Add a mutex to guard event::chain and event::wait_count
This mutex effectively prevents an event's chain or wait_count from being updated while it is in the process of triggering. Otherwise it may be possible to add to an event's chain after it has been triggered, which causes the chained event to never be triggered. --- src/gallium/state_trackers/clover/core/event.cpp | 3 +++ src/gallium/state_trackers/clover/core/event.hpp | 1 + 2 files changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/core/event.cpp b/src/gallium/state_trackers/clover/core/event.cpp index da227bb..646fd38 100644 --- a/src/gallium/state_trackers/clover/core/event.cpp +++ b/src/gallium/state_trackers/clover/core/event.cpp @@ -38,6 +38,7 @@ event::~event() { void event::trigger() { + std::lock_guardstd::mutex lock(trigger_mutex); if (!--wait_count) { signalled_cv.notify_all(); action_ok(*this); @@ -54,6 +55,7 @@ event::abort(cl_int status) { _status = status; action_fail(*this); + std::lock_guardstd::mutex lock(trigger_mutex); while (!_chain.empty()) { _chain.back()().abort(status); _chain.pop_back(); @@ -67,6 +69,7 @@ event::signalled() const { void event::chain(event ev) { + std::lock_guardstd::mutex lock(trigger_mutex); if (!signalled()) { ev.wait_count++; _chain.push_back(ev); diff --git a/src/gallium/state_trackers/clover/core/event.hpp b/src/gallium/state_trackers/clover/core/event.hpp index dffafb9..a64fbba 100644 --- a/src/gallium/state_trackers/clover/core/event.hpp +++ b/src/gallium/state_trackers/clover/core/event.hpp @@ -90,6 +90,7 @@ namespace clover { std::vectorintrusive_refevent _chain; std::condition_variable signalled_cv; std::mutex signalled_mutex; + std::mutex trigger_mutex; }; /// -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On 05/07/2015 04:50 PM, Jason Ekstrand wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. I thought we already did this... why didn't this happen when NIR became the default for the FS backend? And has that reason (assuming there was one) been resolved? --- src/mesa/drivers/dri/i965/brw_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index fd7420a..8615e5e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true; ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false; - if (brw_env_var_as_boolean(INTEL_USE_NIR, false)) + if (brw_env_var_as_boolean(INTEL_USE_NIR, true)) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = nir_options; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. --- src/mesa/drivers/dri/i965/brw_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index fd7420a..8615e5e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true; ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false; - if (brw_env_var_as_boolean(INTEL_USE_NIR, false)) + if (brw_env_var_as_boolean(INTEL_USE_NIR, true)) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = nir_options; } -- 2.4.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On Thursday, May 07, 2015 04:50:39 PM Jason Ekstrand wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. --- src/mesa/drivers/dri/i965/brw_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index fd7420a..8615e5e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true; ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false; - if (brw_env_var_as_boolean(INTEL_USE_NIR, false)) + if (brw_env_var_as_boolean(INTEL_USE_NIR, true)) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = nir_options; } We definitely want to throw the switch before 10.6, so that all the scalar backends are using NIR, and we'll be able to delete the deprecated ones post-release. Acked-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On May 7, 2015 5:38 PM, Ian Romanick i...@freedesktop.org wrote: On 05/07/2015 04:50 PM, Jason Ekstrand wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. I thought we already did this... why didn't this happen when NIR became the default for the FS backend? And has that reason (assuming there was one) been resolved? We couldn't do copy propagation of values in the attribute register file. That, it turn was blocked on reworking the LOAD_PAYLOAD instruction. I pushed a series this morning that fixed both of those and cut 7.5% off of all SIMD8 VS instructions when using NIR. It also helps GLSL IR but by only 1% or so. --Jason --- src/mesa/drivers/dri/i965/brw_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index fd7420a..8615e5e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true; ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false; - if (brw_env_var_as_boolean(INTEL_USE_NIR, false)) + if (brw_env_var_as_boolean(INTEL_USE_NIR, true)) ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = nir_options; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Initial amdgpu driver release
On Mon, Apr 20, 2015 at 6:33 PM, Alex Deucher alexdeuc...@gmail.com wrote: I'm pleased to announce the initial release of the new amdgpu driver. This is a partial replacement for the radeon driver for newer AMD asics. A number of components are still shared. Here is a comparison of the radeon and amdgpu stacks: 1. radeon stack kernel driver: radeon.ko libdrm: libdrm_radeon mesa: radeon, r200, r300, r600, radeonsi ddx: xf86-video-ati 2. amdgpu stack kernel driver: amdgpu.ko libdrm: libdrm_amdgpu mesa: radeonsi ddx: xf86-video-amdgpu Older asics will continue to be supported by the radeon stack; new asics will be supported by the amdgpu stack. CI (Sea Islands) asics have support in both driver stacks, but this is purely for testing purposes. CI parts are officially supported in the radeon stack. Support for CI on the amdgpu stack is determined by a config option in the kernel. CI support is not enabled by default for amdgpu. Most of our focus has been on Carrizo support, so there are some gaps in the dGPU support for Tonga and Iceland, notably power management. Those gaps will be filled in eventually. Also included in this code base are full register headers for just about every block on the asics. Barring the gaps mentioned above, the driver stack is functionally on par with radeon including: - OpenGL 3.3 support using the radeonsi mesa driver - Video decode support using UVD - Video encode support using VCE The code can be found in the amdgpu branches of the following git trees. xf86-video-amdgpu: http://cgit.freedesktop.org/~agd5f/xf86-video-amdgpu/log/?h=amdgpu libdrm: http://cgit.freedesktop.org/~agd5f/drm/log/?h=amdgpu kernel: http://cgit.freedesktop.org/~agd5f/linux/log/?h=amdgpu mesa: http://cgit.freedesktop.org/~mareko/mesa/log/?h=amdgpu Some updates on the latest source locations: xf86-video-amdgpu: http://cgit.freedesktop.org/xorg/driver/xf86-video-amdgpu libdrm: http://cgit.freedesktop.org/~agd5f/drm/log/?h=amdgpu kernel: http://cgit.freedesktop.org/amd/drm-amd/ mesa: http://cgit.freedesktop.org/mesa/mesa/log/?h=amdgpu Alex To test the new driver stack you will need to specify a device section in your xorg.conf with the driver set to amdgpu rather than radeon. Please review! Thanks, The AMD Linux Driver Team ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/13] mesa/main: Verify context creation on progress
On 05/07/2015 05:21 AM, Pohjolainen, Topi wrote: On Tue, May 05, 2015 at 02:25:29PM +0300, Juha-Pekka Heikkila wrote: Stop context creation if something failed. If something errored during context creation we'd segfault. Now will clean up and return error. Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/main/shared.c | 66 +++--- 1 file changed, 62 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c index 0b76cc0..cc05b05 100644 --- a/src/mesa/main/shared.c +++ b/src/mesa/main/shared.c @@ -64,9 +64,21 @@ _mesa_alloc_shared_state(struct gl_context *ctx) mtx_init(shared-Mutex, mtx_plain); + /* Mutex and timestamp for texobj state validation */ + mtx_init(shared-TexMutex, mtx_recursive); + shared-TextureStateStamp = 0; Do you really need to move this here? I was going to ask the same thing. I think moving it here means that it can be unconditionally mtx_destroy'ed in the error path below. + shared-DisplayList = _mesa_NewHashTable(); + if (!shared-DisplayList) + goto error_out; + shared-TexObjects = _mesa_NewHashTable(); + if (!shared-TexObjects) + goto error_out; + shared-Programs = _mesa_NewHashTable(); + if (!shared-Programs) + goto error_out; shared-DefaultVertexProgram = gl_vertex_program(ctx-Driver.NewProgram(ctx, @@ -76,17 +88,28 @@ _mesa_alloc_shared_state(struct gl_context *ctx) GL_FRAGMENT_PROGRAM_ARB, 0)); shared-ATIShaders = _mesa_NewHashTable(); + if (!shared-ATIShaders) + goto error_out; + shared-DefaultFragmentShader = _mesa_new_ati_fragment_shader(ctx, 0); shared-ShaderObjects = _mesa_NewHashTable(); + if (!shared-ShaderObjects) + goto error_out; shared-BufferObjects = _mesa_NewHashTable(); + if (!shared-BufferObjects) + goto error_out; /* GL_ARB_sampler_objects */ shared-SamplerObjects = _mesa_NewHashTable(); + if (!shared-SamplerObjects) + goto error_out; /* Allocate the default buffer object */ shared-NullBufferObj = ctx-Driver.NewBufferObject(ctx, 0); + if (!shared-NullBufferObj) + goto error_out; /* Create default texture objects */ for (i = 0; i NUM_TEXTURE_TARGETS; i++) { @@ -107,22 +130,57 @@ _mesa_alloc_shared_state(struct gl_context *ctx) }; STATIC_ASSERT(ARRAY_SIZE(targets) == NUM_TEXTURE_TARGETS); shared-DefaultTex[i] = ctx-Driver.NewTextureObject(ctx, 0, targets[i]); + + if (!shared-DefaultTex[i]) + goto error_out; } /* sanity check */ assert(shared-DefaultTex[TEXTURE_1D_INDEX]-RefCount == 1); - /* Mutex and timestamp for texobj state validation */ - mtx_init(shared-TexMutex, mtx_recursive); - shared-TextureStateStamp = 0; - shared-FrameBuffers = _mesa_NewHashTable(); + if (!shared-FrameBuffers) + goto error_out; + shared-RenderBuffers = _mesa_NewHashTable(); + if (!shared-RenderBuffers) + goto error_out; shared-SyncObjects = _mesa_set_create(NULL, _mesa_hash_pointer, _mesa_key_pointer_equal); + if (!shared-SyncObjects) + goto error_out; return shared; + +error_out: + for (i = 0; i NUM_TEXTURE_TARGETS; i++) { + if (shared-DefaultTex[i]) { + ctx-Driver.DeleteTexture(ctx, shared-DefaultTex[i]); + } + } + + _mesa_reference_buffer_object(ctx, shared-NullBufferObj, NULL); + + _mesa_DeleteHashTable(shared-RenderBuffers); + _mesa_DeleteHashTable(shared-FrameBuffers); + _mesa_DeleteHashTable(shared-SamplerObjects); + _mesa_DeleteHashTable(shared-BufferObjects); + _mesa_DeleteHashTable(shared-ShaderObjects); + _mesa_DeleteHashTable(shared-ATIShaders); + _mesa_DeleteHashTable(shared-Programs); + _mesa_DeleteHashTable(shared-TexObjects); + _mesa_DeleteHashTable(shared-DisplayList); + + _mesa_reference_vertprog(ctx, shared-DefaultVertexProgram, NULL); + _mesa_reference_geomprog(ctx, shared-DefaultGeometryProgram, NULL); + _mesa_reference_fragprog(ctx, shared-DefaultFragmentProgram, NULL); + + mtx_destroy(shared-Mutex); + mtx_destroy(shared-TexMutex); + + free(shared); + return NULL; } -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] nir: Translate image load, store and atomic intrinsics from GLSL IR.
On Tue, May 5, 2015 at 4:29 PM, Francisco Jerez curroje...@riseup.net wrote: --- src/glsl/nir/glsl_to_nir.cpp | 125 +++ 1 file changed, 114 insertions(+), 11 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index f6b8331..a01ab3b 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -614,27 +614,130 @@ nir_visitor::visit(ir_call *ir) op = nir_intrinsic_atomic_counter_inc_var; } else if (strcmp(ir-callee_name(), __intrinsic_atomic_predecrement) == 0) { op = nir_intrinsic_atomic_counter_dec_var; + } else if (strcmp(ir-callee_name(), __intrinsic_image_load) == 0) { + op = nir_intrinsic_image_load; + } else if (strcmp(ir-callee_name(), __intrinsic_image_store) == 0) { + op = nir_intrinsic_image_store; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_add) == 0) { + op = nir_intrinsic_image_atomic_add; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_min) == 0) { + op = nir_intrinsic_image_atomic_min; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_max) == 0) { + op = nir_intrinsic_image_atomic_max; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_and) == 0) { + op = nir_intrinsic_image_atomic_and; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_or) == 0) { + op = nir_intrinsic_image_atomic_or; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_xor) == 0) { + op = nir_intrinsic_image_atomic_xor; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_exchange) == 0) { + op = nir_intrinsic_image_atomic_exchange; + } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_comp_swap) == 0) { + op = nir_intrinsic_image_atomic_comp_swap; } else { unreachable(not reached); } nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader, op); - ir_dereference *param = - (ir_dereference *) ir-actual_parameters.get_head(); - instr-variables[0] = evaluate_deref(instr-instr, param); - nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL); + + switch (op) { + case nir_intrinsic_atomic_counter_read_var: + case nir_intrinsic_atomic_counter_inc_var: + case nir_intrinsic_atomic_counter_dec_var: { + ir_dereference *param = +(ir_dereference *) ir-actual_parameters.get_head(); + instr-variables[0] = evaluate_deref(instr-instr, param); + nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL); + break; + } + case nir_intrinsic_image_load: + case nir_intrinsic_image_store: + case nir_intrinsic_image_atomic_add: + case nir_intrinsic_image_atomic_min: + case nir_intrinsic_image_atomic_max: + case nir_intrinsic_image_atomic_and: + case nir_intrinsic_image_atomic_or: + case nir_intrinsic_image_atomic_xor: + case nir_intrinsic_image_atomic_exchange: + case nir_intrinsic_image_atomic_comp_swap: { + nir_load_const_instr *instr_zero = nir_load_const_instr_create(shader, 1); + instr_zero-value.u[0] = 0; + nir_instr_insert_after_cf_list(this-cf_node_list, instr_zero-instr); + + /* Set the image variable dereference. */ + exec_node *param = ir-actual_parameters.get_head(); + ir_dereference *image = (ir_dereference *)param; + const glsl_type *type = +image-variable_referenced()-type-without_array(); + + instr-variables[0] = evaluate_deref(instr-instr, image); + param = param-get_next(); + + /* Set the address argument, extending the coordinate vector to four + * components. + */ + const nir_src src_addr = evaluate_rvalue((ir_dereference *)param); + nir_alu_instr *instr_addr = nir_alu_instr_create(shader, nir_op_vec4); + nir_ssa_dest_init(instr_addr-instr, instr_addr-dest.dest, 4, NULL); + + for (int i = 0; i 4; i++) { +if (i type-coordinate_components()) { + instr_addr-src[i].src = src_addr; + instr_addr-src[i].swizzle[0] = i; +} else { + instr_addr-src[i].src = nir_src_for_ssa(instr_zero-def); I think it would better convey the intent to create an ssa_undef_instr and use that here instead of zero. Unless something else relies on the extra coordinates being zeroed? +} + } + + nir_instr_insert_after_cf_list(cf_node_list, instr_addr-instr); + instr-src[0] = nir_src_for_ssa(instr_addr-dest.dest.ssa); + param = param-get_next(); + + /* Set the sample argument, which should be zero for single-sample + * images. + */ + if
[Mesa-dev] [Bug 69101] prime: black window
https://bugs.freedesktop.org/show_bug.cgi?id=69101 higu...@gmx.net changed: What|Removed |Added CC||higu...@gmx.net -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glx: report which DRI version is used when in verbose debug mode
Hi, On 05/06/2015 08:28 PM, Kenneth Graunke wrote: I agree with Axel - I think LIBGL_DRI3_DISABLE=1 already does what you want, so patch 2 is unnecessary. That needs a patch to doc/envvars.html... - Eero ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/13] mesa/main: Check context pointer in _mesa_error before using it
On Tue, May 05, 2015 at 02:25:26PM +0300, Juha-Pekka Heikkila wrote: I guess this should not really be able to segfault but still it seems to be able to during context creation. Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/main/errors.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c index 2aa1deb..6631b82 100644 --- a/src/mesa/main/errors.c +++ b/src/mesa/main/errors.c @@ -1458,18 +1458,23 @@ _mesa_error( struct gl_context *ctx, GLenum error, const char *fmtString, ... ) To me it looks that it would be better to just leave early already here: if (!ctx) return; Avoids extra indentation and it doesn't look meaningful to call should_output() with null context. do_output = should_output(ctx, error, fmtString); - mtx_lock(ctx-DebugMutex); - if (ctx-Debug) { - do_log = debug_is_message_enabled(ctx-Debug, -MESA_DEBUG_SOURCE_API, -MESA_DEBUG_TYPE_ERROR, -error_msg_id, -MESA_DEBUG_SEVERITY_HIGH); + if (ctx) { + mtx_lock(ctx-DebugMutex); + if (ctx-Debug) { + do_log = debug_is_message_enabled(ctx-Debug, + MESA_DEBUG_SOURCE_API, + MESA_DEBUG_TYPE_ERROR, + error_msg_id, + MESA_DEBUG_SEVERITY_HIGH); + } + else { + do_log = GL_FALSE; + } + mtx_unlock(ctx-DebugMutex); } else { do_log = GL_FALSE; } - mtx_unlock(ctx-DebugMutex); if (do_output || do_log) { char s[MAX_DEBUG_MESSAGE_LENGTH], s2[MAX_DEBUG_MESSAGE_LENGTH]; @@ -1502,14 +1507,15 @@ _mesa_error( struct gl_context *ctx, GLenum error, const char *fmtString, ... ) } /* Log the error via ARB_debug_output if needed.*/ - if (do_log) { + if (ctx do_log) { log_msg(ctx, MESA_DEBUG_SOURCE_API, MESA_DEBUG_TYPE_ERROR, error_msg_id, MESA_DEBUG_SEVERITY_HIGH, len, s2); } } /* Set the GL context error state for glGetError. */ - _mesa_record_error(ctx, error); + if (ctx) + _mesa_record_error(ctx, error); } void -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions
On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote: v2: Get rid of the block parameter and make src a const reference Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 1 file changed, 38 insertions(+), 37 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index 43370cb..9c4ed0b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate) operands_match(a, b, negate); } +static fs_inst * +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate) Did you mean 'src' to be constant reference? It is only used for reading so it could be - you claim this in the commit message yourself :) +{ + int written = inst-regs_written; + int dst_width = inst-dst.width / 8; + fs_reg dst = inst-dst; + fs_inst *copy; + + if (written dst_width) { + fs_reg *sources = ralloc_array(v-mem_ctx, fs_reg, written / dst_width); + for (int i = 0; i written / dst_width; i++) + sources[i] = offset(src, i); + copy = v-LOAD_PAYLOAD(dst, sources, written / dst_width); + } else { + copy = v-MOV(dst, src); + copy-force_writemask_all = inst-force_writemask_all; + copy-src[0].negate = negate; + } + assert(copy-regs_written == written); + + return copy; +} + bool fs_visitor::opt_cse_local(bblock_t *block) { @@ -230,49 +253,27 @@ fs_visitor::opt_cse_local(bblock_t *block) bool no_existing_temp = entry-tmp.file == BAD_FILE; if (no_existing_temp !entry-generator-dst.is_null()) { int written = entry-generator-regs_written; - int dst_width = entry-generator-dst.width / 8; - assert(written % dst_width == 0); - - fs_reg orig_dst = entry-generator-dst; - fs_reg tmp = fs_reg(GRF, alloc.allocate(written), - orig_dst.type, orig_dst.width); - entry-tmp = tmp; - entry-generator-dst = tmp; - - fs_inst *copy; - if (written dst_width) { - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / dst_width); - for (int i = 0; i written / dst_width; i++) - sources[i] = offset(tmp, i); - copy = LOAD_PAYLOAD(orig_dst, sources, written / dst_width); - } else { - copy = MOV(orig_dst, tmp); - copy-force_writemask_all = - entry-generator-force_writemask_all; - } + assert((written * 8) % entry-generator-dst.width == 0); + + entry-tmp = fs_reg(GRF, alloc.allocate(written), + entry-generator-dst.type, + entry-generator-dst.width); + + fs_inst *copy = create_copy_instr(this, entry-generator, + entry-tmp, false); entry-generator-insert_after(block, copy); + + entry-generator-dst = entry-tmp; } /* dest - temp */ if (!inst-dst.is_null()) { - int written = inst-regs_written; - int dst_width = inst-dst.width / 8; - assert(written == entry-generator-regs_written); - assert(dst_width == entry-generator-dst.width / 8); + assert(inst-regs_written == entry-generator-regs_written); + assert(inst-dst.width == entry-generator-dst.width); assert(inst-dst.type == entry-tmp.type); - fs_reg dst = inst-dst; - fs_reg tmp = entry-tmp; - fs_inst *copy; - if (written dst_width) { - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / dst_width); - for (int i = 0; i written / dst_width; i++) - sources[i] = offset(tmp, i); - copy = LOAD_PAYLOAD(dst, sources, written / dst_width); - } else { - copy = MOV(dst, tmp); - copy-force_writemask_all = inst-force_writemask_all; - copy-src[0].negate = negate; - } + + fs_inst *copy = create_copy_instr(this, inst, + entry-tmp, negate); inst-insert_before(block, copy); } -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH 13/13] mesa/main: Verify context creation on progress
On Tue, May 05, 2015 at 02:25:29PM +0300, Juha-Pekka Heikkila wrote: Stop context creation if something failed. If something errored during context creation we'd segfault. Now will clean up and return error. Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/main/shared.c | 66 +++--- 1 file changed, 62 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c index 0b76cc0..cc05b05 100644 --- a/src/mesa/main/shared.c +++ b/src/mesa/main/shared.c @@ -64,9 +64,21 @@ _mesa_alloc_shared_state(struct gl_context *ctx) mtx_init(shared-Mutex, mtx_plain); + /* Mutex and timestamp for texobj state validation */ + mtx_init(shared-TexMutex, mtx_recursive); + shared-TextureStateStamp = 0; Do you really need to move this here? + shared-DisplayList = _mesa_NewHashTable(); + if (!shared-DisplayList) + goto error_out; + shared-TexObjects = _mesa_NewHashTable(); + if (!shared-TexObjects) + goto error_out; + shared-Programs = _mesa_NewHashTable(); + if (!shared-Programs) + goto error_out; shared-DefaultVertexProgram = gl_vertex_program(ctx-Driver.NewProgram(ctx, @@ -76,17 +88,28 @@ _mesa_alloc_shared_state(struct gl_context *ctx) GL_FRAGMENT_PROGRAM_ARB, 0)); shared-ATIShaders = _mesa_NewHashTable(); + if (!shared-ATIShaders) + goto error_out; + shared-DefaultFragmentShader = _mesa_new_ati_fragment_shader(ctx, 0); shared-ShaderObjects = _mesa_NewHashTable(); + if (!shared-ShaderObjects) + goto error_out; shared-BufferObjects = _mesa_NewHashTable(); + if (!shared-BufferObjects) + goto error_out; /* GL_ARB_sampler_objects */ shared-SamplerObjects = _mesa_NewHashTable(); + if (!shared-SamplerObjects) + goto error_out; /* Allocate the default buffer object */ shared-NullBufferObj = ctx-Driver.NewBufferObject(ctx, 0); + if (!shared-NullBufferObj) + goto error_out; /* Create default texture objects */ for (i = 0; i NUM_TEXTURE_TARGETS; i++) { @@ -107,22 +130,57 @@ _mesa_alloc_shared_state(struct gl_context *ctx) }; STATIC_ASSERT(ARRAY_SIZE(targets) == NUM_TEXTURE_TARGETS); shared-DefaultTex[i] = ctx-Driver.NewTextureObject(ctx, 0, targets[i]); + + if (!shared-DefaultTex[i]) + goto error_out; } /* sanity check */ assert(shared-DefaultTex[TEXTURE_1D_INDEX]-RefCount == 1); - /* Mutex and timestamp for texobj state validation */ - mtx_init(shared-TexMutex, mtx_recursive); - shared-TextureStateStamp = 0; - shared-FrameBuffers = _mesa_NewHashTable(); + if (!shared-FrameBuffers) + goto error_out; + shared-RenderBuffers = _mesa_NewHashTable(); + if (!shared-RenderBuffers) + goto error_out; shared-SyncObjects = _mesa_set_create(NULL, _mesa_hash_pointer, _mesa_key_pointer_equal); + if (!shared-SyncObjects) + goto error_out; return shared; + +error_out: + for (i = 0; i NUM_TEXTURE_TARGETS; i++) { + if (shared-DefaultTex[i]) { + ctx-Driver.DeleteTexture(ctx, shared-DefaultTex[i]); + } + } + + _mesa_reference_buffer_object(ctx, shared-NullBufferObj, NULL); + + _mesa_DeleteHashTable(shared-RenderBuffers); + _mesa_DeleteHashTable(shared-FrameBuffers); + _mesa_DeleteHashTable(shared-SamplerObjects); + _mesa_DeleteHashTable(shared-BufferObjects); + _mesa_DeleteHashTable(shared-ShaderObjects); + _mesa_DeleteHashTable(shared-ATIShaders); + _mesa_DeleteHashTable(shared-Programs); + _mesa_DeleteHashTable(shared-TexObjects); + _mesa_DeleteHashTable(shared-DisplayList); + + _mesa_reference_vertprog(ctx, shared-DefaultVertexProgram, NULL); + _mesa_reference_geomprog(ctx, shared-DefaultGeometryProgram, NULL); + _mesa_reference_fragprog(ctx, shared-DefaultFragmentProgram, NULL); + + mtx_destroy(shared-Mutex); + mtx_destroy(shared-TexMutex); + + free(shared); + return NULL; } -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/27] nir: Add glsl_get_element_type() wrapper.
On Tue, 2015-04-28 at 23:08 +0300, Abdiel Janulgue wrote: Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/glsl/nir/nir_types.cpp | 5 + src/glsl/nir/nir_types.h | 2 ++ 2 files changed, 7 insertions(+) diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp index f0d0b46..249678f 100644 --- a/src/glsl/nir/nir_types.cpp +++ b/src/glsl/nir/nir_types.cpp @@ -82,6 +82,11 @@ glsl_get_base_type(const struct glsl_type *type) return type-base_type; } +const struct glsl_type * +glsl_get_element_type(const struct glsl_type *type) +{ + return type-element_type(); I've sent a patch to remove the element_type() helper. I'm yet to see a case where just using is_array() and/or without_array() don't result in clearer code with the added advantage (in most cases) of free multidimensional array support. http://lists.freedesktop.org/archives/mesa-dev/2015-April/083195.html +} unsigned glsl_get_vector_elements(const struct glsl_type *type) { diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h index 276d4ad..125f075 100644 --- a/src/glsl/nir/nir_types.h +++ b/src/glsl/nir/nir_types.h @@ -49,6 +49,8 @@ const struct glsl_type *glsl_get_array_element(const struct glsl_type *type); const struct glsl_type *glsl_get_column_type(const struct glsl_type *type); +const struct glsl_type *glsl_get_element_type(const struct glsl_type *type); + enum glsl_base_type glsl_get_base_type(const struct glsl_type *type); unsigned glsl_get_vector_elements(const struct glsl_type *type); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On Thu, May 7, 2015 at 6:17 PM, Matt Turner matts...@gmail.com wrote: On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. Again, I'm kind of confused and disappointed that we're just okay with hurting 4700 programs without more analysis. I guess I'll go do that... I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from 297 - 161 instructions. More concerning, the number of send instructions drop from 36 to 12, and a loop that was 111 instructions long suddenly becomes START B1 -B0 -B2 cmp.ge.f0(8)nullg428,8,1D g70,1,0D (+f0) break(8) JIP: 24 UIP: 24 END B1 -B3 -B2 START B2 -B1 add(8) g421D g428,8,1D 1D while(8)JIP: -32 END B2 -B1 That deserves a lot more investigation. I'll take a gamble and say something is broken. I did a little looking at that shader and it looks like NIR dead-coded the contents of a for loop and, as a result, a bunch of stuff was promoted to push constants, hence fewer sampler messages. I didn't find anything broken but, then again, that's hard to do without being able to verifiably run the shader. I'll try and look at the places where we end up with more instructions. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1
On 05/08/2015 12:13 AM, Ian Romanick wrote: On 05/07/2015 12:57 AM, Marta Lofstedt wrote: From: Marta Lofstedt marta.lofst...@intel.com v2: only expose enums from GL_ARB_shader_image_load_store for gles 3.1 and GL core Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 17 - 2 files changed, 14 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 9898197..73739b6 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = { EXTRA_END }; +static const int extra_ARB_shader_image_load_store_es31[] = { + EXT(ARB_shader_image_load_store), + EXTRA_API_ES31, I think you're missing the patch that adds EXTRA_API_ES31. Did you forget to send that one out? Marta's series builds on top of my patch here that adds EXTRA_API_ES31: http://lists.freedesktop.org/archives/mesa-dev/2015-May/083593.html Also, on a few of these patches, I think the old, non-_es31 set of requirements can be removed due to no longer being used. + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 513d5d2..85c2494 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -413,6 +413,14 @@ descriptor=[ { apis: [GL_CORE, GLES3], params: [ # GL_ARB_draw_indirect / GLES 3.1 [ DRAW_INDIRECT_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, extra_ARB_draw_indirect_es31 ], +# GL_ARB_shader_image_load_store / GLES 3.1 + [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), extra_ARB_shader_image_load_store_es31], + [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), extra_ARB_shader_image_load_store_es31], + [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), extra_ARB_shader_image_load_store_es31], + [ MAX_VERTEX_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_GEOMETRY_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_FRAGMENT_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), extra_ARB_shader_image_load_store_es31], ]}, # Remaining enums are only in OpenGL @@ -780,15 +788,6 @@ descriptor=[ [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ], [ MAX_VERTEX_ATTRIB_BINDINGS, CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ], -# GL_ARB_shader_image_load_store - [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), extra_ARB_shader_image_load_store], - [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), extra_ARB_shader_image_load_store], - [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), extra_ARB_shader_image_load_store], - [ MAX_VERTEX_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), extra_ARB_shader_image_load_store], - [ MAX_GEOMETRY_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), extra_ARB_shader_image_load_store_and_geometry_shader], - [ MAX_FRAGMENT_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), extra_ARB_shader_image_load_store], - [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), extra_ARB_shader_image_load_store], - # GL_ARB_compute_shader [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ], [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), extra_ARB_compute_shader ], ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/13] util: Move gallium's linked list to util
On Thu, May 7, 2015 at 5:30 PM, Ian Romanick i...@freedesktop.org wrote: Isn't this the same as src/util/simple_list.h? In terms of being a two-pointer circularly linked list, yes. In terms of having a decent API, no. 1) Nothing in simple_list is namespaced in any way 2) it's all macros with do-while around them instead of static inlines 3) It assumes that you just put prev and next pointers in the structure you're putting in the list rather than having a node you embed. While this provides the type saftey claimed at the top of simple_list.h, it requires that, if you want a list of struct foo's, you to use an entire struct foo as the sentinel instead of a 2 or 3 pointer list structure. 4) Point 3 isn't quite true because there is a simple_node structure. However, it looks like a complete after-thought because none of the iterators or manipulators do anything with it. I could probably extend the list, but I think you get the point. Sure, I could improve simple_list, but why do so when there's a perfectly good list in gallium that does everything simple_list does and more. I did start working on replacing simple_list with the gallium list to get us down to two lists, but we use it in things like swrast and tnl so it turned into quite the spider-web. Eventually, I'd like to see simple_list die but if we can at least restrict it back to the older parts of the code and remove it from util, that would make me happy enough for now. --Jason On 04/27/2015 09:03 PM, Jason Ekstrand wrote: The linked list in gallium is pretty much the kernel list and we would like to have a C-based linked list for all of mesa. Let's not duplicate and just steal the gallium one. --- src/gallium/auxiliary/Makefile.sources | 1 - src/gallium/auxiliary/hud/hud_private.h| 2 +- .../auxiliary/pipebuffer/pb_buffer_fenced.c| 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c | 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_debug.c | 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c| 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_pool.c | 2 +- src/gallium/auxiliary/pipebuffer/pb_bufmgr_slab.c | 2 +- src/gallium/auxiliary/util/u_debug_flush.c | 2 +- src/gallium/auxiliary/util/u_debug_memory.c| 2 +- src/gallium/auxiliary/util/u_dirty_surfaces.h | 2 +- src/gallium/auxiliary/util/u_double_list.h | 146 - src/gallium/drivers/freedreno/freedreno_context.h | 2 +- src/gallium/drivers/freedreno/freedreno_query_hw.h | 2 +- src/gallium/drivers/freedreno/freedreno_resource.h | 2 +- src/gallium/drivers/ilo/ilo_common.h | 2 +- src/gallium/drivers/nouveau/nouveau_buffer.h | 2 +- src/gallium/drivers/nouveau/nouveau_fence.c| 2 - src/gallium/drivers/nouveau/nouveau_fence.h| 2 +- src/gallium/drivers/nouveau/nouveau_mm.c | 2 +- src/gallium/drivers/nouveau/nv30/nv30_screen.h | 2 +- src/gallium/drivers/nouveau/nv50/nv50_resource.h | 2 +- src/gallium/drivers/r600/compute_memory_pool.c | 2 +- src/gallium/drivers/r600/evergreen_compute.c | 2 +- src/gallium/drivers/r600/r600_llvm.c | 2 +- src/gallium/drivers/r600/r600_pipe.h | 2 +- src/gallium/drivers/radeon/r600_pipe_common.h | 2 +- src/gallium/drivers/radeon/radeon_vce.h| 2 +- src/gallium/drivers/svga/svga_context.h| 2 +- src/gallium/drivers/svga/svga_resource_buffer.h| 2 - .../drivers/svga/svga_resource_buffer_upload.c | 1 - src/gallium/drivers/svga/svga_screen_cache.h | 2 +- src/gallium/state_trackers/nine/basetexture9.h | 2 +- src/gallium/state_trackers/nine/device9.h | 2 +- src/gallium/state_trackers/nine/nine_state.h | 2 +- src/gallium/state_trackers/nine/surface9.h | 2 +- src/gallium/state_trackers/omx/vid_dec.h | 2 +- src/gallium/state_trackers/omx/vid_enc.h | 2 +- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 2 +- .../winsys/svga/drm/pb_buffer_simple_fenced.c | 2 +- src/gallium/winsys/svga/drm/vmw_fence.c| 2 +- src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +- src/util/Makefile.sources | 1 + src/util/list.h| 146 + 44 files changed, 184 insertions(+), 189 deletions(-) delete mode 100644 src/gallium/auxiliary/util/u_double_list.h create mode 100644 src/util/list.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index ec7547c..62e6b94 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -197,7 +197,6 @@ C_SOURCES := \ util/u_dirty_surfaces.h \ util/u_dl.c \ util/u_dl.h \ - util/u_double_list.h \
[Mesa-dev] [PATCH 1/2] nv50: keep track of PGRAPH state in nv50_screen
Normally this is kept in nv50_context, and on switching the active context, the state is copied from the previous context. However when the last context is destroyed, this is lost, and a new context might later be created. When the currently-active context is destroyed, save its state in the screen, and restore it when setting the current context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363 Reported-by: Matteo Bruni matteo.myst...@gmail.com Signed-off-by: Ilia Mirkin imir...@alum.mit.edu Cc: mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/nouveau/nv50/nv50_context.c| 11 ++-- src/gallium/drivers/nouveau/nv50/nv50_context.h| 29 +- src/gallium/drivers/nouveau/nv50/nv50_screen.h | 24 ++ .../drivers/nouveau/nv50/nv50_state_validate.c | 2 ++ 4 files changed, 36 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c b/src/gallium/drivers/nouveau/nv50/nv50_context.c index 2cfd5db..5b5d391 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c @@ -138,8 +138,11 @@ nv50_destroy(struct pipe_context *pipe) { struct nv50_context *nv50 = nv50_context(pipe); - if (nv50_context_screen(nv50)-cur_ctx == nv50) - nv50_context_screen(nv50)-cur_ctx = NULL; + if (nv50-screen-cur_ctx == nv50) { + nv50-screen-cur_ctx = NULL; + /* Save off the state in case another context gets created */ + nv50-screen-save_state = nv50-state; + } nouveau_pushbuf_bufctx(nv50-base.pushbuf, NULL); nouveau_pushbuf_kick(nv50-base.pushbuf, nv50-base.pushbuf-channel); @@ -290,6 +293,10 @@ nv50_create(struct pipe_screen *pscreen, void *priv) pipe-get_sample_position = nv50_context_get_sample_position; if (!screen-cur_ctx) { + /* Restore the last context's state here, normally handled during + * context switch + */ + nv50-state = screen-save_state; screen-cur_ctx = nv50; nouveau_pushbuf_bufctx(screen-base.pushbuf, nv50-bufctx); } diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h b/src/gallium/drivers/nouveau/nv50/nv50_context.h index 45eb554..1f123ef 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h @@ -104,28 +104,7 @@ struct nv50_context { uint32_t dirty; boolean cb_dirty; - struct { - uint32_t instance_elts; /* bitmask of per-instance elements */ - uint32_t instance_base; - uint32_t interpolant_ctrl; - uint32_t semantic_color; - uint32_t semantic_psize; - int32_t index_bias; - boolean uniform_buffer_bound[3]; - boolean prim_restart; - boolean point_sprite; - boolean rt_serialize; - boolean flushed; - boolean rasterizer_discard; - uint8_t tls_required; - boolean new_tls_space; - uint8_t num_vtxbufs; - uint8_t num_vtxelts; - uint8_t num_textures[3]; - uint8_t num_samplers[3]; - uint8_t prim_size; - uint16_t scissor; - } state; + struct nv50_graph_state state; struct nv50_blend_stateobj *blend; struct nv50_rasterizer_stateobj *rast; @@ -191,12 +170,6 @@ nv50_context(struct pipe_context *pipe) return (struct nv50_context *)pipe; } -static INLINE struct nv50_screen * -nv50_context_screen(struct nv50_context *nv50) -{ - return nv50_screen(nv50-base.screen-base); -} - /* return index used in nv50_context arrays for a specific shader type */ static INLINE unsigned nv50_context_shader_stage(unsigned pipe) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h b/src/gallium/drivers/nouveau/nv50/nv50_screen.h index f8ce365..881051b 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h @@ -25,10 +25,34 @@ struct nv50_context; struct nv50_blitter; +struct nv50_graph_state { + uint32_t instance_elts; /* bitmask of per-instance elements */ + uint32_t instance_base; + uint32_t interpolant_ctrl; + uint32_t semantic_color; + uint32_t semantic_psize; + int32_t index_bias; + boolean uniform_buffer_bound[3]; + boolean prim_restart; + boolean point_sprite; + boolean rt_serialize; + boolean flushed; + boolean rasterizer_discard; + uint8_t tls_required; + boolean new_tls_space; + uint8_t num_vtxbufs; + uint8_t num_vtxelts; + uint8_t num_textures[3]; + uint8_t num_samplers[3]; + uint8_t prim_size; + uint16_t scissor; +}; + struct nv50_screen { struct nouveau_screen base; struct nv50_context *cur_ctx; + struct nv50_graph_state save_state; struct nouveau_bo *code; struct nouveau_bo *uniforms; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c index 85e19b4..116bf4b 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c +++
[Mesa-dev] [PATCH 2/2] nvc0: keep track of PGRAPH state in nvc0_screen
See identical commit for nv50. Destroying the current context and then creating a new one or switching to another existing context would cause the current state to not be properly initialized, so we save it off in the screen. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu Cc: mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/nouveau/nvc0/nvc0_context.c| 7 +- src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 24 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 25 ++ .../drivers/nouveau/nvc0/nvc0_state_validate.c | 2 ++ 4 files changed, 34 insertions(+), 24 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c index 7662fb5..7904984 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c @@ -139,8 +139,12 @@ nvc0_destroy(struct pipe_context *pipe) { struct nvc0_context *nvc0 = nvc0_context(pipe); - if (nvc0-screen-cur_ctx == nvc0) + if (nvc0-screen-cur_ctx == nvc0) { nvc0-screen-cur_ctx = NULL; + nvc0-screen-save_state = nvc0-state; + nvc0-screen-save_state.tfb = NULL; + } + /* Unset bufctx, we don't want to revalidate any resources after the flush. * Other contexts will always set their bufctx again on action calls. */ @@ -303,6 +307,7 @@ nvc0_create(struct pipe_screen *pscreen, void *priv) pipe-get_sample_position = nvc0_context_get_sample_position; if (!screen-cur_ctx) { + nvc0-state = screen-save_state; screen-cur_ctx = nvc0; nouveau_pushbuf_bufctx(screen-base.pushbuf, nvc0-bufctx); } diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h index ef251f3..a8d7593 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h @@ -113,29 +113,7 @@ struct nvc0_context { uint32_t dirty; uint32_t dirty_cp; /* dirty flags for compute state */ - struct { - boolean flushed; - boolean rasterizer_discard; - boolean early_z_forced; - boolean prim_restart; - uint32_t instance_elts; /* bitmask of per-instance elements */ - uint32_t instance_base; - uint32_t constant_vbos; - uint32_t constant_elts; - int32_t index_bias; - uint16_t scissor; - uint8_t vbo_mode; /* 0 = normal, 1 = translate, 3 = translate, forced */ - uint8_t num_vtxbufs; - uint8_t num_vtxelts; - uint8_t num_textures[6]; - uint8_t num_samplers[6]; - uint8_t tls_required; /* bitmask of shader types using l[] */ - uint8_t c14_bound; /* whether immediate array constbuf is bound */ - uint8_t clip_enable; - uint32_t clip_mode; - uint32_t uniform_buffer_bound[5]; - struct nvc0_transform_feedback_state *tfb; - } state; + struct nvc0_graph_state state; struct nvc0_blend_stateobj *blend; struct nvc0_rasterizer_stateobj *rast; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h index 8a1991f..bce0f4a 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h @@ -27,10 +27,35 @@ struct nvc0_context; struct nvc0_blitter; +struct nvc0_graph_state { + boolean flushed; + boolean rasterizer_discard; + boolean early_z_forced; + boolean prim_restart; + uint32_t instance_elts; /* bitmask of per-instance elements */ + uint32_t instance_base; + uint32_t constant_vbos; + uint32_t constant_elts; + int32_t index_bias; + uint16_t scissor; + uint8_t vbo_mode; /* 0 = normal, 1 = translate, 3 = translate, forced */ + uint8_t num_vtxbufs; + uint8_t num_vtxelts; + uint8_t num_textures[6]; + uint8_t num_samplers[6]; + uint8_t tls_required; /* bitmask of shader types using l[] */ + uint8_t c14_bound; /* whether immediate array constbuf is bound */ + uint8_t clip_enable; + uint32_t clip_mode; + uint32_t uniform_buffer_bound[5]; + struct nvc0_transform_feedback_state *tfb; +}; + struct nvc0_screen { struct nouveau_screen base; struct nvc0_context *cur_ctx; + struct nvc0_graph_state save_state; int num_occlusion_queries_active; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c index 6051f12..d3ad81d 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c @@ -543,6 +543,8 @@ nvc0_switch_pipe_context(struct nvc0_context *ctx_to) if (ctx_from) ctx_to-state = ctx_from-state; + else + ctx_to-state = ctx_to-screen-save_state; ctx_to-dirty = ~0; ctx_to-viewports_dirty = ~0; -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option
On 08.05.2015 03:24, Tom Stellard wrote: For this particular situation, I'm happy with any solution that: 1. Allows a user to install the icd file to /etc if he or she wants to. --sysconfdir=/etc That covers drirc as well. 2. Does not require the user to read the spec to know that /etc is the correct place to install it. I think the above is pretty standard for autotools projects. I think it would be better to document this in the appropriate place(s) for OpenCL users than to add another convoluted option which doesn't really add any flexibility. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On Thu, May 7, 2015 at 8:49 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Thu, May 7, 2015 at 6:17 PM, Matt Turner matts...@gmail.com wrote: On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. Again, I'm kind of confused and disappointed that we're just okay with hurting 4700 programs without more analysis. I guess I'll go do that... I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from 297 - 161 instructions. More concerning, the number of send instructions drop from 36 to 12, and a loop that was 111 instructions long suddenly becomes START B1 -B0 -B2 cmp.ge.f0(8)nullg428,8,1D g70,1,0D (+f0) break(8) JIP: 24 UIP: 24 END B1 -B3 -B2 START B2 -B1 add(8) g421D g428,8,1D 1D while(8)JIP: -32 END B2 -B1 That deserves a lot more investigation. I'll take a gamble and say something is broken. I did a little looking at that shader and it looks like NIR dead-coded the contents of a for loop and, as a result, a bunch of stuff was promoted to push constants, hence fewer sampler messages. I didn't find anything broken but, then again, that's hard to do without being able to verifiably run the shader. I'll try and look at the places where we end up with more instructions. --Jason Looking at the assembly even closer, it looks like NIR did 100% the right thing. The shader had a for loop that computes a bunch of values that either don't get used at all or are over-written before they are used. (I didn't check every value written in the loop but I did check a good half-dozen or so.) NIR, probably thanks to SSA, realized that these values were never used for anything, and dead-coded the entire contents of the for loop. The result was that the 12 (yes, 12) pull constant loads inside the loop went away and the 9 after the loop were promoted to push constants. Unfortunately, NIR isn't yet smart enough to remove the loop entirely but an empty loop isn't nearly as expensive as sampler invocations so I'm not too worried about it. I'll try and take a look at some of the hurt programs tomorrow. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+
On Thu, May 7, 2015 at 6:17 PM, Matt Turner matts...@gmail.com wrote: On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote: GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell: total instructions in shared programs: 2724483 - 2711790 (-0.47%) instructions in affected programs: 1860859 - 1848166 (-0.68%) helped:4387 HURT: 4758 GAINED:1499 The gained programs are ARB vertext programs that were previously going through the vec4 backend. Now that we have prog_to_nir, ARB vertex programs can go through the scalar backend so they show up as gained in the shader-db results. Again, I'm kind of confused and disappointed that we're just okay with hurting 4700 programs without more analysis. I guess I'll go do that... What confuses me more is why the results aren't better. When we first turned NIR on by default for FS, the shader-db results looked a lot better. On one branch (wip/nir-by-default-v2) I applied the ATTR copy-prop and we had the following: GLSL IR vs. NIR shader-db results on Broadwell (VS only): total instructions in shared programs: 7106293 - 7001640 (-1.47%) instructions in affected programs: 4604798 - 4500145 (-2.27%) helped:16786 HURT: 8442 GAINED:1563 LOST: 1526 The difference between gained/lost was due to capturing standard error. However, that shouldn't affect the over-all numbers that much. I think adding the improved ffma stuff probably made a bunch of the difference. As far as when we turn it on, I do think that we want to do it before the merge window closes if we can. Being able to delete the visitor after the branch would be really nice. Also, we want to get people testing it and reporting bugs because we're not going to find every bug in every vertex shader by inspection. I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from 297 - 161 instructions. More concerning, the number of send instructions drop from 36 to 12, and a loop that was 111 instructions long suddenly becomes START B1 -B0 -B2 cmp.ge.f0(8)nullg428,8,1D g70,1,0D (+f0) break(8) JIP: 24 UIP: 24 END B1 -B3 -B2 START B2 -B1 add(8) g421D g428,8,1D 1D while(8)JIP: -32 END B2 -B1 That deserves a lot more investigation. I'll take a gamble and say something is broken. Yes, that needs some investigation. I can also take a look at some of the hurt and/or really helped shaders as well and see what I find. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.
On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote: This patch implements the binding table enable command which is also used to allocate a binding table pool where hardware-generated binding table entries are flushed into. Each binding table offset in the binding table pool is unique per each shader stage that are enabled within a batch. Also insert the required brw_tracked_state objects to enable hw-generated binding tables in normal render path. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 ++ src/mesa/drivers/dri/i965/brw_context.c| 4 ++ src/mesa/drivers/dri/i965/brw_context.h| 5 ++ src/mesa/drivers/dri/i965/brw_state.h | 7 +++ src/mesa/drivers/dri/i965/brw_state_upload.c | 2 + src/mesa/drivers/dri/i965/intel_batchbuffer.c | 4 ++ 6 files changed, 92 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c index 459165a..a58e32e 100644 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c @@ -44,6 +44,11 @@ #include brw_state.h #include intel_batchbuffer.h +/* Somehow the hw-binding table pool offset must start here, otherwise + * the GPU will hang + */ +#define HW_BT_START_OFFSET 256; I think we want to understand this a little better before enabling... + /** * Upload a shader stage's binding table as indirect state. * @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = { .emit = brw_gs_upload_binding_table, }; +/** + * Hardware-generated binding tables for the resource streamer + */ +void +gen7_disable_hw_binding_tables(struct brw_context *brw) +{ + BEGIN_BATCH(3); + OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC 16 | (3 - 2)); + OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, BRW_HW_BINDING_TABLE_ENABLE) | + brw-is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0); + OUT_BATCH(0); + ADVANCE_BATCH(); + + /* Pipe control workaround */ + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); +} + +void +gen7_enable_hw_binding_tables(struct brw_context *brw) +{ + if (!brw-has_resource_streamer) { + gen7_disable_hw_binding_tables(brw); I started wondering why we really need this - RS is disabled by default and we haven't needed to do anything to disable it before. + return; + } + + if (!brw-hw_bt_pool.bo) { + /* From the BSpec, 3D Pipeline Resource Streamer Hardware Binding Tables: + * + * A maximum of 16,383 Binding tables are allowed in any batch buffer. + */ + int max_size = 16383 * 4; But does it really need this much all the time? I guess I need to go and read the spec. + brw-hw_bt_pool.bo = drm_intel_bo_alloc(brw-bufmgr, hw_bt, + max_size, 64); + brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET; + } + + uint32_t dw1 = SET_FIELD(BRW_HW_BINDING_TABLE_ON, BRW_HW_BINDING_TABLE_ENABLE); + if (brw-is_haswell) + dw1 |= SET_FIELD(GEN7_MOCS_L3, GEN7_HW_BT_MOCS) | HSW_HW_BINDING_TABLE_RESERVED; These are overflowing 80 columns. + + BEGIN_BATCH(3); + OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC 16 | (3 - 2)); + OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1); + OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, + brw-hw_bt_pool.bo-size); + ADVANCE_BATCH(); + + /* Pipe control workaround */ + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); Would you have a spec reference for this? +} + +void +gen7_reset_rs_pool_offsets(struct brw_context *brw) +{ + brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET; +} + +const struct brw_tracked_state gen7_hw_binding_tables = { + .dirty = { + .mesa = 0, + .brw = BRW_NEW_BATCH, + }, + .emit = gen7_enable_hw_binding_tables +}; + /** @} */ /** diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index c7e1e81..9c7ccae 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -953,6 +953,10 @@ intelDestroyContext(__DRIcontext * driContextPriv) if (brw-wm.base.scratch_bo) drm_intel_bo_unreference(brw-wm.base.scratch_bo); + gen7_reset_rs_pool_offsets(brw); + drm_intel_bo_unreference(brw-hw_bt_pool.bo); + brw-hw_bt_pool.bo = NULL; + drm_intel_gem_context_destroy(brw-hw_ctx); if (ctx-swrast_context) { diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 07626af..1c72b74 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1360,6 +1360,11 @@ struct brw_context uint32_t fast_clear_op;
Re: [Mesa-dev] [PATCH 06/27] i965: Define gather push constants opcodes
On Tue, Apr 28, 2015 at 11:08:03PM +0300, Abdiel Janulgue wrote: Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_defines.h | 23 +++ 1 file changed, 23 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index da288d3..8079433 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -2209,6 +2209,29 @@ enum brw_wm_barycentric_interp_mode { #define _3DSTATE_CONSTANT_HS 0x7819 /* GEN7+ */ #define _3DSTATE_CONSTANT_DS 0x781A /* GEN7+ */ +/* Resource streamer gather constants */ +#define _3DSTATE_GATHER_POOL_ALLOC0x791A /* GEN7.5+ */ +#define _3DSTATE_GATHER_CONSTANT_VS 0x7834 +#define _3DSTATE_GATHER_CONSTANT_GS 0x7835 +#define _3DSTATE_GATHER_CONSTANT_HS 0x7836 +#define _3DSTATE_GATHER_CONSTANT_DS 0x7837 +#define _3DSTATE_GATHER_CONSTANT_PS 0x7838 +/* Only required in HSW */ +#define HSW_GATHER_CONSTANTS_RESERVED (3 4) + +#define BRW_GATHER_CONSTANTS_ENABLE_SHIFT 11 /* GEN7.5+ */ +#define BRW_GATHER_CONSTANTS_ENABLE_MASK INTEL_MASK(11, 11) +#define BRW_GATHER_CONSTANTS_ON 1 +#define BRW_GATHER_CONSTANTS_OFF 0 Such as below for SO_FUNCTION_ENABLE: #define BRW_GATHER_CONSTANTS_ENABLE (1 11) /* GEN7.5+ */ +#define BRW_GATHER_BUFFER_VALID_SHIFT 16 +#define BRW_GATHER_BUFFER_VALID_MASK INTEL_MASK(31, 16) +#define BRW_GATHER_BINDING_TABLE_BLOCK_SHIFT 12 +#define BRW_GATHER_BINDING_TABLE_BLOCK_MASK INTEL_MASK(15, 12) +#define BRW_GATHER_CONST_BUFFER_OFFSET_SHIFT 8 +#define BRW_GATHER_CONST_BUFFER_OFFSET_MASK INTEL_MASK(15, 8) +#define BRW_GATHER_CHANNEL_MASK_SHIFT 4 +#define BRW_GATHER_CHANNEL_MASK_MASK INTEL_MASK(7, 4) + #define _3DSTATE_STREAMOUT0x781e /* GEN7+ */ /* DW1 */ # define SO_FUNCTION_ENABLE (1 31) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/27] i965: Include UBO parameter sizes in push constant parameters
On Tue, 2015-04-28 at 23:08 +0300, Abdiel Janulgue wrote: Now that we consider UBO constants as push constants, we need to include the sizes of the UBO's constant slots in the visitor's uniform slot sizes. This information is needed to properly pack vector constants tightly next to each other. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_gs.c | 11 +++ src/mesa/drivers/dri/i965/brw_vs.c | 13 + src/mesa/drivers/dri/i965/brw_wm.c | 13 + 3 files changed, 37 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index 97658d5..2dc3ea1 100644 --- a/src/mesa/drivers/dri/i965/brw_gs.c +++ b/src/mesa/drivers/dri/i965/brw_gs.c @@ -32,6 +32,7 @@ #include brw_vec4_gs_visitor.h #include brw_state.h #include brw_ff_gs.h +#include glsl/nir/nir_types.h bool @@ -70,6 +71,16 @@ brw_compile_gs_prog(struct brw_context *brw, c.prog_data.base.base.pull_param = rzalloc_array(NULL, const gl_constant_value *, param_count); c.prog_data.base.base.nr_params = param_count; + c.prog_data.base.base.nr_ubo_params = 0; + for (int i = 0; i gs-NumUniformBlocks; i++) { + for (int p = 0; p gs-UniformBlocks[i].NumUniforms; p++) { + const struct glsl_type *type = gs-UniformBlocks[i].Uniforms[p].Type; + const struct glsl_type *elem = glsl_get_element_type(type); + int array_sz = elem ? glsl_get_array_size(type) : 1; + int components = elem ? glsl_get_components(elem) : glsl_get_components(type); As mentioned on the previous patch I've sent a patch to remove the element type helper. I'm not sure I understand the reason the nir wrappers need to be used here can you explain for my benefit? Another way to write this without element type could be something like this: const struct glsl_type *type = gs-UniformBlocks[i].Uniforms[p].Type; int array_sz = MAX2(glsl_get_array_size(type), 1); int components = glsl_get_components(glsl_get_type_without_array(type)); You would obviously need to wrapper the without_array() helper instead. Assuming arrays of arrays support is required here in future (the spec says uniform blocks can be arrays of arrays but I'm not overly familiar with the code your working on) now the only bit missing would be multiplying array size by the other array dimensions. + c.prog_data.base.base.nr_ubo_params += components * array_sz; + } + } c.prog_data.base.base.nr_gather_table = 0; c.prog_data.base.base.gather_table = rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) * diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 52333c9..86bef5e 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -37,6 +37,7 @@ #include brw_state.h #include program/prog_print.h #include program/prog_parameter.h +#include glsl/nir/nir_types.h #include util/ralloc.h @@ -243,6 +244,18 @@ brw_compile_vs_prog(struct brw_context *brw, rzalloc_array(NULL, const gl_constant_value *, param_count); stage_prog_data-nr_params = param_count; + stage_prog_data-nr_ubo_params = 0; + if (vs) { + for (int i = 0; i vs-NumUniformBlocks; i++) { + for (int p = 0; p vs-UniformBlocks[i].NumUniforms; p++) { +const struct glsl_type *type = vs-UniformBlocks[i].Uniforms[p].Type; +const struct glsl_type *elem = glsl_get_element_type(type); +int array_sz = elem ? glsl_get_array_size(type) : 1; +int components = elem ? glsl_get_components(elem) : glsl_get_components(type); +stage_prog_data-nr_ubo_params += components * array_sz; + } + } + } stage_prog_data-nr_gather_table = 0; stage_prog_data-gather_table = rzalloc_size(NULL, sizeof(*stage_prog_data-gather_table) * (stage_prog_data-nr_params + diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 13a64d8..2060eab 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -38,6 +38,7 @@ #include main/samplerobj.h #include program/prog_parameter.h #include program/program.h +#include glsl/nir/nir_types.h #include intel_mipmap_tree.h #include util/ralloc.h @@ -205,6 +206,18 @@ brw_compile_wm_prog(struct brw_context *brw, rzalloc_array(NULL, const gl_constant_value *, param_count); prog_data.base.nr_params = param_count; + prog_data.base.nr_ubo_params = 0; + if (fs) { + for (int i = 0; i fs-NumUniformBlocks; i++) { + for (int p = 0; p fs-UniformBlocks[i].NumUniforms; p++) { +const struct glsl_type *type = fs-UniformBlocks[i].Uniforms[p].Type; +const struct glsl_type *elem = glsl_get_element_type(type); +
[Mesa-dev] [PATCH v2 0/6] Continu enabling Open Gl ES 3.1
Changes to my previous patch-set accoring to comments from Tapani Palli. This will only expose the enums for the respective extensions to gles 3.1 and GL Core. Marta Lofstedt (6): mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1 mesa/es3.1: enable ARB_shader_atomic_counters for GLES 3.1 mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1 mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1 mesa/es3.1: enable GL_ARB_compute_shader for GLES 3.1 mesa/es3.1: enable GL_ARB_explicit_uniform_location for GLES 3.1 src/mesa/main/get.c | 36 src/mesa/main/get_hash_params.py | 88 2 files changed, 80 insertions(+), 44 deletions(-) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/6] mesa/es3.1: enable ARB_shader_atomic_counters for GLES 3.1
From: Marta Lofstedt marta.lofst...@intel.com v2 : only expose ARB_shader_atomic_counters enums for gles 3.1 and GL core. Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 23 +-- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 73739b6..f5318d5 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -361,6 +361,12 @@ static const int extra_ARB_shader_image_load_store_es31[] = { EXTRA_END }; +static const int extra_ARB_shader_atomic_counters_es31[] = { + EXT(ARB_shader_atomic_counters), + EXTRA_API_ES31, + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 85c2494..f9bf749 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -421,6 +421,18 @@ descriptor=[ [ MAX_GEOMETRY_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], [ MAX_FRAGMENT_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), extra_ARB_shader_image_load_store_es31], +# GL_ARB_shader_atomic_counters / GLES 3.1 + [ ATOMIC_COUNTER_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, extra_ARB_shader_atomic_counters_es31 ], + [ MAX_ATOMIC_COUNTER_BUFFER_BINDINGS, CONTEXT_INT(Const.MaxAtomicBufferBindings), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_ATOMIC_COUNTER_BUFFER_SIZE, CONTEXT_INT(Const.MaxAtomicBufferSize), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_VERTEX_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_VERTEX_ATOMIC_COUNTERS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_FRAGMENT_ATOMIC_COUNTERS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_GEOMETRY_ATOMIC_COUNTERS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_COMBINED_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.MaxCombinedAtomicBuffers), extra_ARB_shader_atomic_counters_es31 ], + [ MAX_COMBINED_ATOMIC_COUNTERS, CONTEXT_INT(Const.MaxCombinedAtomicCounters), extra_ARB_shader_atomic_counters_es31 ], ]}, # Remaining enums are only in OpenGL @@ -771,18 +783,9 @@ descriptor=[ # GL_ARB_separate_shader_objects [ PROGRAM_PIPELINE_BINDING, LOC_CUSTOM, TYPE_INT, GL_PROGRAM_PIPELINE_BINDING, NO_EXTRA ], -# GL_ARB_shader_atomic_counters - [ ATOMIC_COUNTER_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, extra_ARB_shader_atomic_counters ], - [ MAX_ATOMIC_COUNTER_BUFFER_BINDINGS, CONTEXT_INT(Const.MaxAtomicBufferBindings), extra_ARB_shader_atomic_counters ], - [ MAX_ATOMIC_COUNTER_BUFFER_SIZE, CONTEXT_INT(Const.MaxAtomicBufferSize), extra_ARB_shader_atomic_counters ], - [ MAX_VERTEX_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers), extra_ARB_shader_atomic_counters ], - [ MAX_VERTEX_ATOMIC_COUNTERS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters), extra_ARB_shader_atomic_counters ], - [ MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers), extra_ARB_shader_atomic_counters ], - [ MAX_FRAGMENT_ATOMIC_COUNTERS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters), extra_ARB_shader_atomic_counters ], +# GL_ARB_shader_atomic_counters and geometry shaders [ MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers), extra_ARB_shader_atomic_counters_and_geometry_shader ], [ MAX_GEOMETRY_ATOMIC_COUNTERS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), extra_ARB_shader_atomic_counters_and_geometry_shader ], - [ MAX_COMBINED_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.MaxCombinedAtomicBuffers), extra_ARB_shader_atomic_counters ], - [ MAX_COMBINED_ATOMIC_COUNTERS, CONTEXT_INT(Const.MaxCombinedAtomicCounters), extra_ARB_shader_atomic_counters ], # GL_ARB_vertex_attrib_binding [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ], -- 1.9.1 ___ mesa-dev mailing list
[Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1
From: Marta Lofstedt marta.lofst...@intel.com v2: only expose enums from GL_ARB_shader_image_load_store for gles 3.1 and GL core Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 17 - 2 files changed, 14 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 9898197..73739b6 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = { EXTRA_END }; +static const int extra_ARB_shader_image_load_store_es31[] = { + EXT(ARB_shader_image_load_store), + EXTRA_API_ES31, + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 513d5d2..85c2494 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -413,6 +413,14 @@ descriptor=[ { apis: [GL_CORE, GLES3], params: [ # GL_ARB_draw_indirect / GLES 3.1 [ DRAW_INDIRECT_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, extra_ARB_draw_indirect_es31 ], +# GL_ARB_shader_image_load_store / GLES 3.1 + [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), extra_ARB_shader_image_load_store_es31], + [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), extra_ARB_shader_image_load_store_es31], + [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), extra_ARB_shader_image_load_store_es31], + [ MAX_VERTEX_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_GEOMETRY_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_FRAGMENT_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), extra_ARB_shader_image_load_store_es31], + [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), extra_ARB_shader_image_load_store_es31], ]}, # Remaining enums are only in OpenGL @@ -780,15 +788,6 @@ descriptor=[ [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ], [ MAX_VERTEX_ATTRIB_BINDINGS, CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ], -# GL_ARB_shader_image_load_store - [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), extra_ARB_shader_image_load_store], - [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), extra_ARB_shader_image_load_store], - [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), extra_ARB_shader_image_load_store], - [ MAX_VERTEX_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), extra_ARB_shader_image_load_store], - [ MAX_GEOMETRY_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), extra_ARB_shader_image_load_store_and_geometry_shader], - [ MAX_FRAGMENT_IMAGE_UNIFORMS, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), extra_ARB_shader_image_load_store], - [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), extra_ARB_shader_image_load_store], - # GL_ARB_compute_shader [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ], [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), extra_ARB_compute_shader ], -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 6/6] mesa/es3.1: enable GL_ARB_explicit_uniform_location for GLES 3.1
From: Marta Lofstedt marta.lofst...@intel.com v2 : only expose GL_ARB_explicit_uniform_location enums for gles 3.1 and GL core. Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 3 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 97d3bf0..6fc0f3f 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -385,6 +385,12 @@ static const int extra_ARB_compute_shader_es31[] = { EXTRA_END }; +static const int extra_ARB_explicit_uniform_location_es31[] = { + EXT(ARB_explicit_uniform_location), + EXTRA_API_ES31, + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 985f252..6b07888 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -454,6 +454,8 @@ descriptor=[ [ MAX_COMPUTE_SHARED_MEMORY_SIZE, CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), extra_ARB_compute_shader_es31 ], [ MAX_COMPUTE_UNIFORM_COMPONENTS, CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), extra_ARB_compute_shader_es31 ], [ MAX_COMPUTE_IMAGE_UNIFORMS, CONST(MAX_COMPUTE_IMAGE_UNIFORMS), extra_ARB_compute_shader_es31 ], +# GL_ARB_explicit_uniform_location / GLES 3.1 + [ MAX_UNIFORM_LOCATIONS, CONTEXT_INT(Const.MaxUserAssignableUniformLocations), extra_ARB_explicit_uniform_location_es31 ], ]}, # Remaining enums are only in OpenGL @@ -539,7 +541,6 @@ descriptor=[ [ MAX_LIST_NESTING, CONST(MAX_LIST_NESTING), NO_EXTRA ], [ MAX_NAME_STACK_DEPTH, CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA ], [ MAX_PIXEL_MAP_TABLE, CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA ], - [ MAX_UNIFORM_LOCATIONS, CONTEXT_INT(Const.MaxUserAssignableUniformLocations), extra_ARB_explicit_uniform_location ], [ NAME_STACK_DEPTH, CONTEXT_INT(Select.NameStackDepth), NO_EXTRA ], [ PACK_LSB_FIRST, CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA ], [ PACK_SWAP_BYTES, CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA ], -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/6] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1
From: Marta Lofstedt marta.lofst...@intel.com v2 : only expose GL_ARB_texture_multisample enums for gles 3.1 and Gl core. Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 17 - 2 files changed, 14 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index f5318d5..dcf4f0a 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -367,6 +367,12 @@ static const int extra_ARB_shader_atomic_counters_es31[] = { EXTRA_END }; +static const int extra_ARB_texture_multisample_es31[] = { + EXT(ARB_texture_multisample), + EXTRA_API_ES31, + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index f9bf749..10c32f2 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -433,6 +433,14 @@ descriptor=[ [ MAX_GEOMETRY_ATOMIC_COUNTERS, CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), extra_ARB_shader_atomic_counters_es31 ], [ MAX_COMBINED_ATOMIC_COUNTER_BUFFERS, CONTEXT_INT(Const.MaxCombinedAtomicBuffers), extra_ARB_shader_atomic_counters_es31 ], [ MAX_COMBINED_ATOMIC_COUNTERS, CONTEXT_INT(Const.MaxCombinedAtomicCounters), extra_ARB_shader_atomic_counters_es31 ], +# GL_ARB_texture_multisample / GLES 3.1 + [ TEXTURE_BINDING_2D_MULTISAMPLE, LOC_CUSTOM, TYPE_INT, TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample_es31 ], + [ TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY, LOC_CUSTOM, TYPE_INT, TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample_es31 ], + [ MAX_COLOR_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxColorTextureSamples), extra_ARB_texture_multisample_es31 ], + [ MAX_DEPTH_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxDepthTextureSamples), extra_ARB_texture_multisample_es31 ], + [ MAX_INTEGER_SAMPLES, CONTEXT_INT(Const.MaxIntegerSamples), extra_ARB_texture_multisample_es31 ], + [ SAMPLE_MASK, CONTEXT_BOOL(Multisample.SampleMask), extra_ARB_texture_multisample_es31 ], + [ MAX_SAMPLE_MASK_WORDS, CONST(1), extra_ARB_texture_multisample_es31 ], ]}, # Remaining enums are only in OpenGL @@ -718,15 +726,6 @@ descriptor=[ [ TEXTURE_BUFFER_FORMAT_ARB, LOC_CUSTOM, TYPE_INT, 0, extra_texture_buffer_object ], [ TEXTURE_BUFFER_ARB, LOC_CUSTOM, TYPE_INT, 0, extra_texture_buffer_object ], -# GL_ARB_texture_multisample / GL 3.2 - [ TEXTURE_BINDING_2D_MULTISAMPLE, LOC_CUSTOM, TYPE_INT, TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample ], - [ TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY, LOC_CUSTOM, TYPE_INT, TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample ], - [ MAX_COLOR_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxColorTextureSamples), extra_ARB_texture_multisample ], - [ MAX_DEPTH_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxDepthTextureSamples), extra_ARB_texture_multisample ], - [ MAX_INTEGER_SAMPLES, CONTEXT_INT(Const.MaxIntegerSamples), extra_ARB_texture_multisample ], - [ SAMPLE_MASK, CONTEXT_BOOL(Multisample.SampleMask), extra_ARB_texture_multisample ], - [ MAX_SAMPLE_MASK_WORDS, CONST(1), extra_ARB_texture_multisample ], - # GL 3.0 [ CONTEXT_FLAGS, CONTEXT_INT(Const.ContextFlags), extra_version_30 ], -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/6] mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1
From: Marta Lofstedt marta.lofst...@intel.com v2 : only expose GL_ARB_texture_gather enums for gles 3.1 and GL core. Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 9 - 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index dcf4f0a..95868bf 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -373,6 +373,12 @@ static const int extra_ARB_texture_multisample_es31[] = { EXTRA_END }; +static const int extra_ARB_texture_gather_es31[] = { + EXT(ARB_texture_gather), + EXTRA_API_ES31, + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 10c32f2..50af078 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -441,6 +441,10 @@ descriptor=[ [ MAX_INTEGER_SAMPLES, CONTEXT_INT(Const.MaxIntegerSamples), extra_ARB_texture_multisample_es31 ], [ SAMPLE_MASK, CONTEXT_BOOL(Multisample.SampleMask), extra_ARB_texture_multisample_es31 ], [ MAX_SAMPLE_MASK_WORDS, CONST(1), extra_ARB_texture_multisample_es31 ], +# GL_ARB_texture_gather / GLES 3.1 + [ MIN_PROGRAM_TEXTURE_GATHER_OFFSET, CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather_es31], + [ MAX_PROGRAM_TEXTURE_GATHER_OFFSET, CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather_es31], + [ MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB, CONTEXT_INT(Const.MaxProgramTextureGatherComponents), extra_ARB_texture_gather_es31], ]}, # Remaining enums are only in OpenGL @@ -774,11 +778,6 @@ descriptor=[ # GL_ARB_texture_cube_map_array [ TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB, LOC_CUSTOM, TYPE_INT, TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array ], -# GL_ARB_texture_gather - [ MIN_PROGRAM_TEXTURE_GATHER_OFFSET, CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather], - [ MAX_PROGRAM_TEXTURE_GATHER_OFFSET, CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather], - [ MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB, CONTEXT_INT(Const.MaxProgramTextureGatherComponents), extra_ARB_texture_gather], - # GL_ARB_separate_shader_objects [ PROGRAM_PIPELINE_BINDING, LOC_CUSTOM, TYPE_INT, GL_PROGRAM_PIPELINE_BINDING, NO_EXTRA ], -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/6] mesa/es3.1: enable GL_ARB_compute_shader for GLES 3.1
From: Marta Lofstedt marta.lofst...@intel.com v2 : only expose GL_ARB_compute_shader enums for gles 3.1 and GL core. Signed-off-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/main/get.c | 6 ++ src/mesa/main/get_hash_params.py | 19 +-- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 95868bf..97d3bf0 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -379,6 +379,12 @@ static const int extra_ARB_texture_gather_es31[] = { EXTRA_END }; +static const int extra_ARB_compute_shader_es31[] = { + EXT(ARB_compute_shader), + EXTRA_API_ES31, + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(EXT_texture_array); EXTRA_EXT(NV_fog_distance); diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 50af078..985f252 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -445,6 +445,15 @@ descriptor=[ [ MIN_PROGRAM_TEXTURE_GATHER_OFFSET, CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather_es31], [ MAX_PROGRAM_TEXTURE_GATHER_OFFSET, CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather_es31], [ MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB, CONTEXT_INT(Const.MaxProgramTextureGatherComponents), extra_ARB_texture_gather_es31], +# GL_ARB_compute_shader / GLES 3.1 + [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader_es31 ], + [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), extra_ARB_compute_shader_es31 ], + [ MAX_COMPUTE_TEXTURE_IMAGE_UNITS, CONST(MAX_COMPUTE_TEXTURE_IMAGE_UNITS), extra_ARB_compute_shader_es31 ], + [ MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS, CONST(MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS), extra_ARB_compute_shader_es31 ], + [ MAX_COMPUTE_ATOMIC_COUNTERS, CONST(MAX_COMPUTE_ATOMIC_COUNTERS), extra_ARB_compute_shader_es31 ], + [ MAX_COMPUTE_SHARED_MEMORY_SIZE, CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), extra_ARB_compute_shader_es31 ], + [ MAX_COMPUTE_UNIFORM_COMPONENTS, CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), extra_ARB_compute_shader_es31 ], + [ MAX_COMPUTE_IMAGE_UNIFORMS, CONST(MAX_COMPUTE_IMAGE_UNIFORMS), extra_ARB_compute_shader_es31 ], ]}, # Remaining enums are only in OpenGL @@ -789,16 +798,6 @@ descriptor=[ [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ], [ MAX_VERTEX_ATTRIB_BINDINGS, CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ], -# GL_ARB_compute_shader - [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ], - [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), extra_ARB_compute_shader ], - [ MAX_COMPUTE_TEXTURE_IMAGE_UNITS, CONST(MAX_COMPUTE_TEXTURE_IMAGE_UNITS), extra_ARB_compute_shader ], - [ MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS, CONST(MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS), extra_ARB_compute_shader ], - [ MAX_COMPUTE_ATOMIC_COUNTERS, CONST(MAX_COMPUTE_ATOMIC_COUNTERS), extra_ARB_compute_shader ], - [ MAX_COMPUTE_SHARED_MEMORY_SIZE, CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), extra_ARB_compute_shader ], - [ MAX_COMPUTE_UNIFORM_COMPONENTS, CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), extra_ARB_compute_shader ], - [ MAX_COMPUTE_IMAGE_UNIFORMS, CONST(MAX_COMPUTE_IMAGE_UNIFORMS), extra_ARB_compute_shader ], - # GL_ARB_gpu_shader5 [ MAX_GEOMETRY_SHADER_INVOCATIONS, CONST(MAX_GEOMETRY_SHADER_INVOCATIONS), extra_ARB_gpu_shader5 ], [ MIN_FRAGMENT_INTERPOLATION_OFFSET, CONTEXT_FLOAT(Const.MinFragmentInterpolationOffset), extra_ARB_gpu_shader5 ], -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/27] i965: Enable gather push constants
On Tue, Apr 28, 2015 at 11:08:04PM +0300, Abdiel Janulgue wrote: The 3DSTATE_GATHER_POOL_ALLOC is used to enable or disable the gather push constants feature within a context. This patch provides the toggle functionality of using gather push constants to program constant data within a batch. Using gather push constants require that a gather pool be allocated so that the resource streamer can flush the packed constants it gathered. The pool is later referenced by the 3DSTATE_CONSTANT_* command to program the push constant data. Also introduce INTEL_UBO_GATHER to selectively enable which shader stage uses gather constants for ubo fetches. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_binding_tables.c | 43 +- src/mesa/drivers/dri/i965/brw_context.c| 37 ++ src/mesa/drivers/dri/i965/brw_context.h| 10 ++ src/mesa/drivers/dri/i965/brw_state.h | 1 + 4 files changed, 90 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c index c1d188e..4793fbc 100644 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c @@ -236,9 +236,47 @@ gen7_update_binding_table_from_array(struct brw_context *brw, ADVANCE_BATCH(); } +static void +gen7_init_gather_pool(struct brw_context *brw) +{ + if (!brw-has_resource_streamer) + return; + + if (!brw-gather_pool.bo) { + brw-gather_pool.bo = drm_intel_bo_alloc(brw-bufmgr, gather_pool, + brw-gather_pool.size, 4096); + brw-gather_pool.next_offset = 0; + } +} + +void +gen7_toggle_gather_constants(struct brw_context *brw, bool enable) +{ + if (enable !brw-has_resource_streamer) + return; + + uint32_t dw1 = brw-is_haswell ? HSW_GATHER_CONSTANTS_RESERVED : 0; + + BEGIN_BATCH(3); + OUT_BATCH(_3DSTATE_GATHER_POOL_ALLOC 16 | (3 - 2)); + if (enable) { + dw1 |= SET_FIELD(BRW_GATHER_CONSTANTS_ON, BRW_GATHER_CONSTANTS_ENABLE) | + (brw-is_haswell ? GEN7_MOCS_L3 : 0); This should align with the previous line. + OUT_RELOC(brw-gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1); + OUT_RELOC(brw-gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, +brw-gather_pool.bo-size); + } else { + OUT_BATCH(dw1); + OUT_BATCH(0); + } + ADVANCE_BATCH(); +} + void gen7_disable_hw_binding_tables(struct brw_context *brw) { + gen7_toggle_gather_constants(brw, false); + BEGIN_BATCH(3); OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC 16 | (3 - 2)); OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, BRW_HW_BINDING_TABLE_ENABLE) | @@ -280,6 +318,9 @@ gen7_enable_hw_binding_tables(struct brw_context *brw) brw-hw_bt_pool.bo-size); ADVANCE_BATCH(); + gen7_init_gather_pool(brw); + gen7_toggle_gather_constants(brw, true); + /* Pipe control workaround */ brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); } @@ -288,6 +329,7 @@ void gen7_reset_rs_pool_offsets(struct brw_context *brw) { brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET; + brw-gather_pool.next_offset = 0; } const struct brw_tracked_state gen7_hw_binding_tables = { @@ -371,5 +413,4 @@ const struct brw_tracked_state gen6_binding_table_pointers = { }, .emit = gen6_upload_binding_table_pointers, }; - Not related to this patch. /** @} */ diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 9c7ccae..685ca70 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -67,6 +67,7 @@ #include tnl/tnl.h #include tnl/t_pipeline.h #include util/ralloc.h +#include util/u_atomic.h #include glsl/nir/nir.h @@ -692,6 +693,25 @@ brw_get_revision(int fd) return revision; } +static void +brw_process_intel_gather_variable(struct brw_context *brw) +{ + uint64_t INTEL_UBO_GATHER = 0; + + static const struct dri_debug_control gather_control[] = { + { vs, (1 MESA_SHADER_VERTEX)}, + { gs, (1 MESA_SHADER_GEOMETRY)}, + { fs, (1 MESA_SHADER_FRAGMENT)}, You can drop the outermost (). + { NULL, 0 } + }; + uint64_t intel_ubo_gather = driParseDebugString(getenv(INTEL_UBO_GATHER), gather_control); Wrap to next line, overflowing 80. + (void) p_atomic_cmpxchg(INTEL_UBO_GATHER, 0, intel_ubo_gather); + + brw-vs_ubo_gather = (INTEL_UBO_GATHER (1 MESA_SHADER_VERTEX)); + brw-gs_ubo_gather = (INTEL_UBO_GATHER (1 MESA_SHADER_GEOMETRY)); + brw-fs_ubo_gather = (INTEL_UBO_GATHER (1 MESA_SHADER_FRAGMENT)); Here also, you can drop the outermost (). +} + GLboolean brwCreateContext(gl_api api, const struct gl_config
Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions
On Thu, May 7, 2015 at 5:52 AM, Pohjolainen, Topi topi.pohjolai...@intel.com wrote: On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote: v2: Get rid of the block parameter and make src a const reference Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 1 file changed, 38 insertions(+), 37 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index 43370cb..9c4ed0b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate) operands_match(a, b, negate); } +static fs_inst * +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate) Did you mean 'src' to be constant reference? It is only used for reading so it could be - you claim this in the commit message yourself :) Oops... I think what happened is that I tried to do it for is_copy_payload not create_copy_instr. But then is_copy_payload does actually change it so I put it back and somehow my brain leaked it into the commit message. Unfortunately, it's already pushed so I can't change it now. However, I could make a fixup if you'd like. --Jason +{ + int written = inst-regs_written; + int dst_width = inst-dst.width / 8; + fs_reg dst = inst-dst; + fs_inst *copy; + + if (written dst_width) { + fs_reg *sources = ralloc_array(v-mem_ctx, fs_reg, written / dst_width); + for (int i = 0; i written / dst_width; i++) + sources[i] = offset(src, i); + copy = v-LOAD_PAYLOAD(dst, sources, written / dst_width); + } else { + copy = v-MOV(dst, src); + copy-force_writemask_all = inst-force_writemask_all; + copy-src[0].negate = negate; + } + assert(copy-regs_written == written); + + return copy; +} + bool fs_visitor::opt_cse_local(bblock_t *block) { @@ -230,49 +253,27 @@ fs_visitor::opt_cse_local(bblock_t *block) bool no_existing_temp = entry-tmp.file == BAD_FILE; if (no_existing_temp !entry-generator-dst.is_null()) { int written = entry-generator-regs_written; - int dst_width = entry-generator-dst.width / 8; - assert(written % dst_width == 0); - - fs_reg orig_dst = entry-generator-dst; - fs_reg tmp = fs_reg(GRF, alloc.allocate(written), - orig_dst.type, orig_dst.width); - entry-tmp = tmp; - entry-generator-dst = tmp; - - fs_inst *copy; - if (written dst_width) { - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / dst_width); - for (int i = 0; i written / dst_width; i++) - sources[i] = offset(tmp, i); - copy = LOAD_PAYLOAD(orig_dst, sources, written / dst_width); - } else { - copy = MOV(orig_dst, tmp); - copy-force_writemask_all = - entry-generator-force_writemask_all; - } + assert((written * 8) % entry-generator-dst.width == 0); + + entry-tmp = fs_reg(GRF, alloc.allocate(written), + entry-generator-dst.type, + entry-generator-dst.width); + + fs_inst *copy = create_copy_instr(this, entry-generator, + entry-tmp, false); entry-generator-insert_after(block, copy); + + entry-generator-dst = entry-tmp; } /* dest - temp */ if (!inst-dst.is_null()) { - int written = inst-regs_written; - int dst_width = inst-dst.width / 8; - assert(written == entry-generator-regs_written); - assert(dst_width == entry-generator-dst.width / 8); + assert(inst-regs_written == entry-generator-regs_written); + assert(inst-dst.width == entry-generator-dst.width); assert(inst-dst.type == entry-tmp.type); - fs_reg dst = inst-dst; - fs_reg tmp = entry-tmp; - fs_inst *copy; - if (written dst_width) { - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / dst_width); - for (int i = 0; i written / dst_width; i++) - sources[i] = offset(tmp, i); - copy = LOAD_PAYLOAD(dst, sources, written / dst_width); - } else { - copy = MOV(dst, tmp); - copy-force_writemask_all = inst-force_writemask_all; -
[Mesa-dev] [PATCH 3/7] i965: Move texture swizzle resolving into dispatcher
From: Topi Pohjolainen topi.pohjolai...@intel.com Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 20 +++- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 16 ++-- src/mesa/drivers/dri/i965/gen8_surface_state.c| 16 ++-- 4 files changed, 21 insertions(+), 35 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index d599ba8..9e85dd7 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -983,10 +983,10 @@ struct brw_context struct { - void (*update_texture_surface)(struct gl_context *ctx, + void (*update_texture_surface)(struct brw_context *brw, struct intel_mipmap_tree *mt, struct gl_texture_object *tObj, - uint32_t tex_format, + uint32_t tex_format, unsigned swizzle, uint32_t *surf_offset, bool for_gather); uint32_t (*update_renderbuffer_surface)(struct brw_context *brw, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 7ed7e18..3dddf89 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -308,14 +308,13 @@ update_buffer_texture_surface(struct gl_context *ctx, } static void -brw_update_texture_surface(struct gl_context *ctx, +brw_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, struct gl_texture_object *tObj, - uint32_t tex_format, + uint32_t tex_format, unsigned swizzle /* unused */, uint32_t *surf_offset, bool for_gather) { - struct brw_context *brw = brw_context(ctx); struct intel_texture_object *intelObj = intel_texture_object(tObj); uint32_t *surf; @@ -801,6 +800,17 @@ update_texture_surface(struct gl_context *ctx, struct intel_mipmap_tree *mt = intel_obj-mt; const struct gl_texture_image *firstImage = obj-Image[0][obj-BaseLevel]; const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); + + /* Handling GL_ALPHA as a surface format override breaks 1.30+ style + * texturing functions that return a float, as our code generation always + * selects the .x channel (which would always be 0). + */ + const bool alpha_depth = obj-DepthMode == GL_ALPHA + (firstImage-_BaseFormat == GL_DEPTH_COMPONENT || + firstImage-_BaseFormat == GL_DEPTH_STENCIL); + const unsigned swizzle = (unlikely(alpha_depth) ? SWIZZLE_XYZW : +brw_get_texture_swizzle(brw-ctx, obj)); + unsigned format = translate_tex_format(brw, intel_obj-_Format, sampler-sRGBDecode); if (obj-StencilSampling firstImage-_BaseFormat == GL_DEPTH_STENCIL) { @@ -810,8 +820,8 @@ update_texture_surface(struct gl_context *ctx, format = BRW_SURFACEFORMAT_R8_UINT; } - brw-vtbl.update_texture_surface(ctx, mt, obj, format, surf_offset, - for_gather); + brw-vtbl.update_texture_surface(brw, mt, obj, format, swizzle, + surf_offset, for_gather); } } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 7e3ee67..7576b20 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -348,14 +348,13 @@ gen7_emit_texture_surface_state(struct brw_context *brw, } static void -gen7_update_texture_surface(struct gl_context *ctx, +gen7_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, struct gl_texture_object *obj, -uint32_t tex_format, +uint32_t tex_format, unsigned swizzle, uint32_t *surf_offset, bool for_gather) { - struct brw_context *brw = brw_context(ctx); struct intel_texture_object *intel_obj = intel_texture_object(obj); /* If this is a view with restricted NumLayers, then our effective depth * is not just the miptree depth. @@ -363,17 +362,6 @@ gen7_update_texture_surface(struct gl_context
[Mesa-dev] [PATCH 5/7] i965: Pass texture target as parameter for surface setup
From: Topi Pohjolainen topi.pohjolai...@intel.com Also changed a couple of direct shifts into SET_FIELD(). Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 1 + src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++-- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 4 ++-- src/mesa/drivers/dri/i965/gen8_surface_state.c| 4 ++-- 4 files changed, 11 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 0e9ede9..6f08b06 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -986,6 +986,7 @@ struct brw_context void (*update_texture_surface)(struct brw_context *brw, struct intel_mipmap_tree *mt, struct gl_texture_object *tObj, + GLenum target, unsigned min_layer, unsigned max_layer, uint32_t tex_format, unsigned swizzle, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 92383e1..fa4e36d 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -310,7 +310,7 @@ update_buffer_texture_surface(struct gl_context *ctx, static void brw_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, - struct gl_texture_object *tObj, + struct gl_texture_object *tObj, GLenum target, unsigned min_layer /* unused */, unsigned max_layer /* unused */, uint32_t tex_format, unsigned swizzle /* unused */, @@ -352,10 +352,10 @@ brw_update_texture_surface(struct brw_context *brw, } } - surf[0] = (translate_tex_target(tObj-Target) BRW_SURFACE_TYPE_SHIFT | - BRW_SURFACE_MIPMAPLAYOUT_BELOW BRW_SURFACE_MIPLAYOUT_SHIFT | - BRW_SURFACE_CUBEFACE_ENABLES | - tex_format BRW_SURFACE_FORMAT_SHIFT); + surf[0] = SET_FIELD(translate_tex_target(target), BRW_SURFACE_TYPE) | + BRW_SURFACE_MIPMAPLAYOUT_BELOW BRW_SURFACE_MIPLAYOUT_SHIFT | + BRW_SURFACE_CUBEFACE_ENABLES | + tex_format BRW_SURFACE_FORMAT_SHIFT; surf[1] = mt-bo-offset64 + mt-offset; /* reloc */ @@ -827,7 +827,7 @@ update_texture_surface(struct gl_context *ctx, format = BRW_SURFACEFORMAT_R8_UINT; } - brw-vtbl.update_texture_surface(brw, mt, obj, + brw-vtbl.update_texture_surface(brw, mt, obj, obj-Target, obj-MinLayer, obj-MinLayer + depth, format, swizzle, surf_offset, for_gather); } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 9755236..89dba40 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -350,7 +350,7 @@ gen7_emit_texture_surface_state(struct brw_context *brw, static void gen7_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, -struct gl_texture_object *obj, +struct gl_texture_object *obj, GLenum target, unsigned min_layer, unsigned max_layer, uint32_t tex_format, unsigned swizzle, @@ -361,7 +361,7 @@ gen7_update_texture_surface(struct brw_context *brw, if (for_gather tex_format == BRW_SURFACEFORMAT_R32G32_FLOAT) tex_format = BRW_SURFACEFORMAT_R32G32_FLOAT_LD; - gen7_emit_texture_surface_state(brw, mt, obj-Target, + gen7_emit_texture_surface_state(brw, mt, target, min_layer, max_layer, obj-MinLevel + obj-BaseLevel, obj-MinLevel + intel_obj-_MaxLevel + 1, diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 580c1a3..9858f5f 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -249,7 +249,7 @@ gen8_emit_texture_surface_state(struct brw_context *brw, static void gen8_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, -struct gl_texture_object *obj, +
[Mesa-dev] [PATCH 6/7] i965: Pass slice details as parameters for surface setup
From: Topi Pohjolainen topi.pohjolai...@intel.com Also changed a couple of direct shifts into SET_FIELD(). Fixes: arb_copy_image-formats -auto -fbo on ILK. In principle, minimum level settings are only for TextureView to use. We, however, also take advantage of that internally when blitting. Before this patch this wasn't taken into account for ILK in the surface setup. v2: - Removed extra whitespace and switched tabs to spaces (Matt) - Added assertion on minimum level (Ken). v3 (Curro): Reorder min_layer and effective_depth Reviewed-by: Matt Turner matts...@gmail.com (v1) Reviewed-by: Kenneth Graunke kenn...@whitecape.org (v1) Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. Pass a half-open interval of levels like emit_texture_surface_state does. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 3 ++- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 31 +++ src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 10 +++- src/mesa/drivers/dri/i965/gen8_surface_state.c| 11 +++- 4 files changed, 30 insertions(+), 25 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 6f08b06..2eb4251 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -985,10 +985,11 @@ struct brw_context { void (*update_texture_surface)(struct brw_context *brw, struct intel_mipmap_tree *mt, - struct gl_texture_object *tObj, GLenum target, unsigned min_layer, unsigned max_layer, + unsigned min_level, + unsigned max_level, uint32_t tex_format, unsigned swizzle, uint32_t *surf_offset, bool for_gather); diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index fa4e36d..de4bdc5 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -310,14 +310,15 @@ update_buffer_texture_surface(struct gl_context *ctx, static void brw_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, - struct gl_texture_object *tObj, GLenum target, + GLenum target, unsigned min_layer /* unused */, unsigned max_layer /* unused */, + unsigned min_level, + unsigned max_level, uint32_t tex_format, unsigned swizzle /* unused */, uint32_t *surf_offset, bool for_gather) { - struct intel_texture_object *intelObj = intel_texture_object(tObj); uint32_t *surf; surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, @@ -359,16 +360,16 @@ brw_update_texture_surface(struct brw_context *brw, surf[1] = mt-bo-offset64 + mt-offset; /* reloc */ - surf[2] = ((intelObj-_MaxLevel - tObj-BaseLevel) BRW_SURFACE_LOD_SHIFT | - (mt-logical_width0 - 1) BRW_SURFACE_WIDTH_SHIFT | - (mt-logical_height0 - 1) BRW_SURFACE_HEIGHT_SHIFT); + surf[2] = SET_FIELD(max_level - min_level - 1, BRW_SURFACE_LOD) | + SET_FIELD(mt-logical_width0 - 1, BRW_SURFACE_WIDTH) | + SET_FIELD(mt-logical_height0 - 1, BRW_SURFACE_HEIGHT); - surf[3] = (brw_get_surface_tiling_bits(mt-tiling) | - (mt-logical_depth0 - 1) BRW_SURFACE_DEPTH_SHIFT | - (mt-pitch - 1) BRW_SURFACE_PITCH_SHIFT); + surf[3] = brw_get_surface_tiling_bits(mt-tiling) | + SET_FIELD(mt-logical_depth0 - 1, BRW_SURFACE_DEPTH) | + SET_FIELD(mt-pitch - 1, BRW_SURFACE_PITCH); - surf[4] = (brw_get_surface_num_multisamples(mt-num_samples) | - SET_FIELD(tObj-BaseLevel - mt-first_level, BRW_SURFACE_MIN_LOD)); + surf[4] = brw_get_surface_num_multisamples(mt-num_samples) | + SET_FIELD(min_level - mt-first_level, BRW_SURFACE_MIN_LOD); surf[5] = mt-align_h == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0; @@ -827,8 +828,16 @@ update_texture_surface(struct gl_context *ctx, format = BRW_SURFACEFORMAT_R8_UINT; } - brw-vtbl.update_texture_surface(brw, mt, obj, obj-Target, + /* Minimum level is only supported for TextureView but internally it is + * also taken advantage of by meta blit path. The former is only enabled + * from gen7 onwards. + */ + assert(brw-gen = 7 || obj-MinLevel == 0 ||
[Mesa-dev] [PATCH] i965/wm/gen6: Add option for disabling statistics collection
Normally this always needed but for internal blits and clears we need to be able to disable it. CC: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/i965/brw_state.h | 3 ++- src/mesa/drivers/dri/i965/gen6_wm_state.c | 14 +++--- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index 18449c4..26fdae6 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -339,7 +339,8 @@ gen6_upload_wm_state(struct brw_context *brw, bool multisampled_fbo, int min_inv_per_frag, bool dual_source_blend_enable, bool kill_enable, bool color_buffer_write_enable, bool msaa_enabled, - bool line_stipple_enable, bool polygon_stipple_enable); + bool line_stipple_enable, bool polygon_stipple_enable, + bool statistic_enable); /* gen6_sf_state.c */ void diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c b/src/mesa/drivers/dri/i965/gen6_wm_state.c index e5b0f5a..7081eb7 100644 --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c @@ -73,7 +73,8 @@ gen6_upload_wm_state(struct brw_context *brw, bool multisampled_fbo, int min_inv_per_frag, bool dual_source_blend_enable, bool kill_enable, bool color_buffer_write_enable, bool msaa_enabled, - bool line_stipple_enable, bool polygon_stipple_enable) + bool line_stipple_enable, bool polygon_stipple_enable, + bool statistic_enable) { uint32_t dw2, dw4, dw5, dw6, ksp0, ksp2; @@ -109,7 +110,10 @@ gen6_upload_wm_state(struct brw_context *brw, } dw2 = dw4 = dw5 = dw6 = ksp2 = 0; - dw4 |= GEN6_WM_STATISTICS_ENABLE; + + if (statistic_enable) + dw4 |= GEN6_WM_STATISTICS_ENABLE; + dw5 |= GEN6_WM_LINE_AA_WIDTH_1_0; dw5 |= GEN6_WM_LINE_END_CAP_AA_WIDTH_0_5; @@ -300,6 +304,9 @@ upload_wm_state(struct brw_context *brw) ctx-Multisample.SampleAlphaToCoverage || prog_data-uses_omask; + /* Rendering against the gl-context is always taken into account. */ + const bool statistic_enable = true; + /* _NEW_LINE | _NEW_POLYGON | _NEW_BUFFERS | _NEW_COLOR | * _NEW_MULTISAMPLE */ @@ -308,7 +315,8 @@ upload_wm_state(struct brw_context *brw) dual_src_blend_enable, kill_enable, brw_color_buffer_write_enabled(brw), ctx-Multisample.Enabled, -ctx-Line.StippleFlag, ctx-Polygon.StippleFlag); +ctx-Line.StippleFlag, ctx-Polygon.StippleFlag, +statistic_enable); } const struct brw_tracked_state gen6_wm_state = { -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965: Revision of texture surface setup refactoring
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Wed, May 06, 2015 at 02:56:53PM +0300, Francisco Jerez wrote: Hi! Topi Pohjolainen topi.pohjolai...@intel.com writes: This series moves all the decision making of values into common hardware independent dispatcher while leaving the hardware specific logic to deal with formatting only. Curro needed a similar refactor for gen7 and gen8. However, that makes it a harder to apply the changes I needed that expand all the way to gen4. Ken helped me to notice that my refactoring can in fact address both relatively easily. For context, I added the patch from Curro that makes use of the texture surface setup logic along with a small patch making it compatible with the surface state refactoring found here. Curro, what do you think? I'm not too happy with reverting your work but overall this way it becomes cleaner, I think. *Shrug*, it seems weird to me that you opted to revert my patches even though they are closer to where you want to get at than it was before my patches. This is the current interface: void (*emit_texture_surface_state)(struct brw_context *brw, struct intel_mipmap_tree *mt, GLenum target, unsigned min_layer, unsigned max_layer, unsigned min_level, unsigned max_level, unsigned format, unsigned swizzle, uint32_t *surf_offset, bool rw, bool for_gather); This is the old interface we both wanted to get rid of: void (*update_texture_surface)(struct gl_context *ctx, unsigned unit, uint32_t *surf_offset, bool for_gather); This is the interface introduced by this series: void (*update_texture_surface)(struct brw_context *brw, const struct intel_mipmap_tree *mt, GLenum target, uint32_t effective_depth, uint32_t min_layer, uint32_t min_lod, uint32_t mip_count, uint32_t tex_format, int swizzle, uint32_t *surf_offset, bool for_gather); AFAIK the only difference between your proposal and mine is the name (IMHO emit_texture_surface_state is more consistent with the other emit_*_surface_state hooks with similar semantics), the ordering of arguments (and I find the ordering and naming of your effective_depth, min_layer, min_lod and mip_count arguments rather asymmetric, they are both pairs determining an interval of either layers or levels, it doesn't make much sense to me that they are named and ordered inconsistently in your series), the fact that you're using a min level/layer index + count instead of half-open intervals like I did, and the fact that you're missing an rw argument which is required for ARB_shader_image_load_store support. I fail to see why a revert is justified or desirable, and I fail to see how your proposal will work better on Gen4, since the difference between the two interfaces mostly cosmetic. I'm just looking at the end result. Here we don't need to introduce new entry to the jump table, the changes are kept to the minimum and we both get applicable interface. I didn't really intentionally choose between the interfaces - this was the outcome of trying to keep it as unintrusive as I could. I've rebased your series on top of master. In fact the rebased version is a lot less churn, two of your patches (PATCH 3 and 5) that were re-applying changes you had previously reverted become empty, and the diffstat goes down from +131/-195 to +89/-143. There were a number of subtle differences between the two interfaces that weren't obvious at all by looking at the end result, and I only noticed while looking at the actual diff between master (without reverts) and your branch, namely: - Your mip_count argument expects the number of mipmap levels minus one, instead of the actual number of mipmap levels (we already discussed this earlier today to some extent). - Your min_lod argument isn't the absolute starting mipmap level, instead it's relative to mt-first_level. This could have bitten us in the future if some caller forgets to take this offset into account. - gen7_emit_texture_surface_state wasn't taking into account the work-around for the R32G32_FLOAT format in the texture
[Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null
opt_sampler_eot enables a direct write to framebuffer from a sample. In order to do this the sample message needs to have a message header so if there wasn't one already then the function adds one. In addition the function sets the destination register to null because it's no longer used. However it was only doing this in cases where it was adding a message header. This patch just moves setting the destination so that it happens even if there's a messge header. In practice this doesn't seem to make any difference but it's a bit cleaner. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 1ca7ca6..72d408b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2675,6 +2675,7 @@ fs_visitor::opt_sampler_eot() tex_inst-offset |= fb_write-target 24; tex_inst-eot = true; + tex_inst-dst = reg_null_ud; fb_write-remove(cfg-blocks[cfg-num_blocks - 1]); /* If a header is present, marking the eot is sufficient. Otherwise, we need @@ -2712,7 +2713,6 @@ fs_visitor::opt_sampler_eot() tex_inst-header_present = true; tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], new_load_payload); tex_inst-src[0] = send_header; - tex_inst-dst = reg_null_ud; return true; } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/27] i965: Define HW-binding table and resource streamer control opcodes
On Sun, May 03, 2015 at 06:04:05PM +0300, Pohjolainen, Topi wrote: On Tue, Apr 28, 2015 at 11:07:58PM +0300, Abdiel Janulgue wrote: Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.h | 1 + src/mesa/drivers/dri/i965/brw_defines.h | 24 src/mesa/drivers/dri/i965/intel_reg.h | 3 +++ 3 files changed, 28 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index a6d6787..07626af 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1105,6 +1105,7 @@ struct brw_context bool no_simd8; bool use_rep_send; bool scalar_vs; + bool has_resource_streamer; This should go to the next patch. Other than that all looks good - I checked the values against bspec and I couldn't find anything amiss. Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com /** * Some versions of Gen hardware don't do centroid interpolation correctly diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index a97a944..da288d3 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1586,6 +1586,30 @@ enum brw_message_target { #define _3DSTATE_BINDING_TABLE_POINTERS_GS 0x7829 /* GEN7+ */ #define _3DSTATE_BINDING_TABLE_POINTERS_PS 0x782A /* GEN7+ */ +#define _3DSTATE_BINDING_TABLE_POOL_ALLOC 0x7919 /* GEN7.5+ */ +#define BRW_HW_BINDING_TABLE_ENABLE_SHIFT 11 /* GEN7.5+ */ +#define BRW_HW_BINDING_TABLE_ENABLE_MASKINTEL_MASK(11, 11) Actually we usually do the booleans just as: #define BRW_HW_BINDING_TABLE_ENABLE (1 11) +#define BRW_HW_BINDING_TABLE_ON 1 +#define BRW_HW_BINDING_TABLE_OFF0 +#define GEN7_HW_BT_MOCS_SHIFT 7 +#define GEN7_HW_BT_MOCS_MASKINTEL_MASK(10, 7) +#define GEN8_HW_BT_MOCS_SHIFT 0 +#define GEN8_HW_BT_MOCS_MASKINTEL_MASK(6, 0) +/* Only required in HSW */ +#define HSW_HW_BINDING_TABLE_RESERVED (3 5) + +#define _3DSTATE_BINDING_TABLE_EDIT_VS 0x7843 /* GEN7.5 */ +#define _3DSTATE_BINDING_TABLE_EDIT_GS 0x7844 /* GEN7.5 */ +#define _3DSTATE_BINDING_TABLE_EDIT_HS 0x7845 /* GEN7.5 */ +#define _3DSTATE_BINDING_TABLE_EDIT_DS 0x7846 /* GEN7.5 */ +#define _3DSTATE_BINDING_TABLE_EDIT_PS 0x7847 /* GEN7.5 */ +#define BRW_BINDING_TABLE_INDEX_SHIFT 16 +#define BRW_BINDING_TABLE_INDEX_MASKINTEL_MASK(23, 16) + +#define BRW_BINDING_TABLE_EDIT_TARGET_ALL 3 +#define BRW_BINDING_TABLE_EDIT_TARGET_CORE1 2 +#define BRW_BINDING_TABLE_EDIT_TARGET_CORE0 1 + #define _3DSTATE_SAMPLER_STATE_POINTERS0x7802 /* GEN6+ */ # define PS_SAMPLER_STATE_CHANGE (1 12) # define GS_SAMPLER_STATE_CHANGE (1 9) diff --git a/src/mesa/drivers/dri/i965/intel_reg.h b/src/mesa/drivers/dri/i965/intel_reg.h index 488fb5b..9cdb3ca 100644 --- a/src/mesa/drivers/dri/i965/intel_reg.h +++ b/src/mesa/drivers/dri/i965/intel_reg.h @@ -47,6 +47,9 @@ /* Load a value from memory into a register. Only available on Gen7+. */ #define GEN7_MI_LOAD_REGISTER_MEM (CMD_MI | (0x29 23)) # define MI_LOAD_REGISTER_MEM_USE_GGTT (1 22) +/* Haswell RS control */ +#define MI_RS_CONTROL (CMD_MI | (0x6 23)) +#define MI_RS_STORE_DATA_IMM(CMD_MI | (0x2b 23)) /** @{ * -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] i965: Drop the update_texture_surface vtbl hook.
At this point the update_texture_surface and emit_texture_surface_state hooks are almost equivalent, the only significant difference is that emit_texture_surface_state supports binding read-write surfaces. The name of the latter is more consistent with the other emit_something_surface_state hooks, so let's keep it. --- src/mesa/drivers/dri/i965/brw_context.h | 10 -- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 37 --- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 24 ++- src/mesa/drivers/dri/i965/gen8_surface_state.c| 18 --- 4 files changed, 23 insertions(+), 66 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 2eb4251..780edba 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -983,16 +983,6 @@ struct brw_context struct { - void (*update_texture_surface)(struct brw_context *brw, - struct intel_mipmap_tree *mt, - GLenum target, - unsigned min_layer, - unsigned max_layer, - unsigned min_level, - unsigned max_level, - uint32_t tex_format, unsigned swizzle, - uint32_t *surf_offset, - bool for_gather); uint32_t (*update_renderbuffer_surface)(struct brw_context *brw, struct gl_renderbuffer *rb, bool layered, unsigned unit, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index de4bdc5..870d699 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -308,16 +308,17 @@ update_buffer_texture_surface(struct gl_context *ctx, } static void -brw_update_texture_surface(struct brw_context *brw, - struct intel_mipmap_tree *mt, - GLenum target, - unsigned min_layer /* unused */, - unsigned max_layer /* unused */, - unsigned min_level, - unsigned max_level, - uint32_t tex_format, unsigned swizzle /* unused */, - uint32_t *surf_offset, - bool for_gather) +gen4_emit_texture_surface_state(struct brw_context *brw, +struct intel_mipmap_tree *mt, +GLenum target, +unsigned min_layer /* unused */, +unsigned max_layer /* unused */, +unsigned min_level, +unsigned max_level, +unsigned tex_format, +unsigned swizzle /* unused */, +uint32_t *surf_offset, +bool rw, bool for_gather) { uint32_t *surf; @@ -378,7 +379,8 @@ brw_update_texture_surface(struct brw_context *brw, *surf_offset + 4, mt-bo, surf[1] - mt-bo-offset64, - I915_GEM_DOMAIN_SAMPLER, 0); + I915_GEM_DOMAIN_SAMPLER, + (rw ? I915_GEM_DOMAIN_SAMPLER : 0)); } /** @@ -834,11 +836,12 @@ update_texture_surface(struct gl_context *ctx, */ assert(brw-gen = 7 || obj-MinLevel == 0 || brw-meta_in_progress); - brw-vtbl.update_texture_surface(brw, mt, obj-Target, - obj-MinLayer, obj-MinLayer + depth, - obj-MinLevel + obj-BaseLevel, - obj-MinLevel + intel_obj-_MaxLevel + 1, - format, swizzle, surf_offset, for_gather); + brw-vtbl.emit_texture_surface_state( + brw, mt, obj-Target, + obj-MinLayer, obj-MinLayer + depth, + obj-MinLevel + obj-BaseLevel, + obj-MinLevel + intel_obj-_MaxLevel + 1, + format, swizzle, surf_offset, false, for_gather); } } @@ -1071,8 +1074,8 @@ const struct brw_tracked_state brw_cs_abo_surfaces = { void gen4_init_vtable_surface_functions(struct brw_context *brw) { - brw-vtbl.update_texture_surface = brw_update_texture_surface; brw-vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface; brw-vtbl.emit_null_surface_state = brw_emit_null_surface_state; + brw-vtbl.emit_texture_surface_state = gen4_emit_texture_surface_state;
[Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location
From: Topi Pohjolainen topi.pohjolai...@intel.com All generations do the same exact dispatch and it could be therefore done in the hardware independent stage. Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 3 - src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 31 ++ src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 +++ src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++ 4 files changed, 83 insertions(+), 87 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 2fcdcfa..a6282f4 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context *brw, uint32_t size, uint32_t *out_offset, bool dword_pitch); -void brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset); void brw_update_sol_surface(struct brw_context *brw, struct gl_buffer_object *buffer_obj, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 160dd2f..2b8040c 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw, } } -void -brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset) +static void +update_buffer_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset) { struct brw_context *brw = brw_context(ctx); struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current; @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx, struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); uint32_t *surf; - /* BRW_NEW_TEXTURE_BUFFER */ - if (tObj-Target == GL_TEXTURE_BUFFER) { - brw_update_buffer_texture_surface(ctx, unit, surf_offset); - return; - } - surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, surf_offset); @@ -795,6 +789,21 @@ const struct brw_tracked_state gen6_renderbuffer_surfaces = { .emit = update_renderbuffer_surfaces, }; +static void +update_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset, + bool for_gather) +{ + struct brw_context *brw = brw_context(ctx); + struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current; + + if (obj-Target == GL_TEXTURE_BUFFER) { + update_buffer_texture_surface(ctx, unit, surf_offset); + } else { + brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather); + } +} static void update_stage_texture_surfaces(struct brw_context *brw, @@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw, /* _NEW_TEXTURE */ if (ctx-Texture.Unit[unit]._Current) { -brw-vtbl.update_texture_surface(ctx, unit, surf_offset + s, for_gather); +update_texture_surface(ctx, unit, surf_offset + s, for_gather); } } } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 15ab2b0..098b5c8 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -356,43 +356,38 @@ gen7_update_texture_surface(struct gl_context *ctx, struct brw_context *brw = brw_context(ctx); struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current; - if (obj-Target == GL_TEXTURE_BUFFER) { - brw_update_buffer_texture_surface(ctx, unit, surf_offset); - - } else { - struct intel_texture_object *intel_obj = intel_texture_object(obj); - struct intel_mipmap_tree *mt = intel_obj-mt; - struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); - /* If this is a view with restricted NumLayers, then our effective depth - * is not just the miptree depth. - */ - const unsigned depth = (obj-Immutable obj-Target != GL_TEXTURE_3D ? - obj-NumLayers : mt-logical_depth0); - - /* Handling GL_ALPHA as a surface format override breaks 1.30+ style - * texturing functions that return a float, as
[Mesa-dev] [PATCH 4/7] i965: Refactor effective depth calculation
From: Topi Pohjolainen topi.pohjolai...@intel.com Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. Pass a half-open interval of layers like emit_texture_surface_state does. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 2 ++ src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++-- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 10 +++--- src/mesa/drivers/dri/i965/gen8_surface_state.c| 9 +++-- 4 files changed, 18 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 9e85dd7..0e9ede9 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -986,6 +986,8 @@ struct brw_context void (*update_texture_surface)(struct brw_context *brw, struct intel_mipmap_tree *mt, struct gl_texture_object *tObj, + unsigned min_layer, + unsigned max_layer, uint32_t tex_format, unsigned swizzle, uint32_t *surf_offset, bool for_gather); diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 3dddf89..92383e1 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -311,6 +311,8 @@ static void brw_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, struct gl_texture_object *tObj, + unsigned min_layer /* unused */, + unsigned max_layer /* unused */, uint32_t tex_format, unsigned swizzle /* unused */, uint32_t *surf_offset, bool for_gather) @@ -800,6 +802,11 @@ update_texture_surface(struct gl_context *ctx, struct intel_mipmap_tree *mt = intel_obj-mt; const struct gl_texture_image *firstImage = obj-Image[0][obj-BaseLevel]; const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); + /* If this is a view with restricted NumLayers, then our effective depth + * is not just the miptree depth. + */ + const unsigned depth = (obj-Immutable obj-Target != GL_TEXTURE_3D ? + obj-NumLayers : mt-logical_depth0); /* Handling GL_ALPHA as a surface format override breaks 1.30+ style * texturing functions that return a float, as our code generation always @@ -820,8 +827,9 @@ update_texture_surface(struct gl_context *ctx, format = BRW_SURFACEFORMAT_R8_UINT; } - brw-vtbl.update_texture_surface(brw, mt, obj, format, swizzle, - surf_offset, for_gather); + brw-vtbl.update_texture_surface(brw, mt, obj, + obj-MinLayer, obj-MinLayer + depth, + format, swizzle, surf_offset, for_gather); } } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 7576b20..9755236 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -351,22 +351,18 @@ static void gen7_update_texture_surface(struct brw_context *brw, struct intel_mipmap_tree *mt, struct gl_texture_object *obj, +unsigned min_layer, +unsigned max_layer, uint32_t tex_format, unsigned swizzle, uint32_t *surf_offset, bool for_gather) { struct intel_texture_object *intel_obj = intel_texture_object(obj); - /* If this is a view with restricted NumLayers, then our effective depth -* is not just the miptree depth. -*/ - const unsigned depth = (obj-Immutable obj-Target != GL_TEXTURE_3D ? - obj-NumLayers : mt-logical_depth0); - if (for_gather tex_format == BRW_SURFACEFORMAT_R32G32_FLOAT) tex_format = BRW_SURFACEFORMAT_R32G32_FLOAT_LD; gen7_emit_texture_surface_state(brw, mt, obj-Target, - obj-MinLayer, obj-MinLayer + depth, + min_layer, max_layer, obj-MinLevel + obj-BaseLevel, obj-MinLevel + intel_obj-_MaxLevel + 1, tex_format, swizzle, diff
[Mesa-dev] [PATCH 2/7] i965: Move tex miptree and format resolving into dispatcher
From: Topi Pohjolainen topi.pohjolai...@intel.com All hardware platforms have this in common, so do it in the hardware independent dispatcher. v2 (Matt): Removed extra whitespace. Reviewed-by: Matt Turner matts...@gmail.com (v1) Reviewed-by: Kenneth Graunke kenn...@whitecape.org (v1) Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 4 +++- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 26 --- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 17 ++- src/mesa/drivers/dri/i965/gen8_surface_state.c| 17 --- 4 files changed, 31 insertions(+), 33 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index a6282f4..d599ba8 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -984,7 +984,9 @@ struct brw_context struct { void (*update_texture_surface)(struct gl_context *ctx, - unsigned unit, + struct intel_mipmap_tree *mt, + struct gl_texture_object *tObj, + uint32_t tex_format, uint32_t *surf_offset, bool for_gather); uint32_t (*update_renderbuffer_surface)(struct brw_context *brw, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 2b8040c..7ed7e18 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -309,23 +309,19 @@ update_buffer_texture_surface(struct gl_context *ctx, static void brw_update_texture_surface(struct gl_context *ctx, - unsigned unit, + struct intel_mipmap_tree *mt, + struct gl_texture_object *tObj, + uint32_t tex_format, uint32_t *surf_offset, bool for_gather) { struct brw_context *brw = brw_context(ctx); - struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current; struct intel_texture_object *intelObj = intel_texture_object(tObj); - struct intel_mipmap_tree *mt = intelObj-mt; - struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); uint32_t *surf; surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, surf_offset); - uint32_t tex_format = translate_tex_format(brw, mt-format, - sampler-sRGBDecode); - if (for_gather) { /* Sandybridge's gather4 message is broken for integer formats. * To work around this, we pretend the surface is UNORM for @@ -801,7 +797,21 @@ update_texture_surface(struct gl_context *ctx, if (obj-Target == GL_TEXTURE_BUFFER) { update_buffer_texture_surface(ctx, unit, surf_offset); } else { - brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather); + struct intel_texture_object *intel_obj = intel_texture_object(obj); + struct intel_mipmap_tree *mt = intel_obj-mt; + const struct gl_texture_image *firstImage = obj-Image[0][obj-BaseLevel]; + const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); + unsigned format = translate_tex_format(brw, intel_obj-_Format, + sampler-sRGBDecode); + if (obj-StencilSampling firstImage-_BaseFormat == GL_DEPTH_STENCIL) { + assert(brw-gen = 8); + mt = mt-stencil_mt; + assert(mt-format == MESA_FORMAT_S_UINT8); + format = BRW_SURFACEFORMAT_R8_UINT; + } + + brw-vtbl.update_texture_surface(ctx, mt, obj, format, surf_offset, + for_gather); } } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 098b5c8..7e3ee67 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -349,16 +349,14 @@ gen7_emit_texture_surface_state(struct brw_context *brw, static void gen7_update_texture_surface(struct gl_context *ctx, -unsigned unit, +struct intel_mipmap_tree *mt, +struct gl_texture_object *obj, +uint32_t tex_format, uint32_t *surf_offset, bool for_gather) { struct brw_context *brw = brw_context(ctx); - struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current; - struct intel_texture_object *intel_obj = intel_texture_object(obj); - struct
Re: [Mesa-dev] [PATCH 01/13] nir/validate: Validate SSA def parent instructiosn
I can't seem to find the cover email, so I'll respond to this one. Aside from my comments on patches 11 and 13, patches 1-5 and 11-13 are Reviewed-by: Connor Abbott cwabbo...@gmail.com and FWIW 6-10 are Acked-by: Connor Abbott cwabbo...@gmail.com although what's important there are other people testing those and make sure they don't break other things (particularly Windows). On Tue, May 5, 2015 at 8:16 PM, Connor Abbott cwabbo...@gmail.com wrote: Typo in the subject line. On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand ja...@jlekstrand.net wrote: --- src/glsl/nir/nir_validate.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c index a7aa798..35a853d 100644 --- a/src/glsl/nir/nir_validate.c +++ b/src/glsl/nir/nir_validate.c @@ -236,6 +236,8 @@ validate_ssa_def(nir_ssa_def *def, validate_state *state) assert(!BITSET_TEST(state-ssa_defs_found, def-index)); BITSET_SET(state-ssa_defs_found, def-index); + assert(def-parent_instr == state-instr); + assert(def-num_components = 4); ssa_def_validate_state *def_state = ralloc(state-ssa_defs, -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] prog_to_nir: OPCODE_EXP is not nir_op_fexp
On 05/07/2015 07:30 AM, Jason Ekstrand wrote: On Wed, May 6, 2015 at 7:29 PM, Matt Turner matts...@gmail.com wrote: On Wed, May 6, 2015 at 7:09 PM, Ian Romanick i...@freedesktop.org wrote: From: Ian Romanick ian.d.roman...@intel.com It's a weird thing that provides some values related to 2**x. It's also already handled by a case in the switch. Signed-off-by: Ian Romanick ian.d.roman...@intel.com The series is Reviewed-by: Matt Turner matts...@gmail.com I was going to complain about you making my SPIR-V - NIR translator harder to write. But, based on the discussion by Ken and Ilia on IRC, it looks like basically no one's hardware does a base-e log. I'll just lower on-the-fly. I guess maybe we could do it with pow(x, e) but meh. If you'd like, the series is Right. We currently unconditionally lower exp(x) to exp2(x * M_LOG2E) in the GLSL IR lowering code. I believe we picked that lowering because some older architectures lack a pow instruction. It may be worth trying the other way to see if we get better code. Acked-by: Jason Ekstrand jason.ekstr...@intel.com I can't say I read it enough to call it a review but I glanced through it and it seems ok. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd
On Thu, 2015-05-07 at 21:52 +0200, EdB wrote: Le 2015-05-07 18:55, Aaron Watry a écrit : I'm not sure what the final consensus will be on how to do this, but FWIW: Tested-By: Aaron Watry awa...@gmail.com I've tested this with 4 combinations: no --with-opencl-icd option specified : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=standard : libMesaOpenCL.so installed in ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in ${prefix}/lib, icd in ${prefix}/etc//mesa.icd. I only specified --prefix, no other directories overridden in configure command. shouldn't this part go to ${prefix}/etc/OpenCL/vendors? Is it just a typo or did it install to ${prefix}/etc//? jan thanks EdB --Aaron On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote: The standard ICD file path is /etc/OpenCL/vendor/. However it doesn't fit well with custom build. This option allow ICD vendor file installation path override --- configure.ac [1] | 46 +++--- src/gallium/targets/opencl/Makefile.am | 2 +- 2 files changed, 33 insertions(+), 15 deletions(-) diff --git a/configure.ac [1] b/configure.ac [1] index 095e23e..90dba4e 100644 --- a/configure.ac [1] +++ b/configure.ac [1] @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl], [enable OpenCL library @:@default=disabled@:@])], [enable_opencl=$enableval], [enable_opencl=no]) -AC_ARG_ENABLE([opencl_icd], - [AS_HELP_STRING([--enable-opencl-icd], - [Build an OpenCL ICD library to be loaded by an ICD implementation - @:@default=disabled@:@])], -[enable_opencl_icd=$enableval], -[enable_opencl_icd=no]) AC_ARG_ENABLE([xlib-glx], [AS_HELP_STRING([--enable-xlib-glx], [make GLX library Xlib-based instead of DRI-based @:@default=disabled@:@])], @@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then # XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers enable_gallium_loader=yes -if test x$enable_opencl_icd = xyes; then -OPENCL_LIBNAME=MesaOpenCL -else -OPENCL_LIBNAME=OpenCL -fi - if test x$have_libelf != xyes; then AC_MSG_ERROR([Clover requires libelf]) fi fi AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes) -AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes) -AC_SUBST([OPENCL_LIBNAME]) dnl dnl Gallium configuration @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir], [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d]) AC_SUBST([D3D_DRIVER_INSTALL_DIR]) +dnl OpenCL ICD + +AC_ARG_WITH([opencl-icd], + [AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@], +[Build an OpenCL ICD library to be loaded by an ICD implementation. + If @:@standard@:@ the OpenCL ICD vendor file installs in /etc/OpenCL/vendors. + @:@sysconfdir@:@ installs the file in $sysconfdir/OpenCL/vendors + @:@default=no@:@])], +[OPENCL_ICD=$withval], +[OPENCL_ICD=no]) + +case x$OPENCL_ICD in +xno) +OPENCL_LIBNAME=OpenCL +;; +xstandard) +OPENCL_LIBNAME=MesaOpenCL +ICD_FILE_DIR=/etc/OpenCL/vendors +;; +xsysconfdir) +OPENCL_LIBNAME=MesaOpenCL +ICD_FILE_DIR=$sysconfdir/OpenCL/vendors +;; +*) +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for --with-opencl-icd]) +;; +esac + +AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno) +AC_SUBST([OPENCL_LIBNAME]) +AC_SUBST([ICD_FILE_DIR]) + dnl dnl Gallium helper functions dnl diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 5daf327..781daa0 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES = opencl.sym EXTRA_DIST = mesa.icd opencl.sym if HAVE_CLOVER_ICD -icddir = /etc/OpenCL/vendors/ +icddir = $(ICD_FILE_DIR) icd_DATA = mesa.icd endif -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2] Links: -- [1] http://configure.ac [2] http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Jan Vesely jan.ves...@rutgers.edu signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH 10/13] mesa/main: Check context pointer in _mesa_error before using it
On 05/07/2015 05:17 AM, Pohjolainen, Topi wrote: On Tue, May 05, 2015 at 02:25:26PM +0300, Juha-Pekka Heikkila wrote: I guess this should not really be able to segfault but still it seems to be able to during context creation. Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/main/errors.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c index 2aa1deb..6631b82 100644 --- a/src/mesa/main/errors.c +++ b/src/mesa/main/errors.c @@ -1458,18 +1458,23 @@ _mesa_error( struct gl_context *ctx, GLenum error, const char *fmtString, ... ) To me it looks that it would be better to just leave early already here: if (!ctx) return; Avoids extra indentation and it doesn't look meaningful to call should_output() with null context. I like that plan. I don't think you can even get to _mesa_error (or _mesa_warning) without a context. Maybe add an assert(ctx != NULL)? do_output = should_output(ctx, error, fmtString); - mtx_lock(ctx-DebugMutex); - if (ctx-Debug) { - do_log = debug_is_message_enabled(ctx-Debug, -MESA_DEBUG_SOURCE_API, -MESA_DEBUG_TYPE_ERROR, -error_msg_id, -MESA_DEBUG_SEVERITY_HIGH); + if (ctx) { + mtx_lock(ctx-DebugMutex); + if (ctx-Debug) { + do_log = debug_is_message_enabled(ctx-Debug, + MESA_DEBUG_SOURCE_API, + MESA_DEBUG_TYPE_ERROR, + error_msg_id, + MESA_DEBUG_SEVERITY_HIGH); + } + else { + do_log = GL_FALSE; + } + mtx_unlock(ctx-DebugMutex); } else { do_log = GL_FALSE; } - mtx_unlock(ctx-DebugMutex); if (do_output || do_log) { char s[MAX_DEBUG_MESSAGE_LENGTH], s2[MAX_DEBUG_MESSAGE_LENGTH]; @@ -1502,14 +1507,15 @@ _mesa_error( struct gl_context *ctx, GLenum error, const char *fmtString, ... ) } /* Log the error via ARB_debug_output if needed.*/ - if (do_log) { + if (ctx do_log) { log_msg(ctx, MESA_DEBUG_SOURCE_API, MESA_DEBUG_TYPE_ERROR, error_msg_id, MESA_DEBUG_SEVERITY_HIGH, len, s2); } } /* Set the GL context error state for glGetError. */ - _mesa_record_error(ctx, error); + if (ctx) + _mesa_record_error(ctx, error); } void -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/13] nir/nir: Use a linked list instead of a has set for use/def sets
Based on the testing you did, it sounds like switching to linked lists gives us some pretty good performance gains, but before we go ahead with this you should collect some numbers using http://anholt.net/compare-perf/ and put them on this commit message. Comparing list vs. no-list as well as NIR vs. non-NIR might be useful, so we can compare the time saved to the total time we spend doing NIR-related things. On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand ja...@jlekstrand.net wrote: This commit switches us from the current setup of using hash sets for use/def sets to using linked lists. Doing so should save us quite a bit of memory because we aren't carrying around 3 hash sets per register and 2 per SSA value. It should also save us CPU time because adding/removing things from use/def sets is 4 pointer manipulations instead of a hash lookup. On the code complexity side of things, some things are now much easier and others are a bit harder. One of the operations we perform constantly in optimization passes is to replace one source with another. Due to the fact that an instruction can use the same SSA value multiple times, we had to iterate through the sources of the instruction and determine if the use we were replacing was the only one before removing it from the set of uses. With this patch, uses are per-source not per-instruction so we can just remove it safely. On the other hand, trying to iterate over all of the instructions that use a given value is more difficult. Fortunately, the two places we do that are the ffma peephole where it doesn't matter and GCM where we already gracefully handle duplicates visits to an instruction. Another aspect here is that using linked lists in this way can be tricky to get right. With sets, things were quite forgiving and the worst that happened if you didn't properly remove a use was that it would get caught in the validator. With linked lists, it can lead to linked list corruption which can be harder to track. However, we do just as much validation of the linked lists as we did of the sets so the validator should still catch these problems. While working on this series, the vast majority of the bugs I had to fix were caught by assertions. I don't think the lists are going to be that much worse than the sets. --- src/glsl/nir/nir.c | 228 +++- src/glsl/nir/nir.h | 45 +++-- src/glsl/nir/nir_validate.c | 158 +++--- 3 files changed, 194 insertions(+), 237 deletions(-) diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index b8f5dd4..be13c90 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -58,12 +58,9 @@ reg_create(void *mem_ctx, struct exec_list *list) nir_register *reg = ralloc(mem_ctx, nir_register); reg-parent_instr = NULL; - reg-uses = _mesa_set_create(reg, _mesa_hash_pointer, -_mesa_key_pointer_equal); - reg-defs = _mesa_set_create(reg, _mesa_hash_pointer, -_mesa_key_pointer_equal); - reg-if_uses = _mesa_set_create(reg, _mesa_hash_pointer, - _mesa_key_pointer_equal); + list_inithead(reg-uses); + list_inithead(reg-defs); + list_inithead(reg-if_uses); reg-num_components = 0; reg-num_array_elems = 0; @@ -1070,11 +1067,14 @@ update_if_uses(nir_cf_node *node) nir_if *if_stmt = nir_cf_node_as_if(node); - struct set *if_uses_set = if_stmt-condition.is_ssa ? - if_stmt-condition.ssa-if_uses : - if_stmt-condition.reg.reg-uses; - - _mesa_set_add(if_uses_set, if_stmt); + if_stmt-condition.parent_if = if_stmt; + if (if_stmt-condition.is_ssa) { + list_addtail(if_stmt-condition.use_link, + if_stmt-condition.ssa-if_uses); + } else { + list_addtail(if_stmt-condition.use_link, + if_stmt-condition.reg.reg-if_uses); + } } void @@ -1227,16 +1227,7 @@ cleanup_cf_node(nir_cf_node *node) foreach_list_typed(nir_cf_node, child, node, if_stmt-else_list) cleanup_cf_node(child); - struct set *if_uses; - if (if_stmt-condition.is_ssa) { - if_uses = if_stmt-condition.ssa-if_uses; - } else { - if_uses = if_stmt-condition.reg.reg-if_uses; - } - - struct set_entry *entry = _mesa_set_search(if_uses, if_stmt); - assert(entry); - _mesa_set_remove(if_uses, entry); + list_del(if_stmt-condition.use_link); break; } @@ -1293,9 +1284,9 @@ add_use_cb(nir_src *src, void *state) { nir_instr *instr = state; - struct set *uses_set = src-is_ssa ? src-ssa-uses : src-reg.reg-uses; - - _mesa_set_add(uses_set, instr); + src-parent_instr = instr; + list_addtail(src-use_link, +src-is_ssa ? src-ssa-uses : src-reg.reg-uses); return true; } @@
Re: [Mesa-dev] [PATCH 1/5] nir: Define image load, store and atomic intrinsics.
On IRC, Ken and I were discussing using a scheme inspired by SPIR-V, which has an OpImagePointer instruction that forms a pointer to the particular texel of the image as well as OpAtomic{Load,Store,Exchange,etc.} that operate on an image or shared buffer pointer. The advantages would be: * Makes translating from SPIR-V easier. * Reduces the number of intrinsics we need to add for SSBO support. * Reduces the combinatorial explosion enough that we can have separate versions for 2, 3, and 4 components and MS vs. non-MS without it being unbearable. I'm not sure how much of a benefit that would be though. The disadvantages I can think of are: * Doesn't actually save any code in the i965 backend, since we need to do different things depending on if the pointer is to an image or a shared buffer anyways. * We'd have to special case nir_convert_from_ssa to ignore the SSA value that's really a pointer since we don't have any real type-level support for pointers. * Since we lower to SSA before converting to i965, there are some ugly edge cases when the coordinate argument becomes part of a phi web and gets potentially overwritten before the instruction that uses the pointer. I don't have a preference one way or the other, and I guess we could always refactor it later if we wanted to, so assuming Ken is OK with this, then besides one minor comment on patch 4 the series is Reviewed-by: Connor Abbott cwabbo...@gmail.com On Tue, May 5, 2015 at 4:29 PM, Francisco Jerez curroje...@riseup.net wrote: --- src/glsl/nir/nir_intrinsics.h | 27 +++ 1 file changed, 27 insertions(+) diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h index 8e28765..4b13c75 100644 --- a/src/glsl/nir/nir_intrinsics.h +++ b/src/glsl/nir/nir_intrinsics.h @@ -89,6 +89,33 @@ ATOMIC(inc, 0) ATOMIC(dec, 0) ATOMIC(read, NIR_INTRINSIC_CAN_ELIMINATE) +/* + * Image load, store and atomic intrinsics. + * + * All image intrinsics take an image target passed as a nir_variable. Image + * variables contain a number of memory and layout qualifiers that influence + * the semantics of the intrinsic. + * + * All image intrinsics take a four-coordinate vector and a sample index as + * first two sources, determining the location within the image that will be + * accessed by the intrinsic. Components not applicable to the image target + * in use are equal to zero by convention. Image store takes an additional + * four-component argument with the value to be written, and image atomic + * operations take either one or two additional scalar arguments with the same + * meaning as in the ARB_shader_image_load_store specification. + */ +INTRINSIC(image_load, 2, ARR(4, 1), true, 4, 1, 0, + NIR_INTRINSIC_CAN_ELIMINATE) +INTRINSIC(image_store, 3, ARR(4, 1, 4), false, 0, 1, 0, 0) +INTRINSIC(image_atomic_add, 3, ARR(4, 1, 1), true, 1, 1, 0, 0) +INTRINSIC(image_atomic_min, 3, ARR(4, 1, 1), true, 1, 1, 0, 0) +INTRINSIC(image_atomic_max, 3, ARR(4, 1, 1), true, 1, 1, 0, 0) +INTRINSIC(image_atomic_and, 3, ARR(4, 1, 1), true, 1, 1, 0, 0) +INTRINSIC(image_atomic_or, 3, ARR(4, 1, 1), true, 1, 1, 0, 0) +INTRINSIC(image_atomic_xor, 3, ARR(4, 1, 1), true, 1, 1, 0, 0) +INTRINSIC(image_atomic_exchange, 3, ARR(4, 1, 1), true, 1, 1, 0, 0) +INTRINSIC(image_atomic_comp_swap, 4, ARR(4, 1, 1, 1), true, 1, 1, 0, 0) + #define SYSTEM_VALUE(name, components) \ INTRINSIC(load_##name, 0, ARR(), true, components, 0, 0, \ NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) -- 2.3.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/27] i965: Store gather table information in the program data
On Tue, Apr 28, 2015 at 11:08:08PM +0300, Abdiel Janulgue wrote: The resource streamer is able to gather and pack sparsely-located constant data from any buffer object by a referring to a gather table This patch adds support for keeping track of these constant data fetches into a gather table. The gather table is generated from two sources. Ordinary uniform fetches are stored first. These are then combined with a separate table containing UBO entries. The separate entry for UBOs is needed to make it easier to generate the gather mask when combining and packing the constant data. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.h | 9 + src/mesa/drivers/dri/i965/brw_gs.c | 4 src/mesa/drivers/dri/i965/brw_program.c | 5 + src/mesa/drivers/dri/i965/brw_shader.cpp | 4 +++- src/mesa/drivers/dri/i965/brw_shader.h | 11 +++ src/mesa/drivers/dri/i965/brw_vs.c | 5 + src/mesa/drivers/dri/i965/brw_wm.c | 5 + 7 files changed, 42 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 7fd49e9..e25c64d 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -355,9 +355,12 @@ struct brw_stage_prog_data { GLuint nr_params; /** number of float params/constants */ GLuint nr_pull_params; + GLuint nr_ubo_params; + GLuint nr_gather_table; I would introduce these as non gl-types - we are trying to move away from them. Perhaps change nr_params and nr_pull_params while you are at it. unsigned curb_read_length; unsigned total_scratch; + unsigned max_ubo_const_block; /** * Register where the thread expects to find input data from the URB @@ -375,6 +378,12 @@ struct brw_stage_prog_data { */ const gl_constant_value **param; const gl_constant_value **pull_param; + struct { + int reg; + unsigned channel_mask; + unsigned const_block; + unsigned const_offset; + } *gather_table; }; Below in brw_shader.h you do: struct gather_table { int reg; unsigned channel_mask; unsigned const_block; unsigned const_offset; }; gather_table *ubo_gather_table; Why not here? /* Data about a particular attempt to compile a program. Note that diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index bea90d8..97658d5 100644 --- a/src/mesa/drivers/dri/i965/brw_gs.c +++ b/src/mesa/drivers/dri/i965/brw_gs.c @@ -70,6 +70,10 @@ brw_compile_gs_prog(struct brw_context *brw, c.prog_data.base.base.pull_param = rzalloc_array(NULL, const gl_constant_value *, param_count); c.prog_data.base.base.nr_params = param_count; + c.prog_data.base.base.nr_gather_table = 0; + c.prog_data.base.base.gather_table = + rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) * + (c.prog_data.base.base.nr_params + c.prog_data.base.base.nr_ubo_params)); Wrap this line. if (brw-gen = 7) { if (gp-program.OutputType == GL_POINTS) { diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 81a0c19..f27c799 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -573,6 +573,10 @@ brw_stage_prog_data_compare(const struct brw_stage_prog_data *a, if (memcmp(a-pull_param, b-pull_param, a-nr_pull_params * sizeof(void *))) return false; + if (memcmp(a-gather_table, b-gather_table, + a-nr_gather_table * sizeof(*a-gather_table))) + return false; + return true; } @@ -583,6 +587,7 @@ brw_stage_prog_data_free(const void *p) ralloc_free(prog_data-param); ralloc_free(prog_data-pull_param); + ralloc_free(prog_data-gather_table); } void diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 0d6ac0c..8769f67 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -739,11 +739,13 @@ backend_visitor::backend_visitor(struct brw_context *brw, prog(prog), stage_prog_data(stage_prog_data), cfg(NULL), - stage(stage) + stage(stage), + ubo_gather_table(NULL) { debug_enabled = INTEL_DEBUG intel_debug_flag_for_shader_stage(stage); stage_name = _mesa_shader_stage_to_string(stage); stage_abbrev = _mesa_shader_stage_to_abbrev(stage); + this-nr_ubo_gather_table = 0; Any particular reason not to do this in the initializer along with the other members? } bool diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h index 8a3263e..db0018f 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.h +++
Re: [Mesa-dev] [PATCH 13/27] i965: Assign hw-binding table index for uniform constant buffer block
On Tue, Apr 28, 2015 at 11:08:10PM +0300, Abdiel Janulgue wrote: Assign the uploaded uniform block with hardware binding table indices. This is indexed by the resource streamer to fetch the constant buffers referred to by our gather table entries. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/gen6_vs_state.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c b/src/mesa/drivers/dri/i965/gen6_vs_state.c index 7325c6e..bce597f 100644 --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c @@ -72,9 +72,16 @@ gen6_upload_push_constants(struct brw_context *brw, gl_constant_value *param; int i; - param = brw_state_batch(brw, type, - prog_data-nr_params * sizeof(gl_constant_value), + uint32_t size = prog_data-nr_params * sizeof(gl_constant_value); Const would be nice here. + param = brw_state_batch(brw, type, size, 32, stage_state-push_const_offset); + if (brw-gather_pool.bo != NULL) { + uint32_t surf_offset = 0; + brw_create_constant_surface(brw, brw-batch.bo, stage_state-push_const_offset, + size, surf_offset, false); + gen7_update_binding_table(brw, stage_state-stage, BRW_UNIFORM_GATHER_INDEX_START, Two lines overflowing 80 columns. + surf_offset); + } STATIC_ASSERT(sizeof(gl_constant_value) == sizeof(float)); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location
On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote: From: Topi Pohjolainen topi.pohjolai...@intel.com All generations do the same exact dispatch and it could be therefore done in the hardware independent stage. Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 3 - src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 31 ++ src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 +++ src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++ 4 files changed, 83 insertions(+), 87 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 2fcdcfa..a6282f4 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context *brw, uint32_t size, uint32_t *out_offset, bool dword_pitch); -void brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset); void brw_update_sol_surface(struct brw_context *brw, struct gl_buffer_object *buffer_obj, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 160dd2f..2b8040c 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw, } } -void -brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset) +static void +update_buffer_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset) { struct brw_context *brw = brw_context(ctx); struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current; @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx, struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); uint32_t *surf; - /* BRW_NEW_TEXTURE_BUFFER */ - if (tObj-Target == GL_TEXTURE_BUFFER) { - brw_update_buffer_texture_surface(ctx, unit, surf_offset); - return; - } - surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, surf_offset); @@ -795,6 +789,21 @@ const struct brw_tracked_state gen6_renderbuffer_surfaces = { .emit = update_renderbuffer_surfaces, }; +static void +update_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset, + bool for_gather) +{ + struct brw_context *brw = brw_context(ctx); + struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current; + + if (obj-Target == GL_TEXTURE_BUFFER) { + update_buffer_texture_surface(ctx, unit, surf_offset); In order to avoid extra level of indentation I used the following. I would have preferred it here also. if (obj-Target == GL_TEXTURE_BUFFER) { update_buffer_texture_surface(ctx, unit, surf_offset); return; } + } else { + brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather); + } +} static void update_stage_texture_surfaces(struct brw_context *brw, @@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw, /* _NEW_TEXTURE */ if (ctx-Texture.Unit[unit]._Current) { -brw-vtbl.update_texture_surface(ctx, unit, surf_offset + s, for_gather); +update_texture_surface(ctx, unit, surf_offset + s, for_gather); } } } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 15ab2b0..098b5c8 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -356,43 +356,38 @@ gen7_update_texture_surface(struct gl_context *ctx, struct brw_context *brw = brw_context(ctx); struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current; - if (obj-Target == GL_TEXTURE_BUFFER) { - brw_update_buffer_texture_surface(ctx, unit, surf_offset); - - } else { - struct intel_texture_object *intel_obj = intel_texture_object(obj); - struct intel_mipmap_tree *mt = intel_obj-mt; - struct gl_sampler_object *sampler =
Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location
On Thu, May 07, 2015 at 05:55:48PM +0300, Francisco Jerez wrote: Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote: From: Topi Pohjolainen topi.pohjolai...@intel.com All generations do the same exact dispatch and it could be therefore done in the hardware independent stage. Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 3 - src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 31 ++ src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 +++ src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++ 4 files changed, 83 insertions(+), 87 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 2fcdcfa..a6282f4 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context *brw, uint32_t size, uint32_t *out_offset, bool dword_pitch); -void brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset); void brw_update_sol_surface(struct brw_context *brw, struct gl_buffer_object *buffer_obj, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 160dd2f..2b8040c 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw, } } -void -brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset) +static void +update_buffer_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset) { struct brw_context *brw = brw_context(ctx); struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current; @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx, struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); uint32_t *surf; - /* BRW_NEW_TEXTURE_BUFFER */ - if (tObj-Target == GL_TEXTURE_BUFFER) { - brw_update_buffer_texture_surface(ctx, unit, surf_offset); - return; - } - surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, surf_offset); @@ -795,6 +789,21 @@ const struct brw_tracked_state gen6_renderbuffer_surfaces = { .emit = update_renderbuffer_surfaces, }; +static void +update_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset, + bool for_gather) +{ + struct brw_context *brw = brw_context(ctx); + struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current; + + if (obj-Target == GL_TEXTURE_BUFFER) { + update_buffer_texture_surface(ctx, unit, surf_offset); In order to avoid extra level of indentation I used the following. I would have preferred it here also. if (obj-Target == GL_TEXTURE_BUFFER) { update_buffer_texture_surface(ctx, unit, surf_offset); return; } I kept this as an indented block because it's harmless IMHO and it seemed a somewhat lesser evil than: 1/ Define all texture-specific variables (i.e. things that are not applicable to buffer textures, including some pointer dereferences) at the top level, which is what you did, but it seemed a bit dodgy. 2/ Mix statements and declarations. (Granted, this file is likely already relying on other C99 features, so it wouldn't matter in practice, it's just a codestyle itch) 3/ Declare stuff and leave it uninitialized until later. That said, the reason was largely subjective, and I don't really have a strong preference. As you are still the author of this commit you're free to format it as you wish, you can keep my R-b if you simply reindent this function. If Ken and Matt are happy with this series, so am I. I'm just glad if we can land it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/27] i965: Include UBO parameter sizes in push constant parameters
On Tue, Apr 28, 2015 at 11:08:13PM +0300, Abdiel Janulgue wrote: Now that we consider UBO constants as push constants, we need to include the sizes of the UBO's constant slots in the visitor's uniform slot sizes. This information is needed to properly pack vector constants tightly next to each other. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_gs.c | 11 +++ src/mesa/drivers/dri/i965/brw_vs.c | 13 + src/mesa/drivers/dri/i965/brw_wm.c | 13 + 3 files changed, 37 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index 97658d5..2dc3ea1 100644 --- a/src/mesa/drivers/dri/i965/brw_gs.c +++ b/src/mesa/drivers/dri/i965/brw_gs.c @@ -32,6 +32,7 @@ #include brw_vec4_gs_visitor.h #include brw_state.h #include brw_ff_gs.h +#include glsl/nir/nir_types.h bool @@ -70,6 +71,16 @@ brw_compile_gs_prog(struct brw_context *brw, c.prog_data.base.base.pull_param = rzalloc_array(NULL, const gl_constant_value *, param_count); c.prog_data.base.base.nr_params = param_count; + c.prog_data.base.base.nr_ubo_params = 0; + for (int i = 0; i gs-NumUniformBlocks; i++) { + for (int p = 0; p gs-UniformBlocks[i].NumUniforms; p++) { + const struct glsl_type *type = gs-UniformBlocks[i].Uniforms[p].Type; + const struct glsl_type *elem = glsl_get_element_type(type); + int array_sz = elem ? glsl_get_array_size(type) : 1; + int components = elem ? glsl_get_components(elem) : glsl_get_components(type); + c.prog_data.base.base.nr_ubo_params += components * array_sz; + } + } c.prog_data.base.base.nr_gather_table = 0; c.prog_data.base.base.gather_table = rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) * diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 52333c9..86bef5e 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -37,6 +37,7 @@ #include brw_state.h #include program/prog_print.h #include program/prog_parameter.h +#include glsl/nir/nir_types.h #include util/ralloc.h @@ -243,6 +244,18 @@ brw_compile_vs_prog(struct brw_context *brw, rzalloc_array(NULL, const gl_constant_value *, param_count); stage_prog_data-nr_params = param_count; + stage_prog_data-nr_ubo_params = 0; + if (vs) { + for (int i = 0; i vs-NumUniformBlocks; i++) { + for (int p = 0; p vs-UniformBlocks[i].NumUniforms; p++) { +const struct glsl_type *type = vs-UniformBlocks[i].Uniforms[p].Type; +const struct glsl_type *elem = glsl_get_element_type(type); +int array_sz = elem ? glsl_get_array_size(type) : 1; +int components = elem ? glsl_get_components(elem) : glsl_get_components(type); +stage_prog_data-nr_ubo_params += components * array_sz; + } + } + } stage_prog_data-nr_gather_table = 0; stage_prog_data-gather_table = rzalloc_size(NULL, sizeof(*stage_prog_data-gather_table) * (stage_prog_data-nr_params + diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 13a64d8..2060eab 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -38,6 +38,7 @@ #include main/samplerobj.h #include program/prog_parameter.h #include program/program.h +#include glsl/nir/nir_types.h #include intel_mipmap_tree.h #include util/ralloc.h @@ -205,6 +206,18 @@ brw_compile_wm_prog(struct brw_context *brw, rzalloc_array(NULL, const gl_constant_value *, param_count); prog_data.base.nr_params = param_count; + prog_data.base.nr_ubo_params = 0; + if (fs) { + for (int i = 0; i fs-NumUniformBlocks; i++) { + for (int p = 0; p fs-UniformBlocks[i].NumUniforms; p++) { +const struct glsl_type *type = fs-UniformBlocks[i].Uniforms[p].Type; +const struct glsl_type *elem = glsl_get_element_type(type); +int array_sz = elem ? glsl_get_array_size(type) : 1; +int components = elem ? glsl_get_components(elem) : glsl_get_components(type); +prog_data.base.nr_ubo_params += components * array_sz; + } + } + } I didn't check for exact details but looks to me you could refactor this into its own routine - all three occurences look awfully similar. prog_data.base.nr_gather_table = 0; prog_data.base.gather_table = rzalloc_size(NULL, sizeof(*prog_data.base.gather_table) * (prog_data.base.nr_params + -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] prog_to_nir: OPCODE_EXP is not nir_op_fexp
On Wed, May 6, 2015 at 7:29 PM, Matt Turner matts...@gmail.com wrote: On Wed, May 6, 2015 at 7:09 PM, Ian Romanick i...@freedesktop.org wrote: From: Ian Romanick ian.d.roman...@intel.com It's a weird thing that provides some values related to 2**x. It's also already handled by a case in the switch. Signed-off-by: Ian Romanick ian.d.roman...@intel.com The series is Reviewed-by: Matt Turner matts...@gmail.com I was going to complain about you making my SPIR-V - NIR translator harder to write. But, based on the discussion by Ken and Ilia on IRC, it looks like basically no one's hardware does a base-e log. I'll just lower on-the-fly. I guess maybe we could do it with pow(x, e) but meh. If you'd like, the series is Acked-by: Jason Ekstrand jason.ekstr...@intel.com I can't say I read it enough to call it a review but I glanced through it and it seems ok. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.
On Thu, May 07, 2015 at 04:43:21PM +0300, Pohjolainen, Topi wrote: On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote: This patch implements the binding table enable command which is also used to allocate a binding table pool where hardware-generated binding table entries are flushed into. Each binding table offset in the binding table pool is unique per each shader stage that are enabled within a batch. Also insert the required brw_tracked_state objects to enable hw-generated binding tables in normal render path. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 ++ src/mesa/drivers/dri/i965/brw_context.c| 4 ++ src/mesa/drivers/dri/i965/brw_context.h| 5 ++ src/mesa/drivers/dri/i965/brw_state.h | 7 +++ src/mesa/drivers/dri/i965/brw_state_upload.c | 2 + src/mesa/drivers/dri/i965/intel_batchbuffer.c | 4 ++ 6 files changed, 92 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c index 459165a..a58e32e 100644 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c @@ -44,6 +44,11 @@ #include brw_state.h #include intel_batchbuffer.h +/* Somehow the hw-binding table pool offset must start here, otherwise + * the GPU will hang + */ +#define HW_BT_START_OFFSET 256; I think we want to understand this a little better before enabling... + /** * Upload a shader stage's binding table as indirect state. * @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = { .emit = brw_gs_upload_binding_table, }; +/** + * Hardware-generated binding tables for the resource streamer + */ +void +gen7_disable_hw_binding_tables(struct brw_context *brw) +{ + BEGIN_BATCH(3); + OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC 16 | (3 - 2)); + OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, BRW_HW_BINDING_TABLE_ENABLE) | + brw-is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0); + OUT_BATCH(0); + ADVANCE_BATCH(); + + /* Pipe control workaround */ + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); +} + +void +gen7_enable_hw_binding_tables(struct brw_context *brw) +{ + if (!brw-has_resource_streamer) { + gen7_disable_hw_binding_tables(brw); I started wondering why we really need this - RS is disabled by default and we haven't needed to do anything to disable it before. Right, patch number eight gave me the answer, we want to disable it for blorp. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms
On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote: Reserve space in the gather pool where the gathered uniforms are flushed. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c b/src/mesa/drivers/dri/i965/gen6_vs_state.c index 35d10ef..aebaa49 100644 --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw, */ assert(stage_state-push_const_size = 32); } + /* Allocate gather pool space for uniform and UBO entries in 512-bit chunks*/ + if (brw-gather_pool.bo != NULL) { + if (prog_data-nr_params 0) { I guess you combine these conditions: if (brw-gather_pool.bo != NULL prog_data-nr_params 0) Or even bail out early: if (brw-gather_pool.bo == NULL || prog_data-nr_params == 0) return; + int num_consts = ALIGN(prog_data-nr_params, 4) / 4; This could be const, no big deal though. + stage_state-push_const_offset = brw-gather_pool.next_offset; + brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64; + } + } } static void -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms
On Thu, May 7, 2015 at 10:52 AM, Pohjolainen, Topi topi.pohjolai...@intel.com wrote: On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote: Reserve space in the gather pool where the gathered uniforms are flushed. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c b/src/mesa/drivers/dri/i965/gen6_vs_state.c index 35d10ef..aebaa49 100644 --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw, */ assert(stage_state-push_const_size = 32); } + /* Allocate gather pool space for uniform and UBO entries in 512-bit chunks*/ + if (brw-gather_pool.bo != NULL) { + if (prog_data-nr_params 0) { I guess you combine these conditions: if (brw-gather_pool.bo != NULL prog_data-nr_params 0) Or even bail out early: if (brw-gather_pool.bo == NULL || prog_data-nr_params == 0) return; + int num_consts = ALIGN(prog_data-nr_params, 4) / 4; This could be const, no big deal though. And it could be DIV_ROUND_UP... + stage_state-push_const_offset = brw-gather_pool.next_offset; + brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64; + } + } } static void -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote: From: Topi Pohjolainen topi.pohjolai...@intel.com All generations do the same exact dispatch and it could be therefore done in the hardware independent stage. Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com [ Francisco Jerez: Non-trivial rebase. ] Reviewed-by: Francisco Jerez curroje...@riseup.net --- src/mesa/drivers/dri/i965/brw_context.h | 3 - src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 31 ++ src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 +++ src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++ 4 files changed, 83 insertions(+), 87 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 2fcdcfa..a6282f4 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context *brw, uint32_t size, uint32_t *out_offset, bool dword_pitch); -void brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset); void brw_update_sol_surface(struct brw_context *brw, struct gl_buffer_object *buffer_obj, diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 160dd2f..2b8040c 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw, } } -void -brw_update_buffer_texture_surface(struct gl_context *ctx, - unsigned unit, - uint32_t *surf_offset) +static void +update_buffer_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset) { struct brw_context *brw = brw_context(ctx); struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current; @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx, struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit); uint32_t *surf; - /* BRW_NEW_TEXTURE_BUFFER */ - if (tObj-Target == GL_TEXTURE_BUFFER) { - brw_update_buffer_texture_surface(ctx, unit, surf_offset); - return; - } - surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, surf_offset); @@ -795,6 +789,21 @@ const struct brw_tracked_state gen6_renderbuffer_surfaces = { .emit = update_renderbuffer_surfaces, }; +static void +update_texture_surface(struct gl_context *ctx, + unsigned unit, + uint32_t *surf_offset, + bool for_gather) +{ + struct brw_context *brw = brw_context(ctx); + struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current; + + if (obj-Target == GL_TEXTURE_BUFFER) { + update_buffer_texture_surface(ctx, unit, surf_offset); In order to avoid extra level of indentation I used the following. I would have preferred it here also. if (obj-Target == GL_TEXTURE_BUFFER) { update_buffer_texture_surface(ctx, unit, surf_offset); return; } I kept this as an indented block because it's harmless IMHO and it seemed a somewhat lesser evil than: 1/ Define all texture-specific variables (i.e. things that are not applicable to buffer textures, including some pointer dereferences) at the top level, which is what you did, but it seemed a bit dodgy. 2/ Mix statements and declarations. (Granted, this file is likely already relying on other C99 features, so it wouldn't matter in practice, it's just a codestyle itch) 3/ Declare stuff and leave it uninitialized until later. That said, the reason was largely subjective, and I don't really have a strong preference. As you are still the author of this commit you're free to format it as you wish, you can keep my R-b if you simply reindent this function. + } else { + brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather); + } +} static void update_stage_texture_surfaces(struct brw_context *brw, @@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw, /* _NEW_TEXTURE */ if (ctx-Texture.Unit[unit]._Current) { -brw-vtbl.update_texture_surface(ctx, unit, surf_offset + s, for_gather); +update_texture_surface(ctx, unit,
Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms
On Thu, May 07, 2015 at 05:52:12PM +0300, Pohjolainen, Topi wrote: On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote: Reserve space in the gather pool where the gathered uniforms are flushed. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c b/src/mesa/drivers/dri/i965/gen6_vs_state.c index 35d10ef..aebaa49 100644 --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw, */ assert(stage_state-push_const_size = 32); } + /* Allocate gather pool space for uniform and UBO entries in 512-bit chunks*/ + if (brw-gather_pool.bo != NULL) { + if (prog_data-nr_params 0) { I guess you combine these conditions: if (brw-gather_pool.bo != NULL prog_data-nr_params 0) Or even bail out early: Newermind, you modify it even further in the next patch. if (brw-gather_pool.bo == NULL || prog_data-nr_params == 0) return; + int num_consts = ALIGN(prog_data-nr_params, 4) / 4; This could be const, no big deal though. + stage_state-push_const_offset = brw-gather_pool.next_offset; + brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64; + } + } } static void -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/27] i965: Assign hw-binding table index for each UBO constant buffer.
On Tue, Apr 28, 2015 at 11:08:09PM +0300, Abdiel Janulgue wrote: To be able to refer to a constant buffer, the resource streamer needs to index it with a hardware binding table entry. This blankets the ubo buffers with hardware binding table indices. Gather constants hardware fetches in 16-entry binding table blocks. So we need to use a block that is unused. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.h | 11 +++ src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 6 ++ 2 files changed, 17 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index e25c64d..276c359 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -678,6 +678,17 @@ struct brw_vs_prog_data { #define SURF_INDEX_GEN6_SOL_BINDING(t) (t) +/** Start of hardware binding table index for uniform gather constant entries. + * This must be aligned to the start of a hardware binding table block (a block + * is a group 16 binding table entries). + */ +#define BRW_UNIFORM_GATHER_INDEX_START 32 + +/** Appended to the end of the binding table index for uniform constant buffers to indicate Wrap this line. + * start of the UBO gather constant binding table. + */ +#define BRW_UBO_GATHER_INDEX_APPEND 2 + /* Note: brw_gs_prog_data_compare() must be updated when adding fields to * this struct! */ diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 161d140..ce61554 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -884,6 +884,7 @@ brw_upload_ubo_surfaces(struct brw_context *brw, uint32_t *surf_offsets = stage_state-surf_offset[prog_data-binding_table.ubo_start]; + bool use_gather = (brw-gather_pool.bo != NULL); I would move this closer to the only use. This won't get re-used in the rest of the series. for (int i = 0; i shader-NumUniformBlocks; i++) { struct gl_uniform_buffer_binding *binding; @@ -904,6 +905,11 @@ brw_upload_ubo_surfaces(struct brw_context *brw, bo-size - binding-Offset, surf_offsets[i], dword_pitch); + if (use_gather) { Or simply: if (brw-gather_pool.bo) { + int bt_idx = BRW_UNIFORM_GATHER_INDEX_START + BRW_UBO_GATHER_INDEX_APPEND + i; Wrap this line. + gen7_update_binding_table(brw, stage_state-stage, + bt_idx, surf_offsets[i]); + } } if (shader-NumUniformBlocks) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions
On Thu, May 07, 2015 at 07:26:12AM -0700, Jason Ekstrand wrote: On Thu, May 7, 2015 at 5:52 AM, Pohjolainen, Topi topi.pohjolai...@intel.com wrote: On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote: v2: Get rid of the block parameter and make src a const reference Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com Reviewed-by: Matt Turner matts...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 1 file changed, 38 insertions(+), 37 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index 43370cb..9c4ed0b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate) operands_match(a, b, negate); } +static fs_inst * +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate) Did you mean 'src' to be constant reference? It is only used for reading so it could be - you claim this in the commit message yourself :) Oops... I think what happened is that I tried to do it for is_copy_payload not create_copy_instr. But then is_copy_payload does actually change it so I put it back and somehow my brain leaked it into the commit message. Unfortunately, it's already pushed so I can't change it now. However, I could make a fixup if you'd like. --Jason No big deal really, I'm sure compiler handles that for us anyway. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: document the LIBGL_DRI3_DISABLE environment variable
Suggested-by: Axel Davy axel.d...@ens.fr Signed-off-by: Martin Peres martin.pe...@intel.linux.com --- docs/envvars.html | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/envvars.html b/docs/envvars.html index 31d14a4..c0d5a51 100644 --- a/docs/envvars.html +++ b/docs/envvars.html @@ -34,6 +34,7 @@ sometimes be useful for debugging end-user issues. liLIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for debugging) liLIBGL_SHOW_FPS - print framerate to stdout based on the number of glXSwapBuffers calls per second. +liLIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter) /ul -- 2.4.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 26/27] i965: Disable gather push constants for null constants
On Tue, Apr 28, 2015 at 11:08:23PM +0300, Abdiel Janulgue wrote: Programming null constants with gather constant tables seems to be unsupported and results in a GPU lockup even with the prescribed GPU workarounds in the bspec. Found out by trial and error that disabling HW gather constant when the constant state for a stage needs to be nullified is the only way to go around the issue. Just a general question. We keep resource streamer itself always enabled (except for blorp of course). Does it still do something meaningful without gather constants or should we disable them both? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.
On Thu, May 07, 2015 at 04:43:21PM +0300, Pohjolainen, Topi wrote: On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote: This patch implements the binding table enable command which is also used to allocate a binding table pool where hardware-generated binding table entries are flushed into. Each binding table offset in the binding table pool is unique per each shader stage that are enabled within a batch. Also insert the required brw_tracked_state objects to enable hw-generated binding tables in normal render path. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 ++ src/mesa/drivers/dri/i965/brw_context.c| 4 ++ src/mesa/drivers/dri/i965/brw_context.h| 5 ++ src/mesa/drivers/dri/i965/brw_state.h | 7 +++ src/mesa/drivers/dri/i965/brw_state_upload.c | 2 + src/mesa/drivers/dri/i965/intel_batchbuffer.c | 4 ++ 6 files changed, 92 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c index 459165a..a58e32e 100644 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c @@ -44,6 +44,11 @@ #include brw_state.h #include intel_batchbuffer.h +/* Somehow the hw-binding table pool offset must start here, otherwise + * the GPU will hang + */ +#define HW_BT_START_OFFSET 256; I think we want to understand this a little better before enabling... + /** * Upload a shader stage's binding table as indirect state. * @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = { .emit = brw_gs_upload_binding_table, }; +/** + * Hardware-generated binding tables for the resource streamer + */ +void +gen7_disable_hw_binding_tables(struct brw_context *brw) +{ + BEGIN_BATCH(3); + OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC 16 | (3 - 2)); + OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, BRW_HW_BINDING_TABLE_ENABLE) | + brw-is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0); + OUT_BATCH(0); + ADVANCE_BATCH(); + + /* Pipe control workaround */ + brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE); +} + +void +gen7_enable_hw_binding_tables(struct brw_context *brw) +{ + if (!brw-has_resource_streamer) { + gen7_disable_hw_binding_tables(brw); I started wondering why we really need this - RS is disabled by default and we haven't needed to do anything to disable it before. + return; + } + + if (!brw-hw_bt_pool.bo) { + /* From the BSpec, 3D Pipeline Resource Streamer Hardware Binding Tables: + * + * A maximum of 16,383 Binding tables are allowed in any batch buffer. + */ + int max_size = 16383 * 4; But does it really need this much all the time? I guess I need to go and read the spec. I haven't read through the entire series but it seems that we can calculate (at least for gather constants) pretty accurately how much we need space. Could we do it also here based on the program data of all stages? I maybe missing something and just throwing questions up in the air, so bare with me... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/27] i965/fs: Append ir_binop_ubo_load entries to the gather table
On Tue, Apr 28, 2015 at 11:08:15PM +0300, Abdiel Janulgue wrote: When the const block and offset are immediate values. Otherwise just fall-back to the previous method of uploading the UBO constant data to GRF using pull constants. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_fs.cpp | 11 src/mesa/drivers/dri/i965/brw_fs.h | 4 ++ src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 +++- 3 files changed, 100 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 071ac59..031d807 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2273,6 +2273,7 @@ fs_visitor::assign_constant_locations() } stage_prog_data-nr_params = 0; + stage_prog_data-nr_ubo_params = ubo_uniforms; unsigned const_reg_access[uniforms]; memset(const_reg_access, 0, sizeof(const_reg_access)); @@ -2302,6 +2303,16 @@ fs_visitor::assign_constant_locations() stage_prog_data-gather_table[p].channel_mask = const_reg_access[i]; } + + for (unsigned i = 0; i this-nr_ubo_gather_table; i++) { + int p = stage_prog_data-nr_gather_table++; + stage_prog_data-gather_table[p].reg = this-ubo_gather_table[i].reg; + stage_prog_data-gather_table[p].channel_mask = this-ubo_gather_table[i].channel_mask; + stage_prog_data-gather_table[p].const_block = this-ubo_gather_table[i].const_block; + stage_prog_data-gather_table[p].const_offset = this-ubo_gather_table[i].const_offset; + stage_prog_data-max_ubo_const_block = MAX2(stage_prog_data-max_ubo_const_block, + this-ubo_gather_table[i].const_block); These are all overflowing 80 columns. + } } /** diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 32063f0..a48b2bb 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -417,6 +417,7 @@ public: void setup_uniform_values(ir_variable *ir); void setup_builtin_uniform_values(ir_variable *ir); int implied_mrf_writes(fs_inst *inst); + bool generate_ubo_gather_table(ir_expression* ir); virtual void dump_instructions(); virtual void dump_instructions(const char *name); @@ -445,6 +446,9 @@ public: /** Total number of direct uniforms we can get from NIR */ unsigned num_direct_uniforms; + /** Number of ubo uniform variable components visited. */ + unsigned ubo_uniforms; + /** Byte-offset for the next available spot in the scratch space buffer. */ unsigned last_scratch; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 4e99366..11e608b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1179,11 +1179,18 @@ fs_visitor::visit(ir_expression *ir) emit(FS_OPCODE_PACK_HALF_2x16_SPLIT, this-result, op[0], op[1]); break; case ir_binop_ubo_load: { + /* Use gather push constants if at all possible, otherwise just + * fall back to pull constants for UBOs + */ + if (generate_ubo_gather_table(ir)) + break; + /* This IR node takes a constant uniform block and a constant or * variable byte offset within the block and loads a vector from that. */ ir_constant *const_uniform_block = ir-operands[0]-as_constant(); ir_constant *const_offset = ir-operands[1]-as_constant(); + Not part of this patch. fs_reg surf_index; if (const_uniform_block) { @@ -4144,6 +4151,79 @@ fs_visitor::resolve_bool_comparison(ir_rvalue *rvalue, fs_reg *reg) *reg = neg_result; } +bool +fs_visitor::generate_ubo_gather_table(ir_expression *ir) +{ + ir_constant *const_uniform_block = ir-operands[0]-as_constant(); + ir_constant *const_offset = ir-operands[1]-as_constant(); These are only used for reading, lets use constant pointers. + + if (ir-operation != ir_binop_ubo_load || + !brw-has_resource_streamer|| + !brw-fs_ubo_gather|| + !const_uniform_block || Not really the style used elsewhere, don't align ||. + !const_offset) + return false; + + /* Only allow 16 registers (128 uniform components) as push constants. + */ Move the comment closing to the previous line. + unsigned int max_push_components = 16 * 8; + unsigned param_index = uniforms + ubo_uniforms; These could be both declared as const. + if ((param_index + ir-type-vector_elements) = max_push_components) + return false; + + fs_reg reg; + if (dispatch_width == 16) { + for (int i = 0; i (int) this-nr_ubo_gather_table; i++) { + if
Re: [Mesa-dev] [PATCH 19/27] i965/fs/nir: Append nir_intrinsic_load_ubo entries to the gather table
On Tue, Apr 28, 2015 at 11:08:16PM +0300, Abdiel Janulgue wrote: When the const block and offset are immediate values. Otherwise just fall-back to the previous method of uploading the UBO constant data to GRF using pull constants. Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_fs.h | 2 ++ src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 59 2 files changed, 61 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index a48b2bb..5247fa1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -418,6 +418,8 @@ public: void setup_builtin_uniform_values(ir_variable *ir); int implied_mrf_writes(fs_inst *inst); bool generate_ubo_gather_table(ir_expression* ir); + bool nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg dest, + bool has_indirect); virtual void dump_instructions(); virtual void dump_instructions(const char *name); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 3972581..b68f221 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1377,6 +1377,9 @@ fs_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) has_indirect = true; /* fallthrough */ case nir_intrinsic_load_ubo: { + if (nir_generate_ubo_gather_table(instr, dest, has_indirect)) + break; + nir_const_value *const_index = nir_src_as_const_value(instr-src[0]); fs_reg surf_index; @@ -1774,3 +1777,59 @@ fs_visitor::nir_emit_jump(nir_jump_instr *instr) unreachable(unknown jump); } } + +bool +fs_visitor::nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg dest, + bool has_indirect) +{ + nir_const_value *const_index = nir_src_as_const_value(instr-src[0]); Used only for reading, const. + + if (!const_index || has_indirect || !brw-fs_ubo_gather || !brw-has_resource_streamer) Wrap this line. + return false; + + /* Only allow 16 registers (128 uniform components) as push constants. +*/ + unsigned int max_push_components = 16 * 8; + unsigned param_index = uniforms + ubo_uniforms; These would be nicer as constants. + if ((MAX2(param_index, num_direct_uniforms) + +instr-num_components) max_push_components) + return false; + + fs_reg uniform_reg; + if (dispatch_width == 16) { + for (int i = 0; i (int) this-nr_ubo_gather_table; i++) { Extra space. + if ((this-ubo_gather_table[i].const_block == + const_index-u[0]) + (this-ubo_gather_table[i].const_offset == + (unsigned) instr-const_index[0])) { Here also. +uniform_reg = fs_reg(UNIFORM, this-ubo_gather_table[i].reg); +break; + } + } + if (uniform_reg.file != UNIFORM) { + /* Unlikely but this means that SIMD8 wasn't able to allocate push constant Wrap this line. + * registers for this ubo load. Fall back to pull-constant method. + */ + return false; + } + } + + if (uniform_reg.file != UNIFORM) { + uniform_reg = fs_reg(UNIFORM, param_index); + int gather = this-nr_ubo_gather_table++; + + assert(instr-num_components = 4); + ubo_uniforms += instr-num_components; + this-ubo_gather_table[gather].reg = uniform_reg.reg; + this-ubo_gather_table[gather].const_block = const_index-u[0]; + this-ubo_gather_table[gather].const_offset = instr-const_index[0]; + } + + for (unsigned j = 0; j instr-num_components; j++) { + fs_reg src = offset(retype(uniform_reg, dest.type), j); + emit(MOV(dest, src)); + dest = offset(dest, 1); + } + + return true; +} -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs: Don't forget the force_sechalf flag in lower_load_payload().
Regression from commit 41868bb6824c6106a55c8442006c1e2215abf567. Fixes a bunch of ARB_shader_image_load_store tests. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 7e4ead0..0a62e46 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3512,6 +3512,7 @@ fs_visitor::lower_load_payload() fs_inst *mov = MOV(retype(dst, inst-src[i].type), inst-src[i]); mov-force_writemask_all = inst-force_writemask_all; +mov-force_sechalf = inst-force_sechalf; inst-insert_before(block, mov); } dst = offset(dst, 1); -- 2.3.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd
I'm not sure what the final consensus will be on how to do this, but FWIW: Tested-By: Aaron Watry awa...@gmail.com I've tested this with 4 combinations: no --with-opencl-icd option specified : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib --with-opencl-icd=standard : libMesaOpenCL.so installed in ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in ${prefix}/lib, icd in ${prefix}/etc//mesa.icd. I only specified --prefix, no other directories overridden in configure command. --Aaron On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote: The standard ICD file path is /etc/OpenCL/vendor/. However it doesn't fit well with custom build. This option allow ICD vendor file installation path override --- configure.ac | 46 +++--- src/gallium/targets/opencl/Makefile.am | 2 +- 2 files changed, 33 insertions(+), 15 deletions(-) diff --git a/configure.ac b/configure.ac index 095e23e..90dba4e 100644 --- a/configure.ac +++ b/configure.ac @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl], [enable OpenCL library @:@default=disabled@:@])], [enable_opencl=$enableval], [enable_opencl=no]) -AC_ARG_ENABLE([opencl_icd], - [AS_HELP_STRING([--enable-opencl-icd], - [Build an OpenCL ICD library to be loaded by an ICD implementation - @:@default=disabled@:@])], -[enable_opencl_icd=$enableval], -[enable_opencl_icd=no]) AC_ARG_ENABLE([xlib-glx], [AS_HELP_STRING([--enable-xlib-glx], [make GLX library Xlib-based instead of DRI-based @:@default=disabled@:@])], @@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then # XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers enable_gallium_loader=yes -if test x$enable_opencl_icd = xyes; then -OPENCL_LIBNAME=MesaOpenCL -else -OPENCL_LIBNAME=OpenCL -fi - if test x$have_libelf != xyes; then AC_MSG_ERROR([Clover requires libelf]) fi fi AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes) -AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes) -AC_SUBST([OPENCL_LIBNAME]) dnl dnl Gallium configuration @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir], [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d]) AC_SUBST([D3D_DRIVER_INSTALL_DIR]) +dnl OpenCL ICD + +AC_ARG_WITH([opencl-icd], +[AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@], +[Build an OpenCL ICD library to be loaded by an ICD implementation. + If @:@standard@:@ the OpenCL ICD vendor file installs in /etc/OpenCL/vendors. + @:@sysconfdir@:@ installs the file in $sysconfdir/OpenCL/vendors + @:@default=no@:@])], +[OPENCL_ICD=$withval], +[OPENCL_ICD=no]) + +case x$OPENCL_ICD in +xno) +OPENCL_LIBNAME=OpenCL +;; +xstandard) +OPENCL_LIBNAME=MesaOpenCL +ICD_FILE_DIR=/etc/OpenCL/vendors +;; +xsysconfdir) +OPENCL_LIBNAME=MesaOpenCL +ICD_FILE_DIR=$sysconfdir/OpenCL/vendors +;; +*) +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for --with-opencl-icd]) +;; +esac + +AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno) +AC_SUBST([OPENCL_LIBNAME]) +AC_SUBST([ICD_FILE_DIR]) + dnl dnl Gallium helper functions dnl diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 5daf327..781daa0 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES = opencl.sym EXTRA_DIST = mesa.icd opencl.sym if HAVE_CLOVER_ICD -icddir = /etc/OpenCL/vendors/ +icddir = $(ICD_FILE_DIR) icd_DATA = mesa.icd endif -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option
On Thu, May 7, 2015 at 3:59 AM, Michel Dänzer mic...@daenzer.net wrote: On 05.05.2015 01:47, Tom Stellard wrote: On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote: On Mon, May 4, 2015 at 10:04 AM, Tom Stellard t...@stellard.net wrote: On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote: On Sat, May 2, 2015 at 1:19 PM, EdB edb+m...@sigluy.net wrote: The standard ICD file path is /etc/OpenCL/vendor/. However it doesn't fit well with custom build. This option allow ICD vendor file installation path override --- configure.ac | 6 ++ src/gallium/targets/opencl/Makefile.am | 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 095e23e..bf08d76 100644 --- a/configure.ac +++ b/configure.ac @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir], [D3D_DRIVER_INSTALL_DIR=$withval], [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d]) AC_SUBST([D3D_DRIVER_INSTALL_DIR]) +AC_ARG_WITH([icd-file-dir], +[AS_HELP_STRING([--with-icd-file-dir=DIR], +[directory for the OpenCL ICD vendor file @:@/etc/OpenCL/vendors@:@])], +[ICD_FILE_INSTALL_DIR=$withval], +[ICD_FILE_INSTALL_DIR=/etc/OpenCL/vendors]) What about making this default to ${sysconfdir}/OpenCL/vendors ? That way using --prefix should auto-make it go into the prefix instead of unexpectedly installing things outside of the specified prefix? That way a distro build which specifies --sysconfdir as /etc will get it in the right place, while by default it'll go into /usr/local/etc and a user can override the icd loader's default behaviour with OPENCL_VENDOR_PATH? I would prefer not to make this the default behavior, because it violates the spec and there could potentially be multiple icd implementations, which may or may not have the overrides. I think the best solution would be to rename the option to something like --enable-ocl-icd-respect-prefix (suggestions for other names encouraged). and have the option enable the behavior that Ilia is describing. This will give distros and advanced users a way to setup their system the way they want. It's just a very anti-autoconf thing to do to have make install fail by default unless you specify some hey, i actually want make install to work option. I think it's crazy to expect that, by default, people will want to write over their system installs, and having things go outside of the specified --prefix is very surprising (unless you force some other option). And asking the user to run make install as root is even crazier. My expectation is that, by default, when people specify --enable-opencl-icd they want an implementation that conforms to the specification. Unfortunately, this means installing icd files to /etc. There is no good solution here, but I'd rather have users specify a flag to get a sane build system, than requiring them to set a flag and set an environment variable just to get working OpenCL with the ICD loader. I guess I haven't hit this yet because there's no OpenCL support in nouveau or freedreno, but I made the same stink about vdpau when Emil tried to make it install to some system location by default. At least a few people seemed to agree with me back then... Does the vdpau spec also require installation to a specific system director (e.g. /etc/) ? Tom, I think ensuring that the OpenCL ICD loader can pick up the mesa.icd file is something for the distributor / administrator / user to worry about, not Mesa upstream. There's a similar situation with the drirc file, which is installed inside the prefix by default but only read from /etc/. FTR, I fully agree with this assessment (it's the distributor's problem), but my main priority was making sure make install works. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev