date:20150507

[Mesa-dev] [Bug 80183] [llvmpipe] triangles with vertices that map to raster positions viewport width/height are not displayed

2015-05-07 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=80183

--- Comment #14 from cgerlac...@gmail.com ---
The problem is also reproducible with softpipe.

I understand your concern and I will try provide some samplecode to reproduce
the clipping error.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util: Take memset out of rzalloc_size()

2015-05-07 Thread Juha-Pekka Heikkila

On 06.05.2015 21:51, Rob Clark wrote:
 On Wed, May 6, 2015 at 1:24 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Wednesday, May 06, 2015 03:35:27 PM Juha-Pekka Heikkila wrote:
 rzalloc_size() call ralloc_size() to allocate memory. ralloc_size()
 use calloc to get memory thus zeroing in rzalloc_size is not
 necessary.

 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
  src/util/ralloc.c | 2 --
  1 file changed, 2 deletions(-)

 diff --git a/src/util/ralloc.c b/src/util/ralloc.c
 index 01719c8..09f5fcd 100644
 --- a/src/util/ralloc.c
 +++ b/src/util/ralloc.c
 @@ -132,8 +132,6 @@ void *
  rzalloc_size(const void *ctx, size_t size)
  {
 void *ptr = ralloc_size(ctx, size);
 -   if (likely(ptr != NULL))
 -  memset(ptr, 0, size);
 return ptr;
  }



 Wow, I have no idea why I did that.  This is certainly
 counter-intuitive.

 rzalloc() is supposed to guarantee zeroed memory.  ralloc() is not, but
 it looks like it always has for some reason.  I'm somewhat inclined to
 change ralloc_size() to use malloc instead of calloc.

 I wonder how many things would break :)

 
 try the change conditionally ifndef DEBUG??  (abusing --enable-debug
 as a proxy for --im-actually-a-mesa-dev-and-want-to-see-the-crashes)
 
 

I did have a try to put malloc in place of calloc and did see basically
almost all Piglit tests starting to fail on this one. There were handful
of tests which still worked but also saw many different places for
crashes thus though at first suggest just taking the memset out. :)

/Juha-Pekka
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glapi: Add positional argument specifier.

2015-05-07 Thread Vinson Lee

On Wed, May 6, 2015 at 4:35 PM, Ian Romanick i...@freedesktop.org wrote:
 On 05/06/2015 03:45 PM, Kenneth Graunke wrote:
 On Wednesday, May 06, 2015 12:48:30 PM Vinson Lee wrote:
 Fix build error introduced with commit 1c5a57a glapi/es3.1: Add support
 for GLES versions  3.0 with Python  2.7.

   File src/mapi/glapi/gen/gl_genexec.py, line 230, in module
 printer.Print(api)
   File src/mapi/glapi/gen/gl_XML.py, line 120, in Print
 self.printBody(api)
   File src/mapi/glapi/gen/gl_genexec.py, line 187, in printBody
 condition_parts.append('(ctx-API == API_OPENGLES2  ctx-Version = 
 {})'.format(int(f.api_map['es2'] * 10)))
 ValueError: zero length field name in format

 Signed-off-by: Vinson Lee v...@freedesktop.org
 ---
  src/mapi/glapi/gen/gl_genexec.py |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/src/mapi/glapi/gen/gl_genexec.py 
 b/src/mapi/glapi/gen/gl_genexec.py
 index e58cdfc..4e76fe3 100644
 --- a/src/mapi/glapi/gen/gl_genexec.py
 +++ b/src/mapi/glapi/gen/gl_genexec.py
 @@ -184,7 +184,7 @@ class PrintCode(gl_XML.gl_print_base):
  condition_parts.append('ctx-API == API_OPENGLES')
  if 'es2' in f.api_map:
  if f.api_map['es2']  2.0:
 -condition_parts.append('(ctx-API == API_OPENGLES2  
 ctx-Version = {})'.format(int(f.api_map['es2'] * 10)))
 +condition_parts.append('(ctx-API == API_OPENGLES2  
 ctx-Version = {0})'.format(int(f.api_map['es2'] * 10)))
  else:
  condition_parts.append('ctx-API == API_OPENGLES2')
  if not condition_parts:


 Do we actually care at this point?

 Depends on whether or not you care about CentOS 6 or whatever version
 RHEL it is based on. :(

 https://bugs.freedesktop.org/show_bug.cgi?id=90346


This patch does not fix bug 90346. DispatchSanity_test.GLES2 still
fails on CentOS 6.

 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glx: provide a way to disable DRI3 using an environment variable

2015-05-07 Thread Martin Peres


On 06/05/15 19:47, Axel Davy wrote:

Le 06/05/2015 14:43, Martin Peres a écrit :

  diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index ff77a91..5246737 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -2092,6 +2092,11 @@ dri3_create_display(Display * dpy)
 xcb_generic_error_t  *error;
 const xcb_query_extension_reply_t*extension;
  +   if (getenv(MESA_GLX_DRI3_DISABLE)) {
+  ErrorMessageF(DRI3 disabled by the environment\n);
+  return NULL;
+   }
+
 xcb_prefetch_extension_data(c, xcb_dri3_id);
 xcb_prefetch_extension_data(c, xcb_present_id);

There is already a LIBGL_DRI3_DISABLE env var.

Does this one bring something different ?

Yours,

Axel Davy


Thanks Axel! I heard that there was such a variable, but no-one could 
remember the name. I looked for it in the wrong place it would seem!


Let's drop this patch for the moment. If the variable works as expected, 
I would suggest documenting it in envvar.html :)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Michel Dänzer

On 05.05.2015 01:47, Tom Stellard wrote:
 On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote:
 On Mon, May 4, 2015 at 10:04 AM, Tom Stellard t...@stellard.net wrote:
 On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote:
 On Sat, May 2, 2015 at 1:19 PM, EdB edb+m...@sigluy.net wrote:
 The standard ICD file path is /etc/OpenCL/vendor/.
 However it doesn't fit well with custom build.
 This option allow ICD vendor file installation path override
 ---
  configure.ac   | 6 ++
  src/gallium/targets/opencl/Makefile.am | 2 +-
  2 files changed, 7 insertions(+), 1 deletion(-)

 diff --git a/configure.ac b/configure.ac
 index 095e23e..bf08d76 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir],
  [D3D_DRIVER_INSTALL_DIR=$withval],
  [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d])
  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
 +AC_ARG_WITH([icd-file-dir],
 +[AS_HELP_STRING([--with-icd-file-dir=DIR],
 +[directory for the OpenCL ICD vendor file 
 @:@/etc/OpenCL/vendors@:@])],
 +[ICD_FILE_INSTALL_DIR=$withval],
 +[ICD_FILE_INSTALL_DIR=/etc/OpenCL/vendors])

 What about making this default to ${sysconfdir}/OpenCL/vendors ? That
 way using --prefix should auto-make it go into the prefix instead of
 unexpectedly installing things outside of the specified prefix? That
 way a distro build which specifies --sysconfdir as /etc will get it in
 the right place, while by default it'll go into /usr/local/etc and a
 user can override the icd loader's default behaviour with
 OPENCL_VENDOR_PATH?


 I would prefer not to make this the default behavior, because it violates 
 the spec
 and there could potentially be multiple icd implementations, which may or 
 may not have
 the overrides.

 I think the best solution would be to rename the option to something like
 --enable-ocl-icd-respect-prefix (suggestions for other names encouraged).
 and have the option enable the behavior that Ilia is describing.

 This will give distros and advanced users a way to setup their system
 the way they want.

 It's just a very anti-autoconf thing to do to have make install fail
 by default unless you specify some hey, i actually want make install
 to work option.

 I think it's crazy to expect that, by default, people will want to
 write over their system installs, and having things go outside of the
 specified --prefix is very surprising (unless you force some other
 option). And asking the user to run make install as root is even
 crazier.

 
 My expectation is that, by default, when people specify --enable-opencl-icd
 they want an implementation that conforms to the specification.
 Unfortunately, this means installing icd files to /etc.
 
 There is no good solution here, but I'd rather have users specify a flag
 to get a sane build system, than requiring them to set a flag and set
 an environment variable just to get working OpenCL with the ICD loader.
 
 I guess I haven't hit this yet because there's no OpenCL support in
 nouveau or freedreno, but I made the same stink about vdpau when Emil
 tried to make it install to some system location by default. At least
 a few people seemed to agree with me back then...

 
 Does the vdpau spec also require installation to a specific system director
 (e.g. /etc/) ?

Tom, I think ensuring that the OpenCL ICD loader can pick up the
mesa.icd file is something for the distributor / administrator / user to
worry about, not Mesa upstream.

There's a similar situation with the drirc file, which is installed
inside the prefix by default but only read from /etc/.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

On Thu, May 7, 2015 at 11:06 AM, Neil Roberts n...@linux.intel.com wrote:
 Commit 94ee908448 added a header size parameter to the function to
 create the LOAD_PAYLOAD instruction. However this broke
 opt_sampler_eot which manually constructs the instruction and so
 wasn't setting the header_size. This ends up making the parameters for
 the send message all have the wrong location and it all falls apart.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 3bf5866..02a1ad5 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot()
  load_payload-sources + 
 1);

 new_load_payload-regs_written = load_payload-regs_written + 1;
 +   new_load_payload-header_size = 1;
 tex_inst-mlen++;
 tex_inst-header_size = 1;
 tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], 
 new_load_payload);
 --
 1.9.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread EdB


Le 2015-05-07 18:55, Aaron Watry a écrit :

I'm not sure what the final consensus will be on how to do this, but
FWIW:
Tested-By: Aaron Watry awa...@gmail.com

I've tested this with 4 combinations:
no --with-opencl-icd option specified : libOpenCL.so gets installed in
${prefix}/lib
--with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
--with-opencl-icd=standard : libMesaOpenCL.so installed in
${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd
--with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in
${prefix}/lib, icd in ${prefix}/etc//mesa.icd.  I only specified
--prefix, no other directories overridden in configure command.



thanks

  EdB


--Aaron

 

On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote:


The standard ICD file path is /etc/OpenCL/vendor/.
However it doesn't fit well with custom build.
This option allow ICD vendor file installation path override
---
 configure.ac [1]                           | 46
+++---
 src/gallium/targets/opencl/Makefile.am |  2 +-
 2 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/configure.ac [1] b/configure.ac [1]
index 095e23e..90dba4e 100644
--- a/configure.ac [1]
+++ b/configure.ac [1]
@@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
          [enable OpenCL library @:@default=disabled@:@])],
    [enable_opencl=$enableval],
    [enable_opencl=no])
-AC_ARG_ENABLE([opencl_icd],
-   [AS_HELP_STRING([--enable-opencl-icd],
-          [Build an OpenCL ICD library to be loaded by an ICD
implementation
-           @:@default=disabled@:@])],
-    [enable_opencl_icd=$enableval],
-    [enable_opencl_icd=no])
 AC_ARG_ENABLE([xlib-glx],
     [AS_HELP_STRING([--enable-xlib-glx],
         [make GLX library Xlib-based instead of DRI-based
@:@default=disabled@:@])],
@@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then
     # XXX: Use $enable_shared_pipe_drivers once converted to
use static/shared pipe-drivers
     enable_gallium_loader=yes

-    if test x$enable_opencl_icd = xyes; then
-        OPENCL_LIBNAME=MesaOpenCL
-    else
-        OPENCL_LIBNAME=OpenCL
-    fi
-
     if test x$have_libelf != xyes; then
        AC_MSG_ERROR([Clover requires libelf])
     fi
 fi
 AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes)
-AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes)
-AC_SUBST([OPENCL_LIBNAME])

 dnl
 dnl Gallium configuration
@@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
     [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d])
 AC_SUBST([D3D_DRIVER_INSTALL_DIR])

+dnl OpenCL ICD
+
+AC_ARG_WITH([opencl-icd],
+   
[AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@],
+        [Build an OpenCL ICD library to be loaded by an ICD
implementation.
+         If @:@standard@:@ the OpenCL ICD vendor file
installs in /etc/OpenCL/vendors.
+         @:@sysconfdir@:@ installs the file in
$sysconfdir/OpenCL/vendors
+         @:@default=no@:@])],
+    [OPENCL_ICD=$withval],
+    [OPENCL_ICD=no])
+
+case x$OPENCL_ICD in
+xno)
+    OPENCL_LIBNAME=OpenCL
+    ;;
+xstandard)
+    OPENCL_LIBNAME=MesaOpenCL
+    ICD_FILE_DIR=/etc/OpenCL/vendors
+    ;;
+xsysconfdir)
+    OPENCL_LIBNAME=MesaOpenCL
+    ICD_FILE_DIR=$sysconfdir/OpenCL/vendors
+    ;;
+*)
+    AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
--with-opencl-icd])
+    ;;
+esac
+
+AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno)
+AC_SUBST([OPENCL_LIBNAME])
+AC_SUBST([ICD_FILE_DIR])
+
 dnl
 dnl Gallium helper functions
 dnl
diff --git a/src/gallium/targets/opencl/Makefile.am
b/src/gallium/targets/opencl/Makefile.am
index 5daf327..781daa0 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES =
opencl.sym
 EXTRA_DIST = mesa.icd opencl.sym

 if HAVE_CLOVER_ICD
-icddir = /etc/OpenCL/vendors/
+icddir = $(ICD_FILE_DIR)
 icd_DATA = mesa.icd
 endif

--
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2]




Links:
--
[1] http://configure.ac
[2] http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/wm/gen6: Add option for disabling statistics collection

2015-05-07 Thread Kenneth Graunke

On Thursday, May 07, 2015 04:39:14 PM Topi Pohjolainen wrote:
 Normally this always needed but for internal blits and clears
 we need to be able to disable it.
 
 CC: Kenneth Graunke kenn...@whitecape.org
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com

Reviewed-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] docs: document the LIBGL_DRI3_DISABLE environment variable

2015-05-07 Thread Kenneth Graunke

On Thursday, May 07, 2015 05:34:13 PM Martin Peres wrote:
 Suggested-by: Axel Davy axel.d...@ens.fr
 Signed-off-by: Martin Peres martin.pe...@intel.linux.com
 ---
  docs/envvars.html | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/docs/envvars.html b/docs/envvars.html
 index 31d14a4..c0d5a51 100644
 --- a/docs/envvars.html
 +++ b/docs/envvars.html
 @@ -34,6 +34,7 @@ sometimes be useful for debugging end-user issues.
  liLIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for 
 debugging)
  liLIBGL_SHOW_FPS - print framerate to stdout based on the number of 
 glXSwapBuffers
  calls per second.
 +liLIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter)
  /ul

Documentation?!? :)  Always nice to have.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null

2015-05-07 Thread Anuj Phogat

On Thu, May 7, 2015 at 6:20 AM, Neil Roberts n...@linux.intel.com wrote:
 opt_sampler_eot enables a direct write to framebuffer from a sample.
 In order to do this the sample message needs to have a message header
 so if there wasn't one already then the function adds one. In addition
 the function sets the destination register to null because it's no
 longer used. However it was only doing this in cases where it was
 adding a message header. This patch just moves setting the destination
 so that it happens even if there's a messge header. In practice this
 doesn't seem to make any difference but it's a bit cleaner.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 1ca7ca6..72d408b 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -2675,6 +2675,7 @@ fs_visitor::opt_sampler_eot()

 tex_inst-offset |= fb_write-target  24;
 tex_inst-eot = true;
 +   tex_inst-dst = reg_null_ud;
 fb_write-remove(cfg-blocks[cfg-num_blocks - 1]);

 /* If a header is present, marking the eot is sufficient. Otherwise, we 
 need
 @@ -2712,7 +2713,6 @@ fs_visitor::opt_sampler_eot()
 tex_inst-header_present = true;
 tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], 
 new_load_payload);
 tex_inst-src[0] = send_header;
 -   tex_inst-dst = reg_null_ud;

 return true;
  }
 --
 1.9.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

LGTM.
Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Neil Roberts

Commit 94ee908448 added a header size parameter to the function to
create the LOAD_PAYLOAD instruction. However this broke
opt_sampler_eot which manually constructs the instruction and so
wasn't setting the header_size. This ends up making the parameters for
the send message all have the wrong location and it all falls apart.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3bf5866..02a1ad5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot()
 load_payload-sources + 1);
 
new_load_payload-regs_written = load_payload-regs_written + 1;
+   new_load_payload-header_size = 1;
tex_inst-mlen++;
tex_inst-header_size = 1;
tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], new_load_payload);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Tom Stellard

On Thu, May 07, 2015 at 04:59:41PM +0900, Michel Dänzer wrote:
 On 05.05.2015 01:47, Tom Stellard wrote:
  On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote:
  On Mon, May 4, 2015 at 10:04 AM, Tom Stellard t...@stellard.net wrote:
  On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote:
  On Sat, May 2, 2015 at 1:19 PM, EdB edb+m...@sigluy.net wrote:
  The standard ICD file path is /etc/OpenCL/vendor/.
  However it doesn't fit well with custom build.
  This option allow ICD vendor file installation path override
  ---
   configure.ac   | 6 ++
   src/gallium/targets/opencl/Makefile.am | 2 +-
   2 files changed, 7 insertions(+), 1 deletion(-)
 
  diff --git a/configure.ac b/configure.ac
  index 095e23e..bf08d76 100644
  --- a/configure.ac
  +++ b/configure.ac
  @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir],
   [D3D_DRIVER_INSTALL_DIR=$withval],
   [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d])
   AC_SUBST([D3D_DRIVER_INSTALL_DIR])
  +AC_ARG_WITH([icd-file-dir],
  +[AS_HELP_STRING([--with-icd-file-dir=DIR],
  +[directory for the OpenCL ICD vendor file 
  @:@/etc/OpenCL/vendors@:@])],
  +[ICD_FILE_INSTALL_DIR=$withval],
  +[ICD_FILE_INSTALL_DIR=/etc/OpenCL/vendors])
 
  What about making this default to ${sysconfdir}/OpenCL/vendors ? That
  way using --prefix should auto-make it go into the prefix instead of
  unexpectedly installing things outside of the specified prefix? That
  way a distro build which specifies --sysconfdir as /etc will get it in
  the right place, while by default it'll go into /usr/local/etc and a
  user can override the icd loader's default behaviour with
  OPENCL_VENDOR_PATH?
 
 
  I would prefer not to make this the default behavior, because it violates 
  the spec
  and there could potentially be multiple icd implementations, which may or 
  may not have
  the overrides.
 
  I think the best solution would be to rename the option to something like
  --enable-ocl-icd-respect-prefix (suggestions for other names encouraged).
  and have the option enable the behavior that Ilia is describing.
 
  This will give distros and advanced users a way to setup their system
  the way they want.
 
  It's just a very anti-autoconf thing to do to have make install fail
  by default unless you specify some hey, i actually want make install
  to work option.
 
  I think it's crazy to expect that, by default, people will want to
  write over their system installs, and having things go outside of the
  specified --prefix is very surprising (unless you force some other
  option). And asking the user to run make install as root is even
  crazier.
 
  
  My expectation is that, by default, when people specify --enable-opencl-icd
  they want an implementation that conforms to the specification.
  Unfortunately, this means installing icd files to /etc.
  
  There is no good solution here, but I'd rather have users specify a flag
  to get a sane build system, than requiring them to set a flag and set
  an environment variable just to get working OpenCL with the ICD loader.
  
  I guess I haven't hit this yet because there's no OpenCL support in
  nouveau or freedreno, but I made the same stink about vdpau when Emil
  tried to make it install to some system location by default. At least
  a few people seemed to agree with me back then...
 
  
  Does the vdpau spec also require installation to a specific system director
  (e.g. /etc/) ?
 
 Tom, I think ensuring that the OpenCL ICD loader can pick up the
 mesa.icd file is something for the distributor / administrator / user to
 worry about, not Mesa upstream.
 

I don't really disagree with this in general.  My position is that when
there is a situation where it is impossible to follow both the API spec
and build system best practices that it is more important to follow the
API spec.

I realize some people disagree with this, and I completely understand
their rationale.

For this particular situation, I'm happy with any solution that:

1. Allows a user to install the icd file to /etc if he or she wants to.
and
2. Does not require the user to read the spec to know that /etc is the
correct place to install it.

I think EdB's latest patch is a good solution:
http://lists.freedesktop.org/archives/mesa-dev/2015-May/083661.html

-Tom

 There's a similar situation with the drirc file, which is installed
 inside the prefix by default but only read from /etc/.
 
 
 -- 
 Earthling Michel Dänzer   |   http://www.amd.com
 Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Anuj Phogat

On Thu, May 7, 2015 at 11:06 AM, Neil Roberts n...@linux.intel.com wrote:
 Commit 94ee908448 added a header size parameter to the function to
 create the LOAD_PAYLOAD instruction. However this broke
 opt_sampler_eot which manually constructs the instruction and so
 wasn't setting the header_size. This ends up making the parameters for
 the send message all have the wrong location and it all falls apart.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 3bf5866..02a1ad5 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot()
  load_payload-sources + 
 1);

 new_load_payload-regs_written = load_payload-regs_written + 1;
 +   new_load_payload-header_size = 1;
 tex_inst-mlen++;
 tex_inst-header_size = 1;
 tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], 
 new_load_payload);
 --
 1.9.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/wm/gen7: Refactor state setup

2015-05-07 Thread Topi Pohjolainen

CC: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
---
 src/mesa/drivers/dri/i965/brw_state.h |  9 +++
 src/mesa/drivers/dri/i965/gen7_wm_state.c | 98 ---
 2 files changed, 74 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 26fdae6..5a52a74 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -264,6 +264,15 @@ void brw_update_renderbuffer_surfaces(struct brw_context 
*brw,
 
 /* gen7_wm_state.c */
 void
+gen7_upload_wm_state(struct brw_context *brw,
+ const struct gl_program *fp,
+ const struct brw_wm_prog_data *prog_data,
+ bool multisampled_fbo, int min_inv_per_frag,
+ bool kill_enable, bool color_buffer_write_enable,
+ bool msaa_enabled, bool statistic_enable,
+ bool line_stipple_enable, bool polygon_stipple_enable);
+
+void
 gen7_upload_ps_state(struct brw_context *brw,
  const struct gl_fragment_program *fp,
  const struct brw_stage_state *stage_state,
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_state.c
index b918275..b3fa5be 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
@@ -32,63 +32,53 @@
 #include program/prog_statevars.h
 #include intel_batchbuffer.h
 
-static void
-upload_wm_state(struct brw_context *brw)
+void
+gen7_upload_wm_state(struct brw_context *brw,
+ const struct gl_program *fp,
+ const struct brw_wm_prog_data *prog_data,
+ bool multisampled_fbo, int min_inv_per_frag,
+ bool kill_enable, bool color_buffer_write_enable,
+ bool msaa_enabled, bool statistic_enable,
+ bool line_stipple_enable, bool polygon_stipple_enable)
 {
-   struct gl_context *ctx = brw-ctx;
-   /* BRW_NEW_FRAGMENT_PROGRAM */
-   const struct brw_fragment_program *fp =
-  brw_fragment_program_const(brw-fragment_program);
-   /* BRW_NEW_FS_PROG_DATA */
-   const struct brw_wm_prog_data *prog_data = brw-wm.prog_data;
bool writes_depth = prog_data-computed_depth_mode != BRW_PSCDEPTH_OFF;
uint32_t dw1, dw2;
 
-   /* _NEW_BUFFERS */
-   bool multisampled_fbo = ctx-DrawBuffer-Visual.samples  1;
-
dw1 = dw2 = 0;
-   dw1 |= GEN7_WM_STATISTICS_ENABLE;
+
+   if (statistic_enable)
+  dw1 |= GEN7_WM_STATISTICS_ENABLE;
+
dw1 |= GEN7_WM_LINE_AA_WIDTH_1_0;
dw1 |= GEN7_WM_LINE_END_CAP_AA_WIDTH_0_5;
 
-   /* _NEW_LINE */
-   if (ctx-Line.StippleFlag)
+   if (line_stipple_enable)
   dw1 |= GEN7_WM_LINE_STIPPLE_ENABLE;
 
-   /* _NEW_POLYGON */
-   if (ctx-Polygon.StippleFlag)
+   if (polygon_stipple_enable)
   dw1 |= GEN7_WM_POLYGON_STIPPLE_ENABLE;
 
-   if (fp-program.Base.InputsRead  VARYING_BIT_POS)
+   if (fp-InputsRead  VARYING_BIT_POS)
   dw1 |= GEN7_WM_USES_SOURCE_DEPTH | GEN7_WM_USES_SOURCE_W;
 
dw1 |= prog_data-computed_depth_mode  GEN7_WM_COMPUTED_DEPTH_MODE_SHIFT;
dw1 |= prog_data-barycentric_interp_modes 
   GEN7_WM_BARYCENTRIC_INTERPOLATION_MODE_SHIFT;
 
-   /* _NEW_COLOR, _NEW_MULTISAMPLE */
-   /* Enable if the pixel shader kernel generates and outputs oMask.
-*/
-   if (prog_data-uses_kill || ctx-Color.AlphaEnabled ||
-   ctx-Multisample.SampleAlphaToCoverage ||
-   prog_data-uses_omask) {
+   if (kill_enable)
   dw1 |= GEN7_WM_KILL_ENABLE;
-   }
 
-   /* _NEW_BUFFERS | _NEW_COLOR */
-   if (brw_color_buffer_write_enabled(brw) || writes_depth ||
-   dw1  GEN7_WM_KILL_ENABLE) {
+   if (color_buffer_write_enable || writes_depth ||
+   dw1  GEN7_WM_KILL_ENABLE)
   dw1 |= GEN7_WM_DISPATCH_ENABLE;
-   }
+
if (multisampled_fbo) {
-  /* _NEW_MULTISAMPLE */
-  if (ctx-Multisample.Enabled)
+  if (msaa_enabled)
  dw1 |= GEN7_WM_MSRAST_ON_PATTERN;
   else
  dw1 |= GEN7_WM_MSRAST_OFF_PIXEL;
 
-  if (_mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program, 
false)  1)
+  if (min_inv_per_frag  1)
  dw2 |= GEN7_WM_MSDISPMODE_PERSAMPLE;
   else
  dw2 |= GEN7_WM_MSDISPMODE_PERPIXEL;
@@ -97,9 +87,8 @@ upload_wm_state(struct brw_context *brw)
   dw2 |= GEN7_WM_MSDISPMODE_PERSAMPLE;
}
 
-   if (fp-program.Base.SystemValuesRead  SYSTEM_BIT_SAMPLE_MASK_IN) {
+   if (fp-SystemValuesRead  SYSTEM_BIT_SAMPLE_MASK_IN)
   dw1 |= GEN7_WM_USES_INPUT_COVERAGE_MASK;
-   }
 
BEGIN_BATCH(3);
OUT_BATCH(_3DSTATE_WM  16 | (3 - 2));
@@ -108,6 +97,49 @@ upload_wm_state(struct brw_context *brw)
ADVANCE_BATCH();
 }
 
+static void
+upload_wm_state(struct brw_context *brw)
+{
+   struct gl_context *ctx = brw-ctx;
+   /* BRW_NEW_FRAGMENT_PROGRAM */
+   const struct brw_fragment_program *fp =
+

Re: [Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1

2015-05-07 Thread Ian Romanick

On 05/07/2015 12:57 AM, Marta Lofstedt wrote:
 From: Marta Lofstedt marta.lofst...@intel.com
 
 v2: only expose enums from GL_ARB_shader_image_load_store
 for gles 3.1 and GL core
 
 Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
 ---
  src/mesa/main/get.c  |  6 ++
  src/mesa/main/get_hash_params.py | 17 -
  2 files changed, 14 insertions(+), 9 deletions(-)
 
 diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
 index 9898197..73739b6 100644
 --- a/src/mesa/main/get.c
 +++ b/src/mesa/main/get.c
 @@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = {
 EXTRA_END
  };
  
 +static const int extra_ARB_shader_image_load_store_es31[] = {
 +   EXT(ARB_shader_image_load_store),
 +   EXTRA_API_ES31,

I think you're missing the patch that adds EXTRA_API_ES31.  Did you
forget to send that one out?

Also, on a few of these patches, I think the old, non-_es31 set of
requirements can be removed due to no longer being used.

 +   EXTRA_END
 +};
 +
  EXTRA_EXT(ARB_texture_cube_map);
  EXTRA_EXT(EXT_texture_array);
  EXTRA_EXT(NV_fog_distance);
 diff --git a/src/mesa/main/get_hash_params.py 
 b/src/mesa/main/get_hash_params.py
 index 513d5d2..85c2494 100644
 --- a/src/mesa/main/get_hash_params.py
 +++ b/src/mesa/main/get_hash_params.py
 @@ -413,6 +413,14 @@ descriptor=[
  { apis: [GL_CORE, GLES3], params: [
  # GL_ARB_draw_indirect / GLES 3.1
[ DRAW_INDIRECT_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, 
 extra_ARB_draw_indirect_es31 ],
 +# GL_ARB_shader_image_load_store / GLES 3.1
 +  [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), 
 extra_ARB_shader_image_load_store_es31],
 +  [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, 
 CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
 extra_ARB_shader_image_load_store_es31],
 +  [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), 
 extra_ARB_shader_image_load_store_es31],
 +  [ MAX_VERTEX_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
 extra_ARB_shader_image_load_store_es31],
 +  [ MAX_GEOMETRY_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
 extra_ARB_shader_image_load_store_es31],
 +  [ MAX_FRAGMENT_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
 extra_ARB_shader_image_load_store_es31],
 +  [ MAX_COMBINED_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.MaxCombinedImageUniforms), 
 extra_ARB_shader_image_load_store_es31],
  ]},
  
  # Remaining enums are only in OpenGL
 @@ -780,15 +788,6 @@ descriptor=[
[ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, 
 CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ],
[ MAX_VERTEX_ATTRIB_BINDINGS, 
 CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ],
  
 -# GL_ARB_shader_image_load_store
 -  [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), 
 extra_ARB_shader_image_load_store],
 -  [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, 
 CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
 extra_ARB_shader_image_load_store],
 -  [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), 
 extra_ARB_shader_image_load_store],
 -  [ MAX_VERTEX_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
 extra_ARB_shader_image_load_store],
 -  [ MAX_GEOMETRY_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
 extra_ARB_shader_image_load_store_and_geometry_shader],
 -  [ MAX_FRAGMENT_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
 extra_ARB_shader_image_load_store],
 -  [ MAX_COMBINED_IMAGE_UNIFORMS, 
 CONTEXT_INT(Const.MaxCombinedImageUniforms), 
 extra_ARB_shader_image_load_store],
 -
  # GL_ARB_compute_shader
[ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, 
 CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader 
 ],
[ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
 extra_ARB_compute_shader ],
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/13] SQUASH: nir: Update various components for the new list-based use/def sets

2015-05-07 Thread Connor Abbott

On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand ja...@jlekstrand.net wrote:
 ---
  src/glsl/nir/nir_from_ssa.c | 11 +--
  src/glsl/nir/nir_lower_locals_to_regs.c | 14 ++
  src/glsl/nir/nir_lower_to_source_mods.c | 20 
  src/glsl/nir/nir_lower_vars_to_ssa.c|  3 ++-
  src/glsl/nir/nir_opt_gcm.c  | 14 ++
  src/glsl/nir/nir_opt_global_to_local.c  | 13 ++---
  src/glsl/nir/nir_opt_peephole_ffma.c|  9 -
  src/glsl/nir/nir_opt_peephole_select.c  | 10 --
  src/glsl/nir/nir_to_ssa.c   | 19 ++-
  9 files changed, 55 insertions(+), 58 deletions(-)

 diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c
 index 5e7deca..94d1ced 100644
 --- a/src/glsl/nir/nir_from_ssa.c
 +++ b/src/glsl/nir/nir_from_ssa.c
 @@ -345,6 +345,7 @@ isolate_phi_nodes_block(nir_block *block, void 
 *void_state)

   nir_parallel_copy_entry *entry = rzalloc(state-dead_ctx,
nir_parallel_copy_entry);
 + entry-src.parent_instr = pcopy-instr;

I don't think this change, or the one immediately below, are needed
since nir_instr_rewrite_uses() will already set the parent_instr.

   nir_ssa_dest_init(pcopy-instr, entry-dest,
 phi-dest.ssa.num_components, src-src.ssa-name);
   exec_list_push_tail(pcopy-entries, entry-node);
 @@ -358,6 +359,7 @@ isolate_phi_nodes_block(nir_block *block, void 
 *void_state)

nir_parallel_copy_entry *entry = rzalloc(state-dead_ctx,
 nir_parallel_copy_entry);
 +  entry-src.parent_instr = block_pcopy-instr;
nir_ssa_dest_init(block_pcopy-instr, entry-dest,
  phi-dest.ssa.num_components, phi-dest.ssa.name);
exec_list_push_tail(block_pcopy-entries, entry-node);
 @@ -503,7 +505,7 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
 }

 nir_ssa_def_rewrite_uses(def, nir_src_for_reg(reg), state-mem_ctx);
 -   assert(def-uses-entries == 0  def-if_uses-entries == 0);
 +   assert(list_empty(def-uses)  list_empty(def-if_uses));

 if (def-parent_instr-type == nir_instr_type_ssa_undef)
return true;
 @@ -515,12 +517,9 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
  */
 nir_dest *dest = exec_node_data(nir_dest, def, ssa);

 -   _mesa_set_destroy(dest-ssa.uses, NULL);
 -   _mesa_set_destroy(dest-ssa.if_uses, NULL);
 -
 *dest = nir_dest_for_reg(reg);
 -
 -   _mesa_set_add(reg-defs, state-instr);
 +   dest-reg.parent_instr = state-instr;
 +   list_addtail(dest-reg.def_link, reg-defs);

 return true;
  }
 diff --git a/src/glsl/nir/nir_lower_locals_to_regs.c 
 b/src/glsl/nir/nir_lower_locals_to_regs.c
 index bc6a3d3..28fdec5 100644
 --- a/src/glsl/nir/nir_lower_locals_to_regs.c
 +++ b/src/glsl/nir/nir_lower_locals_to_regs.c
 @@ -269,18 +269,16 @@ lower_locals_to_regs_block(nir_block *block, void 
 *void_state)
  static nir_block *
  compute_reg_usedef_lca(nir_register *reg)
  {
 -   struct set_entry *entry;
 nir_block *lca = NULL;

 -   set_foreach(reg-defs, entry)
 -  lca = nir_dominance_lca(lca, ((nir_instr *)entry-key)-block);
 +   list_for_each_entry(nir_dest, def_dest, reg-defs, reg.def_link)
 +  lca = nir_dominance_lca(lca, def_dest-reg.parent_instr-block);

 -   set_foreach(reg-uses, entry)
 -  lca = nir_dominance_lca(lca, ((nir_instr *)entry-key)-block);
 +   list_for_each_entry(nir_src, use_src, reg-uses, use_link)
 +  lca = nir_dominance_lca(lca, use_src-parent_instr-block);

 -   set_foreach(reg-if_uses, entry) {
 -  nir_if *if_stmt = (nir_if *)entry-key;
 -  nir_cf_node *prev_node = nir_cf_node_prev(if_stmt-cf_node);
 +   list_for_each_entry(nir_src, use_src, reg-if_uses, use_link) {
 +  nir_cf_node *prev_node = 
 nir_cf_node_prev(use_src-parent_if-cf_node);
assert(prev_node-type == nir_cf_node_block);
lca = nir_dominance_lca(lca, nir_cf_node_as_block(prev_node));
 }
 diff --git a/src/glsl/nir/nir_lower_to_source_mods.c 
 b/src/glsl/nir/nir_lower_to_source_mods.c
 index 7b4a0f6..94c7e36 100644
 --- a/src/glsl/nir/nir_lower_to_source_mods.c
 +++ b/src/glsl/nir/nir_lower_to_source_mods.c
 @@ -88,8 +88,8 @@ nir_lower_to_source_mods_block(nir_block *block, void 
 *state)
  alu-src[i].swizzle[j] = 
 parent-src[0].swizzle[alu-src[i].swizzle[j]];
   }

 - if (parent-dest.dest.ssa.uses-entries == 0 
 - parent-dest.dest.ssa.if_uses-entries == 0)
 + if (list_empty(parent-dest.dest.ssa.uses) 
 + list_empty(parent-dest.dest.ssa.if_uses))
  nir_instr_remove(parent-instr);
}

 @@ -131,13 +131,13 @@ nir_lower_to_source_mods_block(nir_block *block, void 
 *state)
if (nir_op_infos[alu-op].output_type != nir_type_float)
   continue;

 -  if (alu-dest.dest.ssa.if_uses-entries != 0)
 +  if

[Mesa-dev] [PATCH 4/5] clover: Add a mutex to guard queue::queued_events

2015-05-07 Thread Tom Stellard

This fixes a potential crash where on a sequence like this:

Thread 0: Check if queue is not empty.
Thread 1: Remove item from queue, making it empty.
Thread 0: Do something assuming queue is not empty.
---
 src/gallium/state_trackers/clover/core/queue.cpp | 2 ++
 src/gallium/state_trackers/clover/core/queue.hpp | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/clover/core/queue.cpp 
b/src/gallium/state_trackers/clover/core/queue.cpp
index 24f9326..87f9dcc 100644
--- a/src/gallium/state_trackers/clover/core/queue.cpp
+++ b/src/gallium/state_trackers/clover/core/queue.cpp
@@ -44,6 +44,7 @@ command_queue::flush() {
pipe_screen *screen = device().pipe;
pipe_fence_handle *fence = NULL;
 
+   std::lock_guardstd::mutex lock(queued_events_mutex);
if (!queued_events.empty()) {
   pipe-flush(pipe, fence, 0);
 
@@ -69,6 +70,7 @@ command_queue::profiling_enabled() const {
 
 void
 command_queue::sequence(hard_event ev) {
+   std::lock_guardstd::mutex lock(queued_events_mutex);
if (!queued_events.empty())
   queued_events.back()().chain(ev);
 
diff --git a/src/gallium/state_trackers/clover/core/queue.hpp 
b/src/gallium/state_trackers/clover/core/queue.hpp
index b7166e6..bddb86c 100644
--- a/src/gallium/state_trackers/clover/core/queue.hpp
+++ b/src/gallium/state_trackers/clover/core/queue.hpp
@@ -24,6 +24,7 @@
 #define CLOVER_CORE_QUEUE_HPP
 
 #include deque
+#include mutex
 
 #include core/object.hpp
 #include core/context.hpp
@@ -69,6 +70,7 @@ namespace clover {
 
   cl_command_queue_properties props;
   pipe_context *pipe;
+  std::mutex queued_events_mutex;
   std::dequeintrusive_refhard_event queued_events;
};
 }
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] clover: Add threadsafe wrappers for pipe_screen and pipe_context

2015-05-07 Thread Tom Stellard

Events can be added to an OpenCL command queue concurrently from
multiple threads, but pipe_context and pipe_screen objects
are not threadsafe.  The threadsafe wrappers protect all pipe_screen
and pipe_context function calls with a mutex, so we can safely use
them with multiple threads.
---
 src/gallium/state_trackers/clover/Makefile.am  |   6 +-
 src/gallium/state_trackers/clover/Makefile.sources |   4 +
 src/gallium/state_trackers/clover/core/device.cpp  |   2 +
 .../clover/core/pipe_threadsafe_context.c  | 272 +
 .../clover/core/pipe_threadsafe_screen.c   | 184 ++
 .../state_trackers/clover/core/threadsafe.h|  39 +++
 src/gallium/targets/opencl/Makefile.am |   3 +-
 7 files changed, 508 insertions(+), 2 deletions(-)
 create mode 100644 
src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c
 create mode 100644 
src/gallium/state_trackers/clover/core/pipe_threadsafe_screen.c
 create mode 100644 src/gallium/state_trackers/clover/core/threadsafe.h

diff --git a/src/gallium/state_trackers/clover/Makefile.am 
b/src/gallium/state_trackers/clover/Makefile.am
index f46d9ef..8b615ae 100644
--- a/src/gallium/state_trackers/clover/Makefile.am
+++ b/src/gallium/state_trackers/clover/Makefile.am
@@ -1,5 +1,6 @@
 AUTOMAKE_OPTIONS = subdir-objects
 
+include $(top_srcdir)/src/gallium/Automake.inc
 include Makefile.sources
 
 AM_CPPFLAGS = \
@@ -32,6 +33,9 @@ cl_HEADERS = \
$(top_srcdir)/include/CL/opencl.h
 endif
 
+AM_CFLAGS = \
+   $(GALLIUM_CFLAGS)
+
 noinst_LTLIBRARIES = libclover.la libcltgsi.la libclllvm.la
 
 libcltgsi_la_CXXFLAGS = \
@@ -58,6 +62,6 @@ libclover_la_CXXFLAGS = \
 libclover_la_LIBADD = \
libcltgsi.la libclllvm.la
 
-libclover_la_SOURCES = $(CPP_SOURCES)
+libclover_la_SOURCES = $(CPP_SOURCES) $(C_SOURCES)
 
 EXTRA_DIST = Doxyfile
diff --git a/src/gallium/state_trackers/clover/Makefile.sources 
b/src/gallium/state_trackers/clover/Makefile.sources
index 10bbda0..90e6b7e 100644
--- a/src/gallium/state_trackers/clover/Makefile.sources
+++ b/src/gallium/state_trackers/clover/Makefile.sources
@@ -53,6 +53,10 @@ CPP_SOURCES := \
util/range.hpp \
util/tuple.hpp
 
+C_SOURCES := \
+   core/pipe_threadsafe_context.c \
+   core/pipe_threadsafe_screen.c
+
 LLVM_SOURCES := \
llvm/invocation.cpp
 
diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
b/src/gallium/state_trackers/clover/core/device.cpp
index 42b45b7..b145027 100644
--- a/src/gallium/state_trackers/clover/core/device.cpp
+++ b/src/gallium/state_trackers/clover/core/device.cpp
@@ -22,6 +22,7 @@
 
 #include core/device.hpp
 #include core/platform.hpp
+#include core/threadsafe.h
 #include pipe/p_screen.h
 #include pipe/p_state.h
 
@@ -47,6 +48,7 @@ device::device(clover::platform platform, pipe_loader_device 
*ldev) :
  pipe-destroy(pipe);
   throw error(CL_INVALID_DEVICE);
}
+   pipe = pipe_threadsafe_screen(pipe);
 }
 
 device::~device() {
diff --git a/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c 
b/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c
new file mode 100644
index 000..f08f56c
--- /dev/null
+++ b/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c
@@ -0,0 +1,272 @@
+/*
+ * Copyright 2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ *
+ * Authors: Tom Stellard thomas.stell...@amd.com
+ *
+ */
+
+#include stdio.h
+
+/**
+ * \file
+ *
+ * threadsafe_context is a wrapper around a pipe_context to make it thread
+ * safe.
+ */
+
+#include os/os_thread.h
+#include pipe/p_context.h
+#include util/u_memory.h
+
+#include threadsafe.h
+
+
+
+struct threadsafe_context {
+   struct pipe_context base;
+   struct pipe_context *ctx;
+   pipe_mutex mutex;
+};
+
+static struct pipe_context *unwrap(struct pipe_context *ctx) {
+   if (!ctx)
+  return

[Mesa-dev] [PATCH 3/5] clover: Fix a bug with multi-threaded events

2015-05-07 Thread Tom Stellard

It was possible for some events never to get triggered if one thread
was creating events and another threads was waiting for them.

This patch consolidates soft_event::wait() and hard_event::wait()
into event::wait() so that hard_event objects will now wait for
all their dependencies to be submitted before flushing the command
queue.
---
 src/gallium/state_trackers/clover/core/event.cpp | 19 +++
 src/gallium/state_trackers/clover/core/event.hpp |  9 ++---
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp 
b/src/gallium/state_trackers/clover/core/event.cpp
index 3c9336e..da227bb 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -39,6 +39,7 @@ event::~event() {
 void
 event::trigger() {
if (!--wait_count) {
+  signalled_cv.notify_all();
   action_ok(*this);
 
   while (!_chain.empty()) {
@@ -73,6 +74,15 @@ event::chain(event ev) {
ev.deps.push_back(*this);
 }
 
+void
+event::wait() {
+   for (event ev : deps)
+  ev.wait();
+
+   std::unique_lockstd::mutex lock(signalled_mutex);
+   signalled_cv.wait(lock, [=]{ return signalled(); });
+}
+
 hard_event::hard_event(command_queue q, cl_command_type command,
const ref_vectorevent deps, action action) :
event(q.context(), deps, profile(q, action), [](event ev){}),
@@ -117,9 +127,11 @@ hard_event::command() const {
 }
 
 void
-hard_event::wait() const {
+hard_event::wait() {
pipe_screen *screen = queue()-device().pipe;
 
+   event::wait();
+
if (status() == CL_QUEUED)
   queue()-flush();
 
@@ -206,9 +218,8 @@ soft_event::command() const {
 }
 
 void
-soft_event::wait() const {
-   for (event ev : deps)
-  ev.wait();
+soft_event::wait() {
+   event::wait();
 
if (status() != CL_COMPLETE)
   throw error(CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST);
diff --git a/src/gallium/state_trackers/clover/core/event.hpp 
b/src/gallium/state_trackers/clover/core/event.hpp
index d407c80..dffafb9 100644
--- a/src/gallium/state_trackers/clover/core/event.hpp
+++ b/src/gallium/state_trackers/clover/core/event.hpp
@@ -23,6 +23,7 @@
 #ifndef CLOVER_CORE_EVENT_HPP
 #define CLOVER_CORE_EVENT_HPP
 
+#include condition_variable
 #include functional
 
 #include core/object.hpp
@@ -68,7 +69,7 @@ namespace clover {
   virtual cl_int status() const = 0;
   virtual command_queue *queue() const = 0;
   virtual cl_command_type command() const = 0;
-  virtual void wait() const = 0;
+  virtual void wait();
 
   virtual struct pipe_fence_handle *fence() const {
  return NULL;
@@ -87,6 +88,8 @@ namespace clover {
   action action_ok;
   action action_fail;
   std::vectorintrusive_refevent _chain;
+  std::condition_variable signalled_cv;
+  std::mutex signalled_mutex;
};
 
///
@@ -111,7 +114,7 @@ namespace clover {
   virtual cl_int status() const;
   virtual command_queue *queue() const;
   virtual cl_command_type command() const;
-  virtual void wait() const;
+  virtual void wait();
 
   const lazycl_ulong time_queued() const;
   const lazycl_ulong time_submit() const;
@@ -149,7 +152,7 @@ namespace clover {
   virtual cl_int status() const;
   virtual command_queue *queue() const;
   virtual cl_command_type command() const;
-  virtual void wait() const;
+  virtual void wait();
};
 }
 
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] clover: Replace open-coded event::signalled()

2015-05-07 Thread Tom Stellard

This consolidates signalled checks into the same place.
---
 src/gallium/state_trackers/clover/core/event.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp 
b/src/gallium/state_trackers/clover/core/event.cpp
index 58de888..3c9336e 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -66,7 +66,7 @@ event::signalled() const {
 
 void
 event::chain(event ev) {
-   if (wait_count) {
+   if (!signalled()) {
   ev.wait_count++;
   _chain.push_back(ev);
}
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/13] util: Move gallium's linked list to util

2015-05-07 Thread Ian Romanick

Isn't this the same as src/util/simple_list.h?

On 04/27/2015 09:03 PM, Jason Ekstrand wrote:
 The linked list in gallium is pretty much the kernel list and we would like
 to have a C-based linked list for all of mesa.  Let's not duplicate and
 just steal the gallium one.
 ---
  src/gallium/auxiliary/Makefile.sources |   1 -
  src/gallium/auxiliary/hud/hud_private.h|   2 +-
  .../auxiliary/pipebuffer/pb_buffer_fenced.c|   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c |   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_debug.c |   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c|   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_pool.c  |   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_slab.c  |   2 +-
  src/gallium/auxiliary/util/u_debug_flush.c |   2 +-
  src/gallium/auxiliary/util/u_debug_memory.c|   2 +-
  src/gallium/auxiliary/util/u_dirty_surfaces.h  |   2 +-
  src/gallium/auxiliary/util/u_double_list.h | 146 
 -
  src/gallium/drivers/freedreno/freedreno_context.h  |   2 +-
  src/gallium/drivers/freedreno/freedreno_query_hw.h |   2 +-
  src/gallium/drivers/freedreno/freedreno_resource.h |   2 +-
  src/gallium/drivers/ilo/ilo_common.h   |   2 +-
  src/gallium/drivers/nouveau/nouveau_buffer.h   |   2 +-
  src/gallium/drivers/nouveau/nouveau_fence.c|   2 -
  src/gallium/drivers/nouveau/nouveau_fence.h|   2 +-
  src/gallium/drivers/nouveau/nouveau_mm.c   |   2 +-
  src/gallium/drivers/nouveau/nv30/nv30_screen.h |   2 +-
  src/gallium/drivers/nouveau/nv50/nv50_resource.h   |   2 +-
  src/gallium/drivers/r600/compute_memory_pool.c |   2 +-
  src/gallium/drivers/r600/evergreen_compute.c   |   2 +-
  src/gallium/drivers/r600/r600_llvm.c   |   2 +-
  src/gallium/drivers/r600/r600_pipe.h   |   2 +-
  src/gallium/drivers/radeon/r600_pipe_common.h  |   2 +-
  src/gallium/drivers/radeon/radeon_vce.h|   2 +-
  src/gallium/drivers/svga/svga_context.h|   2 +-
  src/gallium/drivers/svga/svga_resource_buffer.h|   2 -
  .../drivers/svga/svga_resource_buffer_upload.c |   1 -
  src/gallium/drivers/svga/svga_screen_cache.h   |   2 +-
  src/gallium/state_trackers/nine/basetexture9.h |   2 +-
  src/gallium/state_trackers/nine/device9.h  |   2 +-
  src/gallium/state_trackers/nine/nine_state.h   |   2 +-
  src/gallium/state_trackers/nine/surface9.h |   2 +-
  src/gallium/state_trackers/omx/vid_dec.h   |   2 +-
  src/gallium/state_trackers/omx/vid_enc.h   |   2 +-
  src/gallium/winsys/radeon/drm/radeon_drm_bo.c  |   2 +-
  .../winsys/svga/drm/pb_buffer_simple_fenced.c  |   2 +-
  src/gallium/winsys/svga/drm/vmw_fence.c|   2 +-
  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c  |   2 +-
  src/util/Makefile.sources  |   1 +
  src/util/list.h| 146 
 +
  44 files changed, 184 insertions(+), 189 deletions(-)
  delete mode 100644 src/gallium/auxiliary/util/u_double_list.h
  create mode 100644 src/util/list.h
 
 diff --git a/src/gallium/auxiliary/Makefile.sources 
 b/src/gallium/auxiliary/Makefile.sources
 index ec7547c..62e6b94 100644
 --- a/src/gallium/auxiliary/Makefile.sources
 +++ b/src/gallium/auxiliary/Makefile.sources
 @@ -197,7 +197,6 @@ C_SOURCES := \
   util/u_dirty_surfaces.h \
   util/u_dl.c \
   util/u_dl.h \
 - util/u_double_list.h \
   util/u_draw.c \
   util/u_draw.h \
   util/u_draw_quad.c \
 diff --git a/src/gallium/auxiliary/hud/hud_private.h 
 b/src/gallium/auxiliary/hud/hud_private.h
 index 1606ada..c74dc3b 100644
 --- a/src/gallium/auxiliary/hud/hud_private.h
 +++ b/src/gallium/auxiliary/hud/hud_private.h
 @@ -29,7 +29,7 @@
  #define HUD_PRIVATE_H
  
  #include pipe/p_context.h
 -#include util/u_double_list.h
 +#include util/list.h
  
  struct hud_graph {
 /* initialized by common code */
 diff --git a/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c 
 b/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
 index 9e0cace..7840467 100644
 --- a/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
 +++ b/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
 @@ -46,7 +46,7 @@
  #include util/u_debug.h
  #include os/os_thread.h
  #include util/u_memory.h
 -#include util/u_double_list.h
 +#include util/list.h
  
  #include pb_buffer.h
  #include pb_buffer_fenced.h
 diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c 
 b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
 index 5eb8d06..5023687 100644
 --- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
 +++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
 @@ -38,7 +38,7 @@
  #include util/u_debug.h
  #include os/os_thread.h
  #include util/u_memory.h
 -#include util/u_double_list.h
 +#include util/list.h
  #include util/u_time.h
  
  #include

Re: [Mesa-dev] [PATCH 09/13] util/list: Add list_empty and list_length functions

2015-05-07 Thread Ian Romanick

On 05/05/2015 11:21 AM, Neil Roberts wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:
 
 +static inline bool list_empty(struct list_head *list)
 +{
 +   return list-next == list;
 +}
 
 It would be good if list.h also included stdbool.h in order to get the
 declaration of bool. However, will that cause problems on MSVC? Is the
 Gallium code compiled on MSVC in general?
 
 +static inline unsigned list_length(struct list_head *list)
 +{
 +   unsigned length = 0;
 +   for (struct list_head *node = list-next; node != list; node = 
 node-next)
 +  length++;
 +   return length;
 +}
 
 Any reason not to use one of the list iterator macros here? Is it safe
 to use a C99-ism outside of a macro in this header? Maybe MSVC
 supports this particular C99-ism anyway.
 
 For what it's worth, I'm strongly in favour of using these kernel-style
 lists instead of exec_list. The kernel ones seem much less confusing.

Huh?  They're practically identical.  The only difference is the
kernel-style lists have a single sentinel node, and that node is
impossible to identify in a crowd.  The exec_lists use two sentinel
nodes, and those nodes have one pointer of overlapping storage (head and
tail are the next and prev pointers of one node, and tail and tail_pred
are the next and prev pointers of the other).  I thought there was some
ASCII art in list.h that showed this, but that appears to not be the case...

This gives some convenience that you can walk through a list from any
node in the list without having a pointer to the list itself.  I don't
know if we still do, but there used to be a few places where we took
advantage of that.

Some of the APIs are (very) poorly named (I'm looking at you,
insert_before), and I'd welcome patches to fix that up.

 Regards,
 - Neil
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] main: glGetIntegeri_v fails for GL_VERTEX_BINDING_STRIDE

2015-05-07 Thread Marta Lofstedt

The return type for GL_VERTEX_BINDING_STRIDE is missing,
this cause glGetIntegeri_v to fail.

Signed-off-by: Marta Lofstedt marta.lofst...@linux.intel.com
---
 src/mesa/main/get.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 6fc0f3f..9fb8fba 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -1959,6 +1959,7 @@ find_value_indexed(const char *func, GLenum pname, GLuint 
index, union value *v)
   if (index = ctx-Const.Program[MESA_SHADER_VERTEX].MaxAttribs)
   goto invalid_value;
   v-value_int = 
ctx-Array.VAO-VertexBinding[VERT_ATTRIB_GENERIC(index)].Stride;
+  return TYPE_INT;
 
/* ARB_shader_image_load_store */
case GL_IMAGE_BINDING_NAME: {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Ian Romanick

On 05/07/2015 05:44 PM, Jason Ekstrand wrote:
 
 On May 7, 2015 5:38 PM, Ian Romanick i...@freedesktop.org
 mailto:i...@freedesktop.org wrote:

 On 05/07/2015 04:50 PM, Jason Ekstrand wrote:
  GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
 
 total instructions in shared programs: 2724483 - 2711790 (-0.47%)
 instructions in affected programs: 1860859 - 1848166 (-0.68%)
 helped:4387
 HURT:  4758
 GAINED:1499
 
  The gained programs are ARB vertext programs that were previously going
  through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
  programs can go through the scalar backend so they show up as
 gained in
  the shader-db results.

 I thought we already did this... why didn't this happen when NIR became
 the default for the FS backend?  And has that reason (assuming there was
 one) been resolved?
 
 We couldn't do copy propagation of values in the attribute register
 file.  That, it turn was blocked on reworking the LOAD_PAYLOAD
 instruction.  I pushed a series this morning that fixed both of those
 and cut 7.5% off of all SIMD8 VS instructions when using NIR.  It also
 helps GLSL IR but by only 1% or so.
 --Jason

Ah, that's right.  Make it so!

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

  ---
   src/mesa/drivers/dri/i965/brw_context.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_context.c
 b/src/mesa/drivers/dri/i965/brw_context.c
  index fd7420a..8615e5e 100644
  --- a/src/mesa/drivers/dri/i965/brw_context.c
  +++ b/src/mesa/drivers/dri/i965/brw_context.c
  @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct
 brw_context *brw)

 ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp
 = true;

 ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false;
 
  -  if (brw_env_var_as_boolean(INTEL_USE_NIR, false))
  +  if (brw_env_var_as_boolean(INTEL_USE_NIR, true))
  
  ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions =
 nir_options;
  }
 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Matt Turner

On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

total instructions in shared programs: 2724483 - 2711790 (-0.47%)
instructions in affected programs: 1860859 - 1848166 (-0.68%)
helped:4387
HURT:  4758
GAINED:1499

 The gained programs are ARB vertext programs that were previously going
 through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
 programs can go through the scalar backend so they show up as gained in
 the shader-db results.

Again, I'm kind of confused and disappointed that we're just okay with
hurting 4700 programs without more analysis. I guess I'll go do
that...

I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
297 - 161 instructions. More concerning, the number of send
instructions drop from 36 to 12, and a loop that was 111 instructions
long suddenly becomes

   START B1 -B0 -B2
cmp.ge.f0(8)nullg428,8,1D g70,1,0D
(+f0) break(8)  JIP: 24 UIP: 24
   END B1 -B3 -B2
   START B2 -B1
add(8)  g421D g428,8,1D 1D
while(8)JIP: -32
   END B2 -B1

That deserves a lot more investigation. I'll take a gamble and say
something is broken.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Ian Romanick

On 05/07/2015 06:17 PM, Matt Turner wrote:
 On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

total instructions in shared programs: 2724483 - 2711790 (-0.47%)
instructions in affected programs: 1860859 - 1848166 (-0.68%)
helped:4387
HURT:  4758
GAINED:1499

 The gained programs are ARB vertext programs that were previously going
 through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
 programs can go through the scalar backend so they show up as gained in
 the shader-db results.
 
 Again, I'm kind of confused and disappointed that we're just okay with
 hurting 4700 programs without more analysis. I guess I'll go do
 that...

Yeah... I think I just (foolishly) assumed it was mostly +/- small
amounts given the % in affected programs.

 I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
 297 - 161 instructions. More concerning, the number of send
 instructions drop from 36 to 12, and a loop that was 111 instructions
 long suddenly becomes
 
START B1 -B0 -B2
 cmp.ge.f0(8)nullg428,8,1D g70,1,0D
 (+f0) break(8)  JIP: 24 UIP: 24
END B1 -B3 -B2
START B2 -B1
 add(8)  g421D g428,8,1D 1D
 while(8)JIP: -32
END B2 -B1
 
 That deserves a lot more investigation. I'll take a gamble and say
 something is broken.

Yikes.  I guess I'm surprised that piglit+gles3conform+deqp didn't
already find

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread Aaron Watry

On Thu, May 7, 2015 at 5:27 PM, Jan Vesely jan.ves...@rutgers.edu wrote:

 On Thu, 2015-05-07 at 21:52 +0200, EdB wrote:
  Le 2015-05-07 18:55, Aaron Watry a écrit :
   I'm not sure what the final consensus will be on how to do this, but
   FWIW:
   Tested-By: Aaron Watry awa...@gmail.com
  
   I've tested this with 4 combinations:
   no --with-opencl-icd option specified : libOpenCL.so gets installed in
   ${prefix}/lib
   --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
   --with-opencl-icd=standard : libMesaOpenCL.so installed in
   ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd
   --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in
   ${prefix}/lib, icd in ${prefix}/etc//mesa.icd.  I only specified
   --prefix, no other directories overridden in configure command.

 shouldn't this part go to ${prefix}/etc/OpenCL/vendors?
 Is it just a typo or did it install to ${prefix}/etc//?


That was just a typo.  It went to ${prefix}/etc/OpenCL/vendors/mesa.icd.

--Aaron


 jan

  
 
  thanks
 
 EdB
 
   --Aaron
  
  
  
   On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote:
  
   The standard ICD file path is /etc/OpenCL/vendor/.
   However it doesn't fit well with custom build.
   This option allow ICD vendor file installation path override
   ---
configure.ac [1]   | 46
   +++---
src/gallium/targets/opencl/Makefile.am |  2 +-
2 files changed, 33 insertions(+), 15 deletions(-)
  
   diff --git a/configure.ac [1] b/configure.ac [1]
   index 095e23e..90dba4e 100644
   --- a/configure.ac [1]
   +++ b/configure.ac [1]
   @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
 [enable OpenCL library @:@default=disabled@:@])],
   [enable_opencl=$enableval],
   [enable_opencl=no])
   -AC_ARG_ENABLE([opencl_icd],
   -   [AS_HELP_STRING([--enable-opencl-icd],
   -  [Build an OpenCL ICD library to be loaded by an ICD
   implementation
   -   @:@default=disabled@:@])],
   -[enable_opencl_icd=$enableval],
   -[enable_opencl_icd=no])
AC_ARG_ENABLE([xlib-glx],
[AS_HELP_STRING([--enable-xlib-glx],
[make GLX library Xlib-based instead of DRI-based
   @:@default=disabled@:@])],
   @@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then
# XXX: Use $enable_shared_pipe_drivers once converted to
   use static/shared pipe-drivers
enable_gallium_loader=yes
  
   -if test x$enable_opencl_icd = xyes; then
   -OPENCL_LIBNAME=MesaOpenCL
   -else
   -OPENCL_LIBNAME=OpenCL
   -fi
   -
if test x$have_libelf != xyes; then
   AC_MSG_ERROR([Clover requires libelf])
fi
fi
AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes)
   -AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes)
   -AC_SUBST([OPENCL_LIBNAME])
  
dnl
dnl Gallium configuration
   @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
[D3D_DRIVER_INSTALL_DIR=${libdir}/d3d])
AC_SUBST([D3D_DRIVER_INSTALL_DIR])
  
   +dnl OpenCL ICD
   +
   +AC_ARG_WITH([opencl-icd],
   +
   [AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@],
   +[Build an OpenCL ICD library to be loaded by an ICD
   implementation.
   + If @:@standard@:@ the OpenCL ICD vendor file
   installs in /etc/OpenCL/vendors.
   + @:@sysconfdir@:@ installs the file in
   $sysconfdir/OpenCL/vendors
   + @:@default=no@:@])],
   +[OPENCL_ICD=$withval],
   +[OPENCL_ICD=no])
   +
   +case x$OPENCL_ICD in
   +xno)
   +OPENCL_LIBNAME=OpenCL
   +;;
   +xstandard)
   +OPENCL_LIBNAME=MesaOpenCL
   +ICD_FILE_DIR=/etc/OpenCL/vendors
   +;;
   +xsysconfdir)
   +OPENCL_LIBNAME=MesaOpenCL
   +ICD_FILE_DIR=$sysconfdir/OpenCL/vendors
   +;;
   +*)
   +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
   --with-opencl-icd])
   +;;
   +esac
   +
   +AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno)
   +AC_SUBST([OPENCL_LIBNAME])
   +AC_SUBST([ICD_FILE_DIR])
   +
dnl
dnl Gallium helper functions
dnl
   diff --git a/src/gallium/targets/opencl/Makefile.am
   b/src/gallium/targets/opencl/Makefile.am
   index 5daf327..781daa0 100644
   --- a/src/gallium/targets/opencl/Makefile.am
   +++ b/src/gallium/targets/opencl/Makefile.am
   @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES =
   opencl.sym
EXTRA_DIST = mesa.icd opencl.sym
  
if HAVE_CLOVER_ICD
   -icddir = /etc/OpenCL/vendors/
   +icddir = $(ICD_FILE_DIR)
icd_DATA = mesa.icd
endif
  
   --
   2.1.0
  
   ___
   mesa-dev mailing list
   mesa-dev@lists.freedesktop.org
   http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2]
  
  
  
   Links:
   --
   [1] http://configure.ac
   [2] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
  ___
  mesa-dev mailing list

[Mesa-dev] [PATCH 5/5] clover: Add a mutex to guard event::chain and event::wait_count

2015-05-07 Thread Tom Stellard

This mutex effectively prevents an event's chain or wait_count from
being updated while it is in the process of triggering.  Otherwise it
may be possible to add to an event's chain after it has been triggered,
which causes the chained event to never be triggered.
---
 src/gallium/state_trackers/clover/core/event.cpp | 3 +++
 src/gallium/state_trackers/clover/core/event.hpp | 1 +
 2 files changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp 
b/src/gallium/state_trackers/clover/core/event.cpp
index da227bb..646fd38 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -38,6 +38,7 @@ event::~event() {
 
 void
 event::trigger() {
+   std::lock_guardstd::mutex lock(trigger_mutex);
if (!--wait_count) {
   signalled_cv.notify_all();
   action_ok(*this);
@@ -54,6 +55,7 @@ event::abort(cl_int status) {
_status = status;
action_fail(*this);
 
+   std::lock_guardstd::mutex lock(trigger_mutex);
while (!_chain.empty()) {
   _chain.back()().abort(status);
   _chain.pop_back();
@@ -67,6 +69,7 @@ event::signalled() const {
 
 void
 event::chain(event ev) {
+   std::lock_guardstd::mutex lock(trigger_mutex);
if (!signalled()) {
   ev.wait_count++;
   _chain.push_back(ev);
diff --git a/src/gallium/state_trackers/clover/core/event.hpp 
b/src/gallium/state_trackers/clover/core/event.hpp
index dffafb9..a64fbba 100644
--- a/src/gallium/state_trackers/clover/core/event.hpp
+++ b/src/gallium/state_trackers/clover/core/event.hpp
@@ -90,6 +90,7 @@ namespace clover {
   std::vectorintrusive_refevent _chain;
   std::condition_variable signalled_cv;
   std::mutex signalled_mutex;
+  std::mutex trigger_mutex;
};
 
///
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Ian Romanick

On 05/07/2015 04:50 PM, Jason Ekstrand wrote:
 GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
 
total instructions in shared programs: 2724483 - 2711790 (-0.47%)
instructions in affected programs: 1860859 - 1848166 (-0.68%)
helped:4387
HURT:  4758
GAINED:1499
 
 The gained programs are ARB vertext programs that were previously going
 through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
 programs can go through the scalar backend so they show up as gained in
 the shader-db results.

I thought we already did this... why didn't this happen when NIR became
the default for the FS backend?  And has that reason (assuming there was
one) been resolved?

 ---
  src/mesa/drivers/dri/i965/brw_context.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
 b/src/mesa/drivers/dri/i965/brw_context.c
 index fd7420a..8615e5e 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.c
 +++ b/src/mesa/drivers/dri/i965/brw_context.c
 @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw)

 ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
 true;
ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
 false;
  
 -  if (brw_env_var_as_boolean(INTEL_USE_NIR, false))
 +  if (brw_env_var_as_boolean(INTEL_USE_NIR, true))
   ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
 nir_options;
 }
  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

   total instructions in shared programs: 2724483 - 2711790 (-0.47%)
   instructions in affected programs: 1860859 - 1848166 (-0.68%)
   helped:4387
   HURT:  4758
   GAINED:1499

The gained programs are ARB vertext programs that were previously going
through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as gained in
the shader-db results.
---
 src/mesa/drivers/dri/i965/brw_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index fd7420a..8615e5e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp 
= true;
   ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
false;
 
-  if (brw_env_var_as_boolean(INTEL_USE_NIR, false))
+  if (brw_env_var_as_boolean(INTEL_USE_NIR, true))
  ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
nir_options;
}
 
-- 
2.4.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Kenneth Graunke

On Thursday, May 07, 2015 04:50:39 PM Jason Ekstrand wrote:
 GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
 
total instructions in shared programs: 2724483 - 2711790 (-0.47%)
instructions in affected programs: 1860859 - 1848166 (-0.68%)
helped:4387
HURT:  4758
GAINED:1499
 
 The gained programs are ARB vertext programs that were previously going
 through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
 programs can go through the scalar backend so they show up as gained in
 the shader-db results.
 ---
  src/mesa/drivers/dri/i965/brw_context.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
 b/src/mesa/drivers/dri/i965/brw_context.c
 index fd7420a..8615e5e 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.c
 +++ b/src/mesa/drivers/dri/i965/brw_context.c
 @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw)

 ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
 true;
ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
 false;
  
 -  if (brw_env_var_as_boolean(INTEL_USE_NIR, false))
 +  if (brw_env_var_as_boolean(INTEL_USE_NIR, true))
   ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
 nir_options;
 }
  
 

We definitely want to throw the switch before 10.6, so that all the
scalar backends are using NIR, and we'll be able to delete the
deprecated ones post-release.

Acked-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On May 7, 2015 5:38 PM, Ian Romanick i...@freedesktop.org wrote:

 On 05/07/2015 04:50 PM, Jason Ekstrand wrote:
  GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
 
 total instructions in shared programs: 2724483 - 2711790 (-0.47%)
 instructions in affected programs: 1860859 - 1848166 (-0.68%)
 helped:4387
 HURT:  4758
 GAINED:1499
 
  The gained programs are ARB vertext programs that were previously going
  through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
  programs can go through the scalar backend so they show up as gained
in
  the shader-db results.

 I thought we already did this... why didn't this happen when NIR became
 the default for the FS backend?  And has that reason (assuming there was
 one) been resolved?

We couldn't do copy propagation of values in the attribute register file.
That, it turn was blocked on reworking the LOAD_PAYLOAD instruction.  I
pushed a series this morning that fixed both of those and cut 7.5% off of
all SIMD8 VS instructions when using NIR.  It also helps GLSL IR but by
only 1% or so.
--Jason

  ---
   src/mesa/drivers/dri/i965/brw_context.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_context.c
b/src/mesa/drivers/dri/i965/brw_context.c
  index fd7420a..8615e5e 100644
  --- a/src/mesa/drivers/dri/i965/brw_context.c
  +++ b/src/mesa/drivers/dri/i965/brw_context.c
  @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context
*brw)
 
ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp =
true;
 
ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false;
 
  -  if (brw_env_var_as_boolean(INTEL_USE_NIR, false))
  +  if (brw_env_var_as_boolean(INTEL_USE_NIR, true))
 
 ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions =
nir_options;
  }
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Initial amdgpu driver release

2015-05-07 Thread Alex Deucher

On Mon, Apr 20, 2015 at 6:33 PM, Alex Deucher alexdeuc...@gmail.com wrote:
 I'm pleased to announce the initial release of the new amdgpu driver.
 This is a partial replacement for the radeon driver for newer AMD
 asics.  A number of components are still shared.  Here is a comparison
 of the radeon and amdgpu stacks:

 1. radeon stack
 kernel driver: radeon.ko
 libdrm: libdrm_radeon
 mesa: radeon, r200, r300, r600, radeonsi
 ddx: xf86-video-ati

 2. amdgpu stack
 kernel driver: amdgpu.ko
 libdrm: libdrm_amdgpu
 mesa: radeonsi
 ddx: xf86-video-amdgpu

 Older asics will continue to be supported by the radeon stack; new
 asics will be supported by the amdgpu stack.  CI (Sea Islands) asics
 have support in both driver stacks, but this is purely for testing
 purposes.  CI parts are officially supported in the radeon stack.
 Support for CI on the amdgpu stack is determined by a config option in
 the kernel.  CI support is not enabled by default for amdgpu.

 Most of our focus has been on Carrizo support, so there are some gaps
 in the dGPU support for Tonga and Iceland, notably power management.
 Those gaps will be filled in eventually.

 Also included in this code base are full register headers for just
 about every block on the asics.

 Barring the gaps mentioned above, the driver stack is functionally on
 par with radeon including:
 - OpenGL 3.3 support using the radeonsi mesa driver
 - Video decode support using UVD
 - Video encode support using VCE

 The code can be found in the amdgpu branches of the following git trees.
 xf86-video-amdgpu:
 http://cgit.freedesktop.org/~agd5f/xf86-video-amdgpu/log/?h=amdgpu
 libdrm:
 http://cgit.freedesktop.org/~agd5f/drm/log/?h=amdgpu
 kernel:
 http://cgit.freedesktop.org/~agd5f/linux/log/?h=amdgpu
 mesa:
 http://cgit.freedesktop.org/~mareko/mesa/log/?h=amdgpu

Some updates on the latest source locations:

xf86-video-amdgpu:
http://cgit.freedesktop.org/xorg/driver/xf86-video-amdgpu
libdrm:
http://cgit.freedesktop.org/~agd5f/drm/log/?h=amdgpu
kernel:
http://cgit.freedesktop.org/amd/drm-amd/
mesa:
http://cgit.freedesktop.org/mesa/mesa/log/?h=amdgpu

Alex



 To test the new driver stack you will need to specify a device section
 in your xorg.conf with the driver set to amdgpu rather than radeon.

 Please review!

 Thanks,

 The AMD Linux Driver Team
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/13] mesa/main: Verify context creation on progress

2015-05-07 Thread Ian Romanick

On 05/07/2015 05:21 AM, Pohjolainen, Topi wrote:
 On Tue, May 05, 2015 at 02:25:29PM +0300, Juha-Pekka Heikkila wrote:
 Stop context creation if something failed. If something errored
 during context creation we'd segfault. Now will clean up and
 return error.

 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
  src/mesa/main/shared.c | 66 
 +++---
  1 file changed, 62 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
 index 0b76cc0..cc05b05 100644
 --- a/src/mesa/main/shared.c
 +++ b/src/mesa/main/shared.c
 @@ -64,9 +64,21 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
  
 mtx_init(shared-Mutex, mtx_plain);
  
 +   /* Mutex and timestamp for texobj state validation */
 +   mtx_init(shared-TexMutex, mtx_recursive);
 +   shared-TextureStateStamp = 0;
 
 Do you really need to move this here?

I was going to ask the same thing.  I think moving it here means that it
can be unconditionally mtx_destroy'ed in the error path below.

 +
 shared-DisplayList = _mesa_NewHashTable();
 +   if (!shared-DisplayList)
 +  goto error_out;
 +
 shared-TexObjects = _mesa_NewHashTable();
 +   if (!shared-TexObjects)
 +  goto error_out;
 +
 shared-Programs = _mesa_NewHashTable();
 +   if (!shared-Programs)
 +  goto error_out;
  
 shared-DefaultVertexProgram =
gl_vertex_program(ctx-Driver.NewProgram(ctx,
 @@ -76,17 +88,28 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
   GL_FRAGMENT_PROGRAM_ARB, 
 0));
  
 shared-ATIShaders = _mesa_NewHashTable();
 +   if (!shared-ATIShaders)
 +  goto error_out;
 +
 shared-DefaultFragmentShader = _mesa_new_ati_fragment_shader(ctx, 0);
  
 shared-ShaderObjects = _mesa_NewHashTable();
 +   if (!shared-ShaderObjects)
 +  goto error_out;
  
 shared-BufferObjects = _mesa_NewHashTable();
 +   if (!shared-BufferObjects)
 +  goto error_out;
  
 /* GL_ARB_sampler_objects */
 shared-SamplerObjects = _mesa_NewHashTable();
 +   if (!shared-SamplerObjects)
 +  goto error_out;
  
 /* Allocate the default buffer object */
 shared-NullBufferObj = ctx-Driver.NewBufferObject(ctx, 0);
 +   if (!shared-NullBufferObj)
 +   goto error_out;
  
 /* Create default texture objects */
 for (i = 0; i  NUM_TEXTURE_TARGETS; i++) {
 @@ -107,22 +130,57 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
};
STATIC_ASSERT(ARRAY_SIZE(targets) == NUM_TEXTURE_TARGETS);
shared-DefaultTex[i] = ctx-Driver.NewTextureObject(ctx, 0, 
 targets[i]);
 +
 +  if (!shared-DefaultTex[i])
 +  goto error_out;
 }
  
 /* sanity check */
 assert(shared-DefaultTex[TEXTURE_1D_INDEX]-RefCount == 1);
  
 -   /* Mutex and timestamp for texobj state validation */
 -   mtx_init(shared-TexMutex, mtx_recursive);
 -   shared-TextureStateStamp = 0;
 -
 shared-FrameBuffers = _mesa_NewHashTable();
 +   if (!shared-FrameBuffers)
 +  goto error_out;
 +
 shared-RenderBuffers = _mesa_NewHashTable();
 +   if (!shared-RenderBuffers)
 +  goto error_out;
  
 shared-SyncObjects = _mesa_set_create(NULL, _mesa_hash_pointer,
_mesa_key_pointer_equal);
 +   if (!shared-SyncObjects)
 +  goto error_out;
  
 return shared;
 +
 +error_out:
 +   for (i = 0; i  NUM_TEXTURE_TARGETS; i++) {
 +  if (shared-DefaultTex[i]) {
 + ctx-Driver.DeleteTexture(ctx, shared-DefaultTex[i]);
 +  }
 +   }
 +
 +   _mesa_reference_buffer_object(ctx, shared-NullBufferObj, NULL);
 +
 +   _mesa_DeleteHashTable(shared-RenderBuffers);
 +   _mesa_DeleteHashTable(shared-FrameBuffers);
 +   _mesa_DeleteHashTable(shared-SamplerObjects);
 +   _mesa_DeleteHashTable(shared-BufferObjects);
 +   _mesa_DeleteHashTable(shared-ShaderObjects);
 +   _mesa_DeleteHashTable(shared-ATIShaders);
 +   _mesa_DeleteHashTable(shared-Programs);
 +   _mesa_DeleteHashTable(shared-TexObjects);
 +   _mesa_DeleteHashTable(shared-DisplayList);
 +
 +   _mesa_reference_vertprog(ctx, shared-DefaultVertexProgram, NULL);
 +   _mesa_reference_geomprog(ctx, shared-DefaultGeometryProgram, NULL);
 +   _mesa_reference_fragprog(ctx, shared-DefaultFragmentProgram, NULL);
 +
 +   mtx_destroy(shared-Mutex);
 +   mtx_destroy(shared-TexMutex);
 +
 +   free(shared);
 +   return NULL;
  }
  
  
 -- 
 1.8.5.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] nir: Translate image load, store and atomic intrinsics from GLSL IR.

2015-05-07 Thread Connor Abbott

On Tue, May 5, 2015 at 4:29 PM, Francisco Jerez curroje...@riseup.net wrote:
 ---
  src/glsl/nir/glsl_to_nir.cpp | 125 
 +++
  1 file changed, 114 insertions(+), 11 deletions(-)

 diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
 index f6b8331..a01ab3b 100644
 --- a/src/glsl/nir/glsl_to_nir.cpp
 +++ b/src/glsl/nir/glsl_to_nir.cpp
 @@ -614,27 +614,130 @@ nir_visitor::visit(ir_call *ir)
   op = nir_intrinsic_atomic_counter_inc_var;
} else if (strcmp(ir-callee_name(), 
 __intrinsic_atomic_predecrement) == 0) {
   op = nir_intrinsic_atomic_counter_dec_var;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_load) == 0) {
 + op = nir_intrinsic_image_load;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_store) == 0) {
 + op = nir_intrinsic_image_store;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_add) 
 == 0) {
 + op = nir_intrinsic_image_atomic_add;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_min) 
 == 0) {
 + op = nir_intrinsic_image_atomic_min;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_max) 
 == 0) {
 + op = nir_intrinsic_image_atomic_max;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_and) 
 == 0) {
 + op = nir_intrinsic_image_atomic_and;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_or) == 
 0) {
 + op = nir_intrinsic_image_atomic_or;
 +  } else if (strcmp(ir-callee_name(), __intrinsic_image_atomic_xor) 
 == 0) {
 + op = nir_intrinsic_image_atomic_xor;
 +  } else if (strcmp(ir-callee_name(), 
 __intrinsic_image_atomic_exchange) == 0) {
 + op = nir_intrinsic_image_atomic_exchange;
 +  } else if (strcmp(ir-callee_name(), 
 __intrinsic_image_atomic_comp_swap) == 0) {
 + op = nir_intrinsic_image_atomic_comp_swap;
} else {
   unreachable(not reached);
}

nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader, op);
 -  ir_dereference *param =
 - (ir_dereference *) ir-actual_parameters.get_head();
 -  instr-variables[0] = evaluate_deref(instr-instr, param);
 -  nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL);
 +
 +  switch (op) {
 +  case nir_intrinsic_atomic_counter_read_var:
 +  case nir_intrinsic_atomic_counter_inc_var:
 +  case nir_intrinsic_atomic_counter_dec_var: {
 + ir_dereference *param =
 +(ir_dereference *) ir-actual_parameters.get_head();
 + instr-variables[0] = evaluate_deref(instr-instr, param);
 + nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL);
 + break;
 +  }
 +  case nir_intrinsic_image_load:
 +  case nir_intrinsic_image_store:
 +  case nir_intrinsic_image_atomic_add:
 +  case nir_intrinsic_image_atomic_min:
 +  case nir_intrinsic_image_atomic_max:
 +  case nir_intrinsic_image_atomic_and:
 +  case nir_intrinsic_image_atomic_or:
 +  case nir_intrinsic_image_atomic_xor:
 +  case nir_intrinsic_image_atomic_exchange:
 +  case nir_intrinsic_image_atomic_comp_swap: {
 + nir_load_const_instr *instr_zero = 
 nir_load_const_instr_create(shader, 1);
 + instr_zero-value.u[0] = 0;
 + nir_instr_insert_after_cf_list(this-cf_node_list, 
 instr_zero-instr);
 +
 + /* Set the image variable dereference. */
 + exec_node *param = ir-actual_parameters.get_head();
 + ir_dereference *image = (ir_dereference *)param;
 + const glsl_type *type =
 +image-variable_referenced()-type-without_array();
 +
 + instr-variables[0] = evaluate_deref(instr-instr, image);
 + param = param-get_next();
 +
 + /* Set the address argument, extending the coordinate vector to four
 +  * components.
 +  */
 + const nir_src src_addr = evaluate_rvalue((ir_dereference *)param);
 + nir_alu_instr *instr_addr = nir_alu_instr_create(shader, 
 nir_op_vec4);
 + nir_ssa_dest_init(instr_addr-instr, instr_addr-dest.dest, 4, 
 NULL);
 +
 + for (int i = 0; i  4; i++) {
 +if (i  type-coordinate_components()) {
 +   instr_addr-src[i].src = src_addr;
 +   instr_addr-src[i].swizzle[0] = i;
 +} else {
 +   instr_addr-src[i].src = nir_src_for_ssa(instr_zero-def);

I think it would better convey the intent to create an ssa_undef_instr
and use that here instead of zero. Unless something else relies on the
extra coordinates being zeroed?

 +}
 + }
 +
 + nir_instr_insert_after_cf_list(cf_node_list, instr_addr-instr);
 + instr-src[0] = nir_src_for_ssa(instr_addr-dest.dest.ssa);
 + param = param-get_next();
 +
 + /* Set the sample argument, which should be zero for single-sample
 +  * images.
 +  */
 + if

[Mesa-dev] [Bug 69101] prime: black window

2015-05-07 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=69101

higu...@gmx.net changed:

   What|Removed |Added

 CC||higu...@gmx.net

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glx: report which DRI version is used when in verbose debug mode

2015-05-07 Thread Eero Tamminen


Hi,

On 05/06/2015 08:28 PM, Kenneth Graunke wrote:

I agree with Axel - I think LIBGL_DRI3_DISABLE=1 already does what you
want, so patch 2 is unnecessary.


That needs a patch to doc/envvars.html...


- Eero

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/13] mesa/main: Check context pointer in _mesa_error before using it

2015-05-07 Thread Pohjolainen, Topi

On Tue, May 05, 2015 at 02:25:26PM +0300, Juha-Pekka Heikkila wrote:
 I guess this should not really be able to segfault but still it
 seems to be able to during context creation.
 
 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
  src/mesa/main/errors.c | 26 --
  1 file changed, 16 insertions(+), 10 deletions(-)
 
 diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
 index 2aa1deb..6631b82 100644
 --- a/src/mesa/main/errors.c
 +++ b/src/mesa/main/errors.c
 @@ -1458,18 +1458,23 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
 const char *fmtString, ... )
  
To me it looks that it would be better to just leave early already here:

  if (!ctx)
 return;

Avoids extra indentation and it doesn't look meaningful to call
should_output() with null context.

 
 do_output = should_output(ctx, error, fmtString);
  
 -   mtx_lock(ctx-DebugMutex);
 -   if (ctx-Debug) {
 -  do_log = debug_is_message_enabled(ctx-Debug,
 -MESA_DEBUG_SOURCE_API,
 -MESA_DEBUG_TYPE_ERROR,
 -error_msg_id,
 -MESA_DEBUG_SEVERITY_HIGH);
 +   if (ctx) {
 +  mtx_lock(ctx-DebugMutex);
 +  if (ctx-Debug) {
 + do_log = debug_is_message_enabled(ctx-Debug,
 +   MESA_DEBUG_SOURCE_API,
 +   MESA_DEBUG_TYPE_ERROR,
 +   error_msg_id,
 +   MESA_DEBUG_SEVERITY_HIGH);
 +  }
 +  else {
 + do_log = GL_FALSE;
 +  }
 +  mtx_unlock(ctx-DebugMutex);
 }
 else {
do_log = GL_FALSE;
 }
 -   mtx_unlock(ctx-DebugMutex);
  
 if (do_output || do_log) {
char s[MAX_DEBUG_MESSAGE_LENGTH], s2[MAX_DEBUG_MESSAGE_LENGTH];
 @@ -1502,14 +1507,15 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
 const char *fmtString, ... )
}
  
/* Log the error via ARB_debug_output if needed.*/
 -  if (do_log) {
 +  if (ctx  do_log) {
   log_msg(ctx, MESA_DEBUG_SOURCE_API, MESA_DEBUG_TYPE_ERROR,
   error_msg_id, MESA_DEBUG_SEVERITY_HIGH, len, s2);
}
 }
  
 /* Set the GL context error state for glGetError. */
 -   _mesa_record_error(ctx, error);
 +   if (ctx)
 +  _mesa_record_error(ctx, error);
  }
  
  void
 -- 
 1.8.5.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions

2015-05-07 Thread Pohjolainen, Topi

On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote:
 v2: Get rid of the block parameter and make src a const reference
 
 Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com
 Reviewed-by: Matt Turner matts...@gmail.com
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 
 
  1 file changed, 38 insertions(+), 37 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 index 43370cb..9c4ed0b 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate)
operands_match(a, b, negate);
  }
  
 +static fs_inst *
 +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate)

Did you mean 'src' to be constant reference? It is only used for reading
so it could be - you claim this in the commit message yourself :)

 +{
 +   int written = inst-regs_written;
 +   int dst_width = inst-dst.width / 8;
 +   fs_reg dst = inst-dst;
 +   fs_inst *copy;
 +
 +   if (written  dst_width) {
 +  fs_reg *sources = ralloc_array(v-mem_ctx, fs_reg, written / 
 dst_width);
 +  for (int i = 0; i  written / dst_width; i++)
 + sources[i] = offset(src, i);
 +  copy = v-LOAD_PAYLOAD(dst, sources, written / dst_width);
 +   } else {
 +  copy = v-MOV(dst, src);
 +  copy-force_writemask_all = inst-force_writemask_all;
 +  copy-src[0].negate = negate;
 +   }
 +   assert(copy-regs_written == written);
 +
 +   return copy;
 +}
 +
  bool
  fs_visitor::opt_cse_local(bblock_t *block)
  {
 @@ -230,49 +253,27 @@ fs_visitor::opt_cse_local(bblock_t *block)
  bool no_existing_temp = entry-tmp.file == BAD_FILE;
  if (no_existing_temp  !entry-generator-dst.is_null()) {
 int written = entry-generator-regs_written;
 -   int dst_width = entry-generator-dst.width / 8;
 -   assert(written % dst_width == 0);
 -
 -   fs_reg orig_dst = entry-generator-dst;
 -   fs_reg tmp = fs_reg(GRF, alloc.allocate(written),
 -   orig_dst.type, orig_dst.width);
 -   entry-tmp = tmp;
 -   entry-generator-dst = tmp;
 -
 -   fs_inst *copy;
 -   if (written  dst_width) {
 -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
 dst_width);
 -  for (int i = 0; i  written / dst_width; i++)
 - sources[i] = offset(tmp, i);
 -  copy = LOAD_PAYLOAD(orig_dst, sources, written / 
 dst_width);
 -   } else {
 -  copy = MOV(orig_dst, tmp);
 -  copy-force_writemask_all =
 - entry-generator-force_writemask_all;
 -   }
 +   assert((written * 8) % entry-generator-dst.width == 0);
 +
 +   entry-tmp = fs_reg(GRF, alloc.allocate(written),
 +   entry-generator-dst.type,
 +   entry-generator-dst.width);
 +
 +   fs_inst *copy = create_copy_instr(this, entry-generator,
 + entry-tmp, false);
 entry-generator-insert_after(block, copy);
 +
 +   entry-generator-dst = entry-tmp;
  }
  
  /* dest - temp */
  if (!inst-dst.is_null()) {
 -   int written = inst-regs_written;
 -   int dst_width = inst-dst.width / 8;
 -   assert(written == entry-generator-regs_written);
 -   assert(dst_width == entry-generator-dst.width / 8);
 +   assert(inst-regs_written == entry-generator-regs_written);
 +   assert(inst-dst.width == entry-generator-dst.width);
 assert(inst-dst.type == entry-tmp.type);
 -   fs_reg dst = inst-dst;
 -   fs_reg tmp = entry-tmp;
 -   fs_inst *copy;
 -   if (written  dst_width) {
 -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
 dst_width);
 -  for (int i = 0; i  written / dst_width; i++)
 - sources[i] = offset(tmp, i);
 -  copy = LOAD_PAYLOAD(dst, sources, written / dst_width);
 -   } else {
 -  copy = MOV(dst, tmp);
 -  copy-force_writemask_all = inst-force_writemask_all;
 -  copy-src[0].negate = negate;
 -   }
 +
 +   fs_inst *copy = create_copy_instr(this, inst,
 + entry-tmp, negate);
 inst-insert_before(block, copy);
  }
  
 -- 
 2.3.6
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 13/13] mesa/main: Verify context creation on progress

2015-05-07 Thread Pohjolainen, Topi

On Tue, May 05, 2015 at 02:25:29PM +0300, Juha-Pekka Heikkila wrote:
 Stop context creation if something failed. If something errored
 during context creation we'd segfault. Now will clean up and
 return error.
 
 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
  src/mesa/main/shared.c | 66 
 +++---
  1 file changed, 62 insertions(+), 4 deletions(-)
 
 diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
 index 0b76cc0..cc05b05 100644
 --- a/src/mesa/main/shared.c
 +++ b/src/mesa/main/shared.c
 @@ -64,9 +64,21 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
  
 mtx_init(shared-Mutex, mtx_plain);
  
 +   /* Mutex and timestamp for texobj state validation */
 +   mtx_init(shared-TexMutex, mtx_recursive);
 +   shared-TextureStateStamp = 0;

Do you really need to move this here?

 +
 shared-DisplayList = _mesa_NewHashTable();
 +   if (!shared-DisplayList)
 +  goto error_out;
 +
 shared-TexObjects = _mesa_NewHashTable();
 +   if (!shared-TexObjects)
 +  goto error_out;
 +
 shared-Programs = _mesa_NewHashTable();
 +   if (!shared-Programs)
 +  goto error_out;
  
 shared-DefaultVertexProgram =
gl_vertex_program(ctx-Driver.NewProgram(ctx,
 @@ -76,17 +88,28 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
   GL_FRAGMENT_PROGRAM_ARB, 
 0));
  
 shared-ATIShaders = _mesa_NewHashTable();
 +   if (!shared-ATIShaders)
 +  goto error_out;
 +
 shared-DefaultFragmentShader = _mesa_new_ati_fragment_shader(ctx, 0);
  
 shared-ShaderObjects = _mesa_NewHashTable();
 +   if (!shared-ShaderObjects)
 +  goto error_out;
  
 shared-BufferObjects = _mesa_NewHashTable();
 +   if (!shared-BufferObjects)
 +  goto error_out;
  
 /* GL_ARB_sampler_objects */
 shared-SamplerObjects = _mesa_NewHashTable();
 +   if (!shared-SamplerObjects)
 +  goto error_out;
  
 /* Allocate the default buffer object */
 shared-NullBufferObj = ctx-Driver.NewBufferObject(ctx, 0);
 +   if (!shared-NullBufferObj)
 +   goto error_out;
  
 /* Create default texture objects */
 for (i = 0; i  NUM_TEXTURE_TARGETS; i++) {
 @@ -107,22 +130,57 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
};
STATIC_ASSERT(ARRAY_SIZE(targets) == NUM_TEXTURE_TARGETS);
shared-DefaultTex[i] = ctx-Driver.NewTextureObject(ctx, 0, 
 targets[i]);
 +
 +  if (!shared-DefaultTex[i])
 +  goto error_out;
 }
  
 /* sanity check */
 assert(shared-DefaultTex[TEXTURE_1D_INDEX]-RefCount == 1);
  
 -   /* Mutex and timestamp for texobj state validation */
 -   mtx_init(shared-TexMutex, mtx_recursive);
 -   shared-TextureStateStamp = 0;
 -
 shared-FrameBuffers = _mesa_NewHashTable();
 +   if (!shared-FrameBuffers)
 +  goto error_out;
 +
 shared-RenderBuffers = _mesa_NewHashTable();
 +   if (!shared-RenderBuffers)
 +  goto error_out;
  
 shared-SyncObjects = _mesa_set_create(NULL, _mesa_hash_pointer,
_mesa_key_pointer_equal);
 +   if (!shared-SyncObjects)
 +  goto error_out;
  
 return shared;
 +
 +error_out:
 +   for (i = 0; i  NUM_TEXTURE_TARGETS; i++) {
 +  if (shared-DefaultTex[i]) {
 + ctx-Driver.DeleteTexture(ctx, shared-DefaultTex[i]);
 +  }
 +   }
 +
 +   _mesa_reference_buffer_object(ctx, shared-NullBufferObj, NULL);
 +
 +   _mesa_DeleteHashTable(shared-RenderBuffers);
 +   _mesa_DeleteHashTable(shared-FrameBuffers);
 +   _mesa_DeleteHashTable(shared-SamplerObjects);
 +   _mesa_DeleteHashTable(shared-BufferObjects);
 +   _mesa_DeleteHashTable(shared-ShaderObjects);
 +   _mesa_DeleteHashTable(shared-ATIShaders);
 +   _mesa_DeleteHashTable(shared-Programs);
 +   _mesa_DeleteHashTable(shared-TexObjects);
 +   _mesa_DeleteHashTable(shared-DisplayList);
 +
 +   _mesa_reference_vertprog(ctx, shared-DefaultVertexProgram, NULL);
 +   _mesa_reference_geomprog(ctx, shared-DefaultGeometryProgram, NULL);
 +   _mesa_reference_fragprog(ctx, shared-DefaultFragmentProgram, NULL);
 +
 +   mtx_destroy(shared-Mutex);
 +   mtx_destroy(shared-TexMutex);
 +
 +   free(shared);
 +   return NULL;
  }
  
  
 -- 
 1.8.5.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/27] nir: Add glsl_get_element_type() wrapper.

2015-05-07 Thread Timothy Arceri


On Tue, 2015-04-28 at 23:08 +0300, Abdiel Janulgue wrote:
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/glsl/nir/nir_types.cpp | 5 +
  src/glsl/nir/nir_types.h   | 2 ++
  2 files changed, 7 insertions(+)
 
 diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp
 index f0d0b46..249678f 100644
 --- a/src/glsl/nir/nir_types.cpp
 +++ b/src/glsl/nir/nir_types.cpp
 @@ -82,6 +82,11 @@ glsl_get_base_type(const struct glsl_type *type)
 return type-base_type;
  }
  
 +const struct glsl_type *
 +glsl_get_element_type(const struct glsl_type *type)
 +{
 +   return type-element_type();

I've sent a patch to remove the element_type() helper. I'm yet to see a
case where just using is_array() and/or without_array() don't result in
clearer code with the added advantage (in most cases) of free
multidimensional array support.

http://lists.freedesktop.org/archives/mesa-dev/2015-April/083195.html

 +}
  unsigned
  glsl_get_vector_elements(const struct glsl_type *type)
  {
 diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h
 index 276d4ad..125f075 100644
 --- a/src/glsl/nir/nir_types.h
 +++ b/src/glsl/nir/nir_types.h
 @@ -49,6 +49,8 @@ const struct glsl_type *glsl_get_array_element(const struct 
 glsl_type *type);
  
  const struct glsl_type *glsl_get_column_type(const struct glsl_type *type);
  
 +const struct glsl_type *glsl_get_element_type(const struct glsl_type *type);
 +
  enum glsl_base_type glsl_get_base_type(const struct glsl_type *type);
  
  unsigned glsl_get_vector_elements(const struct glsl_type *type);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 6:17 PM, Matt Turner matts...@gmail.com wrote:
 On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

total instructions in shared programs: 2724483 - 2711790 (-0.47%)
instructions in affected programs: 1860859 - 1848166 (-0.68%)
helped:4387
HURT:  4758
GAINED:1499

 The gained programs are ARB vertext programs that were previously going
 through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
 programs can go through the scalar backend so they show up as gained in
 the shader-db results.

 Again, I'm kind of confused and disappointed that we're just okay with
 hurting 4700 programs without more analysis. I guess I'll go do
 that...

 I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
 297 - 161 instructions. More concerning, the number of send
 instructions drop from 36 to 12, and a loop that was 111 instructions
 long suddenly becomes

START B1 -B0 -B2
 cmp.ge.f0(8)nullg428,8,1D g70,1,0D
 (+f0) break(8)  JIP: 24 UIP: 24
END B1 -B3 -B2
START B2 -B1
 add(8)  g421D g428,8,1D 1D
 while(8)JIP: -32
END B2 -B1

 That deserves a lot more investigation. I'll take a gamble and say
 something is broken.

I did a little looking at that shader and it looks like NIR dead-coded
the contents of a for loop and, as a result, a bunch of stuff was
promoted to push constants, hence fewer sampler messages.  I didn't
find anything broken but, then again, that's hard to do without being
able to verifiably run the shader.  I'll try and look at the places
where we end up with more instructions.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1

2015-05-07 Thread Tapani Pälli




On 05/08/2015 12:13 AM, Ian Romanick wrote:

On 05/07/2015 12:57 AM, Marta Lofstedt wrote:

From: Marta Lofstedt marta.lofst...@intel.com

v2: only expose enums from GL_ARB_shader_image_load_store
for gles 3.1 and GL core

Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
---
  src/mesa/main/get.c  |  6 ++
  src/mesa/main/get_hash_params.py | 17 -
  2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 9898197..73739b6 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = {
 EXTRA_END
  };

+static const int extra_ARB_shader_image_load_store_es31[] = {
+   EXT(ARB_shader_image_load_store),
+   EXTRA_API_ES31,


I think you're missing the patch that adds EXTRA_API_ES31.  Did you
forget to send that one out?


Marta's series builds on top of my patch here that adds EXTRA_API_ES31:

http://lists.freedesktop.org/archives/mesa-dev/2015-May/083593.html


Also, on a few of these patches, I think the old, non-_es31 set of
requirements can be removed due to no longer being used.


+   EXTRA_END
+};
+
  EXTRA_EXT(ARB_texture_cube_map);
  EXTRA_EXT(EXT_texture_array);
  EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 513d5d2..85c2494 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -413,6 +413,14 @@ descriptor=[
  { apis: [GL_CORE, GLES3], params: [
  # GL_ARB_draw_indirect / GLES 3.1
[ DRAW_INDIRECT_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_draw_indirect_es31 ],
+# GL_ARB_shader_image_load_store / GLES 3.1
+  [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, 
CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_VERTEX_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_GEOMETRY_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_FRAGMENT_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store_es31],
  ]},

  # Remaining enums are only in OpenGL
@@ -780,15 +788,6 @@ descriptor=[
[ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, 
CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ],
[ MAX_VERTEX_ATTRIB_BINDINGS, CONTEXT_ENUM(Const.MaxVertexAttribBindings), 
NO_EXTRA ],

-# GL_ARB_shader_image_load_store
-  [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store],
-  [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, 
CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store],
-  [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store],
-  [ MAX_VERTEX_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store],
-  [ MAX_GEOMETRY_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_and_geometry_shader],
-  [ MAX_FRAGMENT_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store],
-  [ MAX_COMBINED_IMAGE_UNIFORMS, CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store],
-
  # GL_ARB_compute_shader
[ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, 
CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ],
[ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader ],



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/13] util: Move gallium's linked list to util

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 5:30 PM, Ian Romanick i...@freedesktop.org wrote:
 Isn't this the same as src/util/simple_list.h?

In terms of being a two-pointer circularly linked list, yes.  In terms
of having a decent API, no.

1) Nothing in simple_list is namespaced in any way
2) it's all macros with do-while around them instead of static inlines
3) It assumes that you just put prev and next pointers in the
structure you're putting in the list rather than having a node you
embed.  While this provides the type saftey claimed at the top of
simple_list.h, it requires that, if you want a list of struct foo's,
you to use an entire struct foo as the sentinel instead of a 2 or 3
pointer list structure.
4) Point 3 isn't quite true because there is a simple_node structure.
However, it looks like a complete after-thought because none of the
iterators or manipulators do anything with it.

I could probably extend the list, but I think you get the point.
Sure, I could improve simple_list, but why do so when there's a
perfectly good list in gallium that does everything simple_list does
and more.

I did start working on replacing simple_list with the gallium list to
get us down to two lists, but we use it in things like swrast and tnl
so it turned into quite the spider-web.  Eventually, I'd like to see
simple_list die but if we can at least restrict it back to the older
parts of the code and remove it from util, that would make me happy
enough for now.
--Jason

 On 04/27/2015 09:03 PM, Jason Ekstrand wrote:
 The linked list in gallium is pretty much the kernel list and we would like
 to have a C-based linked list for all of mesa.  Let's not duplicate and
 just steal the gallium one.
 ---
  src/gallium/auxiliary/Makefile.sources |   1 -
  src/gallium/auxiliary/hud/hud_private.h|   2 +-
  .../auxiliary/pipebuffer/pb_buffer_fenced.c|   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c |   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_debug.c |   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c|   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_pool.c  |   2 +-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_slab.c  |   2 +-
  src/gallium/auxiliary/util/u_debug_flush.c |   2 +-
  src/gallium/auxiliary/util/u_debug_memory.c|   2 +-
  src/gallium/auxiliary/util/u_dirty_surfaces.h  |   2 +-
  src/gallium/auxiliary/util/u_double_list.h | 146 
 -
  src/gallium/drivers/freedreno/freedreno_context.h  |   2 +-
  src/gallium/drivers/freedreno/freedreno_query_hw.h |   2 +-
  src/gallium/drivers/freedreno/freedreno_resource.h |   2 +-
  src/gallium/drivers/ilo/ilo_common.h   |   2 +-
  src/gallium/drivers/nouveau/nouveau_buffer.h   |   2 +-
  src/gallium/drivers/nouveau/nouveau_fence.c|   2 -
  src/gallium/drivers/nouveau/nouveau_fence.h|   2 +-
  src/gallium/drivers/nouveau/nouveau_mm.c   |   2 +-
  src/gallium/drivers/nouveau/nv30/nv30_screen.h |   2 +-
  src/gallium/drivers/nouveau/nv50/nv50_resource.h   |   2 +-
  src/gallium/drivers/r600/compute_memory_pool.c |   2 +-
  src/gallium/drivers/r600/evergreen_compute.c   |   2 +-
  src/gallium/drivers/r600/r600_llvm.c   |   2 +-
  src/gallium/drivers/r600/r600_pipe.h   |   2 +-
  src/gallium/drivers/radeon/r600_pipe_common.h  |   2 +-
  src/gallium/drivers/radeon/radeon_vce.h|   2 +-
  src/gallium/drivers/svga/svga_context.h|   2 +-
  src/gallium/drivers/svga/svga_resource_buffer.h|   2 -
  .../drivers/svga/svga_resource_buffer_upload.c |   1 -
  src/gallium/drivers/svga/svga_screen_cache.h   |   2 +-
  src/gallium/state_trackers/nine/basetexture9.h |   2 +-
  src/gallium/state_trackers/nine/device9.h  |   2 +-
  src/gallium/state_trackers/nine/nine_state.h   |   2 +-
  src/gallium/state_trackers/nine/surface9.h |   2 +-
  src/gallium/state_trackers/omx/vid_dec.h   |   2 +-
  src/gallium/state_trackers/omx/vid_enc.h   |   2 +-
  src/gallium/winsys/radeon/drm/radeon_drm_bo.c  |   2 +-
  .../winsys/svga/drm/pb_buffer_simple_fenced.c  |   2 +-
  src/gallium/winsys/svga/drm/vmw_fence.c|   2 +-
  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c  |   2 +-
  src/util/Makefile.sources  |   1 +
  src/util/list.h| 146 
 +
  44 files changed, 184 insertions(+), 189 deletions(-)
  delete mode 100644 src/gallium/auxiliary/util/u_double_list.h
  create mode 100644 src/util/list.h

 diff --git a/src/gallium/auxiliary/Makefile.sources 
 b/src/gallium/auxiliary/Makefile.sources
 index ec7547c..62e6b94 100644
 --- a/src/gallium/auxiliary/Makefile.sources
 +++ b/src/gallium/auxiliary/Makefile.sources
 @@ -197,7 +197,6 @@ C_SOURCES := \
   util/u_dirty_surfaces.h \
   util/u_dl.c \
   util/u_dl.h \
 - util/u_double_list.h \

[Mesa-dev] [PATCH 1/2] nv50: keep track of PGRAPH state in nv50_screen

2015-05-07 Thread Ilia Mirkin

Normally this is kept in nv50_context, and on switching the active
context, the state is copied from the previous context. However when the
last context is destroyed, this is lost, and a new context might later
be created. When the currently-active context is destroyed, save its
state in the screen, and restore it when setting the current context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363
Reported-by: Matteo Bruni matteo.myst...@gmail.com
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/nv50/nv50_context.c| 11 ++--
 src/gallium/drivers/nouveau/nv50/nv50_context.h| 29 +-
 src/gallium/drivers/nouveau/nv50/nv50_screen.h | 24 ++
 .../drivers/nouveau/nv50/nv50_state_validate.c |  2 ++
 4 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c 
b/src/gallium/drivers/nouveau/nv50/nv50_context.c
index 2cfd5db..5b5d391 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c
@@ -138,8 +138,11 @@ nv50_destroy(struct pipe_context *pipe)
 {
struct nv50_context *nv50 = nv50_context(pipe);
 
-   if (nv50_context_screen(nv50)-cur_ctx == nv50)
-  nv50_context_screen(nv50)-cur_ctx = NULL;
+   if (nv50-screen-cur_ctx == nv50) {
+  nv50-screen-cur_ctx = NULL;
+  /* Save off the state in case another context gets created */
+  nv50-screen-save_state = nv50-state;
+   }
nouveau_pushbuf_bufctx(nv50-base.pushbuf, NULL);
nouveau_pushbuf_kick(nv50-base.pushbuf, nv50-base.pushbuf-channel);
 
@@ -290,6 +293,10 @@ nv50_create(struct pipe_screen *pscreen, void *priv)
pipe-get_sample_position = nv50_context_get_sample_position;
 
if (!screen-cur_ctx) {
+  /* Restore the last context's state here, normally handled during
+   * context switch
+   */
+  nv50-state = screen-save_state;
   screen-cur_ctx = nv50;
   nouveau_pushbuf_bufctx(screen-base.pushbuf, nv50-bufctx);
}
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h 
b/src/gallium/drivers/nouveau/nv50/nv50_context.h
index 45eb554..1f123ef 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h
@@ -104,28 +104,7 @@ struct nv50_context {
uint32_t dirty;
boolean cb_dirty;
 
-   struct {
-  uint32_t instance_elts; /* bitmask of per-instance elements */
-  uint32_t instance_base;
-  uint32_t interpolant_ctrl;
-  uint32_t semantic_color;
-  uint32_t semantic_psize;
-  int32_t index_bias;
-  boolean uniform_buffer_bound[3];
-  boolean prim_restart;
-  boolean point_sprite;
-  boolean rt_serialize;
-  boolean flushed;
-  boolean rasterizer_discard;
-  uint8_t tls_required;
-  boolean new_tls_space;
-  uint8_t num_vtxbufs;
-  uint8_t num_vtxelts;
-  uint8_t num_textures[3];
-  uint8_t num_samplers[3];
-  uint8_t prim_size;
-  uint16_t scissor;
-   } state;
+   struct nv50_graph_state state;
 
struct nv50_blend_stateobj *blend;
struct nv50_rasterizer_stateobj *rast;
@@ -191,12 +170,6 @@ nv50_context(struct pipe_context *pipe)
return (struct nv50_context *)pipe;
 }
 
-static INLINE struct nv50_screen *
-nv50_context_screen(struct nv50_context *nv50)
-{
-   return nv50_screen(nv50-base.screen-base);
-}
-
 /* return index used in nv50_context arrays for a specific shader type */
 static INLINE unsigned
 nv50_context_shader_stage(unsigned pipe)
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index f8ce365..881051b 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -25,10 +25,34 @@ struct nv50_context;
 
 struct nv50_blitter;
 
+struct nv50_graph_state {
+   uint32_t instance_elts; /* bitmask of per-instance elements */
+   uint32_t instance_base;
+   uint32_t interpolant_ctrl;
+   uint32_t semantic_color;
+   uint32_t semantic_psize;
+   int32_t index_bias;
+   boolean uniform_buffer_bound[3];
+   boolean prim_restart;
+   boolean point_sprite;
+   boolean rt_serialize;
+   boolean flushed;
+   boolean rasterizer_discard;
+   uint8_t tls_required;
+   boolean new_tls_space;
+   uint8_t num_vtxbufs;
+   uint8_t num_vtxelts;
+   uint8_t num_textures[3];
+   uint8_t num_samplers[3];
+   uint8_t prim_size;
+   uint16_t scissor;
+};
+
 struct nv50_screen {
struct nouveau_screen base;
 
struct nv50_context *cur_ctx;
+   struct nv50_graph_state save_state;
 
struct nouveau_bo *code;
struct nouveau_bo *uniforms;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
index 85e19b4..116bf4b 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
+++

[Mesa-dev] [PATCH 2/2] nvc0: keep track of PGRAPH state in nvc0_screen

2015-05-07 Thread Ilia Mirkin

See identical commit for nv50. Destroying the current context and then
creating a new one or switching to another existing context would cause
the current state to not be properly initialized, so we save it off in
the screen.

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/nvc0/nvc0_context.c|  7 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 24 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 25 ++
 .../drivers/nouveau/nvc0/nvc0_state_validate.c |  2 ++
 4 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
index 7662fb5..7904984 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
@@ -139,8 +139,12 @@ nvc0_destroy(struct pipe_context *pipe)
 {
struct nvc0_context *nvc0 = nvc0_context(pipe);
 
-   if (nvc0-screen-cur_ctx == nvc0)
+   if (nvc0-screen-cur_ctx == nvc0) {
   nvc0-screen-cur_ctx = NULL;
+  nvc0-screen-save_state = nvc0-state;
+  nvc0-screen-save_state.tfb = NULL;
+   }
+
/* Unset bufctx, we don't want to revalidate any resources after the flush.
 * Other contexts will always set their bufctx again on action calls.
 */
@@ -303,6 +307,7 @@ nvc0_create(struct pipe_screen *pscreen, void *priv)
pipe-get_sample_position = nvc0_context_get_sample_position;
 
if (!screen-cur_ctx) {
+  nvc0-state = screen-save_state;
   screen-cur_ctx = nvc0;
   nouveau_pushbuf_bufctx(screen-base.pushbuf, nvc0-bufctx);
}
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
index ef251f3..a8d7593 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -113,29 +113,7 @@ struct nvc0_context {
uint32_t dirty;
uint32_t dirty_cp; /* dirty flags for compute state */
 
-   struct {
-  boolean flushed;
-  boolean rasterizer_discard;
-  boolean early_z_forced;
-  boolean prim_restart;
-  uint32_t instance_elts; /* bitmask of per-instance elements */
-  uint32_t instance_base;
-  uint32_t constant_vbos;
-  uint32_t constant_elts;
-  int32_t index_bias;
-  uint16_t scissor;
-  uint8_t vbo_mode; /* 0 = normal, 1 = translate, 3 = translate, forced */
-  uint8_t num_vtxbufs;
-  uint8_t num_vtxelts;
-  uint8_t num_textures[6];
-  uint8_t num_samplers[6];
-  uint8_t tls_required; /* bitmask of shader types using l[] */
-  uint8_t c14_bound; /* whether immediate array constbuf is bound */
-  uint8_t clip_enable;
-  uint32_t clip_mode;
-  uint32_t uniform_buffer_bound[5];
-  struct nvc0_transform_feedback_state *tfb;
-   } state;
+   struct nvc0_graph_state state;
 
struct nvc0_blend_stateobj *blend;
struct nvc0_rasterizer_stateobj *rast;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
index 8a1991f..bce0f4a 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
@@ -27,10 +27,35 @@ struct nvc0_context;
 
 struct nvc0_blitter;
 
+struct nvc0_graph_state {
+   boolean flushed;
+   boolean rasterizer_discard;
+   boolean early_z_forced;
+   boolean prim_restart;
+   uint32_t instance_elts; /* bitmask of per-instance elements */
+   uint32_t instance_base;
+   uint32_t constant_vbos;
+   uint32_t constant_elts;
+   int32_t index_bias;
+   uint16_t scissor;
+   uint8_t vbo_mode; /* 0 = normal, 1 = translate, 3 = translate, forced */
+   uint8_t num_vtxbufs;
+   uint8_t num_vtxelts;
+   uint8_t num_textures[6];
+   uint8_t num_samplers[6];
+   uint8_t tls_required; /* bitmask of shader types using l[] */
+   uint8_t c14_bound; /* whether immediate array constbuf is bound */
+   uint8_t clip_enable;
+   uint32_t clip_mode;
+   uint32_t uniform_buffer_bound[5];
+   struct nvc0_transform_feedback_state *tfb;
+};
+
 struct nvc0_screen {
struct nouveau_screen base;
 
struct nvc0_context *cur_ctx;
+   struct nvc0_graph_state save_state;
 
int num_occlusion_queries_active;
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
index 6051f12..d3ad81d 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
@@ -543,6 +543,8 @@ nvc0_switch_pipe_context(struct nvc0_context *ctx_to)
 
if (ctx_from)
   ctx_to-state = ctx_from-state;
+   else
+  ctx_to-state = ctx_to-screen-save_state;
 
ctx_to-dirty = ~0;
ctx_to-viewports_dirty = ~0;
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Michel Dänzer

On 08.05.2015 03:24, Tom Stellard wrote:
 For this particular situation, I'm happy with any solution that:
 
 1. Allows a user to install the icd file to /etc if he or she wants to.

--sysconfdir=/etc

That covers drirc as well.


 2. Does not require the user to read the spec to know that /etc is the
 correct place to install it.

I think the above is pretty standard for autotools projects. I think it
would be better to document this in the appropriate place(s) for OpenCL
users than to add another convoluted option which doesn't really add any
flexibility.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 8:49 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 On Thu, May 7, 2015 at 6:17 PM, Matt Turner matts...@gmail.com wrote:
 On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

total instructions in shared programs: 2724483 - 2711790 (-0.47%)
instructions in affected programs: 1860859 - 1848166 (-0.68%)
helped:4387
HURT:  4758
GAINED:1499

 The gained programs are ARB vertext programs that were previously going
 through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
 programs can go through the scalar backend so they show up as gained in
 the shader-db results.

 Again, I'm kind of confused and disappointed that we're just okay with
 hurting 4700 programs without more analysis. I guess I'll go do
 that...

 I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
 297 - 161 instructions. More concerning, the number of send
 instructions drop from 36 to 12, and a loop that was 111 instructions
 long suddenly becomes

START B1 -B0 -B2
 cmp.ge.f0(8)nullg428,8,1D g70,1,0D
 (+f0) break(8)  JIP: 24 UIP: 24
END B1 -B3 -B2
START B2 -B1
 add(8)  g421D g428,8,1D 1D
 while(8)JIP: -32
END B2 -B1

 That deserves a lot more investigation. I'll take a gamble and say
 something is broken.

 I did a little looking at that shader and it looks like NIR dead-coded
 the contents of a for loop and, as a result, a bunch of stuff was
 promoted to push constants, hence fewer sampler messages.  I didn't
 find anything broken but, then again, that's hard to do without being
 able to verifiably run the shader.  I'll try and look at the places
 where we end up with more instructions.
 --Jason

Looking at the assembly even closer, it looks like NIR did 100% the
right thing.  The shader had a for loop that computes a bunch of
values that either don't get used at all or are over-written before
they are used.  (I didn't check every value written in the loop but I
did check a good half-dozen or so.)  NIR, probably thanks to SSA,
realized that these values were never used for anything, and
dead-coded the entire contents of the for loop.  The result was that
the 12 (yes, 12) pull constant loads inside the loop went away and the
9 after the loop were promoted to push constants.  Unfortunately, NIR
isn't yet smart enough to remove the loop entirely but an empty loop
isn't nearly as expensive as sampler invocations so I'm not too
worried about it.

I'll try and take a look at some of the hurt programs tomorrow.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 6:17 PM, Matt Turner matts...@gmail.com wrote:
 On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

total instructions in shared programs: 2724483 - 2711790 (-0.47%)
instructions in affected programs: 1860859 - 1848166 (-0.68%)
helped:4387
HURT:  4758
GAINED:1499

 The gained programs are ARB vertext programs that were previously going
 through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
 programs can go through the scalar backend so they show up as gained in
 the shader-db results.

 Again, I'm kind of confused and disappointed that we're just okay with
 hurting 4700 programs without more analysis. I guess I'll go do
 that...

What confuses me more is why the results aren't better.  When we first
turned NIR on by default for FS, the shader-db results looked a lot
better.  On one branch (wip/nir-by-default-v2) I applied the ATTR
copy-prop and we had the following:

GLSL IR vs. NIR shader-db results on Broadwell (VS only):

   total instructions in shared programs: 7106293 - 7001640 (-1.47%)
   instructions in affected programs: 4604798 - 4500145 (-2.27%)
   helped:16786
   HURT:  8442
   GAINED:1563
   LOST:  1526

The difference between gained/lost was due to capturing standard
error.  However, that shouldn't  affect the over-all numbers that
much.  I think adding the improved ffma stuff probably made a bunch of
the difference.

As far as when we turn it on, I do think that we want to do it before
the merge window closes if we can.  Being able to delete the visitor
after the branch would be really nice.  Also, we want to get people
testing it and reporting bugs because we're not going to find every
bug in every vertex shader  by inspection.

 I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
 297 - 161 instructions. More concerning, the number of send
 instructions drop from 36 to 12, and a loop that was 111 instructions
 long suddenly becomes

START B1 -B0 -B2
 cmp.ge.f0(8)nullg428,8,1D g70,1,0D
 (+f0) break(8)  JIP: 24 UIP: 24
END B1 -B3 -B2
START B2 -B1
 add(8)  g421D g428,8,1D 1D
 while(8)JIP: -32
END B2 -B1

 That deserves a lot more investigation. I'll take a gamble and say
 something is broken.

Yes, that needs some investigation.  I can also take a look at some of
the hurt and/or really helped shaders as well and see what I find.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote:
 This patch implements the binding table enable command which is also
 used to allocate a binding table pool where hardware-generated
 binding table entries are flushed into. Each binding table offset in
 the binding table pool is unique per each shader stage that are
 enabled within a batch.
 
 Also insert the required brw_tracked_state objects to enable
 hw-generated binding tables in normal render path.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 
 ++
  src/mesa/drivers/dri/i965/brw_context.c|  4 ++
  src/mesa/drivers/dri/i965/brw_context.h|  5 ++
  src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
  src/mesa/drivers/dri/i965/brw_state_upload.c   |  2 +
  src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
  6 files changed, 92 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
 b/src/mesa/drivers/dri/i965/brw_binding_tables.c
 index 459165a..a58e32e 100644
 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
 +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
 @@ -44,6 +44,11 @@
  #include brw_state.h
  #include intel_batchbuffer.h
  
 +/* Somehow the hw-binding table pool offset must start here, otherwise
 + * the GPU will hang
 + */
 +#define HW_BT_START_OFFSET 256;

I think we want to understand this a little better before enabling...

 +
  /**
   * Upload a shader stage's binding table as indirect state.
   *
 @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = {
 .emit = brw_gs_upload_binding_table,
  };
  
 +/**
 + * Hardware-generated binding tables for the resource streamer
 + */
 +void
 +gen7_disable_hw_binding_tables(struct brw_context *brw)
 +{
 +   BEGIN_BATCH(3);
 +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
 +   OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
 BRW_HW_BINDING_TABLE_ENABLE) |
 + brw-is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0);
 +   OUT_BATCH(0);
 +   ADVANCE_BATCH();
 +
 +   /* Pipe control workaround */
 +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
 +}
 +
 +void
 +gen7_enable_hw_binding_tables(struct brw_context *brw)
 +{
 +   if (!brw-has_resource_streamer) {
 +  gen7_disable_hw_binding_tables(brw);

I started wondering why we really need this - RS is disabled by default and
we haven't needed to do anything to disable it before.

 +  return;
 +   }
 +
 +   if (!brw-hw_bt_pool.bo) {
 +  /* From the BSpec, 3D Pipeline  Resource Streamer  Hardware Binding 
 Tables:
 +   *
 +   *  A maximum of 16,383 Binding tables are allowed in any batch 
 buffer.
 +   */
 +  int max_size = 16383 * 4;

But does it really need this much all the time? I guess I need to go and
read the spec.

 +  brw-hw_bt_pool.bo = drm_intel_bo_alloc(brw-bufmgr, hw_bt,
 +  max_size, 64);
 +  brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET;
 +   }
 +
 +   uint32_t dw1 = SET_FIELD(BRW_HW_BINDING_TABLE_ON, 
 BRW_HW_BINDING_TABLE_ENABLE);
 +   if (brw-is_haswell)
 +  dw1 |= SET_FIELD(GEN7_MOCS_L3, GEN7_HW_BT_MOCS) | 
 HSW_HW_BINDING_TABLE_RESERVED;

These are overflowing 80 columns.

 +
 +   BEGIN_BATCH(3);
 +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
 +   OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
 +   OUT_RELOC(brw-hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0,
 + brw-hw_bt_pool.bo-size);
 +   ADVANCE_BATCH();
 +
 +   /* Pipe control workaround */
 +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);

Would you have a spec reference for this?

 +}
 +
 +void
 +gen7_reset_rs_pool_offsets(struct brw_context *brw)
 +{
 +   brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET;
 +}
 +
 +const struct brw_tracked_state gen7_hw_binding_tables = {
 +   .dirty = {
 +  .mesa = 0,
 +  .brw = BRW_NEW_BATCH,
 +   },
 +   .emit = gen7_enable_hw_binding_tables
 +};
 +
  /** @} */
  
  /**
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
 b/src/mesa/drivers/dri/i965/brw_context.c
 index c7e1e81..9c7ccae 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.c
 +++ b/src/mesa/drivers/dri/i965/brw_context.c
 @@ -953,6 +953,10 @@ intelDestroyContext(__DRIcontext * driContextPriv)
 if (brw-wm.base.scratch_bo)
drm_intel_bo_unreference(brw-wm.base.scratch_bo);
  
 +   gen7_reset_rs_pool_offsets(brw);
 +   drm_intel_bo_unreference(brw-hw_bt_pool.bo);
 +   brw-hw_bt_pool.bo = NULL;
 +
 drm_intel_gem_context_destroy(brw-hw_ctx);
  
 if (ctx-swrast_context) {
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 07626af..1c72b74 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -1360,6 +1360,11 @@ struct brw_context
uint32_t fast_clear_op;

Re: [Mesa-dev] [PATCH 06/27] i965: Define gather push constants opcodes

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:03PM +0300, Abdiel Janulgue wrote:
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_defines.h | 23 +++
  1 file changed, 23 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index da288d3..8079433 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -2209,6 +2209,29 @@ enum brw_wm_barycentric_interp_mode {
  #define _3DSTATE_CONSTANT_HS  0x7819 /* GEN7+ */
  #define _3DSTATE_CONSTANT_DS  0x781A /* GEN7+ */
  
 +/* Resource streamer gather constants */
 +#define _3DSTATE_GATHER_POOL_ALLOC0x791A /* GEN7.5+ */
 +#define _3DSTATE_GATHER_CONSTANT_VS   0x7834
 +#define _3DSTATE_GATHER_CONSTANT_GS   0x7835
 +#define _3DSTATE_GATHER_CONSTANT_HS   0x7836
 +#define _3DSTATE_GATHER_CONSTANT_DS   0x7837
 +#define _3DSTATE_GATHER_CONSTANT_PS   0x7838
 +/* Only required in HSW */
 +#define HSW_GATHER_CONSTANTS_RESERVED (3  4)
 +
 +#define BRW_GATHER_CONSTANTS_ENABLE_SHIFT 11 /* GEN7.5+ */
 +#define BRW_GATHER_CONSTANTS_ENABLE_MASK  INTEL_MASK(11, 11)
 +#define BRW_GATHER_CONSTANTS_ON   1
 +#define BRW_GATHER_CONSTANTS_OFF  0

Such as below for SO_FUNCTION_ENABLE:

   #define BRW_GATHER_CONSTANTS_ENABLE   (1  11) /* GEN7.5+ */

 +#define BRW_GATHER_BUFFER_VALID_SHIFT 16
 +#define BRW_GATHER_BUFFER_VALID_MASK  INTEL_MASK(31, 16)
 +#define BRW_GATHER_BINDING_TABLE_BLOCK_SHIFT  12
 +#define BRW_GATHER_BINDING_TABLE_BLOCK_MASK   INTEL_MASK(15, 12)
 +#define BRW_GATHER_CONST_BUFFER_OFFSET_SHIFT  8
 +#define BRW_GATHER_CONST_BUFFER_OFFSET_MASK   INTEL_MASK(15, 8)
 +#define BRW_GATHER_CHANNEL_MASK_SHIFT 4
 +#define BRW_GATHER_CHANNEL_MASK_MASK  INTEL_MASK(7, 4)
 +
  #define _3DSTATE_STREAMOUT0x781e /* GEN7+ */
  /* DW1 */
  # define SO_FUNCTION_ENABLE  (1  31)
 -- 
 1.9.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/27] i965: Include UBO parameter sizes in push constant parameters

2015-05-07 Thread Timothy Arceri

On Tue, 2015-04-28 at 23:08 +0300, Abdiel Janulgue wrote:
 Now that we consider UBO constants as push constants, we need to include
 the sizes of the UBO's constant slots in the visitor's uniform slot sizes.
 This information is needed to properly pack vector constants tightly next to
 each other.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_gs.c | 11 +++
  src/mesa/drivers/dri/i965/brw_vs.c | 13 +
  src/mesa/drivers/dri/i965/brw_wm.c | 13 +
  3 files changed, 37 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
 b/src/mesa/drivers/dri/i965/brw_gs.c
 index 97658d5..2dc3ea1 100644
 --- a/src/mesa/drivers/dri/i965/brw_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_gs.c
 @@ -32,6 +32,7 @@
  #include brw_vec4_gs_visitor.h
  #include brw_state.h
  #include brw_ff_gs.h
 +#include glsl/nir/nir_types.h
  
 
  bool
 @@ -70,6 +71,16 @@ brw_compile_gs_prog(struct brw_context *brw,
 c.prog_data.base.base.pull_param =
rzalloc_array(NULL, const gl_constant_value *, param_count);
 c.prog_data.base.base.nr_params = param_count;
 +   c.prog_data.base.base.nr_ubo_params = 0;
 +   for (int i = 0; i  gs-NumUniformBlocks; i++) {
 +  for (int p = 0; p  gs-UniformBlocks[i].NumUniforms; p++) {
 + const struct glsl_type *type = 
 gs-UniformBlocks[i].Uniforms[p].Type;
 + const struct glsl_type *elem = glsl_get_element_type(type);
 + int array_sz = elem ? glsl_get_array_size(type) : 1;
 + int components = elem ? glsl_get_components(elem) : 
 glsl_get_components(type);

As mentioned on the previous patch I've sent a patch to remove the
element type helper. I'm not sure I understand the reason the nir
wrappers need to be used here can you explain for my benefit?

Another way to write this without element type could be something like
this:

const struct glsl_type *type = gs-UniformBlocks[i].Uniforms[p].Type;
int array_sz = MAX2(glsl_get_array_size(type), 1);
int components = glsl_get_components(glsl_get_type_without_array(type));

You would obviously need to wrapper the without_array() helper instead.

Assuming arrays of arrays support is required here in future (the spec
says uniform blocks can be arrays of arrays but I'm not overly familiar
with the code your working on) now the only bit missing would be
multiplying array size by the other array dimensions.


 + c.prog_data.base.base.nr_ubo_params += components * array_sz;
 +  }
 +   }
 c.prog_data.base.base.nr_gather_table = 0;
 c.prog_data.base.base.gather_table =
rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
 diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
 b/src/mesa/drivers/dri/i965/brw_vs.c
 index 52333c9..86bef5e 100644
 --- a/src/mesa/drivers/dri/i965/brw_vs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vs.c
 @@ -37,6 +37,7 @@
  #include brw_state.h
  #include program/prog_print.h
  #include program/prog_parameter.h
 +#include glsl/nir/nir_types.h
  
  #include util/ralloc.h
  
 @@ -243,6 +244,18 @@ brw_compile_vs_prog(struct brw_context *brw,
rzalloc_array(NULL, const gl_constant_value *, param_count);
 stage_prog_data-nr_params = param_count;
  
 +   stage_prog_data-nr_ubo_params = 0;
 +   if (vs) {
 +  for (int i = 0; i  vs-NumUniformBlocks; i++) {
 + for (int p = 0; p  vs-UniformBlocks[i].NumUniforms; p++) {
 +const struct glsl_type *type = 
 vs-UniformBlocks[i].Uniforms[p].Type;
 +const struct glsl_type *elem = glsl_get_element_type(type);
 +int array_sz = elem ? glsl_get_array_size(type) : 1;
 +int components = elem ? glsl_get_components(elem) : 
 glsl_get_components(type);
 +stage_prog_data-nr_ubo_params += components * array_sz;
 + }
 +  }
 +   }
 stage_prog_data-nr_gather_table = 0;
 stage_prog_data-gather_table = rzalloc_size(NULL, 
 sizeof(*stage_prog_data-gather_table) *
  (stage_prog_data-nr_params +
 diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
 b/src/mesa/drivers/dri/i965/brw_wm.c
 index 13a64d8..2060eab 100644
 --- a/src/mesa/drivers/dri/i965/brw_wm.c
 +++ b/src/mesa/drivers/dri/i965/brw_wm.c
 @@ -38,6 +38,7 @@
  #include main/samplerobj.h
  #include program/prog_parameter.h
  #include program/program.h
 +#include glsl/nir/nir_types.h
  #include intel_mipmap_tree.h
  
  #include util/ralloc.h
 @@ -205,6 +206,18 @@ brw_compile_wm_prog(struct brw_context *brw,
rzalloc_array(NULL, const gl_constant_value *, param_count);
 prog_data.base.nr_params = param_count;
  
 +   prog_data.base.nr_ubo_params = 0;
 +   if (fs) {
 +  for (int i = 0; i  fs-NumUniformBlocks; i++) {
 + for (int p = 0; p  fs-UniformBlocks[i].NumUniforms; p++) {
 +const struct glsl_type *type = 
 fs-UniformBlocks[i].Uniforms[p].Type;
 +const struct glsl_type *elem = glsl_get_element_type(type);
 +

[Mesa-dev] [PATCH v2 0/6] Continu enabling Open Gl ES 3.1

2015-05-07 Thread Marta Lofstedt

Changes to my previous patch-set accoring to comments
from Tapani Palli. This will only expose the enums
for the respective extensions to gles 3.1 and GL Core.

Marta Lofstedt (6):
  mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1
  mesa/es3.1: enable ARB_shader_atomic_counters for GLES 3.1
  mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1
  mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1
  mesa/es3.1: enable GL_ARB_compute_shader for GLES 3.1
  mesa/es3.1: enable GL_ARB_explicit_uniform_location for GLES 3.1

 src/mesa/main/get.c  | 36 
 src/mesa/main/get_hash_params.py | 88 
 2 files changed, 80 insertions(+), 44 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/6] mesa/es3.1: enable ARB_shader_atomic_counters for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt marta.lofst...@intel.com

v2 :  only expose ARB_shader_atomic_counters enums
for gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 23 +--
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 73739b6..f5318d5 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -361,6 +361,12 @@ static const int extra_ARB_shader_image_load_store_es31[] 
= {
EXTRA_END
 };
 
+static const int extra_ARB_shader_atomic_counters_es31[] = {
+   EXT(ARB_shader_atomic_counters),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 85c2494..f9bf749 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -421,6 +421,18 @@ descriptor=[
   [ MAX_GEOMETRY_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
   [ MAX_FRAGMENT_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
   [ MAX_COMBINED_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store_es31],
+# GL_ARB_shader_atomic_counters / GLES 3.1
+  [ ATOMIC_COUNTER_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_ATOMIC_COUNTER_BUFFER_BINDINGS, 
CONTEXT_INT(Const.MaxAtomicBufferBindings), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_ATOMIC_COUNTER_BUFFER_SIZE, CONTEXT_INT(Const.MaxAtomicBufferSize), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_VERTEX_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_VERTEX_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_FRAGMENT_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_GEOMETRY_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_COMBINED_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.MaxCombinedAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31 ],
+  [ MAX_COMBINED_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.MaxCombinedAtomicCounters), 
extra_ARB_shader_atomic_counters_es31 ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -771,18 +783,9 @@ descriptor=[
 # GL_ARB_separate_shader_objects
   [ PROGRAM_PIPELINE_BINDING, LOC_CUSTOM, TYPE_INT, 
GL_PROGRAM_PIPELINE_BINDING, NO_EXTRA ],
 
-# GL_ARB_shader_atomic_counters
-  [ ATOMIC_COUNTER_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_shader_atomic_counters ],
-  [ MAX_ATOMIC_COUNTER_BUFFER_BINDINGS, 
CONTEXT_INT(Const.MaxAtomicBufferBindings), extra_ARB_shader_atomic_counters 
],
-  [ MAX_ATOMIC_COUNTER_BUFFER_SIZE, CONTEXT_INT(Const.MaxAtomicBufferSize), 
extra_ARB_shader_atomic_counters ],
-  [ MAX_VERTEX_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters ],
-  [ MAX_VERTEX_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters ],
-  [ MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters ],
-  [ MAX_FRAGMENT_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters ],
+# GL_ARB_shader_atomic_counters and geometry shaders
   [ MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_and_geometry_shader ],
   [ MAX_GEOMETRY_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_and_geometry_shader ],
-  [ MAX_COMBINED_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.MaxCombinedAtomicBuffers), extra_ARB_shader_atomic_counters 
],
-  [ MAX_COMBINED_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.MaxCombinedAtomicCounters), 
extra_ARB_shader_atomic_counters ],
 
 # GL_ARB_vertex_attrib_binding
   [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, 
CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ],
-- 
1.9.1

___
mesa-dev mailing list

[Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt marta.lofst...@intel.com

v2: only expose enums from GL_ARB_shader_image_load_store
for gles 3.1 and GL core

Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 17 -
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 9898197..73739b6 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_shader_image_load_store_es31[] = {
+   EXT(ARB_shader_image_load_store),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 513d5d2..85c2494 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -413,6 +413,14 @@ descriptor=[
 { apis: [GL_CORE, GLES3], params: [
 # GL_ARB_draw_indirect / GLES 3.1
   [ DRAW_INDIRECT_BUFFER_BINDING, LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_draw_indirect_es31 ],
+# GL_ARB_shader_image_load_store / GLES 3.1
+  [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, 
CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_VERTEX_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_GEOMETRY_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_FRAGMENT_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31],
+  [ MAX_COMBINED_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store_es31],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -780,15 +788,6 @@ descriptor=[
   [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, 
CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ],
   [ MAX_VERTEX_ATTRIB_BINDINGS, 
CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ],
 
-# GL_ARB_shader_image_load_store
-  [ MAX_IMAGE_UNITS, CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store],
-  [ MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS, 
CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store],
-  [ MAX_IMAGE_SAMPLES, CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store],
-  [ MAX_VERTEX_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store],
-  [ MAX_GEOMETRY_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_and_geometry_shader],
-  [ MAX_FRAGMENT_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store],
-  [ MAX_COMBINED_IMAGE_UNIFORMS, 
CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store],
-
 # GL_ARB_compute_shader
   [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, 
CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ],
   [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader ],
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 6/6] mesa/es3.1: enable GL_ARB_explicit_uniform_location for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt marta.lofst...@intel.com

v2 : only expose GL_ARB_explicit_uniform_location enums
for gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
---
 src/mesa/main/get.c  | 6 ++
 src/mesa/main/get_hash_params.py | 3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 97d3bf0..6fc0f3f 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -385,6 +385,12 @@ static const int extra_ARB_compute_shader_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_explicit_uniform_location_es31[] = {
+   EXT(ARB_explicit_uniform_location),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 985f252..6b07888 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -454,6 +454,8 @@ descriptor=[
   [ MAX_COMPUTE_SHARED_MEMORY_SIZE, CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), 
extra_ARB_compute_shader_es31 ],
   [ MAX_COMPUTE_UNIFORM_COMPONENTS, CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), 
extra_ARB_compute_shader_es31 ],
   [ MAX_COMPUTE_IMAGE_UNIFORMS, CONST(MAX_COMPUTE_IMAGE_UNIFORMS), 
extra_ARB_compute_shader_es31 ],
+# GL_ARB_explicit_uniform_location / GLES 3.1
+  [ MAX_UNIFORM_LOCATIONS, 
CONTEXT_INT(Const.MaxUserAssignableUniformLocations), 
extra_ARB_explicit_uniform_location_es31 ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -539,7 +541,6 @@ descriptor=[
   [ MAX_LIST_NESTING, CONST(MAX_LIST_NESTING), NO_EXTRA ],
   [ MAX_NAME_STACK_DEPTH, CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA ],
   [ MAX_PIXEL_MAP_TABLE, CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA ],
-  [ MAX_UNIFORM_LOCATIONS, 
CONTEXT_INT(Const.MaxUserAssignableUniformLocations), 
extra_ARB_explicit_uniform_location ],
   [ NAME_STACK_DEPTH, CONTEXT_INT(Select.NameStackDepth), NO_EXTRA ],
   [ PACK_LSB_FIRST, CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA ],
   [ PACK_SWAP_BYTES, CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA ],
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/6] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt marta.lofst...@intel.com

v2 : only expose GL_ARB_texture_multisample enums
for gles 3.1 and Gl core.

Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 17 -
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index f5318d5..dcf4f0a 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -367,6 +367,12 @@ static const int extra_ARB_shader_atomic_counters_es31[] = 
{
EXTRA_END
 };
 
+static const int extra_ARB_texture_multisample_es31[] = {
+   EXT(ARB_texture_multisample),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index f9bf749..10c32f2 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -433,6 +433,14 @@ descriptor=[
   [ MAX_GEOMETRY_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31 ],
   [ MAX_COMBINED_ATOMIC_COUNTER_BUFFERS, 
CONTEXT_INT(Const.MaxCombinedAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31 ],
   [ MAX_COMBINED_ATOMIC_COUNTERS, 
CONTEXT_INT(Const.MaxCombinedAtomicCounters), 
extra_ARB_shader_atomic_counters_es31 ],
+# GL_ARB_texture_multisample / GLES 3.1
+  [ TEXTURE_BINDING_2D_MULTISAMPLE, LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample_es31 ],
+  [ TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY, LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample_es31 ],
+  [ MAX_COLOR_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxColorTextureSamples), 
extra_ARB_texture_multisample_es31 ],
+  [ MAX_DEPTH_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxDepthTextureSamples), 
extra_ARB_texture_multisample_es31 ],
+  [ MAX_INTEGER_SAMPLES, CONTEXT_INT(Const.MaxIntegerSamples), 
extra_ARB_texture_multisample_es31 ],
+  [ SAMPLE_MASK, CONTEXT_BOOL(Multisample.SampleMask), 
extra_ARB_texture_multisample_es31 ],
+  [ MAX_SAMPLE_MASK_WORDS, CONST(1), extra_ARB_texture_multisample_es31 ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -718,15 +726,6 @@ descriptor=[
   [ TEXTURE_BUFFER_FORMAT_ARB, LOC_CUSTOM, TYPE_INT, 0, 
extra_texture_buffer_object ],
   [ TEXTURE_BUFFER_ARB, LOC_CUSTOM, TYPE_INT, 0, 
extra_texture_buffer_object ],
 
-# GL_ARB_texture_multisample / GL 3.2
-  [ TEXTURE_BINDING_2D_MULTISAMPLE, LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample ],
-  [ TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY, LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample ],
-  [ MAX_COLOR_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxColorTextureSamples), 
extra_ARB_texture_multisample ],
-  [ MAX_DEPTH_TEXTURE_SAMPLES, CONTEXT_INT(Const.MaxDepthTextureSamples), 
extra_ARB_texture_multisample ],
-  [ MAX_INTEGER_SAMPLES, CONTEXT_INT(Const.MaxIntegerSamples), 
extra_ARB_texture_multisample ],
-  [ SAMPLE_MASK, CONTEXT_BOOL(Multisample.SampleMask), 
extra_ARB_texture_multisample ],
-  [ MAX_SAMPLE_MASK_WORDS, CONST(1), extra_ARB_texture_multisample ],
-
 # GL 3.0
   [ CONTEXT_FLAGS, CONTEXT_INT(Const.ContextFlags), extra_version_30 ],
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/6] mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt marta.lofst...@intel.com

v2 : only expose GL_ARB_texture_gather enums for
gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
---
 src/mesa/main/get.c  | 6 ++
 src/mesa/main/get_hash_params.py | 9 -
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index dcf4f0a..95868bf 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -373,6 +373,12 @@ static const int extra_ARB_texture_multisample_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_texture_gather_es31[] = {
+   EXT(ARB_texture_gather),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 10c32f2..50af078 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -441,6 +441,10 @@ descriptor=[
   [ MAX_INTEGER_SAMPLES, CONTEXT_INT(Const.MaxIntegerSamples), 
extra_ARB_texture_multisample_es31 ],
   [ SAMPLE_MASK, CONTEXT_BOOL(Multisample.SampleMask), 
extra_ARB_texture_multisample_es31 ],
   [ MAX_SAMPLE_MASK_WORDS, CONST(1), extra_ARB_texture_multisample_es31 ],
+# GL_ARB_texture_gather / GLES 3.1
+  [ MIN_PROGRAM_TEXTURE_GATHER_OFFSET, 
CONTEXT_INT(Const.MinProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31],
+  [ MAX_PROGRAM_TEXTURE_GATHER_OFFSET, 
CONTEXT_INT(Const.MaxProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31],
+  [ MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB, 
CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather_es31],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -774,11 +778,6 @@ descriptor=[
 # GL_ARB_texture_cube_map_array
   [ TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB, LOC_CUSTOM, TYPE_INT, 
TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array ],
 
-# GL_ARB_texture_gather
-  [ MIN_PROGRAM_TEXTURE_GATHER_OFFSET, 
CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather],
-  [ MAX_PROGRAM_TEXTURE_GATHER_OFFSET, 
CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather],
-  [ MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB, 
CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather],
-
 # GL_ARB_separate_shader_objects
   [ PROGRAM_PIPELINE_BINDING, LOC_CUSTOM, TYPE_INT, 
GL_PROGRAM_PIPELINE_BINDING, NO_EXTRA ],
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 5/6] mesa/es3.1: enable GL_ARB_compute_shader for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt marta.lofst...@intel.com

v2 : only expose GL_ARB_compute_shader enums for
gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt marta.lofst...@intel.com
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 19 +--
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 95868bf..97d3bf0 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -379,6 +379,12 @@ static const int extra_ARB_texture_gather_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_compute_shader_es31[] = {
+   EXT(ARB_compute_shader),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 50af078..985f252 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -445,6 +445,15 @@ descriptor=[
   [ MIN_PROGRAM_TEXTURE_GATHER_OFFSET, 
CONTEXT_INT(Const.MinProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31],
   [ MAX_PROGRAM_TEXTURE_GATHER_OFFSET, 
CONTEXT_INT(Const.MaxProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31],
   [ MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB, 
CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather_es31],
+# GL_ARB_compute_shader / GLES 3.1
+  [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, 
CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), 
extra_ARB_compute_shader_es31 ],
+  [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader_es31 ],
+  [ MAX_COMPUTE_TEXTURE_IMAGE_UNITS, 
CONST(MAX_COMPUTE_TEXTURE_IMAGE_UNITS), extra_ARB_compute_shader_es31 ],
+  [ MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS, 
CONST(MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS), extra_ARB_compute_shader_es31 ],
+  [ MAX_COMPUTE_ATOMIC_COUNTERS, CONST(MAX_COMPUTE_ATOMIC_COUNTERS), 
extra_ARB_compute_shader_es31 ],
+  [ MAX_COMPUTE_SHARED_MEMORY_SIZE, CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), 
extra_ARB_compute_shader_es31 ],
+  [ MAX_COMPUTE_UNIFORM_COMPONENTS, CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), 
extra_ARB_compute_shader_es31 ],
+  [ MAX_COMPUTE_IMAGE_UNIFORMS, CONST(MAX_COMPUTE_IMAGE_UNIFORMS), 
extra_ARB_compute_shader_es31 ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -789,16 +798,6 @@ descriptor=[
   [ MAX_VERTEX_ATTRIB_RELATIVE_OFFSET, 
CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA ],
   [ MAX_VERTEX_ATTRIB_BINDINGS, 
CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA ],
 
-# GL_ARB_compute_shader
-  [ MAX_COMPUTE_WORK_GROUP_INVOCATIONS, 
CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader ],
-  [ MAX_COMPUTE_UNIFORM_BLOCKS, CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader ],
-  [ MAX_COMPUTE_TEXTURE_IMAGE_UNITS, 
CONST(MAX_COMPUTE_TEXTURE_IMAGE_UNITS), extra_ARB_compute_shader ],
-  [ MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS, 
CONST(MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS), extra_ARB_compute_shader ],
-  [ MAX_COMPUTE_ATOMIC_COUNTERS, CONST(MAX_COMPUTE_ATOMIC_COUNTERS), 
extra_ARB_compute_shader ],
-  [ MAX_COMPUTE_SHARED_MEMORY_SIZE, CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), 
extra_ARB_compute_shader ],
-  [ MAX_COMPUTE_UNIFORM_COMPONENTS, CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), 
extra_ARB_compute_shader ],
-  [ MAX_COMPUTE_IMAGE_UNIFORMS, CONST(MAX_COMPUTE_IMAGE_UNIFORMS), 
extra_ARB_compute_shader ],
-
 # GL_ARB_gpu_shader5
   [ MAX_GEOMETRY_SHADER_INVOCATIONS, 
CONST(MAX_GEOMETRY_SHADER_INVOCATIONS), extra_ARB_gpu_shader5 ],
   [ MIN_FRAGMENT_INTERPOLATION_OFFSET, 
CONTEXT_FLOAT(Const.MinFragmentInterpolationOffset), extra_ARB_gpu_shader5 ],
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/27] i965: Enable gather push constants

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:04PM +0300, Abdiel Janulgue wrote:
 The 3DSTATE_GATHER_POOL_ALLOC is used to enable or disable the gather
 push constants feature within a context. This patch provides the toggle
 functionality of using gather push constants to program constant data
 within a batch.
 
 Using gather push constants require that a gather pool be allocated so
 that the resource streamer can flush the packed constants it gathered.
 The pool is later referenced by the 3DSTATE_CONSTANT_* command to
 program the push constant data.
 
 Also introduce INTEL_UBO_GATHER to selectively enable which shader stage
 uses gather constants for ubo fetches.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_binding_tables.c | 43 
 +-
  src/mesa/drivers/dri/i965/brw_context.c| 37 ++
  src/mesa/drivers/dri/i965/brw_context.h| 10 ++
  src/mesa/drivers/dri/i965/brw_state.h  |  1 +
  4 files changed, 90 insertions(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
 b/src/mesa/drivers/dri/i965/brw_binding_tables.c
 index c1d188e..4793fbc 100644
 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
 +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
 @@ -236,9 +236,47 @@ gen7_update_binding_table_from_array(struct brw_context 
 *brw,
 ADVANCE_BATCH();
  }
  
 +static void
 +gen7_init_gather_pool(struct brw_context *brw)
 +{
 +   if (!brw-has_resource_streamer)
 +  return;
 +
 +   if (!brw-gather_pool.bo) {
 +  brw-gather_pool.bo = drm_intel_bo_alloc(brw-bufmgr, gather_pool,
 +   brw-gather_pool.size, 4096);
 +  brw-gather_pool.next_offset = 0;
 +   }
 +}
 +
 +void
 +gen7_toggle_gather_constants(struct brw_context *brw, bool enable)
 +{
 +   if (enable  !brw-has_resource_streamer)
 +  return;
 +
 +   uint32_t dw1 = brw-is_haswell ? HSW_GATHER_CONSTANTS_RESERVED : 0;
 +
 +   BEGIN_BATCH(3);
 +   OUT_BATCH(_3DSTATE_GATHER_POOL_ALLOC  16 | (3 - 2));
 +   if (enable) {
 +  dw1 |= SET_FIELD(BRW_GATHER_CONSTANTS_ON, BRW_GATHER_CONSTANTS_ENABLE) 
 |
 + (brw-is_haswell ? GEN7_MOCS_L3 : 0);

This should align with the previous line.

 +  OUT_RELOC(brw-gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
 +  OUT_RELOC(brw-gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0,
 +brw-gather_pool.bo-size);
 +   } else {
 +  OUT_BATCH(dw1);
 +  OUT_BATCH(0);
 +   }
 +   ADVANCE_BATCH();
 +}
 +
  void
  gen7_disable_hw_binding_tables(struct brw_context *brw)
  {
 +   gen7_toggle_gather_constants(brw, false);
 +
 BEGIN_BATCH(3);
 OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
 OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
 BRW_HW_BINDING_TABLE_ENABLE) |
 @@ -280,6 +318,9 @@ gen7_enable_hw_binding_tables(struct brw_context *brw)
   brw-hw_bt_pool.bo-size);
 ADVANCE_BATCH();
  
 +   gen7_init_gather_pool(brw);
 +   gen7_toggle_gather_constants(brw, true);
 +
 /* Pipe control workaround */
 brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
  }
 @@ -288,6 +329,7 @@ void
  gen7_reset_rs_pool_offsets(struct brw_context *brw)
  {
 brw-hw_bt_pool.next_offset = HW_BT_START_OFFSET;
 +   brw-gather_pool.next_offset = 0;
  }
  
  const struct brw_tracked_state gen7_hw_binding_tables = {
 @@ -371,5 +413,4 @@ const struct brw_tracked_state 
 gen6_binding_table_pointers = {
 },
 .emit = gen6_upload_binding_table_pointers,
  };
 -

Not related to this patch.

  /** @} */
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
 b/src/mesa/drivers/dri/i965/brw_context.c
 index 9c7ccae..685ca70 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.c
 +++ b/src/mesa/drivers/dri/i965/brw_context.c
 @@ -67,6 +67,7 @@
  #include tnl/tnl.h
  #include tnl/t_pipeline.h
  #include util/ralloc.h
 +#include util/u_atomic.h
  
  #include glsl/nir/nir.h
  
 @@ -692,6 +693,25 @@ brw_get_revision(int fd)
 return revision;
  }
  
 +static void
 +brw_process_intel_gather_variable(struct brw_context *brw)
 +{
 +   uint64_t INTEL_UBO_GATHER = 0;
 +
 +   static const struct dri_debug_control gather_control[] = {
 +  { vs, (1  MESA_SHADER_VERTEX)},
 +  { gs, (1  MESA_SHADER_GEOMETRY)},
 +  { fs, (1  MESA_SHADER_FRAGMENT)},

You can drop the outermost ().

 +  { NULL, 0 }
 +   };
 +   uint64_t intel_ubo_gather = 
 driParseDebugString(getenv(INTEL_UBO_GATHER), gather_control);

Wrap to next line, overflowing 80.

 +   (void) p_atomic_cmpxchg(INTEL_UBO_GATHER, 0, intel_ubo_gather);
 +
 +   brw-vs_ubo_gather = (INTEL_UBO_GATHER  (1  MESA_SHADER_VERTEX));
 +   brw-gs_ubo_gather = (INTEL_UBO_GATHER  (1  MESA_SHADER_GEOMETRY));
 +   brw-fs_ubo_gather = (INTEL_UBO_GATHER  (1  MESA_SHADER_FRAGMENT));

Here also, you can drop the outermost ().

 +}
 +
  GLboolean
  brwCreateContext(gl_api api,
const struct gl_config

Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 5:52 AM, Pohjolainen, Topi
topi.pohjolai...@intel.com wrote:
 On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote:
 v2: Get rid of the block parameter and make src a const reference

 Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com
 Reviewed-by: Matt Turner matts...@gmail.com
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 
 
  1 file changed, 38 insertions(+), 37 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 index 43370cb..9c4ed0b 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate)
operands_match(a, b, negate);
  }

 +static fs_inst *
 +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate)

 Did you mean 'src' to be constant reference? It is only used for reading
 so it could be - you claim this in the commit message yourself :)

Oops...  I think what happened is that I tried to do it for
is_copy_payload not create_copy_instr.  But then is_copy_payload does
actually change it so I put it back and somehow my brain leaked it
into the commit message.  Unfortunately, it's already pushed so I
can't change it now.  However, I could make a fixup if you'd like.
--Jason

 +{
 +   int written = inst-regs_written;
 +   int dst_width = inst-dst.width / 8;
 +   fs_reg dst = inst-dst;
 +   fs_inst *copy;
 +
 +   if (written  dst_width) {
 +  fs_reg *sources = ralloc_array(v-mem_ctx, fs_reg, written / 
 dst_width);
 +  for (int i = 0; i  written / dst_width; i++)
 + sources[i] = offset(src, i);
 +  copy = v-LOAD_PAYLOAD(dst, sources, written / dst_width);
 +   } else {
 +  copy = v-MOV(dst, src);
 +  copy-force_writemask_all = inst-force_writemask_all;
 +  copy-src[0].negate = negate;
 +   }
 +   assert(copy-regs_written == written);
 +
 +   return copy;
 +}
 +
  bool
  fs_visitor::opt_cse_local(bblock_t *block)
  {
 @@ -230,49 +253,27 @@ fs_visitor::opt_cse_local(bblock_t *block)
  bool no_existing_temp = entry-tmp.file == BAD_FILE;
  if (no_existing_temp  !entry-generator-dst.is_null()) {
 int written = entry-generator-regs_written;
 -   int dst_width = entry-generator-dst.width / 8;
 -   assert(written % dst_width == 0);
 -
 -   fs_reg orig_dst = entry-generator-dst;
 -   fs_reg tmp = fs_reg(GRF, alloc.allocate(written),
 -   orig_dst.type, orig_dst.width);
 -   entry-tmp = tmp;
 -   entry-generator-dst = tmp;
 -
 -   fs_inst *copy;
 -   if (written  dst_width) {
 -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
 dst_width);
 -  for (int i = 0; i  written / dst_width; i++)
 - sources[i] = offset(tmp, i);
 -  copy = LOAD_PAYLOAD(orig_dst, sources, written / 
 dst_width);
 -   } else {
 -  copy = MOV(orig_dst, tmp);
 -  copy-force_writemask_all =
 - entry-generator-force_writemask_all;
 -   }
 +   assert((written * 8) % entry-generator-dst.width == 0);
 +
 +   entry-tmp = fs_reg(GRF, alloc.allocate(written),
 +   entry-generator-dst.type,
 +   entry-generator-dst.width);
 +
 +   fs_inst *copy = create_copy_instr(this, entry-generator,
 + entry-tmp, false);
 entry-generator-insert_after(block, copy);
 +
 +   entry-generator-dst = entry-tmp;
  }

  /* dest - temp */
  if (!inst-dst.is_null()) {
 -   int written = inst-regs_written;
 -   int dst_width = inst-dst.width / 8;
 -   assert(written == entry-generator-regs_written);
 -   assert(dst_width == entry-generator-dst.width / 8);
 +   assert(inst-regs_written == entry-generator-regs_written);
 +   assert(inst-dst.width == entry-generator-dst.width);
 assert(inst-dst.type == entry-tmp.type);
 -   fs_reg dst = inst-dst;
 -   fs_reg tmp = entry-tmp;
 -   fs_inst *copy;
 -   if (written  dst_width) {
 -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
 dst_width);
 -  for (int i = 0; i  written / dst_width; i++)
 - sources[i] = offset(tmp, i);
 -  copy = LOAD_PAYLOAD(dst, sources, written / dst_width);
 -   } else {
 -  copy = MOV(dst, tmp);
 -  copy-force_writemask_all = inst-force_writemask_all;
 -

[Mesa-dev] [PATCH 3/7] i965: Move texture swizzle resolving into dispatcher

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen topi.pohjolai...@intel.com

Reviewed-by: Matt Turner matts...@gmail.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez curroje...@riseup.net
---
 src/mesa/drivers/dri/i965/brw_context.h   |  4 ++--
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 20 +++-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 16 ++--
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 16 ++--
 4 files changed, 21 insertions(+), 35 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index d599ba8..9e85dd7 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -983,10 +983,10 @@ struct brw_context
 
struct
{
-  void (*update_texture_surface)(struct gl_context *ctx,
+  void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  struct gl_texture_object *tObj,
- uint32_t tex_format,
+ uint32_t tex_format, unsigned swizzle,
  uint32_t *surf_offset,
  bool for_gather);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 7ed7e18..3dddf89 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -308,14 +308,13 @@ update_buffer_texture_surface(struct gl_context *ctx,
 }
 
 static void
-brw_update_texture_surface(struct gl_context *ctx,
+brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
struct gl_texture_object *tObj,
-   uint32_t tex_format,
+   uint32_t tex_format, unsigned swizzle /* unused */,
uint32_t *surf_offset,
bool for_gather)
 {
-   struct brw_context *brw = brw_context(ctx);
struct intel_texture_object *intelObj = intel_texture_object(tObj);
uint32_t *surf;
 
@@ -801,6 +800,17 @@ update_texture_surface(struct gl_context *ctx,
   struct intel_mipmap_tree *mt = intel_obj-mt;
   const struct gl_texture_image *firstImage = 
obj-Image[0][obj-BaseLevel];
   const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, 
unit);
+
+  /* Handling GL_ALPHA as a surface format override breaks 1.30+ style
+   * texturing functions that return a float, as our code generation always
+   * selects the .x channel (which would always be 0).
+   */
+  const bool alpha_depth = obj-DepthMode == GL_ALPHA 
+ (firstImage-_BaseFormat == GL_DEPTH_COMPONENT ||
+  firstImage-_BaseFormat == GL_DEPTH_STENCIL);
+  const unsigned swizzle = (unlikely(alpha_depth) ? SWIZZLE_XYZW :
+brw_get_texture_swizzle(brw-ctx, obj));
+
   unsigned format = translate_tex_format(brw, intel_obj-_Format,
  sampler-sRGBDecode);
   if (obj-StencilSampling  firstImage-_BaseFormat == GL_DEPTH_STENCIL) 
{
@@ -810,8 +820,8 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw-vtbl.update_texture_surface(ctx, mt, obj, format, surf_offset,
-   for_gather);
+  brw-vtbl.update_texture_surface(brw, mt, obj, format, swizzle,
+   surf_offset, for_gather);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 7e3ee67..7576b20 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -348,14 +348,13 @@ gen7_emit_texture_surface_state(struct brw_context *brw,
 }
 
 static void
-gen7_update_texture_surface(struct gl_context *ctx,
+gen7_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
 struct gl_texture_object *obj,
-uint32_t tex_format,
+uint32_t tex_format, unsigned swizzle,
 uint32_t *surf_offset,
 bool for_gather)
 {
-   struct brw_context *brw = brw_context(ctx);
struct intel_texture_object *intel_obj = intel_texture_object(obj);
/* If this is a view with restricted NumLayers, then our effective depth
 * is not just the miptree depth.
@@ -363,17 +362,6 @@ gen7_update_texture_surface(struct gl_context

[Mesa-dev] [PATCH 5/7] i965: Pass texture target as parameter for surface setup

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen topi.pohjolai...@intel.com

Also changed a couple of direct shifts into SET_FIELD().

Reviewed-by: Matt Turner matts...@gmail.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez curroje...@riseup.net
---
 src/mesa/drivers/dri/i965/brw_context.h   |  1 +
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 12 ++--
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |  4 ++--
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  4 ++--
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 0e9ede9..6f08b06 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -986,6 +986,7 @@ struct brw_context
   void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  struct gl_texture_object *tObj,
+ GLenum target,
  unsigned min_layer,
  unsigned max_layer,
  uint32_t tex_format, unsigned swizzle,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 92383e1..fa4e36d 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -310,7 +310,7 @@ update_buffer_texture_surface(struct gl_context *ctx,
 static void
 brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
-   struct gl_texture_object *tObj,
+   struct gl_texture_object *tObj, GLenum target,
unsigned min_layer /* unused */,
unsigned max_layer /* unused */,
uint32_t tex_format, unsigned swizzle /* unused */,
@@ -352,10 +352,10 @@ brw_update_texture_surface(struct brw_context *brw,
   }
}
 
-   surf[0] = (translate_tex_target(tObj-Target)  BRW_SURFACE_TYPE_SHIFT |
- BRW_SURFACE_MIPMAPLAYOUT_BELOW  BRW_SURFACE_MIPLAYOUT_SHIFT |
- BRW_SURFACE_CUBEFACE_ENABLES |
- tex_format  BRW_SURFACE_FORMAT_SHIFT);
+   surf[0] = SET_FIELD(translate_tex_target(target), BRW_SURFACE_TYPE) |
+ BRW_SURFACE_MIPMAPLAYOUT_BELOW  BRW_SURFACE_MIPLAYOUT_SHIFT |
+ BRW_SURFACE_CUBEFACE_ENABLES |
+ tex_format  BRW_SURFACE_FORMAT_SHIFT;
 
surf[1] = mt-bo-offset64 + mt-offset; /* reloc */
 
@@ -827,7 +827,7 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw-vtbl.update_texture_surface(brw, mt, obj,
+  brw-vtbl.update_texture_surface(brw, mt, obj, obj-Target,
obj-MinLayer, obj-MinLayer + depth,
format, swizzle, surf_offset, 
for_gather);
}
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 9755236..89dba40 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -350,7 +350,7 @@ gen7_emit_texture_surface_state(struct brw_context *brw,
 static void
 gen7_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
-struct gl_texture_object *obj,
+struct gl_texture_object *obj, GLenum target,
 unsigned min_layer,
 unsigned max_layer,
 uint32_t tex_format, unsigned swizzle,
@@ -361,7 +361,7 @@ gen7_update_texture_surface(struct brw_context *brw,
if (for_gather  tex_format == BRW_SURFACEFORMAT_R32G32_FLOAT)
   tex_format = BRW_SURFACEFORMAT_R32G32_FLOAT_LD;
 
-   gen7_emit_texture_surface_state(brw, mt, obj-Target,
+   gen7_emit_texture_surface_state(brw, mt, target,
min_layer, max_layer,
obj-MinLevel + obj-BaseLevel,
obj-MinLevel + intel_obj-_MaxLevel + 1,
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 580c1a3..9858f5f 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -249,7 +249,7 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
 static void
 gen8_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
-struct gl_texture_object *obj,
+

[Mesa-dev] [PATCH 6/7] i965: Pass slice details as parameters for surface setup

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen topi.pohjolai...@intel.com

Also changed a couple of direct shifts into SET_FIELD().

Fixes: arb_copy_image-formats -auto -fbo on ILK. In principle,
minimum level settings are only for TextureView to use. We,
however, also take advantage of that internally when blitting.
Before this patch this wasn't taken into account for ILK in the
surface setup.

v2:
   - Removed extra whitespace and switched tabs to spaces (Matt)
   - Added assertion on minimum level (Ken).

v3 (Curro): Reorder min_layer and effective_depth

Reviewed-by: Matt Turner matts...@gmail.com (v1)
Reviewed-by: Kenneth Graunke kenn...@whitecape.org (v1)
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
[ Francisco Jerez: Non-trivial rebase.  Pass a half-open interval of
  levels like emit_texture_surface_state does. ]
Reviewed-by: Francisco Jerez curroje...@riseup.net
---
 src/mesa/drivers/dri/i965/brw_context.h   |  3 ++-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 +++
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 10 +++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 11 +++-
 4 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 6f08b06..2eb4251 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -985,10 +985,11 @@ struct brw_context
{
   void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
- struct gl_texture_object *tObj,
  GLenum target,
  unsigned min_layer,
  unsigned max_layer,
+ unsigned min_level,
+ unsigned max_level,
  uint32_t tex_format, unsigned swizzle,
  uint32_t *surf_offset,
  bool for_gather);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index fa4e36d..de4bdc5 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -310,14 +310,15 @@ update_buffer_texture_surface(struct gl_context *ctx,
 static void
 brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
-   struct gl_texture_object *tObj, GLenum target,
+   GLenum target,
unsigned min_layer /* unused */,
unsigned max_layer /* unused */,
+   unsigned min_level,
+   unsigned max_level,
uint32_t tex_format, unsigned swizzle /* unused */,
uint32_t *surf_offset,
bool for_gather)
 {
-   struct intel_texture_object *intelObj = intel_texture_object(tObj);
uint32_t *surf;
 
surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
@@ -359,16 +360,16 @@ brw_update_texture_surface(struct brw_context *brw,
 
surf[1] = mt-bo-offset64 + mt-offset; /* reloc */
 
-   surf[2] = ((intelObj-_MaxLevel - tObj-BaseLevel)  BRW_SURFACE_LOD_SHIFT 
|
- (mt-logical_width0 - 1)  BRW_SURFACE_WIDTH_SHIFT |
- (mt-logical_height0 - 1)  BRW_SURFACE_HEIGHT_SHIFT);
+   surf[2] = SET_FIELD(max_level - min_level - 1, BRW_SURFACE_LOD) |
+ SET_FIELD(mt-logical_width0 - 1, BRW_SURFACE_WIDTH) |
+ SET_FIELD(mt-logical_height0 - 1, BRW_SURFACE_HEIGHT);
 
-   surf[3] = (brw_get_surface_tiling_bits(mt-tiling) |
- (mt-logical_depth0 - 1)  BRW_SURFACE_DEPTH_SHIFT |
- (mt-pitch - 1)  BRW_SURFACE_PITCH_SHIFT);
+   surf[3] = brw_get_surface_tiling_bits(mt-tiling) |
+ SET_FIELD(mt-logical_depth0 - 1, BRW_SURFACE_DEPTH) |
+ SET_FIELD(mt-pitch - 1, BRW_SURFACE_PITCH);
 
-   surf[4] = (brw_get_surface_num_multisamples(mt-num_samples) |
-  SET_FIELD(tObj-BaseLevel - mt-first_level, 
BRW_SURFACE_MIN_LOD));
+   surf[4] = brw_get_surface_num_multisamples(mt-num_samples) |
+ SET_FIELD(min_level - mt-first_level, BRW_SURFACE_MIN_LOD);
 
surf[5] = mt-align_h == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0;
 
@@ -827,8 +828,16 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw-vtbl.update_texture_surface(brw, mt, obj, obj-Target,
+  /* Minimum level is only supported for TextureView but internally it is
+   * also taken advantage of by meta blit path. The former is only enabled
+   * from gen7 onwards.
+   */
+  assert(brw-gen = 7 || obj-MinLevel == 0 ||

[Mesa-dev] [PATCH] i965/wm/gen6: Add option for disabling statistics collection

2015-05-07 Thread Topi Pohjolainen

Normally this always needed but for internal blits and clears
we need to be able to disable it.

CC: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
---
 src/mesa/drivers/dri/i965/brw_state.h |  3 ++-
 src/mesa/drivers/dri/i965/gen6_wm_state.c | 14 +++---
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 18449c4..26fdae6 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -339,7 +339,8 @@ gen6_upload_wm_state(struct brw_context *brw,
  bool multisampled_fbo, int min_inv_per_frag,
  bool dual_source_blend_enable, bool kill_enable,
  bool color_buffer_write_enable, bool msaa_enabled,
- bool line_stipple_enable, bool polygon_stipple_enable);
+ bool line_stipple_enable, bool polygon_stipple_enable,
+ bool statistic_enable);
 
 /* gen6_sf_state.c */
 void
diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c 
b/src/mesa/drivers/dri/i965/gen6_wm_state.c
index e5b0f5a..7081eb7 100644
--- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
@@ -73,7 +73,8 @@ gen6_upload_wm_state(struct brw_context *brw,
  bool multisampled_fbo, int min_inv_per_frag,
  bool dual_source_blend_enable, bool kill_enable,
  bool color_buffer_write_enable, bool msaa_enabled,
- bool line_stipple_enable, bool polygon_stipple_enable)
+ bool line_stipple_enable, bool polygon_stipple_enable,
+ bool statistic_enable)
 {
uint32_t dw2, dw4, dw5, dw6, ksp0, ksp2;
 
@@ -109,7 +110,10 @@ gen6_upload_wm_state(struct brw_context *brw,
}
 
dw2 = dw4 = dw5 = dw6 = ksp2 = 0;
-   dw4 |= GEN6_WM_STATISTICS_ENABLE;
+
+   if (statistic_enable)
+  dw4 |= GEN6_WM_STATISTICS_ENABLE;
+
dw5 |= GEN6_WM_LINE_AA_WIDTH_1_0;
dw5 |= GEN6_WM_LINE_END_CAP_AA_WIDTH_0_5;
 
@@ -300,6 +304,9 @@ upload_wm_state(struct brw_context *brw)
 ctx-Multisample.SampleAlphaToCoverage ||
 prog_data-uses_omask;
 
+   /* Rendering against the gl-context is always taken into account. */
+   const bool statistic_enable = true;
+
/* _NEW_LINE | _NEW_POLYGON | _NEW_BUFFERS | _NEW_COLOR |
 * _NEW_MULTISAMPLE
 */
@@ -308,7 +315,8 @@ upload_wm_state(struct brw_context *brw)
 dual_src_blend_enable, kill_enable,
 brw_color_buffer_write_enabled(brw),
 ctx-Multisample.Enabled,
-ctx-Line.StippleFlag, ctx-Polygon.StippleFlag);
+ctx-Line.StippleFlag, ctx-Polygon.StippleFlag,
+statistic_enable);
 }
 
 const struct brw_tracked_state gen6_wm_state = {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] i965: Revision of texture surface setup refactoring

2015-05-07 Thread Francisco Jerez

Pohjolainen, Topi topi.pohjolai...@intel.com writes:

 On Wed, May 06, 2015 at 02:56:53PM +0300, Francisco Jerez wrote:
 Hi!
 
 Topi Pohjolainen topi.pohjolai...@intel.com writes:
 
  This series moves all the decision making of values into common
  hardware independent dispatcher while leaving the hardware specific
  logic to deal with formatting only.
 
  Curro needed a similar refactor for gen7 and gen8. However, that
  makes it a harder to apply the changes I needed that expand all the
  way to gen4. Ken helped me to notice that my refactoring can in
  fact address both relatively easily.
 
  For context, I added the patch from Curro that makes use of the
  texture surface setup logic along with a small patch making it
  compatible with the surface state refactoring found here.
 
  Curro, what do you think? I'm not too happy with reverting your
  work but overall this way it becomes cleaner, I think.
 
 
 *Shrug*, it seems weird to me that you opted to revert my patches even
 though they are closer to where you want to get at than it was before my
 patches.
 
 This is the current interface:
   void (*emit_texture_surface_state)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  GLenum target,
  unsigned min_layer,
  unsigned max_layer,
  unsigned min_level,
  unsigned max_level,
  unsigned format,
  unsigned swizzle,
  uint32_t *surf_offset,
  bool rw, bool for_gather);
 
 This is the old interface we both wanted to get rid of:
   void (*update_texture_surface)(struct gl_context *ctx,
  unsigned unit,
  uint32_t *surf_offset,
  bool for_gather);

 
 This is the interface introduced by this series:
 void (*update_texture_surface)(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
GLenum target, uint32_t 
 effective_depth,
uint32_t min_layer,
uint32_t min_lod, uint32_t mip_count,
uint32_t tex_format, int swizzle,
uint32_t *surf_offset,
bool for_gather);
 
 AFAIK the only difference between your proposal and mine is the name
 (IMHO emit_texture_surface_state is more consistent with the other
 emit_*_surface_state hooks with similar semantics), the ordering of
 arguments (and I find the ordering and naming of your effective_depth,
 min_layer, min_lod and mip_count arguments rather asymmetric, they
 are both pairs determining an interval of either layers or levels, it
 doesn't make much sense to me that they are named and ordered
 inconsistently in your series), the fact that you're using a min
 level/layer index + count instead of half-open intervals like I did, and
 the fact that you're missing an rw argument which is required for
 ARB_shader_image_load_store support.
 
 I fail to see why a revert is justified or desirable, and I fail to see
 how your proposal will work better on Gen4, since the difference between
 the two interfaces mostly cosmetic.

 I'm just looking at the end result. Here we don't need to introduce new entry
 to the jump table, the changes are kept to the minimum and we both get
 applicable interface. I didn't really intentionally choose between the
 interfaces - this was the outcome of trying to keep it as unintrusive as I
 could.

I've rebased your series on top of master.  In fact the rebased version
is a lot less churn, two of your patches (PATCH 3 and 5) that were
re-applying changes you had previously reverted become empty, and the
diffstat goes down from +131/-195 to +89/-143.

There were a number of subtle differences between the two interfaces
that weren't obvious at all by looking at the end result, and I only
noticed while looking at the actual diff between master (without
reverts) and your branch, namely:

 - Your mip_count argument expects the number of mipmap levels minus
   one, instead of the actual number of mipmap levels (we already
   discussed this earlier today to some extent).

 - Your min_lod argument isn't the absolute starting mipmap level,
   instead it's relative to mt-first_level.  This could have bitten us
   in the future if some caller forgets to take this offset into
   account.

 - gen7_emit_texture_surface_state wasn't taking into account the
   work-around for the R32G32_FLOAT format in the texture

[Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null

2015-05-07 Thread Neil Roberts

opt_sampler_eot enables a direct write to framebuffer from a sample.
In order to do this the sample message needs to have a message header
so if there wasn't one already then the function adds one. In addition
the function sets the destination register to null because it's no
longer used. However it was only doing this in cases where it was
adding a message header. This patch just moves setting the destination
so that it happens even if there's a messge header. In practice this
doesn't seem to make any difference but it's a bit cleaner.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 1ca7ca6..72d408b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2675,6 +2675,7 @@ fs_visitor::opt_sampler_eot()
 
tex_inst-offset |= fb_write-target  24;
tex_inst-eot = true;
+   tex_inst-dst = reg_null_ud;
fb_write-remove(cfg-blocks[cfg-num_blocks - 1]);
 
/* If a header is present, marking the eot is sufficient. Otherwise, we need
@@ -2712,7 +2713,6 @@ fs_visitor::opt_sampler_eot()
tex_inst-header_present = true;
tex_inst-insert_before(cfg-blocks[cfg-num_blocks - 1], new_load_payload);
tex_inst-src[0] = send_header;
-   tex_inst-dst = reg_null_ud;
 
return true;
 }
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/27] i965: Define HW-binding table and resource streamer control opcodes

2015-05-07 Thread Pohjolainen, Topi

On Sun, May 03, 2015 at 06:04:05PM +0300, Pohjolainen, Topi wrote:
 On Tue, Apr 28, 2015 at 11:07:58PM +0300, Abdiel Janulgue wrote:
  Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/brw_context.h |  1 +
   src/mesa/drivers/dri/i965/brw_defines.h | 24 
   src/mesa/drivers/dri/i965/intel_reg.h   |  3 +++
   3 files changed, 28 insertions(+)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
  b/src/mesa/drivers/dri/i965/brw_context.h
  index a6d6787..07626af 100644
  --- a/src/mesa/drivers/dri/i965/brw_context.h
  +++ b/src/mesa/drivers/dri/i965/brw_context.h
  @@ -1105,6 +1105,7 @@ struct brw_context
  bool no_simd8;
  bool use_rep_send;
  bool scalar_vs;
  +   bool has_resource_streamer;
 
 This should go to the next patch. Other than that all looks good - I checked
 the values against bspec and I couldn't find anything amiss.
 
 Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com
 
   
  /**
   * Some versions of Gen hardware don't do centroid interpolation 
  correctly
  diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
  b/src/mesa/drivers/dri/i965/brw_defines.h
  index a97a944..da288d3 100644
  --- a/src/mesa/drivers/dri/i965/brw_defines.h
  +++ b/src/mesa/drivers/dri/i965/brw_defines.h
  @@ -1586,6 +1586,30 @@ enum brw_message_target {
   #define _3DSTATE_BINDING_TABLE_POINTERS_GS 0x7829 /* GEN7+ */
   #define _3DSTATE_BINDING_TABLE_POINTERS_PS 0x782A /* GEN7+ */
   
  +#define _3DSTATE_BINDING_TABLE_POOL_ALLOC   0x7919 /* GEN7.5+ */
  +#define BRW_HW_BINDING_TABLE_ENABLE_SHIFT   11 /* GEN7.5+ */
  +#define BRW_HW_BINDING_TABLE_ENABLE_MASKINTEL_MASK(11, 11)

Actually we usually do the booleans just as:

 #define BRW_HW_BINDING_TABLE_ENABLE (1  11)

  +#define BRW_HW_BINDING_TABLE_ON 1
  +#define BRW_HW_BINDING_TABLE_OFF0
  +#define GEN7_HW_BT_MOCS_SHIFT   7
  +#define GEN7_HW_BT_MOCS_MASKINTEL_MASK(10, 7)
  +#define GEN8_HW_BT_MOCS_SHIFT   0
  +#define GEN8_HW_BT_MOCS_MASKINTEL_MASK(6, 0)
  +/* Only required in HSW */
  +#define HSW_HW_BINDING_TABLE_RESERVED   (3  5)
  +
  +#define _3DSTATE_BINDING_TABLE_EDIT_VS  0x7843 /* GEN7.5 */
  +#define _3DSTATE_BINDING_TABLE_EDIT_GS  0x7844 /* GEN7.5 */
  +#define _3DSTATE_BINDING_TABLE_EDIT_HS  0x7845 /* GEN7.5 */
  +#define _3DSTATE_BINDING_TABLE_EDIT_DS  0x7846 /* GEN7.5 */
  +#define _3DSTATE_BINDING_TABLE_EDIT_PS  0x7847 /* GEN7.5 */
  +#define BRW_BINDING_TABLE_INDEX_SHIFT   16
  +#define BRW_BINDING_TABLE_INDEX_MASKINTEL_MASK(23, 16)
  +
  +#define BRW_BINDING_TABLE_EDIT_TARGET_ALL   3
  +#define BRW_BINDING_TABLE_EDIT_TARGET_CORE1 2
  +#define BRW_BINDING_TABLE_EDIT_TARGET_CORE0 1
  +
   #define _3DSTATE_SAMPLER_STATE_POINTERS0x7802 /* GEN6+ */
   # define PS_SAMPLER_STATE_CHANGE   (1  12)
   # define GS_SAMPLER_STATE_CHANGE   (1  9)
  diff --git a/src/mesa/drivers/dri/i965/intel_reg.h 
  b/src/mesa/drivers/dri/i965/intel_reg.h
  index 488fb5b..9cdb3ca 100644
  --- a/src/mesa/drivers/dri/i965/intel_reg.h
  +++ b/src/mesa/drivers/dri/i965/intel_reg.h
  @@ -47,6 +47,9 @@
   /* Load a value from memory into a register.  Only available on Gen7+. */
   #define GEN7_MI_LOAD_REGISTER_MEM  (CMD_MI | (0x29  23))
   # define MI_LOAD_REGISTER_MEM_USE_GGTT (1  22)
  +/* Haswell RS control */
  +#define MI_RS_CONTROL   (CMD_MI | (0x6  23))
  +#define MI_RS_STORE_DATA_IMM(CMD_MI | (0x2b  23))
   
   /** @{
*
  -- 
  1.9.1
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/7] i965: Drop the update_texture_surface vtbl hook.

2015-05-07 Thread Francisco Jerez

At this point the update_texture_surface and
emit_texture_surface_state hooks are almost equivalent, the only
significant difference is that emit_texture_surface_state supports
binding read-write surfaces.  The name of the latter is more
consistent with the other emit_something_surface_state hooks, so let's
keep it.
---
 src/mesa/drivers/dri/i965/brw_context.h   | 10 --
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 37 ---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 24 ++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 18 ---
 4 files changed, 23 insertions(+), 66 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 2eb4251..780edba 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -983,16 +983,6 @@ struct brw_context
 
struct
{
-  void (*update_texture_surface)(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- GLenum target,
- unsigned min_layer,
- unsigned max_layer,
- unsigned min_level,
- unsigned max_level,
- uint32_t tex_format, unsigned swizzle,
- uint32_t *surf_offset,
- bool for_gather);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
   struct gl_renderbuffer *rb,
   bool layered, unsigned unit,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index de4bdc5..870d699 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -308,16 +308,17 @@ update_buffer_texture_surface(struct gl_context *ctx,
 }
 
 static void
-brw_update_texture_surface(struct brw_context *brw,
-   struct intel_mipmap_tree *mt,
-   GLenum target,
-   unsigned min_layer /* unused */,
-   unsigned max_layer /* unused */,
-   unsigned min_level,
-   unsigned max_level,
-   uint32_t tex_format, unsigned swizzle /* unused */,
-   uint32_t *surf_offset,
-   bool for_gather)
+gen4_emit_texture_surface_state(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+GLenum target,
+unsigned min_layer /* unused */,
+unsigned max_layer /* unused */,
+unsigned min_level,
+unsigned max_level,
+unsigned tex_format,
+unsigned swizzle /* unused */,
+uint32_t *surf_offset,
+bool rw, bool for_gather)
 {
uint32_t *surf;
 
@@ -378,7 +379,8 @@ brw_update_texture_surface(struct brw_context *brw,
*surf_offset + 4,
mt-bo,
surf[1] - mt-bo-offset64,
-   I915_GEM_DOMAIN_SAMPLER, 0);
+   I915_GEM_DOMAIN_SAMPLER,
+   (rw ? I915_GEM_DOMAIN_SAMPLER : 0));
 }
 
 /**
@@ -834,11 +836,12 @@ update_texture_surface(struct gl_context *ctx,
*/
   assert(brw-gen = 7 || obj-MinLevel == 0 || brw-meta_in_progress);
 
-  brw-vtbl.update_texture_surface(brw, mt, obj-Target,
-   obj-MinLayer, obj-MinLayer + depth,
-   obj-MinLevel + obj-BaseLevel,
-   obj-MinLevel + intel_obj-_MaxLevel + 
1,
-   format, swizzle, surf_offset, 
for_gather);
+  brw-vtbl.emit_texture_surface_state(
+ brw, mt, obj-Target,
+ obj-MinLayer, obj-MinLayer + depth,
+ obj-MinLevel + obj-BaseLevel,
+ obj-MinLevel + intel_obj-_MaxLevel + 1,
+ format, swizzle, surf_offset, false, for_gather);
}
 }
 
@@ -1071,8 +1074,8 @@ const struct brw_tracked_state brw_cs_abo_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw-vtbl.update_texture_surface = brw_update_texture_surface;
brw-vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
brw-vtbl.emit_null_surface_state = brw_emit_null_surface_state;
+   brw-vtbl.emit_texture_surface_state = gen4_emit_texture_surface_state;

[Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen topi.pohjolai...@intel.com

All generations do the same exact dispatch and it could be
therefore done in the hardware independent stage.

Reviewed-by: Matt Turner matts...@gmail.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez curroje...@riseup.net
---
 src/mesa/drivers/dri/i965/brw_context.h   |  3 -
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 +++
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++
 4 files changed, 83 insertions(+), 87 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 2fcdcfa..a6282f4 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context *brw,
  uint32_t size,
  uint32_t *out_offset,
  bool dword_pitch);
-void brw_update_buffer_texture_surface(struct gl_context *ctx,
-   unsigned unit,
-   uint32_t *surf_offset);
 void
 brw_update_sol_surface(struct brw_context *brw,
struct gl_buffer_object *buffer_obj,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 160dd2f..2b8040c 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
}
 }
 
-void
-brw_update_buffer_texture_surface(struct gl_context *ctx,
-  unsigned unit,
-  uint32_t *surf_offset)
+static void
+update_buffer_texture_surface(struct gl_context *ctx,
+  unsigned unit,
+  uint32_t *surf_offset)
 {
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current;
@@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
uint32_t *surf;
 
-   /* BRW_NEW_TEXTURE_BUFFER */
-   if (tObj-Target == GL_TEXTURE_BUFFER) {
-  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
-  return;
-   }
-
surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
  6 * 4, 32, surf_offset);
 
@@ -795,6 +789,21 @@ const struct brw_tracked_state gen6_renderbuffer_surfaces 
= {
.emit = update_renderbuffer_surfaces,
 };
 
+static void
+update_texture_surface(struct gl_context *ctx,
+   unsigned unit,
+   uint32_t *surf_offset,
+   bool for_gather)
+{
+   struct brw_context *brw = brw_context(ctx);
+   struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current;
+
+   if (obj-Target == GL_TEXTURE_BUFFER) {
+  update_buffer_texture_surface(ctx, unit, surf_offset);
+   } else {
+  brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
+   }
+}
 
 static void
 update_stage_texture_surfaces(struct brw_context *brw,
@@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
 
  /* _NEW_TEXTURE */
  if (ctx-Texture.Unit[unit]._Current) {
-brw-vtbl.update_texture_surface(ctx, unit, surf_offset + s, 
for_gather);
+update_texture_surface(ctx, unit, surf_offset + s, for_gather);
  }
   }
}
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 15ab2b0..098b5c8 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -356,43 +356,38 @@ gen7_update_texture_surface(struct gl_context *ctx,
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current;
 
-   if (obj-Target == GL_TEXTURE_BUFFER) {
-  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
-
-   } else {
-  struct intel_texture_object *intel_obj = intel_texture_object(obj);
-  struct intel_mipmap_tree *mt = intel_obj-mt;
-  struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
-  /* If this is a view with restricted NumLayers, then our effective depth
-   * is not just the miptree depth.
-   */
-  const unsigned depth = (obj-Immutable  obj-Target != GL_TEXTURE_3D ?
-  obj-NumLayers : mt-logical_depth0);
-
-  /* Handling GL_ALPHA as a surface format override breaks 1.30+ style
-   * texturing functions that return a float, as

[Mesa-dev] [PATCH 4/7] i965: Refactor effective depth calculation

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen topi.pohjolai...@intel.com

Reviewed-by: Matt Turner matts...@gmail.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
[ Francisco Jerez: Non-trivial rebase.  Pass a half-open interval of
  layers like emit_texture_surface_state does. ]
Reviewed-by: Francisco Jerez curroje...@riseup.net
---
 src/mesa/drivers/dri/i965/brw_context.h   |  2 ++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 12 ++--
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 10 +++---
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  9 +++--
 4 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9e85dd7..0e9ede9 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -986,6 +986,8 @@ struct brw_context
   void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  struct gl_texture_object *tObj,
+ unsigned min_layer,
+ unsigned max_layer,
  uint32_t tex_format, unsigned swizzle,
  uint32_t *surf_offset,
  bool for_gather);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 3dddf89..92383e1 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -311,6 +311,8 @@ static void
 brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
struct gl_texture_object *tObj,
+   unsigned min_layer /* unused */,
+   unsigned max_layer /* unused */,
uint32_t tex_format, unsigned swizzle /* unused */,
uint32_t *surf_offset,
bool for_gather)
@@ -800,6 +802,11 @@ update_texture_surface(struct gl_context *ctx,
   struct intel_mipmap_tree *mt = intel_obj-mt;
   const struct gl_texture_image *firstImage = 
obj-Image[0][obj-BaseLevel];
   const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, 
unit);
+  /* If this is a view with restricted NumLayers, then our effective depth
+   * is not just the miptree depth.
+   */
+  const unsigned depth = (obj-Immutable  obj-Target != GL_TEXTURE_3D ?
+  obj-NumLayers : mt-logical_depth0);
 
   /* Handling GL_ALPHA as a surface format override breaks 1.30+ style
* texturing functions that return a float, as our code generation always
@@ -820,8 +827,9 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw-vtbl.update_texture_surface(brw, mt, obj, format, swizzle,
-   surf_offset, for_gather);
+  brw-vtbl.update_texture_surface(brw, mt, obj,
+   obj-MinLayer, obj-MinLayer + depth,
+   format, swizzle, surf_offset, 
for_gather);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 7576b20..9755236 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -351,22 +351,18 @@ static void
 gen7_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
 struct gl_texture_object *obj,
+unsigned min_layer,
+unsigned max_layer,
 uint32_t tex_format, unsigned swizzle,
 uint32_t *surf_offset,
 bool for_gather)
 {
struct intel_texture_object *intel_obj = intel_texture_object(obj);
-   /* If this is a view with restricted NumLayers, then our effective depth
-* is not just the miptree depth.
-*/
-   const unsigned depth = (obj-Immutable  obj-Target != GL_TEXTURE_3D ?
-   obj-NumLayers : mt-logical_depth0);
-
if (for_gather  tex_format == BRW_SURFACEFORMAT_R32G32_FLOAT)
   tex_format = BRW_SURFACEFORMAT_R32G32_FLOAT_LD;
 
gen7_emit_texture_surface_state(brw, mt, obj-Target,
-   obj-MinLayer, obj-MinLayer + depth,
+   min_layer, max_layer,
obj-MinLevel + obj-BaseLevel,
obj-MinLevel + intel_obj-_MaxLevel + 1,
tex_format, swizzle,
diff

[Mesa-dev] [PATCH 2/7] i965: Move tex miptree and format resolving into dispatcher

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen topi.pohjolai...@intel.com

All hardware platforms have this in common, so do it in the
hardware independent dispatcher.

v2 (Matt): Removed extra whitespace.

Reviewed-by: Matt Turner matts...@gmail.com (v1)
Reviewed-by: Kenneth Graunke kenn...@whitecape.org (v1)
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez curroje...@riseup.net
---
 src/mesa/drivers/dri/i965/brw_context.h   |  4 +++-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 26 ---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 17 ++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 17 ---
 4 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a6282f4..d599ba8 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -984,7 +984,9 @@ struct brw_context
struct
{
   void (*update_texture_surface)(struct gl_context *ctx,
- unsigned unit,
+ struct intel_mipmap_tree *mt,
+ struct gl_texture_object *tObj,
+ uint32_t tex_format,
  uint32_t *surf_offset,
  bool for_gather);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 2b8040c..7ed7e18 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -309,23 +309,19 @@ update_buffer_texture_surface(struct gl_context *ctx,
 
 static void
 brw_update_texture_surface(struct gl_context *ctx,
-   unsigned unit,
+   struct intel_mipmap_tree *mt,
+   struct gl_texture_object *tObj,
+   uint32_t tex_format,
uint32_t *surf_offset,
bool for_gather)
 {
struct brw_context *brw = brw_context(ctx);
-   struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current;
struct intel_texture_object *intelObj = intel_texture_object(tObj);
-   struct intel_mipmap_tree *mt = intelObj-mt;
-   struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
uint32_t *surf;
 
surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
  6 * 4, 32, surf_offset);
 
-   uint32_t tex_format = translate_tex_format(brw, mt-format,
-  sampler-sRGBDecode);
-
if (for_gather) {
   /* Sandybridge's gather4 message is broken for integer formats.
* To work around this, we pretend the surface is UNORM for
@@ -801,7 +797,21 @@ update_texture_surface(struct gl_context *ctx,
if (obj-Target == GL_TEXTURE_BUFFER) {
   update_buffer_texture_surface(ctx, unit, surf_offset);
} else {
-  brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
+  struct intel_texture_object *intel_obj = intel_texture_object(obj);
+  struct intel_mipmap_tree *mt = intel_obj-mt;
+  const struct gl_texture_image *firstImage = 
obj-Image[0][obj-BaseLevel];
+  const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, 
unit);
+  unsigned format = translate_tex_format(brw, intel_obj-_Format,
+ sampler-sRGBDecode);
+  if (obj-StencilSampling  firstImage-_BaseFormat == GL_DEPTH_STENCIL) 
{
+ assert(brw-gen = 8);
+ mt = mt-stencil_mt;
+ assert(mt-format == MESA_FORMAT_S_UINT8);
+ format = BRW_SURFACEFORMAT_R8_UINT;
+  }
+
+  brw-vtbl.update_texture_surface(ctx, mt, obj, format, surf_offset,
+   for_gather);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 098b5c8..7e3ee67 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -349,16 +349,14 @@ gen7_emit_texture_surface_state(struct brw_context *brw,
 
 static void
 gen7_update_texture_surface(struct gl_context *ctx,
-unsigned unit,
+struct intel_mipmap_tree *mt,
+struct gl_texture_object *obj,
+uint32_t tex_format,
 uint32_t *surf_offset,
 bool for_gather)
 {
struct brw_context *brw = brw_context(ctx);
-   struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current;
-
struct intel_texture_object *intel_obj = intel_texture_object(obj);
-   struct

Re: [Mesa-dev] [PATCH 01/13] nir/validate: Validate SSA def parent instructiosn

2015-05-07 Thread Connor Abbott

I can't seem to find the cover email, so I'll respond to this one.
Aside from my comments on patches 11 and 13, patches 1-5 and 11-13 are

Reviewed-by: Connor Abbott cwabbo...@gmail.com

and FWIW 6-10 are

Acked-by: Connor Abbott cwabbo...@gmail.com

although what's important there are other people testing those and
make sure they don't break other things (particularly Windows).


On Tue, May 5, 2015 at 8:16 PM, Connor Abbott cwabbo...@gmail.com wrote:
 Typo in the subject line.

 On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand ja...@jlekstrand.net wrote:
 ---
  src/glsl/nir/nir_validate.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c
 index a7aa798..35a853d 100644
 --- a/src/glsl/nir/nir_validate.c
 +++ b/src/glsl/nir/nir_validate.c
 @@ -236,6 +236,8 @@ validate_ssa_def(nir_ssa_def *def, validate_state *state)
 assert(!BITSET_TEST(state-ssa_defs_found, def-index));
 BITSET_SET(state-ssa_defs_found, def-index);

 +   assert(def-parent_instr == state-instr);
 +
 assert(def-num_components = 4);

 ssa_def_validate_state *def_state = ralloc(state-ssa_defs,
 --
 2.3.6

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] prog_to_nir: OPCODE_EXP is not nir_op_fexp

2015-05-07 Thread Ian Romanick

On 05/07/2015 07:30 AM, Jason Ekstrand wrote:
 On Wed, May 6, 2015 at 7:29 PM, Matt Turner matts...@gmail.com wrote:
 On Wed, May 6, 2015 at 7:09 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 It's a weird thing that provides some values related to 2**x.  It's also
 already handled by a case in the switch.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com

 The series is

 Reviewed-by: Matt Turner matts...@gmail.com
 
 I was going to complain about you making my SPIR-V - NIR translator
 harder to write.  But, based on the discussion by Ken and Ilia on IRC,
 it looks like basically no one's hardware does a base-e log.  I'll
 just lower on-the-fly.  I guess maybe we could do it with pow(x, e)
 but meh.  If you'd like, the series is

Right.  We currently unconditionally lower exp(x) to exp2(x * M_LOG2E)
in the GLSL IR lowering code.  I believe we picked that lowering because
some older architectures lack a pow instruction.  It may be worth trying
the other way to see if we get better code.

 Acked-by: Jason Ekstrand jason.ekstr...@intel.com
 
 I can't say I read it enough to call it a review but I glanced through
 it and it seems ok.
 --Jason
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread Jan Vesely

On Thu, 2015-05-07 at 21:52 +0200, EdB wrote:
 Le 2015-05-07 18:55, Aaron Watry a écrit :
  I'm not sure what the final consensus will be on how to do this, but
  FWIW:
  Tested-By: Aaron Watry awa...@gmail.com
  
  I've tested this with 4 combinations:
  no --with-opencl-icd option specified : libOpenCL.so gets installed in
  ${prefix}/lib
  --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
  --with-opencl-icd=standard : libMesaOpenCL.so installed in
  ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd
  --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in
  ${prefix}/lib, icd in ${prefix}/etc//mesa.icd.  I only specified
  --prefix, no other directories overridden in configure command.

shouldn't this part go to ${prefix}/etc/OpenCL/vendors?
Is it just a typo or did it install to ${prefix}/etc//?

jan

  
 
 thanks
 
EdB
 
  --Aaron
  
   
  
  On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote:
  
  The standard ICD file path is /etc/OpenCL/vendor/.
  However it doesn't fit well with custom build.
  This option allow ICD vendor file installation path override
  ---
   configure.ac [1]   | 46
  +++---
   src/gallium/targets/opencl/Makefile.am |  2 +-
   2 files changed, 33 insertions(+), 15 deletions(-)
  
  diff --git a/configure.ac [1] b/configure.ac [1]
  index 095e23e..90dba4e 100644
  --- a/configure.ac [1]
  +++ b/configure.ac [1]
  @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
[enable OpenCL library @:@default=disabled@:@])],
  [enable_opencl=$enableval],
  [enable_opencl=no])
  -AC_ARG_ENABLE([opencl_icd],
  -   [AS_HELP_STRING([--enable-opencl-icd],
  -  [Build an OpenCL ICD library to be loaded by an ICD
  implementation
  -   @:@default=disabled@:@])],
  -[enable_opencl_icd=$enableval],
  -[enable_opencl_icd=no])
   AC_ARG_ENABLE([xlib-glx],
   [AS_HELP_STRING([--enable-xlib-glx],
   [make GLX library Xlib-based instead of DRI-based
  @:@default=disabled@:@])],
  @@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then
   # XXX: Use $enable_shared_pipe_drivers once converted to
  use static/shared pipe-drivers
   enable_gallium_loader=yes
  
  -if test x$enable_opencl_icd = xyes; then
  -OPENCL_LIBNAME=MesaOpenCL
  -else
  -OPENCL_LIBNAME=OpenCL
  -fi
  -
   if test x$have_libelf != xyes; then
  AC_MSG_ERROR([Clover requires libelf])
   fi
   fi
   AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes)
  -AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes)
  -AC_SUBST([OPENCL_LIBNAME])
  
   dnl
   dnl Gallium configuration
  @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
   [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d])
   AC_SUBST([D3D_DRIVER_INSTALL_DIR])
  
  +dnl OpenCL ICD
  +
  +AC_ARG_WITH([opencl-icd],
  +   
  [AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@],
  +[Build an OpenCL ICD library to be loaded by an ICD
  implementation.
  + If @:@standard@:@ the OpenCL ICD vendor file
  installs in /etc/OpenCL/vendors.
  + @:@sysconfdir@:@ installs the file in
  $sysconfdir/OpenCL/vendors
  + @:@default=no@:@])],
  +[OPENCL_ICD=$withval],
  +[OPENCL_ICD=no])
  +
  +case x$OPENCL_ICD in
  +xno)
  +OPENCL_LIBNAME=OpenCL
  +;;
  +xstandard)
  +OPENCL_LIBNAME=MesaOpenCL
  +ICD_FILE_DIR=/etc/OpenCL/vendors
  +;;
  +xsysconfdir)
  +OPENCL_LIBNAME=MesaOpenCL
  +ICD_FILE_DIR=$sysconfdir/OpenCL/vendors
  +;;
  +*)
  +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
  --with-opencl-icd])
  +;;
  +esac
  +
  +AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno)
  +AC_SUBST([OPENCL_LIBNAME])
  +AC_SUBST([ICD_FILE_DIR])
  +
   dnl
   dnl Gallium helper functions
   dnl
  diff --git a/src/gallium/targets/opencl/Makefile.am
  b/src/gallium/targets/opencl/Makefile.am
  index 5daf327..781daa0 100644
  --- a/src/gallium/targets/opencl/Makefile.am
  +++ b/src/gallium/targets/opencl/Makefile.am
  @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES =
  opencl.sym
   EXTRA_DIST = mesa.icd opencl.sym
  
   if HAVE_CLOVER_ICD
  -icddir = /etc/OpenCL/vendors/
  +icddir = $(ICD_FILE_DIR)
   icd_DATA = mesa.icd
   endif
  
  --
  2.1.0
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2]
  
  
  
  Links:
  --
  [1] http://configure.ac
  [2] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Jan Vesely jan.ves...@rutgers.edu


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 10/13] mesa/main: Check context pointer in _mesa_error before using it

2015-05-07 Thread Ian Romanick

On 05/07/2015 05:17 AM, Pohjolainen, Topi wrote:
 On Tue, May 05, 2015 at 02:25:26PM +0300, Juha-Pekka Heikkila wrote:
 I guess this should not really be able to segfault but still it
 seems to be able to during context creation.

 Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com
 ---
  src/mesa/main/errors.c | 26 --
  1 file changed, 16 insertions(+), 10 deletions(-)

 diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
 index 2aa1deb..6631b82 100644
 --- a/src/mesa/main/errors.c
 +++ b/src/mesa/main/errors.c
 @@ -1458,18 +1458,23 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
 const char *fmtString, ... )
  
 To me it looks that it would be better to just leave early already here:
 
   if (!ctx)
  return;
 
 Avoids extra indentation and it doesn't look meaningful to call
 should_output() with null context.

I like that plan.

I don't think you can even get to _mesa_error (or _mesa_warning) without
a context.  Maybe add an assert(ctx != NULL)?

 do_output = should_output(ctx, error, fmtString);
  
 -   mtx_lock(ctx-DebugMutex);
 -   if (ctx-Debug) {
 -  do_log = debug_is_message_enabled(ctx-Debug,
 -MESA_DEBUG_SOURCE_API,
 -MESA_DEBUG_TYPE_ERROR,
 -error_msg_id,
 -MESA_DEBUG_SEVERITY_HIGH);
 +   if (ctx) {
 +  mtx_lock(ctx-DebugMutex);
 +  if (ctx-Debug) {
 + do_log = debug_is_message_enabled(ctx-Debug,
 +   MESA_DEBUG_SOURCE_API,
 +   MESA_DEBUG_TYPE_ERROR,
 +   error_msg_id,
 +   MESA_DEBUG_SEVERITY_HIGH);
 +  }
 +  else {
 + do_log = GL_FALSE;
 +  }
 +  mtx_unlock(ctx-DebugMutex);
 }
 else {
do_log = GL_FALSE;
 }
 -   mtx_unlock(ctx-DebugMutex);
  
 if (do_output || do_log) {
char s[MAX_DEBUG_MESSAGE_LENGTH], s2[MAX_DEBUG_MESSAGE_LENGTH];
 @@ -1502,14 +1507,15 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
 const char *fmtString, ... )
}
  
/* Log the error via ARB_debug_output if needed.*/
 -  if (do_log) {
 +  if (ctx  do_log) {
   log_msg(ctx, MESA_DEBUG_SOURCE_API, MESA_DEBUG_TYPE_ERROR,
   error_msg_id, MESA_DEBUG_SEVERITY_HIGH, len, s2);
}
 }
  
 /* Set the GL context error state for glGetError. */
 -   _mesa_record_error(ctx, error);
 +   if (ctx)
 +  _mesa_record_error(ctx, error);
  }
  
  void
 -- 
 1.8.5.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/13] nir/nir: Use a linked list instead of a has set for use/def sets

2015-05-07 Thread Connor Abbott

Based on the testing you did, it sounds like switching to linked lists
gives us some pretty good performance gains, but before we go ahead
with this you should collect some numbers using
http://anholt.net/compare-perf/ and put them on this commit message.
Comparing list vs. no-list as well as NIR vs. non-NIR might be useful,
so we can compare the time saved to the total time we spend doing
NIR-related things.

On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand ja...@jlekstrand.net wrote:
 This commit switches us from the current setup of using hash sets for
 use/def sets to using linked lists.  Doing so should save us quite a bit of
 memory because we aren't carrying around 3 hash sets per register and 2 per
 SSA value.  It should also save us CPU time because adding/removing things
 from use/def sets is 4 pointer manipulations instead of a hash lookup.

 On the code complexity side of things, some things are now much easier and
 others are a bit harder.  One of the operations we perform constantly in
 optimization passes is to replace one source with another.  Due to the fact
 that an instruction can use the same SSA value multiple times, we had to
 iterate through the sources of the instruction and determine if the use we
 were replacing was the only one before removing it from the set of uses.
 With this patch, uses are per-source not per-instruction so we can just
 remove it safely.  On the other hand, trying to iterate over all of the
 instructions that use a given value is more difficult.  Fortunately, the
 two places we do that are the ffma peephole where it doesn't matter and GCM
 where we already gracefully handle duplicates visits to an instruction.

 Another aspect here is that using linked lists in this way can be tricky to
 get right.  With sets, things were quite forgiving and the worst that
 happened if you didn't properly remove a use was that it would get caught
 in the validator.  With linked lists, it can lead to linked list corruption
 which can be harder to track.  However, we do just as much validation of
 the linked lists as we did of the sets so the validator should still catch
 these problems.  While working on this series, the vast majority of the
 bugs I had to fix were caught by assertions.  I don't think the lists are
 going to be that much worse than the sets.
 ---
  src/glsl/nir/nir.c  | 228 
 +++-
  src/glsl/nir/nir.h  |  45 +++--
  src/glsl/nir/nir_validate.c | 158 +++---
  3 files changed, 194 insertions(+), 237 deletions(-)

 diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
 index b8f5dd4..be13c90 100644
 --- a/src/glsl/nir/nir.c
 +++ b/src/glsl/nir/nir.c
 @@ -58,12 +58,9 @@ reg_create(void *mem_ctx, struct exec_list *list)
 nir_register *reg = ralloc(mem_ctx, nir_register);

 reg-parent_instr = NULL;
 -   reg-uses = _mesa_set_create(reg, _mesa_hash_pointer,
 -_mesa_key_pointer_equal);
 -   reg-defs = _mesa_set_create(reg, _mesa_hash_pointer,
 -_mesa_key_pointer_equal);
 -   reg-if_uses = _mesa_set_create(reg, _mesa_hash_pointer,
 -   _mesa_key_pointer_equal);
 +   list_inithead(reg-uses);
 +   list_inithead(reg-defs);
 +   list_inithead(reg-if_uses);

 reg-num_components = 0;
 reg-num_array_elems = 0;
 @@ -1070,11 +1067,14 @@ update_if_uses(nir_cf_node *node)

 nir_if *if_stmt = nir_cf_node_as_if(node);

 -   struct set *if_uses_set = if_stmt-condition.is_ssa ?
 - if_stmt-condition.ssa-if_uses :
 - if_stmt-condition.reg.reg-uses;
 -
 -   _mesa_set_add(if_uses_set, if_stmt);
 +   if_stmt-condition.parent_if = if_stmt;
 +   if (if_stmt-condition.is_ssa) {
 +  list_addtail(if_stmt-condition.use_link,
 +   if_stmt-condition.ssa-if_uses);
 +   } else {
 +  list_addtail(if_stmt-condition.use_link,
 +   if_stmt-condition.reg.reg-if_uses);
 +   }
  }

  void
 @@ -1227,16 +1227,7 @@ cleanup_cf_node(nir_cf_node *node)
foreach_list_typed(nir_cf_node, child, node, if_stmt-else_list)
   cleanup_cf_node(child);

 -  struct set *if_uses;
 -  if (if_stmt-condition.is_ssa) {
 - if_uses = if_stmt-condition.ssa-if_uses;
 -  } else {
 - if_uses = if_stmt-condition.reg.reg-if_uses;
 -  }
 -
 -  struct set_entry *entry = _mesa_set_search(if_uses, if_stmt);
 -  assert(entry);
 -  _mesa_set_remove(if_uses, entry);
 +  list_del(if_stmt-condition.use_link);
break;
 }

 @@ -1293,9 +1284,9 @@ add_use_cb(nir_src *src, void *state)
  {
 nir_instr *instr = state;

 -   struct set *uses_set = src-is_ssa ? src-ssa-uses : src-reg.reg-uses;
 -
 -   _mesa_set_add(uses_set, instr);
 +   src-parent_instr = instr;
 +   list_addtail(src-use_link,
 +src-is_ssa ? src-ssa-uses : src-reg.reg-uses);

 return true;
  }
 @@

Re: [Mesa-dev] [PATCH 1/5] nir: Define image load, store and atomic intrinsics.

2015-05-07 Thread Connor Abbott

On IRC, Ken and I were discussing using a scheme inspired by SPIR-V,
which has an OpImagePointer instruction that forms a pointer to the
particular texel of the image as well as
OpAtomic{Load,Store,Exchange,etc.} that operate on an image or shared
buffer pointer. The advantages would be:

* Makes translating from SPIR-V easier.
* Reduces the number of intrinsics we need to add for SSBO support.
* Reduces the combinatorial explosion enough that we can have separate
versions for 2, 3, and 4 components and MS vs. non-MS without it being
unbearable. I'm not sure how much of a benefit that would be though.

The disadvantages I can think of are:

* Doesn't actually save any code in the i965 backend, since we need to
do different things depending on if the pointer is to an image or a
shared buffer anyways.
* We'd have to special case nir_convert_from_ssa to ignore the SSA
value that's really a pointer since we don't have any real type-level
support for pointers.
* Since we lower to SSA before converting to i965, there are some ugly
edge cases when the coordinate argument becomes part of a phi web and
gets potentially overwritten before the instruction that uses the
pointer.

I don't have a preference one way or the other, and I guess we could
always refactor it later if we wanted to, so assuming Ken is OK with
this, then besides one minor comment on patch 4 the series is

Reviewed-by: Connor Abbott cwabbo...@gmail.com

On Tue, May 5, 2015 at 4:29 PM, Francisco Jerez curroje...@riseup.net wrote:
 ---
  src/glsl/nir/nir_intrinsics.h | 27 +++
  1 file changed, 27 insertions(+)

 diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
 index 8e28765..4b13c75 100644
 --- a/src/glsl/nir/nir_intrinsics.h
 +++ b/src/glsl/nir/nir_intrinsics.h
 @@ -89,6 +89,33 @@ ATOMIC(inc, 0)
  ATOMIC(dec, 0)
  ATOMIC(read, NIR_INTRINSIC_CAN_ELIMINATE)

 +/*
 + * Image load, store and atomic intrinsics.
 + *
 + * All image intrinsics take an image target passed as a nir_variable.  Image
 + * variables contain a number of memory and layout qualifiers that influence
 + * the semantics of the intrinsic.
 + *
 + * All image intrinsics take a four-coordinate vector and a sample index as
 + * first two sources, determining the location within the image that will be
 + * accessed by the intrinsic.  Components not applicable to the image target
 + * in use are equal to zero by convention.  Image store takes an additional
 + * four-component argument with the value to be written, and image atomic
 + * operations take either one or two additional scalar arguments with the 
 same
 + * meaning as in the ARB_shader_image_load_store specification.
 + */
 +INTRINSIC(image_load, 2, ARR(4, 1), true, 4, 1, 0,
 +  NIR_INTRINSIC_CAN_ELIMINATE)
 +INTRINSIC(image_store, 3, ARR(4, 1, 4), false, 0, 1, 0, 0)
 +INTRINSIC(image_atomic_add, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
 +INTRINSIC(image_atomic_min, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
 +INTRINSIC(image_atomic_max, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
 +INTRINSIC(image_atomic_and, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
 +INTRINSIC(image_atomic_or, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
 +INTRINSIC(image_atomic_xor, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
 +INTRINSIC(image_atomic_exchange, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
 +INTRINSIC(image_atomic_comp_swap, 4, ARR(4, 1, 1, 1), true, 1, 1, 0, 0)
 +
  #define SYSTEM_VALUE(name, components) \
 INTRINSIC(load_##name, 0, ARR(), true, components, 0, 0, \
 NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 --
 2.3.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/27] i965: Store gather table information in the program data

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:08PM +0300, Abdiel Janulgue wrote:
 The resource streamer is able to gather and pack sparsely-located
 constant data from any buffer object by a referring to a gather table
 This patch adds support for keeping track of these constant data
 fetches into a gather table.
 
 The gather table is generated from two sources. Ordinary uniform fetches
 are stored first. These are then combined with a separate table containing
 UBO entries. The separate entry for UBOs is needed to make it easier to
 generate the gather mask when combining and packing the constant data.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_context.h  |  9 +
  src/mesa/drivers/dri/i965/brw_gs.c   |  4 
  src/mesa/drivers/dri/i965/brw_program.c  |  5 +
  src/mesa/drivers/dri/i965/brw_shader.cpp |  4 +++-
  src/mesa/drivers/dri/i965/brw_shader.h   | 11 +++
  src/mesa/drivers/dri/i965/brw_vs.c   |  5 +
  src/mesa/drivers/dri/i965/brw_wm.c   |  5 +
  7 files changed, 42 insertions(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 7fd49e9..e25c64d 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -355,9 +355,12 @@ struct brw_stage_prog_data {
  
 GLuint nr_params;   /** number of float params/constants */
 GLuint nr_pull_params;
 +   GLuint nr_ubo_params;
 +   GLuint nr_gather_table;

I would introduce these as non gl-types - we are trying to move away from
them. Perhaps change nr_params and nr_pull_params while you are at it.

  
 unsigned curb_read_length;
 unsigned total_scratch;
 +   unsigned max_ubo_const_block;
  
 /**
  * Register where the thread expects to find input data from the URB
 @@ -375,6 +378,12 @@ struct brw_stage_prog_data {
  */
 const gl_constant_value **param;
 const gl_constant_value **pull_param;
 +   struct {
 +  int reg;
 +  unsigned channel_mask;
 +  unsigned const_block;
 +  unsigned const_offset;
 +   } *gather_table;
  };

Below in brw_shader.h you do:

   struct gather_table {
  int reg;
  unsigned channel_mask;
  unsigned const_block;
  unsigned const_offset;
   };
   gather_table *ubo_gather_table;

Why not here?

  
  /* Data about a particular attempt to compile a program.  Note that
 diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
 b/src/mesa/drivers/dri/i965/brw_gs.c
 index bea90d8..97658d5 100644
 --- a/src/mesa/drivers/dri/i965/brw_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_gs.c
 @@ -70,6 +70,10 @@ brw_compile_gs_prog(struct brw_context *brw,
 c.prog_data.base.base.pull_param =
rzalloc_array(NULL, const gl_constant_value *, param_count);
 c.prog_data.base.base.nr_params = param_count;
 +   c.prog_data.base.base.nr_gather_table = 0;
 +   c.prog_data.base.base.gather_table =
 +  rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
 +   (c.prog_data.base.base.nr_params + 
 c.prog_data.base.base.nr_ubo_params));

Wrap this line.

  
 if (brw-gen = 7) {
if (gp-program.OutputType == GL_POINTS) {
 diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
 b/src/mesa/drivers/dri/i965/brw_program.c
 index 81a0c19..f27c799 100644
 --- a/src/mesa/drivers/dri/i965/brw_program.c
 +++ b/src/mesa/drivers/dri/i965/brw_program.c
 @@ -573,6 +573,10 @@ brw_stage_prog_data_compare(const struct 
 brw_stage_prog_data *a,
 if (memcmp(a-pull_param, b-pull_param, a-nr_pull_params * sizeof(void 
 *)))
return false;
  
 +   if (memcmp(a-gather_table, b-gather_table,
 +  a-nr_gather_table * sizeof(*a-gather_table)))
 +  return false;
 +
 return true;
  }
  
 @@ -583,6 +587,7 @@ brw_stage_prog_data_free(const void *p)
  
 ralloc_free(prog_data-param);
 ralloc_free(prog_data-pull_param);
 +   ralloc_free(prog_data-gather_table);
  }
  
  void
 diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
 b/src/mesa/drivers/dri/i965/brw_shader.cpp
 index 0d6ac0c..8769f67 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
 @@ -739,11 +739,13 @@ backend_visitor::backend_visitor(struct brw_context 
 *brw,
   prog(prog),
   stage_prog_data(stage_prog_data),
   cfg(NULL),
 - stage(stage)
 + stage(stage),
 + ubo_gather_table(NULL)
  {
 debug_enabled = INTEL_DEBUG  intel_debug_flag_for_shader_stage(stage);
 stage_name = _mesa_shader_stage_to_string(stage);
 stage_abbrev = _mesa_shader_stage_to_abbrev(stage);
 +   this-nr_ubo_gather_table = 0;

Any particular reason not to do this in the initializer along with the other
members?

  }
  
  bool
 diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
 b/src/mesa/drivers/dri/i965/brw_shader.h
 index 8a3263e..db0018f 100644
 --- a/src/mesa/drivers/dri/i965/brw_shader.h
 +++

Re: [Mesa-dev] [PATCH 13/27] i965: Assign hw-binding table index for uniform constant buffer block

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:10PM +0300, Abdiel Janulgue wrote:
 Assign the uploaded uniform block with hardware binding table indices.
 This is indexed by the resource streamer to fetch the constant buffers
 referred to by our gather table entries.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/gen6_vs_state.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
 b/src/mesa/drivers/dri/i965/gen6_vs_state.c
 index 7325c6e..bce597f 100644
 --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
 @@ -72,9 +72,16 @@ gen6_upload_push_constants(struct brw_context *brw,
gl_constant_value *param;
int i;
  
 -  param = brw_state_batch(brw, type,
 -   prog_data-nr_params * sizeof(gl_constant_value),
 +  uint32_t size = prog_data-nr_params * sizeof(gl_constant_value);

Const would be nice here.

 +  param = brw_state_batch(brw, type, size,
 32, stage_state-push_const_offset);
 +  if (brw-gather_pool.bo != NULL) {
 + uint32_t surf_offset = 0;
 + brw_create_constant_surface(brw, brw-batch.bo, 
 stage_state-push_const_offset,
 + size, surf_offset, false);
 + gen7_update_binding_table(brw, stage_state-stage, 
 BRW_UNIFORM_GATHER_INDEX_START,

Two lines overflowing 80 columns.

 +   surf_offset);
 +  }
  
STATIC_ASSERT(sizeof(gl_constant_value) == sizeof(float));
  
 -- 
 1.9.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote:
 From: Topi Pohjolainen topi.pohjolai...@intel.com
 
 All generations do the same exact dispatch and it could be
 therefore done in the hardware independent stage.
 
 Reviewed-by: Matt Turner matts...@gmail.com
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 [ Francisco Jerez: Non-trivial rebase. ]
 Reviewed-by: Francisco Jerez curroje...@riseup.net
 ---
  src/mesa/drivers/dri/i965/brw_context.h   |  3 -
  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 
 +++
  src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++
  4 files changed, 83 insertions(+), 87 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 2fcdcfa..a6282f4 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context 
 *brw,
   uint32_t size,
   uint32_t *out_offset,
   bool dword_pitch);
 -void brw_update_buffer_texture_surface(struct gl_context *ctx,
 -   unsigned unit,
 -   uint32_t *surf_offset);
  void
  brw_update_sol_surface(struct brw_context *brw,
 struct gl_buffer_object *buffer_obj,
 diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
 b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 index 160dd2f..2b8040c 100644
 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
 }
  }
  
 -void
 -brw_update_buffer_texture_surface(struct gl_context *ctx,
 -  unsigned unit,
 -  uint32_t *surf_offset)
 +static void
 +update_buffer_texture_surface(struct gl_context *ctx,
 +  unsigned unit,
 +  uint32_t *surf_offset)
  {
 struct brw_context *brw = brw_context(ctx);
 struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current;
 @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
 struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
 uint32_t *surf;
  
 -   /* BRW_NEW_TEXTURE_BUFFER */
 -   if (tObj-Target == GL_TEXTURE_BUFFER) {
 -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
 -  return;
 -   }
 -
 surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
 6 * 4, 32, surf_offset);
  
 @@ -795,6 +789,21 @@ const struct brw_tracked_state 
 gen6_renderbuffer_surfaces = {
 .emit = update_renderbuffer_surfaces,
  };
  
 +static void
 +update_texture_surface(struct gl_context *ctx,
 +   unsigned unit,
 +   uint32_t *surf_offset,
 +   bool for_gather)
 +{
 +   struct brw_context *brw = brw_context(ctx);
 +   struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current;
 +
 +   if (obj-Target == GL_TEXTURE_BUFFER) {
 +  update_buffer_texture_surface(ctx, unit, surf_offset);

In order to avoid extra level of indentation I used the following. I would
have preferred it here also.

  if (obj-Target == GL_TEXTURE_BUFFER) {
 update_buffer_texture_surface(ctx, unit, surf_offset);
 return;
  }

 +   } else {
 +  brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
 +   }
 +}
  
  static void
  update_stage_texture_surfaces(struct brw_context *brw,
 @@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
  
   /* _NEW_TEXTURE */
   if (ctx-Texture.Unit[unit]._Current) {
 -brw-vtbl.update_texture_surface(ctx, unit, surf_offset + s, 
 for_gather);
 +update_texture_surface(ctx, unit, surf_offset + s, for_gather);
   }
}
 }
 diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
 b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 index 15ab2b0..098b5c8 100644
 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 @@ -356,43 +356,38 @@ gen7_update_texture_surface(struct gl_context *ctx,
 struct brw_context *brw = brw_context(ctx);
 struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current;
  
 -   if (obj-Target == GL_TEXTURE_BUFFER) {
 -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
 -
 -   } else {
 -  struct intel_texture_object *intel_obj = intel_texture_object(obj);
 -  struct intel_mipmap_tree *mt = intel_obj-mt;
 -  struct gl_sampler_object *sampler =

Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 05:55:48PM +0300, Francisco Jerez wrote:
 Pohjolainen, Topi topi.pohjolai...@intel.com writes:
 
  On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote:
  From: Topi Pohjolainen topi.pohjolai...@intel.com
  
  All generations do the same exact dispatch and it could be
  therefore done in the hardware independent stage.
  
  Reviewed-by: Matt Turner matts...@gmail.com
  Reviewed-by: Kenneth Graunke kenn...@whitecape.org
  Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
  [ Francisco Jerez: Non-trivial rebase. ]
  Reviewed-by: Francisco Jerez curroje...@riseup.net
  ---
   src/mesa/drivers/dri/i965/brw_context.h   |  3 -
   src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
   src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 
  +++
   src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 
  ++
   4 files changed, 83 insertions(+), 87 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
  b/src/mesa/drivers/dri/i965/brw_context.h
  index 2fcdcfa..a6282f4 100644
  --- a/src/mesa/drivers/dri/i965/brw_context.h
  +++ b/src/mesa/drivers/dri/i965/brw_context.h
  @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context 
  *brw,
uint32_t size,
uint32_t *out_offset,
bool dword_pitch);
  -void brw_update_buffer_texture_surface(struct gl_context *ctx,
  -   unsigned unit,
  -   uint32_t *surf_offset);
   void
   brw_update_sol_surface(struct brw_context *brw,
  struct gl_buffer_object *buffer_obj,
  diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
  b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
  index 160dd2f..2b8040c 100644
  --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
  +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
  @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context 
  *brw,
  }
   }
   
  -void
  -brw_update_buffer_texture_surface(struct gl_context *ctx,
  -  unsigned unit,
  -  uint32_t *surf_offset)
  +static void
  +update_buffer_texture_surface(struct gl_context *ctx,
  +  unsigned unit,
  +  uint32_t *surf_offset)
   {
  struct brw_context *brw = brw_context(ctx);
  struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current;
  @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
  struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
  uint32_t *surf;
   
  -   /* BRW_NEW_TEXTURE_BUFFER */
  -   if (tObj-Target == GL_TEXTURE_BUFFER) {
  -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
  -  return;
  -   }
  -
  surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
   6 * 4, 32, surf_offset);
   
  @@ -795,6 +789,21 @@ const struct brw_tracked_state 
  gen6_renderbuffer_surfaces = {
  .emit = update_renderbuffer_surfaces,
   };
   
  +static void
  +update_texture_surface(struct gl_context *ctx,
  +   unsigned unit,
  +   uint32_t *surf_offset,
  +   bool for_gather)
  +{
  +   struct brw_context *brw = brw_context(ctx);
  +   struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current;
  +
  +   if (obj-Target == GL_TEXTURE_BUFFER) {
  +  update_buffer_texture_surface(ctx, unit, surf_offset);
 
  In order to avoid extra level of indentation I used the following. I would
  have preferred it here also.
 
if (obj-Target == GL_TEXTURE_BUFFER) {
   update_buffer_texture_surface(ctx, unit, surf_offset);
   return;
}
 
 I kept this as an indented block because it's harmless IMHO and it
 seemed a somewhat lesser evil than:
 1/ Define all texture-specific variables (i.e. things that are not
applicable to buffer textures, including some pointer dereferences)
at the top level, which is what you did, but it seemed a bit dodgy.
 2/ Mix statements and declarations. (Granted, this file is likely
already relying on other C99 features, so it wouldn't matter in
practice, it's just a codestyle itch)
 3/ Declare stuff and leave it uninitialized until later.
 
 That said, the reason was largely subjective, and I don't really have a
 strong preference.  As you are still the author of this commit you're
 free to format it as you wish, you can keep my R-b if you simply
 reindent this function.

If Ken and Matt are happy with this series, so am I. I'm just glad if we
can land it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/27] i965: Include UBO parameter sizes in push constant parameters

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:13PM +0300, Abdiel Janulgue wrote:
 Now that we consider UBO constants as push constants, we need to include
 the sizes of the UBO's constant slots in the visitor's uniform slot sizes.
 This information is needed to properly pack vector constants tightly next to
 each other.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_gs.c | 11 +++
  src/mesa/drivers/dri/i965/brw_vs.c | 13 +
  src/mesa/drivers/dri/i965/brw_wm.c | 13 +
  3 files changed, 37 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
 b/src/mesa/drivers/dri/i965/brw_gs.c
 index 97658d5..2dc3ea1 100644
 --- a/src/mesa/drivers/dri/i965/brw_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_gs.c
 @@ -32,6 +32,7 @@
  #include brw_vec4_gs_visitor.h
  #include brw_state.h
  #include brw_ff_gs.h
 +#include glsl/nir/nir_types.h
  
  
  bool
 @@ -70,6 +71,16 @@ brw_compile_gs_prog(struct brw_context *brw,
 c.prog_data.base.base.pull_param =
rzalloc_array(NULL, const gl_constant_value *, param_count);
 c.prog_data.base.base.nr_params = param_count;
 +   c.prog_data.base.base.nr_ubo_params = 0;
 +   for (int i = 0; i  gs-NumUniformBlocks; i++) {
 +  for (int p = 0; p  gs-UniformBlocks[i].NumUniforms; p++) {
 + const struct glsl_type *type = 
 gs-UniformBlocks[i].Uniforms[p].Type;
 + const struct glsl_type *elem = glsl_get_element_type(type);
 + int array_sz = elem ? glsl_get_array_size(type) : 1;
 + int components = elem ? glsl_get_components(elem) : 
 glsl_get_components(type);
 + c.prog_data.base.base.nr_ubo_params += components * array_sz;
 +  }
 +   }
 c.prog_data.base.base.nr_gather_table = 0;
 c.prog_data.base.base.gather_table =
rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
 diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
 b/src/mesa/drivers/dri/i965/brw_vs.c
 index 52333c9..86bef5e 100644
 --- a/src/mesa/drivers/dri/i965/brw_vs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vs.c
 @@ -37,6 +37,7 @@
  #include brw_state.h
  #include program/prog_print.h
  #include program/prog_parameter.h
 +#include glsl/nir/nir_types.h
  
  #include util/ralloc.h
  
 @@ -243,6 +244,18 @@ brw_compile_vs_prog(struct brw_context *brw,
rzalloc_array(NULL, const gl_constant_value *, param_count);
 stage_prog_data-nr_params = param_count;
  
 +   stage_prog_data-nr_ubo_params = 0;
 +   if (vs) {
 +  for (int i = 0; i  vs-NumUniformBlocks; i++) {
 + for (int p = 0; p  vs-UniformBlocks[i].NumUniforms; p++) {
 +const struct glsl_type *type = 
 vs-UniformBlocks[i].Uniforms[p].Type;
 +const struct glsl_type *elem = glsl_get_element_type(type);
 +int array_sz = elem ? glsl_get_array_size(type) : 1;
 +int components = elem ? glsl_get_components(elem) : 
 glsl_get_components(type);
 +stage_prog_data-nr_ubo_params += components * array_sz;
 + }
 +  }
 +   }
 stage_prog_data-nr_gather_table = 0;
 stage_prog_data-gather_table = rzalloc_size(NULL, 
 sizeof(*stage_prog_data-gather_table) *
  (stage_prog_data-nr_params +
 diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
 b/src/mesa/drivers/dri/i965/brw_wm.c
 index 13a64d8..2060eab 100644
 --- a/src/mesa/drivers/dri/i965/brw_wm.c
 +++ b/src/mesa/drivers/dri/i965/brw_wm.c
 @@ -38,6 +38,7 @@
  #include main/samplerobj.h
  #include program/prog_parameter.h
  #include program/program.h
 +#include glsl/nir/nir_types.h
  #include intel_mipmap_tree.h
  
  #include util/ralloc.h
 @@ -205,6 +206,18 @@ brw_compile_wm_prog(struct brw_context *brw,
rzalloc_array(NULL, const gl_constant_value *, param_count);
 prog_data.base.nr_params = param_count;
  
 +   prog_data.base.nr_ubo_params = 0;
 +   if (fs) {
 +  for (int i = 0; i  fs-NumUniformBlocks; i++) {
 + for (int p = 0; p  fs-UniformBlocks[i].NumUniforms; p++) {
 +const struct glsl_type *type = 
 fs-UniformBlocks[i].Uniforms[p].Type;
 +const struct glsl_type *elem = glsl_get_element_type(type);
 +int array_sz = elem ? glsl_get_array_size(type) : 1;
 +int components = elem ? glsl_get_components(elem) : 
 glsl_get_components(type);
 +prog_data.base.nr_ubo_params += components * array_sz;
 + }
 +  }
 +   }

I didn't check for exact details but looks to me you could refactor this
into its own routine - all three occurences look awfully similar.

 prog_data.base.nr_gather_table = 0;
 prog_data.base.gather_table = rzalloc_size(NULL, 
 sizeof(*prog_data.base.gather_table) *
(prog_data.base.nr_params +
 -- 
 1.9.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] prog_to_nir: OPCODE_EXP is not nir_op_fexp

2015-05-07 Thread Jason Ekstrand

On Wed, May 6, 2015 at 7:29 PM, Matt Turner matts...@gmail.com wrote:
 On Wed, May 6, 2015 at 7:09 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 It's a weird thing that provides some values related to 2**x.  It's also
 already handled by a case in the switch.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com

 The series is

 Reviewed-by: Matt Turner matts...@gmail.com

I was going to complain about you making my SPIR-V - NIR translator
harder to write.  But, based on the discussion by Ken and Ilia on IRC,
it looks like basically no one's hardware does a base-e log.  I'll
just lower on-the-fly.  I guess maybe we could do it with pow(x, e)
but meh.  If you'd like, the series is

Acked-by: Jason Ekstrand jason.ekstr...@intel.com

I can't say I read it enough to call it a review but I glanced through
it and it seems ok.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 04:43:21PM +0300, Pohjolainen, Topi wrote:
 On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote:
  This patch implements the binding table enable command which is also
  used to allocate a binding table pool where hardware-generated
  binding table entries are flushed into. Each binding table offset in
  the binding table pool is unique per each shader stage that are
  enabled within a batch.
  
  Also insert the required brw_tracked_state objects to enable
  hw-generated binding tables in normal render path.
  
  Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 
  ++
   src/mesa/drivers/dri/i965/brw_context.c|  4 ++
   src/mesa/drivers/dri/i965/brw_context.h|  5 ++
   src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
   src/mesa/drivers/dri/i965/brw_state_upload.c   |  2 +
   src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
   6 files changed, 92 insertions(+)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
  b/src/mesa/drivers/dri/i965/brw_binding_tables.c
  index 459165a..a58e32e 100644
  --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
  +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
  @@ -44,6 +44,11 @@
   #include brw_state.h
   #include intel_batchbuffer.h
   
  +/* Somehow the hw-binding table pool offset must start here, otherwise
  + * the GPU will hang
  + */
  +#define HW_BT_START_OFFSET 256;
 
 I think we want to understand this a little better before enabling...
 
  +
   /**
* Upload a shader stage's binding table as indirect state.
*
  @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = {
  .emit = brw_gs_upload_binding_table,
   };
   
  +/**
  + * Hardware-generated binding tables for the resource streamer
  + */
  +void
  +gen7_disable_hw_binding_tables(struct brw_context *brw)
  +{
  +   BEGIN_BATCH(3);
  +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
  +   OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
  BRW_HW_BINDING_TABLE_ENABLE) |
  + brw-is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0);
  +   OUT_BATCH(0);
  +   ADVANCE_BATCH();
  +
  +   /* Pipe control workaround */
  +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
  +}
  +
  +void
  +gen7_enable_hw_binding_tables(struct brw_context *brw)
  +{
  +   if (!brw-has_resource_streamer) {
  +  gen7_disable_hw_binding_tables(brw);
 
 I started wondering why we really need this - RS is disabled by default and
 we haven't needed to do anything to disable it before.

Right, patch number eight gave me the answer, we want to disable it for blorp.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote:
 Reserve space in the gather pool where the gathered uniforms are flushed.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 
  1 file changed, 8 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
 b/src/mesa/drivers/dri/i965/gen6_vs_state.c
 index 35d10ef..aebaa49 100644
 --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
 @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw,
 */
assert(stage_state-push_const_size = 32);
 }
 +   /* Allocate gather pool space for uniform and UBO entries in 512-bit 
 chunks*/
 +   if (brw-gather_pool.bo != NULL) {
 +  if (prog_data-nr_params  0) {

I guess you combine these conditions:

  if (brw-gather_pool.bo != NULL  prog_data-nr_params  0)

Or even bail out early:

  if (brw-gather_pool.bo == NULL || prog_data-nr_params == 0)
 return;

 + int num_consts = ALIGN(prog_data-nr_params, 4) / 4;

This could be const, no big deal though.

 + stage_state-push_const_offset = brw-gather_pool.next_offset;
 + brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
 +  }
 +   }
  }
  
  static void
 -- 
 1.9.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms

2015-05-07 Thread Ilia Mirkin

On Thu, May 7, 2015 at 10:52 AM, Pohjolainen, Topi
topi.pohjolai...@intel.com wrote:
 On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote:
 Reserve space in the gather pool where the gathered uniforms are flushed.

 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 
  1 file changed, 8 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
 b/src/mesa/drivers/dri/i965/gen6_vs_state.c
 index 35d10ef..aebaa49 100644
 --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
 @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw,
 */
assert(stage_state-push_const_size = 32);
 }
 +   /* Allocate gather pool space for uniform and UBO entries in 512-bit 
 chunks*/
 +   if (brw-gather_pool.bo != NULL) {
 +  if (prog_data-nr_params  0) {

 I guess you combine these conditions:

   if (brw-gather_pool.bo != NULL  prog_data-nr_params  0)

 Or even bail out early:

   if (brw-gather_pool.bo == NULL || prog_data-nr_params == 0)
  return;

 + int num_consts = ALIGN(prog_data-nr_params, 4) / 4;

 This could be const, no big deal though.

And it could be DIV_ROUND_UP...


 + stage_state-push_const_offset = brw-gather_pool.next_offset;
 + brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
 +  }
 +   }
  }

  static void
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Francisco Jerez

Pohjolainen, Topi topi.pohjolai...@intel.com writes:

 On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote:
 From: Topi Pohjolainen topi.pohjolai...@intel.com
 
 All generations do the same exact dispatch and it could be
 therefore done in the hardware independent stage.
 
 Reviewed-by: Matt Turner matts...@gmail.com
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 [ Francisco Jerez: Non-trivial rebase. ]
 Reviewed-by: Francisco Jerez curroje...@riseup.net
 ---
  src/mesa/drivers/dri/i965/brw_context.h   |  3 -
  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 
 +++
  src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 
 ++
  4 files changed, 83 insertions(+), 87 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 2fcdcfa..a6282f4 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context 
 *brw,
   uint32_t size,
   uint32_t *out_offset,
   bool dword_pitch);
 -void brw_update_buffer_texture_surface(struct gl_context *ctx,
 -   unsigned unit,
 -   uint32_t *surf_offset);
  void
  brw_update_sol_surface(struct brw_context *brw,
 struct gl_buffer_object *buffer_obj,
 diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
 b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 index 160dd2f..2b8040c 100644
 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
 }
  }
  
 -void
 -brw_update_buffer_texture_surface(struct gl_context *ctx,
 -  unsigned unit,
 -  uint32_t *surf_offset)
 +static void
 +update_buffer_texture_surface(struct gl_context *ctx,
 +  unsigned unit,
 +  uint32_t *surf_offset)
  {
 struct brw_context *brw = brw_context(ctx);
 struct gl_texture_object *tObj = ctx-Texture.Unit[unit]._Current;
 @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
 struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
 uint32_t *surf;
  
 -   /* BRW_NEW_TEXTURE_BUFFER */
 -   if (tObj-Target == GL_TEXTURE_BUFFER) {
 -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
 -  return;
 -   }
 -
 surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
6 * 4, 32, surf_offset);
  
 @@ -795,6 +789,21 @@ const struct brw_tracked_state 
 gen6_renderbuffer_surfaces = {
 .emit = update_renderbuffer_surfaces,
  };
  
 +static void
 +update_texture_surface(struct gl_context *ctx,
 +   unsigned unit,
 +   uint32_t *surf_offset,
 +   bool for_gather)
 +{
 +   struct brw_context *brw = brw_context(ctx);
 +   struct gl_texture_object *obj = ctx-Texture.Unit[unit]._Current;
 +
 +   if (obj-Target == GL_TEXTURE_BUFFER) {
 +  update_buffer_texture_surface(ctx, unit, surf_offset);

 In order to avoid extra level of indentation I used the following. I would
 have preferred it here also.

   if (obj-Target == GL_TEXTURE_BUFFER) {
  update_buffer_texture_surface(ctx, unit, surf_offset);
  return;
   }

I kept this as an indented block because it's harmless IMHO and it
seemed a somewhat lesser evil than:
1/ Define all texture-specific variables (i.e. things that are not
   applicable to buffer textures, including some pointer dereferences)
   at the top level, which is what you did, but it seemed a bit dodgy.
2/ Mix statements and declarations. (Granted, this file is likely
   already relying on other C99 features, so it wouldn't matter in
   practice, it's just a codestyle itch)
3/ Declare stuff and leave it uninitialized until later.

That said, the reason was largely subjective, and I don't really have a
strong preference.  As you are still the author of this commit you're
free to format it as you wish, you can keep my R-b if you simply
reindent this function.

 +   } else {
 +  brw-vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
 +   }
 +}
  
  static void
  update_stage_texture_surfaces(struct brw_context *brw,
 @@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
  
   /* _NEW_TEXTURE */
   if (ctx-Texture.Unit[unit]._Current) {
 -brw-vtbl.update_texture_surface(ctx, unit, surf_offset + s, 
 for_gather);
 +update_texture_surface(ctx, unit,

Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 05:52:12PM +0300, Pohjolainen, Topi wrote:
 On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote:
  Reserve space in the gather pool where the gathered uniforms are flushed.
  
  Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 
   1 file changed, 8 insertions(+)
  
  diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
  b/src/mesa/drivers/dri/i965/gen6_vs_state.c
  index 35d10ef..aebaa49 100644
  --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
  +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
  @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw,
  */
 assert(stage_state-push_const_size = 32);
  }
  +   /* Allocate gather pool space for uniform and UBO entries in 512-bit 
  chunks*/
  +   if (brw-gather_pool.bo != NULL) {
  +  if (prog_data-nr_params  0) {
 
 I guess you combine these conditions:
 
   if (brw-gather_pool.bo != NULL  prog_data-nr_params  0)
 
 Or even bail out early:

Newermind, you modify it even further in the next patch.

 
   if (brw-gather_pool.bo == NULL || prog_data-nr_params == 0)
  return;
 
  + int num_consts = ALIGN(prog_data-nr_params, 4) / 4;
 
 This could be const, no big deal though.
 
  + stage_state-push_const_offset = brw-gather_pool.next_offset;
  + brw-gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
  +  }
  +   }
   }
   
   static void
  -- 
  1.9.1
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/27] i965: Assign hw-binding table index for each UBO constant buffer.

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:09PM +0300, Abdiel Janulgue wrote:
 To be able to refer to a constant buffer, the resource streamer needs
 to index it with a hardware binding table entry. This blankets the ubo
 buffers with hardware binding table indices.
 
 Gather constants hardware fetches in 16-entry binding table blocks.
 So we need to use a block that is unused.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_context.h  | 11 +++
  src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  6 ++
  2 files changed, 17 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index e25c64d..276c359 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -678,6 +678,17 @@ struct brw_vs_prog_data {
  
  #define SURF_INDEX_GEN6_SOL_BINDING(t) (t)
  
 +/** Start of hardware binding table index for uniform gather constant 
 entries.
 + *  This must be aligned to the start of a hardware binding table block (a 
 block
 + *  is a group 16 binding table entries).
 + */
 +#define BRW_UNIFORM_GATHER_INDEX_START 32
 +
 +/** Appended to the end of the binding table index for uniform constant 
 buffers to indicate

Wrap this line.

 + *  start of the UBO gather constant binding table.
 + */
 +#define BRW_UBO_GATHER_INDEX_APPEND 2
 +
  /* Note: brw_gs_prog_data_compare() must be updated when adding fields to
   * this struct!
   */
 diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
 b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 index 161d140..ce61554 100644
 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
 @@ -884,6 +884,7 @@ brw_upload_ubo_surfaces(struct brw_context *brw,
  
 uint32_t *surf_offsets =
stage_state-surf_offset[prog_data-binding_table.ubo_start];
 +   bool use_gather = (brw-gather_pool.bo != NULL);

I would move this closer to the only use. This won't get re-used in the
rest of the series.

  
 for (int i = 0; i  shader-NumUniformBlocks; i++) {
struct gl_uniform_buffer_binding *binding;
 @@ -904,6 +905,11 @@ brw_upload_ubo_surfaces(struct brw_context *brw,
bo-size - binding-Offset,
surf_offsets[i],
dword_pitch);
 +  if (use_gather) {

Or simply:

 if (brw-gather_pool.bo) {

 + int bt_idx = BRW_UNIFORM_GATHER_INDEX_START + 
 BRW_UBO_GATHER_INDEX_APPEND + i;

Wrap this line.

 + gen7_update_binding_table(brw, stage_state-stage,
 +   bt_idx, surf_offsets[i]);
 +  }
 }
  
 if (shader-NumUniformBlocks)
 -- 
 1.9.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 07:26:12AM -0700, Jason Ekstrand wrote:
 On Thu, May 7, 2015 at 5:52 AM, Pohjolainen, Topi
 topi.pohjolai...@intel.com wrote:
  On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote:
  v2: Get rid of the block parameter and make src a const reference
 
  Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com
  Reviewed-by: Matt Turner matts...@gmail.com
  Reviewed-by: Kenneth Graunke kenn...@whitecape.org
  ---
   src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 
  
   1 file changed, 38 insertions(+), 37 deletions(-)
 
  diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
  b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
  index 43370cb..9c4ed0b 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
  @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool 
  *negate)
 operands_match(a, b, negate);
   }
 
  +static fs_inst *
  +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate)
 
  Did you mean 'src' to be constant reference? It is only used for reading
  so it could be - you claim this in the commit message yourself :)
 
 Oops...  I think what happened is that I tried to do it for
 is_copy_payload not create_copy_instr.  But then is_copy_payload does
 actually change it so I put it back and somehow my brain leaked it
 into the commit message.  Unfortunately, it's already pushed so I
 can't change it now.  However, I could make a fixup if you'd like.
 --Jason

No big deal really, I'm sure compiler handles that for us anyway.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] docs: document the LIBGL_DRI3_DISABLE environment variable

2015-05-07 Thread Martin Peres

Suggested-by: Axel Davy axel.d...@ens.fr
Signed-off-by: Martin Peres martin.pe...@intel.linux.com
---
 docs/envvars.html | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/envvars.html b/docs/envvars.html
index 31d14a4..c0d5a51 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -34,6 +34,7 @@ sometimes be useful for debugging end-user issues.
 liLIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for 
debugging)
 liLIBGL_SHOW_FPS - print framerate to stdout based on the number of 
glXSwapBuffers
 calls per second.
+liLIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter)
 /ul
 
 
-- 
2.4.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 26/27] i965: Disable gather push constants for null constants

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:23PM +0300, Abdiel Janulgue wrote:
 Programming null constants with gather constant tables seems to
 be unsupported and results in a GPU lockup even with the prescribed
 GPU workarounds in the bspec. Found out by trial and error that
 disabling HW gather constant when the constant state for a stage
 needs to be nullified is the only way to go around the issue.

Just a general question. We keep resource streamer itself always enabled
(except for blorp of course). Does it still do something meaningful without
gather constants or should we disable them both?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 04:43:21PM +0300, Pohjolainen, Topi wrote:
 On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote:
  This patch implements the binding table enable command which is also
  used to allocate a binding table pool where hardware-generated
  binding table entries are flushed into. Each binding table offset in
  the binding table pool is unique per each shader stage that are
  enabled within a batch.
  
  Also insert the required brw_tracked_state objects to enable
  hw-generated binding tables in normal render path.
  
  Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 
  ++
   src/mesa/drivers/dri/i965/brw_context.c|  4 ++
   src/mesa/drivers/dri/i965/brw_context.h|  5 ++
   src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
   src/mesa/drivers/dri/i965/brw_state_upload.c   |  2 +
   src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
   6 files changed, 92 insertions(+)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
  b/src/mesa/drivers/dri/i965/brw_binding_tables.c
  index 459165a..a58e32e 100644
  --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
  +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
  @@ -44,6 +44,11 @@
   #include brw_state.h
   #include intel_batchbuffer.h
   
  +/* Somehow the hw-binding table pool offset must start here, otherwise
  + * the GPU will hang
  + */
  +#define HW_BT_START_OFFSET 256;
 
 I think we want to understand this a little better before enabling...
 
  +
   /**
* Upload a shader stage's binding table as indirect state.
*
  @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = {
  .emit = brw_gs_upload_binding_table,
   };
   
  +/**
  + * Hardware-generated binding tables for the resource streamer
  + */
  +void
  +gen7_disable_hw_binding_tables(struct brw_context *brw)
  +{
  +   BEGIN_BATCH(3);
  +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC  16 | (3 - 2));
  +   OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
  BRW_HW_BINDING_TABLE_ENABLE) |
  + brw-is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0);
  +   OUT_BATCH(0);
  +   ADVANCE_BATCH();
  +
  +   /* Pipe control workaround */
  +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
  +}
  +
  +void
  +gen7_enable_hw_binding_tables(struct brw_context *brw)
  +{
  +   if (!brw-has_resource_streamer) {
  +  gen7_disable_hw_binding_tables(brw);
 
 I started wondering why we really need this - RS is disabled by default and
 we haven't needed to do anything to disable it before.
 
  +  return;
  +   }
  +
  +   if (!brw-hw_bt_pool.bo) {
  +  /* From the BSpec, 3D Pipeline  Resource Streamer  Hardware 
  Binding Tables:
  +   *
  +   *  A maximum of 16,383 Binding tables are allowed in any batch 
  buffer.
  +   */
  +  int max_size = 16383 * 4;
 
 But does it really need this much all the time? I guess I need to go and
 read the spec.

I haven't read through the entire series but it seems that we can calculate
(at least for gather constants) pretty accurately how much we need space.
Could we do it also here based on the program data of all stages? I maybe
missing something and just throwing questions up in the air, so bare with me...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 18/27] i965/fs: Append ir_binop_ubo_load entries to the gather table

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:15PM +0300, Abdiel Janulgue wrote:
 When the const block and offset are immediate values. Otherwise just
 fall-back to the previous method of uploading the UBO constant data to
 GRF using pull constants.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 11 
  src/mesa/drivers/dri/i965/brw_fs.h   |  4 ++
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 
 +++-
  3 files changed, 100 insertions(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 071ac59..031d807 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -2273,6 +2273,7 @@ fs_visitor::assign_constant_locations()
 }
  
 stage_prog_data-nr_params = 0;
 +   stage_prog_data-nr_ubo_params = ubo_uniforms;
  
 unsigned const_reg_access[uniforms];
 memset(const_reg_access, 0, sizeof(const_reg_access));
 @@ -2302,6 +2303,16 @@ fs_visitor::assign_constant_locations()
stage_prog_data-gather_table[p].channel_mask =
   const_reg_access[i];
 }
 +
 +   for (unsigned i = 0; i  this-nr_ubo_gather_table; i++) {
 +  int p = stage_prog_data-nr_gather_table++;
 +  stage_prog_data-gather_table[p].reg = this-ubo_gather_table[i].reg;
 +  stage_prog_data-gather_table[p].channel_mask = 
 this-ubo_gather_table[i].channel_mask;
 +  stage_prog_data-gather_table[p].const_block = 
 this-ubo_gather_table[i].const_block;
 +  stage_prog_data-gather_table[p].const_offset = 
 this-ubo_gather_table[i].const_offset;
 +  stage_prog_data-max_ubo_const_block = 
 MAX2(stage_prog_data-max_ubo_const_block,
 +  
 this-ubo_gather_table[i].const_block);

These are all overflowing 80 columns.

 +   }
  }
  
  /**
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index 32063f0..a48b2bb 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -417,6 +417,7 @@ public:
 void setup_uniform_values(ir_variable *ir);
 void setup_builtin_uniform_values(ir_variable *ir);
 int implied_mrf_writes(fs_inst *inst);
 +   bool generate_ubo_gather_table(ir_expression* ir);
  
 virtual void dump_instructions();
 virtual void dump_instructions(const char *name);
 @@ -445,6 +446,9 @@ public:
 /** Total number of direct uniforms we can get from NIR */
 unsigned num_direct_uniforms;
  
 +   /** Number of ubo uniform variable components visited. */
 +   unsigned ubo_uniforms;
 +
 /** Byte-offset for the next available spot in the scratch space buffer. 
 */
 unsigned last_scratch;
  
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 4e99366..11e608b 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1179,11 +1179,18 @@ fs_visitor::visit(ir_expression *ir)
emit(FS_OPCODE_PACK_HALF_2x16_SPLIT, this-result, op[0], op[1]);
break;
 case ir_binop_ubo_load: {
 +  /* Use gather push constants if at all possible, otherwise just
 +   * fall back to pull constants for UBOs
 +   */
 +  if (generate_ubo_gather_table(ir))
 + break;
 +
/* This IR node takes a constant uniform block and a constant or
 * variable byte offset within the block and loads a vector from that.
 */
ir_constant *const_uniform_block = ir-operands[0]-as_constant();
ir_constant *const_offset = ir-operands[1]-as_constant();
 +

Not part of this patch.

fs_reg surf_index;
  
if (const_uniform_block) {
 @@ -4144,6 +4151,79 @@ fs_visitor::resolve_bool_comparison(ir_rvalue *rvalue, 
 fs_reg *reg)
 *reg = neg_result;
  }
  
 +bool
 +fs_visitor::generate_ubo_gather_table(ir_expression *ir)
 +{
 +   ir_constant *const_uniform_block = ir-operands[0]-as_constant();
 +   ir_constant *const_offset = ir-operands[1]-as_constant();

These are only used for reading, lets use constant pointers.

 +
 +   if (ir-operation != ir_binop_ubo_load ||
 +   !brw-has_resource_streamer||
 +   !brw-fs_ubo_gather||
 +   !const_uniform_block   ||

Not really the style used elsewhere, don't align ||.

 +   !const_offset)
 +  return false;
 +
 +  /* Only allow 16 registers (128 uniform components) as push constants.
 +   */

Move the comment closing to the previous line.

 +   unsigned int max_push_components = 16 * 8;
 +   unsigned param_index = uniforms + ubo_uniforms;

These could be both declared as const.

 +   if ((param_index + ir-type-vector_elements) = max_push_components)
 +  return false;
 +
 +   fs_reg reg;
 +   if (dispatch_width == 16) {
 +  for (int i = 0; i  (int) this-nr_ubo_gather_table; i++) {
 + if

Re: [Mesa-dev] [PATCH 19/27] i965/fs/nir: Append nir_intrinsic_load_ubo entries to the gather table

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:16PM +0300, Abdiel Janulgue wrote:
 When the const block and offset are immediate values. Otherwise just
 fall-back to the previous method of uploading the UBO constant data to
 GRF using pull constants.
 
 Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/brw_fs.h   |  2 ++
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 59 
 
  2 files changed, 61 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
 b/src/mesa/drivers/dri/i965/brw_fs.h
 index a48b2bb..5247fa1 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.h
 +++ b/src/mesa/drivers/dri/i965/brw_fs.h
 @@ -418,6 +418,8 @@ public:
 void setup_builtin_uniform_values(ir_variable *ir);
 int implied_mrf_writes(fs_inst *inst);
 bool generate_ubo_gather_table(ir_expression* ir);
 +   bool nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg 
 dest,
 +  bool has_indirect);
  
 virtual void dump_instructions();
 virtual void dump_instructions(const char *name);
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index 3972581..b68f221 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -1377,6 +1377,9 @@ fs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
 *instr)
has_indirect = true;
/* fallthrough */
 case nir_intrinsic_load_ubo: {
 +  if (nir_generate_ubo_gather_table(instr, dest, has_indirect))
 + break;
 +
nir_const_value *const_index = nir_src_as_const_value(instr-src[0]);
fs_reg surf_index;
  
 @@ -1774,3 +1777,59 @@ fs_visitor::nir_emit_jump(nir_jump_instr *instr)
unreachable(unknown jump);
 }
  }
 +
 +bool
 +fs_visitor::nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg 
 dest,
 +  bool has_indirect)
 +{
 +   nir_const_value *const_index = nir_src_as_const_value(instr-src[0]);

Used only for reading, const.

 +
 +   if (!const_index || has_indirect || !brw-fs_ubo_gather || 
 !brw-has_resource_streamer)

Wrap this line.

 +  return false;
 +
 +   /* Only allow 16 registers (128 uniform components) as push constants.
 +*/
 +   unsigned int max_push_components = 16 * 8;
 +   unsigned param_index = uniforms + ubo_uniforms;

These would be nicer as constants.

 +   if ((MAX2(param_index, num_direct_uniforms) +
 +instr-num_components)  max_push_components)
 +  return false;
 +
 +   fs_reg uniform_reg;
 +   if (dispatch_width == 16) {
 +  for (int i = 0; i  (int) this-nr_ubo_gather_table; i++) {

Extra space.

 + if ((this-ubo_gather_table[i].const_block ==
 +  const_index-u[0]) 
 + (this-ubo_gather_table[i].const_offset ==
 +  (unsigned) instr-const_index[0])) {

Here also.

 +uniform_reg = fs_reg(UNIFORM, this-ubo_gather_table[i].reg);
 +break;
 + }
 +  }
 +  if (uniform_reg.file != UNIFORM) {
 + /* Unlikely but this means that SIMD8 wasn't able to allocate push 
 constant

Wrap this line.

 +  * registers for this ubo load. Fall back to pull-constant method.
 +  */
 + return false;
 +  }
 +   }
 +
 +   if (uniform_reg.file != UNIFORM) {
 +  uniform_reg = fs_reg(UNIFORM, param_index);
 +  int gather = this-nr_ubo_gather_table++;
 +
 +  assert(instr-num_components = 4);
 +  ubo_uniforms += instr-num_components;
 +  this-ubo_gather_table[gather].reg = uniform_reg.reg;
 +  this-ubo_gather_table[gather].const_block = const_index-u[0];
 +  this-ubo_gather_table[gather].const_offset = instr-const_index[0];
 +   }
 +
 +   for (unsigned j = 0; j  instr-num_components; j++) {
 +  fs_reg src = offset(retype(uniform_reg, dest.type), j);
 +  emit(MOV(dest, src));
 +  dest = offset(dest, 1);
 +   }
 +
 +   return true;
 +}
 -- 
 1.9.1
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Don't forget the force_sechalf flag in lower_load_payload().

2015-05-07 Thread Francisco Jerez

Regression from commit 41868bb6824c6106a55c8442006c1e2215abf567.
Fixes a bunch of ARB_shader_image_load_store tests.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 7e4ead0..0a62e46 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3512,6 +3512,7 @@ fs_visitor::lower_load_payload()
 fs_inst *mov = MOV(retype(dst, inst-src[i].type),
inst-src[i]);
 mov-force_writemask_all = inst-force_writemask_all;
+mov-force_sechalf = inst-force_sechalf;
 inst-insert_before(block, mov);
  }
  dst = offset(dst, 1);
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread Aaron Watry

I'm not sure what the final consensus will be on how to do this, but FWIW:
Tested-By: Aaron Watry awa...@gmail.com

I've tested this with 4 combinations:
no --with-opencl-icd option specified : libOpenCL.so gets installed in
${prefix}/lib
--with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
--with-opencl-icd=standard : libMesaOpenCL.so installed in ${prefix}/lib,
icd in /etc/OpenCL/vendors/mesa.icd
--with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in ${prefix}/lib,
icd in ${prefix}/etc//mesa.icd.  I only specified --prefix, no other
directories overridden in configure command.

--Aaron


On Wed, May 6, 2015 at 4:34 PM, EdB edb+m...@sigluy.net wrote:

 The standard ICD file path is /etc/OpenCL/vendor/.
 However it doesn't fit well with custom build.
 This option allow ICD vendor file installation path override
 ---
  configure.ac   | 46
 +++---
  src/gallium/targets/opencl/Makefile.am |  2 +-
  2 files changed, 33 insertions(+), 15 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 095e23e..90dba4e 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
   [enable OpenCL library @:@default=disabled@:@])],
 [enable_opencl=$enableval],
 [enable_opencl=no])
 -AC_ARG_ENABLE([opencl_icd],
 -   [AS_HELP_STRING([--enable-opencl-icd],
 -  [Build an OpenCL ICD library to be loaded by an ICD
 implementation
 -   @:@default=disabled@:@])],
 -[enable_opencl_icd=$enableval],
 -[enable_opencl_icd=no])
  AC_ARG_ENABLE([xlib-glx],
  [AS_HELP_STRING([--enable-xlib-glx],
  [make GLX library Xlib-based instead of DRI-based
 @:@default=disabled@:@])],
 @@ -1689,19 +1683,11 @@ if test x$enable_opencl = xyes; then
  # XXX: Use $enable_shared_pipe_drivers once converted to use
 static/shared pipe-drivers
  enable_gallium_loader=yes

 -if test x$enable_opencl_icd = xyes; then
 -OPENCL_LIBNAME=MesaOpenCL
 -else
 -OPENCL_LIBNAME=OpenCL
 -fi
 -
  if test x$have_libelf != xyes; then
 AC_MSG_ERROR([Clover requires libelf])
  fi
  fi
  AM_CONDITIONAL(HAVE_CLOVER, test x$enable_opencl = xyes)
 -AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$enable_opencl_icd = xyes)
 -AC_SUBST([OPENCL_LIBNAME])

  dnl
  dnl Gallium configuration
 @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
  [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d])
  AC_SUBST([D3D_DRIVER_INSTALL_DIR])

 +dnl OpenCL ICD
 +
 +AC_ARG_WITH([opencl-icd],
 +[AS_HELP_STRING([--with-opencl-icd=@:@no,standard,sysconfdir@:@],
 +[Build an OpenCL ICD library to be loaded by an ICD
 implementation.
 + If @:@standard@:@ the OpenCL ICD vendor file installs in
 /etc/OpenCL/vendors.
 + @:@sysconfdir@:@ installs the file in
 $sysconfdir/OpenCL/vendors
 + @:@default=no@:@])],
 +[OPENCL_ICD=$withval],
 +[OPENCL_ICD=no])
 +
 +case x$OPENCL_ICD in
 +xno)
 +OPENCL_LIBNAME=OpenCL
 +;;
 +xstandard)
 +OPENCL_LIBNAME=MesaOpenCL
 +ICD_FILE_DIR=/etc/OpenCL/vendors
 +;;
 +xsysconfdir)
 +OPENCL_LIBNAME=MesaOpenCL
 +ICD_FILE_DIR=$sysconfdir/OpenCL/vendors
 +;;
 +*)
 +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
 --with-opencl-icd])
 +;;
 +esac
 +
 +AM_CONDITIONAL(HAVE_CLOVER_ICD, test x$OPENCL_ICD != xno)
 +AC_SUBST([OPENCL_LIBNAME])
 +AC_SUBST([ICD_FILE_DIR])
 +
  dnl
  dnl Gallium helper functions
  dnl
 diff --git a/src/gallium/targets/opencl/Makefile.am
 b/src/gallium/targets/opencl/Makefile.am
 index 5daf327..781daa0 100644
 --- a/src/gallium/targets/opencl/Makefile.am
 +++ b/src/gallium/targets/opencl/Makefile.am
 @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES = opencl.sym
  EXTRA_DIST = mesa.icd opencl.sym

  if HAVE_CLOVER_ICD
 -icddir = /etc/OpenCL/vendors/
 +icddir = $(ICD_FILE_DIR)
  icd_DATA = mesa.icd
  endif

 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Ilia Mirkin

On Thu, May 7, 2015 at 3:59 AM, Michel Dänzer mic...@daenzer.net wrote:
 On 05.05.2015 01:47, Tom Stellard wrote:
 On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote:
 On Mon, May 4, 2015 at 10:04 AM, Tom Stellard t...@stellard.net wrote:
 On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote:
 On Sat, May 2, 2015 at 1:19 PM, EdB edb+m...@sigluy.net wrote:
 The standard ICD file path is /etc/OpenCL/vendor/.
 However it doesn't fit well with custom build.
 This option allow ICD vendor file installation path override
 ---
  configure.ac   | 6 ++
  src/gallium/targets/opencl/Makefile.am | 2 +-
  2 files changed, 7 insertions(+), 1 deletion(-)

 diff --git a/configure.ac b/configure.ac
 index 095e23e..bf08d76 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir],
  [D3D_DRIVER_INSTALL_DIR=$withval],
  [D3D_DRIVER_INSTALL_DIR=${libdir}/d3d])
  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
 +AC_ARG_WITH([icd-file-dir],
 +[AS_HELP_STRING([--with-icd-file-dir=DIR],
 +[directory for the OpenCL ICD vendor file 
 @:@/etc/OpenCL/vendors@:@])],
 +[ICD_FILE_INSTALL_DIR=$withval],
 +[ICD_FILE_INSTALL_DIR=/etc/OpenCL/vendors])

 What about making this default to ${sysconfdir}/OpenCL/vendors ? That
 way using --prefix should auto-make it go into the prefix instead of
 unexpectedly installing things outside of the specified prefix? That
 way a distro build which specifies --sysconfdir as /etc will get it in
 the right place, while by default it'll go into /usr/local/etc and a
 user can override the icd loader's default behaviour with
 OPENCL_VENDOR_PATH?


 I would prefer not to make this the default behavior, because it violates 
 the spec
 and there could potentially be multiple icd implementations, which may or 
 may not have
 the overrides.

 I think the best solution would be to rename the option to something like
 --enable-ocl-icd-respect-prefix (suggestions for other names encouraged).
 and have the option enable the behavior that Ilia is describing.

 This will give distros and advanced users a way to setup their system
 the way they want.

 It's just a very anti-autoconf thing to do to have make install fail
 by default unless you specify some hey, i actually want make install
 to work option.

 I think it's crazy to expect that, by default, people will want to
 write over their system installs, and having things go outside of the
 specified --prefix is very surprising (unless you force some other
 option). And asking the user to run make install as root is even
 crazier.


 My expectation is that, by default, when people specify --enable-opencl-icd
 they want an implementation that conforms to the specification.
 Unfortunately, this means installing icd files to /etc.

 There is no good solution here, but I'd rather have users specify a flag
 to get a sane build system, than requiring them to set a flag and set
 an environment variable just to get working OpenCL with the ICD loader.

 I guess I haven't hit this yet because there's no OpenCL support in
 nouveau or freedreno, but I made the same stink about vdpau when Emil
 tried to make it install to some system location by default. At least
 a few people seemed to agree with me back then...


 Does the vdpau spec also require installation to a specific system director
 (e.g. /etc/) ?

 Tom, I think ensuring that the OpenCL ICD loader can pick up the
 mesa.icd file is something for the distributor / administrator / user to
 worry about, not Mesa upstream.

 There's a similar situation with the drirc file, which is installed
 inside the prefix by default but only read from /etc/.

FTR, I fully agree with this assessment (it's the distributor's
problem), but my main priority was making sure make install works.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

99 matches

Mail list logo