[Mesa-dev] [Bug 106905] Account request

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106905

Bug ID: 106905
   Summary: Account request
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: gw.foss...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 140145
  --> https://bugs.freedesktop.org/attachment.cgi?id=140145&action=edit
SSH key

Name: Gert Wollny
Email: gw.foss...@gmail.com
Username: gerddie

for now I plan to continue to contributing to the GLSL->TGSI layer, r600, and
virgl.

many thanks

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106905] Account request

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106905

--- Comment #1 from Gert Wollny  ---
Created attachment 140146
  --> https://bugs.freedesktop.org/attachment.cgi?id=140146&action=edit
GPG key

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106905] Account request

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106905

--- Comment #2 from Gert Wollny  ---
BTW: I've added these keys already to my gitlab.fdo account.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106903] radv: Fragment shader output goes to wrong attachments when render targets are sparse

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106903

--- Comment #1 from Samuel Pitoiset  ---
Well, AMDVLK hangs on Polaris/Vega here. (I recompiled that fragment shader
manually).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106877] The game Rise of the Tomb Raider lead to GPU hang when I try in same place jump into the hole.

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106877

--- Comment #6 from Samuel Pitoiset  ---
I can reproduce the hang as well. This seems to only affect Vega and LLVM 6
(latest LLVM trunk fixes the GPU hang on my side). I have no ideas what
changed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106897] Ubuntu 16.04. Mesa can't be built with specified configurations

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106897

Sergii Romantsov  changed:

   What|Removed |Added

 Resolution|--- |NOTABUG
 Status|REOPENED|RESOLVED

--- Comment #6 from Sergii Romantsov  ---
Sorry, didn't realise at once your words "core Wayland repository".

If that means that user has to make Wayland manually or upgrade system to the
18.10 because of lack just only one header, than seems its not issue...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106906] Failed to recongnize keyword “sampler2DRect” and "sampler2DRectShadow"

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106906

Bug ID: 106906
   Summary: Failed to recongnize keyword “sampler2DRect” and
"sampler2DRectShadow"
   Product: Mesa
   Version: 17.1
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: zhaowei.y...@samsung.com
QA Contact: intel-3d-b...@lists.freedesktop.org

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106906] Failed to recongnize keyword “sampler2DRect” and "sampler2DRectShadow"

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106906

--- Comment #1 from Zhaowei Yuan  ---
CTS cases
"dEQP-GLES2.functional.shaders.keywords.reserved_keywords.sampler2DRectShadow_vertex"
and
"dEQP-GLES2.functional.shaders.keywords.reserved_keywords.sampler2DRectShadow_fragment"
check that if shader complier can recongnize reserved keywords "sampler2DRect"
and "sampler2DRectShadow"

GLSL ES spec 1.0.17 says they are keywords reserved for future use. Using them
will result in an error

I've fixed the problem with follow modification:
-sampler2DRect 
DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable,
glsl_type::sampler2DRect_type);
+sampler2DRect  TYPE_WITH_ALT(110, 100, 0, 0,
yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRect_type);
 sampler3DRect  KEYWORD(110, 100, 0, 0, SAMPLER3DRECT);
-sampler2DRectShadow   
DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable,
glsl_type::sampler2DRectShadow_type);
+sampler2DRectShadowTYPE_WITH_ALT(110, 100, 0, 0,
yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRectShadow_type);

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Take sampler2DRect and sampler2DRectShadow as reserved

2018-06-13 Thread zhaowei yuan
"sampler2DRect" and "sampler2DRectShadow" are specified as
reserved from GLSL 1.1 and GLSL ES 1.0

Signed-off-by: zhaowei yuan 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906
---
 src/compiler/glsl/glsl_lexer.ll | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/glsl_lexer.ll b/src/compiler/glsl/glsl_lexer.ll
index de6dc64..87b64e0 100644
--- a/src/compiler/glsl/glsl_lexer.ll
+++ b/src/compiler/glsl/glsl_lexer.ll
@@ -627,9 +627,9 @@ dmat4x4 TYPE_WITH_ALT(110, 100, 400, 0, 
yyextra->ARB_gpu_shader_fp64_enable, gl
 fvec2  KEYWORD(110, 100, 0, 0, FVEC2);
 fvec3  KEYWORD(110, 100, 0, 0, FVEC3);
 fvec4  KEYWORD(110, 100, 0, 0, FVEC4);
-sampler2DRect  
DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable, 
glsl_type::sampler2DRect_type);
+sampler2DRect  TYPE_WITH_ALT(110, 100, 0, 0, 
yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRect_type);
 sampler3DRect  KEYWORD(110, 100, 0, 0, SAMPLER3DRECT);
-sampler2DRectShadow
DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable, 
glsl_type::sampler2DRectShadow_type);
+sampler2DRectShadowTYPE_WITH_ALT(110, 100, 0, 0, 
yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRectShadow_type);
 sizeof KEYWORD(110, 100, 0, 0, SIZEOF);
 cast   KEYWORD(110, 100, 0, 0, CAST);
 namespace  KEYWORD(110, 100, 0, 0, NAMESPACE);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106906] Failed to recongnize keyword “sampler2DRect” and "sampler2DRectShadow"

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106906

--- Comment #2 from Zhaowei Yuan  ---
patch is posted here:
https://patchwork.freedesktop.org/patch/229229/

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 075/129] nir: convert lower_samplers_as_deref to deref instructions

2018-06-13 Thread Kenneth Graunke
On Tuesday, June 12, 2018 5:54:31 PM PDT Rob Clark wrote:
> On Tue, Jun 12, 2018 at 6:34 PM, Kenneth Graunke  
> wrote:
> > On Thursday, May 31, 2018 10:04:05 PM PDT Jason Ekstrand wrote:
> >> From: Rob Clark 
> >>
> >> This also removes the legacy version of lower_samplers.
> >
> > It does not, that's what patch 76 (the next one) does.
> >
> 
> (for lack of good way of viewing full patchset atm, I'll take your
> word for that, but that said) maybe just add the words "need for" into
> that sentence, rather than squashing this and the following patch
> together, to reduce the noise in the patch history..
> 
> BR,
> -R

Yeah, definitely, I'd just change the message, not the patch.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/compiler: Properly consider UBO loads that cross 32B boundaries.

2018-06-13 Thread Kenneth Graunke
On Tuesday, June 12, 2018 1:38:03 PM PDT Rafael Antognolli wrote:
> On Mon, Jun 11, 2018 at 02:01:49PM -0700, Kenneth Graunke wrote:
> > The UBO push analysis pass incorrectly assumed that all values would fit
> > within a 32B chunk, and only recorded a bit for the 32B chunk containing
> > the starting offset.
> > 
> > For example, if a UBO contained the following, tightly packed:
> > 
> >vec4 a;  // [0, 16)
> >float b; // [16, 20)
> >vec4 c;  // [20, 36)
> > 
> > then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
> > which means that we ought to record two 32B chunks in the bitfield.
> > 
> > Similarly, dvec4s would suffer from the same problem.
> > ---
> >  src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 +++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c 
> > b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > index d58fe3dd2e3..6d6ccf73ade 100644
> > --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state *state, 
> > nir_block *block)
> >   if (offset >= 64)
> >  continue;
> >  
> > + /* The value might span multiple 32-byte chunks. */
> > + const int bytes = nir_intrinsic_dest_components(intrin) *
> > +   (nir_dest_bit_size(intrin->dest) / 8);
> > + const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes, 32);
> > + const int regs = end - offset + 1;
> > +
> 
> But if I understood it correctly, offset is the first 32B chunk within
> the UBO block (it's actually an ubo "chunk offset"). And you calculate
> bytes by taking the number of components times the size of each
> component of the nir_intrinsic_load_ubo instruction (which apparently
> supports multiple components). So yeah, this makes sense to me.

Yeah, that's exactly right.  load_ubo can load up to 4 components.

> Take this review with a grain of salt (assuming what I wrote above is
> correct), but this looks simple enough. So it is
> 
> Reviewed-by: Rafael Antognolli 

Thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

Bug ID: 106907
   Summary: Correct Transform Feedback Varyings information is
expected after using ProgramBinary
   Product: Mesa
   Version: git
  Hardware: Other
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: xinghua@intel.com
QA Contact: mesa-dev@lists.freedesktop.org

Steps:
1. Download chrome and install it on your Ubuntu,
https://www.google.com/chrome/?platform=linux&extra=devchannel;
2. Open
https://www.khronos.org/registry/webgl/sdk/tests/conformance2/transform_feedback/transform_feedback.html?webglVersion=2&quiet=0
3. First time open link, all cases will be successful. Please click refresh
button of chrome, some cases fail.

Notes:
1. I could only reproduce it on mesa git master, could not reproduce on system
driver(Ubuntu 17.10). May our latest code introduced some regression?
2. The case may be related with
https://bugs.freedesktop.org/show_bug.cgi?id=106810
3. Chrome will cache program binary, the second time run page, chrome will call
glProgramBinary to avoid re-compile shaders and re-link program.
4. This failed cases verify glGetProgramiv to get tranform feedback varyings
number, and getTransformFeedbackVarying to get transform feedback's size, type
and name. But current program seems be without transform feedback varyings
information.
5.I had checked mesa code. For example, glProgramBinary triggers read_xfb
function to re-serialize binary, creates gl_transform_feedback_info object,
which also has a member named "NumVarying", if the program binary has two
tranform feedback varyings, "NumVarying" value is 2. Then call glGetProgramiv
to get tranform feedback varyings number in shaderapi.c, the value is got from
"NumVarying", which is a member of TransformFeedback of gl_shader_program. I
found that glProgramBinary implemetation in mesa did not update transform
feedback varyings number from gl_transform_feedback_info object to
TransformFeedback object.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

xinghua  changed:

   What|Removed |Added

 CC||jljus...@gmail.com,
   ||lem...@gmail.com,
   ||yang...@intel.com,
   ||yunchao...@intel.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac/meson.build: Add options for library suffixes

2018-06-13 Thread Eric Engestrom
On Tuesday, 2018-06-12 11:19:40 -0600, bmgor...@chromium.org wrote:
> From: Benjamin Gordon 
> 
> When building the Chrome OS Android container, we need to build copies
> of mesa that don't conflict with the Android system-supplied libraries.
> This adds options to create suffixed versions of EGL and GLES libraries:
> 
> libEGL.so -> libEGL${egl-lib-suffix}.so
> libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so
> libGLESv2.so -> libGLES${gles-lib-suffix}.so
> 
> This is similar to what happens when --enable-libglvnd is specified, but
> without the side effects of linking against libglvnd.

This seems reasonable, and the meson side of this patch is correct,
but we need to document or prevent the interaction between
--enable-libglvnd and --with-egl-lib-suffix.

I can't think of a use-case for having both, so I suggest "if both are
enabled, error out"; scroll down for what this could look like in meson.

With that (and the corresponding autotools hunk):
Reviewed-by: Eric Engestrom 

> 
> Change-Id: I0a534d3921a24c031e2532ee7d5ba9813740b33b

(Note to whoever merges this patch: drop this line ^)

> Signed-off-by: Benjamin Gordon 
> ---
>  configure.ac| 14 ++
>  meson_options.txt   | 12 
>  src/egl/Makefile.am |  8 
>  src/egl/meson.build |  2 +-
>  src/mapi/Makefile.am| 28 ++--
>  src/mapi/es1api/meson.build |  2 +-
>  src/mapi/es2api/meson.build |  2 +-
>  7 files changed, 47 insertions(+), 21 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 35ade986d1..6070a2146b 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1511,12 +1511,24 @@ AC_ARG_WITH([gl-lib-name],
>  [specify GL library name @<:@default=GL@:>@])],
>[GL_LIB=$withval],
>[GL_LIB="$DEFAULT_GL_LIB_NAME"])
> +AC_ARG_WITH([egl-lib-suffix],
> +  [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@],
> +[specify EGL library suffix @<:@default=none@:>@])],
> +  [EGL_LIB_SUFFIX=$withval],
> +  [EGL_LIB_SUFFIX=""])
> +AC_ARG_WITH([gles-lib-suffix],
> +  [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@],
> +[specify GLES library suffix @<:@default=none@:>@])],
> +  [GLES_LIB_SUFFIX=$withval],
> +  [GLES_LIB_SUFFIX=""])
>  AC_ARG_WITH([osmesa-lib-name],
>[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
>  [specify OSMesa library name @<:@default=OSMesa@:>@])],
>[OSMESA_LIB=$withval],
>[OSMESA_LIB=OSMesa])
>  AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
> +AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""])
> +AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""])
>  AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
>  
>  dnl
> @@ -1534,6 +1546,8 @@ if test "x${enable_mangling}" = "xyes" ; then
>OSMESA_LIB="Mangled${OSMESA_LIB}"
>  fi
>  AC_SUBST([GL_LIB])
> +AC_SUBST([EGL_LIB_SUFFIX])
> +AC_SUBST([GLES_LIB_SUFFIX])
>  AC_SUBST([OSMESA_LIB])
>  
>  # Check for libdrm
> diff --git a/meson_options.txt b/meson_options.txt
> index ce7d87f1eb..9d84c3b5bb 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -298,3 +298,15 @@ option(
>choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'],
>description : 'List of tools to build.',
>  )
> +option(
> +  'egl-lib-suffix',
> +  type : 'string',
> +  value : '',
> +  description : 'Suffix to append to EGL library name.  Default: none.'
> +)
> +option(
> +  'gles-lib-suffix',
> +  type : 'string',
> +  value : '',
> +  description : 'Suffix to append to GLES library names.  Default: none.'
> +)
> diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
> index 086a4a1e63..c3aeeea007 100644
> --- a/src/egl/Makefile.am
> +++ b/src/egl/Makefile.am
> @@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \
>  
>  else # USE_LIBGLVND
>  
> -lib_LTLIBRARIES = libEGL.la
> -libEGL_la_SOURCES =
> -libEGL_la_LIBADD = \
> +lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la
> +libEGL@EGL_LIB_SUFFIX@_la_SOURCES =
> +libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \
>   libEGL_common.la \
>   $(top_builddir)/src/mapi/shared-glapi/libglapi.la
> -libEGL_la_LDFLAGS = \
> +libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \
>   -no-undefined \
>   -version-number 1:0 \
>   $(BSYMBOLIC) \
> diff --git a/src/egl/meson.build b/src/egl/meson.build
> index 6537e4bdee..b833fd1729 100644
> --- a/src/egl/meson.build
> +++ b/src/egl/meson.build
> @@ -148,7 +148,7 @@ if cc.has_function('mincore')
>  endif
>  

  if with_glvnd and get_option('egl-lib-suffix') != ''
error('''EGL lib suffix can't be used with libglvnd''')
  endif

>  if not with_glvnd
> -  egl_lib_name = 'EGL'
> +  egl_lib_name = 'EGL' + get_option('egl-lib-suffix')
>egl_lib_version = '1.0.0'
>  else
>egl_lib_name = 'EGL_mesa'
> diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
> index 3da1a193d2..a2b108adc9 100644
> --- a/src/mapi/Makefile.am
> +++ b/src/mapi/Makefile.am
> @@ -178,24 +178,24 @@ GLES_include_HEADERS = \
>   $(top_srcdir)/inc

[Mesa-dev] [PATCH v2] radv: update the ZRANGE_PRECISION value for the TC-compat bug

2018-06-13 Thread Samuel Pitoiset
On GFX8+, there is a bug that affects TC-compatible depth surfaces
when the ZRange is not reset after LateZ kills pixels.

The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match
the last fast clear value. Because the value is set to 1 by default,
we only need to update it when clearing Z to 0.0.

Original patch from James Legg.

v2: - only update ZRANGE_PRECISION for depth aspects
- adjust base address in presence of stencil

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396
CC: 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 94 
 1 file changed, 94 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 043b4a2f44a..b8724d6b937 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer 
*cmd_buffer,
}
 }
 
+static void
+radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
+struct radv_ds_buffer_info *ds,
+struct radv_image *image, VkImageLayout layout,
+bool requires_cond_write)
+{
+   uint32_t db_z_info = ds->db_z_info;
+   uint32_t db_z_info_reg;
+
+   if (!radv_image_is_tc_compat_htile(image))
+   return;
+
+   if (!radv_layout_has_htile(image, layout,
+  radv_image_queue_family_mask(image,
+   
cmd_buffer->queue_family_index,
+   
cmd_buffer->queue_family_index))) {
+   db_z_info &= C_028040_TILE_SURFACE_ENABLE;
+   }
+
+   db_z_info &= C_028040_ZRANGE_PRECISION;
+
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
+   db_z_info_reg = R_028038_DB_Z_INFO;
+   } else {
+   db_z_info_reg = R_028040_DB_Z_INFO;
+   }
+
+   /* When we don't know the last fast clear value we need to emit a
+* conditional packet, otherwise we can update DB_Z_INFO directly.
+*/
+   if (requires_cond_write) {
+   radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0));
+
+   const uint32_t write_space = 0 << 8;/* register */
+   const uint32_t poll_space = 1 << 4; /* memory */
+   const uint32_t function = 3 << 0;   /* equal to the 
reference */
+   const uint32_t options = write_space | poll_space | function;
+   radeon_emit(cmd_buffer->cs, options);
+
+   /* poll address - location of the depth clear value */
+   uint64_t va = radv_buffer_get_va(image->bo);
+   va += image->offset + image->clear_value_offset;
+
+   /* In presence of stencil format, we have to adjust the base
+* address because the first value is the stencil clear value.
+*/
+   if (vk_format_is_stencil(image->vk_format))
+   va += 4;
+
+   radeon_emit(cmd_buffer->cs, va);
+   radeon_emit(cmd_buffer->cs, va >> 32);
+
+   radeon_emit(cmd_buffer->cs, fui(0.0f));  /* reference 
value */
+   radeon_emit(cmd_buffer->cs, (uint32_t)-1);   /* comparison 
mask */
+   radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write 
address low */
+   radeon_emit(cmd_buffer->cs, 0u); /* write 
address high */
+   radeon_emit(cmd_buffer->cs, db_z_info);
+   } else {
+   radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, 
db_z_info);
+   }
+}
+
 static void
 radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
  struct radv_ds_buffer_info *ds,
@@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
 
}
 
+   /* Update the ZRANGE_PRECISION value for the TC-compat bug. */
+   radv_update_zrange_precision(cmd_buffer, ds, image, layout, true);
+
radeon_set_context_reg(cmd_buffer->cs, 
R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
   ds->pa_su_poly_offset_db_fmt_cntl);
 }
@@ -1143,6 +1208,35 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cmd_buffer->cs, ds_clear_value.stencil); /* 
R_028028_DB_STENCIL_CLEAR */
if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth)); /* 
R_02802C_DB_DEPTH_CLEAR */
+
+   /* Update the ZRANGE_PRECISION value for the TC-compat bug. This is
+* only needed when clearing Z to 0.0.
+*/
+   if ((aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
+   ds_clear_value.depth == 0.0) {
+   struct radv_framebuffer *framebuffer = 
cmd_buffer->state.framebuffer;
+   const struct radv_subpas

[Mesa-dev] [Bug 105396] tc compatible htile sets depth of htiles of discarded fragments to 1.0

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105396

--- Comment #9 from Samuel Pitoiset  ---
Can you confirm this patch fixes the issue ?
https://patchwork.freedesktop.org/patch/229236/

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] radv: update the ZRANGE_PRECISION value for the TC-compat bug

2018-06-13 Thread Bas Nieuwenhuizen
Thanks for figuring out the remaning issues,

Reviewed-by: Bas Nieuwenhuizen 

On Wed, Jun 13, 2018 at 12:04 PM, Samuel Pitoiset
 wrote:
> On GFX8+, there is a bug that affects TC-compatible depth surfaces
> when the ZRange is not reset after LateZ kills pixels.
>
> The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match
> the last fast clear value. Because the value is set to 1 by default,
> we only need to update it when clearing Z to 0.0.
>
> Original patch from James Legg.
>
> v2: - only update ZRANGE_PRECISION for depth aspects
> - adjust base address in presence of stencil
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396
> CC: 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 94 
>  1 file changed, 94 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 043b4a2f44a..b8724d6b937 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer 
> *cmd_buffer,
> }
>  }
>
> +static void
> +radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
> +struct radv_ds_buffer_info *ds,
> +struct radv_image *image, VkImageLayout layout,
> +bool requires_cond_write)
> +{
> +   uint32_t db_z_info = ds->db_z_info;
> +   uint32_t db_z_info_reg;
> +
> +   if (!radv_image_is_tc_compat_htile(image))
> +   return;
> +
> +   if (!radv_layout_has_htile(image, layout,
> +  radv_image_queue_family_mask(image,
> +   
> cmd_buffer->queue_family_index,
> +   
> cmd_buffer->queue_family_index))) {
> +   db_z_info &= C_028040_TILE_SURFACE_ENABLE;
> +   }
> +
> +   db_z_info &= C_028040_ZRANGE_PRECISION;
> +
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) 
> {
> +   db_z_info_reg = R_028038_DB_Z_INFO;
> +   } else {
> +   db_z_info_reg = R_028040_DB_Z_INFO;
> +   }
> +
> +   /* When we don't know the last fast clear value we need to emit a
> +* conditional packet, otherwise we can update DB_Z_INFO directly.
> +*/
> +   if (requires_cond_write) {
> +   radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0));
> +
> +   const uint32_t write_space = 0 << 8;/* register */
> +   const uint32_t poll_space = 1 << 4; /* memory */
> +   const uint32_t function = 3 << 0;   /* equal to the 
> reference */
> +   const uint32_t options = write_space | poll_space | function;
> +   radeon_emit(cmd_buffer->cs, options);
> +
> +   /* poll address - location of the depth clear value */
> +   uint64_t va = radv_buffer_get_va(image->bo);
> +   va += image->offset + image->clear_value_offset;
> +
> +   /* In presence of stencil format, we have to adjust the base
> +* address because the first value is the stencil clear value.
> +*/
> +   if (vk_format_is_stencil(image->vk_format))
> +   va += 4;
> +
> +   radeon_emit(cmd_buffer->cs, va);
> +   radeon_emit(cmd_buffer->cs, va >> 32);
> +
> +   radeon_emit(cmd_buffer->cs, fui(0.0f));  /* reference 
> value */
> +   radeon_emit(cmd_buffer->cs, (uint32_t)-1);   /* 
> comparison mask */
> +   radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write 
> address low */
> +   radeon_emit(cmd_buffer->cs, 0u); /* write 
> address high */
> +   radeon_emit(cmd_buffer->cs, db_z_info);
> +   } else {
> +   radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, 
> db_z_info);
> +   }
> +}
> +
>  static void
>  radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
>   struct radv_ds_buffer_info *ds,
> @@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer 
> *cmd_buffer,
>
> }
>
> +   /* Update the ZRANGE_PRECISION value for the TC-compat bug. */
> +   radv_update_zrange_precision(cmd_buffer, ds, image, layout, true);
> +
> radeon_set_context_reg(cmd_buffer->cs, 
> R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
>ds->pa_su_poly_offset_db_fmt_cntl);
>  }
> @@ -1143,6 +1208,35 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
> *cmd_buffer,
> radeon_emit(cmd_buffer->cs, ds_clear_value.stencil); /* 
> R_028028_DB_STENCIL_CLEAR */
> if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
> radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth)); /* 
> R_02802C_DB_DE

[Mesa-dev] [Bug 106910] Primus Segfaults after updating Mesa to 18.1.1

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106910

Bug ID: 106910
   Summary: Primus Segfaults after updating Mesa to 18.1.1
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: sali...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 140148
  --> https://bugs.freedesktop.org/attachment.cgi?id=140148&action=edit
Journalctl traces

After upgrading MESA to version 18.1.1 Primus (bridge for Bumblebee, the NVIDIA
Optimus implementation) segfaults while trying to use applications that require
GLX.Running same applications with Intel and VirtualGL (another bridge for
Bumblebee that runs slower than Primus) works fine.After downgrading to Mesa
18.0.4 everything starts working again.Applications' logs don't say much.
The issue can be reproduced by:
1)Installing Bumblebee 3.2.1, MESA 18.1.1 and primus 20151110
2)Running any GLX application with primusrun

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106910] Primus Segfaults after updating Mesa to 18.1.1

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106910

--- Comment #1 from Alexander  ---
The same issue remains in git version as well.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 105396] tc compatible htile sets depth of htiles of discarded fragments to 1.0

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105396

--- Comment #10 from James Legg  ---
Yes, that patch fixes it. Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] meson: Fix -latomic check

2018-06-13 Thread Eric Engestrom
On Tuesday, 2018-06-12 17:50:20 -0700, Matt Turner wrote:
> Commit 54ba73ef102f (configure.ac/meson.build: Fix -latomic test) fixed
> some checks for -latomic, and then commit 54bbe600ec26 (configure.ac:
> rework -latomic check) further extended the fixes in configure.ac but
> not in Meson. This commit extends those fixes to the Meson tests.
> 
> Fixes: 54bbe600ec26 (configure.ac: rework -latomic check)

Reviewed-by: Eric Engestrom 

> ---
>  meson.build | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/meson.build b/meson.build
> index 7dba52369b0..62200476216 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -836,7 +836,13 @@ endif
>  # Check for GCC style atomics
>  dep_atomic = null_dep
>  
> -if cc.compiles('int main() { int n; return __atomic_load_n(&n, 
> __ATOMIC_ACQUIRE); }',
> +if cc.compiles('''#include 
> +  int main() {
> +struct {
> +  uint64_t *v;
> +} x;
> +return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE);
> +  }''',
> name : 'GCC atomic builtins')
>pre_args += '-DUSE_GCC_ATOMIC_BUILTINS'
>  
> -- 
> 2.16.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa GitLab access approval process

2018-06-13 Thread Rob Clark
On Wed, Jun 13, 2018 at 12:43 AM, Jason Ekstrand  wrote:
> Since we've been on GitLab (it's been less than a week), we've already
> gotten a couple of developer access requests through GitLab.  As it stands,
> these just show up as an e-mail to the group owners with zero explanation or
> opportunity for the requester to provide justification for the request.
> This is clearly worse than the bugzilla system we had before.
>
> I don't think we want to change the general guidelines for getting commit
> access of ~2 dozen patches, good standing, and an understanding of the Mesa
> code review process.  However, we do need to do something else for
> requesting access so we have some real dialogue and provide opportunities
> for people from the same area of Mesa that the new developer wants to work
> in to vouch for them.
>
> My recommendation (if no one minds) would be to create a mesa "accounts"
> project that doesn't have a git repo or anything else and just provides an
> issue tracker.  People could then use that much in the same way as they've
> used Bugzilla in the past to request accounts.  If people would rather stick
> to bugzilla, that's fine with me.  I just thought this would be a relatively
> painless way to try out the issue tracker.
>

I guess in the long run, if we switch over to gitlab issue tracker for
"real" bugs, I was kinda expecting account requests would just be a
special component in the mesa project's issue tracker, instead of a
special project.

Either way, I agree w/ keeping the the process of filling an issue/bz,
and keeping the same general guidelines (wherever the issues/bzs live,
ie same project, different project, or bugzilla).

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] virgl: add ARB_tessellation_shader support. (v2)

2018-06-13 Thread Elie Tournier
On Wed, Jun 13, 2018 at 11:03:55AM +1000, Dave Airlie wrote:
> From: Dave Airlie 
> 
> This should add all the pieces to enable tess shaders on virgl.
> 
> v2: fixup transform to handle tess and strip out precise.
> set default for max patch varyings to work around issue when
> tess gets enabled from v1 caps but v2 caps aren't in place. (Elie)

Reviewed-by: Elie Tournier 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_transform.c |  4 --
>  src/gallium/drivers/virgl/virgl_context.c   | 69 
> +
>  src/gallium/drivers/virgl/virgl_encode.c| 21 -
>  src/gallium/drivers/virgl/virgl_encode.h|  4 ++
>  src/gallium/drivers/virgl/virgl_protocol.h  |  5 +++
>  src/gallium/drivers/virgl/virgl_screen.c| 10 -
>  src/gallium/drivers/virgl/virgl_winsys.h|  2 +-
>  7 files changed, 107 insertions(+), 8 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_transform.c 
> b/src/gallium/auxiliary/tgsi/tgsi_transform.c
> index cd076c9e79e..4b2b10f50ad 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_transform.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_transform.c
> @@ -140,10 +140,6 @@ tgsi_transform_shader(const struct tgsi_token *tokens_in,
>return -1;
> }
> procType = parse.FullHeader.Processor.Processor;
> -   assert(procType == PIPE_SHADER_FRAGMENT ||
> -  procType == PIPE_SHADER_VERTEX ||
> -  procType == PIPE_SHADER_GEOMETRY);
> -
>  
> /**
>  **  Setup output shader
> diff --git a/src/gallium/drivers/virgl/virgl_context.c 
> b/src/gallium/drivers/virgl/virgl_context.c
> index 8d701bb8f40..e6f8dc85256 100644
> --- a/src/gallium/drivers/virgl/virgl_context.c
> +++ b/src/gallium/drivers/virgl/virgl_context.c
> @@ -492,6 +492,18 @@ static void *virgl_create_vs_state(struct pipe_context 
> *ctx,
> return virgl_shader_encoder(ctx, shader, PIPE_SHADER_VERTEX);
>  }
>  
> +static void *virgl_create_tcs_state(struct pipe_context *ctx,
> +   const struct pipe_shader_state *shader)
> +{
> +   return virgl_shader_encoder(ctx, shader, PIPE_SHADER_TESS_CTRL);
> +}
> +
> +static void *virgl_create_tes_state(struct pipe_context *ctx,
> +   const struct pipe_shader_state *shader)
> +{
> +   return virgl_shader_encoder(ctx, shader, PIPE_SHADER_TESS_EVAL);
> +}
> +
>  static void *virgl_create_gs_state(struct pipe_context *ctx,
> const struct pipe_shader_state *shader)
>  {
> @@ -534,6 +546,26 @@ virgl_delete_vs_state(struct pipe_context *ctx,
> virgl_encode_delete_object(vctx, handle, VIRGL_OBJECT_SHADER);
>  }
>  
> +static void
> +virgl_delete_tcs_state(struct pipe_context *ctx,
> +   void *tcs)
> +{
> +   uint32_t handle = (unsigned long)tcs;
> +   struct virgl_context *vctx = virgl_context(ctx);
> +
> +   virgl_encode_delete_object(vctx, handle, VIRGL_OBJECT_SHADER);
> +}
> +
> +static void
> +virgl_delete_tes_state(struct pipe_context *ctx,
> +  void *tes)
> +{
> +   uint32_t handle = (unsigned long)tes;
> +   struct virgl_context *vctx = virgl_context(ctx);
> +
> +   virgl_encode_delete_object(vctx, handle, VIRGL_OBJECT_SHADER);
> +}
> +
>  static void virgl_bind_vs_state(struct pipe_context *ctx,
>  void *vss)
>  {
> @@ -543,6 +575,24 @@ static void virgl_bind_vs_state(struct pipe_context *ctx,
> virgl_encode_bind_shader(vctx, handle, PIPE_SHADER_VERTEX);
>  }
>  
> +static void virgl_bind_tcs_state(struct pipe_context *ctx,
> +   void *vss)
> +{
> +   uint32_t handle = (unsigned long)vss;
> +   struct virgl_context *vctx = virgl_context(ctx);
> +
> +   virgl_encode_bind_shader(vctx, handle, PIPE_SHADER_TESS_CTRL);
> +}
> +
> +static void virgl_bind_tes_state(struct pipe_context *ctx,
> +   void *vss)
> +{
> +   uint32_t handle = (unsigned long)vss;
> +   struct virgl_context *vctx = virgl_context(ctx);
> +
> +   virgl_encode_bind_shader(vctx, handle, PIPE_SHADER_TESS_EVAL);
> +}
> +
>  static void virgl_bind_gs_state(struct pipe_context *ctx,
> void *vss)
>  {
> @@ -801,6 +851,18 @@ static void virgl_set_clip_state(struct pipe_context 
> *ctx,
> virgl_encoder_set_clip_state(vctx, clip);
>  }
>  
> +static void virgl_set_tess_state(struct pipe_context *ctx,
> + const float default_outer_level[4],
> + const float default_inner_level[2])
> +{
> +   struct virgl_context *vctx = virgl_context(ctx);
> +   struct virgl_screen *rs = virgl_screen(ctx->screen);
> +
> +   if (!rs->caps.caps.v1.bset.has_tessellation_shaders)
> +  return;
> +   virgl_encode_set_tess_state(vctx, default_outer_level, 
> default_inner_level);
> +}
> +
>  static void virgl_resource_copy_region(struct pipe_context *ctx,
>struct pipe_resource *dst,
>   

[Mesa-dev] [PATCH v3] radv: update the ZRANGE_PRECISION value for the TC-compat bug

2018-06-13 Thread Samuel Pitoiset
On GFX8+, there is a bug that affects TC-compatible depth surfaces
when the ZRange is not reset after LateZ kills pixels.

The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match
the last fast clear value. Because the value is set to 1 by default,
we only need to update it when clearing Z to 0.0.

We also need to set the depth clear regs and to update
ZRANGE_PRECISION when initializing a TC-compat depth image to 0.

Original patch from James Legg.

v3: - check that subpass isn't NULL (needed for the next patch)
- set depth clear regs when initializing HTILE
v2: - only update ZRANGE_PRECISION for depth aspects
- adjust base address in presence of stencil

This fixes random CTS fails with
dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396
CC: 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 108 +++
 1 file changed, 108 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 043b4a2f44a..53fb4988a8c 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer 
*cmd_buffer,
}
 }
 
+static void
+radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
+struct radv_ds_buffer_info *ds,
+struct radv_image *image, VkImageLayout layout,
+bool requires_cond_write)
+{
+   uint32_t db_z_info = ds->db_z_info;
+   uint32_t db_z_info_reg;
+
+   if (!radv_image_is_tc_compat_htile(image))
+   return;
+
+   if (!radv_layout_has_htile(image, layout,
+  radv_image_queue_family_mask(image,
+   
cmd_buffer->queue_family_index,
+   
cmd_buffer->queue_family_index))) {
+   db_z_info &= C_028040_TILE_SURFACE_ENABLE;
+   }
+
+   db_z_info &= C_028040_ZRANGE_PRECISION;
+
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) {
+   db_z_info_reg = R_028038_DB_Z_INFO;
+   } else {
+   db_z_info_reg = R_028040_DB_Z_INFO;
+   }
+
+   /* When we don't know the last fast clear value we need to emit a
+* conditional packet, otherwise we can update DB_Z_INFO directly.
+*/
+   if (requires_cond_write) {
+   radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0));
+
+   const uint32_t write_space = 0 << 8;/* register */
+   const uint32_t poll_space = 1 << 4; /* memory */
+   const uint32_t function = 3 << 0;   /* equal to the 
reference */
+   const uint32_t options = write_space | poll_space | function;
+   radeon_emit(cmd_buffer->cs, options);
+
+   /* poll address - location of the depth clear value */
+   uint64_t va = radv_buffer_get_va(image->bo);
+   va += image->offset + image->clear_value_offset;
+
+   /* In presence of stencil format, we have to adjust the base
+* address because the first value is the stencil clear value.
+*/
+   if (vk_format_is_stencil(image->vk_format))
+   va += 4;
+
+   radeon_emit(cmd_buffer->cs, va);
+   radeon_emit(cmd_buffer->cs, va >> 32);
+
+   radeon_emit(cmd_buffer->cs, fui(0.0f));  /* reference 
value */
+   radeon_emit(cmd_buffer->cs, (uint32_t)-1);   /* comparison 
mask */
+   radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write 
address low */
+   radeon_emit(cmd_buffer->cs, 0u); /* write 
address high */
+   radeon_emit(cmd_buffer->cs, db_z_info);
+   } else {
+   radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, 
db_z_info);
+   }
+}
+
 static void
 radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
  struct radv_ds_buffer_info *ds,
@@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
 
}
 
+   /* Update the ZRANGE_PRECISION value for the TC-compat bug. */
+   radv_update_zrange_precision(cmd_buffer, ds, image, layout, true);
+
radeon_set_context_reg(cmd_buffer->cs, 
R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
   ds->pa_su_poly_offset_db_fmt_cntl);
 }
@@ -1143,6 +1208,35 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cmd_buffer->cs, ds_clear_value.stencil); /* 
R_028028_DB_STENCIL_CLEAR */
if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT)
radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth)); /* 
R_02802C_DB_DEPTH_CLEAR */
+
+   /* Update t

Re: [Mesa-dev] [PATCH v3] radv: update the ZRANGE_PRECISION value for the TC-compat bug

2018-06-13 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Wed, Jun 13, 2018 at 2:27 PM, Samuel Pitoiset
 wrote:
> On GFX8+, there is a bug that affects TC-compatible depth surfaces
> when the ZRange is not reset after LateZ kills pixels.
>
> The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match
> the last fast clear value. Because the value is set to 1 by default,
> we only need to update it when clearing Z to 0.0.
>
> We also need to set the depth clear regs and to update
> ZRANGE_PRECISION when initializing a TC-compat depth image to 0.
>
> Original patch from James Legg.
>
> v3: - check that subpass isn't NULL (needed for the next patch)
> - set depth clear regs when initializing HTILE
> v2: - only update ZRANGE_PRECISION for depth aspects
> - adjust base address in presence of stencil
>
> This fixes random CTS fails with
> dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.*
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396
> CC: 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 108 +++
>  1 file changed, 108 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 043b4a2f44a..53fb4988a8c 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer 
> *cmd_buffer,
> }
>  }
>
> +static void
> +radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer,
> +struct radv_ds_buffer_info *ds,
> +struct radv_image *image, VkImageLayout layout,
> +bool requires_cond_write)
> +{
> +   uint32_t db_z_info = ds->db_z_info;
> +   uint32_t db_z_info_reg;
> +
> +   if (!radv_image_is_tc_compat_htile(image))
> +   return;
> +
> +   if (!radv_layout_has_htile(image, layout,
> +  radv_image_queue_family_mask(image,
> +   
> cmd_buffer->queue_family_index,
> +   
> cmd_buffer->queue_family_index))) {
> +   db_z_info &= C_028040_TILE_SURFACE_ENABLE;
> +   }
> +
> +   db_z_info &= C_028040_ZRANGE_PRECISION;
> +
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) 
> {
> +   db_z_info_reg = R_028038_DB_Z_INFO;
> +   } else {
> +   db_z_info_reg = R_028040_DB_Z_INFO;
> +   }
> +
> +   /* When we don't know the last fast clear value we need to emit a
> +* conditional packet, otherwise we can update DB_Z_INFO directly.
> +*/
> +   if (requires_cond_write) {
> +   radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0));
> +
> +   const uint32_t write_space = 0 << 8;/* register */
> +   const uint32_t poll_space = 1 << 4; /* memory */
> +   const uint32_t function = 3 << 0;   /* equal to the 
> reference */
> +   const uint32_t options = write_space | poll_space | function;
> +   radeon_emit(cmd_buffer->cs, options);
> +
> +   /* poll address - location of the depth clear value */
> +   uint64_t va = radv_buffer_get_va(image->bo);
> +   va += image->offset + image->clear_value_offset;
> +
> +   /* In presence of stencil format, we have to adjust the base
> +* address because the first value is the stencil clear value.
> +*/
> +   if (vk_format_is_stencil(image->vk_format))
> +   va += 4;
> +
> +   radeon_emit(cmd_buffer->cs, va);
> +   radeon_emit(cmd_buffer->cs, va >> 32);
> +
> +   radeon_emit(cmd_buffer->cs, fui(0.0f));  /* reference 
> value */
> +   radeon_emit(cmd_buffer->cs, (uint32_t)-1);   /* 
> comparison mask */
> +   radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write 
> address low */
> +   radeon_emit(cmd_buffer->cs, 0u); /* write 
> address high */
> +   radeon_emit(cmd_buffer->cs, db_z_info);
> +   } else {
> +   radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, 
> db_z_info);
> +   }
> +}
> +
>  static void
>  radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer,
>   struct radv_ds_buffer_info *ds,
> @@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer 
> *cmd_buffer,
>
> }
>
> +   /* Update the ZRANGE_PRECISION value for the TC-compat bug. */
> +   radv_update_zrange_precision(cmd_buffer, ds, image, layout, true);
> +
> radeon_set_context_reg(cmd_buffer->cs, 
> R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL,
>ds->pa_su_poly_offset_db_fmt_cntl);
>  }
> @@ -1143,6 +1208,35 @@ radv_set_d

Re: [Mesa-dev] [PATCH 1/2] ac/gpu_info: report real total memory sizes

2018-06-13 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

for both. Thanks!

On Wed, Jun 13, 2018 at 3:15 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> The change from MIN2 to MAX2 is intentional.
> ---
>  src/amd/common/ac_gpu_info.c | 82 
>  1 file changed, 54 insertions(+), 28 deletions(-)
>
> diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
> index 6bee96b9eee..3b6600dcbc6 100644
> --- a/src/amd/common/ac_gpu_info.c
> +++ b/src/amd/common/ac_gpu_info.c
> @@ -91,21 +91,20 @@ static bool has_syncobj(int fd)
> return false;
> return value ? true : false;
>  }
>
>  bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
>struct radeon_info *info,
>struct amdgpu_gpu_info *amdinfo)
>  {
> struct drm_amdgpu_info_device device_info = {};
> struct amdgpu_buffer_size_alignments alignment_info = {};
> -   struct amdgpu_heap_info vram, vram_vis, gtt;
> struct drm_amdgpu_info_hw_ip dma = {}, compute = {}, uvd = {};
> struct drm_amdgpu_info_hw_ip uvd_enc = {}, vce = {}, vcn_dec = {};
> struct drm_amdgpu_info_hw_ip vcn_enc = {}, gfx = {};
> struct amdgpu_gds_resource_info gds = {};
> uint32_t vce_version = 0, vce_feature = 0, uvd_version = 0, 
> uvd_feature = 0;
> int r, i, j;
> drmDevicePtr devinfo;
>
> /* Get PCI info. */
> r = drmGetDevice2(fd, 0, &devinfo);
> @@ -132,40 +131,20 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
> fprintf(stderr, "amdgpu: amdgpu_query_info(dev_info) 
> failed.\n");
> return false;
> }
>
> r = amdgpu_query_buffer_size_alignment(dev, &alignment_info);
> if (r) {
> fprintf(stderr, "amdgpu: amdgpu_query_buffer_size_alignment 
> failed.\n");
> return false;
> }
>
> -   r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_VRAM, 0, &vram);
> -   if (r) {
> -   fprintf(stderr, "amdgpu: amdgpu_query_heap_info(vram) 
> failed.\n");
> -   return false;
> -   }
> -
> -   r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_VRAM,
> -   AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED,
> -   &vram_vis);
> -   if (r) {
> -   fprintf(stderr, "amdgpu: amdgpu_query_heap_info(vram_vis) 
> failed.\n");
> -   return false;
> -   }
> -
> -   r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_GTT, 0, >t);
> -   if (r) {
> -   fprintf(stderr, "amdgpu: amdgpu_query_heap_info(gtt) 
> failed.\n");
> -   return false;
> -   }
> -
> r = amdgpu_query_hw_ip_info(dev, AMDGPU_HW_IP_DMA, 0, &dma);
> if (r) {
> fprintf(stderr, "amdgpu: amdgpu_query_hw_ip_info(dma) 
> failed.\n");
> return false;
> }
>
> r = amdgpu_query_hw_ip_info(dev, AMDGPU_HW_IP_GFX, 0, &gfx);
> if (r) {
> fprintf(stderr, "amdgpu: amdgpu_query_hw_ip_info(gfx) 
> failed.\n");
> return false;
> @@ -256,20 +235,74 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev,
> fprintf(stderr, "amdgpu: amdgpu_query_sw_info(address32_hi) 
> failed.\n");
> return false;
> }
>
> r = amdgpu_query_gds_info(dev, &gds);
> if (r) {
> fprintf(stderr, "amdgpu: amdgpu_query_gds_info failed.\n");
> return false;
> }
>
> +   if (info->drm_minor >= 9) {
> +   struct drm_amdgpu_memory_info meminfo;
> +
> +   r = amdgpu_query_info(dev, AMDGPU_INFO_MEMORY, 
> sizeof(meminfo), &meminfo);
> +   if (r) {
> +   fprintf(stderr, "amdgpu: amdgpu_query_info(memory) 
> failed.\n");
> +   return false;
> +   }
> +
> +   /* Note: usable_heap_size values can be random and can't be 
> relied on. */
> +   info->gart_size = meminfo.gtt.total_heap_size;
> +   info->vram_size = meminfo.vram.total_heap_size;
> +   info->vram_vis_size = 
> meminfo.cpu_accessible_vram.total_heap_size;
> +
> +   info->max_alloc_size = MAX2(meminfo.vram.max_allocation,
> +   meminfo.gtt.max_allocation);
> +   } else {
> +   /* This is a deprecated interface, which reports usable sizes
> +* (total minus pinned), but the pinned size computation is
> +* buggy, so the values returned from these functions can be
> +* random.
> +*/
> +   struct amdgpu_heap_info vram, vram_vis, gtt;
> +
> +   r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_VRAM, 0, 
> &vram);
> +   if (r) {
> +   fprintf(stderr, "amdgpu: amdgpu_query_heap_info(vram) 
> failed.\n");

[Mesa-dev] [Bug 106897] Ubuntu 16.04. Mesa can't be built with specified configurations

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106897

--- Comment #7 from Timo Aaltonen  ---
such is life, 16.04 won't get a newer wayland, but 18.04 will.. eventually

for now, you can use a ppa for a backport with the necessary packaging changes:

https://launchpad.net/~ubuntu-x-swat/+archive/ubuntu/updates

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106912] radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106912

Bug ID: 106912
   Summary: radv: 16-bit depth buffer causes artifacts in Shadow
Warrior 2
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: philip.rebo...@tu-dortmund.de
QA Contact: mesa-dev@lists.freedesktop.org

Hello,

Shadow Warrior 2 uses D16_UNORM as a shadow map format, and clearing the depth
buffer in one render pass instance and rendering to it in another results in
parts of the depth buffer getting set to 1.0. The game renders correctly with
RADV_DEBUG=nohiz.

I wasn't able to reproduce this issue outside of DXVK so far, so here's a
Renderdoc capture of the issue (captured on Polaris 10):

  https://mega.nz/#!gfoWFDSC!rb9qsW9H6dGq_gsNvpdhPW82mSkZEy94PX-4Ey6BSTs

The render pass in question starts at EID 19473, which should be bookmarked.
For the capture I used Mesa 18.1.1, but the issue is still present in latest
-git.

Regards
- Philip

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 3/3] egl/android: Add DRM node probing and filtering

2018-06-13 Thread Rob Herring
+Amit and John

On Sat, Jun 9, 2018 at 11:27 AM, Robert Foss  wrote:
> This patch both adds support for probing & filtering DRM nodes
> and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD
> gralloc call.
>
> Currently the filtering is based just on the driver name,
> and the desired name is supplied using the "drm.gpu.vendor_name"
> Android property.

There's a potential issue with this whole approach and that is
SELinux. With the way SELinux locks down accesses, getting probing
thru device files to work can be a pain. It may be better now than the
prior version because sysfs is not probed. I'll leave it to Amit or
John to comment.

Rob

>
> Signed-off-by: Robert Foss 
> ---
>
> Changes since v2:
>  - Switch from drmGetDevices2 to manual renderD node iteration
>  - Add probe_res enum to communicate probing results better
>  - Avoid using _eglError() in internal static functions
>  - Avoid actually loading the driver while probing, just verify
>that it exists.
>  - Replace strlen call with the assumed length PROPERTY_VALUE_MAX
>
> Changes since v1:
>  - Do not rely on libdrm for probing
>  - Distinguish between errors and when no drm devices are found
>
> Changes since RFC:
>  - Rebased on newer libdrm drmHandleMatch patch
>  - Added support for driver probing
>
>
>  src/egl/drivers/dri2/platform_android.c | 222 ++--
>  1 file changed, 169 insertions(+), 53 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/platform_android.c 
> b/src/egl/drivers/dri2/platform_android.c
> index 4ba96aad90..a2cbe92d93 100644
> --- a/src/egl/drivers/dri2/platform_android.c
> +++ b/src/egl/drivers/dri2/platform_android.c
> @@ -27,12 +27,16 @@
>   * DEALINGS IN THE SOFTWARE.
>   */
>
> +#include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
>
>  #include "loader.h"
>  #include "egl_dri2.h"
> @@ -1130,31 +1134,6 @@ droid_add_configs_for_visuals(_EGLDriver *drv, 
> _EGLDisplay *dpy)
> return (config_count != 0);
>  }
>
> -enum {
> -/* perform(const struct gralloc_module_t *mod,
> - * int op,
> - * int *fd);
> - */
> -GRALLOC_MODULE_PERFORM_GET_DRM_FD = 0x4002,
> -};
> -
> -static int
> -droid_open_device(struct dri2_egl_display *dri2_dpy)
> -{
> -   int fd = -1, err = -EINVAL;
> -
> -   if (dri2_dpy->gralloc->perform)
> - err = dri2_dpy->gralloc->perform(dri2_dpy->gralloc,
> -  GRALLOC_MODULE_PERFORM_GET_DRM_FD,
> -  &fd);
> -   if (err || fd < 0) {
> -  _eglLog(_EGL_WARNING, "fail to get drm fd");
> -  fd = -1;
> -   }
> -
> -   return (fd >= 0) ? fcntl(fd, F_DUPFD_CLOEXEC, 3) : -1;
> -}
> -
>  static const struct dri2_egl_display_vtbl droid_display_vtbl = {
> .authenticate = NULL,
> .create_window_surface = droid_create_window_surface,
> @@ -1215,6 +1194,168 @@ static const __DRIextension 
> *droid_image_loader_extensions[] = {
> NULL,
>  };
>
> +EGLBoolean
> +droid_load_driver(_EGLDisplay *disp)
> +{
> +   struct dri2_egl_display *dri2_dpy = disp->DriverData;
> +   const char *err;
> +
> +   dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd);
> +   if (dri2_dpy->driver_name == NULL)
> +  return false;
> +
> +   dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) == 
> DRM_NODE_RENDER;
> +
> +   if (!dri2_dpy->is_render_node) {
> +   #ifdef HAVE_DRM_GRALLOC
> +   /* Handle control nodes using __DRI_DRI2_LOADER extension and GEM 
> names
> +* for backwards compatibility with drm_gralloc. (Do not use on new
> +* systems.) */
> +   dri2_dpy->loader_extensions = droid_dri2_loader_extensions;
> +   if (!dri2_load_driver(disp)) {
> +  err = "DRI2: failed to load driver";
> +  goto error;
> +   }
> +   #else
> +   err = "DRI2: handle is not for a render node";
> +   goto error;
> +   #endif
> +   } else {
> +   dri2_dpy->loader_extensions = droid_image_loader_extensions;
> +   if (!dri2_load_driver_dri3(disp)) {
> +  err = "DRI3: failed to load driver";
> +  goto error;
> +   }
> +}
> +
> +   return true;
> +
> +error:
> +   free(dri2_dpy->driver_name);
> +   dri2_dpy->driver_name = NULL;
> +   return false;
> +}
> +
> +static bool
> +droid_probe_driver(int fd)
> +{
> +   char *driver_name;
> +
> +   driver_name = loader_get_driver_for_fd(fd);
> +   if (driver_name == NULL)
> +  return false;
> +
> +   free(driver_name);
> +   return true;
> +}
> +
> +typedef enum {
> +   probe_error = -1,
> +   probe_success = 0,
> +   probe_filtered_out = 1,
> +   probe_no_driver = 2
> +} probe_ret_t;
> +
> +static probe_ret_t
> +droid_probe_device(_EGLDisplay *disp, int fd, char *vendor)
> +{
> +   int ret;
> +
> +   drmVersionPtr ver = drmGetVersion(fd);
> +   if (!ver)
> +  return probe_error;
> +
> +   if (vendor != NULL && ver->name != NULL &&
> +

Re: [Mesa-dev] [PATCH 1/3] meson: Fix -latomic check

2018-06-13 Thread Dylan Baker
Quoting Matt Turner (2018-06-12 17:50:20)
> Commit 54ba73ef102f (configure.ac/meson.build: Fix -latomic test) fixed
> some checks for -latomic, and then commit 54bbe600ec26 (configure.ac:
> rework -latomic check) further extended the fixes in configure.ac but
> not in Meson. This commit extends those fixes to the Meson tests.
> 
> Fixes: 54bbe600ec26 (configure.ac: rework -latomic check)
> ---
>  meson.build | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/meson.build b/meson.build
> index 7dba52369b0..62200476216 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -836,7 +836,13 @@ endif
>  # Check for GCC style atomics
>  dep_atomic = null_dep
>  
> -if cc.compiles('int main() { int n; return __atomic_load_n(&n, 
> __ATOMIC_ACQUIRE); }',
> +if cc.compiles('''#include 
> +  int main() {
> +struct {
> +  uint64_t *v;
> +} x;
> +return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE);
> +  }''',
> name : 'GCC atomic builtins')
>pre_args += '-DUSE_GCC_ATOMIC_BUILTINS'
>  
> -- 
> 2.16.1
> 

Should patches 2 and 3 be cc 18.1?

Reviewed-by: Dylan Baker 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac/meson.build: Add options for library suffixes

2018-06-13 Thread Dylan Baker
Quoting Eric Engestrom (2018-06-13 03:03:25)
> On Tuesday, 2018-06-12 11:19:40 -0600, bmgor...@chromium.org wrote:
> > From: Benjamin Gordon 
> > 
> > When building the Chrome OS Android container, we need to build copies
> > of mesa that don't conflict with the Android system-supplied libraries.
> > This adds options to create suffixed versions of EGL and GLES libraries:
> > 
> > libEGL.so -> libEGL${egl-lib-suffix}.so
> > libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so
> > libGLESv2.so -> libGLES${gles-lib-suffix}.so
> > 
> > This is similar to what happens when --enable-libglvnd is specified, but
> > without the side effects of linking against libglvnd.
> 
> This seems reasonable, and the meson side of this patch is correct,
> but we need to document or prevent the interaction between
> --enable-libglvnd and --with-egl-lib-suffix.
> 
> I can't think of a use-case for having both, so I suggest "if both are
> enabled, error out"; scroll down for what this could look like in meson.

Agreed, making it hard error to use both makes sense to me.

> With that (and the corresponding autotools hunk):
> Reviewed-by: Eric Engestrom 
> 
> > 
> > Change-Id: I0a534d3921a24c031e2532ee7d5ba9813740b33b
> 
> (Note to whoever merges this patch: drop this line ^)
> 
> > Signed-off-by: Benjamin Gordon 
> > ---
> >  configure.ac| 14 ++
> >  meson_options.txt   | 12 
> >  src/egl/Makefile.am |  8 
> >  src/egl/meson.build |  2 +-
> >  src/mapi/Makefile.am| 28 ++--
> >  src/mapi/es1api/meson.build |  2 +-
> >  src/mapi/es2api/meson.build |  2 +-
> >  7 files changed, 47 insertions(+), 21 deletions(-)
> > 
> > diff --git a/configure.ac b/configure.ac
> > index 35ade986d1..6070a2146b 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -1511,12 +1511,24 @@ AC_ARG_WITH([gl-lib-name],
> >  [specify GL library name @<:@default=GL@:>@])],
> >[GL_LIB=$withval],
> >[GL_LIB="$DEFAULT_GL_LIB_NAME"])
> > +AC_ARG_WITH([egl-lib-suffix],
> > +  [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@],
> > +[specify EGL library suffix @<:@default=none@:>@])],
> > +  [EGL_LIB_SUFFIX=$withval],
> > +  [EGL_LIB_SUFFIX=""])
> > +AC_ARG_WITH([gles-lib-suffix],
> > +  [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@],
> > +[specify GLES library suffix @<:@default=none@:>@])],
> > +  [GLES_LIB_SUFFIX=$withval],
> > +  [GLES_LIB_SUFFIX=""])
> >  AC_ARG_WITH([osmesa-lib-name],
> >[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
> >  [specify OSMesa library name @<:@default=OSMesa@:>@])],
> >[OSMESA_LIB=$withval],
> >[OSMESA_LIB=OSMesa])
> >  AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
> > +AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""])
> > +AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""])
> >  AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
> >  
> >  dnl
> > @@ -1534,6 +1546,8 @@ if test "x${enable_mangling}" = "xyes" ; then
> >OSMESA_LIB="Mangled${OSMESA_LIB}"
> >  fi
> >  AC_SUBST([GL_LIB])
> > +AC_SUBST([EGL_LIB_SUFFIX])
> > +AC_SUBST([GLES_LIB_SUFFIX])
> >  AC_SUBST([OSMESA_LIB])
> >  
> >  # Check for libdrm
> > diff --git a/meson_options.txt b/meson_options.txt
> > index ce7d87f1eb..9d84c3b5bb 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -298,3 +298,15 @@ option(
> >choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'],
> >description : 'List of tools to build.',
> >  )
> > +option(
> > +  'egl-lib-suffix',
> > +  type : 'string',
> > +  value : '',
> > +  description : 'Suffix to append to EGL library name.  Default: none.'
> > +)
> > +option(
> > +  'gles-lib-suffix',
> > +  type : 'string',
> > +  value : '',
> > +  description : 'Suffix to append to GLES library names.  Default: none.'
> > +)
> > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
> > index 086a4a1e63..c3aeeea007 100644
> > --- a/src/egl/Makefile.am
> > +++ b/src/egl/Makefile.am
> > @@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \
> >  
> >  else # USE_LIBGLVND
> >  
> > -lib_LTLIBRARIES = libEGL.la
> > -libEGL_la_SOURCES =
> > -libEGL_la_LIBADD = \
> > +lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la
> > +libEGL@EGL_LIB_SUFFIX@_la_SOURCES =
> > +libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \
> >   libEGL_common.la \
> >   $(top_builddir)/src/mapi/shared-glapi/libglapi.la
> > -libEGL_la_LDFLAGS = \
> > +libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \
> >   -no-undefined \
> >   -version-number 1:0 \
> >   $(BSYMBOLIC) \
> > diff --git a/src/egl/meson.build b/src/egl/meson.build
> > index 6537e4bdee..b833fd1729 100644
> > --- a/src/egl/meson.build
> > +++ b/src/egl/meson.build
> > @@ -148,7 +148,7 @@ if cc.has_function('mincore')
> >  endif
> >  
> 
>   if with_glvnd and get_option('egl-lib-suffix') != ''
> error('''EGL lib suffix can't be used with libglvnd''')
>   endif
> 
> >  if not with_glvnd
> > -  egl_lib

Re: [Mesa-dev] [PATCH 04/13] i965/draw: Fix adding the stencil bo to the depth cache

2018-06-13 Thread Nanley Chery
On Wed, Jun 13, 2018 at 09:25:02AM +0300, Pohjolainen, Topi wrote:
> On Tue, Jun 12, 2018 at 12:21:56PM -0700, Nanley Chery wrote:
> > Fix the case where only stencil writes are enabled on a depth stencil
> 
> Isn't this an issue even when depth writes are enabled? Both would add the
> same bo to cache?
> 

You're right. The message should omit the word "only". I think we'd be
adding the same BO pre-SNB, but I'm not sure.

-Nanley

> > texture. Found by inspection.
> > 
> > ---
> > 
> > I'm looking into writing a test for this.
> > 
> >  src/mesa/drivers/dri/i965/brw_draw.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
> > b/src/mesa/drivers/dri/i965/brw_draw.c
> > index 271456e0f7d..71461d7b0a7 100644
> > --- a/src/mesa/drivers/dri/i965/brw_draw.c
> > +++ b/src/mesa/drivers/dri/i965/brw_draw.c
> > @@ -623,10 +623,10 @@ brw_postdraw_set_buffers_need_resolve(struct 
> > brw_context *brw)
> > }
> >  
> > if (stencil_irb && brw->stencil_write_enabled) {
> > -  brw_depth_cache_add_bo(brw, stencil_irb->mt->bo);
> >struct intel_mipmap_tree *stencil_mt =
> >   stencil_irb->mt->stencil_mt != NULL ?
> >   stencil_irb->mt->stencil_mt : stencil_irb->mt;
> > +  brw_depth_cache_add_bo(brw, stencil_mt->bo);
> >intel_miptree_finish_write(brw, stencil_mt, stencil_irb->mt_level,
> >   stencil_irb->mt_layer,
> >   stencil_irb->layer_count, 
> > ISL_AUX_USAGE_NONE);
> > -- 
> > 2.17.0
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/13] i965/miptree: Share the miptree format in miptree_create

2018-06-13 Thread Nanley Chery
On Wed, Jun 13, 2018 at 09:33:41AM +0300, Pohjolainen, Topi wrote:
> On Tue, Jun 12, 2018 at 12:22:00PM -0700, Nanley Chery wrote:
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 30 +--
> >  1 file changed, 15 insertions(+), 15 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 03628e3fd9f..97de30076e0 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -696,8 +696,19 @@ miptree_create(struct brw_context *brw,
> > if (devinfo->gen < 6 && _mesa_is_format_color_format(format))
> >tiling_flags &= ~ISL_TILING_Y0_BIT;
> >  
> > +   mesa_format mt_fmt;
> > +   if (_mesa_is_format_color_format(format)) {
> > +  mt_fmt = intel_lower_compressed_format(brw, format);
> > +   } else {
> > +  /* Fix up the Z miptree format for how we're splitting out separate
> > +   * stencil. Gen7 expects there to be no stencil bits in its depth 
> > buffer.
> > +   */
> > +  mt_fmt = (devinfo->gen < 6) ? format :
> > +   intel_depth_format_for_depthstencil_format(format);
> > +   }
> 
> I wonder if we need to add something of this sort for coverity not complaining
> later on (I don't know if it is clever to know what
> _mesa_is_format_color_format() does):
> 
>   } else {
>  unreachable("Format with invalid base");
>   }
> 
> 

Where would we be adding this unreachable? There is already an else case here.

-Nanley

> > +
> > if (format == MESA_FORMAT_S_UINT8)
> > -  return make_surface(brw, target, format, first_level, last_level,
> > +  return make_surface(brw, target, mt_fmt, first_level, last_level,
> >width0, height0, depth0, num_samples,
> >tiling_flags,
> >ISL_SURF_USAGE_STENCIL_BIT |
> > @@ -709,13 +720,8 @@ miptree_create(struct brw_context *brw,
> > const GLenum base_format = _mesa_get_format_base_format(format);
> > if ((base_format == GL_DEPTH_COMPONENT ||
> >  base_format == GL_DEPTH_STENCIL)) {
> > -  /* Fix up the Z miptree format for how we're splitting out separate
> > -   * stencil.  Gen7 expects there to be no stencil bits in its depth 
> > buffer.
> > -   */
> > -  const mesa_format depth_only_format =
> > - intel_depth_format_for_depthstencil_format(format);
> >struct intel_mipmap_tree *mt = make_surface(
> > - brw, target, devinfo->gen >= 6 ? depth_only_format : format,
> > + brw, target, mt_fmt,
> >   first_level, last_level,
> >   width0, height0, depth0, num_samples, tiling_flags,
> >   ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT,
> > @@ -733,19 +739,13 @@ miptree_create(struct brw_context *brw,
> >return mt;
> > }
> >  
> > -   mesa_format tex_format = format;
> > -   mesa_format etc_format = MESA_FORMAT_NONE;
> > uint32_t alloc_flags = 0;
> >  
> > -   format = intel_lower_compressed_format(brw, format);
> > -
> > -   etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
> > -
> > if (flags & MIPTREE_CREATE_BUSY)
> >alloc_flags |= BO_ALLOC_BUSY;
> >  
> > struct intel_mipmap_tree *mt = make_surface(
> > - brw, target, format,
> > + brw, target, mt_fmt,
> >   first_level, last_level,
> >   width0, height0, depth0,
> >   num_samples, tiling_flags,
> > @@ -755,7 +755,7 @@ miptree_create(struct brw_context *brw,
> > if (!mt)
> >return NULL;
> >  
> > -   mt->etc_format = etc_format;
> > +   mt->etc_format = (mt_fmt != format) ? format : MESA_FORMAT_NONE;
> >  
> > if (!(flags & MIPTREE_CREATE_NO_AUX))
> >intel_miptree_choose_aux_usage(brw, mt);
> > -- 
> > 2.17.0
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/compiler: Properly consider UBO loads that cross 32B boundaries.

2018-06-13 Thread Jason Ekstrand
I just reverted this in master because it regressed about 30K Vulkan CTS
tests.  More investigation needed?

On Wed, Jun 13, 2018 at 2:07 AM, Kenneth Graunke 
wrote:

> On Tuesday, June 12, 2018 1:38:03 PM PDT Rafael Antognolli wrote:
> > On Mon, Jun 11, 2018 at 02:01:49PM -0700, Kenneth Graunke wrote:
> > > The UBO push analysis pass incorrectly assumed that all values would
> fit
> > > within a 32B chunk, and only recorded a bit for the 32B chunk
> containing
> > > the starting offset.
> > >
> > > For example, if a UBO contained the following, tightly packed:
> > >
> > >vec4 a;  // [0, 16)
> > >float b; // [16, 20)
> > >vec4 c;  // [20, 36)
> > >
> > > then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1,
> > > which means that we ought to record two 32B chunks in the bitfield.
> > >
> > > Similarly, dvec4s would suffer from the same problem.
> > > ---
> > >  src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 +++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > > index d58fe3dd2e3..6d6ccf73ade 100644
> > > --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > > +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c
> > > @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state
> *state, nir_block *block)
> > >   if (offset >= 64)
> > >  continue;
> > >
> > > + /* The value might span multiple 32-byte chunks. */
> > > + const int bytes = nir_intrinsic_dest_components(intrin) *
> > > +   (nir_dest_bit_size(intrin->dest) / 8);
> > > + const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes,
> 32);
> > > + const int regs = end - offset + 1;
> > > +
> >
> > But if I understood it correctly, offset is the first 32B chunk within
> > the UBO block (it's actually an ubo "chunk offset"). And you calculate
> > bytes by taking the number of components times the size of each
> > component of the nir_intrinsic_load_ubo instruction (which apparently
> > supports multiple components). So yeah, this makes sense to me.
>
> Yeah, that's exactly right.  load_ubo can load up to 4 components.
>
> > Take this review with a grain of salt (assuming what I wrote above is
> > correct), but this looks simple enough. So it is
> >
> > Reviewed-by: Rafael Antognolli 
>
> Thanks!
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/13] i965/miptree: Add and use mt_surf_usage

2018-06-13 Thread Nanley Chery
On Wed, Jun 13, 2018 at 09:39:08AM +0300, Pohjolainen, Topi wrote:
> On Tue, Jun 12, 2018 at 12:22:02PM -0700, Nanley Chery wrote:
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 ---
> >  1 file changed, 26 insertions(+), 14 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index cfb83d15ecc..5e00da86d32 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -677,6 +677,23 @@ make_separate_stencil_surface(struct brw_context *brw,
> > return true;
> >  }
> >  
> > +/* Return the usual surface usage flags for the given format. */
> > +static isl_surf_usage_flags_t
> > +mt_surf_usage(mesa_format format)
> > +{
> > +   switch(_mesa_get_format_base_format(format)) {
> > +   case GL_DEPTH_COMPONENT:
> > +  return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT;
> > +   case GL_DEPTH_STENCIL:
> > +  return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_STENCIL_BIT |
> > + ISL_SURF_USAGE_TEXTURE_BIT;
> > +   case GL_STENCIL_INDEX:
> > +  return ISL_SURF_USAGE_STENCIL_BIT | ISL_SURF_USAGE_TEXTURE_BIT;
> > +   default:
> > +  return ISL_SURF_USAGE_RENDER_TARGET_BIT | ISL_SURF_USAGE_TEXTURE_BIT;
> > +   }
> > +}
> > +
> >  static struct intel_mipmap_tree *
> >  miptree_create(struct brw_context *brw,
> > GLenum target,
> > @@ -713,8 +730,7 @@ miptree_create(struct brw_context *brw,
> >return make_surface(brw, target, mt_fmt, first_level, last_level,
> >width0, height0, depth0, num_samples,
> >tiling_flags,
> > -  ISL_SURF_USAGE_STENCIL_BIT |
> > -  ISL_SURF_USAGE_TEXTURE_BIT,
> 
> New logic also sets ISL_SURF_USAGE_DEPTH_BIT here.
> 

How so? The base format of MESA_FORMAT_S_UINT8 is GL_STENCIL_INDEX.

> > +  mt_surf_usage(mt_fmt),
> >alloc_flags,
> >0,
> >NULL);
> > @@ -726,7 +742,7 @@ miptree_create(struct brw_context *brw,
> >   brw, target, mt_fmt,
> >   first_level, last_level,
> >   width0, height0, depth0, num_samples, tiling_flags,
> > - ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT,
> > + mt_surf_usage(mt_fmt),
> >   alloc_flags, 0, NULL);
> >  
> >if (needs_separate_stencil(brw, mt, format) &&
> > @@ -746,8 +762,7 @@ miptree_create(struct brw_context *brw,
> >   first_level, last_level,
> >   width0, height0, depth0,
> >   num_samples, tiling_flags,
> > - ISL_SURF_USAGE_RENDER_TARGET_BIT |
> > - ISL_SURF_USAGE_TEXTURE_BIT,
> > + mt_surf_usage(mt_fmt),
> >   alloc_flags, 0, NULL);
> > if (!mt)
> >return NULL;
> > @@ -816,12 +831,11 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> >  
> > if ((base_format == GL_DEPTH_COMPONENT ||
> >  base_format == GL_DEPTH_STENCIL)) {
> > -  const mesa_format depth_only_format =
> > - intel_depth_format_for_depthstencil_format(format);
> > -  mt = make_surface(brw, target,
> > -devinfo->gen >= 6 ? depth_only_format : format,
> > +  mesa_format mt_fmt = (devinfo->gen < 6) ? format :
> > +   
> > intel_depth_format_for_depthstencil_format(format);
> > +  mt = make_surface(brw, target, mt_fmt,
> >  0, 0, width, height, depth, 1, ISL_TILING_Y0_BIT,
> > -ISL_SURF_USAGE_DEPTH_BIT | 
> > ISL_SURF_USAGE_TEXTURE_BIT,
> > +mt_surf_usage(mt_fmt),
> >  0, pitch, bo);
> >if (!mt)
> >   return NULL;
> > @@ -836,8 +850,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> >mt = make_surface(brw, target, MESA_FORMAT_S_UINT8,
> >  0, 0, width, height, depth, 1,
> >  ISL_TILING_W_BIT,
> > -ISL_SURF_USAGE_STENCIL_BIT |
> > -ISL_SURF_USAGE_TEXTURE_BIT,
> > +mt_surf_usage(MESA_FORMAT_S_UINT8),
> 
> Same here, new logic also sets ISL_SURF_USAGE_DEPTH_BIT here.
> 

How so?

-Nanley

> >  0, pitch, bo);
> >if (!mt)
> >   return NULL;
> > @@ -862,8 +875,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> > mt = make_surface(brw, target, format,
> >   0, 0, width, height, depth, 1,
> >   1lu << tiling,
> > - ISL_SURF_USAGE_RENDER_TARGET_BIT |
> > - ISL_SURF_USAGE_T

[Mesa-dev] [Bug 106912] radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106912

--- Comment #1 from Samuel Pitoiset  ---
Can you explain how to reproduce the issue in-game? I would like to know if
Vega is affected as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] meson: Fix -latomic check

2018-06-13 Thread Matt Turner
On Wed, Jun 13, 2018 at 8:37 AM, Dylan Baker  wrote:
> Quoting Matt Turner (2018-06-12 17:50:20)
>> Commit 54ba73ef102f (configure.ac/meson.build: Fix -latomic test) fixed
>> some checks for -latomic, and then commit 54bbe600ec26 (configure.ac:
>> rework -latomic check) further extended the fixes in configure.ac but
>> not in Meson. This commit extends those fixes to the Meson tests.
>>
>> Fixes: 54bbe600ec26 (configure.ac: rework -latomic check)
>> ---
>>  meson.build | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/meson.build b/meson.build
>> index 7dba52369b0..62200476216 100644
>> --- a/meson.build
>> +++ b/meson.build
>> @@ -836,7 +836,13 @@ endif
>>  # Check for GCC style atomics
>>  dep_atomic = null_dep
>>
>> -if cc.compiles('int main() { int n; return __atomic_load_n(&n, 
>> __ATOMIC_ACQUIRE); }',
>> +if cc.compiles('''#include 
>> +  int main() {
>> +struct {
>> +  uint64_t *v;
>> +} x;
>> +return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE);
>> +  }''',
>> name : 'GCC atomic builtins')
>>pre_args += '-DUSE_GCC_ATOMIC_BUILTINS'
>>
>> --
>> 2.16.1
>>
>
> Should patches 2 and 3 be cc 18.1?

Yes, I will send you a list of 5 patches that I'd like to include in
18.1 including these.

(I *really* hate that we're using 64-bit atomics)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/13] i965/miptree: Refactor miptree_create

2018-06-13 Thread Nanley Chery
On Wed, Jun 13, 2018 at 09:44:14AM +0300, Pohjolainen, Topi wrote:
> On Tue, Jun 12, 2018 at 12:22:03PM -0700, Nanley Chery wrote:
> > Enable a future patch to create the r8stencil_mt in this function.
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 48 +--
> >  1 file changed, 12 insertions(+), 36 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 5e00da86d32..b078c759243 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -726,48 +726,24 @@ miptree_create(struct brw_context *brw,
> > intel_depth_format_for_depthstencil_format(format);
> > }
> >  
> > -   if (format == MESA_FORMAT_S_UINT8)
> > -  return make_surface(brw, target, mt_fmt, first_level, last_level,
> > -  width0, height0, depth0, num_samples,
> > -  tiling_flags,
> > -  mt_surf_usage(mt_fmt),
> > -  alloc_flags,
> > -  0,
> > -  NULL);
> > +   struct intel_mipmap_tree *mt =
> > +  make_surface(brw, target, mt_fmt, first_level, last_level,
> > +   width0, height0, depth0, num_samples,
> > +   tiling_flags, mt_surf_usage(mt_fmt),
> > +   alloc_flags, 0, NULL);
> >  
> > -   const GLenum base_format = _mesa_get_format_base_format(format);
> > -   if ((base_format == GL_DEPTH_COMPONENT ||
> > -base_format == GL_DEPTH_STENCIL)) {
> > -  struct intel_mipmap_tree *mt = make_surface(
> > - brw, target, mt_fmt,
> > - first_level, last_level,
> > - width0, height0, depth0, num_samples, tiling_flags,
> > - mt_surf_usage(mt_fmt),
> > - alloc_flags, 0, NULL);
> > -
> > -  if (needs_separate_stencil(brw, mt, format) &&
> > -  !make_separate_stencil_surface(brw, mt)) {
> > +   if (mt == NULL)
> > +  return NULL;
> > +
> > +   if (needs_separate_stencil(brw, mt, format)) {
> > +  if (!make_separate_stencil_surface(brw, mt)) {
> >   intel_miptree_release(&mt);
> >   return NULL;
> >}
> > -
> > -  if (!(flags & MIPTREE_CREATE_NO_AUX))
> > - intel_miptree_choose_aux_usage(brw, mt);
> > -
> > -  return mt;
> > }
> >  
> > -   struct intel_mipmap_tree *mt = make_surface(
> > - brw, target, mt_fmt,
> > - first_level, last_level,
> > - width0, height0, depth0,
> > - num_samples, tiling_flags,
> > - mt_surf_usage(mt_fmt),
> > - alloc_flags, 0, NULL);
> > -   if (!mt)
> > -  return NULL;
> > -
> > -   mt->etc_format = (mt_fmt != format) ? format : MESA_FORMAT_NONE;
> > +   if (_mesa_is_format_color_format(format) && mt_fmt != format)
> > +  mt->etc_format = format;
> 
> This relies on MESA_FORMAT_NONE == 0 and make_surface() to use calloc().
> Should we play safe and:
> 
>   else
>  mt->etc_format = MESA_FORMAT_NONE;
> 

Sure, I plan to change it to this:

 mt->etc_format = (_mesa_is_format_color_format(format) && mt_fmt != format) ?
  format : MESA_FORMAT_NONE;

v2: Explicitly set etc_format to MESA_FORMAT_NONE (Topi)

> >  
> > if (!(flags & MIPTREE_CREATE_NO_AUX))
> >intel_miptree_choose_aux_usage(brw, mt);
> > -- 
> > 2.17.0
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106677] vmwgfx: atom (electron-based app) causes corruption, hangs

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106677

--- Comment #2 from Deepak  ---
(In reply to David Cuthbert from comment #0)
> I'm filing this currently so I have a place to keep notes on this bug.
> 
> Running the atom text editor under various OSes (tried Linux Mint 18.3,
> Ubuntu 18.04, and currently using Fedora 28) results in minor screen
> glitches, eventually followed by drawing going completely haywire. I
> recompiled vmwgfx.ko from the current HEAD which resulted in fewer glitches,
> but it never completely goes away.
> 
> The hangs are always immediately preceded by:
> [drm:vmw_cmdbuf_work_func [vmwgfx]] *ERROR* Command "(null)" causing device
> error.
> [drm:vmw_cmdbuf_work_func [vmwgfx]] *ERROR* Command buffer offset is 28
> [drm:vmw_cmdbuf_work_func [vmwgfx]] *ERROR* Command size is 24
> 

Hi David, thanks for the bug report. Do you see the command buffer error with
the new top of the tree vmwgfx only ? I tried to reproduce this bug with clean
Ubuntu 18.04 and Atom installed from software center. I see that Atom text
editor will be unresponsive but couldn't see the kernel command buffer errors.

Will try with Fedora 28 later.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: Don't copy propagate from SSBO or shared variables either

2018-06-13 Thread Caio Marcelo de Oliveira Filho
Reviewed-by: Caio Marcelo de Oliveira Filho 

On Tue, Jun 12, 2018 at 03:48:13PM -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> Since SSBOs can be written, copy propagating a read can cause the

Optional: maybe write "... can be written by other threads"?

> value to magically change.  SSBO reads are also very expensive, so
> doing it twice will be slower.
> 
> Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
> total instructions in shared programs: 14399120 -> 14399119 (<.01%)
> instructions in affected programs: 684 -> 683 (-0.15%)
> helped: 1
> HURT: 0
> 
> total cycles in shared programs: 532978931 -> 532973113 (<.01%)
> cycles in affected programs: 530484 -> 524666 (-1.10%)
> helped: 1
> HURT: 0
> 
> Signed-off-by: Ian Romanick 
> Cc: mesa-sta...@lists.freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
> ---
>  src/compiler/glsl/opt_copy_propagation.cpp | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/compiler/glsl/opt_copy_propagation.cpp 
> b/src/compiler/glsl/opt_copy_propagation.cpp
> index 6220aa86da9..206dffe4f1c 100644
> --- a/src/compiler/glsl/opt_copy_propagation.cpp
> +++ b/src/compiler/glsl/opt_copy_propagation.cpp
> @@ -347,6 +347,8 @@ ir_copy_propagation_visitor::add_copy(ir_assignment *ir)
> if (lhs_var != NULL && rhs_var != NULL && lhs_var != rhs_var) {
>if (lhs_var->data.mode != ir_var_shader_storage &&
>lhs_var->data.mode != ir_var_shader_shared &&
> +  rhs_var->data.mode != ir_var_shader_storage &&
> +  rhs_var->data.mode != ir_var_shader_shared &&
>lhs_var->data.precise == rhs_var->data.precise) {
>   _mesa_hash_table_insert(acp, lhs_var, rhs_var);
>}
> -- 
> 2.14.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: Don't copy propagate elements from SSBO or shared variables either

2018-06-13 Thread Caio Marcelo de Oliveira Filho
Reviewed-by: Caio Marcelo de Oliveira Filho 


On Tue, Jun 12, 2018 at 03:48:14PM -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> Since SSBOs can be written, copy propagating a read can cause the
> value to magically change.  SSBO reads are also very expensive, so
> doing it twice will be slower.
> 
> The same shader was helped by this patch and the previous.
> 
> Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
> total instructions in shared programs: 14399119 -> 14399113 (<.01%)
> instructions in affected programs: 683 -> 677 (-0.88%)
> helped: 1
> HURT: 0
> 
> total cycles in shared programs: 532973113 -> 532971865 (<.01%)
> cycles in affected programs: 524666 -> 523418 (-0.24%)
> helped: 1
> HURT: 0
> 
> Signed-off-by: Ian Romanick 
> Cc: mesa-sta...@lists.freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
> ---
>  src/compiler/glsl/opt_copy_propagation_elements.cpp | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp 
> b/src/compiler/glsl/opt_copy_propagation_elements.cpp
> index 8bae424a1d0..8975e727522 100644
> --- a/src/compiler/glsl/opt_copy_propagation_elements.cpp
> +++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp
> @@ -544,6 +544,10 @@ 
> ir_copy_propagation_elements_visitor::add_copy(ir_assignment *ir)
> if (!lhs || !(lhs->type->is_scalar() || lhs->type->is_vector()))
>return;
>  
> +   if (lhs->var->data.mode == ir_var_shader_storage ||
> +   lhs->var->data.mode == ir_var_shader_shared)
> +  return;
> +
> ir_dereference_variable *rhs = ir->rhs->as_dereference_variable();
> if (!rhs) {
>ir_swizzle *swiz = ir->rhs->as_swizzle();
> @@ -560,6 +564,10 @@ 
> ir_copy_propagation_elements_visitor::add_copy(ir_assignment *ir)
>orig_swizzle[3] = swiz->mask.w;
> }
>  
> +   if (rhs->var->data.mode == ir_var_shader_storage ||
> +   rhs->var->data.mode == ir_var_shader_shared)
> +  return;
> +
> /* Move the swizzle channels out to the positions they match in the
>  * destination.  We don't want to have to rewrite the swizzle[]
>  * array every time we clear a bit of the write_mask.
> -- 
> 2.14.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/13] i965/miptree: Add and use mt_surf_usage

2018-06-13 Thread Pohjolainen, Topi
On Wed, Jun 13, 2018 at 09:25:37AM -0700, Nanley Chery wrote:
> On Wed, Jun 13, 2018 at 09:39:08AM +0300, Pohjolainen, Topi wrote:
> > On Tue, Jun 12, 2018 at 12:22:02PM -0700, Nanley Chery wrote:
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 ---
> > >  1 file changed, 26 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index cfb83d15ecc..5e00da86d32 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -677,6 +677,23 @@ make_separate_stencil_surface(struct brw_context 
> > > *brw,
> > > return true;
> > >  }
> > >  
> > > +/* Return the usual surface usage flags for the given format. */
> > > +static isl_surf_usage_flags_t
> > > +mt_surf_usage(mesa_format format)
> > > +{
> > > +   switch(_mesa_get_format_base_format(format)) {
> > > +   case GL_DEPTH_COMPONENT:
> > > +  return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT;
> > > +   case GL_DEPTH_STENCIL:
> > > +  return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_STENCIL_BIT |
> > > + ISL_SURF_USAGE_TEXTURE_BIT;
> > > +   case GL_STENCIL_INDEX:
> > > +  return ISL_SURF_USAGE_STENCIL_BIT | ISL_SURF_USAGE_TEXTURE_BIT;
> > > +   default:
> > > +  return ISL_SURF_USAGE_RENDER_TARGET_BIT | 
> > > ISL_SURF_USAGE_TEXTURE_BIT;
> > > +   }
> > > +}
> > > +
> > >  static struct intel_mipmap_tree *
> > >  miptree_create(struct brw_context *brw,
> > > GLenum target,
> > > @@ -713,8 +730,7 @@ miptree_create(struct brw_context *brw,
> > >return make_surface(brw, target, mt_fmt, first_level, last_level,
> > >width0, height0, depth0, num_samples,
> > >tiling_flags,
> > > -  ISL_SURF_USAGE_STENCIL_BIT |
> > > -  ISL_SURF_USAGE_TEXTURE_BIT,
> > 
> > New logic also sets ISL_SURF_USAGE_DEPTH_BIT here.
> > 
> 
> How so? The base format of MESA_FORMAT_S_UINT8 is GL_STENCIL_INDEX.

Yeah, my bad, I misread completely, same further down, sorry for the noise :(

> 
> > > +  mt_surf_usage(mt_fmt),
> > >alloc_flags,
> > >0,
> > >NULL);
> > > @@ -726,7 +742,7 @@ miptree_create(struct brw_context *brw,
> > >   brw, target, mt_fmt,
> > >   first_level, last_level,
> > >   width0, height0, depth0, num_samples, tiling_flags,
> > > - ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT,
> > > + mt_surf_usage(mt_fmt),
> > >   alloc_flags, 0, NULL);
> > >  
> > >if (needs_separate_stencil(brw, mt, format) &&
> > > @@ -746,8 +762,7 @@ miptree_create(struct brw_context *brw,
> > >   first_level, last_level,
> > >   width0, height0, depth0,
> > >   num_samples, tiling_flags,
> > > - ISL_SURF_USAGE_RENDER_TARGET_BIT |
> > > - ISL_SURF_USAGE_TEXTURE_BIT,
> > > + mt_surf_usage(mt_fmt),
> > >   alloc_flags, 0, NULL);
> > > if (!mt)
> > >return NULL;
> > > @@ -816,12 +831,11 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> > >  
> > > if ((base_format == GL_DEPTH_COMPONENT ||
> > >  base_format == GL_DEPTH_STENCIL)) {
> > > -  const mesa_format depth_only_format =
> > > - intel_depth_format_for_depthstencil_format(format);
> > > -  mt = make_surface(brw, target,
> > > -devinfo->gen >= 6 ? depth_only_format : format,
> > > +  mesa_format mt_fmt = (devinfo->gen < 6) ? format :
> > > +   
> > > intel_depth_format_for_depthstencil_format(format);
> > > +  mt = make_surface(brw, target, mt_fmt,
> > >  0, 0, width, height, depth, 1, ISL_TILING_Y0_BIT,
> > > -ISL_SURF_USAGE_DEPTH_BIT | 
> > > ISL_SURF_USAGE_TEXTURE_BIT,
> > > +mt_surf_usage(mt_fmt),
> > >  0, pitch, bo);
> > >if (!mt)
> > >   return NULL;
> > > @@ -836,8 +850,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> > >mt = make_surface(brw, target, MESA_FORMAT_S_UINT8,
> > >  0, 0, width, height, depth, 1,
> > >  ISL_TILING_W_BIT,
> > > -ISL_SURF_USAGE_STENCIL_BIT |
> > > -ISL_SURF_USAGE_TEXTURE_BIT,
> > > +mt_surf_usage(MESA_FORMAT_S_UINT8),
> > 
> > Same here, new logic also sets ISL_SURF_USAGE_DEPTH_BIT here.
> > 
> 
> How so?
> 
> -Nanley
> 
> > >  0, pitch, bo);
> > > 

Re: [Mesa-dev] [PATCH 08/13] i965/miptree: Share the miptree format in miptree_create

2018-06-13 Thread Pohjolainen, Topi
On Wed, Jun 13, 2018 at 09:20:55AM -0700, Nanley Chery wrote:
> On Wed, Jun 13, 2018 at 09:33:41AM +0300, Pohjolainen, Topi wrote:
> > On Tue, Jun 12, 2018 at 12:22:00PM -0700, Nanley Chery wrote:
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 30 +--
> > >  1 file changed, 15 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index 03628e3fd9f..97de30076e0 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -696,8 +696,19 @@ miptree_create(struct brw_context *brw,
> > > if (devinfo->gen < 6 && _mesa_is_format_color_format(format))
> > >tiling_flags &= ~ISL_TILING_Y0_BIT;
> > >  
> > > +   mesa_format mt_fmt;
> > > +   if (_mesa_is_format_color_format(format)) {
> > > +  mt_fmt = intel_lower_compressed_format(brw, format);
> > > +   } else {
> > > +  /* Fix up the Z miptree format for how we're splitting out separate
> > > +   * stencil. Gen7 expects there to be no stencil bits in its depth 
> > > buffer.
> > > +   */
> > > +  mt_fmt = (devinfo->gen < 6) ? format :
> > > +   intel_depth_format_for_depthstencil_format(format);
> > > +   }
> > 
> > I wonder if we need to add something of this sort for coverity not 
> > complaining
> > later on (I don't know if it is clever to know what
> > _mesa_is_format_color_format() does):
> > 
> >   } else {
> >  unreachable("Format with invalid base");
> >   }
> > 
> > 
> 
> Where would we be adding this unreachable? There is already an else case here.

Yeah, same thing as with the other patch, I was thinking the STENCIL_INDEX
case in my head and somehow stopped reading what you actually had here. This
is all fine, sorry.

> 
> -Nanley
> 
> > > +
> > > if (format == MESA_FORMAT_S_UINT8)
> > > -  return make_surface(brw, target, format, first_level, last_level,
> > > +  return make_surface(brw, target, mt_fmt, first_level, last_level,
> > >width0, height0, depth0, num_samples,
> > >tiling_flags,
> > >ISL_SURF_USAGE_STENCIL_BIT |
> > > @@ -709,13 +720,8 @@ miptree_create(struct brw_context *brw,
> > > const GLenum base_format = _mesa_get_format_base_format(format);
> > > if ((base_format == GL_DEPTH_COMPONENT ||
> > >  base_format == GL_DEPTH_STENCIL)) {
> > > -  /* Fix up the Z miptree format for how we're splitting out separate
> > > -   * stencil.  Gen7 expects there to be no stencil bits in its depth 
> > > buffer.
> > > -   */
> > > -  const mesa_format depth_only_format =
> > > - intel_depth_format_for_depthstencil_format(format);
> > >struct intel_mipmap_tree *mt = make_surface(
> > > - brw, target, devinfo->gen >= 6 ? depth_only_format : format,
> > > + brw, target, mt_fmt,
> > >   first_level, last_level,
> > >   width0, height0, depth0, num_samples, tiling_flags,
> > >   ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT,
> > > @@ -733,19 +739,13 @@ miptree_create(struct brw_context *brw,
> > >return mt;
> > > }
> > >  
> > > -   mesa_format tex_format = format;
> > > -   mesa_format etc_format = MESA_FORMAT_NONE;
> > > uint32_t alloc_flags = 0;
> > >  
> > > -   format = intel_lower_compressed_format(brw, format);
> > > -
> > > -   etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
> > > -
> > > if (flags & MIPTREE_CREATE_BUSY)
> > >alloc_flags |= BO_ALLOC_BUSY;
> > >  
> > > struct intel_mipmap_tree *mt = make_surface(
> > > - brw, target, format,
> > > + brw, target, mt_fmt,
> > >   first_level, last_level,
> > >   width0, height0, depth0,
> > >   num_samples, tiling_flags,
> > > @@ -755,7 +755,7 @@ miptree_create(struct brw_context *brw,
> > > if (!mt)
> > >return NULL;
> > >  
> > > -   mt->etc_format = etc_format;
> > > +   mt->etc_format = (mt_fmt != format) ? format : MESA_FORMAT_NONE;
> > >  
> > > if (!(flags & MIPTREE_CREATE_NO_AUX))
> > >intel_miptree_choose_aux_usage(brw, mt);
> > > -- 
> > > 2.17.0
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106677] vmwgfx: atom (electron-based app) causes corruption, hangs

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106677

--- Comment #3 from Thomas Hellström  ---
FWIW, no apparent problems on Fedora Rawhide with 4.18.0-rc0.

/Thomas

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8

2018-06-13 Thread Samuel Pitoiset
This causes rendering issues in Shadow Warrior 2 with DXVK.

Cc: mesa-sta...@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_clear.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index fae441ceb6..373072dd36 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -717,6 +717,14 @@ emit_fast_htile_clear(struct radv_cmd_buffer *cmd_buffer,
if ((clear_value.depth != 0.0 && clear_value.depth != 1.0) || !(aspects 
& VK_IMAGE_ASPECT_DEPTH_BIT))
goto fail;
 
+   /* GFX8 only supports 32-bit depth surfaces but we can enable TC-compat
+* HTILE for 16-bit surfaces if no Z planes are compressed. Though,
+* fast HTILE clears don't seem to work.
+*/
+   if (cmd_buffer->device->physical_device->rad_info.chip_class == VI &&
+   iview->image->vk_format == VK_FORMAT_D16_UNORM)
+   goto fail;
+
if (vk_format_aspects(iview->image->vk_format) & 
VK_IMAGE_ASPECT_STENCIL_BIT) {
if (clear_value.stencil != 0 || !(aspects & 
VK_IMAGE_ASPECT_STENCIL_BIT))
goto fail;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8

2018-06-13 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Wed, Jun 13, 2018 at 8:19 PM, Samuel Pitoiset
 wrote:
> This causes rendering issues in Shadow Warrior 2 with DXVK.
>
> Cc: mesa-sta...@lists.freedesktop.org
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_meta_clear.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_meta_clear.c 
> b/src/amd/vulkan/radv_meta_clear.c
> index fae441ceb6..373072dd36 100644
> --- a/src/amd/vulkan/radv_meta_clear.c
> +++ b/src/amd/vulkan/radv_meta_clear.c
> @@ -717,6 +717,14 @@ emit_fast_htile_clear(struct radv_cmd_buffer *cmd_buffer,
> if ((clear_value.depth != 0.0 && clear_value.depth != 1.0) || 
> !(aspects & VK_IMAGE_ASPECT_DEPTH_BIT))
> goto fail;
>
> +   /* GFX8 only supports 32-bit depth surfaces but we can enable 
> TC-compat
> +* HTILE for 16-bit surfaces if no Z planes are compressed. Though,
> +* fast HTILE clears don't seem to work.
> +*/
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class == VI &&
> +   iview->image->vk_format == VK_FORMAT_D16_UNORM)
> +   goto fail;
> +
> if (vk_format_aspects(iview->image->vk_format) & 
> VK_IMAGE_ASPECT_STENCIL_BIT) {
> if (clear_value.stencil != 0 || !(aspects & 
> VK_IMAGE_ASPECT_STENCIL_BIT))
> goto fail;
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106912] radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106912

Samuel Pitoiset  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #2 from Samuel Pitoiset  ---
Fixed.
https://cgit.freedesktop.org/mesa/mesa/commit/?id=51e23d34190076159129dd7b449b95a1ac3d4949

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radv: don't check for linear images in emit_fast_color_clear()

2018-06-13 Thread Samuel Pitoiset
We don't enable CMASK for linear surfaces and addrlib only
enables DCC for tiling surfaces.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_clear.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index 28050079f92..b52beb3861c 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1008,8 +1008,6 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
if (iview->image->info.array_size != iview->layer_count)
goto fail;
 
-   if (iview->image->surface.is_linear)
-   goto fail;
if (!radv_image_extent_compare(iview->image, &iview->extent))
goto fail;
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radv: don't check the number of levels in emit_fast_color_clear()

2018-06-13 Thread Samuel Pitoiset
This is useless because we don't support DCC/CMASK for mipmaps.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_meta_clear.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c
index fae441ceb66..28050079f92 100644
--- a/src/amd/vulkan/radv_meta_clear.c
+++ b/src/amd/vulkan/radv_meta_clear.c
@@ -1008,9 +1008,6 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer,
if (iview->image->info.array_size != iview->layer_count)
goto fail;
 
-   if (iview->image->info.levels > 1)
-   goto fail;
-
if (iview->image->surface.is_linear)
goto fail;
if (!radv_image_extent_compare(iview->image, &iview->extent))
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106677] vmwgfx: atom (electron-based app) causes corruption, hangs

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106677

--- Comment #4 from David Cuthbert  ---
Note that it takes some fiddling to reproduce this currently (the exact trigger
isn't known). I can go hours without seeing this issue.

I've been banging my head against the wall trying to get my extra logging to
work -- finally realized yesterday that vmwgfx.ko is being loaded in initramfs
and not from my filesystem. I'm attempting to reproduce it now with a rebuilt
initramfs.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106479] NDEBUG not defined for libamdgpu_addrlib

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106479

Samuel Pitoiset  changed:

   What|Removed |Added

 Status|NEEDINFO|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Samuel Pitoiset  ---
This has been fixed by Bas.
https://cgit.freedesktop.org/mesa/mesa/commit/?id=62e0e089d710835d9f79138377bcc37147f75ebd

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106696] repeatable drm:amdgpu_job_timedout with vulkan toy

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106696

Samuel Pitoiset  changed:

   What|Removed |Added

 Resolution|--- |NOTOURBUG
 Status|REOPENED|RESOLVED

--- Comment #8 from Samuel Pitoiset  ---
As Nicolai said, this is a known issue. Definitely unrelated to RADV. Please
don't re-open, thanks!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 3/3] egl/android: Add DRM node probing and filtering

2018-06-13 Thread Rob Herring
On Wed, Jun 13, 2018 at 12:19 PM, Amit Pundir  wrote:
> On 13 June 2018 at 20:45, Rob Herring  wrote:
>>
>> +Amit and John
>>
>> On Sat, Jun 9, 2018 at 11:27 AM, Robert Foss  
>> wrote:
>> > This patch both adds support for probing & filtering DRM nodes
>> > and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD
>> > gralloc call.
>> >
>> > Currently the filtering is based just on the driver name,
>> > and the desired name is supplied using the "drm.gpu.vendor_name"
>> > Android property.
>>
>> There's a potential issue with this whole approach and that is
>> SELinux. With the way SELinux locks down accesses, getting probing
>> thru device files to work can be a pain. It may be better now than the
>> prior version because sysfs is not probed. I'll leave it to Amit or
>> John to comment.
>
> Right.. so ICYMI, this patch is already pulled into external/mesa3d
> project of AOSP and I stumbled upon one such /dev/dri/ access denial
> on db820c recently.

A prior version of the patch series which accesses sysfs too (via libdrm).

>
> In AOSP, zygote spawned apps already have access to GPU device nodes
> in the form of /dev/gpu_device file, but the missing part is the

It's "gpu_device" in terms a a SELinux context, right? Not an actual /dev path?

> open-read access to "/dev/dri/" which need to be allowed explicitly.

Or we need a way to just open a specific device.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106756

Samuel Pitoiset  changed:

   What|Removed |Added

 Status|NEW |NEEDINFO

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/8] i965: Don't recycle BOs until they are idle

2018-06-13 Thread Jason Ekstrand
The current BO cache puts BOs back into the recycle bucket the moment the
refcount hits zero.  If the BO is busy, we just don't re-use it until it
isn't or we re-use it for a render target which we assume will be used
first for drawing.  This patch series reworks the way the BO cache works a
bit so that we don't ever recycle a busy BO.  On the down side, it means
that we don't get the "keep busy BOs busy" heuristic (which we have no
proof actually helps).  On the up side, we can now easily use a MRU
heuristic instead of round-robin for all buffers and not just the busy
ones.  Will this be an improvement, a regression or a wash?  I don't know
but I doubt it will have a major effect one way or another.

Jason Ekstrand (8):
  i965/bufmgr: Bail early in bo_busy if the BO is flagged idle
  i965/miptree: Stop setting BO_ALLOC_BUSY
  i965/bufmgr: Drop the BO_ALLOC_BUSY flag
  i965/bufmgr: Add a garbage collection mechanism
  i965/batch: Use brw_bo_unreference_bos_when_idle
  i965: Call intel_finish before destroying the context
  i965/bufmgr: Don't allow busy BOs to be returned to the pool
  i965/bufmgr: Allocate from the tail of the bucket free list

 src/mesa/drivers/dri/i965/brw_bufmgr.c| 186 +-
 src/mesa/drivers/dri/i965/brw_bufmgr.h|  18 +-
 src/mesa/drivers/dri/i965/brw_context.c   |   8 +
 src/mesa/drivers/dri/i965/intel_batchbuffer.c |   8 +-
 src/mesa/drivers/dri/i965/intel_fbo.c |   2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  29 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |   9 -
 src/mesa/drivers/dri/i965/intel_screen.c  |   2 +-
 .../drivers/dri/i965/intel_tex_validate.c |   2 +-
 9 files changed, 182 insertions(+), 82 deletions(-)

-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] i965/bufmgr: Drop the BO_ALLOC_BUSY flag

2018-06-13 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 ++
 src/mesa/drivers/dri/i965/brw_bufmgr.h |  1 -
 2 files changed, 10 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 58bb559fdee..e9d3daa5985 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -448,11 +448,6 @@ int
 brw_bo_busy(struct brw_bo *bo)
 {
struct brw_bufmgr *bufmgr = bo->bufmgr;
-
-   /* If we know it's idle, don't bother with the kernel round trip */
-   if (bo->idle && !bo->external)
-  return false;
-
struct drm_i915_gem_busy busy = { .handle = bo->gem_handle };
 
int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, &busy);
@@ -506,20 +501,11 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
struct bo_cache_bucket *bucket;
bool alloc_from_cache;
uint64_t bo_size;
-   bool busy = false;
bool zeroed = false;
 
-   if (flags & BO_ALLOC_BUSY)
-  busy = true;
-
if (flags & BO_ALLOC_ZEROED)
   zeroed = true;
 
-   /* BUSY does doesn't really jive with ZEROED as we have to wait for it to
-* be idle before we can memset.  Just disallow that combination.
-*/
-   assert(!(busy && zeroed));
-
/* Round the allocated size up to a power of two number of pages. */
bucket = bucket_for_size(bufmgr, size);
 
@@ -539,29 +525,17 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
 retry:
alloc_from_cache = false;
if (bucket != NULL && !list_empty(&bucket->head)) {
-  if (busy && !zeroed) {
- /* Allocate new render-target BOs from the tail (MRU)
-  * of the list, as it will likely be hot in the GPU
-  * cache and in the aperture for us.  If the caller
-  * asked us to zero the buffer, we don't want this
-  * because we are going to mmap it.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
- list_del(&bo->head);
+  /* For non-render-target BOs (where we're probably
+   * going to map it first thing in order to fill it
+   * with data), check if the last BO in the cache is
+   * unbusy, and only reuse in that case. Otherwise,
+   * allocating a new buffer is probably faster than
+   * waiting for the GPU to finish.
+   */
+  bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
+  if (!brw_bo_busy(bo)) {
  alloc_from_cache = true;
-  } else {
- /* For non-render-target BOs (where we're probably
-  * going to map it first thing in order to fill it
-  * with data), check if the last BO in the cache is
-  * unbusy, and only reuse in that case. Otherwise,
-  * allocating a new buffer is probably faster than
-  * waiting for the GPU to finish.
-  */
- bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
- if (!brw_bo_busy(bo)) {
-alloc_from_cache = true;
-list_del(&bo->head);
- }
+ list_del(&bo->head);
   }
 
   if (alloc_from_cache) {
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index 32fc7a553c9..d3b3aadc0db 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -195,7 +195,6 @@ struct brw_bo {
bool cache_coherent;
 };
 
-#define BO_ALLOC_BUSY   (1<<0)
 #define BO_ALLOC_ZEROED (1<<1)
 
 /**
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] i965: Call intel_finish before destroying the context

2018-06-13 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_context.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 9ced230ec14..98ec54f2ae3 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1099,6 +1099,14 @@ intelDestroyContext(__DRIcontext * driContextPriv)
   (struct brw_context *) driContextPriv->driverPrivate;
struct gl_context *ctx = &brw->ctx;
 
+   /* Wait for our any outstanding rendering to be completed before we start
+* freeing anything.  It's probably safe to destroy the context while stuff
+* is sill in flight since the kernel will reference count our BOs.  This
+* just ensures that everything is safe before we start destroying things
+* in case doing so has any side-effects.
+*/
+   intel_finish(ctx);
+
_mesa_meta_free(&brw->ctx);
 
if (INTEL_DEBUG & DEBUG_SHADER_TIME) {
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] i965/bufmgr: Don't allow busy BOs to be returned to the pool

2018-06-13 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 51 --
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index cfa32ff3726..ef918315c65 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -524,6 +524,14 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
  bo_size = page_size;
} else {
   bo_size = bucket->size;
+
+  /* If there's nothing in the bucket, call bufmgr_collect in the hopes
+   * that maybe we can free and re-use an old BO.  It should be safe to
+   * call list_empty() without taking a lock since it's just a pointer
+   * comparison and nothing bad will happen if we get it wrong.
+   */
+  if (list_empty(&bucket->head))
+ brw_bufmgr_collect(bufmgr);
}
 
mtx_lock(&bufmgr->lock);
@@ -539,31 +547,29 @@ retry:
* waiting for the GPU to finish.
*/
   bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
-  if (!brw_bo_busy(bo)) {
- alloc_from_cache = true;
- list_del(&bo->head);
+  assert(!brw_bo_busy(bo));
+
+  alloc_from_cache = true;
+  list_del(&bo->head);
+
+  if (!brw_bo_madvise(bo, I915_MADV_WILLNEED)) {
+ bo_free(bo);
+ brw_bo_cache_purge_bucket(bufmgr, bucket);
+ goto retry;
   }
 
-  if (alloc_from_cache) {
- if (!brw_bo_madvise(bo, I915_MADV_WILLNEED)) {
-bo_free(bo);
-brw_bo_cache_purge_bucket(bufmgr, bucket);
-goto retry;
- }
+  if (bo_set_tiling_internal(bo, tiling_mode, stride)) {
+ bo_free(bo);
+ goto retry;
+  }
 
- if (bo_set_tiling_internal(bo, tiling_mode, stride)) {
+  if (zeroed) {
+ void *map = brw_bo_map(NULL, bo, MAP_WRITE | MAP_RAW);
+ if (!map) {
 bo_free(bo);
 goto retry;
  }
-
- if (zeroed) {
-void *map = brw_bo_map(NULL, bo, MAP_WRITE | MAP_RAW);
-if (!map) {
-   bo_free(bo);
-   goto retry;
-}
-memset(map, 0, bo_size);
- }
+ memset(map, 0, bo_size);
   }
}
 
@@ -871,6 +877,13 @@ bo_unreference_final(struct brw_bo *bo, time_t time)
 
DBG("bo_unreference final: %d (%s)\n", bo->gem_handle, bo->name);
 
+   /* The only way an internal BO can be busy is if it's in use by one of our
+* (this screen's) batch buffers.  Since we always wait for the batch to be
+* idle before we unref the BOs it references, we can never get here with a
+* busy internal BO.
+*/
+   assert(bo->external || !brw_bo_busy(bo));
+
bucket = bucket_for_size(bufmgr, bo->size);
/* Put the buffer into our internal cache for reuse if we can. */
if (bufmgr->bo_reuse && bo->reusable && bucket != NULL &&
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] i965/bufmgr: Add a garbage collection mechanism

2018-06-13 Thread Jason Ekstrand
While we can always trust the kernel to reference count things and not
actually free any memory until the GPU is done with it, that may not
actually do what we want.  We have to be careful, for instance, with
recycling buffers that we might immediately map.  This commit provides a
tagging mechanism that we can use to avoid unreferencing a BO until some
other BO (presumably a batch) goes idle.  The next commit will actually
start using the new mechanism.
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 102 +
 src/mesa/drivers/dri/i965/brw_bufmgr.h |  17 +
 2 files changed, 119 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index e9d3daa5985..cfa32ff3726 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -158,6 +158,12 @@ struct brw_bufmgr {
bool bo_reuse:1;
 
uint64_t initial_kflags;
+
+   /* List of struct bo_idle_unref_request
+*
+* See also brw_bufmgr_collect()
+*/
+   struct list_head unref_requests;
 };
 
 static int bo_set_tiling_internal(struct brw_bo *bo, uint32_t tiling_mode,
@@ -904,6 +910,92 @@ brw_bo_unreference(struct brw_bo *bo)
}
 }
 
+struct bo_idle_unref_request {
+   struct brw_bo *wait_bo;
+
+   struct list_head link;
+
+   unsigned num_unref_bos;
+   struct brw_bo *unref_bos[0];
+};
+
+void
+brw_bo_unreference_bos_when_idle(struct brw_bo *wait_bo,
+ struct brw_bo **unref_bos,
+ unsigned num_unref_bos)
+{
+   struct brw_bufmgr *bufmgr = wait_bo->bufmgr;
+
+   struct bo_idle_unref_request *req =
+  malloc(sizeof(*req) + num_unref_bos * sizeof(req->unref_bos[0]));
+
+   if (req == NULL) {
+  /* This should never happen.  If it does, we can always just stall and
+   * then unreference everything.
+   */
+  brw_bo_wait_rendering(wait_bo);
+  for (unsigned i = 0; i < num_unref_bos; i++)
+ brw_bo_unreference(unref_bos[i]);
+  return;
+   }
+
+   req->wait_bo = wait_bo;
+   brw_bo_reference(wait_bo);
+
+   req->num_unref_bos = num_unref_bos;
+   memcpy(req->unref_bos, unref_bos, num_unref_bos * sizeof(*unref_bos));
+
+   mtx_lock(&bufmgr->lock);
+   list_addtail(&req->link, &bufmgr->unref_requests);
+   mtx_unlock(&bufmgr->lock);
+}
+
+static void
+bufmgr_collect(struct brw_bufmgr *bufmgr, bool wait)
+{
+   mtx_lock(&bufmgr->lock);
+
+   struct list_head idle_list;
+   list_inithead(&idle_list);
+
+   /* Move all entries with idle BOs into the idle list */
+   list_for_each_entry_safe(struct bo_idle_unref_request, req,
+&bufmgr->unref_requests, link) {
+  if (wait) {
+ /* This case is only for when we're destroying the bufmgr so nothing
+  * should ever be busy.  We'll wait on it in release builds just to
+  * make sure.
+  */
+ assert(!brw_bo_busy(req->wait_bo));
+ brw_bo_wait(req->wait_bo, -1);
+  } else if (brw_bo_busy(req->wait_bo)) {
+ continue;
+  }
+
+  list_del(&req->link);
+  list_addtail(&req->link, &idle_list);
+   }
+
+   /* Drop the lock before we start unreferencing things */
+   mtx_unlock(&bufmgr->lock);
+
+   list_for_each_entry_safe(struct bo_idle_unref_request, req,
+&idle_list, link) {
+  brw_bo_unreference(req->wait_bo);
+  for (unsigned i = 0; i < req->num_unref_bos; i++)
+ brw_bo_unreference(req->unref_bos[i]);
+  list_del(&req->link);
+  free(req);
+   }
+   assert(list_empty(&idle_list));
+}
+
+void
+brw_bufmgr_collect(struct brw_bufmgr *bufmgr)
+{
+   bufmgr_collect(bufmgr, false);
+}
+
 static void
 bo_wait_with_stall_warning(struct brw_context *brw,
struct brw_bo *bo,
@@ -1270,12 +1362,20 @@ brw_bo_wait(struct brw_bo *bo, int64_t timeout_ns)
 
bo->idle = true;
 
+   /* We just had to call into the kernel to wait on a BO, something is now
+* idle so we may as well garbage collect.
+*/
+   brw_bufmgr_collect(bufmgr);
+
return ret;
 }
 
 void
 brw_bufmgr_destroy(struct brw_bufmgr *bufmgr)
 {
+   bufmgr_collect(bufmgr, true);
+   assert(list_empty(&bufmgr->unref_requests));
+
mtx_destroy(&bufmgr->lock);
 
/* Free any cached buffer objects we were going to reuse */
@@ -1731,5 +1831,7 @@ brw_bufmgr_init(struct gen_device_info *devinfo, int fd)
bufmgr->handle_table =
   _mesa_hash_table_create(NULL, key_hash_uint, key_uint_equal);
 
+   list_inithead(&bufmgr->unref_requests);
+
return bufmgr;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index d3b3aadc0db..644ba3a47aa 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -262,6 +262,23 @@ brw_bo_reference(struct brw_bo *bo)
  */
 void brw_bo_unreference(struct brw_bo *bo);
 
+/**
+ * Release references on a list of BOs when the given BO becomes idle.
+ *
+ * 

[Mesa-dev] [PATCH 1/8] i965/bufmgr: Bail early in bo_busy if the BO is flagged idle

2018-06-13 Thread Jason Ekstrand
This has the potential to make brw_bo_busy a bit cheaper for internal
BOs if someone has checked it for busy or waited on it before.  We
already do the same thing in brw_bo_wait.
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index 7ac3bcad3da..58bb559fdee 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -448,6 +448,11 @@ int
 brw_bo_busy(struct brw_bo *bo)
 {
struct brw_bufmgr *bufmgr = bo->bufmgr;
+
+   /* If we know it's idle, don't bother with the kernel round trip */
+   if (bo->idle && !bo->external)
+  return false;
+
struct drm_i915_gem_busy busy = { .handle = bo->gem_handle };
 
int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, &busy);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] i965/bufmgr: Allocate from the tail of the bucket free list

2018-06-13 Thread Jason Ekstrand
The previous approach gave a sort of round-robin behavior which made
sense because we didn't want to walk the entire list looking for the
first idle BO.  Now that everything is idle, we can pick any BO in the
list and it should be fine.  Using the most recently used BO should give
us less over-all thrash than the round-robin because we will be trying
to re-use BOs as much as possible.
---
 src/mesa/drivers/dri/i965/brw_bufmgr.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index ef918315c65..02aea435e84 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -539,14 +539,10 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr,
 retry:
alloc_from_cache = false;
if (bucket != NULL && !list_empty(&bucket->head)) {
-  /* For non-render-target BOs (where we're probably
-   * going to map it first thing in order to fill it
-   * with data), check if the last BO in the cache is
-   * unbusy, and only reuse in that case. Otherwise,
-   * allocating a new buffer is probably faster than
-   * waiting for the GPU to finish.
+  /* Allocate BOs from the tail (MRU) of the list as it will likely be
+   * hotter in the GPU cache and in the aperature for us.
*/
-  bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
+  bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
   assert(!brw_bo_busy(bo));
 
   alloc_from_cache = true;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] i965/batch: Use brw_bo_unreference_bos_when_idle

2018-06-13 Thread Jason Ekstrand
Instead of unreferencing all the BOs used by the freshly submitted batch
directly, ask the bufmgr to unref them for us once the batch goes idle.
This should more-or-less have the same effect except that we now wait to
unref the BOs until the batch is idle.
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index df999ffeb1d..127d0c34bea 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -535,10 +535,12 @@ static void
 brw_new_batch(struct brw_context *brw)
 {
/* Unreference any BOs held by the previous batch, and reset counts. */
-   for (int i = 0; i < brw->batch.exec_count; i++) {
-  brw_bo_unreference(brw->batch.exec_bos[i]);
+   brw_bo_unreference_bos_when_idle(brw->batch.batch.bo,
+brw->batch.exec_bos,
+brw->batch.exec_count);
+
+   for (int i = 0; i < brw->batch.exec_count; i++)
   brw->batch.exec_bos[i] = NULL;
-   }
brw->batch.batch_relocs.reloc_count = 0;
brw->batch.state_relocs.reloc_count = 0;
brw->batch.exec_count = 0;
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] i965/miptree: Stop setting BO_ALLOC_BUSY

2018-06-13 Thread Jason Ekstrand
It was never all that useful and no one had really demonstrated the
value of it in any concrete way.  It is, however, a very easy way to run
into trouble if you're not careful.  Let's just drop it and hope to
solve whatever problems it was solving in some other way.
---
 src/mesa/drivers/dri/i965/intel_fbo.c |  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 29 +++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  9 --
 src/mesa/drivers/dri/i965/intel_screen.c  |  2 +-
 .../drivers/dri/i965/intel_tex_validate.c |  2 +-
 5 files changed, 14 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index fb84b738c08..5d446023d12 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -948,7 +948,7 @@ intel_renderbuffer_move_to_temp(struct brw_context *brw,
  0, 0,
  width, height, 1,
  irb->mt->surf.samples,
- MIPTREE_CREATE_BUSY);
+ MIPTREE_CREATE_DEFAULT);
 
if (!invalidate)
   intel_miptree_copy_slice(brw, intel_image->mt,
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 6b89bf6848a..6a1d4fc670c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -554,7 +554,7 @@ make_surface(struct brw_context *brw, GLenum target, 
mesa_format format,
  unsigned first_level, unsigned last_level,
  unsigned width0, unsigned height0, unsigned depth0,
  unsigned num_samples, isl_tiling_flags_t tiling_flags,
- isl_surf_usage_flags_t isl_usage_flags, uint32_t alloc_flags,
+ isl_surf_usage_flags_t isl_usage_flags,
  unsigned row_pitch, struct brw_bo *bo)
 {
struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1);
@@ -630,7 +630,7 @@ make_surface(struct brw_context *brw, GLenum target, 
mesa_format format,
   BRW_MEMZONE_OTHER,
   isl_tiling_to_i915_tiling(
  mt->surf.tiling),
-  mt->surf.row_pitch, alloc_flags);
+  mt->surf.row_pitch, 0);
   if (!mt->bo)
  goto fail;
} else {
@@ -667,7 +667,7 @@ make_separate_stencil_surface(struct brw_context *brw,
  mt->surf.samples, ISL_TILING_W_BIT,
  ISL_SURF_USAGE_STENCIL_BIT |
  ISL_SURF_USAGE_TEXTURE_BIT,
- BO_ALLOC_BUSY, 0, NULL);
+ 0, NULL);
 
if (!mt->stencil_mt)
   return false;
@@ -697,7 +697,6 @@ miptree_create(struct brw_context *brw,
   ISL_TILING_W_BIT,
   ISL_SURF_USAGE_STENCIL_BIT |
   ISL_SURF_USAGE_TEXTURE_BIT,
-  BO_ALLOC_BUSY,
   0,
   NULL);
 
@@ -715,7 +714,7 @@ miptree_create(struct brw_context *brw,
  first_level, last_level,
  width0, height0, depth0, num_samples, ISL_TILING_Y0_BIT,
  ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT,
- BO_ALLOC_BUSY, 0, NULL);
+ 0, NULL);
 
   if (needs_separate_stencil(brw, mt, format) &&
   !make_separate_stencil_surface(brw, mt)) {
@@ -731,15 +730,11 @@ miptree_create(struct brw_context *brw,
 
mesa_format tex_format = format;
mesa_format etc_format = MESA_FORMAT_NONE;
-   uint32_t alloc_flags = 0;
 
format = intel_lower_compressed_format(brw, format);
 
etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE;
 
-   if (flags & MIPTREE_CREATE_BUSY)
-  alloc_flags |= BO_ALLOC_BUSY;
-
isl_tiling_flags_t tiling_flags = (flags & MIPTREE_CREATE_LINEAR) ?
   ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK;
 
@@ -754,7 +749,7 @@ miptree_create(struct brw_context *brw,
  num_samples, tiling_flags,
  ISL_SURF_USAGE_RENDER_TARGET_BIT |
  ISL_SURF_USAGE_TEXTURE_BIT,
- alloc_flags, 0, NULL);
+ 0, NULL);
if (!mt)
   return NULL;
 
@@ -828,7 +823,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
 devinfo->gen >= 6 ? depth_only_format : format,
 0, 0, width, height, depth, 1, ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT,
-0, pitch, bo);
+pitch, bo);
   if (!mt)
  return NULL;
 
@@ -844,7 +839,7 @@ intel_miptree_

Re: [Mesa-dev] [PATCH 01/14] intel/compiler: general 8/16/32/64-bit shuffle_src_to_dst function

2018-06-13 Thread Jason Ekstrand
On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> This new function takes care of shuffle/unshuffle components of a
> particular bit-size in components with a different bit-size.
>
> If source type size is smaller than destination type size the operation
> needed is a component shuffle. The opposite case would be an unshuffle.
>
> The operation allows to skip first_component number of components from
> the source.
>
> Shuffle MOVs are retyped using integer types avoiding problems with denorms
> and float types. This allows to simplify uses of shuffle functions that are
> dealing with these retypes individually.
>
> Now there is a new restriction so source and destination can not overlap
> anymore when calling this suffle function. Following patches that migrate
> to use this new function will take care individually of avoiding source
> and destination overlaps.
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 92 +++
>  1 file changed, 92 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 166da0aa6d7..1a9d3c41d1d 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const
> fs_builder &bld,
> }
>  }
>
> +/*
> + * This helper takes a source register and un/shuffles it into the
> destination
> + * register.
> + *
> + * If source type size is smaller than destination type size the operation
> + * needed is a component shuffle. The opposite case would be an
> unshuffle. If
> + * source/destination type size is equal a shuffle is done that would be
> + * equivalent to a simple MOV.
>

There's a sticky bit here if we want this to work with 64-bit types on gen7
and earlier because we only have DF there and not Q so the
brw_reg_type_from_bit_size below doesn't work.  If we care about that case
(and I'm not convinced we do), it should be easy enough to add a
type_sz(src.type) == type_sz(dst.type) case which just does MOVs from
source to dest.


> + *
> + * For example, if source is a 16-bit type and destination is 32-bit. A 3
> + * components .xyz 16-bit vector on SIMD8 would be.
> + *
> + *|x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8|
> + *|z1|z2|z3|z4|z5|z6|z7|z8|  |  |  |  |  |  |  |  |
> + *
> + * This helper will return the following 2 32-bit components with the
> 16-bit
> + * values shuffled:
> + *
> + *|x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8|
> + *|z1   |z2   |z3   |z4   |z5   |z6   |z7   |z8   |
> + *
> + * For unshuffle, the example would be the opposite, a 64-bit type source
> + * and a 32-bit destination. A 2 component .xy 64-bit vector on SIMD8
> + * would be:
> + *
> + *| x1l   x1h | x2l   x2h | x3l   x3h | x4l   x4h |
> + *| x5l   x5h | x6l   x6h | x7l   x7h | x8l   x8h |
> + *| y1l   y1h | y2l   y2h | y3l   y3h | y4l   y4h |
> + *| y5l   y5h | y6l   y6h | y7l   y7h | y8l   y8h |
> + *
> + * The returned result would be the following 4 32-bit components
> unshuffled:
> + *
> + *| x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l |
> + *| x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h |
> + *| y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l |
> + *| y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h |
> + *
> + * - Source and destination register must not be overlapped.
> + * - first_component parameter allows skipping source components.
> + */
> +void
> +shuffle_src_to_dst(const fs_builder &bld,
> +   const fs_reg &dst,
> +   const fs_reg &src,
> +   uint32_t first_component,
> +   uint32_t components)
> +{
> +   if (type_sz(src.type) <= type_sz(dst.type)) {
> +  /* Source is shuffled into destination */
> +  unsigned size_ratio = type_sz(dst.type) / type_sz(src.type);
> +#ifndef NDEBUG
> +  boolean src_dst_overlap = regions_overlap(dst,
> + type_sz(dst.type) * bld.dispatch_width() * components,
> + offset(src, bld, first_component * size_ratio),
>

Why do you need to multiply first_component by size_ratio?  It's already in
units of source components.


> + type_sz(src.type) * bld.dispatch_width() * components *
> size_ratio);
> +#endif
> +  assert(!src_dst_overlap);
>

If the only thing you're doing with src_dst_overlap is to assert on it, you
may as well put the regions_overlap call inside the assert and drop the
#ifndef.


> +
> +  brw_reg_type shuffle_type =
> + brw_reg_type_from_bit_size(8 * type_sz(src.type),
> +BRW_REGISTER_TYPE_D);
> +  for (unsigned i = 0; i < components; i++) {
> + fs_reg shuffle_component_i =
> +subscript(offset(dst, bld, i / size_ratio),
> +  shuffle_type, i % size_ratio);
> + bld.MOV(shuffle_component_i,
> + retype(offset(src, bld, i + first_component),
> shuffle_type));
> +  }
> 

[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106907

--- Comment #1 from Jordan Justen  ---
Any chance you might be able to write a small piglit test
that shows the bug? For example:

https://cgit.freedesktop.org/piglit/commit/?id=f1dc46ddf8c1

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Fix output for sparse MRTs.

2018-06-13 Thread Bas Nieuwenhuizen
We need to init the cb_shader_format correctly with the changed
col_format, so this moves the col_format adjustment to before the
adjustment to before the cb_shader_mask gets generated.

Fixes: 06d3c650980 "radv: fix a GPU hang when MRTs are sparse"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903
CC: 18.1 
---
 src/amd/vulkan/radv_pipeline.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index b8b425aca9f..6eeedc65a39 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -524,20 +524,21 @@ radv_pipeline_compute_spi_color_formats(struct 
radv_pipeline *pipeline,
col_format |= cf << (4 * i);
}
 
-   blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);
-
-   if (blend->mrt0_is_dual_src)
-   col_format |= (col_format & 0xf) << 4;
-   blend->spi_shader_col_format = col_format;
-
/* If the i-th target format is set, all previous target formats must
 * be non-zero to avoid hangs.
 */
-   num_targets = (util_last_bit(blend->spi_shader_col_format) + 3) / 4;
+   num_targets = (util_last_bit(col_format) + 3) / 4;
for (unsigned i = 0; i < num_targets; i++) {
-   if (!(blend->spi_shader_col_format & (0xf << (i * 4
-   blend->spi_shader_col_format |= 
V_028714_SPI_SHADER_32_R << (i * 4);
+   if (!(col_format & (0xf << (i * 4 {
+   col_format |= V_028714_SPI_SHADER_32_R << (i * 4);
+   }
}
+
+   blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);
+
+   if (blend->mrt0_is_dual_src)
+   col_format |= (col_format & 0xf) << 4;
+   blend->spi_shader_col_format = col_format;
 }
 
 static bool
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 1/9] vulkan: Add KHR_display extension using DRM [v8]

2018-06-13 Thread Jason Ekstrand
I'm trusting that not much changed other than what was explicitly called
out.  I didn't want to re-read in *that* much detail again. :-)

Reviewed-by: Jason Ekstrand 



On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard  wrote:

> This adds support for the KHR_display extension support to the vulkan
> WSI layer. Driver support will be added separately.
>
> v2:
> * fix double ;; in wsi_common_display.c
>
> * Move mode list from wsi_display to wsi_display_connector
>
> * Fix scope for wsi_display_mode andwsi_display_connector
>   allocs
>
> * Switch all allocations to vk_zalloc instead of vk_alloc.
>
> * Fix DRM failure in
>   wsi_display_get_physical_device_display_properties
>
>   When DRM fails, or when we don't have a master fd
>   (presumably due to application errors), just return 0
>   properties from this function, which is at least a valid
>   response.
>
> * Use vk_outarray for all property queries
>
>   This is a bit less error-prone than open-coding the same
>   stuff.
>
> * Remove VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR from surface caps
>
>   Until we have multi-plane support, we shouldn't pretend to
>   have any multi-plane semantics, even if undefined.
>
> Suggested-by: Jason Ekstrand 
>
> * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to
>   vulkan_wsi_args
>
> Suggested-by: Eric Engestrom 
>
> v3:
> Add separate 'display_fd' and 'render_fd' arguments to
> wsi_device_init API. This allows drivers to use different FDs
> for the different aspects of the device.
>
> Use largest mode as display size when no preferred mode.
>
> If the display doesn't provide a preferred mode, we'll assume
> that the largest supported mode is the "physical size" of the
> device and report that.
>
> v4:
> Make wsi_image_state enumeration values uppercase.
> Follow more common mesa conventions.
>
> Remove 'render_fd' from wsi_device_init API.  The
> wsi_common_display code doesn't use this fd at all, so stop
> passing it in. This avoids any potential confusion over which
> fd to use when creating display-relative object handles.
>
> Remove call to wsi_create_prime_image which would never have
> been reached as the necessary condition (use_prime_blit) is
> never set.
>
> whitespace cleanups in wsi_common_display.c
>
> Suggested-by: Jason Ekstrand 
>
> Add depth/bpp info to available surface formats.  Instead of
> hard-coding depth 24 bpp 32 in the drmModeAddFB call, use the
> requested format to find suitable values.
>
> Destroy kernel buffers and FBs when swapchain is destroyed. We
> were leaking both of these kernel objects across swapchain
> destruction.
>
> Note that wsi_display_wait_for_event waits for anything to
> happen.  wsi_display_wait_for_event is simply a yield so that
> the caller can then check to see if the desired state change
> has occurred.
>
> Record swapchain failures in chain for later return. If some
> asynchronous swapchain activity fails, we need to tell the
> application eventually. Record the failure in the swapchain
> and report it at the next acquire_next_image or queue_present
> call.
>
> Fix error returns from wsi_display_setup_connector.  If a
> malloc failed, then the result should be
> VK_ERROR_OUT_OF_HOST_MEMORY. Otherwise, the associated ioctl
> failed and we're either VT switched away, or our lease has
> been revoked, in which case we should return
> VK_ERROR_OUT_OF_DATE_KHR.
>
> Make sure both sides of if/else brace use matches
>
> Note that we assume drmModeSetCrtc is synchronous. Add a
> comment explaining why we can idle any previous displayed
> image as soon as the mode set returns.
>
> Note that EACCES from drmModePageFlip means VT inactive.  When
> vt switched away drmModePageFlip returns EACCES. Poll once a
> second waiting until we get some other return value back.
>
> Clean up after alloc failure in
> wsi_display_surface_create_swapchain. Destroy any created
> images, free the swapchain.
>
> Remove physical_device from wsi_display_init_wsi. We never
> need this value, so remove it from the API and from the
> internal wsi_display structure.
>
> Use drmModeAddFB2 in wsi_display_image_init.  This takes a drm
> format instead of depth/bpp, which provides more control over
> the format of the data.
>
> v5:
> Set the 'currentStackIndex' member of the
> VkDisplayPlanePropertiesKHR record to zero, instead of
> indexing across all displays. This value is the stac

Re: [Mesa-dev] [PATCH mesa 2/9] anv: Add KHR_display extension to anv [v5]

2018-06-13 Thread Jason Ekstrand
On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard  wrote:

> This adds support for the KHR_display extension to the anv Vulkan
> driver. The driver now attempts to open the master DRM node when the
> KHR_display extension is requested so that the common winsys code can
> perform the necessary operations.
>
> v2: Make sure primary fd is usable
>
> When KHR_display is selected, we try to open the primary node
> instead of the render node in case the user wants to use
> KHR_display for presentation. However, if we're actually going
> to end up using RandR leases, then we don't care if the
> resulting fd can't be used for display, but the kernel also
> prevents us from using it for drawing when someone else has
> master.
>
> v3:
> Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args
>
> Suggested-by: Eric Engestrom 
>
> v4:
> Adapt primary node usage to new wsi_device_init API
>
> v5:
> Adopt Jason Ekstrand's coding conventions
>
> Declare variables at first use, eliminate extra whitespace between
> types and names. Wrap lines to 80 columns.
>
> Remove spurious MM_PER_PIXEL define
>
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
>
> fixup
> ---
>  src/intel/Makefile.sources |   3 +
>  src/intel/Makefile.vulkan.am   |   7 ++
>  src/intel/vulkan/anv_device.c  |  21 
>  src/intel/vulkan/anv_extensions.py |   1 +
>  src/intel/vulkan/anv_extensions_gen.py |   5 +-
>  src/intel/vulkan/anv_wsi_display.c | 129 +
>  src/intel/vulkan/meson.build   |   5 +
>  7 files changed, 169 insertions(+), 2 deletions(-)
>  create mode 100644 src/intel/vulkan/anv_wsi_display.c
>
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index f22e727553f..5f6cd96825b 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -254,6 +254,9 @@ VULKAN_WSI_WAYLAND_FILES := \
>  VULKAN_WSI_X11_FILES := \
> vulkan/anv_wsi_x11.c
>
> +VULKAN_WSI_DISPLAY_FILES := \
> +   vulkan/anv_wsi_display.c
> +
>  VULKAN_GEM_FILES := \
> vulkan/anv_gem.c
>
> diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
> index 4125cb205ad..9b7fbb74007 100644
> --- a/src/intel/Makefile.vulkan.am
> +++ b/src/intel/Makefile.vulkan.am
> @@ -192,6 +192,13 @@ VULKAN_SOURCES += $(VULKAN_WSI_WAYLAND_FILES)
>  VULKAN_LIB_DEPS += $(WAYLAND_CLIENT_LIBS)
>  endif
>
> +if HAVE_PLATFORM_DRM
> +VULKAN_CPPFLAGS += \
> +   -DVK_USE_PLATFORM_DISPLAY_KHR
> +
> +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES)
> +endif
> +
>  noinst_LTLIBRARIES += vulkan/libvulkan_common.la
>  vulkan_libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
>  vulkan_libvulkan_common_la_CFLAGS = $(VULKAN_CFLAGS)
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 56e91fe5de1..b3c6d1a8722 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -274,6 +274,7 @@ anv_physical_device_init_uuids(struct
> anv_physical_device *device)
>  static VkResult
>  anv_physical_device_init(struct anv_physical_device *device,
>   struct anv_instance *instance,
> + const char *primary_path,
>   const char *path)
>  {
> VkResult result;
> @@ -445,6 +446,25 @@ anv_physical_device_init(struct anv_physical_device
> *device,
> anv_physical_device_get_supported_extensions(device,
>
>  &device->supported_extensions);
>
> +   if (instance->enabled_extensions.KHR_display) {
> +  master_fd = open(path, O_RDWR | O_CLOEXEC);
>

Is this supposed to be opening primary_path instead?


> +  if (master_fd >= 0) {
> + /* prod the device with a GETPARAM call which will fail if
> +  * we don't have permission to even render on this device
> +  */
> + drm_i915_getparam_t gp;
> + memset(&gp, '\0', sizeof(gp));
> + int devid = 0;
> + gp.param = I915_PARAM_CHIPSET_ID;
> + gp.value = &devid;
> + int ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, &gp);
> + if (ret < 0) {
> +close(master_fd);
> +master_fd = -1;
> + }
>

This could just be

if (anv_gem_get_param(master_fd, I915_PARAM_CHIPSET_ID) == 0) {
   close(master_fd);
   master_fd = -1;
}

No need to type out all that IOCTL stuff.


> +  }
> +   }
> +
> device->local_fd = fd;
> device->master_fd = master_fd;
> return VK_SUCCESS;
> @@ -635,6 +655,7 @@ anv_enumerate_devices(struct anv_instance *instance)
>
>   result = anv_physical_device_init(&instance->physicalDevice,
>  instance,
> +devices[i]->nodes[DRM_NODE_PRIMARY],
>  devices[i]->nodes[DRM_NODE_RENDER]);
>   if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
>  break;
> diff --git a/

Re: [Mesa-dev] [PATCH mesa 04/21] vulkan: Add EXT_direct_mode_display

2018-06-13 Thread Jason Ekstrand
On Mon, Jun 11, 2018 at 9:32 PM, Keith Packard  wrote:

> Jason Ekstrand  writes:
>
> > This seems a bit odd.  Why is the FD not stored in the display?  What if
> > you acquire multiple displays for two-player VR?  If the master FD passed
> > in is not -1, we could just create a VkDisplayKHR object containing
> > it.
>
> You want to share the master_fd passed in at init_wsi time among all
> VkDisplayKHR objects, so you need to leave that FD in the global
> structure. However, you're right that when you use
> EXT_acquire_xlib_display, then you get a separate master_fd for each DRM
> output and need to have one per display.
>
> However, extending this code to support multiple master FDs looks tricky
> -- in the case where you have a single master_fd, then enumerating the
> DRM resources for that gives you all of the available
> connectors. However, if you have one DRM master per connector, then you
> need to enumerate each independently to get the complete set of
> available resources. For APIs which don't explicitly include a
> connector, I would have to go find a suitable master FD for each
> resource.
>
> How about I just disallow multiple leases for now? If you want multiple
> outputs, I think you'd want them on the same DRM master anyways, and we
> could get that by creating a new extension which had the application
> pass in a DRM master that had all of the resources you want to access.
>
>/* XXX no support for multiple leases yet */
>if (wsi->fd >= 0)
>   return VK_ERROR_OUT_OF_DATE_KHR;
>

That's fine with me.  As long as we do something sensible such as
disallowing it instead of just falling over.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 4/9] vulkan: Add EXT_direct_mode_display [v2]

2018-06-13 Thread Jason Ekstrand
patches 4-6 are

Reviewed-by: Jason Ekstrand 

On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard  wrote:

> Add support for the EXT_direct_mode_display extension. This just
> provides the vkReleaseDisplayEXT function.
>
> v2:
> Adopt Jason Ekstrand's coding conventions
>
> Declare variables at first use, eliminate extra whitespace
> between types and names. Wrap lines to 80 columns.
>
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
> ---
>  src/vulkan/wsi/wsi_common_display.c | 18 ++
>  src/vulkan/wsi/wsi_common_display.h |  5 +
>  2 files changed, 23 insertions(+)
>
> diff --git a/src/vulkan/wsi/wsi_common_display.c
> b/src/vulkan/wsi/wsi_common_display.c
> index e529d2fc580..7a484c0df95 100644
> --- a/src/vulkan/wsi/wsi_common_display.c
> +++ b/src/vulkan/wsi/wsi_common_display.c
> @@ -1430,3 +1430,21 @@ wsi_display_finish_wsi(struct wsi_device
> *wsi_device,
>vk_free(alloc, wsi);
> }
>  }
> +
> +/*
> + * Implement vkReleaseDisplay
> + */
> +VkResult
> +wsi_release_display(VkPhysicalDevicephysical_device,
> +struct wsi_device   *wsi_device,
> +VkDisplayKHRdisplay)
> +{
> +   struct wsi_display *wsi =
> +  (struct wsi_display *) wsi_device->wsi[VK_ICD_WSI_
> PLATFORM_DISPLAY];
> +
> +   if (wsi->fd >= 0) {
> +  close(wsi->fd);
> +  wsi->fd = -1;
> +   }
> +   return VK_SUCCESS;
> +}
> diff --git a/src/vulkan/wsi/wsi_common_display.h
> b/src/vulkan/wsi/wsi_common_display.h
> index 4bb86cf2102..dd3a098f80a 100644
> --- a/src/vulkan/wsi/wsi_common_display.h
> +++ b/src/vulkan/wsi/wsi_common_display.h
> @@ -74,4 +74,9 @@ wsi_create_display_surface(VkInstance instance,
> const VkDisplaySurfaceCreateInfoKHR
> *pCreateInfo,
> VkSurfaceKHR *pSurface);
>
> +VkResult
> +wsi_release_display(VkPhysicalDevicephysical_device,
> +struct wsi_device   *wsi_device,
> +VkDisplayKHRdisplay);
> +
>  #endif
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 3/9] radv: Add KHR_display extension to radv [v4]

2018-06-13 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard  wrote:

> This adds support for the KHR_display extension to the radv Vulkan
> driver. The driver now attempts to open the master DRM node when the
> KHR_display extension is requested so that the common winsys code can
> perform the necessary operations.
>
> v2:
> * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to
>   vulkan_wsi_args
>
> Suggested-by: Eric Engestrom 
>
> v3:
> Adapt to new wsi_device_init API (added display_fd)
>
> v4:
> Adopt Jason Ekstrand's coding conventions
>
> Declare variables at first use, eliminate extra whitespace
> between types and names. Wrap lines to 80 columns.
>
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
> ---
>  src/amd/vulkan/Makefile.am|   8 ++
>  src/amd/vulkan/Makefile.sources   |   3 +
>  src/amd/vulkan/meson.build|   5 +
>  src/amd/vulkan/radv_device.c  |  17 
>  src/amd/vulkan/radv_extensions.py |   7 +-
>  src/amd/vulkan/radv_private.h |   1 +
>  src/amd/vulkan/radv_wsi_display.c | 149 ++
>  7 files changed, 188 insertions(+), 2 deletions(-)
>  create mode 100644 src/amd/vulkan/radv_wsi_display.c
>
> diff --git a/src/amd/vulkan/Makefile.am b/src/amd/vulkan/Makefile.am
> index 18f263ab447..f4f99400275 100644
> --- a/src/amd/vulkan/Makefile.am
> +++ b/src/amd/vulkan/Makefile.am
> @@ -80,6 +80,14 @@ VULKAN_LIB_DEPS = \
> $(DLOPEN_LIBS) \
> -lm
>
> +if HAVE_PLATFORM_DRM
> +AM_CPPFLAGS += \
> +   -DVK_USE_PLATFORM_DISPLAY_KHR
> +
> +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES)
> +
> +endif
> +
>  if HAVE_PLATFORM_X11
>  AM_CPPFLAGS += \
> $(XCB_DRI3_CFLAGS) \
> diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile.
> sources
> index ccb956a2396..70d56e88cb3 100644
> --- a/src/amd/vulkan/Makefile.sources
> +++ b/src/amd/vulkan/Makefile.sources
> @@ -80,6 +80,9 @@ VULKAN_WSI_WAYLAND_FILES := \
>  VULKAN_WSI_X11_FILES := \
> radv_wsi_x11.c
>
> +VULKAN_WSI_DISPLAY_FILES := \
> +   radv_wsi_display.c
> +
>  VULKAN_GENERATED_FILES := \
> radv_entrypoints.c \
> radv_entrypoints.h \
> diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build
> index b5a99fe91e1..15e69d582dd 100644
> --- a/src/amd/vulkan/meson.build
> +++ b/src/amd/vulkan/meson.build
> @@ -115,6 +115,11 @@ if with_platform_wayland
>libradv_files += files('radv_wsi_wayland.c')
>  endif
>
> +if with_platform_drm
> +  radv_flags += '-DVK_USE_PLATFORM_DISPLAY_KHR'
> +  libradv_files += files('radv_wsi_display.c')
> +endif
> +
>  libvulkan_radeon = shared_library(
>'vulkan_radeon',
>[libradv_files, radv_entrypoints, radv_extensions_c, vk_format_table_c],
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index ca091ee12ba..59ee503c8c2 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -274,6 +274,23 @@ radv_physical_device_init(struct
> radv_physical_device *device,
> goto fail;
> }
>
> +   if (instance->enabled_extensions.KHR_display) {
> +   master_fd = open(drm_device->nodes[DRM_NODE_PRIMARY],
> O_RDWR | O_CLOEXEC);
> +   if (master_fd >= 0) {
> +   uint32_t accel_working = 0;
> +   struct drm_amdgpu_info request = {
> +   .return_pointer =
> (uintptr_t)&accel_working,
> +   .return_size = sizeof(accel_working),
> +   .query = AMDGPU_INFO_ACCEL_WORKING
> +   };
> +
> +   if (drmCommandWrite(master_fd, DRM_AMDGPU_INFO,
> &request, sizeof (struct drm_amdgpu_info)) < 0 || !accel_working) {
> +   close(master_fd);
> +   master_fd = -1;
> +   }
> +   }
> +   }
> +
> device->master_fd = master_fd;
> device->local_fd = fd;
> device->ws->query_info(device->ws, &device->rad_info);
> diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_
> extensions.py
> index a5b5a8dc34e..6f4fc71bfd8 100644
> --- a/src/amd/vulkan/radv_extensions.py
> +++ b/src/amd/vulkan/radv_extensions.py
> @@ -86,6 +86,7 @@ EXTENSIONS = [
>  Extension('VK_KHR_xcb_surface',   6,
> 'VK_USE_PLATFORM_XCB_KHR'),
>  Extension('VK_KHR_xlib_surface',  6,
> 'VK_USE_PLATFORM_XLIB_KHR'),
>  Extension('VK_KHR_multiview', 1, True),
> +Extension('VK_KHR_display',  23,
> 'VK_USE_PLATFORM_DISPLAY_KHR'),
>  Extension('VK_EXT_debug_report',  9, True),
>  Extension('VK_EXT_depth_range_unrestricted',  1, True),
>  Extension('VK_EXT_descriptor_indexing',   2, True),
> @@ -214,7 +215,7 @@ _TEMPLATE_C = Te

Re: [Mesa-dev] [PATCH mesa 3/9] radv: Add KHR_display extension to radv [v4]

2018-06-13 Thread Jason Ekstrand
On Wed, Jun 13, 2018 at 2:46 PM, Jason Ekstrand 
wrote:

> Reviewed-by: Jason Ekstrand 
>

With the caveat that I have no idea how the amdgpu kernel interface works.
:-)


> On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard  wrote:
>
>> This adds support for the KHR_display extension to the radv Vulkan
>> driver. The driver now attempts to open the master DRM node when the
>> KHR_display extension is requested so that the common winsys code can
>> perform the necessary operations.
>>
>> v2:
>> * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to
>>   vulkan_wsi_args
>>
>> Suggested-by: Eric Engestrom 
>>
>> v3:
>> Adapt to new wsi_device_init API (added display_fd)
>>
>> v4:
>> Adopt Jason Ekstrand's coding conventions
>>
>> Declare variables at first use, eliminate extra whitespace
>> between types and names. Wrap lines to 80 columns.
>>
>> Suggested-by: Jason Ekstrand 
>>
>> Signed-off-by: Keith Packard 
>> ---
>>  src/amd/vulkan/Makefile.am|   8 ++
>>  src/amd/vulkan/Makefile.sources   |   3 +
>>  src/amd/vulkan/meson.build|   5 +
>>  src/amd/vulkan/radv_device.c  |  17 
>>  src/amd/vulkan/radv_extensions.py |   7 +-
>>  src/amd/vulkan/radv_private.h |   1 +
>>  src/amd/vulkan/radv_wsi_display.c | 149 ++
>>  7 files changed, 188 insertions(+), 2 deletions(-)
>>  create mode 100644 src/amd/vulkan/radv_wsi_display.c
>>
>> diff --git a/src/amd/vulkan/Makefile.am b/src/amd/vulkan/Makefile.am
>> index 18f263ab447..f4f99400275 100644
>> --- a/src/amd/vulkan/Makefile.am
>> +++ b/src/amd/vulkan/Makefile.am
>> @@ -80,6 +80,14 @@ VULKAN_LIB_DEPS = \
>> $(DLOPEN_LIBS) \
>> -lm
>>
>> +if HAVE_PLATFORM_DRM
>> +AM_CPPFLAGS += \
>> +   -DVK_USE_PLATFORM_DISPLAY_KHR
>> +
>> +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES)
>> +
>> +endif
>> +
>>  if HAVE_PLATFORM_X11
>>  AM_CPPFLAGS += \
>> $(XCB_DRI3_CFLAGS) \
>> diff --git a/src/amd/vulkan/Makefile.sources
>> b/src/amd/vulkan/Makefile.sources
>> index ccb956a2396..70d56e88cb3 100644
>> --- a/src/amd/vulkan/Makefile.sources
>> +++ b/src/amd/vulkan/Makefile.sources
>> @@ -80,6 +80,9 @@ VULKAN_WSI_WAYLAND_FILES := \
>>  VULKAN_WSI_X11_FILES := \
>> radv_wsi_x11.c
>>
>> +VULKAN_WSI_DISPLAY_FILES := \
>> +   radv_wsi_display.c
>> +
>>  VULKAN_GENERATED_FILES := \
>> radv_entrypoints.c \
>> radv_entrypoints.h \
>> diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build
>> index b5a99fe91e1..15e69d582dd 100644
>> --- a/src/amd/vulkan/meson.build
>> +++ b/src/amd/vulkan/meson.build
>> @@ -115,6 +115,11 @@ if with_platform_wayland
>>libradv_files += files('radv_wsi_wayland.c')
>>  endif
>>
>> +if with_platform_drm
>> +  radv_flags += '-DVK_USE_PLATFORM_DISPLAY_KHR'
>> +  libradv_files += files('radv_wsi_display.c')
>> +endif
>> +
>>  libvulkan_radeon = shared_library(
>>'vulkan_radeon',
>>[libradv_files, radv_entrypoints, radv_extensions_c,
>> vk_format_table_c],
>> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
>> index ca091ee12ba..59ee503c8c2 100644
>> --- a/src/amd/vulkan/radv_device.c
>> +++ b/src/amd/vulkan/radv_device.c
>> @@ -274,6 +274,23 @@ radv_physical_device_init(struct
>> radv_physical_device *device,
>> goto fail;
>> }
>>
>> +   if (instance->enabled_extensions.KHR_display) {
>> +   master_fd = open(drm_device->nodes[DRM_NODE_PRIMARY],
>> O_RDWR | O_CLOEXEC);
>> +   if (master_fd >= 0) {
>> +   uint32_t accel_working = 0;
>> +   struct drm_amdgpu_info request = {
>> +   .return_pointer =
>> (uintptr_t)&accel_working,
>> +   .return_size = sizeof(accel_working),
>> +   .query = AMDGPU_INFO_ACCEL_WORKING
>> +   };
>> +
>> +   if (drmCommandWrite(master_fd, DRM_AMDGPU_INFO,
>> &request, sizeof (struct drm_amdgpu_info)) < 0 || !accel_working) {
>> +   close(master_fd);
>> +   master_fd = -1;
>> +   }
>> +   }
>> +   }
>> +
>> device->master_fd = master_fd;
>> device->local_fd = fd;
>> device->ws->query_info(device->ws, &device->rad_info);
>> diff --git a/src/amd/vulkan/radv_extensions.py
>> b/src/amd/vulkan/radv_extensions.py
>> index a5b5a8dc34e..6f4fc71bfd8 100644
>> --- a/src/amd/vulkan/radv_extensions.py
>> +++ b/src/amd/vulkan/radv_extensions.py
>> @@ -86,6 +86,7 @@ EXTENSIONS = [
>>  Extension('VK_KHR_xcb_surface',   6,
>> 'VK_USE_PLATFORM_XCB_KHR'),
>>  Extension('VK_KHR_xlib_surface',  6,
>> 'VK_USE_PLATFORM_XLIB_KHR'),
>>  Extension('VK_KHR_multiview', 1, True),
>> +Extension('VK_KHR_display',  23,
>> 'VK_USE_

Re: [Mesa-dev] [PATCH mesa 2/9] anv: Add KHR_display extension to anv [v5]

2018-06-13 Thread Jason Ekstrand
On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard  wrote:

> This adds support for the KHR_display extension to the anv Vulkan
> driver. The driver now attempts to open the master DRM node when the
> KHR_display extension is requested so that the common winsys code can
> perform the necessary operations.
>
> v2: Make sure primary fd is usable
>
> When KHR_display is selected, we try to open the primary node
> instead of the render node in case the user wants to use
> KHR_display for presentation. However, if we're actually going
> to end up using RandR leases, then we don't care if the
> resulting fd can't be used for display, but the kernel also
> prevents us from using it for drawing when someone else has
> master.
>
> v3:
> Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args
>
> Suggested-by: Eric Engestrom 
>
> v4:
> Adapt primary node usage to new wsi_device_init API
>
> v5:
> Adopt Jason Ekstrand's coding conventions
>
> Declare variables at first use, eliminate extra whitespace between
> types and names. Wrap lines to 80 columns.
>
> Remove spurious MM_PER_PIXEL define
>
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
>
> fixup
>

Did you mean to leave this in here?


> ---
>  src/intel/Makefile.sources |   3 +
>  src/intel/Makefile.vulkan.am   |   7 ++
>  src/intel/vulkan/anv_device.c  |  21 
>  src/intel/vulkan/anv_extensions.py |   1 +
>  src/intel/vulkan/anv_extensions_gen.py |   5 +-
>  src/intel/vulkan/anv_wsi_display.c | 129 +
>  src/intel/vulkan/meson.build   |   5 +
>  7 files changed, 169 insertions(+), 2 deletions(-)
>  create mode 100644 src/intel/vulkan/anv_wsi_display.c
>
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index f22e727553f..5f6cd96825b 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -254,6 +254,9 @@ VULKAN_WSI_WAYLAND_FILES := \
>  VULKAN_WSI_X11_FILES := \
> vulkan/anv_wsi_x11.c
>
> +VULKAN_WSI_DISPLAY_FILES := \
> +   vulkan/anv_wsi_display.c
> +
>  VULKAN_GEM_FILES := \
> vulkan/anv_gem.c
>
> diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am
> index 4125cb205ad..9b7fbb74007 100644
> --- a/src/intel/Makefile.vulkan.am
> +++ b/src/intel/Makefile.vulkan.am
> @@ -192,6 +192,13 @@ VULKAN_SOURCES += $(VULKAN_WSI_WAYLAND_FILES)
>  VULKAN_LIB_DEPS += $(WAYLAND_CLIENT_LIBS)
>  endif
>
> +if HAVE_PLATFORM_DRM
> +VULKAN_CPPFLAGS += \
> +   -DVK_USE_PLATFORM_DISPLAY_KHR
> +
> +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES)
> +endif
> +
>  noinst_LTLIBRARIES += vulkan/libvulkan_common.la
>  vulkan_libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
>  vulkan_libvulkan_common_la_CFLAGS = $(VULKAN_CFLAGS)
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 56e91fe5de1..b3c6d1a8722 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -274,6 +274,7 @@ anv_physical_device_init_uuids(struct
> anv_physical_device *device)
>  static VkResult
>  anv_physical_device_init(struct anv_physical_device *device,
>   struct anv_instance *instance,
> + const char *primary_path,
>   const char *path)
>  {
> VkResult result;
> @@ -445,6 +446,25 @@ anv_physical_device_init(struct anv_physical_device
> *device,
> anv_physical_device_get_supported_extensions(device,
>
>  &device->supported_extensions);
>
> +   if (instance->enabled_extensions.KHR_display) {
> +  master_fd = open(path, O_RDWR | O_CLOEXEC);
> +  if (master_fd >= 0) {
> + /* prod the device with a GETPARAM call which will fail if
> +  * we don't have permission to even render on this device
> +  */
> + drm_i915_getparam_t gp;
> + memset(&gp, '\0', sizeof(gp));
> + int devid = 0;
> + gp.param = I915_PARAM_CHIPSET_ID;
> + gp.value = &devid;
> + int ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, &gp);
> + if (ret < 0) {
> +close(master_fd);
> +master_fd = -1;
> + }
> +  }
> +   }
> +
> device->local_fd = fd;
> device->master_fd = master_fd;
> return VK_SUCCESS;
> @@ -635,6 +655,7 @@ anv_enumerate_devices(struct anv_instance *instance)
>
>   result = anv_physical_device_init(&instance->physicalDevice,
>  instance,
> +devices[i]->nodes[DRM_NODE_PRIMARY],
>  devices[i]->nodes[DRM_NODE_RENDER]);
>   if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
>  break;
> diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_
> extensions.py
> index 8160864685f..83c09a46741 100644
> --- a/src/intel/vulkan/anv_extensions.py
> +++ b/src/intel/vulkan/anv_exte

Re: [Mesa-dev] [PATCH] mesa: enable EXT_render_snorm extension

2018-06-13 Thread Eric Anholt
Tapani Pälli  writes:

> Patch sets additional formats renderable and enables the extension
> when OpenGL ES 3.1 is supported.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/extensions_table.h |  1 +
>  src/mesa/main/fbobject.c | 20 +++-
>  src/mesa/main/glformats.c|  9 +
>  3 files changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index 79ef228b69..bc60475bea 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -245,6 +245,7 @@ EXT(EXT_polygon_offset_clamp, 
> ARB_polygon_offset_clamp
>  EXT(EXT_primitive_bounding_box  , OES_primitive_bounding_box 
> ,  x ,  x ,  x ,  31, 2014)
>  EXT(EXT_provoking_vertex, EXT_provoking_vertex   
> , GLL, GLC,  x ,  x , 2009)
>  EXT(EXT_read_format_bgra, dummy_true 
> ,  x ,  x , ES1, ES2, 2009)
> +EXT(EXT_render_snorm, dummy_true 
> ,  x ,  x ,  x,   31, 2014)

Since this is an extension beyond GLES 3.1, I think it shouldn't be
dummy_true -- at least V3D 3.3 should be able to do 3.1, and can't
render to snorm.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] nv50/ir: add preliminary support for OP_XMAD

2018-06-13 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp|  3 ++-
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  | 14 
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 12 +--
 .../drivers/nouveau/codegen/nv50_ir_print.cpp  | 20 +
 .../drivers/nouveau/codegen/nv50_ir_target.cpp |  7 +++---
 .../nouveau/codegen/nv50_ir_target_gm107.cpp   |  1 +
 .../nouveau/codegen/nv50_ir_target_nv50.cpp|  5 +++--
 .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 25 --
 8 files changed, 77 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
index 49425b98b9..99bf8de370 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
@@ -53,7 +53,8 @@ Modifier Modifier::operator*(const Modifier m) const
   b &= ~NV50_IR_MOD_NEG;
 
a = (this->bits ^ b)  & (NV50_IR_MOD_NOT | NV50_IR_MOD_NEG);
-   c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT);
+   c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT |
+NV50_IR_MOD_H1 | NV50_IR_MOD_SEXT);
 
return Modifier(a | c);
 }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index f4f3c70888..4deaf09989 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -58,6 +58,7 @@ enum operation
OP_FMA,
OP_SAD, // abs(src0 - src1) + src2
OP_SHLADD,
+   OP_XMAD, // extended multiply-add (GM107+), does a lot of things
OP_ABS,
OP_NEG,
OP_NOT,
@@ -251,6 +252,13 @@ enum operation
 #define NV50_IR_SUBOP_VOTE_ALL 0
 #define NV50_IR_SUBOP_VOTE_ANY 1
 #define NV50_IR_SUBOP_VOTE_UNI 2
+#define NV50_IR_SUBOP_XMAD_PSL (1 << 0)
+#define NV50_IR_SUBOP_XMAD_MRG (1 << 1)
+#define NV50_IR_SUBOP_XMAD_CLO (1 << 2)
+#define NV50_IR_SUBOP_XMAD_CHI (2 << 2)
+#define NV50_IR_SUBOP_XMAD_CSFU (3 << 2)
+#define NV50_IR_SUBOP_XMAD_CBCC (4 << 2)
+#define NV50_IR_SUBOP_XMAD_CMODE_MASK (0x7 << 2)
 
 #define NV50_IR_SUBOP_MINMAX_LOW  1
 #define NV50_IR_SUBOP_MINMAX_MED  2
@@ -527,6 +535,9 @@ struct Storage
 #define NV50_IR_MOD_SAT (1 << 2)
 #define NV50_IR_MOD_NOT (1 << 3)
 #define NV50_IR_MOD_NEG_ABS (NV50_IR_MOD_NEG | NV50_IR_MOD_ABS)
+// modifiers only for XMAD
+#define NV50_IR_MOD_H1   (1 << 4)
+#define NV50_IR_MOD_SEXT (1 << 5)
 
 #define NV50_IR_INTERP_MODE_MASK   0x3
 #define NV50_IR_INTERP_LINEAR  (0 << 0)
@@ -556,11 +567,14 @@ public:
inline Modifier operator&(const Modifier m) const { return bits & m.bits; }
inline Modifier operator|(const Modifier m) const { return bits | m.bits; }
inline Modifier operator^(const Modifier m) const { return bits ^ m.bits; }
+   inline Modifier operator~() const { return ~bits; }
 
operation getOp() const;
 
inline int neg() const { return (bits & NV50_IR_MOD_NEG) ? 1 : 0; }
inline int abs() const { return (bits & NV50_IR_MOD_ABS) ? 1 : 0; }
+   inline int h1() const { return (bits & NV50_IR_MOD_H1) ? 1 : 0; }
+   inline int sext() const { return (bits & NV50_IR_MOD_SEXT) ? 1 : 0; }
 
inline operator bool() const { return bits ? true : false; }
 
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 4d0589214d..a43b481a01 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -191,9 +191,16 @@ void
 LoadPropagation::checkSwapSrc01(Instruction *insn)
 {
const Target *targ = prog->getTarget();
-   if (!targ->getOpInfo(insn).commutative)
-  if (insn->op != OP_SET && insn->op != OP_SLCT && insn->op != OP_SUB)
+   if (!targ->getOpInfo(insn).commutative) {
+  if (insn->op != OP_SET && insn->op != OP_SLCT &&
+  insn->op != OP_SUB && insn->op != OP_XMAD)
  return;
+  // XMAD is only commutative if both the CBCC and MRG flags are not set.
+  if (insn->op == OP_XMAD && (insn->subOp & 0x1c) == 
NV50_IR_SUBOP_XMAD_CBCC)
+ return;
+  if (insn->op == OP_XMAD && (insn->subOp & NV50_IR_SUBOP_XMAD_MRG))
+ return;
+   }
if (insn->src(1).getFile() != FILE_GPR)
   return;
// This is the special OP_SET used for alphatesting, we can't reverse its
@@ -488,6 +495,7 @@ Modifier::applyTo(ImmediateValue& imm) const
  imm.reg.data.s32 = -imm.reg.data.s32;
   if (bits & NV50_IR_MOD_NOT)
  imm.reg.data.s32 = ~imm.reg.data.s32;
+  // NOTE: applying the h1 and sext modifiers is confusing and not very 
useful
   break;
 
case TYPE_F64:
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
index cbb21f5f72..c4906c31a8 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
+++ b/src/

[Mesa-dev] [PATCH 4/4] nv50/ir: further optimize multiplication by immediates

2018-06-13 Thread Rhys Perry
Strongly mitigates the harm from the previous commit, which made many
integer multiplications much more heavy on the register and instruction
count.

total instructions in shared programs : 5294693 -> 5268293 (-0.50%)
total gprs used in shared programs: 624962 -> 624196 (-0.12%)
total shared used in shared programs  : 360704 -> 360704 (0.00%)
total local used in shared programs   : 21048 -> 20952 (-0.46%)

local sharedgpr   inst  bytes
helped   1   0 36817721772
  hurt   0   0  74  23  23

Signed-off-by: Rhys Perry 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 123 ++---
 src/util/bitscan.h |  26 +
 2 files changed, 135 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 84cb5eb04b..aaad4db479 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -371,6 +371,10 @@ private:
void tryCollapseChainedMULs(Instruction *, const int s, ImmediateValue&);
 
CmpInstruction *findOriginForTestWithZero(Value *);
+ 
+   Value *createMulMethod1(Value *a, unsigned b, Value *c);
+   Value *createMulMethod2(Value *a, unsigned b, Value *c);
+   Value *createMul(Value *a, unsigned b, Value *c);
 
unsigned int foldCount;
 
@@ -946,6 +950,97 @@ ConstantFolding::opnd3(Instruction *i, ImmediateValue 
&imm2)
   return;
}
 }
+ 
+Value *
+ConstantFolding::createMulMethod1(Value *a, unsigned b, Value *c)
+{
+   if (b == 1)
+  return a;
+
+   // Basically constant folded shift and add multiplication.
+   Value *res = c ? c : bld.loadImm(NULL, 0u);
+   bool resZero = !c;
+   unsigned ashift = 0;
+   while (b) {
+  if ((b & 1) && ashift) {
+ if (resZero)
+res = bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(), a, 
bld.mkImm(ashift));
+ else
+res = bld.mkOp3v(OP_SHLADD, TYPE_U32, bld.getSSA(), a, 
bld.mkImm(ashift), res);
+ resZero = false;
+  } else if (b & 1) {
+ if (resZero)
+res = a;
+ else
+res = bld.mkOp2v(OP_ADD, TYPE_U32, bld.getSSA(), res, a);
+ resZero = false;
+  }
+  b >>= 1;
+  ashift++;
+   }
+   return res;
+}
+
+Value *
+ConstantFolding::createMulMethod2(Value *a, unsigned b, Value *c)
+{
+   uint64_t b2 = u_next_power_of_two(b);
+   unsigned b2shift = ffsll(b2) - 1;
+   if (b2 != b) { // a * b2 - a * (b2 - b)
+  // mul1 = a * (b2 - b)
+  Value *mul1 = createMulMethod1(a, b2 - b, NULL);
+
+  if (b2shift < 32 && c) { // a * b2 - mul1 + c (implemented as a * b2 + c 
- mul1)
+ return bld.mkOp2v(OP_SUB, TYPE_U32, bld.getSSA(),
+   bld.mkOp3v(OP_SHLADD, TYPE_U32, bld.getSSA(),
+  a, bld.mkImm(b2shift), c),
+   mul1);
+  } else
+  if (b2shift < 32) { // a * b2 - mul1
+ Value *res = bld.getSSA();
+ Instruction *i = bld.mkOp3(OP_SHLADD, TYPE_U32, res, a, 
bld.mkImm(b2shift), mul1);
+ if (bld.getProgram()->getTarget()->isModSupported(i, 2, 
NV50_IR_MOD_NEG))
+i->src(2).mod *= Modifier(NV50_IR_MOD_NEG);
+ else
+i->setSrc(2, bld.mkOp1v(OP_NEG, TYPE_U32, bld.getSSA(), mul1));
+ return res;
+  } else
+  if (c) { // - mul1 + c (implemented as c - mul1)
+ return bld.mkOp2v(OP_SUB, TYPE_U32, bld.getSSA(), c, mul1);
+  } else { // - mul1
+ return bld.mkOp1v(OP_NEG, TYPE_U32, bld.getSSA(), mul1);
+  }
+   } else {
+  if (c) // a * b2 + c
+ return bld.mkOp3v(OP_SHLADD, TYPE_U32, bld.getSSA(), a, 
bld.mkImm(b2shift), c);
+  else // a * b2
+ return bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(), a, 
bld.loadImm(NULL, b2shift));
+   }
+}
+
+Value *
+ConstantFolding::createMul(Value *a, unsigned b, Value *c)
+{
+   unsigned cost[2];
+
+   // Estimate cost for first method (a << i) + (b << j) + ...
+   cost[0] = u_bit_count64(b >> 1);
+
+   // Estimate cost for second method (a << i) - ((a << j) + (a << k) + ...)
+   uint64_t rounded_b = u_next_power_of_two(b);
+   cost[1] = rounded_b == b ? 1 : (u_bit_count64((rounded_b - b) >> 1) + 2);
+   if (c) cost[1]++;
+
+   // The general method, multiplication by XMADs, costs three instructions.
+   // So nothing larger than that or it could be making things worse.
+   if (cost[0] > 3 && cost[1] > 3)
+  return NULL;
+
+   if (cost[0] < cost[1])
+  return createMulMethod1(a, b, c);
+   else
+  return createMulMethod2(a, b, c);
+}
 
 void
 ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s)
@@ -1034,13 +1129,13 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue 
&imm0, int s)
  i->setSrc(s, i->getSrc(t));
  i->src(s).mod = 

[Mesa-dev] [PATCH 3/4] nv50/ir: optimize imul/imad to xmads

2018-06-13 Thread Rhys Perry
This hits the shader-db numbers a good bit, though a few xmads is way
faster than an imul or imad and the cost is mitigated by the next commit,
which optimizes many multiplications by immediates into shorter and less
register heavy instructions than the xmads.

total instructions in shared programs : 5256901 -> 5294693 (0.72%)
total gprs used in shared programs: 624328 -> 624962 (0.10%)
total shared used in shared programs  : 360704 -> 360704 (0.00%)
total local used in shared programs   : 20952 -> 21048 (0.46%)

local sharedgpr   inst  bytes
helped   0   0  39   0   0
  hurt   1   0 33422772277

Signed-off-by: Rhys Perry 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 53 ++
 1 file changed, 53 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index a43b481a01..84cb5eb04b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -2246,13 +2246,18 @@ AlgebraicOpt::visit(BasicBlock *bb)
 // 
=
 
 // ADD(SHL(a, b), c) -> SHLADD(a, b, c)
+// MUL(a, b) -> a few XMADs
+// MAD/FMA(a, b, c) -> a few XMADs
 class LateAlgebraicOpt : public Pass
 {
 private:
virtual bool visit(Instruction *);
 
void handleADD(Instruction *);
+   void handleMULMAD(Instruction *);
bool tryADDToSHLADD(Instruction *);
+
+   BuildUtil bld;
 };
 
 void
@@ -2312,6 +2317,49 @@ LateAlgebraicOpt::tryADDToSHLADD(Instruction *add)
 
return true;
 }
+ 
+// MUL(a, b) -> a few XMADs
+// MAD/FMA(a, b, c) -> a few XMADs
+void
+LateAlgebraicOpt::handleMULMAD(Instruction *i)
+{
+   // TODO: handle NV50_IR_SUBOP_MUL_HIGH
+   if (!prog->getTarget()->isOpSupported(OP_XMAD, TYPE_U32))
+  return;
+   if (isFloatType(i->dType) || typeSizeof(i->dType) != 4)
+  return;
+   if (i->subOp || i->usesFlags() || i->flagsDef >= 0)
+  return;
+
+   assert(!i->src(0).mod);
+   assert(!i->src(1).mod);
+   assert(i->op == OP_MUL ? 1 : !i->src(2).mod);
+
+   bld.setPosition(i, true);
+
+   Value *a = i->getSrc(0);
+   Value *b = i->getSrc(1);
+   Value *c = i->op == OP_MUL ? bld.mkImm(0) : i->getSrc(2);
+
+   Value *tmp0 = bld.getSSA();
+   Value *tmp1 = bld.getSSA();
+
+   Instruction *insn = bld.mkOp3(OP_XMAD, TYPE_U32, tmp0, b, a, c);
+   insn->setPredicate(i->cc, i->getPredicate());
+
+   insn = bld.mkOp3(OP_XMAD, TYPE_U32, tmp1, b, a, bld.mkImm(0));
+   insn->setPredicate(i->cc, i->getPredicate());
+   insn->src(1).mod = NV50_IR_MOD_H1;
+   insn->subOp = NV50_IR_SUBOP_XMAD_MRG;
+
+   insn = bld.mkOp3(OP_XMAD, TYPE_U32, i->getDef(0), b, tmp1, tmp0);
+   insn->setPredicate(i->cc, i->getPredicate());
+   insn->src(0).mod = NV50_IR_MOD_H1;
+   insn->src(1).mod = NV50_IR_MOD_H1;
+   insn->subOp = NV50_IR_SUBOP_XMAD_PSL | NV50_IR_SUBOP_XMAD_CBCC;
+
+   delete_Instruction(prog, i);
+}
 
 bool
 LateAlgebraicOpt::visit(Instruction *i)
@@ -2320,6 +2368,11 @@ LateAlgebraicOpt::visit(Instruction *i)
case OP_ADD:
   handleADD(i);
   break;
+   case OP_MUL:
+   case OP_MAD:
+   case OP_FMA:
+  handleMULMAD(i);
+  break;
default:
   break;
}
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] nv50/ir: Improve Performance of Integer Multiplication

2018-06-13 Thread Rhys Perry
This series improve the performance of integer multiplication by removing
much usage of the very slow IMAD and IMUL. It depends on the
SHLADD/IndirectPropagation patches.

The first and second patch add support for the XMAD instruction in codegen

The third patch replaces most IMADs and IMULs with a sequence of XMADs.
This is far faster but increases the total instructions in the shader-db
by 0.72%.

This number is significantly lowered with the next patch. It replaces many
multiplications with instructions that should be as fast or faster than
the XMAD approach. They are also typically be smaller and less register
heavy, so they decrease the total instruction count by -0.50%.

This series gives about a ~50% speedup in fragment-heavy scenaries with
Dolphin 5.0. All timings were made with interesting looking fifos from
Dolphin's bugtracker:
 Wind Waker: 18 FPS -> 26 FPS at 3x internal resolution
 Wind Waker:  8 FPS -> 11 FPS at 5x internal resolution
   Paper Mario?: 26 FPS -> 42 FPS at 5x internal resolution
SpongeBob Movie: 19 FPS -> 30 FPS at 5x internal resolution

Unigine Heaven and Unigine Valley seems to run the same at low quality with
no anti-aliasing and no tessellation. SuperTuxKart and 0 A.D. also show no
change.

It's possible these patches may break something, especially the fourth
one. Piglit shows no functionality regressions though they should probably
be tested for improvements or breakage with actual applications.

These patches can also be found on my github:
https://github.com/pendingchaos/mesa/tree/nv-xmad-v1

The final changes in shader-db are as follows:

total instructions in shared programs : 5256901 -> 5268293 (0.22%)
total gprs used in shared programs: 624328 -> 624196 (-0.02%)
total shared used in shared programs  : 360704 -> 360704 (0.00%)
total local used in shared programs   : 20952 -> 20952 (0.00%)

local sharedgpr   inst  bytes 
helped   0   0 255 680 680 
  hurt   0   0 12814841484 

Rhys Perry (4):
  nv50/ir: add preliminary support for OP_XMAD
  gm107/ir: add support for OP_XMAD on GM107+
  nv50/ir: optimize imul/imad to xmads
  nv50/ir: further optimize multiplication by immediates

 src/gallium/drivers/nouveau/codegen/nv50_ir.cpp|   3 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |  14 ++
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp |  61 +++
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 188 +++--
 .../drivers/nouveau/codegen/nv50_ir_print.cpp  |  20 +++
 .../drivers/nouveau/codegen/nv50_ir_target.cpp |   7 +-
 .../nouveau/codegen/nv50_ir_target_gm107.cpp   |   5 +
 .../nouveau/codegen/nv50_ir_target_nv50.cpp|   5 +-
 .../nouveau/codegen/nv50_ir_target_nvc0.cpp|  26 ++-
 src/util/bitscan.h |  26 +++
 10 files changed, 331 insertions(+), 24 deletions(-)

-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] gm107/ir: add support for OP_XMAD on GM107+

2018-06-13 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 61 ++
 .../nouveau/codegen/nv50_ir_target_gm107.cpp   |  6 ++-
 .../nouveau/codegen/nv50_ir_target_nvc0.cpp|  1 +
 3 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 26826d6360..8ace77aa59 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -155,6 +155,7 @@ private:
void emitIMUL();
void emitIMAD();
void emitISCADD();
+   void emitXMAD();
void emitIMNMX();
void emitICMP();
void emitISET();
@@ -1881,6 +1882,63 @@ CodeEmitterGM107::emitISCADD()
emitGPR (0x08, insn->src(0));
emitGPR (0x00, insn->def(0));
 }
+ 
+void
+CodeEmitterGM107::emitXMAD()
+{
+   assert(insn->src(0).getFile() == FILE_GPR);
+
+   bool constbuf = false;
+   bool psl_mrg = true;
+   bool immediate = false;
+   if (insn->src(2).getFile() == FILE_MEMORY_CONST) {
+  assert(insn->src(1).getFile() == FILE_GPR);
+  constbuf = true;
+  psl_mrg = false;
+  emitInsn(0x5100);
+  emitGPR(0x27, insn->src(1));
+  emitCBUF(0x22, -1, 0x14, 16, 2, insn->src(2));
+   } else if (insn->src(1).getFile() == FILE_MEMORY_CONST) {
+  assert(insn->src(2).getFile() == FILE_GPR);
+  constbuf = true;
+  emitInsn(0x4e00);
+  emitCBUF(0x22, -1, 0x14, 16, 2, insn->src(1));
+  emitGPR(0x27, insn->src(2));
+   } else if (insn->src(1).getFile() == FILE_IMMEDIATE) {
+  assert(insn->src(2).getFile() == FILE_GPR);
+  assert(!insn->src(1).mod.h1());
+  immediate = false;
+  emitInsn(0x3600);
+  emitIMMD(0x14, 19, insn->src(1));
+  emitGPR(0x27, insn->src(2));
+   } else {
+  assert(insn->src(1).getFile() == FILE_GPR);
+  assert(insn->src(2).getFile() == FILE_GPR);
+  emitInsn(0x5b00);
+  emitGPR(0x14, insn->src(1));
+  emitGPR(0x27, insn->src(2));
+   }
+
+   if (insn->src(0).mod.sext())
+  emitField(0x30, 2, insn->src(1).mod.sext() ? 3 : 1);
+   else
+  emitField(0x30, 2, insn->src(1).mod.sext() ? 2 : 0);
+   emitField(0x35, 1, insn->src(0).mod.h1());
+   if (!immediate)
+  emitField(constbuf ? 0x34 : 0x23, 1, insn->src(1).mod.h1());
+
+   if (psl_mrg) {
+  emitField(constbuf ? 0x37 : 0x24, 1, insn->subOp & 
NV50_IR_SUBOP_XMAD_PSL ? 1 : 0);
+  emitField(constbuf ? 0x38 : 0x25, 1, insn->subOp & 
NV50_IR_SUBOP_XMAD_MRG ? 1 : 0);
+   }
+   emitField(0x32, constbuf ? 2 : 3, (insn->subOp >> 2) & 0x7);
+
+   emitX(constbuf ? 0x36 : 0x26);
+   emitCC(0x2f);
+
+   emitGPR(0x0, insn->def(0));
+   emitGPR(0x8, insn->src(0));
+}
 
 void
 CodeEmitterGM107::emitIMNMX()
@@ -3253,6 +3311,9 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
case OP_SHLADD:
   emitISCADD();
   break;
+   case OP_XMAD:
+  emitXMAD();
+  break;
case OP_MIN:
case OP_MAX:
   if (isFloatType(insn->dType)) {
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp
index 24a1cbb8da..f918fbfdd3 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp
@@ -60,8 +60,11 @@ TargetGM107::isOpSupported(operation op, DataType ty) const
case OP_SQRT:
case OP_DIV:
case OP_MOD:
-   case OP_XMAD:
   return false;
+   case OP_XMAD:
+  if (isFloatType(ty))
+ return false;
+  break;
default:
   break;
}
@@ -230,6 +233,7 @@ TargetGM107::getLatency(const Instruction *insn) const
case OP_SUB:
case OP_VOTE:
case OP_XOR:
+   case OP_XMAD:
   if (insn->dType != TYPE_F64)
  return 6;
   break;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
index 66efa0135f..3b96c71f44 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
@@ -161,6 +161,7 @@ static const struct opProperties _initPropsGM107[] = {
{ OP_SUSTP,   0x0, 0x0, 0x0, 0x0, 0x0, 0x4 },
{ OP_SUREDB,  0x0, 0x0, 0x0, 0x0, 0x0, 0x4 },
{ OP_SUREDP,  0x0, 0x0, 0x0, 0x0, 0x0, 0x4 },
+   { OP_XMAD,0x0, 0x0, 0x0, 0x0, 0x6, 0x2 },
 };
 
 void TargetNVC0::initProps(const struct opProperties *props, int size)
-- 
2.14.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 106903] radv: Fragment shader output goes to wrong attachments when render targets are sparse

2018-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=106903

--- Comment #2 from Bas Nieuwenhuizen  ---
https://patchwork.freedesktop.org/patch/229361/ should fix this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] nv50/ir: Improve Performance of Integer Multiplication

2018-06-13 Thread Rhys Perry
Forgot to CC you.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] configure.ac/meson.build: Add options for library suffixes

2018-06-13 Thread bmgordon
From: Benjamin Gordon 

When building the Chrome OS Android container, we need to build copies
of mesa that don't conflict with the Android system-supplied libraries.
This adds options to create suffixed versions of EGL and GLES libraries:

libEGL.so -> libEGL${egl-lib-suffix}.so
libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so
libGLESv2.so -> libGLES${gles-lib-suffix}.so

This is similar to what happens when --enable-libglvnd is specified, but
without the side effects of linking against libglvnd.  To avoid
unexpected clashes with the suffixed appended by libglvnd, make it an
error to specify both --enable-libglvnd and --with-egl-lib-suffix.

Signed-off-by: Benjamin Gordon 
Reviewed-by: Eric Engestrom 
---
 configure.ac| 18 ++
 meson.build |  3 +++
 meson_options.txt   | 12 
 src/egl/Makefile.am |  8 
 src/egl/meson.build |  2 +-
 src/mapi/Makefile.am| 28 ++--
 src/mapi/es1api/meson.build |  2 +-
 src/mapi/es2api/meson.build |  2 +-
 8 files changed, 54 insertions(+), 21 deletions(-)

diff --git a/configure.ac b/configure.ac
index 35ade986d1..95ec47266f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1511,14 +1511,30 @@ AC_ARG_WITH([gl-lib-name],
 [specify GL library name @<:@default=GL@:>@])],
   [GL_LIB=$withval],
   [GL_LIB="$DEFAULT_GL_LIB_NAME"])
+AC_ARG_WITH([egl-lib-suffix],
+  [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@],
+[specify EGL library suffix @<:@default=none@:>@])],
+  [EGL_LIB_SUFFIX=$withval],
+  [EGL_LIB_SUFFIX=""])
+AC_ARG_WITH([gles-lib-suffix],
+  [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@],
+[specify GLES library suffix @<:@default=none@:>@])],
+  [GLES_LIB_SUFFIX=$withval],
+  [GLES_LIB_SUFFIX=""])
 AC_ARG_WITH([osmesa-lib-name],
   [AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
 [specify OSMesa library name @<:@default=OSMesa@:>@])],
   [OSMESA_LIB=$withval],
   [OSMESA_LIB=OSMesa])
 AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
+AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""])
+AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""])
 AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
 
+if test "x$enable_libglvnd" = xyes -a "x$EGL_LIB_SUFFIX" != x; then
+AC_MSG_ERROR([EGL lib suffix can't be used with libglvnd])
+fi
+
 dnl
 dnl Mangled Mesa support
 dnl
@@ -1534,6 +1550,8 @@ if test "x${enable_mangling}" = "xyes" ; then
   OSMESA_LIB="Mangled${OSMESA_LIB}"
 fi
 AC_SUBST([GL_LIB])
+AC_SUBST([EGL_LIB_SUFFIX])
+AC_SUBST([GLES_LIB_SUFFIX])
 AC_SUBST([OSMESA_LIB])
 
 # Check for libdrm
diff --git a/meson.build b/meson.build
index e52b4a5109..ca081b1e0b 100644
--- a/meson.build
+++ b/meson.build
@@ -373,6 +373,9 @@ if with_glvnd
   elif with_glx == 'disabled' and not with_egl
 error('glvnd requires DRI based GLX and/or EGL')
   endif
+  if get_option('egl-lib-suffix') != ''
+error('''EGL lib suffix can't be used with libglvnd''')
+  endif
 endif
 
 # TODO: toggle for this
diff --git a/meson_options.txt b/meson_options.txt
index ce7d87f1eb..9d84c3b5bb 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -298,3 +298,15 @@ option(
   choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'],
   description : 'List of tools to build.',
 )
+option(
+  'egl-lib-suffix',
+  type : 'string',
+  value : '',
+  description : 'Suffix to append to EGL library name.  Default: none.'
+)
+option(
+  'gles-lib-suffix',
+  type : 'string',
+  value : '',
+  description : 'Suffix to append to GLES library names.  Default: none.'
+)
diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index 086a4a1e63..c3aeeea007 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \
 
 else # USE_LIBGLVND
 
-lib_LTLIBRARIES = libEGL.la
-libEGL_la_SOURCES =
-libEGL_la_LIBADD = \
+lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la
+libEGL@EGL_LIB_SUFFIX@_la_SOURCES =
+libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \
libEGL_common.la \
$(top_builddir)/src/mapi/shared-glapi/libglapi.la
-libEGL_la_LDFLAGS = \
+libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \
-no-undefined \
-version-number 1:0 \
$(BSYMBOLIC) \
diff --git a/src/egl/meson.build b/src/egl/meson.build
index 6537e4bdee..b833fd1729 100644
--- a/src/egl/meson.build
+++ b/src/egl/meson.build
@@ -148,7 +148,7 @@ if cc.has_function('mincore')
 endif
 
 if not with_glvnd
-  egl_lib_name = 'EGL'
+  egl_lib_name = 'EGL' + get_option('egl-lib-suffix')
   egl_lib_version = '1.0.0'
 else
   egl_lib_name = 'EGL_mesa'
diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
index 3da1a193d2..a2b108adc9 100644
--- a/src/mapi/Makefile.am
+++ b/src/mapi/Makefile.am
@@ -178,24 +178,24 @@ GLES_include_HEADERS = \
$(top_srcdir)/include/GLES/glext.h \
$(top_srcdir)/include/GLES/glplatform.h
 
-lib_LTLIBRARIES += es1api/libGLESv1_CM.la
+lib_LTLIBRARIES += es1api/libGLE

Re: [Mesa-dev] [PATCH] configure.ac/meson.build: Add options for library suffixes

2018-06-13 Thread Benjamin Gordon
On Wed, Jun 13, 2018 at 9:46 AM Dylan Baker  wrote:

> Quoting Eric Engestrom (2018-06-13 03:03:25)
> > On Tuesday, 2018-06-12 11:19:40 -0600, bmgor...@chromium.org wrote:
> > > From: Benjamin Gordon 
> > >
> > > When building the Chrome OS Android container, we need to build copies
> > > of mesa that don't conflict with the Android system-supplied libraries.
> > > This adds options to create suffixed versions of EGL and GLES
> libraries:
> > >
> > > libEGL.so -> libEGL${egl-lib-suffix}.so
> > > libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so
> > > libGLESv2.so -> libGLES${gles-lib-suffix}.so
> > >
> > > This is similar to what happens when --enable-libglvnd is specified,
> but
> > > without the side effects of linking against libglvnd.
> >
> > This seems reasonable, and the meson side of this patch is correct,
> > but we need to document or prevent the interaction between
> > --enable-libglvnd and --with-egl-lib-suffix.
> >
> > I can't think of a use-case for having both, so I suggest "if both are
> > enabled, error out"; scroll down for what this could look like in meson.
>
> Agreed, making it hard error to use both makes sense to me.
>

Thanks for the reviews.  I just sent a v2 that makes it an error to pass
both flags.


>
> > With that (and the corresponding autotools hunk):
> > Reviewed-by: Eric Engestrom 
> >
> > >
> > > Change-Id: I0a534d3921a24c031e2532ee7d5ba9813740b33b
> >
> > (Note to whoever merges this patch: drop this line ^)
> >
> > > Signed-off-by: Benjamin Gordon 
> > > ---
> > >  configure.ac| 14 ++
> > >  meson_options.txt   | 12 
> > >  src/egl/Makefile.am |  8 
> > >  src/egl/meson.build |  2 +-
> > >  src/mapi/Makefile.am| 28 ++--
> > >  src/mapi/es1api/meson.build |  2 +-
> > >  src/mapi/es2api/meson.build |  2 +-
> > >  7 files changed, 47 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/configure.ac b/configure.ac
> > > index 35ade986d1..6070a2146b 100644
> > > --- a/configure.ac
> > > +++ b/configure.ac
> > > @@ -1511,12 +1511,24 @@ AC_ARG_WITH([gl-lib-name],
> > >  [specify GL library name @<:@default=GL@:>@])],
> > >[GL_LIB=$withval],
> > >[GL_LIB="$DEFAULT_GL_LIB_NAME"])
> > > +AC_ARG_WITH([egl-lib-suffix],
> > > +  [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@],
> > > +[specify EGL library suffix @<:@default=none@:>@])],
> > > +  [EGL_LIB_SUFFIX=$withval],
> > > +  [EGL_LIB_SUFFIX=""])
> > > +AC_ARG_WITH([gles-lib-suffix],
> > > +  [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@],
> > > +[specify GLES library suffix @<:@default=none@:>@])],
> > > +  [GLES_LIB_SUFFIX=$withval],
> > > +  [GLES_LIB_SUFFIX=""])
> > >  AC_ARG_WITH([osmesa-lib-name],
> > >[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@],
> > >  [specify OSMesa library name @<:@default=OSMesa@:>@])],
> > >[OSMESA_LIB=$withval],
> > >[OSMESA_LIB=OSMesa])
> > >  AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"])
> > > +AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""])
> > > +AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""])
> > >  AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa])
> > >
> > >  dnl
> > > @@ -1534,6 +1546,8 @@ if test "x${enable_mangling}" = "xyes" ; then
> > >OSMESA_LIB="Mangled${OSMESA_LIB}"
> > >  fi
> > >  AC_SUBST([GL_LIB])
> > > +AC_SUBST([EGL_LIB_SUFFIX])
> > > +AC_SUBST([GLES_LIB_SUFFIX])
> > >  AC_SUBST([OSMESA_LIB])
> > >
> > >  # Check for libdrm
> > > diff --git a/meson_options.txt b/meson_options.txt
> > > index ce7d87f1eb..9d84c3b5bb 100644
> > > --- a/meson_options.txt
> > > +++ b/meson_options.txt
> > > @@ -298,3 +298,15 @@ option(
> > >choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'],
> > >description : 'List of tools to build.',
> > >  )
> > > +option(
> > > +  'egl-lib-suffix',
> > > +  type : 'string',
> > > +  value : '',
> > > +  description : 'Suffix to append to EGL library name.  Default:
> none.'
> > > +)
> > > +option(
> > > +  'gles-lib-suffix',
> > > +  type : 'string',
> > > +  value : '',
> > > +  description : 'Suffix to append to GLES library names.  Default:
> none.'
> > > +)
> > > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
> > > index 086a4a1e63..c3aeeea007 100644
> > > --- a/src/egl/Makefile.am
> > > +++ b/src/egl/Makefile.am
> > > @@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \
> > >
> > >  else # USE_LIBGLVND
> > >
> > > -lib_LTLIBRARIES = libEGL.la
> > > -libEGL_la_SOURCES =
> > > -libEGL_la_LIBADD = \
> > > +lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la
> > > +libEGL@EGL_LIB_SUFFIX@_la_SOURCES =
> > > +libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \
> > >   libEGL_common.la \
> > >   $(top_builddir)/src/mapi/shared-glapi/libglapi.la
> > > -libEGL_la_LDFLAGS = \
> > > +libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \
> > >   -no-undefined \
> > >   -version-number 1:0 \
> > >   $(BSYMBOLIC) \
> > > diff --git a/src/egl/

Re: [Mesa-dev] [PATCH mesa 7/9] vulkan: Add EXT_acquire_xlib_display [v3]

2018-06-13 Thread Jason Ekstrand
On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard  wrote:

> This extension adds the ability to borrow an X RandR output for
> temporary use directly by a Vulkan application. For DRM, we use the
> Linux resource leasing mechanism.
>
> v2:
> Clean up xlib_lease detection
>
> * Use separate temporary '_xlib_lease' variable to hold the
>   option value to avoid changin the type of a variable.
>
> * Use boolean expressions instead of additional if statements
>   to compute resulting with_xlib_lease value.
>
> * Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to
>   vulkan_wsi_args
>
>   Suggested-by: Eric Engestrom 
>
> Move mode list from wsi_display to wsi_display_connector
>
> Fix scope for wsi_display_mode and wsi_display_connector allocs
>
>   Suggested-by: Jason Ekstrand 
>
> v3:
> Adopt Jason Ekstrand's coding conventions
>
> Declare variables at first use, eliminate extra whitespace
> between types and names. Wrap lines to 80 columns.
>
> Explicitly forbid multiple DRM leases. Making the code support
> this looks tricky and will require additional thought.
>
> Use xcb_randr_output_t throughout the internals of the
> implementation. Convert at the public API
> (wsi_get_randr_output_display).
>
> Clean up check for usable active_crtc (possible when only the
> desired output is connected to the crtc).
>
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
>
> fixup for acquire
>
> fixup for RROutput type
>
> Signed-off-by: Keith Packard 
>
> fixup
>

Lots of "fixup".  Did you mean to actually comment on what that was?


> ---
>  configure.ac|  32 ++
>  meson.build |  11 +
>  meson_options.txt   |   7 +
>  src/vulkan/Makefile.am  |   5 +
>  src/vulkan/wsi/meson.build  |   5 +
>  src/vulkan/wsi/wsi_common_display.c | 493 
>  src/vulkan/wsi/wsi_common_display.h |  17 +
>  7 files changed, 570 insertions(+)
>

[...]


> +static bool
> +wsi_display_mode_matches_x(struct wsi_display_mode *wsi,
> +   xcb_randr_mode_info_t *xcb)
> +{
> +   return wsi->clock == (xcb->dot_clock + 500) / 1000 &&
> +  wsi->hdisplay == xcb->width &&
> +  wsi->hsync_start == xcb->hsync_start &&
> +  wsi->hsync_end == xcb->hsync_end &&
> +  wsi->htotal == xcb->htotal &&
> +  wsi->hskew == xcb->hskew &&
> +  wsi->vdisplay == xcb->height &&
> +  wsi->vsync_start == xcb->vsync_start &&
> +  wsi->vsync_end == xcb->vsync_end &&
> +  wsi->vtotal == xcb->vtotal &&
>

You're not checking vscan here.


> +  wsi->flags == xcb->mode_flags;
> +}
>

[...]


> +static struct wsi_display_connector *
> +wsi_display_get_output(struct wsi_device *wsi_device,
> +   xcb_connection_t *connection,
> +   xcb_randr_output_t output)
> +{
> +   struct wsi_display *wsi =
> +  (struct wsi_display *) wsi_device->wsi[VK_ICD_WSI_PLA
> TFORM_DISPLAY];
> +   struct wsi_display_connector *connector;
> +   uint32_t connector_id;
> +
> +   xcb_window_t root = wsi_display_output_to_root(connection, output);
> +   if (!root)
> +  return NULL;
> +
> +   xcb_randr_get_screen_resources_cookie_t src =
> +  xcb_randr_get_screen_resources(connection, root);
> +   xcb_randr_get_output_info_cookie_t oic =
> +  xcb_randr_get_output_info(connection, output, XCB_CURRENT_TIME);
> +   xcb_randr_get_screen_resources_reply_t *srr =
> +  xcb_randr_get_screen_resources_reply(connection, src, NULL);
> +   xcb_randr_get_output_info_reply_t *oir =
> +  xcb_randr_get_output_info_reply(connection, oic, NULL);
>

Why are you fetching these here and not lower down?  The only uses of them
inside the "if (!connector)" is to free them.  Seems to be a bit of a waste.


> +
> +   /* See if we already have a connector for this output */
> +   connector = wsi_display_find_output(wsi_device, output);
> +
> +   if (!connector) {
> +  xcb_atom_t connector_id_atom = 0;
> +
> +  /*
> +   * Go get the kernel connector ID for this X output
> +   */
> +  connector_id = wsi_display_output_to_connector_id(connection,
> +
> &connector_id_atom,
> +output);
> +
> +  /* Any X server with lease support will have this atom */
> +  if (!connector_id) {
> + free(oir);
> + free(srr);
> + return NULL;
> +  }
> +
> +  /* See if we already have a connector for this id */
> +  connector = wsi_display_find_connector(wsi_device, connector_id);
> +
> +  if (connector == NULL) {
> + connector = wsi_display_alloc_connector(wsi, connector_id);
> + if (!connector) {
> +free(oir);
> +free(srr);
> +return NULL;
> + }
> + li

Re: [Mesa-dev] [PATCH 1/4] nv50/ir: add preliminary support for OP_XMAD

2018-06-13 Thread Karol Herbst
On Thu, Jun 14, 2018 at 12:02 AM, Rhys Perry  wrote:
> Signed-off-by: Rhys Perry 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.cpp|  3 ++-
>  src/gallium/drivers/nouveau/codegen/nv50_ir.h  | 14 
>  .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 12 +--
>  .../drivers/nouveau/codegen/nv50_ir_print.cpp  | 20 +
>  .../drivers/nouveau/codegen/nv50_ir_target.cpp |  7 +++---
>  .../nouveau/codegen/nv50_ir_target_gm107.cpp   |  1 +
>  .../nouveau/codegen/nv50_ir_target_nv50.cpp|  5 +++--
>  .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 25 
> --
>  8 files changed, 77 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> index 49425b98b9..99bf8de370 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp
> @@ -53,7 +53,8 @@ Modifier Modifier::operator*(const Modifier m) const
>b &= ~NV50_IR_MOD_NEG;
>
> a = (this->bits ^ b)  & (NV50_IR_MOD_NOT | NV50_IR_MOD_NEG);
> -   c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT);
> +   c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT |
> +NV50_IR_MOD_H1 | NV50_IR_MOD_SEXT);
>
> return Modifier(a | c);
>  }
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> index f4f3c70888..4deaf09989 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> @@ -58,6 +58,7 @@ enum operation
> OP_FMA,
> OP_SAD, // abs(src0 - src1) + src2
> OP_SHLADD,
> +   OP_XMAD, // extended multiply-add (GM107+), does a lot of things
> OP_ABS,
> OP_NEG,
> OP_NOT,
> @@ -251,6 +252,13 @@ enum operation
>  #define NV50_IR_SUBOP_VOTE_ALL 0
>  #define NV50_IR_SUBOP_VOTE_ANY 1
>  #define NV50_IR_SUBOP_VOTE_UNI 2
> +#define NV50_IR_SUBOP_XMAD_PSL (1 << 0)
> +#define NV50_IR_SUBOP_XMAD_MRG (1 << 1)
> +#define NV50_IR_SUBOP_XMAD_CLO (1 << 2)
> +#define NV50_IR_SUBOP_XMAD_CHI (2 << 2)
> +#define NV50_IR_SUBOP_XMAD_CSFU (3 << 2)
> +#define NV50_IR_SUBOP_XMAD_CBCC (4 << 2)
> +#define NV50_IR_SUBOP_XMAD_CMODE_MASK (0x7 << 2)

please document what all of those subops do here or at least for those you know.

>
>  #define NV50_IR_SUBOP_MINMAX_LOW  1
>  #define NV50_IR_SUBOP_MINMAX_MED  2
> @@ -527,6 +535,9 @@ struct Storage
>  #define NV50_IR_MOD_SAT (1 << 2)
>  #define NV50_IR_MOD_NOT (1 << 3)
>  #define NV50_IR_MOD_NEG_ABS (NV50_IR_MOD_NEG | NV50_IR_MOD_ABS)
> +// modifiers only for XMAD
> +#define NV50_IR_MOD_H1   (1 << 4)
> +#define NV50_IR_MOD_SEXT (1 << 5)

same here

>
>  #define NV50_IR_INTERP_MODE_MASK   0x3
>  #define NV50_IR_INTERP_LINEAR  (0 << 0)
> @@ -556,11 +567,14 @@ public:
> inline Modifier operator&(const Modifier m) const { return bits & m.bits; 
> }
> inline Modifier operator|(const Modifier m) const { return bits | m.bits; 
> }
> inline Modifier operator^(const Modifier m) const { return bits ^ m.bits; 
> }
> +   inline Modifier operator~() const { return ~bits; }
>
> operation getOp() const;
>
> inline int neg() const { return (bits & NV50_IR_MOD_NEG) ? 1 : 0; }
> inline int abs() const { return (bits & NV50_IR_MOD_ABS) ? 1 : 0; }
> +   inline int h1() const { return (bits & NV50_IR_MOD_H1) ? 1 : 0; }
> +   inline int sext() const { return (bits & NV50_IR_MOD_SEXT) ? 1 : 0; }
>
> inline operator bool() const { return bits ? true : false; }
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 4d0589214d..a43b481a01 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -191,9 +191,16 @@ void
>  LoadPropagation::checkSwapSrc01(Instruction *insn)
>  {
> const Target *targ = prog->getTarget();
> -   if (!targ->getOpInfo(insn).commutative)
> -  if (insn->op != OP_SET && insn->op != OP_SLCT && insn->op != OP_SUB)
> +   if (!targ->getOpInfo(insn).commutative) {
> +  if (insn->op != OP_SET && insn->op != OP_SLCT &&
> +  insn->op != OP_SUB && insn->op != OP_XMAD)
>   return;
> +  // XMAD is only commutative if both the CBCC and MRG flags are not set.
> +  if (insn->op == OP_XMAD && (insn->subOp & 0x1c) == 
> NV50_IR_SUBOP_XMAD_CBCC)
> + return;
> +  if (insn->op == OP_XMAD && (insn->subOp & NV50_IR_SUBOP_XMAD_MRG))
> + return;
> +   }
> if (insn->src(1).getFile() != FILE_GPR)
>return;
> // This is the special OP_SET used for alphatesting, we can't reverse its
> @@ -488,6 +495,7 @@ Modifier::applyTo(ImmediateValue& imm) const
>   imm.reg.data.s32 = -imm.reg.data.s32;
>if (bits & NV50_IR_MOD_NOT)
>   imm.reg.data.s32 = ~imm.reg.d

[Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()

2018-06-13 Thread Dave Airlie
From: Dave Airlie 

GLSL 4.60 offically added this but games and older CTS suites actually
had shaders that did this, we may as well enable it everywhere.
---
 src/compiler/glsl/glsl_parser.yy | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy
index 91c10ce1a60..432fc874268 100644
--- a/src/compiler/glsl/glsl_parser.yy
+++ b/src/compiler/glsl/glsl_parser.yy
@@ -2706,6 +2706,7 @@ external_declaration:
| declaration{ $$ = $1; }
| pragma_statement   { $$ = NULL; }
| layout_defaults{ $$ = $1; }
+   | ';' { $$ = NULL; }
;
 
 function_definition:
-- 
2.17.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()

2018-06-13 Thread Timothy Arceri



On 14/06/18 09:53, Dave Airlie wrote:

From: Dave Airlie 

GLSL 4.60 offically added this but games and older CTS suites actually
had shaders that did this, we may as well enable it everywhere.
---
  src/compiler/glsl/glsl_parser.yy | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy
index 91c10ce1a60..432fc874268 100644
--- a/src/compiler/glsl/glsl_parser.yy
+++ b/src/compiler/glsl/glsl_parser.yy
@@ -2706,6 +2706,7 @@ external_declaration:
 | declaration{ $$ = $1; }
 | pragma_statement   { $$ = NULL; }
 | layout_defaults{ $$ = $1; }
+   | ';' { $$ = NULL; }


Should the $$ stuff be aligned with above? Otherwise:

Acked-by: Timothy Arceri 


 ;
  
  function_definition:



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: Fix output for sparse MRTs.

2018-06-13 Thread Dave Airlie
On 14 June 2018 at 07:35, Bas Nieuwenhuizen  wrote:
> We need to init the cb_shader_format correctly with the changed
> col_format, so this moves the col_format adjustment to before the
> adjustment to before the cb_shader_mask gets generated.
>
> Fixes: 06d3c650980 "radv: fix a GPU hang when MRTs are sparse"
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903
> CC: 18.1 

Reviewed-by: Dave Airlie 

> ---
>  src/amd/vulkan/radv_pipeline.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
> index b8b425aca9f..6eeedc65a39 100644
> --- a/src/amd/vulkan/radv_pipeline.c
> +++ b/src/amd/vulkan/radv_pipeline.c
> @@ -524,20 +524,21 @@ radv_pipeline_compute_spi_color_formats(struct 
> radv_pipeline *pipeline,
> col_format |= cf << (4 * i);
> }
>
> -   blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);
> -
> -   if (blend->mrt0_is_dual_src)
> -   col_format |= (col_format & 0xf) << 4;
> -   blend->spi_shader_col_format = col_format;
> -
> /* If the i-th target format is set, all previous target formats must
>  * be non-zero to avoid hangs.
>  */
> -   num_targets = (util_last_bit(blend->spi_shader_col_format) + 3) / 4;
> +   num_targets = (util_last_bit(col_format) + 3) / 4;
> for (unsigned i = 0; i < num_targets; i++) {
> -   if (!(blend->spi_shader_col_format & (0xf << (i * 4
> -   blend->spi_shader_col_format |= 
> V_028714_SPI_SHADER_32_R << (i * 4);
> +   if (!(col_format & (0xf << (i * 4 {
> +   col_format |= V_028714_SPI_SHADER_32_R << (i * 4);
> +   }
> }
> +
> +   blend->cb_shader_mask = ac_get_cb_shader_mask(col_format);
> +
> +   if (blend->mrt0_is_dual_src)
> +   col_format |= (col_format & 0xf) << 4;
> +   blend->spi_shader_col_format = col_format;
>  }
>
>  static bool
> --
> 2.17.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] intel/compiler: general 8/16/32/64-bit shuffle_src_to_dst function

2018-06-13 Thread Chema Casanova
On 13/06/18 22:46, Jason Ekstrand wrote:
> On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo
> mailto:jmcasan...@igalia.com>> wrote:
> 
> This new function takes care of shuffle/unshuffle components of a
> particular bit-size in components with a different bit-size.
> 
> If source type size is smaller than destination type size the operation
> needed is a component shuffle. The opposite case would be an unshuffle.
> 
> The operation allows to skip first_component number of components from
> the source.
> 
> Shuffle MOVs are retyped using integer types avoiding problems with
> denorms
> and float types. This allows to simplify uses of shuffle functions
> that are
> dealing with these retypes individually.
> 
> Now there is a new restriction so source and destination can not overlap
> anymore when calling this suffle function. Following patches that
> migrate
> to use this new function will take care individually of avoiding source
> and destination overlaps.
> ---
>  src/intel/compiler/brw_fs_nir.cpp | 92 +++
>  1 file changed, 92 insertions(+)
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 166da0aa6d7..1a9d3c41d1d 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const
> fs_builder &bld,
>     }
>  }
> 
> +/*
> + * This helper takes a source register and un/shuffles it into the
> destination
> + * register.
> + *
> + * If source type size is smaller than destination type size the
> operation
> + * needed is a component shuffle. The opposite case would be an
> unshuffle. If
> + * source/destination type size is equal a shuffle is done that
> would be
> + * equivalent to a simple MOV.
> 
> 
> There's a sticky bit here if we want this to work with 64-bit types on
> gen7 and earlier because we only have DF there and not Q so the
> brw_reg_type_from_bit_size below doesn't work.  If we care about that
> case (and I'm not convinced we do), it should be easy enough to add a
> type_sz(src.type) == type_sz(dst.type) case which just does MOVs from
> source to dest.

At this moment, current uses of this function are to read from 32-bits
or to write to 32-bit. But I think that for completeness if would be
nice to have all cases covered. The option of doing the MOVs in the case
of equality (that would be quite normal) saves us to do the shuffle
calculus for the simple case. So I'm going for it.

> + *
> + * For example, if source is a 16-bit type and destination is
> 32-bit. A 3
> + * components .xyz 16-bit vector on SIMD8 would be.
> + *
> + *    |x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8|
> + *    |z1|z2|z3|z4|z5|z6|z7|z8|  |  |  |  |  |  |  |  |
> + *
> + * This helper will return the following 2 32-bit components with
> the 16-bit
> + * values shuffled:
> + *
> + *    |x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8|
> + *    |z1   |z2   |z3   |z4   |z5   |z6   |z7   |z8   |
> + *
> + * For unshuffle, the example would be the opposite, a 64-bit type
> source
> + * and a 32-bit destination. A 2 component .xy 64-bit vector on SIMD8
> + * would be:
> + *
> + *    | x1l   x1h | x2l   x2h | x3l   x3h | x4l   x4h |
> + *    | x5l   x5h | x6l   x6h | x7l   x7h | x8l   x8h |
> + *    | y1l   y1h | y2l   y2h | y3l   y3h | y4l   y4h |
> + *    | y5l   y5h | y6l   y6h | y7l   y7h | y8l   y8h |
> + *
> + * The returned result would be the following 4 32-bit components
> unshuffled:
> + *
> + *    | x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l |
> + *    | x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h |
> + *    | y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l |
> + *    | y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h |
> + *
> + * - Source and destination register must not be overlapped.
> + * - first_component parameter allows skipping source components.
> + */
> +void
> +shuffle_src_to_dst(const fs_builder &bld,
> +                   const fs_reg &dst,
> +                   const fs_reg &src,
> +                   uint32_t first_component,
> +                   uint32_t components)
> +{
> +   if (type_sz(src.type) <= type_sz(dst.type)) {
> +      /* Source is shuffled into destination */
> +      unsigned size_ratio = type_sz(dst.type) / type_sz(src.type);
> +#ifndef NDEBUG
> +      boolean src_dst_overlap = regions_overlap(dst,
> +         type_sz(dst.type) * bld.dispatch_width() * components,
> +         offset(src, bld, first_component * size_ratio),
> 
> 
> Why do you need to multiply first_component by size_ratio?  It's already
> in units of source components.

Yes, that's wro

Re: [Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()

2018-06-13 Thread Matt Turner
On Wed, Jun 13, 2018 at 4:53 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> GLSL 4.60 offically added this but games and older CTS suites actually
> had shaders that did this, we may as well enable it everywhere.
> ---
>  src/compiler/glsl/glsl_parser.yy | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/compiler/glsl/glsl_parser.yy 
> b/src/compiler/glsl/glsl_parser.yy
> index 91c10ce1a60..432fc874268 100644
> --- a/src/compiler/glsl/glsl_parser.yy
> +++ b/src/compiler/glsl/glsl_parser.yy
> @@ -2706,6 +2706,7 @@ external_declaration:
> | declaration{ $$ = $1; }
> | pragma_statement   { $$ = NULL; }
> | layout_defaults{ $$ = $1; }
> +   | ';' { $$ = NULL; }

Indentation.

Also, piglit test?

With those,

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()

2018-06-13 Thread Dave Airlie
On 14 June 2018 at 10:12, Matt Turner  wrote:
> On Wed, Jun 13, 2018 at 4:53 PM, Dave Airlie  wrote:
>> From: Dave Airlie 
>>
>> GLSL 4.60 offically added this but games and older CTS suites actually
>> had shaders that did this, we may as well enable it everywhere.
>> ---
>>  src/compiler/glsl/glsl_parser.yy | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/src/compiler/glsl/glsl_parser.yy 
>> b/src/compiler/glsl/glsl_parser.yy
>> index 91c10ce1a60..432fc874268 100644
>> --- a/src/compiler/glsl/glsl_parser.yy
>> +++ b/src/compiler/glsl/glsl_parser.yy
>> @@ -2706,6 +2706,7 @@ external_declaration:
>> | declaration{ $$ = $1; }
>> | pragma_statement   { $$ = NULL; }
>> | layout_defaults{ $$ = $1; }
>> +   | ';' { $$ = NULL; }
>
> Indentation.
>
> Also, piglit test?

There is already a piglit test, unfortunately it only runs under glsl 4.60,

I suppose I can send a patch to enable it to run from GLSL1.30.

>
> Reviewed-by: Matt Turner 
Thanks,
Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] intel/compiler: general 8/16/32/64-bit shuffle_src_to_dst function

2018-06-13 Thread Jason Ekstrand
On Wed, Jun 13, 2018 at 5:07 PM, Chema Casanova 
wrote:

> On 13/06/18 22:46, Jason Ekstrand wrote:
> > On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo
> > mailto:jmcasan...@igalia.com>> wrote:
> >
> > This new function takes care of shuffle/unshuffle components of a
> > particular bit-size in components with a different bit-size.
> >
> > If source type size is smaller than destination type size the
> operation
> > needed is a component shuffle. The opposite case would be an
> unshuffle.
> >
> > The operation allows to skip first_component number of components
> from
> > the source.
> >
> > Shuffle MOVs are retyped using integer types avoiding problems with
> > denorms
> > and float types. This allows to simplify uses of shuffle functions
> > that are
> > dealing with these retypes individually.
> >
> > Now there is a new restriction so source and destination can not
> overlap
> > anymore when calling this suffle function. Following patches that
> > migrate
> > to use this new function will take care individually of avoiding
> source
> > and destination overlaps.
> > ---
> >  src/intel/compiler/brw_fs_nir.cpp | 92
> +++
> >  1 file changed, 92 insertions(+)
> >
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 166da0aa6d7..1a9d3c41d1d 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const
> > fs_builder &bld,
> > }
> >  }
> >
> > +/*
> > + * This helper takes a source register and un/shuffles it into the
> > destination
> > + * register.
> > + *
> > + * If source type size is smaller than destination type size the
> > operation
> > + * needed is a component shuffle. The opposite case would be an
> > unshuffle. If
> > + * source/destination type size is equal a shuffle is done that
> > would be
> > + * equivalent to a simple MOV.
> >
> >
> > There's a sticky bit here if we want this to work with 64-bit types on
> > gen7 and earlier because we only have DF there and not Q so the
> > brw_reg_type_from_bit_size below doesn't work.  If we care about that
> > case (and I'm not convinced we do), it should be easy enough to add a
> > type_sz(src.type) == type_sz(dst.type) case which just does MOVs from
> > source to dest.
>
> At this moment, current uses of this function are to read from 32-bits
> or to write to 32-bit. But I think that for completeness if would be
> nice to have all cases covered. The option of doing the MOVs in the case
> of equality (that would be quite normal) saves us to do the shuffle
> calculus for the simple case. So I'm going for it.
>
> > + *
> > + * For example, if source is a 16-bit type and destination is
> > 32-bit. A 3
> > + * components .xyz 16-bit vector on SIMD8 would be.
> > + *
> > + *|x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8|
> > + *|z1|z2|z3|z4|z5|z6|z7|z8|  |  |  |  |  |  |  |  |
> > + *
> > + * This helper will return the following 2 32-bit components with
> > the 16-bit
> > + * values shuffled:
> > + *
> > + *|x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8|
> > + *|z1   |z2   |z3   |z4   |z5   |z6   |z7   |z8   |
> > + *
> > + * For unshuffle, the example would be the opposite, a 64-bit type
> > source
> > + * and a 32-bit destination. A 2 component .xy 64-bit vector on
> SIMD8
> > + * would be:
> > + *
> > + *| x1l   x1h | x2l   x2h | x3l   x3h | x4l   x4h |
> > + *| x5l   x5h | x6l   x6h | x7l   x7h | x8l   x8h |
> > + *| y1l   y1h | y2l   y2h | y3l   y3h | y4l   y4h |
> > + *| y5l   y5h | y6l   y6h | y7l   y7h | y8l   y8h |
> > + *
> > + * The returned result would be the following 4 32-bit components
> > unshuffled:
> > + *
> > + *| x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l |
> > + *| x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h |
> > + *| y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l |
> > + *| y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h |
> > + *
> > + * - Source and destination register must not be overlapped.
> > + * - first_component parameter allows skipping source components.
> > + */
> > +void
> > +shuffle_src_to_dst(const fs_builder &bld,
> > +   const fs_reg &dst,
> > +   const fs_reg &src,
> > +   uint32_t first_component,
> > +   uint32_t components)
> > +{
> > +   if (type_sz(src.type) <= type_sz(dst.type)) {
> > +  /* Source is shuffled into destination */
> > +  unsigned size_ratio = type_sz(dst.type) / type_sz(src.type);
> > +#ifndef NDEBUG
> > +  boolean src_dst_overlap = 

[Mesa-dev] [PATCH v2 2/5] mesa/util: add allow_glsl_builtin_const_expression driconf override

2018-06-13 Thread Timothy Arceri
Google Earth VR shaders uses builtins in constant expressions with
GLSL 1.10. That feature wasn't allowed until GLSL 1.20.
---
 src/compiler/glsl/ast_function.cpp  | 3 ++-
 src/gallium/auxiliary/pipe-loader/driinfo_gallium.h | 1 +
 src/gallium/include/state_tracker/st_api.h  | 1 +
 src/gallium/state_trackers/dri/dri_screen.c | 2 ++
 src/mesa/main/mtypes.h  | 6 ++
 src/mesa/state_tracker/st_extensions.c  | 3 +++
 src/util/xmlpool/t_options.h| 5 +
 7 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_function.cpp 
b/src/compiler/glsl/ast_function.cpp
index 22d58e48c64..127aa1f91c4 100644
--- a/src/compiler/glsl/ast_function.cpp
+++ b/src/compiler/glsl/ast_function.cpp
@@ -529,7 +529,8 @@ generate_call(exec_list *instructions, 
ir_function_signature *sig,
 * If the function call is a constant expression, don't generate any
 * instructions; just generate an ir_constant.
 */
-   if (state->is_version(120, 100)) {
+   if (state->is_version(120, 100) ||
+   state->ctx->Const.AllowGLSLBuiltinConstantExpression) {
   ir_constant *value = sig->constant_expression_value(ctx,
   actual_parameters,
   NULL);
diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h 
b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
index 21dc599dc26..f25f2080080 100644
--- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
+++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
@@ -23,6 +23,7 @@ DRI_CONF_SECTION_DEBUG
DRI_CONF_DISABLE_SHADER_BIT_ENCODING("false")
DRI_CONF_FORCE_GLSL_VERSION(0)
DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER("false")
+   DRI_CONF_ALLOW_GLSL_BUILTIN_CONST_EXPRESSION("false")
DRI_CONF_ALLOW_GLSL_BUILTIN_VARIABLE_REDECLARATION("false")
DRI_CONF_ALLOW_GLSL_CROSS_STAGE_INTERPOLATION_MISMATCH("false")
DRI_CONF_ALLOW_HIGHER_COMPAT_VERSION("false")
diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index ec6e7844b87..1efc7f081d1 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -222,6 +222,7 @@ struct st_config_options
boolean force_glsl_extensions_warn;
unsigned force_glsl_version;
boolean allow_glsl_extension_directive_midshader;
+   boolean allow_glsl_builtin_const_expression;
boolean allow_glsl_builtin_variable_redeclaration;
boolean allow_higher_compat_version;
boolean glsl_zero_init;
diff --git a/src/gallium/state_trackers/dri/dri_screen.c 
b/src/gallium/state_trackers/dri/dri_screen.c
index aaee9870776..a86b7519364 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -74,6 +74,8 @@ dri_fill_st_options(struct dri_screen *screen)
   driQueryOptioni(optionCache, "force_glsl_version");
options->allow_glsl_extension_directive_midshader =
   driQueryOptionb(optionCache, "allow_glsl_extension_directive_midshader");
+   options->allow_glsl_builtin_const_expression =
+  driQueryOptionb(optionCache, "allow_glsl_builtin_const_expression");
options->allow_glsl_builtin_variable_redeclaration =
   driQueryOptionb(optionCache, 
"allow_glsl_builtin_variable_redeclaration");
options->allow_higher_compat_version =
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 482c42a4b2d..41ad783d4b1 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3716,6 +3716,12 @@ struct gl_constants
 */
GLboolean AllowGLSLExtensionDirectiveMidShader;
 
+   /**
+* Allow builtins as part of constant expressions. This was not allowed
+* until GLSL 1.20 this allows it everywhere.
+*/
+   GLboolean AllowGLSLBuiltinConstantExpression;
+
/**
 * Allow GLSL built-in variables to be redeclared verbatim
 */
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 467d9b07596..7f44b4a80c0 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -1133,6 +1133,9 @@ void st_init_extensions(struct pipe_screen *screen,
if (options->allow_glsl_extension_directive_midshader)
   consts->AllowGLSLExtensionDirectiveMidShader = GL_TRUE;
 
+   if (options->allow_glsl_builtin_const_expression)
+  consts->AllowGLSLBuiltinConstantExpression = GL_TRUE;
+
consts->MinMapBufferAlignment =
   screen->get_param(screen, PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT);
 
diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h
index 3ada813d639..1a4945d6888 100644
--- a/src/util/xmlpool/t_options.h
+++ b/src/util/xmlpool/t_options.h
@@ -115,6 +115,11 @@ 
DRI_CONF_OPT_BEGIN_B(allow_glsl_extension_directive_midshader, def) \
 DRI_CONF_DESC(en,gettext("Allow GLSL #extension directives

[Mesa-dev] [PATCH v2 1/5] util: manually extract the program name from program_invocation_name

2018-06-13 Thread Timothy Arceri
Glibc has the same code to get program_invocation_short_name. However
for some reason the short name gets mangled for some wine apps.

For example with Google Earth VR I get:

program_invocation_name:
"/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe"

program_invocation_short_name:
"e"
---
 src/util/xmlconfig.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/util/xmlconfig.c b/src/util/xmlconfig.c
index 60a6331c86c..ad943e2ce48 100644
--- a/src/util/xmlconfig.c
+++ b/src/util/xmlconfig.c
@@ -45,7 +45,16 @@
 /* These aren't declared in any libc5 header */
 extern char *program_invocation_name, *program_invocation_short_name;
 #endif
-#define GET_PROGRAM_NAME() program_invocation_short_name
+static const char *
+__getProgramName()
+{
+char * arg = strrchr(program_invocation_name, '/');
+if (arg)
+return arg+1;
+else
+return program_invocation_name;
+}
+#define GET_PROGRAM_NAME() __getProgramName()
 #elif defined(__CYGWIN__)
 #define GET_PROGRAM_NAME() program_invocation_short_name
 #elif defined(__FreeBSD__) && (__FreeBSD__ >= 2)
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 5/5] util: add allow_glsl_relaxed_es to drirc for Google Earth VR

2018-06-13 Thread Timothy Arceri
---
 src/util/drirc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/util/drirc b/src/util/drirc
index ff706d16001..7f91035ae8b 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -178,6 +178,7 @@ TODO: document the other workarounds.
 
 
 
+
 
 
 

[Mesa-dev] [PATCH v2 4/5] mesa/util: add allow_glsl_relaxed_es driconfig override

2018-06-13 Thread Timothy Arceri
This relaxes a number of ES shader restrictions allowing shaders
to follow more desktop GLSL like rules.

This initial implementation relaxes the following:

 - allows linking ES shaders with desktop shaders
 - allows mismatching precision qualifiers
 - always enables standard derivative builtins

These relaxations allow Google Earth VR shaders to compile.
---
 src/compiler/glsl/builtin_functions.cpp   |  3 ++-
 src/compiler/glsl/linker.cpp  | 22 +++
 .../auxiliary/pipe-loader/driinfo_gallium.h   |  1 +
 src/gallium/include/state_tracker/st_api.h|  1 +
 src/gallium/state_trackers/dri/dri_screen.c   |  2 ++
 src/mesa/main/mtypes.h|  6 +
 src/mesa/state_tracker/st_extensions.c|  3 +++
 src/util/xmlpool/t_options.h  |  5 +
 8 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index efe90346d0e..7119903795f 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -446,7 +446,8 @@ fs_oes_derivatives(const _mesa_glsl_parse_state *state)
 {
return state->stage == MESA_SHADER_FRAGMENT &&
   (state->is_version(110, 300) ||
-   state->OES_standard_derivatives_enable);
+   state->OES_standard_derivatives_enable ||
+   state->ctx->Const.AllowGLSLRelaxedES);
 }
 
 static bool
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index e4bf634abe8..487a1ffcb05 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -894,7 +894,7 @@ validate_intrastage_arrays(struct gl_shader_program *prog,
  * Perform validation of global variables used across multiple shaders
  */
 static void
-cross_validate_globals(struct gl_shader_program *prog,
+cross_validate_globals(struct gl_context *ctx, struct gl_shader_program *prog,
struct exec_list *ir, glsl_symbol_table *variables,
bool uniforms_only)
 {
@@ -1115,7 +1115,8 @@ cross_validate_globals(struct gl_shader_program *prog,
  /* Check the precision qualifier matches for uniform variables on
   * GLSL ES.
   */
- if (prog->IsES && !var->get_interface_type() &&
+ if (!ctx->Const.AllowGLSLRelaxedES &&
+ prog->IsES && !var->get_interface_type() &&
  existing->data.precision != var->data.precision) {
 if ((existing->data.used && var->data.used) || prog->data->Version 
>= 300) {
linker_error(prog, "declarations for %s `%s` have "
@@ -1168,15 +1169,16 @@ cross_validate_globals(struct gl_shader_program *prog,
  * Perform validation of uniforms used across multiple shader stages
  */
 static void
-cross_validate_uniforms(struct gl_shader_program *prog)
+cross_validate_uniforms(struct gl_context *ctx,
+struct gl_shader_program *prog)
 {
glsl_symbol_table variables;
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   if (prog->_LinkedShaders[i] == NULL)
  continue;
 
-  cross_validate_globals(prog, prog->_LinkedShaders[i]->ir, &variables,
- true);
+  cross_validate_globals(ctx, prog, prog->_LinkedShaders[i]->ir,
+ &variables, true);
}
 }
 
@@ -2210,7 +2212,8 @@ link_intrastage_shaders(void *mem_ctx,
for (unsigned i = 0; i < num_shaders; i++) {
   if (shader_list[i] == NULL)
  continue;
-  cross_validate_globals(prog, shader_list[i]->ir, &variables, false);
+  cross_validate_globals(ctx, prog, shader_list[i]->ir, &variables,
+ false);
}
 
if (!prog->data->LinkStatus)
@@ -4807,7 +4810,8 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   min_version = MIN2(min_version, prog->Shaders[i]->Version);
   max_version = MAX2(max_version, prog->Shaders[i]->Version);
 
-  if (prog->Shaders[i]->IsES != prog->Shaders[0]->IsES) {
+  if (!ctx->Const.AllowGLSLRelaxedES &&
+  prog->Shaders[i]->IsES != prog->Shaders[0]->IsES) {
  linker_error(prog, "all shaders must use same shading "
   "language version\n");
  goto done;
@@ -4825,7 +4829,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
/* In desktop GLSL, different shader versions may be linked together.  In
 * GLSL ES, all shader versions must be the same.
 */
-   if (prog->Shaders[0]->IsES && min_version != max_version) {
+   if (!ctx->Const.AllowGLSLRelaxedES && min_version != max_version) {
   linker_error(prog, "all shaders must use same shading "
"language version\n");
   goto done;
@@ -4951,7 +4955,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 * performed, then locations are assigned for uniforms, attributes, and
 * varyings.
 */
-   cross_validat

[Mesa-dev] [PATCH v2 3/5] util: add allow_glsl_builtin_const_expression to drirc for Google Earth VR

2018-06-13 Thread Timothy Arceri
---
 src/util/drirc | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/util/drirc b/src/util/drirc
index c76f1ca4380..ff706d16001 100644
--- a/src/util/drirc
+++ b/src/util/drirc
@@ -176,6 +176,10 @@ TODO: document the other workarounds.
 
 
 
+
+
+
+
 
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/14] intel/compiler: new shuffle_for_32bit_write and shuffle_from_32bit_read

2018-06-13 Thread Jason Ekstrand
On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> These new shuffle functions deal with the shuffle/unshuffle operations
> needed for read/write operations using 32-bit components when the
> read/written components have a different bit-size (8, 16, 64-bits).
> Shuffle from 32-bit to 32-bit becomes a simple MOV.
>
> As the new function shuffle_src_to_dst takes of doing a shuffle or an
> unshuffle based on the different type_sz of source an destination this
> generic functions work with any source/destination assuming that writes
> use a 32-bit destination or reads use a 32-bit source.
>

I'm having a lot of trouble understanding this paragraph.  Would you mind
rephrasing it?


> To enable this new functions it is needed than there is no
> source/destination overlap in the case of shuffle_from_32bit_read.
> That never happens on shuffle_for_32bit_write as it allocates a new
> destination register as it was at shuffle_64bit_data_for_32bit_write.
> ---
>  src/intel/compiler/brw_fs.h   | 11 +
>  src/intel/compiler/brw_fs_nir.cpp | 38 +++
>  2 files changed, 49 insertions(+)
>
> diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
> index faf51568637..779170ecc95 100644
> --- a/src/intel/compiler/brw_fs.h
> +++ b/src/intel/compiler/brw_fs.h
> @@ -519,6 +519,17 @@ void shuffle_16bit_data_for_32bit_write(const
> brw::fs_builder &bld,
>  const fs_reg &src,
>  uint32_t components);
>
> +void shuffle_from_32bit_read(const brw::fs_builder &bld,
> + const fs_reg &dst,
> + const fs_reg &src,
> + uint32_t first_component,
> + uint32_t components);
> +
> +fs_reg shuffle_for_32bit_write(const brw::fs_builder &bld,
> +   const fs_reg &src,
> +   uint32_t first_component,
> +   uint32_t components);
> +
>  fs_reg setup_imm_df(const brw::fs_builder &bld,
>  double v);
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index 1a9d3c41d1d..1f684149fd5 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -5454,6 +5454,44 @@ shuffle_src_to_dst(const fs_builder &bld,
> }
>  }
>
> +void
> +shuffle_from_32bit_read(const fs_builder &bld,
> +const fs_reg &dst,
> +const fs_reg &src,
> +uint32_t first_component,
> +uint32_t components)
> +{
> +   assert(type_sz(src.type) == 4);
> +
>

/* This function takes components in units of the destination type while
shuffle_src_to_dst takes components in units of the smallest type */


> +   if (type_sz(dst.type) > 4) {
> +  assert(type_sz(dst.type) == 8);
> +  first_component *= 2;
> +  components *= 2;
> +   }
> +
> +   shuffle_src_to_dst(bld, dst, src, first_component, components);
> +}
> +
> +fs_reg
> +shuffle_for_32bit_write(const fs_builder &bld,
> +const fs_reg &src,
> +uint32_t first_component,
> +uint32_t components)
> +{
> +   fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_D,
> + DIV_ROUND_UP (components * type_sz(src.type),
> 4));
> +
>

/* This function takes components in units of the source type while
shuffle_src_to_dst takes components in units of the smallest type */

With those added and the commit message re-worded a bit,

Reviewed-by: Jason Ekstrand 


> +   if (type_sz(src.type) > 4) {
> +  assert(type_sz(src.type) == 8);
> +  first_component *= 2;
> +  components *= 2;
> +   }
> +
> +   shuffle_src_to_dst(bld, dst, src, first_component, components);
> +
> +   return dst;
> +}
> +
>  fs_reg
>  setup_imm_df(const fs_builder &bld, double v)
>  {
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/14] intel/compiler: Use shuffle_from_32bit_write for 16-bits store_ssbo

2018-06-13 Thread Jason Ekstrand
s/from/for/ in the commit message.

On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo <
jmcasan...@igalia.com> wrote:

> ---
>  src/intel/compiler/brw_fs_nir.cpp | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_nir.cpp
> b/src/intel/compiler/brw_fs_nir.cpp
> index ef7895262b8..a54935f7049 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4297,11 +4297,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
> &bld, nir_intrinsic_instr *instr
>   * aligned. Shuffling only one component would be the same as
>   * striding it.
>   */
> -fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_D,
> -  DIV_ROUND_UP(num_components, 2));
> -shuffle_16bit_data_for_32bit_write(bld, tmp, write_src,
> -   num_components);
> -write_src = tmp;
> +write_src = shuffle_for_32bit_write(bld, write_src, 0,
> +num_components);
>   }
>
>   fs_reg offset_reg;
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >