Re: [Mesa-dev] [PATCH 1/2] radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUs

2017-09-06 Thread Axel Davy

Hi,

On 07/09/2017 00:35, Marek Olšák wrote:

+   /* Out-of-order rasterization can be enabled for these cases:
+*
+* - color-only rendering:
+*   + blending must be enabled and commutative
+*   + only when inexact behavior due to rounding is allowed
+*
+* - depth-only rendering:
+*   + depth must force ordering



Why depth must force ordering in depth only rendering ? If the depth 
func is PIPE_FUNC_LESS, PIPE_FUNC_GREATER or similar, you get min or max 
behaviour, thus the order shouldn't matter.



Yours,


Axel Davy

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102571] vulkaninfo fails with "trap divide error"

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102571

Vinson Lee  changed:

   What|Removed |Added

   Keywords||bisected, have-backtrace,
   ||regression
 CC||airl...@freedesktop.org,
   ||b...@basnieuwenhuizen.nl

--- Comment #1 from Vinson Lee  ---
commit 180c1b924e1ed3a2918fad9c5cbb653524de8233
Author: Bas Nieuwenhuizen 
Date:   Wed Aug 16 09:09:56 2017 +0200

ac/nir: Add shader support for multiviews.

It uses an user SGPR to pass the view index to the shaders, except
for the fragment shader where we use layer=view (which comes in
handy when we want to do the NV ext that allows us to execute pre-FS
stages once instead of per view).

Reviewed-by: Dave Airlie 

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/main: Fix GetTransformFeedbacki64 for glTransformFeedbackBufferBase

2017-09-06 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

On Tue, 2017-09-05 at 14:41 +0200, Iago Toral Quiroga wrote:
> The spec has special rules for querying buffer offsets and sizes
> when BindBufferBase is used, described  in the OpenGL 4.6 spec,
> section 6.8 Buffer Object State:
> 
>    "To query the starting offset or size of the range of a buffer
> object binding in an indexed array, call GetInteger64i_v with
> target set to respectively the starting offset or binding size
> name from table 6.5 for that array. Index must be in the range
> zero to the number of bind points supported minus one. If the
> starting offset or size was not specified when the buffer object
> was bound (e.g. if it was bound with BindBufferBase), or if no
> buffer object is bound to the target array at index, zero is
> returned."
> 
> Transform feedback buffer queries should follow the same rules, since
> it is the same case for them. There is a CTS test for this.
> 
> Fixes:
> KHR-GL45.direct_state_access.xfb_buffers
> ---
>  src/mesa/main/transformfeedback.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/src/mesa/main/transformfeedback.c
> b/src/mesa/main/transformfeedback.c
> index befc7c..a5ea2a5eb7 100644
> --- a/src/mesa/main/transformfeedback.c
> +++ b/src/mesa/main/transformfeedback.c
> @@ -1402,12 +1402,34 @@ _mesa_GetTransformFeedbacki64_v(GLuint xfb,
> GLenum pname, GLuint index,
>    return;
> }
>  
> +   /**
> +* This follows the same general rules used for BindBufferBase:
> +*
> +*   "To query the starting offset or size of the range of a
> buffer
> +*object binding in an indexed array, call GetInteger64i_v
> with
> +*target set to respectively the starting offset or binding
> size
> +*name from table 6.5 for that array. Index must be in the
> range
> +*zero to the number of bind points supported minus one. If
> the
> +*starting offset or size was not specified when the buffer
> object
> +*was bound (e.g. if it was bound with BindBufferBase), or if
> no
> +*buffer object is bound to the target array at index, zero
> is
> +*returned."
> +*/
> +   if (obj->RequestedSize[index] == 0 &&
> +   (pname == GL_TRANSFORM_FEEDBACK_BUFFER_START ||
> +pname == GL_TRANSFORM_FEEDBACK_BUFFER_SIZE)) {
> +  *param = 0;
> +  return;
> +   }
> +
> compute_transform_feedback_buffer_sizes(obj);
> switch(pname) {
> case GL_TRANSFORM_FEEDBACK_BUFFER_START:
> +  assert(obj->RequestedSize[index] > 0);
>    *param = obj->Offset[index];
>    break;
> case GL_TRANSFORM_FEEDBACK_BUFFER_SIZE:
> +  assert(obj->RequestedSize[index] > 0);
>    *param = obj->Size[index];
>    break;
> default:

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102571] vulkaninfo fails with "trap divide error"

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102571

Bug ID: 102571
   Summary: vulkaninfo fails with "trap divide error"
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: danielr...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 134033
  --> https://bugs.freedesktop.org/attachment.cgi?id=134033=edit
backtrace of vulkaninfo crash

While testing vulkan support for my 390x on git master, I ran across the
following issue.

Running vulkaninfo crashes and this line shows up in the journal:
traps: vulkaninfo[30227] trap divide error ip:7f9777251563 sp:7fff7aa100e8
error:0 in libvulkan_radeon.so[7f97771ed000+1a5000]

I also ran git-bisect, which indicates that this issue was caused by the commit
180c1b924e1ed3a2918fad9c5cbb653524de8233

Attached is a backtrace of when it crashes, which is in
radv_pipeline_scratch_init of src/vulkan/radv_pipeline.c. The only divide by
zero error that looks possible in that function is on line 763 if
pipeline->shaders[i]->config.num_vgprs is zero.

Additional information:
Running 4.13.0 kernel on NixOS

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv/gfx9: allocate events from uncached VA space

2017-09-06 Thread Dave Airlie
From: Dave Airlie 

This copies what amdgpu-pro does, and allocates the memory
for an event with an uncached mtype.

This fixes hangs with:
dEQP-VK.api.command_buffers.record_simul_use_primary

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_device.c  | 2 +-
 src/amd/vulkan/radv_radeon_winsys.h   | 3 ++-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c | 9 -
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 0b25469..12f6fe6 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2793,7 +2793,7 @@ VkResult radv_CreateEvent(
 
event->bo = device->ws->buffer_create(device->ws, 8, 8,
  RADEON_DOMAIN_GTT,
- RADEON_FLAG_CPU_ACCESS);
+ RADEON_FLAG_VA_UNCACHED | 
RADEON_FLAG_CPU_ACCESS);
if (!event->bo) {
vk_free2(>alloc, pAllocator, event);
return VK_ERROR_OUT_OF_DEVICE_MEMORY;
diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
b/src/amd/vulkan/radv_radeon_winsys.h
index 8e2ba74..a9c1f54 100644
--- a/src/amd/vulkan/radv_radeon_winsys.h
+++ b/src/amd/vulkan/radv_radeon_winsys.h
@@ -51,7 +51,8 @@ enum radeon_bo_flag { /* bitfield */
RADEON_FLAG_GTT_WC =(1 << 0),
RADEON_FLAG_CPU_ACCESS =(1 << 1),
RADEON_FLAG_NO_CPU_ACCESS = (1 << 2),
-   RADEON_FLAG_VIRTUAL =   (1 << 3)
+   RADEON_FLAG_VIRTUAL =   (1 << 3),
+   RADEON_FLAG_VA_UNCACHED =   (1 << 4),
 };
 
 enum radeon_bo_usage { /* bitfield */
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
index 75444d5..0af5a39 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c
@@ -323,7 +323,14 @@ radv_amdgpu_winsys_bo_create(struct radeon_winsys *_ws,
goto error_bo_alloc;
}
 
-   r = amdgpu_bo_va_op(buf_handle, 0, size, va, 0, AMDGPU_VA_OP_MAP);
+   uint32_t raw_flags = AMDGPU_VM_PAGE_READABLE | AMDGPU_VM_PAGE_WRITEABLE 
|
+   AMDGPU_VM_PAGE_EXECUTABLE;
+   if (flags & RADEON_FLAG_VA_UNCACHED)
+   raw_flags |= AMDGPU_VM_MTYPE_UC;
+
+   size = ALIGN(size, getpagesize());
+
+   r = amdgpu_bo_va_op_raw(ws->dev, buf_handle, 0, size, va, raw_flags, 
AMDGPU_VA_OP_MAP);
if (r)
goto error_va_map;
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: use simpler indirect packet 3 if possible.

2017-09-06 Thread Dave Airlie
From: Dave Airlie 

This fixes some observed hangs on CIK GPUs.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_cmd_buffer.c | 37 +++--
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index b372123..bc4aeb3 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2834,20 +2834,29 @@ radv_cs_emit_indirect_draw_packet(struct 
radv_cmd_buffer *cmd_buffer,
uint32_t base_reg = cmd_buffer->state.pipeline->graphics.vtx_base_sgpr;
assert(base_reg);
 
-   radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT_MULTI :
-  PKT3_DRAW_INDIRECT_MULTI,
-8, false));
-   radeon_emit(cs, 0);
-   radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
-   radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
-   radeon_emit(cs, (((base_reg + 8) - SI_SH_REG_OFFSET) >> 2) |
-   S_2C3_DRAW_INDEX_ENABLE(draw_id_enable) |
-   S_2C3_COUNT_INDIRECT_ENABLE(!!count_va));
-   radeon_emit(cs, draw_count); /* count */
-   radeon_emit(cs, count_va); /* count_addr */
-   radeon_emit(cs, count_va >> 32);
-   radeon_emit(cs, stride); /* stride */
-   radeon_emit(cs, di_src_sel);
+   if (draw_count == 1 && !count_va && !draw_id_enable) {
+   radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT :
+PKT3_DRAW_INDIRECT, 3, false));
+   radeon_emit(cs, 0);
+   radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, di_src_sel);
+   } else {
+   radeon_emit(cs, PKT3(indexed ? PKT3_DRAW_INDEX_INDIRECT_MULTI :
+PKT3_DRAW_INDIRECT_MULTI,
+8, false));
+   radeon_emit(cs, 0);
+   radeon_emit(cs, (base_reg - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, ((base_reg + 4) - SI_SH_REG_OFFSET) >> 2);
+   radeon_emit(cs, (((base_reg + 8) - SI_SH_REG_OFFSET) >> 2) |
+   S_2C3_DRAW_INDEX_ENABLE(draw_id_enable) |
+   S_2C3_COUNT_INDIRECT_ENABLE(!!count_va));
+   radeon_emit(cs, draw_count); /* count */
+   radeon_emit(cs, count_va); /* count_addr */
+   radeon_emit(cs, count_va >> 32);
+   radeon_emit(cs, stride); /* stride */
+   radeon_emit(cs, di_src_sel);
+   }
 }
 
 static void
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 3/3] run: add extension_in_string() helper

2017-09-06 Thread Timothy Arceri



On 21/08/17 20:27, Emil Velikov wrote:

From: Emil Velikov 

memmem() does not attribute what the character after the searched string
is. Thus it will flag even when haystack is "foobar" while we're looking
for "foo".

Pull a small helper (based on piglit) that correctly handles this and use
it.

Note: when parsing through the shader we have a non-zero terminated
needle, let's keep the memmem in there for now.

Signed-off-by: Emil Velikov 
---
  run.c | 43 +++
  1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/run.c b/run.c
index d0d4598..0691d33 100644
--- a/run.c
+++ b/run.c
@@ -69,6 +69,35 @@ struct shader {
  int type;
  };
  
+static bool

+extension_in_string(const char *haystack, const char *needle)
+{
+const unsigned needle_len = strlen(needle);
+
+if (needle_len == 0)
+return false;
+
+while (true) {
+const char *const s = strstr(haystack, needle);
+
+if (s == NULL)
+return false;
+
+if (s[needle_len] == ' ' || s[needle_len] == '\0')
+return true;
+
+/* strstr found an extension whose name begins with
+ * needle, but whose name is not equal to needle.
+ * Restart the search at s + needle_len so that we
+ * don't just find the same extension again and go
+ * into an infinite loop.
+ */
+haystack = s + needle_len;
+}
+
+return false;
+}
+
  static struct shader *
  get_shaders(const struct context_info *core, const struct context_info 
*compat,
  const char *text, size_t text_size,
@@ -141,8 +170,8 @@ get_shaders(const struct context_info *core, const struct 
context_info *compat,
  extension_text += 1;
  const char *newline = memchr(extension_text, '\n',
   end_text - extension_text);
-if (memmem(info->extension_string, info->extension_string_len,
-   extension_text, newline - extension_text) == NULL) {
+if (memmem(info->extension_string, info->extension_string_len,
+   extension_text, newline - extension_text) == NULL) {
  fprintf(stderr, "SKIP: %s requires unavailable extension %.*s\n",
  shader_name, (int)(newline - extension_text), 
extension_text);
  return NULL;
@@ -415,7 +444,7 @@ main(int argc, char **argv)
  return -1;
  }
  
-if (!strstr(client_extensions, "EGL_MESA_platform_gbm")) {

+if (!extension_in_string(client_extensions, "EGL_MESA_platform_gbm")) {
  fprintf(stderr, "ERROR: Missing EGL_MESA_platform_gbm\n");
  return -1;
  }
@@ -458,7 +487,7 @@ main(int argc, char **argv)
  };
  for (int i = 0; i < ARRAY_SIZE(egl_extension); i++) {
  const char *extension_string = eglQueryString(egl_dpy, 
EGL_EXTENSIONS);
-if (strstr(extension_string, egl_extension[i]) == NULL) {
+if (!extension_in_string(extension_string, egl_extension[i])) {
  fprintf(stderr, "ERROR: Missing %s\n", egl_extension[i]);
  ret = -1;
  goto egl_terminate;
@@ -530,8 +559,7 @@ main(int argc, char **argv)
  
  core.max_glsl_version = get_glsl_version();
  
-if (memmem(core.extension_string, core.extension_string_len,

-   "GL_KHR_debug", strlen("GL_KHR_debug")) == NULL) {
+if (!extension_in_string(core.extension_string, "GL_KHR_debug")) {
  fprintf(stderr, "ERROR: Missing GL_KHR_debug\n");
  ret = -1;
  goto egl_terminate;
@@ -556,8 +584,7 @@ main(int argc, char **argv)
  
  compat.max_glsl_version = get_glsl_version();
  
-if (memmem(compat.extension_string, compat.extension_string_len,

-   "GL_KHR_debug", strlen("GL_KHR_debug")) == NULL) {
+if (!extension_in_string(compat.extension_string, "GL_KHR_debug")) {


Hi Emil,

I haven't looked into why but this segfaults when running on i965. 
Reverting these three patches makes things work again.




  fprintf(stderr, "ERROR: Missing GL_KHR_debug\n");
  ret = -1;
  goto egl_terminate;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 17.2.0 missing git_sha1.h

2017-09-06 Thread dosse91
I'm trying to build Gallium using LLVM 4.0 on Windows, and it says that 
git_sha1.h is missing. Removing the include from version.c fixes the 
problem and Mesa compiles as usual.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/10] egl/x11: Match depth 30 RGB visuals to 32-bit RGBA EGLConfigs.

2017-09-06 Thread Mario Kleiner

On 09/06/2017 03:18 PM, Eric Engestrom wrote:

On Tuesday, 2017-09-05 07:01:13 +0200, Mario Kleiner wrote:

Similar to the matching of 24 bit RGB visuals to 32-bit
RGBA EGLConfigs.

Fixes failure of piglit egl tests to select ARGB2101010
visuals via eglChooseConfig() with EGL_ALPHA_BITS 2 on
a depth 30 X-Screen.

Signed-off-by: Mario Kleiner 
---
  src/egl/drivers/dri2/platform_x11.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 062c8a4..df768ab 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -781,13 +781,14 @@ dri2_x11_add_configs_for_visuals(struct dri2_egl_display 
*dri2_dpy,
config_count++;
  
  /* Allow a 24-bit RGB visual to match a 32-bit RGBA EGLConfig.

+ * Ditto for 30-bit RGB visuals to match a 32-bit RGBA EGLConfig.
   * Otherwise it will only match a 32-bit RGBA visual.  On a
   * composited window manager on X11, this will make all of the
   * EGLConfigs with destination alpha get blended by the
   * compositor.  This is probably not what the application
   * wants... especially on drivers that only have 32-bit RGBA
   * EGLConfigs! */
-if (d.data->depth == 24) {
+if (d.data->depth == 24 || d.data->depth == 30) {
 rgba_masks[3] =
~(rgba_masks[0] | rgba_masks[1] | rgba_masks[2]);
 dri2_conf = dri2_add_config(disp, config, config_count + 1,
--
2.7.4



Haven't looked into it in details, but I feel like the two switches in
swrastCreateDrawable() and dri2_create_image_khr_pixmap() would need
updating as well, don't they? (probably as a separate patch though)



Thanks for the feedback. Will check this and do so in some add-on patches.
-mario
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] RadeonSI: Tessellation shader micro-optimizations

2017-09-06 Thread Dieter Nützel

Dear relentless Marek,

I've gave it a shot on 'Unigine Heaven' tessellation 'normal'.
Not that I found any significant effects, like expected...

So this series is:

Tested-by: Dieter Nützel 

on RX580, too.

Dieter

Am 06.09.2017 19:03, schrieb Marek Olšák:

Hi,

This series seemed like a good idea and results in eliminations of
shader instructions. However I haven't been able to find an app where
it has a measurable effect.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: skip draw calls with pipe_draw_info::count == 0

2017-09-06 Thread Timothy Arceri

On 07/09/17 08:37, Marek Olšák wrote:

Ping.

On Sat, Sep 2, 2017 at 1:34 AM, Marek Olšák  wrote:

From: Marek Olšák 

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102502

Cc: 17.2 
---
  src/mesa/state_tracker/st_draw.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index fe03a4a..2fe7070 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -191,23 +191,28 @@ st_draw_vbo(struct gl_context *ctx,
if (tfb_vertcount) {
   if (!st_transform_feedback_draw_init(tfb_vertcount, stream, ))
  return;
}
 }

 assert(!indirect);

 /* do actual drawing */
 for (i = 0; i < nr_prims; i++) {
+  info.count = prims[i].count;
+
+  /* Skip no-op draw calls. */
+  if (!info.count && !tfb_vertcount)
+ continue;


I tried to reviewed this, the bit I'm unclear about is what happens if. 
info.count == 0 and tfb_vertcount != NULL


It wasn't clear to me if the 0 count wasn't going to be an issue here. 
Anyway it seem clear that this isn't going to break anything that isn't 
already broken so for what its worth:


Acked-by: Timothy Arceri 



+
info.mode = translate_prim(ctx, prims[i].mode);
info.start = start + prims[i].start;
-  info.count = prims[i].count;
info.start_instance = prims[i].base_instance;
info.instance_count = prims[i].num_instances;
info.index_bias = prims[i].basevertex;
info.drawid = prims[i].draw_id;
if (!ib) {
   info.min_index = info.start;
   info.max_index = info.start + info.count - 1;
}

if (ST_DEBUG & DEBUG_DRAW) {
--
2.7.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: skip draw calls with pipe_draw_info::count == 0

2017-09-06 Thread Dieter Nützel

Hello Marek,

go ahead.

This is:
Tested-by: Alexandre Demers 
Tested-by: Dieter Nützel 

Solve:
https://bugs.freedesktop.org/show_bug.cgi?id=102502

Dieter

Am 07.09.2017 00:37, schrieb Marek Olšák:

Ping.

On Sat, Sep 2, 2017 at 1:34 AM, Marek Olšák  wrote:

From: Marek Olšák 

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102502

Cc: 17.2 
---
 src/mesa/state_tracker/st_draw.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_draw.c 
b/src/mesa/state_tracker/st_draw.c

index fe03a4a..2fe7070 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -191,23 +191,28 @@ st_draw_vbo(struct gl_context *ctx,
   if (tfb_vertcount) {
  if (!st_transform_feedback_draw_init(tfb_vertcount, stream, 
))

 return;
   }
}

assert(!indirect);

/* do actual drawing */
for (i = 0; i < nr_prims; i++) {
+  info.count = prims[i].count;
+
+  /* Skip no-op draw calls. */
+  if (!info.count && !tfb_vertcount)
+ continue;
+
   info.mode = translate_prim(ctx, prims[i].mode);
   info.start = start + prims[i].start;
-  info.count = prims[i].count;
   info.start_instance = prims[i].base_instance;
   info.instance_count = prims[i].num_instances;
   info.index_bias = prims[i].basevertex;
   info.drawid = prims[i].draw_id;
   if (!ib) {
  info.min_index = info.start;
  info.max_index = info.start + info.count - 1;
   }

   if (ST_DEBUG & DEBUG_DRAW) {
--
2.7.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix loop analysis of loop terminators

2017-09-06 Thread Matt Turner
On Wed, Sep 6, 2017 at 3:48 PM, Timothy Arceri  wrote:
>
> On 07/09/17 06:59, Matt Turner wrote:
>>
>> I feel like the commit message is missing some important information.
>> What does this fix? Do we have a piglit test? I don't see one from
>> you. I see that someone has replied with a Tested-by, so presumably
>> they know what it's intended to fix.
>>
>
> Huh??? Are you winding me up?

My apologies. I meant to reply to this a few days ago, forgot, forgot
the details, and wrote a reply based on faulty memory today.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102530

Alexandre Demers  changed:

   What|Removed |Added

 Status|NEEDINFO|RESOLVED
 Resolution|--- |FIXED

--- Comment #23 from Alexandre Demers  ---
This bug is indeed fixed in latest git commit. And I propose my reviewed-by to
Eric's patch if needed for dealing with the actuel MESA_NO_ERROR's value.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102502] [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102502

Alexandre Demers  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Alexandre Demers  ---
Marek's patch fixes the bug. Thanks. Add my tested-by if needed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: skip draw calls with pipe_draw_info::count == 0

2017-09-06 Thread Alexandre Demers
You can add my tested-by to the patch.

-- 
Alexandre Demers

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/2] mesa: allow user to set MESA_NO_ERROR=0

2017-09-06 Thread Timothy Arceri

Series:

Reviewed-by: Timothy Arceri 

Thanks!

On 07/09/17 00:23, Eric Engestrom wrote:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102530
Cc: Michel Dänzer 
Cc: Alexandre Demers 
Signed-off-by: Eric Engestrom 
---
  src/mesa/main/context.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index cd3eccea20..1c4232d298 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -123,6 +123,7 @@
  #include "shared.h"
  #include "shaderobj.h"
  #include "shaderimage.h"
+#include "util/debug.h"
  #include "util/disk_cache.h"
  #include "util/strtod.h"
  #include "stencil.h"
@@ -1213,7 +1214,7 @@ _mesa_initialize_context(struct gl_context *ctx,
 /* KHR_no_error is likely to crash, overflow memory, etc if an application
  * has errors so don't enable it for setuid processes.
  */
-   if (getenv("MESA_NO_ERROR")) {
+   if (env_var_as_boolean("MESA_NO_ERROR", false)) {
  #if !defined(_WIN32)
if (geteuid() == getuid())
  #endif


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] disk_cache: make the thread queue resizable and low priority

2017-09-06 Thread Timothy Arceri

Seems reasonable.

Acked-by: Timothy Arceri 

On 07/09/17 08:20, Marek Olšák wrote:

From: Marek Olšák 

---
  src/util/disk_cache.c | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index b789a45..33e4dc8 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -351,27 +351,29 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp,
}
 }
  
 /* Default to 1GB for maximum cache size. */

 if (max_size == 0) {
max_size = 1024*1024*1024;
 }
  
 cache->max_size = max_size;
  
-   /* A limit of 32 jobs was choosen as observations of Deus Ex start-up times

-* showed that we reached at most 11 jobs on an Intel i5-6400 CPU@2.70GHz
-* (a fairly modest desktop CPU). 1 thread was chosen because we don't
-* really care about getting things to disk quickly just that it's not
-* blocking other tasks.
+   /* 1 thread was chosen because we don't really care about getting things
+* to disk quickly just that it's not blocking other tasks.
+*
+* The queue will resize automatically when it's full, so adding new jobs
+* doesn't stall.
  */
-   util_queue_init(>cache_queue, "disk_cache", 32, 1, 0);
+   util_queue_init(>cache_queue, "disk_cache", 32, 1,
+   UTIL_QUEUE_INIT_RESIZE_IF_FULL |
+   UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY);
  
 uint8_t cache_version = CACHE_VERSION;

 size_t cv_size = sizeof(cache_version);
 cache->driver_keys_blob_size = cv_size;
  
 /* Create driver id keys */

 size_t ts_size = strlen(timestamp) + 1;
 size_t gpu_name_size = strlen(gpu_name) + 1;
 cache->driver_keys_blob_size += ts_size;
 cache->driver_keys_blob_size += gpu_name_size;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix loop analysis of loop terminators

2017-09-06 Thread Timothy Arceri

On 04/09/17 13:33, Timothy Arceri wrote:
Sent too early, this breaks a piglit test. I'll try to track down the 
issue and resend.


Whoops, seems I replied to myself and not the list. I have a more 
comprehensive fix that unrolls even more loop variants in the works, I 
just need to track don't a bug it exposes further down the line in 
either tgsi or llvm.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 41/47] i965/fs: Add reuse_16bit_conversions_register optimization

2017-09-06 Thread Chema Casanova
Hi Connor and Curro,

On 28/08/17 12:24, Alejandro Piñeiro wrote:
> On 27/08/17 20:24, Connor Abbott wrote:
>> Hi,
>>
>> On Aug 25, 2017 9:28 AM, "Alejandro Piñeiro" > > wrote:
>>
>> On 24/08/17 21:07, Connor Abbott wrote:
>> >
>> > Hi Alejandro,
>>
>> Hi Connor,
>>
>> >
>> > This seems really suspicious. If the live ranges are really
>> > independent, then the register allocator should be able to
>> assign the
>> > two virtual registers to the same physical register if it needs to.
>>
>> Yes, it is true, the register allocator should be able to assign two
>> virtual registers to the same physical register. But that is done
>> at the
>> end (or really near the end), so late for the problem this
>> optimization
>> is trying to fix.
>>
>>
>> Well, my understanding is that the problem is long compilation times
>> due to spilling and our not-so-great implementation of it. So no,
>> register allocation is not late for the problem. As both Curro and I
>> explained, the change by itself can only pessimise register
>> allocation, so if it helps then it must be due to a bug in the
>> register allocator or a problem in a subsequent pass that's getting
>> hidden by this one.
> 
> Ok.
> 
>>
>> We are also reducing the amount of instructions used.
>>
>>
>> The comments in the source code say otherwise. Any instructions
>> eliminated were from spilling, which this pass only accidentally reduces.
> 
> Yes, sorry, I explained myself poorly. The optimization itself doesn't
> remove any instructions. But using it reduces the final number of
> instructions, although as you say, they are likely due reducing the
> spilling.
> 
>>
>>
>>
>> Probably not really clear on the commit message. When I say
>> "reduce the
>> pressure of the register allocator" I mean having a code that the
>> register allocator would be able to handle without using too much
>> time.
>> The problem this optimization tries to solve is that for some 16
>> bit CTS
>> tests (some with matrices and geometry shaders), the amount of virtual
>> registers used and instructions was really big. For the record,
>> initially, some tests needed 24 min just to compile. Right now, thanks
>> to other optimizations, the slower test without this optimization
>> needs
>> 1min 30 seconds. Adding some hacky timestamps, the time used  at
>> fs_visitor::allocate_registers (brw_fs.cpp:6096) is:
>>
>> * While trying to schedule using the three available pre mode
>> heuristics: 7 seconds
>> * Allocation with spilling: 63 seconds
>> * Final schedule using SCHEDULE_POST: 19 seconds
>>
>> With this optimization, the total time goes down to 14 seconds (10
>> + 0 +
>> 3 on the previous bullet point list).
>>
>> One could argue that 1min 30 seconds is okish. But taking into account
>> that it goes down to 14 seconds, even with some caveats (see below), I
>> still think that it is worth to use the optimization.
>>
>> And a final comment. For that same test, this is the final stats
>> (using
>> INTEL_DEBUG):
>>
>>  * With the optimization: SIMD8 shader: 4610 instructions. 0 loops.
>> 130320 cycles. 15:9 spills:fills.
>>  * Without the optimization: SIMD8 shader: 12312 instructions. 0
>> loops.
>> 174816 cycles. 751:1851 spills:fills.
>>
>>
>> So, the fact that it helps at all with SIMD8 shows that my theory is
>> wrong, but since your pass reduces spilling, it clearly must be
>> avoiding a bug somewhere else. You need to compare the IR for a shader
>> with the problem with and without this pass right before register
>> allocation. Maybe the sources and destinations of the conversion
>> instructions interfere without the change due to some other pass
>> that's increasing register pressure, in which case that's the problem,
>> but I doubt it.
> 
> Ok, thanks for the hints.

After some research we found that we need to adapt the live_variables
algorithm to support 32 to 16-bit conversions. Because of the HW
alignment restrictions these conversions need that the result register
uses stride=2, so it is not continuous (stride!=1) so by definition
is_partial_write returns true. Any of the next last 3 conditions could
be true when we use 16-bit types.

bool
fs_inst::is_partial_write() const
{
   return ((this->predicate && this->opcode != BRW_OPCODE_SEL) ||
   (this->exec_size * type_sz(this->dst.type)) < 32 ||
   !this->dst.is_contiguous() ||
   this->dst.offset % REG_SIZE != 0);
}

So at the check on the setup_one_write function at
brw_fs_live_variables.cpp the variable isn't marked as defined
completely in the block.

   if (inst->dst.file == VGRF && !inst->is_partial_write()) {
  if (!BITSET_TEST(bd->use, var))
 BITSET_SET(bd->def, var);
   }

That makes that the live start of the variable is expected to defined

Re: [Mesa-dev] [PATCH] glsl: fix loop analysis of loop terminators

2017-09-06 Thread Timothy Arceri


On 07/09/17 06:59, Matt Turner wrote:

I feel like the commit message is missing some important information.
What does this fix? Do we have a piglit test? I don't see one from
you. I see that someone has replied with a Tested-by, so presumably
they know what it's intended to fix.



Huh??? Are you winding me up?

"This code incorrectly assumed that loop terminators will always be
at the start of the loop. Fortunately we *seem* to avoid any bugs
because the unrolling code loops over and correctly handles the
terminators.

However the incorrect analysis can result in loops not being
unrolled at all.

... examples of loops that are unrolled/not unrolled ..."

You can't write a piglit test to detect if a loop is unrolled or not, 
only if it was unrolled correctly. I wrote a bunch of the former when 
implementing unrolling in nir.


Anyway this patch broke one of those tests as per my reply shortly after 
sending it. Further examination shows there are more bugs and 
limitations in the GLSL IR version of unrolling. I've updated the pass 
to be more like the NIR pass but now I'm hitting bugs further down the 
line due to the us exiting the GLIR optimisation loop after only a 
single iteration, it seems TGSI or LLVM doesn't handle the unrolled loop 
without the redundant ifs cleaned up in some cases.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: skip draw calls with pipe_draw_info::count == 0

2017-09-06 Thread Marek Olšák
Ping.

On Sat, Sep 2, 2017 at 1:34 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102502
>
> Cc: 17.2 
> ---
>  src/mesa/state_tracker/st_draw.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/state_tracker/st_draw.c 
> b/src/mesa/state_tracker/st_draw.c
> index fe03a4a..2fe7070 100644
> --- a/src/mesa/state_tracker/st_draw.c
> +++ b/src/mesa/state_tracker/st_draw.c
> @@ -191,23 +191,28 @@ st_draw_vbo(struct gl_context *ctx,
>if (tfb_vertcount) {
>   if (!st_transform_feedback_draw_init(tfb_vertcount, stream, ))
>  return;
>}
> }
>
> assert(!indirect);
>
> /* do actual drawing */
> for (i = 0; i < nr_prims; i++) {
> +  info.count = prims[i].count;
> +
> +  /* Skip no-op draw calls. */
> +  if (!info.count && !tfb_vertcount)
> + continue;
> +
>info.mode = translate_prim(ctx, prims[i].mode);
>info.start = start + prims[i].start;
> -  info.count = prims[i].count;
>info.start_instance = prims[i].base_instance;
>info.instance_count = prims[i].num_instances;
>info.index_bias = prims[i].basevertex;
>info.drawid = prims[i].draw_id;
>if (!ib) {
>   info.min_index = info.start;
>   info.max_index = info.start + info.count - 1;
>}
>
>if (ST_DEBUG & DEBUG_DRAW) {
> --
> 2.7.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUs

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c  |   2 +
 src/gallium/drivers/radeonsi/si_pipe.h  |   1 +
 src/gallium/drivers/radeonsi/si_state.c | 143 +++-
 src/gallium/drivers/radeonsi/si_state.h |  10 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c |   5 +
 5 files changed, 156 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 640b57c..9642edd 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -1041,20 +1041,22 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws,
 sscreen->b.info.pfp_fw_version >= 121 &&
 sscreen->b.info.me_fw_version >= 87) ||
(sscreen->b.chip_class == CIK &&
 sscreen->b.info.pfp_fw_version >= 211 &&
 sscreen->b.info.me_fw_version >= 173) ||
(sscreen->b.chip_class == SI &&
 sscreen->b.info.pfp_fw_version >= 79 &&
 sscreen->b.info.me_fw_version >= 142);
 
sscreen->has_ds_bpermute = sscreen->b.chip_class >= VI;
+   sscreen->has_out_of_order_rast = sscreen->b.chip_class >= VI &&
+sscreen->b.info.max_se >= 2;
sscreen->has_msaa_sample_loc_bug = (sscreen->b.family >= CHIP_POLARIS10 
&&
sscreen->b.family <= 
CHIP_POLARIS12) ||
   sscreen->b.family == CHIP_VEGA10 ||
   sscreen->b.family == CHIP_RAVEN;
sscreen->dpbb_allowed = sscreen->b.chip_class >= GFX9 &&
!(sscreen->b.debug_flags & DBG_NO_DPBB);
sscreen->dfsm_allowed = sscreen->dpbb_allowed &&
!(sscreen->b.debug_flags & DBG_NO_DFSM);
 
/* While it would be nice not to have this flag, we are constrained
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 8db7028..b8073ce 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -88,20 +88,21 @@ struct hash_table;
 struct u_suballocator;
 
 struct si_screen {
struct r600_common_screen   b;
unsignedgs_table_depth;
unsignedtess_offchip_block_dw_size;
boolhas_clear_state;
boolhas_distributed_tess;
boolhas_draw_indirect_multi;
boolhas_ds_bpermute;
+   boolhas_out_of_order_rast;
boolhas_msaa_sample_loc_bug;
booldpbb_allowed;
booldfsm_allowed;
boolllvm_has_working_vgpr_indexing;
 
/* Whether shaders are monolithic (1-part) or separate (3-part). */
booluse_monolithic_shaders;
boolrecord_llvm_ir;
 
mtx_t   shader_parts_mutex;
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 7e9140b..855ad27 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -416,20 +416,21 @@ static void *si_create_blend_state_mode(struct 
pipe_context *ctx,
struct si_pm4_state *pm4 = >pm4;
uint32_t sx_mrt_blend_opt[8] = {0};
uint32_t color_control = 0;
 
if (!blend)
return NULL;
 
blend->alpha_to_coverage = state->alpha_to_coverage;
blend->alpha_to_one = state->alpha_to_one;
blend->dual_src_blend = util_blend_state_is_dual(state, 0);
+   blend->logicop_enable = state->logicop_enable;
 
if (state->logicop_enable) {
color_control |= S_028808_ROP3(state->logicop_func | 
(state->logicop_func << 4));
} else {
color_control |= S_028808_ROP3(0xcc);
}
 
si_pm4_set_reg(pm4, R_028B70_DB_ALPHA_TO_MASK,
   S_028B70_ALPHA_TO_MASK_ENABLE(state->alpha_to_coverage) |
   S_028B70_ALPHA_TO_MASK_OFFSET0(2) |
@@ -623,20 +624,27 @@ static void si_bind_blend_state(struct pipe_context *ctx, 
void *state)
old_blend->blend_enable_4bit != blend->blend_enable_4bit ||
old_blend->need_src_alpha_4bit != blend->need_src_alpha_4bit)
sctx->do_update_shaders = true;
 
if (sctx->screen->dpbb_allowed &&
(!old_blend ||
 old_blend->alpha_to_coverage != blend->alpha_to_coverage ||
 old_blend->blend_enable_4bit != blend->blend_enable_4bit ||
 old_blend->cb_target_enabled_4bit != 

[Mesa-dev] [PATCH 2/2] ac/surface: add radeon_surf::has_stencil for convenience

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/ac_surface.c|  2 ++
 src/amd/common/ac_surface.h|  1 +
 src/amd/vulkan/radv_device.c   |  6 +++---
 src/gallium/drivers/r600/evergreen_state.c |  2 +-
 src/gallium/drivers/r600/r600_blit.c   |  2 +-
 src/gallium/drivers/r600/r600_state_common.c   |  2 +-
 src/gallium/drivers/radeon/r600_texture.c  |  4 ++--
 src/gallium/drivers/radeonsi/si_blit.c |  2 +-
 src/gallium/drivers/radeonsi/si_state.c| 15 +++
 src/gallium/drivers/radeonsi/si_state_binning.c|  2 +-
 src/gallium/winsys/radeon/drm/radeon_drm_surface.c |  1 +
 11 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
index 4edefc7..c6ff573 100644
--- a/src/amd/common/ac_surface.c
+++ b/src/amd/common/ac_surface.c
@@ -648,20 +648,21 @@ static int gfx6_compute_surface(ADDR_HANDLE addrlib,
if (AddrSurfInfoIn.tileType == ADDR_DISPLAYABLE)
AddrSurfInfoIn.tileIndex = 10; /* 2D 
displayable */
else
AddrSurfInfoIn.tileIndex = 14; /* 2D 
non-displayable */
 
/* Addrlib doesn't set this if tileIndex is forced like 
above. */
AddrSurfInfoOut.macroModeIndex = 
cik_get_macro_tile_index(surf);
}
}
 
+   surf->has_stencil = !!(surf->flags & RADEON_SURF_SBUFFER);
surf->num_dcc_levels = 0;
surf->surf_size = 0;
surf->dcc_size = 0;
surf->dcc_alignment = 1;
surf->htile_size = 0;
surf->htile_slice_size = 0;
surf->htile_alignment = 1;
 
const bool only_stencil = (surf->flags & RADEON_SURF_SBUFFER) &&
  !(surf->flags & RADEON_SURF_ZBUFFER);
@@ -1070,20 +1071,21 @@ static int gfx9_compute_surface(ADDR_HANDLE addrlib,

);
if (r)
return r;
break;
 
default:
assert(0);
}
 
surf->u.gfx9.resource_type = AddrSurfInfoIn.resourceType;
+   surf->has_stencil = !!(surf->flags & RADEON_SURF_SBUFFER);
 
surf->num_dcc_levels = 0;
surf->surf_size = 0;
surf->dcc_size = 0;
surf->htile_size = 0;
surf->htile_slice_size = 0;
surf->u.gfx9.surf_offset = 0;
surf->u.gfx9.stencil_offset = 0;
surf->u.gfx9.fmask_size = 0;
surf->u.gfx9.cmask_size = 0;
diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
index b2620f9..13f3d7a 100644
--- a/src/amd/common/ac_surface.h
+++ b/src/amd/common/ac_surface.h
@@ -152,20 +152,21 @@ struct radeon_surf {
 /* Format properties. */
 unsignedblk_w:4;
 unsignedblk_h:4;
 unsignedbpe:5;
 /* Number of mipmap levels where DCC is enabled starting from level 0.
  * Non-zero levels may be disabled due to alignment constraints, but not
  * the first level.
  */
 unsignednum_dcc_levels:4;
 unsignedis_linear:1;
+unsignedhas_stencil:1;
 /* Displayable, thin, depth, rotated. AKA D,S,Z,R swizzle modes. */
 unsignedmicro_tile_mode:3;
 uint32_tflags;
 
 /* These are return values. Some of them can be set by the caller, but
  * they will be treated as hints (e.g. bankw, bankh) and might be
  * changed by the calculator.
  */
 
 /* Tile swizzle can be OR'd with low bits of the BASE_256B address.
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 7c218b1..b64a023 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -3134,21 +3134,21 @@ radv_initialise_ds_surface(struct radv_device *device,
ds->offset_scale = 1.0f;
break;
case VK_FORMAT_S8_UINT:
stencil_only = true;
break;
default:
break;
}
 
format = radv_translate_dbformat(iview->image->vk_format);
-   stencil_format = iview->image->surface.flags & RADEON_SURF_SBUFFER ?
+   stencil_format = iview->image->surface.has_stencil ?
V_028044_STENCIL_8 : V_028044_STENCIL_INVALID;
 
uint32_t max_slice = radv_surface_layer_count(iview);
ds->db_depth_view = S_028008_SLICE_START(iview->base_layer) |
S_028008_SLICE_MAX(iview->base_layer + max_slice - 1);
 
ds->db_htile_data_base = 0;
ds->db_htile_surface = 0;
 
va = device->ws->buffer_get_va(iview->bo) + iview->image->offset;
@@ -3169,21 +3169,21 @@ radv_initialise_ds_surface(struct radv_device *device,
ds->db_stencil_info2 = 

[Mesa-dev] [PATCH] winsys/amdgpu: disable local BOs on Raven

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

It hangs with a high degree of reproducibility.
---
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 897b4f0..4e9022f 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -404,21 +404,22 @@ static struct amdgpu_winsys_bo *amdgpu_create_bo(struct 
amdgpu_winsys *ws,
if (initial_domain & RADEON_DOMAIN_VRAM)
   request.preferred_heap |= AMDGPU_GEM_DOMAIN_VRAM;
if (initial_domain & RADEON_DOMAIN_GTT)
   request.preferred_heap |= AMDGPU_GEM_DOMAIN_GTT;
 
if (flags & RADEON_FLAG_NO_CPU_ACCESS)
   request.flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
if (flags & RADEON_FLAG_GTT_WC)
   request.flags |= AMDGPU_GEM_CREATE_CPU_GTT_USWC;
if (flags & RADEON_FLAG_NO_INTERPROCESS_SHARING &&
-   ws->info.drm_minor >= 20)
+   ws->info.drm_minor >= 20 &&
+   ws->info.family != CHIP_RAVEN)
   request.flags |= AMDGPU_GEM_CREATE_VM_ALWAYS_VALID;
 
r = amdgpu_bo_alloc(ws->dev, , _handle);
if (r) {
   fprintf(stderr, "amdgpu: Failed to allocate a buffer:\n");
   fprintf(stderr, "amdgpu:size  : %"PRIu64" bytes\n", size);
   fprintf(stderr, "amdgpu:alignment : %u bytes\n", alignment);
   fprintf(stderr, "amdgpu:domains   : %u\n", initial_domain);
   goto error_bo_alloc;
}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] disk_cache: make the thread queue resizable and low priority

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

---
 src/util/disk_cache.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index b789a45..33e4dc8 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -351,27 +351,29 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp,
   }
}
 
/* Default to 1GB for maximum cache size. */
if (max_size == 0) {
   max_size = 1024*1024*1024;
}
 
cache->max_size = max_size;
 
-   /* A limit of 32 jobs was choosen as observations of Deus Ex start-up times
-* showed that we reached at most 11 jobs on an Intel i5-6400 CPU@2.70GHz
-* (a fairly modest desktop CPU). 1 thread was chosen because we don't
-* really care about getting things to disk quickly just that it's not
-* blocking other tasks.
+   /* 1 thread was chosen because we don't really care about getting things
+* to disk quickly just that it's not blocking other tasks.
+*
+* The queue will resize automatically when it's full, so adding new jobs
+* doesn't stall.
 */
-   util_queue_init(>cache_queue, "disk_cache", 32, 1, 0);
+   util_queue_init(>cache_queue, "disk_cache", 32, 1,
+   UTIL_QUEUE_INIT_RESIZE_IF_FULL |
+   UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY);
 
uint8_t cache_version = CACHE_VERSION;
size_t cv_size = sizeof(cache_version);
cache->driver_keys_blob_size = cv_size;
 
/* Create driver id keys */
size_t ts_size = strlen(timestamp) + 1;
size_t gpu_name_size = strlen(gpu_name) + 1;
cache->driver_keys_blob_size += ts_size;
cache->driver_keys_blob_size += gpu_name_size;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] Resend of preprocessor series

2017-09-06 Thread Thomas Helland
I'm busy until Sunday, but I'll see if I can find the time
to address Nicolai's comments on Sunday evening.
I've addressed the build issues with the tests, and the
comment about using util_vsnprintf, so it's getting there.
I've also done some general polishing on comments, etc.



6. sep. 2017 23.00 skrev "Dieter Nützel" :

For the series:

Tested-by: Dieter Nützel 

But do NOT apply on current git any longer.
With Nicolai's comments addressed new version underway? ;-)

Dieter


Am 29.08.2017 21:56, schrieb Thomas Helland:

> This is a resend of the string buffer implementation and
> related patches sent out back in May. I've done one more
> change to the string buffer; using u_string.h for a compatible
> vsnprintf version to reduce the code even more. I've not been
> able to test this due to two build breakages (xmlpool and dri)
> that I'm still trying to figure out of. But since I promised
> to send these out this evening, I'm sending them untested.
> I did test them thoroughly the last time around though,
> so I believe it should be mostly good as long as I haven't
> messed up the rebasing. I believe the string buffer part of
> the series is the most important; the rest I've not really
> gotten around to performance test much.
>
> Thomas Helland (7):
>   util: Add a string buffer implementation
>   util: Add tests for the string buffer
>   glsl: Change the parser to use the string buffer
>   glcpp: Use string_buffer for line continuation removal
>   glcpp: Avoid unnecessary call to strlen
>   port to gtest
>   fix test makefile
>
> Vladislav Egorov (1):
>   glcpp: Use Bloom filter before identifier search
>
>  configure.ac  |   2 +
>  src/compiler/glsl/glcpp/glcpp-lex.l   |   3 +-
>  src/compiler/glsl/glcpp/glcpp-parse.y | 219
> -
>  src/compiler/glsl/glcpp/glcpp.h   |  18 +-
>  src/compiler/glsl/glcpp/pp.c  |  64 ---
>  src/util/Makefile.am  |   3 +-
>  src/util/Makefile.sources |   2 +
>  src/util/string_buffer.c  | 155 +++
>  src/util/string_buffer.h  |  87 +
>  src/util/tests/string_buffer/Makefile.am  |  38 
>  src/util/tests/string_buffer/append_and_print.cpp | 221
> ++
>  11 files changed, 633 insertions(+), 179 deletions(-)
>  create mode 100644 src/util/string_buffer.c
>  create mode 100644 src/util/string_buffer.h
>  create mode 100644 src/util/tests/string_buffer/Makefile.am
>  create mode 100644 src/util/tests/string_buffer/append_and_print.cpp
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC] NIR serialization

2017-09-06 Thread Daniel Schürmann

Hello together!
Recently, we had a small discussion (off the list) about the NIR 
serialization, which was previously discussed in [RFC] ARB_gl_spirv and 
NIR backend for radeonsi.


As this topic could be interesting to more people, I would like to 
share, what was talked about so far (You might want to read from bottom up).


TL;DR:
- NIR serialization is in demand for shader cache
- could be done either directly (NIR binary form) or via SPIR-V
- Ian et al. are working on GLSL IR -> SPIR-V transformation, which 
could be adapted for a NIR -> SPIR-V pass

- in NIR representation, some type information is lost
- thus, a serialization via SPIR-V could NOT be a glslang alternative 
(otoh, the GLSL IR->SPIR-V pass could), but only for spirv-opt (if the 
output is valid SPIR-V)

- now, the question is if this is worth the additional effort

Kind regards,
Daniel

 Forwarded Message 
Subject:Re: NIR serialization
Date:   Tue, 5 Sep 2017 11:00:31 -0700
From:   Ian Romanick 
To: 	Daniel Schürmann , Nicolai 
Hähnle , Timothy Arceri 




Sorry for taking so long to reply.  It was a long holiday weekend in the
US, and I was away.

On 09/01/2017 05:03 AM, Daniel Schürmann wrote:

A direct NIR binary serialization would also do the job (vc4/freedreno
was mentioned as well).
I only thought that SPIRV is preferable because
- deserialization for free
- cached shader size
- spirv-opt and glslang alternative

The term lossy doesn't make much sense to me with regard to
optimizations: aren't all optimizations lossy?


By lossy I mean there is a significant  semantic change.  As soon as
GLSL IR is converted to NIR, Boolean types completely cease to exist.
They are replaced with integers that are either 0 or -1.  Similarly, all
matrix types cease to exist.  They are replaced by a set of vectors.

For the purpose of the on-disk cache, this probably doesn't matter.  It
does mean that additional information about, for example, types of
uniforms has to be tracked.  In a direct GLSL IR to SPIR-V translation,
type information is maintained, so the SPIR-V has all the necessary
information.

As a glslang replacement, maintaining type information is an absolute
requirement.  Users will use other tools to introspect the SPIR-V shader
to find locations of uniforms, shader inputs, offsets of values in UBOs,
etc.  If the types are changed in the SPIR-V shader that we emit, none
of that will work.  I plan to enable retrieval of portable SPIR-V both
from a Mesa driver and the standalone GLSL compiler.

Right now SPIR-V binaries will be quite large.  I have several ideas
that I plan to implement once we have OpenGL 4.6 done that should
dramatically reduce the size of SPIR-V... I'm actually hoping to present
that at FOSDEM.


The primary goal would be the lossless NIR-SPIRV-NIR round-trip.
Secondary, it would be desirable if we achieve valid SPIRV binaries
which preserve the semantics of the original shader.
And here is the question if this is possible with the type information
that are available...

Ian: can you hint me to your repository? I couldn't find it.


https://cgit.freedesktop.org/~idr/mesa/log/?h=emit-spirv


Kind regards,

Daniel


On 09/01/2017 12:16 PM, Nicolai Hähnle wrote:

In addition to using NIR-based optimizations, I believe Timothy
mentioned that a method for serializing NIR would help the shader disk
cache of i965. It would certainly help radeonsi if/when we switch to
the NIR backend, because we could compile new shader variants without
falling back all the way to GLSL. For that, a lossless NIR-SPIRV-NIR
path would do the job.

Not that falling back all the way to GLSL from radeonsi is impossible,
but it would also require a whole bunch of new groundwork to be done
-- basically, we would need multi-threaded GLSL compilation and linking.

Cheers,
Nicolai


On 01.09.2017 02:41, Ian Romanick wrote:

I have been working on GLSL IR to SPIR-V. I have a bunch of stuff in
the emit-spirv branch of my freedesktop.org tree.  Once that is done, it
should be pretty trivial to adapt it to NIR to SPIR-V, but I don't know
how useful that would be for Mesa.  Part of the problem is NIR loses a
lot of information about types (bool and matrix types), so a
SPIRV-NIR-SPIRV path would necessarily be lossy.

On the flip side, GLSL IR lacks a huge number of optimizations that
exist in NIR, so it's probably not a huge improvement over spirv-opt.

On 08/26/2017 02:33 PM, Nicolai Hähnle wrote:

Hey Ian,

Have you done any more concrete work on NIR serialization? See below...

Cheers,
Nicolai

On 26.08.2017 23:17, Daniel Schürmann wrote:

Hello Nicolai,

I'm a Master student (CS) from TU Berlin and currently writing an
OpenMP backend for clang using SPIR-V/OpenCL for my thesis. As I'm
interested in mesa and graphics driver development since long time, I
would like to get involved a little bit.

Recently, I read your [RFC] 

[Mesa-dev] Mesa 17.1.9 release candidate

2017-09-06 Thread Andres Gomez
Hello list,

The candidate for the Mesa 17.1.9 is now available. Currently we have:
 - 27 queued
 - 0 nominated (outstanding)
 - and 3 rejected patches


In the current queue we have:

In Mesa Core we include a fix for a rendering problem detected while
using GoogleEarth with the VMware driver.

The state tracker received a couples of patches, one for handling
properly the vertex array double inputs and another for a redundant
initialization of the view template in the PBO downloads for ReadPixels
implementation.

The GLSL compiler has received a fix for the counting of vertex shader
output slots.

The SPIR-V compiler has seen a fix for properly handling the
HelperInvocation builtin.

Intel's anv has solved some possible crashes by better dealing with
unknown VkFormat enums.

The etnaviv driver has also received some care.

AMD's radv has received several patches, including a couple of fixes
for improperly triggered asserts and another one for initializing the
usage flags in the command buffer.

Nouveau's nvc0 driver now performs a previously missing initialization
when querying HW statistics.

Gallivm has seen corrected a problem with big endian architectures when
handling the color channels of a texture. Similarly, the llvmpipe
driver also got a fix for big endian architectures.

EGL has also received some fixes in the memory management avoiding
improper dereferences and plugging leaks, including some in the Wayland
platform.

From build and integration point of view, we have improved the
detection of xlocale.h availability to allow newer builds with Glibc
2.26 or later to still make use of the locale-setting.

Take a look at section "Mesa stable queue" for more information.


Testing reports/general approval


Any testing reports (or general approval of the state of the branch)
will be greatly appreciated.

The plan is to have 17.1.9 next Friday (8th of September), around or
shortly after 21:00 GMT.

If you have any questions or suggestions - be that about the current
patch queue or otherwise, please go ahead.


Trivial merge conflicts
---

52e70819b4302d06dbfc3fd3dd655e79aefba7c8
Author: Michael Olbrich 

egl/dri2: only destroy created objects

(cherry picked from commit 81d5c31631840db704337489cf677cc596da79f5)

commit 018e602dc629dfbda52b5c7711b5867d72ce33c8
Author: Charmaine Lee 

vbo: fix offset in minmax cache key

(cherry picked from commit 2d93b462b4d978b0da417b35a7470e336bc4e783)

commit eb5eb5b26d3dffdbfac9c12a9113a16859a7f0c0
Author: Jason Ekstrand 

spirv: Add support for the HelperInvocation builtin

(cherry picked from commit e439908af9665b50443f1196cb55388c69d0c7d7)



Cheers,
Andres


Mesa stable queue
-

Nominated (0)
=


Queued (27)
===

Andres Gomez (7):
  docs: add sha256 checksums for 17.1.8
  cherry-ignore: added 17.2 nominations.
  cherry-ignore: add "nir: Fix system_value_from_intrinsic for subgroups"
  cherry-ignore: add "i965: Fix crash in fallback GTT mapping."
  cherry-ignore: add "radeonsi/gfx9: always flush DB metadata on 
framebuffer changes"
  cherry-ignore: add "radv: Fix vkCopyImage with both depth and stencil 
aspects."
  cherry-ignore: add "radeonsi/gfx9: proper workaround for LS/HS VGPR 
initialization bug"

Bas Nieuwenhuizen (3):
  radv: Fix off by one in MAX_VBS assert.
  radv: Fix sparse BO mapping merging.
  radv: Actually set the cmd_buffer usage_flags.

Ben Crocker (1):
  llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load

Charmaine Lee (1):
  vbo: fix offset in minmax cache key

Christian Gmeiner (1):
  etnaviv: use correct param for etna_compatible_rs_format(..)

Emil Velikov (3):
  egl: don't NULL deref the .get_capabilities function pointer
  egl/wayland: plug leaks in dri2_wl_create_window_surface() error path
  egl/wayland: polish object teardown in dri2_wl_destroy_surface

Eric Engestrom (1):
  util: improve compiler guard

Grazvydas Ignotas (2):
  radv: clear dynamic_shader_stages on create
  radv: don't assert on empty hash table

Ilia Mirkin (2):
  glsl: fix counting of vertex shader output slots used by explicit vars
  st/mesa: fix handling of vertex array double inputs

Jason Ekstrand (2):
  anv/formats: Nicely handle unknown VkFormat enums
  spirv: Add support for the HelperInvocation builtin

Karol Herbst (1):
  nvc0: write 0 to pipeline_statistics.cs_invocations

Michael Olbrich (1):
  egl/dri2: only destroy created objects

Ray Strode (1):
  gallivm: correct channel shift logic on big endian

Roland Scheidegger (1):
  st/mesa: fix view template initialization in try_pbo_readpixels


Rejected (3)


Jason Ekstrand (1):
  nir: Fix system_value_from_intrinsic for
subgroups

Depends on earlier commit 43ef75b394f which did not land in 

Re: [Mesa-dev] [PATCH 0/8] Resend of preprocessor series

2017-09-06 Thread Dieter Nützel

For the series:

Tested-by: Dieter Nützel 

But do NOT apply on current git any longer.
With Nicolai's comments addressed new version underway? ;-)

Dieter

Am 29.08.2017 21:56, schrieb Thomas Helland:

This is a resend of the string buffer implementation and
related patches sent out back in May. I've done one more
change to the string buffer; using u_string.h for a compatible
vsnprintf version to reduce the code even more. I've not been
able to test this due to two build breakages (xmlpool and dri)
that I'm still trying to figure out of. But since I promised
to send these out this evening, I'm sending them untested.
I did test them thoroughly the last time around though,
so I believe it should be mostly good as long as I haven't
messed up the rebasing. I believe the string buffer part of
the series is the most important; the rest I've not really
gotten around to performance test much.

Thomas Helland (7):
  util: Add a string buffer implementation
  util: Add tests for the string buffer
  glsl: Change the parser to use the string buffer
  glcpp: Use string_buffer for line continuation removal
  glcpp: Avoid unnecessary call to strlen
  port to gtest
  fix test makefile

Vladislav Egorov (1):
  glcpp: Use Bloom filter before identifier search

 configure.ac  |   2 +
 src/compiler/glsl/glcpp/glcpp-lex.l   |   3 +-
 src/compiler/glsl/glcpp/glcpp-parse.y | 219 
-

 src/compiler/glsl/glcpp/glcpp.h   |  18 +-
 src/compiler/glsl/glcpp/pp.c  |  64 ---
 src/util/Makefile.am  |   3 +-
 src/util/Makefile.sources |   2 +
 src/util/string_buffer.c  | 155 
+++

 src/util/string_buffer.h  |  87 +
 src/util/tests/string_buffer/Makefile.am  |  38 
 src/util/tests/string_buffer/append_and_print.cpp | 221 
++

 11 files changed, 633 insertions(+), 179 deletions(-)
 create mode 100644 src/util/string_buffer.c
 create mode 100644 src/util/string_buffer.h
 create mode 100644 src/util/tests/string_buffer/Makefile.am
 create mode 100644 src/util/tests/string_buffer/append_and_print.cpp

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix loop analysis of loop terminators

2017-09-06 Thread Matt Turner
I feel like the commit message is missing some important information.
What does this fix? Do we have a piglit test? I don't see one from
you. I see that someone has replied with a Tested-by, so presumably
they know what it's intended to fix.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/23] anv: Better support for Android logging

2017-09-06 Thread Chad Versace
On Sat 02 Sep 2017, Jason Ekstrand wrote:
> This is going to conflict badly with tapani's work to implement
> VK_EXT_debug_report (which I need to finish reviewing).

Understood. It probably conflicts with other in-flight patches too.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/23] intel: Add simple logging façade for Android

2017-09-06 Thread Chad Versace
On Mon 04 Sep 2017, Eero Tamminen wrote:
> Hi,
> 
> On 02.09.2017 11:17, Chad Versace wrote:
> > I'm bringing up Vulkan in the Android container of Chrome OS (ARC++).
> > 
> > On Android, stdio goes to /dev/null. On Android, remote gdb is even more
> > painful than the usual remote gdb. On Android, nothing works like you
> > expect and debugging is hell. I need logging.
> 
> Would non-remote Gdb work better?
> 
> I.e. use a chroot containing your normal Linux setup inside your Android,
> and use tools from that to debug Android stuff outside the chroot.
> 
> Everything that doesn't need to be inside the debugged process (like
> LD_PRELOAD tools) such as Gdb, "perf" etc, should work fine as long as you
> mount /dev, /proc, /sys there, along with having (the non-stripped versions
> of) the Android binaries in same path within the chroot, as they're outside.
> 
> (At least that worked fine for me few years ago, when I needed to debug &
> profile Android stuff.  If security is nowadays tightened, you may need to
> use your own more relaxed kernel config.)

The ARC++ environment is locked down even more tightly than regular
Android. For example, all syscalls go through a translation table in the
kernel that may rewrite the arguments; and all syscalls must be
explicitly whitelisted. I expect local-gdb-through-a-chroot may have
difficulty there. Anyway, disk space is too scarce for a chroot.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/23] intel: Add simple logging façade for Android

2017-09-06 Thread Chad Versace
On Tue 05 Sep 2017, Rob Herring wrote:
> On Sat, Sep 2, 2017 at 3:17 AM, Chad Versace  wrote:
> > I'm bringing up Vulkan in the Android container of Chrome OS (ARC++).
> >
> > On Android, stdio goes to /dev/null. On Android, remote gdb is even more
> > painful than the usual remote gdb. On Android, nothing works like you
> > expect and debugging is hell. I need logging.
> 
> We do!
> 
> You used to be able to do logwrapper at least for system level
> services, but that now is a pain to get working thanks to SELinux.
> 
> > This patch introduces a small, simple logging API that can easily wrap
> > Android's API. On non-Android platforms, this logger does nothing fancy.
> > It follows the time-honored Unix tradition of spewing everything to
> > stderr with minimal fuss.
> >
> > My goal here is not perfection. My goal is to make a minimal, clean API,
> > that people hate merely a little instead of a lot, and that's good
> > enough to let me bring up Android Vulkan.  And it needs to be fast,
> > which means it must be small. No one wants to their game to miss frames
> > while aiming a flaming bow into the jaws of an angry robot t-rex, and
> > thus become t-rex breakfast, because some fool had too much fun desiging
> > a bloated, ideal logging API.
> >
> > If people like it, perhaps we should quickly promote it to src/util.
> 
> The only thing I don't like is being Intel specific.

Of course, I would rename everything to be generic non-Intel (u_log,
perhaps?) when promoting to src/util.

> There's already a
> gallium API (with Android support floating around) as well as ddebug
> (which I started Android support for, but haven't gotten that working
> yet). Of course, some things still just call fprintf(strerr,...) or
> other C lib functions directly. I've hacked up files with "#define
> fprintf() ALOGE()" in places I've needed it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/23] intel: Add simple logging façade for Android

2017-09-06 Thread Chad Versace
On Sat 02 Sep 2017, Jason Ekstrand wrote:
> n Sat, Sep 2, 2017 at 1:17 AM, Chad Versace <[1]chadvers...@chromium.org>
> wrote:

> My goal here is not perfection. My goal is to make a minimal, clean API,
> that people hate merely a little instead of a lot, and that's good
> enough to let me bring up Android Vulkan.  And it needs to be fast,
> which means it must be small. No one wants to their game to miss frames
> while aiming a flaming bow into the jaws of an angry robot t-rex, and
> thus become t-rex breakfast, because some fool had too much fun desiging
> a bloated, ideal logging API.
> 
> 
> I don't actually hate it at all. In fact, I rather like it.  Sadtly, we
> probably need a bit more indirection in Vulkan thanks to VK_EXT_debug_report
> but it should be pretty easy to tie in here.
>  
> 
> If people like it, perhaps we should quickly promote it to src/util.
> 
> 
> I'd be a fan.

Would you rather see this land as-is, then promote? Or promot, then
land?

> +#define intel_loge(fmt, ...) intel_log(INTEL_LOG_ERROR, (INTEL_LOG_TAG),
> (fmt), ##__VA_ARGS__)
> +#define intel_logw(fmt, ...) intel_log(INTEL_LOG_WARN, (INTEL_LOG_TAG),
> (fmt), ##__VA_ARGS__)
> +#define intel_logi(fmt, ...) intel_log(INTEL_LOG_INFO, (INTEL_LOG_TAG),
> (fmt), ##__VA_ARGS__)
> +#ifdef DEBUG
> +#define intel_logd(fmt, ...) intel_log(INTEL_LOG_DEBUG, (INTEL_LOG_TAG),
> (fmt), ##__VA_ARGS__)
> +#else
> +#define intel_logd(fmt, ...) __intel_log_use_args((fmt), ##__VA_ARGS__)
> +#endif
> 
> 
> I'm not sure if ignoring debug loigging is best done here or at some slightly
> higher level.  I think here is probably fine.

I had the same thoughts. I tentatively chose to do it here because
I wanted to keep the patch simple.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] fixup! anv/android: Disable surface and swapchain extensions

2017-09-06 Thread Jason Ekstrand
On Wed, Sep 6, 2017 at 1:09 PM, Chad Versace  wrote:

> ---
>
> Jason, did you envision a cleanup like this?
>

Yes, this is better.  Eventually, I think we can make it better.

--Jason


>  src/intel/vulkan/anv_extensions.py | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_
> extensions.py
> index 18062359d31..747b36b71f5 100644
> --- a/src/intel/vulkan/anv_extensions.py
> +++ b/src/intel/vulkan/anv_extensions.py
> @@ -44,7 +44,7 @@ class Extension:
>  else:
>  self.enable = enable;
>
> -# On Android, disable all surface and swapchain extensions. Android's
> Vulkan
> +# On Android, we disable all surface and swapchain extensions. Android's
> Vulkan
>  # loader implements VK_KHR_surface and VK_KHR_swapchain, and applications
>  # cannot access the driver's implementation. Moreoever, if the driver
> exposes
>  # the those extension strings, then tests dEQP-VK.api.info.instance.
> extensions
> @@ -66,7 +66,7 @@ EXTENSIONS = [
>  Extension('VK_KHR_external_semaphore_fd', 1, True),
>  Extension('VK_KHR_get_memory_requirements2',  1, True),
>  Extension('VK_KHR_get_physical_device_properties2',   1, True),
> -Extension('VK_KHR_get_surface_capabilities2', 1, '!ANDROID'),
> +Extension('VK_KHR_get_surface_capabilities2', 1,
> 'ANV_HAS_SURFACE'),
>  Extension('VK_KHR_incremental_present',   1, True),
>  Extension('VK_KHR_maintenance1',  1, True),
>  Extension('VK_KHR_push_descriptor',   1, True),
> @@ -74,12 +74,12 @@ EXTENSIONS = [
>  Extension('VK_KHR_sampler_mirror_clamp_to_edge',  1, True),
>  Extension('VK_KHR_shader_draw_parameters',1, True),
>  Extension('VK_KHR_storage_buffer_storage_class',  1, True),
> -Extension('VK_KHR_surface',  25, '!ANDROID'),
> -Extension('VK_KHR_swapchain',68, '!ANDROID'),
> +Extension('VK_KHR_surface',  25,
> 'ANV_HAS_SURFACE'),
> +Extension('VK_KHR_swapchain',68,
> 'ANV_HAS_SURFACE'),
>  Extension('VK_KHR_variable_pointers', 1, True),
> -Extension('VK_KHR_wayland_surface',   6,
> 'VK_USE_PLATFORM_WAYLAND_KHR && !ANDROID'),
> -Extension('VK_KHR_xcb_surface',   6,
> 'VK_USE_PLATFORM_XCB_KHR && !ANDROID'),
> -Extension('VK_KHR_xlib_surface',  6,
> 'VK_USE_PLATFORM_XLIB_KHR && !ANDROID'),
> +Extension('VK_KHR_wayland_surface',   6,
> 'VK_USE_PLATFORM_WAYLAND_KHR'),
> +Extension('VK_KHR_xcb_surface',   6,
> 'VK_USE_PLATFORM_XCB_KHR'),
> +Extension('VK_KHR_xlib_surface',  6,
> 'VK_USE_PLATFORM_XLIB_KHR'),
>  Extension('VK_KHX_multiview', 1, True),
>  ]
>
> @@ -176,6 +176,10 @@ _TEMPLATE = Template(COPYRIGHT + """
>  #   define ANDROID false
>  #endif
>
> +#define ANV_HAS_SURFACE (VK_USE_PLATFORM_WAYLAND_KHR || \\
> + VK_USE_PLATFORM_XCB_KHR || \\
> + VK_USE_PLATFORM_XLIB_KHR)
> +
>  bool
>  anv_instance_extension_supported(const char *name)
>  {
> --
> 2.13.5
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] fixup! anv/android: Disable surface and swapchain extensions

2017-09-06 Thread Chad Versace
---

Jason, did you envision a cleanup like this?

 src/intel/vulkan/anv_extensions.py | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 18062359d31..747b36b71f5 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -44,7 +44,7 @@ class Extension:
 else:
 self.enable = enable;
 
-# On Android, disable all surface and swapchain extensions. Android's Vulkan
+# On Android, we disable all surface and swapchain extensions. Android's Vulkan
 # loader implements VK_KHR_surface and VK_KHR_swapchain, and applications
 # cannot access the driver's implementation. Moreoever, if the driver exposes
 # the those extension strings, then tests dEQP-VK.api.info.instance.extensions
@@ -66,7 +66,7 @@ EXTENSIONS = [
 Extension('VK_KHR_external_semaphore_fd', 1, True),
 Extension('VK_KHR_get_memory_requirements2',  1, True),
 Extension('VK_KHR_get_physical_device_properties2',   1, True),
-Extension('VK_KHR_get_surface_capabilities2', 1, '!ANDROID'),
+Extension('VK_KHR_get_surface_capabilities2', 1, 
'ANV_HAS_SURFACE'),
 Extension('VK_KHR_incremental_present',   1, True),
 Extension('VK_KHR_maintenance1',  1, True),
 Extension('VK_KHR_push_descriptor',   1, True),
@@ -74,12 +74,12 @@ EXTENSIONS = [
 Extension('VK_KHR_sampler_mirror_clamp_to_edge',  1, True),
 Extension('VK_KHR_shader_draw_parameters',1, True),
 Extension('VK_KHR_storage_buffer_storage_class',  1, True),
-Extension('VK_KHR_surface',  25, '!ANDROID'),
-Extension('VK_KHR_swapchain',68, '!ANDROID'),
+Extension('VK_KHR_surface',  25, 
'ANV_HAS_SURFACE'),
+Extension('VK_KHR_swapchain',68, 
'ANV_HAS_SURFACE'),
 Extension('VK_KHR_variable_pointers', 1, True),
-Extension('VK_KHR_wayland_surface',   6, 
'VK_USE_PLATFORM_WAYLAND_KHR && !ANDROID'),
-Extension('VK_KHR_xcb_surface',   6, 
'VK_USE_PLATFORM_XCB_KHR && !ANDROID'),
-Extension('VK_KHR_xlib_surface',  6, 
'VK_USE_PLATFORM_XLIB_KHR && !ANDROID'),
+Extension('VK_KHR_wayland_surface',   6, 
'VK_USE_PLATFORM_WAYLAND_KHR'),
+Extension('VK_KHR_xcb_surface',   6, 
'VK_USE_PLATFORM_XCB_KHR'),
+Extension('VK_KHR_xlib_surface',  6, 
'VK_USE_PLATFORM_XLIB_KHR'),
 Extension('VK_KHX_multiview', 1, True),
 ]
 
@@ -176,6 +176,10 @@ _TEMPLATE = Template(COPYRIGHT + """
 #   define ANDROID false
 #endif
 
+#define ANV_HAS_SURFACE (VK_USE_PLATFORM_WAYLAND_KHR || \\
+ VK_USE_PLATFORM_XCB_KHR || \\
+ VK_USE_PLATFORM_XLIB_KHR)
+
 bool
 anv_instance_extension_supported(const char *name)
 {
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102496] Frontbuffer rendering corruption on mesa master

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102496

--- Comment #11 from Gert Wollny  ---
Actually this patch seems to fixes some issues I had with some applications
lately (specifically QtCreator didn't redraw properly and old screen content
would show up instead).

Tested-By: Gert Wollny 

on r600g/HD 6870

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102565] u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102565

--- Comment #2 from Gert Wollny  ---
A workaround is to compile with --disable-libunwind, a patch to correct the
problem has been submitted to mesa-dev.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa/st/tests: Fix classic build regressions introduced with 7be6d8fe12

2017-09-06 Thread Gert Wollny
Fixes the build in classic only mode, i.e. the new state tracker tests are
only build when Gallium is enabled.
---
 src/mesa/Makefile.am | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index 865735be27..f2097eb209 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -19,7 +19,12 @@
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 # IN THE SOFTWARE.
 
-SUBDIRS = . main/tests state_tracker/tests
+SUBDIRS = . main/tests
+
+# state tracker tests depend on libmesagallium.la
+if HAVE_GALLIUM
+SUBDIRS += state_tracker/tests
+endif
 
 if HAVE_XLIB_GLX
 SUBDIRS += drivers/x11
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa/st/tests: Fix regressions with libunwind enabled introduced with 7be6d8fe12

2017-09-06 Thread Gert Wollny
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102565

Add the according flags to link with libunwind.
---
 src/mesa/state_tracker/tests/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/tests/Makefile.am 
b/src/mesa/state_tracker/tests/Makefile.am
index fb64cf9dc2..12ae7fab10 100644
--- a/src/mesa/state_tracker/tests/Makefile.am
+++ b/src/mesa/state_tracker/tests/Makefile.am
@@ -31,6 +31,7 @@ st_renumerate_test_LDADD = \
$(top_builddir)/src/util/libmesautil.la \
$(top_builddir)/src/gtest/libgtest.la \
$(GALLIUM_COMMON_LIB_DEPS) \
+   $(LIBUNWIND_LIBS) \
$(LLVM_LIBS) \
$(PTHREAD_LIBS) \
$(DLOPEN_LIBS)
-- 
2.13.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] glDrawBuffer crashes in case of surfaceless context

2017-09-06 Thread Volker Vogelhuber


I'm currently creating a surfaceless OpenGL context using the
EGL_KHR_surfaceless_context extension together with
eglGetPlatformDisplay/EGL_PLATFORM_GBM_MESA. So my default
framebuffer has no real buffers. I normally only render
to textures bound to FBOs. Due to an error on my side I called
glDrawBuffer with GL_FRONT_LEFT while no FBO was bound. This
result in a crash in intel_buffers.c because in intelDrawBuffer()
dri2InvalidateDrawable is called with a null pointer which is
not checked in dri2InvalidateDrawable() or anywhere before.
While the root cause for triggering the error is on my side,
I think it may be better to raise an error instead of crashing.
So I propose to add a check to brw->driContext->driDrawablePriv
within intelDrawBuffer. Probably if the driDrawablePriv is nullptr
one should not call intel_prepare_render either.

Regards
   Volker
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel: Remove unused Kabylake pci id

2017-09-06 Thread Anuj Phogat
I missed this one in Mesa commit ebc5ccf.

Signed-off-by: Anuj Phogat 
Cc: Matt Turner 
---
 include/pci_ids/i965_pci_ids.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 4a51e44..655d579 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -149,7 +149,6 @@ CHIPSET(0x590B, kbl_gt1, "Intel(R) Kabylake GT1")
 CHIPSET(0x5917, kbl_gt2, "Intel(R) UHD Graphics 620 (Kabylake GT2)")
 CHIPSET(0x5912, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")
 CHIPSET(0x5916, kbl_gt2, "Intel(R) HD Graphics 620 (Kaby Lake GT2)")
-CHIPSET(0x591A, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
 CHIPSET(0x591B, kbl_gt2, "Intel(R) HD Graphics 630 (Kaby Lake GT2)")
 CHIPSET(0x591D, kbl_gt2, "Intel(R) HD Graphics P630 (Kaby Lake GT2)")
 CHIPSET(0x591E, kbl_gt2, "Intel(R) HD Graphics 615 (Kaby Lake GT2)")
-- 
2.9.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] intel: Remove unused Kabylake pci ids

2017-09-06 Thread Anuj Phogat
On Tue, Sep 5, 2017 at 5:13 PM, Matt Turner  wrote:
> The series is
>
> Reviewed-by: Matt Turner 
>
> I think It should be tagged for the stable branch as well. Does anyone
> else have an opinion?
Yes, I'll tag them for stable and send to mesa-stable.

I left an unused pci-id (0x591A) in Mesa but I already pushed this series.
So, I'll send a separate patch to remove it.

>
> I tested a KBL-R system (the 0x5917 PCI ID) with it set as a GT1.5 and
> a GT2 and in both cases is passed piglit.
>
> Are you planning to send patches for the kernel and libdrm? If not, I
> can handle that.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965/fs: Define new shader opcode to set rounding modes

2017-09-06 Thread Chema Casanova

On 05/09/17 23:41, Francisco Jerez wrote:
> Alejandro Piñeiro  writes:
> 
>> Although it is possible to emit them directly as AND/OR on brw_fs_nir,
>> having a specific opcode makes it easier to remove duplicate settings
>> later.
>>
>> v2: (Curro)
>>   - Set thread control to 'switch' when using the control register
>>   - Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate
>> with the rounding mode.
>>   - Avoid magic numbers setting rounding mode field at control register.
>>
>> Signed-off-by:  Alejandro Piñeiro 
>> Signed-off-by:  Jose Maria Casanova Crespo 
>> ---
>>  src/intel/compiler/brw_eu.h |  3 +++
>>  src/intel/compiler/brw_eu_defines.h | 17 +
>>  src/intel/compiler/brw_eu_emit.c| 34 
>> +
>>  src/intel/compiler/brw_fs_generator.cpp |  5 +
>>  src/intel/compiler/brw_shader.cpp   |  4 
>>  5 files changed, 63 insertions(+)
>>
>> diff --git a/src/intel/compiler/brw_eu.h b/src/intel/compiler/brw_eu.h
>> index 8e597b212a6..106bf03530d 100644
>> --- a/src/intel/compiler/brw_eu.h
>> +++ b/src/intel/compiler/brw_eu.h
>> @@ -500,6 +500,9 @@ brw_broadcast(struct brw_codegen *p,
>>struct brw_reg src,
>>struct brw_reg idx);
>>  
>> +void
>> +brw_rounding_mode(struct brw_codegen *p,
>> +  enum brw_rnd_mode mode);
> 
> Missing whitespace line.

Ok

> 
>>  /***
>>   * brw_eu_util.c:
>>   */
>> diff --git a/src/intel/compiler/brw_eu_defines.h 
>> b/src/intel/compiler/brw_eu_defines.h
>> index da482b73c58..91d88fe8952 100644
>> --- a/src/intel/compiler/brw_eu_defines.h
>> +++ b/src/intel/compiler/brw_eu_defines.h
>> @@ -388,6 +388,9 @@ enum opcode {
>> SHADER_OPCODE_TYPED_SURFACE_WRITE,
>> SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL,
>>  
>> +
> 
> Redundant whitespace.

OK.

> 
>> +   SHADER_OPCODE_RND_MODE,
>> +
>> SHADER_OPCODE_MEMORY_FENCE,
>>  
>> SHADER_OPCODE_GEN4_SCRATCH_READ,
>> @@ -1214,4 +1217,18 @@ enum brw_message_target {
>>  /* R0 */
>>  # define GEN7_GS_PAYLOAD_INSTANCE_ID_SHIFT  27
>>  
>> +/* CR0.0[5:4] Floating-Point Rounding Modes
>> + *  Skylake PRM, Volume 7 Part 1, "Control Register", page 756
>> + */
>> +
>> +#define BRW_CR0_RND_MODE_MASK 0x30
>> +#define BRW_CR0_RND_MODE_SHIFT4
>> +
>> +enum PACKED brw_rnd_mode {
>> +   BRW_RND_MODE_RTNE = 0,  /* Round to Nearest or Even */
>> +   BRW_RND_MODE_RU = 1,/* Round Up, toward +inf */
>> +   BRW_RND_MODE_RD = 2,/* Round Down, toward -inf */
>> +   BRW_RND_MODE_RTZ = 3/* Round Toward Zero */
>> +};
>> +
>>  #endif /* BRW_EU_DEFINES_H */
>> diff --git a/src/intel/compiler/brw_eu_emit.c 
>> b/src/intel/compiler/brw_eu_emit.c
>> index 8c952e7da26..12164653e47 100644
>> --- a/src/intel/compiler/brw_eu_emit.c
>> +++ b/src/intel/compiler/brw_eu_emit.c
>> @@ -3530,3 +3530,37 @@ brw_WAIT(struct brw_codegen *p)
>> brw_inst_set_exec_size(devinfo, insn, BRW_EXECUTE_1);
>> brw_inst_set_mask_control(devinfo, insn, BRW_MASK_DISABLE);
>>  }
>> +
>> +/**
>> + * Changes the floating point rounding mode updating the control register
>> + * field defined at cr0.0[5-6] bits. This function supports the changes to
>> + * RTNE (00), RU (01), RD (10) and RTZ (11) rounding using bitwise 
>> operations.
>> + * Only RTNE and RTZ rounding are enabled at nir.
>> + */
>> +
> 
> Redundant whitespace.

OK.

> 
>> +void
>> +brw_rounding_mode(struct brw_codegen *p,
>> +  enum brw_rnd_mode mode)
>> +{
>> +   const unsigned bits  = mode << BRW_CR0_RND_MODE_SHIFT;
>> +
>> +   if (bits != BRW_CR0_RND_MODE_MASK) {
>> +  brw_inst *inst = brw_AND(p, brw_cr0_reg(0), brw_cr0_reg(0),
>> +   brw_imm_ud(~BRW_CR0_RND_MODE_MASK));
>> +
>> +  /* From the Skylake PRM, Volume 7, page 760:
>> +   *  "Implementation Restriction on Register Access: When the control
>> +   *   register is used as an explicit source and/or destination, 
>> hardware
>> +   *   does not ensure execution pipeline coherency. Software must set 
>> the
>> +   *   thread control field to ‘switch’ for an instruction that uses
>> +   *   control register as an explicit operand."
>> +   */
>> +  brw_inst_set_thread_control(p->devinfo, inst, BRW_THREAD_SWITCH);
>> +}
>> +
>> +   if (bits) {
>> +  brw_inst *inst = brw_OR(p, brw_cr0_reg(0), brw_cr0_reg(0),
>> +  brw_imm_ud(bits));
>> +  brw_inst_set_thread_control(p->devinfo, inst, BRW_THREAD_SWITCH);
>> +   }
>> +}
>> diff --git a/src/intel/compiler/brw_fs_generator.cpp 
>> b/src/intel/compiler/brw_fs_generator.cpp
>> index afaec5c9497..ff9880ebfe8 100644
>> --- a/src/intel/compiler/brw_fs_generator.cpp
>> +++ b/src/intel/compiler/brw_fs_generator.cpp
>> @@ -2144,6 +2144,11 @@ fs_generator::generate_code(const cfg_t *cfg, int 
>> 

[Mesa-dev] [Bug 102502] [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102502

--- Comment #4 from Alexandre Demers  ---
(In reply to Dieter Nützel from comment #3)
> Hello Alexandre,
> 
> have you verified, that Marek's patch (fix) for this works?
> 
> [Mesa-dev] [PATCH] st/mesa: skip draw calls with pipe_draw_info::count == 0
> https://lists.freedesktop.org/archives/mesa-dev/2017-September/168319.html
> 
> Greetings,
> Dieter

No, I was not aware of that patch (could have been linked here). I'll test it
later today.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102496] Frontbuffer rendering corruption on mesa master

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102496

--- Comment #10 from Bruce Cherniak  ---
Tested-by: Bruce Cherniak 

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102502] [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102502

--- Comment #3 from Dieter Nützel  ---
Hello Alexandre,

have you verified, that Marek's patch (fix) for this works?

[Mesa-dev] [PATCH] st/mesa: skip draw calls with pipe_draw_info::count == 0
https://lists.freedesktop.org/archives/mesa-dev/2017-September/168319.html

Greetings,
Dieter

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix loop analysis of loop terminators

2017-09-06 Thread Dieter Nützel

Tested-by: Dieter Nützel 

Dieter

Am 04.09.2017 05:29, schrieb Timothy Arceri:

This code incorrectly assumed that loop terminators will always be
at the start of the loop. Fortunately we *seem* to avoid any bugs
because the unrolling code loops over and correctly handles the
terminators.

However the incorrect analysis can result in loops not being
unrolled at all. For example the current code would unroll:

  int j = 0;
  do {
 if (j > 5)
break;

 ... do stuff ...

 j++;
  } while (j < 4);

But would fail to unroll the following as no iteration limit was
calculated because it failed to find the terminator:

  int j = 0;
  do {
 ... do stuff ...

 j++;
  } while (j < 4);

Also we would fail to unroll the following as we ended up
calculating the iteration limit as 6 rather than 4. The unroll
code then assumed we had 3 terminators rather the 2 as it
wasn't able to determine that "if (j > 5)" was redundant.

  int j = 0;
  do {
 if (j > 5)
break;

 ... do stuff ...

 if (bool(i))
break;

 j++;
  } while (j < 4);
---
 src/compiler/glsl/loop_analysis.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/compiler/glsl/loop_analysis.cpp
b/src/compiler/glsl/loop_analysis.cpp
index b9bae43536..253a405dfb 100644
--- a/src/compiler/glsl/loop_analysis.cpp
+++ b/src/compiler/glsl/loop_analysis.cpp
@@ -290,22 +290,20 @@ loop_analysis::visit_leave(ir_loop *ir)
foreach_in_list(ir_instruction, node, >body_instructions) {
   /* Skip over declarations at the start of a loop.
*/
   if (node->as_variable())
 continue;

   ir_if *if_stmt = ((ir_instruction *) node)->as_if();

   if ((if_stmt != NULL) && is_loop_terminator(if_stmt))
 ls->insert(if_stmt);
-  else
-break;
}


foreach_in_list_safe(loop_variable, lv, >variables) {
   /* Move variables that are already marked as being loop constant 
to

* a separate list.  These trivially don't need to be tested.
*/
   if (lv->is_loop_constant()) {
 lv->remove();
 ls->constants.push_tail(lv);

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: build ddebug, noop, rbug, trace as part of auxiliary

2017-09-06 Thread Emil Velikov
On 6 September 2017 at 18:13, Marek Olšák  wrote:
> On Wed, Sep 6, 2017 at 2:08 PM, Emil Velikov  wrote:
>> On 6 September 2017 at 12:11, Marek Olšák  wrote:
>>> On Wed, Sep 6, 2017 at 1:02 PM, Marek Olšák  wrote:
 On Wed, Sep 6, 2017 at 12:38 PM, Emil Velikov  
 wrote:
> On 4 September 2017 at 21:36, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU.
>> (gallium build time is reduced by 15% when building only radeonsi)
>>
> Some of this can be attributed to a couple libraries less to link.
> Speaking of which, did you switch to the gold linker, it should
> utilise the multiple cores/threads nicely.

 Sadly no.
>>>
>>> How do I switch to the gold linker?
>>>
>> There's multiple ways:
>>  - manually (or via binutils-config –linker ld.gold) set the default 
>> /usr/bin/ld
>> By default Arch has _hardlink_ to ld.bfd
>>  - export LD=ld.gold // haven't tried it
>>  - Add -fuse-ld=gold to the LDFLAGS (you may need a libtool patch [1])
>>
>> HTH
>> Emil
>>
>> [1] 
>> http://git.savannah.gnu.org/cgit/libtool.git/commit/?id=f9970d99293faf908fdc153a653fa5781095fb7a
>
> Thanks. I used LDFLAGS and it does improve build times, but the linking
> still seems mostly single-threaded.
>
One can control the thread count at the different stages - see the
ld.gold --help.
You can also try lld it might be faster[1].

-Emil

[1] http://lld.llvm.org/#performance
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/glsl_to_tgsi: be precise about merging scopes

2017-09-06 Thread Dieter Nützel

For the series:

Tested-by: Dieter Nützel 

Am 06.09.2017 11:54, schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

enclosing_scope already contains enclosing_scope_first_read.
What we really want to check here -- not for correctness, but
for speed -- is whether last_read_scope already contains
enclosing_scope.
---
 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
index d984184e701..8ba1f5bb0be 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -526,22 +526,22 @@ lifetime 
temp_comp_access::get_required_lifetime()

   enclosing_scope_first_write = conditional->outermost_loop();
}

/* Evaluate the scope that is shared by all: required first write 
scope,

 * required first read before write scope, and last read scope.
 */
const prog_scope *enclosing_scope = enclosing_scope_first_read;
if 
(enclosing_scope_first_write->contains_range_of(*enclosing_scope))

   enclosing_scope = enclosing_scope_first_write;

-   if 
(enclosing_scope_first_read->contains_range_of(*enclosing_scope))

-  enclosing_scope = enclosing_scope_first_read;
+   if (last_read_scope->contains_range_of(*enclosing_scope))
+  enclosing_scope = last_read_scope;

while 
(!enclosing_scope->contains_range_of(*enclosing_scope_first_write) ||

   !enclosing_scope->contains_range_of(*last_read_scope)) {
   enclosing_scope = enclosing_scope->parent();
   assert(enclosing_scope);
}

/* Propagate the last read scope to the target scope */
while (enclosing_scope->nesting_depth() <
last_read_scope->nesting_depth()) {
   /* If the read is in a loop and we have to move up the scope we 
need to

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102530

--- Comment #22 from Alexandre Demers  ---
(In reply to Emil Velikov from comment #21)
> The attached patch from Alexandre seems to be a noop.
> 
> This series from Eric treats MESA_NO_ERROR as boolean.
> https://patchwork.freedesktop.org/series/29888/

I like the boolean usage... and maybe prefer it to my proposed patch.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102564] swr: GPU Caps Viewer crashes with any 3D demo

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102564

Alex Granni  changed:

   What|Removed |Added

Version|unspecified |17.2

--- Comment #1 from Alex Granni  ---
Regression in 17.2.0. v17.1.8 is unaffected.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102565] u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102565

Vinson Lee  changed:

   What|Removed |Added

   Keywords||bisected
 CC||gw.foss...@gmail.com,
   ||nhaeh...@gmail.com

--- Comment #1 from Vinson Lee  ---
7be6d8fe1250f3b1d5fb2347839567049526c5be is the first bad commit
commit 7be6d8fe1250f3b1d5fb2347839567049526c5be
Author: Gert Wollny 
Date:   Fri Jun 30 08:37:36 2017 +0200

mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker

This patch adds a set of unit tests for the new lifetime tracker.

Reviewed-by: Nicolai Hähnle 

:100644 100644 fb6037eedc3e2c3919e77532b45b114c151b1cac
d0d4c0dfd1dec4e2710448182c6576e16400ffe6 M  configure.ac
:04 04 d302f957741f1bf87f1deb49599f26c98418e17f
d0219311e0b3c83109bff579fc951559d6769b97 M  src
bisect run success

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102496] Frontbuffer rendering corruption on mesa master

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102496

--- Comment #9 from Dieter Nützel  ---
(In reply to Thomas Hellström from comment #4)
> Created attachment 133985 [details] [review]
> Patch that should fix the issue
> 
> This patch fixes the issue on my side. Please verify!

Tested-by: Dieter Nützel 

on radeonsi/RX580

Thank you Thomas.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: build ddebug, noop, rbug, trace as part of auxiliary

2017-09-06 Thread Marek Olšák
On Wed, Sep 6, 2017 at 2:08 PM, Emil Velikov  wrote:
> On 6 September 2017 at 12:11, Marek Olšák  wrote:
>> On Wed, Sep 6, 2017 at 1:02 PM, Marek Olšák  wrote:
>>> On Wed, Sep 6, 2017 at 12:38 PM, Emil Velikov  
>>> wrote:
 On 4 September 2017 at 21:36, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU.
> (gallium build time is reduced by 15% when building only radeonsi)
>
 Some of this can be attributed to a couple libraries less to link.
 Speaking of which, did you switch to the gold linker, it should
 utilise the multiple cores/threads nicely.
>>>
>>> Sadly no.
>>
>> How do I switch to the gold linker?
>>
> There's multiple ways:
>  - manually (or via binutils-config –linker ld.gold) set the default 
> /usr/bin/ld
> By default Arch has _hardlink_ to ld.bfd
>  - export LD=ld.gold // haven't tried it
>  - Add -fuse-ld=gold to the LDFLAGS (you may need a libtool patch [1])
>
> HTH
> Emil
>
> [1] 
> http://git.savannah.gnu.org/cgit/libtool.git/commit/?id=f9970d99293faf908fdc153a653fa5781095fb7a

Thanks. I used LDFLAGS and it does improve build times, but the linking
still seems mostly single-threaded.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radv: do not use a bitfield when dirtying the vertex buffers

2017-09-06 Thread Gustaw Smolarczyk
2017-09-06 15:53 GMT+02:00 Samuel Pitoiset :
> Useless to track which one has been updated because we
> re-upload all the vertex buffers in one shot.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 5 +++--
>  src/amd/vulkan/radv_private.h| 2 +-
>  2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index ed2984eb5a..cc9a758d36 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1629,7 +1629,7 @@ radv_cmd_buffer_update_vertex_descriptors(struct 
> radv_cmd_buffer *cmd_buffer)
> radv_emit_userdata_address(cmd_buffer, 
> cmd_buffer->state.pipeline, MESA_SHADER_VERTEX,
>AC_UD_VS_VERTEX_BUFFERS, va);
> }
> -   cmd_buffer->state.vb_dirty = 0;
> +   cmd_buffer->state.vb_dirty = false;
>  }
>
>  static void
> @@ -2049,8 +2049,9 @@ void radv_CmdBindVertexBuffers(
> for (uint32_t i = 0; i < bindingCount; i++) {
> vb[firstBinding + i].buffer = 
> radv_buffer_from_handle(pBuffers[i]);
> vb[firstBinding + i].offset = pOffsets[i];
> -   cmd_buffer->state.vb_dirty |= 1 << (firstBinding + i);
> }
> +
> +   cmd_buffer->state.vb_dirty = true;
>  }
>
>  void radv_CmdBindIndexBuffer(
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
> index 65ec712707..7c5dac3240 100644
> --- a/src/amd/vulkan/radv_private.h
> +++ b/src/amd/vulkan/radv_private.h
> @@ -757,7 +757,7 @@ struct radv_attachment_state {
>  };
>
>  struct radv_cmd_state {
> -   uint32_t  vb_dirty;
> +   bool  vb_dirty;

Maybe move it below the dirty field in order to save 8 bytes? You
might also move the predicating field in order to save another 8.

That might be a candidate for a different commit though...

Gustaw
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] intel: Remove unused Kabylake pci ids

2017-09-06 Thread Anuj Phogat
On Tue, Sep 5, 2017 at 5:13 PM, Matt Turner  wrote:
> The series is
>
> Reviewed-by: Matt Turner 
>
> I think It should be tagged for the stable branch as well. Does anyone
> else have an opinion?
>
> I tested a KBL-R system (the 0x5917 PCI ID) with it set as a GT1.5 and
> a GT2 and in both cases is passed piglit.
>
> Are you planning to send patches for the kernel and libdrm? If not, I
> can handle that.
I'll send out the patches. Thanks for the review.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/9] radeonsi: don't read tcs_out_lds_layout.vertex_size from an SGPR

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

TCS outputs are usually not written to LDS, so no stats here.
---
 src/gallium/drivers/radeonsi/si_shader.c  | 21 +++--
 src/gallium/drivers/radeonsi/si_shader_internal.h |  2 --
 src/gallium/drivers/radeonsi/si_state_draw.c  |  3 +--
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index d622304..1a9fce9 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -319,20 +319,37 @@ get_tcs_in_patch_stride(struct si_shader_context *ctx)
 {
return unpack_param(ctx, ctx->param_vs_state_bits, 8, 13);
 }
 
 static LLVMValueRef
 get_tcs_out_patch_stride(struct si_shader_context *ctx)
 {
return unpack_param(ctx, ctx->param_tcs_out_lds_layout, 0, 13);
 }
 
+static unsigned get_tcs_out_vertex_dw_stride_constant(struct si_shader_context 
*ctx)
+{
+   assert(ctx->type == PIPE_SHADER_TESS_CTRL);
+
+   if (ctx->shader->key.mono.u.ff_tcs_inputs_to_copy)
+   return 
util_last_bit64(ctx->shader->key.mono.u.ff_tcs_inputs_to_copy) * 4;
+
+   return util_last_bit64(ctx->shader->selector->outputs_written) * 4;
+}
+
+static LLVMValueRef get_tcs_out_vertex_dw_stride(struct si_shader_context *ctx)
+{
+   unsigned stride = get_tcs_out_vertex_dw_stride_constant(ctx);
+
+   return LLVMConstInt(ctx->i32, stride, 0);
+}
+
 static LLVMValueRef
 get_tcs_out_patch0_offset(struct si_shader_context *ctx)
 {
return lp_build_mul_imm(>bld_base.uint_bld,
unpack_param(ctx,
 ctx->param_tcs_out_lds_offsets,
 0, 16),
4);
 }
 
@@ -1079,21 +1096,21 @@ static LLVMValueRef fetch_input_tcs(
 
 static LLVMValueRef fetch_output_tcs(
struct lp_build_tgsi_context *bld_base,
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type, unsigned swizzle)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef dw_addr, stride;
 
if (reg->Register.Dimension) {
-   stride = unpack_param(ctx, ctx->param_tcs_out_lds_layout, 13, 
8);
+   stride = get_tcs_out_vertex_dw_stride(ctx);
dw_addr = get_tcs_out_current_patch_offset(ctx);
dw_addr = get_dw_address(ctx, NULL, reg, stride, dw_addr);
} else {
dw_addr = get_tcs_out_current_patch_data_offset(ctx);
dw_addr = get_dw_address(ctx, NULL, reg, NULL, dw_addr);
}
 
return lds_load(bld_base, type, swizzle, dw_addr);
 }
 
@@ -1132,21 +1149,21 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
/* Only handle per-patch and per-vertex outputs here.
 * Vectors will be lowered to scalars and this function will be called 
again.
 */
if (reg->Register.File != TGSI_FILE_OUTPUT ||
(dst[0] && LLVMGetTypeKind(LLVMTypeOf(dst[0])) == 
LLVMVectorTypeKind)) {
si_llvm_emit_store(bld_base, inst, info, dst);
return;
}
 
if (reg->Register.Dimension) {
-   stride = unpack_param(ctx, ctx->param_tcs_out_lds_layout, 13, 
8);
+   stride = get_tcs_out_vertex_dw_stride(ctx);
dw_addr = get_tcs_out_current_patch_offset(ctx);
dw_addr = get_dw_address(ctx, reg, NULL, stride, dw_addr);
skip_lds_store = !sh_info->reads_pervertex_outputs;
} else {
dw_addr = get_tcs_out_current_patch_data_offset(ctx);
dw_addr = get_dw_address(ctx, reg, NULL, NULL, dw_addr);
skip_lds_store = !sh_info->reads_perpatch_outputs;
 
if (!reg->Register.Indirect) {
int name = 
sh_info->output_semantic_name[reg->Register.Index];
diff --git a/src/gallium/drivers/radeonsi/si_shader_internal.h 
b/src/gallium/drivers/radeonsi/si_shader_internal.h
index 4ae8d85..023f9a6 100644
--- a/src/gallium/drivers/radeonsi/si_shader_internal.h
+++ b/src/gallium/drivers/radeonsi/si_shader_internal.h
@@ -154,22 +154,20 @@ struct si_shader_context {
/* API TCS */
/* Offsets where TCS outputs and TCS patch outputs live in LDS:
 *   [0:15] = TCS output patch0 offset / 16, max = NUM_PATCHES * 32 * 32
 *   [16:31] = TCS output patch0 offset for per-patch / 16
 * max = (NUM_PATCHES + 1) * 32*32
 */
int param_tcs_out_lds_offsets;
/* Layout of TCS outputs / TES inputs:
 *   [0:12] = stride between output patches in DW, num_outputs * 
num_vertices * 4
 *max = 32*32*4 + 32*4
-*   [13:20] = stride between output vertices in DW = num_inputs * 4
-* max = 32*4
 *   [26:31] = 

[Mesa-dev] [PATCH 9/9] radeonsi: don't read tcs_out_lds_layout.patch_stride from an SGPR

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

Same as before, writing TCS outputs to LDS is rare.
---
 src/gallium/drivers/radeonsi/si_shader.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 1a9fce9..f134cf8 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -313,43 +313,51 @@ static LLVMValueRef get_rel_patch_id(struct 
si_shader_context *ctx)
  *
  * All three shaders VS(LS), TCS, TES share the same LDS space.
  */
 
 static LLVMValueRef
 get_tcs_in_patch_stride(struct si_shader_context *ctx)
 {
return unpack_param(ctx, ctx->param_vs_state_bits, 8, 13);
 }
 
-static LLVMValueRef
-get_tcs_out_patch_stride(struct si_shader_context *ctx)
-{
-   return unpack_param(ctx, ctx->param_tcs_out_lds_layout, 0, 13);
-}
-
 static unsigned get_tcs_out_vertex_dw_stride_constant(struct si_shader_context 
*ctx)
 {
assert(ctx->type == PIPE_SHADER_TESS_CTRL);
 
if (ctx->shader->key.mono.u.ff_tcs_inputs_to_copy)
return 
util_last_bit64(ctx->shader->key.mono.u.ff_tcs_inputs_to_copy) * 4;
 
return util_last_bit64(ctx->shader->selector->outputs_written) * 4;
 }
 
 static LLVMValueRef get_tcs_out_vertex_dw_stride(struct si_shader_context *ctx)
 {
unsigned stride = get_tcs_out_vertex_dw_stride_constant(ctx);
 
return LLVMConstInt(ctx->i32, stride, 0);
 }
 
+static LLVMValueRef get_tcs_out_patch_stride(struct si_shader_context *ctx)
+{
+   if (ctx->shader->key.mono.u.ff_tcs_inputs_to_copy)
+   return unpack_param(ctx, ctx->param_tcs_out_lds_layout, 0, 13);
+
+   const struct tgsi_shader_info *info = >shader->selector->info;
+   unsigned tcs_out_vertices = 
info->properties[TGSI_PROPERTY_TCS_VERTICES_OUT];
+   unsigned vertex_dw_stride = get_tcs_out_vertex_dw_stride_constant(ctx);
+   unsigned num_patch_outputs = 
util_last_bit64(ctx->shader->selector->patch_outputs_written);
+   unsigned patch_dw_stride = tcs_out_vertices * vertex_dw_stride +
+  num_patch_outputs * 4;
+   return LLVMConstInt(ctx->i32, patch_dw_stride, 0);
+}
+
 static LLVMValueRef
 get_tcs_out_patch0_offset(struct si_shader_context *ctx)
 {
return lp_build_mul_imm(>bld_base.uint_bld,
unpack_param(ctx,
 ctx->param_tcs_out_lds_offsets,
 0, 16),
4);
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/9] radeonsi/gfx9: don't read LS out vertex stride from an SGPR in monolithic HS

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

-44 bytes in a monolithic LS-HS binary.
---
 src/gallium/drivers/radeonsi/si_shader.c| 5 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 7 ++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 7c3bd8b..d622304 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -401,20 +401,25 @@ static LLVMValueRef get_num_tcs_out_vertices(struct 
si_shader_context *ctx)
 static LLVMValueRef get_tcs_in_vertex_dw_stride(struct si_shader_context *ctx)
 {
unsigned stride;
 
switch (ctx->type) {
case PIPE_SHADER_VERTEX:
stride = 
util_last_bit64(ctx->shader->selector->outputs_written);
return LLVMConstInt(ctx->i32, stride * 4, 0);
 
case PIPE_SHADER_TESS_CTRL:
+   if (ctx->screen->b.chip_class >= GFX9 &&
+   ctx->shader->is_monolithic) {
+   stride = 
util_last_bit64(ctx->shader->key.part.tcs.ls->outputs_written);
+   return LLVMConstInt(ctx->i32, stride * 4, 0);
+   }
return unpack_param(ctx, ctx->param_vs_state_bits, 24, 8);
 
default:
assert(0);
return NULL;
}
 }
 
 static LLVMValueRef get_instance_index_for_fetch(
struct si_shader_context *ctx,
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 25fcead..fe25598 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1284,21 +1284,26 @@ static inline void si_shader_selector_key(struct 
pipe_context *ctx,
  key, 
>part.tcs.ls_prolog);
key->part.tcs.ls = sctx->vs_shader.cso;
 
/* When the LS VGPR fix is needed, monolithic shaders
 * can:
 *  - avoid initializing EXEC in both the LS prolog
 *and the LS main part when !vs_needs_prolog
 *  - remove the fixup for unused input VGPRs
 */
key->part.tcs.ls_prolog.ls_vgpr_fix = sctx->ls_vgpr_fix;
-   key->opt.prefer_mono = sctx->ls_vgpr_fix;
+
+   /* The LS output / HS input layout can be communicated
+* directly instead of via user SGPRs for merged LS-HS.
+* The LS VGPR fix prefers this too.
+*/
+   key->opt.prefer_mono = 1;
}
 
key->part.tcs.epilog.prim_mode =

sctx->tes_shader.cso->info.properties[TGSI_PROPERTY_TES_PRIM_MODE];
key->part.tcs.epilog.invoc0_tess_factors_are_def =
sel->tcs_info.invoc0_tessfactors_are_def;
key->part.tcs.epilog.tes_reads_tess_factors =
sctx->tes_shader.cso->info.reads_tess_factors;
 
if (sel == sctx->fixed_func_tcs_shader.cso)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/9] radeonsi: remove 2 callbacks from si_shader_context

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c| 13 +
 src/gallium/drivers/radeonsi/si_shader_internal.h   | 13 ++---
 src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c |  4 ++--
 3 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index db8297d..861d82f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1486,23 +1486,23 @@ static LLVMValueRef load_sample_position(struct 
si_shader_context *ctx, LLVMValu
LLVMValueRef pos[4] = {
buffer_load_const(ctx, resource, offset0),
buffer_load_const(ctx, resource, offset1),
LLVMConstReal(ctx->f32, 0),
LLVMConstReal(ctx->f32, 0)
};
 
return lp_build_gather_values(gallivm, pos, 4);
 }
 
-static void declare_system_value(struct si_shader_context *ctx,
-unsigned index,
-const struct tgsi_full_declaration *decl)
+void si_load_system_value(struct si_shader_context *ctx,
+ unsigned index,
+ const struct tgsi_full_declaration *decl)
 {
struct lp_build_context *bld = >bld_base.base;
struct gallivm_state *gallivm = >gallivm;
LLVMValueRef value = 0;
 
assert(index < RADEON_LLVM_MAX_SYSTEM_VALUES);
 
switch (decl->Semantic.Name) {
case TGSI_SEMANTIC_INSTANCEID:
value = ctx->abi.instance_id;
@@ -1763,22 +1763,22 @@ static void declare_system_value(struct 
si_shader_context *ctx,
}
 
default:
assert(!"unknown system value");
return;
}
 
ctx->system_values[index] = value;
 }
 
-static void declare_compute_memory(struct si_shader_context *ctx,
-   const struct tgsi_full_declaration *decl)
+void si_declare_compute_memory(struct si_shader_context *ctx,
+  const struct tgsi_full_declaration *decl)
 {
struct si_shader_selector *sel = ctx->shader->selector;
struct gallivm_state *gallivm = >gallivm;
 
LLVMTypeRef i8p = LLVMPointerType(ctx->i8, LOCAL_ADDR_SPACE);
LLVMValueRef var;
 
assert(decl->Declaration.MemType == TGSI_MEMORY_TYPE_SHARED);
assert(decl->Range.First == decl->Range.Last);
assert(!ctx->shared_memory);
@@ -5684,21 +5684,20 @@ static bool si_compile_tgsi_main(struct 
si_shader_context *ctx,
case PIPE_SHADER_GEOMETRY:
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs;
bld_base->emit_epilogue = si_llvm_emit_gs_epilogue;
break;
case PIPE_SHADER_FRAGMENT:
ctx->load_input = declare_input_fs;
ctx->abi.emit_outputs = si_llvm_return_fs_outputs;
bld_base->emit_epilogue = si_tgsi_emit_epilogue;
break;
case PIPE_SHADER_COMPUTE:
-   ctx->declare_memory_region = declare_compute_memory;
break;
default:
assert(!"Unsupported shader type");
return false;
}
 
ctx->abi.load_ubo = load_ubo;
ctx->abi.load_ssbo = load_ssbo;
 
create_function(ctx);
@@ -6338,22 +6337,20 @@ int si_compile_tgsi_shader(struct si_screen *sscreen,
 
si_init_shader_ctx(, sscreen, tm);
si_llvm_context_set_tgsi(, shader);
ctx.separate_prolog = !is_monolithic;
 
memset(shader->info.vs_output_param_offset, AC_EXP_PARAM_UNDEFINED,
   sizeof(shader->info.vs_output_param_offset));
 
shader->info.uses_instanceid = sel->info.uses_instanceid;
 
-   ctx.load_system_value = declare_system_value;
-
if (!si_compile_tgsi_main(, is_monolithic)) {
si_llvm_dispose();
return -1;
}
 
if (is_monolithic && ctx.type == PIPE_SHADER_VERTEX) {
LLVMValueRef parts[2];
bool need_prolog = sel->vs_needs_prolog;
 
parts[1] = ctx.main_fn;
diff --git a/src/gallium/drivers/radeonsi/si_shader_internal.h 
b/src/gallium/drivers/radeonsi/si_shader_internal.h
index f304295..1231ef4 100644
--- a/src/gallium/drivers/radeonsi/si_shader_internal.h
+++ b/src/gallium/drivers/radeonsi/si_shader_internal.h
@@ -71,27 +71,20 @@ struct si_shader_context {
struct ac_shader_abi abi;
 
/** This function is responsible for initilizing the inputs array and 
will be
  * called once for each input declared in the TGSI shader.
  */
void (*load_input)(struct si_shader_context *,
   unsigned input_index,
   const struct tgsi_full_declaration *decl,
   LLVMValueRef out[4]);
 
-   void (*load_system_value)(struct 

[Mesa-dev] [PATCH 4/9] radeonsi: don't always apply the PrimID instancing bug workaround on SI

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

It looks like commit 391673af7ad1565a5f6ac8fc2f8c9fcdd1fe9908 that should
have fixed the perf regression didn't really change much if anything.
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 051dfea..363a4ae 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -206,21 +206,21 @@ static void si_emit_derived_tess_state(struct si_context 
*sctx,
/* The VGT HS block increments the patch ID unconditionally
 * within a single threadgroup. This results in incorrect
 * patch IDs when instanced draws are used.
 *
 * The intended solution is to restrict threadgroups to
 * a single instance by setting SWITCH_ON_EOI, which
 * should cause IA to split instances up. However, this
 * doesn't work correctly on SI when there is no other
 * SE to switch to.
 */
-   if (has_primid_instancing_bug)
+   if (has_primid_instancing_bug && tess_uses_primid)
*num_patches = 1;
 
sctx->last_num_patches = *num_patches;
 
output_patch0_offset = input_patch_size * *num_patches;
perpatch_output_offset = output_patch0_offset + 
pervertex_output_patch_size;
 
/* Compute userdata SGPRs. */
assert(((input_vertex_size / 4) & ~0xff) == 0);
assert(((output_vertex_size / 4) & ~0xff) == 0);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/9] radeonsi: don't read the LS output vertex stride from an SGPR in LS

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

Now it's able to generate ds_write2_b64 instead of ds_write2_b32.

-20 bytes in one shader binary. (having only 1 output)
---
 src/gallium/drivers/radeonsi/si_shader.c | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 32a6fa0..7c3bd8b 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -391,20 +391,38 @@ static LLVMValueRef get_num_tcs_out_vertices(struct 
si_shader_context *ctx)
ctx->shader->selector ?

ctx->shader->selector->info.properties[TGSI_PROPERTY_TCS_VERTICES_OUT] : 0;
 
/* If !tcs_out_vertices, it's either the fixed-func TCS or the TCS 
epilog. */
if (ctx->type == PIPE_SHADER_TESS_CTRL && tcs_out_vertices)
return LLVMConstInt(ctx->i32, tcs_out_vertices, 0);
 
return unpack_param(ctx, ctx->param_tcs_offchip_layout, 6, 6);
 }
 
+static LLVMValueRef get_tcs_in_vertex_dw_stride(struct si_shader_context *ctx)
+{
+   unsigned stride;
+
+   switch (ctx->type) {
+   case PIPE_SHADER_VERTEX:
+   stride = 
util_last_bit64(ctx->shader->selector->outputs_written);
+   return LLVMConstInt(ctx->i32, stride * 4, 0);
+
+   case PIPE_SHADER_TESS_CTRL:
+   return unpack_param(ctx, ctx->param_vs_state_bits, 24, 8);
+
+   default:
+   assert(0);
+   return NULL;
+   }
+}
+
 static LLVMValueRef get_instance_index_for_fetch(
struct si_shader_context *ctx,
unsigned param_start_instance, LLVMValueRef divisor)
 {
struct gallivm_state *gallivm = >gallivm;
 
LLVMValueRef result = ctx->abi.instance_id;
 
/* The division must be done before START_INSTANCE is added. */
if (divisor != ctx->i32_1)
@@ -1040,21 +1058,21 @@ static LLVMValueRef desc_from_addr_base64k(struct 
si_shader_context *ctx,
 }
 
 static LLVMValueRef fetch_input_tcs(
struct lp_build_tgsi_context *bld_base,
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type, unsigned swizzle)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef dw_addr, stride;
 
-   stride = unpack_param(ctx, ctx->param_vs_state_bits, 24, 8);
+   stride = get_tcs_in_vertex_dw_stride(ctx);
dw_addr = get_tcs_in_current_patch_offset(ctx);
dw_addr = get_dw_address(ctx, NULL, reg, stride, dw_addr);
 
return lds_load(bld_base, type, swizzle, dw_addr);
 }
 
 static LLVMValueRef fetch_output_tcs(
struct lp_build_tgsi_context *bld_base,
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type, unsigned swizzle)
@@ -2603,21 +2621,21 @@ static void si_copy_tcs_inputs(struct 
lp_build_tgsi_context *bld_base)
struct si_shader_context *ctx = si_shader_context(bld_base);
struct gallivm_state *gallivm = >gallivm;
LLVMValueRef invocation_id, buffer, buffer_offset;
LLVMValueRef lds_vertex_stride, lds_vertex_offset, lds_base;
uint64_t inputs;
 
invocation_id = unpack_param(ctx, ctx->param_tcs_rel_ids, 8, 5);
buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
buffer_offset = LLVMGetParam(ctx->main_fn, 
ctx->param_tcs_offchip_offset);
 
-   lds_vertex_stride = unpack_param(ctx, ctx->param_vs_state_bits, 24, 8);
+   lds_vertex_stride = get_tcs_in_vertex_dw_stride(ctx);
lds_vertex_offset = LLVMBuildMul(gallivm->builder, invocation_id,
 lds_vertex_stride, "");
lds_base = get_tcs_in_current_patch_offset(ctx);
lds_base = LLVMBuildAdd(gallivm->builder, lds_base, lds_vertex_offset, 
"");
 
inputs = ctx->shader->key.mono.u.ff_tcs_inputs_to_copy;
while (inputs) {
unsigned i = u_bit_scan64();
 
LLVMValueRef lds_ptr = LLVMBuildAdd(gallivm->builder, lds_base,
@@ -3014,22 +3032,21 @@ static void si_set_es_return_value_for_gs(struct 
si_shader_context *ctx)
 
 static void si_llvm_emit_ls_epilogue(struct lp_build_tgsi_context *bld_base)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct si_shader *shader = ctx->shader;
struct tgsi_shader_info *info = >selector->info;
struct gallivm_state *gallivm = >gallivm;
unsigned i, chan;
LLVMValueRef vertex_id = LLVMGetParam(ctx->main_fn,
  ctx->param_rel_auto_id);
-   LLVMValueRef vertex_dw_stride =
-   unpack_param(ctx, ctx->param_vs_state_bits, 24, 8);
+   LLVMValueRef vertex_dw_stride = get_tcs_in_vertex_dw_stride(ctx);
LLVMValueRef base_dw_addr = LLVMBuildMul(gallivm->builder, vertex_id,
 

[Mesa-dev] [PATCH 5/9] radeonsi: don't read the number of TCS out vertices from an SGPR in TCS

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

-16 bytes in one shader binary.
---
 src/gallium/drivers/radeonsi/si_shader.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index de58737..32a6fa0 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -378,20 +378,33 @@ get_tcs_out_current_patch_data_offset(struct 
si_shader_context *ctx)
get_tcs_out_patch0_patch_data_offset(ctx);
LLVMValueRef patch_stride = get_tcs_out_patch_stride(ctx);
LLVMValueRef rel_patch_id = get_rel_patch_id(ctx);
 
return LLVMBuildAdd(gallivm->builder, patch0_patch_data_offset,
LLVMBuildMul(gallivm->builder, patch_stride,
 rel_patch_id, ""),
"");
 }
 
+static LLVMValueRef get_num_tcs_out_vertices(struct si_shader_context *ctx)
+{
+   unsigned tcs_out_vertices =
+   ctx->shader->selector ?
+   
ctx->shader->selector->info.properties[TGSI_PROPERTY_TCS_VERTICES_OUT] : 0;
+
+   /* If !tcs_out_vertices, it's either the fixed-func TCS or the TCS 
epilog. */
+   if (ctx->type == PIPE_SHADER_TESS_CTRL && tcs_out_vertices)
+   return LLVMConstInt(ctx->i32, tcs_out_vertices, 0);
+
+   return unpack_param(ctx, ctx->param_tcs_offchip_layout, 6, 6);
+}
+
 static LLVMValueRef get_instance_index_for_fetch(
struct si_shader_context *ctx,
unsigned param_start_instance, LLVMValueRef divisor)
 {
struct gallivm_state *gallivm = >gallivm;
 
LLVMValueRef result = ctx->abi.instance_id;
 
/* The division must be done before START_INSTANCE is added. */
if (divisor != ctx->i32_1)
@@ -797,21 +810,21 @@ static LLVMValueRef get_dw_address(struct 
si_shader_context *ctx,
  */
 static LLVMValueRef get_tcs_tes_buffer_address(struct si_shader_context *ctx,
   LLVMValueRef rel_patch_id,
LLVMValueRef vertex_index,
LLVMValueRef param_index)
 {
struct gallivm_state *gallivm = >gallivm;
LLVMValueRef base_addr, vertices_per_patch, num_patches, total_vertices;
LLVMValueRef param_stride, constant16;
 
-   vertices_per_patch = unpack_param(ctx, ctx->param_tcs_offchip_layout, 
6, 6);
+   vertices_per_patch = get_num_tcs_out_vertices(ctx);
num_patches = unpack_param(ctx, ctx->param_tcs_offchip_layout, 0, 6);
total_vertices = LLVMBuildMul(gallivm->builder, vertices_per_patch,
  num_patches, "");
 
constant16 = LLVMConstInt(ctx->i32, 16, 0);
if (vertex_index) {
base_addr = LLVMBuildMul(gallivm->builder, rel_patch_id,
 vertices_per_patch, "");
 
base_addr = LLVMBuildAdd(gallivm->builder, base_addr,
@@ -1630,21 +1643,21 @@ void si_load_system_value(struct si_shader_context *ctx,
lp_build_add(bld, coord[0], 
coord[1]));
 
value = lp_build_gather_values(gallivm, coord, 4);
break;
}
 
case TGSI_SEMANTIC_VERTICESIN:
if (ctx->type == PIPE_SHADER_TESS_CTRL)
value = unpack_param(ctx, 
ctx->param_tcs_out_lds_layout, 26, 6);
else if (ctx->type == PIPE_SHADER_TESS_EVAL)
-   value = unpack_param(ctx, 
ctx->param_tcs_offchip_layout, 6, 6);
+   value = get_num_tcs_out_vertices(ctx);
else
assert(!"invalid shader stage for 
TGSI_SEMANTIC_VERTICESIN");
break;
 
case TGSI_SEMANTIC_TESSINNER:
case TGSI_SEMANTIC_TESSOUTER:
{
LLVMValueRef buffer, base, addr;
int param = 
si_shader_io_get_unique_index_patch(decl->Semantic.Name, 0);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/9] RadeonSI: Tessellation shader micro-optimizations

2017-09-06 Thread Marek Olšák
Hi,

This series seemed like a good idea and results in eliminations of
shader instructions. However I haven't been able to find an app where
it has a measurable effect.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/9] radeonsi: optimize TCS epilog when invocation 0 writes tess factors

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

This removes the barrier and LDS stores and loads for tess factors
when it's possible. The removal of the barrier seems more important
to me though.

In one shader, it removes 17 * 4 bytes from the shader binary.
---
 src/gallium/drivers/radeonsi/si_shader.c  | 111 --
 src/gallium/drivers/radeonsi/si_shader.h  |   2 +
 src/gallium/drivers/radeonsi/si_shader_internal.h |   1 +
 src/gallium/drivers/radeonsi/si_state_shaders.c   |   3 +
 4 files changed, 89 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 861d82f..de58737 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1084,21 +1084,21 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct gallivm_state *gallivm = >gallivm;
const struct tgsi_full_dst_register *reg = >Dst[0];
const struct tgsi_shader_info *sh_info = >shader->selector->info;
unsigned chan_index;
LLVMValueRef dw_addr, stride;
LLVMValueRef buffer, base, buf_addr;
LLVMValueRef values[4];
bool skip_lds_store;
-   bool is_tess_factor = false;
+   bool is_tess_factor = false, is_tess_inner = false;
 
/* Only handle per-patch and per-vertex outputs here.
 * Vectors will be lowered to scalars and this function will be called 
again.
 */
if (reg->Register.File != TGSI_FILE_OUTPUT ||
(dst[0] && LLVMGetTypeKind(LLVMTypeOf(dst[0])) == 
LLVMVectorTypeKind)) {
si_llvm_emit_store(bld_base, inst, info, dst);
return;
}
 
@@ -,22 +,25 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
dw_addr = get_tcs_out_current_patch_data_offset(ctx);
dw_addr = get_dw_address(ctx, reg, NULL, NULL, dw_addr);
skip_lds_store = !sh_info->reads_perpatch_outputs;
 
if (!reg->Register.Indirect) {
int name = 
sh_info->output_semantic_name[reg->Register.Index];
 
/* Always write tess factors into LDS for the TCS 
epilog. */
if (name == TGSI_SEMANTIC_TESSINNER ||
name == TGSI_SEMANTIC_TESSOUTER) {
-   skip_lds_store = false;
+   /* The epilog doesn't read LDS if invocation 0 
defines tess factors. */
+   skip_lds_store = 
!sh_info->reads_tessfactor_outputs &&
+
ctx->shader->selector->tcs_info.invoc0_tessfactors_are_def;
is_tess_factor = true;
+   is_tess_inner = name == TGSI_SEMANTIC_TESSINNER;
}
}
}
 
buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
 
base = LLVMGetParam(ctx->main_fn, ctx->param_tcs_offchip_offset);
buf_addr = get_tcs_tes_buffer_address_from_reg(ctx, reg, NULL);
 
 
@@ -1141,20 +1144,32 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
lds_store(bld_base, chan_index, dw_addr, value);
 
value = LLVMBuildBitCast(gallivm->builder, value, ctx->i32, "");
values[chan_index] = value;
 
if (inst->Dst[0].Register.WriteMask != 0xF && !is_tess_factor) {
ac_build_buffer_store_dword(>ac, buffer, value, 1,
buf_addr, base,
4 * chan_index, 1, 0, true, 
false);
}
+
+   /* Write tess factors into VGPRs for the epilog. */
+   if (is_tess_factor &&
+   ctx->shader->selector->tcs_info.invoc0_tessfactors_are_def) 
{
+   if (!is_tess_inner) {
+   LLVMBuildStore(gallivm->builder, value, /* 
outer */
+  
ctx->invoc0_tess_factors[chan_index]);
+   } else if (chan_index < 2) {
+   LLVMBuildStore(gallivm->builder, value, /* 
inner */
+  ctx->invoc0_tess_factors[4 + 
chan_index]);
+   }
+   }
}
 
if (inst->Dst[0].Register.WriteMask == 0xF && !is_tess_factor) {
LLVMValueRef value = lp_build_gather_values(gallivm,
values, 4);
ac_build_buffer_store_dword(>ac, buffer, value, 4, 
buf_addr,
base, 0, 1, 0, true, false);
}
 }
 
@@ -2605,32 +2620,36 @@ static void 

[Mesa-dev] [PATCH 1/9] tgsi/scan: add a new pass that analyzes tess factor writes

2017-09-06 Thread Marek Olšák
From: Marek Olšák 

The pass tries to deduce whether tess factors are always written by
invocation 0 (at least).

The implication for radeonsi is that it doesn't have to use a barrier
near the end of TCS, and doesn't have to use LDS for passing the tess
factors to the epilog.
---
 src/gallium/auxiliary/tgsi/tgsi_scan.c | 188 +
 src/gallium/auxiliary/tgsi/tgsi_scan.h |  11 ++
 2 files changed, 199 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index db87ce3..612a8c6 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -930,10 +930,198 @@ tgsi_scan_arrays(const struct tgsi_token *tokens,
   array->writemask |= dst->Register.WriteMask;
 }
  }
   }
}
 
tgsi_parse_free();
 
return;
 }
+
+static void
+check_no_subroutines(const struct tgsi_full_instruction *inst)
+{
+   switch (inst->Instruction.Opcode) {
+   case TGSI_OPCODE_BGNSUB:
+   case TGSI_OPCODE_ENDSUB:
+   case TGSI_OPCODE_CAL:
+  unreachable("subroutines unhandled");
+   }
+}
+
+static unsigned
+get_inst_tessfactor_writemask(const struct tgsi_shader_info *info,
+  const struct tgsi_full_instruction *inst)
+{
+   unsigned writemask = 0;
+
+   for (unsigned i = 0; i < inst->Instruction.NumDstRegs; i++) {
+  const struct tgsi_full_dst_register *dst = >Dst[i];
+
+  if (dst->Register.File == TGSI_FILE_OUTPUT &&
+  !dst->Register.Indirect) {
+ unsigned name = info->output_semantic_name[dst->Register.Index];
+
+ if (name == TGSI_SEMANTIC_TESSINNER)
+writemask |= dst->Register.WriteMask;
+ else if (name == TGSI_SEMANTIC_TESSOUTER)
+writemask |= dst->Register.WriteMask << 4;
+  }
+   }
+   return writemask;
+}
+
+static unsigned
+get_block_tessfactor_writemask(const struct tgsi_shader_info *info,
+   struct tgsi_parse_context *parse,
+   unsigned end_opcode)
+{
+   struct tgsi_full_instruction *inst;
+   unsigned writemask = 0;
+
+   do {
+  tgsi_parse_token(parse);
+  assert(parse->FullToken.Token.Type == TGSI_TOKEN_TYPE_INSTRUCTION);
+  inst = >FullToken.FullInstruction;
+  check_no_subroutines(inst);
+
+  /* Recursively process nested blocks. */
+  switch (inst->Instruction.Opcode) {
+  case TGSI_OPCODE_IF:
+  case TGSI_OPCODE_UIF:
+ writemask |=
+get_block_tessfactor_writemask(info, parse, TGSI_OPCODE_ENDIF);
+ continue;
+
+  case TGSI_OPCODE_BGNLOOP:
+ writemask |=
+get_block_tessfactor_writemask(info, parse, TGSI_OPCODE_ENDLOOP);
+ continue;
+  }
+
+  writemask |= get_inst_tessfactor_writemask(info, inst);
+   } while (inst->Instruction.Opcode != end_opcode);
+
+   return writemask;
+}
+
+static void
+get_if_block_tessfactor_writemask(const struct tgsi_shader_info *info,
+  struct tgsi_parse_context *parse,
+  unsigned *upper_block_tf_writemask,
+  unsigned *cond_block_tf_writemask)
+{
+   struct tgsi_full_instruction *inst;
+   unsigned then_tessfactor_writemask = 0;
+   unsigned else_tessfactor_writemask = 0;
+   bool is_then = true;
+
+   do {
+  tgsi_parse_token(parse);
+  assert(parse->FullToken.Token.Type == TGSI_TOKEN_TYPE_INSTRUCTION);
+  inst = >FullToken.FullInstruction;
+  check_no_subroutines(inst);
+
+  switch (inst->Instruction.Opcode) {
+  case TGSI_OPCODE_ELSE:
+ is_then = false;
+ continue;
+
+  /* Recursively process nested blocks. */
+  case TGSI_OPCODE_IF:
+  case TGSI_OPCODE_UIF:
+ get_if_block_tessfactor_writemask(info, parse,
+   is_then ? 
_tessfactor_writemask :
+ 
_tessfactor_writemask,
+   cond_block_tf_writemask);
+ continue;
+
+  case TGSI_OPCODE_BGNLOOP:
+ *cond_block_tf_writemask |=
+get_block_tessfactor_writemask(info, parse, TGSI_OPCODE_ENDLOOP);
+ continue;
+  }
+
+  /* Process an instruction in the current block. */
+  unsigned writemask = get_inst_tessfactor_writemask(info, inst);
+
+  if (writemask) {
+ if (is_then)
+then_tessfactor_writemask |= writemask;
+ else
+else_tessfactor_writemask |= writemask;
+  }
+   } while (inst->Instruction.Opcode != TGSI_OPCODE_ENDIF);
+
+   if (then_tessfactor_writemask || else_tessfactor_writemask) {
+  /* If both statements write the same tess factor channels,
+   * we can say that the upper block writes them too. */
+  *upper_block_tf_writemask |= then_tessfactor_writemask &
+   

Re: [Mesa-dev] [PATCH v6 3/6] mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker

2017-09-06 Thread Ilia Mirkin
On Wed, Sep 6, 2017 at 12:29 PM, Emil Velikov  wrote:
> On 6 September 2017 at 17:17, Ilia Mirkin  wrote:
>> One might ask the question of why st/mesa is being built at all
>> without HAVE_GALLIUM...
>>
> The following snippet might answer your question (that or I'm having a
> dull moment).

Nope, I'm the dull one here. The issue isn't that mesa/state_tracker
(and thus tests) are being built, it's that state_tracker/tests is
being added unconditionally to the tests target higher up. All good.
Move along. Nothing to see here.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102565] u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102565

Bug ID: 102565
   Summary: u_debug_stack.c:114: undefined reference to
`_Ux86_64_getcontext'
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org

u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'

5c9af800cbce71f818b5d5e8ce9bfc5b56611360 (master 17.3.0-devel)

$ make check
[...]
  CXXLDst-renumerate-test
../../../../src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In
function `debug_backtrace_capture':
mesa/src/gallium/auxiliary/util/u_debug_stack.c:114: undefined reference to
`_Ux86_64_getcontext'
mesa/src/gallium/auxiliary/util/u_debug_stack.c:115: undefined reference to
`_ULx86_64_init_local'
mesa/src/gallium/auxiliary/util/u_debug_stack.c:117: undefined reference to
`_ULx86_64_step'
mesa/src/gallium/auxiliary/util/u_debug_stack.c:120: undefined reference to
`_ULx86_64_step'
mesa/src/gallium/auxiliary/util/u_debug_stack.c:123: undefined reference to
`_ULx86_64_get_reg'
mesa/src/gallium/auxiliary/util/u_debug_stack.c:124: undefined reference to
`_ULx86_64_get_proc_info'

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102530

--- Comment #21 from Emil Velikov  ---
The attached patch from Alexandre seems to be a noop.

This series from Eric treats MESA_NO_ERROR as boolean.
https://patchwork.freedesktop.org/series/29888/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6 3/6] mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker

2017-09-06 Thread Emil Velikov
On 6 September 2017 at 17:17, Ilia Mirkin  wrote:
> One might ask the question of why st/mesa is being built at all
> without HAVE_GALLIUM...
>
The following snippet might answer your question (that or I'm having a
dull moment).

src/mesa/Makefile.am:if HAVE_GALLIUM
src/mesa/Makefile.am-noinst_LTLIBRARIES += libmesagallium.la // aka st/mesa
src/mesa/Makefile.am-endif

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102530

--- Comment #20 from Alexandre Demers  ---
Created attachment 134015
  --> https://bugs.freedesktop.org/attachment.cgi?id=134015=edit
Enable KHR_NO_ERROR only if MESA_NO_ERROR's value is different from 0

>From Michel Dänzer's comment, KHR_NO_ERROR is enabled whenever MESA_NO_ERROR is
set with no regard to its actual value (which is not the case when set through
driconf).

This should take care of the problem, activating KHR_NO_ERROR only if
MESA_NO_ERROR is different than 0.

If this is fine, could someone commit it for me?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] glx: remove dead code

2017-09-06 Thread Emil Velikov
On 6 September 2017 at 17:17, Adam Jackson  wrote:
> On Fri, 2017-09-01 at 15:04 +0100, Eric Engestrom wrote:
>> These fields were added in 2d94601582 but never used; hasPresent was
>> never set, while the other ones were set but never read.
>
> I think this patch is wrong:
>
>> -   dri3_reply = xcb_dri3_query_version_reply(c, dri3_cookie, );
>> -   if (!dri3_reply) {
>> -  free(error);
>> -  goto no_extension;
>> -   }
>
> You're not just removing the dead stores into the display state, you're
> also removing the checks for whether the extensions exist at all. The
> DRI3 loader is definitely not going to work if DRI3 and Present aren't,
> er, present.
>
I'm not that much of an expert on things XCB, so perhaps a silly question.
Isn't the presence checked with the code just above the removed hunk? Namely:

extension = xcb_get_extension_data(c, _dri3_id);
if (!(extension && extension->present))
  return NULL;

extension = xcb_get_extension_data(c, _present_id);
if (!(extension && extension->present))
  return NULL;

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6 3/6] mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker

2017-09-06 Thread Ilia Mirkin
One might ask the question of why st/mesa is being built at all
without HAVE_GALLIUM...

On Wed, Sep 6, 2017 at 12:11 PM, Emil Velikov  wrote:
> Hi Gert,
>
> This seems to have broken the "classic only" build - see
> https://travis-ci.org/evelikov/Mesa/jobs/272529714.
> In there the following is executed
>
> DRI_LOADERS="--enable-glx --enable-gbm --enable-egl
> --with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"
> DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
> GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine
> --disable-xvmc --disable-vdpau --disable-va --disable-omx
> --disable-gallium-osmesa"
> GALLIUM_DRIVERS=""
> VULKAN_DRIVERS=""
>
> ./autogen.sh --enable-debug \
> $DRI_LOADERS \
> --with-dri-drivers=$DRI_DRIVERS \
> $GALLIUM_ST \
> --with-gallium-drivers=$GALLIUM_DRIVERS \
> --with-vulkan-drivers=$VULKAN_DRIVERS \
> --disable-llvm-shared-libs
>
> make && make check
>
> On 4 July 2017 at 15:18, Gert Wollny  wrote:
>> This patch adds a set of unit tests for the new lifetime tracker.
>> ---
>>  configure.ac   |1 +
>>  src/mesa/Makefile.am   |2 +-
>>  src/mesa/state_tracker/tests/Makefile.am   |   36 +
>>  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 1364 
>> 
>>  4 files changed, 1402 insertions(+), 1 deletion(-)
>>  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>>  create mode 100644 
>> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
>>
>> diff --git a/configure.ac b/configure.ac
>> index 1e7a3be73f..d49aa83082 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -2850,6 +2850,7 @@ AC_CONFIG_FILES([Makefile
>> src/mesa/drivers/osmesa/osmesa.pc
>> src/mesa/drivers/x11/Makefile
>> src/mesa/main/tests/Makefile
>> +   src/mesa/state_tracker/tests/Makefile
>> src/util/Makefile
>> src/util/tests/hash_table/Makefile
>> src/vulkan/Makefile])
>> diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
>> index 97a9bbd8c2..865735be27 100644
>> --- a/src/mesa/Makefile.am
>> +++ b/src/mesa/Makefile.am
>> @@ -19,7 +19,7 @@
>>  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS
>>  # IN THE SOFTWARE.
>>
>> -SUBDIRS = . main/tests
>> +SUBDIRS = . main/tests state_tracker/tests
>>
> New folder should be conditionally included, ideally with a comment
> "Tests depend on libmesagallium.la "
>
> if HAVE_GALLIUM
> SUBDIRS += state_tracker/tests
> endif
>
> Do give it a try and polish any other nitpicks.
>
> Thanks
> Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] glx: remove dead code

2017-09-06 Thread Adam Jackson
On Fri, 2017-09-01 at 15:04 +0100, Eric Engestrom wrote:
> These fields were added in 2d94601582 but never used; hasPresent was
> never set, while the other ones were set but never read.

I think this patch is wrong:

> -   dri3_reply = xcb_dri3_query_version_reply(c, dri3_cookie, );
> -   if (!dri3_reply) {
> -  free(error);
> -  goto no_extension;
> -   }

You're not just removing the dead stores into the display state, you're
also removing the checks for whether the extensions exist at all. The
DRI3 loader is definitely not going to work if DRI3 and Present aren't,
er, present.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v6 3/6] mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker

2017-09-06 Thread Emil Velikov
Hi Gert,

This seems to have broken the "classic only" build - see
https://travis-ci.org/evelikov/Mesa/jobs/272529714.
In there the following is executed

DRI_LOADERS="--enable-glx --enable-gbm --enable-egl
--with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"
DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine
--disable-xvmc --disable-vdpau --disable-va --disable-omx
--disable-gallium-osmesa"
GALLIUM_DRIVERS=""
VULKAN_DRIVERS=""

./autogen.sh --enable-debug \
$DRI_LOADERS \
--with-dri-drivers=$DRI_DRIVERS \
$GALLIUM_ST \
--with-gallium-drivers=$GALLIUM_DRIVERS \
--with-vulkan-drivers=$VULKAN_DRIVERS \
--disable-llvm-shared-libs

make && make check

On 4 July 2017 at 15:18, Gert Wollny  wrote:
> This patch adds a set of unit tests for the new lifetime tracker.
> ---
>  configure.ac   |1 +
>  src/mesa/Makefile.am   |2 +-
>  src/mesa/state_tracker/tests/Makefile.am   |   36 +
>  .../tests/test_glsl_to_tgsi_lifetime.cpp   | 1364 
> 
>  4 files changed, 1402 insertions(+), 1 deletion(-)
>  create mode 100644 src/mesa/state_tracker/tests/Makefile.am
>  create mode 100644 
> src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp
>
> diff --git a/configure.ac b/configure.ac
> index 1e7a3be73f..d49aa83082 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2850,6 +2850,7 @@ AC_CONFIG_FILES([Makefile
> src/mesa/drivers/osmesa/osmesa.pc
> src/mesa/drivers/x11/Makefile
> src/mesa/main/tests/Makefile
> +   src/mesa/state_tracker/tests/Makefile
> src/util/Makefile
> src/util/tests/hash_table/Makefile
> src/vulkan/Makefile])
> diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
> index 97a9bbd8c2..865735be27 100644
> --- a/src/mesa/Makefile.am
> +++ b/src/mesa/Makefile.am
> @@ -19,7 +19,7 @@
>  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
>  # IN THE SOFTWARE.
>
> -SUBDIRS = . main/tests
> +SUBDIRS = . main/tests state_tracker/tests
>
New folder should be conditionally included, ideally with a comment
"Tests depend on libmesagallium.la "

if HAVE_GALLIUM
SUBDIRS += state_tracker/tests
endif

Do give it a try and polish any other nitpicks.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: disallow mixed varying types within a location

2017-09-06 Thread Ilia Mirkin
On Wed, Sep 6, 2017 at 12:13 AM, Timothy Arceri  wrote:
>
>
> On 06/09/17 11:59, Ilia Mirkin wrote:
>>
>> On Tue, Sep 5, 2017 at 9:54 PM, Timothy Arceri 
>> wrote:
>>>
>>>
>>> On 06/09/17 11:23, Ilia Mirkin wrote:


 The enhanced layouts spec has all kinds of restrictions about what can
 and cannot be mixed in a location. Integer/float(/presumably double)
 can't occupy a single location, interpolation has to be the same, as
 well as auxiliary storage (including patch!).

 The implication of this is ... don't specify explicit
 locations/components
 if you want better packing, since the auto-packer doesn't care at all
 about most of these restrictions. Sad.
>>>
>>>
>>>
>>> There are still use cases such as SSO, tessellation shaders and varyings
>>> used by interpolateAt (although we just enable the enhanced layout
>>> packing
>>> rules by default for those anyway) were we cannot use the auto-packer.
>>>
>>> As far as the patch goes this should really be in link_varyings.cpp
>>> rather
>>> than linker.cpp, also there is already related validation code in
>>> cross_validate_outputs_to_inputs() any reason for not just modifying the
>>> code there?
>>
>>
>> This applies to whole shader stages. So e.g. in SSO, you still
>> validate the inputs and the outputs. Similarly, you do this for vertex
>> shader inputs.
>
>
> I'm not following what you are trying to say here.
> cross_validate_outputs_to_inputs() does almost the same thing you are doing
> here, but also validates the outputs from one stage with the input from
> another. You just need to adjust the offsets for patches like you have done
> here and add the missing interpolation checks etc, the base type is already
> validated there.
>
>>
>> Ideally it'd be done earlier on, but we need to wait for the interface
>> types to go away, or else it'd be a disaster.
>>
>> Most of link_varyings is concerned with inter-stage logic. It could be
>> moved there, of course, just didn't really seem to belong.
>
>
> link_varyings should be used for all things varyings unless it really
> doesn't make sense. There is no reason to dump everything in linker.cpp. The
> only reason there are still bits in linker.cpp is because Paul left the
> project in the middle of re-factoring things and beside a single patch from
> me that moved more varying related code here a while ago, there has been
> pretty much zero effort in re-factoring any of the GLSL IR compiler into
> more sensible pieces.
>

OK, I'll redo it. I originally had hopes of doing it much earlier in
the flow, but then realized that I needed the interface lowering.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102496] Frontbuffer rendering corruption on mesa master

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102496

--- Comment #8 from Bruce Cherniak  ---
Sounds like the right solution then.  Thanks for taking a look at this.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102496] Frontbuffer rendering corruption on mesa master

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102496

--- Comment #7 from Thomas Hellström  ---
(In reply to Bruce Cherniak from comment #6)
> Verified.  Your patch does fix this issue.
> 
> Does this negate the original intent of your "Reduce the number of
> frontbuffer flush calls"?

Nope, we're still skipping the frontbuffer flush when there was nothing
rendered to the frontbuffer, (which we didn't do before). But contrary to
before, we're updating the framebuffer state after each frontbuffer flush.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/4] i965/screen: Report the correct number of image planes

2017-09-06 Thread Emil Velikov
On 5 September 2017 at 19:31, Jason Ekstrand  wrote:
> On Tue, Sep 5, 2017 at 10:25 AM, Emil Velikov 
> wrote:
>>
>> Hi Jason,
>>
>> On 5 September 2017 at 16:48, Jason Ekstrand  wrote:
>> > For non-CCS images, we were reporting just one plane even though they
>> > may have multiple in the case of YUV.
>> >
>> > Reviewed-by: Ben Widawsky 
>> I think we want this for stable, right?
>
>
> Maybe?  Ben and I were debating it.  I don't think it would hurt to send it
> to stable but I also doubt it benefits us given that everything seems to be
> working.  I'm happy to add the tag if you'd like.
>
You and Ben know this code far better than me, so I'll leave it to you guys :-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102564] swr: GPU Caps Viewer crashes with any 3D demo

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102564

Alex Granni  changed:

   What|Removed |Added

 Attachment #134011|GPU Caps Viewer randomly|GPU Caps Viewer crash in
description|picked demo with OpenSWR -  |randomly picked demo with
   |Visual Studio debug output  |OpenSWR - Visual Studio
   |screenshot  |debug output screenshot

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/17] i965: Disentangle batch and state buffer flushing.

2017-09-06 Thread Chris Wilson
Quoting Kenneth Graunke (2017-09-06 16:33:48)
> On Wednesday, September 6, 2017 5:26:10 AM PDT Chris Wilson wrote:
> > Quoting Kenneth Graunke (2017-09-06 01:09:50)
> > > We now flush the batch when either the batchbuffer or statebuffer
> > > reaches the original intended batch size, instead of when the sum of
> > > the two reaches a certain size (which makes no sense now that they're
> > > separate buffers).
> > > 
> > > With this change, we also need to update our "are we near the end?"
> > > estimate to require separate batch and state buffer space.  I obtained
> > > these estimates by looking at the size of draw calls in the Unreal 4
> > > Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).
> > > 
> > > This will increase the batch size by perhaps 2-4x, which will almost
> > > certainly have a performance impact, and may impact overall system
> > > responsiveness.
> > 
> > You also need to update DEBUG_FLUSH:
> > 
> > @@ -823,8 +826,8 @@ _intel_batchbuffer_flush_fence(struct brw_context *brw,
> >int bytes_for_state = brw->batch.state_used;
> >fprintf(stderr, "%s:%d: Batchbuffer flush with %4db (%0.1f%%) (pkt) 
> > + "
> >"%4db (%0.1f%%) (state)\n", file, line,
> > -  bytes_for_commands, 100.0f * bytes_for_commands / BATCH_SZ,
> > -  bytes_for_state, 100.0f * bytes_for_state / STATE_SZ);
> > +  bytes_for_commands, 100.0f * bytes_for_commands / 
> > brw->batch.bo->size,
> > +  bytes_for_state, 100.0f * bytes_for_state / 
> > brw->batch.state_bo->size);
> > }
> 
> Ah...I'd actually meant to leave it this way.  The flushing still happens
> when we reach the target size (BATCH_SZ or STATE_SZ), even if we grow...
> I figured we could report the "we grew the batch" cases as "105% of
> the target size", so you can see that the batch is over-utilized...
> 
> Which I guess is a good point...with that model, we won't grow more than
> once anyway, because after we finish the one draw, we'll be over BATCH_SZ
> (or STATE_SZ) and flush.  So it might be reasonable to just allocate
> (BATCH_SZ * 2) and not have the pretense of making it continually grow...

Ok, that makes sense but I completely missed that was the intended
design as I read through the series. I was just expecting for everything
to keep on growing on demand until some heuristic said enough was
enough.

Just make sure that design is described in a comment somewhere around
grow_buffers().
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/mtypes: repack gl_texture_object.

2017-09-06 Thread Nicolai Hähnle

On 06.09.2017 11:47, Marek Olšák wrote:

On Wed, Sep 6, 2017 at 4:44 AM, Dave Airlie  wrote:

On 6 September 2017 at 03:11, Marek Olšák  wrote:

On Tue, Sep 5, 2017 at 5:50 PM, Brian Paul  wrote:

On 09/04/2017 05:29 AM, Marek Olšák wrote:


On Sun, Sep 3, 2017 at 1:18 PM, Dave Airlie  wrote:


From: Dave Airlie 

reduces size from 1144 to 1128.

Signed-off-by: Dave Airlie 
---
   src/mesa/main/mtypes.h | 10 +-
   1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index d44897b..3d68a6d 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1012,7 +1012,6 @@ struct gl_texture_object
  struct gl_sampler_object Sampler;

  GLenum DepthMode;   /**< GL_ARB_depth_texture */



The patch looks good, but here are some ideas for future improvements:

GLenum can be uint16_t everywhere, because GL doesn't set higher bits:

typedef uint16_t GLenum16.
s/GLenum/GLenum16/


-   bool StencilSampling;   /**< Should we sample stencil instead of
depth? */

  GLfloat Priority;   /**< in [0,1] */
  GLint BaseLevel;/**< min mipmap level, OpenGL 1.2 */
@@ -1033,12 +1032,17 @@ struct gl_texture_object
  GLboolean Immutable;/**< GL_ARB_texture_storage */
  GLboolean _IsFloat; /**< GL_OES_float_texture */
  GLboolean _IsHalfFloat; /**< GL_OES_half_float_texture */
+   bool StencilSampling;   /**< Should we sample stencil instead of
depth? */
+   bool HandleAllocated;   /**< GL_ARB_bindless_texture */



All bools can be 1 bit:

bool x:1;
GLboolean y:1;

etc.



  GLuint MinLevel;/**< GL_ARB_texture_view */
  GLuint MinLayer;/**< GL_ARB_texture_view */
  GLuint NumLevels;   /**< GL_ARB_texture_view */
  GLuint NumLayers;   /**< GL_ARB_texture_view */



MinLevel, NumLevels can be ubyte (uint8_t). MinLayer, NumLayers can be
ushort (uint16_t)... simply by considering the range of possible
values.



There's lots of opportunities along these lines in gl_texture_image. And
since we often have many gl_texture_images per gl_texture_object, and we
often have many textures, it'll probably have considerable impact.  I've
suggested this in the past but never got around to working on it.

I recall Eric Anholt mentioning a memory profiling tool that was helpful for
finding wasted space in structures, etc.  I don't recall the name right now.
Eric?


Dave used pahole for this patch series too. It can't obviously suggest
what I suggested above (like changing the types and bits).


Yup this was pahole, doing what Marek describes is definitely something
that can be done, but needs a lot more care and attention.

Replacing bool with unsigned :1 fields isn't always a win, as you then
have a mask/shift on the accesses so overall may end up slowing things
down, and increasing instruction count etc.


bool:1 is better than unsigned:1, because it's has the behavior of
bool. (and the 1-byte alignment might also matter)

Testing bool:1 doesn't need a shift, only a mask, i.e. the compiler should do:
if (structure->somebyte & 0x04)
for testing a bool on the 3rd bit.

The OR operator has the same complexity:
if (structure->bool_val & 0x24)
for testing 2 bools on the 3rd and 6th bit.

And the AND operator:
if ((structure->bool_val & 0x24) == 0x24)

>

Setting a bool:1 to 1 is done by OR'ing the bit. Setting a bool to 0
is done by AND'ing the bit. No shifts anywhere.


In fairness, most of these do end up involving more instructions because 
x86 ops don't have memory/immediate forms. So for a simple if 
(structure->bool_bit) you end up with a MOV+TEST instead of a CMP.


I still think bitfields are worth it in many cases, though, and 
gl_texture_image may be a good candidate.


Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102564] swr: GPU Caps Viewer crashes with any 3D demo

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102564

Bug ID: 102564
   Summary: swr: GPU Caps Viewer crashes with any 3D demo
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/swr
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: liviupro...@yahoo.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 134011
  --> https://bugs.freedesktop.org/attachment.cgi?id=134011=edit
GPU Caps Viewer randomly picked demo with OpenSWR - Visual Studio debug output
screenshot

I was trying to benchmark swr against llvmpipe using GPU Caps Viewer
(http://www.geeks3d.com/category/geeks3d/gpu-caps-viewer-geeks3d/) as I
observed some performance gain with v17.2.0 when testing with other software,
but this didn't worked as planned. I got an instant crash on any demo I tried
regardless of what OpenGL version the demo required. Initially I thought that
GL 3.3COMPAT context enforced with MESA_GL_VERSION_OVERRIDE is the cause[1]. I
thought maybe OpenSWR doesn't support this yet or it is buggy, but that doesn't
seam to be the case. Even if I remove that from the batch script I wrote to
pass variables to Mesa before launching GPU Caps Viewer and limit myself to GL
1.2 and 2.1 demos, I still get crashes for all demos, regardless of GL
requirements being met or not.
llvmpipe doesn't crash even if you attempt to start a demo for which llvmpipe
doesn't meet the requirements. It fails cleanly.

[1]I set MESA_GL_VERSION_OVERRIDE=3.3COMPAT for convenience as it allows
running GL 1.2, 2.1 and 3.x demos without having to restart GPU Caps Viewer
with GL 3.x content for GL 3.x demos and without for GL 1.2 and 2.1 demos.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/17] i965: Grow the batch/state buffers if we need space and can't flush.

2017-09-06 Thread Chris Wilson
Quoting Kenneth Graunke (2017-09-06 16:20:00)
> On Wednesday, September 6, 2017 3:08:44 AM PDT Chris Wilson wrote:
> > Quoting Kenneth Graunke (2017-09-06 01:09:47)
> > > Previously, we would just assert fail and die in this case.  The only
> > > safeguard is the "estimated max prim size" checks when starting a draw
> > > (or compute dispatch or BLORP operation)...which are woefully broken.
> > > 
> > > Growing is fairly straightforward:
> > > 
> > > 1. Allocate a new larger BO.
> > > 2. memcpy the existing contents over to the new buffer
> > > 3. Set the new BO to the same GTT offset as the old BO.  When emitting
> > >relocations, we write the presumed GTT offset of the target BO.  If
> > >we changed it, we'd have to update all the existing values (by
> > >walking the relocation list and looking at offsets), which is more
> > >expensive.  With the old BO freed, ideally the kernel could simply
> > >place the new BO at that offset anyway.
> > > 4. Update the validation list to contain the new BO.
> > > 5. Update the relocation list to have the GEM handle for the new BO
> > >(which we can skip if using I915_EXEC_HANDLE_LUT).
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 104 
> > > --
> > >  1 file changed, 99 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
> > > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > index 909f56f9792..118f75c4d71 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > > @@ -43,6 +43,9 @@
> > >  #define BATCH_SZ (8192*sizeof(uint32_t))
> > >  #define STATE_SZ (8192*sizeof(uint32_t))
> > >  
> > > +/* Don't exceed this - batchbuffers need to fit in the ring! */
> > 
> > I don't understand this comment. I probably just have the wrong pov, you
> > say ring and I then think of the legacy/lrc ringbuffer in the kernel.
> 
> My understanding was that the legacy Gen4-7.5 ringbuffer mode allocated a
> ringbuffer that was...128kB large?  So if you exceeded that size, the
> batch would not fit in the ring at all, and execbuf would fail.

We don't copy the batch into the ring, we just stick a
MI_BATCH_BUFFER_START in there (with a flush before/after and a
breadcrumb, along with any context switch, change of mm, etc).

64k is indeed a magic limit for the state batch though, but as there's
no limit on the batch buffer size (after converting it to a pure command
stream, just the prospect of a timeout). Well... There is an implicit
assumption that you don't exceed 256KiB for a batch (as a limit for a gen2
workaround and a different gen7 workaround).

Elsewhere, I have used 256KiB batches (and then shrunk to fit) simply
because of UINT16_MAX dwords. (Which is kind of why the kernel assumes
that the upper reasonable maximum size is 256KiB for its w/a.)

> > > +   brw_bo_reference(new_bo);
> > > +   brw_bo_unreference(old_bo);
> > 
> > > +
> > > +   if (!batch->use_batch_first) {
> > > +  /* We're not using I915_EXEC_HANDLE_LUT, which means we need to go
> > > +   * update the relocation list entries to point at the new BO as 
> > > well.
> > > +   * (With newer kernels, the "handle" is an offset into the 
> > > validation
> > > +   * list, which remains unchanged, so we can skip this.)
> > > +   */
> > > +  replace_bo_in_reloc_list(>batch_relocs,
> > > +   old_bo->gem_handle, new_bo->gem_handle);
> > > +  replace_bo_in_reloc_list(>state_relocs,
> > > +   old_bo->gem_handle, new_bo->gem_handle);
> > > +   }
> > > +
> > > +   /* Drop the *bo_ptr reference.  This should free the old BO. */
> > > +   brw_bo_unreference(old_bo);
> > 
> > Ok, just took a double take even with the comment in place.
> > 
> > > +   *bo_ptr = new_bo;
> > > +   *map_ptr = new_map;
> > > +}
> 
> Yeah...the double unreference is kind of spooky.  The alternative is to
> make the batch and state buffers not referenced by the validation list,
> which does save a few atomics, but adds a != batch_bo && != state_bo
> check in the validation list loops...so...seemed better to just do this.

It's not terrible, it just looked fishy! But I didn't like any of my
suggestions for the comment either.
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102496] Frontbuffer rendering corruption on mesa master

2017-09-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102496

--- Comment #6 from Bruce Cherniak  ---
Verified.  Your patch does fix this issue.

Does this negate the original intent of your "Reduce the number of frontbuffer
flush calls"?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v9 0/7] mesa/st: glsl_to_tgsi: refined register merge algorithm

2017-09-06 Thread Nicolai Hähnle

On 06.09.2017 17:00, Dieter Nützel wrote:

Am 06.09.2017 11:53, schrieb Nicolai Hähnle:

I finally went over this again, and I think it's good enough to go in.
R-b and pushed!

Thanks for making all the adjustments and your patience :)

I did notice some minor things, I'm about to send out patches to clean
things up afterwards.

Cheers,
Nicolai


Wow,

this IS GREAT news!
But you've lost all my T-b ... ;-)))


Sorry about that. I guess you didn't send them on the last version?

Gert, for the future, the usual policy is that when you receive 
Tested-by (or other things) on earlier versions of commits, you add them as


Tested-by: Firstname Lastname  (vNN)

where vNN is whichever version of the series you received the Tested-by 
for. That way, they won't get lost :)


Cheers,
Nicolai




What I've found and didn't mailed before is, that I see ~50 KB smaller 
size off the *dr[i|v].so files with Gert's nice stuff.


Thank you very much for this, Gert!

I'll swap my Turks XT/6670 back in and test LS2015 on it again, too 
verify if the missing textures (people) appear, now.


Greetings,
Dieter



On 24.08.2017 19:38, Gert Wollny wrote:

Dear all,

I thought I might send out this patch another time with its full 
history and
freshly rebased. All the changes that I applied were a result of 
reviews by

Nicolai (mostly) and Emil (thanks again to both of you).

The set is mirroed at
https://github.com/gerddie/mesa/tree/regrename-v9

The patch fixes a series of bugs where shader compilation fails with
   "translation from TGSI failed!"
Among these are
   * https://bugs.freedesktop.org/show_bug.cgi?id=65448 which
 I can confirm will be fixed for R600_DEBUG=nosb set (with sb 
enabled it will

 fail with a failing assertion in the sb code).
   * According to a user report against v5, the patch also fixes #99349

I can also confirm that the patch fixes the Piano and Voloplosion 
benchmarks

implemented in gputest on BARTS (r600g).

The patch has no significant impact on runtime - not taking Dave's 
patch into
account that in itself reduces the register renaming run-time for 
shaders

with a large numbers of temporary registers.

The patch doesn't introduce piglit regression (I tested the shader 
subset).
spec@glsl-1.50@execution@variable-indexing@gs-input-array-vec2-index-rd 
is

fixed though.

The algorithm works like follows:
- first the program is scanned, the loops, switch and if/else scopes are
   collected and for each temporary first and last reads and writes 
and the
   according scopes are collected, and it is recorded whether a 
variable is
   written conditionally, and whether loops have continue or break 
statements.
- then after the whole program has been scanned, the life times are 
estimated
   by merging the read and write scopes for each temporary on a per 
component

   bases,
- the life-times of the cmponents are merged,
- the register mapping is evaluated, and
- the mapping is applied with the rename_temp_registers method 
implemented

   by Dave.

I've used the patches for quite some time now and so far I didn't 
encounter

any problems, many thanks for any comments,
  Gert

Patch history:

v2:* significantly cut down on the memory allocations,
* expose only a minimal interface to register lifetime estimation 
and

  calculating the rename table,
v3: was broken and v4 restarted from v2

v4:* split the changes into more patches
* correct formatting errors,
* remove the use of the STL with one exception though:
  since in st_glsl_to_tgsi.cpp std::sort is already used and its 
run-time
  performance is significantly better than qsort. It is used in 
the register
  rename mapping evaluation. It can be disabled by commenting out 
the define

  USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp.
* add more tests and improve the life-time evaluation accordingly,
* further reduce memory allocations,
* rename functions and methods to better clarify what they are 
used for,

* remove unused methods and variables in prog_scope,
* eliminate the class tgsi_temp_lifetime,
* no longer require C++11 for the core library code, however, the 
tests

  make use of C++11 and the STL

v5: * correct formatting following Emil's suggetions
 * remove un-needed libraries for the tests

v6:* the components are now tracked individually and the life time of 
a temporary

  is evaluated by merging the life-times of their components,
* BRK/CONT are now handled separately,
* the final algorithm to evaluate the life-times was simplified,
* read and write in the same instruction is now considered to be 
always

  well defined,
* adherence to the coding stile was improved,
* the case scope level is now below the according switch scope 
level,
* the new register merge method replaces the old version, i.e. no 
environment
  variables to switch between implementations. In theory, one 
could also
  remove 

Re: [Mesa-dev] [PATCH 00/17] i965: Growing the batch buffer, separate state buffers

2017-09-06 Thread Kenneth Graunke
On Wednesday, September 6, 2017 2:21:51 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-09-06 01:09:33)
> > A couple issues remain: the series drops the malloc'd shadow copy of
> > the batch for non-LLC systems.  I haven't checked how dire this is.
> > The last patch also dramatically impacts batch sizes, which we'll need
> > to benchmark.  However, I thought I'd get the code out there for review
> > before finishing that - I don't expect the mechanics to change much.
> 
> It's a slight regression atm (at least looking at piglit/drawoverhead)
> all due to that dastardly readback in brw_emit_surface_state().
> 
> I've been chicken and been switching it with a temporary alloca there.
> -Chris

Yeah, it looks to be about -3 to -4% on Synmark's OglBatch[67] as well.

I'll try and fix ISL to handle the relocs so we can avoid the readback.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/17] i965: Disentangle batch and state buffer flushing.

2017-09-06 Thread Kenneth Graunke
On Wednesday, September 6, 2017 5:26:10 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-09-06 01:09:50)
> > We now flush the batch when either the batchbuffer or statebuffer
> > reaches the original intended batch size, instead of when the sum of
> > the two reaches a certain size (which makes no sense now that they're
> > separate buffers).
> > 
> > With this change, we also need to update our "are we near the end?"
> > estimate to require separate batch and state buffer space.  I obtained
> > these estimates by looking at the size of draw calls in the Unreal 4
> > Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).
> > 
> > This will increase the batch size by perhaps 2-4x, which will almost
> > certainly have a performance impact, and may impact overall system
> > responsiveness.
> 
> You also need to update DEBUG_FLUSH:
> 
> @@ -823,8 +826,8 @@ _intel_batchbuffer_flush_fence(struct brw_context *brw,
>int bytes_for_state = brw->batch.state_used;
>fprintf(stderr, "%s:%d: Batchbuffer flush with %4db (%0.1f%%) (pkt) + "
>"%4db (%0.1f%%) (state)\n", file, line,
> -  bytes_for_commands, 100.0f * bytes_for_commands / BATCH_SZ,
> -  bytes_for_state, 100.0f * bytes_for_state / STATE_SZ);
> +  bytes_for_commands, 100.0f * bytes_for_commands / 
> brw->batch.bo->size,
> +  bytes_for_state, 100.0f * bytes_for_state / 
> brw->batch.state_bo->size);
> }

Ah...I'd actually meant to leave it this way.  The flushing still happens
when we reach the target size (BATCH_SZ or STATE_SZ), even if we grow...
I figured we could report the "we grew the batch" cases as "105% of
the target size", so you can see that the batch is over-utilized...

Which I guess is a good point...with that model, we won't grow more than
once anyway, because after we finish the one draw, we'll be over BATCH_SZ
(or STATE_SZ) and flush.  So it might be reasonable to just allocate
(BATCH_SZ * 2) and not have the pretense of making it continually grow...

*shrug*

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/17] i965: Disentangle batch and state buffer flushing.

2017-09-06 Thread Kenneth Graunke
On Wednesday, September 6, 2017 4:23:37 AM PDT Chris Wilson wrote:
> Quoting Chris Wilson (2017-09-06 11:13:54)
> > Quoting Kenneth Graunke (2017-09-06 01:09:50)
> > > We now flush the batch when either the batchbuffer or statebuffer
> > > reaches the original intended batch size, instead of when the sum of
> > > the two reaches a certain size (which makes no sense now that they're
> > > separate buffers).
> > > 
> > > With this change, we also need to update our "are we near the end?"
> > > estimate to require separate batch and state buffer space.  I obtained
> > > these estimates by looking at the size of draw calls in the Unreal 4
> > > Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).
> > > 
> > > This will increase the batch size by perhaps 2-4x, which will almost
> > > certainly have a performance impact, and may impact overall system
> > > responsiveness.
> > > 
> > > XXX: benchmark, may need a lot of tuning.
> > 
> > What were you thoughts to not flushing the batch on swapping the state,
> > since that just needs to re-emit STATE_BASE?

With STATE_BASE_ADDRESS being a pretty heavy pipeline stall, I figured
that we may as well simply flush and kick some of the work off.  It's also
simpler to implement.

I don't expect growing to be very common - the hope is that the pre-draw
batch/state estimates will cause us to flush before we hit the end of the
original intended batch size, but if our estimates are off (say there's a
big draw at an inopportune time) we can survive.

Growing actually ended up being pretty cheap, though, except for the
memcpy...
 
> Also given the restriction upon the surface state that only allows it to
> grow once, why not just make it 64k and replace upon filling?
> -Chris

Not sure I follow - the batch is restricted to not grow forever, but the
state buffer should have no restrictions.  I'm also growing buffers by
1.5x rather than the usual 2x, though I guess that means 32k -> 48k ->
72k...

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/17] i965: Grow the batch/state buffers if we need space and can't flush.

2017-09-06 Thread Kenneth Graunke
On Wednesday, September 6, 2017 3:08:44 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-09-06 01:09:47)
> > Previously, we would just assert fail and die in this case.  The only
> > safeguard is the "estimated max prim size" checks when starting a draw
> > (or compute dispatch or BLORP operation)...which are woefully broken.
> > 
> > Growing is fairly straightforward:
> > 
> > 1. Allocate a new larger BO.
> > 2. memcpy the existing contents over to the new buffer
> > 3. Set the new BO to the same GTT offset as the old BO.  When emitting
> >relocations, we write the presumed GTT offset of the target BO.  If
> >we changed it, we'd have to update all the existing values (by
> >walking the relocation list and looking at offsets), which is more
> >expensive.  With the old BO freed, ideally the kernel could simply
> >place the new BO at that offset anyway.
> > 4. Update the validation list to contain the new BO.
> > 5. Update the relocation list to have the GEM handle for the new BO
> >(which we can skip if using I915_EXEC_HANDLE_LUT).
> > ---
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 104 
> > --
> >  1 file changed, 99 insertions(+), 5 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
> > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > index 909f56f9792..118f75c4d71 100644
> > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > @@ -43,6 +43,9 @@
> >  #define BATCH_SZ (8192*sizeof(uint32_t))
> >  #define STATE_SZ (8192*sizeof(uint32_t))
> >  
> > +/* Don't exceed this - batchbuffers need to fit in the ring! */
> 
> I don't understand this comment. I probably just have the wrong pov, you
> say ring and I then think of the legacy/lrc ringbuffer in the kernel.

My understanding was that the legacy Gen4-7.5 ringbuffer mode allocated a
ringbuffer that was...128kB large?  So if you exceeded that size, the
batch would not fit in the ring at all, and execbuf would fail.

65536 may be the wrong size here.

> > +#define MAX_BATCH_SIZE 65536
> > +
> >  static void
> >  intel_batchbuffer_reset(struct intel_batchbuffer *batch,
> >  struct brw_bufmgr *bufmgr);
> > @@ -228,6 +231,78 @@ intel_batchbuffer_free(struct intel_batchbuffer *batch)
> >_mesa_hash_table_destroy(batch->state_batch_sizes, NULL);
> >  }
> >  
> > +static void
> > +replace_bo_in_reloc_list(struct brw_reloc_list *rlist,
> > + uint32_t old_handle, uint32_t new_handle)
> > +{
> > +   for (int i = 0; i < rlist->reloc_count; i++) {
> > +  if (rlist->relocs[i].target_handle == old_handle)
> > + rlist->relocs[i].target_handle = new_handle;
> > +   }
> > +}
> > +
> > +static void
> > +grow_buffer(struct brw_context *brw,
> > +struct brw_bo **bo_ptr,
> > +uint32_t **map_ptr,
> > +unsigned existing_bytes,
> > +unsigned new_size)
> > +{
> > +   struct intel_batchbuffer *batch = >batch;
> > +   struct brw_bufmgr *bufmgr = brw->bufmgr;
> > +
> > +   uint32_t *old_map = *map_ptr;
> > +   struct brw_bo *old_bo = *bo_ptr;
> > +
> > +   struct brw_bo *new_bo = brw_bo_alloc(bufmgr, old_bo->name, new_size, 
> > 4096);
> > +   uint32_t *new_map = brw_bo_map(brw, new_bo, MAP_READ | MAP_WRITE);
> > +
> > +   perf_debug("Growing %s - ran out of space\n", old_bo->name);
> > +
> > +   /* Copy existing data to the new larger buffer */
> > +   memcpy(new_map, old_map, existing_bytes);
> 
> Needs the sse41 treatment.

Good catch, thanks!

> > +
> > +   /* Try to put the new BO at the same GTT offset as the old BO (which
> > +* we're throwing away, so it doesn't need to be there).
> > +*
> > +* This guarantees that our relocations continue to work: values we've
> > +* already written into the buffer, values we're going to write into the
> > +* buffer, and the validation/relocation lists all will match.
> > +*/
> > +   new_bo->gtt_offset = old_bo->gtt_offset;
> > +   new_bo->index = old_bo->index;
> > +
> > +   /* Batch/state buffers are per-context, and if we've run out of space,
> > +* we must have actually used them before, so...they will be in the 
> > list.
> > +*/
> > +   assert(old_bo->index < batch->exec_count);
> > +   assert(batch->exec_bos[old_bo->index] == old_bo);
> > +
> > +   /* Update the validation list to use the new BO. */
> > +   batch->exec_bos[old_bo->index] = new_bo;
> > +   batch->validation_list[old_bo->index].handle = new_bo->gem_handle;
> 
> Nice touch.
> 
> > +   brw_bo_reference(new_bo);
> > +   brw_bo_unreference(old_bo);
> 
> > +
> > +   if (!batch->use_batch_first) {
> > +  /* We're not using I915_EXEC_HANDLE_LUT, which means we need to go
> > +   * update the relocation list entries to point at the new BO as well.
> > +   * (With newer kernels, the "handle" is an offset into the validation
> > +   * list, which remains unchanged, 

Re: [Mesa-dev] [PATCH 06/17] i965: Drop a useless ret == 0 check.

2017-09-06 Thread Kenneth Graunke
On Wednesday, September 6, 2017 2:31:11 AM PDT Chris Wilson wrote:
> Quoting Kenneth Graunke (2017-09-06 01:09:39)
> > Prior to the previous patch, we would pwrite the batchbuffer contents,
> > and wanted to skip the execbuffer if that failed.  Now, we write things
> > directly to the map, so we don't need this check.
> > ---
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 40 
> > ---
> >  1 file changed, 18 insertions(+), 22 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
> > b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > index 9b37470f926..df094bb6047 100644
> > --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> > @@ -199,8 +199,6 @@ intel_batchbuffer_reset_to_saved(struct brw_context 
> > *brw)
> >  void
> >  intel_batchbuffer_free(struct intel_batchbuffer *batch)
> >  {
> > -   free(batch->cpu_map);
> 
> Stray from previous patch.
> -Chris

Fixed locally - squashed it into the previous patch.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v9 0/7] mesa/st: glsl_to_tgsi: refined register merge algorithm

2017-09-06 Thread Dieter Nützel

Am 06.09.2017 11:53, schrieb Nicolai Hähnle:

I finally went over this again, and I think it's good enough to go in.
R-b and pushed!

Thanks for making all the adjustments and your patience :)

I did notice some minor things, I'm about to send out patches to clean
things up afterwards.

Cheers,
Nicolai


Wow,

this IS GREAT news!
But you've lost all my T-b ... ;-)))

What I've found and didn't mailed before is, that I see ~50 KB smaller 
size off the *dr[i|v].so files with Gert's nice stuff.


Thank you very much for this, Gert!

I'll swap my Turks XT/6670 back in and test LS2015 on it again, too 
verify if the missing textures (people) appear, now.


Greetings,
Dieter



On 24.08.2017 19:38, Gert Wollny wrote:

Dear all,

I thought I might send out this patch another time with its full 
history and
freshly rebased. All the changes that I applied were a result of 
reviews by

Nicolai (mostly) and Emil (thanks again to both of you).

The set is mirroed at
https://github.com/gerddie/mesa/tree/regrename-v9

The patch fixes a series of bugs where shader compilation fails with
   "translation from TGSI failed!"
Among these are
   * https://bugs.freedesktop.org/show_bug.cgi?id=65448 which
 I can confirm will be fixed for R600_DEBUG=nosb set (with sb 
enabled it will

 fail with a failing assertion in the sb code).
   * According to a user report against v5, the patch also fixes 
#99349


I can also confirm that the patch fixes the Piano and Voloplosion 
benchmarks

implemented in gputest on BARTS (r600g).

The patch has no significant impact on runtime - not taking Dave's 
patch into
account that in itself reduces the register renaming run-time for 
shaders

with a large numbers of temporary registers.

The patch doesn't introduce piglit regression (I tested the shader 
subset).
spec@glsl-1.50@execution@variable-indexing@gs-input-array-vec2-index-rd 
is

fixed though.

The algorithm works like follows:
- first the program is scanned, the loops, switch and if/else scopes 
are
   collected and for each temporary first and last reads and writes 
and the
   according scopes are collected, and it is recorded whether a 
variable is
   written conditionally, and whether loops have continue or break 
statements.
- then after the whole program has been scanned, the life times are 
estimated
   by merging the read and write scopes for each temporary on a per 
component

   bases,
- the life-times of the cmponents are merged,
- the register mapping is evaluated, and
- the mapping is applied with the rename_temp_registers method 
implemented

   by Dave.

I've used the patches for quite some time now and so far I didn't 
encounter

any problems, many thanks for any comments,
  Gert

Patch history:

v2:* significantly cut down on the memory allocations,
* expose only a minimal interface to register lifetime estimation 
and

  calculating the rename table,
v3: was broken and v4 restarted from v2

v4:* split the changes into more patches
* correct formatting errors,
* remove the use of the STL with one exception though:
  since in st_glsl_to_tgsi.cpp std::sort is already used and its 
run-time
  performance is significantly better than qsort. It is used in 
the register
  rename mapping evaluation. It can be disabled by commenting out 
the define

  USE_STL_SORT in st_glsl_to_tgsi_temprename.cpp.
* add more tests and improve the life-time evaluation accordingly,
* further reduce memory allocations,
* rename functions and methods to better clarify what they are 
used for,

* remove unused methods and variables in prog_scope,
* eliminate the class tgsi_temp_lifetime,
* no longer require C++11 for the core library code, however, the 
tests

  make use of C++11 and the STL

v5: * correct formatting following Emil's suggetions
 * remove un-needed libraries for the tests

v6:* the components are now tracked individually and the life time of 
a temporary

  is evaluated by merging the life-times of their components,
* BRK/CONT are now handled separately,
* the final algorithm to evaluate the life-times was simplified,
* read and write in the same instruction is now considered to be 
always

  well defined,
* adherence to the coding stile was improved,
* the case scope level is now below the according switch scope 
level,
* the new register merge method replaces the old version, i.e. no 
environment
  variables to switch between implementations. In theory, one 
could also
  remove the function get_last_temp_read_first_temp_write, but is 
is still

  used in some code in a #define 0 block, so I didn't touch it.
* when compiled in debug mode and with the environment variable
  GLSL_TO_TGSI_RENAME_DEBUG specified the TGSI and resulting 
register

  lifetimes will be dumped to stderr.Here Nicolai suggested to use
  _mesa_register_file_name instead of my hand-backed array of 
strings,
  but 

Re: [Mesa-dev] [PATCH v2 3/3] docs/release-calendar: update and extend

2017-09-06 Thread Andres Gomez
On Tue, 2017-09-05 at 18:41 +0100, Emil Velikov wrote:
> From: Emil Velikov 
> 
> v2: Use correct 17.1.10 version, adjust some names.
> 
> Cc: Juan A. Suárez 
> Cc: Andres Gomez 
> Signed-off-by: Emil Velikov 
> Reviewed-by: Eric Engestrom 
> ---
>  docs/release-calendar.html | 33 -
>  1 file changed, 16 insertions(+), 17 deletions(-)
> 
> diff --git a/docs/release-calendar.html b/docs/release-calendar.html
> index 554eb6a540f..56564b52ea8 100644
> --- a/docs/release-calendar.html
> +++ b/docs/release-calendar.html
> @@ -39,59 +39,58 @@ if you'd like to nominate a patch in the next stable 
> release.
>  Notes
>  
>  
> -17.1
> +17.1
>  2017-09-08
>  17.1.9
>  Andres Gomez
> -Final planned release for the 17.1 series
> +
>  
> -

This  is needed.

Other than that, this is:

Reviewed-by: Andres Gomez 

> -17.2
> -2017-08-25
> -17.2.0-rc6
> -Emil Velikov
> -May be promoted to 17.2.0 final
> +2017-09-22
> +17.1.10
> +Juan A. Suarez Romero
> +Final planned release for the 17.1 series
>  
>  
> -2017-09-08
> +17.2
> +2017-09-15
>  17.2.1
>  Emil Velikov
>  
>  
>  
> -2017-09-22
> +2017-09-29
>  17.2.2
>  Juan A. Suarez Romero
>  
>  
>  
> -2017-10-06
> +2017-10-13
>  17.2.3
>  Emil Velikov
>  
>  
>  
> -2017-10-20
> +2017-10-27
>  17.2.4
> -Juan A. Suarez Romero
> +Andres Gomez
>  
>  
>  
> -2017-11-03
> +2017-11-10
>  17.2.5
>  Andres Gomez
>  
>  
>  
> -2017-11-17
> +2017-11-24
>  17.2.6
>  Andres Gomez
>  
>  
>  
> -2017-12-01
> +2017-12-08
>  17.2.7
> -Andres Gomez
> +Emil Velikov
>  Final planned release for the 17.2 series
>  
>  
-- 
Br,

Andres

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv: remove unused radv_meta_saved_state::vertex_saved field

2017-09-06 Thread Bas Nieuwenhuizen
Correct,  the code that set it has been removed with
bcf705b62e00c45a178e07ef01e7d266f73c2acc.

With the old_vertex_bindings removed, this patch is 

Reviewed-by: Bas Nieuwenhuizen 

On Wed, Sep 6, 2017, at 16:24, Samuel Pitoiset wrote:
> Just noticed that old_vertex_bindings should also be removed in this
> patch.
> 
> On 09/06/2017 03:53 PM, Samuel Pitoiset wrote:
> > It's always false.
> > 
> > Signed-off-by: Samuel Pitoiset 
> > ---
> >   src/amd/vulkan/radv_meta.c | 6 --
> >   src/amd/vulkan/radv_meta.h | 1 -
> >   2 files changed, 7 deletions(-)
> > 
> > diff --git a/src/amd/vulkan/radv_meta.c b/src/amd/vulkan/radv_meta.c
> > index af56f493b4..b17076703a 100644
> > --- a/src/amd/vulkan/radv_meta.c
> > +++ b/src/amd/vulkan/radv_meta.c
> > @@ -43,7 +43,6 @@ radv_meta_save_novertex(struct radv_meta_saved_state 
> > *state,
> > dynamic_mask);
> >   
> > memcpy(state->push_constants, cmd_buffer->push_constants, 
> > MAX_PUSH_CONSTANTS_SIZE);
> > -   state->vertex_saved = false;
> >   }
> >   
> >   void
> > @@ -53,11 +52,6 @@ radv_meta_restore(const struct radv_meta_saved_state 
> > *state,
> > radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer), 
> > VK_PIPELINE_BIND_POINT_GRAPHICS,
> >  radv_pipeline_to_handle(state->old_pipeline));
> > cmd_buffer->state.descriptors[0] = state->old_descriptor_set0;
> > -   if (state->vertex_saved) {
> > -   memcpy(cmd_buffer->state.vertex_bindings, 
> > state->old_vertex_bindings,
> > -  sizeof(state->old_vertex_bindings));
> > -   cmd_buffer->state.vb_dirty |= (1 << 
> > RADV_META_VERTEX_BINDING_COUNT) - 1;
> > -   }
> >   
> > cmd_buffer->state.dirty |= RADV_CMD_DIRTY_PIPELINE;
> >   
> > diff --git a/src/amd/vulkan/radv_meta.h b/src/amd/vulkan/radv_meta.h
> > index d84d8cb68c..8b7b664b22 100644
> > --- a/src/amd/vulkan/radv_meta.h
> > +++ b/src/amd/vulkan/radv_meta.h
> > @@ -36,7 +36,6 @@ extern "C" {
> >   #define RADV_META_VERTEX_BINDING_COUNT 2
> >   
> >   struct radv_meta_saved_state {
> > -   bool vertex_saved;
> > struct radv_vertex_binding 
> > old_vertex_bindings[RADV_META_VERTEX_BINDING_COUNT];
> > struct radv_descriptor_set *old_descriptor_set0;
> > struct radv_pipeline *old_pipeline;
> > 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv: remove unused radv_meta_saved_state::vertex_saved field

2017-09-06 Thread Samuel Pitoiset

Just noticed that old_vertex_bindings should also be removed in this patch.

On 09/06/2017 03:53 PM, Samuel Pitoiset wrote:

It's always false.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_meta.c | 6 --
  src/amd/vulkan/radv_meta.h | 1 -
  2 files changed, 7 deletions(-)

diff --git a/src/amd/vulkan/radv_meta.c b/src/amd/vulkan/radv_meta.c
index af56f493b4..b17076703a 100644
--- a/src/amd/vulkan/radv_meta.c
+++ b/src/amd/vulkan/radv_meta.c
@@ -43,7 +43,6 @@ radv_meta_save_novertex(struct radv_meta_saved_state *state,
dynamic_mask);
  
  	memcpy(state->push_constants, cmd_buffer->push_constants, MAX_PUSH_CONSTANTS_SIZE);

-   state->vertex_saved = false;
  }
  
  void

@@ -53,11 +52,6 @@ radv_meta_restore(const struct radv_meta_saved_state *state,
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer), 
VK_PIPELINE_BIND_POINT_GRAPHICS,
 radv_pipeline_to_handle(state->old_pipeline));
cmd_buffer->state.descriptors[0] = state->old_descriptor_set0;
-   if (state->vertex_saved) {
-   memcpy(cmd_buffer->state.vertex_bindings, 
state->old_vertex_bindings,
-  sizeof(state->old_vertex_bindings));
-   cmd_buffer->state.vb_dirty |= (1 << 
RADV_META_VERTEX_BINDING_COUNT) - 1;
-   }
  
  	cmd_buffer->state.dirty |= RADV_CMD_DIRTY_PIPELINE;
  
diff --git a/src/amd/vulkan/radv_meta.h b/src/amd/vulkan/radv_meta.h

index d84d8cb68c..8b7b664b22 100644
--- a/src/amd/vulkan/radv_meta.h
+++ b/src/amd/vulkan/radv_meta.h
@@ -36,7 +36,6 @@ extern "C" {
  #define RADV_META_VERTEX_BINDING_COUNT 2
  
  struct radv_meta_saved_state {

-   bool vertex_saved;
struct radv_vertex_binding 
old_vertex_bindings[RADV_META_VERTEX_BINDING_COUNT];
struct radv_descriptor_set *old_descriptor_set0;
struct radv_pipeline *old_pipeline;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 2/2] mesa: allow user to set MESA_NO_ERROR=0

2017-09-06 Thread Eric Engestrom
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102530
Cc: Michel Dänzer 
Cc: Alexandre Demers 
Signed-off-by: Eric Engestrom 
---
 src/mesa/main/context.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index cd3eccea20..1c4232d298 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -123,6 +123,7 @@
 #include "shared.h"
 #include "shaderobj.h"
 #include "shaderimage.h"
+#include "util/debug.h"
 #include "util/disk_cache.h"
 #include "util/strtod.h"
 #include "stencil.h"
@@ -1213,7 +1214,7 @@ _mesa_initialize_context(struct gl_context *ctx,
/* KHR_no_error is likely to crash, overflow memory, etc if an application
 * has errors so don't enable it for setuid processes.
 */
-   if (getenv("MESA_NO_ERROR")) {
+   if (env_var_as_boolean("MESA_NO_ERROR", false)) {
 #if !defined(_WIN32)
   if (geteuid() == getuid())
 #endif
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 1/2] util: rename include guard to avoid clash

2017-09-06 Thread Eric Engestrom
src/mesa/main/debug.h uses the same include guard.

Signed-off-by: Eric Engestrom 
---
 src/util/debug.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/util/debug.h b/src/util/debug.h
index 11a8561eb5..75ebc2ebff 100644
--- a/src/util/debug.h
+++ b/src/util/debug.h
@@ -21,8 +21,8 @@
  * IN THE SOFTWARE.
  */
 
-#ifndef _DEBUG_H
-#define _DEBUG_H
+#ifndef _UTIL_DEBUG_H
+#define _UTIL_DEBUG_H
 
 #include 
 #include 
@@ -46,4 +46,4 @@ env_var_as_boolean(const char *var_name, bool default_value);
 } /* extern C */
 #endif
 
-#endif /* _DEBUG_H */
+#endif /* _UTIL_DEBUG_H */
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] radv/gfx9: fix tile swizzle handling for gfx9

2017-09-06 Thread Emil Velikov
Hi Dave, Bas,

On 21 August 2017 at 13:59, Emil Velikov  wrote:
> Hi Dave,
>
> On 15 August 2017 at 06:26, Dave Airlie  wrote:
>> From: David Airlie 
>>
>> This sets the tile swizzle up properly for gfx9.
>>
>> Signed-off-by: Dave Airlie 
>> ---
> Can you please provide a backport of this patch for 17.2.
>
> There are a couple of conflict and with the "disable radv on Vega"
> landed, I did not want to leave things in a broken state.
>
Friendly reminder that this patch needs a backport for the 17.2 branch.
ATM everything's fine, since Vega support is disabled :-)

Just a note if you're planning to flip it on.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >