Mesa (master): glsl: enable early_fragment_tests implicitly with post_depth_coverage

2017-02-23 Thread Iago Toral Quiroga
Module: Mesa
Branch: master
Commit: 42b9057447bde6a48c948ed71d23e935c250cef5
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=42b9057447bde6a48c948ed71d23e935c250cef5

Author: Iago Toral Quiroga 
Date:   Wed Feb 22 09:06:31 2017 +0100

glsl: enable early_fragment_tests implicitly with post_depth_coverage

From ARB_post_depth_coverage:

   "This extension allows the fragment shader to control whether values in
gl_SampleMaskIn[] reflect the coverage after application of the early
depth and stencil tests.  This feature can be enabled with the following
layout qualifier in the fragment shader:

   layout(post_depth_coverage) in;

Use of this feature implicitly enables early fragment tests."

And a bit later it also adds:

   "early_fragment_tests" requests that fragment tests be performed before
fragment shader execution, as described in section 15.2.4 "Early Fragment
Tests" of the OpenGL Specification. If neither this nor post_depth_coverage
are declared, per-fragment tests will be performed after fragment shader
execution."

Fixes:
GL45-CTS.post_depth_coverage_tests.PostDepthSampleMask

Reviewed-by: Marek Olšák 

---

 src/compiler/glsl/linker.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index b6f8bc4..7343e4e 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -1881,7 +1881,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program 
*prog,
   }
 
   linked_shader->Program->info.fs.early_fragment_tests |=
- shader->EarlyFragmentTests;
+ shader->EarlyFragmentTests || shader->PostDepthCoverage;
   linked_shader->Program->info.fs.inner_coverage |= shader->InnerCoverage;
   linked_shader->Program->info.fs.post_depth_coverage |=
  shader->PostDepthCoverage;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): isl/state: fix assert on raw buffer surface state minimum size

2017-02-23 Thread Samuel Iglesias Gonsálvez
Module: Mesa
Branch: master
Commit: a9c488f2858f8a383dd50e557ec8a832bcb35f47
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a9c488f2858f8a383dd50e557ec8a832bcb35f47

Author: Samuel Iglesias Gonsálvez 
Date:   Wed Feb 22 12:27:15 2017 +0100

isl/state: fix assert on raw buffer surface state minimum size

From IVB PRM, SURFACE_STATE::Height:

"For typed buffer and structured buffer surfaces, the number of
 entries in the buffer ranges from 1 to 2^27 . For raw buffer
 surfaces, the number of entries in the buffer is the number of bytes
 which can range from 1 to 2^30."

The minimum value is 1, according to the spec. The spec quote
was already added into the code by 028f6d8317f00.

Fixes crashing tests under:

dEQP-VK.robustness.buffer_access.*

Signed-off-by: Samuel Iglesias Gonsálvez 
Reviewed-by: Jason Ekstrand 

---

 src/intel/isl/isl_surface_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 29ec289..853bb11 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -671,7 +671,7 @@ isl_genX(buffer_fill_state_s)(void *state,
*/
   if (info->format == ISL_FORMAT_RAW) {
  assert(num_elements <= (1ull << 30));
- assert((num_elements & 3) == 0);
+ assert(num_elements > 0);
   } else {
  assert(num_elements <= (1ull << 27));
   }

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): glsl: Raise a link error for non-SSO ES programs with a TES but no TCS.

2017-02-23 Thread Kenneth Graunke
Module: Mesa
Branch: master
Commit: e6e8475b0f17e605e1c8251a076cc1d48734873b
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e6e8475b0f17e605e1c8251a076cc1d48734873b

Author: Kenneth Graunke 
Date:   Wed Feb 22 17:16:01 2017 -0800

glsl: Raise a link error for non-SSO ES programs with a TES but no TCS.

OpenGL allows the TCS to be missing and supplies an implicit passthrough
shader, but OpenGL ES does not (see section 7.3 of the ES 3.2 spec,
cited above in the code).

One open question is how to handle this for ARB_ES3_2_compatibility.
This patch raises the link error for all ES shading language programs,
but it might make sense to base it on the API.  The approach taken in
this patch is more restrictive, but should still allow any valid ES
programs to work in GL.

Signed-off-by: Kenneth Graunke 
Reviewed-by: Andres Gomez 

---

 src/compiler/glsl/linker.cpp | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 7343e4e..3eddbe2 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4743,6 +4743,16 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   "tessellation evaluation shader\n");
  goto done;
   }
+
+  if (prog->IsES) {
+ if (num_shaders[MESA_SHADER_TESS_EVAL] > 0 &&
+ num_shaders[MESA_SHADER_TESS_CTRL] == 0) {
+linker_error(prog, "GLSL ES requires non-separable programs "
+ "containing a tessellation evaluation shader to also "
+ "be linked with a tessellation control shader\n");
+goto done;
+ }
+  }
}
 
/* Compute shaders have additional restrictions. */

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): anv/blorp/clear_subpass: Only set surface clear color for fast clears

2017-02-23 Thread Jason Ekstrand
Module: Mesa
Branch: master
Commit: 42b10b175d5e8dfb9c4c46edbc306e7fac6bd3ec
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=42b10b175d5e8dfb9c4c46edbc306e7fac6bd3ec

Author: Jason Ekstrand 
Date:   Tue Feb 21 18:28:38 2017 -0800

anv/blorp/clear_subpass: Only set surface clear color for fast clears

Not all clear colors are valid.  In particular, on Broadwell and
earlier, only 0/1 colors are allowed in surface state.  No CTS tests are
affected outright by this because, apparently, the CTS coverage for
different clear colors is pretty terrible.  However, when multisample
compression is enabled, we do hit it with CTS tests and this commit
prevents regressions when enabling MCS on Broadwell and earlier.

Reviewed-by: Lionel Landwerlin 
Cc: "13.0 17.0" 

---

 src/intel/vulkan/anv_blorp.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 4e7078b..8db03e4 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1198,9 +1198,10 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer 
*cmd_buffer)
   struct blorp_surf surf;
   get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT,
att_state->aux_usage, &surf);
-  surf.clear_color = vk_to_isl_color(att_state->clear_value.color);
 
   if (att_state->fast_clear) {
+ surf.clear_color = vk_to_isl_color(att_state->clear_value.color);
+
  blorp_fast_clear(&batch, &surf, iview->isl.format,
   iview->isl.base_level,
   iview->isl.base_array_layer, fb->layers,
@@ -1224,7 +1225,7 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer 
*cmd_buffer)
  render_area.offset.x, render_area.offset.y,
  render_area.offset.x + render_area.extent.width,
  render_area.offset.y + render_area.extent.height,
- surf.clear_color, NULL);
+ vk_to_isl_color(att_state->clear_value.color), NULL);
   }
 
   att_state->pending_clear_aspects = 0;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): intel/isl: add MCS width constraint 16 samples

2017-02-23 Thread Jason Ekstrand
Module: Mesa
Branch: master
Commit: 34e29b2ebd2c296bad0bf6df986b3d75105c55ec
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=34e29b2ebd2c296bad0bf6df986b3d75105c55ec

Author: Lionel Landwerlin 
Date:   Mon Feb 20 16:10:30 2017 +

intel/isl: add MCS width constraint 16 samples

v3 (Jason Ekstrand): Add a comment explaining why

Signed-off-by: Lionel Landwerlin 
Reviewed-by: Jason Ekstrand 
Reviewed-by: Chad Versace 

---

 src/intel/isl/isl.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 1a47da5..d1fb7e4 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1417,6 +1417,16 @@ isl_surf_get_mcs_surf(const struct isl_device *dev,
assert(surf->levels == 1);
assert(surf->logical_level0_px.depth == 1);
 
+   /* The "Auxiliary Surface Pitch" field in RENDER_SURFACE_STATE is only 9
+* bits which means the maximum pitch of a compression surface is 512
+* tiles or 64KB (since MCS is always Y-tiled).  Since a 16x MCS buffer is
+* 64bpp, this gives us a maximum width of 8192 pixels.  We can create
+* larger multisampled surfaces, we just can't compress them.   For 2x, 4x,
+* and 8x, we have enough room for the full 16k supported by the hardware.
+*/
+   if (surf->samples == 16 && surf->logical_level0_px.width > 8192)
+  return false;
+
enum isl_format mcs_format;
switch (surf->samples) {
case 2:  mcs_format = ISL_FORMAT_MCS_2X;  break;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): anv: Enable MSAA compression

2017-02-23 Thread Jason Ekstrand
Module: Mesa
Branch: master
Commit: 261092f7d4f3142760fcce98ccb63b4efd47cc48
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=261092f7d4f3142760fcce98ccb63b4efd47cc48

Author: Jason Ekstrand 
Date:   Fri Feb 17 14:14:48 2017 -0800

anv: Enable MSAA compression

This just enables basic MSAA compression (no fast clears) for all
multisampled surfaces.  This improves the framerate of the Sascha
"multisampling" demo by 76% on my Sky Lake laptop.  Running Talos on
medium settings with 8x MSAA, this improves the framerate in the
benchmark by 80%.

Reviewed-by: Lionel Landwerlin 
Reviewed-by: Chad Versace 

---

 src/intel/vulkan/TODO  |  2 +-
 src/intel/vulkan/anv_blorp.c   |  3 ++-
 src/intel/vulkan/anv_image.c   |  9 +
 src/intel/vulkan/anv_pipeline.c| 19 +++
 src/intel/vulkan/genX_cmd_buffer.c |  5 +
 5 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/TODO b/src/intel/vulkan/TODO
index 6abda88..5366774 100644
--- a/src/intel/vulkan/TODO
+++ b/src/intel/vulkan/TODO
@@ -8,7 +8,7 @@ Missing Features:
 
 Performance:
  - Multi-{sampled/gen8,LOD} HiZ
- - Compressed multisample support
+ - MSAA fast clears
  - Pushing pieces of UBOs?
  - Enable guardband clipping
  - Use soft-pin to avoid relocations
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 8db03e4..2cde3b7 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1398,7 +1398,8 @@ ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer,
struct anv_attachment_state *att_state =
   &cmd_buffer->state.attachments[att];
 
-   if (att_state->aux_usage == ISL_AUX_USAGE_NONE)
+   if (att_state->aux_usage == ISL_AUX_USAGE_NONE ||
+   att_state->aux_usage == ISL_AUX_USAGE_MCS)
   return; /* Nothing to resolve */
 
assert(att_state->aux_usage == ISL_AUX_USAGE_CCS_E ||
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 47d0a1e..cd14293 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -238,6 +238,15 @@ make_surface(const struct anv_device *dev,
 }
  }
   }
+   } else if (aspect == VK_IMAGE_ASPECT_COLOR_BIT && vk_info->samples > 1) {
+  assert(image->aux_surface.isl.size == 0);
+  assert(!(vk_info->usage & VK_IMAGE_USAGE_STORAGE_BIT));
+  ok = isl_surf_get_mcs_surf(&dev->isl_dev, &anv_surf->isl,
+ &image->aux_surface.isl);
+  if (ok) {
+ add_surface(image, &image->aux_surface);
+ image->aux_usage = ISL_AUX_USAGE_MCS;
+  }
}
 
return VK_SUCCESS;
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 4410103..708b05a 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -228,6 +228,25 @@ static void
 populate_sampler_prog_key(const struct gen_device_info *devinfo,
   struct brw_sampler_prog_key_data *key)
 {
+   /* Almost all multisampled textures are compressed.  The only time when we
+* don't compress a multisampled texture is for 16x MSAA with a surface
+* width greater than 8k which is a bit of an edge case.  Since the sampler
+* just ignores the MCS parameter to ld2ms when MCS is disabled, it's safe
+* to tell the compiler to always assume compression.
+*/
+   key->compressed_multisample_layout_mask = ~0;
+
+   /* SkyLake added support for 16x MSAA.  With this came a new message for
+* reading from a 16x MSAA surface with compression.  The new message was
+* needed because now the MCS data is 64 bits instead of 32 or lower as is
+* the case for 8x, 4x, and 2x.  The key->msaa_16 bit-field controls which
+* message we use.  Fortunately, the 16x message works for 8x, 4x, and 2x
+* so we can just use it unconditionally.  This may not be quite as
+* efficient but it saves us from recompiling.
+*/
+   if (devinfo->gen >= 9)
+  key->msaa_16 = ~0;
+
/* XXX: Handle texture swizzle on HSW- */
for (int i = 0; i < MAX_SAMPLERS; i++) {
   /* Assume color sampler, no swizzling. (Works for BDW+) */
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 7af2b31..e3f84e3 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -222,6 +222,11 @@ color_attachment_compute_aux_usage(struct anv_device 
*device,
   att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
   att_state->fast_clear = false;
   return;
+   } else if (iview->image->aux_usage == ISL_AUX_USAGE_MCS) {
+  att_state->aux_usage = ISL_AUX_USAGE_MCS;
+  att_state->input_aux_usage = ISL_AUX_USAGE_MCS;
+  att_state->fast_clear = false;
+  return;
}
 
assert(iview->image->aux_surface.isl.usage & ISL_SURF_USAGE_CCS_BIT);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mai

Mesa (master): intel/isl: Return surface creation success from aux helpers

2017-02-23 Thread Jason Ekstrand
Module: Mesa
Branch: master
Commit: 3885375195c9c62f7450beabb070a0e47cc11c58
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=3885375195c9c62f7450beabb070a0e47cc11c58

Author: Jason Ekstrand 
Date:   Fri Feb 17 13:48:11 2017 -0800

intel/isl: Return surface creation success from aux helpers

The isl_surf_init call that each of these helpers make can, in theory,
fail.  We should propagate that up to the caller rather than just
silently ignoring it.

Reviewed-by: Topi Pohjolainen 
Reviewed-by: Chad Versace 

---

 src/intel/isl/isl.c  | 72 +---
 src/intel/isl/isl.h  |  4 +--
 src/intel/vulkan/anv_image.c |  5 +--
 3 files changed, 40 insertions(+), 41 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 82ab68d..1a47da5 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1323,7 +1323,7 @@ isl_surf_get_tile_info(const struct isl_device *dev,
isl_tiling_get_info(dev, surf->tiling, fmtl->bpb, tile_info);
 }
 
-void
+bool
 isl_surf_get_hiz_surf(const struct isl_device *dev,
   const struct isl_surf *surf,
   struct isl_surf *hiz_surf)
@@ -1391,20 +1391,20 @@ isl_surf_get_hiz_surf(const struct isl_device *dev,
 */
const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples;
 
-   isl_surf_init(dev, hiz_surf,
- .dim = surf->dim,
- .format = ISL_FORMAT_HIZ,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = surf->logical_level0_px.depth,
- .levels = surf->levels,
- .array_len = surf->logical_level0_px.array_len,
- .samples = samples,
- .usage = ISL_SURF_USAGE_HIZ_BIT,
- .tiling_flags = ISL_TILING_HIZ_BIT);
+   return isl_surf_init(dev, hiz_surf,
+.dim = surf->dim,
+.format = ISL_FORMAT_HIZ,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = surf->logical_level0_px.depth,
+.levels = surf->levels,
+.array_len = surf->logical_level0_px.array_len,
+.samples = samples,
+.usage = ISL_SURF_USAGE_HIZ_BIT,
+.tiling_flags = ISL_TILING_HIZ_BIT);
 }
 
-void
+bool
 isl_surf_get_mcs_surf(const struct isl_device *dev,
   const struct isl_surf *surf,
   struct isl_surf *mcs_surf)
@@ -1427,17 +1427,17 @@ isl_surf_get_mcs_surf(const struct isl_device *dev,
   unreachable("Invalid sample count");
}
 
-   isl_surf_init(dev, mcs_surf,
- .dim = ISL_SURF_DIM_2D,
- .format = mcs_format,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = 1,
- .levels = 1,
- .array_len = surf->logical_level0_px.array_len,
- .samples = 1, /* MCS surfaces are really single-sampled */
- .usage = ISL_SURF_USAGE_MCS_BIT,
- .tiling_flags = ISL_TILING_Y0_BIT);
+   return isl_surf_init(dev, mcs_surf,
+.dim = ISL_SURF_DIM_2D,
+.format = mcs_format,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = 1,
+.levels = 1,
+.array_len = surf->logical_level0_px.array_len,
+.samples = 1, /* MCS surfaces are really 
single-sampled */
+.usage = ISL_SURF_USAGE_MCS_BIT,
+.tiling_flags = ISL_TILING_Y0_BIT);
 }
 
 bool
@@ -1491,19 +1491,17 @@ isl_surf_get_ccs_surf(const struct isl_device *dev,
   return false;
}
 
-   isl_surf_init(dev, ccs_surf,
- .dim = surf->dim,
- .format = ccs_format,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = surf->logical_level0_px.depth,
- .levels = surf->levels,
- .array_len = surf->logical_level0_px.array_len,
- .samples = 1,
- .usage = ISL_SURF_USAGE_CCS_BIT,
- .tiling_flags = ISL_TILING_CCS_BIT);
-
-   return true;
+   return isl_surf_init(dev, ccs_surf,
+.dim = surf->dim,
+.format = ccs_format,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = surf->logical_level0_px.depth,
+.levels = surf-

Mesa (master): intel/isl: Apply render target alignment constraints for MCS

2017-02-23 Thread Jason Ekstrand
Module: Mesa
Branch: master
Commit: 042cc201f2869bb3a316729643e8e025f115
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=042cc201f2869bb3a316729643e8e025f115

Author: Pohjolainen, Topi 
Date:   Thu Feb 23 15:31:44 2017 +0200

intel/isl: Apply render target alignment constraints for MCS

v2: Instead of having the same block in isl_gen7,8,9.c add it
once into isl.c::isl_choose_image_alignment_el() instead.

Reviewed-by: Jason Ekstrand 
Signed-off-by: Topi Pohjolainen 

---

 src/intel/isl/isl.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index d1fb7e4..6eb1e93 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -480,7 +480,22 @@ isl_choose_image_alignment_el(const struct isl_device *dev,
   enum isl_msaa_layout msaa_layout,
   struct isl_extent3d *image_align_el)
 {
-   if (info->format == ISL_FORMAT_HIZ) {
+   const struct isl_format_layout *fmtl = isl_format_get_layout(info->format);
+   if (fmtl->txc == ISL_TXC_MCS) {
+  assert(tiling == ISL_TILING_Y0);
+
+  /*
+   * IvyBrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
+   *
+   * Height, width, and layout of MCS buffer in this case must match with
+   * Render Target height, width, and layout. MCS buffer is tiledY.
+   *
+   * To avoid wasting memory, choose the smallest alignment possible:
+   * HALIGN_4 and VALIGN_4.
+   */
+  *image_align_el = isl_extent3d(4, 4, 1);
+  return;
+   } else if (info->format == ISL_FORMAT_HIZ) {
   assert(ISL_DEV_GEN(dev) >= 6);
   /* HiZ surfaces are always aligned to 16x8 pixels in the primary surface
* which works out to 2x2 HiZ elments.

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): st/mesa: free shader cache buffer on fallback

2017-02-23 Thread Timothy Arceri
Module: Mesa
Branch: master
Commit: 987d8037cabaafaeba2cb8b82cb7fa7290dc4464
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=987d8037cabaafaeba2cb8b82cb7fa7290dc4464

Author: Timothy Arceri 
Date:   Thu Feb 23 14:50:58 2017 +1100

st/mesa: free shader cache buffer on fallback

Reviewed-by: Edward O'Callaghan 
Tested-by: Michel Dänzer 

---

 src/mesa/state_tracker/st_shader_cache.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_shader_cache.c 
b/src/mesa/state_tracker/st_shader_cache.c
index eb66f99..fba4b0a 100644
--- a/src/mesa/state_tracker/st_shader_cache.c
+++ b/src/mesa/state_tracker/st_shader_cache.c
@@ -242,13 +242,14 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx,
   return false;
 
struct st_context *st = st_context(ctx);
+   uint8_t *buffer = NULL;
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   if (prog->_LinkedShaders[i] == NULL)
  continue;
 
   unsigned char *sha1 = stage_sha1[i];
   size_t size;
-  uint8_t *buffer = (uint8_t *) disk_cache_get(ctx->Cache, sha1, &size);
+  buffer = (uint8_t *) disk_cache_get(ctx->Cache, sha1, &size);
   if (buffer) {
  struct blob_reader blob_reader;
  blob_reader_init(&blob_reader, buffer, size);
@@ -396,6 +397,7 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx,
return true;
 
 fallback_recompile:
+   free(buffer);
 
for (unsigned i = 0; i < prog->NumShaders; i++) {
   _mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): st/mesa: fix crash in shader cache cased by race condition

2017-02-23 Thread Timothy Arceri
Module: Mesa
Branch: master
Commit: c24d0aaa9a197ccf7cbaa9154b840aed6397f6bd
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=c24d0aaa9a197ccf7cbaa9154b840aed6397f6bd

Author: Timothy Arceri 
Date:   Thu Feb 23 14:42:07 2017 +1100

st/mesa: fix crash in shader cache cased by race condition

If a thread doesn't load GLSL IR from cache but does load TGSI
from cache (that was created by another thread) than it will
crash due to expecting gl_program_parameter_list to have been
restored from the GLSL IR cache and not be null.

Reviewed-by: Edward O'Callaghan 
Tested-by: Michel Dänzer 

---

 src/mesa/state_tracker/st_shader_cache.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/mesa/state_tracker/st_shader_cache.c 
b/src/mesa/state_tracker/st_shader_cache.c
index 607e5b1..eb66f99 100644
--- a/src/mesa/state_tracker/st_shader_cache.c
+++ b/src/mesa/state_tracker/st_shader_cache.c
@@ -233,6 +233,14 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx,
   ralloc_free(buf);
}
 
+   /* Now that we have created the sha1 keys that will be used for writting to
+* the tgsi cache fallback to the regular glsl to tgsi path if we didn't
+* load the GLSL IR from cache. We do this as glsl to tgsi can alter things
+* such as gl_program_parameter_list which holds things like uniforms.
+*/
+   if (prog->data->LinkStatus != linking_skipped)
+  return false;
+
struct st_context *st = st_context(ctx);
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   if (prog->_LinkedShaders[i] == NULL)
@@ -389,12 +397,6 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx,
 
 fallback_recompile:
 
-   /* GLSL IR was compiled and linked so just fallback to the regular
-* glsl to tgsi path.
-*/
-   if (prog->data->LinkStatus != linking_skipped)
-  return false;
-
for (unsigned i = 0; i < prog->NumShaders; i++) {
   _mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true);
}

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): swr: fix index buffers with non-zero indices

2017-02-23 Thread Tim Rowley
Module: Mesa
Branch: master
Commit: dcac48bfee545660dffbf23bd92a0939b19ffd18
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=dcac48bfee545660dffbf23bd92a0939b19ffd18

Author: George Kyriazis 
Date:   Thu Feb  2 21:16:47 2017 -0600

swr: fix index buffers with non-zero indices

Fix issue with index buffers that do not contain a 0 index.  0 index
can be a non-valid index if the (copied) vertex buffers are a subset of the
user's (which happens because we only copy the range between min & max).
Core will use an index passed in from the driver to replace invalid indices.

Only do this for calls that contain non-zero indices, to minimize performance

Reviewed-by: Bruce Cherniak 

cost.

---

 src/gallium/drivers/swr/rasterizer/core/state.h|  1 +
 .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 60 +++---
 .../drivers/swr/rasterizer/jitter/fetch_jit.h  |  2 +
 src/gallium/drivers/swr/swr_draw.cpp   |  1 +
 src/gallium/drivers/swr/swr_state.cpp  |  4 ++
 5 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h 
b/src/gallium/drivers/swr/rasterizer/core/state.h
index 2f3b913..05347dc 100644
--- a/src/gallium/drivers/swr/rasterizer/core/state.h
+++ b/src/gallium/drivers/swr/rasterizer/core/state.h
@@ -524,6 +524,7 @@ struct SWR_VERTEX_BUFFER_STATE
 const uint8_t *pData;
 uint32_t size;
 uint32_t numaNode;
+uint32_t minVertex; // min vertex (for bounds checking)
 uint32_t maxVertex; // size / pitch.  precalculated value used 
by fetch shader for OOB checks
 uint32_t partialInboundsSize;   // size % pitch.  precalculated value used 
by fetch shader for partially OOB vertices
 };
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index 901bce6..ffa7605 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -309,11 +309,29 @@ void FetchJit::JitLoadVertices(const FETCH_COMPILE_STATE 
&fetchState, Value* str
 
 Value* startVertexOffset = MUL(Z_EXT(startOffset, mInt64Ty), stride);
 
+Value *minVertex = NULL;
+Value *minVertexOffset = NULL;
+if (fetchState.bPartialVertexBuffer) {
+// fetch min index for low bounds checking
+minVertex = GEP(streams, {C(ied.StreamIndex), 
C(SWR_VERTEX_BUFFER_STATE_minVertex)});
+minVertex = LOAD(minVertex);
+if (!fetchState.bDisableIndexOOBCheck) {
+minVertexOffset = MUL(Z_EXT(minVertex, mInt64Ty), stride);
+}
+}
+
 // Load from the stream.
 for(uint32_t lane = 0; lane < mVWidth; ++lane)
 {
 // Get index
 Value* index = VEXTRACT(vCurIndices, C(lane));
+
+if (fetchState.bPartialVertexBuffer) {
+// clamp below minvertex
+Value *isBelowMin = ICMP_SLT(index, minVertex);
+index = SELECT(isBelowMin, minVertex, index);
+}
+
 index = Z_EXT(index, mInt64Ty);
 
 Value*offset = MUL(index, stride);
@@ -321,10 +339,14 @@ void FetchJit::JitLoadVertices(const FETCH_COMPILE_STATE 
&fetchState, Value* str
 offset = ADD(offset, startVertexOffset);
 
 if (!fetchState.bDisableIndexOOBCheck) {
-// check for out of bound access, including partial OOB, and 
mask them to 0
+// check for out of bound access, including partial OOB, and 
replace them with minVertex
 Value *endOffset = ADD(offset, C((int64_t)info.Bpp));
 Value *oob = ICMP_ULE(endOffset, size);
-offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 0));
+if (fetchState.bPartialVertexBuffer) {
+offset = SELECT(oob, offset, minVertexOffset);
+} else {
+offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 
0));
+}
 }
 
 Value*pointer = GEP(stream, offset);
@@ -732,6 +754,13 @@ void FetchJit::JitGatherVertices(const FETCH_COMPILE_STATE 
&fetchState,
 Value *maxVertex = GEP(streams, {C(ied.StreamIndex), 
C(SWR_VERTEX_BUFFER_STATE_maxVertex)});
 maxVertex = LOAD(maxVertex);
 
+Value *minVertex = NULL;
+if (fetchState.bPartialVertexBuffer) {
+// min vertex index for low bounds OOB checking
+minVertex = GEP(streams, {C(ied.StreamIndex), 
C(SWR_VERTEX_BUFFER_STATE_minVertex)});
+minVertex = LOAD(minVertex);
+}
+
 Value *vCurIndices;
 Value *startOffset;
 if(ied.InstanceEnable)
@@ -769,9 +798,16 @@ void FetchJit::JitGatherVertices(const FETCH_COMPILE_STATE 
&fetchState,
 
 // if we have a start offset, subtract from max vertex. Used for OOB 
check
 maxVert

Mesa (master): swr: add fetch shader cache

2017-02-23 Thread Tim Rowley
Module: Mesa
Branch: master
Commit: 669d8f626f64cee1bc74ef7869aac8585b6dcfe6
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=669d8f626f64cee1bc74ef7869aac8585b6dcfe6

Author: George Kyriazis 
Date:   Fri Feb 10 10:24:32 2017 -0600

swr: add fetch shader cache

For now, the cache key is all of FETCH_COMPILE_STATE.

Use new/delete for swr_vertex_element_state, since we have to call the
constructors/destructors of the struct elements.

Reviewed-by: Bruce Cherniak 

---

 src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h |  2 +-
 src/gallium/drivers/swr/swr_draw.cpp  | 19 +++
 src/gallium/drivers/swr/swr_shader.cpp| 14 ++
 src/gallium/drivers/swr/swr_shader.h  | 15 +++
 src/gallium/drivers/swr/swr_state.cpp |  6 --
 src/gallium/drivers/swr/swr_state.h   |  9 +
 6 files changed, 50 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
index 1547453..622608a 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
@@ -94,7 +94,7 @@ enum ComponentControl
 //
 struct FETCH_COMPILE_STATE
 {
-uint32_t numAttribs;
+uint32_t numAttribs {0};
 INPUT_ELEMENT_DESC layout[KNOB_NUM_ATTRIBUTES];
 SWR_FORMAT indexType;
 uint32_t cutIndex{ 0x };
diff --git a/src/gallium/drivers/swr/swr_draw.cpp 
b/src/gallium/drivers/swr/swr_draw.cpp
index c4d5e5c..4bdd3bb 100644
--- a/src/gallium/drivers/swr/swr_draw.cpp
+++ b/src/gallium/drivers/swr/swr_draw.cpp
@@ -141,19 +141,22 @@ swr_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
}
 
struct swr_vertex_element_state *velems = ctx->velems;
-   if (!velems->fsFunc
-   || (velems->fsState.cutIndex != info->restart_index)
-   || (velems->fsState.bEnableCutIndex != info->primitive_restart)) {
-
-  velems->fsState.cutIndex = info->restart_index;
-  velems->fsState.bEnableCutIndex = info->primitive_restart;
-
-  /* Create Fetch Shader */
+   velems->fsState.cutIndex = info->restart_index;
+   velems->fsState.bEnableCutIndex = info->primitive_restart;
+
+   swr_jit_fetch_key key;
+   swr_generate_fetch_key(key, velems);
+   auto search = velems->map.find(key);
+   if (search != velems->map.end()) {
+  velems->fsFunc = search->second;
+   } else {
   HANDLE hJitMgr = swr_screen(ctx->pipe.screen)->hJitMgr;
   velems->fsFunc = JitCompileFetch(hJitMgr, velems->fsState);
 
   debug_printf("fetch shader %p\n", velems->fsFunc);
   assert(velems->fsFunc && "Error: FetchShader = NULL");
+
+  velems->map.insert(std::make_pair(key, velems->fsFunc));
}
 
SwrSetFetchFunc(ctx->swrContext, velems->fsFunc);
diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
b/src/gallium/drivers/swr/swr_shader.cpp
index 979a28b..676938c 100644
--- a/src/gallium/drivers/swr/swr_shader.cpp
+++ b/src/gallium/drivers/swr/swr_shader.cpp
@@ -61,6 +61,11 @@ bool operator==(const swr_jit_vs_key &lhs, const 
swr_jit_vs_key &rhs)
return !memcmp(&lhs, &rhs, sizeof(lhs));
 }
 
+bool operator==(const swr_jit_fetch_key &lhs, const swr_jit_fetch_key &rhs)
+{
+   return !memcmp(&lhs, &rhs, sizeof(lhs));
+}
+
 static void
 swr_generate_sampler_key(const struct lp_tgsi_info &info,
  struct swr_context *ctx,
@@ -157,6 +162,15 @@ swr_generate_vs_key(struct swr_jit_vs_key &key,
swr_generate_sampler_key(swr_vs->info, ctx, PIPE_SHADER_VERTEX, key);
 }
 
+void
+swr_generate_fetch_key(struct swr_jit_fetch_key &key,
+   struct swr_vertex_element_state *velems)
+{
+   memset(&key, 0, sizeof(key));
+
+   key.fsState = velems->fsState;
+}
+
 struct BuilderSWR : public Builder {
BuilderSWR(JitManager *pJitMgr, const char *pName)
   : Builder(pJitMgr)
diff --git a/src/gallium/drivers/swr/swr_shader.h 
b/src/gallium/drivers/swr/swr_shader.h
index 7e3399c..266573f 100644
--- a/src/gallium/drivers/swr/swr_shader.h
+++ b/src/gallium/drivers/swr/swr_shader.h
@@ -42,6 +42,9 @@ void swr_generate_vs_key(struct swr_jit_vs_key &key,
  struct swr_context *ctx,
  swr_vertex_shader *swr_vs);
 
+void swr_generate_fetch_key(struct swr_jit_fetch_key &key,
+struct swr_vertex_element_state *velems);
+
 struct swr_jit_sampler_key {
unsigned nr_samplers;
unsigned nr_sampler_views;
@@ -60,6 +63,10 @@ struct swr_jit_vs_key : swr_jit_sampler_key {
unsigned clip_plane_mask; // from rasterizer state & vs_info
 };
 
+struct swr_jit_fetch_key {
+   FETCH_COMPILE_STATE fsState;
+};
+
 namespace std
 {
 template <> struct hash {
@@ -75,7 +82,15 @@ template <> struct hash {
   return util_hash_crc32(&k, sizeof(k));
}
 };
+
+template <> stru

Mesa (master): radv: enable location at sample when persample is forced.

2017-02-23 Thread Dave Airlie
Module: Mesa
Branch: master
Commit: 58c97a0791bf71b31546b13c2b491a636555749c
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=58c97a0791bf71b31546b13c2b491a636555749c

Author: Dave Airlie 
Date:   Thu Feb 23 14:24:20 2017 +1000

radv: enable location at sample when persample is forced.

Reviewed-by: Bas Nieuwenhuizen 
Signed-off-by: Dave Airlie 

---

 src/amd/vulkan/radv_cmd_buffer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index dd6deef..5b7564c 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -685,6 +685,9 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
*cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR,
   ps->config.spi_ps_input_addr);
 
+   if (ps->info.fs.force_persample)
+   spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);
+
radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL,
   S_0286D8_NUM_INTERP(ps->info.fs.num_interp));
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): radv: add sample mask output support

2017-02-23 Thread Dave Airlie
Module: Mesa
Branch: master
Commit: ccb70d6f53464171639ee7809c9fe5ee3a86e54d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=ccb70d6f53464171639ee7809c9fe5ee3a86e54d

Author: Dave Airlie 
Date:   Thu Feb 23 16:06:22 2017 +1000

radv: add sample mask output support

This adds support to write to sample mask from the fragment shader.

We can optimise this later like radeonsi.

Reviewed-by: Bas Nieuwenhuizen 
Signed-off-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c  | 8 ++--
 src/amd/common/ac_nir_to_llvm.h  | 1 +
 src/amd/vulkan/radv_cmd_buffer.c | 2 ++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 6021647..9778581 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -4753,13 +4753,17 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx)
ctx->shader_info->fs.writes_stencil = true;
stencil = to_float(ctx, LLVMBuildLoad(ctx->builder,
  
ctx->outputs[radeon_llvm_reg_index_soa(i, 0)], ""));
+   } else if (i == FRAG_RESULT_SAMPLE_MASK) {
+   ctx->shader_info->fs.writes_sample_mask = true;
+   samplemask = to_float(ctx, LLVMBuildLoad(ctx->builder,
+ 
ctx->outputs[radeon_llvm_reg_index_soa(i, 0)], ""));
} else {
bool last = false;
for (unsigned j = 0; j < 4; j++)
values[j] = to_float(ctx, 
LLVMBuildLoad(ctx->builder,

ctx->outputs[radeon_llvm_reg_index_soa(i, j)], ""));
 
-   if (!ctx->shader_info->fs.writes_z && 
!ctx->shader_info->fs.writes_stencil)
+   if (!ctx->shader_info->fs.writes_z && 
!ctx->shader_info->fs.writes_stencil && 
!ctx->shader_info->fs.writes_sample_mask)
last = ctx->output_mask <= ((1ull << (i + 1)) - 
1);
 
si_export_mrt_color(ctx, values, V_008DFC_SQ_EXP_MRT + 
index, last);
@@ -4767,7 +4771,7 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx)
}
}
 
-   if (depth || stencil)
+   if (depth || stencil || samplemask)
si_export_mrt_z(ctx, depth, stencil, samplemask);
else if (!index)
si_export_mrt_color(ctx, NULL, V_008DFC_SQ_EXP_NULL, true);
diff --git a/src/amd/common/ac_nir_to_llvm.h b/src/amd/common/ac_nir_to_llvm.h
index c2662e2..6c2b78b 100644
--- a/src/amd/common/ac_nir_to_llvm.h
+++ b/src/amd/common/ac_nir_to_llvm.h
@@ -118,6 +118,7 @@ struct ac_shader_variant_info {
bool can_discard;
bool writes_z;
bool writes_stencil;
+   bool writes_sample_mask;
bool early_fragment_test;
bool writes_memory;
bool force_persample;
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 5b7564c..1e38cbe 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -674,6 +674,7 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
*cmd_buffer,
   S_02880C_Z_EXPORT_ENABLE(ps->info.fs.writes_z) |
   
S_02880C_STENCIL_TEST_VAL_EXPORT_ENABLE(ps->info.fs.writes_stencil) |
   S_02880C_KILL_ENABLE(!!ps->info.fs.can_discard) |
+  
S_02880C_MASK_EXPORT_ENABLE(ps->info.fs.writes_sample_mask) |
   S_02880C_Z_ORDER(z_order) |
   
S_02880C_DEPTH_BEFORE_SHADER(ps->info.fs.early_fragment_test) |
   
S_02880C_EXEC_ON_HIER_FAIL(ps->info.fs.writes_memory) |
@@ -694,6 +695,7 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
*cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, 
spi_baryc_cntl);
 
radeon_set_context_reg(cmd_buffer->cs, R_028710_SPI_SHADER_Z_FORMAT,
+  ps->info.fs.writes_sample_mask ? 
V_028710_SPI_SHADER_32_ABGR :
   ps->info.fs.writes_stencil ? 
V_028710_SPI_SHADER_32_GR :
   ps->info.fs.writes_z ? V_028710_SPI_SHADER_32_R :
   V_028710_SPI_SHADER_ZERO);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): radv: fetch sample index via fmask for image coord as well.

2017-02-23 Thread Dave Airlie
Module: Mesa
Branch: master
Commit: 5e9ead0fa21eb2e3dfaca5485990110e17cc7b79
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=5e9ead0fa21eb2e3dfaca5485990110e17cc7b79

Author: Dave Airlie 
Date:   Wed Feb 22 14:29:09 2017 +1000

radv: fetch sample index via fmask for image coord as well.

This follows the txf_ms code, I can't figure out why amdgpu-pro
doesn't do this in their shaders, they must know someone we don't.

This fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*

Reviewed-by: Bas Nieuwenhuizen 
Signed-off-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c | 180 
 1 file changed, 126 insertions(+), 54 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 0cc5810..63583fa 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2367,60 +2367,6 @@ static int image_type_to_components_count(enum 
glsl_sampler_dim dim, bool array)
return 0;
 }
 
-static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
-nir_intrinsic_instr *instr)
-{
-   const struct glsl_type *type = instr->variables[0]->var->type;
-   if(instr->variables[0]->deref.child)
-   type = instr->variables[0]->deref.child->type;
-
-   LLVMValueRef src0 = get_src(ctx, instr->src[0]);
-   LLVMValueRef coords[4];
-   LLVMValueRef masks[] = {
-   LLVMConstInt(ctx->i32, 0, false), LLVMConstInt(ctx->i32, 1, 
false),
-   LLVMConstInt(ctx->i32, 2, false), LLVMConstInt(ctx->i32, 3, 
false),
-   };
-   LLVMValueRef res;
-   int count;
-   enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
-   bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS ||
-dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
-   bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
- dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
-
-   count = image_type_to_components_count(dim,
-  
glsl_sampler_type_is_array(type));
-
-   if (count == 1) {
-   if (instr->src[0].ssa->num_components)
-   res = LLVMBuildExtractElement(ctx->builder, src0, 
masks[0], "");
-   else
-   res = src0;
-   } else {
-   int chan;
-   if (is_ms)
-   count--;
-   for (chan = 0; chan < count; ++chan) {
-   coords[chan] = LLVMBuildExtractElement(ctx->builder, 
src0, masks[chan], "");
-   }
-
-   if (add_frag_pos) {
-   for (chan = 0; chan < count; ++chan)
-   coords[chan] = LLVMBuildAdd(ctx->builder, 
coords[chan], LLVMBuildFPToUI(ctx->builder, ctx->frag_pos[chan], ctx->i32, ""), 
"");
-   }
-   if (is_ms) {
-   coords[count] = llvm_extract_elem(ctx, get_src(ctx, 
instr->src[1]), 0);
-   count++;
-   }
-
-   if (count == 3) {
-   coords[3] = LLVMGetUndef(ctx->i32);
-   count = 4;
-   }
-   res = ac_build_gather_values(&ctx->ac, coords, count);
-   }
-   return res;
-}
 
 static void build_type_name_for_intr(
 LLVMTypeRef type,
@@ -2483,6 +2429,132 @@ static void get_image_intr_name(const char *base_name,
 }
 }
 
+static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
+nir_intrinsic_instr *instr)
+{
+   const struct glsl_type *type = instr->variables[0]->var->type;
+   if(instr->variables[0]->deref.child)
+   type = instr->variables[0]->deref.child->type;
+
+   LLVMValueRef src0 = get_src(ctx, instr->src[0]);
+   LLVMValueRef coords[4];
+   LLVMValueRef masks[] = {
+   LLVMConstInt(ctx->i32, 0, false), LLVMConstInt(ctx->i32, 1, 
false),
+   LLVMConstInt(ctx->i32, 2, false), LLVMConstInt(ctx->i32, 3, 
false),
+   };
+   LLVMValueRef res;
+   LLVMValueRef sample_index = llvm_extract_elem(ctx, get_src(ctx, 
instr->src[1]), 0);
+
+   int count;
+   enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
+   bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS ||
+dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
+   bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
+ dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
+
+   count = image_type_to_components_count(dim,
+  
glsl_sampler_type_is_array(type));
+
+   if (is_ms) {
+   LLVMValueRef fmask_load_address[4];
+   LLVMValueRef params[7];
+   LLVMValueRef glc = LLVMConstInt(ctx->i1, 0, false);
+   LLVMValueRef slc = LLVMConstInt(ctx->i1, 0, false);
+   LLVMValueRef da = ctx->i32

Mesa (master): radv/ac: refactor our fmask sample index fixup.

2017-02-23 Thread Dave Airlie
Module: Mesa
Branch: master
Commit: 8282c5c7710fb56231ea0e1b9d7b0f9295230e15
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8282c5c7710fb56231ea0e1b9d7b0f9295230e15

Author: Dave Airlie 
Date:   Thu Feb 23 12:20:25 2017 +1000

radv/ac: refactor our fmask sample index fixup.

This refactors out the sample index fixup between
txf and image load.

Reviewed-by: Bas Nieuwenhuizen 
Signed-off-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c | 229 +++-
 1 file changed, 107 insertions(+), 122 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 63583fa..6021647 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2429,6 +2429,95 @@ static void get_image_intr_name(const char *base_name,
 }
 }
 
+/* Adjust the sample index according to FMASK.
+ *
+ * For uncompressed MSAA surfaces, FMASK should return 0x76543210,
+ * which is the identity mapping. Each nibble says which physical sample
+ * should be fetched to get that sample.
+ *
+ * For example, 0x1100 means there are only 2 samples stored and
+ * the second sample covers 3/4 of the pixel. When reading samples 0
+ * and 1, return physical sample 0 (determined by the first two 0s
+ * in FMASK), otherwise return physical sample 1.
+ *
+ * The sample index should be adjusted as follows:
+ *   sample_index = (fmask >> (sample_index * 4)) & 0xF;
+ */
+static LLVMValueRef adjust_sample_index_using_fmask(struct nir_to_llvm_context 
*ctx,
+   LLVMValueRef coord_x, 
LLVMValueRef coord_y,
+   LLVMValueRef coord_z,
+   LLVMValueRef sample_index,
+   LLVMValueRef fmask_desc_ptr)
+{
+   LLVMValueRef fmask_load_address[4], params[7];
+   LLVMValueRef glc = LLVMConstInt(ctx->i1, 0, false);
+   LLVMValueRef slc = LLVMConstInt(ctx->i1, 0, false);
+   LLVMValueRef da = coord_z ? ctx->i32one : ctx->i32zero;
+   LLVMValueRef res;
+   char intrinsic_name[64];
+
+   fmask_load_address[0] = coord_x;
+   fmask_load_address[1] = coord_y;
+   if (coord_z) {
+   fmask_load_address[2] = coord_z;
+   fmask_load_address[3] = LLVMGetUndef(ctx->i32);
+   }
+
+   params[0] = ac_build_gather_values(&ctx->ac, fmask_load_address, 
coord_z ? 4 : 2);
+   params[1] = fmask_desc_ptr;
+   params[2] = LLVMConstInt(ctx->i32, 15, false); /* dmask */
+   LLVMValueRef lwe = LLVMConstInt(ctx->i1, 0, false);
+   params[3] = glc;
+   params[4] = slc;
+   params[5] = lwe;
+   params[6] = da;
+
+   get_image_intr_name("llvm.amdgcn.image.load",
+   ctx->v4f32, /* vdata */
+   LLVMTypeOf(params[0]), /* coords */
+   LLVMTypeOf(params[1]), /* rsrc */
+   intrinsic_name, sizeof(intrinsic_name));
+
+   res = ac_emit_llvm_intrinsic(&ctx->ac, intrinsic_name, ctx->v4f32,
+params, 7, AC_FUNC_ATTR_READONLY);
+
+   res = to_integer(ctx, res);
+   LLVMValueRef four = LLVMConstInt(ctx->i32, 4, false);
+   LLVMValueRef F = LLVMConstInt(ctx->i32, 0xf, false);
+
+   LLVMValueRef fmask = LLVMBuildExtractElement(ctx->builder,
+res,
+ctx->i32zero, "");
+
+   LLVMValueRef sample_index4 =
+   LLVMBuildMul(ctx->builder, sample_index, four, "");
+   LLVMValueRef shifted_fmask =
+   LLVMBuildLShr(ctx->builder, fmask, sample_index4, "");
+   LLVMValueRef final_sample =
+   LLVMBuildAnd(ctx->builder, shifted_fmask, F, "");
+
+   /* Don't rewrite the sample index if WORD1.DATA_FORMAT of the FMASK
+* resource descriptor is 0 (invalid),
+*/
+   LLVMValueRef fmask_desc =
+   LLVMBuildBitCast(ctx->builder, params[1],
+ctx->v8i32, "");
+
+   LLVMValueRef fmask_word1 =
+   LLVMBuildExtractElement(ctx->builder, fmask_desc,
+   ctx->i32one, "");
+
+   LLVMValueRef word1_is_nonzero =
+   LLVMBuildICmp(ctx->builder, LLVMIntNE,
+ fmask_word1, ctx->i32zero, "");
+
+   /* Replace the MSAA sample index. */
+   sample_index =
+   LLVMBuildSelect(ctx->builder, word1_is_nonzero,
+   final_sample, sample_index, "");
+   return sample_index;
+}
+
 static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
 nir_intrinsic_instr *instr)
 {
@@ -2456,73 +2545,25 @@ static LLVMValueRef get_image_coords(struct 
nir_to_llvm_context *ctx,

Mesa (master): radv: add sample mask input support

2017-02-23 Thread Dave Airlie
Module: Mesa
Branch: master
Commit: bdcbe7c76bba3171f4f4c30b29e21f58c9a62856
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=bdcbe7c76bba3171f4f4c30b29e21f58c9a62856

Author: Dave Airlie 
Date:   Tue Jan 31 05:30:26 2017 +1000

radv: add sample mask input support

Reviewed-by: Bas Nieuwenhuizen 
Signed-off-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index ca1416d..0cc5810 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -99,6 +99,7 @@ struct nir_to_llvm_context {
LLVMValueRef linear_sample, linear_center, linear_centroid;
LLVMValueRef front_face;
LLVMValueRef ancillary;
+   LLVMValueRef sample_coverage;
LLVMValueRef frag_pos[4];
 
LLVMBasicBlockRef continue_block;
@@ -532,7 +533,7 @@ static void create_function(struct nir_to_llvm_context *ctx)
arg_types[arg_idx++] = ctx->f32;  /* pos w float */
arg_types[arg_idx++] = ctx->i32;  /* front face */
arg_types[arg_idx++] = ctx->i32;  /* ancillary */
-   arg_types[arg_idx++] = ctx->f32;  /* sample coverage */
+   arg_types[arg_idx++] = ctx->i32;  /* sample coverage */
arg_types[arg_idx++] = ctx->i32;  /* fixed pt */
break;
default:
@@ -659,6 +660,7 @@ static void create_function(struct nir_to_llvm_context *ctx)
ctx->frag_pos[3] = LLVMGetParam(ctx->main_function, arg_idx++);
ctx->front_face = LLVMGetParam(ctx->main_function, arg_idx++);
ctx->ancillary = LLVMGetParam(ctx->main_function, arg_idx++);
+   ctx->sample_coverage = LLVMGetParam(ctx->main_function, 
arg_idx++);
break;
default:
unreachable("Shader stage not implemented");
@@ -3115,6 +3117,9 @@ static void visit_intrinsic(struct nir_to_llvm_context 
*ctx,
ctx->shader_info->fs.force_persample = true;
result = load_sample_pos(ctx);
break;
+   case nir_intrinsic_load_sample_mask_in:
+   result = ctx->sample_coverage;
+   break;
case nir_intrinsic_load_front_face:
result = ctx->front_face;
break;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit


Mesa (master): radv: fix interpolation at wrong place for offset interp

2017-02-23 Thread Dave Airlie
Module: Mesa
Branch: master
Commit: fc430c391b4be0e92bc9e297aaa260c674648ac2
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=fc430c391b4be0e92bc9e297aaa260c674648ac2

Author: Dave Airlie 
Date:   Thu Feb 23 14:24:20 2017 +1000

radv: fix interpolation at wrong place for offset interp

The code was interpolating at the offset from the sample,
not the offset from the center. Also fix for persample interpolation
modes we should force the pixel center to be at the sample.

Reviewed-by: Bas Nieuwenhuizen 
Signed-off-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c  | 6 --
 src/amd/vulkan/radv_cmd_buffer.c | 1 -
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index a74b906..ca1416d 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2884,10 +2884,12 @@ static LLVMValueRef visit_interp(struct 
nir_to_llvm_context *ctx,
location = INTERP_CENTROID;
break;
case nir_intrinsic_interp_var_at_sample:
-   case nir_intrinsic_interp_var_at_offset:
location = INTERP_SAMPLE;
src0 = get_src(ctx, instr->src[0]);
break;
+   case nir_intrinsic_interp_var_at_offset:
+   location = INTERP_CENTER;
+   src0 = get_src(ctx, instr->src[0]);
default:
break;
}
@@ -2910,7 +2912,7 @@ static LLVMValueRef visit_interp(struct 
nir_to_llvm_context *ctx,
interp_param = lookup_interp_param(ctx, 
instr->variables[0]->var->data.interpolation, location);
attr_number = LLVMConstInt(ctx->i32, input_index, false);
 
-   if (location == INTERP_SAMPLE) {
+   if (location == INTERP_SAMPLE || location == INTERP_CENTER) {
LLVMValueRef ij_out[2];
LLVMValueRef ddxy_out = emit_ddxy_interp(ctx, interp_param);
 
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 4aa5df6..dd6deef 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -685,7 +685,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
*cmd_buffer,
radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR,
   ps->config.spi_ps_input_addr);
 
-   spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(0);
radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL,
   S_0286D8_NUM_INTERP(ps->info.fs.num_interp));
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit