Mesa (master): glsl: enable early_fragment_tests implicitly with post_depth_coverage
Module: Mesa Branch: master Commit: 42b9057447bde6a48c948ed71d23e935c250cef5 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=42b9057447bde6a48c948ed71d23e935c250cef5 Author: Iago Toral Quiroga Date: Wed Feb 22 09:06:31 2017 +0100 glsl: enable early_fragment_tests implicitly with post_depth_coverage From ARB_post_depth_coverage: "This extension allows the fragment shader to control whether values in gl_SampleMaskIn[] reflect the coverage after application of the early depth and stencil tests. This feature can be enabled with the following layout qualifier in the fragment shader: layout(post_depth_coverage) in; Use of this feature implicitly enables early fragment tests." And a bit later it also adds: "early_fragment_tests" requests that fragment tests be performed before fragment shader execution, as described in section 15.2.4 "Early Fragment Tests" of the OpenGL Specification. If neither this nor post_depth_coverage are declared, per-fragment tests will be performed after fragment shader execution." Fixes: GL45-CTS.post_depth_coverage_tests.PostDepthSampleMask Reviewed-by: Marek Olšák --- src/compiler/glsl/linker.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index b6f8bc4..7343e4e 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -1881,7 +1881,7 @@ link_fs_inout_layout_qualifiers(struct gl_shader_program *prog, } linked_shader->Program->info.fs.early_fragment_tests |= - shader->EarlyFragmentTests; + shader->EarlyFragmentTests || shader->PostDepthCoverage; linked_shader->Program->info.fs.inner_coverage |= shader->InnerCoverage; linked_shader->Program->info.fs.post_depth_coverage |= shader->PostDepthCoverage; ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): isl/state: fix assert on raw buffer surface state minimum size
Module: Mesa Branch: master Commit: a9c488f2858f8a383dd50e557ec8a832bcb35f47 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=a9c488f2858f8a383dd50e557ec8a832bcb35f47 Author: Samuel Iglesias Gonsálvez Date: Wed Feb 22 12:27:15 2017 +0100 isl/state: fix assert on raw buffer surface state minimum size From IVB PRM, SURFACE_STATE::Height: "For typed buffer and structured buffer surfaces, the number of entries in the buffer ranges from 1 to 2^27 . For raw buffer surfaces, the number of entries in the buffer is the number of bytes which can range from 1 to 2^30." The minimum value is 1, according to the spec. The spec quote was already added into the code by 028f6d8317f00. Fixes crashing tests under: dEQP-VK.robustness.buffer_access.* Signed-off-by: Samuel Iglesias Gonsálvez Reviewed-by: Jason Ekstrand --- src/intel/isl/isl_surface_state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/isl/isl_surface_state.c b/src/intel/isl/isl_surface_state.c index 29ec289..853bb11 100644 --- a/src/intel/isl/isl_surface_state.c +++ b/src/intel/isl/isl_surface_state.c @@ -671,7 +671,7 @@ isl_genX(buffer_fill_state_s)(void *state, */ if (info->format == ISL_FORMAT_RAW) { assert(num_elements <= (1ull << 30)); - assert((num_elements & 3) == 0); + assert(num_elements > 0); } else { assert(num_elements <= (1ull << 27)); } ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): glsl: Raise a link error for non-SSO ES programs with a TES but no TCS.
Module: Mesa Branch: master Commit: e6e8475b0f17e605e1c8251a076cc1d48734873b URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=e6e8475b0f17e605e1c8251a076cc1d48734873b Author: Kenneth Graunke Date: Wed Feb 22 17:16:01 2017 -0800 glsl: Raise a link error for non-SSO ES programs with a TES but no TCS. OpenGL allows the TCS to be missing and supplies an implicit passthrough shader, but OpenGL ES does not (see section 7.3 of the ES 3.2 spec, cited above in the code). One open question is how to handle this for ARB_ES3_2_compatibility. This patch raises the link error for all ES shading language programs, but it might make sense to base it on the API. The approach taken in this patch is more restrictive, but should still allow any valid ES programs to work in GL. Signed-off-by: Kenneth Graunke Reviewed-by: Andres Gomez --- src/compiler/glsl/linker.cpp | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index 7343e4e..3eddbe2 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -4743,6 +4743,16 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) "tessellation evaluation shader\n"); goto done; } + + if (prog->IsES) { + if (num_shaders[MESA_SHADER_TESS_EVAL] > 0 && + num_shaders[MESA_SHADER_TESS_CTRL] == 0) { +linker_error(prog, "GLSL ES requires non-separable programs " + "containing a tessellation evaluation shader to also " + "be linked with a tessellation control shader\n"); +goto done; + } + } } /* Compute shaders have additional restrictions. */ ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): anv/blorp/clear_subpass: Only set surface clear color for fast clears
Module: Mesa Branch: master Commit: 42b10b175d5e8dfb9c4c46edbc306e7fac6bd3ec URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=42b10b175d5e8dfb9c4c46edbc306e7fac6bd3ec Author: Jason Ekstrand Date: Tue Feb 21 18:28:38 2017 -0800 anv/blorp/clear_subpass: Only set surface clear color for fast clears Not all clear colors are valid. In particular, on Broadwell and earlier, only 0/1 colors are allowed in surface state. No CTS tests are affected outright by this because, apparently, the CTS coverage for different clear colors is pretty terrible. However, when multisample compression is enabled, we do hit it with CTS tests and this commit prevents regressions when enabling MCS on Broadwell and earlier. Reviewed-by: Lionel Landwerlin Cc: "13.0 17.0" --- src/intel/vulkan/anv_blorp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c index 4e7078b..8db03e4 100644 --- a/src/intel/vulkan/anv_blorp.c +++ b/src/intel/vulkan/anv_blorp.c @@ -1198,9 +1198,10 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer) struct blorp_surf surf; get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT, att_state->aux_usage, &surf); - surf.clear_color = vk_to_isl_color(att_state->clear_value.color); if (att_state->fast_clear) { + surf.clear_color = vk_to_isl_color(att_state->clear_value.color); + blorp_fast_clear(&batch, &surf, iview->isl.format, iview->isl.base_level, iview->isl.base_array_layer, fb->layers, @@ -1224,7 +1225,7 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer *cmd_buffer) render_area.offset.x, render_area.offset.y, render_area.offset.x + render_area.extent.width, render_area.offset.y + render_area.extent.height, - surf.clear_color, NULL); + vk_to_isl_color(att_state->clear_value.color), NULL); } att_state->pending_clear_aspects = 0; ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): intel/isl: add MCS width constraint 16 samples
Module: Mesa Branch: master Commit: 34e29b2ebd2c296bad0bf6df986b3d75105c55ec URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=34e29b2ebd2c296bad0bf6df986b3d75105c55ec Author: Lionel Landwerlin Date: Mon Feb 20 16:10:30 2017 + intel/isl: add MCS width constraint 16 samples v3 (Jason Ekstrand): Add a comment explaining why Signed-off-by: Lionel Landwerlin Reviewed-by: Jason Ekstrand Reviewed-by: Chad Versace --- src/intel/isl/isl.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c index 1a47da5..d1fb7e4 100644 --- a/src/intel/isl/isl.c +++ b/src/intel/isl/isl.c @@ -1417,6 +1417,16 @@ isl_surf_get_mcs_surf(const struct isl_device *dev, assert(surf->levels == 1); assert(surf->logical_level0_px.depth == 1); + /* The "Auxiliary Surface Pitch" field in RENDER_SURFACE_STATE is only 9 +* bits which means the maximum pitch of a compression surface is 512 +* tiles or 64KB (since MCS is always Y-tiled). Since a 16x MCS buffer is +* 64bpp, this gives us a maximum width of 8192 pixels. We can create +* larger multisampled surfaces, we just can't compress them. For 2x, 4x, +* and 8x, we have enough room for the full 16k supported by the hardware. +*/ + if (surf->samples == 16 && surf->logical_level0_px.width > 8192) + return false; + enum isl_format mcs_format; switch (surf->samples) { case 2: mcs_format = ISL_FORMAT_MCS_2X; break; ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): anv: Enable MSAA compression
Module: Mesa Branch: master Commit: 261092f7d4f3142760fcce98ccb63b4efd47cc48 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=261092f7d4f3142760fcce98ccb63b4efd47cc48 Author: Jason Ekstrand Date: Fri Feb 17 14:14:48 2017 -0800 anv: Enable MSAA compression This just enables basic MSAA compression (no fast clears) for all multisampled surfaces. This improves the framerate of the Sascha "multisampling" demo by 76% on my Sky Lake laptop. Running Talos on medium settings with 8x MSAA, this improves the framerate in the benchmark by 80%. Reviewed-by: Lionel Landwerlin Reviewed-by: Chad Versace --- src/intel/vulkan/TODO | 2 +- src/intel/vulkan/anv_blorp.c | 3 ++- src/intel/vulkan/anv_image.c | 9 + src/intel/vulkan/anv_pipeline.c| 19 +++ src/intel/vulkan/genX_cmd_buffer.c | 5 + 5 files changed, 36 insertions(+), 2 deletions(-) diff --git a/src/intel/vulkan/TODO b/src/intel/vulkan/TODO index 6abda88..5366774 100644 --- a/src/intel/vulkan/TODO +++ b/src/intel/vulkan/TODO @@ -8,7 +8,7 @@ Missing Features: Performance: - Multi-{sampled/gen8,LOD} HiZ - - Compressed multisample support + - MSAA fast clears - Pushing pieces of UBOs? - Enable guardband clipping - Use soft-pin to avoid relocations diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c index 8db03e4..2cde3b7 100644 --- a/src/intel/vulkan/anv_blorp.c +++ b/src/intel/vulkan/anv_blorp.c @@ -1398,7 +1398,8 @@ ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer, struct anv_attachment_state *att_state = &cmd_buffer->state.attachments[att]; - if (att_state->aux_usage == ISL_AUX_USAGE_NONE) + if (att_state->aux_usage == ISL_AUX_USAGE_NONE || + att_state->aux_usage == ISL_AUX_USAGE_MCS) return; /* Nothing to resolve */ assert(att_state->aux_usage == ISL_AUX_USAGE_CCS_E || diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c index 47d0a1e..cd14293 100644 --- a/src/intel/vulkan/anv_image.c +++ b/src/intel/vulkan/anv_image.c @@ -238,6 +238,15 @@ make_surface(const struct anv_device *dev, } } } + } else if (aspect == VK_IMAGE_ASPECT_COLOR_BIT && vk_info->samples > 1) { + assert(image->aux_surface.isl.size == 0); + assert(!(vk_info->usage & VK_IMAGE_USAGE_STORAGE_BIT)); + ok = isl_surf_get_mcs_surf(&dev->isl_dev, &anv_surf->isl, + &image->aux_surface.isl); + if (ok) { + add_surface(image, &image->aux_surface); + image->aux_usage = ISL_AUX_USAGE_MCS; + } } return VK_SUCCESS; diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index 4410103..708b05a 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -228,6 +228,25 @@ static void populate_sampler_prog_key(const struct gen_device_info *devinfo, struct brw_sampler_prog_key_data *key) { + /* Almost all multisampled textures are compressed. The only time when we +* don't compress a multisampled texture is for 16x MSAA with a surface +* width greater than 8k which is a bit of an edge case. Since the sampler +* just ignores the MCS parameter to ld2ms when MCS is disabled, it's safe +* to tell the compiler to always assume compression. +*/ + key->compressed_multisample_layout_mask = ~0; + + /* SkyLake added support for 16x MSAA. With this came a new message for +* reading from a 16x MSAA surface with compression. The new message was +* needed because now the MCS data is 64 bits instead of 32 or lower as is +* the case for 8x, 4x, and 2x. The key->msaa_16 bit-field controls which +* message we use. Fortunately, the 16x message works for 8x, 4x, and 2x +* so we can just use it unconditionally. This may not be quite as +* efficient but it saves us from recompiling. +*/ + if (devinfo->gen >= 9) + key->msaa_16 = ~0; + /* XXX: Handle texture swizzle on HSW- */ for (int i = 0; i < MAX_SAMPLERS; i++) { /* Assume color sampler, no swizzling. (Works for BDW+) */ diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 7af2b31..e3f84e3 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -222,6 +222,11 @@ color_attachment_compute_aux_usage(struct anv_device *device, att_state->input_aux_usage = ISL_AUX_USAGE_NONE; att_state->fast_clear = false; return; + } else if (iview->image->aux_usage == ISL_AUX_USAGE_MCS) { + att_state->aux_usage = ISL_AUX_USAGE_MCS; + att_state->input_aux_usage = ISL_AUX_USAGE_MCS; + att_state->fast_clear = false; + return; } assert(iview->image->aux_surface.isl.usage & ISL_SURF_USAGE_CCS_BIT); ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mai
Mesa (master): intel/isl: Return surface creation success from aux helpers
Module: Mesa Branch: master Commit: 3885375195c9c62f7450beabb070a0e47cc11c58 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=3885375195c9c62f7450beabb070a0e47cc11c58 Author: Jason Ekstrand Date: Fri Feb 17 13:48:11 2017 -0800 intel/isl: Return surface creation success from aux helpers The isl_surf_init call that each of these helpers make can, in theory, fail. We should propagate that up to the caller rather than just silently ignoring it. Reviewed-by: Topi Pohjolainen Reviewed-by: Chad Versace --- src/intel/isl/isl.c | 72 +--- src/intel/isl/isl.h | 4 +-- src/intel/vulkan/anv_image.c | 5 +-- 3 files changed, 40 insertions(+), 41 deletions(-) diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c index 82ab68d..1a47da5 100644 --- a/src/intel/isl/isl.c +++ b/src/intel/isl/isl.c @@ -1323,7 +1323,7 @@ isl_surf_get_tile_info(const struct isl_device *dev, isl_tiling_get_info(dev, surf->tiling, fmtl->bpb, tile_info); } -void +bool isl_surf_get_hiz_surf(const struct isl_device *dev, const struct isl_surf *surf, struct isl_surf *hiz_surf) @@ -1391,20 +1391,20 @@ isl_surf_get_hiz_surf(const struct isl_device *dev, */ const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples; - isl_surf_init(dev, hiz_surf, - .dim = surf->dim, - .format = ISL_FORMAT_HIZ, - .width = surf->logical_level0_px.width, - .height = surf->logical_level0_px.height, - .depth = surf->logical_level0_px.depth, - .levels = surf->levels, - .array_len = surf->logical_level0_px.array_len, - .samples = samples, - .usage = ISL_SURF_USAGE_HIZ_BIT, - .tiling_flags = ISL_TILING_HIZ_BIT); + return isl_surf_init(dev, hiz_surf, +.dim = surf->dim, +.format = ISL_FORMAT_HIZ, +.width = surf->logical_level0_px.width, +.height = surf->logical_level0_px.height, +.depth = surf->logical_level0_px.depth, +.levels = surf->levels, +.array_len = surf->logical_level0_px.array_len, +.samples = samples, +.usage = ISL_SURF_USAGE_HIZ_BIT, +.tiling_flags = ISL_TILING_HIZ_BIT); } -void +bool isl_surf_get_mcs_surf(const struct isl_device *dev, const struct isl_surf *surf, struct isl_surf *mcs_surf) @@ -1427,17 +1427,17 @@ isl_surf_get_mcs_surf(const struct isl_device *dev, unreachable("Invalid sample count"); } - isl_surf_init(dev, mcs_surf, - .dim = ISL_SURF_DIM_2D, - .format = mcs_format, - .width = surf->logical_level0_px.width, - .height = surf->logical_level0_px.height, - .depth = 1, - .levels = 1, - .array_len = surf->logical_level0_px.array_len, - .samples = 1, /* MCS surfaces are really single-sampled */ - .usage = ISL_SURF_USAGE_MCS_BIT, - .tiling_flags = ISL_TILING_Y0_BIT); + return isl_surf_init(dev, mcs_surf, +.dim = ISL_SURF_DIM_2D, +.format = mcs_format, +.width = surf->logical_level0_px.width, +.height = surf->logical_level0_px.height, +.depth = 1, +.levels = 1, +.array_len = surf->logical_level0_px.array_len, +.samples = 1, /* MCS surfaces are really single-sampled */ +.usage = ISL_SURF_USAGE_MCS_BIT, +.tiling_flags = ISL_TILING_Y0_BIT); } bool @@ -1491,19 +1491,17 @@ isl_surf_get_ccs_surf(const struct isl_device *dev, return false; } - isl_surf_init(dev, ccs_surf, - .dim = surf->dim, - .format = ccs_format, - .width = surf->logical_level0_px.width, - .height = surf->logical_level0_px.height, - .depth = surf->logical_level0_px.depth, - .levels = surf->levels, - .array_len = surf->logical_level0_px.array_len, - .samples = 1, - .usage = ISL_SURF_USAGE_CCS_BIT, - .tiling_flags = ISL_TILING_CCS_BIT); - - return true; + return isl_surf_init(dev, ccs_surf, +.dim = surf->dim, +.format = ccs_format, +.width = surf->logical_level0_px.width, +.height = surf->logical_level0_px.height, +.depth = surf->logical_level0_px.depth, +.levels = surf-
Mesa (master): intel/isl: Apply render target alignment constraints for MCS
Module: Mesa Branch: master Commit: 042cc201f2869bb3a316729643e8e025f115 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=042cc201f2869bb3a316729643e8e025f115 Author: Pohjolainen, Topi Date: Thu Feb 23 15:31:44 2017 +0200 intel/isl: Apply render target alignment constraints for MCS v2: Instead of having the same block in isl_gen7,8,9.c add it once into isl.c::isl_choose_image_alignment_el() instead. Reviewed-by: Jason Ekstrand Signed-off-by: Topi Pohjolainen --- src/intel/isl/isl.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c index d1fb7e4..6eb1e93 100644 --- a/src/intel/isl/isl.c +++ b/src/intel/isl/isl.c @@ -480,7 +480,22 @@ isl_choose_image_alignment_el(const struct isl_device *dev, enum isl_msaa_layout msaa_layout, struct isl_extent3d *image_align_el) { - if (info->format == ISL_FORMAT_HIZ) { + const struct isl_format_layout *fmtl = isl_format_get_layout(info->format); + if (fmtl->txc == ISL_TXC_MCS) { + assert(tiling == ISL_TILING_Y0); + + /* + * IvyBrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)": + * + * Height, width, and layout of MCS buffer in this case must match with + * Render Target height, width, and layout. MCS buffer is tiledY. + * + * To avoid wasting memory, choose the smallest alignment possible: + * HALIGN_4 and VALIGN_4. + */ + *image_align_el = isl_extent3d(4, 4, 1); + return; + } else if (info->format == ISL_FORMAT_HIZ) { assert(ISL_DEV_GEN(dev) >= 6); /* HiZ surfaces are always aligned to 16x8 pixels in the primary surface * which works out to 2x2 HiZ elments. ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): st/mesa: free shader cache buffer on fallback
Module: Mesa Branch: master Commit: 987d8037cabaafaeba2cb8b82cb7fa7290dc4464 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=987d8037cabaafaeba2cb8b82cb7fa7290dc4464 Author: Timothy Arceri Date: Thu Feb 23 14:50:58 2017 +1100 st/mesa: free shader cache buffer on fallback Reviewed-by: Edward O'Callaghan Tested-by: Michel Dänzer --- src/mesa/state_tracker/st_shader_cache.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_shader_cache.c b/src/mesa/state_tracker/st_shader_cache.c index eb66f99..fba4b0a 100644 --- a/src/mesa/state_tracker/st_shader_cache.c +++ b/src/mesa/state_tracker/st_shader_cache.c @@ -242,13 +242,14 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx, return false; struct st_context *st = st_context(ctx); + uint8_t *buffer = NULL; for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { if (prog->_LinkedShaders[i] == NULL) continue; unsigned char *sha1 = stage_sha1[i]; size_t size; - uint8_t *buffer = (uint8_t *) disk_cache_get(ctx->Cache, sha1, &size); + buffer = (uint8_t *) disk_cache_get(ctx->Cache, sha1, &size); if (buffer) { struct blob_reader blob_reader; blob_reader_init(&blob_reader, buffer, size); @@ -396,6 +397,7 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx, return true; fallback_recompile: + free(buffer); for (unsigned i = 0; i < prog->NumShaders; i++) { _mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true); ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): st/mesa: fix crash in shader cache cased by race condition
Module: Mesa Branch: master Commit: c24d0aaa9a197ccf7cbaa9154b840aed6397f6bd URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=c24d0aaa9a197ccf7cbaa9154b840aed6397f6bd Author: Timothy Arceri Date: Thu Feb 23 14:42:07 2017 +1100 st/mesa: fix crash in shader cache cased by race condition If a thread doesn't load GLSL IR from cache but does load TGSI from cache (that was created by another thread) than it will crash due to expecting gl_program_parameter_list to have been restored from the GLSL IR cache and not be null. Reviewed-by: Edward O'Callaghan Tested-by: Michel Dänzer --- src/mesa/state_tracker/st_shader_cache.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/mesa/state_tracker/st_shader_cache.c b/src/mesa/state_tracker/st_shader_cache.c index 607e5b1..eb66f99 100644 --- a/src/mesa/state_tracker/st_shader_cache.c +++ b/src/mesa/state_tracker/st_shader_cache.c @@ -233,6 +233,14 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx, ralloc_free(buf); } + /* Now that we have created the sha1 keys that will be used for writting to +* the tgsi cache fallback to the regular glsl to tgsi path if we didn't +* load the GLSL IR from cache. We do this as glsl to tgsi can alter things +* such as gl_program_parameter_list which holds things like uniforms. +*/ + if (prog->data->LinkStatus != linking_skipped) + return false; + struct st_context *st = st_context(ctx); for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { if (prog->_LinkedShaders[i] == NULL) @@ -389,12 +397,6 @@ st_load_tgsi_from_disk_cache(struct gl_context *ctx, fallback_recompile: - /* GLSL IR was compiled and linked so just fallback to the regular -* glsl to tgsi path. -*/ - if (prog->data->LinkStatus != linking_skipped) - return false; - for (unsigned i = 0; i < prog->NumShaders; i++) { _mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true); } ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): swr: fix index buffers with non-zero indices
Module: Mesa Branch: master Commit: dcac48bfee545660dffbf23bd92a0939b19ffd18 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=dcac48bfee545660dffbf23bd92a0939b19ffd18 Author: George Kyriazis Date: Thu Feb 2 21:16:47 2017 -0600 swr: fix index buffers with non-zero indices Fix issue with index buffers that do not contain a 0 index. 0 index can be a non-valid index if the (copied) vertex buffers are a subset of the user's (which happens because we only copy the range between min & max). Core will use an index passed in from the driver to replace invalid indices. Only do this for calls that contain non-zero indices, to minimize performance Reviewed-by: Bruce Cherniak cost. --- src/gallium/drivers/swr/rasterizer/core/state.h| 1 + .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 60 +++--- .../drivers/swr/rasterizer/jitter/fetch_jit.h | 2 + src/gallium/drivers/swr/swr_draw.cpp | 1 + src/gallium/drivers/swr/swr_state.cpp | 4 ++ 5 files changed, 62 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h b/src/gallium/drivers/swr/rasterizer/core/state.h index 2f3b913..05347dc 100644 --- a/src/gallium/drivers/swr/rasterizer/core/state.h +++ b/src/gallium/drivers/swr/rasterizer/core/state.h @@ -524,6 +524,7 @@ struct SWR_VERTEX_BUFFER_STATE const uint8_t *pData; uint32_t size; uint32_t numaNode; +uint32_t minVertex; // min vertex (for bounds checking) uint32_t maxVertex; // size / pitch. precalculated value used by fetch shader for OOB checks uint32_t partialInboundsSize; // size % pitch. precalculated value used by fetch shader for partially OOB vertices }; diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp index 901bce6..ffa7605 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp @@ -309,11 +309,29 @@ void FetchJit::JitLoadVertices(const FETCH_COMPILE_STATE &fetchState, Value* str Value* startVertexOffset = MUL(Z_EXT(startOffset, mInt64Ty), stride); +Value *minVertex = NULL; +Value *minVertexOffset = NULL; +if (fetchState.bPartialVertexBuffer) { +// fetch min index for low bounds checking +minVertex = GEP(streams, {C(ied.StreamIndex), C(SWR_VERTEX_BUFFER_STATE_minVertex)}); +minVertex = LOAD(minVertex); +if (!fetchState.bDisableIndexOOBCheck) { +minVertexOffset = MUL(Z_EXT(minVertex, mInt64Ty), stride); +} +} + // Load from the stream. for(uint32_t lane = 0; lane < mVWidth; ++lane) { // Get index Value* index = VEXTRACT(vCurIndices, C(lane)); + +if (fetchState.bPartialVertexBuffer) { +// clamp below minvertex +Value *isBelowMin = ICMP_SLT(index, minVertex); +index = SELECT(isBelowMin, minVertex, index); +} + index = Z_EXT(index, mInt64Ty); Value*offset = MUL(index, stride); @@ -321,10 +339,14 @@ void FetchJit::JitLoadVertices(const FETCH_COMPILE_STATE &fetchState, Value* str offset = ADD(offset, startVertexOffset); if (!fetchState.bDisableIndexOOBCheck) { -// check for out of bound access, including partial OOB, and mask them to 0 +// check for out of bound access, including partial OOB, and replace them with minVertex Value *endOffset = ADD(offset, C((int64_t)info.Bpp)); Value *oob = ICMP_ULE(endOffset, size); -offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 0)); +if (fetchState.bPartialVertexBuffer) { +offset = SELECT(oob, offset, minVertexOffset); +} else { +offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 0)); +} } Value*pointer = GEP(stream, offset); @@ -732,6 +754,13 @@ void FetchJit::JitGatherVertices(const FETCH_COMPILE_STATE &fetchState, Value *maxVertex = GEP(streams, {C(ied.StreamIndex), C(SWR_VERTEX_BUFFER_STATE_maxVertex)}); maxVertex = LOAD(maxVertex); +Value *minVertex = NULL; +if (fetchState.bPartialVertexBuffer) { +// min vertex index for low bounds OOB checking +minVertex = GEP(streams, {C(ied.StreamIndex), C(SWR_VERTEX_BUFFER_STATE_minVertex)}); +minVertex = LOAD(minVertex); +} + Value *vCurIndices; Value *startOffset; if(ied.InstanceEnable) @@ -769,9 +798,16 @@ void FetchJit::JitGatherVertices(const FETCH_COMPILE_STATE &fetchState, // if we have a start offset, subtract from max vertex. Used for OOB check maxVert
Mesa (master): swr: add fetch shader cache
Module: Mesa Branch: master Commit: 669d8f626f64cee1bc74ef7869aac8585b6dcfe6 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=669d8f626f64cee1bc74ef7869aac8585b6dcfe6 Author: George Kyriazis Date: Fri Feb 10 10:24:32 2017 -0600 swr: add fetch shader cache For now, the cache key is all of FETCH_COMPILE_STATE. Use new/delete for swr_vertex_element_state, since we have to call the constructors/destructors of the struct elements. Reviewed-by: Bruce Cherniak --- src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h | 2 +- src/gallium/drivers/swr/swr_draw.cpp | 19 +++ src/gallium/drivers/swr/swr_shader.cpp| 14 ++ src/gallium/drivers/swr/swr_shader.h | 15 +++ src/gallium/drivers/swr/swr_state.cpp | 6 -- src/gallium/drivers/swr/swr_state.h | 9 + 6 files changed, 50 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h index 1547453..622608a 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h +++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h @@ -94,7 +94,7 @@ enum ComponentControl // struct FETCH_COMPILE_STATE { -uint32_t numAttribs; +uint32_t numAttribs {0}; INPUT_ELEMENT_DESC layout[KNOB_NUM_ATTRIBUTES]; SWR_FORMAT indexType; uint32_t cutIndex{ 0x }; diff --git a/src/gallium/drivers/swr/swr_draw.cpp b/src/gallium/drivers/swr/swr_draw.cpp index c4d5e5c..4bdd3bb 100644 --- a/src/gallium/drivers/swr/swr_draw.cpp +++ b/src/gallium/drivers/swr/swr_draw.cpp @@ -141,19 +141,22 @@ swr_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info) } struct swr_vertex_element_state *velems = ctx->velems; - if (!velems->fsFunc - || (velems->fsState.cutIndex != info->restart_index) - || (velems->fsState.bEnableCutIndex != info->primitive_restart)) { - - velems->fsState.cutIndex = info->restart_index; - velems->fsState.bEnableCutIndex = info->primitive_restart; - - /* Create Fetch Shader */ + velems->fsState.cutIndex = info->restart_index; + velems->fsState.bEnableCutIndex = info->primitive_restart; + + swr_jit_fetch_key key; + swr_generate_fetch_key(key, velems); + auto search = velems->map.find(key); + if (search != velems->map.end()) { + velems->fsFunc = search->second; + } else { HANDLE hJitMgr = swr_screen(ctx->pipe.screen)->hJitMgr; velems->fsFunc = JitCompileFetch(hJitMgr, velems->fsState); debug_printf("fetch shader %p\n", velems->fsFunc); assert(velems->fsFunc && "Error: FetchShader = NULL"); + + velems->map.insert(std::make_pair(key, velems->fsFunc)); } SwrSetFetchFunc(ctx->swrContext, velems->fsFunc); diff --git a/src/gallium/drivers/swr/swr_shader.cpp b/src/gallium/drivers/swr/swr_shader.cpp index 979a28b..676938c 100644 --- a/src/gallium/drivers/swr/swr_shader.cpp +++ b/src/gallium/drivers/swr/swr_shader.cpp @@ -61,6 +61,11 @@ bool operator==(const swr_jit_vs_key &lhs, const swr_jit_vs_key &rhs) return !memcmp(&lhs, &rhs, sizeof(lhs)); } +bool operator==(const swr_jit_fetch_key &lhs, const swr_jit_fetch_key &rhs) +{ + return !memcmp(&lhs, &rhs, sizeof(lhs)); +} + static void swr_generate_sampler_key(const struct lp_tgsi_info &info, struct swr_context *ctx, @@ -157,6 +162,15 @@ swr_generate_vs_key(struct swr_jit_vs_key &key, swr_generate_sampler_key(swr_vs->info, ctx, PIPE_SHADER_VERTEX, key); } +void +swr_generate_fetch_key(struct swr_jit_fetch_key &key, + struct swr_vertex_element_state *velems) +{ + memset(&key, 0, sizeof(key)); + + key.fsState = velems->fsState; +} + struct BuilderSWR : public Builder { BuilderSWR(JitManager *pJitMgr, const char *pName) : Builder(pJitMgr) diff --git a/src/gallium/drivers/swr/swr_shader.h b/src/gallium/drivers/swr/swr_shader.h index 7e3399c..266573f 100644 --- a/src/gallium/drivers/swr/swr_shader.h +++ b/src/gallium/drivers/swr/swr_shader.h @@ -42,6 +42,9 @@ void swr_generate_vs_key(struct swr_jit_vs_key &key, struct swr_context *ctx, swr_vertex_shader *swr_vs); +void swr_generate_fetch_key(struct swr_jit_fetch_key &key, +struct swr_vertex_element_state *velems); + struct swr_jit_sampler_key { unsigned nr_samplers; unsigned nr_sampler_views; @@ -60,6 +63,10 @@ struct swr_jit_vs_key : swr_jit_sampler_key { unsigned clip_plane_mask; // from rasterizer state & vs_info }; +struct swr_jit_fetch_key { + FETCH_COMPILE_STATE fsState; +}; + namespace std { template <> struct hash { @@ -75,7 +82,15 @@ template <> struct hash { return util_hash_crc32(&k, sizeof(k)); } }; + +template <> stru
Mesa (master): radv: enable location at sample when persample is forced.
Module: Mesa Branch: master Commit: 58c97a0791bf71b31546b13c2b491a636555749c URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=58c97a0791bf71b31546b13c2b491a636555749c Author: Dave Airlie Date: Thu Feb 23 14:24:20 2017 +1000 radv: enable location at sample when persample is forced. Reviewed-by: Bas Nieuwenhuizen Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_cmd_buffer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index dd6deef..5b7564c 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -685,6 +685,9 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer, radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR, ps->config.spi_ps_input_addr); + if (ps->info.fs.force_persample) + spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2); + radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL, S_0286D8_NUM_INTERP(ps->info.fs.num_interp)); ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): radv: add sample mask output support
Module: Mesa Branch: master Commit: ccb70d6f53464171639ee7809c9fe5ee3a86e54d URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=ccb70d6f53464171639ee7809c9fe5ee3a86e54d Author: Dave Airlie Date: Thu Feb 23 16:06:22 2017 +1000 radv: add sample mask output support This adds support to write to sample mask from the fragment shader. We can optimise this later like radeonsi. Reviewed-by: Bas Nieuwenhuizen Signed-off-by: Dave Airlie --- src/amd/common/ac_nir_to_llvm.c | 8 ++-- src/amd/common/ac_nir_to_llvm.h | 1 + src/amd/vulkan/radv_cmd_buffer.c | 2 ++ 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 6021647..9778581 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -4753,13 +4753,17 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx) ctx->shader_info->fs.writes_stencil = true; stencil = to_float(ctx, LLVMBuildLoad(ctx->builder, ctx->outputs[radeon_llvm_reg_index_soa(i, 0)], "")); + } else if (i == FRAG_RESULT_SAMPLE_MASK) { + ctx->shader_info->fs.writes_sample_mask = true; + samplemask = to_float(ctx, LLVMBuildLoad(ctx->builder, + ctx->outputs[radeon_llvm_reg_index_soa(i, 0)], "")); } else { bool last = false; for (unsigned j = 0; j < 4; j++) values[j] = to_float(ctx, LLVMBuildLoad(ctx->builder, ctx->outputs[radeon_llvm_reg_index_soa(i, j)], "")); - if (!ctx->shader_info->fs.writes_z && !ctx->shader_info->fs.writes_stencil) + if (!ctx->shader_info->fs.writes_z && !ctx->shader_info->fs.writes_stencil && !ctx->shader_info->fs.writes_sample_mask) last = ctx->output_mask <= ((1ull << (i + 1)) - 1); si_export_mrt_color(ctx, values, V_008DFC_SQ_EXP_MRT + index, last); @@ -4767,7 +4771,7 @@ handle_fs_outputs_post(struct nir_to_llvm_context *ctx) } } - if (depth || stencil) + if (depth || stencil || samplemask) si_export_mrt_z(ctx, depth, stencil, samplemask); else if (!index) si_export_mrt_color(ctx, NULL, V_008DFC_SQ_EXP_NULL, true); diff --git a/src/amd/common/ac_nir_to_llvm.h b/src/amd/common/ac_nir_to_llvm.h index c2662e2..6c2b78b 100644 --- a/src/amd/common/ac_nir_to_llvm.h +++ b/src/amd/common/ac_nir_to_llvm.h @@ -118,6 +118,7 @@ struct ac_shader_variant_info { bool can_discard; bool writes_z; bool writes_stencil; + bool writes_sample_mask; bool early_fragment_test; bool writes_memory; bool force_persample; diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 5b7564c..1e38cbe 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -674,6 +674,7 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer, S_02880C_Z_EXPORT_ENABLE(ps->info.fs.writes_z) | S_02880C_STENCIL_TEST_VAL_EXPORT_ENABLE(ps->info.fs.writes_stencil) | S_02880C_KILL_ENABLE(!!ps->info.fs.can_discard) | + S_02880C_MASK_EXPORT_ENABLE(ps->info.fs.writes_sample_mask) | S_02880C_Z_ORDER(z_order) | S_02880C_DEPTH_BEFORE_SHADER(ps->info.fs.early_fragment_test) | S_02880C_EXEC_ON_HIER_FAIL(ps->info.fs.writes_memory) | @@ -694,6 +695,7 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer, radeon_set_context_reg(cmd_buffer->cs, R_0286E0_SPI_BARYC_CNTL, spi_baryc_cntl); radeon_set_context_reg(cmd_buffer->cs, R_028710_SPI_SHADER_Z_FORMAT, + ps->info.fs.writes_sample_mask ? V_028710_SPI_SHADER_32_ABGR : ps->info.fs.writes_stencil ? V_028710_SPI_SHADER_32_GR : ps->info.fs.writes_z ? V_028710_SPI_SHADER_32_R : V_028710_SPI_SHADER_ZERO); ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): radv: fetch sample index via fmask for image coord as well.
Module: Mesa Branch: master Commit: 5e9ead0fa21eb2e3dfaca5485990110e17cc7b79 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=5e9ead0fa21eb2e3dfaca5485990110e17cc7b79 Author: Dave Airlie Date: Wed Feb 22 14:29:09 2017 +1000 radv: fetch sample index via fmask for image coord as well. This follows the txf_ms code, I can't figure out why amdgpu-pro doesn't do this in their shaders, they must know someone we don't. This fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_id.* Reviewed-by: Bas Nieuwenhuizen Signed-off-by: Dave Airlie --- src/amd/common/ac_nir_to_llvm.c | 180 1 file changed, 126 insertions(+), 54 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 0cc5810..63583fa 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -2367,60 +2367,6 @@ static int image_type_to_components_count(enum glsl_sampler_dim dim, bool array) return 0; } -static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx, -nir_intrinsic_instr *instr) -{ - const struct glsl_type *type = instr->variables[0]->var->type; - if(instr->variables[0]->deref.child) - type = instr->variables[0]->deref.child->type; - - LLVMValueRef src0 = get_src(ctx, instr->src[0]); - LLVMValueRef coords[4]; - LLVMValueRef masks[] = { - LLVMConstInt(ctx->i32, 0, false), LLVMConstInt(ctx->i32, 1, false), - LLVMConstInt(ctx->i32, 2, false), LLVMConstInt(ctx->i32, 3, false), - }; - LLVMValueRef res; - int count; - enum glsl_sampler_dim dim = glsl_get_sampler_dim(type); - bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS || -dim == GLSL_SAMPLER_DIM_SUBPASS_MS); - bool is_ms = (dim == GLSL_SAMPLER_DIM_MS || - dim == GLSL_SAMPLER_DIM_SUBPASS_MS); - - count = image_type_to_components_count(dim, - glsl_sampler_type_is_array(type)); - - if (count == 1) { - if (instr->src[0].ssa->num_components) - res = LLVMBuildExtractElement(ctx->builder, src0, masks[0], ""); - else - res = src0; - } else { - int chan; - if (is_ms) - count--; - for (chan = 0; chan < count; ++chan) { - coords[chan] = LLVMBuildExtractElement(ctx->builder, src0, masks[chan], ""); - } - - if (add_frag_pos) { - for (chan = 0; chan < count; ++chan) - coords[chan] = LLVMBuildAdd(ctx->builder, coords[chan], LLVMBuildFPToUI(ctx->builder, ctx->frag_pos[chan], ctx->i32, ""), ""); - } - if (is_ms) { - coords[count] = llvm_extract_elem(ctx, get_src(ctx, instr->src[1]), 0); - count++; - } - - if (count == 3) { - coords[3] = LLVMGetUndef(ctx->i32); - count = 4; - } - res = ac_build_gather_values(&ctx->ac, coords, count); - } - return res; -} static void build_type_name_for_intr( LLVMTypeRef type, @@ -2483,6 +2429,132 @@ static void get_image_intr_name(const char *base_name, } } +static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx, +nir_intrinsic_instr *instr) +{ + const struct glsl_type *type = instr->variables[0]->var->type; + if(instr->variables[0]->deref.child) + type = instr->variables[0]->deref.child->type; + + LLVMValueRef src0 = get_src(ctx, instr->src[0]); + LLVMValueRef coords[4]; + LLVMValueRef masks[] = { + LLVMConstInt(ctx->i32, 0, false), LLVMConstInt(ctx->i32, 1, false), + LLVMConstInt(ctx->i32, 2, false), LLVMConstInt(ctx->i32, 3, false), + }; + LLVMValueRef res; + LLVMValueRef sample_index = llvm_extract_elem(ctx, get_src(ctx, instr->src[1]), 0); + + int count; + enum glsl_sampler_dim dim = glsl_get_sampler_dim(type); + bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS || +dim == GLSL_SAMPLER_DIM_SUBPASS_MS); + bool is_ms = (dim == GLSL_SAMPLER_DIM_MS || + dim == GLSL_SAMPLER_DIM_SUBPASS_MS); + + count = image_type_to_components_count(dim, + glsl_sampler_type_is_array(type)); + + if (is_ms) { + LLVMValueRef fmask_load_address[4]; + LLVMValueRef params[7]; + LLVMValueRef glc = LLVMConstInt(ctx->i1, 0, false); + LLVMValueRef slc = LLVMConstInt(ctx->i1, 0, false); + LLVMValueRef da = ctx->i32
Mesa (master): radv/ac: refactor our fmask sample index fixup.
Module: Mesa Branch: master Commit: 8282c5c7710fb56231ea0e1b9d7b0f9295230e15 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=8282c5c7710fb56231ea0e1b9d7b0f9295230e15 Author: Dave Airlie Date: Thu Feb 23 12:20:25 2017 +1000 radv/ac: refactor our fmask sample index fixup. This refactors out the sample index fixup between txf and image load. Reviewed-by: Bas Nieuwenhuizen Signed-off-by: Dave Airlie --- src/amd/common/ac_nir_to_llvm.c | 229 +++- 1 file changed, 107 insertions(+), 122 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 63583fa..6021647 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -2429,6 +2429,95 @@ static void get_image_intr_name(const char *base_name, } } +/* Adjust the sample index according to FMASK. + * + * For uncompressed MSAA surfaces, FMASK should return 0x76543210, + * which is the identity mapping. Each nibble says which physical sample + * should be fetched to get that sample. + * + * For example, 0x1100 means there are only 2 samples stored and + * the second sample covers 3/4 of the pixel. When reading samples 0 + * and 1, return physical sample 0 (determined by the first two 0s + * in FMASK), otherwise return physical sample 1. + * + * The sample index should be adjusted as follows: + * sample_index = (fmask >> (sample_index * 4)) & 0xF; + */ +static LLVMValueRef adjust_sample_index_using_fmask(struct nir_to_llvm_context *ctx, + LLVMValueRef coord_x, LLVMValueRef coord_y, + LLVMValueRef coord_z, + LLVMValueRef sample_index, + LLVMValueRef fmask_desc_ptr) +{ + LLVMValueRef fmask_load_address[4], params[7]; + LLVMValueRef glc = LLVMConstInt(ctx->i1, 0, false); + LLVMValueRef slc = LLVMConstInt(ctx->i1, 0, false); + LLVMValueRef da = coord_z ? ctx->i32one : ctx->i32zero; + LLVMValueRef res; + char intrinsic_name[64]; + + fmask_load_address[0] = coord_x; + fmask_load_address[1] = coord_y; + if (coord_z) { + fmask_load_address[2] = coord_z; + fmask_load_address[3] = LLVMGetUndef(ctx->i32); + } + + params[0] = ac_build_gather_values(&ctx->ac, fmask_load_address, coord_z ? 4 : 2); + params[1] = fmask_desc_ptr; + params[2] = LLVMConstInt(ctx->i32, 15, false); /* dmask */ + LLVMValueRef lwe = LLVMConstInt(ctx->i1, 0, false); + params[3] = glc; + params[4] = slc; + params[5] = lwe; + params[6] = da; + + get_image_intr_name("llvm.amdgcn.image.load", + ctx->v4f32, /* vdata */ + LLVMTypeOf(params[0]), /* coords */ + LLVMTypeOf(params[1]), /* rsrc */ + intrinsic_name, sizeof(intrinsic_name)); + + res = ac_emit_llvm_intrinsic(&ctx->ac, intrinsic_name, ctx->v4f32, +params, 7, AC_FUNC_ATTR_READONLY); + + res = to_integer(ctx, res); + LLVMValueRef four = LLVMConstInt(ctx->i32, 4, false); + LLVMValueRef F = LLVMConstInt(ctx->i32, 0xf, false); + + LLVMValueRef fmask = LLVMBuildExtractElement(ctx->builder, +res, +ctx->i32zero, ""); + + LLVMValueRef sample_index4 = + LLVMBuildMul(ctx->builder, sample_index, four, ""); + LLVMValueRef shifted_fmask = + LLVMBuildLShr(ctx->builder, fmask, sample_index4, ""); + LLVMValueRef final_sample = + LLVMBuildAnd(ctx->builder, shifted_fmask, F, ""); + + /* Don't rewrite the sample index if WORD1.DATA_FORMAT of the FMASK +* resource descriptor is 0 (invalid), +*/ + LLVMValueRef fmask_desc = + LLVMBuildBitCast(ctx->builder, params[1], +ctx->v8i32, ""); + + LLVMValueRef fmask_word1 = + LLVMBuildExtractElement(ctx->builder, fmask_desc, + ctx->i32one, ""); + + LLVMValueRef word1_is_nonzero = + LLVMBuildICmp(ctx->builder, LLVMIntNE, + fmask_word1, ctx->i32zero, ""); + + /* Replace the MSAA sample index. */ + sample_index = + LLVMBuildSelect(ctx->builder, word1_is_nonzero, + final_sample, sample_index, ""); + return sample_index; +} + static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx, nir_intrinsic_instr *instr) { @@ -2456,73 +2545,25 @@ static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
Mesa (master): radv: add sample mask input support
Module: Mesa Branch: master Commit: bdcbe7c76bba3171f4f4c30b29e21f58c9a62856 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=bdcbe7c76bba3171f4f4c30b29e21f58c9a62856 Author: Dave Airlie Date: Tue Jan 31 05:30:26 2017 +1000 radv: add sample mask input support Reviewed-by: Bas Nieuwenhuizen Signed-off-by: Dave Airlie --- src/amd/common/ac_nir_to_llvm.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index ca1416d..0cc5810 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -99,6 +99,7 @@ struct nir_to_llvm_context { LLVMValueRef linear_sample, linear_center, linear_centroid; LLVMValueRef front_face; LLVMValueRef ancillary; + LLVMValueRef sample_coverage; LLVMValueRef frag_pos[4]; LLVMBasicBlockRef continue_block; @@ -532,7 +533,7 @@ static void create_function(struct nir_to_llvm_context *ctx) arg_types[arg_idx++] = ctx->f32; /* pos w float */ arg_types[arg_idx++] = ctx->i32; /* front face */ arg_types[arg_idx++] = ctx->i32; /* ancillary */ - arg_types[arg_idx++] = ctx->f32; /* sample coverage */ + arg_types[arg_idx++] = ctx->i32; /* sample coverage */ arg_types[arg_idx++] = ctx->i32; /* fixed pt */ break; default: @@ -659,6 +660,7 @@ static void create_function(struct nir_to_llvm_context *ctx) ctx->frag_pos[3] = LLVMGetParam(ctx->main_function, arg_idx++); ctx->front_face = LLVMGetParam(ctx->main_function, arg_idx++); ctx->ancillary = LLVMGetParam(ctx->main_function, arg_idx++); + ctx->sample_coverage = LLVMGetParam(ctx->main_function, arg_idx++); break; default: unreachable("Shader stage not implemented"); @@ -3115,6 +3117,9 @@ static void visit_intrinsic(struct nir_to_llvm_context *ctx, ctx->shader_info->fs.force_persample = true; result = load_sample_pos(ctx); break; + case nir_intrinsic_load_sample_mask_in: + result = ctx->sample_coverage; + break; case nir_intrinsic_load_front_face: result = ctx->front_face; break; ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit
Mesa (master): radv: fix interpolation at wrong place for offset interp
Module: Mesa Branch: master Commit: fc430c391b4be0e92bc9e297aaa260c674648ac2 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=fc430c391b4be0e92bc9e297aaa260c674648ac2 Author: Dave Airlie Date: Thu Feb 23 14:24:20 2017 +1000 radv: fix interpolation at wrong place for offset interp The code was interpolating at the offset from the sample, not the offset from the center. Also fix for persample interpolation modes we should force the pixel center to be at the sample. Reviewed-by: Bas Nieuwenhuizen Signed-off-by: Dave Airlie --- src/amd/common/ac_nir_to_llvm.c | 6 -- src/amd/vulkan/radv_cmd_buffer.c | 1 - 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index a74b906..ca1416d 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -2884,10 +2884,12 @@ static LLVMValueRef visit_interp(struct nir_to_llvm_context *ctx, location = INTERP_CENTROID; break; case nir_intrinsic_interp_var_at_sample: - case nir_intrinsic_interp_var_at_offset: location = INTERP_SAMPLE; src0 = get_src(ctx, instr->src[0]); break; + case nir_intrinsic_interp_var_at_offset: + location = INTERP_CENTER; + src0 = get_src(ctx, instr->src[0]); default: break; } @@ -2910,7 +2912,7 @@ static LLVMValueRef visit_interp(struct nir_to_llvm_context *ctx, interp_param = lookup_interp_param(ctx, instr->variables[0]->var->data.interpolation, location); attr_number = LLVMConstInt(ctx->i32, input_index, false); - if (location == INTERP_SAMPLE) { + if (location == INTERP_SAMPLE || location == INTERP_CENTER) { LLVMValueRef ij_out[2]; LLVMValueRef ddxy_out = emit_ddxy_interp(ctx, interp_param); diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 4aa5df6..dd6deef 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -685,7 +685,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer, radeon_set_context_reg(cmd_buffer->cs, R_0286D0_SPI_PS_INPUT_ADDR, ps->config.spi_ps_input_addr); - spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(0); radeon_set_context_reg(cmd_buffer->cs, R_0286D8_SPI_PS_IN_CONTROL, S_0286D8_NUM_INTERP(ps->info.fs.num_interp)); ___ mesa-commit mailing list mesa-commit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-commit