Re: [Mesa-dev] [PATCH] GL3: remove radeonsi occurrences in GL 4.2, already specified as "all DONE"
Reviewed-by: Edward O'Callaghan On 2016-04-30 01:48, Fabio Pedretti wrote: --- docs/GL3.txt | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index bb2bb6e..5a6be41 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -148,17 +148,17 @@ GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi GL 4.2, GLSL 4.20 -- all DONE: radeonsi - GL_ARB_texture_compression_bptc DONE (i965, nvc0, r600, radeonsi) + GL_ARB_texture_compression_bptc DONE (i965, nvc0, r600) GL_ARB_compressed_texture_pixel_storage DONE (all drivers) - GL_ARB_shader_atomic_counters DONE (i965, nvc0, radeonsi, softpipe) + GL_ARB_shader_atomic_counters DONE (i965, nvc0, softpipe) GL_ARB_texture_storageDONE (all drivers) - GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe) - GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe) - GL_ARB_shader_image_load_storeDONE (i965, radeonsi, softpipe) + GL_ARB_transform_feedback_instanced DONE (i965, nv50, nvc0, r600, llvmpipe, softpipe) + GL_ARB_base_instance DONE (i965, nv50, nvc0, r600, llvmpipe, softpipe) + GL_ARB_shader_image_load_storeDONE (i965, softpipe) GL_ARB_conservative_depth DONE (all drivers that support GLSL 1.30) GL_ARB_shading_language_420pack DONE (all drivers that support GLSL 1.30) GL_ARB_shading_language_packing DONE (all drivers) - GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe) + GL_ARB_internalformat_query DONE (i965, nv50, nvc0, r600, llvmpipe, softpipe) GL_ARB_map_buffer_alignment DONE (all drivers) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/27] glsl: add helper for comparing arrays in varying packing pass
On 2016-03-31 21:57, Timothy Arceri wrote: --- src/compiler/glsl/lower_packed_varyings.cpp | 25 + 1 file changed, 25 insertions(+) diff --git a/src/compiler/glsl/lower_packed_varyings.cpp b/src/compiler/glsl/lower_packed_varyings.cpp index ad766bb..6e7a289 100644 --- a/src/compiler/glsl/lower_packed_varyings.cpp +++ b/src/compiler/glsl/lower_packed_varyings.cpp @@ -152,6 +152,31 @@ using namespace ir_builder; +/** + * If the var is an array check if it matches the array attributes of the + * packed var. + */ +static bool +check_for_matching_arrays(ir_variable *packed_var, ir_variable *var) +{ + const glsl_type *pt = packed_var->type; + const glsl_type *vt = var->type; I suppose its ok to always assume the call site always does the right thing with this helper? Either way, Reviewed-by: Edward O'Callaghan + bool array_match = true; + + while (pt->is_array() || vt->is_array()) { + if (pt->is_array() != vt->is_array() || + pt->length != vt->length) { + array_match = false; + break; + } else { + pt = pt->fields.array; + vt = vt->fields.array; + } + } + + return array_match; +} + static bool needs_lowering(ir_variable *var, bool has_enhanced_layouts, bool disable_varying_packing) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 19/27] glsl: skip location and component packing validation on patch out
Acked-by: Edward O'Callaghan On 2016-03-31 21:57, Timothy Arceri wrote: These outputs have a separate location domain from per-vertex outputs and need to be handled separately. For now just skip validation so we don't invalidate valid shaders. --- src/compiler/glsl/link_varyings.cpp | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp index d125a9f..4f57fb2 100644 --- a/src/compiler/glsl/link_varyings.cpp +++ b/src/compiler/glsl/link_varyings.cpp @@ -348,8 +348,12 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog, foreach_in_list(ir_instruction, node, producer->ir) { ir_variable *const var = node->as_variable(); - if ((var == NULL) || (var->data.mode != ir_var_shader_out)) -continue; + /* FIXME: We should also validate per patch outputs too rather than just + * skipping over them here. + */ + if ((var == NULL) || var->data.patch || + (var->data.mode != ir_var_shader_out)) + continue; if (!var->data.explicit_location || var->data.location < VARYING_SLOT_VAR0) @@ -432,8 +436,12 @@ cross_validate_outputs_to_inputs(struct gl_shader_program *prog, foreach_in_list(ir_instruction, node, consumer->ir) { ir_variable *const input = node->as_variable(); - if ((input == NULL) || (input->data.mode != ir_var_shader_in)) -continue; + /* FIXME: We should also validate per patch outputs too rather than just + * skipping over them here. + */ + if ((input == NULL) || input->data.patch || + (input->data.mode != ir_var_shader_in)) + continue; if (strcmp(input->name, "gl_Color") == 0 && input->data.used) { const ir_variable *const front_color = ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 27/27] docs: mark ARB_enhanced_layouts as DONE
Acked-by: Edward O'Callaghan On 2016-03-31 21:58, Timothy Arceri wrote: --- docs/GL3.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index f6248da..ede8cf5 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -193,11 +193,11 @@ GL 4.4, GLSL 4.40: GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers) GL_ARB_buffer_storage DONE (i965, nv50, nvc0, r600, radeonsi) GL_ARB_clear_texture DONE (i965, nv50, nvc0) - GL_ARB_enhanced_layouts in progress (Timothy) + GL_ARB_enhanced_layouts DONE (all drivers) - compile-time constant expressions DONE - explicit byte offsets for blocksDONE - forced alignment within blocks DONE - - specified vec4-slot component numbers in progress + - specified vec4-slot component numbers DONE - specified transform/feedback layout DONE - input/output block locationsDONE GL_ARB_multi_bind DONE (all drivers) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/27] glsl: allow component qualifier on varying inputs
Reviewed-by: Edward O'Callaghan On 2016-03-31 21:57, Timothy Arceri wrote: --- src/compiler/glsl/ast_type.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp index 30c9eff..de3fdcc 100644 --- a/src/compiler/glsl/ast_type.cpp +++ b/src/compiler/glsl/ast_type.cpp @@ -146,6 +146,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc, input_layout_mask.flags.q.centroid = 1; /* Function params can have constant */ input_layout_mask.flags.q.constant = 1; + input_layout_mask.flags.q.explicit_component = 1; input_layout_mask.flags.q.explicit_location = 1; input_layout_mask.flags.q.flat = 1; input_layout_mask.flags.q.in = 1; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] radeonsi: Print a message when scratch allocation fails.
On 2016-04-20 11:46, Nicolai Hähnle wrote: On 19.04.2016 17:50, Bas Nieuwenhuizen wrote: Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 5 - src/gallium/drivers/radeonsi/si_state_shaders.c | 5 - 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index b46a2fe..7d91ac6 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -215,8 +215,11 @@ static bool si_setup_compute_scratch_buffer(struct si_context *sctx, scratch_needed, 256, false, RADEON_DOMAIN_VRAM, RADEON_FLAG_NO_CPU_ACCESS); - if (!sctx->compute_scratch_buffer) + if (!sctx->compute_scratch_buffer) { + fprintf(stderr, "Warning: Failed to allocate the " + "scratch buffer\n"); return false; + } Here and below, please change the "Warning" into "radeonsi" so unsuspecting users will be more likely to understand what's going on. With that changed, the patch is Reviewed-by: Nicolai Hähnle Wait, why not use the std R600_ERR() macro that wraps fprintf() calls? } if (sctx->compute_scratch_buffer != shader->scratch_bo && scratch_needed) { diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c b/src/gallium/drivers/radeonsi/si_state_shaders.c index fef676b..2396b8e 100644 --- a/src/gallium/drivers/radeonsi/si_state_shaders.c +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c @@ -1692,8 +1692,11 @@ static bool si_update_spi_tmpring_size(struct si_context *sctx) scratch_needed_size, 256, false, RADEON_DOMAIN_VRAM, RADEON_FLAG_NO_CPU_ACCESS); - if (!sctx->scratch_buffer) + if (!sctx->scratch_buffer) { + fprintf(stderr, "Warning: Failed to allocate the " + "scratch buffer\n"); return false; + } sctx->emit_scratch_reloc = true; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.
On 2016-04-20 09:29, Bas Nieuwenhuizen wrote: I retract patch 1 and 2. Large scratch buffers are nice, but the hardware only supports a 32-bit offset into it. - Bas On Wed, Apr 20, 2016 at 12:50 AM, Bas Nieuwenhuizen wrote: Use the CE suballocator instead of the normal one as the usage is most similar to the CE, i.e. only read and written on GPU and not mapped to CPU. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_cp_dma.c | 27 ++- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c b/src/gallium/drivers/radeonsi/si_cp_dma.c index 38e0ee6..264789d 100644 --- a/src/gallium/drivers/radeonsi/si_cp_dma.c +++ b/src/gallium/drivers/radeonsi/si_cp_dma.c @@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context *ctx, struct pipe_resource *dst, */ static void si_cp_dma_realign_engine(struct si_context *sctx, unsigned size) { + trivial spurious '\n' uint64_t va; unsigned dma_flags = 0; unsigned scratch_size = CP_DMA_ALIGNMENT * 2; + unsigned offset; + struct r600_resource *tmp_buf; assert(size < CP_DMA_ALIGNMENT); - /* Use the scratch buffer as the dummy buffer. The 3D engine should be -* idle at this point. -*/ - if (!sctx->scratch_buffer || - sctx->scratch_buffer->b.b.width0 < scratch_size) { - r600_resource_reference(&sctx->scratch_buffer, NULL); - sctx->scratch_buffer = - si_resource_create_custom(&sctx->screen->b.b, - PIPE_USAGE_DEFAULT, - scratch_size); - if (!sctx->scratch_buffer) - return; - sctx->emit_scratch_reloc = true; - } + u_suballocator_alloc(sctx->ce_suballocator, scratch_size, &offset, +(struct pipe_resource**)&tmp_buf); + if (!tmp_buf) + return; - si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b, - &sctx->scratch_buffer->b.b, size, size, &dma_flags); + si_cp_dma_prepare(sctx, &tmp_buf->b.b, + &tmp_buf->b.b, size, size, &dma_flags); - va = sctx->scratch_buffer->gpu_address; + va = tmp_buf->gpu_address + offset; si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, size, dma_flags); } -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: Implement ddx/ddy on VI using ds_bpermute
On 2016-04-16 20:20, Marek Olšák wrote: On Sat, Apr 16, 2016 at 8:04 AM, Michel Dänzer wrote: On 16.04.2016 14:51, Michel Dänzer wrote: On 16.04.2016 11:39, Tom Stellard wrote: The ds_bpermute instruction allows threads to transfer data directly to or from the vgprs of other threads. These instructions use the lds hardware to transfer data, but do not read or write lds memory. DDX BEFORE:| DDX AFTER: | v_mbcnt_lo_u32_b32_e64 v2, -1, 0 | v_mbcnt_lo_u32_b32_e64 v2, -1, 0 v_mbcnt_hi_u32_b32_e64 v2, -1, v2 | v_mbcnt_hi_u32_b32_e64 v2, -1, v2 v_lshlrev_b32_e32 v4, 2, v2| v_and_b32_e32 v2, 0x3ffc, v2 v_and_b32_e32 v2, -4, v2 | v_lshlrev_b32_e32 v2, 2, v2 v_lshlrev_b32_e32 v3, 2, v2| ds_bpermute_b32 v3, v2, v0 s_mov_b32 m0, -1 | ds_bpermute_b32 v0, v2, v0 offset:4 ds_write_b32 v4, v0| s_waitcnt lgkmcnt(0) s_waitcnt lgkmcnt(0) | v_or_b32_e32 v0, 1, v2 | v_lshlrev_b32_e32 v0, 2, v0| ds_read_b32 v1, v3 | ds_read_b32 v0, v0 | s_waitcnt lgkmcnt(0) | | LDS: 1 blocks | LDS: 0 blocks Nice. Were these intrinsics already available in LLVM 3.6? If not, the old code needs to be kept for backwards compatibility. I can see now that you're taking care of this for the bpermute intrinsic, but AFAICT the mbcnt intrinsics were only added in LLVM 3.8. How do you feel about increasing the requirement to LLVM 3.8 for Mesa git? +1 from me. Supporting more than two generations of LLVM is a bit much to carry imho. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] gallium: use enums in p_defines.h
Patches 1 & 2 are, Reviewed-by: Edward O'Callaghan i`ll have to spend some time looking at the others tomorrow.. On 2016-04-16 22:50, Marek Olšák wrote: From: Marek Olšák and remove number assignments which are consecutive --- src/gallium/include/pipe/p_defines.h | 378 +++ 1 file changed, 205 insertions(+), 173 deletions(-) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 1aef21d..6bb180d 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -51,49 +51,56 @@ enum pipe_error /* TODO */ }; +enum { + PIPE_BLENDFACTOR_ONE = 1, + PIPE_BLENDFACTOR_SRC_COLOR, + PIPE_BLENDFACTOR_SRC_ALPHA, + PIPE_BLENDFACTOR_DST_ALPHA, + PIPE_BLENDFACTOR_DST_COLOR, + PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, + PIPE_BLENDFACTOR_CONST_COLOR, + PIPE_BLENDFACTOR_CONST_ALPHA, + PIPE_BLENDFACTOR_SRC1_COLOR, + PIPE_BLENDFACTOR_SRC1_ALPHA, + + PIPE_BLENDFACTOR_ZERO = 0x11, + PIPE_BLENDFACTOR_INV_SRC_COLOR, + PIPE_BLENDFACTOR_INV_SRC_ALPHA, + PIPE_BLENDFACTOR_INV_DST_ALPHA, + PIPE_BLENDFACTOR_INV_DST_COLOR, + + PIPE_BLENDFACTOR_INV_CONST_COLOR = 0x17, + PIPE_BLENDFACTOR_INV_CONST_ALPHA, + PIPE_BLENDFACTOR_INV_SRC1_COLOR, + PIPE_BLENDFACTOR_INV_SRC1_ALPHA, +}; + +enum { + PIPE_BLEND_ADD, + PIPE_BLEND_SUBTRACT, + PIPE_BLEND_REVERSE_SUBTRACT, + PIPE_BLEND_MIN, + PIPE_BLEND_MAX, +}; -#define PIPE_BLENDFACTOR_ONE 0x1 -#define PIPE_BLENDFACTOR_SRC_COLOR 0x2 -#define PIPE_BLENDFACTOR_SRC_ALPHA 0x3 -#define PIPE_BLENDFACTOR_DST_ALPHA 0x4 -#define PIPE_BLENDFACTOR_DST_COLOR 0x5 -#define PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE 0x6 -#define PIPE_BLENDFACTOR_CONST_COLOR 0x7 -#define PIPE_BLENDFACTOR_CONST_ALPHA 0x8 -#define PIPE_BLENDFACTOR_SRC1_COLOR 0x9 -#define PIPE_BLENDFACTOR_SRC1_ALPHA 0x0A -#define PIPE_BLENDFACTOR_ZERO0x11 -#define PIPE_BLENDFACTOR_INV_SRC_COLOR 0x12 -#define PIPE_BLENDFACTOR_INV_SRC_ALPHA 0x13 -#define PIPE_BLENDFACTOR_INV_DST_ALPHA 0x14 -#define PIPE_BLENDFACTOR_INV_DST_COLOR 0x15 -#define PIPE_BLENDFACTOR_INV_CONST_COLOR 0x17 -#define PIPE_BLENDFACTOR_INV_CONST_ALPHA 0x18 -#define PIPE_BLENDFACTOR_INV_SRC1_COLOR 0x19 -#define PIPE_BLENDFACTOR_INV_SRC1_ALPHA 0x1A - -#define PIPE_BLEND_ADD 0 -#define PIPE_BLEND_SUBTRACT 1 -#define PIPE_BLEND_REVERSE_SUBTRACT 2 -#define PIPE_BLEND_MIN 3 -#define PIPE_BLEND_MAX 4 - -#define PIPE_LOGICOP_CLEAR0 -#define PIPE_LOGICOP_NOR 1 -#define PIPE_LOGICOP_AND_INVERTED 2 -#define PIPE_LOGICOP_COPY_INVERTED3 -#define PIPE_LOGICOP_AND_REVERSE 4 -#define PIPE_LOGICOP_INVERT 5 -#define PIPE_LOGICOP_XOR 6 -#define PIPE_LOGICOP_NAND 7 -#define PIPE_LOGICOP_AND 8 -#define PIPE_LOGICOP_EQUIV9 -#define PIPE_LOGICOP_NOOP 10 -#define PIPE_LOGICOP_OR_INVERTED 11 -#define PIPE_LOGICOP_COPY 12 -#define PIPE_LOGICOP_OR_REVERSE 13 -#define PIPE_LOGICOP_OR 14 -#define PIPE_LOGICOP_SET 15 +enum { + PIPE_LOGICOP_CLEAR, + PIPE_LOGICOP_NOR, + PIPE_LOGICOP_AND_INVERTED, + PIPE_LOGICOP_COPY_INVERTED, + PIPE_LOGICOP_AND_REVERSE, + PIPE_LOGICOP_INVERT, + PIPE_LOGICOP_XOR, + PIPE_LOGICOP_NAND, + PIPE_LOGICOP_AND, + PIPE_LOGICOP_EQUIV, + PIPE_LOGICOP_NOOP, + PIPE_LOGICOP_OR_INVERTED, + PIPE_LOGICOP_COPY, + PIPE_LOGICOP_OR_REVERSE, + PIPE_LOGICOP_OR, + PIPE_LOGICOP_SET, +}; #define PIPE_MASK_R 0x1 #define PIPE_MASK_G 0x2 @@ -110,19 +117,23 @@ enum pipe_error * Inequality functions. Used for depth test, stencil compare, alpha * test, shadow compare, etc. */ -#define PIPE_FUNC_NEVER0 -#define PIPE_FUNC_LESS 1 -#define PIPE_FUNC_EQUAL2 -#define PIPE_FUNC_LEQUAL 3 -#define PIPE_FUNC_GREATER 4 -#define PIPE_FUNC_NOTEQUAL 5 -#define PIPE_FUNC_GEQUAL 6 -#define PIPE_FUNC_ALWAYS 7 +enum { + PIPE_FUNC_NEVER, + PIPE_FUNC_LESS, + PIPE_FUNC_EQUAL, + PIPE_FUNC_LEQUAL, + PIPE_FUNC_GREATER, + PIPE_FUNC_NOTEQUAL, + PIPE_FUNC_GEQUAL, + PIPE_FUNC_ALWAYS, +}; /** Polygon fill mode */ -#define PIPE_POLYGON_MODE_FILL 0 -#define PIPE_POLYGON_MODE_LINE 1 -#define PIPE_POLYGON_MODE_POINT 2 +enum { + PIPE_POLYGON_MODE_FILL, + PIPE_POLYGON_MODE_LINE, + PIPE_POLYGON_MODE_POINT, +}; /** Polygon face specification, eg for culling */ #define PIPE_FACE_NONE 0 @@ -131,60 +142,72 @@ enum pipe_error #define PIPE_FACE_FRONT_AND_BACK (PIPE_FACE_FRONT | PIPE_FACE_BACK) /** Stencil ops */ -#define PIPE_STENCIL_OP_KEEP 0 -#define PIPE_STENCIL_OP_ZERO 1 -#define PIPE_STENCIL_OP_REPLACE2 -#define PIPE_STENCIL_OP_INCR 3 -#define PIPE_STENCIL_OP_DECR 4 -#define PIPE_STENCIL_OP_INCR_WRAP 5 -#define PI
Re: [Mesa-dev] [PATCH 1/2] gallium/radeon: use enums in r600_query.h
This series is, Reviewed-by: Edward O'Callaghan On 2016-04-16 22:50, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeon/r600_query.h | 43 ++--- 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_query.h b/src/gallium/drivers/radeon/r600_query.h index 9f3a917..6bb9374 100644 --- a/src/gallium/drivers/radeon/r600_query.h +++ b/src/gallium/drivers/radeon/r600_query.h @@ -40,26 +40,29 @@ struct r600_query; struct r600_query_hw; struct r600_resource; -#define R600_QUERY_DRAW_CALLS (PIPE_QUERY_DRIVER_SPECIFIC + 0) -#define R600_QUERY_REQUESTED_VRAM (PIPE_QUERY_DRIVER_SPECIFIC + 1) -#define R600_QUERY_REQUESTED_GTT (PIPE_QUERY_DRIVER_SPECIFIC + 2) -#define R600_QUERY_BUFFER_WAIT_TIME(PIPE_QUERY_DRIVER_SPECIFIC + 3) -#define R600_QUERY_NUM_CS_FLUSHES (PIPE_QUERY_DRIVER_SPECIFIC + 4) -#define R600_QUERY_NUM_BYTES_MOVED (PIPE_QUERY_DRIVER_SPECIFIC + 5) -#define R600_QUERY_VRAM_USAGE (PIPE_QUERY_DRIVER_SPECIFIC + 6) -#define R600_QUERY_GTT_USAGE (PIPE_QUERY_DRIVER_SPECIFIC + 7) -#define R600_QUERY_GPU_TEMPERATURE (PIPE_QUERY_DRIVER_SPECIFIC + 8) -#define R600_QUERY_CURRENT_GPU_SCLK(PIPE_QUERY_DRIVER_SPECIFIC + 9) -#define R600_QUERY_CURRENT_GPU_MCLK(PIPE_QUERY_DRIVER_SPECIFIC + 10) -#define R600_QUERY_GPU_LOAD(PIPE_QUERY_DRIVER_SPECIFIC + 11) -#define R600_QUERY_NUM_COMPILATIONS(PIPE_QUERY_DRIVER_SPECIFIC + 12) -#define R600_QUERY_NUM_SHADERS_CREATED (PIPE_QUERY_DRIVER_SPECIFIC + 13) -#define R600_QUERY_GPIN_ASIC_ID(PIPE_QUERY_DRIVER_SPECIFIC + 14) -#define R600_QUERY_GPIN_NUM_SIMD (PIPE_QUERY_DRIVER_SPECIFIC + 15) -#define R600_QUERY_GPIN_NUM_RB (PIPE_QUERY_DRIVER_SPECIFIC + 16) -#define R600_QUERY_GPIN_NUM_SPI(PIPE_QUERY_DRIVER_SPECIFIC + 17) -#define R600_QUERY_GPIN_NUM_SE (PIPE_QUERY_DRIVER_SPECIFIC + 18) -#define R600_QUERY_FIRST_PERFCOUNTER (PIPE_QUERY_DRIVER_SPECIFIC + 100) +enum { + R600_QUERY_DRAW_CALLS = PIPE_QUERY_DRIVER_SPECIFIC, + R600_QUERY_REQUESTED_VRAM, + R600_QUERY_REQUESTED_GTT, + R600_QUERY_BUFFER_WAIT_TIME, + R600_QUERY_NUM_CS_FLUSHES, + R600_QUERY_NUM_BYTES_MOVED, + R600_QUERY_VRAM_USAGE, + R600_QUERY_GTT_USAGE, + R600_QUERY_GPU_TEMPERATURE, + R600_QUERY_CURRENT_GPU_SCLK, + R600_QUERY_CURRENT_GPU_MCLK, + R600_QUERY_GPU_LOAD, + R600_QUERY_NUM_COMPILATIONS, + R600_QUERY_NUM_SHADERS_CREATED, + R600_QUERY_GPIN_ASIC_ID, + R600_QUERY_GPIN_NUM_SIMD, + R600_QUERY_GPIN_NUM_RB, + R600_QUERY_GPIN_NUM_SPI, + R600_QUERY_GPIN_NUM_SE, + + R600_QUERY_FIRST_PERFCOUNTER = PIPE_QUERY_DRIVER_SPECIFIC + 100, +}; enum { R600_QUERY_GROUP_GPIN = 0, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] Implement ARB_clear_texture for radeon drivers
Hi Jakob, Unfortunately ARB_clear_texture is not as straight forward and taking that from nouveau. You will notice I sent essentially a identical series last year to the ml as a rfc. You can find that work sitting around on my github here: https://github.com/victoredwardocallaghan/mesa-GLwork Also I am somewhat surprise arb_clear_texture-float piglit passed for you, what hardware did you test that on exactly? In any case, as Ilia correctly pointed out that this implementation specifically relies on nouveau somewhat special take on its local version of usual gallium helpers and thus this implementation isn`t correct under usual conditions. More precisely, changing the surface condition after the create_surface callback is considered illegal under the usual gallium helpers and framework. Unfortunately this is series is, Nacked-by: Edward O'Callaghan Kind Regards, Edward. On 2016-04-16 02:33, Jakob Sinclair wrote: This series of patches implements ARB_clear_texture for r600 and radeonsi. I only tested this with the radeonsi driver and just assumed it would work on the r600 driver. If someone could test this with the r600 driver it would be wonderful. This implementation was mostly based on the nouveau implementation of the same function. I don't have push access so someone reviewing this can push it. Regards Jakob Sinclair Jakob Sinclair (4): gallium/radeon: add clear_texture function gallium/radeonsi: enable ARB_clear_texture gallium/r600: enable ARB_clear_texture docs/GL3.txt: mark ARB_clear_texture as done for r600 and radeonsi docs/GL3.txt | 2 +- src/gallium/drivers/r600/r600_pipe.c | 2 +- src/gallium/drivers/radeon/r600_texture.c | 72 +++ src/gallium/drivers/radeonsi/si_pipe.c| 2 +- 4 files changed, 75 insertions(+), 3 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 01/20] radeonsi: lower compute shader arguments
Patches - 2-4, 7-8, 12-14 & 17 - are all: Reviewed-by: Edward O'Callaghan The series was: Tested-by: Edward O'Callaghan On 2016-04-14 05:29, Bas Nieuwenhuizen wrote: Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_shader.c | 41 src/gallium/drivers/radeonsi/si_shader.h | 7 ++ 2 files changed, 48 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index c58467d..1ccdcac 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -1282,6 +1282,36 @@ static void declare_system_value( value = get_primitive_id(&radeon_bld->soa.bld_base, 0); break; + case TGSI_SEMANTIC_GRID_SIZE: + value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_GRID_SIZE); + break; + + case TGSI_SEMANTIC_BLOCK_SIZE: + { + LLVMValueRef values[3]; + unsigned i; + unsigned *properties = ctx->shader->selector->info.properties; + unsigned sizes[3] = { + properties[TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH], + properties[TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT], + properties[TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH] + }; + + for (i = 0; i < 3; ++i) + values[i] = lp_build_const_int32(gallivm, sizes[i]); + + value = lp_build_gather_values(gallivm, values, 3); + break; + } + + case TGSI_SEMANTIC_BLOCK_ID: + value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_BLOCK_ID); + break; + + case TGSI_SEMANTIC_THREAD_ID: + value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_THREAD_ID); + break; + default: assert(!"unknown system value"); return; @@ -4823,6 +4853,14 @@ static void create_function(struct si_shader_context *ctx) } break; + case TGSI_PROCESSOR_COMPUTE: + params[SI_PARAM_GRID_SIZE] = v3i32; + params[SI_PARAM_BLOCK_ID] = v3i32; + last_sgpr = SI_PARAM_BLOCK_ID; + + params[SI_PARAM_THREAD_ID] = v3i32; + num_params = SI_PARAM_THREAD_ID + 1; + break; default: assert(0 && "unimplemented shader"); return; @@ -5600,6 +5638,7 @@ void si_dump_shader_key(unsigned shader, union si_shader_key *key, FILE *f) break; case PIPE_SHADER_GEOMETRY: + case PIPE_SHADER_COMPUTE: break; case PIPE_SHADER_FRAGMENT: @@ -5784,6 +5823,8 @@ int si_compile_tgsi_shader(struct si_screen *sscreen, else bld_base->emit_epilogue = si_llvm_return_fs_outputs; break; + case TGSI_PROCESSOR_COMPUTE: + break; default: assert(!"Unsupported shader type"); return -1; diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index 013c8a2..5043d43 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -91,6 +91,7 @@ struct radeon_shader_reloc; #define SI_SGPR_TCS_OUT_LAYOUT 11 /* TCS & TES only */ #define SI_SGPR_TCS_IN_LAYOUT 12 /* TCS only */ #define SI_SGPR_ALPHA_REF 10 /* PS only */ +#define SI_SGPR_GRID_SIZE 10 /* CS only */ #define SI_VS_NUM_USER_SGPR15 /* API VS */ #define SI_ES_NUM_USER_SGPR14 /* API VS */ @@ -100,6 +101,7 @@ struct radeon_shader_reloc; #define SI_GS_NUM_USER_SGPR10 #define SI_GSCOPY_NUM_USER_SGPR4 #define SI_PS_NUM_USER_SGPR11 +#define SI_CS_NUM_USER_SGPR13 /* LLVM function parameter indices */ #define SI_PARAM_RW_BUFFERS0 @@ -173,6 +175,11 @@ struct radeon_shader_reloc; #define SI_PARAM_SAMPLE_COVERAGE 21 #define SI_PARAM_POS_FIXED_PT 22 +/* CS only parameters */ +#define SI_PARAM_GRID_SIZE 5 +#define SI_PARAM_BLOCK_ID 6 +#define SI_PARAM_THREAD_ID 7 + #define SI_NUM_PARAMS (SI_PARAM_POS_FIXED_PT + 9) /* +8 for COLOR[0..1] */ struct si_shader; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] glsl: removing double semi-colons
This series is, Reviewed-by: Edward O'Callaghan On 2016-04-14 02:43, Jakob Sinclair wrote: Trivial change. Removing unnecessary semi-colons from the code. I don't have push access so someone reviewing this can push it. Signed-off-by: Jakob Sinclair --- src/compiler/glsl/ast_function.cpp | 2 +- src/compiler/glsl/ir_rvalue_visitor.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/ast_function.cpp b/src/compiler/glsl/ast_function.cpp index db68d5d..f50c7bf 100644 --- a/src/compiler/glsl/ast_function.cpp +++ b/src/compiler/glsl/ast_function.cpp @@ -1690,7 +1690,7 @@ process_record_constructor(exec_list *instructions, constructor_type->fields.structure[i].name, ir->type->name, constructor_type->fields.structure[i].type->name); - return ir_rvalue::error_value(ctx);; + return ir_rvalue::error_value(ctx); } node = node->next; diff --git a/src/compiler/glsl/ir_rvalue_visitor.cpp b/src/compiler/glsl/ir_rvalue_visitor.cpp index 6ab6cf0..addcc68 100644 --- a/src/compiler/glsl/ir_rvalue_visitor.cpp +++ b/src/compiler/glsl/ir_rvalue_visitor.cpp @@ -146,7 +146,7 @@ ir_rvalue_base_visitor::rvalue_visit(ir_discard *ir) ir_visitor_status ir_rvalue_base_visitor::rvalue_visit(ir_return *ir) { - handle_rvalue(&ir->value);; + handle_rvalue(&ir->value); return visit_continue; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] ARB_framebuffer_no_attachments for llvm and soft pipes
On 2016-04-11 22:27, Roland Scheidegger wrote: Am 10.04.2016 um 09:41 schrieb Edward O'Callaghan: All the piglits pass for these two as-is. However, some of the piglits require SSBO support to run, although I can't see why anything would actually fail but I thought I would make note of it just in case someone felt this patch should be held back till SSBO support is in both pipe drivers? If not, we should be golden to flick these on too. Edward O'Callaghan (1): llvmpipe,softpipe: Enable ARB_framebuffer_no_attachments I'm not sure this is really a good idea (at least for llvmpipe). At least the number of layers used internally is wrong (fb_max_layer is going to be ~0 as it is always derived from the attached buffrs). I suppose though it might not actually matter. But I'm not sure if there's really any point in this extension without images - while I initially thought it wouldn't be able to do anything this isn't quite true as there is indeed one side effect of fs possible even without images, which is query results (such as occlusion queries, which is what piglit uses). I think though just about all practical use cases really would require image support, so I'm not convinced exposing this just because we can is worth it. But OTOH why not... That was essentially how I rationalized it also. Reviewed-by: Roland Scheidegger I never applied for commit bit yet so if you still want it your have to merge. Kind Regards, Edward. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] tgsi: fix buffer overflow
This series is, Reviewed-by: Edward O'Callaghan On 2016-04-13 11:06, Thomas Hindoe Paaboel Andersen wrote: Increase r to four channels as rgba is written to it --- src/gallium/auxiliary/tgsi/tgsi_exec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index fb51051..41dd0f0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -4011,7 +4011,7 @@ static void exec_atomop_buf(struct tgsi_exec_machine *mach, const struct tgsi_full_instruction *inst) { - union tgsi_exec_channel r[3]; + union tgsi_exec_channel r[4]; union tgsi_exec_channel value[4], value2[4]; float rgba[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE]; float rgba2[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE]; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: Fix race condition on libgcrypt initialization
Reviewed-by: Edward O'Callaghan On 2016-04-13 08:10, Mark Janes wrote: Fixes intermittent Vulkan CTS failures within the test groups: dEQP-VK.api.object_management.multithreaded_per_thread_device dEQP-VK.api.object_management.multithreaded_per_thread_resources dEQP-VK.api.object_management.multithreaded_shared_resources Signed-off-by: Mark Janes Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904 --- src/util/mesa-sha1.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/src/util/mesa-sha1.c b/src/util/mesa-sha1.c index faa1c87..ca6b89b 100644 --- a/src/util/mesa-sha1.c +++ b/src/util/mesa-sha1.c @@ -175,21 +175,24 @@ _mesa_sha1_final(struct mesa_sha1 *ctx, unsigned char result[20]) #elif defined(HAVE_SHA1_IN_LIBGCRYPT) /* Use libgcrypt for SHA1 */ #include +#include "c11/threads.h" + +static void _mesa_libgcrypt_init(void) +{ + if (!gcry_check_version(NULL)) + return NULL; + gcry_control(GCRYCTL_DISABLE_SECMEM, 0); + gcry_control(GCRYCTL_INITIALIZATION_FINISHED, 0); +} struct mesa_sha1 * _mesa_sha1_init(void) { - static int init; + static once_flag flag = ONCE_FLAG_INIT; gcry_md_hd_t h; gcry_error_t err; - if (!init) { - if (!gcry_check_version(NULL)) - return NULL; - gcry_control(GCRYCTL_DISABLE_SECMEM, 0); - gcry_control(GCRYCTL_INITIALIZATION_FINISHED, 0); - init = 1; - } + call_once(&flag, _mesa_libgcrypt_init); err = gcry_md_open(&h, GCRY_MD_SHA1, 0); if (err) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: enable GLSL 4.20 and therefore OpenGL 4.2
Reviewed-by: Edward O'Callaghan On 2016-04-13 07:42, Nicolai Hähnle wrote: From: Nicolai Hähnle This is the last necessary bit for OpenGL 4.2 support. All driver-specific functionality has already been implemented as part of extensions. --- docs/relnotes/11.3.0.html | 7 --- src/gallium/drivers/radeonsi/si_pipe.c | 3 ++- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/relnotes/11.3.0.html b/docs/relnotes/11.3.0.html index 9860ab0..1815cfc 100644 --- a/docs/relnotes/11.3.0.html +++ b/docs/relnotes/11.3.0.html @@ -22,11 +22,11 @@ People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 11.3.1. -Mesa 11.3.0 implements the OpenGL 4.1 API, but the version reported by +Mesa 11.3.0 implements the OpenGL 4.2 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. -Some drivers don't support all the features required in OpenGL 4.1. OpenGL -4.1 is only available if requested at context creation +Some drivers don't support all the features required in OpenGL 4.2. OpenGL +4.2 is only available if requested at context creation because compatibility contexts are not supported. @@ -44,6 +44,7 @@ Note: some of the new features are only available with certain drivers. +OpenGL 4.2 on radeonsi GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi GL_ARB_internalformat_query2 on all drivers GL_ARB_robust_buffer_access_behavior on radeonsi diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 6b4b3d2..1dd7338 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -336,7 +336,8 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param) return 4; case PIPE_CAP_GLSL_FEATURE_LEVEL: - return HAVE_LLVM >= 0x0307 ? 410 : 330; + return HAVE_LLVM >= 0x0309 ? 420 : + HAVE_LLVM >= 0x0307 ? 410 : 330; case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE: return MIN2(sscreen->b.info.vram_size, 0x); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] R600-GCN: Improving performance on APUs & IGPs
Reviewed-by: Edward O'Callaghan Thanks for working on this, Edward. On 2016-04-12 05:02, Marek Olšák wrote: Hi, This disables buffer moves between VRAM and GTT by setting both of them as preferred heaps for APUs and IGPs. Allocations go to VRAM if there is free space. If not, they go to GTT. If a buffer is evicted from VRAM to GTT, it will stay there. If it's evicted from GTT to swap, it can later be moved to either VRAM or GTT (whichever has free space). Tonga with 128 MB VRAM (decreased for testing) is 2x faster in Heaven (1600x900 Ultra noAA) if I force has_dedicated_vram=false in Mesa, leading to literally 0 buffer evictions. Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] R600, GCN: Guard Band support
I didn't see anything obviously wrong so, Reviewed-by: Edward O'Callaghan But I have some general questions about guard band, not sure if this is the right place but I'll just ask any way: From my somewhat naive understanding guard band can lead to small gaps between polygons. In this implementation what exactly happens when on the screen bounds as clipped triangle could wind up with different edge slopes? Does this clip to the guard band edge or does it clip to the screen edge and if so what is the pixel cost on SI with that? On 2016-04-11 08:34, Marek Olšák wrote: Hi, This patch series adds Guard Band support into r600g and radeonsi. It first implements the Guard Band in radeonsi, then it moves all radeonsi scissor & viewport code into gallium/radeon, and then r600g is switched to it and its original scissor & viewport code is deleted. The differences between the R600 and GCN code are almost none. This should improve performance if clipping is the bottleneck. Grigori Goronzy started this originally. Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix typo in r600 register definitions
Acked-by: Edward O'Callaghan On 2016-04-09 09:12, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/r600/r600d.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600d.h b/src/gallium/drivers/r600/r600d.h index 3d223ed..ef99573 100644 --- a/src/gallium/drivers/r600/r600d.h +++ b/src/gallium/drivers/r600/r600d.h @@ -780,7 +780,7 @@ #define S_028D0C_STENCIL_COMPRESS_DISABLE(x) (((x) & 0x1) << 5) #define S_028D0C_DEPTH_COMPRESS_DISABLE(x) (((x) & 0x1) << 6) #define S_028D0C_COPY_CENTROID(x)(((x) & 0x1) << 7) -#define S_028D0C_COPY_SAMPLE(x) (((x) & 0x1) << 8) +#define S_028D0C_COPY_SAMPLE(x) (((x) & 0x03) << 8) #define S_028D0C_R700_PERFECT_ZPASS_COUNTS(x)(((x) & 0x1) << 15) #define S_028D0C_CONSERVATIVE_Z_EXPORT(x)(((x) & 0x03) << 13) #define G_028D0C_CONSERVATIVE_Z_EXPORT(x)(((x) >> 13) & 0x03) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] gallium/radeon: move pipeline stat context flags to common code
This series is, Reviewed-by: Edward O'Callaghan This definitely makes for a good cleanup, I was wondering about all the manual stuff myself.. On 2016-04-09 09:12, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeon/r600_pipe_common.h | 5 - src/gallium/drivers/radeonsi/si_pipe.h| 3 --- src/gallium/drivers/radeonsi/si_state.c | 8 src/gallium/drivers/radeonsi/si_state_draw.c | 4 ++-- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 7da7736..57af0ff 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -50,7 +50,10 @@ #define R600_RESOURCE_FLAG_FORCE_TILING (PIPE_RESOURCE_FLAG_DRV_PRIV << 2) #define R600_CONTEXT_STREAMOUT_FLUSH (1u << 0) -#define R600_CONTEXT_PRIVATE_FLAG (1u << 1) +/* Pipeline & streamout query controls. */ +#define R600_CONTEXT_START_PIPELINE_STATS (1u << 1) +#define R600_CONTEXT_STOP_PIPELINE_STATS (1u << 2) +#define R600_CONTEXT_PRIVATE_FLAG (1u << 3) /* special primitive types */ #define R600_PRIM_RECTANGLE_LIST PIPE_PRIM_MAX diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index 8fcfcd2..f665c81 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -66,9 +66,6 @@ /* Compute only. */ #define SI_CONTEXT_FLUSH_WITH_INV_L2 (R600_CONTEXT_PRIVATE_FLAG << 13) /* TODO: merge with TC? */ #define SI_CONTEXT_FLAG_COMPUTE(R600_CONTEXT_PRIVATE_FLAG << 14) -/* Pipeline & streamout query controls. */ -#define SI_CONTEXT_START_PIPELINE_STATS (R600_CONTEXT_PRIVATE_FLAG << 15) -#define SI_CONTEXT_STOP_PIPELINE_STATS (R600_CONTEXT_PRIVATE_FLAG << 16) #define SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER (SI_CONTEXT_FLUSH_AND_INV_CB | \ SI_CONTEXT_FLUSH_AND_INV_CB_META | \ diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 0c46425..94130a9 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1358,11 +1358,11 @@ static void si_set_active_query_state(struct pipe_context *ctx, boolean enable) /* Pipeline stat & streamout queries. */ if (enable) { - sctx->b.flags &= ~SI_CONTEXT_STOP_PIPELINE_STATS; - sctx->b.flags |= SI_CONTEXT_START_PIPELINE_STATS; + sctx->b.flags &= ~R600_CONTEXT_STOP_PIPELINE_STATS; + sctx->b.flags |= R600_CONTEXT_START_PIPELINE_STATS; } else { - sctx->b.flags &= ~SI_CONTEXT_START_PIPELINE_STATS; - sctx->b.flags |= SI_CONTEXT_STOP_PIPELINE_STATS; + sctx->b.flags &= ~R600_CONTEXT_START_PIPELINE_STATS; + sctx->b.flags |= R600_CONTEXT_STOP_PIPELINE_STATS; } /* Occlusion queries. */ diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 105c5fb..40cad50 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -722,11 +722,11 @@ void si_emit_cache_flush(struct si_context *si_ctx, struct r600_atom *atom) } } - if (sctx->flags & SI_CONTEXT_START_PIPELINE_STATS) { + if (sctx->flags & R600_CONTEXT_START_PIPELINE_STATS) { radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0)); radeon_emit(cs, EVENT_TYPE(V_028A90_PIPELINESTAT_START) | EVENT_INDEX(0)); - } else if (sctx->flags & SI_CONTEXT_STOP_PIPELINE_STATS) { + } else if (sctx->flags & R600_CONTEXT_STOP_PIPELINE_STATS) { radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0)); radeon_emit(cs, EVENT_TYPE(V_028A90_PIPELINESTAT_STOP) | EVENT_INDEX(0)); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: implement and rely on set_active_query_state
Reviewed-by: Edward O'Callaghan On 2016-04-08 18:58, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_blit.c | 3 --- src/gallium/drivers/radeonsi/si_pipe.h | 4 src/gallium/drivers/radeonsi/si_state.c | 32 +++- src/gallium/drivers/radeonsi/si_state_draw.c | 10 + 4 files changed, 45 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index c5ea8b1..aed783f 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -52,8 +52,6 @@ static void si_blitter_begin(struct pipe_context *ctx, enum si_blitter_op op) { struct si_context *sctx = (struct si_context *)ctx; - r600_suspend_nontimer_queries(&sctx->b); - util_blitter_save_vertex_buffer_slot(sctx->blitter, sctx->vertex_buffer); util_blitter_save_vertex_elements(sctx->blitter, sctx->vertex_elements); util_blitter_save_vertex_shader(sctx->blitter, sctx->vs_shader.cso); @@ -95,7 +93,6 @@ static void si_blitter_end(struct pipe_context *ctx) struct si_context *sctx = (struct si_context *)ctx; sctx->b.render_cond_force_off = false; - r600_resume_nontimer_queries(&sctx->b); } static unsigned u_max_sample(struct pipe_resource *r) diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index 4158fc5..8fcfcd2 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -66,6 +66,9 @@ /* Compute only. */ #define SI_CONTEXT_FLUSH_WITH_INV_L2 (R600_CONTEXT_PRIVATE_FLAG << 13) /* TODO: merge with TC? */ #define SI_CONTEXT_FLAG_COMPUTE(R600_CONTEXT_PRIVATE_FLAG << 14) +/* Pipeline & streamout query controls. */ +#define SI_CONTEXT_START_PIPELINE_STATS (R600_CONTEXT_PRIVATE_FLAG << 15) +#define SI_CONTEXT_STOP_PIPELINE_STATS (R600_CONTEXT_PRIVATE_FLAG << 16) #define SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER (SI_CONTEXT_FLUSH_AND_INV_CB | \ SI_CONTEXT_FLUSH_AND_INV_CB_META | \ @@ -289,6 +292,7 @@ struct si_context { booldb_stencil_clear; booldb_stencil_disable_expclear; unsignedps_db_shader_control; + boolocclusion_queries_disabled; /* Emitted draw state. */ int last_base_vertex; diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index a66bd30..6fbbb68 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1352,6 +1352,26 @@ static void *si_create_db_flush_dsa(struct si_context *sctx) /* DB RENDER STATE */ +static void si_set_active_query_state(struct pipe_context *ctx, boolean enable) +{ + struct si_context *sctx = (struct si_context*)ctx; + + /* Pipeline stat & streamout queries. */ + if (enable) { + sctx->b.flags &= ~SI_CONTEXT_STOP_PIPELINE_STATS; + sctx->b.flags |= SI_CONTEXT_START_PIPELINE_STATS; + } else { + sctx->b.flags &= ~SI_CONTEXT_START_PIPELINE_STATS; + sctx->b.flags |= SI_CONTEXT_STOP_PIPELINE_STATS; + } + + /* Occlusion queries. */ + if (sctx->occlusion_queries_disabled != !enable) { + sctx->occlusion_queries_disabled = !enable; + si_mark_atom_dirty(sctx, &sctx->db_render_state); + } +} + static void si_set_occlusion_query_state(struct pipe_context *ctx, bool enable) { struct si_context *sctx = (struct si_context*)ctx; @@ -1386,7 +1406,8 @@ static void si_emit_db_render_state(struct si_context *sctx, struct r600_atom *s } /* DB_COUNT_CONTROL (occlusion queries) */ - if (sctx->b.num_occlusion_queries > 0) { + if (sctx->b.num_occlusion_queries > 0 && + !sctx->occlusion_queries_disabled) { bool perfect = sctx->b.num_perfect_occlusion_queries > 0; if (sctx->b.chip_class >= CIK) { @@ -3765,6 +3786,7 @@ void si_init_state_functions(struct si_context *sctx) sctx->b.b.set_min_samples = si_set_min_samples; sctx->b.b.set_tess_state = si_set_tess_state; + sctx->b.b.set_active_query_state = si_set_active_query_state; sctx->b.set_occlusion_query_state = si_set_occlusion_query_state; sctx->b.need_gfx_cs_space = si_need_gfx_cs_space; @@ -3995,6 +4017,14 @@ static void si_init_config(struct si_context *sctx) si_pm4_cmd_add(pm4, 0x8000); si_pm4_cmd_end(pm4, false); + /* This enables pipeline stat & streamout queries. +* They are only disabled by blits. +*/ + si_pm4_cmd_begin(pm4, PKT3_EVENT_WRITE); + si_pm4_cmd_add(pm4, EVENT_TYPE(V_028A90_PIPELINESTAT_START) | + EVENT_INDEX(0)); + si_pm4_cmd_end(pm4,
Re: [Mesa-dev] [PATCH] radeonsi: fix mask checking when emitting scissors and viewports
On 2016-04-08 19:00, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_state.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 8087d23..3894e1d 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -912,8 +912,10 @@ static void si_emit_scissors(struct si_context *sctx, struct r600_atom *atom) bool scissor_enable = sctx->queued.named.rasterizer->scissor_enable; /* The simple case: Only 1 viewport is active. */ - if (mask & 1 && - !si_get_vs_info(sctx)->writes_viewport_index) { + if (!si_get_vs_info(sctx)->writes_viewport_index) { + if (!(mask & 1)) seems a bit tentative.. did you want 1u here or? + return; + radeon_set_context_reg_seq(cs, R_028250_PA_SC_VPORT_SCISSOR_0_TL, 2); si_emit_one_scissor(cs, &sctx->viewports.states[0], scissor_enable ? &states[0] : NULL); @@ -960,8 +962,10 @@ static void si_emit_viewports(struct si_context *sctx, struct r600_atom *atom) unsigned mask = sctx->viewports.dirty_mask; /* The simple case: Only 1 viewport is active. */ - if (mask & 1 && - !si_get_vs_info(sctx)->writes_viewport_index) { + if (!si_get_vs_info(sctx)->writes_viewport_index) { + if (!(mask & 1)) + return; + radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE, 6); radeon_emit(cs, fui(states[0].scale[0])); radeon_emit(cs, fui(states[0].translate[0])); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/radeon: unify checking streamout enable state
Reviewed-by: Edward O'Callaghan On 2016-04-08 19:00, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/r600/r600_state_common.c | 5 ++--- src/gallium/drivers/radeon/r600_pipe_common.h | 6 ++ src/gallium/drivers/radeon/r600_streamout.c | 6 -- src/gallium/drivers/radeonsi/si_state_draw.c | 3 +-- 4 files changed, 9 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index df41d3f..82babeb 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -1841,8 +1841,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info ia_switch_on_eop = true; } - if (rctx->b.streamout.streamout_enabled || - rctx->b.streamout.prims_gen_query_enabled) + if (r600_get_strmout_en(&rctx->b)) partial_vs_wave = true; radeon_set_context_reg(cs, CM_R_028AA8_IA_MULTI_VGT_PARAM, @@ -2018,7 +2017,7 @@ static void r600_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info rctx->b.family == CHIP_RV635) { /* if we have gs shader or streamout we need to do a wait idle after every draw */ - if (rctx->gs_shader || rctx->b.streamout.streamout_enabled) { + if (rctx->gs_shader || r600_get_strmout_en(&rctx->b)) { radeon_set_config_reg(cs, R_008040_WAIT_UNTIL, S_008040_WAIT_3D_IDLE(1)); } } diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 062c319..7da7736 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -639,6 +639,12 @@ r600_resource_reference(struct r600_resource **ptr, struct r600_resource *res) (struct pipe_resource *)res); } +static inline bool r600_get_strmout_en(struct r600_common_context *rctx) +{ + return rctx->streamout.streamout_enabled || + rctx->streamout.prims_gen_query_enabled; +} + static inline unsigned r600_tex_aniso_filter(unsigned filter) { if (filter <= 1) return 0; diff --git a/src/gallium/drivers/radeon/r600_streamout.c b/src/gallium/drivers/radeon/r600_streamout.c index e977ed9..fc9ec48 100644 --- a/src/gallium/drivers/radeon/r600_streamout.c +++ b/src/gallium/drivers/radeon/r600_streamout.c @@ -311,12 +311,6 @@ void r600_emit_streamout_end(struct r600_common_context *rctx) * are no buffers bound. */ -static bool r600_get_strmout_en(struct r600_common_context *rctx) -{ - return rctx->streamout.streamout_enabled || - rctx->streamout.prims_gen_query_enabled; -} - static void r600_emit_streamout_enable(struct r600_common_context *rctx, struct r600_atom *atom) { diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 84b850a..3863e59 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -882,8 +882,7 @@ void si_draw_vbo(struct pipe_context *ctx, const struct pipe_draw_info *info) if ((sctx->b.family == CHIP_HAWAII || sctx->b.family == CHIP_TONGA || sctx->b.family == CHIP_FIJI) && - (sctx->b.streamout.streamout_enabled || -sctx->b.streamout.prims_gen_query_enabled)) { + r600_get_strmout_en(&sctx->b)) { sctx->b.flags |= SI_CONTEXT_VGT_STREAMOUT_SYNC; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] svga: add some trivial null pointer checks
Reviewed-by: Edward O'Callaghan +1 for defensive programming. On 2016-04-07 06:00, Brian Paul wrote: These small mallocs will probably never fail, but static analysis tools may complain about the missing checks. --- src/gallium/drivers/svga/svga_pipe_blend.c| 3 +++ src/gallium/drivers/svga/svga_pipe_depthstencil.c | 3 +++ src/gallium/drivers/svga/svga_pipe_rasterizer.c | 3 +++ 3 files changed, 9 insertions(+) diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c b/src/gallium/drivers/svga/svga_pipe_blend.c index 0af80cd..0ba9313 100644 --- a/src/gallium/drivers/svga/svga_pipe_blend.c +++ b/src/gallium/drivers/svga/svga_pipe_blend.c @@ -142,6 +142,9 @@ svga_create_blend_state(struct pipe_context *pipe, struct svga_blend_state *blend = CALLOC_STRUCT( svga_blend_state ); unsigned i; + if (!blend) + return NULL; + /* Fill in the per-rendertarget blend state. We currently only * support independent blend enable and colormask per render target. */ diff --git a/src/gallium/drivers/svga/svga_pipe_depthstencil.c b/src/gallium/drivers/svga/svga_pipe_depthstencil.c index d84ed1d..83fcdc3 100644 --- a/src/gallium/drivers/svga/svga_pipe_depthstencil.c +++ b/src/gallium/drivers/svga/svga_pipe_depthstencil.c @@ -134,6 +134,9 @@ svga_create_depth_stencil_state(struct pipe_context *pipe, struct svga_context *svga = svga_context(pipe); struct svga_depth_stencil_state *ds = CALLOC_STRUCT( svga_depth_stencil_state ); + if (!ds) + return NULL; + /* Don't try to figure out CW/CCW correspondence with * stencil[0]/[1] at this point. Presumably this can change as * back/front face are modified. diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c b/src/gallium/drivers/svga/svga_pipe_rasterizer.c index 8e0db53..d397c95 100644 --- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c +++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c @@ -161,6 +161,9 @@ svga_create_rasterizer_state(struct pipe_context *pipe, struct svga_rasterizer_state *rast = CALLOC_STRUCT( svga_rasterizer_state ); struct svga_screen *screen = svga_screen(pipe->screen); + if (!rast) + return NULL; + /* need this for draw module. */ rast->templ = *templ; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] r600/compute: cleanup evergreen_compute.c
Nice cleanup. This series is, Reviewed-by: Edward O'Callaghan On 2016-04-07 07:40, Dave Airlie wrote: This probably should have been cleaned up before merging, but we were a bit lax with it. This is a bunch of cleanups and changes, that make adding ARB_compute_support less of a task. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] ARB_robust_buffer_access_behavior for radeonsi
I had a hacked up version of this last week which was very similar. This is much cleaner, hence this series is, Reviewed-by: Edward O'Callaghan On 2016-04-04 21:41, Bas Nieuwenhuizen wrote: This series implements ARb_robust_buffer_access_behavior for the radeonsi driver. There are some tests at: https://github.com/BNieuwenhuizen/piglit These have not been send yet as they depend on robust access context support in waffle. Bas Nieuwenhuizen (4): radeonsi: use bounded indexing for constant buffers radeonsi: use bounded indexing for samplers expose ARB_robust_buffer_access_behavior radeonsi: mark ARB_robust_buffer_access_behavior as supported docs/GL3.txt | 2 +- docs/relnotes/11.3.0.html| 1 + src/gallium/docs/source/screen.rst | 4 +++- src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/llvmpipe/lp_screen.c | 1 + src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + src/gallium/drivers/r300/r300_screen.c | 1 + src/gallium/drivers/r600/r600_pipe.c | 1 + src/gallium/drivers/radeonsi/si_pipe.c | 1 + src/gallium/drivers/radeonsi/si_shader.c | 10 +++--- src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c | 1 + src/gallium/drivers/swr/swr_screen.cpp | 1 + src/gallium/drivers/vc4/vc4_screen.c | 1 + src/gallium/drivers/virgl/virgl_screen.c | 1 + src/gallium/include/pipe/p_defines.h | 1 + src/mesa/main/extensions_table.h | 1 + src/mesa/main/mtypes.h | 1 + src/mesa/main/version.c | 2 +- src/mesa/state_tracker/st_extensions.c | 1 + 24 files changed, 32 insertions(+), 6 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] report ARB_cull_distance
Patches 1-5, 8 & 10 are, Reviewed-by: Edward O'Callaghan On 2016-04-04 12:15, Dave Airlie wrote: Okay I've taken Tobias' last work in progress, cleaned it up a bit, move the rename out into a separate patch, reordered things slightly. I've dropped the separate passes, I think nearly all hw operates the same, I do wonder why we even have this as an option, since at least 965/gallium always require the compiler to lower, we don't really have other consumers that would care about this I don't think. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/20] GL compute shaders for radeonsi
This series is, Tested-By: Edward O'Callaghan I didn`t pick up anything major wrong with it, but with others minor suggestions this series is also, Reviewed-By: Edward O'Callaghan Kind Regards, On 2016-04-03 00:10, Bas Nieuwenhuizen wrote: This series implements OpenGL compute shader for radeonsi. It is based off master + Nicolai Hähnle's SSBO patches. It depends on two patches for LLVM that have not been committed yet: - D18340 - D18559 The series is also available as the si-compute-shader branches of - https://github.com/BNieuwenhuizen/llvm - https://github.com/BNieuwenhuizen/mesa Bas Nieuwenhuizen (20): radeonsi: set shader calling conventions radeonsi: lower compute shader arguments radeonsi: add shared memory radeonsi: implement shared memory load/store radeonsi: implement shared atomics radeonsi: set maximum work group size based on block size radeonsi: update shader count for compute shaders radeonsi: implement TGSI compute shader creation radeonsi: split input upload off from si_launch_grid radeonsi: don't pass scratch buffer to user SGPRs radeonsi: do per cs setup for compute shaders once per cs radeonsi: rework compute scratch buffer radeonsi: only emit compute shader state when switching shaders radeonsi: implement TGSI compute dispatch radeonsi: split texture decompression for compute shaders radeonsi: split setting graphics and compute descriptors radeonsi: do not do two full flushes on every compute dispatch radeonsi: clean up compute flush mesa/st: enable compute shaders if images are also supported radeonsi: enable TGSI support cap for compute shaders docs/GL3.txt | 4 +- docs/relnotes/11.3.0.html | 1 + src/gallium/drivers/radeon/r600_pipe_common.c | 21 +- src/gallium/drivers/radeon/radeon_llvm.h | 3 + src/gallium/drivers/radeon/radeon_llvm_emit.c | 17 +- .../drivers/radeon/radeon_setup_tgsi_llvm.c| 4 + src/gallium/drivers/radeonsi/si_blit.c | 13 +- src/gallium/drivers/radeonsi/si_compute.c | 557 - src/gallium/drivers/radeonsi/si_descriptors.c | 60 ++- src/gallium/drivers/radeonsi/si_hw_context.c | 2 + src/gallium/drivers/radeonsi/si_pipe.c | 4 +- src/gallium/drivers/radeonsi/si_pipe.h | 11 +- src/gallium/drivers/radeonsi/si_shader.c | 252 +- src/gallium/drivers/radeonsi/si_shader.h | 10 + src/gallium/drivers/radeonsi/si_state.c| 6 +- src/gallium/drivers/radeonsi/si_state.h| 10 +- src/gallium/drivers/radeonsi/si_state_draw.c | 31 +- src/mesa/state_tracker/st_extensions.c | 6 +- 18 files changed, 708 insertions(+), 304 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Split buffer block array into UBO and SSBO arrays
This series is, Acked-by: Edward O'Callaghan On 2016-04-03 21:16, Timothy Arceri wrote: This is the final clean-up of the buffer block structures. With this series we just create two arrays to begin with and drop the combined array. Note to avoid code churn and regressions I intend to squash patches 1-4 before pushing I've just sent them split up to make reviewing easier. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] radeonsi, r600g ARB_framebuffer_no_attachments rebased
On 2016-04-03 14:55, Ilia Mirkin wrote: On Sat, Apr 2, 2016 at 10:54 PM, Edward O'Callaghan wrote: This series implements ARB_framebuffer_no_attachments for radeonsi & r600g. It is a rebase of the last previous patch series with the respective Rb's added. I have given back my R9 hw today and so this was the last time I ran piglit to confirm everything passes and all our applications locally that need the support are happy and works as expected. Hi Edward, Could you confirm whether you ran piglit on the ARB_framebuffer_no_attachments-specific tests only, or whether you did a full run and compared it a baseline run? Thanks, -ilia Hi Ilia, I ran both and after: $ piglit run tests/gpu results/gpu-reference $ piglit summary html --overwrite summary/gpu results/gpu-reference $ firefox summary/gpu/index.html and to confirm the extensions functionality (+ some local proprietary medical applications): $ piglit run tests/all -t framebuffer_no_attachments results/framebuffer_no_attachments $ piglit summary html --overwrite summary/framebuffer_no_attachments results/framebuffer_no_attachments $ firefox summary/framebuffer_no_attachments/index.html Kind Regards, Edward. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/30] r600: refactor binding code for attach buffer to CB.
Reviewed-by: Edward O'Callaghan On 2016-03-31 18:03, Dave Airlie wrote: From: Dave Airlie This refactors out the code and fixes it up to be used for images later. It uses the code in the current RAT binding for compute. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 111 - 1 file changed, 78 insertions(+), 33 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index c151a1b..f7a2a8f 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1056,6 +1056,71 @@ struct r600_tex_color_info { boolean export_16bpc; }; +static void evergreen_set_color_surface_buffer(struct r600_context *rctx, + struct r600_resource *res, + enum pipe_format pformat, + unsigned first_element, + unsigned last_element, + struct r600_tex_color_info *color) +{ + unsigned format, swap, ntype, endian; + const struct util_format_description *desc; + unsigned block_size = align(util_format_get_blocksize(res->b.b.format), 4); + unsigned pitch_alignment = + MAX2(64, rctx->screen->b.info.pipe_interleave_bytes / block_size); + unsigned pitch = align(res->b.b.width0, pitch_alignment); + int i; + unsigned width_elements; + + width_elements = last_element - first_element + 1; + + format = r600_translate_colorformat(rctx->b.chip_class, pformat); + swap = r600_translate_colorswap(pformat); + + endian = r600_colorformat_endian_swap(format); + + desc = util_format_description(pformat); + for (i = 0; i < 4; i++) { + if (desc->channel[i].type != UTIL_FORMAT_TYPE_VOID) { + break; + } + } + ntype = V_028C70_NUMBER_UNORM; + if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB) + ntype = V_028C70_NUMBER_SRGB; + else if (desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED) { + if (desc->channel[i].normalized) + ntype = V_028C70_NUMBER_SNORM; + else if (desc->channel[i].pure_integer) + ntype = V_028C70_NUMBER_SINT; + } else if (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED) { + if (desc->channel[i].normalized) + ntype = V_028C70_NUMBER_UNORM; + else if (desc->channel[i].pure_integer) + ntype = V_028C70_NUMBER_UINT; + } + pitch = (pitch / 8) - 1; + color->pitch = S_028C64_PITCH_TILE_MAX(pitch); + + color->info = S_028C70_ARRAY_MODE(V_028C70_ARRAY_LINEAR_ALIGNED); + color->info |= S_028C70_FORMAT(format) | + S_028C70_COMP_SWAP(swap) | + S_028C70_BLEND_CLAMP(0) | + S_028C70_BLEND_BYPASS(1) | + S_028C70_NUMBER_TYPE(ntype) | + S_028C70_ENDIAN(endian); + color->attrib = S_028C74_NON_DISP_TILING_ORDER(1); + color->ntype = ntype; + color->export_16bpc = false; + color->dim = width_elements - 1; + color->slice = (width_elements / 64) - 1; + color->view = 0; + color->offset = res->gpu_address >> 8; + + color->fmask = color->offset; + color->fmask_slice = 0; +} + static void evergreen_set_color_surface_common(struct r600_context *rctx, struct r600_texture *rtex, unsigned level, @@ -1239,47 +1304,27 @@ void evergreen_init_color_surface_rat(struct r600_context *rctx, struct r600_surface *surf) { struct pipe_resource *pipe_buffer = surf->base.texture; - unsigned format = r600_translate_colorformat(rctx->b.chip_class, -surf->base.format); - unsigned endian = r600_colorformat_endian_swap(format); - unsigned swap = r600_translate_colorswap(surf->base.format); - unsigned block_size = - align(util_format_get_blocksize(pipe_buffer->format), 4); - unsigned pitch_alignment = - MAX2(64, rctx->screen->b.info.pipe_interleave_bytes / block_size); - unsigned pitch = align(pipe_buffer->width0, pitch_alignment); - - surf->cb_color_base = r600_resource(pipe_buffer)->gpu_address >> 8; + struct r600_tex_color_info color; - surf->cb_color_pitch = (pitch / 8) - 1; + evergreen_set_color_surface_buffer(rctx, (struct r600_resource *)surf->base.texture, + surf->base.format, 0, pipe_buffer->width0, + &color
Re: [Mesa-dev] [PATCH 04/30] r600: refactor out CB setup.
Reviewed-by: Edward O'Callaghan On 2016-03-31 18:03, Dave Airlie wrote: From: Dave Airlie This moves the code to create CB info out into a separate function so it can be reused in images code to create RATs. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 257 + 1 file changed, 147 insertions(+), 110 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 356e708..c151a1b 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1042,105 +1042,72 @@ static void evergreen_emit_scissor_state(struct r600_context *rctx, struct r600_ rstate->atom.num_dw = 0; } -/** - * This function intializes the CB* register values for RATs. It is meant - * to be used for 1D aligned buffers that do not have an associated - * radeon_surf. - */ -void evergreen_init_color_surface_rat(struct r600_context *rctx, - struct r600_surface *surf) -{ - struct pipe_resource *pipe_buffer = surf->base.texture; - unsigned format = r600_translate_colorformat(rctx->b.chip_class, -surf->base.format); - unsigned endian = r600_colorformat_endian_swap(format); - unsigned swap = r600_translate_colorswap(surf->base.format); - unsigned block_size = - align(util_format_get_blocksize(pipe_buffer->format), 4); - unsigned pitch_alignment = - MAX2(64, rctx->screen->b.info.pipe_interleave_bytes / block_size); - unsigned pitch = align(pipe_buffer->width0, pitch_alignment); - - surf->cb_color_base = r600_resource(pipe_buffer)->gpu_address >> 8; - - surf->cb_color_pitch = (pitch / 8) - 1; - - surf->cb_color_slice = 0; - - surf->cb_color_view = 0; - - surf->cb_color_info = - S_028C70_ENDIAN(endian) - | S_028C70_FORMAT(format) - | S_028C70_ARRAY_MODE(V_028C70_ARRAY_LINEAR_ALIGNED) - | S_028C70_NUMBER_TYPE(V_028C70_NUMBER_UINT) - | S_028C70_COMP_SWAP(swap) - | S_028C70_BLEND_BYPASS(1) /* We must set this bit because we - * are using NUMBER_UINT */ - | S_028C70_RAT(1) - ; - - surf->cb_color_attrib = S_028C74_NON_DISP_TILING_ORDER(1); - - /* For buffers, CB_COLOR0_DIM needs to be set to the number of -* elements. */ - surf->cb_color_dim = pipe_buffer->width0; - - /* Set the buffer range the GPU will have access to: */ - util_range_add(&r600_resource(pipe_buffer)->valid_buffer_range, - 0, pipe_buffer->width0); - - surf->cb_color_fmask = surf->cb_color_base; - surf->cb_color_fmask_slice = 0; -} +struct r600_tex_color_info { + unsigned info; + unsigned view; + unsigned dim; + unsigned pitch; + unsigned slice; + unsigned attrib; + unsigned ntype; + unsigned fmask; + unsigned fmask_slice; + uint64_t offset; + boolean export_16bpc; +}; -void evergreen_init_color_surface(struct r600_context *rctx, - struct r600_surface *surf) +static void evergreen_set_color_surface_common(struct r600_context *rctx, + struct r600_texture *rtex, + unsigned level, + unsigned first_layer, + unsigned last_layer, + enum pipe_format pformat, + struct r600_tex_color_info *color) { struct r600_screen *rscreen = rctx->screen; - struct r600_texture *rtex = (struct r600_texture*)surf->base.texture; - unsigned level = surf->base.u.tex.level; unsigned pitch, slice; - unsigned color_info, color_attrib, color_dim = 0, color_view; - unsigned format, swap, ntype, endian; - uint64_t offset, base_offset; unsigned non_disp_tiling, macro_aspect, tile_split, bankh, bankw, fmask_bankh, nbanks; + unsigned format, swap, ntype, endian; const struct util_format_description *desc; - int i; bool blend_clamp = 0, blend_bypass = 0; + int i; - offset = rtex->surface.level[level].offset; + color->offset = rtex->surface.level[level].offset; if (rtex->surface.level[level].mode == RADEON_SURF_MODE_LINEAR) { - assert(surf->base.u.tex.first_layer == surf->base.u.tex.last_layer); - offset += rtex->surface.level[level].slice_size * - surf->base.u.tex.first_layer; - color_view = 0; + color->offset += (rtex->surface.level[level].slice_size * +
Re: [Mesa-dev] [PATCH 02/30] r600: factor out the code to initialise a buffer resource.
Reviewed-by: Edward O'Callaghan On 2016-03-31 18:03, Dave Airlie wrote: From: Dave Airlie This takes the code required to initialise a buffer resource out of the texture buffer code, into it's own function. This is going to be used for the image support later. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 81 +++--- 1 file changed, 53 insertions(+), 28 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index fa3d0b3..b68faed 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -596,56 +596,81 @@ static void *evergreen_create_sampler_state(struct pipe_context *ctx, return ss; } -static struct pipe_sampler_view * -texture_buffer_sampler_view(struct r600_context *rctx, - struct r600_pipe_sampler_view *view, - unsigned width0, unsigned height0) - +struct eg_buf_res_params { + enum pipe_format pipe_format; + unsigned first_element; + unsigned last_element; + unsigned char swizzle[4]; + bool uncached; +}; + +static void evergreen_fill_buffer_resource_words(struct r600_context *rctx, +struct pipe_resource *buffer, +struct eg_buf_res_params *params, +bool *skip_mip_address_reloc, +unsigned tex_resource_words[8]) { - struct r600_texture *tmp = (struct r600_texture*)view->base.texture; + struct r600_texture *tmp = (struct r600_texture*)buffer; uint64_t va; - int stride = util_format_get_blocksize(view->base.format); + int stride = util_format_get_blocksize(params->pipe_format); unsigned format, num_format, format_comp, endian; unsigned swizzle_res; - unsigned char swizzle[4]; const struct util_format_description *desc; - unsigned offset = view->base.u.buf.first_element * stride; - unsigned size = (view->base.u.buf.last_element - view->base.u.buf.first_element + 1) * stride; + unsigned offset = params->first_element * stride; + unsigned num_elements = (params->last_element - params->first_element + 1); + unsigned size = num_elements * stride; - swizzle[0] = view->base.swizzle_r; - swizzle[1] = view->base.swizzle_g; - swizzle[2] = view->base.swizzle_b; - swizzle[3] = view->base.swizzle_a; - - r600_vertex_data_type(view->base.format, + r600_vertex_data_type(params->pipe_format, &format, &num_format, &format_comp, &endian); - desc = util_format_description(view->base.format); + desc = util_format_description(params->pipe_format); - swizzle_res = r600_get_swizzle_combined(desc->swizzle, swizzle, TRUE); + swizzle_res = r600_get_swizzle_combined(desc->swizzle, params->swizzle, TRUE); va = tmp->resource.gpu_address + offset; - view->tex_resource = &tmp->resource; - - view->skip_mip_address_reloc = true; - view->tex_resource_words[0] = va; - view->tex_resource_words[1] = size - 1; - view->tex_resource_words[2] = S_030008_BASE_ADDRESS_HI(va >> 32UL) | + *skip_mip_address_reloc = true; + tex_resource_words[0] = va; + tex_resource_words[1] = size - 1; + tex_resource_words[2] = S_030008_BASE_ADDRESS_HI(va >> 32UL) | S_030008_STRIDE(stride) | S_030008_DATA_FORMAT(format) | S_030008_NUM_FORMAT_ALL(num_format) | S_030008_FORMAT_COMP_ALL(format_comp) | S_030008_ENDIAN_SWAP(endian); - view->tex_resource_words[3] = swizzle_res; + tex_resource_words[3] = swizzle_res | S_03000C_UNCACHED(params->uncached); /* * in theory dword 4 is for number of elements, for use with resinfo, * but it seems to utterly fail to work, the amd gpu shader analyser * uses a const buffer to store the element sizes for buffer txq */ - view->tex_resource_words[4] = 0; - view->tex_resource_words[5] = view->tex_resource_words[6] = 0; - view->tex_resource_words[7] = S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER); + tex_resource_words[4] = num_elements; + tex_resource_words[5] = tex_resource_words[6] = 0; + tex_resource_words[7] = S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER); +} + +static struct pipe_sampler_view * +texture_buffer_sampler_view(struct r600_context *rctx, + struct r600_pipe_sampler_view *view, + unsigned width0, unsigned height0) +{ + struct r600_texture *tmp = (struct r600_texture*)view->base.texture; + struct eg_buf_res_params params; + + memset(¶ms, 0, sizeof(params)); + +
Re: [Mesa-dev] [PATCH 03/30] r600: refactor texture resource words setup code.
Reviewed-by: Edward O'Callaghan On 2016-03-31 18:03, Dave Airlie wrote: From: Dave Airlie This refactors out the code to setup a texture resource so we can reuse it later from the images code. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 274 + 1 file changed, 158 insertions(+), 116 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index b68faed..356e708 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -649,93 +649,55 @@ static void evergreen_fill_buffer_resource_words(struct r600_context *rctx, tex_resource_words[7] = S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER); } -static struct pipe_sampler_view * -texture_buffer_sampler_view(struct r600_context *rctx, - struct r600_pipe_sampler_view *view, - unsigned width0, unsigned height0) -{ - struct r600_texture *tmp = (struct r600_texture*)view->base.texture; - struct eg_buf_res_params params; - - memset(¶ms, 0, sizeof(params)); - - params.pipe_format = view->base.format; - params.first_element = view->base.u.buf.first_element; - params.last_element = view->base.u.buf.last_element; - params.swizzle[0] = view->base.swizzle_r; - params.swizzle[1] = view->base.swizzle_g; - params.swizzle[2] = view->base.swizzle_b; - params.swizzle[3] = view->base.swizzle_a; - - evergreen_fill_buffer_resource_words(rctx, view->base.texture, -¶ms, &view->skip_mip_address_reloc, -view->tex_resource_words); - view->tex_resource = &tmp->resource; - - if (tmp->resource.gpu_address) - LIST_ADDTAIL(&view->list, &rctx->b.texture_buffers); - return &view->base; -} +struct eg_tex_res_params { + enum pipe_format pipe_format; + int force_level; + unsigned width0; + unsigned height0; + unsigned first_level; + unsigned last_level; + unsigned first_layer; + unsigned last_layer; + unsigned target; + unsigned char swizzle[4]; +}; -struct pipe_sampler_view * -evergreen_create_sampler_view_custom(struct pipe_context *ctx, -struct pipe_resource *texture, -const struct pipe_sampler_view *state, -unsigned width0, unsigned height0, -unsigned force_level) +static int evergreen_fill_tex_resource_words(struct r600_context *rctx, +struct pipe_resource *texture, +struct eg_tex_res_params *params, +bool *skip_mip_address_reloc, +unsigned tex_resource_words[8]) { - struct r600_context *rctx = (struct r600_context*)ctx; - struct r600_screen *rscreen = (struct r600_screen*)ctx->screen; - struct r600_pipe_sampler_view *view = CALLOC_STRUCT(r600_pipe_sampler_view); + struct r600_screen *rscreen = (struct r600_screen*)rctx->b.b.screen; struct r600_texture *tmp = (struct r600_texture*)texture; unsigned format, endian; uint32_t word4 = 0, yuv_format = 0, pitch = 0; - unsigned char swizzle[4], array_mode = 0, non_disp_tiling = 0; + unsigned char array_mode = 0, non_disp_tiling = 0; unsigned height, depth, width; unsigned macro_aspect, tile_split, bankh, bankw, nbanks, fmask_bankh; - enum pipe_format pipe_format = state->format; struct radeon_surf_level *surflevel; unsigned base_level, first_level, last_level; unsigned dim, last_layer; uint64_t va; - if (!view) - return NULL; - - /* initialize base object */ - view->base = *state; - view->base.texture = NULL; - pipe_reference(NULL, &texture->reference); - view->base.texture = texture; - view->base.reference.count = 1; - view->base.context = ctx; - - if (state->target == PIPE_BUFFER) - return texture_buffer_sampler_view(rctx, view, width0, height0); - - swizzle[0] = state->swizzle_r; - swizzle[1] = state->swizzle_g; - swizzle[2] = state->swizzle_b; - swizzle[3] = state->swizzle_a; - tile_split = tmp->surface.tile_split; surflevel = tmp->surface.level; /* Texturing with separate depth and stencil. */ if (tmp->is_depth && !tmp->is_flushing_texture) { - switch (pipe_format) { + switch (params->pipe_format) { case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT: - pipe_format = PIPE_FORMAT_Z32_FLOAT; + params->pipe_format = PIPE_FORMAT_Z32_FLOAT;
Re: [Mesa-dev] [PATCH] mesa/st: Avoid a NULL-ptr dereference on possible missing callback
Determinism is always better regardless of how you get there. A null pointer deference `on purpose` is a really poor idea in C. I am somewhat surprised you are asking if its really better to rely on undefined behavior vs. an assert. I thought about using a if branch and just returning but I think it is better to crash deterministically with a reason rather than perhaps fail silently. Although I do see in other places a if branch and a return was used in similar situations so I would be willing to do the same if I must. On 2016-03-28 16:08, Ilia Mirkin wrote: When would that happen? When a user force-enables ARB_query_buffer_object for a driver that's not ready for it? Is hitting a deterministic assert in that case any better than hitting a null deref? On Sun, Mar 27, 2016 at 11:52 PM, Edward O'Callaghan wrote: Just because we miss a gallium driver callback don't dereference invalid memory. Signed-off-by: Edward O'Callaghan --- src/mesa/state_tracker/st_cb_queryobj.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/state_tracker/st_cb_queryobj.c b/src/mesa/state_tracker/st_cb_queryobj.c index cdb9efc..e9abc38 100644 --- a/src/mesa/state_tracker/st_cb_queryobj.c +++ b/src/mesa/state_tracker/st_cb_queryobj.c @@ -402,6 +402,7 @@ st_StoreQueryResult(struct gl_context *ctx, struct gl_query_object *q, index = 0; } + assert(pipe->get_query_result_resource); pipe->get_query_result_resource(pipe, stq->pq, wait, result_type, index, stObj->buffer, offset); } -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: remove initialized field from uniform storage
Reviewed-by: Edward O'Callaghan On 2016-03-27 14:51, Timothy Arceri wrote: The only place this was used was in a gallium debug function that had to be manually enabled. --- src/compiler/glsl/ir_uniform.h | 5 src/compiler/glsl/link_uniform_initializers.cpp | 4 --- src/compiler/glsl/link_uniforms.cpp | 1 - src/mesa/main/shaderapi.c | 3 +- src/mesa/main/uniform_query.cpp | 4 --- src/mesa/state_tracker/st_draw.c| 37 - 6 files changed, 1 insertion(+), 53 deletions(-) diff --git a/src/compiler/glsl/ir_uniform.h b/src/compiler/glsl/ir_uniform.h index 1854279..e72e7b4 100644 --- a/src/compiler/glsl/ir_uniform.h +++ b/src/compiler/glsl/ir_uniform.h @@ -105,11 +105,6 @@ struct gl_uniform_storage { */ unsigned array_elements; - /** -* Has this uniform ever been set? -*/ - bool initialized; - struct gl_opaque_uniform_index opaque[MESA_SHADER_STAGES]; /** diff --git a/src/compiler/glsl/link_uniform_initializers.cpp b/src/compiler/glsl/link_uniform_initializers.cpp index 7d280cc..870bc5b 100644 --- a/src/compiler/glsl/link_uniform_initializers.cpp +++ b/src/compiler/glsl/link_uniform_initializers.cpp @@ -162,8 +162,6 @@ set_opaque_binding(void *mem_ctx, gl_shader_program *prog, } } } - - storage->initialized = true; } } @@ -267,8 +265,6 @@ set_uniform_initializer(void *mem_ctx, gl_shader_program *prog, } } } - - storage->initialized = true; } } diff --git a/src/compiler/glsl/link_uniforms.cpp b/src/compiler/glsl/link_uniforms.cpp index 09322c5..a16b34a 100644 --- a/src/compiler/glsl/link_uniforms.cpp +++ b/src/compiler/glsl/link_uniforms.cpp @@ -799,7 +799,6 @@ private: this->uniforms[id].name = ralloc_strdup(this->uniforms, name); this->uniforms[id].type = base_type; - this->uniforms[id].initialized = 0; this->uniforms[id].num_driver_storage = 0; this->uniforms[id].driver_storage = NULL; this->uniforms[id].atomic_buffer_index = -1; diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 5b882d6..92302f6 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -2568,7 +2568,6 @@ _mesa_UniformSubroutinesuiv(GLenum shadertype, GLsizei count, memcpy(&uni->storage[0], &indices[i], sizeof(GLuint) * uni_count); - uni->initialized = true; _mesa_propagate_uniforms_to_driver_storage(uni, 0, uni_count); i += uni_count; } while(i < count); @@ -2742,7 +2741,7 @@ _mesa_shader_init_subroutine_defaults(struct gl_shader *sh) for (j = 0; j < uni_count; j++) memcpy(&uni->storage[j], &val, sizeof(int)); - uni->initialized = true; + _mesa_propagate_uniforms_to_driver_storage(uni, 0, uni_count); } } diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 2ced201..ab5c3cd 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -815,8 +815,6 @@ _mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shProg, } } - uni->initialized = true; - _mesa_propagate_uniforms_to_driver_storage(uni, offset, count); /* If the uniform is a sampler, do the extra magic necessary to propagate @@ -1030,8 +1028,6 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, } } - uni->initialized = true; - _mesa_propagate_uniforms_to_driver_storage(uni, offset, count); } diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c index fdd59a3..3db5749 100644 --- a/src/mesa/state_tracker/st_draw.c +++ b/src/mesa/state_tracker/st_draw.c @@ -127,35 +127,6 @@ setup_index_buffer(struct st_context *st, /** - * Prior to drawing, check that any uniforms referenced by the - * current shader have been set. If a uniform has not been set, - * issue a warning. - */ -static void -check_uniforms(struct gl_context *ctx) -{ - struct gl_shader_program **shProg = ctx->_Shader->CurrentProgram; - unsigned j; - - for (j = 0; j < 3; j++) { - unsigned i; - - if (shProg[j] == NULL || !shProg[j]->LinkStatus) -continue; - - for (i = 0; i < shProg[j]->NumUniformStorage; i++) { - const struct gl_uniform_storage *u = &shProg[j]->UniformStorage[i]; - if (!u->initialized) { -_mesa_warning(ctx, - "Using shader with uninitialized uniform: %s", - u->name); - } - } - } -} - - -/** * Translate OpenGL primtive type (GL_POINTS, GL_TRIANGLE_STRIP, etc) to * the corresponding Gallium type. */ @@ -203,14 +174,6 @@ st_draw_vbo(struct gl_context *ctx, /* Validate state. */ if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) { st_validate_state(st, ST_PIPELINE_RENDER); - -#if 0 - if (MESA_VERBOSE & VERBOSE_GL
Re: [Mesa-dev] [PATCH 03/17] mesa/st: Set _NumSamples in update_framebuffer_state()
On 2016-03-25 22:20, Ilia Mirkin wrote: On Mar 25, 2016 4:43 AM, wrote: On 2016-03-25 14:02, Ilia Mirkin wrote: On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan wrote: Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported with a framebuffer using no attachment. Signed-off-by: Edward O'Callaghan --- src/mesa/state_tracker/st_atom_framebuffer.c | 51 1 file changed, 51 insertions(+) diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c b/src/mesa/state_tracker/st_atom_framebuffer.c index ae883a2..07854ca 100644 --- a/src/mesa/state_tracker/st_atom_framebuffer.c +++ b/src/mesa/state_tracker/st_atom_framebuffer.c @@ -64,6 +64,44 @@ update_framebuffer_size(struct pipe_framebuffer_state *framebuffer, framebuffer->height = MIN2(framebuffer->height, surface->height); } +/** + * Round up the requested multisample count to the next supported sample size. + */ +static unsigned +framebuffer_quantize_num_samples(struct pipe_screen *screen, unsigned num_samples) +{ + int quantized_samples = 0; + bool msaa_mode_supported; + + if (!num_samples) + return 0; + + assert(screen); + + /* Assumes the highest supported MSAA is x32 on any hardware */ + for (unsigned msaa_mode = 32; msaa_mode >= 1; msaa_mode = msaa_mode/2) { This should probably start at MaxFramebufferSamples right? Also msaa_mode >= num_samples? [then you can get rid of the if below] I did it in this manner because I don't trust all C compilers to warn sufficiently on `num_samples' overflows turning this into a infinite loop even though it is a unsigned type. The micro-optimization serves no purpose because the optimizer will trivially reduce the loop down, not that it has that many iterations any way. The loop as-is is both well bounded and deterministic, nice qualities to have. "Premature optimization is the root of all evil" ~ Donald Knuth's. I was going for clarity and simplicity, not runtime efficiency. Fewer lines of code to read, fewer conditions. For loop semantics are fairly well defined, compilers tend to get those things right. I was not referring to loop semantics or if a compiler can understand how to lower a loop correctly, you didn't really read what I said. Point is, parameterizing the loop with a function argument to save one line of code while losing some safety and determinism does hardly anything to make a argument for this to be changed imho. I much prefer how it is now, a simple constant deterministic loop, very clear. And lastly I don't know if it's a valid assumption that we can always just divide by 2. That said, I don't know of any hw that actually supports non-power-of-two MSAA levels, so perhaps it's OK. You are way over engineering here; it is a totally reasonable assumption and if such hardware does exist which we would support (I can`t see any in the tree currently as far as I am aware) then they can provide follow up fixes. I'm flexible on this one... If no one else cares, I don't care either. + assert(!(msaa_mode > 32 || msaa_mode == 0)); /* be safe from int overflows */ + if (msaa_mode >= num_samples) { + /** + * For ARB_framebuffer_no_attachment, A format of + * PIPE_FORMAT_NONE implies what number of samples is + * supported for a framebuffer with no attachment. Thus the + * drivers callback must be adjusted for this. + */ + msaa_mode_supported = screen->is_format_supported(screen, + PIPE_FORMAT_NONE, PIPE_TEXTURE_2D, + msaa_mode, PIPE_BIND_RENDER_TARGET); + /** + * Check if the MSAA mode that is higher than the requested + * num_samples is supported, and if so returning it. + */ + if (msaa_mode_supported) +quantized_samples = msaa_mode; + } + } + + return quantized_samples; +} /** * Update framebuffer state (color, depth, stencil, etc. buffers) @@ -72,6 +110,8 @@ static void update_framebuffer_state( struct st_context *st ) { struct pipe_framebuffer_state *framebuffer = &st->state.framebuffer; + struct pipe_context *pipe = st->pipe; + struct pipe_screen *screen = pipe->screen; struct gl_framebuffer *fb = st->ctx->DrawBuffer; struct st_renderbuffer *strb; GLuint i; @@ -82,6 +122,17 @@ update_framebuffer_state( struct st_context *st ) framebuffer->width = UINT_MAX; framebuffer->height = UINT_MAX; + /** +* Quantize the derived default number of samples: +* +* A query to the driver of supported MSAA values the +* hardware supports is done as to legalize the number +* of application requested samples, NumSamples. +* See commit eb9cf3c for more information. +*/ + fb->DefaultGeometry._NumSamples = + framebuffer_quantize_num_samples(screen, fb->DefaultGeometry.NumSamples); + /*printf("
Re: [Mesa-dev] [PATCH 15/17] nvc0: handle the case where there are no framebuffer attachments
On 2016-03-25 14:29, Ilia Mirkin wrote: Please leave this and the next patch out of your series. I'm going to need to retest everything carefully once the core support is in (and I get a bit of time). Done. Be sure to remove my Rb if you made changes and i`ll review it again. Thanks, -ilia On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan wrote: From: Ilia Mirkin Signed-off-by: Ilia Mirkin Reviewed-by: Edward O'Callaghan --- src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 7 +++ src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 16 ++-- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c| 4 3 files changed, 25 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c index b7c6faf..add9a79 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c @@ -456,6 +456,13 @@ nvc0_fp_gen_header(struct nvc0_program *fp, struct nv50_ir_prog_info *info) fp->hdr[18] |= 0xf << info->out[i].slot[0]; } + /* TODO: figure out proper condition, but this makes things work when there +* are no "regular" outputs in the frag shader, used when there are no +* attachments. +*/ + if (info->numOutputs == 0) + fp->hdr[18] |= 0xf; + fp->fp.early_z = info->prop.fp.earlyFragTests; return 0; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c index 9c64482..53f574b 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c @@ -75,12 +75,11 @@ nvc0_validate_fb(struct nvc0_context *nvc0) struct nvc0_screen *screen = nvc0->screen; unsigned i, ms; unsigned ms_mode = NVC0_3D_MULTISAMPLE_MODE_MS1; +unsigned nr_cbufs = fb->nr_cbufs; bool serialize = false; nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_3D_FB); -BEGIN_NVC0(push, NVC0_3D(RT_CONTROL), 1); -PUSH_DATA (push, (076543210 << 4) | fb->nr_cbufs); BEGIN_NVC0(push, NVC0_3D(SCREEN_SCISSOR_HORIZ), 2); PUSH_DATA (push, fb->width << 16); PUSH_DATA (push, fb->height << 16); @@ -179,6 +178,18 @@ nvc0_validate_fb(struct nvc0_context *nvc0) PUSH_DATA (push, 0); } +if (nr_cbufs == 0 && !fb->zsbuf) { + unsigned samples = util_next_power_of_two(fb->samples); + + nvc0_fb_set_null_rt(push, 0); + + assert(samples <= 8); + ms_mode = ffs(samples) - 1; + nr_cbufs = 1; +} + +BEGIN_NVC0(push, NVC0_3D(RT_CONTROL), 1); +PUSH_DATA (push, (076543210 << 4) | nr_cbufs); IMMED_NVC0(push, NVC0_3D(MULTISAMPLE_MODE), ms_mode); ms = 1 << ms_mode; @@ -592,6 +603,7 @@ nvc0_validate_derived_2(struct nvc0_context *nvc0) struct nouveau_pushbuf *push = nvc0->base.pushbuf; if (nvc0->zsa && nvc0->zsa->pipe.alpha.enabled && + nvc0->framebuffer.zsbuf && nvc0->framebuffer.nr_cbufs == 0) { nvc0_fb_set_null_rt(push, 0); BEGIN_NVC0(push, NVC0_3D(RT_CONTROL), 1); diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c index e8b3a4d..d546957 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c @@ -1043,6 +1043,8 @@ nvc0_blitctx_pre_blit(struct nvc0_blitctx *ctx) ctx->saved.fb.width = nvc0->framebuffer.width; ctx->saved.fb.height = nvc0->framebuffer.height; + ctx->saved.fb.samples = nvc0->framebuffer.samples; + ctx->saved.fb.layers = nvc0->framebuffer.layers; ctx->saved.fb.nr_cbufs = nvc0->framebuffer.nr_cbufs; ctx->saved.fb.cbufs[0] = nvc0->framebuffer.cbufs[0]; ctx->saved.fb.zsbuf = nvc0->framebuffer.zsbuf; @@ -1110,6 +1112,8 @@ nvc0_blitctx_post_blit(struct nvc0_blitctx *blit) nvc0->framebuffer.width = blit->saved.fb.width; nvc0->framebuffer.height = blit->saved.fb.height; + nvc0->framebuffer.samples = blit->saved.fb.samples; + nvc0->framebuffer.layers = blit->saved.fb.layers; nvc0->framebuffer.nr_cbufs = blit->saved.fb.nr_cbufs; nvc0->framebuffer.cbufs[0] = blit->saved.fb.cbufs[0]; nvc0->framebuffer.zsbuf = blit->saved.fb.zsbuf; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/17] gallium/util: Ensure util_framebuffer_get_num_samples() is valid
On 2016-03-25 14:20, Ilia Mirkin wrote: Instead of introducing buggy code in patch 6/17 and then fixing it up here, you need to fold this with patch 6 so that it's all done at the same time. Yea, can do. Cheers, On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan wrote: Upon context creation, internal driver structures are malloc()'ed and memset() to zero them. This results in a invalid number of samples 'by default'. Handle this in the simplest way to avoid elaborate and probably equally sub-optimial solutions. V2: Minor, use "NOTE:" instead of "N.B." in comment. Signed-off-by: Edward O'Callaghan Reviewed-by: Marek Olšák --- src/gallium/auxiliary/util/u_framebuffer.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_framebuffer.c b/src/gallium/auxiliary/util/u_framebuffer.c index 775f050..b020f27 100644 --- a/src/gallium/auxiliary/util/u_framebuffer.c +++ b/src/gallium/auxiliary/util/u_framebuffer.c @@ -204,9 +204,15 @@ util_framebuffer_get_num_samples(const struct pipe_framebuffer_state *fb) * In the case of ARB_framebuffer_no_attachment * we obtain the number of samples directly from * the framebuffer state. +* +* NOTE: fb->samples may wind up as zero due to memset()'s on internal +* driver structures on their initialization and so we take the +* MAX here to ensure we have a valid number of samples. However, +* if samples is legitimately not getting set somewhere +* multi-sampling will evidently break. */ if (!(fb->nr_cbufs || fb->zsbuf)) - return fb->samples; + return MAX2(fb->samples, 1); for (i = 0; i < fb->nr_cbufs; i++) { if (fb->cbufs[i]) { -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/17] mesa/st: Set _NumSamples in update_framebuffer_state()
On 2016-03-25 14:02, Ilia Mirkin wrote: On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan wrote: Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported with a framebuffer using no attachment. Signed-off-by: Edward O'Callaghan --- src/mesa/state_tracker/st_atom_framebuffer.c | 51 1 file changed, 51 insertions(+) diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c b/src/mesa/state_tracker/st_atom_framebuffer.c index ae883a2..07854ca 100644 --- a/src/mesa/state_tracker/st_atom_framebuffer.c +++ b/src/mesa/state_tracker/st_atom_framebuffer.c @@ -64,6 +64,44 @@ update_framebuffer_size(struct pipe_framebuffer_state *framebuffer, framebuffer->height = MIN2(framebuffer->height, surface->height); } +/** + * Round up the requested multisample count to the next supported sample size. + */ +static unsigned +framebuffer_quantize_num_samples(struct pipe_screen *screen, unsigned num_samples) +{ + int quantized_samples = 0; + bool msaa_mode_supported; + + if (!num_samples) + return 0; + + assert(screen); + + /* Assumes the highest supported MSAA is x32 on any hardware */ + for (unsigned msaa_mode = 32; msaa_mode >= 1; msaa_mode = msaa_mode/2) { This should probably start at MaxFramebufferSamples right? Also msaa_mode >= num_samples? [then you can get rid of the if below] I did it in this manner because I don't trust all C compilers to warn sufficiently on `num_samples' overflows turning this into a infinite loop even though it is a unsigned type. The micro-optimization serves no purpose because the optimizer will trivially reduce the loop down, not that it has that many iterations any way. The loop as-is is both well bounded and deterministic, nice qualities to have. "Premature optimization is the root of all evil" ~ Donald Knuth's. And lastly I don't know if it's a valid assumption that we can always just divide by 2. That said, I don't know of any hw that actually supports non-power-of-two MSAA levels, so perhaps it's OK. You are way over engineering here; it is a totally reasonable assumption and if such hardware does exist which we would support (I can`t see any in the tree currently as far as I am aware) then they can provide follow up fixes. + assert(!(msaa_mode > 32 || msaa_mode == 0)); /* be safe from int overflows */ + if (msaa_mode >= num_samples) { + /** + * For ARB_framebuffer_no_attachment, A format of + * PIPE_FORMAT_NONE implies what number of samples is + * supported for a framebuffer with no attachment. Thus the + * drivers callback must be adjusted for this. + */ + msaa_mode_supported = screen->is_format_supported(screen, + PIPE_FORMAT_NONE, PIPE_TEXTURE_2D, + msaa_mode, PIPE_BIND_RENDER_TARGET); + /** + * Check if the MSAA mode that is higher than the requested + * num_samples is supported, and if so returning it. + */ + if (msaa_mode_supported) +quantized_samples = msaa_mode; + } + } + + return quantized_samples; +} /** * Update framebuffer state (color, depth, stencil, etc. buffers) @@ -72,6 +110,8 @@ static void update_framebuffer_state( struct st_context *st ) { struct pipe_framebuffer_state *framebuffer = &st->state.framebuffer; + struct pipe_context *pipe = st->pipe; + struct pipe_screen *screen = pipe->screen; struct gl_framebuffer *fb = st->ctx->DrawBuffer; struct st_renderbuffer *strb; GLuint i; @@ -82,6 +122,17 @@ update_framebuffer_state( struct st_context *st ) framebuffer->width = UINT_MAX; framebuffer->height = UINT_MAX; + /** +* Quantize the derived default number of samples: +* +* A query to the driver of supported MSAA values the +* hardware supports is done as to legalize the number +* of application requested samples, NumSamples. +* See commit eb9cf3c for more information. +*/ + fb->DefaultGeometry._NumSamples = + framebuffer_quantize_num_samples(screen, fb->DefaultGeometry.NumSamples); + /*printf("-- fb size %d x %d\n", fb->Width, fb->Height);*/ /* Examine Mesa's ctx->DrawBuffer->_ColorDrawBuffers state -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: fix 2D array MSAA failures since image support landed
Reviewed-by: Edward O'Callaghan On 2016-03-23 04:27, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_state.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index b9bdd47..b8fde00 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2993,7 +2993,8 @@ si_make_texture_descriptor(struct si_screen *screen, if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) { height = 1; depth = res->array_size; - } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY) { + } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY || + type == V_008F1C_SQ_RSRC_IMG_2D_MSAA_ARRAY) { if (sampler || res->target != PIPE_TEXTURE_3D) depth = res->array_size; } else if (type == V_008F1C_SQ_RSRC_IMG_CUBE) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/17] gallium/aux: Fix u_blitter.c for layers/samples
Ah you are correct, this is no longer needed in the push branch. We can drop this one from the series as its a nop, please ignore thanks for spotting it. On 2016-03-22 02:43, Marek Olšák wrote: Does this fix anything even? The blitter always binds something, thus this should have no effect. Marek On Sat, Mar 19, 2016 at 7:41 AM, Edward O'Callaghan wrote: Signed-off-by: Edward O'Callaghan --- src/gallium/auxiliary/util/u_blitter.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c index 43fbd8e..c4a32e8 100644 --- a/src/gallium/auxiliary/util/u_blitter.c +++ b/src/gallium/auxiliary/util/u_blitter.c @@ -1566,11 +1566,13 @@ void util_blitter_blit_generic(struct blitter_context *blitter, /* Initialize framebuffer state. */ fb_state.width = dst->width; fb_state.height = dst->height; - fb_state.nr_cbufs = blit_depth || blit_stencil ? 0 : 1; fb_state.cbufs[0] = NULL; fb_state.zsbuf = NULL; if (blit_depth || blit_stencil) { + fb_state.nr_cbufs = 0; + fb_state.layers = 0; + fb_state.samples = 1; pipe->bind_blend_state(pipe, ctx->blend[0][0]); if (blit_depth && blit_stencil) { @@ -1594,6 +1596,7 @@ void util_blitter_blit_generic(struct blitter_context *blitter, } } else { + fb_state.nr_cbufs = 1; unsigned colormask = mask & PIPE_MASK_RGBA; pipe->bind_blend_state(pipe, ctx->blend[colormask][alpha_blend]); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/11] radeonsi: shader buffer support (atomic counters, ssbo)
Hi Nicolai, Thanks for taking over this work and going the whole nine yards with it! This series is, Reviewed-by: Edward O'Callaghan Thanks again, Edward. On 2016-03-22 10:21, Nicolai Hähnle wrote: Hi, since shader images have laid most of the foundation, here are shader buffers now. This is the last extension missing for OpenGL 4.2 (we still need to turn on GLSL 4.2, but I think that only involves flipping a bit). As with shader images, this extension needs bleeding edge LLVM - this time, important patches have not landed upstream yet, and if you want to try this code you'll need my LLVM branch at https://cgit.freedesktop.org/~nh/llvm/log/?h=images (For those following along at home, the necessary LLVM patches for shader images have already landed upstream.) In principle, there are two alternative implementations for shader buffers: using LLVM IR pointers with LLVM-native load/store instructions directly, or using intrinsics that operate on GCN buffer descriptors. This implementation uses the second approach. A brief comparison between the two approaches: 1. The pointer approach would use FLAT memory instructions on CI+, which operate on 64 bit pointers rather than 128 bit buffer descriptors. This would reduce SGPR memory pressure slightly. 2. LLVM understands pointers for alias analysis, so it's possible that it would generate somewhat better code if we were to use pointers in the IR. 3. The buffer load/store intructions have built-in bounds checks. Bounds checks are required for an honest implementation of the ARB_robustness extension, which we claim to support. The last point makes it obvious that the implementation really needs to use buffer intrinsics, but it'd be interesting to know how big the difference in code quality is versus something that uses pointers. To get the best of both worlds, we should really find a way to teach LLVM's alias analysis about what those buffer descriptors mean. For now, this current approach is the right way to do it. Please review! Thanks, Nicolai ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled.
Too quick, very nice cleanup, thanks. Reviewed-by: Edward O'Callaghan On 2016-03-22 12:58, Bas Nieuwenhuizen wrote: This removes any dependency on driver validation of the number of framebuffer samples. Signed-off-by: Bas Nieuwenhuizen --- src/mesa/drivers/dri/i965/brw_util.h | 5 +++-- src/mesa/drivers/dri/i965/gen6_cc.c| 6 +++--- src/mesa/drivers/dri/i965/gen6_multisample_state.c | 2 +- src/mesa/drivers/dri/i965/gen8_blend_state.c | 6 +++--- src/mesa/drivers/dri/i965/gen8_depth_state.c | 3 ++- src/mesa/drivers/dri/i965/gen8_sf_state.c | 4 ++-- src/mesa/main/framebuffer.c| 19 +++ src/mesa/main/framebuffer.h| 3 +++ src/mesa/main/mtypes.h | 1 - src/mesa/main/state.c | 17 - src/mesa/program/prog_statevars.c | 2 +- src/mesa/state_tracker/st_atom_rasterizer.c| 4 ++-- src/mesa/state_tracker/st_atom_shader.c| 2 +- src/mesa/swrast/s_points.c | 4 ++-- 14 files changed, 42 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_util.h b/src/mesa/drivers/dri/i965/brw_util.h index 1f27e98..3e9a6ee 100644 --- a/src/mesa/drivers/dri/i965/brw_util.h +++ b/src/mesa/drivers/dri/i965/brw_util.h @@ -34,6 +34,7 @@ #define BRW_UTIL_H #include "brw_context.h" +#include "main/framebuffer.h" extern GLuint brw_translate_blend_factor( GLenum factor ); extern GLuint brw_translate_blend_equation( GLenum mode ); @@ -49,13 +50,13 @@ brw_get_line_width(struct brw_context *brw) * implementation-dependent maximum non-antialiased line width." */ float line_width = - CLAMP(!brw->ctx.Multisample._Enabled && !brw->ctx.Line.SmoothFlag + CLAMP(!_mesa_is_multisample_enabled(&brw->ctx) && !brw->ctx.Line.SmoothFlag ? roundf(brw->ctx.Line.Width) : brw->ctx.Line.Width, 0.0f, brw->ctx.Const.MaxLineWidth); uint32_t line_width_u3_7 = U_FIXED(line_width, 7); /* Line width of 0 is not allowed when MSAA enabled */ - if (brw->ctx.Multisample._Enabled) { + if (_mesa_is_multisample_enabled(&brw->ctx)) { if (line_width_u3_7 == 0) line_width_u3_7 = 1; } else if (brw->ctx.Line.SmoothFlag && line_width < 1.5f) { diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c b/src/mesa/drivers/dri/i965/gen6_cc.c index cee139b..f5a7d4d 100644 --- a/src/mesa/drivers/dri/i965/gen6_cc.c +++ b/src/mesa/drivers/dri/i965/gen6_cc.c @@ -198,14 +198,14 @@ gen6_upload_blend_state(struct brw_context *brw) if(!is_buffer_zero_integer_format) { /* _NEW_MULTISAMPLE */ blend[b].blend1.alpha_to_coverage = -ctx->Multisample._Enabled && ctx->Multisample.SampleAlphaToCoverage; +_mesa_is_multisample_enabled(ctx) && ctx->Multisample.SampleAlphaToCoverage; /* From SandyBridge PRM, volume 2 Part 1, section 8.2.3, BLEND_STATE: * DWord 1, Bit 30 (AlphaToOne Enable): * "If Dual Source Blending is enabled, this bit must be disabled" */ WARN_ONCE(ctx->Color.Blend[b]._UsesDualSrc && - ctx->Multisample._Enabled && + _mesa_is_multisample_enabled(ctx) && ctx->Multisample.SampleAlphaToOne, "HW workaround: disabling alpha to one with dual src " "blending\n"); @@ -213,7 +213,7 @@ gen6_upload_blend_state(struct brw_context *brw) blend[b].blend1.alpha_to_one = false; else blend[b].blend1.alpha_to_one = - ctx->Multisample._Enabled && ctx->Multisample.SampleAlphaToOne; + _mesa_is_multisample_enabled(ctx) && ctx->Multisample.SampleAlphaToOne; blend[b].blend1.alpha_to_coverage_dither = (brw->gen >= 7); } diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c b/src/mesa/drivers/dri/i965/gen6_multisample_state.c index 8eb620d..fcd313a 100644 --- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c +++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c @@ -171,7 +171,7 @@ gen6_determine_sample_mask(struct brw_context *brw) /* BRW_NEW_NUM_SAMPLES */ unsigned num_samples = brw->num_samples; - if (ctx->Multisample._Enabled) { + if (_mesa_is_multisample_enabled(ctx)) { if (ctx->Multisample.SampleCoverage) { coverage = ctx->Multisample.SampleCoverageValue; coverage_invert = ctx->Multisample.SampleCoverageInvert; diff --git a/src/mesa/drivers/dri/i965/gen8_blend_state.c b/src/mesa/drivers/dri/i965/gen8_blend_state.c index 786c79a..63186bd 100644 --- a/src/mesa/drivers/dri/i965/gen8_blend_state.c +++ b/src/mesa/drivers/dri/i965/gen8_blend_state.c @@ -65,7 +65,7 @@ gen8_upload_blend_state(struct brw_context *brw) if (rb_zero_type != GL_INT && rb_zero_type != GL_UNSIGNED_INT) { /* _NEW_MULTISAMPLE */ - if (ctx->Mul
Re: [Mesa-dev] [PATCH] tgsi: drop unused set_exec/kill_mask interfaces.
Reviewed-by: Edward O'Callaghan On 2016-03-22 11:29, Dave Airlie wrote: From: Dave Airlie These don't get used and haven't been in git history from what I can see, so drop them. Signed-off-by: Dave Airlie --- src/gallium/auxiliary/draw/draw_gs.c | 6 -- src/gallium/auxiliary/draw/draw_vs_exec.c | 6 -- src/gallium/auxiliary/tgsi/tgsi_exec.h| 25 - 3 files changed, 37 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index 6b33341..fcef31b 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -206,12 +206,6 @@ static unsigned tgsi_gs_run(struct draw_geometry_shader *shader, { struct tgsi_exec_machine *machine = shader->machine; - tgsi_set_exec_mask(machine, - 1, - input_primitives > 1, - input_primitives > 2, - input_primitives > 3); - /* run interpreter */ tgsi_exec_machine_run(machine); diff --git a/src/gallium/auxiliary/draw/draw_vs_exec.c b/src/gallium/auxiliary/draw/draw_vs_exec.c index abd64f5..3fd8ef3 100644 --- a/src/gallium/auxiliary/draw/draw_vs_exec.c +++ b/src/gallium/auxiliary/draw/draw_vs_exec.c @@ -159,12 +159,6 @@ vs_exec_run_linear( struct draw_vertex_shader *shader, input = (const float (*)[4])((const char *)input + input_stride); } - tgsi_set_exec_mask(machine, - 1, - max_vertices > 1, - max_vertices > 2, - max_vertices > 3); - /* run interpreter */ tgsi_exec_machine_run( machine ); diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h b/src/gallium/auxiliary/tgsi/tgsi_exec.h index 12a6875..991c3bf 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.h +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h @@ -196,10 +196,6 @@ struct tgsi_sampler #define TGSI_EXEC_TEMP_HALF_I (TGSI_EXEC_NUM_TEMPS + 3) #define TGSI_EXEC_TEMP_HALF_C 0 -/* execution mask, each value is either 0 or ~0 */ -#define TGSI_EXEC_MASK_I(TGSI_EXEC_NUM_TEMPS + 3) -#define TGSI_EXEC_MASK_C1 - /* 4 register buffer for various purposes */ #define TGSI_EXEC_TEMP_R0 (TGSI_EXEC_NUM_TEMPS + 4) #define TGSI_EXEC_NUM_TEMP_R4 @@ -397,27 +393,6 @@ boolean tgsi_check_soa_dependencies(const struct tgsi_full_instruction *inst); -static inline void -tgsi_set_kill_mask(struct tgsi_exec_machine *mach, unsigned mask) -{ - mach->Temps[TGSI_EXEC_TEMP_KILMASK_I].xyzw[TGSI_EXEC_TEMP_KILMASK_C].u[0] = - mask; -} - - -/** Set execution mask values prior to executing the shader */ -static inline void -tgsi_set_exec_mask(struct tgsi_exec_machine *mach, - boolean ch0, boolean ch1, boolean ch2, boolean ch3) -{ - int *mask = mach->Temps[TGSI_EXEC_MASK_I].xyzw[TGSI_EXEC_MASK_C].i; - mask[0] = ch0 ? ~0 : 0; - mask[1] = ch1 ? ~0 : 0; - mask[2] = ch2 ? ~0 : 0; - mask[3] = ch3 ? ~0 : 0; -} - - extern void tgsi_exec_set_constant_buffers(struct tgsi_exec_machine *mach, unsigned num_bufs, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: fix out-of-bounds indexing of shader images
Reviewed-by: Edward O'Callaghan On 2016-03-22 07:41, Nicolai Hähnle wrote: From: Nicolai Hähnle Results are undefined but may not crash. Without this change, out-of-bounds indexing can lead to VM faults and GPU hangs. Constant buffers, samplers, and possibly others will eventually need similar treatment to support GL_ARB_robust_buffer_access_behavior. --- src/gallium/drivers/radeonsi/si_shader.c | 44 +++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 9ad2290..1e4bf82 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -532,6 +532,37 @@ static LLVMValueRef get_indirect_index(struct si_shader_context *ctx, } /** + * Like get_indirect_index, but restricts the return value to a (possibly + * undefined) value inside [0..num). + */ +static LLVMValueRef get_bounded_indirect_index(struct si_shader_context *ctx, + const struct tgsi_ind_register *ind, + int rel_index, unsigned num) +{ + struct gallivm_state *gallivm = &ctx->radeon_bld.gallivm; + LLVMBuilderRef builder = gallivm->builder; + LLVMValueRef result = get_indirect_index(ctx, ind, rel_index); + LLVMValueRef c_max = LLVMConstInt(ctx->i32, num - 1, 0); + LLVMValueRef cc; + + if (util_is_power_of_two(num)) { + result = LLVMBuildAnd(builder, result, c_max, ""); + } else { + /* In theory, this MAX pattern should result in code that is +* as good as the bit-wise AND above. +* +* In practice, LLVM generates worse code (at the time of +* writing), because its value tracking is not strong enough. +*/ + cc = LLVMBuildICmp(builder, LLVMIntULE, result, c_max, ""); + result = LLVMBuildSelect(builder, cc, result, c_max, ""); + } + + return result; +} + + +/** * Calculate a dword address given an input or output register and a stride. */ static LLVMValueRef get_dw_address(struct si_shader_context *ctx, @@ -2814,7 +2845,18 @@ image_fetch_rsrc( LLVMValueRef rsrc_ptr; LLVMValueRef tmp; - ind_index = get_indirect_index(ctx, &image->Indirect, image->Register.Index); + /* From the GL_ARB_shader_image_load_store extension spec: +* +*If a shader performs an image load, store, or atomic +*operation using an image variable declared as an array, +*and if the index used to select an individual element is +*negative or greater than or equal to the size of the +*array, the results of the operation are undefined but may +*not lead to termination. +*/ + ind_index = get_bounded_indirect_index(ctx, &image->Indirect, + image->Register.Index, + SI_NUM_IMAGES); rsrc_ptr = LLVMGetParam(ctx->radeon_bld.main_fn, SI_PARAM_IMAGES); tmp = build_indexed_load_const(ctx, rsrc_ptr, ind_index); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/17] gallium: Add PIPE_CAP_MSAA_MODES
On 2016-03-21 21:06, Marek Olšák wrote: On Sat, Mar 19, 2016 at 5:09 PM, Ilia Mirkin wrote: On Sat, Mar 19, 2016 at 12:02 PM, Bas Nieuwenhuizen wrote: On Sat, Mar 19, 2016 at 4:25 PM, Ilia Mirkin wrote: On Sat, Mar 19, 2016 at 11:14 AM, Bas Nieuwenhuizen wrote: That would limit us to supporting sample counts for which we have texture formats. As far as I understand with radeonsi we can support 16 samples without any attachments, but all formats are limited to <= 8 samples. So you're going to end up with a situation where GL_MAX_SAMPLES is less than GL_MAX_FRAMEBUFFER_SAMPLES? I don't know that that's a useful thing to have. This implementation still has the problem of only supporting POT MSAA levels (although tbh I'm not 100% sure there's hw out there that supports NPOT MSAA levels). If people really want this, I think the way to go would be to make is_format_supported() work with PIPE_FORMAT_NONE and do it that way. Also, are you *sure* that's the case on radeonsi? I find it very odd that the rasterizer would support a higher MSAA level than the highest attachment would... I am pretty confident that this is the case. I just tested 16 samples (although this series seems to miss changing MaxFramebufferSamples), and the driver disallows any texture format with > 8 samples [1]. Furthermore the proprietary driver on Windows seems to have GL_MAX_SAMPLES=8 and GL_MAX_FRAMEBUFFER_SAMPLES=16 [2]. OK. I still think it's crazy, but it is what it is :) It's called EQAA (similar to CSAA). The hardware can do 16 unique depth samples, but only 8 unique color samples can be stored. Other than that, the rasterization hw supports 16x MSAA fully. Using PIPE_FORMAT_NONE to query the driver would probably be a bit less error prone than the current code that sets the masks, so that would be fine with me. Actually my earlier criticism about it only doing POT levels is a bit off -- after reading some more of the code, I just think that the settings to drivers were off - it should have been (1 << 8) | (1 << 4), etc. This works up to 32x MSAA, which is not supported by anyone (for real, although NVIDIA blob drivers do fake it). R300 can do 6x MSAA, but granted it won't support this extension. I do still prefer to avoid having separate places where this info is encoded... so I maintain my vote for using PIPE_FORMAT_NONE in is_format_supported. Same here. Same. I fixed this in the up-coming series. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 17/17] GL3.txt: Mark ARB_framebuffer_no_attachments as done
On 2016-03-19 21:08, Kai Wasserbäch wrote: Edward O'Callaghan wrote on 19.03.2016 07:41: Signed-off-by: Edward O'Callaghan --- docs/GL3.txt | 2 +- docs/relnotes/11.3.0.html | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 3058996..b9fc86b 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -172,7 +172,7 @@ GL 4.3, GLSL 4.30: GL_KHR_debug DONE (all drivers) GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL) GL_ARB_fragment_layer_viewportDONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe) - GL_ARB_framebuffer_no_attachments DONE (i965) + GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, r600, radeonsi) GL_ARB_internalformat_query2 DONE (i965) GL_ARB_invalidate_subdata DONE (all drivers) GL_ARB_multi_draw_indirectDONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe) Should this also update the GL_ARB_framebuffer_no_attachments line in the OpenGL ES 3.1 section? Or is more work needed for that? In the latter case a small comment in the commit message might be nice. Cheers, Kai I am not working on ES and don`t really know much about it so i`ll leave that one to the `experts`. My focus here is just usual GL. Thanks, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] st/mesa, radeonsi: some MemoryBarrier fixes
Hi Nicolai, This series is, Reviewed-by: Edward O'Callaghan Thanks, On 2016-03-19 14:37, Nicolai Hähnle wrote: Hi, these patches apply on top of my ARB_shader_image_load_store series. Together, they fix a few remaining fails with piglit's arb_shader_image_load_store-host-mem-barrier. You can see them in context at https://cgit.freedesktop.org/~nh/mesa/log/?h=ssbo The basic assumption for how barrier bits are translated is that each Gallium object / binding point has its own PIPE_BARRIER_* bit, but the driver will automatically do the necessary invalidations/flushes for transfers and blit-type operations, as well as when the framebuffer state is changed. This is still very tricky stuff to get right, but at least I think it's shaping up nicely for radeonsi, as evidenced by the fact that the host-mem-barrier test passes (and the control subtests also show that what we're doing here isn't just a no-op). Please review! Thanks, Nicolai --- src/gallium/drivers/radeonsi/si_state.c | 7 +++-- src/gallium/include/pipe/p_defines.h | 1 + .../state_tracker/st_cb_texturebarrier.c | 25 +- 3 files changed, 30 insertions(+), 3 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac require libdrm 2.4.65 for amdgpu because of drmGetDevice
Reviewed-by: Edward O'Callaghan On 2016-03-14 03:46, Marek Olšák wrote: From: Marek Olšák --- configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 49be147..d768db6 100644 --- a/configure.ac +++ b/configure.ac @@ -70,7 +70,7 @@ AC_SUBST([OPENCL_VERSION]) dnl Versions for external dependencies LIBDRM_REQUIRED=2.4.60 LIBDRM_RADEON_REQUIRED=2.4.56 -LIBDRM_AMDGPU_REQUIRED=2.4.63 +LIBDRM_AMDGPU_REQUIRED=2.4.65 LIBDRM_INTEL_REQUIRED=2.4.61 LIBDRM_NVVIEUX_REQUIRED=2.4.66 LIBDRM_NOUVEAU_REQUIRED=2.4.66 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: gl_NumSamples should always be at least one
Reviewed-by: Edward O`Callaghan I had the same issue also. On 2016-02-16 17:31, Ilia Mirkin wrote: From ARB_sample_shading: "gl_NumSamples is the total number of samples in the framebuffer, or one if rendering to a non-multisample framebuffer" So make sure to always pass in at least 1. Signed-off-by: Ilia Mirkin --- src/mesa/program/prog_statevars.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/program/prog_statevars.c b/src/mesa/program/prog_statevars.c index eed2412..489f75f 100644 --- a/src/mesa/program/prog_statevars.c +++ b/src/mesa/program/prog_statevars.c @@ -353,7 +353,7 @@ _mesa_fetch_state(struct gl_context *ctx, const gl_state_index state[], } return; case STATE_NUM_SAMPLES: - ((int *)value)[0] = _mesa_geometric_samples(ctx->DrawBuffer); + ((int *)value)[0] = MAX2(1, _mesa_geometric_samples(ctx->DrawBuffer)); return; case STATE_DEPTH_RANGE: value[0] = ctx->ViewportArray[0].Near;/* near */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/dri/r200: Refrain from using symbol links in repo
Disregard, apparently this breaks out-of-tree builds. There perhaps is maybe no good solution here so i`ll refrain from this can of worms for now. On 2016-02-18 15:46, Edward O'Callaghan wrote: Just use the relative path in the Makefile.source over symbol links that are not necessarily portable. Untested as I don't have this old hardware. Signed-off-by: Edward O'Callaghan --- src/mesa/drivers/dri/r200/Makefile.sources| 58 +++ src/mesa/drivers/dri/r200/radeon_buffer_objects.c | 1 - src/mesa/drivers/dri/r200/radeon_buffer_objects.h | 1 - src/mesa/drivers/dri/r200/radeon_chipset.h| 1 - src/mesa/drivers/dri/r200/radeon_cmdbuf.h | 1 - src/mesa/drivers/dri/r200/radeon_common.c | 1 - src/mesa/drivers/dri/r200/radeon_common.h | 1 - src/mesa/drivers/dri/r200/radeon_common_context.c | 1 - src/mesa/drivers/dri/r200/radeon_common_context.h | 1 - src/mesa/drivers/dri/r200/radeon_debug.c | 1 - src/mesa/drivers/dri/r200/radeon_debug.h | 1 - src/mesa/drivers/dri/r200/radeon_dma.c| 1 - src/mesa/drivers/dri/r200/radeon_dma.h| 1 - src/mesa/drivers/dri/r200/radeon_fbo.c| 1 - src/mesa/drivers/dri/r200/radeon_fog.c| 1 - src/mesa/drivers/dri/r200/radeon_fog.h| 1 - src/mesa/drivers/dri/r200/radeon_mipmap_tree.c| 1 - src/mesa/drivers/dri/r200/radeon_mipmap_tree.h| 1 - src/mesa/drivers/dri/r200/radeon_pixel_read.c | 1 - src/mesa/drivers/dri/r200/radeon_queryobj.c | 1 - src/mesa/drivers/dri/r200/radeon_queryobj.h | 1 - src/mesa/drivers/dri/r200/radeon_screen.c | 1 - src/mesa/drivers/dri/r200/radeon_screen.h | 1 - src/mesa/drivers/dri/r200/radeon_span.c | 1 - src/mesa/drivers/dri/r200/radeon_span.h | 1 - src/mesa/drivers/dri/r200/radeon_tex_copy.c | 1 - src/mesa/drivers/dri/r200/radeon_texture.c| 1 - src/mesa/drivers/dri/r200/radeon_texture.h| 1 - src/mesa/drivers/dri/r200/radeon_tile.c | 1 - src/mesa/drivers/dri/r200/radeon_tile.h | 1 - 30 files changed, 29 insertions(+), 58 deletions(-) delete mode 12 src/mesa/drivers/dri/r200/radeon_buffer_objects.c delete mode 12 src/mesa/drivers/dri/r200/radeon_buffer_objects.h delete mode 12 src/mesa/drivers/dri/r200/radeon_chipset.h delete mode 12 src/mesa/drivers/dri/r200/radeon_cmdbuf.h delete mode 12 src/mesa/drivers/dri/r200/radeon_common.c delete mode 12 src/mesa/drivers/dri/r200/radeon_common.h delete mode 12 src/mesa/drivers/dri/r200/radeon_common_context.c delete mode 12 src/mesa/drivers/dri/r200/radeon_common_context.h delete mode 12 src/mesa/drivers/dri/r200/radeon_debug.c delete mode 12 src/mesa/drivers/dri/r200/radeon_debug.h delete mode 12 src/mesa/drivers/dri/r200/radeon_dma.c delete mode 12 src/mesa/drivers/dri/r200/radeon_dma.h delete mode 12 src/mesa/drivers/dri/r200/radeon_fbo.c delete mode 12 src/mesa/drivers/dri/r200/radeon_fog.c delete mode 12 src/mesa/drivers/dri/r200/radeon_fog.h delete mode 12 src/mesa/drivers/dri/r200/radeon_mipmap_tree.c delete mode 12 src/mesa/drivers/dri/r200/radeon_mipmap_tree.h delete mode 12 src/mesa/drivers/dri/r200/radeon_pixel_read.c delete mode 12 src/mesa/drivers/dri/r200/radeon_queryobj.c delete mode 12 src/mesa/drivers/dri/r200/radeon_queryobj.h delete mode 12 src/mesa/drivers/dri/r200/radeon_screen.c delete mode 12 src/mesa/drivers/dri/r200/radeon_screen.h delete mode 12 src/mesa/drivers/dri/r200/radeon_span.c delete mode 12 src/mesa/drivers/dri/r200/radeon_span.h delete mode 12 src/mesa/drivers/dri/r200/radeon_tex_copy.c delete mode 12 src/mesa/drivers/dri/r200/radeon_texture.c delete mode 12 src/mesa/drivers/dri/r200/radeon_texture.h delete mode 12 src/mesa/drivers/dri/r200/radeon_tile.c delete mode 12 src/mesa/drivers/dri/r200/radeon_tile.h diff --git a/src/mesa/drivers/dri/r200/Makefile.sources b/src/mesa/drivers/dri/r200/Makefile.sources index dbcb9af..ef2e7be 100644 --- a/src/mesa/drivers/dri/r200/Makefile.sources +++ b/src/mesa/drivers/dri/r200/Makefile.sources @@ -1,30 +1,30 @@ R200_COMMON_FILES = \ - radeon_buffer_objects.c \ - radeon_buffer_objects.h \ - radeon_cmdbuf.h \ - radeon_common.c \ - radeon_common.h \ - radeon_common_context.c \ - radeon_common_context.h \ - radeon_debug.c \ - radeon_debug.h \ - radeon_dma.c \ - radeon_dma.h \ - radeon_fbo.c \ - radeon_fog.c \ - radeon_fog.h \ - radeon_mipmap_tree.c \ - radeon_mipmap_tree.h \ - radeon_pixel_read.c \ - radeon_queryobj.c \ - radeon_queryobj.h \ - radeon_span.c \ - radeon_span.h \ - radeon_tex_copy.c \ - radeon_texture.c \ - radeon_texture.h \ - radeon_tile.c \ - radeon_
Re: [Mesa-dev] [PATCH v3] clover: fix build failure since bfd695e
Thanks kindly. Reviewed-by: Edward O'Callaghan On 2016-02-14 09:39, Serge Martin wrote: --- src/gallium/state_trackers/clover/core/kernel.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp index 41b3852..8396be9 100644 --- a/src/gallium/state_trackers/clover/core/kernel.cpp +++ b/src/gallium/state_trackers/clover/core/kernel.cpp @@ -76,9 +76,9 @@ kernel::launch(command_queue &q, exec.g_buffers.data(), g_handles.data()); // Fill information for the launch_grid() call. - info.block = pad_vector(q, block_size, 1).data(), - info.grid = pad_vector(q, reduced_grid_size, 1).data(), - info.pc = find(name_equals(_name), m.sysm).offset; + copy(pad_vector(q, block_size, 1), info.block); + copy(pad_vector(q, reduced_grid_size, 1), info.grid); + info.pc = find(name_equals(_name), m.syms).offset; info.input = exec.input.data(); q.pipe->launch_grid(q.pipe, &info); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] tgsi: break gigantic tgsi_scan_shader() function into pieces
This series is, Reviewed-by: Edward O'Callaghan On 2016-02-06 11:56, Brian Paul wrote: New functions for examining instructions, declarations, etc. --- src/gallium/auxiliary/tgsi/tgsi_scan.c | 739 + 1 file changed, 375 insertions(+), 364 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c b/src/gallium/auxiliary/tgsi/tgsi_scan.c index 687fb54..4199dbe 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c @@ -44,6 +44,375 @@ +static void +scan_instruction(struct tgsi_shader_info *info, + const struct tgsi_full_instruction *fullinst, + unsigned *current_depth) +{ + unsigned i; + + assert(fullinst->Instruction.Opcode < TGSI_OPCODE_LAST); + info->opcode_count[fullinst->Instruction.Opcode]++; + + switch (fullinst->Instruction.Opcode) { + case TGSI_OPCODE_IF: + case TGSI_OPCODE_UIF: + case TGSI_OPCODE_BGNLOOP: + (*current_depth)++; + info->max_depth = MAX2(info->max_depth, *current_depth); + break; + case TGSI_OPCODE_ENDIF: + case TGSI_OPCODE_ENDLOOP: + (*current_depth)--; + break; + default: + break; + } + + if (fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_CENTROID || + fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_OFFSET || + fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_SAMPLE) { + const struct tgsi_full_src_register *src0 = &fullinst->Src[0]; + unsigned input; + + if (src0->Register.Indirect && src0->Indirect.ArrayID) + input = info->input_array_first[src0->Indirect.ArrayID]; + else + input = src0->Register.Index; + + /* For the INTERP opcodes, the interpolation is always + * PERSPECTIVE unless LINEAR is specified. + */ + switch (info->input_interpolate[input]) { + case TGSI_INTERPOLATE_COLOR: + case TGSI_INTERPOLATE_CONSTANT: + case TGSI_INTERPOLATE_PERSPECTIVE: + switch (fullinst->Instruction.Opcode) { + case TGSI_OPCODE_INTERP_CENTROID: +info->uses_persp_opcode_interp_centroid = true; +break; + case TGSI_OPCODE_INTERP_OFFSET: +info->uses_persp_opcode_interp_offset = true; +break; + case TGSI_OPCODE_INTERP_SAMPLE: +info->uses_persp_opcode_interp_sample = true; +break; + } + break; + + case TGSI_INTERPOLATE_LINEAR: + switch (fullinst->Instruction.Opcode) { + case TGSI_OPCODE_INTERP_CENTROID: +info->uses_linear_opcode_interp_centroid = true; +break; + case TGSI_OPCODE_INTERP_OFFSET: +info->uses_linear_opcode_interp_offset = true; +break; + case TGSI_OPCODE_INTERP_SAMPLE: +info->uses_linear_opcode_interp_sample = true; +break; + } + break; + } + } + + if (fullinst->Instruction.Opcode >= TGSI_OPCODE_F2D && + fullinst->Instruction.Opcode <= TGSI_OPCODE_DSSG) + info->uses_doubles = true; + + for (i = 0; i < fullinst->Instruction.NumSrcRegs; i++) { + const struct tgsi_full_src_register *src = &fullinst->Src[i]; + int ind = src->Register.Index; + + /* Mark which inputs are effectively used */ + if (src->Register.File == TGSI_FILE_INPUT) { + unsigned usage_mask; + usage_mask = tgsi_util_get_inst_usage_mask(fullinst, i); + if (src->Register.Indirect) { +for (ind = 0; ind < info->num_inputs; ++ind) { + info->input_usage_mask[ind] |= usage_mask; +} + } else { +assert(ind >= 0); +assert(ind < PIPE_MAX_SHADER_INPUTS); +info->input_usage_mask[ind] |= usage_mask; + } + + if (info->processor == TGSI_PROCESSOR_FRAGMENT && + !src->Register.Indirect) { +unsigned name = + info->input_semantic_name[src->Register.Index]; +unsigned index = + info->input_semantic_index[src->Register.Index]; + +if (name == TGSI_SEMANTIC_POSITION && +(src->Register.SwizzleX == TGSI_SWIZZLE_Z || + src->Register.SwizzleY == TGSI_SWIZZLE_Z || + src->Register.SwizzleZ == TGSI_SWIZZLE_Z || + src->Register.SwizzleW == TGSI_SWIZZLE_Z)) + info->reads_z = TRUE; + +if (name == TGSI_SEMANTIC_COLOR) { + unsigned mask = + (1 << src->Register.SwizzleX) | + (1 << src->Register.SwizzleY) | + (1 << src->Register.SwizzleZ) | + (1 << src->Register.SwizzleW); + + info->colors_read |= mask << (index * 4); +} + } + } + + /* check for indirect register reads */ + if (src->Register.Indirect) { + info->indirect_files |= (1 << src->Register.File); + info->indirect_files_read |= (1 <<
Re: [Mesa-dev] [PATCH] radeonsi: Dump LLVM IR before optimization passes
Reviewed-by: Edward O'Callaghan On 2016-02-04 13:28, Michel Dänzer wrote: From: Michel Dänzer Otherwise it's not possible to diagnose problems caused by optimization passes. Signed-off-by: Michel Dänzer --- src/gallium/drivers/radeonsi/si_shader.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 2192b21..d6c719f 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -4089,13 +4089,10 @@ int si_compile_llvm(struct si_screen *sscreen, int r = 0; unsigned count = p_atomic_inc_return(&sscreen->b.num_compilations); - if (r600_can_dump_shader(&sscreen->b, processor)) { + if (!(sscreen->b.debug_flags & DBG_NO_IR) && + r600_can_dump_shader(&sscreen->b, processor)) fprintf(stderr, "radeonsi: Compiling shader %d\n", count); - if (!(sscreen->b.debug_flags & DBG_NO_IR)) - LLVMDumpModule(mod); - } - if (!si_replace_shader(count, binary)) { r = radeon_llvm_compile(mod, binary, r600_get_llvm_processor_name(sscreen->b.family), tm, @@ -4177,6 +4174,11 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen, si_llvm_export_vs(bld_base, outputs, gsinfo->num_outputs); + /* Dump LLVM IR before any optimization passes */ + if (!(sscreen->b.debug_flags & DBG_NO_IR) && + r600_can_dump_shader(&sscreen->b, TGSI_PROCESSOR_GEOMETRY)) + LLVMDumpModule(bld_base->base.gallivm->module); + radeon_llvm_finalize_module(&si_shader_ctx->radeon_bld); if (dump) @@ -4383,9 +4385,15 @@ int si_shader_create(struct si_screen *sscreen, LLVMTargetMachineRef tm, goto out; } + mod = bld_base->base.gallivm->module; + + /* Dump LLVM IR before any optimization passes */ + if (!(sscreen->b.debug_flags & DBG_NO_IR) && + r600_can_dump_shader(&sscreen->b, si_shader_ctx.type)) + LLVMDumpModule(mod); + radeon_llvm_finalize_module(&si_shader_ctx.radeon_bld); - mod = bld_base->base.gallivm->module; r = si_compile_llvm(sscreen, &shader->binary, &shader->config, tm, mod, debug, si_shader_ctx.type); if (r) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] radeonsi: experimental support for GPUPerfStudio
Can't see any serious issues here and a non-verified `working' instance of GPUPerfStudio is sure better than a crashing one! This series is, Reviewed-by: Edward O'Callaghan On 2016-02-04 00:52, Nicolai Hähnle wrote: Hi, this bunch of patches meets GPUPerfStudio half-way in supporting the timing features on CI+ hardware. The latest version of GPUPerfStudio is required. With these patches, GPUPerfStudio should recognize our driver as supported and offer its frame profiling features without crashing. It should also report reasonable numbers in the profile. However, I haven't fully validated the reported numbers, so while I'd like to get this merged now, it should still be considered as somewhat experimental. Please review. Thanks, Nicolai -- .../drivers/radeon/r600_perfcounter.c| 38 +++--- src/gallium/drivers/radeon/r600_query.c | 80 ++- src/gallium/drivers/radeon/r600_query.h | 32 ++--- .../drivers/radeonsi/si_perfcounter.c| 121 + 4 files changed, 201 insertions(+), 70 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] st/mesa: unify variants and delete functions for TCS, TES, GS
This series is, Reviewed-by: Edward O'Callaghan On 2016-01-31 02:50, Marek Olšák wrote: From: Marek Olšák no difference between those --- src/mesa/state_tracker/st_atom_shader.c | 6 +- src/mesa/state_tracker/st_cb_program.c | 18 ++- src/mesa/state_tracker/st_context.h | 6 +- src/mesa/state_tracker/st_program.c | 204 src/mesa/state_tracker/st_program.h | 88 +++--- 5 files changed, 108 insertions(+), 214 deletions(-) diff --git a/src/mesa/state_tracker/st_atom_shader.c b/src/mesa/state_tracker/st_atom_shader.c index 0f9ea10..2d8a3c3 100644 --- a/src/mesa/state_tracker/st_atom_shader.c +++ b/src/mesa/state_tracker/st_atom_shader.c @@ -163,7 +163,7 @@ static void update_gp( struct st_context *st ) { struct st_geometry_program *stgp; - struct st_gp_variant_key key; + struct st_basic_variant_key key; if (!st->ctx->GeometryProgram._Current) { cso_set_geometry_shader_handle(st->cso_context, NULL); @@ -199,7 +199,7 @@ static void update_tcp( struct st_context *st ) { struct st_tessctrl_program *sttcp; - struct st_tcp_variant_key key; + struct st_basic_variant_key key; if (!st->ctx->TessCtrlProgram._Current) { cso_set_tessctrl_shader_handle(st->cso_context, NULL); @@ -235,7 +235,7 @@ static void update_tep( struct st_context *st ) { struct st_tesseval_program *sttep; - struct st_tep_variant_key key; + struct st_basic_variant_key key; if (!st->ctx->TessEvalProgram._Current) { cso_set_tesseval_shader_handle(st->cso_context, NULL); diff --git a/src/mesa/state_tracker/st_cb_program.c b/src/mesa/state_tracker/st_cb_program.c index 2c4eccf..6f9c53e 100644 --- a/src/mesa/state_tracker/st_cb_program.c +++ b/src/mesa/state_tracker/st_cb_program.c @@ -153,7 +153,8 @@ st_delete_program(struct gl_context *ctx, struct gl_program *prog) struct st_geometry_program *stgp = (struct st_geometry_program *) prog; - st_release_gp_variants(st, stgp); + st_release_basic_variants(st, stgp->Base.Base.Target, + &stgp->variants, &stgp->tgsi); if (stgp->glsl_to_tgsi) free_glsl_to_tgsi_visitor(stgp->glsl_to_tgsi); @@ -175,7 +176,8 @@ st_delete_program(struct gl_context *ctx, struct gl_program *prog) struct st_tessctrl_program *sttcp = (struct st_tessctrl_program *) prog; - st_release_tcp_variants(st, sttcp); + st_release_basic_variants(st, sttcp->Base.Base.Target, + &sttcp->variants, &sttcp->tgsi); if (sttcp->glsl_to_tgsi) free_glsl_to_tgsi_visitor(sttcp->glsl_to_tgsi); @@ -186,7 +188,8 @@ st_delete_program(struct gl_context *ctx, struct gl_program *prog) struct st_tesseval_program *sttep = (struct st_tesseval_program *) prog; - st_release_tep_variants(st, sttep); + st_release_basic_variants(st, sttep->Base.Base.Target, + &sttep->variants, &sttep->tgsi); if (sttep->glsl_to_tgsi) free_glsl_to_tgsi_visitor(sttep->glsl_to_tgsi); @@ -239,7 +242,8 @@ st_program_string_notify( struct gl_context *ctx, else if (target == GL_GEOMETRY_PROGRAM_NV) { struct st_geometry_program *stgp = (struct st_geometry_program *) prog; - st_release_gp_variants(st, stgp); + st_release_basic_variants(st, stgp->Base.Base.Target, +&stgp->variants, &stgp->tgsi); if (!st_translate_geometry_program(st, stgp)) return false; @@ -260,7 +264,8 @@ st_program_string_notify( struct gl_context *ctx, struct st_tessctrl_program *sttcp = (struct st_tessctrl_program *) prog; - st_release_tcp_variants(st, sttcp); + st_release_basic_variants(st, sttcp->Base.Base.Target, +&sttcp->variants, &sttcp->tgsi); if (!st_translate_tessctrl_program(st, sttcp)) return false; @@ -271,7 +276,8 @@ st_program_string_notify( struct gl_context *ctx, struct st_tesseval_program *sttep = (struct st_tesseval_program *) prog; - st_release_tep_variants(st, sttep); + st_release_basic_variants(st, sttep->Base.Base.Target, +&sttep->variants, &sttep->tgsi); if (!st_translate_tesseval_program(st, sttep)) return false; diff --git a/src/mesa/state_tracker/st_context.h b/src/mesa/state_tracker/st_context.h index 9db5f11..2883edf 100644 --- a/src/mesa/state_tracker/st_context.h +++ b/src/mesa/state_tracker/st_context.h @@ -166,9 +166,9 @@ struct st_context struct st_vp_variant *vp_variant; struct st_fp_variant *fp_variant; - struct st_gp_variant *gp_variant; - struct st_tcp_variant *tcp_variant; - struct st_tep_variant *tep_variant; + struct st_basic_variant *gp_variant; + struct st_basic_variant *tcp_variant; + struct st_basic_variant *tep_variant;
Re: [Mesa-dev] [PATCH 2/3] mesa: use geometric helper for computing min samples
Reviewed-by: Edward O'Callaghan On 2016-01-31 16:58, Ilia Mirkin wrote: In case we have a draw buffer without attachments, we should be looking at the default number of samples. Signed-off-by: Ilia Mirkin --- Still doesn't work properly on nvc0, but at least the right number of min samples gets passed along. src/mesa/program/program.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c index 0e78e6a..27867c4 100644 --- a/src/mesa/program/program.c +++ b/src/mesa/program/program.c @@ -31,6 +31,7 @@ #include "main/glheader.h" #include "main/context.h" +#include "main/framebuffer.h" #include "main/hash.h" #include "main/macros.h" #include "program.h" @@ -534,14 +535,14 @@ _mesa_get_min_invocations_per_fragment(struct gl_context *ctx, * forces per-sample shading" */ if (prog->IsSample && !ignore_sample_qualifier) - return MAX2(ctx->DrawBuffer->Visual.samples, 1); + return MAX2(_mesa_geometric_samples(ctx->DrawBuffer), 1); if (prog->Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID | SYSTEM_BIT_SAMPLE_POS)) - return MAX2(ctx->DrawBuffer->Visual.samples, 1); + return MAX2(_mesa_geometric_samples(ctx->DrawBuffer), 1); else if (ctx->Multisample.SampleShading) return MAX2(ceil(ctx->Multisample.MinSampleShadingValue * - ctx->DrawBuffer->Visual.samples), 1); + _mesa_geometric_samples(ctx->DrawBuffer)), 1); else return 1; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] winsys/amdgpu: Handle RADEON_FLAG_NO_CPU_ACCESS
This series is, Reviewed-by: Edward O'Callaghan Good job working out where this issue was. On 2016-01-26 18:40, Michel Dänzer wrote: From: Michel Dänzer Failing to do this was resulting in the kernel driver unnecessarily leaving open the possibility of CPU access to tiled BOs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93862 (This change shouldn't be backported to stable branches, because released versions of xf86-video-amdgpu unnecessarily try to map the front buffer) Signed-off-by: Michel Dänzer --- src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c index 30a1aa8..1e997d9 100644 --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c @@ -292,6 +292,8 @@ static struct amdgpu_winsys_bo *amdgpu_create_bo(struct amdgpu_winsys *ws, request.preferred_heap |= AMDGPU_GEM_DOMAIN_VRAM; if (flags & RADEON_FLAG_CPU_ACCESS) request.flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; + if (flags & RADEON_FLAG_NO_CPU_ACCESS) + request.flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS; } if (initial_domain & RADEON_DOMAIN_GTT) { request.preferred_heap |= AMDGPU_GEM_DOMAIN_GTT; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 00/11] st/mesa: add shader buffer support
This whole series is now, Reviewed-by: Edward O'Callaghan On 2016-01-25 05:59, Ilia Mirkin wrote: I believe I've addressed the various feedback people had. There's the outstanding point of how to expose the atomic buffer bindings, but this is a larger issue and largely tangential to the actual code changed in this series. No one has commented on my glsl_to_tgsi bits, which I sort of expected. Unless I hear outcry to the contrary, I'm just going to push those unreviewed once everything else is good -- nobody knows that code particularly well, and I've run the dEQP tests, which leads me to believe it's at least mostly good. Ilia Mirkin (11): tgsi: add MEMBAR opcode to handle memoryBarrier* GLSL intrinsics glsl: always initialize image_* fields, copy them on interface init glsl: keep track of ssbo variable being accessed, add access params mesa: add PROGRAM_IMMEDIATE, PROGRAM_BUFFER st/mesa: add atomic counter support st/mesa: add support for SSBO binding and GLSL intrinsics st/mesa: use RESQ to find buffer size st/mesa: add support for memory barrier intrinsics st/mesa: add shader buffer barrier bit st/mesa: enable ARB_shader_storage_buffer_object when supported trace: add support for set_shader_buffers src/gallium/auxiliary/tgsi/tgsi_info.c| 2 +- src/gallium/docs/source/tgsi.rst | 17 ++ src/gallium/drivers/trace/tr_context.c| 40 +++ src/gallium/drivers/trace/tr_dump_state.c | 18 ++ src/gallium/drivers/trace/tr_dump_state.h | 2 + src/gallium/include/pipe/p_defines.h | 1 + src/gallium/include/pipe/p_shader_tokens.h| 7 +- src/glsl/builtin_variables.cpp| 5 + src/glsl/lower_buffer_access.cpp | 6 +- src/glsl/lower_buffer_access.h| 1 + src/glsl/lower_shared_reference.cpp | 6 +- src/glsl/lower_ubo_reference.cpp | 40 ++- src/glsl/nir/glsl_types.cpp | 5 + src/glsl/nir/glsl_types.h | 3 +- src/glsl/nir/shader_enums.h | 10 + src/mesa/Makefile.sources | 2 + src/mesa/main/mtypes.h| 2 + src/mesa/program/ir_to_mesa.cpp | 4 + src/mesa/state_tracker/st_atom.c | 10 + src/mesa/state_tracker/st_atom.h | 10 + src/mesa/state_tracker/st_atom_atomicbuf.c| 158 +++ src/mesa/state_tracker/st_atom_storagebuf.c | 194 + src/mesa/state_tracker/st_cb_bufferobjects.c | 4 + src/mesa/state_tracker/st_cb_texturebarrier.c | 4 + src/mesa/state_tracker/st_context.c | 2 + src/mesa/state_tracker/st_context.h | 2 + src/mesa/state_tracker/st_extensions.c| 30 ++ src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 392 -- 28 files changed, 949 insertions(+), 28 deletions(-) create mode 100644 src/mesa/state_tracker/st_atom_atomicbuf.c create mode 100644 src/mesa/state_tracker/st_atom_storagebuf.c ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: add DCC buffer for sampler views on new CS
Reviewed-by: Edward O'Callaghan On 2016-01-25 03:40, Nicolai Hähnle wrote: From: Nicolai Hähnle This fixes a VM fault and possible lockup in high memory pressure situations. Cc: "11.0 11.1" --- src/gallium/drivers/radeonsi/si_descriptors.c | 33 +++ 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index aad836d..6c79673 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -138,6 +138,22 @@ static void si_release_sampler_views(struct si_sampler_views *views) si_release_descriptors(&views->desc); } +static void si_sampler_view_add_buffers(struct si_context *sctx, + struct si_sampler_view *rview) +{ + if (rview->resource) { + radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, + rview->resource, RADEON_USAGE_READ, + r600_get_sampler_view_priority(rview->resource)); + } + + if (rview->dcc_buffer && rview->dcc_buffer != rview->resource) { + radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, + rview->dcc_buffer, RADEON_USAGE_READ, + RADEON_PRIO_DCC); + } +} + static void si_sampler_views_begin_new_cs(struct si_context *sctx, struct si_sampler_views *views) { @@ -149,12 +165,7 @@ static void si_sampler_views_begin_new_cs(struct si_context *sctx, struct si_sampler_view *rview = (struct si_sampler_view*)views->views[i]; - if (!rview->resource) - continue; - - radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, - rview->resource, RADEON_USAGE_READ, - r600_get_sampler_view_priority(rview->resource)); + si_sampler_view_add_buffers(sctx, rview); } if (!views->desc.buffer) @@ -176,15 +187,7 @@ static void si_set_sampler_view(struct si_context *sctx, unsigned shader, struct si_sampler_view *rview = (struct si_sampler_view*)view; - if (rview->resource) - radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, - rview->resource, RADEON_USAGE_READ, - r600_get_sampler_view_priority(rview->resource)); - - if (rview->dcc_buffer && rview->dcc_buffer != rview->resource) - radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, - rview->dcc_buffer, RADEON_USAGE_READ, - RADEON_PRIO_DCC); + si_sampler_view_add_buffers(sctx, rview); pipe_sampler_view_reference(&views->views[slot], view); memcpy(views->desc.list + slot*8, view_desc, 8*4); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] radeonsi: allow using all CUs for tessellation and on-chip GS (v2)
This series is, Reviewed-by: Edward O'Callaghan On 2016-01-23 01:18, Marek Olšák wrote: From: Marek Olšák v2: After more discussion with hw teams, the kernel already contains the optimal settings allowing us to use all CUs. --- src/gallium/drivers/radeonsi/si_state.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index a3ddee8..67b2835 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -3701,9 +3701,9 @@ static void si_init_config(struct si_context *sctx) si_pm4_set_reg(pm4, R_028408_VGT_INDX_OFFSET, 0); if (sctx->b.chip_class >= CIK) { - si_pm4_set_reg(pm4, R_00B51C_SPI_SHADER_PGM_RSRC3_LS, S_00B51C_CU_EN(0xfffc)); + si_pm4_set_reg(pm4, R_00B51C_SPI_SHADER_PGM_RSRC3_LS, S_00B51C_CU_EN(0x)); si_pm4_set_reg(pm4, R_00B41C_SPI_SHADER_PGM_RSRC3_HS, 0); - si_pm4_set_reg(pm4, R_00B31C_SPI_SHADER_PGM_RSRC3_ES, S_00B31C_CU_EN(0xfffe)); + si_pm4_set_reg(pm4, R_00B31C_SPI_SHADER_PGM_RSRC3_ES, S_00B31C_CU_EN(0x)); si_pm4_set_reg(pm4, R_00B21C_SPI_SHADER_PGM_RSRC3_GS, S_00B21C_CU_EN(0x)); si_pm4_set_reg(pm4, R_00B118_SPI_SHADER_PGM_RSRC3_VS, S_00B118_CU_EN(0x)); si_pm4_set_reg(pm4, R_00B11C_SPI_SHADER_LATE_ALLOC_VS, S_00B11C_LIMIT(0)); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/7] radeonsi: geometry shader bug fix and cleanup
This series is, Reviewed-by: Edward O'Callaghan On 2016-01-23 10:59, Nicolai Hähnle wrote: Hi, this series was prompted by a rendering bug reported for Dolphin. The bug is fixed in the first two patches, and the remainder is assorted cleanups that I noticed while working on the fix. Please review. Thanks, Nicolai -- .../drivers/radeonsi/si_descriptors.c| 8 +- src/gallium/drivers/radeonsi/si_shader.c | 14 ++-- src/gallium/drivers/radeonsi/si_shader.h | 1 - .../drivers/radeonsi/si_state_shaders.c | 78 +++--- 4 files changed, 62 insertions(+), 39 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/9] st/mesa: accelerate texture uploads from PBOs
To the best of my understanding, this series is now: Reviewed-by: Edward O'Callaghan On 2016-01-22 06:37, Nicolai Hähnle wrote: Hi everybody, here's an updated version of the series. I decided to keep BUFFER_SAMPLER_VIEW_RGBA_ONLY as is, following Fredrik's point that it affects not only the sampler swizzle but also the texture format itself. The major functionality changes are that we now try to fulfill larger alignments by adjusting the buf_offset appropriately (this is not needed for radeonsi, but I did some basic tests to make sure this works) and we don't use a geometry shader if the driver can handle layer writes in the VS. Please review. Thanks, Nicolai -- src/gallium/docs/source/screen.rst | 11 + .../drivers/freedreno/freedreno_screen.c |3 + src/gallium/drivers/i915/i915_screen.c |2 + src/gallium/drivers/ilo/ilo_screen.c |3 + src/gallium/drivers/llvmpipe/lp_screen.c |2 + .../drivers/nouveau/nv30/nv30_screen.c |2 + .../drivers/nouveau/nv50/nv50_screen.c |2 + .../drivers/nouveau/nvc0/nvc0_screen.c |2 + src/gallium/drivers/r300/r300_screen.c |2 + src/gallium/drivers/r600/r600_pipe.c |4 + src/gallium/drivers/radeon/r600_texture.c| 26 +- src/gallium/drivers/radeonsi/si_pipe.c |4 + src/gallium/drivers/softpipe/sp_screen.c |3 + src/gallium/drivers/svga/svga_screen.c |2 + src/gallium/drivers/vc4/vc4_screen.c |2 + src/gallium/drivers/virgl/virgl_screen.c |3 + src/gallium/include/pipe/p_defines.h |2 + src/mesa/state_tracker/st_cb_texture.c | 1178 +++- src/mesa/state_tracker/st_cb_texture.h |5 + src/mesa/state_tracker/st_context.c |2 + src/mesa/state_tracker/st_context.h | 13 + 21 files changed, 1254 insertions(+), 19 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: Add option for SI scheduler
Reviewed-by: Edward O'Callaghan On 2016-01-22 04:35, Axel Davy wrote: Add a debug option to select the LLVM SI Machine Scheduler. R600_DEBUG=sisched Signed-off-by: Axel Davy --- The corresponding llvm patch is on llvm master, and should land soon for 3.8 branch src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 1 + src/gallium/drivers/radeonsi/si_pipe.c| 6 +- 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index e926f56..a9ce7b1 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -389,6 +389,7 @@ static const struct debug_named_value common_debug_options[] = { { "nodcc", DBG_NO_DCC, "Disable DCC." }, { "nodccclear", DBG_NO_DCC_CLEAR, "Disable DCC fast clear." }, { "norbplus", DBG_NO_RB_PLUS, "Disable RB+ on Stoney." }, + { "sisched", DBG_SI_SCHED, "Enable LLVM SI Machine Instruction Scheduler." }, DEBUG_NAMED_VALUE_END /* must be last */ }; diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 27f6e98..3020421 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -87,6 +87,7 @@ #define DBG_NO_DCC (1llu << 43) #define DBG_NO_DCC_CLEAR (1llu << 44) #define DBG_NO_RB_PLUS (1llu << 45) +#define DBG_SI_SCHED (1llu << 46) #define R600_MAP_BUFFER_ALIGNMENT 64 diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index f6ff4a8..51bcba7 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -215,7 +215,11 @@ static struct pipe_context *si_create_context(struct pipe_screen *screen, r600_target = radeon_llvm_get_r600_target(triple); sctx->tm = LLVMCreateTargetMachine(r600_target, triple, r600_get_llvm_processor_name(sscreen->b.family), - "+DumpCode,+vgpr-spilling", +#if HAVE_LLVM >= 0x0308 + sscreen->b.debug_flags & DBG_SI_SCHED ? + "+DumpCode,+vgpr-spilling,+si-scheduler" : +#endif + "+DumpCode,+vgpr-spilling", LLVMCodeGenLevelDefault, LLVMRelocDefault, LLVMCodeModelDefault); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: add max waves / CU to shader stats
This series is, Reviewed-by: Edward O'Callaghan On 2016-01-20 12:39, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_shader.c | 33 +--- 1 file changed, 30 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 0c5fd32..5c536f8 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -3994,12 +3994,39 @@ static void si_shader_dump_stats(struct si_screen *sscreen, struct pipe_debug_callback *debug, unsigned processor) { + /* Compute the maximum number of waves. +* The pixel shader additionally allocates 1 - 48 blocks of LDS +* depending on non-compile times parameters. +*/ + unsigned ps_lds_size = processor == TGSI_PROCESSOR_FRAGMENT ? 1 : 0; + unsigned lds_size = ps_lds_size + conf->lds_size; + unsigned max_waves = 10; + + if (conf->num_sgprs) { + if (sscreen->b.chip_class >= VI) + max_waves = MIN2(max_waves, 800 / conf->num_sgprs); + else + max_waves = MIN2(max_waves, 512 / conf->num_sgprs); + } + + if (conf->num_vgprs) + max_waves = MIN2(max_waves, 256 / conf->num_vgprs); + + if (lds_size) + max_waves = MIN2(max_waves, 128 / lds_size); + if (r600_can_dump_shader(&sscreen->b, processor)) { fprintf(stderr, "*** SHADER STATS ***\n" - "SGPRS: %d\nVGPRS: %d\nCode Size: %d bytes\nLDS: %d blocks\n" - "Scratch: %d bytes per wave\n\n", + "SGPRS: %d\n" + "VGPRS: %d\n" + "Code Size: %d bytes\n" + "LDS: %d blocks\n" + "Scratch: %d bytes per wave\n" + "Max waves / CU: %d\n" + "\n", conf->num_sgprs, conf->num_vgprs, code_size, - conf->lds_size, conf->scratch_bytes_per_wave); + conf->lds_size, conf->scratch_bytes_per_wave, + max_waves); } pipe_debug_message(debug, SHADER_INFO, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/10] st/mesa: use RESQ to find buffer size
Reviewed-by: Edward O'Callaghan On 2016-01-18 16:51, Ilia Mirkin wrote: --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 0aaa175..602e689 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -569,6 +569,7 @@ static bool is_resource_instruction(unsigned opcode) { switch (opcode) { + case TGSI_OPCODE_RESQ: case TGSI_OPCODE_LOAD: case TGSI_OPCODE_ATOMUADD: case TGSI_OPCODE_ATOMXCHG: @@ -,6 +2223,22 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) emit_asm(ir, TGSI_OPCODE_UP2H, result_dst, op[0]); break; + case ir_unop_get_buffer_size: { + ir_constant *const_offset = ir->operands[0]->as_constant(); + st_src_reg buffer( +PROGRAM_BUFFER, +ctx->Const.Program[shader->Stage].MaxAtomicBuffers + +(const_offset ? const_offset->value.u[0] : 0), +GLSL_TYPE_UINT); + if (!const_offset) { + buffer.reladdr = ralloc(mem_ctx, st_src_reg); + memcpy(buffer.reladdr, &sampler_reladdr, sizeof(sampler_reladdr)); + emit_arl(ir, sampler_reladdr, op[0]); + } + emit_asm(ir, TGSI_OPCODE_RESQ, result_dst)->buffer = buffer; + break; + } + case ir_unop_pack_snorm_2x16: case ir_unop_pack_unorm_2x16: case ir_unop_pack_snorm_4x8: @@ -2245,10 +2262,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) */ assert(!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()"); break; - - case ir_unop_get_buffer_size: - assert(!"Not implemented yet"); - break; } this->result = result_src; @@ -5133,6 +5146,7 @@ compile_tgsi_instruction(struct st_translate *t, src, num_src); return; + case TGSI_OPCODE_RESQ: case TGSI_OPCODE_LOAD: case TGSI_OPCODE_ATOMUADD: case TGSI_OPCODE_ATOMXCHG: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: initialize Atomic field in tgsi_default_declaration
Reviewed-by: Edward O'Callaghan On 2016-01-17 19:46, Ilia Mirkin wrote: Spotted by Coverity. Signed-off-by: Ilia Mirkin --- src/gallium/auxiliary/tgsi/tgsi_build.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index ea20746..83f5062 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -110,6 +110,7 @@ tgsi_default_declaration( void ) declaration.Invariant = 0; declaration.Local = 0; declaration.Array = 0; + declaration.Atomic = 0; declaration.Padding = 0; return declaration; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: Print "LLVM emitted unknown config register" warning only once
This series is, Reviewed-by: Edward O'Callaghan On 2016-01-15 14:23, Michel Dänzer wrote: From: Michel Dänzer Say "LLVM" instead of "Compiler" for clarity. Signed-off-by: Michel Dänzer --- src/gallium/drivers/radeonsi/si_shader.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index cc9718e..3ab054c 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -3735,8 +3735,15 @@ void si_shader_binary_read_config(struct radeon_shader_binary *binary, G_00B860_WAVESIZE(value) * 256 * 4 * 1; break; default: - fprintf(stderr, "Warning: Compiler emitted unknown " - "config register: 0x%x\n", reg); + { + static bool printed; + + if (!printed) { + fprintf(stderr, "Warning: LLVM emitted unknown " + "config register: 0x%x\n", reg); + printed = true; + } + } break; } } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] glsl: enable offset layout qualifier for ARB_enhanced_layouts
This series is, Reviewed-by: Edward O'Callaghan On 2016-01-11 14:13, Timothy Arceri wrote: --- src/glsl/glsl_parser.yy | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index 6b634f2..b2b94f4 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -1505,7 +1505,8 @@ layout_qualifier_id: $$.binding = $3; } - if (state->has_atomic_counters() && + if ((state->has_atomic_counters() || + state->has_enhanced_layouts()) && match_layout_qualifier("offset", $1, state) == 0) { $$.flags.q.explicit_offset = 1; $$.offset = $3; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] st/mesa: remove dead code from mesa_to_tgsi
This series is: Reviewed-by: Edward O'Callaghan On 2016-01-08 12:12, Marek Olšák wrote: From: Marek Olšák These aren't part of ARB_fragment_program. --- src/mesa/state_tracker/st_mesa_to_tgsi.c | 51 1 file changed, 51 deletions(-) diff --git a/src/mesa/state_tracker/st_mesa_to_tgsi.c b/src/mesa/state_tracker/st_mesa_to_tgsi.c index 4b9dc99..d8f7b6c 100644 --- a/src/mesa/state_tracker/st_mesa_to_tgsi.c +++ b/src/mesa/state_tracker/st_mesa_to_tgsi.c @@ -475,24 +475,6 @@ static void emit_swz( struct st_translate *t, } -/** - * Negate the value of DDY to match GL semantics where (0,0) is the - * lower-left corner of the window. - * Note that the GL_ARB_fragment_coord_conventions extension will - * effect this someday. - */ -static void emit_ddy( struct st_translate *t, - struct ureg_dst dst, - const struct prog_src_register *SrcReg ) -{ - struct ureg_program *ureg = t->ureg; - struct ureg_src src = translate_src( t, SrcReg ); - src = ureg_negate( src ); - ureg_DDY( ureg, dst, src ); -} - - - static unsigned translate_opcode( unsigned op ) { @@ -714,10 +696,6 @@ compile_instruction( */ ureg_MOV( ureg, dst[0], ureg_imm1f(ureg, 0.5) ); break; - - case OPCODE_DDY: - emit_ddy( t, dst[0], &inst->SrcReg[0] ); - break; case OPCODE_RSQ: ureg_RSQ( ureg, dst[0], ureg_abs(src[0]) ); @@ -926,31 +904,6 @@ emit_wpos(struct st_context *st, /** - * OpenGL's fragment gl_FrontFace input is 1 for front-facing, 0 for back. - * TGSI uses +1 for front, -1 for back. - * This function converts the TGSI value to the GL value. Simply clamping/ - * saturating the value to [0,1] does the job. - */ -static void -emit_face_var( struct st_translate *t, - const struct gl_program *program ) -{ - struct ureg_program *ureg = t->ureg; - struct ureg_dst face_temp = ureg_DECL_temporary( ureg ); - struct ureg_src face_input = t->inputs[t->inputMapping[VARYING_SLOT_FACE]]; - - /* MOV_SAT face_temp, input[face] -*/ - face_temp = ureg_saturate( face_temp ); - ureg_MOV( ureg, face_temp, face_input ); - - /* Use face_temp as face input from here on: -*/ - t->inputs[t->inputMapping[VARYING_SLOT_FACE]] = ureg_src(face_temp); -} - - -/** * Translate Mesa program to TGSI format. * \param program the program to translate * \param numInputs number of input registers used @@ -1020,10 +973,6 @@ st_translate_mesa_program( emit_wpos(st_context(ctx), t, program, ureg); } - if (program->InputsRead & VARYING_BIT_FACE) { - emit_face_var( t, program ); - } - /* * Declare output attributes. */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: simplify gl_FragCoord behavior
This series is: Reviewed-by: Edward O'Callaghan On 2016-01-08 12:30, Marek Olšák wrote: From: Marek Olšák It will become a system value, not an input. --- src/gallium/drivers/radeonsi/si_state_shaders.c | 45 - 1 file changed, 22 insertions(+), 23 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c b/src/gallium/drivers/radeonsi/si_state_shaders.c index 64adf69..460dda5 100644 --- a/src/gallium/drivers/radeonsi/si_state_shaders.c +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c @@ -399,30 +399,29 @@ static void si_shader_ps(struct si_shader *shader) if (!pm4) return; - for (i = 0; i < info->num_inputs; i++) { - switch (info->input_semantic_name[i]) { - case TGSI_SEMANTIC_POSITION: - /* SPI_BARYC_CNTL.POS_FLOAT_LOCATION -* Possible vaules: -* 0 -> Position = pixel center (default) -* 1 -> Position = pixel centroid -* 2 -> Position = at sample position -*/ - switch (info->input_interpolate_loc[i]) { - case TGSI_INTERPOLATE_LOC_CENTROID: - spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(1); - break; - case TGSI_INTERPOLATE_LOC_SAMPLE: - spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2); - break; - } + /* SPI_BARYC_CNTL.POS_FLOAT_LOCATION +* Possible vaules: +* 0 -> Position = pixel center +* 1 -> Position = pixel centroid +* 2 -> Position = at sample position +* +* From GLSL 4.5 specification, section 7.1: + * "The variable gl_FragCoord is available as an input variable from + *within fragment shaders and it holds the window relative coordinates +*(x, y, z, 1/w) values for the fragment. If multi-sampling, this +*value can be for any location within the pixel, or one of the +*fragment samples. The use of centroid does not further restrict +*this value to be inside the current primitive." +* + * Meaning that centroid has no effect and we can return anything within + * the pixel. Thus, return the value at sample position, because that's +* the most accurate one shaders can get. +*/ + spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2); - if (info->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER] == - TGSI_FS_COORD_PIXEL_CENTER_INTEGER) - spi_baryc_cntl |= S_0286E0_POS_FLOAT_ULC(1); - break; - } - } + if (info->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER] == + TGSI_FS_COORD_PIXEL_CENTER_INTEGER) + spi_baryc_cntl |= S_0286E0_POS_FLOAT_ULC(1); /* Find out what SPI_SHADER_COL_FORMAT and CB_SHADER_MASK should be. */ colors_written = info->colors_written; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] gl_FragCoord and gl_FrontFacing as system values
This series is: Reviewed-by: Edward O'Callaghan On 2016-01-08 12:29, Marek Olšák wrote: Hi, This series adds the possibility for drivers to get gl_FragCoord and gl_FrontFacing as system values. When FACE is a system value, it also changes its type to integer from floating-point. Each variable has its own Const flag / Gallium CAP, so drivers can choose whether they want this for each variable. This simplifies input handling in the radeonsi driver. With this, TGSI INPUT[i] becomes fragment shader input[i] in the hardware, so the driver doesn't have to do any translation of locations. Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: combine if blocks
This series is: Reviewed-by: Edward O'Callaghan On 2016-01-08 15:25, Timothy Arceri wrote: --- src/glsl/link_uniforms.cpp | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp index 47bb771..33b2d4c 100644 --- a/src/glsl/link_uniforms.cpp +++ b/src/glsl/link_uniforms.cpp @@ -532,6 +532,8 @@ public: */ if (var->is_interface_instance()) { ubo_byte_offset = 0; +process(var->get_interface_type(), +var->get_interface_type()->name); } else { const struct gl_uniform_block *const block = &prog->BufferInterfaceBlocks[ubo_block_index]; @@ -542,13 +544,8 @@ public: &block->Uniforms[var->data.location]; ubo_byte_offset = ubo_var->Offset; - } - - if (var->is_interface_instance()) -process(var->get_interface_type(), -var->get_interface_type()->name); - else process(var); + } } else { /* Store any explicit location and reset data location so we can * reuse this variable for storing the uniform slot number. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] tgsi/scan: set if a fragment shader writes sample mask
This series is, Reviewed-by: Edward O'Callaghan On 2016-01-06 12:46, Marek Olšák wrote: From: Marek Olšák This will be used by radeonsi. --- src/gallium/auxiliary/tgsi/tgsi_scan.c | 2 ++ src/gallium/auxiliary/tgsi/tgsi_scan.h | 1 + 2 files changed, 3 insertions(+) diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c b/src/gallium/auxiliary/tgsi/tgsi_scan.c index e04f407..e3feed9 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c @@ -392,6 +392,8 @@ tgsi_scan_shader(const struct tgsi_token *tokens, } else if (semName == TGSI_SEMANTIC_STENCIL) { info->writes_stencil = TRUE; + } else if (semName == TGSI_SEMANTIC_SAMPLEMASK) { +info->writes_samplemask = TRUE; } } diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h b/src/gallium/auxiliary/tgsi/tgsi_scan.h index 7e9a559..a3e4378 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h @@ -82,6 +82,7 @@ struct tgsi_shader_info boolean reads_z; /**< does fragment shader read depth? */ boolean writes_z; /**< does fragment shader write Z value? */ boolean writes_stencil; /**< does fragment shader write stencil value? */ + boolean writes_samplemask; /**< does fragment shader write sample mask? */ boolean writes_edgeflag; /**< vertex shader outputs edgeflag */ boolean uses_kill; /**< KILL or KILL_IF instruction used? */ boolean uses_persp_center; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] gallium: Remove unnecessary semicolons
On 2016-01-06 10:30, Brian Paul wrote: Series looks OK to me. Reviewed-by: Brian Paul Do you need someone to commit/push for you? I do yes, thank you kindly. Edward. -Brian On 01/05/2016 03:07 AM, Edward O'Callaghan wrote: Fix silly issue with MSVC case fall-though support to need a extra 'break;' Found-by: Coccinelle Signed-off-by: Edward O'Callaghan Reviewed-by: Brian Paul --- src/gallium/auxiliary/draw/draw_pipe_aaline.c | 2 +- src/gallium/auxiliary/gallivm/lp_bld_swizzle.c | 2 +- src/gallium/auxiliary/nir/tgsi_to_nir.c| 2 +- src/gallium/auxiliary/util/u_surface.c | 3 ++- src/gallium/auxiliary/vl/vl_mpeg12_decoder.c | 2 +- src/gallium/state_trackers/nine/swapchain9.c | 2 +- src/gallium/state_trackers/omx/entrypoint.c| 2 +- src/gallium/state_trackers/vdpau/mixer.c | 2 +- 8 files changed, 9 insertions(+), 8 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pipe_aaline.c b/src/gallium/auxiliary/draw/draw_pipe_aaline.c index 3ce550a..e85ae16 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_aaline.c +++ b/src/gallium/auxiliary/draw/draw_pipe_aaline.c @@ -938,7 +938,7 @@ draw_aaline_prepare_outputs(struct draw_context *draw, const struct pipe_rasterizer_state *rast = draw->rasterizer; /* update vertex attrib info */ - aaline->pos_slot = draw_current_shader_position_output(draw);; + aaline->pos_slot = draw_current_shader_position_output(draw); if (!rast->line_smooth) return; diff --git a/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c b/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c index b1aef71..f571838 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c @@ -720,7 +720,7 @@ lp_build_transpose_aos_n(struct gallivm_state *gallivm, default: assert(0); - }; + } } diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index 94d992b..7c57759 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -1950,7 +1950,7 @@ tgsi_processor_to_shader_stage(unsigned processor) case TGSI_PROCESSOR_COMPUTE: return MESA_SHADER_COMPUTE; default: unreachable("invalid TGSI processor"); - }; + } } struct nir_shader * diff --git a/src/gallium/auxiliary/util/u_surface.c b/src/gallium/auxiliary/util/u_surface.c index 6aa44f9..c150d92 100644 --- a/src/gallium/auxiliary/util/u_surface.c +++ b/src/gallium/auxiliary/util/u_surface.c @@ -600,7 +600,8 @@ is_box_inside_resource(const struct pipe_resource *res, depth = res->array_size; assert(res->array_size % 6 == 0); break; - case PIPE_MAX_TEXTURE_TYPES:; + case PIPE_MAX_TEXTURE_TYPES: + break; } return box->x >= 0 && diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c index f5bb3a0..b5c7045 100644 --- a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c +++ b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c @@ -792,7 +792,7 @@ vl_mpeg12_end_frame(struct pipe_video_codec *decoder, for (j = 0; j < VL_MAX_REF_FRAMES; ++j) { if (!ref_frames[j] || !ref_frames[j][i]) continue; - vb[2] = vl_vb_get_mv(&buf->vertex_stream, j);; + vb[2] = vl_vb_get_mv(&buf->vertex_stream, j); dec->context->set_vertex_buffers(dec->context, 0, 3, vb); vl_mc_render_ref(i ? &dec->mc_c : &dec->mc_y, &buf->mc[i], ref_frames[j][i]); diff --git a/src/gallium/state_trackers/nine/swapchain9.c b/src/gallium/state_trackers/nine/swapchain9.c index 3f5be26..3b1a7a4 100644 --- a/src/gallium/state_trackers/nine/swapchain9.c +++ b/src/gallium/state_trackers/nine/swapchain9.c @@ -790,7 +790,7 @@ NineSwapChain9_Present( struct NineSwapChain9 *This, case D3DSWAPEFFECT_FLIP: UNTESTED(4); case D3DSWAPEFFECT_DISCARD: -/* rotate the queue */; +/* rotate the queue */ pipe_resource_reference(&res, This->buffers[0]->base.resource); for (i = 1; i <= This->params.BackBufferCount; i++) { NineSurface9_SetResourceResize(This->buffers[i - 1], diff --git a/src/gallium/state_trackers/omx/entrypoint.c b/src/gallium/state_trackers/omx/entrypoint.c index da9ca10..afcbd97 100644 --- a/src/gallium/state_trackers/omx/entrypoint.c +++ b/src/gallium/state_trackers/omx/entrypoint.c @@ -137,7 +137,7 @@ OMX_ERRORTYPE omx_workaround_Destructor(OMX_COMPONENTTYPE *comp) priv->state = OMX_StateInvalid; tsem_up(priv->messageSem); - /* wait for thread to exit */; + /* wait for thread to exit */ pthread_join(priv->messageHandlerThread, NULL); return omx_base_component_Destructor(comp); diff --git a/src/gallium/state_trackers/vdpau/mixer.c b/src/gallium/state_trackers/vdpau/mixer.c index c0b1ecc..dec79ff 100644 --- a/src/gallium/state_trackers/vdpau/mixer.c +++ b/sr
Re: [Mesa-dev] [PATCH] nir: few missing struct names
Reviewed-by: Edward O'Callaghan On 2016-01-05 05:27, Rob Clark wrote: From: Rob Clark nir.h is a bit inconsistent about 'typedef struct {} nir_foo' vs 'typedef struct nir_foo {} nir_foo'. But missing struct name tags is inconvenient when you need a fwd declaration without pulling in all of nir. So add missing struct name tag for nir_variable, and a couple other spots where it would likely be useful. Signed-off-by: Rob Clark --- src/glsl/nir/nir.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 4286738..bedcc0d 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -139,7 +139,7 @@ typedef enum { * ir_variable - it should be easy to translate between the two. */ -typedef struct { +typedef struct nir_variable { struct exec_node node; /** @@ -349,7 +349,7 @@ typedef struct { #define nir_foreach_variable(var, var_list) \ foreach_list_typed(nir_variable, var, node, var_list) -typedef struct { +typedef struct nir_register { struct exec_node node; unsigned num_components; /** < number of vector components */ @@ -443,7 +443,7 @@ nir_instr_is_last(nir_instr *instr) return exec_node_is_tail_sentinel(exec_node_get_next(&instr->node)); } -typedef struct { +typedef struct nir_ssa_def { /** for debugging only, can be NULL */ const char* name; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/8] gallium: add shader buffer support
In this series patches 2-8 are: Reviewed-by: Edward O'Callaghan with some commentary on patch 1. Kind Regards, On 2016-01-03 15:37, Ilia Mirkin wrote: This provides enough support in TGSI to support shader buffers. I do away with the defunct TGSI_FILE_RESOURCE (renaming it into TGSI_FILE_IMAGE to work with pipe_image_view), and add a brand new TGSI_FILE_BUFFER. At the declaration level, this can have an ATOMIC qualifier (and later a SHARED qualifier for compute shaders). I also add memory qualifiers to LOAD/STORE opcodes, which can convey the coherent/volatile/restrict flags as specified in the GLSL. I also modified all of the formerly resource opcodes to work on both buffers and images. For images they will derive the format from the IMAGE declaration, while buffers are format-less by definition. This is still missing a way to implement memory barriers, that will come soon, and is not going to affect anything else I do in this series. For the full series I'm working on, you can look at https://github.com/imirkin/mesa/commits/atomic3 which exposes ARB_shader_atomic_counters and ARB_shader_storage_buffer_objects on nvc0+ (but it won't work on maxwell -- need to add emission of atomic ops and cache control). However this is a nice self-contained chunk to start with. Ilia Mirkin (8): tgsi: add ureg support for image decls ureg: add buffer support to ureg tgsi: provide a way to encode memory qualifiers for SSBO tgsi: add a is_store property tgsi: update atomic op docs gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT gallium: add a RESQ opcode to query info about a resource src/gallium/auxiliary/gallivm/lp_bld_limits.h | 1 + src/gallium/auxiliary/tgsi/tgsi_build.c| 112 -- src/gallium/auxiliary/tgsi/tgsi_dump.c | 25 +- src/gallium/auxiliary/tgsi/tgsi_exec.h | 1 + src/gallium/auxiliary/tgsi/tgsi_info.c | 446 ++--- src/gallium/auxiliary/tgsi/tgsi_info.h | 1 + src/gallium/auxiliary/tgsi/tgsi_parse.c| 8 +- src/gallium/auxiliary/tgsi/tgsi_parse.h| 3 +- src/gallium/auxiliary/tgsi/tgsi_strings.c | 12 +- src/gallium/auxiliary/tgsi/tgsi_strings.h | 2 + src/gallium/auxiliary/tgsi/tgsi_text.c | 42 +- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 182 + src/gallium/auxiliary/tgsi/tgsi_ureg.h | 23 ++ src/gallium/docs/source/screen.rst | 8 + src/gallium/docs/source/tgsi.rst | 105 ++--- src/gallium/drivers/freedreno/freedreno_screen.c | 3 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/ilo/shader/toy_tgsi.c | 8 +- src/gallium/drivers/llvmpipe/lp_screen.c | 1 + .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 12 +- src/gallium/drivers/nouveau/nv30/nv30_screen.c | 3 + src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 + src/gallium/drivers/r300/r300_screen.c | 3 + src/gallium/drivers/r600/r600_pipe.c | 2 + src/gallium/drivers/radeonsi/si_pipe.c | 3 + src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c | 4 + src/gallium/drivers/svga/svga_tgsi_vgpu10.c| 2 + src/gallium/drivers/vc4/vc4_screen.c | 3 + src/gallium/drivers/virgl/virgl_screen.c | 1 + src/gallium/include/pipe/p_defines.h | 2 + src/gallium/include/pipe/p_shader_tokens.h | 28 +- 34 files changed, 729 insertions(+), 324 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] tgsi: add ureg support for image decls
There is quite a bit of rename churn happening here at the same time as the bring up of ureg support for image declarations. Would it be possible to split the rename churn out from the actual behavioral changes please? On 2016-01-03 15:37, Ilia Mirkin wrote: Signed-off-by: Ilia Mirkin --- src/gallium/auxiliary/tgsi/tgsi_build.c| 62 + src/gallium/auxiliary/tgsi/tgsi_dump.c | 10 +-- src/gallium/auxiliary/tgsi/tgsi_parse.c| 4 +- src/gallium/auxiliary/tgsi/tgsi_parse.h| 2 +- src/gallium/auxiliary/tgsi/tgsi_strings.c | 4 +- src/gallium/auxiliary/tgsi/tgsi_text.c | 10 +-- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 77 ++ src/gallium/auxiliary/tgsi/tgsi_ureg.h | 7 ++ src/gallium/drivers/ilo/shader/toy_tgsi.c | 8 +-- .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 12 +++- src/gallium/drivers/svga/svga_tgsi_vgpu10.c| 2 + src/gallium/include/pipe/p_shader_tokens.h | 7 +- 12 files changed, 153 insertions(+), 52 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index fdb7feb..bb9d0cb 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -259,36 +259,39 @@ tgsi_build_declaration_semantic( return ds; } -static struct tgsi_declaration_resource -tgsi_default_declaration_resource(void) +static struct tgsi_declaration_image +tgsi_default_declaration_image(void) { - struct tgsi_declaration_resource dr; + struct tgsi_declaration_image di; - dr.Resource = TGSI_TEXTURE_BUFFER; - dr.Raw = 0; - dr.Writable = 0; - dr.Padding = 0; + di.Resource = TGSI_TEXTURE_BUFFER; + di.Raw = 0; + di.Writable = 0; + di.Format = 0; + di.Padding = 0; - return dr; + return di; } -static struct tgsi_declaration_resource -tgsi_build_declaration_resource(unsigned texture, -unsigned raw, -unsigned writable, -struct tgsi_declaration *declaration, -struct tgsi_header *header) +static struct tgsi_declaration_image +tgsi_build_declaration_image(unsigned texture, + unsigned format, + unsigned raw, + unsigned writable, + struct tgsi_declaration *declaration, + struct tgsi_header *header) { - struct tgsi_declaration_resource dr; + struct tgsi_declaration_image di; - dr = tgsi_default_declaration_resource(); - dr.Resource = texture; - dr.Raw = raw; - dr.Writable = writable; + di = tgsi_default_declaration_image(); + di.Resource = texture; + di.Format = format; + di.Raw = raw; + di.Writable = writable; declaration_grow(declaration, header); - return dr; + return di; } static struct tgsi_declaration_sampler_view @@ -364,7 +367,7 @@ tgsi_default_full_declaration( void ) full_declaration.Range = tgsi_default_declaration_range(); full_declaration.Semantic = tgsi_default_declaration_semantic(); full_declaration.Interp = tgsi_default_declaration_interp(); - full_declaration.Resource = tgsi_default_declaration_resource(); + full_declaration.Image = tgsi_default_declaration_image(); full_declaration.SamplerView = tgsi_default_declaration_sampler_view(); full_declaration.Array = tgsi_default_declaration_array(); @@ -454,20 +457,21 @@ tgsi_build_full_declaration( header ); } - if (full_decl->Declaration.File == TGSI_FILE_RESOURCE) { - struct tgsi_declaration_resource *dr; + if (full_decl->Declaration.File == TGSI_FILE_IMAGE) { + struct tgsi_declaration_image *di; if (maxsize <= size) { return 0; } - dr = (struct tgsi_declaration_resource *)&tokens[size]; + di = (struct tgsi_declaration_image *)&tokens[size]; size++; - *dr = tgsi_build_declaration_resource(full_decl->Resource.Resource, -full_decl->Resource.Raw, - full_decl->Resource.Writable, -declaration, -header); + *di = tgsi_build_declaration_image(full_decl->Image.Resource, + full_decl->Image.Format, + full_decl->Image.Raw, + full_decl->Image.Writable, + declaration, + header); } if (full_decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) { diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c b/src/gallium/auxiliary/tgsi/tgsi_dump.c index e29ffb3..dad3839 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c ++
Re: [Mesa-dev] [PATCH 0/5] Add ARB_indirect_parameters support
In this series patches 1-4 are: Reviewed-by: Edward O'Callaghan No idea what is happening in patch 5 to say anything either way. On 2016-01-03 07:38, Ilia Mirkin wrote: The nvc0 patch applies on top of some unpublished patches, see https://github.com/imirkin/mesa/commits/tmp4 for the full thing. The whole series applies on top of the ARB_multi_draw_indirect patches I sent earlier (with potential minor modifications). There is some type confusion between the ARB_indirect_parameters spec and the Khronos gl.xml/glcorearb.h files, I went with the latter's definitions. This passes the relatively simple piglit test I sent. Ilia Mirkin (5): glapi: add ARB_indirect_parameters definitions mesa: add parameter buffer, used for ARB_indirect_parameters mesa: add support for ARB_indirect_parameters draw functions st/mesa: expose ARB_indirect_parameters when the backend driver allows nvc0: add ARB_indirect_parameters support docs/relnotes/11.2.0.html | 1 + src/gallium/drivers/nouveau/nvc0/mme/com9097.mme | 157 + src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 125 src/gallium/drivers/nouveau/nvc0/nvc0_macros.h | 4 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 4 +- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c| 29 +++- src/mapi/glapi/gen/ARB_indirect_parameters.xml | 30 src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 6 +- src/mesa/main/api_validate.c | 115 +++ src/mesa/main/api_validate.h | 16 +++ src/mesa/main/bufferobj.c | 15 ++ src/mesa/main/extensions_table.h | 1 + src/mesa/main/get.c| 5 + src/mesa/main/get_hash_params.py | 4 + src/mesa/main/mtypes.h | 2 + src/mesa/main/tests/dispatch_sanity.cpp| 4 + src/mesa/state_tracker/st_cb_bufferobjects.c | 1 + src/mesa/state_tracker/st_extensions.c | 1 + src/mesa/vbo/vbo_exec_array.c | 124 20 files changed, 638 insertions(+), 7 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_indirect_parameters.xml ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] gallium: document PK2H/UP2H
This series is: Reviewed-by: Edward O'Callaghan On 2016-01-03 11:37, Ilia Mirkin wrote: Signed-off-by: Ilia Mirkin --- src/gallium/docs/source/tgsi.rst | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index 955ece8..f69998f 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -458,7 +458,9 @@ while DDY is allowed to be the same for the entire 2x2 quad. .. opcode:: PK2H - Pack Two 16-bit Floats - TBD +.. math:: + + dst.x = f32\_to\_f16(src.x) | f32\_to\_f16(src.y) << 16 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars @@ -615,7 +617,11 @@ This instruction replicates its result. .. opcode:: UP2H - Unpack Two 16-Bit Floats - TBD +.. math:: + + dst.x = f16\_to\_f32(src0.x \& 0x) + + dst.y = f16\_to\_f32(src0.x >> 16) .. note:: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] gallium/tests: fix build with clang compiler
omg I don't know why folks insist on using gnuc nested functions they are insane. Thanks for working though this one! Reviewed-by: Edward O'Callaghan On 2016-01-03 04:20, Samuel Pitoiset wrote: Nested functions are supported as an extension in GNU C, but Clang don't support them. This fixes compilation errors when (manually) building compute.c, or by setting --enable-gallium-tests to the configure script. Changes from v3: - refactor by introducing test_default_init() Changes from v2: - fix typo Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165 Signed-off-by: Samuel Pitoiset --- src/gallium/tests/trivial/compute.c | 603 1 file changed, 330 insertions(+), 273 deletions(-) diff --git a/src/gallium/tests/trivial/compute.c b/src/gallium/tests/trivial/compute.c index bcdfb11..5ce12ab 100644 --- a/src/gallium/tests/trivial/compute.c +++ b/src/gallium/tests/trivial/compute.c @@ -428,6 +428,35 @@ static void launch_grid(struct context *ctx, const uint *block_layout, pipe->launch_grid(pipe, block_layout, grid_layout, pc, input); } +static void test_default_init(void *p, int s, int x, int y) +{ +*(uint32_t *)p = 0xdeadbeef; +} + +/* test_system_values */ +static void test_system_values_expect(void *p, int s, int x, int y) +{ +int id = x / 16, sv = (x % 16) / 4, c = x % 4; +int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 }; +int bsz[] = { 4, 3, 5, 1}; +int gsz[] = { 5, 4, 1, 1}; + +switch (sv) { +case 0: +*(uint32_t *)p = tid[c] / bsz[c]; +break; +case 1: +*(uint32_t *)p = bsz[c]; +break; +case 2: +*(uint32_t *)p = gsz[c]; +break; +case 3: +*(uint32_t *)p = tid[c] % bsz[c]; +break; +} +} + static void test_system_values(struct context *ctx) { const char *src = "COMP\n" @@ -461,44 +490,31 @@ static void test_system_values(struct context *ctx) " STORE RES[0].xyzw, TEMP[0], SV[3]\n" " RET\n" "ENDSUB\n"; -void init(void *p, int s, int x, int y) { -*(uint32_t *)p = 0xdeadbeef; -} -void expect(void *p, int s, int x, int y) { -int id = x / 16, sv = (x % 16) / 4, c = x % 4; -int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 }; -int bsz[] = { 4, 3, 5, 1}; -int gsz[] = { 5, 4, 1, 1}; - -switch (sv) { -case 0: -*(uint32_t *)p = tid[c] / bsz[c]; -break; -case 1: -*(uint32_t *)p = bsz[c]; -break; -case 2: -*(uint32_t *)p = gsz[c]; -break; -case 3: -*(uint32_t *)p = tid[c] % bsz[c]; -break; -} -} printf("- %s\n", __func__); init_prog(ctx, 0, 0, 0, src, NULL); init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT, - 76800, 0, init); + 76800, 0, test_default_init); init_compute_resources(ctx, (int []) { 0, -1 }); launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, NULL); -check_tex(ctx, 0, expect, NULL); +check_tex(ctx, 0, test_system_values_expect, NULL); destroy_compute_resources(ctx); destroy_tex(ctx); destroy_prog(ctx); } +/* test_resource_access */ +static void test_resource_access_init0(void *p, int s, int x, int y) +{ +*(float *)p = 8.0 - (float)x; +} + +static void test_resource_access_expect(void *p, int s, int x, int y) +{ +*(float *)p = 8.0 - (float)((x + 4 * y) & 0x3f); +} + static void test_resource_access(struct context *ctx) { const char *src = "COMP\n" @@ -519,31 +535,33 @@ static void test_resource_access(struct context *ctx) " STORE RES[1].xyzw, TEMP[1], TEMP[0]\n" " RET\n" "ENDSUB\n"; -void init0(void *p, int s, int x, int y) { -*(float *)p = 8.0 - (float)x; -} -void init1(void *p, int s, int x, int y) { -*(uint32_t *)p = 0xdeadbeef; -} -void expect(void *p, int s, int x, int y) { -*(float *)p = 8.0 - (float)((x + 4*y) & 0x3f); -} printf("- %s\n", __func__); init_prog(ctx, 0, 0, 0, src, NULL); init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT, - 256, 0, init0); + 256, 0, test_resource_access_init0); init_tex(ctx, 1, PIPE_TEXTURE_2D, true, PIPE_FORMAT_R32_FLOAT, - 60, 12, init1); + 60, 12, test_default_init); init_compute_r
Re: [Mesa-dev] [PATCH 0/9] RadeonSI: Some shaders cleanups
Well can't disagree with anything in this series and it certainly makes `si_shader.c' a tiny bit easier to understand for me. Hopefully not too much fallout with the debug callback patch series as they will need to be rebased on top of this :| Thus this series is, Reviewed-by: Edward O'Callaghan On 2016-01-02 01:13, Marek Olšák wrote: Hi, These are shader cleanups mostly around si_compile_llvm. You may wonder why the "move si_shader_binary_upload out of xxx" patches. They are part of my one-variant-per-shader rework, which needs a lot of restructuring. Besides this, I have 2 more series of cleanup patches, which I will send when this lands. Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] gallium/radeon: implement set_debug_callback
Fantastic thanks! This series is, Reviewed-by: Edward O'Callaghan On 2015-12-31 13:30, Nicolai Hähnle wrote: From: Nicolai Hähnle --- src/gallium/drivers/radeon/r600_pipe_common.c | 12 src/gallium/drivers/radeon/r600_pipe_common.h | 2 ++ 2 files changed, 14 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 9a5e987..41c7aa5 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -227,6 +227,17 @@ static enum pipe_reset_status r600_get_reset_status(struct pipe_context *ctx) return PIPE_UNKNOWN_CONTEXT_RESET; } +static void r600_set_debug_callback(struct pipe_context *ctx, + const struct pipe_debug_callback *cb) +{ + struct r600_common_context *rctx = (struct r600_common_context *)ctx; + + if (cb) + rctx->debug = *cb; + else + memset(&rctx->debug, 0, sizeof(rctx->debug)); +} + bool r600_common_context_init(struct r600_common_context *rctx, struct r600_common_screen *rscreen) { @@ -252,6 +263,7 @@ bool r600_common_context_init(struct r600_common_context *rctx, rctx->b.transfer_inline_write = u_default_transfer_inline_write; rctx->b.memory_barrier = r600_memory_barrier; rctx->b.flush = r600_flush_from_st; + rctx->b.set_debug_callback = r600_set_debug_callback; if (rscreen->info.drm_major == 2 && rscreen->info.drm_minor >= 43) { rctx->b.get_device_reset_status = r600_get_reset_status; diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index c3933b1d..a69e627 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -440,6 +440,8 @@ struct r600_common_context { * the GPU addresses are updated. */ struct list_headtexture_buffers; + struct pipe_debug_callback debug; + /* Copy one resource to another using async DMA. */ void (*dma_copy)(struct pipe_context *ctx, struct pipe_resource *dst, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] glsl: annotate ast_process_struct_or_iface_block_members() as static
This series is, Reviewed-by: Edward O'Callaghan On 2015-12-29 21:02, Timothy Arceri wrote: From: Emil Velikov Reviewed-by: Timothy Arceri --- src/glsl/ast_to_hir.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index bb35d72..d51f095 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -6201,7 +6201,7 @@ ast_type_specifier::hir(exec_list *instructions, * The number of fields processed. A pointer to the array structure fields is * stored in \c *fields_ret. */ -unsigned +static unsigned ast_process_struct_or_iface_block_members(exec_list *instructions, struct _mesa_glsl_parse_state *state, exec_list *declarations, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] V2 ARB_enhanced_layouts component qualifier support
On 2015-12-29 16:00, Timothy Arceri wrote: This series adds support for the component layout qualifier by enhancing the varying packing pass at the GLSL IR level. The advantage to this approach is that its fairly simple and will work for all drivers, the disadvantage it that it relies on optimisation passes to clean up the mess. I'm personally a fan of this approach so I think abstract IR passes are the way to go over hand holding bulky driver backends. I see the main issue is around the interaction with tessellation, admittedly patch 16 I don't fully understand the implications although patch 26 helped clarify a little about the issue. It does sound to me that the duplicate dimensionality from AoA and so on are solvable at the IR level in any case. Thus, this series is: Reviewed-by: Edward O'Callaghan [PATCH 01/28] glsl: only add outward facing varyings to resourse list Patch 1: Bugfix for SSO but also required later in the series to add packed vertex inputs to the resource list. [PATCH 02/28] glsl: move lowering after matching validation [PATCH 03/28] glsl: don't change the varying type in validation code [PATCH 04/28] glsl: remove unused varyings before packing them Patches 2-4: Required to allow unused varyings with explicit locations to be removed before packing. [PATCH 05/28] glsl: create helper to remove outer vertex index array [PATCH 06/28] glsl: fix overlapping of varying locations for arrays [PATCH 07/28] glsl: don't try adding build-ins to explicit locations Patches 5-7: are SSO bugfixes [PATCH 08/28] glsl: parse component layout qualifier [PATCH 09/28] glsl: validate and store component layout qualifier in [PATCH 10/28] glsl: fix cross validation for explicit locations on [PATCH 11/28] glsl: cross validate varyings with a component [PATCH 12/28] glsl: update explicit location matching to support [PATCH 13/28] glsl: include varyings with explicit locations in slot [PATCH 14/28] glsl: pass disable_varying_packing bool to the lowering [PATCH 15/28] glsl: add support for packing varyings with explicit [PATCH 16/28] glsl: don't pack tessellation stages like we do other [PATCH 17/28] glsl: enable lowering of varyings with explicit [PATCH 18/28] glsl: validate linking of intrastage component [PATCH 19/28] glsl: add support for explicit components to frag [PATCH 20/28] glsl: pack vertex attributes with component layout [PATCH 21/28] glsl: pack fragment shader outputs with component [PATCH 22/28] glsl: get geometry shader vertex count from type when [PATCH 23/28] glsl: add pack varying to resource list for vertex [PATCH 24/28] glsl: make needs_lowering() a shared packing helper [PATCH 25/28] glsl: move packed varying creation code to a helper [PATCH 26/28] glsl: lower tessellation varyings packed with component [PATCH 27/28] mesa: add LOCATION_COMPONENT support to Patches 8-28: add the component layout qualifier support. [PATCH 28/28] docs: mark component layout qualifiers as DONE ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/28] glsl: don't pack tessellation stages like we do other stages
On 2015-12-29 16:00, Timothy Arceri wrote: Tessellation shaders treat varyings as shared memory and invocations can access each others varyings therefore we can't use the existing method to lower them. This adds a check for these stages as following patches will allow explicit locations to be lowered even when the driver and existing tesselation checks ask for it to be disabled, we do this to enable support for the component layout qualifier. I find this a little hard to read and understand, could you brush it up a bit please if that's ok? --- src/glsl/lower_packed_varyings.cpp | 62 +- 1 file changed, 34 insertions(+), 28 deletions(-) diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/lower_packed_varyings.cpp index 2899846..e4e9a35 100644 --- a/src/glsl/lower_packed_varyings.cpp +++ b/src/glsl/lower_packed_varyings.cpp @@ -737,40 +737,46 @@ lower_packed_varyings(void *mem_ctx, unsigned locations_used, ir_variable_mode mode, unsigned gs_input_vertices, gl_shader *shader, bool disable_varying_packing) { - exec_list *instructions = shader->ir; ir_function *main_func = shader->symbols->get_function("main"); exec_list void_parameters; ir_function_signature *main_func_sig = main_func->matching_signature(NULL, &void_parameters, false); - exec_list new_instructions, new_variables; - lower_packed_varyings_visitor visitor(mem_ctx, locations_used, mode, - gs_input_vertices, - &new_instructions, - &new_variables, - disable_varying_packing); - visitor.run(shader); - if (mode == ir_var_shader_out) { - if (shader->Stage == MESA_SHADER_GEOMETRY) { - /* For geometry shaders, outputs need to be lowered before each call - * to EmitVertex() - */ - lower_packed_varyings_gs_splicer splicer(mem_ctx, &new_instructions); - - /* Add all the variables in first. */ - main_func_sig->body.head->insert_before(&new_variables); - /* Now update all the EmitVertex instances */ - splicer.run(instructions); + if (!(shader->Stage == MESA_SHADER_TESS_CTRL || + shader->Stage == MESA_SHADER_TESS_EVAL)) { + exec_list *instructions = shader->ir; + exec_list new_instructions, new_variables; + + lower_packed_varyings_visitor visitor(mem_ctx, locations_used, mode, +gs_input_vertices, +&new_instructions, +&new_variables, +disable_varying_packing); + visitor.run(shader); + if (mode == ir_var_shader_out) { + if (shader->Stage == MESA_SHADER_GEOMETRY) { +/* For geometry shaders, outputs need to be lowered before each + * call to EmitVertex() + */ +lower_packed_varyings_gs_splicer splicer(mem_ctx, + &new_instructions); + +/* Add all the variables in first. */ +main_func_sig->body.head->insert_before(&new_variables); + +/* Now update all the EmitVertex instances */ +splicer.run(instructions); + } else { +/* For other shader types, outputs need to be lowered at the end + * of main() + */ +main_func_sig->body.append_list(&new_variables); +main_func_sig->body.append_list(&new_instructions); + } } else { - /* For other shader types, outputs need to be lowered at the end of - * main() - */ - main_func_sig->body.append_list(&new_variables); - main_func_sig->body.append_list(&new_instructions); + /* Shader inputs need to be lowered at the beginning of main() */ + main_func_sig->body.head->insert_before(&new_instructions); + main_func_sig->body.head->insert_before(&new_variables); } - } else { - /* Shader inputs need to be lowered at the beginning of main() */ - main_func_sig->body.head->insert_before(&new_instructions); - main_func_sig->body.head->insert_before(&new_variables); } } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 26/28] glsl: lower tessellation varyings packed with component layout qualifier
On 2015-12-29 16:00, Timothy Arceri wrote: For tessellation shaders we cannot just copy everything to the packed varyings like we do in other stages as tessellation uses shared memory for varyings, therefore it is only safe to copy array elements that the shader actually uses. This class searches the IR for uses of varyings and then creates instructions that copy those vars to a packed varying. This means it is easy to end up with duplicate copies if the varying is used more than once, also arrays of arrays create a duplicate copy for each dimension that exists. These issues are not easily resolved without breaking various corner cases so we leave it to a later IR stage to clean up the mess. Note that neither GLSL IR nor NIR can currently can't clean up the s/can\'t// duplicates when and indirect is used as an array index. This patch assumes that NIR will eventually be able to clean this up. --- src/glsl/lower_packed_varyings.cpp | 421 + 1 file changed, 421 insertions(+) diff --git a/src/glsl/lower_packed_varyings.cpp b/src/glsl/lower_packed_varyings.cpp index b606cc8..9522969 100644 --- a/src/glsl/lower_packed_varyings.cpp +++ b/src/glsl/lower_packed_varyings.cpp @@ -148,10 +148,28 @@ #include "ir.h" #include "ir_builder.h" #include "ir_optimization.h" +#include "ir_rvalue_visitor.h" #include "program/prog_instruction.h" +#include "util/hash_table.h" using namespace ir_builder; +/** + * Creates new type for and array when the base type changes. + */ +static const glsl_type * +update_packed_array_type(const glsl_type *type, const glsl_type *packed_type) +{ + const glsl_type *element_type = type->fields.array; + if (element_type->is_array()) { + const glsl_type *new_array_type = +update_packed_array_type(element_type, packed_type); + return glsl_type::get_array_instance(new_array_type, type->length); + } else { + return glsl_type::get_array_instance(packed_type, type->length); + } +} + static bool needs_lowering(ir_variable *var, bool has_enhanced_layouts, bool disable_varying_packing) @@ -205,6 +223,51 @@ create_packed_var(void * const mem_ctx, const char *packed_name, return packed_var; } +/** + * Creates a packed varying for the tessellation packing. + */ +static ir_variable * +create_tess_packed_var(void *mem_ctx, ir_variable *unpacked_var) +{ + /* create packed varying name using location */ + char location_str[11]; + snprintf(location_str, 11, "%d", unpacked_var->data.location); + char *packed_name; + if ((ir_variable_mode) unpacked_var->data.mode == ir_var_shader_out) + packed_name = ralloc_asprintf(mem_ctx, "packed_out:%s", location_str); + else + packed_name = ralloc_asprintf(mem_ctx, "packed_in:%s", location_str); + + const glsl_type *packed_type; + switch (unpacked_var->type->without_array()->base_type) { + case GLSL_TYPE_UINT: + packed_type = glsl_type::uvec4_type; + break; + case GLSL_TYPE_INT: + packed_type = glsl_type::ivec4_type; + break; + case GLSL_TYPE_FLOAT: + packed_type = glsl_type::vec4_type; + break; + case GLSL_TYPE_DOUBLE: + packed_type = glsl_type::dvec4_type; + break; + default: + assert(!"Unexpected type in tess varying packing"); + return NULL; + } + + /* Create array new array type */ + if (unpacked_var->type->is_array()) { + packed_type = update_packed_array_type(unpacked_var->type, packed_type); + } + + return create_packed_var(mem_ctx, packed_name, packed_type, unpacked_var, +(ir_variable_mode) unpacked_var->data.mode, +unpacked_var->data.location, +unpacked_var->type->is_array()); +} + namespace { /** @@ -763,6 +826,296 @@ lower_packed_varyings_gs_splicer::visit_leave(ir_emit_vertex *ev) } +/** + * For tessellation shaders we cannot just copy everything to the packed + * varyings like we do in other stages as tessellation uses shared memory for + * varyings, therefore it is only safe to copy array elements that the shader + * actually uses. + * + * This class searches the IR for uses of varyings and then creates + * instructions that copy those vars to a packed varying. This means it is + * easy to end up with duplicate copies if the varying is used more than once, + * also arrays of arrays create a duplicate copy for each dimension that + * exists. These issues are not easily resolved without breaking various + * corner cases so we leave it to a later IR stage to clean up the mess. + */ +class lower_packed_varyings_tess_visitor : public ir_rvalue_visitor +{ +public: + lower_packed_varyings_tess_visitor(void *mem_ctx, hash_table *varyings, + ir_variable_mode mode) + : mem_ctx(mem_ctx), varyings(varyings), mode(mode) + { + } + + virtual ~lower_packed_varyings_tess_visitor() + { + } + + virtual ir
Re: [Mesa-dev] [PATCH v5] Add .mailmap
Should I be expecting to see myself on here? On 2015-12-28 20:50, Giuseppe Bilotta wrote: This adds a first tentative .mailmap file, to canonicize contributor name/emails in shortlogs and other statistical endeavours. Signed-off-by: Giuseppe Bilotta --- Hopefully the last time I need to submit this … .mailmap | 460 +++ 1 file changed, 460 insertions(+) create mode 100644 .mailmap diff --git a/.mailmap b/.mailmap new file mode 100644 index 000..10811c0 --- /dev/null +++ b/.mailmap @@ -0,0 +1,460 @@ +Aapo Tahkola + +Adam Jackson +Adam Jackson + +Adrian Marius Negreanu Adrian Negreanu +Adrian Marius Negreanu Negreanu Marius Adrian + +Dave Airlie +Dave Airlie airlied +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie +Dave Airlie + +Alan Coopersmith + +Alan Hourihane +Alan Hourihane +Alan Hourihane + +Alexander Monakov + +Alexander von Gluck IV Alexander von Gluck + +Alex Corscadden +Alex Corscadden + +Alex Deucher +Alex Deucher +Alex Deucher +Alex Deucher +Alex Deucher +Alex Deucher + +Andreas Fänger + +Andreas Hartmetz + +Andre Heider +Andreas Heider + +Andreas Pokorny + +Andrew Randrianasulu +Andrew Randrianasulu + +Arthur Huillet Arthur HUILLET + +Benjamin Franzke ben + +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs +Ben Skeggs + +Ben Widawsky Ben Widawsky + +Blair Sadewitz Blair Sadewitz + +Boris Peterbarg reist + +Brian Paul Brian +Brian Paul +Brian Paul +Brian Paul +Brian Paul brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul Brian +Brian Paul root +Brian Paul root +Brian Paul root +Brian Paul root + +Bruce Merry + +Carl-Philip Hänsch Carl-Philip Haensch +Carl-Philip Hänsch Carl-Philip Haensch +Carl-Philip Hänsch Carl-Philip Haensch + +Chad Versace +Chad Versace c...@chad-versace.us> +Chad Versace + +Chia-I Wu +Chia-I Wu Chia-Wu + +Chih-Wei Huang Chih-Wei Huang + +Christian König Christian Koenig +Christian König Christian König +Christian König Christian König + +Christoph Brill Christoph Bill +Christoph Brill + +Christoph Bumiller + +Christopher James Halse Rogers Christopher James Halse Rogers + +Claudio Ciccani +Claudio Ciccani + +Connor Abbott +Connor Abbott + +Corbin Simpson +Corbin Simpson + +Courtney Goeltzenleuchter + +Daniel Skinner sio + +Daniel Stone + +David Miller David S. Miller +David Miller Dave Miller +David Miller davem69 + +David Heidelberger David Heidelberg +David Heidelberger + +David Reveman + +Dieter Nützel Dieter Nützel + +Dmitry Cherkassov Dmitry Cherkasov + +Dylan Baker + +Emeric Grange Emeric + +Emil Velikov + +Eric Anholt Eric Anholt + +Eugeni Dodonov + +Fabian Bieler +Fabian Bieler <> + +Feng, Haitao Haitao Feng + +Frank Henigman + +George Sapountzis George Sapountzis + +Gwenole Beauchesne + +Hamish Marson hmarson + +Hans de Goede Hans de Goede + +Homer Hsing + +Hui Qi Tay + +Ian Romanick +Ian Romanick + +Jakob Bornecrantz +Jakob Bornecrantz +Jakob Bornecrantz +Jakob Bornecrantz +Jakob Bornecrantz +Jakob Bornecrantz + +Jakub Bogusz + +James Legg + +Jan Vesely Jan Vesely + +Jason Ekstrand + +Jeremy Huddleston +Jeremy Huddleston +Jeremy Huddleston +Jeremy Huddleston +Jeremy Huddleston Jeremy Huddleston Sequoia + +Jeremy Kolb + +Jerome Glisse +Jerome Glisse +Jerome Glisse John Doe +Jerome Glisse John Doe + +Jesse Barnes +Jesse Barnes +Jesse Barnes +Jesse Barnes +Jesse Barnes + +Joakim Sindholt +Joakim Sindholt + +Jochen Gerlach jtg + +Joel Bosveld + +Jonathan Adamczewski + +Jon Turney Jon TURNEY + +José Fonseca Jose Fonseca +José Fonseca Jose Fonseca +José Fonseca +José Fonseca +José Fonseca +José Fonseca +José Fonseca + +Jouk Jansen Jouk Jansen +Jouk Jansen Jouk Jansen +Jouk Jansen joukj +Jouk Jansen Jouk +Jouk Jansen Jouk +Jouk Jansen J.Jansen + +Juan Zhao + +Julien Cristau + +Julien Isorce + +Kalyan Kondapally + +Karl Schultz Karl Schultze +Karl Schultz unknown +Karl Schultz +Karl Schultz +Karl Schultz + +Keith Harrison sio2 + +Keith Packard +Keith Packard + +Keith Whitwell +Keith Whitwell keithw + +Kristian Høgsberg +Kristian Høgsberg +Kristian Høgsberg +Kristian Høgsberg +Kristian Høgsberg + +Krzesimir Nowak + +Li Peng + +Lucas Stach + +Maarten Lankhorst +Maarten Lankhorst + +Maciej Cencora + +Marc-André Lureau Marc-Andre Lureau + +Marc Dietrich Marc +Marc Dietrich marvin24 + +Marcin Ślusarz Marcin Slusarz + +Marek Olšák + +Mario Kleiner kleinerm +Mario Kleiner + +Mark Mueller + +Marta Lofstedt + +Martin Peres + +Mathias Fröhlich Mathias Froehlich +Mat
Re: [Mesa-dev] [PATCH 0/10] Tessellation shaders for Gen7/7.5.
Reviewed-by: Edward O'Callaghan Congrats on getting this working, also thanks! On 2015-12-25 12:34, Kenneth Graunke wrote: This morning, I woke up and somehow "knew" what was causing my HS GPU hangs on Gen7/7.5. It turns out I was (completely) wrong, but through some miraculous series of illogical leaps, I arrived at a solution anyway. I don't honestly know how I got it working on Christmas Eve after failing to figure it out for months on end. After exhausting every bit of documentation and every tool available, and finding zero information, somehow randomly flailing in the dark resulted in a solution, today of all days. Honestly, I had pretty much no hope for figuring this out, so I'm relieved to have it working at last... It turns out that setting interleave on the EOT URB write does bad things. Fixing this fixed all the GPU hangs when releasing inputs one at a time, I then added back the ability to release inputs in pairs, which caused more GPU hangs. It turned out I needed to be more careful and enable both halves. Everything seems to be working just fine now, so let's turn it on. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/13] i965: Only call brw_upload_tcs/tes_prog when using tessellation.
On 2015-12-22 21:20, Kenneth Graunke wrote: If there's no evaluation shader, tessellation is disabled. The upload functions would just bail. Instead, don't bother calling them. This will simplify the optional-TCS case a bit, as brw_upload_tcs can assume that we're doing tessellation. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_state_upload.c | 11 +-- src/mesa/drivers/dri/i965/brw_tcs.c | 17 - src/mesa/drivers/dri/i965/brw_tes.c | 9 - 3 files changed, 13 insertions(+), 24 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index 56962d5..af9fb5b 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -678,8 +678,15 @@ brw_upload_programs(struct brw_context *brw, { if (pipeline == BRW_RENDER_PIPELINE) { brw_upload_vs_prog(brw); - brw_upload_tcs_prog(brw); - brw_upload_tes_prog(brw); + if (brw->tess_eval_program) { + brw_upload_tcs_prog(brw); + brw_upload_tes_prog(brw); + } else { + brw->tcs.prog_data = NULL; + brw->tcs.base.prog_data = NULL; + brw->tes.prog_data = NULL; + brw->tes.base.prog_data = NULL; + } if (brw->gen < 6) brw_upload_ff_gs_prog(brw); diff --git a/src/mesa/drivers/dri/i965/brw_tcs.c b/src/mesa/drivers/dri/i965/brw_tcs.c index b5eb4cd..037a2da 100644 --- a/src/mesa/drivers/dri/i965/brw_tcs.c +++ b/src/mesa/drivers/dri/i965/brw_tcs.c @@ -187,6 +187,10 @@ brw_upload_tcs_prog(struct brw_context *brw) /* BRW_NEW_TESS_CTRL_PROGRAM */ struct brw_tess_ctrl_program *tcp = (struct brw_tess_ctrl_program *) brw->tess_ctrl_program; + /* BRW_NEW_TESS_EVAL_PROGRAM */ + struct brw_tess_eval_program *tep = + (struct brw_tess_eval_program *) brw->tess_eval_program; + assert(tcp && tep); if (!brw_state_dirty(brw, _NEW_TEXTURE, @@ -195,15 +199,6 @@ brw_upload_tcs_prog(struct brw_context *brw) BRW_NEW_TESS_EVAL_PROGRAM)) return; - if (tcp == NULL) { - /* Other state atoms had better not try to access prog_data, since - * there's no HS program. - */ - brw->tcs.prog_data = NULL; - brw->tcs.base.prog_data = NULL; - return; - } - struct gl_program *prog = &tcp->program.Base; memset(&key, 0, sizeof(key)); @@ -216,13 +211,9 @@ brw_upload_tcs_prog(struct brw_context *brw) brw_populate_sampler_prog_key_data(ctx, prog, stage_state->sampler_count, &key.tex); - /* BRW_NEW_TESS_EVAL_PROGRAM */ /* We need to specialize our code generation for tessellation levels * based on the domain the DS is expecting to tessellate. */ - struct brw_tess_eval_program *tep = - (struct brw_tess_eval_program *) brw->tess_eval_program; - assert(tep); key.tes_primitive_mode = tep->program.PrimitiveMode; Does this compile? You've killed off *tep yet we still dereference it. if (!brw_search_cache(&brw->cache, BRW_CACHE_TCS_PROG, diff --git a/src/mesa/drivers/dri/i965/brw_tes.c b/src/mesa/drivers/dri/i965/brw_tes.c index 3c12706..4b2bf8c 100644 --- a/src/mesa/drivers/dri/i965/brw_tes.c +++ b/src/mesa/drivers/dri/i965/brw_tes.c @@ -241,15 +241,6 @@ brw_upload_tes_prog(struct brw_context *brw) BRW_NEW_TESS_EVAL_PROGRAM)) return; - if (tep == NULL) { - /* Other state atoms had better not try to access prog_data, since - * there's no TES program. - */ - brw->tes.prog_data = NULL; - brw->tes.base.prog_data = NULL; - return; - } - struct gl_program *prog = &tep->program.Base; memset(&key, 0, sizeof(key)); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] draw: rework hanndling of non-existing outputs in emit code
Thanks for the most comprehensive cleanup Roland and fixing that minor regression we discussed. Happy holiday's. Reviewed-by: Edward O'Callaghan On 2015-12-22 14:00, srol...@vmware.com wrote: From: Roland Scheidegger Previously the code would just redirect requests for attributes which don't exist to use output 0. Rework this to output all zeros instead which seems more useful - in particular some extensions like ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by previous stages. That way, drivers don't have to special case this depending if the vs/gs outputs some attribute or not. --- src/gallium/auxiliary/draw/draw_pipe_vbuf.c | 52 + src/gallium/auxiliary/draw/draw_pt_emit.c | 12 +++ src/gallium/auxiliary/draw/draw_vertex.h| 4 +-- 3 files changed, 45 insertions(+), 23 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pipe_vbuf.c b/src/gallium/auxiliary/draw/draw_pipe_vbuf.c index f36706c..81c4fed 100644 --- a/src/gallium/auxiliary/draw/draw_pipe_vbuf.c +++ b/src/gallium/auxiliary/draw/draw_pipe_vbuf.c @@ -74,9 +74,10 @@ struct vbuf_stage { unsigned max_indices; unsigned nr_indices; - /* Cache point size somewhere it's address won't change: + /* Cache point size somewhere its address won't change: */ float point_size; + float zero4[4]; struct translate_cache *cache; }; @@ -205,6 +206,7 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint prim ) struct translate_key hw_key; unsigned dst_offset; unsigned i; + const struct vertex_info *vinfo; vbuf->render->set_primitive(vbuf->render, prim); @@ -215,27 +217,33 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint prim ) * state change. */ vbuf->vinfo = vbuf->render->get_vertex_info(vbuf->render); - vbuf->vertex_size = vbuf->vinfo->size * sizeof(float); + vinfo = vbuf->vinfo; + vbuf->vertex_size = vinfo->size * sizeof(float); /* Translate from pipeline vertices to hw vertices. */ dst_offset = 0; - for (i = 0; i < vbuf->vinfo->num_attribs; i++) { + for (i = 0; i < vinfo->num_attribs; i++) { unsigned emit_sz = 0; unsigned src_buffer = 0; enum pipe_format output_format; - unsigned src_offset = (vbuf->vinfo->attrib[i].src_index * 4 * sizeof(float) ); + unsigned src_offset = (vinfo->attrib[i].src_index * 4 * sizeof(float) ); - output_format = draw_translate_vinfo_format(vbuf->vinfo->attrib[i].emit); - emit_sz = draw_translate_vinfo_size(vbuf->vinfo->attrib[i].emit); + output_format = draw_translate_vinfo_format(vinfo->attrib[i].emit); + emit_sz = draw_translate_vinfo_size(vinfo->attrib[i].emit); /* doesn't handle EMIT_OMIT */ assert(emit_sz != 0); - if (vbuf->vinfo->attrib[i].emit == EMIT_1F_PSIZE) { -src_buffer = 1; -src_offset = 0; + if (vinfo->attrib[i].emit == EMIT_1F_PSIZE) { + src_buffer = 1; + src_offset = 0; + } + else if (vinfo->attrib[i].src_index == 255) { + /* elements which don't exist will get assigned zeros */ + src_buffer = 2; + src_offset = 0; } hw_key.element[i].type = TRANSLATE_ELEMENT_NORMAL; @@ -249,7 +257,7 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint prim ) dst_offset += emit_sz; } - hw_key.nr_elements = vbuf->vinfo->num_attribs; + hw_key.nr_elements = vinfo->num_attribs; hw_key.output_stride = vbuf->vertex_size; /* Don't bother with caching at this stage: @@ -261,6 +269,7 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint prim ) vbuf->translate = translate_cache_find(vbuf->cache, &hw_key); vbuf->translate->set_buffer(vbuf->translate, 1, &vbuf->point_size, 0, ~0); + vbuf->translate->set_buffer(vbuf->translate, 2, &vbuf->zero4[0], 0, ~0); } vbuf->point_size = vbuf->stage.draw->rasterizer->point_size; @@ -428,7 +437,7 @@ struct draw_stage *draw_vbuf_stage( struct draw_context *draw, struct vbuf_stage *vbuf = CALLOC_STRUCT(vbuf_stage); if (!vbuf) goto fail; - + vbuf->stage.draw = draw; vbuf->stage.name = "vbuf"; vbuf->stage.point = vbuf_first_point; @@ -437,29 +446,30 @@ struct draw_stage *draw_vbuf_stage( struct draw_context *draw, vbuf->stage.flush = vbuf_flush; vbuf->stage.reset_stipple_counter = vbuf_reset_stipple_counter; vbuf->stage.destroy = vbuf_destroy; - + vbuf->render = render; vbuf->max_indices = MIN2(render->max_indices, UNDEFINED_VERTEX_ID-1); - vbuf->indices = (ushort *) align_malloc( vbuf->max_indices * - sizeof(vbuf->indices[0]), - 16 ); + vbuf->indices = (ushort *) align_malloc(vbuf->max_indices * +sizeof(vbuf->indices[0]), +16); if (!vbuf->indices) goto fail; vbuf->cache = translate_cache_create(); - if (!vbuf->cache) + if (!vbuf->ca
Re: [Mesa-dev] [PATCH] nir/builder: fix C90 build errors
On 2015-12-20 09:39, Rob Clark wrote: From: Rob Clark We are going to start using nir_builder.h from some gallium code, which is currently only C90. Which results in: In file included from nir/nir_emulate.c:26:0: ../../../src/glsl/nir/nir_builder.h: In function ‘nir_build_alu’: ../../../src/glsl/nir/nir_builder.h:132:4: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] unsigned num_components = op_info->output_size; ^ In file included from nir/nir_emulate.c:26:0: ../../../src/glsl/nir/nir_builder.h: In function ‘nir_ssa_for_src’: ../../../src/glsl/nir/nir_builder.h:271:4: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] nir_alu_src alu = { NIR_SRC_INIT }; ^ cc1: some warnings being treated as errors Signed-off-by: Rob Clark --- Not sure if I should just go ahead and push this sort of thing. Or if we can start requiring C99 for gallium? src/glsl/nir/nir_builder.h | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/glsl/nir/nir_builder.h b/src/glsl/nir/nir_builder.h index 332bb02..6f30306 100644 --- a/src/glsl/nir/nir_builder.h +++ b/src/glsl/nir/nir_builder.h @@ -115,6 +115,8 @@ nir_build_alu(nir_builder *build, nir_op op, nir_ssa_def *src0, { const nir_op_info *op_info = &nir_op_infos[op]; nir_alu_instr *instr = nir_alu_instr_create(build->shader, op); + unsigned num_components; + if (!instr) return NULL; @@ -129,7 +131,7 @@ nir_build_alu(nir_builder *build, nir_op op, nir_ssa_def *src0, /* Guess the number of components the destination temporary should have * based on our input sizes, if it's not fixed for the op. */ - unsigned num_components = op_info->output_size; + num_components = op_info->output_size; if (num_components == 0) { for (unsigned i = 0; i < op_info->num_inputs; i++) { if (op_info->input_sizes[i] == 0) @@ -265,10 +267,11 @@ nir_channel(nir_builder *b, nir_ssa_def *def, unsigned c) static inline nir_ssa_def * nir_ssa_for_src(nir_builder *build, nir_src src, int num_components) { + nir_alu_src alu = { NIR_SRC_INIT }; + if (src.is_ssa && src.ssa->num_components == num_components) return src.ssa; - nir_alu_src alu = { NIR_SRC_INIT }; alu.src = src; for (int j = 0; j < 4; j++) alu.swizzle[j] = j; Reviewed-by: Edward O'Callaghan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] radeonsi: fix perfcounter selection for SI_PC_MULTI_BLOCK layouts
On 2015-12-15 04:06, Nicolai Hähnle wrote: From: Nicolai Hähnle The incorrectly computed register count caused lockups. --- src/gallium/drivers/radeonsi/si_perfcounter.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_perfcounter.c b/src/gallium/drivers/radeonsi/si_perfcounter.c index a0ddff6..7ee1dae 100644 --- a/src/gallium/drivers/radeonsi/si_perfcounter.c +++ b/src/gallium/drivers/radeonsi/si_perfcounter.c @@ -436,7 +436,7 @@ static void si_pc_emit_select(struct r600_common_context *ctx, dw = count + regs->num_prelude; if (count >= regs->num_multi) - count += regs->num_multi; + dw += regs->num_multi; radeon_set_uconfig_reg_seq(cs, regs->select0, dw); for (idx = 0; idx < regs->num_prelude; ++idx) radeon_emit(cs, 0); This series is, Reviewed-by: Edward O'Callaghan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] glsl: simplifiy interface matching
On 2015-12-13 16:25, Timothy Arceri wrote: This makes the code easier to follow, should be more efficient and will makes it easier to add matching via explicit locations in the following patch. This patch also replaces the hash table with the newer resizable hash table this should be more suitable as the table is likely to only contain a small number of entries. --- src/glsl/link_interface_blocks.cpp | 154 +++-- 1 file changed, 46 insertions(+), 108 deletions(-) diff --git a/src/glsl/link_interface_blocks.cpp b/src/glsl/link_interface_blocks.cpp index 936e2e0..61ba078 100644 --- a/src/glsl/link_interface_blocks.cpp +++ b/src/glsl/link_interface_blocks.cpp @@ -30,100 +30,52 @@ #include "glsl_symbol_table.h" #include "linker.h" #include "main/macros.h" -#include "program/hash_table.h" +#include "util/hash_table.h" namespace { /** - * Information about a single interface block definition that we need to keep - * track of in order to check linkage rules. - * - * Note: this class is expected to be short lived, so it doesn't make copies - * of the strings it references; it simply borrows the pointers from the - * ir_variable class. - */ -struct interface_block_definition -{ - /** -* Extract an interface block definition from an ir_variable that -* represents either the interface instance (for named interfaces), or a -* member of the interface (for unnamed interfaces). -*/ - explicit interface_block_definition(ir_variable *var) - : var(var), -type(var->get_interface_type()), -instance_name(NULL) - { - if (var->is_interface_instance()) { - instance_name = var->name; - } - explicitly_declared = (var->data.how_declared != ir_var_declared_implicitly); - } - /** -* Interface block ir_variable -*/ - ir_variable *var; - - /** -* Interface block type -*/ - const glsl_type *type; - - /** -* For a named interface block, the instance name. Otherwise NULL. -*/ - const char *instance_name; - - /** -* True if this interface block was explicitly declared in the shader; -* false if it was an implicitly declared built-in interface block. -*/ - bool explicitly_declared; -}; - - -/** * Check if two interfaces match, according to intrastage interface matching * rules. If they do, and the first interface uses an unsized array, it will * be updated to reflect the array size declared in the second interface. */ bool -intrastage_match(interface_block_definition *a, - const interface_block_definition *b, - ir_variable_mode mode, +intrastage_match(ir_variable *a, + ir_variable *b, struct gl_shader_program *prog) { /* Types must match. */ - if (a->type != b->type) { + if (a->get_interface_type() != b->get_interface_type()) { /* Exception: if both the interface blocks are implicitly declared, * don't force their types to match. They might mismatch due to the two * shaders using different GLSL versions, and that's ok. */ - if (a->explicitly_declared || b->explicitly_declared) + if (a->data.how_declared != ir_var_declared_implicitly || + b->data.how_declared != ir_var_declared_implicitly) return false; } /* Presence/absence of interface names must match. */ - if ((a->instance_name == NULL) != (b->instance_name == NULL)) + if (a->is_interface_instance() != b->is_interface_instance()) return false; /* For uniforms, instance names need not match. For shader ins/outs, * it's not clear from the spec whether they need to match, but * Mesa's implementation relies on them matching. */ - if (a->instance_name != NULL && - mode != ir_var_uniform && mode != ir_var_shader_storage && - strcmp(a->instance_name, b->instance_name) != 0) { + if (a->is_interface_instance() && b->data.mode != ir_var_uniform && + b->data.mode != ir_var_shader_storage && + strcmp(a->name, b->name) != 0) { return false; } /* If a block is an array then it must match across the shader. * Unsized arrays are also processed and matched agaist sized arrays. */ - if (b->var->type != a->var->type && - (b->instance_name != NULL || a->instance_name != NULL) && - !validate_intrastage_arrays(prog, b->var, a->var)) + if (b->type != a->type && + (b->is_interface_instance() || a->is_interface_instance()) && + !validate_intrastage_arrays(prog, b, a)) return false; return true; @@ -139,43 +91,44 @@ intrastage_match(interface_block_definition *a, * This is used for tessellation control and geometry shader consumers. */ bool -interstage_match(const interface_block_definition *producer, - const interface_block_definition *consumer, +interstage_match(ir_variable *producer, + ir_variable *consumer,
Re: [Mesa-dev] [PATCH] radeonsi: also print hexadecimal values for register fields in the IB parser
On 2015-12-10 09:54, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_debug.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_debug.c b/src/gallium/drivers/radeonsi/si_debug.c index cce665e..034acf5 100644 --- a/src/gallium/drivers/radeonsi/si_debug.c +++ b/src/gallium/drivers/radeonsi/si_debug.c @@ -61,13 +61,16 @@ static void print_spaces(FILE *f, unsigned num) static void print_value(FILE *file, uint32_t value, int bits) { /* Guess if it's int or float */ - if (value <= (1 << 15)) - fprintf(file, "%u\n", value); - else { + if (value <= (1 << 15)) { + if (value <= 9) + fprintf(file, "%u\n", value); + else + fprintf(file, "%u (0x%0*x)\n", value, bits / 4, value); + } else { float f = uif(value); if (fabs(f) < 10 && f*10 == floor(f*10)) - fprintf(file, "%.1ff\n", f); + fprintf(file, "%.1ff (0x%0*x)\n", f, bits / 4, value); else /* Don't print more leading zeros than there are bits. */ fprintf(file, "0x%0*x\n", bits / 4, value); Reviewed-by: Edward O'Callaghan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: don't call of u_prims_for_vertices for patches and rectangles
On 2015-12-10 22:15, Marek Olšák wrote: On Thu, Dec 10, 2015 at 4:01 AM, Michel Dänzer wrote: On 10.12.2015 06:58, Marek Olšák wrote: From: Marek Olšák Both caused a crash due to a division by zero in that function. This is an alternative fix. Cc: 11.0 11.1 --- src/gallium/drivers/radeonsi/si_state_draw.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index ee84a1f..e550011 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -216,6 +216,18 @@ static void si_emit_derived_tess_state(struct si_context *sctx, radeon_emit(cs, tcs_out_layout | (num_tcs_output_cp << 26)); } +static unsigned si_num_prims_for_vertices(const struct pipe_draw_info *info) +{ + switch (info->mode) { + case PIPE_PRIM_PATCHES: + return info->count / info->vertices_per_patch; + case R600_PRIM_RECTANGLE_LIST: + return info->count / 3; + default: + return u_prims_for_vertices(info->mode, info->count); + } +} I don't suppose it makes sense to handle PIPE_PRIM_PATCHES in u_prims_for_vertices? Either way, u_prims_for_vertices has an assertion that fails if mode == PATCHES. That's sufficient. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev I prefer this combined solution now. Many thanks, Reviewed-by: Edward O'Callaghan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: handle patches in u_prims_for_vertices to fix a radeonsi crash
On 2015-12-10 01:47, Marek Olšák wrote: From: Marek Olšák I guess the crash was because of divison by zero. Cc: 11.0 11.1 --- src/gallium/auxiliary/util/u_prim.h | 17 + src/gallium/drivers/radeonsi/si_state_draw.c | 3 ++- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/src/gallium/auxiliary/util/u_prim.h b/src/gallium/auxiliary/util/u_prim.h index 3668015..4926af6 100644 --- a/src/gallium/auxiliary/util/u_prim.h +++ b/src/gallium/auxiliary/util/u_prim.h @@ -141,14 +141,23 @@ u_prim_vertex_count(unsigned prim) * For polygons, return the number of triangles. */ static inline unsigned -u_prims_for_vertices(unsigned prim, unsigned num) +u_prims_for_vertices(unsigned prim, unsigned num, unsigned vertices_per_patch) { - const struct u_prim_vertex_count *info = u_prim_vertex_count(prim); + struct u_prim_vertex_count info; - if (num < info->min) + if (prim == PIPE_PRIM_PATCHES) + info.min = info.incr = vertices_per_patch; + else if (prim < PIPE_PRIM_MAX) We already do this check in u_prim_vertex_count() and if out-of-bounds we returned a NULL. Perhaps it would be better avoid this extra else-if branch here and just in the else branch, make the call and then assert on the NULL. + info = *u_prim_vertex_count(prim); + else { + assert(!"invalid prim type"); + return 0; + } + + if (num < info.min) return 0; Well convolving this with my previous patch, http://lists.freedesktop.org/archives/mesa-dev/2015-December/102729.html I think we should still have an assert(info.incr != 0); here. - return 1 + ((num - info->min) / info->incr); + return 1 + ((num - info.min) / info.incr); } static inline boolean u_validate_pipe_prim( unsigned pipe_prim, unsigned nr ) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index ee84a1f..4ac9d0a 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -320,7 +320,8 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context *sctx, if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi && (info->indirect || (info->instance_count > 1 && - u_prims_for_vertices(info->mode, info->count) <= 1))) + u_prims_for_vertices(info->mode, info->count, + info->vertices_per_patch) <= 1))) sctx->b.flags |= SI_CONTEXT_VGT_FLUSH; return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) | ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports
Roland, I could not due to ml size limit or something, it just bounces hence the pull request. Cheers, Edward. On 2015-12-10 02:38, Roland Scheidegger wrote: Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan: This fixes my initial attempt so that piglit now passes 14/14. Thanks to a couple of tips from Roland in the previous patch I was able to fix the remaining issue. This should be golden now. Great that you got it working! Please send the patches to the ml. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev