Re: [Mesa-dev] [PATCH] i965: Use SET_FIELD in 3DSTATE_STREAMOUT packets.
On Mon, Apr 06, 2015 at 04:12:32PM -0700, Kenneth Graunke wrote: Suggested by Topi Pohjolainen. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Topi Pohjolainen topi.pohjolai...@intel.com Thanks Ken! Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/i965/gen7_sol_state.c | 16 src/mesa/drivers/dri/i965/gen8_sol_state.c | 16 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index 7e9b285..3f99df9 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -245,17 +245,17 @@ upload_3dstate_streamout(struct brw_context *brw, bool active, * point by reading less and offsetting the register index in the * SO_DECLs. */ - dw2 |= urb_entry_read_offset SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_0_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_0_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_1_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_1_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_2_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_2_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_3_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_3_VERTEX_READ_LENGTH); } BEGIN_BATCH(3); diff --git a/src/mesa/drivers/dri/i965/gen8_sol_state.c b/src/mesa/drivers/dri/i965/gen8_sol_state.c index d98a226..58ead68 100644 --- a/src/mesa/drivers/dri/i965/gen8_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen8_sol_state.c @@ -125,17 +125,17 @@ gen8_upload_3dstate_streamout(struct brw_context *brw, bool active, * point by reading less and offsetting the register index in the * SO_DECLs. */ - dw2 |= urb_entry_read_offset SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_0_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_0_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_1_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_1_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_2_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_2_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_3_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_3_VERTEX_READ_LENGTH); /* Set buffer pitches; 0 means unbound. */ if (xfb_obj-Buffers[0]) -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89599] symbol 'x86_64_entry_start' is already defined when building with LLVM/clang
https://bugs.freedesktop.org/show_bug.cgi?id=89599 --- Comment #2 from Tomasz Paweł Gajc tpg...@gmail.com --- This patch fixes this issues: https://abf.io/openmandriva/mesa/blob/master/mesa-10.5.2-hide-few-symbols-to-workaround-clang.patch -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Add the ability to render to I8/L8 and I16/L16 UNORM formats.
On Mon, Apr 06, 2015 at 05:06:39PM -0700, Kenneth Graunke wrote: This allows those formats to work with the meta PBO upload path. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/i965/brw_surface_formats.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index 7261c01..7524ad9 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -582,6 +582,14 @@ brw_init_surface_formats(struct brw_context *brw) case BRW_SURFACEFORMAT_L16_FLOAT: render = BRW_SURFACEFORMAT_R16_FLOAT; break; + case BRW_SURFACEFORMAT_I8_UNORM: + case BRW_SURFACEFORMAT_L8_UNORM: + render = BRW_SURFACEFORMAT_R8_UNORM; + break; + case BRW_SURFACEFORMAT_I16_UNORM: + case BRW_SURFACEFORMAT_L16_UNORM: + render = BRW_SURFACEFORMAT_R16_UNORM; + break; case BRW_SURFACEFORMAT_B8G8R8X8_UNORM: /* XRGB is handled as ARGB because the chips in this family * cannot render to XRGB targets. This means that we have to -- 2.3.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips
On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- Pushing this through a complete piglit run, but it seems to fix bin/arb-provoking-vertex-render on a3xx. Please take special care to double-check that I didn't mess up cw/ccw order or something. I'm especially weak on the quadstrip case. src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py b/src/gallium/auxiliary/indices/u_indices_gen.py index 687a717..b17d132 100644 --- a/src/gallium/auxiliary/indices/u_indices_gen.py +++ b/src/gallium/auxiliary/indices/u_indices_gen.py @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, outpv ): tri( intype, outtype, ptr, v2, v0, v1 ) def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ): -do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); -do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +if inpv == LAST: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +else: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v0, v3, v2, inpv, outpv ); Erm, make that v0, v1, v2; v0, v2, v3. Oops :) def name(intype, outtype, inpv, outpv, pr, prim): if intype == GENERATE: @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr): print ' i += 4;' print ' goto restart;' print ' }' -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +if inpv == LAST: +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +else: +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, outpv ); print ' }' postamble() -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00
From: Kalyan Kondapally kalyan.kondapa...@intel.com Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00. Earlier versions allow 'constant-index-expression' indexing, where index can contain a loop induction variable. Patch allows dynamic indexing for sampler arrays when GLSL ES 3.00. This change makes 'sampler-array-index.frag' parser test in Piglit pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend. v2: small change and some more commit message (Tapani) Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225 --- src/glsl/ast_array_index.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index ecef651..b2609b6 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, * dynamically uniform expression is undefined. */ if (array-type-element_type()-is_sampler()) { -if (!state-is_version(130, 100)) { +if (!state-is_version(130, 300)) { if (state-es_shader) { _mesa_glsl_warning(loc, state, sampler arrays indexed with non-constant -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] nir: Allocate dereferences out of their parent instruction or deref.
Jason pointed out that variable dereferences in NIR are really part of their parent instruction, and should have the same lifetime. Unlike in GLSL IR, they're not used very often - just for intrinsic variables, call parameters return, and indirect samplers for texturing. Also, nir_deref_var is the top-level concept, and nir_deref_array/nir_deref_record are child nodes. This patch attempts to allocate nir_deref_vars out of their parent instruction, and any sub-dereferences out of their parent deref. It enforces these restrictions in the validator as well. This means that freeing an instruction should free its associated dereference chain as well. The memory sweeper pass can also happily ignore them. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/glsl_to_nir.cpp| 47 - src/glsl/nir/nir.c | 6 ++--- src/glsl/nir/nir_lower_var_copies.c | 8 +++ src/glsl/nir/nir_split_var_copies.c | 4 ++-- src/glsl/nir/nir_validate.c | 13 ++ src/mesa/program/prog_to_nir.c | 9 --- 6 files changed, 45 insertions(+), 42 deletions(-) This is still a lot of churn, but surprisingly about even on LOC. With the validator code in place, I suspect we can get this right going forward without too much trouble. diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 80c5b3a..f61a47a 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -88,6 +88,8 @@ private: exec_list *cf_node_list; nir_instr *result; /* result of the expression tree last visited */ + nir_deref_var *make_deref(void *mem_ctx, ir_instruction *ir); + /* the head of the dereference chain we're creating */ nir_deref_var *deref_head; /* the tail of the dereference chain we're creating */ @@ -156,6 +158,14 @@ nir_visitor::~nir_visitor() _mesa_hash_table_destroy(this-overload_table, NULL); } +nir_deref_var * +nir_visitor::make_deref(void *mem_ctx, ir_instruction *ir) +{ + ir-accept(this); + ralloc_steal(mem_ctx, this-deref_head); + return this-deref_head; +} + static nir_constant * constant_copy(ir_constant *ir, void *mem_ctx) { @@ -582,13 +592,11 @@ void nir_visitor::visit(ir_return *ir) { if (ir-value != NULL) { - ir-value-accept(this); nir_intrinsic_instr *copy = nir_intrinsic_instr_create(this-shader, nir_intrinsic_copy_var); - copy-variables[0] = nir_deref_var_create(this-shader, -this-impl-return_var); - copy-variables[1] = this-deref_head; + copy-variables[0] = nir_deref_var_create(copy, this-impl-return_var); + copy-variables[1] = make_deref(copy, ir-value); } nir_jump_instr *instr = nir_jump_instr_create(this-shader, nir_jump_return); @@ -613,8 +621,7 @@ nir_visitor::visit(ir_call *ir) nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader, op); ir_dereference *param = (ir_dereference *) ir-actual_parameters.get_head(); - param-accept(this); - instr-variables[0] = this-deref_head; + instr-variables[0] = make_deref(instr, param); nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL); nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr); @@ -623,8 +630,7 @@ nir_visitor::visit(ir_call *ir) nir_intrinsic_instr_create(shader, nir_intrinsic_store_var); store_instr-num_components = 1; - ir-return_deref-accept(this); - store_instr-variables[0] = this-deref_head; + store_instr-variables[0] = make_deref(store_instr, ir-return_deref); store_instr-src[0].is_ssa = true; store_instr-src[0].ssa = instr-dest.ssa; @@ -642,13 +648,11 @@ nir_visitor::visit(ir_call *ir) unsigned i = 0; foreach_in_list(ir_dereference, param, ir-actual_parameters) { - param-accept(this); - instr-params[i] = this-deref_head; + instr-params[i] = make_deref(instr, param); i++; } - ir-return_deref-accept(this); - instr-return_deref = this-deref_head; + instr-return_deref = make_deref(instr, ir-return_deref); nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr); } @@ -663,12 +667,8 @@ nir_visitor::visit(ir_assignment *ir) nir_intrinsic_instr *copy = nir_intrinsic_instr_create(this-shader, nir_intrinsic_copy_var); - ir-lhs-accept(this); - copy-variables[0] = this-deref_head; - - ir-rhs-accept(this); - copy-variables[1] = this-deref_head; - + copy-variables[0] = make_deref(copy, ir-lhs); + copy-variables[1] = make_deref(copy, ir-rhs); if (ir-condition) { nir_if *if_stmt = nir_if_create(this-shader); @@ -700,6 +700,7 @@ nir_visitor::visit(ir_assignment *ir) load-num_components = ir-lhs-type-vector_elements; nir_ssa_dest_init(load-instr, load-dest, num_components, NULL); load-variables[0] = lhs_deref; + ralloc_steal(load,
[Mesa-dev] [PATCH 2/5] nir: Allocate nir_phi_src values out of the nir_phi_instr.
Phi sources are part of the phi instruction and should have the same lifetime. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir_lower_phis_to_scalar.c | 2 +- src/glsl/nir/nir_lower_vars_to_ssa.c| 2 +- src/glsl/nir/nir_to_ssa.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/glsl/nir/nir_lower_phis_to_scalar.c b/src/glsl/nir/nir_lower_phis_to_scalar.c index 7cd93ea..4bdb800 100644 --- a/src/glsl/nir/nir_lower_phis_to_scalar.c +++ b/src/glsl/nir/nir_lower_phis_to_scalar.c @@ -223,7 +223,7 @@ lower_phis_to_scalar_block(nir_block *block, void *void_state) else nir_instr_insert_after_block(src-pred, mov-instr); -nir_phi_src *new_src = ralloc(state-mem_ctx, nir_phi_src); +nir_phi_src *new_src = ralloc(new_phi, nir_phi_src); new_src-pred = src-pred; new_src-src = nir_src_for_ssa(mov-dest.dest.ssa); diff --git a/src/glsl/nir/nir_lower_vars_to_ssa.c b/src/glsl/nir/nir_lower_vars_to_ssa.c index 86e6ab4..2ca74d7 100644 --- a/src/glsl/nir/nir_lower_vars_to_ssa.c +++ b/src/glsl/nir/nir_lower_vars_to_ssa.c @@ -642,7 +642,7 @@ add_phi_sources(nir_block *block, nir_block *pred, struct deref_node *node = entry-data; - nir_phi_src *src = ralloc(state-mem_ctx, nir_phi_src); + nir_phi_src *src = ralloc(phi, nir_phi_src); src-pred = pred; src-src.is_ssa = true; src-src.ssa = get_ssa_def_for_block(node, pred, state); diff --git a/src/glsl/nir/nir_to_ssa.c b/src/glsl/nir/nir_to_ssa.c index 47cf453..53ff547 100644 --- a/src/glsl/nir/nir_to_ssa.c +++ b/src/glsl/nir/nir_to_ssa.c @@ -47,7 +47,7 @@ insert_trivial_phi(nir_register *reg, nir_block *block, void *mem_ctx) set_foreach(block-predecessors, entry) { nir_block *pred = (nir_block *) entry-key; - nir_phi_src *src = ralloc(mem_ctx, nir_phi_src); + nir_phi_src *src = ralloc(instr, nir_phi_src); src-pred = pred; src-src.is_ssa = false; src-src.reg.base_offset = 0; -- 2.3.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] nir: Allocate nir_ssa_def::uses/if_uses out of the instruction.
We can't allocate them out of the nir_ssa_def itself, because it may not be ralloc'd (for example, nir_dest embeds a nir_ssa_def). However, allocating them out of the instruction should work. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 0f807dd..85ff0f4 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -1834,13 +1834,11 @@ void nir_ssa_def_init(nir_instr *instr, nir_ssa_def *def, unsigned num_components, const char *name) { - void *mem_ctx = ralloc_parent(instr); - def-name = name; def-parent_instr = instr; - def-uses = _mesa_set_create(mem_ctx, _mesa_hash_pointer, + def-uses = _mesa_set_create(instr, _mesa_hash_pointer, _mesa_key_pointer_equal); - def-if_uses = _mesa_set_create(mem_ctx, _mesa_hash_pointer, + def-if_uses = _mesa_set_create(instr, _mesa_hash_pointer, _mesa_key_pointer_equal); def-num_components = num_components; -- 2.3.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] nir: Allocate nir_call_instr::params out of the nir_call itself.
The lifetime of the params array needs to be match the nir_call_instr itself. So, allocate it using the instruction itself as the context. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) This is the 'nir-memory-v2' branch in my tree. diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 5f86eca..0f807dd 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -445,7 +445,7 @@ nir_call_instr_create(void *mem_ctx, nir_function_overload *callee) instr-callee = callee; instr-num_params = callee-num_params; - instr-params = ralloc_array(mem_ctx, nir_deref_var *, instr-num_params); + instr-params = ralloc_array(instr, nir_deref_var *, instr-num_params); instr-return_deref = NULL; return instr; -- 2.3.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] nir: Implement a nir_sweep() pass.
This pass performs a mark and sweep pass over a nir_shader's associated memory - anything still connected to the program will be kept, and any dead memory we dropped on the floor will be freed. The expectation is that this will be called when finished building and optimizing the shader. However, it's also fine to call it earlier, and many times, to free up memory earlier. v2: (feedback from Jason Ekstrand) - Skip sweeping impl-start_block, as it's already in the CF list. - Don't sweep SSA defs (they're owned by their defining instruction) - Don't steal phi sources (they're owned by nir_phi_instr). - Don't steal tex-src (it's owned by the tex_inst itself) - Don't sweep dereference chains (top-level dereferences are owned by the instruction; sub-dereferences are owned by the parent deref). - Don't sweep sources and destinations (SSA defs are handled as part of the defining instruction, and registers are handled as part of function implementations). - Just steal instructions; don't walk them (no longer required). Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/Makefile.sources | 1 + src/glsl/nir/nir.h| 2 + src/glsl/nir/nir_sweep.c | 151 ++ 3 files changed, 154 insertions(+) create mode 100644 src/glsl/nir/nir_sweep.c This version is much simpler (= faster), thanks to the earlier changes. diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 9bdcb80..c471eca 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -59,6 +59,7 @@ NIR_FILES = \ nir/nir_search.c \ nir/nir_search.h \ nir/nir_split_var_copies.c \ + nir/nir_sweep.c \ nir/nir_to_ssa.c \ nir/nir_types.h \ nir/nir_validate.c \ diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index e6b7684..0f72301 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1650,6 +1650,8 @@ bool nir_opt_peephole_ffma(nir_shader *shader); bool nir_opt_remove_phis(nir_shader *shader); +void nir_sweep(nir_shader *shader); + #ifdef __cplusplus } /* extern C */ #endif diff --git a/src/glsl/nir/nir_sweep.c b/src/glsl/nir/nir_sweep.c new file mode 100644 index 000..b33d624 --- /dev/null +++ b/src/glsl/nir/nir_sweep.c @@ -0,0 +1,151 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include nir.h + +/** + * \file nir_sweep.c + * + * The nir_sweep() pass performs a mark and sweep pass over a nir_shader's associated + * memory - anything still connected to the program will be kept, and any dead memory + * we dropped on the floor will be freed. + * + * The expectation is that drivers should call this when finished compiling the shader + * (after any optimization, lowering, and so on). However, it's also fine to call it + * earlier, and even many times, trading CPU cycles for memory savings. + */ + +#define steal_list(mem_ctx, type, list) \ + foreach_list_typed(type, obj, node, list) { ralloc_steal(mem_ctx, obj); } + +static void sweep_cf_node(nir_shader *nir, nir_cf_node *cf_node); + +static void +sweep_block(nir_shader *nir, nir_block *block) +{ + ralloc_steal(nir, block); + + nir_foreach_instr(block, instr) { + ralloc_steal(nir, instr); + } +} + +static void +sweep_if(nir_shader *nir, nir_if *iff) +{ + ralloc_steal(nir, iff); + + foreach_list_typed(nir_cf_node, cf_node, node, iff-then_list) { + sweep_cf_node(nir, cf_node); + } + + foreach_list_typed(nir_cf_node, cf_node, node, iff-else_list) { + sweep_cf_node(nir, cf_node); + } +} + +static void +sweep_loop(nir_shader *nir, nir_loop *loop) +{ + ralloc_steal(nir, loop); + + foreach_list_typed(nir_cf_node, cf_node, node, loop-body) { + sweep_cf_node(nir, cf_node); + } +} + +static void +sweep_cf_node(nir_shader *nir, nir_cf_node *cf_node) +{ + switch (cf_node-type) { + case
Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00
Tapani Pälli tapani.pa...@intel.com writes: From: Kalyan Kondapally kalyan.kondapa...@intel.com Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00. Earlier versions allow 'constant-index-expression' indexing, where index can contain a loop induction variable. Patch allows dynamic indexing for sampler arrays when GLSL ES 3.00. This change makes 'sampler-array-index.frag' parser test in Piglit pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend. v2: small change and some more commit message (Tapani) Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225 Looks good, but did you check what happens now if the shader uses actual variable indexing (i.e. which lowering cannot turn into a constant) on an implementation that doesn't support it? Hopefully no crashes or hangs? --- src/glsl/ast_array_index.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index ecef651..b2609b6 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, * dynamically uniform expression is undefined. */ if (array-type-element_type()-is_sampler()) { - if (!state-is_version(130, 100)) { + if (!state-is_version(130, 300)) { if (state-es_shader) { _mesa_glsl_warning(loc, state, sampler arrays indexed with non-constant -- 2.1.0 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00
On 04/07/2015 01:22 PM, Francisco Jerez wrote: Tapani Pälli tapani.pa...@intel.com writes: From: Kalyan Kondapally kalyan.kondapa...@intel.com Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00. Earlier versions allow 'constant-index-expression' indexing, where index can contain a loop induction variable. Patch allows dynamic indexing for sampler arrays when GLSL ES 3.00. This change makes 'sampler-array-index.frag' parser test in Piglit pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend. v2: small change and some more commit message (Tapani) Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225 Looks good, but did you check what happens now if the shader uses actual variable indexing (i.e. which lowering cannot turn into a constant) on an implementation that doesn't support it? Hopefully no crashes or hangs? I could test something like this, can you throw example of a good victim platform and some ugly corner case? I have a shader_test that has expression with a uniform in it as index as a starter. As a plan B, I think loop analysis could store some information which can be then used for additional validation of array index in a later step (skip it in AST and check only later for ES 1.00). --- src/glsl/ast_array_index.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index ecef651..b2609b6 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, * dynamically uniform expression is undefined. */ if (array-type-element_type()-is_sampler()) { -if (!state-is_version(130, 100)) { +if (!state-is_version(130, 300)) { if (state-es_shader) { _mesa_glsl_warning(loc, state, sampler arrays indexed with non-constant -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] glsl: Consider active all elements of a shared/std140 block array
Besides fixing the mentioned dEQP crashes, this patch also generally fixes instance arrays with UBOs. The problem we have now is that each element in the UBO instance array is a separate UBO mapped to a specific binding point (and thus, a separate buffer), but we kill the instances that are not being referenced in the shader code, so if we have something like this: layout(std140, binding=2) uniform Fragments { vec4 v0; vec4 v1; } inst[3]; And then the shader code only references inst[1], for example: vec4 tfOutput0 = inst[1].v0; That UBO read for inst[1].v0 can fail as a consequence of the fact that we we are killing UBOs for inst[0] and inst[2] and we shouldn't. I hit this while developing SSBO, which is the same thing, and this patch fixes the problem. Iago On Wed, 2015-03-11 at 10:01 +0100, Eduardo Lima Mitev wrote: From: Antia Puentes apuen...@igalia.com Commmit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' blocks or block members) considers active 'shared' and 'std140' uniform blocks and uniform block arrays but did not include the block array elements. It was possible to have an active uniform block array without any elements marked as used, making the assertion ((b-num_array_elements 0) == b-type-is_array()) in link_uniform_blocks fail. Fixes the following 5 dEQP tests: * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18 * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24 * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19 * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49 * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36 --- src/glsl/link_uniform_block_active_visitor.cpp | 23 +++ 1 file changed, 23 insertions(+) diff --git a/src/glsl/link_uniform_block_active_visitor.cpp b/src/glsl/link_uniform_block_active_visitor.cpp index 292cde3..8379750 100644 --- a/src/glsl/link_uniform_block_active_visitor.cpp +++ b/src/glsl/link_uniform_block_active_visitor.cpp @@ -105,6 +105,22 @@ link_uniform_block_active_visitor::visit(ir_variable *var) assert(b-num_array_elements == 0); assert(b-array_elements == NULL); assert(b-type != NULL); + assert(!b-type-is_array() || b-has_instance_name); + + /* For uniform block arrays declared with a shared or std140 layout +* qualifier, mark all its instances as used. +*/ + if (b-type-is_array() b-type-length 0) { + b-num_array_elements = b-type-length; + b-array_elements = reralloc(this-mem_ctx, + b-array_elements, + unsigned, + b-num_array_elements); + + for (unsigned i = 0; i b-num_array_elements; i++) { + b-array_elements[i] = i; + } + } return visit_continue; } @@ -146,6 +162,13 @@ link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir) assert((b-num_array_elements == 0) == (b-array_elements == NULL)); assert(b-type != NULL); + /* If the block array was declared with a shared or std140 layout qualifier, +* all its instances have been already marked as used (see +* link_uniform_block_active_visitor::visit(ir_variable *) function). +*/ + if (var-type-interface_packing == GLSL_INTERFACE_PACKING_PACKED) + return visit_continue; + ir_constant *c = ir-array_index-as_constant(); if (c) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.
Sorry for the delay. I've been away during the Easter. On 02/04/15 19:02, Matt Turner wrote: On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote: These were being defined in SCons, but it's not practical -- we actually need to include Gallium headers from external source trees, with completely disjoint build infrastructure, and it's unsustainable to replicate the HAVE_xxx checks or even hard-coded defines across everywhere. To confirm, you're building external sources with gcc? I don't think these macros are useful for MSVC. Correct. No actual change in behavior for autoconf. --- configure.ac | 2 +- include/c99_compat.h | 45 + scons/gallium.py | 27 --- src/util/macros.h| 2 ++ 4 files changed, 48 insertions(+), 28 deletions(-) diff --git a/configure.ac b/configure.ac index 520cc22..1485bba 100644 --- a/configure.ac +++ b/configure.ac @@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS _SAVE_CPPFLAGS=$CPPFLAGS dnl Compiler macros -DEFINES= +DEFINES=-DHAVE_AUTOCONF AC_SUBST([DEFINES]) case $host_os in linux*|*-gnu*|gnu*) diff --git a/include/c99_compat.h b/include/c99_compat.h index 4fc91bc..62ccd46 100644 --- a/include/c99_compat.h +++ b/include/c99_compat.h c99_compat.h doesn't seem like the right location. I know it seems like a nice place to add this since it's included everywhere, but I worry that in a few years we're going to be cleaning it up like we've been doing with compiler.h and friends. I might make a separate header to define these? Not sure. I can move the defines out of c99_compat.h , e.g., mesa/include/fallbackconfig.h. But I'd prefer to include fallbackconfig.h out of c99_compat.h , as c99_compat.h is pretty much guaranteed to be included all the time. Since probably all cases of #ifdef HAVE___* have a fallback, that runs the risk of never noticing that you weren't including the right header. Precisely, this is all the more reason why it must be included from a header that's included all the time. If it depends on people to add the include on a case-by-case it is bound to fail, as nobody else but us cares, and it will easily go unnoticed. @@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a, #endif + +/* Fallback definitions, for when these headers are used by build systems which + * don't auto-detect these things.*/ +#ifndef HAVE_AUTOCONF I'd rather flip this condition around and not modify configure.ac. But maybe you can't do that because you're not actually building everything with scons? No biggie either way. I don't know. This seems nuts. I really don't like adding stuff to the autotools build system like this. Sure. I really don't know how to deal with this. What I'm hearing is that even the custom scons build system you guys use isn't sufficient for your own needs. You're not building the external source trees with the same build system...? I think you might be getting the wrong idea. We don't build the .C files from external source trees. But we do need to include .h files, so we can interface with components in Mesa tree. That is, I only need the .h files to make sense on their own (with Mesa components, namely mesa/src/gallium/include, and gallium auxiliary libraries). But we have so many inlines functions, so many #ifdef HAVE_foo, that unless all the defines match precisely, the whole hell breaks loose. Gallium has from the start been integrated (ie. embedded) on a myriad of places. It was always meant as a framework to write any sort of 3d driver, not just OpenGL drivers. Things were much worse when Gallium was used on Windows XP kernel land or Windows CE. I'm glad that I or anybody else has to deal with the quirkiness of keeping code portable across these platforms. Things are still much more uniform nowadays. I mean, in all the build system work I've done I've tried to make sure scons continues working -- doing things like adding these HAVE_* definitions to it and such. It's kind of frustrating, and it's even more frustrating when even that isn't sufficient. All I'm doing here is basically move your defines out of scons's python files into C headers. Conceptually it's doing pretty much the same thing as before, but being in a header that means that it's there for all build systems to take. Rembember that Mesa itself is not just autoconf and Scons, there's also Android build system. I don't like it any more you do, but this is the world we live in: the fact is that many platforms constraint how software must be built to a point which is impracticable/impossible to build. Even if a build system that meets everybody needs existed, we'd still face the legacy of existing software using other build systems. To be honest, IMHO, Mesa source tree and build systems are a failure if they can't even sustain external interfaces. For many drivers, the external
Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00
Tapani Pälli tapani.pa...@intel.com writes: On 04/07/2015 01:22 PM, Francisco Jerez wrote: Tapani Pälli tapani.pa...@intel.com writes: From: Kalyan Kondapally kalyan.kondapa...@intel.com Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00. Earlier versions allow 'constant-index-expression' indexing, where index can contain a loop induction variable. Patch allows dynamic indexing for sampler arrays when GLSL ES 3.00. This change makes 'sampler-array-index.frag' parser test in Piglit pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend. v2: small change and some more commit message (Tapani) Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225 Looks good, but did you check what happens now if the shader uses actual variable indexing (i.e. which lowering cannot turn into a constant) on an implementation that doesn't support it? Hopefully no crashes or hangs? I could test something like this, can you throw example of a good victim platform and some ugly corner case? I have a shader_test that has expression with a uniform in it as index as a starter. I guess SNB would be bad enough. The hardware actually supports dynamically uniform indexing of surfaces but we don't implement it and it would likely violate some assumptions in the back-end if it gets that far. As a plan B, I think loop analysis could store some information which can be then used for additional validation of array index in a later step (skip it in AST and check only later for ES 1.00). Yeah, well. It seems rather annoying to get right at the GLSL IR level too, because you'd have to traverse variable defs and built-in function calls, except maybe after optimization (after loop unrolling and constant folding at least). At that point a valid ESSL 1.0 program should only have sampler arrays indexed by constants, what will probably make your job easier. --- src/glsl/ast_array_index.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index ecef651..b2609b6 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, * dynamically uniform expression is undefined. */ if (array-type-element_type()-is_sampler()) { -if (!state-is_version(130, 100)) { +if (!state-is_version(130, 300)) { if (state-es_shader) { _mesa_glsl_warning(loc, state, sampler arrays indexed with non-constant -- 2.1.0 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips
Weird, this seems to regress bin/arb_shader_texture_lod-texgrad bin/arb_shader_texture_lod-texgradcube Visually they look the same, but piglit finds small differences. On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- Pushing this through a complete piglit run, but it seems to fix bin/arb-provoking-vertex-render on a3xx. Please take special care to double-check that I didn't mess up cw/ccw order or something. I'm especially weak on the quadstrip case. src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py b/src/gallium/auxiliary/indices/u_indices_gen.py index 687a717..b17d132 100644 --- a/src/gallium/auxiliary/indices/u_indices_gen.py +++ b/src/gallium/auxiliary/indices/u_indices_gen.py @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, outpv ): tri( intype, outtype, ptr, v2, v0, v1 ) def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ): -do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); -do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +if inpv == LAST: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +else: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v0, v3, v2, inpv, outpv ); Erm, make that v0, v1, v2; v0, v2, v3. Oops :) def name(intype, outtype, inpv, outpv, pr, prim): if intype == GENERATE: @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr): print ' i += 4;' print ' goto restart;' print ' }' -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +if inpv == LAST: +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +else: +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, outpv ); print ' }' postamble() -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips
It will look different with llvmpipe if you use the right debug variables (GALLIVM_DEBUG=no_brilinear,no_quad_lod,no_rho_approx), though still fail. I think the test may not be really valid. This is because if you use texgrad, the driver/hw probably will (or should) use per-pixel lod. But if you don't, it is of course per-quad. For the smallest mip it will only give the same results here if you pick the right (top/bottom, left/right) values for doing the lod calculations with implicit lod (that is, the one from the actually active pixel in the quad, so if the active pixel was top/left you must calculate ddx with the top values, and ddy with the left values). And I don't think that is a requirement anywhere. At least that's what I remember... And the cube test is probably not quite right neither (though llvmpipe passes it with those mentioned variables, that is more due to the implementation of cube mapping though - the cube face selection must be done per pixel and not per quad and there's tons of code to get lods right be it implicit or explicit). Don't know though why the test would regress this, as it shouldn't affect it at all with last provoking vertex. Am 07.04.2015 um 16:28 schrieb Ilia Mirkin: Oh fun, those tests also fail with nvc0 and llvmpipe. But pass on softpipe. (The llvmpipe fail is visually different from the nvc0 and freedreno/a3xx one though.) On Tue, Apr 7, 2015 at 10:25 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Weird, this seems to regress bin/arb_shader_texture_lod-texgrad bin/arb_shader_texture_lod-texgradcube Visually they look the same, but piglit finds small differences. On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- Pushing this through a complete piglit run, but it seems to fix bin/arb-provoking-vertex-render on a3xx. Please take special care to double-check that I didn't mess up cw/ccw order or something. I'm especially weak on the quadstrip case. src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py b/src/gallium/auxiliary/indices/u_indices_gen.py index 687a717..b17d132 100644 --- a/src/gallium/auxiliary/indices/u_indices_gen.py +++ b/src/gallium/auxiliary/indices/u_indices_gen.py @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, outpv ): tri( intype, outtype, ptr, v2, v0, v1 ) def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ): -do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); -do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +if inpv == LAST: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +else: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v0, v3, v2, inpv, outpv ); Erm, make that v0, v1, v2; v0, v2, v3. Oops :) def name(intype, outtype, inpv, outpv, pr, prim): if intype == GENERATE: @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr): print ' i += 4;' print ' goto restart;' print ' }' -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +if inpv == LAST: +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +else: +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, outpv ); print ' }' postamble() -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=WCe6lxrKqxKgvrVHuTpgj1fH76sE-mwWzExbv9DnLQss=R37n-HrF-x56UKswGlBbLHCIlZvTk6p-Z99737VvlS8e= ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.
On 7 April 2015 at 13:14, Jose Fonseca jfons...@vmware.com wrote: Sorry for the delay. I've been away during the Easter. On 02/04/15 19:02, Matt Turner wrote: On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote: These were being defined in SCons, but it's not practical -- we actually need to include Gallium headers from external source trees, with completely disjoint build infrastructure, and it's unsustainable to replicate the HAVE_xxx checks or even hard-coded defines across everywhere. To confirm, you're building external sources with gcc? I don't think these macros are useful for MSVC. Correct. No actual change in behavior for autoconf. --- configure.ac | 2 +- include/c99_compat.h | 45 + scons/gallium.py | 27 --- src/util/macros.h| 2 ++ 4 files changed, 48 insertions(+), 28 deletions(-) diff --git a/configure.ac b/configure.ac index 520cc22..1485bba 100644 --- a/configure.ac +++ b/configure.ac @@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS _SAVE_CPPFLAGS=$CPPFLAGS dnl Compiler macros -DEFINES= +DEFINES=-DHAVE_AUTOCONF AC_SUBST([DEFINES]) case $host_os in linux*|*-gnu*|gnu*) diff --git a/include/c99_compat.h b/include/c99_compat.h index 4fc91bc..62ccd46 100644 --- a/include/c99_compat.h +++ b/include/c99_compat.h c99_compat.h doesn't seem like the right location. I know it seems like a nice place to add this since it's included everywhere, but I worry that in a few years we're going to be cleaning it up like we've been doing with compiler.h and friends. I might make a separate header to define these? Not sure. I can move the defines out of c99_compat.h , e.g., mesa/include/fallbackconfig.h. But I'd prefer to include fallbackconfig.h out of c99_compat.h , as c99_compat.h is pretty much guaranteed to be included all the time. Since probably all cases of #ifdef HAVE___* have a fallback, that runs the risk of never noticing that you weren't including the right header. Precisely, this is all the more reason why it must be included from a header that's included all the time. If it depends on people to add the include on a case-by-case it is bound to fail, as nobody else but us cares, and it will easily go unnoticed. @@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a, #endif + +/* Fallback definitions, for when these headers are used by build systems which + * don't auto-detect these things.*/ +#ifndef HAVE_AUTOCONF I'd rather flip this condition around and not modify configure.ac. But maybe you can't do that because you're not actually building everything with scons? No biggie either way. I don't know. This seems nuts. I really don't like adding stuff to the autotools build system like this. Sure. I really don't know how to deal with this. What I'm hearing is that even the custom scons build system you guys use isn't sufficient for your own needs. You're not building the external source trees with the same build system...? I think you might be getting the wrong idea. We don't build the .C files from external source trees. But we do need to include .h files, so we can interface with components in Mesa tree. That is, I only need the .h files to make sense on their own (with Mesa components, namely mesa/src/gallium/include, and gallium auxiliary libraries). But we have so many inlines functions, so many #ifdef HAVE_foo, that unless all the defines match precisely, the whole hell breaks loose. Gallium has from the start been integrated (ie. embedded) on a myriad of places. It was always meant as a framework to write any sort of 3d driver, not just OpenGL drivers. Things were much worse when Gallium was used on Windows XP kernel land or Windows CE. I'm glad that I or anybody else has to deal with the quirkiness of keeping code portable across these platforms. Things are still much more uniform nowadays. I mean, in all the build system work I've done I've tried to make sure scons continues working -- doing things like adding these HAVE_* definitions to it and such. It's kind of frustrating, and it's even more frustrating when even that isn't sufficient. All I'm doing here is basically move your defines out of scons's python files into C headers. Conceptually it's doing pretty much the same thing as before, but being in a header that means that it's there for all build systems to take. Rembember that Mesa itself is not just autoconf and Scons, there's also Android build system. I don't like it any more you do, but this is the world we live in: the fact is that many platforms constraint how software must be built to a point which is impracticable/impossible to build. Even if a build system that meets everybody needs existed, we'd still face the legacy of existing software using other build systems. To be honest, IMHO, Mesa source
Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips
Oh fun, those tests also fail with nvc0 and llvmpipe. But pass on softpipe. (The llvmpipe fail is visually different from the nvc0 and freedreno/a3xx one though.) On Tue, Apr 7, 2015 at 10:25 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Weird, this seems to regress bin/arb_shader_texture_lod-texgrad bin/arb_shader_texture_lod-texgradcube Visually they look the same, but piglit finds small differences. On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- Pushing this through a complete piglit run, but it seems to fix bin/arb-provoking-vertex-render on a3xx. Please take special care to double-check that I didn't mess up cw/ccw order or something. I'm especially weak on the quadstrip case. src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py b/src/gallium/auxiliary/indices/u_indices_gen.py index 687a717..b17d132 100644 --- a/src/gallium/auxiliary/indices/u_indices_gen.py +++ b/src/gallium/auxiliary/indices/u_indices_gen.py @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, outpv ): tri( intype, outtype, ptr, v2, v0, v1 ) def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ): -do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); -do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +if inpv == LAST: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +else: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v0, v3, v2, inpv, outpv ); Erm, make that v0, v1, v2; v0, v2, v3. Oops :) def name(intype, outtype, inpv, outpv, pr, prim): if intype == GENERATE: @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr): print ' i += 4;' print ' goto restart;' print ' }' -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +if inpv == LAST: +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +else: +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, outpv ); print ' }' postamble() -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] nir: Allocate dereferences out of their parent instruction or deref.
Other than my nitpicking below this looks great! Thanks for working on this! On Apr 7, 2015 2:32 AM, Kenneth Graunke kenn...@whitecape.org wrote: Jason pointed out that variable dereferences in NIR are really part of their parent instruction, and should have the same lifetime. Unlike in GLSL IR, they're not used very often - just for intrinsic variables, call parameters return, and indirect samplers for texturing. Also, nir_deref_var is the top-level concept, and nir_deref_array/nir_deref_record are child nodes. This patch attempts to allocate nir_deref_vars out of their parent instruction, and any sub-dereferences out of their parent deref. It enforces these restrictions in the validator as well. This means that freeing an instruction should free its associated dereference chain as well. The memory sweeper pass can also happily ignore them. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/glsl_to_nir.cpp| 47 - src/glsl/nir/nir.c | 6 ++--- src/glsl/nir/nir_lower_var_copies.c | 8 +++ src/glsl/nir/nir_split_var_copies.c | 4 ++-- src/glsl/nir/nir_validate.c | 13 ++ src/mesa/program/prog_to_nir.c | 9 --- 6 files changed, 45 insertions(+), 42 deletions(-) This is still a lot of churn, but surprisingly about even on LOC. With the validator code in place, I suspect we can get this right going forward without too much trouble. diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 80c5b3a..f61a47a 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -88,6 +88,8 @@ private: exec_list *cf_node_list; nir_instr *result; /* result of the expression tree last visited */ + nir_deref_var *make_deref(void *mem_ctx, ir_instruction *ir); + /* the head of the dereference chain we're creating */ nir_deref_var *deref_head; /* the tail of the dereference chain we're creating */ @@ -156,6 +158,14 @@ nir_visitor::~nir_visitor() _mesa_hash_table_destroy(this-overload_table, NULL); } +nir_deref_var * +nir_visitor::make_deref(void *mem_ctx, ir_instruction *ir) I'm not a huge fan of the name. Maybe evaluate_deref to match evaluate_rvalue or perhaps build_deref? In any case, it doesn't really matter so I won't quibble. It should, however, take a nir_instr instead of a void as its memory context. That makes it a bit more explicit. +{ + ir-accept(this); + ralloc_steal(mem_ctx, this-deref_head); + return this-deref_head; +} + static nir_constant * constant_copy(ir_constant *ir, void *mem_ctx) { @@ -582,13 +592,11 @@ void nir_visitor::visit(ir_return *ir) { if (ir-value != NULL) { - ir-value-accept(this); nir_intrinsic_instr *copy = nir_intrinsic_instr_create(this-shader, nir_intrinsic_copy_var); - copy-variables[0] = nir_deref_var_create(this-shader, -this-impl-return_var); - copy-variables[1] = this-deref_head; + copy-variables[0] = nir_deref_var_create(copy, this-impl-return_var); + copy-variables[1] = make_deref(copy, ir-value); } nir_jump_instr *instr = nir_jump_instr_create(this-shader, nir_jump_return); @@ -613,8 +621,7 @@ nir_visitor::visit(ir_call *ir) nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader, op); ir_dereference *param = (ir_dereference *) ir-actual_parameters.get_head(); - param-accept(this); - instr-variables[0] = this-deref_head; + instr-variables[0] = make_deref(instr, param); nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL); nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr); @@ -623,8 +630,7 @@ nir_visitor::visit(ir_call *ir) nir_intrinsic_instr_create(shader, nir_intrinsic_store_var); store_instr-num_components = 1; - ir-return_deref-accept(this); - store_instr-variables[0] = this-deref_head; + store_instr-variables[0] = make_deref(store_instr, ir-return_deref); store_instr-src[0].is_ssa = true; store_instr-src[0].ssa = instr-dest.ssa; @@ -642,13 +648,11 @@ nir_visitor::visit(ir_call *ir) unsigned i = 0; foreach_in_list(ir_dereference, param, ir-actual_parameters) { - param-accept(this); - instr-params[i] = this-deref_head; + instr-params[i] = make_deref(instr, param); i++; } - ir-return_deref-accept(this); - instr-return_deref = this-deref_head; + instr-return_deref = make_deref(instr, ir-return_deref); nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr); } @@ -663,12 +667,8 @@ nir_visitor::visit(ir_assignment *ir) nir_intrinsic_instr *copy = nir_intrinsic_instr_create(this-shader, nir_intrinsic_copy_var); - ir-lhs-accept(this); - copy-variables[0] = this-deref_head; - -
Re: [Mesa-dev] [PATCH 5/5] nir: Implement a nir_sweep() pass.
On Apr 7, 2015 2:32 AM, Kenneth Graunke kenn...@whitecape.org wrote: This pass performs a mark and sweep pass over a nir_shader's associated memory - anything still connected to the program will be kept, and any dead memory we dropped on the floor will be freed. The expectation is that this will be called when finished building and optimizing the shader. However, it's also fine to call it earlier, and many times, to free up memory earlier. v2: (feedback from Jason Ekstrand) - Skip sweeping impl-start_block, as it's already in the CF list. - Don't sweep SSA defs (they're owned by their defining instruction) - Don't steal phi sources (they're owned by nir_phi_instr). - Don't steal tex-src (it's owned by the tex_inst itself) - Don't sweep dereference chains (top-level dereferences are owned by the instruction; sub-dereferences are owned by the parent deref). - Don't sweep sources and destinations (SSA defs are handled as part of the defining instruction, and registers are handled as part of function implementations). - Just steal instructions; don't walk them (no longer required). Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/Makefile.sources | 1 + src/glsl/nir/nir.h| 2 + src/glsl/nir/nir_sweep.c | 151 ++ 3 files changed, 154 insertions(+) create mode 100644 src/glsl/nir/nir_sweep.c This version is much simpler (= faster), thanks to the earlier changes. diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 9bdcb80..c471eca 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -59,6 +59,7 @@ NIR_FILES = \ nir/nir_search.c \ nir/nir_search.h \ nir/nir_split_var_copies.c \ + nir/nir_sweep.c \ nir/nir_to_ssa.c \ nir/nir_types.h \ nir/nir_validate.c \ diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index e6b7684..0f72301 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1650,6 +1650,8 @@ bool nir_opt_peephole_ffma(nir_shader *shader); bool nir_opt_remove_phis(nir_shader *shader); +void nir_sweep(nir_shader *shader); + #ifdef __cplusplus } /* extern C */ #endif diff --git a/src/glsl/nir/nir_sweep.c b/src/glsl/nir/nir_sweep.c new file mode 100644 index 000..b33d624 --- /dev/null +++ b/src/glsl/nir/nir_sweep.c @@ -0,0 +1,151 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include nir.h + +/** + * \file nir_sweep.c + * + * The nir_sweep() pass performs a mark and sweep pass over a nir_shader's associated + * memory - anything still connected to the program will be kept, and any dead memory + * we dropped on the floor will be freed. + * + * The expectation is that drivers should call this when finished compiling the shader + * (after any optimization, lowering, and so on). However, it's also fine to call it + * earlier, and even many times, trading CPU cycles for memory savings. + */ + +#define steal_list(mem_ctx, type, list) \ + foreach_list_typed(type, obj, node, list) { ralloc_steal(mem_ctx, obj); } + +static void sweep_cf_node(nir_shader *nir, nir_cf_node *cf_node); + +static void +sweep_block(nir_shader *nir, nir_block *block) +{ + ralloc_steal(nir, block); + + nir_foreach_instr(block, instr) { + ralloc_steal(nir, instr); We still need to walk the non-ssa sources and steal any indirect register uses. Either that or ensure that they're allocated out of the instruction. + } +} + +static void +sweep_if(nir_shader *nir, nir_if *iff) +{ + ralloc_steal(nir, iff); If has a source that may have an indirect too. With comments addressed, series is Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com + + foreach_list_typed(nir_cf_node, cf_node, node, iff-then_list) { +
[Mesa-dev] [PATCH v2] glsl: fix assignment of multiple scalar and vecs to matrices.
When a vec has more elements than row components in a matrix, the code could end up failing an assert inside assign_to_matrix_column(). This patch makes sure that when there is still room in the matrix for more elements (but in other columns of the matrix), the data is actually assigned. This patch fixes the following dEQP test: dEQP-GLES3.functional.shaders.conversions.matrix_combine.float_bvec4_ivec2_bool_to_mat4x2_vertex dEQP-GLES3.functional.shaders.conversions.matrix_combine.float_bvec4_ivec2_bool_to_mat4x2_fragment Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- v2: - Improve the patch following Ben's comments. src/glsl/ast_function.cpp | 110 +- 1 file changed, 49 insertions(+), 61 deletions(-) diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index 918be69..0010ffe 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -1370,71 +1370,59 @@ emit_inline_matrix_constructor(const glsl_type *type, } else { const unsigned cols = type-matrix_columns; const unsigned rows = type-vector_elements; + unsigned remaining_slots = rows * cols; unsigned col_idx = 0; unsigned row_idx = 0; foreach_in_list(ir_rvalue, rhs, parameters) { -const unsigned components_remaining_this_column = rows - row_idx; -unsigned rhs_components = rhs-type-components(); -unsigned rhs_base = 0; - -/* Since the parameter might be used in the RHS of two assignments, - * generate a temporary and copy the paramter there. - */ -ir_variable *rhs_var = - new(ctx) ir_variable(rhs-type, mat_ctor_vec, ir_var_temporary); -instructions-push_tail(rhs_var); - -ir_dereference *rhs_var_ref = - new(ctx) ir_dereference_variable(rhs_var); -ir_instruction *inst = new(ctx) ir_assignment(rhs_var_ref, rhs, NULL); -instructions-push_tail(inst); - -/* Assign the current parameter to as many components of the matrix - * as it will fill. - * - * NOTE: A single vector parameter can span two matrix columns. A - * single vec4, for example, can completely fill a mat2. - */ -if (rhs_components = components_remaining_this_column) { - const unsigned count = MIN2(rhs_components, - components_remaining_this_column); - - rhs_var_ref = new(ctx) ir_dereference_variable(rhs_var); - - ir_instruction *inst = assign_to_matrix_column(var, col_idx, - row_idx, - rhs_var_ref, 0, - count, ctx); - instructions-push_tail(inst); - - rhs_base = count; - - col_idx++; - row_idx = 0; -} - -/* If there is data left in the parameter and components left to be - * set in the destination, emit another assignment. It is possible - * that the assignment could be of a vec4 to the last element of the - * matrix. In this case col_idx==cols, but there is still data - * left in the source parameter. Obviously, don't emit an assignment - * to data outside the destination matrix. - */ -if ((col_idx cols) (rhs_base rhs_components)) { - const unsigned count = rhs_components - rhs_base; - - rhs_var_ref = new(ctx) ir_dereference_variable(rhs_var); - - ir_instruction *inst = assign_to_matrix_column(var, col_idx, - row_idx, - rhs_var_ref, - rhs_base, - count, ctx); - instructions-push_tail(inst); - - row_idx += count; -} + unsigned rhs_components = rhs-type-components(); + unsigned rhs_base = 0; + + if (remaining_slots == 0) +break; + + /* Since the parameter might be used in the RHS of two assignments, + * generate a temporary and copy the paramter there. + */ + ir_variable *rhs_var = +new(ctx) ir_variable(rhs-type, mat_ctor_vec, ir_var_temporary); + instructions-push_tail(rhs_var); + + ir_dereference *rhs_var_ref = +new(ctx) ir_dereference_variable(rhs_var); + ir_instruction *inst = new(ctx) ir_assignment(rhs_var_ref, rhs, NULL); + instructions-push_tail(inst); + + do { +/* Assign the current parameter to as many components of the matrix + * as it will fill. + * + * NOTE: A single vector parameter can span two matrix columns. A + * single vec4, for example, can
Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips
Mystery semi-solved? Previously u_primconvert would always select *FIRST* provoking order when flatshading wasn't enabled, but the quads would still follow the last logic. No big deal. I added support for quads to be able to follow the provoking vertex convention, but now the way that the quad is split into tris is different (to make it so that both tris start with vertex 0). This probably tickles one of the effects that you allude to. Soo I just changed it to only look at flatshading_first, which will now generally make it use the LAST provoking order. Problem solved? Not really, but piglits pass, and this seems more consistent. On Tue, Apr 7, 2015 at 11:06 AM, Roland Scheidegger srol...@vmware.com wrote: It will look different with llvmpipe if you use the right debug variables (GALLIVM_DEBUG=no_brilinear,no_quad_lod,no_rho_approx), though still fail. I think the test may not be really valid. This is because if you use texgrad, the driver/hw probably will (or should) use per-pixel lod. But if you don't, it is of course per-quad. For the smallest mip it will only give the same results here if you pick the right (top/bottom, left/right) values for doing the lod calculations with implicit lod (that is, the one from the actually active pixel in the quad, so if the active pixel was top/left you must calculate ddx with the top values, and ddy with the left values). And I don't think that is a requirement anywhere. At least that's what I remember... And the cube test is probably not quite right neither (though llvmpipe passes it with those mentioned variables, that is more due to the implementation of cube mapping though - the cube face selection must be done per pixel and not per quad and there's tons of code to get lods right be it implicit or explicit). Don't know though why the test would regress this, as it shouldn't affect it at all with last provoking vertex. Am 07.04.2015 um 16:28 schrieb Ilia Mirkin: Oh fun, those tests also fail with nvc0 and llvmpipe. But pass on softpipe. (The llvmpipe fail is visually different from the nvc0 and freedreno/a3xx one though.) On Tue, Apr 7, 2015 at 10:25 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Weird, this seems to regress bin/arb_shader_texture_lod-texgrad bin/arb_shader_texture_lod-texgradcube Visually they look the same, but piglit finds small differences. On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote: Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- Pushing this through a complete piglit run, but it seems to fix bin/arb-provoking-vertex-render on a3xx. Please take special care to double-check that I didn't mess up cw/ccw order or something. I'm especially weak on the quadstrip case. src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py b/src/gallium/auxiliary/indices/u_indices_gen.py index 687a717..b17d132 100644 --- a/src/gallium/auxiliary/indices/u_indices_gen.py +++ b/src/gallium/auxiliary/indices/u_indices_gen.py @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, outpv ): tri( intype, outtype, ptr, v2, v0, v1 ) def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ): -do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); -do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +if inpv == LAST: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +else: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v0, v3, v2, inpv, outpv ); Erm, make that v0, v1, v2; v0, v2, v3. Oops :) def name(intype, outtype, inpv, outpv, pr, prim): if intype == GENERATE: @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr): print ' i += 4;' print ' goto restart;' print ' }' -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +if inpv == LAST: +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +else: +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, outpv ); print ' }' postamble() -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=WCe6lxrKqxKgvrVHuTpgj1fH76sE-mwWzExbv9DnLQss=R37n-HrF-x56UKswGlBbLHCIlZvTk6p-Z99737VvlS8e= ___ mesa-dev mailing list
[Mesa-dev] [PATCH v2 1/2] primconvert: select pv convention only from flatshade_first
This should match to how drivers program hardware. It shouldn't matter when flatshading isn't in effect, but somehow it seems to. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/auxiliary/indices/u_primconvert.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_primconvert.c b/src/gallium/auxiliary/indices/u_primconvert.c index 00e65aa..70d3e85 100644 --- a/src/gallium/auxiliary/indices/u_primconvert.c +++ b/src/gallium/auxiliary/indices/u_primconvert.c @@ -104,8 +104,7 @@ util_primconvert_save_rasterizer_state(struct primconvert_context *pc, * we would actually need to save/restore rasterizer state. As * it is, we just need to make note of the pv. */ - pc-api_pv = (rast-flatshade - !rast-flatshade_first) ? PV_LAST : PV_FIRST; + pc-api_pv = rast-flatshade_first ? PV_FIRST : PV_LAST; } void -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 0/2] nir and ttn support for indirect/arrays
On Tue, Apr 7, 2015 at 8:52 AM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org Introduce intrinsics to load/store global vars (since I'm not sure what the point is to have global as a var type if there is no way to access it, but maybe I'm missing something), and update ttn to generate global variables for arrays. By and large, variables are supposed to be accessed with nir_intrinsic_load/store_var. We then (optionally) lower to explicit index+offset intrinsics for shader inputs/outputs to make things easier on the backends. However, globals and locals are expected to be lowered to registers not intrinsics. With TGSI, I think they have an input/output register file, so it was easier for Eric to simply use the index+offset intrinsics right from the start. --Jason So, for example: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..1] DCL TEMP[0..2], ARRAY(1), LOCAL DCL TEMP[3..4], LOCAL DCL ADDR[0] IMM[0] FLT32 {1., 2., 3., 0.} IMM[1] FLT32 {4., 5., 6., 7.} IMM[2] FLT32 {7., 8., 9., 0.} 0: MOV TEMP[0], IMM[0].xyzx 1: MOV TEMP[1], IMM[1].xyzx 2: MOV TEMP[2], IMM[2].xyzx 3: UARL ADDR[0].x, CONST[0]. 4: FSEQ TEMP[3].xyz, TEMP[ADDR[0].x](1).xyzz, CONST[1].xyzz 5: AND TEMP[3].y, TEMP[3]., TEMP[3]. 6: AND TEMP[3].x, TEMP[3]., TEMP[3]. 7: UCMP TEMP[4], TEMP[3]., IMM[0].wxwx, TEMP[4] 8: NOT TEMP[3].x, TEMP[3]. 9: UCMP TEMP[4], TEMP[3]., IMM[0].xwwx, TEMP[4] 10: MOV OUT[0], TEMP[4] 11: END becomes: decl_var uniform vec4[2] uniform_0 (0, 0) decl_var shader_out vec4 out_0 (1, 0) decl_overload main returning void impl main { block block_0: /* preds: */ vec4 ssa_0 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 2.00 */, 0x4040 /* 3.00 */, 0x /* 0.00 */ vec4 ssa_233 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 2.00 */, 0x4040 /* 3.00 */, 0x3f80 /* 1.00 */ intrinsic store_global (ssa_233) () (0, 1) vec4 ssa_234 = load_const (0x4080 /* 4.00 */, 0x40a0 /* 5.00 */, 0x40c0 /* 6.00 */, 0x4080 /* 4.00 */ intrinsic store_global (ssa_234) () (1, 1) vec4 ssa_235 = load_const (0x40e0 /* 7.00 */, 0x4100 /* 8.00 */, 0x4110 /* 9.00 */, 0x40e0 /* 7.00 */ intrinsic store_global (ssa_235) () (2, 1) vec4 ssa_18 = intrinsic load_uniform () () (0, 1) vec4 ssa_25 = intrinsic load_global_indirect (ssa_18) () (0, 1) vec4 ssa_27 = intrinsic load_uniform () () (1, 1) vec1 ssa_132 = feq ssa_25, ssa_27 vec1 ssa_133 = feq ssa_25.y, ssa_27.y vec1 ssa_134 = feq ssa_25.z, ssa_27.z vec1 ssa_77 = iand ssa_133, ssa_134 vec1 ssa_79 = iand ssa_132, ssa_77 vec1 ssa_244 = bcsel ssa_79, ssa_0.w, ssa_0 vec1 ssa_246 = bcsel ssa_79, ssa_0, ssa_0.w vec1 ssa_254 = load_const (0x /* 0.00 */ vec1 ssa_255 = load_const (0x3f80 /* 1.00 */ vec4 ssa_230 = vec4 ssa_244, ssa_246, ssa_254, ssa_255 intrinsic store_output (ssa_230) () (0, 1) /* succs: block_1 */ block block_1: } note, in one of the opt passes the 'decl_var vec4[3] arr_1' is getting lost but I haven't debugged that yet Rob Clark (2): nir: add intrinsics for load/store global gallium/ttn: add support for temp arrays src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 ++-- src/glsl/nir/nir_intrinsics.h | 7 +- 2 files changed, 100 insertions(+), 23 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] primconvert: select pv convention only from flatshade_first
This looks good to me. Note that generally it is not true that this doesn't affect things when flatshading isn't in effect - this is only true if you only have old-style semantics, where color is the only attribute which can be flatshaded (and is done so with the rasterizer setting). But not true if you have attributes which just are interpolated with the flat qualifier. Though this is probably another problem... Roland Am 07.04.2015 um 18:12 schrieb Ilia Mirkin: This should match to how drivers program hardware. It shouldn't matter when flatshading isn't in effect, but somehow it seems to. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/auxiliary/indices/u_primconvert.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_primconvert.c b/src/gallium/auxiliary/indices/u_primconvert.c index 00e65aa..70d3e85 100644 --- a/src/gallium/auxiliary/indices/u_primconvert.c +++ b/src/gallium/auxiliary/indices/u_primconvert.c @@ -104,8 +104,7 @@ util_primconvert_save_rasterizer_state(struct primconvert_context *pc, * we would actually need to save/restore rasterizer state. As * it is, we just need to make note of the pv. */ - pc-api_pv = (rast-flatshade - !rast-flatshade_first) ? PV_LAST : PV_FIRST; + pc-api_pv = rast-flatshade_first ? PV_FIRST : PV_LAST; } void ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.
On Tue, Apr 7, 2015 at 5:14 AM, Jose Fonseca jfons...@vmware.com wrote: Sorry for the delay. I've been away during the Easter. On 02/04/15 19:02, Matt Turner wrote: On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote: These were being defined in SCons, but it's not practical -- we actually need to include Gallium headers from external source trees, with completely disjoint build infrastructure, and it's unsustainable to replicate the HAVE_xxx checks or even hard-coded defines across everywhere. To confirm, you're building external sources with gcc? I don't think these macros are useful for MSVC. Correct. No actual change in behavior for autoconf. --- configure.ac | 2 +- include/c99_compat.h | 45 + scons/gallium.py | 27 --- src/util/macros.h| 2 ++ 4 files changed, 48 insertions(+), 28 deletions(-) diff --git a/configure.ac b/configure.ac index 520cc22..1485bba 100644 --- a/configure.ac +++ b/configure.ac @@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS _SAVE_CPPFLAGS=$CPPFLAGS dnl Compiler macros -DEFINES= +DEFINES=-DHAVE_AUTOCONF AC_SUBST([DEFINES]) case $host_os in linux*|*-gnu*|gnu*) diff --git a/include/c99_compat.h b/include/c99_compat.h index 4fc91bc..62ccd46 100644 --- a/include/c99_compat.h +++ b/include/c99_compat.h c99_compat.h doesn't seem like the right location. I know it seems like a nice place to add this since it's included everywhere, but I worry that in a few years we're going to be cleaning it up like we've been doing with compiler.h and friends. I might make a separate header to define these? Not sure. I can move the defines out of c99_compat.h , e.g., mesa/include/fallbackconfig.h. But I'd prefer to include fallbackconfig.h out of c99_compat.h , as c99_compat.h is pretty much guaranteed to be included all the time. Since probably all cases of #ifdef HAVE___* have a fallback, that runs the risk of never noticing that you weren't including the right header. Precisely, this is all the more reason why it must be included from a header that's included all the time. If it depends on people to add the include on a case-by-case it is bound to fail, as nobody else but us cares, and it will easily go unnoticed. @@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a, #endif + +/* Fallback definitions, for when these headers are used by build systems which + * don't auto-detect these things.*/ +#ifndef HAVE_AUTOCONF I'd rather flip this condition around and not modify configure.ac. But maybe you can't do that because you're not actually building everything with scons? No biggie either way. I don't know. This seems nuts. I really don't like adding stuff to the autotools build system like this. Sure. I really don't know how to deal with this. What I'm hearing is that even the custom scons build system you guys use isn't sufficient for your own needs. You're not building the external source trees with the same build system...? I think you might be getting the wrong idea. We don't build the .C files from external source trees. But we do need to include .h files, so we can interface with components in Mesa tree. That is, I only need the .h files to make sense on their own (with Mesa components, namely mesa/src/gallium/include, and gallium auxiliary libraries). But we have so many inlines functions, so many #ifdef HAVE_foo, that unless all the defines match precisely, the whole hell breaks loose. Gallium has from the start been integrated (ie. embedded) on a myriad of places. It was always meant as a framework to write any sort of 3d driver, not just OpenGL drivers. Things were much worse when Gallium was used on Windows XP kernel land or Windows CE. I'm glad that I or anybody else has to deal with the quirkiness of keeping code portable across these platforms. Things are still much more uniform nowadays. I mean, in all the build system work I've done I've tried to make sure scons continues working -- doing things like adding these HAVE_* definitions to it and such. It's kind of frustrating, and it's even more frustrating when even that isn't sufficient. All I'm doing here is basically move your defines out of scons's python files into C headers. Conceptually it's doing pretty much the same thing as before, but being in a header that means that it's there for all build systems to take. Rembember that Mesa itself is not just autoconf and Scons, there's also Android build system. I don't like it any more you do, but this is the world we live in: the fact is that many platforms constraint how software must be built to a point which is impracticable/impossible to build. Even if a build system that meets everybody needs existed, we'd still face the legacy of existing software using other build systems. To be honest, IMHO, Mesa
Re: [Mesa-dev] [PATCH] glsl: check for forced_language_version in is_version()
Ping. On 04/01/2015 02:38 PM, Brian Paul wrote: This is a follow-on fix from the earlier glsl: allow ForceGLSLVersion to override #version directives change. Since we're not changing the language_version field, we have to check forced_language_version here. --- src/glsl/glsl_parser_extras.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 1f5478b..dae7864 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -105,8 +105,10 @@ struct _mesa_glsl_parse_state { { unsigned required_version = this-es_shader ? required_glsl_es_version : required_glsl_version; + unsigned this_version = this-forced_language_version + ? this-forced_language_version : this-language_version; return required_version != 0 - this-language_version = required_version; + this_version = required_version; } bool check_version(unsigned required_glsl_version, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/2] indices: fix provoking vertex for quads/quadstrips
This allows drivers to provide consistent flat shading for quads. Otherwise a driver that only supported tris would have to force last provoking vertex when drawing quads (and would have to say that quads don't follow the provoking vertex convention). Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- vmware folks -- Please test this out with svga. I see that you might have a similar issue with how you determine api_pv as u_primconvert. Also if you don't expect quads to follow, you can always just pass in pv == LAST, but then you still end up having to force your rast state to pv = LAST too, for quads. A good piglit to play with is: bin/arb-provoking-vertex-render src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py b/src/gallium/auxiliary/indices/u_indices_gen.py index 687a717..97c8e0d 100644 --- a/src/gallium/auxiliary/indices/u_indices_gen.py +++ b/src/gallium/auxiliary/indices/u_indices_gen.py @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, outpv ): tri( intype, outtype, ptr, v2, v0, v1 ) def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ): -do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); -do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +if inpv == LAST: +do_tri( intype, outtype, ptr+'+0', v0, v1, v3, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v1, v2, v3, inpv, outpv ); +else: +do_tri( intype, outtype, ptr+'+0', v0, v1, v2, inpv, outpv ); +do_tri( intype, outtype, ptr+'+3', v0, v2, v3, inpv, outpv ); def name(intype, outtype, inpv, outpv, pr, prim): if intype == GENERATE: @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr): print ' i += 4;' print ' goto restart;' print ' }' -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +if inpv == LAST: +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv ); +else: +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, outpv ); print ' }' postamble() -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 1/2] nir: add intrinsics for load/store global
From: Rob Clark robcl...@freedesktop.org Seemed like these were missing in action? This is how it works for other vars (uniform/shader_in/shader_out/etc), so seemed sensible. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/glsl/nir/nir_intrinsics.h | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h index 8e28765..a047324 100644 --- a/src/glsl/nir/nir_intrinsics.h +++ b/src/glsl/nir/nir_intrinsics.h @@ -122,6 +122,10 @@ SYSTEM_VALUE(invocation_id, 1) INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \ true, 0, 0, 2, flags) +/* NOTE: global can be re-ordered, just not wrt. stores.. not sure if + * is a way to express that? + */ +LOAD(global, 0, NIR_INTRINSIC_CAN_ELIMINATE) LOAD(uniform, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) LOAD(ubo, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) LOAD(input, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) @@ -136,8 +140,9 @@ LOAD(input, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) #define STORE(name, num_indices, flags) \ INTRINSIC(store_##name, 1, ARR(0), false, 0, 0, num_indices, flags) \ INTRINSIC(store_##name##_indirect, 2, ARR(0, 1), false, 0, 0, \ - num_indices, flags) \ + num_indices, flags) +STORE(global, 2, 0)/* num_indices should be ?? */ STORE(output, 2, 0) /* STORE(ssbo, 3, 0) */ -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix depth field setting in surface state for raw buffer on Gen7/8
On Mon, Apr 6, 2015 at 10:51 PM, Zhenyu Wang zhen...@linux.intel.com wrote: On Gen7/8 for RAW surface format, the depth field (surf[3]) in surface state means [30:21] bits of number of entries which is different from other surface format which uses [26:21] bits field. Signed-off-by: Zhenyu Wang zhen...@linux.intel.com Is there a bugzilla that this fixes we can link from the commit message? Either way, this looks good. Reviewed-by: Kristian Høgsberg k...@bitplanet.net --- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 7 +-- src/mesa/drivers/dri/i965/gen8_surface_state.c| 7 +-- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index d9361d3..18bcb8a 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -238,8 +238,11 @@ gen7_emit_buffer_surface_state(struct brw_context *brw, surf[1] = (bo ? bo-offset64 : 0) + buffer_offset; /* reloc */ surf[2] = SET_FIELD((buffer_size - 1) 0x7f, GEN7_SURFACE_WIDTH) | SET_FIELD(((buffer_size - 1) 7) 0x3fff, GEN7_SURFACE_HEIGHT); - surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH) | - (pitch - 1); + if (surface_format == BRW_SURFACEFORMAT_RAW) + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3ff, BRW_SURFACE_DEPTH); + else + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH); + surf[3] |= (pitch - 1); surf[5] = SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS); diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 0007c95..ba59b05 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -129,8 +129,11 @@ gen8_emit_buffer_surface_state(struct brw_context *brw, surf[2] = SET_FIELD((buffer_size - 1) 0x7f, GEN7_SURFACE_WIDTH) | SET_FIELD(((buffer_size - 1) 7) 0x3fff, GEN7_SURFACE_HEIGHT); - surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH) | - (pitch - 1); + if (surface_format == BRW_SURFACEFORMAT_RAW) + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3ff, BRW_SURFACE_DEPTH); + else + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH); + surf[3] |= (pitch - 1); surf[7] = SET_FIELD(HSW_SCS_RED, GEN7_SURFACE_SCS_R) | SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) | SET_FIELD(HSW_SCS_BLUE, GEN7_SURFACE_SCS_B) | -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 2/2] gallium/ttn: add support for temp arrays
On Tue, Apr 7, 2015 at 8:52 AM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org Since the rest of NIR really would rather have these as variables rather than registers, create a nir_variable per array. But rather than completely re-arrange ttn to be variable based rather than register based, keep the registers. In the cases where there is a matching var for the reg, ttn_emit_instruction will append the appropriate intrinsic to get things back from the shadow reg into the variable. NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give an array id. But those just kinda suck, and should really go away. AFAICT we don't get those from glsl. Might be an issue for some other state tracker. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 ++-- 1 file changed, 94 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index da935a4..1c7b313 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -44,6 +44,7 @@ struct ttn_reg_info { /** nir register containing this TGSI index. */ nir_register *reg; + nir_variable *var; /** Offset (in vec4s) from the start of var for this TGSI index. */ int offset; }; @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c) if (file == TGSI_FILE_TEMPORARY) { nir_register *reg; - if (c-scan-indirect_files (1 file)) { + for (i = 0; i array_size; i++) { reg = nir_local_reg_create(b-impl); reg-num_components = 4; - reg-num_array_elems = array_size; + c-temp_regs[decl-Range.First + i].reg = reg; + c-temp_regs[decl-Range.First + i].offset = 0; + } + if (decl-Declaration.Array) { + /* for arrays, the register created just serves as a + * shadow register. We append intrinsic_store_global + * after the tgsi instruction is translated to move + * back from the shadow register to the variable + */ + nir_variable *var = rzalloc(b-shader, nir_variable); + var-type = glsl_array_type(glsl_vec4_type(), array_size); + var-data.mode = nir_var_global; + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID); + + exec_list_push_tail(b-shader-globals, var-node); for (i = 0; i array_size; i++) { -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = i; +c-temp_regs[decl-Range.First + i].var = var; } } else { - for (i = 0; i array_size; i++) { -reg = nir_local_reg_create(b-impl); -reg-num_components = 4; -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = 0; - } } } else if (file == TGSI_FILE_ADDRESS) { c-addr_reg = nir_local_reg_create(b-impl); @@ -256,10 +264,34 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, switch (file) { case TGSI_FILE_TEMPORARY: - src.reg.reg = c-temp_regs[index].reg; - src.reg.base_offset = c-temp_regs[index].offset; - if (indirect) - src.reg.indirect = ttn_src_for_indirect(c, indirect); + if (c-temp_regs[index].var) { + nir_intrinsic_instr *load; + nir_alu_src indirect_address; + + assert(indirect); + + load = nir_intrinsic_instr_create(b-shader, + nir_intrinsic_load_global_indirect); + load-num_components = 4; + load-const_index[0] = index; + load-const_index[1] = 1; Why are we using an intrinsic that has an index and not nir_intrinsic_load_var with a deref? A short (2-element) deref chain will handle this for you and then the lower_vars_to_ssa pass will pick up on things like if all the indirect uses are actually constant and lower it to SSA values for you. If you use an index+offset intrinsic then it's completely opaque and the rest of NIR doesn't know what to do with it. + + memset(indirect_address, 0, sizeof(indirect_address)); + indirect_address.src = nir_src_for_reg(c-addr_reg); + for (int i = 0; i 4; i++) +indirect_address.swizzle[i] = indirect-Swizzle; + load-src[0] = nir_src_for_ssa(nir_imov_alu(b, indirect_address, 1)); + + nir_ssa_dest_init(load-instr, load-dest, 4, NULL); + nir_instr_insert_after_cf_list(b-cf_node_list, load-instr); + + src = nir_src_for_ssa(load-dest.ssa); + + } else { + assert(!indirect); + src.reg.reg = c-temp_regs[index].reg; + src.reg.base_offset = c-temp_regs[index].offset; + } break; case
Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.
On 07/04/15 15:01, Emil Velikov wrote: On 7 April 2015 at 13:14, Jose Fonseca jfons...@vmware.com wrote: Sorry for the delay. I've been away during the Easter. On 02/04/15 19:02, Matt Turner wrote: On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote: These were being defined in SCons, but it's not practical -- we actually need to include Gallium headers from external source trees, with completely disjoint build infrastructure, and it's unsustainable to replicate the HAVE_xxx checks or even hard-coded defines across everywhere. To confirm, you're building external sources with gcc? I don't think these macros are useful for MSVC. Correct. No actual change in behavior for autoconf. --- configure.ac | 2 +- include/c99_compat.h | 45 + scons/gallium.py | 27 --- src/util/macros.h| 2 ++ 4 files changed, 48 insertions(+), 28 deletions(-) diff --git a/configure.ac b/configure.ac index 520cc22..1485bba 100644 --- a/configure.ac +++ b/configure.ac @@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS _SAVE_CPPFLAGS=$CPPFLAGS dnl Compiler macros -DEFINES= +DEFINES=-DHAVE_AUTOCONF AC_SUBST([DEFINES]) case $host_os in linux*|*-gnu*|gnu*) diff --git a/include/c99_compat.h b/include/c99_compat.h index 4fc91bc..62ccd46 100644 --- a/include/c99_compat.h +++ b/include/c99_compat.h c99_compat.h doesn't seem like the right location. I know it seems like a nice place to add this since it's included everywhere, but I worry that in a few years we're going to be cleaning it up like we've been doing with compiler.h and friends. I might make a separate header to define these? Not sure. I can move the defines out of c99_compat.h , e.g., mesa/include/fallbackconfig.h. But I'd prefer to include fallbackconfig.h out of c99_compat.h , as c99_compat.h is pretty much guaranteed to be included all the time. Since probably all cases of #ifdef HAVE___* have a fallback, that runs the risk of never noticing that you weren't including the right header. Precisely, this is all the more reason why it must be included from a header that's included all the time. If it depends on people to add the include on a case-by-case it is bound to fail, as nobody else but us cares, and it will easily go unnoticed. @@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a, #endif + +/* Fallback definitions, for when these headers are used by build systems which + * don't auto-detect these things.*/ +#ifndef HAVE_AUTOCONF I'd rather flip this condition around and not modify configure.ac. But maybe you can't do that because you're not actually building everything with scons? No biggie either way. I don't know. This seems nuts. I really don't like adding stuff to the autotools build system like this. Sure. I really don't know how to deal with this. What I'm hearing is that even the custom scons build system you guys use isn't sufficient for your own needs. You're not building the external source trees with the same build system...? I think you might be getting the wrong idea. We don't build the .C files from external source trees. But we do need to include .h files, so we can interface with components in Mesa tree. That is, I only need the .h files to make sense on their own (with Mesa components, namely mesa/src/gallium/include, and gallium auxiliary libraries). But we have so many inlines functions, so many #ifdef HAVE_foo, that unless all the defines match precisely, the whole hell breaks loose. Gallium has from the start been integrated (ie. embedded) on a myriad of places. It was always meant as a framework to write any sort of 3d driver, not just OpenGL drivers. Things were much worse when Gallium was used on Windows XP kernel land or Windows CE. I'm glad that I or anybody else has to deal with the quirkiness of keeping code portable across these platforms. Things are still much more uniform nowadays. I mean, in all the build system work I've done I've tried to make sure scons continues working -- doing things like adding these HAVE_* definitions to it and such. It's kind of frustrating, and it's even more frustrating when even that isn't sufficient. All I'm doing here is basically move your defines out of scons's python files into C headers. Conceptually it's doing pretty much the same thing as before, but being in a header that means that it's there for all build systems to take. Rembember that Mesa itself is not just autoconf and Scons, there's also Android build system. I don't like it any more you do, but this is the world we live in: the fact is that many platforms constraint how software must be built to a point which is impracticable/impossible to build. Even if a build system that meets everybody needs existed, we'd still face the legacy of existing software using other build systems. To be honest, IMHO, Mesa source tree and build systems are
[Mesa-dev] [RFC 2/2] gallium/ttn: add support for temp arrays
From: Rob Clark robcl...@freedesktop.org Since the rest of NIR really would rather have these as variables rather than registers, create a nir_variable per array. But rather than completely re-arrange ttn to be variable based rather than register based, keep the registers. In the cases where there is a matching var for the reg, ttn_emit_instruction will append the appropriate intrinsic to get things back from the shadow reg into the variable. NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give an array id. But those just kinda suck, and should really go away. AFAICT we don't get those from glsl. Might be an issue for some other state tracker. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 ++-- 1 file changed, 94 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index da935a4..1c7b313 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -44,6 +44,7 @@ struct ttn_reg_info { /** nir register containing this TGSI index. */ nir_register *reg; + nir_variable *var; /** Offset (in vec4s) from the start of var for this TGSI index. */ int offset; }; @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c) if (file == TGSI_FILE_TEMPORARY) { nir_register *reg; - if (c-scan-indirect_files (1 file)) { + for (i = 0; i array_size; i++) { reg = nir_local_reg_create(b-impl); reg-num_components = 4; - reg-num_array_elems = array_size; + c-temp_regs[decl-Range.First + i].reg = reg; + c-temp_regs[decl-Range.First + i].offset = 0; + } + if (decl-Declaration.Array) { + /* for arrays, the register created just serves as a + * shadow register. We append intrinsic_store_global + * after the tgsi instruction is translated to move + * back from the shadow register to the variable + */ + nir_variable *var = rzalloc(b-shader, nir_variable); + var-type = glsl_array_type(glsl_vec4_type(), array_size); + var-data.mode = nir_var_global; + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID); + + exec_list_push_tail(b-shader-globals, var-node); for (i = 0; i array_size; i++) { -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = i; +c-temp_regs[decl-Range.First + i].var = var; } } else { - for (i = 0; i array_size; i++) { -reg = nir_local_reg_create(b-impl); -reg-num_components = 4; -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = 0; - } } } else if (file == TGSI_FILE_ADDRESS) { c-addr_reg = nir_local_reg_create(b-impl); @@ -256,10 +264,34 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, switch (file) { case TGSI_FILE_TEMPORARY: - src.reg.reg = c-temp_regs[index].reg; - src.reg.base_offset = c-temp_regs[index].offset; - if (indirect) - src.reg.indirect = ttn_src_for_indirect(c, indirect); + if (c-temp_regs[index].var) { + nir_intrinsic_instr *load; + nir_alu_src indirect_address; + + assert(indirect); + + load = nir_intrinsic_instr_create(b-shader, + nir_intrinsic_load_global_indirect); + load-num_components = 4; + load-const_index[0] = index; + load-const_index[1] = 1; + + memset(indirect_address, 0, sizeof(indirect_address)); + indirect_address.src = nir_src_for_reg(c-addr_reg); + for (int i = 0; i 4; i++) +indirect_address.swizzle[i] = indirect-Swizzle; + load-src[0] = nir_src_for_ssa(nir_imov_alu(b, indirect_address, 1)); + + nir_ssa_dest_init(load-instr, load-dest, 4, NULL); + nir_instr_insert_after_cf_list(b-cf_node_list, load-instr); + + src = nir_src_for_ssa(load-dest.ssa); + + } else { + assert(!indirect); + src.reg.reg = c-temp_regs[index].reg; + src.reg.base_offset = c-temp_regs[index].offset; + } break; case TGSI_FILE_ADDRESS: @@ -340,29 +372,45 @@ ttn_get_dest(struct ttn_compile *c, struct tgsi_full_dst_register *tgsi_fdst) { struct tgsi_dst_register *tgsi_dst = tgsi_fdst-Register; nir_alu_dest dest; + unsigned index = tgsi_dst-Index; memset(dest, 0, sizeof(dest)); + dest.write_mask = tgsi_dst-WriteMask; + dest.saturate = false; + if (tgsi_dst-File == TGSI_FILE_TEMPORARY) { - dest.dest.reg.reg = c-temp_regs[tgsi_dst-Index].reg; - dest.dest.reg.base_offset = c-temp_regs[tgsi_dst-Index].offset; + dest.dest.reg.reg = c-temp_regs[index].reg; +
[Mesa-dev] [RFC 0/2] nir and ttn support for indirect/arrays
From: Rob Clark robcl...@freedesktop.org Introduce intrinsics to load/store global vars (since I'm not sure what the point is to have global as a var type if there is no way to access it, but maybe I'm missing something), and update ttn to generate global variables for arrays. So, for example: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0..1] DCL TEMP[0..2], ARRAY(1), LOCAL DCL TEMP[3..4], LOCAL DCL ADDR[0] IMM[0] FLT32 {1., 2., 3., 0.} IMM[1] FLT32 {4., 5., 6., 7.} IMM[2] FLT32 {7., 8., 9., 0.} 0: MOV TEMP[0], IMM[0].xyzx 1: MOV TEMP[1], IMM[1].xyzx 2: MOV TEMP[2], IMM[2].xyzx 3: UARL ADDR[0].x, CONST[0]. 4: FSEQ TEMP[3].xyz, TEMP[ADDR[0].x](1).xyzz, CONST[1].xyzz 5: AND TEMP[3].y, TEMP[3]., TEMP[3]. 6: AND TEMP[3].x, TEMP[3]., TEMP[3]. 7: UCMP TEMP[4], TEMP[3]., IMM[0].wxwx, TEMP[4] 8: NOT TEMP[3].x, TEMP[3]. 9: UCMP TEMP[4], TEMP[3]., IMM[0].xwwx, TEMP[4] 10: MOV OUT[0], TEMP[4] 11: END becomes: decl_var uniform vec4[2] uniform_0 (0, 0) decl_var shader_out vec4 out_0 (1, 0) decl_overload main returning void impl main { block block_0: /* preds: */ vec4 ssa_0 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 2.00 */, 0x4040 /* 3.00 */, 0x /* 0.00 */ vec4 ssa_233 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 2.00 */, 0x4040 /* 3.00 */, 0x3f80 /* 1.00 */ intrinsic store_global (ssa_233) () (0, 1) vec4 ssa_234 = load_const (0x4080 /* 4.00 */, 0x40a0 /* 5.00 */, 0x40c0 /* 6.00 */, 0x4080 /* 4.00 */ intrinsic store_global (ssa_234) () (1, 1) vec4 ssa_235 = load_const (0x40e0 /* 7.00 */, 0x4100 /* 8.00 */, 0x4110 /* 9.00 */, 0x40e0 /* 7.00 */ intrinsic store_global (ssa_235) () (2, 1) vec4 ssa_18 = intrinsic load_uniform () () (0, 1) vec4 ssa_25 = intrinsic load_global_indirect (ssa_18) () (0, 1) vec4 ssa_27 = intrinsic load_uniform () () (1, 1) vec1 ssa_132 = feq ssa_25, ssa_27 vec1 ssa_133 = feq ssa_25.y, ssa_27.y vec1 ssa_134 = feq ssa_25.z, ssa_27.z vec1 ssa_77 = iand ssa_133, ssa_134 vec1 ssa_79 = iand ssa_132, ssa_77 vec1 ssa_244 = bcsel ssa_79, ssa_0.w, ssa_0 vec1 ssa_246 = bcsel ssa_79, ssa_0, ssa_0.w vec1 ssa_254 = load_const (0x /* 0.00 */ vec1 ssa_255 = load_const (0x3f80 /* 1.00 */ vec4 ssa_230 = vec4 ssa_244, ssa_246, ssa_254, ssa_255 intrinsic store_output (ssa_230) () (0, 1) /* succs: block_1 */ block block_1: } note, in one of the opt passes the 'decl_var vec4[3] arr_1' is getting lost but I haven't debugged that yet Rob Clark (2): nir: add intrinsics for load/store global gallium/ttn: add support for temp arrays src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 ++-- src/glsl/nir/nir_intrinsics.h | 7 +- 2 files changed, 100 insertions(+), 23 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] primconvert: select pv convention only from flatshade_first
BTW, I should have mentioned -- this affects the way that quads are split up into tri's, which is most likely the source of any differences from my later change which makes quads respect pv order. On Tue, Apr 7, 2015 at 12:12 PM, Ilia Mirkin imir...@alum.mit.edu wrote: This should match to how drivers program hardware. It shouldn't matter when flatshading isn't in effect, but somehow it seems to. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/auxiliary/indices/u_primconvert.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/indices/u_primconvert.c b/src/gallium/auxiliary/indices/u_primconvert.c index 00e65aa..70d3e85 100644 --- a/src/gallium/auxiliary/indices/u_primconvert.c +++ b/src/gallium/auxiliary/indices/u_primconvert.c @@ -104,8 +104,7 @@ util_primconvert_save_rasterizer_state(struct primconvert_context *pc, * we would actually need to save/restore rasterizer state. As * it is, we just need to make note of the pv. */ - pc-api_pv = (rast-flatshade - !rast-flatshade_first) ? PV_LAST : PV_FIRST; + pc-api_pv = rast-flatshade_first ? PV_FIRST : PV_LAST; } void -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 2/2] gallium/ttn: add support for temp arrays
On Tue, Apr 7, 2015 at 12:32 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Tue, Apr 7, 2015 at 8:52 AM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org Since the rest of NIR really would rather have these as variables rather than registers, create a nir_variable per array. But rather than completely re-arrange ttn to be variable based rather than register based, keep the registers. In the cases where there is a matching var for the reg, ttn_emit_instruction will append the appropriate intrinsic to get things back from the shadow reg into the variable. NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give an array id. But those just kinda suck, and should really go away. AFAICT we don't get those from glsl. Might be an issue for some other state tracker. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 ++-- 1 file changed, 94 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index da935a4..1c7b313 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -44,6 +44,7 @@ struct ttn_reg_info { /** nir register containing this TGSI index. */ nir_register *reg; + nir_variable *var; /** Offset (in vec4s) from the start of var for this TGSI index. */ int offset; }; @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c) if (file == TGSI_FILE_TEMPORARY) { nir_register *reg; - if (c-scan-indirect_files (1 file)) { + for (i = 0; i array_size; i++) { reg = nir_local_reg_create(b-impl); reg-num_components = 4; - reg-num_array_elems = array_size; + c-temp_regs[decl-Range.First + i].reg = reg; + c-temp_regs[decl-Range.First + i].offset = 0; + } + if (decl-Declaration.Array) { + /* for arrays, the register created just serves as a + * shadow register. We append intrinsic_store_global + * after the tgsi instruction is translated to move + * back from the shadow register to the variable + */ + nir_variable *var = rzalloc(b-shader, nir_variable); + var-type = glsl_array_type(glsl_vec4_type(), array_size); + var-data.mode = nir_var_global; + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID); + + exec_list_push_tail(b-shader-globals, var-node); for (i = 0; i array_size; i++) { -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = i; +c-temp_regs[decl-Range.First + i].var = var; } } else { - for (i = 0; i array_size; i++) { -reg = nir_local_reg_create(b-impl); -reg-num_components = 4; -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = 0; - } } } else if (file == TGSI_FILE_ADDRESS) { c-addr_reg = nir_local_reg_create(b-impl); @@ -256,10 +264,34 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, switch (file) { case TGSI_FILE_TEMPORARY: - src.reg.reg = c-temp_regs[index].reg; - src.reg.base_offset = c-temp_regs[index].offset; - if (indirect) - src.reg.indirect = ttn_src_for_indirect(c, indirect); + if (c-temp_regs[index].var) { + nir_intrinsic_instr *load; + nir_alu_src indirect_address; + + assert(indirect); + + load = nir_intrinsic_instr_create(b-shader, + nir_intrinsic_load_global_indirect); + load-num_components = 4; + load-const_index[0] = index; + load-const_index[1] = 1; Why are we using an intrinsic that has an index and not nir_intrinsic_load_var with a deref? A short (2-element) deref chain will handle this for you and then the lower_vars_to_ssa pass will pick up on things like if all the indirect uses are actually constant and lower it to SSA values for you. If you use an index+offset intrinsic then it's completely opaque and the rest of NIR doesn't know what to do with it. I am *assuming* here that the index refers to which var you are load/storing.. at least that is how it seemed to work for uniforms/inputs/outputs. Ofc I'm mostly just trying to infer about how things should work from reading code so entirely possible I'm missing something or haven't read the right parts of the code yet.. I'm starting to think more that I should have added a nir_intrinsic_{load,store}_var_indirect instead of new intrinsics for load/store_global(_indirect).. I guess that would fit in better with how variables already work. Although I couldn't see any obvious way for {load,store}_var to take
Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.
On 07/04/15 17:16, Matt Turner wrote: On Tue, Apr 7, 2015 at 5:14 AM, Jose Fonseca jfons...@vmware.com wrote: Sorry for the delay. I've been away during the Easter. On 02/04/15 19:02, Matt Turner wrote: On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote: These were being defined in SCons, but it's not practical -- we actually need to include Gallium headers from external source trees, with completely disjoint build infrastructure, and it's unsustainable to replicate the HAVE_xxx checks or even hard-coded defines across everywhere. To confirm, you're building external sources with gcc? I don't think these macros are useful for MSVC. Correct. No actual change in behavior for autoconf. --- configure.ac | 2 +- include/c99_compat.h | 45 + scons/gallium.py | 27 --- src/util/macros.h| 2 ++ 4 files changed, 48 insertions(+), 28 deletions(-) diff --git a/configure.ac b/configure.ac index 520cc22..1485bba 100644 --- a/configure.ac +++ b/configure.ac @@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS _SAVE_CPPFLAGS=$CPPFLAGS dnl Compiler macros -DEFINES= +DEFINES=-DHAVE_AUTOCONF AC_SUBST([DEFINES]) case $host_os in linux*|*-gnu*|gnu*) diff --git a/include/c99_compat.h b/include/c99_compat.h index 4fc91bc..62ccd46 100644 --- a/include/c99_compat.h +++ b/include/c99_compat.h c99_compat.h doesn't seem like the right location. I know it seems like a nice place to add this since it's included everywhere, but I worry that in a few years we're going to be cleaning it up like we've been doing with compiler.h and friends. I might make a separate header to define these? Not sure. I can move the defines out of c99_compat.h , e.g., mesa/include/fallbackconfig.h. But I'd prefer to include fallbackconfig.h out of c99_compat.h , as c99_compat.h is pretty much guaranteed to be included all the time. Since probably all cases of #ifdef HAVE___* have a fallback, that runs the risk of never noticing that you weren't including the right header. Precisely, this is all the more reason why it must be included from a header that's included all the time. If it depends on people to add the include on a case-by-case it is bound to fail, as nobody else but us cares, and it will easily go unnoticed. @@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a, #endif + +/* Fallback definitions, for when these headers are used by build systems which + * don't auto-detect these things.*/ +#ifndef HAVE_AUTOCONF I'd rather flip this condition around and not modify configure.ac. But maybe you can't do that because you're not actually building everything with scons? No biggie either way. I don't know. This seems nuts. I really don't like adding stuff to the autotools build system like this. Sure. I really don't know how to deal with this. What I'm hearing is that even the custom scons build system you guys use isn't sufficient for your own needs. You're not building the external source trees with the same build system...? I think you might be getting the wrong idea. We don't build the .C files from external source trees. But we do need to include .h files, so we can interface with components in Mesa tree. That is, I only need the .h files to make sense on their own (with Mesa components, namely mesa/src/gallium/include, and gallium auxiliary libraries). But we have so many inlines functions, so many #ifdef HAVE_foo, that unless all the defines match precisely, the whole hell breaks loose. Gallium has from the start been integrated (ie. embedded) on a myriad of places. It was always meant as a framework to write any sort of 3d driver, not just OpenGL drivers. Things were much worse when Gallium was used on Windows XP kernel land or Windows CE. I'm glad that I or anybody else has to deal with the quirkiness of keeping code portable across these platforms. Things are still much more uniform nowadays. I mean, in all the build system work I've done I've tried to make sure scons continues working -- doing things like adding these HAVE_* definitions to it and such. It's kind of frustrating, and it's even more frustrating when even that isn't sufficient. All I'm doing here is basically move your defines out of scons's python files into C headers. Conceptually it's doing pretty much the same thing as before, but being in a header that means that it's there for all build systems to take. Rembember that Mesa itself is not just autoconf and Scons, there's also Android build system. I don't like it any more you do, but this is the world we live in: the fact is that many platforms constraint how software must be built to a point which is impracticable/impossible to build. Even if a build system that meets everybody needs existed, we'd still face the legacy of existing software using other build systems. To be honest, IMHO, Mesa source tree and build
[Mesa-dev] [PATCH] swrast: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/swrast/s_linetemp.h |4 ++-- src/mesa/swrast/s_span.c |2 +- src/mesa/swrast/s_tritemp.h |2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/swrast/s_linetemp.h b/src/mesa/swrast/s_linetemp.h index 352c884..035a1e6 100644 --- a/src/mesa/swrast/s_linetemp.h +++ b/src/mesa/swrast/s_linetemp.h @@ -106,7 +106,7 @@ NAME( struct gl_context *ctx, const SWvertex *vert0, const SWvertex *vert1 ) } /* - printf(%s():\n, __FUNCTION__); + printf(%s():\n, __func__); printf( (%f, %f, %f) - (%f, %f, %f)\n, vert0-attrib[VARYING_SLOT_POS][0], vert0-attrib[VARYING_SLOT_POS][1], @@ -154,7 +154,7 @@ NAME( struct gl_context *ctx, const SWvertex *vert0, const SWvertex *vert1 ) return; /* - printf(%s %d,%d %g %g %g %g %g %g %g %g\n, __FUNCTION__, dx, dy, + printf(%s %d,%d %g %g %g %g %g %g %g %g\n, __func__, dx, dy, vert0-attrib[VARYING_SLOT_COL1][0], vert0-attrib[VARYING_SLOT_COL1][1], vert0-attrib[VARYING_SLOT_COL1][2], diff --git a/src/mesa/swrast/s_span.c b/src/mesa/swrast/s_span.c index e304b6b..7bb5712 100644 --- a/src/mesa/swrast/s_span.c +++ b/src/mesa/swrast/s_span.c @@ -1144,7 +1144,7 @@ _swrast_write_rgba_span( struct gl_context *ctx, SWspan *span) struct gl_framebuffer *fb = ctx-DrawBuffer; /* - printf(%s() interp 0x%x array 0x%x\n, __FUNCTION__, + printf(%s() interp 0x%x array 0x%x\n, __func__, span-interpMask, span-arrayMask); */ diff --git a/src/mesa/swrast/s_tritemp.h b/src/mesa/swrast/s_tritemp.h index fb73b2d..4b6d34c 100644 --- a/src/mesa/swrast/s_tritemp.h +++ b/src/mesa/swrast/s_tritemp.h @@ -156,7 +156,7 @@ static void NAME(struct gl_context *ctx, const SWvertex *v0, #endif /* - printf(%s()\n, __FUNCTION__); + printf(%s()\n, __func__); printf( %g, %g, %g\n, v0-attrib[VARYING_SLOT_POS][0], v0-attrib[VARYING_SLOT_POS][1], -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.
On 7 April 2015 at 16:21, Jose Fonseca jfons...@vmware.com wrote: On 07/04/15 15:01, Emil Velikov wrote: ... So let see if I got this correct, apologies in advance if it comes out too blunt. Unless I'm mistaken the gallium interfaces are internal/private, so comparing them with public ones (like the Khronos OpenGL) seems like comparing apples to oranges. Yet as one tries to have/use gallium interfaces as if they were public, the idea of gettting some of this #ifdef-ery into a single, isolated and easily manageable place is valid and honourable. From my POV, Gallium interfaces are public and always have been. Admittedly, there's no standards body, and the interface is neither stable nor does it provide backwards compatibility. But pretty much from as far as I can remember (which is 2007) there were external (as in out-of-tree) state-trackers and even externals drivers. Don't mean to be cheeky but do you have an example of a project that has a public interface that is neither stable nor backwards compatible ? Don't think I've heard about one, so I must admit that I found your statement rather surprising. Then again I have been proven narrow minded on a occasion or two. BTW, another solution would be for autotools to generate a config.h. And have SCons, etc, include a hand-written drop-in config.h (living in a separate directory.) This is actually a practice that many projects (out of my head I can name zlib, linpng, tiff, etc) do. I had the same idea for a few months now. Although that would likely be a slow and long transition, as I would like to avoid severe breakages. Another alternative is for me to pre-include this fakeconfig.h , ie., `gcc --include fakeconfig.h`, MSVC's `/Fifakeconfig.h` Imho this sounds like a the better solution. This way when someone uses gallium interfaces as public they can include it explicitly. Afaict the overhead of rebasing an integrated solution on top of newer mesa, would be less than having it out-of-tree. Plus it seems like the better engineering approach. Perhaps I'm missing something and this does not hold true ? I'm afraid it doesn't hold true. It's not worth going into specifics, but imagine the following: there's Mesa, theres our Product, and there's the Component linking both. Mesa has its build system. The Product has its build system. Both Product and Mesa are huge, so building one inside the other it's just impractical. What you can do is choose to build the Component inside Mesa or inside the Product, but either way you'll end up the variations of the same problem. Which is one is easier depends on how tightly the Component is integrated with Mesa vs the Product. The component is sort of a Direct3D state tracker, and is way more tightly integrated into the rest of the Product than Mesa, as it really one needs the gallium headers and a few of the helper modules. The situation sounds familiar, although I might be bit biased on the topic. Let me reword your sentence with an example in place: You have a Component (st/omx) , which depends on Product (omx-bellagio) at compile/link time. Does your platform(s) has tools similar to pkg-config cmake's package-config ? If so one should be able to tackle it as follows: 1. Bring some versioning into Product. 2. Making sure that Product's headers/libraries are available via the pkg-config(alike) tool. 3. Add the check for Product into the autotools/scons build 4. Integrate(merge) Component into Mesa. It does have one small catch though - Product cannot on depend on Mesa at link time. One can get around this but it requires some non-trivial changes. Suspecting that the situation might be more elaborate than presented (or I did not fully understood it) and I'm not trying to push you to disclose any more information. Just saying that as presented it does not sound so complex. Thanks for the comprehensive explanation. Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use SET_FIELD in 3DSTATE_STREAMOUT packets.
On Mon, Apr 6, 2015 at 4:12 PM, Kenneth Graunke kenn...@whitecape.org wrote: Suggested by Topi Pohjolainen. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/i965/gen7_sol_state.c | 16 src/mesa/drivers/dri/i965/gen8_sol_state.c | 16 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index 7e9b285..3f99df9 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -245,17 +245,17 @@ upload_3dstate_streamout(struct brw_context *brw, bool active, * point by reading less and offsetting the register index in the * SO_DECLs. */ - dw2 |= urb_entry_read_offset SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_0_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_0_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_1_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_1_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_2_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_2_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_3_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_3_VERTEX_READ_LENGTH); } BEGIN_BATCH(3); diff --git a/src/mesa/drivers/dri/i965/gen8_sol_state.c b/src/mesa/drivers/dri/i965/gen8_sol_state.c index d98a226..58ead68 100644 --- a/src/mesa/drivers/dri/i965/gen8_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen8_sol_state.c @@ -125,17 +125,17 @@ gen8_upload_3dstate_streamout(struct brw_context *brw, bool active, * point by reading less and offsetting the register index in the * SO_DECLs. */ - dw2 |= urb_entry_read_offset SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_0_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_0_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_1_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_1_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_2_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_2_VERTEX_READ_LENGTH); - dw2 |= urb_entry_read_offset SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT; - dw2 |= (urb_entry_read_length - 1) SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT; + dw2 |= SET_FIELD(urb_entry_read_offset, SO_STREAM_3_VERTEX_READ_OFFSET); + dw2 |= SET_FIELD(urb_entry_read_length - 1, SO_STREAM_3_VERTEX_READ_LENGTH); /* Set buffer pitches; 0 means unbound. */ if (xfb_obj-Buffers[0]) -- 2.3.4 Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir/lower_tex_projector: Don't use designated initializers
Reviewed-by: Mark Janes mark.a.ja...@intel.com Jason Ekstrand ja...@jlekstrand.net writes: These don't work in MSVC or in older versions of GCC Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89899 Cc: Eric Anholt e...@anholt.net --- src/glsl/nir/nir_lower_tex_projector.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glsl/nir/nir_lower_tex_projector.c b/src/glsl/nir/nir_lower_tex_projector.c index 6327b23..6b0e9c3 100644 --- a/src/glsl/nir/nir_lower_tex_projector.c +++ b/src/glsl/nir/nir_lower_tex_projector.c @@ -109,7 +109,8 @@ nir_lower_tex_projector_block(nir_block *block, void *void_state) /* Now move the later tex sources down the array so that the projector * disappears. */ - nir_src dead = {.is_ssa = false, .ssa = NULL}; + nir_src dead; + memset(dead, 0, sizeof dead); nir_instr_rewrite_src(tex-instr, tex-src[proj_index].src, dead); memmove(tex-src[proj_index], tex-src[proj_index + 1], -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa/teximage: use correct extension for accept stencil texture.
On Monday, April 06, 2015 09:44:07 PM Pohjolainen, Topi wrote: On Mon, Apr 06, 2015 at 11:37:08AM -0700, Ian Romanick wrote: On 04/06/2015 08:33 AM, Pohjolainen, Topi wrote: On Sun, Apr 05, 2015 at 08:22:13PM +0300, Pohjolainen, Topi wrote: On Sun, Apr 05, 2015 at 08:06:50PM +0300, Pohjolainen, Topi wrote: On Sun, Apr 05, 2015 at 08:46:16AM -0400, Ilia Mirkin wrote: While this change is correct, the Intel guys will yell at you, because they're somehow misusing this in meta for Broadwell, s.t. this will cause crashes when blitting stencil. IMHO that's a problem that should be fixed in their driver and this can go on, but... it's also not my driver that's crashing -- they might feel differently :) As far as I can tell we only do: _mesa_TexParameteri(target, GL_DEPTH_STENCIL_TEXTURE_MODE, GL_STENCIL_INDEX); which suppose to be the right thing to do - we select the stencil to be sampled instead of depth. And this won't hit the path below. I made the change locally and I'm now running piglit on broadwell. I noticed that _mesa_base_tex_format() is in turn used in src/mesa/drivers/common/meta_blit.c but we shouldn't go there with intel driver ever. On hardware older than broadwell we don't use meta and the one used on broadwell and newer is found in: src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c But lets see what piglit says. Right you are. This is more subtle, we will hit it when we actually create a temporary texture out of the given read renderbuffer. It seems that this was hit first time when formats where adjusted and then Jason added the conditional using ARB_stencil_texturing (which is not right either). Really sorry that this is hindering your work now. I'll try to take a look at this tomorrow. So far I can't come up with other things than pure hacks. I'll explain a little what happens in the intel stencil meta blit. Like I said, the driver creates a temporary texture out of the stencil attachment: const struct gl_renderbuffer_attachment *att = ctx-ReadBuffer-Attachment[BUFFER_STENCIL]; struct gl_renderbuffer *rb = att-Renderbuffer; struct gl_texture_object *tex_obj; ... if (!_mesa_meta_bind_rb_as_tex_image(ctx, rb, blit-tempTex, tex_obj, target)) { This gets wound back to the driver, a call to intel_bind_renderbuffer_tex_image() which in turn calls the core again. _mesa_init_teximage_fields(ctx, image, rb-Width, rb-Height, 1, 0, rb-InternalFormat, rb-Format); Here rb-InternalFormat is GL_STENCIL_INDEX that won't be accepted by _mesa_base_tex_format() anymore without ARB_texture_stencil8. As most of the texture image setting up logic takes place in the core, the boolean state flag (brw_context::meta_in_progress) we have in intel driver is not much help. It looks that we would need additional driver driven overriding. But I don't like that at all. On the platforms that use this path, don't we fake DEPTH_STENCIL textures by having separate depth and stencil surfaces? The implication being that all of the mechanism that does stencil texturing from DEPTH_STENCIL surfaces is the same as we would need to texture from STENCIL_INDEX8 surfaces. Wouldn't it be easier to just enable ARB_texture_stencil8 on those platforms? I'm sure you would know better than me :) Actually, you're the expert here :) I think that we can just turn on ARB_texture_stencil8 - I just hadn't done the core Mesa plumbing. Why don't we try and do that? signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Add the ability to render to I8/L8 and I16/L16 UNORM formats.
On Mon, Apr 6, 2015 at 5:06 PM, Kenneth Graunke kenn...@whitecape.org wrote: This allows those formats to work with the meta PBO upload path. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_surface_formats.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index 7261c01..7524ad9 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -582,6 +582,14 @@ brw_init_surface_formats(struct brw_context *brw) case BRW_SURFACEFORMAT_L16_FLOAT: render = BRW_SURFACEFORMAT_R16_FLOAT; break; + case BRW_SURFACEFORMAT_I8_UNORM: + case BRW_SURFACEFORMAT_L8_UNORM: + render = BRW_SURFACEFORMAT_R8_UNORM; + break; + case BRW_SURFACEFORMAT_I16_UNORM: + case BRW_SURFACEFORMAT_L16_UNORM: + render = BRW_SURFACEFORMAT_R16_UNORM; + break; case BRW_SURFACEFORMAT_B8G8R8X8_UNORM: /* XRGB is handled as ARGB because the chips in this family * cannot render to XRGB targets. This means that we have to -- 2.3.5 Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89899] nir/nir_lower_tex_projector.c:112: error: unknown field ‘ssa’ specified in initializer
https://bugs.freedesktop.org/show_bug.cgi?id=89899 Jason Ekstrand ja...@jlekstrand.net changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Jason Ekstrand ja...@jlekstrand.net --- I just pushed a patch that should fix this. Reopen if it's still a problem. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i915: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/drivers/dri/i915/i830_state.c | 44 src/mesa/drivers/dri/i915/i830_texblend.c |4 +-- src/mesa/drivers/dri/i915/i830_texstate.c |2 +- src/mesa/drivers/dri/i915/i915_program.c |8 ++--- src/mesa/drivers/dri/i915/i915_state.c | 26 +++--- src/mesa/drivers/dri/i915/i915_tex_layout.c|4 +-- src/mesa/drivers/dri/i915/i915_texstate.c |2 +- src/mesa/drivers/dri/i915/i915_vtbl.c |2 +- src/mesa/drivers/dri/i915/intel_blit.c | 10 +++--- src/mesa/drivers/dri/i915/intel_clear.c|2 +- src/mesa/drivers/dri/i915/intel_context.c |2 +- src/mesa/drivers/dri/i915/intel_fbo.c |8 ++--- src/mesa/drivers/dri/i915/intel_mipmap_tree.c | 18 +- src/mesa/drivers/dri/i915/intel_pixel_bitmap.c |2 +- src/mesa/drivers/dri/i915/intel_pixel_copy.c |6 ++-- src/mesa/drivers/dri/i915/intel_pixel_read.c | 12 +++ src/mesa/drivers/dri/i915/intel_regions.c |8 ++--- src/mesa/drivers/dri/i915/intel_render.c |2 +- src/mesa/drivers/dri/i915/intel_state.c|6 ++-- src/mesa/drivers/dri/i915/intel_tex.c | 10 +++--- src/mesa/drivers/dri/i915/intel_tex_copy.c |4 +-- src/mesa/drivers/dri/i915/intel_tex_image.c| 18 +- src/mesa/drivers/dri/i915/intel_tex_subimage.c |2 +- src/mesa/drivers/dri/i915/intel_tris.c | 12 +++ 24 files changed, 107 insertions(+), 107 deletions(-) diff --git a/src/mesa/drivers/dri/i915/i830_state.c b/src/mesa/drivers/dri/i915/i830_state.c index 3e379f3..13adf56 100644 --- a/src/mesa/drivers/dri/i915/i830_state.c +++ b/src/mesa/drivers/dri/i915/i830_state.c @@ -56,7 +56,7 @@ i830StencilFuncSeparate(struct gl_context * ctx, GLenum face, GLenum func, GLint mask = mask 0xff; - DBG(%s : func: %s, ref : 0x%x, mask: 0x%x\n, __FUNCTION__, + DBG(%s : func: %s, ref : 0x%x, mask: 0x%x\n, __func__, _mesa_lookup_enum_by_nr(func), ref, mask); @@ -77,7 +77,7 @@ i830StencilMaskSeparate(struct gl_context * ctx, GLenum face, GLuint mask) { struct i830_context *i830 = i830_context(ctx); - DBG(%s : mask 0x%x\n, __FUNCTION__, mask); + DBG(%s : mask 0x%x\n, __func__, mask); mask = mask 0xff; @@ -94,7 +94,7 @@ i830StencilOpSeparate(struct gl_context * ctx, GLenum face, GLenum fail, GLenum struct i830_context *i830 = i830_context(ctx); int fop, dfop, dpop; - DBG(%s: fail : %s, zfail: %s, zpass : %s\n, __FUNCTION__, + DBG(%s: fail : %s, zfail: %s, zpass : %s\n, __func__, _mesa_lookup_enum_by_nr(fail), _mesa_lookup_enum_by_nr(zfail), _mesa_lookup_enum_by_nr(zpass)); @@ -261,7 +261,7 @@ i830BlendColor(struct gl_context * ctx, const GLfloat color[4]) struct i830_context *i830 = i830_context(ctx); GLubyte r, g, b, a; - DBG(%s\n, __FUNCTION__); + DBG(%s\n, __func__); UNCLAMPED_FLOAT_TO_UBYTE(r, color[RCOMP]); UNCLAMPED_FLOAT_TO_UBYTE(g, color[GCOMP]); @@ -315,7 +315,7 @@ i830_set_blend_state(struct gl_context * ctx) break; default: fprintf(stderr, [%s:%u] Invalid RGB blend equation (0x%04x).\n, - __FUNCTION__, __LINE__, ctx-Color.Blend[0].EquationRGB); + __func__, __LINE__, ctx-Color.Blend[0].EquationRGB); return; } @@ -343,7 +343,7 @@ i830_set_blend_state(struct gl_context * ctx) break; default: fprintf(stderr, [%s:%u] Invalid alpha blend equation (0x%04x).\n, - __FUNCTION__, __LINE__, ctx-Color.Blend[0].EquationA); + __func__, __LINE__, ctx-Color.Blend[0].EquationA); return; } @@ -378,7 +378,7 @@ i830_set_blend_state(struct gl_context * ctx) if (0) { fprintf(stderr, [%s:%u] STATE1: 0x%08x IALPHAB: 0x%08x blend is %sabled\n, - __FUNCTION__, __LINE__, i830-state.Ctx[I830_CTXREG_STATE1], + __func__, __LINE__, i830-state.Ctx[I830_CTXREG_STATE1], i830-state.Ctx[I830_CTXREG_IALPHAB], (ctx-Color.BlendEnabled) ? en : dis); } @@ -388,7 +388,7 @@ i830_set_blend_state(struct gl_context * ctx) static void i830BlendEquationSeparate(struct gl_context * ctx, GLenum modeRGB, GLenum modeA) { - DBG(%s - %s, %s\n, __FUNCTION__, + DBG(%s - %s, %s\n, __func__, _mesa_lookup_enum_by_nr(modeRGB), _mesa_lookup_enum_by_nr(modeA)); @@ -402,7 +402,7 @@ static void i830BlendFuncSeparate(struct gl_context * ctx, GLenum sfactorRGB, GLenum dfactorRGB, GLenum sfactorA, GLenum dfactorA) { - DBG(%s - RGB(%s, %s) A(%s, %s)\n, __FUNCTION__, + DBG(%s -
[Mesa-dev] [PATCH] glx: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/glx/apple/apple_glx_log.h |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glx/apple/apple_glx_log.h b/src/glx/apple/apple_glx_log.h index 4b1c531..b1a5538 100644 --- a/src/glx/apple/apple_glx_log.h +++ b/src/glx/apple/apple_glx_log.h @@ -39,14 +39,14 @@ __printflike(5, 6) void _apple_glx_log(int level, const char *file, const char *function, int line, const char *fmt, ...); #define apple_glx_log(l, f, args ...) \ -_apple_glx_log(l, __FILE__, __FUNCTION__, __LINE__, f, ## args) +_apple_glx_log(l, __FILE__, __func__, __LINE__, f, ## args) __printflike(5, 0) void _apple_glx_vlog(int level, const char *file, const char *function, int line, const char *fmt, va_list v); #define apple_glx_vlog(l, f, v) \ -_apple_glx_vlog(l, __FILE__, __FUNCTION__, __LINE__, f, v) +_apple_glx_vlog(l, __FILE__, __func__, __LINE__, f, v) /* This is just here to help the transition. * TODO: Replace calls to apple_glx_diagnostic -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/drivers/dri/common/utils.c|2 +- src/mesa/drivers/dri/i965/brw_blorp.cpp|2 +- src/mesa/drivers/dri/i965/brw_blorp_blit.cpp |2 +- src/mesa/drivers/dri/i965/brw_context.c|4 +-- src/mesa/drivers/dri/i965/brw_draw_upload.c|2 +- src/mesa/drivers/dri/i965/brw_state_cache.c|4 +-- src/mesa/drivers/dri/i965/brw_tex_layout.c |2 +- src/mesa/drivers/dri/i965/brw_wm_surface_state.c |2 +- src/mesa/drivers/dri/i965/gen6_surface_state.c |2 +- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |2 +- src/mesa/drivers/dri/i965/gen8_surface_state.c |2 +- src/mesa/drivers/dri/i965/intel_blit.c |8 +++--- src/mesa/drivers/dri/i965/intel_fbo.c |8 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 30 ++-- src/mesa/drivers/dri/i965/intel_pixel_bitmap.c |2 +- src/mesa/drivers/dri/i965/intel_pixel_copy.c |6 ++-- src/mesa/drivers/dri/i965/intel_pixel_draw.c | 14 - src/mesa/drivers/dri/i965/intel_pixel_read.c |8 +++--- src/mesa/drivers/dri/i965/intel_screen.c |4 +-- src/mesa/drivers/dri/i965/intel_tex.c | 10 +++ src/mesa/drivers/dri/i965/intel_tex_copy.c |4 +-- src/mesa/drivers/dri/i965/intel_tex_image.c| 16 +-- src/mesa/drivers/dri/i965/intel_tex_subimage.c |6 ++-- .../dri/i965/test_vec4_register_coalesce.cpp |2 +- 24 files changed, 72 insertions(+), 72 deletions(-) diff --git a/src/mesa/drivers/dri/common/utils.c b/src/mesa/drivers/dri/common/utils.c index bb22107..70d34e8 100644 --- a/src/mesa/drivers/dri/common/utils.c +++ b/src/mesa/drivers/dri/common/utils.c @@ -227,7 +227,7 @@ driCreateConfigs(mesa_format format, break; default: fprintf(stderr, [%s:%u] Unknown framebuffer type %s (%d).\n, - __FUNCTION__, __LINE__, + __func__, __LINE__, _mesa_get_format_name(format), format); return NULL; } diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp index df00b77..3b03f75 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp @@ -194,7 +194,7 @@ intel_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt, } DBG(%s %s to mt %p level %d layer %d\n, - __FUNCTION__, opname, mt, level, layer); + __func__, opname, mt, level, layer); if (brw-gen = 8) { gen8_hiz_exec(brw, mt, level, layer, op); diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp index 644cb41..d25e201 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp @@ -78,7 +78,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw, DBG(%s from %dx %s mt %p %d %d (%f,%f) (%f,%f) to %dx %s mt %p %d %d (%f,%f) (%f,%f) (flip %d,%d)\n, - __FUNCTION__, + __func__, src_mt-num_samples, _mesa_get_format_name(src_mt-format), src_mt, src_level, src_layer, src_x0, src_y0, src_x1, src_y1, dst_mt-num_samples, _mesa_get_format_name(dst_mt-format), dst_mt, diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index ed6fdff..a63d00b 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -722,7 +722,7 @@ brwCreateContext(gl_api api, struct brw_context *brw = rzalloc(NULL, struct brw_context); if (!brw) { - fprintf(stderr, %s: failed to alloc context\n, __FUNCTION__); + fprintf(stderr, %s: failed to alloc context\n, __func__); *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY; return false; } @@ -778,7 +778,7 @@ brwCreateContext(gl_api api, if (!_mesa_initialize_context(ctx, api, mesaVis, shareCtx, functions)) { *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY; - fprintf(stderr, %s: failed to init mesa context\n, __FUNCTION__); + fprintf(stderr, %s: failed to init mesa context\n, __func__); intelDestroyContext(driContextPriv); return false; } diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index 52dcb6f..623465f 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -413,7 +413,7 @@ brw_prepare_vertices(struct brw_context *brw) } if (0) - fprintf(stderr, %s %d..%d\n, __FUNCTION__, min_index, max_index); + fprintf(stderr, %s %d..%d\n, __func__,
[Mesa-dev] [PATCH] main: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/main/atifragshader.c |4 ++-- src/mesa/main/ffvertex_prog.c |6 +++--- src/mesa/main/format_unpack.py |4 ++-- src/mesa/main/glformats.c |2 +- src/mesa/main/mtypes.h |2 +- src/mesa/main/state.c |2 +- 6 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c index 9d967b9..9fc3552 100644 --- a/src/mesa/main/atifragshader.c +++ b/src/mesa/main/atifragshader.c @@ -476,7 +476,7 @@ _mesa_PassTexCoordATI(GLuint dst, GLuint coord, GLenum swizzle) curI-swizzle = swizzle; #if MESA_DEBUG_ATI_FS - _mesa_debug(ctx, %s(%s, %s, %s)\n, __FUNCTION__, + _mesa_debug(ctx, %s(%s, %s, %s)\n, __func__, _mesa_lookup_enum_by_nr(dst), _mesa_lookup_enum_by_nr(coord), _mesa_lookup_enum_by_nr(swizzle)); #endif @@ -549,7 +549,7 @@ _mesa_SampleMapATI(GLuint dst, GLuint interp, GLenum swizzle) curI-swizzle = swizzle; #if MESA_DEBUG_ATI_FS - _mesa_debug(ctx, %s(%s, %s, %s)\n, __FUNCTION__, + _mesa_debug(ctx, %s(%s, %s, %s)\n, __func__, _mesa_lookup_enum_by_nr(dst), _mesa_lookup_enum_by_nr(interp), _mesa_lookup_enum_by_nr(swizzle)); #endif diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c index 395b00e..edf7e33 100644 --- a/src/mesa/main/ffvertex_prog.c +++ b/src/mesa/main/ffvertex_prog.c @@ -619,13 +619,13 @@ static void emit_op3fn(struct tnl_program *p, #define emit_op3(p, op, dst, mask, src0, src1, src2) \ - emit_op3fn(p, op, dst, mask, src0, src1, src2, __FUNCTION__, __LINE__) + emit_op3fn(p, op, dst, mask, src0, src1, src2, __func__, __LINE__) #define emit_op2(p, op, dst, mask, src0, src1) \ -emit_op3fn(p, op, dst, mask, src0, src1, undef, __FUNCTION__, __LINE__) +emit_op3fn(p, op, dst, mask, src0, src1, undef, __func__, __LINE__) #define emit_op1(p, op, dst, mask, src0) \ -emit_op3fn(p, op, dst, mask, src0, undef, undef, __FUNCTION__, __LINE__) +emit_op3fn(p, op, dst, mask, src0, undef, undef, __func__, __LINE__) static struct ureg make_temp( struct tnl_program *p, struct ureg reg ) diff --git a/src/mesa/main/format_unpack.py b/src/mesa/main/format_unpack.py index 53bdf64..9917548 100644 --- a/src/mesa/main/format_unpack.py +++ b/src/mesa/main/format_unpack.py @@ -333,7 +333,7 @@ _mesa_unpack_rgba_row(mesa_format format, GLuint n, unpack_float_ycbcr_rev(src, dst, n); break; default: - _mesa_problem(NULL, %s: bad format %s, __FUNCTION__, + _mesa_problem(NULL, %s: bad format %s, __func__, _mesa_get_format_name(format)); return; } @@ -402,7 +402,7 @@ _mesa_unpack_uint_rgba_row(mesa_format format, GLuint n, break; %endfor default: - _mesa_problem(NULL, %s: bad format %s, __FUNCTION__, + _mesa_problem(NULL, %s: bad format %s, __func__, _mesa_get_format_name(format)); return; } diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c index 4e05229..8ced579 100644 --- a/src/mesa/main/glformats.c +++ b/src/mesa/main/glformats.c @@ -1393,7 +1393,7 @@ _mesa_base_format_has_channel(GLenum base_format, GLenum pname) return GL_FALSE; default: _mesa_warning(NULL, %s: Unexpected channel token 0x%x\n, - __FUNCTION__, pname); + __func__, pname); return GL_FALSE; } diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 8e1dba6..ab74489 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -4523,7 +4523,7 @@ struct gl_context #ifdef DEBUG extern int MESA_VERBOSE; extern int MESA_DEBUG_FLAGS; -# define MESA_FUNCTION __FUNCTION__ +# define MESA_FUNCTION __func__ #else # define MESA_VERBOSE 0 # define MESA_DEBUG_FLAGS 0 diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c index dadfb3c..ccf83d7 100644 --- a/src/mesa/main/state.c +++ b/src/mesa/main/state.c @@ -507,7 +507,7 @@ _mesa_set_varying_vp_inputs( struct gl_context *ctx, ctx-FragmentProgram._TexEnvProgram) { ctx-NewState |= _NEW_VARYING_VP_INPUTS; } - /*printf(%s %x\n, __FUNCTION__, varying_inputs);*/ + /*printf(%s %x\n, __func__, varying_inputs);*/ } } -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix depth field setting in surface state for raw buffer on Gen7/8
On Mon, Apr 6, 2015 at 10:51 PM, Zhenyu Wang zhen...@linux.intel.com wrote: On Gen7/8 for RAW surface format, the depth field (surf[3]) in surface state means [30:21] bits of number of entries which is different from other surface format which uses [26:21] bits field. Signed-off-by: Zhenyu Wang zhen...@linux.intel.com --- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 7 +-- src/mesa/drivers/dri/i965/gen8_surface_state.c| 7 +-- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index d9361d3..18bcb8a 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -238,8 +238,11 @@ gen7_emit_buffer_surface_state(struct brw_context *brw, surf[1] = (bo ? bo-offset64 : 0) + buffer_offset; /* reloc */ surf[2] = SET_FIELD((buffer_size - 1) 0x7f, GEN7_SURFACE_WIDTH) | SET_FIELD(((buffer_size - 1) 7) 0x3fff, GEN7_SURFACE_HEIGHT); - surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH) | - (pitch - 1); + if (surface_format == BRW_SURFACEFORMAT_RAW) + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3ff, BRW_SURFACE_DEPTH); + else + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH); + surf[3] |= (pitch - 1); surf[5] = SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS); diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 0007c95..ba59b05 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -129,8 +129,11 @@ gen8_emit_buffer_surface_state(struct brw_context *brw, surf[2] = SET_FIELD((buffer_size - 1) 0x7f, GEN7_SURFACE_WIDTH) | SET_FIELD(((buffer_size - 1) 7) 0x3fff, GEN7_SURFACE_HEIGHT); - surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH) | - (pitch - 1); + if (surface_format == BRW_SURFACEFORMAT_RAW) + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3ff, BRW_SURFACE_DEPTH); + else + surf[3] = SET_FIELD(((buffer_size - 1) 21) 0x3f, BRW_SURFACE_DEPTH); + surf[3] |= (pitch - 1); surf[7] = SET_FIELD(HSW_SCS_RED, GEN7_SURFACE_SCS_R) | SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) | SET_FIELD(HSW_SCS_BLUE, GEN7_SURFACE_SCS_B) | -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] state_tracker: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/state_tracker/st_atom.c |2 +- src/mesa/state_tracker/st_atom_constbuf.c |2 +- src/mesa/state_tracker/st_cb_clear.c |2 +- src/mesa/state_tracker/st_cb_texture.c | 18 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +- src/mesa/state_tracker/st_mesa_to_tgsi.c |2 +- src/mesa/state_tracker/st_program.c|2 +- src/mesa/state_tracker/st_texture.c|6 +++--- 8 files changed, 18 insertions(+), 18 deletions(-) diff --git a/src/mesa/state_tracker/st_atom.c b/src/mesa/state_tracker/st_atom.c index f0fe11f..428f2d9 100644 --- a/src/mesa/state_tracker/st_atom.c +++ b/src/mesa/state_tracker/st_atom.c @@ -183,7 +183,7 @@ void st_validate_state( struct st_context *st ) if (state-st == 0) return; - /*printf(%s %x/%x\n, __FUNCTION__, state-mesa, state-st);*/ + /*printf(%s %x/%x\n, __func__, state-mesa, state-st);*/ #ifdef DEBUG if (1) { diff --git a/src/mesa/state_tracker/st_atom_constbuf.c b/src/mesa/state_tracker/st_atom_constbuf.c index 7984bf7..a54e0d9 100644 --- a/src/mesa/state_tracker/st_atom_constbuf.c +++ b/src/mesa/state_tracker/st_atom_constbuf.c @@ -92,7 +92,7 @@ void st_upload_constants( struct st_context *st, if (ST_DEBUG DEBUG_CONSTANTS) { debug_printf(%s(shader=%d, numParams=%d, stateFlags=0x%x)\n, - __FUNCTION__, shader_type, params-NumParameters, + __func__, shader_type, params-NumParameters, params-StateFlags); _mesa_print_parameter_list(params); } diff --git a/src/mesa/state_tracker/st_cb_clear.c b/src/mesa/state_tracker/st_cb_clear.c index dd81a62..f10e906 100644 --- a/src/mesa/state_tracker/st_cb_clear.c +++ b/src/mesa/state_tracker/st_cb_clear.c @@ -247,7 +247,7 @@ clear_with_quad(struct gl_context *ctx, unsigned clear_buffers) util_framebuffer_get_num_layers(st-state.framebuffer); /* - printf(%s %s%s%s %f,%f %f,%f\n, __FUNCTION__, + printf(%s %s%s%s %f,%f %f,%f\n, __func__, color ? color, : , depth ? depth, : , stencil ? stencil : , diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index 5c520b4..6b35d61 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -123,7 +123,7 @@ gl_target_to_pipe(GLenum target) static struct gl_texture_image * st_NewTextureImage(struct gl_context * ctx) { - DBG(%s\n, __FUNCTION__); + DBG(%s\n, __func__); (void) ctx; return (struct gl_texture_image *) ST_CALLOC_STRUCT(st_texture_image); } @@ -144,7 +144,7 @@ st_NewTextureObject(struct gl_context * ctx, GLuint name, GLenum target) { struct st_texture_object *obj = ST_CALLOC_STRUCT(st_texture_object); - DBG(%s\n, __FUNCTION__); + DBG(%s\n, __func__); _mesa_initialize_texture_object(ctx, obj-base, name, target); return obj-base; @@ -172,7 +172,7 @@ st_FreeTextureImageBuffer(struct gl_context *ctx, { struct st_texture_image *stImage = st_texture_image(texImage); - DBG(%s\n, __FUNCTION__); + DBG(%s\n, __func__); if (stImage-pt) { pipe_resource_reference(stImage-pt, NULL); @@ -405,7 +405,7 @@ guess_and_alloc_texture(struct st_context *st, GLuint ptWidth, ptHeight, ptDepth, ptLayers; enum pipe_format fmt; - DBG(%s\n, __FUNCTION__); + DBG(%s\n, __func__); assert(!stObj-pt); @@ -473,7 +473,7 @@ guess_and_alloc_texture(struct st_context *st, stObj-lastLevel = lastLevel; - DBG(%s returning %d\n, __FUNCTION__, (stObj-pt != NULL)); + DBG(%s returning %d\n, __func__, (stObj-pt != NULL)); return stObj-pt != NULL; } @@ -496,7 +496,7 @@ st_AllocTextureImageBuffer(struct gl_context *ctx, GLuint height = texImage-Height; GLuint depth = texImage-Depth; - DBG(%s\n, __FUNCTION__); + DBG(%s\n, __func__); assert(!stImage-pt); /* xxx this might be wrong */ @@ -1148,7 +1148,7 @@ st_GetTexImage(struct gl_context * ctx, } if (ST_DEBUG DEBUG_FALLBACK) - debug_printf(%s: fallback format translation\n, __FUNCTION__); + debug_printf(%s: fallback format translation\n, __func__); dstMesaFormat = _mesa_format_from_format_and_type(format, type); dstStride = _mesa_image_row_stride(ctx-Pack, width, format, type); @@ -1234,7 +1234,7 @@ fallback_copy_texsubimage(struct gl_context *ctx, struct pipe_transfer *transfer; if (ST_DEBUG DEBUG_FALLBACK) - debug_printf(%s: fallback processing\n, __FUNCTION__); + debug_printf(%s: fallback processing\n, __func__); if (st_fb_orientation(ctx-ReadBuffer) == Y_0_TOP) {
Re: [Mesa-dev] DMA_BUF render targets disabled for intel
On Thu 02 Apr 2015, Axel Davy wrote: Hi, you may be interesting look at this related bug report: https://bugs.freedesktop.org/show_bug.cgi?id=87452#c5 Yours, Axel Davy On 02/04/2015 11:58, Volker Vogelhuber wrote : We currently want to stream OpenGL output to an FPGA that does not provide a SG controller and should manage the transfers from the CPU memory to it's own hardware. For that reason we want to have the OpenGL driver (intel baytrail) to render at a specific memory area within the CPU system. Render to texture as it is possible e.g. on the PowerVR 530 seems not to be possible, as GL_TEXTURE_EXTERNAL_OES is not valid for glFrameBufferTexture2D and in contrast to the PowerVR OpenGL implementation, Mesa seems to prohibit the use of GL_TEXTURE_2D for textures created by glEGLImageTargetTexture2DOES (there is a check within Mesa where glEGLImageTargetTexture2DOES's target has to be equal to the target of the texture = GL_TEXTURE_EXTERNAL_OES != GL_TEXTURE_2D). So the only possible way to render to an EGLImage with memory allocated by myself seems to be the use of glEGLImageTargetRenderbufferStorageOES and bind this render buffer using glFramebufferRenderbuffer to the FBO. But for some reason, it seems to be forbidden to use an EGLImage imported from a dmabuf as render buffer. At least within src/mesa/drivers/dri/i965/intel_fbo.c there is a check: /* Buffers originating from outside are for read-only. */ if (image-dma_buf_imported) { _mesa_error(ctx, GL_INVALID_OPERATION, glEGLImageTargetRenderbufferStorage(dma buffers are read-only)); return; } This prevents me from doing what I wanted to do and I googled a bit. I found someone else that just removed that check: https://github.com/kalyankondapally/Chromium-OzoneGBM/blob/master/0010-i965-remove-read-only-restriction-of-imported-buffer.patch That patch isn't safe for general renderbuffer usage... details below. (As an aside, Chrome OS also has a similar patch in their Mesa tree. But it's safe for Chrome OS, at least for now). and after I did so myself, it just worked as I wanted it to work. I only wonder why this limitation has been added. Is it just for some pedantic reasons or is there any good reason why EGLImages imported from dmabuf descriptors shouldn't be used for render targets? There is a very good reason. It is not pedantic. And me and Tapani (CC'd) are working on enabling this. See [https://bugs.freedesktop.org/show_bug.cgi?id=87452#c7] for my work-in-progress patches. The reason is that, on Intel chipsets Ivybridge and newer, the i965 driver often expects each color buffer to have an auxiliary metadata buffer that holds compresson information. If the aux buffer does not exist, i965 will create it. If the metadata buffer and the real color buffer become unsynchronized (which is *very* likey when using a dma_buf as renderbuffer storage), you will get corrupt rendering. If you haven't got corrupt rendering, it's solely due to luck (and that luck is proportional to the density of cleared pixels exist in the buffer). Therefore, i965 needs to be taught to disable aux buffers for dma_buf-backed storage. Before that happens, you risk corrupted images if you render to a dma_buf-backed renderbuffer. If you apply Kalyan's patch on top of my (untested) patches, then that should safely enable what you're doing with the FPGA. (There may be still be bugs with EGLImage orphaning semantics, but that likely won't affect you). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965/fs: Always invert predicate of SEL with swapped arguments
From: Ian Romanick ian.d.roman...@intel.com Commit b616164 added an optimization of b2f generation of a comparison. It also included an extra optimization of one of the comparison values is a constant of zero. The trick was that some value was known to be zero, so that value could be used in the SEL instruction instead of potentially loading 0.0 into a register. This change switched the order of the arguments to the SEL, and, for some unknown reason, I thought that the predicate should therefore only be inverted for the == case. Clearly, it should always be inverted. Fixes piglit fs-notEqual-of-expression.shader_test and fs-equal-of-expression.shader_test. v2: Don't do the register already has zero optimization for the '== 0' case. In that case, the register does not have zero when we want to produce a zero result. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89722 Reviewed-by: Kenneth Graunke kenn...@whitecape.org [v1] --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index e6fb0cb..da0a08d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -502,15 +502,15 @@ fs_visitor::try_emit_b2f_of_comparison(ir_expression *ir) * and(16) g41D g28,8,1D 1D * and(16) m61D -g48,8,1D 0x3f80UD * -* When the comparison is either == 0.0 or != 0.0 using the knowledge that -* the true (or false) case already results in zero would allow better code -* generation by possibly avoiding a load-immediate instruction. +* When the comparison is != 0.0 using the knowledge that the false case +* already results in zero would allow better code generation by possibly +* avoiding a load-immediate instruction. */ ir_expression *cmp = ir-operands[0]-as_expression(); if (cmp == NULL) return false; - if (cmp-operation == ir_binop_equal || cmp-operation == ir_binop_nequal) { + if (cmp-operation == ir_binop_nequal) { for (unsigned i = 0; i 2; i++) { ir_constant *c = cmp-operands[i]-as_constant(); if (c == NULL || !c-is_zero()) @@ -538,7 +538,7 @@ fs_visitor::try_emit_b2f_of_comparison(ir_expression *ir) fs_inst *inst = emit(SEL(this-result, op[i ^ 1], fs_reg(1.0f))); inst-predicate = BRW_PREDICATE_NORMAL; -inst-predicate_inverse = cmp-operation == ir_binop_equal; +inst-predicate_inverse = true; return true; } } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/ttn: add support for temp arrays
From: Rob Clark robcl...@freedesktop.org Since the rest of NIR really would rather have these as variables rather than registers, create a nir_variable per array. But rather than completely re-arrange ttn to be variable based rather than register based, keep the registers. In the cases where there is a matching var for the reg, ttn_emit_instruction will append the appropriate intrinsic to get things back from the shadow reg into the variable. NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give an array id. But those just kinda suck, and should really go away. AFAICT we don't get those from glsl. Might be an issue for some other state tracker. v2: rework to use load_var/store_var with deref chains Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/auxiliary/nir/tgsi_to_nir.c | 122 +++- 1 file changed, 103 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index da935a4..f4c0bad 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -44,6 +44,7 @@ struct ttn_reg_info { /** nir register containing this TGSI index. */ nir_register *reg; + nir_variable *var; /** Offset (in vec4s) from the start of var for this TGSI index. */ int offset; }; @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c) if (file == TGSI_FILE_TEMPORARY) { nir_register *reg; - if (c-scan-indirect_files (1 file)) { + nir_variable *var = NULL; + + if (decl-Declaration.Array) { + /* for arrays, the register created just serves as a + * shadow register. We append intrinsic_store_global + * after the tgsi instruction is translated to move + * back from the shadow register to the variable + */ + var = rzalloc(b-shader, nir_variable); + + var-type = glsl_array_type(glsl_vec4_type(), array_size); + var-data.mode = nir_var_global; + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID); + + exec_list_push_tail(b-shader-globals, var-node); + } + + for (i = 0; i array_size; i++) { reg = nir_local_reg_create(b-impl); reg-num_components = 4; - reg-num_array_elems = array_size; - - for (i = 0; i array_size; i++) { -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = i; - } - } else { - for (i = 0; i array_size; i++) { -reg = nir_local_reg_create(b-impl); -reg-num_components = 4; -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = 0; - } + c-temp_regs[decl-Range.First + i].reg = reg; + c-temp_regs[decl-Range.First + i].var = var; + c-temp_regs[decl-Range.First + i].offset = i; } } else if (file == TGSI_FILE_ADDRESS) { c-addr_reg = nir_local_reg_create(b-impl); @@ -245,6 +253,32 @@ ttn_emit_immediate(struct ttn_compile *c) static nir_src * ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect); +/* generate either a constant or indirect deref chain for accessing an + * array variable. + */ +static nir_deref_var * +ttn_array_deref(struct ttn_compile *c, nir_variable *var, unsigned offset, +struct tgsi_ind_register *indirect) +{ + nir_builder *b = c-build; + nir_deref_var *deref = nir_deref_var_create(b-shader, var); + nir_deref_array *arr = nir_deref_array_create(b-shader); + + arr-base_offset = offset; + arr-deref.type = glsl_get_array_element(var-type); + + if (indirect) { + arr-deref_array_type = nir_deref_array_type_indirect; + arr-indirect = nir_src_for_reg(c-addr_reg); + } else { + arr-deref_array_type = nir_deref_array_type_direct; + } + + deref-deref.child = arr-deref; + + return deref; +} + static nir_src ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, struct tgsi_ind_register *indirect) @@ -256,10 +290,25 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, switch (file) { case TGSI_FILE_TEMPORARY: - src.reg.reg = c-temp_regs[index].reg; - src.reg.base_offset = c-temp_regs[index].offset; - if (indirect) - src.reg.indirect = ttn_src_for_indirect(c, indirect); + if (c-temp_regs[index].var) { + unsigned offset = c-temp_regs[index].offset; + nir_variable *var = c-temp_regs[index].var; + nir_intrinsic_instr *load; + + load = nir_intrinsic_instr_create(b-shader, + nir_intrinsic_load_var); + load-num_components = 4; + load-variables[0] = ttn_array_deref(c, var, offset, indirect); + + nir_ssa_dest_init(load-instr, load-dest, 4, NULL); +
Re: [Mesa-dev] [PATCH] i965: replace __FUNCTION__ with __func__
On Tue, Apr 7, 2015 at 12:05 PM, Marius Predut marius.pre...@intel.com wrote: Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Presumably not, since i965 isn't built with MSVC :) But yeah, the patch looks good. I'll collect all of the __func__ patches and commit them later assuming everything else looks good. Thanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Value Range Propagation in NIR (GSoC)
Hi, For those that don't know I've submitted a proposal for this years GSoC. I've proposed to implement value range propagation and loop unrolling in NIR. Since I'm no expert on compilers I've read up on some litterature: I started with Constant propagation with conditional branches (thanks Connor). This paper describes an algorithm, sparse conditional constant propagation, that seems to be the defacto standard in compilers today. I also found the paper; Accurate static branch prediction by value range propagation (VRP). This describes a value range propagation implementation based on SCCP. (This also allows one to set heuristics to calculate educated guesses for the probability of a certain branch, but that's probably more than we're interested in.) There is also a GCC paper (with whatever licensing issues that may apply); A propagation engine for GCC. They have a shared engine for doing all propagation passes. It handles the worklists, and the logic to traverse these. The implementing passes then supply callbacks to define the lattice rules. They reply back if the instruction was interesting or not, and the propagation engine basically handles the rest. Maybe that's an interesting solution? Or it might not be worth the hassle? We already have copy propagation, and with value range propagation we probably don't want separate constant propagation? (I'm hoping to write the pass so that it handles both constants and value ranges.) The GCC guys have used this engine to get copy propagation that propagates copies accross conditionals, maybe this makes such a solution more interesting? Connor: I just remembered you saying something about your freedesktop git repo, so I poked around some and found that you have already done some work on VRP based on SCCP? How far did you get? If we just want to make an SCCP inspired VRP pass then Connor has work in progress. Finishing that, and loop unrolling, may not be enough work for GSoC? Or maybe Connor wants to finish it of himself, and I should spend my time implementing some other pass instead, alongside loop unrolling? Realising Connor has partially started on this I thought it was a good idea to get some feedback and ideas from others (if I need to change my proposal) All suggestions, ideas and opinions are more than welcome. Fire at will, I'm all ears =) Regards, Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/5] nir: Make nir_*_instr_create take a nir_shader instead of a void * context
Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com --- src/glsl/nir/nir.c | 36 ++-- src/glsl/nir/nir.h | 18 +- 2 files changed, 27 insertions(+), 27 deletions(-) diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 1c6b603..c6e5361 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -381,11 +381,11 @@ alu_src_init(nir_alu_src *src) } nir_alu_instr * -nir_alu_instr_create(void *mem_ctx, nir_op op) +nir_alu_instr_create(nir_shader *shader, nir_op op) { unsigned num_srcs = nir_op_infos[op].num_inputs; nir_alu_instr *instr = - ralloc_size(mem_ctx, + ralloc_size(shader, sizeof(nir_alu_instr) + num_srcs * sizeof(nir_alu_src)); instr_init(instr-instr, nir_instr_type_alu); @@ -398,18 +398,18 @@ nir_alu_instr_create(void *mem_ctx, nir_op op) } nir_jump_instr * -nir_jump_instr_create(void *mem_ctx, nir_jump_type type) +nir_jump_instr_create(nir_shader *shader, nir_jump_type type) { - nir_jump_instr *instr = ralloc(mem_ctx, nir_jump_instr); + nir_jump_instr *instr = ralloc(shader, nir_jump_instr); instr_init(instr-instr, nir_instr_type_jump); instr-type = type; return instr; } nir_load_const_instr * -nir_load_const_instr_create(void *mem_ctx, unsigned num_components) +nir_load_const_instr_create(nir_shader *shader, unsigned num_components) { - nir_load_const_instr *instr = ralloc(mem_ctx, nir_load_const_instr); + nir_load_const_instr *instr = ralloc(shader, nir_load_const_instr); instr_init(instr-instr, nir_instr_type_load_const); nir_ssa_def_init(instr-instr, instr-def, num_components, NULL); @@ -418,11 +418,11 @@ nir_load_const_instr_create(void *mem_ctx, unsigned num_components) } nir_intrinsic_instr * -nir_intrinsic_instr_create(void *mem_ctx, nir_intrinsic_op op) +nir_intrinsic_instr_create(nir_shader *shader, nir_intrinsic_op op) { unsigned num_srcs = nir_intrinsic_infos[op].num_srcs; nir_intrinsic_instr *instr = - ralloc_size(mem_ctx, + ralloc_size(shader, sizeof(nir_intrinsic_instr) + num_srcs * sizeof(nir_src)); instr_init(instr-instr, nir_instr_type_intrinsic); @@ -438,9 +438,9 @@ nir_intrinsic_instr_create(void *mem_ctx, nir_intrinsic_op op) } nir_call_instr * -nir_call_instr_create(void *mem_ctx, nir_function_overload *callee) +nir_call_instr_create(nir_shader *shader, nir_function_overload *callee) { - nir_call_instr *instr = ralloc(mem_ctx, nir_call_instr); + nir_call_instr *instr = ralloc(shader, nir_call_instr); instr_init(instr-instr, nir_instr_type_call); instr-callee = callee; @@ -452,9 +452,9 @@ nir_call_instr_create(void *mem_ctx, nir_function_overload *callee) } nir_tex_instr * -nir_tex_instr_create(void *mem_ctx, unsigned num_srcs) +nir_tex_instr_create(nir_shader *shader, unsigned num_srcs) { - nir_tex_instr *instr = ralloc(mem_ctx, nir_tex_instr); + nir_tex_instr *instr = ralloc(shader, nir_tex_instr); instr_init(instr-instr, nir_instr_type_tex); dest_init(instr-dest); @@ -472,9 +472,9 @@ nir_tex_instr_create(void *mem_ctx, unsigned num_srcs) } nir_phi_instr * -nir_phi_instr_create(void *mem_ctx) +nir_phi_instr_create(nir_shader *shader) { - nir_phi_instr *instr = ralloc(mem_ctx, nir_phi_instr); + nir_phi_instr *instr = ralloc(shader, nir_phi_instr); instr_init(instr-instr, nir_instr_type_phi); dest_init(instr-dest); @@ -483,9 +483,9 @@ nir_phi_instr_create(void *mem_ctx) } nir_parallel_copy_instr * -nir_parallel_copy_instr_create(void *mem_ctx) +nir_parallel_copy_instr_create(nir_shader *shader) { - nir_parallel_copy_instr *instr = ralloc(mem_ctx, nir_parallel_copy_instr); + nir_parallel_copy_instr *instr = ralloc(shader, nir_parallel_copy_instr); instr_init(instr-instr, nir_instr_type_parallel_copy); exec_list_make_empty(instr-entries); @@ -494,9 +494,9 @@ nir_parallel_copy_instr_create(void *mem_ctx) } nir_ssa_undef_instr * -nir_ssa_undef_instr_create(void *mem_ctx, unsigned num_components) +nir_ssa_undef_instr_create(nir_shader *shader, unsigned num_components) { - nir_ssa_undef_instr *instr = ralloc(mem_ctx, nir_ssa_undef_instr); + nir_ssa_undef_instr *instr = ralloc(shader, nir_ssa_undef_instr); instr_init(instr-instr, nir_instr_type_ssa_undef); nir_ssa_def_init(instr-instr, instr-def, num_components, NULL); diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 0f72301..f9ca0f7 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1480,26 +1480,26 @@ void nir_metadata_require(nir_function_impl *impl, nir_metadata required); void nir_metadata_preserve(nir_function_impl *impl, nir_metadata preserved); /** creates an instruction with default swizzle/writemask/etc. with NULL registers */ -nir_alu_instr *nir_alu_instr_create(void *mem_ctx, nir_op op); +nir_alu_instr *nir_alu_instr_create(nir_shader *shader, nir_op op); -nir_jump_instr *nir_jump_instr_create(void *mem_ctx,
Re: [Mesa-dev] [PATCH] gallium/ttn: add support for temp arrays
On Tue, Apr 7, 2015 at 1:30 PM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org Since the rest of NIR really would rather have these as variables rather than registers, create a nir_variable per array. But rather than completely re-arrange ttn to be variable based rather than register based, keep the registers. In the cases where there is a matching var for the reg, ttn_emit_instruction will append the appropriate intrinsic to get things back from the shadow reg into the variable. NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give an array id. But those just kinda suck, and should really go away. AFAICT we don't get those from glsl. Might be an issue for some other state tracker. v2: rework to use load_var/store_var with deref chains Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/auxiliary/nir/tgsi_to_nir.c | 122 +++- 1 file changed, 103 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c b/src/gallium/auxiliary/nir/tgsi_to_nir.c index da935a4..f4c0bad 100644 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c @@ -44,6 +44,7 @@ struct ttn_reg_info { /** nir register containing this TGSI index. */ nir_register *reg; + nir_variable *var; /** Offset (in vec4s) from the start of var for this TGSI index. */ int offset; }; @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c) if (file == TGSI_FILE_TEMPORARY) { nir_register *reg; - if (c-scan-indirect_files (1 file)) { + nir_variable *var = NULL; + + if (decl-Declaration.Array) { + /* for arrays, the register created just serves as a + * shadow register. We append intrinsic_store_global + * after the tgsi instruction is translated to move + * back from the shadow register to the variable + */ + var = rzalloc(b-shader, nir_variable); + + var-type = glsl_array_type(glsl_vec4_type(), array_size); + var-data.mode = nir_var_global; + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID); + + exec_list_push_tail(b-shader-globals, var-node); + } + + for (i = 0; i array_size; i++) { reg = nir_local_reg_create(b-impl); reg-num_components = 4; - reg-num_array_elems = array_size; - - for (i = 0; i array_size; i++) { -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = i; - } - } else { - for (i = 0; i array_size; i++) { -reg = nir_local_reg_create(b-impl); -reg-num_components = 4; -c-temp_regs[decl-Range.First + i].reg = reg; -c-temp_regs[decl-Range.First + i].offset = 0; - } + c-temp_regs[decl-Range.First + i].reg = reg; + c-temp_regs[decl-Range.First + i].var = var; + c-temp_regs[decl-Range.First + i].offset = i; } } else if (file == TGSI_FILE_ADDRESS) { c-addr_reg = nir_local_reg_create(b-impl); @@ -245,6 +253,32 @@ ttn_emit_immediate(struct ttn_compile *c) static nir_src * ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register *indirect); +/* generate either a constant or indirect deref chain for accessing an + * array variable. + */ +static nir_deref_var * +ttn_array_deref(struct ttn_compile *c, nir_variable *var, unsigned offset, +struct tgsi_ind_register *indirect) +{ + nir_builder *b = c-build; + nir_deref_var *deref = nir_deref_var_create(b-shader, var); + nir_deref_array *arr = nir_deref_array_create(b-shader); As per code Ken just pushed today, deref_var's need to be created with the instruction as the memory context and all other derefs need to be created with the parent deref as the memory context. The validator will assert-fail if you don't. Other than that, this looks good to me. I can't speak for the other TTN bits but it looks ok. For whatever it's worth, Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com + + arr-base_offset = offset; + arr-deref.type = glsl_get_array_element(var-type); + + if (indirect) { + arr-deref_array_type = nir_deref_array_type_indirect; + arr-indirect = nir_src_for_reg(c-addr_reg); + } else { + arr-deref_array_type = nir_deref_array_type_direct; + } + + deref-deref.child = arr-deref; + + return deref; +} + static nir_src ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, struct tgsi_ind_register *indirect) @@ -256,10 +290,25 @@ ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned index, switch (file) { case TGSI_FILE_TEMPORARY: - src.reg.reg = c-temp_regs[index].reg; - src.reg.base_offset =
Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00
On 04/07/2015 03:22 AM, Francisco Jerez wrote: Tapani Pälli tapani.pa...@intel.com writes: From: Kalyan Kondapally kalyan.kondapa...@intel.com Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00. Earlier versions allow 'constant-index-expression' indexing, where index can contain a loop induction variable. Patch allows dynamic indexing for sampler arrays when GLSL ES 3.00. This change makes 'sampler-array-index.frag' parser test in Piglit pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend. v2: small change and some more commit message (Tapani) Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225 Looks good, but did you check what happens now if the shader uses actual variable indexing (i.e. which lowering cannot turn into a constant) on an implementation that doesn't support it? Hopefully no crashes or hangs? I think we should add a post-link check that no dynamic indexing remains after all the optimizations are complete. The intention if the ES2 language was to allow cases where the dynamic indexing could be optimized away. This was redacted in ES3 because each optimizer was differently capable, so a shader that worked on one driver/GPU might fail on another... even from the same vendor. Adding the post-link check should prevent the problems the Curro (rightly) worried about, and it should still allow the WebGL demo to work. --- src/glsl/ast_array_index.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index ecef651..b2609b6 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, * dynamically uniform expression is undefined. */ if (array-type-element_type()-is_sampler()) { - if (!state-is_version(130, 100)) { + if (!state-is_version(130, 300)) { if (state-es_shader) { _mesa_glsl_warning(loc, state, sampler arrays indexed with non-constant It looks like this is what e3ded7f should have made this code. Looking at the rest of the surrounding code, I don't think this is quite right... at the very least, it's not easy to follow. You can blame me and Paul for that. I think this is correct and easier to follow: if (!state-is_version(400, 0) !state-ARB_gpu_shader5_enable) { if (state-is_version(130, 300)) _mesa_glsl_error(loc, state, sampler arrays indexed with non-constant expressions are forbidden in GLSL %s and later state-es_shader ? ES 3.00 : 1.30); else if (state-es_shader) _mesa_glsl_warning(loc, state, sampler arrays indexed with non-constant expressions are optional in %s and will be forbidden in GLSL ES 3.00 and later state-version_string()); else _mesa_glsl_warning(loc, state, sampler arrays indexed with non-constant expressions will be forbidden in GLSL 1.30 and later); } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Add the ability to render to I8/L8 and I16/L16 UNORM formats.
On 04/06/2015 05:06 PM, Kenneth Graunke wrote: This allows those formats to work with the meta PBO upload path. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_surface_formats.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index 7261c01..7524ad9 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -582,6 +582,14 @@ brw_init_surface_formats(struct brw_context *brw) case BRW_SURFACEFORMAT_L16_FLOAT: render = BRW_SURFACEFORMAT_R16_FLOAT; break; + case BRW_SURFACEFORMAT_I8_UNORM: + case BRW_SURFACEFORMAT_L8_UNORM: + render = BRW_SURFACEFORMAT_R8_UNORM; + break; I wasn't sure this was correct, so I spent some time digging in the GL spec. Table 3.15 on page 179 (page 195 of the PDF) of the OpenGL 3.0 spec shows that this mapping is correct. Reviewed-by: Ian Romanick ian.d.roman...@intel.com + case BRW_SURFACEFORMAT_I16_UNORM: + case BRW_SURFACEFORMAT_L16_UNORM: + render = BRW_SURFACEFORMAT_R16_UNORM; + break; case BRW_SURFACEFORMAT_B8G8R8X8_UNORM: /* XRGB is handled as ARGB because the chips in this family * cannot render to XRGB targets. This means that we have to ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Value Range Propagation in NIR (GSoC)
Hi Thomas, Thanks for submitting a proposal! Some comments/answers below. On Tue, Apr 7, 2015 at 3:34 PM, Thomas Helland thomashellan...@gmail.com wrote: Hi, For those that don't know I've submitted a proposal for this years GSoC. I've proposed to implement value range propagation and loop unrolling in NIR. Since I'm no expert on compilers I've read up on some litterature: I started with Constant propagation with conditional branches (thanks Connor). This paper describes an algorithm, sparse conditional constant propagation, that seems to be the defacto standard in compilers today. I also found the paper; Accurate static branch prediction by value range propagation (VRP). This describes a value range propagation implementation based on SCCP. (This also allows one to set heuristics to calculate educated guesses for the probability of a certain branch, but that's probably more than we're interested in.) Thanks for mentioning that... I had forgotten the name of that paper. You're right in that the branch probability stuff isn't too useful for us. Also, it raises an important issue about back-edges from phi nodes; they present a more sophisticated method to handle it, but I think that for now we can just force back edges to have an infinite range unless they're constant. There is also a GCC paper (with whatever licensing issues that may apply); A propagation engine for GCC. They have a shared engine for doing all propagation passes. It handles the worklists, and the logic to traverse these. The implementing passes then supply callbacks to define the lattice rules. They reply back if the instruction was interesting or not, and the propagation engine basically handles the rest. Maybe that's an interesting solution? Or it might not be worth the hassle? We already have copy propagation, and with value range propagation we probably don't want separate constant propagation? (I'm hoping to write the pass so that it handles both constants and value ranges.) Yes, copy propagation probably won't be so useful once we have value range propagation; the former is a special case of the latter. Note that we have a nifty way of actually doing the constant folding (nir_constant_expressions.py and nir_constant_expressions.h), which you should still use if all the inputs are constant. The GCC guys have used this engine to get copy propagation that propagates copies accross conditionals, maybe this makes such a solution more interesting? I'm not so sure how useful such a general framework will be. Constant propagation that handles back-edges seems interesting, but I'm not sure it's worth the time to implement something this general as a first pass. Connor: I just remembered you saying something about your freedesktop git repo, so I poked around some and found that you have already done some work on VRP based on SCCP? How far did you get? I started on it, but then I realized that the approach I was using was too cumbersome/complicated so I don't think what I have is too useful. Feel free to work on it yourself, although Jason and I have discussed it so we have some ideas of how to do it. I've written a few notes on this below that you may find useful. - I have a branch I created while working on VRP that you'll probably find useful: http://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-worklist . The first two commits are already in master, but the last two should be useful for implementing SCCP/VRP (although they'll need to be rebased, obviously). - There's a comment in the SCCP paper (5.3, Nodes versus Edges) that says: An alternative way of implementing this would be to add nodes to the graph and then associate an ExecutableFlag with each node. An additional node must be inserted between any node that has more than one immediate successor and any successor node that has more than one immediate predecessor. I think this procedure is what's usually called splitting critical edges; in NIR, thanks to the structured control flow, there are never any critical edges except for one edge case you don't really have to care about too much (namely, an infinite loop with one basic block) and therefore you can just use the basic block worklist that I added in the branch mentioned above, rather than a worklist of basic block edges as the paper describes. - The reason my pass was becoming so cumbersome was because I was trying to solve two problems at once. First, there's actually propagating the ranges. Then, there's taking into account restrictions on range due to branch predicates. For example, if I have something like: if (x 0) { y = max(x, 0); } then since the use of x is dominated by the then-branch of the if, x has to be greater than 0 and we can optimize it. This is a little contrived, but we have seen things like: if (foo) break; /* ... */ if (foo) break; in the wild, where we could get rid of the redundant break using this analysis by recognizing that since
Re: [Mesa-dev] [PATCH] st/mesa: align cube map arrays layers
On Tue, Apr 7, 2015 at 8:02 PM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com We create textures internally for texsubimage, and we use the values from sub image to create a new texture, however we don't align these to valid sizes, and cube map arrays must have an array size aligned to 6. This fixes texsubimage cube_map_array on CAYMAN at least, (it was causing GPU hang and bad values), it probably also fixes it on radeonsi and evergreen. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_texture.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker/st_texture.c b/src/mesa/state_tracker/st_texture.c index ca7c83c..5c9a09c 100644 --- a/src/mesa/state_tracker/st_texture.c +++ b/src/mesa/state_tracker/st_texture.c @@ -177,6 +177,8 @@ st_gl_texture_dims_to_pipe_dims(GLenum texture, *widthOut = widthIn; *heightOut = heightIn; *depthOut = 1; + if (depthIn % 6) + depthIn = util_align_npot(depthIn, 6); *layersOut = depthIn; I'd just do this as *layersOut = util_align_npot(depthIn, 6) But I assume this is the st_TexSubImage caller? Then I bet that instead /* TexSubImage only sets a single cubemap face. */ if (gl_target == GL_TEXTURE_CUBE_MAP) { gl_target = GL_TEXTURE_2D; } Should be changed to account for cube map arrays... -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Value Range Propagation in NIR (GSoC)
Yes, copy propagation probably won't be so useful once we have value range propagation; the former is a special case of the latter. Err, I meant constant propagation... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: align cube map arrays layers
From: Dave Airlie airl...@redhat.com We create textures internally for texsubimage, and we use the values from sub image to create a new texture, however we don't align these to valid sizes, and cube map arrays must have an array size aligned to 6. This fixes texsubimage cube_map_array on CAYMAN at least, (it was causing GPU hang and bad values), it probably also fixes it on radeonsi and evergreen. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_texture.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker/st_texture.c b/src/mesa/state_tracker/st_texture.c index ca7c83c..5c9a09c 100644 --- a/src/mesa/state_tracker/st_texture.c +++ b/src/mesa/state_tracker/st_texture.c @@ -177,6 +177,8 @@ st_gl_texture_dims_to_pipe_dims(GLenum texture, *widthOut = widthIn; *heightOut = heightIn; *depthOut = 1; + if (depthIn % 6) + depthIn = util_align_npot(depthIn, 6); *layersOut = depthIn; break; default: -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: convert sub image for cube map arrays to 2d arrays for upload
From: Dave Airlie airl...@redhat.com Since we can subimage upload a number of cube map array layers, that aren't a complete cube map array, we should specify things as a 2D array and blit from that. Suggested by Ilia Mirkin as an alternate fix for texsubimage cube map array issues. seems to work just as well. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_cb_texture.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index 5631361..40c6677 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -738,6 +738,11 @@ st_TexSubImage(struct gl_context *ctx, GLuint dims, if (gl_target == GL_TEXTURE_CUBE_MAP) { gl_target = GL_TEXTURE_2D; } + /* TexSubImage can specify subsets of cube map array faces +* so we need to upload via 2D array instead */ + if (gl_target == GL_TEXTURE_CUBE_MAP_ARRAY) { + gl_target = GL_TEXTURE_2D_ARRAY; + } /* Initialize the source texture description. */ memset(src_templ, 0, sizeof(src_templ)); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: align cube map arrays layers
On 8 April 2015 at 10:43, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, Apr 7, 2015 at 8:02 PM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com We create textures internally for texsubimage, and we use the values from sub image to create a new texture, however we don't align these to valid sizes, and cube map arrays must have an array size aligned to 6. This fixes texsubimage cube_map_array on CAYMAN at least, (it was causing GPU hang and bad values), it probably also fixes it on radeonsi and evergreen. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_texture.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker/st_texture.c b/src/mesa/state_tracker/st_texture.c index ca7c83c..5c9a09c 100644 --- a/src/mesa/state_tracker/st_texture.c +++ b/src/mesa/state_tracker/st_texture.c @@ -177,6 +177,8 @@ st_gl_texture_dims_to_pipe_dims(GLenum texture, *widthOut = widthIn; *heightOut = heightIn; *depthOut = 1; + if (depthIn % 6) + depthIn = util_align_npot(depthIn, 6); *layersOut = depthIn; I'd just do this as *layersOut = util_align_npot(depthIn, 6) But I assume this is the st_TexSubImage caller? Then I bet that instead /* TexSubImage only sets a single cubemap face. */ if (gl_target == GL_TEXTURE_CUBE_MAP) { gl_target = GL_TEXTURE_2D; } Should be changed to account for cube map arrays... That works as well, belt and braces maybe just in case? Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix depth field setting in surface state for raw buffer on Gen7/8
On 2015.04.07 09:18:08 -0700, Kristian Høgsberg wrote: On Mon, Apr 6, 2015 at 10:51 PM, Zhenyu Wang zhen...@linux.intel.com wrote: On Gen7/8 for RAW surface format, the depth field (surf[3]) in surface state means [30:21] bits of number of entries which is different from other surface format which uses [26:21] bits field. Signed-off-by: Zhenyu Wang zhen...@linux.intel.com Is there a bugzilla that this fixes we can link from the commit message? Either way, this looks good. Reviewed-by: Kristian Høgsberg k...@bitplanet.net Not yet, as I have another client to use raw surface, so better to fix this with spec. thanks -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 signature.asc Description: Digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] egl/dri2: implement platform_null (v2).
The name surfaceless suits me. Does this platform need to provide a hint to the user about buffer format? Platform drm does this via the EGL_NATIVE_VISUAL_ID query of eglGetConfigAttrib(), returning a gbm format value. Unless we do the same or similar here, how does the user robustly find the right format for allocating buffers? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] egl/dri2: implement platform_null (v2).
On Tue 07 Apr 2015, Kristian Høgsberg wrote: On Tue, Apr 7, 2015 at 6:46 PM, Frank Henigman fjhenig...@google.com wrote: The name surfaceless suits me. Does this platform need to provide a hint to the user about buffer format? Platform drm does this via the EGL_NATIVE_VISUAL_ID query of eglGetConfigAttrib(), returning a gbm format value. Unless we do the same or similar here, how does the user robustly find the right format for allocating buffers? GBM provides int gbm_device_is_format_supported(struct gbm_device *gbm, uint32_t format, uint32_t usage); and you can use that to find a format that works with GBM_BO_USE_RENDERING. I don't think it makes sense to use EGL_NATIVE_VISUAL_ID here, so Kristian's suggestion sounds good to me. The EGL_NATIVE_VISUAL_ID, of course, has the same type as the native format of EGLNativeWindowSurface. But this platform has no EGLNativeWindowSurface, so therefore it has no EGL_NATIVE_VISUAL_ID. On first thought, it seems like re-purposing EGL_NATIVE_VISUAL_ID for this platform setting it to a gbm format, might be a good idea. But ultimately I think it's a bad idea because the platform isn't tied to gbm in any way. Today we use gbm_bo_create() to create the dma_buf storage, but tomorrow we might use a different API that doesn't understand gbm formats. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] egl/dri2: implement platform_null (v2).
On Tue, Apr 7, 2015 at 6:46 PM, Frank Henigman fjhenig...@google.com wrote: The name surfaceless suits me. Does this platform need to provide a hint to the user about buffer format? Platform drm does this via the EGL_NATIVE_VISUAL_ID query of eglGetConfigAttrib(), returning a gbm format value. Unless we do the same or similar here, how does the user robustly find the right format for allocating buffers? GBM provides int gbm_device_is_format_supported(struct gbm_device *gbm, uint32_t format, uint32_t usage); and you can use that to find a format that works with GBM_BO_USE_RENDERING. Kristian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev