[Mesa-dev] [Bug 59187] [Steam] Implement GLSL 1.30 (for older chipsets than SandyBridge)
https://bugs.freedesktop.org/show_bug.cgi?id=59187 kost BebiX k...@ya.ru changed: What|Removed |Added CC||k...@ya.ru -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] d3d1x: Remove.
First email was too long, so re-sending just the interesting bits) From: José Fonseca jfons...@vmware.com Unused/unmaintained. --- configure.ac | 21 - src/gallium/docs/source/context.rst|2 +- src/gallium/state_trackers/d3d1x/.gitignore| 20 - src/gallium/state_trackers/d3d1x/Makefile | 11 - src/gallium/state_trackers/d3d1x/Makefile.inc | 19 - .../state_trackers/d3d1x/d3d1xshader/Makefile | 16 - .../d3d1x/d3d1xshader/defs/files.txt | 41 - .../d3d1x/d3d1xshader/defs/interpolations.txt |8 - .../d3d1x/d3d1xshader/defs/opcodes.txt | 207 -- .../d3d1x/d3d1xshader/defs/operand_compnums.txt|5 - .../d3d1x/d3d1xshader/defs/operand_index_reprs.txt |5 - .../d3d1x/d3d1xshader/defs/operand_modes.txt |4 - .../d3d1x/d3d1xshader/defs/shortfiles.txt | 41 - .../state_trackers/d3d1x/d3d1xshader/defs/svs.txt | 23 - .../d3d1x/d3d1xshader/defs/targets.txt | 13 - .../defs/token_instruction_extended_types.txt |4 - .../defs/token_operand_extended_types.txt |2 - .../state_trackers/d3d1x/d3d1xshader/gen-header.sh | 13 - .../state_trackers/d3d1x/d3d1xshader/gen-text.sh | 11 - .../d3d1x/d3d1xshader/include/dxbc.h | 125 - .../d3d1x/d3d1xshader/include/le32.h | 45 - .../state_trackers/d3d1x/d3d1xshader/include/sm4.h | 416 .../d3d1x/d3d1xshader/src/dxbc_assemble.cpp| 59 - .../d3d1x/d3d1xshader/src/dxbc_dump.cpp| 43 - .../d3d1x/d3d1xshader/src/dxbc_parse.cpp | 87 - .../d3d1x/d3d1xshader/src/sm4_analyze.cpp | 122 - .../d3d1x/d3d1xshader/src/sm4_dump.cpp | 222 -- .../d3d1x/d3d1xshader/src/sm4_parse.cpp| 445 .../state_trackers/d3d1x/d3d1xshader/src/utils.h | 45 - .../d3d1x/d3d1xshader/tools/fxdis.cpp | 75 - .../state_trackers/d3d1x/d3d1xstutil/Makefile |5 - .../d3d1x/d3d1xstutil/include/d3d1xstutil.h| 1110 - .../d3d1x/d3d1xstutil/src/d3d_sm4_enums.cpp| 42 - .../d3d1x/d3d1xstutil/src/dxgi_enums.cpp | 165 -- .../state_trackers/d3d1x/d3d1xstutil/src/guids.cpp |6 - src/gallium/state_trackers/d3d1x/d3dapi/Makefile |4 - src/gallium/state_trackers/d3d1x/d3dapi/d3d10.idl | 1554 .../state_trackers/d3d1x/d3dapi/d3d10_1.idl| 191 -- .../state_trackers/d3d1x/d3dapi/d3d10misc.h| 47 - .../state_trackers/d3d1x/d3dapi/d3d10shader.idl| 269 --- src/gallium/state_trackers/d3d1x/d3dapi/d3d11.idl | 2492 .../state_trackers/d3d1x/d3dapi/d3d11shader.idl| 287 --- .../state_trackers/d3d1x/d3dapi/d3dcommon.idl | 704 -- src/gallium/state_trackers/d3d1x/d3dapi/dxgi.idl | 470 .../state_trackers/d3d1x/d3dapi/dxgiformat.idl | 129 - .../state_trackers/d3d1x/d3dapi/dxgitype.idl | 84 - src/gallium/state_trackers/d3d1x/docs/Makefile |5 - .../state_trackers/d3d1x/docs/coding_style.txt | 84 - .../d3d1x/docs/module_dependencies.dot | 25 - .../state_trackers/d3d1x/docs/source_layout.txt| 17 - src/gallium/state_trackers/d3d1x/dxgi/Makefile | 17 - .../state_trackers/d3d1x/dxgi/src/dxgi_loader.cpp | 206 -- .../state_trackers/d3d1x/dxgi/src/dxgi_native.cpp | 1514 .../state_trackers/d3d1x/dxgi/src/dxgi_private.h | 49 - .../state_trackers/d3d1x/dxgid3d10/Makefile|4 - .../state_trackers/d3d1x/dxgid3d10/dxgid3d10.cpp | 149 -- .../state_trackers/d3d1x/dxgid3d11/Makefile|4 - .../state_trackers/d3d1x/dxgid3d11/dxgid3d11.cpp | 135 -- src/gallium/state_trackers/d3d1x/gd3d10/Makefile | 20 - src/gallium/state_trackers/d3d1x/gd3d10/d3d10.pl | 12 - src/gallium/state_trackers/d3d1x/gd3d11/Makefile |6 - src/gallium/state_trackers/d3d1x/gd3d11/d3d11.cpp | 241 -- .../state_trackers/d3d1x/gd3d11/d3d11_context.h| 2132 - .../state_trackers/d3d1x/gd3d11/d3d11_misc.h | 92 - .../state_trackers/d3d1x/gd3d11/d3d11_objects.h| 737 -- .../state_trackers/d3d1x/gd3d11/d3d11_screen.h | 1704 - src/gallium/state_trackers/d3d1x/gd3d1x/Makefile |7 - .../state_trackers/d3d1x/gd3d1x/d3d1x_private.h| 97 - .../state_trackers/d3d1x/gd3d1x/d3d_enums.cpp | 123 - .../state_trackers/d3d1x/gd3d1x/sm4_to_tgsi.cpp| 856 --- .../state_trackers/d3d1x/gd3d1x/sm4_to_tgsi.h | 35 - .../d3d1x/gd3d1x/tools/dxbc2tgsi.cpp | 82 - src/gallium/state_trackers/d3d1x/gd3dapi/Makefile |4 - .../state_trackers/d3d1x/gd3dapi/galliumcom.idl| 76 - .../d3d1x/gd3dapi/galliumd3d10_1.idl | 35 - .../state_trackers/d3d1x/gd3dapi/galliumd3d11.idl | 36 - .../state_trackers/d3d1x/gd3dapi/galliumdxgi.idl | 134 -- .../state_trackers/d3d1x/mstools/download-mstools
Re: [Mesa-dev] [PATCH 1/2] mesa: Add GL_UNSIGNED_BYTE fast-path to fast_read_rgba_pixels_memcpy
On Son, 2013-03-10 at 23:05 +0100, Martin Andersson wrote: --- src/mesa/main/readpix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 2f130ae..349b0bc 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -238,7 +238,7 @@ fast_read_rgba_pixels_memcpy( struct gl_context *ctx, } else if (rb-Format == MESA_FORMAT_XRGB format == GL_BGRA - type == GL_UNSIGNED_INT_8_8_8_8_REV + (type == GL_UNSIGNED_INT_8_8_8_8_REV || type == GL_UNSIGNED_BYTE) This cannot be equivalent on little endian and big endian hosts at the same time. As it works for you, it's apparently equivalent on little endian. I suspect ReadPixels could be made even faster with similar treatment as Marek has applied to TexSubImage etc. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC] Solving the TGSI indirect addressing optimization problem
Hi everybody, this problem has been open for quite some time now, with a bunch of different opinions and sometimes even patches floating on the list. The solutions proposed or implemented so far all more or less incomplete, so this approach was designed in mind with both completeness and compatibility with existing code. Over all it's just an implementation of what Tom Stellard named solution #4 in this eMail thread: http://lists.freedesktop.org/archives/mesa-dev/2013-January/033264.html Please review and as usual comments are welcome, Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] tgsi/ureg: cleanup local temporary emission
From: Christian König christian.koe...@amd.com Instead of emitting each temporary separately, emit them in a chunk. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 53 ++-- 1 file changed, 17 insertions(+), 36 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c b/src/gallium/auxiliary/tgsi/tgsi_ureg.c index 3c2a923..9303dc7 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c @@ -1260,30 +1260,11 @@ emit_decl_fs(struct ureg_program *ureg, out[3].decl_semantic.Index = semantic_index; } - -static void emit_decl( struct ureg_program *ureg, - unsigned file, - unsigned index, - boolean local ) -{ - union tgsi_any_token *out = get_tokens( ureg, DOMAIN_DECL, 2 ); - - out[0].value = 0; - out[0].decl.Type = TGSI_TOKEN_TYPE_DECLARATION; - out[0].decl.NrTokens = 2; - out[0].decl.File = file; - out[0].decl.UsageMask = TGSI_WRITEMASK_XYZW; - out[0].decl.Local = local; - - out[1].value = 0; - out[1].decl_range.First = index; - out[1].decl_range.Last = index; -} - static void emit_decl_range( struct ureg_program *ureg, unsigned file, unsigned first, - unsigned count ) + unsigned count, + boolean local ) { union tgsi_any_token *out = get_tokens( ureg, DOMAIN_DECL, 2 ); @@ -1293,6 +1274,7 @@ static void emit_decl_range( struct ureg_program *ureg, out[0].decl.File = file; out[0].decl.UsageMask = TGSI_WRITEMASK_XYZW; out[0].decl.Semantic = 0; + out[0].decl.Local = local; out[1].value = 0; out[1].decl_range.First = first; @@ -1450,7 +1432,7 @@ static void emit_decls( struct ureg_program *ureg ) if (ureg-processor == TGSI_PROCESSOR_VERTEX) { for (i = 0; i UREG_MAX_INPUT; i++) { if (ureg-vs_inputs[i/32] (1 (i%32))) { -emit_decl_range( ureg, TGSI_FILE_INPUT, i, 1 ); +emit_decl_range( ureg, TGSI_FILE_INPUT, i, 1, FALSE ); } } } else if (ureg-processor == TGSI_PROCESSOR_FRAGMENT) { @@ -1496,7 +1478,7 @@ static void emit_decls( struct ureg_program *ureg ) for (i = 0; i ureg-nr_samplers; i++) { emit_decl_range( ureg, TGSI_FILE_SAMPLER, - ureg-sampler[i].Index, 1 ); + ureg-sampler[i].Index, 1, FALSE ); } for (i = 0; i ureg-nr_sampler_views; i++) { @@ -1514,7 +1496,8 @@ static void emit_decls( struct ureg_program *ureg ) emit_decl_range(ureg, TGSI_FILE_CONSTANT, ureg-const_decls.constant_range[i].first, - ureg-const_decls.constant_range[i].last - ureg-const_decls.constant_range[i].first + 1); + ureg-const_decls.constant_range[i].last - ureg-const_decls.constant_range[i].first + 1, + FALSE); } } @@ -1535,30 +1518,28 @@ static void emit_decls( struct ureg_program *ureg ) } if (ureg-nr_temps) { - if (util_bitmask_get_first_index(ureg-local_temps) == UTIL_BITMASK_INVALID_INDEX) { - emit_decl_range( ureg, - TGSI_FILE_TEMPORARY, - 0, ureg-nr_temps ); - - } else { - for (i = 0; i ureg-nr_temps; i++) { -emit_decl( ureg, TGSI_FILE_TEMPORARY, i, - util_bitmask_get(ureg-local_temps, i) ); - } + for (i = 0; i ureg-nr_temps;) { + boolean local = util_bitmask_get(ureg-local_temps, i); + unsigned first = i++; + while (i ureg-nr_temps local == util_bitmask_get(ureg-local_temps, i)) +++i; + + emit_decl_range( ureg, TGSI_FILE_TEMPORARY, first, + i - first, local ); } } if (ureg-nr_addrs) { emit_decl_range( ureg, TGSI_FILE_ADDRESS, - 0, ureg-nr_addrs ); + 0, ureg-nr_addrs, FALSE ); } if (ureg-nr_preds) { emit_decl_range(ureg, TGSI_FILE_PREDICATE, 0, - ureg-nr_preds); + ureg-nr_preds, FALSE); } for (i = 0; i ureg-nr_immediates; i++) { -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] tgsi/ureg: implement support for array temporaries
From: Christian König christian.koe...@amd.com Don't bother with free temporaries, just allocate them at the end and also emit them in their own declaration. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 55 src/gallium/auxiliary/tgsi/tgsi_ureg.h | 38 +++--- 2 files changed, 69 insertions(+), 24 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c b/src/gallium/auxiliary/tgsi/tgsi_ureg.c index 9303dc7..d5fa084 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c @@ -153,6 +153,7 @@ struct ureg_program struct util_bitmask *free_temps; struct util_bitmask *local_temps; + struct util_bitmask *decl_temps; unsigned nr_temps; struct const_decl const_decls; @@ -547,13 +548,18 @@ static struct ureg_dst alloc_temporary( struct ureg_program *ureg, /* Or allocate a new one. */ - if (i == UTIL_BITMASK_INVALID_INDEX) + if (i == UTIL_BITMASK_INVALID_INDEX) { i = ureg-nr_temps++; - util_bitmask_clear(ureg-free_temps, i); + if (local) + util_bitmask_set(ureg-local_temps, i); - if (local) - util_bitmask_set(ureg-local_temps, i); + /* Start a new declaration when the local flag changes */ + if (!i || util_bitmask_get(ureg-local_temps, i - 1) != local) + util_bitmask_set(ureg-decl_temps, i); + } + + util_bitmask_clear(ureg-free_temps, i); return ureg_dst_register( TGSI_FILE_TEMPORARY, i ); } @@ -568,6 +574,24 @@ struct ureg_dst ureg_DECL_local_temporary( struct ureg_program *ureg ) return alloc_temporary(ureg, TRUE); } +struct ureg_dst ureg_DECL_array_temporary( struct ureg_program *ureg, + unsigned size, + boolean local ) +{ + unsigned i = ureg-nr_temps; + struct ureg_dst dst = ureg_dst_register( TGSI_FILE_TEMPORARY, i ); + + if (local) + util_bitmask_set(ureg-local_temps, i); + + util_bitmask_set(ureg-decl_temps, i); + + ureg-nr_temps += size; + util_bitmask_set(ureg-decl_temps, ureg-nr_temps); + + return dst; +} + void ureg_release_temporary( struct ureg_program *ureg, struct ureg_dst tmp ) { @@ -856,11 +880,11 @@ ureg_emit_src( struct ureg_program *ureg, } if (src.Dimension) { + out[0].src.Dimension = 1; + out[n].dim.Dimension = 0; + out[n].dim.Padding = 0; if (src.DimIndirect) { - out[0].src.Dimension = 1; out[n].dim.Indirect = 1; - out[n].dim.Dimension = 0; - out[n].dim.Padding = 0; out[n].dim.Index = src.DimensionIndex; n++; out[n].value = 0; @@ -871,10 +895,7 @@ ureg_emit_src( struct ureg_program *ureg, out[n].src.SwizzleW = src.DimIndSwizzle; out[n].src.Index = src.DimIndIndex; } else { - out[0].src.Dimension = 1; out[n].dim.Indirect = 0; - out[n].dim.Dimension = 0; - out[n].dim.Padding = 0; out[n].dim.Index = src.DimensionIndex; } n++; @@ -1520,9 +1541,10 @@ static void emit_decls( struct ureg_program *ureg ) if (ureg-nr_temps) { for (i = 0; i ureg-nr_temps;) { boolean local = util_bitmask_get(ureg-local_temps, i); - unsigned first = i++; - while (i ureg-nr_temps local == util_bitmask_get(ureg-local_temps, i)) -++i; + unsigned first = i; + i = util_bitmask_get_next_index(ureg-decl_temps, i + 1); + if (i == UTIL_BITMASK_INVALID_INDEX) +i = ureg-nr_temps; emit_decl_range( ureg, TGSI_FILE_TEMPORARY, first, i - first, local ); @@ -1692,8 +1714,14 @@ struct ureg_program *ureg_create( unsigned processor ) if (ureg-local_temps == NULL) goto no_local_temps; + ureg-decl_temps = util_bitmask_create(); + if (ureg-decl_temps == NULL) + goto no_decl_temps; + return ureg; +no_decl_temps: + util_bitmask_destroy(ureg-local_temps); no_local_temps: util_bitmask_destroy(ureg-free_temps); no_free_temps: @@ -1715,6 +1743,7 @@ void ureg_destroy( struct ureg_program *ureg ) util_bitmask_destroy(ureg-free_temps); util_bitmask_destroy(ureg-local_temps); + util_bitmask_destroy(ureg-decl_temps); FREE(ureg); } diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h b/src/gallium/auxiliary/tgsi/tgsi_ureg.h index fb663e9..cd140de 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h @@ -71,17 +71,17 @@ struct ureg_src */ struct ureg_dst { - unsigned File: 4; /* TGSI_FILE_ */ - unsigned WriteMask : 4; /* TGSI_WRITEMASK_ */ - unsigned Indirect: 1; /* BOOL */ - unsigned Saturate: 1; /* BOOL */ - unsigned Predicate : 1; - unsigned PredNegate : 1; /* BOOL */ - unsigned PredSwizzleX: 2; /* TGSI_SWIZZLE_
[Mesa-dev] [PATCH 3/7] glsl_to_tgsi: use get_temp for all allocations
From: Christian König christian.koe...@amd.com Signed-off-by: Christian König christian.koe...@amd.com --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 131ecb2..b2cccbc 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -1078,13 +1078,11 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir) */ assert((int) ir-num_state_slots == type_size(ir-type)); - storage = new(mem_ctx) variable_storage(ir, PROGRAM_TEMPORARY, -this-next_temp); - this-variables.push_tail(storage); - this-next_temp += type_size(ir-type); + dst = st_dst_reg(get_temp(ir-type)); + + storage = new(mem_ctx) variable_storage(ir, dst.file, dst.index); - dst = st_dst_reg(st_src_reg(PROGRAM_TEMPORARY, storage-index, - native_integers ? ir-type-base_type : GLSL_TYPE_FLOAT)); + this-variables.push_tail(storage); } @@ -2052,11 +2050,11 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir) break; case ir_var_auto: case ir_var_temporary: - entry = new(mem_ctx) variable_storage(var, PROGRAM_TEMPORARY, - this-next_temp); + st_src_reg src = get_temp(var-type); + + entry = new(mem_ctx) variable_storage(var, src.file, src.index); this-variables.push_tail(entry); - next_temp += type_size(var-type); break; } @@ -2574,11 +2572,10 @@ glsl_to_tgsi_visitor::get_function_signature(ir_function_signature *sig) storage = find_variable_storage(param); assert(!storage); - storage = new(mem_ctx) variable_storage(param, PROGRAM_TEMPORARY, - this-next_temp); - this-variables.push_tail(storage); + st_src_reg src = get_temp(param-type); - this-next_temp += type_size(param-type); + storage = new(mem_ctx) variable_storage(param, src.file, src.index); + this-variables.push_tail(storage); } if (!sig-return_type-is_void()) { -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] glsl_to_tgsi: allocate arrays separately
From: Christian König christian.koe...@amd.com Instead of allocating everything as temporaries, use the new array allocation functions. Signed-off-by: Christian König christian.koe...@amd.com --- src/mesa/main/mtypes.h |1 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 83 ++-- 2 files changed, 54 insertions(+), 30 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 4f09513..f7499d0 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1905,6 +1905,7 @@ struct gl_transform_feedback_state typedef enum { PROGRAM_TEMPORARY, /** machine-Temporary[] */ + PROGRAM_ARRAY, /** Arrays Matrixes */ PROGRAM_INPUT, /** machine-Inputs[] */ PROGRAM_OUTPUT, /** machine-Outputs[] */ PROGRAM_LOCAL_PARAM, /** gl_program-LocalParams[] */ diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index b2cccbc..ce90188 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -85,6 +85,11 @@ extern C { */ #define MAX_TEMPS 4096 +/** + * Maximum number of arrays + */ +#define MAX_ARRAYS256 + /* will be 4 for GLSL 4.00 */ #define MAX_GLSL_TEXTURE_OFFSET 1 @@ -315,6 +320,9 @@ public: int next_temp; + unsigned array_sizes[MAX_ARRAYS]; + unsigned next_array; + int num_address_regs; int samplers_used; bool indirect_addr_temps; @@ -550,6 +558,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, if (dst.reladdr) { switch(dst.file) { case PROGRAM_TEMPORARY: + case PROGRAM_ARRAY: this-indirect_addr_temps = true; break; case PROGRAM_LOCAL_PARAM: @@ -571,6 +580,7 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned op, if(inst-src[i].reladdr) { switch(inst-src[i].file) { case PROGRAM_TEMPORARY: +case PROGRAM_ARRAY: this-indirect_addr_temps = true; break; case PROGRAM_LOCAL_PARAM: @@ -1005,17 +1015,26 @@ glsl_to_tgsi_visitor::get_temp(const glsl_type *type) st_src_reg src; src.type = native_integers ? type-base_type : GLSL_TYPE_FLOAT; - src.file = PROGRAM_TEMPORARY; - src.index = next_temp; src.reladdr = NULL; - next_temp += type_size(type); + src.negate = 0; + + if (type-is_array() || type-is_matrix()) { + src.file = PROGRAM_ARRAY; + src.index = next_array 16 | 0x8000; + array_sizes[next_array] = type_size(type); + ++next_array; + + } else { + src.file = PROGRAM_TEMPORARY; + src.index = next_temp; + next_temp += type_size(type); + } if (type-is_array() || type-is_record()) { src.swizzle = SWIZZLE_NOOP; } else { src.swizzle = swizzle_for_size(type-vector_elements); } - src.negate = 0; return src; } @@ -2975,6 +2994,7 @@ glsl_to_tgsi_visitor::glsl_to_tgsi_visitor() { result.file = PROGRAM_UNDEFINED; next_temp = 1; + next_array = 0; next_signature_id = 1; num_immediates = 0; current_function = NULL; @@ -4011,6 +4031,7 @@ struct st_translate { struct ureg_program *ureg; struct ureg_dst temps[MAX_TEMPS]; + struct ureg_dst arrays[MAX_ARRAYS]; struct ureg_src *constants; struct ureg_src *immediates; struct ureg_dst outputs[PIPE_MAX_SHADER_OUTPUTS]; @@ -4129,16 +4150,30 @@ dst_register(struct st_translate *t, gl_register_file file, GLuint index) { + unsigned array; + switch(file) { case PROGRAM_UNDEFINED: return ureg_dst_undef(); case PROGRAM_TEMPORARY: + assert(index = 0); + assert(index (int) Elements(t-temps)); + if (ureg_dst_is_undef(t-temps[index])) t-temps[index] = ureg_DECL_local_temporary(t-ureg); return t-temps[index]; + case PROGRAM_ARRAY: + array = index 16; + + assert(array = 0); + assert(array (int) Elements(t-arrays)); + + return ureg_dst_array_offset(t-arrays[array], + (int)(index 0x) - 0x8000); + case PROGRAM_OUTPUT: if (t-procType == TGSI_PROCESSOR_VERTEX) assert(index VERT_RESULT_MAX); @@ -4173,11 +4208,8 @@ src_register(struct st_translate *t, return ureg_src_undef(); case PROGRAM_TEMPORARY: - assert(index = 0); - assert(index (int) Elements(t-temps)); - if (ureg_dst_is_undef(t-temps[index])) - t-temps[index] = ureg_DECL_local_temporary(t-ureg); - return ureg_src(t-temps[index]); + case PROGRAM_ARRAY: + return ureg_src(dst_register(t, file, index)); case PROGRAM_ENV_PARAM: case PROGRAM_LOCAL_PARAM: @@ -4259,8 +4291,10 @@ translate_dst(struct st_translate *t, } } - if (dst_reg-reladdr != NULL) + if (dst_reg-reladdr != NULL) { + assert(dst_reg-file != PROGRAM_TEMPORARY); dst =
[Mesa-dev] [PATCH 6/7] tgsi: remove TGSI_FILE_(IMMEDIATE|TEMP)_ARRAY
From: Christian König christian.koe...@amd.com Nobody seems to be using it, and only nv50 had a partial implementation. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/auxiliary/tgsi/tgsi_build.c| 19 - src/gallium/auxiliary/tgsi/tgsi_dump.c | 38 - src/gallium/auxiliary/tgsi/tgsi_exec.c | 41 -- src/gallium/auxiliary/tgsi/tgsi_exec.h |2 - src/gallium/auxiliary/tgsi/tgsi_parse.c| 11 --- src/gallium/auxiliary/tgsi/tgsi_parse.h|6 -- src/gallium/auxiliary/tgsi/tgsi_sanity.c |2 - src/gallium/auxiliary/tgsi/tgsi_strings.c |2 - src/gallium/auxiliary/tgsi/tgsi_text.c | 43 -- .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp | 82 src/gallium/include/pipe/p_shader_tokens.h |6 +- 11 files changed, 2 insertions(+), 250 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index cb7b9b2..33cbbd8 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -336,7 +336,6 @@ tgsi_default_full_declaration( void ) full_declaration.Range = tgsi_default_declaration_range(); full_declaration.Semantic = tgsi_default_declaration_semantic(); full_declaration.Interp = tgsi_default_declaration_interp(); - full_declaration.ImmediateData.u = NULL; full_declaration.Resource = tgsi_default_declaration_resource(); full_declaration.SamplerView = tgsi_default_declaration_sampler_view(); @@ -425,24 +424,6 @@ tgsi_build_full_declaration( header ); } - if (full_decl-Declaration.File == TGSI_FILE_IMMEDIATE_ARRAY) { - unsigned i, j; - union tgsi_immediate_data *data; - - for (i = 0; i = dr-Last; ++i) { - for (j = 0; j 4; ++j) { -unsigned idx = i*4 + j; -if (maxsize = size) - return 0; -data = (union tgsi_immediate_data *) tokens[size]; -++size; - -*data = full_decl-ImmediateData.u[idx]; -declaration_grow( declaration, header ); - } - } - } - if (full_decl-Declaration.File == TGSI_FILE_RESOURCE) { struct tgsi_declaration_resource *dr; diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c b/src/gallium/auxiliary/tgsi/tgsi_dump.c index 3e6f76a..177be0f 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c @@ -347,44 +347,6 @@ iter_declaration( TXT( , INVARIANT ); } - - if (decl-Declaration.File == TGSI_FILE_IMMEDIATE_ARRAY) { - unsigned i; - char range_indent[4]; - - TXT( {); - - if (decl-Range.Last 10) - range_indent[0] = '\0'; - else if (decl-Range.Last 100) { - range_indent[0] = ' '; - range_indent[1] = '\0'; - } else if (decl-Range.Last 1000) { - range_indent[0] = ' '; - range_indent[1] = ' '; - range_indent[2] = '\0'; - } else { - range_indent[0] = ' '; - range_indent[1] = ' '; - range_indent[2] = ' '; - range_indent[3] = '\0'; - } - - dump_imm_data(iter, decl-ImmediateData.u, -4, TGSI_IMM_FLOAT32); - for(i = 1; i = decl-Range.Last; ++i) { - /* indent by strlen of: - * DCL IMMX[0..1] { */ - CHR('\n'); - TXT( ); - TXT( range_indent ); - dump_imm_data(iter, decl-ImmediateData.u + i, - 4, TGSI_IMM_FLOAT32); - } - - TXT( }); - } - EOL(); return TRUE; diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index 6a74ef3..838c4a8 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -748,19 +748,6 @@ tgsi_exec_machine_bind_shader( ++mach-NumOutputs; } } - if (parse.FullToken.FullDeclaration.Declaration.File == - TGSI_FILE_IMMEDIATE_ARRAY) { -unsigned reg; -struct tgsi_full_declaration *decl = - parse.FullToken.FullDeclaration; -debug_assert(decl-Range.Last TGSI_EXEC_NUM_IMMEDIATES); -for (reg = decl-Range.First; reg = decl-Range.Last; ++reg) { - for( i = 0; i 4; i++ ) { - int idx = reg * 4 + i; - mach-ImmArray[reg][i] = decl-ImmediateData.u[idx].Float; - } -} - } memcpy(declarations + numDeclarations, parse.FullToken.FullDeclaration, sizeof(declarations[0])); @@ -1115,16 +1102,6 @@ fetch_src_file_channel(const struct tgsi_exec_machine *mach, } break; - case TGSI_FILE_TEMPORARY_ARRAY: - for (i = 0; i TGSI_QUAD_SIZE; i++) { - assert(index-i[i] TGSI_EXEC_NUM_TEMPS); -
[Mesa-dev] [PATCH 7/7] tgsi: use separate structure for indirect address
From: Christian König christian.koe...@amd.com To further improve the optimization of source and destination indirect addressing we need the ability to store a reference to the declaration of the addressed operands. Since most of the fields in tgsi_src_register doesn't apply for an indirect addressing operand replace it with a separate tgsi_ind_register structure and so make room for extra information. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c|4 +- src/gallium/auxiliary/tgsi/tgsi_build.c| 109 +++- src/gallium/auxiliary/tgsi/tgsi_dump.c | 28 - src/gallium/auxiliary/tgsi/tgsi_exec.c |8 +- src/gallium/auxiliary/tgsi/tgsi_parse.c| 35 +-- src/gallium/auxiliary/tgsi/tgsi_parse.h|8 +- src/gallium/auxiliary/tgsi/tgsi_text.c | 98 +- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 38 +++ src/gallium/auxiliary/tgsi/tgsi_ureg.h |7 ++ src/gallium/auxiliary/tgsi/tgsi_util.c | 18 src/gallium/auxiliary/tgsi/tgsi_util.h |3 + src/gallium/drivers/i915/i915_fpc.h|8 +- src/gallium/drivers/nv30/nvfx_vertprog.c |2 +- .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp |6 ++ src/gallium/drivers/r600/r600_llvm.c |2 +- src/gallium/include/pipe/p_shader_tokens.h | 18 ++-- 16 files changed, 221 insertions(+), 171 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 69957fe..f8e011e 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -517,12 +517,12 @@ emit_mask_scatter(struct lp_build_tgsi_soa_context *bld, static LLVMValueRef get_indirect_index(struct lp_build_tgsi_soa_context *bld, unsigned reg_file, unsigned reg_index, - const struct tgsi_src_register *indirect_reg) + const struct tgsi_ind_register *indirect_reg) { LLVMBuilderRef builder = bld-bld_base.base.gallivm-builder; struct lp_build_context *uint_bld = bld-bld_base.uint_bld; /* always use X component of address register */ - unsigned swizzle = indirect_reg-SwizzleX; + unsigned swizzle = indirect_reg-Swizzle; LLVMValueRef base; LLVMValueRef rel; LLVMValueRef max_index; diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index 33cbbd8..e71a6ea 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -816,6 +816,43 @@ tgsi_build_src_register( return src_register; } +static struct tgsi_ind_register +tgsi_default_ind_register( void ) +{ + struct tgsi_ind_register ind_register; + + ind_register.File = TGSI_FILE_NULL; + ind_register.Swizzle = TGSI_SWIZZLE_X; + ind_register.Declaration = 0; + + return ind_register; +} + +static struct tgsi_ind_register +tgsi_build_ind_register( + unsigned file, + unsigned swizzle, + unsigned declaration, + int index, + struct tgsi_instruction *instruction, + struct tgsi_header *header ) +{ + struct tgsi_ind_register ind_register; + + assert( file TGSI_FILE_COUNT ); + assert( swizzle = TGSI_SWIZZLE_W ); + assert( index = -0x8000 index = 0x7FFF ); + + ind_register.File = file; + ind_register.Swizzle = swizzle; + ind_register.Index = index; + ind_register.Declaration = declaration; + + instruction_grow( instruction, header ); + + return ind_register; +} + static struct tgsi_dimension tgsi_default_dimension( void ) { @@ -835,9 +872,9 @@ tgsi_default_full_src_register( void ) struct tgsi_full_src_register full_src_register; full_src_register.Register = tgsi_default_src_register(); - full_src_register.Indirect = tgsi_default_src_register(); + full_src_register.Indirect = tgsi_default_ind_register(); full_src_register.Dimension = tgsi_default_dimension(); - full_src_register.DimIndirect = tgsi_default_src_register(); + full_src_register.DimIndirect = tgsi_default_ind_register(); return full_src_register; } @@ -910,9 +947,9 @@ tgsi_default_full_dst_register( void ) struct tgsi_full_dst_register full_dst_register; full_dst_register.Register = tgsi_default_dst_register(); - full_dst_register.Indirect = tgsi_default_src_register(); + full_dst_register.Indirect = tgsi_default_ind_register(); full_dst_register.Dimension = tgsi_default_dimension(); - full_dst_register.DimIndirect = tgsi_default_src_register(); + full_dst_register.DimIndirect = tgsi_default_ind_register(); return full_dst_register; } @@ -1057,24 +1094,18 @@ tgsi_build_full_instruction( header ); if( reg-Register.Indirect ) { - struct tgsi_src_register *ind; + struct tgsi_ind_register *ind;
Re: [Mesa-dev] [PATCH 1/2] d3d1x: Remove.
On 11.03.2013 11:26, Jose Fonseca wrote: First email was too long, so re-sending just the interesting bits) Please tell me removing this came to mind because you're going to release a better D3D9,10/11 state tracker :) (Nah I guess it would be too much trouble if there's no users for it ...) This one *did* kind of work, notably also with wine, but it still has loads of bugs and I just don't have the time to improve it; and then add those missing bits like deferred contexts, virtual functions, compute shader or UAV support. Also gallium's still not completely able to support everything properly. It did acquire some of the missing parts though since last time I touched it. I had succeeded in making Unigine Heaven run (taking a little shortcut with sm4 to nv50, extending the gallium interface for features like tessellation that are still years ahead for all the other parties would not have been well received at that time, at least I had that impression), but all the more complex games I tested crashed somewhere and I wasn't going to try to debug binary blobs (most of them seemed to require those missing features, too). Anyway, just meant to say, it *could* have been useful had someone finished it ... if only with wine. So I'm fine with removing it since I don't expect anyone to get back to it. Trying to decide between farewell and good riddance for all the pain its bugs caused me. From: José Fonseca jfons...@vmware.com Unused/unmaintained. --- configure.ac | 21 - src/gallium/docs/source/context.rst|2 +- src/gallium/state_trackers/d3d1x/.gitignore| 20 - src/gallium/state_trackers/d3d1x/Makefile | 11 - src/gallium/state_trackers/d3d1x/Makefile.inc | 19 - .../state_trackers/d3d1x/d3d1xshader/Makefile | 16 - .../d3d1x/d3d1xshader/defs/files.txt | 41 - .../d3d1x/d3d1xshader/defs/interpolations.txt |8 - .../d3d1x/d3d1xshader/defs/opcodes.txt | 207 -- .../d3d1x/d3d1xshader/defs/operand_compnums.txt|5 - .../d3d1x/d3d1xshader/defs/operand_index_reprs.txt |5 - .../d3d1x/d3d1xshader/defs/operand_modes.txt |4 - .../d3d1x/d3d1xshader/defs/shortfiles.txt | 41 - .../state_trackers/d3d1x/d3d1xshader/defs/svs.txt | 23 - .../d3d1x/d3d1xshader/defs/targets.txt | 13 - .../defs/token_instruction_extended_types.txt |4 - .../defs/token_operand_extended_types.txt |2 - .../state_trackers/d3d1x/d3d1xshader/gen-header.sh | 13 - .../state_trackers/d3d1x/d3d1xshader/gen-text.sh | 11 - .../d3d1x/d3d1xshader/include/dxbc.h | 125 - .../d3d1x/d3d1xshader/include/le32.h | 45 - .../state_trackers/d3d1x/d3d1xshader/include/sm4.h | 416 .../d3d1x/d3d1xshader/src/dxbc_assemble.cpp| 59 - .../d3d1x/d3d1xshader/src/dxbc_dump.cpp| 43 - .../d3d1x/d3d1xshader/src/dxbc_parse.cpp | 87 - .../d3d1x/d3d1xshader/src/sm4_analyze.cpp | 122 - .../d3d1x/d3d1xshader/src/sm4_dump.cpp | 222 -- .../d3d1x/d3d1xshader/src/sm4_parse.cpp| 445 .../state_trackers/d3d1x/d3d1xshader/src/utils.h | 45 - .../d3d1x/d3d1xshader/tools/fxdis.cpp | 75 - .../state_trackers/d3d1x/d3d1xstutil/Makefile |5 - .../d3d1x/d3d1xstutil/include/d3d1xstutil.h| 1110 - .../d3d1x/d3d1xstutil/src/d3d_sm4_enums.cpp| 42 - .../d3d1x/d3d1xstutil/src/dxgi_enums.cpp | 165 -- .../state_trackers/d3d1x/d3d1xstutil/src/guids.cpp |6 - src/gallium/state_trackers/d3d1x/d3dapi/Makefile |4 - src/gallium/state_trackers/d3d1x/d3dapi/d3d10.idl | 1554 .../state_trackers/d3d1x/d3dapi/d3d10_1.idl| 191 -- .../state_trackers/d3d1x/d3dapi/d3d10misc.h| 47 - .../state_trackers/d3d1x/d3dapi/d3d10shader.idl| 269 --- src/gallium/state_trackers/d3d1x/d3dapi/d3d11.idl | 2492 .../state_trackers/d3d1x/d3dapi/d3d11shader.idl| 287 --- .../state_trackers/d3d1x/d3dapi/d3dcommon.idl | 704 -- src/gallium/state_trackers/d3d1x/d3dapi/dxgi.idl | 470 .../state_trackers/d3d1x/d3dapi/dxgiformat.idl | 129 - .../state_trackers/d3d1x/d3dapi/dxgitype.idl | 84 - src/gallium/state_trackers/d3d1x/docs/Makefile |5 - .../state_trackers/d3d1x/docs/coding_style.txt | 84 - .../d3d1x/docs/module_dependencies.dot | 25 - .../state_trackers/d3d1x/docs/source_layout.txt| 17 - src/gallium/state_trackers/d3d1x/dxgi/Makefile | 17 - .../state_trackers/d3d1x/dxgi/src/dxgi_loader.cpp | 206 -- .../state_trackers/d3d1x/dxgi/src/dxgi_native.cpp | 1514 .../state_trackers/d3d1x/dxgi/src/dxgi_private.h | 49 - .../state_trackers/d3d1x/dxgid3d10/Makefile|4 -
Re: [Mesa-dev] [PATCH 1/2] mesa: Add GL_UNSIGNED_BYTE fast-path to fast_read_rgba_pixels_memcpy
On Mon, Mar 11, 2013 at 11:54 AM, Michel Dänzer mic...@daenzer.net wrote: On Son, 2013-03-10 at 23:05 +0100, Martin Andersson wrote: --- src/mesa/main/readpix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 2f130ae..349b0bc 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -238,7 +238,7 @@ fast_read_rgba_pixels_memcpy( struct gl_context *ctx, } else if (rb-Format == MESA_FORMAT_XRGB format == GL_BGRA - type == GL_UNSIGNED_INT_8_8_8_8_REV + (type == GL_UNSIGNED_INT_8_8_8_8_REV || type == GL_UNSIGNED_BYTE) This cannot be equivalent on little endian and big endian hosts at the same time. As it works for you, it's apparently equivalent on little endian. ok, I guess it is also undesirable to have lots of special cases there, with checks for lots of different combos of types and endianness. I suspect ReadPixels could be made even faster with similar treatment as Marek has applied to TexSubImage etc. I have looked at TexSubImage and how the radeon dri and intel dri drivers implement glReadPixels (they use a blit call). But I did not understand how I could use it for glReadPixels. I will look at it some more, thanks. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Solving the TGSI indirect addressing optimization problem
On 11.03.2013 13:44, Christian König wrote: Hi everybody, this problem has been open for quite some time now, with a bunch of different opinions and sometimes even patches floating on the list. Nice, finally someone implements a proper solution. However, it seems like this isn't used for arrays in the IN and OUT files (varyings). Would it be much more work to use it there, too ? Fragment Shader inputs seem to be read with if (index == 0) return in[0] else if (index == 1) ... sequences. And I may have spotted a bug in the following shader: in vec4 vertex[2]; in vec4 color; out vec4 value[4]; uniform int i, j; void main() { gl_Position = vertex[i]; value[0] = vertex[0]; value[1] = vertex[1]; value[2] = vec4(0.0); value[3] = vec4(0.0); value[j] = color; } gives me DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], GENERIC[12] DCL OUT[2], GENERIC[13] DCL OUT[3], GENERIC[14] DCL OUT[4], GENERIC[15] DCL CONST[0..1] DCL TEMP[0..3], LOCAL DCL TEMP[4], LOCAL DCL ADDR[0] IMM[0] FLT32 {0., 0., 0., 0.} 0: UARL ADDR[0].x, CONST[1]. 1: MOV TEMP[4], IN[ADDR[0].x] (not the bug) but this is invalid as there is no IN array, just single ones 2: MOV TEMP[0], IN[0] 3: MOV TEMP[1], IN[1] 4: MOV TEMP[2], IMM[0]. 5: MOV TEMP[3], IMM[0]. 6: UARL ADDR[0].x, CONST[0]. 7: MOV TEMP[1][ADDR[0].x], IN[2] why is this TEMP[1][] ? The array seems to be the first declaration ... 8: MOV OUT[1], TEMP[0] 9: MOV OUT[2], TEMP[1] 10: MOV OUT[3], TEMP[2] 11: MOV OUT[4], TEMP[3] 12: MOV OUT[0], TEMP[4] 13: END Ideally this would not use TEMP arrays at all though, but output arrays (I vaguely recall some radeon card doesn't support this though. Is that just outputs or also inputs ?). The solutions proposed or implemented so far all more or less incomplete, so this approach was designed in mind with both completeness and compatibility with existing code. Over all it's just an implementation of what Tom Stellard named solution #4 in this eMail thread: http://lists.freedesktop.org/archives/mesa-dev/2013-January/033264.html Please review and as usual comments are welcome, Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] d3d1x: Remove.
- Original Message - On 11.03.2013 11:26, Jose Fonseca wrote: First email was too long, so re-sending just the interesting bits) Please tell me removing this came to mind because you're going to release a better D3D9,10/11 state tracker :) (Nah I guess it would be too much trouble if there's no users for it ...) No.. :) I just noticed it continues to be a distraction when doing interface changes or build cleanups. This one *did* kind of work, notably also with wine, but it still has loads of bugs and I just don't have the time to improve it; and then add those missing bits like deferred contexts, virtual functions, compute shader or UAV support. Also gallium's still not completely able to support everything properly. It did acquire some of the missing parts though since last time I touched it. I had succeeded in making Unigine Heaven run (taking a little shortcut with sm4 to nv50, extending the gallium interface for features like tessellation that are still years ahead for all the other parties would not have been well received at that time, at least I had that impression), but all the more complex games I tested crashed somewhere and I wasn't going to try to debug binary blobs (most of them seemed to require those missing features, too). Anyway, just meant to say, it *could* have been useful had someone finished it ... if only with wine. So I'm fine with removing it since I don't expect anyone to get back to it. Trying to decide between farewell and good riddance for all the pain its bugs caused me. Thanks. For the record, I have no objection with this component goals, I always found it a cool project, and I'll happily welcome back if a maintainer comes along. But until then I'd rather we don't have dead code in master -- the live code is more than enough to keep us totally busy. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Solving the TGSI indirect addressing optimization problem
Am 11.03.2013 14:47, schrieb Christoph Bumiller: On 11.03.2013 13:44, Christian König wrote: Hi everybody, this problem has been open for quite some time now, with a bunch of different opinions and sometimes even patches floating on the list. Nice, finally someone implements a proper solution. However, it seems like this isn't used for arrays in the IN and OUT files (varyings). Would it be much more work to use it there, too ? Shouldn't be to much of a problem, but I just wanted to solve temporaries first and when that's working look at all the rest. Fragment Shader inputs seem to be read with if (index == 0) return in[0] else if (index == 1) ... sequences. Well as said before it only handles temp arrays for now. That looks like the code that's generated if the driver reports to not have indirect support, do you know off hand where exactly that's handled? The glsl_to_tgsi code is unfortunately hard to read at best. And I may have spotted a bug in the following shader: in vec4 vertex[2]; in vec4 color; out vec4 value[4]; uniform int i, j; void main() { gl_Position = vertex[i]; value[0] = vertex[0]; value[1] = vertex[1]; value[2] = vec4(0.0); value[3] = vec4(0.0); value[j] = color; } gives me DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], GENERIC[12] DCL OUT[2], GENERIC[13] DCL OUT[3], GENERIC[14] DCL OUT[4], GENERIC[15] DCL CONST[0..1] DCL TEMP[0..3], LOCAL DCL TEMP[4], LOCAL DCL ADDR[0] IMM[0] FLT32 {0., 0., 0., 0.} 0: UARL ADDR[0].x, CONST[1]. 1: MOV TEMP[4], IN[ADDR[0].x] (not the bug) but this is invalid as there is no IN array, just single ones 2: MOV TEMP[0], IN[0] 3: MOV TEMP[1], IN[1] 4: MOV TEMP[2], IMM[0]. 5: MOV TEMP[3], IMM[0]. 6: UARL ADDR[0].x, CONST[0]. 7: MOV TEMP[1][ADDR[0].x], IN[2] why is this TEMP[1][] ? The array seems to be the first declaration ... I numbered the declarations starting with 1 (and not 0), so I could use 0 as the SPECIAL case saying that we want to address the whole range of registers and not just one declaration. I did this just for compatibility reasons, so I could look at handling temps only, and doesn't bother to much with inputs/outputs. Well so far the patchset is just an RFC, and so I want to let the list see the patches before either implementing inputs/outputs as well or fully document such quirks/hacks. 8: MOV OUT[1], TEMP[0] 9: MOV OUT[2], TEMP[1] 10: MOV OUT[3], TEMP[2] 11: MOV OUT[4], TEMP[3] 12: MOV OUT[0], TEMP[4] 13: END Ideally this would not use TEMP arrays at all though, but output arrays (I vaguely recall some radeon card doesn't support this though. Is that just outputs or also inputs ?). More or less correct, modern radeons don't have an output register space, but instead have export instructions. In the current driver we allocate registers for temps and outputs to work around this, but in the example above it wouldn't be necessary. Inputs are just registers as well, either preloaded when starting the shader or filled in by special instructions (vector fetches, coordinate interpolation etc...). Thanks for the comments, Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] tgsi: use separate structure for indirect address
Christian, I didn't comment on the previous threads, but the approach mentioned in http://lists.freedesktop.org/archives/mesa-dev/2012-November/030476.html seems sensible to me. I think after the first round we should have this in a branch to allow drivers to catch up with the interface change. Or is it possible for drivers to opt-in via a cap? Also, I think that a nice summary of how this is supposed to work in src/gallium/docs is a must. A few more remarks inline. - Original Message - From: Christian König christian.koe...@amd.com To further improve the optimization of source and destination indirect addressing we need the ability to store a reference to the declaration of the addressed operands. Just to be perfectly clear, declaration number does not refer to the n-th TEMP declaration, but declaration of TEMP[n], right? That is, this DCL TEMP[1][0..70] DCL TEMP[2][0..7] MOV OUT[1], TEMP[1][ADDR[0].x] and this DCL TEMP[2][0..7] DCL TEMP[1][0..70] MOV OUT[1], TEMP[1][ADDR[0].x] are equivalent, right? If so, I wonder if there is a name more descriptive than Declaration here. Maybe Range, or IndexableRange? Since most of the fields in tgsi_src_register doesn't apply for an indirect addressing operand replace it with a separate tgsi_ind_register structure and so make room for extra information. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 4 +- src/gallium/auxiliary/tgsi/tgsi_build.c | 109 +++- src/gallium/auxiliary/tgsi/tgsi_dump.c | 28 - src/gallium/auxiliary/tgsi/tgsi_exec.c | 8 +- src/gallium/auxiliary/tgsi/tgsi_parse.c | 35 +-- src/gallium/auxiliary/tgsi/tgsi_parse.h | 8 +- src/gallium/auxiliary/tgsi/tgsi_text.c | 98 +- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 38 +++ src/gallium/auxiliary/tgsi/tgsi_ureg.h | 7 ++ src/gallium/auxiliary/tgsi/tgsi_util.c | 18 src/gallium/auxiliary/tgsi/tgsi_util.h | 3 + src/gallium/drivers/i915/i915_fpc.h | 8 +- src/gallium/drivers/nv30/nvfx_vertprog.c | 2 +- .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp | 6 ++ src/gallium/drivers/r600/r600_llvm.c | 2 +- src/gallium/include/pipe/p_shader_tokens.h | 18 ++-- 16 files changed, 221 insertions(+), 171 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 69957fe..f8e011e 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -517,12 +517,12 @@ emit_mask_scatter(struct lp_build_tgsi_soa_context *bld, static LLVMValueRef get_indirect_index(struct lp_build_tgsi_soa_context *bld, unsigned reg_file, unsigned reg_index, - const struct tgsi_src_register *indirect_reg) + const struct tgsi_ind_register *indirect_reg) { LLVMBuilderRef builder = bld-bld_base.base.gallivm-builder; struct lp_build_context *uint_bld = bld-bld_base.uint_bld; /* always use X component of address register */ - unsigned swizzle = indirect_reg-SwizzleX; + unsigned swizzle = indirect_reg-Swizzle; LLVMValueRef base; LLVMValueRef rel; LLVMValueRef max_index; diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index 33cbbd8..e71a6ea 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -816,6 +816,43 @@ tgsi_build_src_register( return src_register; } +static struct tgsi_ind_register +tgsi_default_ind_register( void ) +{ + struct tgsi_ind_register ind_register; + + ind_register.File = TGSI_FILE_NULL; + ind_register.Swizzle = TGSI_SWIZZLE_X; + ind_register.Declaration = 0; + + return ind_register; +} + +static struct tgsi_ind_register +tgsi_build_ind_register( + unsigned file, + unsigned swizzle, + unsigned declaration, + int index, + struct tgsi_instruction *instruction, + struct tgsi_header *header ) +{ + struct tgsi_ind_register ind_register; + + assert( file TGSI_FILE_COUNT ); + assert( swizzle = TGSI_SWIZZLE_W ); + assert( index = -0x8000 index = 0x7FFF ); + + ind_register.File = file; + ind_register.Swizzle = swizzle; + ind_register.Index = index; + ind_register.Declaration = declaration; + + instruction_grow( instruction, header ); + + return ind_register; +} + static struct tgsi_dimension tgsi_default_dimension( void ) { @@ -835,9 +872,9 @@ tgsi_default_full_src_register( void ) struct tgsi_full_src_register full_src_register; full_src_register.Register = tgsi_default_src_register(); -
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? Jose - Original Message - --- src/mesa/main/readpix.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 349b0bc..0f5c84c 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -285,11 +285,12 @@ fast_read_rgba_pixels_memcpy( struct gl_context *ctx, } } else if (copy_xrgb) { /* convert xrgb - argb */ + int alphaOffset = texelBytes - 1; for (j = 0; j height; j++) { - GLuint *dst4 = (GLuint *) dst, *map4 = (GLuint *) map; + memcpy(dst, map, width * texelBytes); int i; for (i = 0; i width; i++) { -dst4[i] = map4[i] | 0xff00; /* set A=0xff */ +dst[i * texelBytes + alphaOffset] = 0xff; /* set A=0xff */ } dst += dstStride; map += stride; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
On Mon, Mar 11, 2013 at 9:56 AM, Jose Fonseca jfons...@vmware.com wrote: I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? Jose I'm sorry to be dull, but is there a SSE2 implementation of this somewhere for x86 / x64 CPUs? Patrick - Original Message - --- src/mesa/main/readpix.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 349b0bc..0f5c84c 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -285,11 +285,12 @@ fast_read_rgba_pixels_memcpy( struct gl_context *ctx, } } else if (copy_xrgb) { /* convert xrgb - argb */ + int alphaOffset = texelBytes - 1; for (j = 0; j height; j++) { - GLuint *dst4 = (GLuint *) dst, *map4 = (GLuint *) map; + memcpy(dst, map, width * texelBytes); int i; for (i = 0; i width; i++) { -dst4[i] = map4[i] | 0xff00; /* set A=0xff */ +dst[i * texelBytes + alphaOffset] = 0xff; /* set A=0xff */ } dst += dstStride; map += stride; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Solving the TGSI indirect addressing optimization problem
On 11.03.2013 15:38, Christian König wrote: Am 11.03.2013 14:47, schrieb Christoph Bumiller: On 11.03.2013 13:44, Christian König wrote: Hi everybody, this problem has been open for quite some time now, with a bunch of different opinions and sometimes even patches floating on the list. Nice, finally someone implements a proper solution. However, it seems like this isn't used for arrays in the IN and OUT files (varyings). Would it be much more work to use it there, too ? Shouldn't be to much of a problem, but I just wanted to solve temporaries first and when that's working look at all the rest. Fragment Shader inputs seem to be read with if (index == 0) return in[0] else if (index == 1) ... sequences. Well as said before it only handles temp arrays for now. That looks like the code that's generated if the driver reports to not have indirect support, do you know off hand where exactly that's handled? The glsl_to_tgsi code is unfortunately hard to read at best. Apologies, I didn't remember I that I didn't advertise indirect support for fragment shaders, indirect inputs would be supported though. The reason why I really want array support for inputs, too, is that input space location depends on semantic, and thus doesn't necessarily correspond to the TGSI order. Treatment of arrays should be consistent in the end, right now it looks like we're having, if you read this like C code: float temp0[4]; temp0[i] = x; but float in0, in1, in2, in3; x = in[i]; why is this TEMP[1][] ? The array seems to be the first declaration ... I numbered the declarations starting with 1 (and not 0), so I could use 0 as the SPECIAL case saying that we want to address the whole range of registers and not just one declaration. I did this just for compatibility reasons, so I could look at handling temps only, and doesn't bother to much with inputs/outputs. Well so far the patchset is just an RFC, and so I want to let the list see the patches before either implementing inputs/outputs as well or fully document such quirks/hacks. Ah, good to know. This should be documented (maybe it is and I missed it ?). At least in the comment above struct tgsi_ind_register's definition, which is what I'd look at first. Thanks again for doing this. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] tgsi: use separate structure for indirect address
Am 11.03.2013 15:52, schrieb Jose Fonseca: Christian, I didn't comment on the previous threads, but the approach mentioned in http://lists.freedesktop.org/archives/mesa-dev/2012-November/030476.html seems sensible to me. I think after the first round we should have this in a branch to allow drivers to catch up with the interface change. Or is it possible for drivers to opt-in via a cap? Not the drivers are in question of changing, the state trackers are. If the drivers just ignore those additional informations nothing should change for them. For the state trackers my current approach also doesn't need them to change, currently the semantics is as following: If Declaration==0 then we fall back to the old behavior, e.g. the whole register file is indirectly addressed. Else the state tracker (currently only glsl_to_tgsi) provided the necessary information in the Declaration field to only indirect address a certain part of the register file. Also, I think that a nice summary of how this is supposed to work in src/gallium/docs is a must. Of course, yes. So far I haven't written ANY documentation at all. I just coded the patch and wanted to discuss all the details first before proceeding with anything that wouldn't be accepted. A few more remarks inline. - Original Message - From: Christian König christian.koe...@amd.com To further improve the optimization of source and destination indirect addressing we need the ability to store a reference to the declaration of the addressed operands. Just to be perfectly clear, declaration number does not refer to the n-th TEMP declaration, but declaration of TEMP[n], right? No, currently it indeed refers to the n-th TEMP declaration. But I'm still fighting with myself weather or not that's a good idea. That is, this DCL TEMP[1][0..70] DCL TEMP[2][0..7] MOV OUT[1], TEMP[1][ADDR[0].x] and this DCL TEMP[2][0..7] DCL TEMP[1][0..70] MOV OUT[1], TEMP[1][ADDR[0].x] are equivalent, right? If so, I wonder if there is a name more descriptive than Declaration here. Maybe Range, or IndexableRange? Correct, yes. As said above, currently Declaration refers to the n-th declaration but I can easily add an Indirect flag to tgsi_declaration (there are still 6 bits of padding in it) and have an IndirectRangeID (or ArrayID, ArrayName, whatever, make your choice) token following the declaration. Since most of the fields in tgsi_src_register doesn't apply for an indirect addressing operand replace it with a separate tgsi_ind_register structure and so make room for extra information. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c|4 +- src/gallium/auxiliary/tgsi/tgsi_build.c| 109 +++- src/gallium/auxiliary/tgsi/tgsi_dump.c | 28 - src/gallium/auxiliary/tgsi/tgsi_exec.c |8 +- src/gallium/auxiliary/tgsi/tgsi_parse.c| 35 +-- src/gallium/auxiliary/tgsi/tgsi_parse.h|8 +- src/gallium/auxiliary/tgsi/tgsi_text.c | 98 +- src/gallium/auxiliary/tgsi/tgsi_ureg.c | 38 +++ src/gallium/auxiliary/tgsi/tgsi_ureg.h |7 ++ src/gallium/auxiliary/tgsi/tgsi_util.c | 18 src/gallium/auxiliary/tgsi/tgsi_util.h |3 + src/gallium/drivers/i915/i915_fpc.h|8 +- src/gallium/drivers/nv30/nvfx_vertprog.c |2 +- .../drivers/nv50/codegen/nv50_ir_from_tgsi.cpp |6 ++ src/gallium/drivers/r600/r600_llvm.c |2 +- src/gallium/include/pipe/p_shader_tokens.h | 18 ++-- 16 files changed, 221 insertions(+), 171 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 69957fe..f8e011e 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -517,12 +517,12 @@ emit_mask_scatter(struct lp_build_tgsi_soa_context *bld, static LLVMValueRef get_indirect_index(struct lp_build_tgsi_soa_context *bld, unsigned reg_file, unsigned reg_index, - const struct tgsi_src_register *indirect_reg) + const struct tgsi_ind_register *indirect_reg) { LLVMBuilderRef builder = bld-bld_base.base.gallivm-builder; struct lp_build_context *uint_bld = bld-bld_base.uint_bld; /* always use X component of address register */ - unsigned swizzle = indirect_reg-SwizzleX; + unsigned swizzle = indirect_reg-Swizzle; LLVMValueRef base; LLVMValueRef rel; LLVMValueRef max_index; diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c index 33cbbd8..e71a6ea 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_build.c +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c @@ -816,6 +816,43 @@
Re: [Mesa-dev] [PATCH 08/18] build: Move src/mapi/mapi/* to src/mapi/
- Original Message - hot to build mesa in windows use mingw scons ? if use `scons platform=windows toolchain=crossmingw machine=x86 build=release mesagdi libgl-gdi` will happan [build\windows-x86\mesa\libmesa.a] Error 1 without any tips Surely there must be more output than that. Anyway, I wasn't referring to mingw scons in my email. just scons libgl-xlib would be enough. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
On Mon, Mar 11, 2013 at 3:56 PM, Jose Fonseca jfons...@vmware.com wrote: I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? I tested it with the previous patch, with GL_UNSIGNED_BYTE, and on that case it was faster, but since that patch was incorrect (I did not take endianness into account) this patch can also probably be discarded. Jose - Original Message - --- src/mesa/main/readpix.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 349b0bc..0f5c84c 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -285,11 +285,12 @@ fast_read_rgba_pixels_memcpy( struct gl_context *ctx, } } else if (copy_xrgb) { /* convert xrgb - argb */ + int alphaOffset = texelBytes - 1; for (j = 0; j height; j++) { - GLuint *dst4 = (GLuint *) dst, *map4 = (GLuint *) map; + memcpy(dst, map, width * texelBytes); int i; for (i = 0; i width; i++) { -dst4[i] = map4[i] | 0xff00; /* set A=0xff */ +dst[i * texelBytes + alphaOffset] = 0xff; /* set A=0xff */ } dst += dstStride; map += stride; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Use correct functions for enum conversion.
On Sun, Mar 10, 2013 at 11:56 PM, Vinson Lee v...@freedesktop.org wrote: Fixes mixing enum types defects reported by Coverity. Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/main/errors.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c index 97f1b8a..684f235 100644 --- a/src/mesa/main/errors.c +++ b/src/mesa/main/errors.c @@ -651,8 +651,8 @@ _mesa_DebugMessageControlARB(GLenum gl_source, GLenum gl_type, return; } - source = gl_enum_to_debug_severity(gl_source); - type = gl_enum_to_debug_severity(gl_type); + source = gl_enum_to_debug_source(gl_source); + type = gl_enum_to_debug_type(gl_type); severity = gl_enum_to_debug_severity(gl_severity); control_app_messages(ctx, source, type, severity, count, ids, enabled); Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: clean up passing derivatives around
I only skimmed, but looks good in principle. Jose - Original Message - From: Roland Scheidegger srol...@vmware.com Previously, the derivatives were calculated and passed in a packed form to the sample code (for implicit derivatives, explicit derivatives were packed to the same format). There's several reasons why this wasn't such a good idea: 1) the derivatives may not even be needed (not as bad as it sounds since llvm will just throw the calculations needed for them away but still) 2) the special packing format really shouldn't be part of the sampler interface 3) depending what the sample code actually does the derivatives will be processed differently, hence there is no ideal packing. For cube maps with explicit derivatives (which we don't do yet) for instance the packing looked downright useless, and for non-isotropic filtering we'd need different calculations too. So, instead just pass the derivatives as is (for explicit derivatives), or let the rho calculating sample code calculate them itself. This still does exactly the same packing stuff for implicit derivatives for now, though explicit ones are handled in a more straightforward manner (quick estimates show performance should be quite similar, though it is much easier to follow and also does the rho calculation per-pixel until the end, which we eventually need for spec compliance anyway). No piglit changes. --- src/gallium/auxiliary/gallivm/lp_bld_quad.c | 14 +- src/gallium/auxiliary/gallivm/lp_bld_sample.c | 271 + src/gallium/auxiliary/gallivm/lp_bld_sample.h |6 +- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 11 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c | 21 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 122 +- 6 files changed, 196 insertions(+), 249 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_quad.c b/src/gallium/auxiliary/gallivm/lp_bld_quad.c index 8a0efed..1955add 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_quad.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_quad.c @@ -79,14 +79,9 @@ lp_build_ddy(struct lp_build_context *bld, } /* - * To be able to handle multiple quads at once in texture sampling and - * do lod calculations per quad, it is necessary to get the per-quad - * derivatives into the lp_build_rho function. - * For 8-wide vectors the packed derivative values for 3 coords would - * look like this, this scales to a arbitrary (multiple of 4) vector size: - * ds1dx ds1dy dt1dx dt1dy ds2dx ds2dy dt2dx dt2dy + * Helper for building packed ddx/ddy vector for one coord (scalar per quad + * values). The vector will look like this (8-wide): * dr1dx dr1dy _ _ dr2dx dr2dy _ _ - * The second vector will be unused for 1d and 2d textures. */ LLVMValueRef lp_build_packed_ddx_ddy_onecoord(struct lp_build_context *bld, @@ -121,6 +116,11 @@ lp_build_packed_ddx_ddy_onecoord(struct lp_build_context *bld, } +/* + * Helper for building packed ddx/ddy vector for one coord (scalar per quad + * values). The vector will look like this (8-wide): + * ds1dx ds1dy dt1dx dt1dy ds2dx ds2dy dt2dx dt2dy + */ LLVMValueRef lp_build_packed_ddx_ddy_twocoord(struct lp_build_context *bld, LLVMValueRef a, LLVMValueRef b) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c b/src/gallium/auxiliary/gallivm/lp_bld_sample.c index ef0631c..fc8bae7 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c @@ -46,6 +46,7 @@ #include lp_bld_type.h #include lp_bld_logic.h #include lp_bld_pack.h +#include lp_bld_quad.h /* @@ -203,6 +204,9 @@ lp_sampler_static_sampler_state(struct lp_static_sampler_state *state, static LLVMValueRef lp_build_rho(struct lp_build_sample_context *bld, unsigned texture_unit, + LLVMValueRef s, + LLVMValueRef t, + LLVMValueRef r, const struct lp_derivatives *derivs) { struct gallivm_state *gallivm = bld-gallivm; @@ -211,8 +215,8 @@ lp_build_rho(struct lp_build_sample_context *bld, struct lp_build_context *float_bld = bld-float_bld; struct lp_build_context *coord_bld = bld-coord_bld; struct lp_build_context *perquadf_bld = bld-perquadf_bld; - const LLVMValueRef *ddx_ddy = derivs-ddx_ddy; const unsigned dims = bld-dims; + LLVMValueRef ddx_ddy[2]; LLVMBuilderRef builder = bld-gallivm-builder; LLVMTypeRef i32t = LLVMInt32TypeInContext(bld-gallivm-context); LLVMValueRef index0 = LLVMConstInt(i32t, 0, 0); @@ -229,59 +233,7 @@ lp_build_rho(struct lp_build_sample_context *bld, LLVMValueRef i32undef = LLVMGetUndef(LLVMInt32TypeInContext(gallivm-context)); LLVMValueRef rho_xvec, rho_yvec; - abs_ddx_ddy[0] = lp_build_abs(coord_bld, ddx_ddy[0]); - if (dims 2) { - abs_ddx_ddy[1] =
Re: [Mesa-dev] [PATCH] mesa: Use correct functions for enum conversion.
Vinson Lee v...@freedesktop.org writes: Fixes mixing enum types defects reported by Coverity. Reviewed-by: Eric Anholt e...@anholt.net I'm disappointed in gcc that there's -Wenum-compare, but nothing to complain about implicit conversions between enum types. (between an enum and int I'm fine with, unlike C++, but enum to enum is probably a mistake) Also, further proof that we need more tests. pgpJS8YP_dS1q.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 61907] Indirect rendering of multi-texture vertex arrays broken
https://bugs.freedesktop.org/show_bug.cgi?id=61907 --- Comment #3 from Colin McDonald cjmmail10...@yahoo.co.uk --- The piglit test tests/texturing/tex-skipped-unit.c demonstrates the problem, as it uses texture unit 1 with glTexCoordPointer. Using direct rendering, test passes. $ piglit-run.py --tests=skip tests/all.tests results/tex [Mon Mar 11 16:28:02 2013] :: running :: spec/!OpenGL 1.2/tex-skipped-unit [Mon Mar 11 16:28:02 2013] :: pass :: spec/!OpenGL 1.2/tex-skipped-unit Using indirect rendering, as required for a remote X display, fails: $ export LIBGL_ALWAYS_INDIRECT=1 $ piglit-run.py --tests=skip tests/all.tests results/tex [Mon Mar 11 16:29:04 2013] :: running :: spec/!OpenGL 1.2/tex-skipped-unit [Mon Mar 11 16:29:04 2013] :: fail :: spec/!OpenGL 1.2/tex-skipped-unit Using LIBGL patched with the given updates is OK: $ export LIBGL_ALWAYS_INDIRECT=1 $ export LD_LIBRARY_PATH=/home/patch/lib $ piglit-run.py --tests=skip tests/all.tests results/tex [Mon Mar 11 16:37:29 2013] :: running :: spec/!OpenGL 1.2/tex-skipped-unit [Mon Mar 11 16:37:29 2013] :: pass :: spec/!OpenGL 1.2/tex-skipped-unit I have to say that I've had all sorts of problems getting piglit to run on my linux system. It's a straightward Centos 6.3 installation, but the software renderer doesn't appear to be working properly. Hardware rendering is better. I don't know whats going on with the system, but I don't think those problems are related to this indirect texture protocol issue. Hopefully you will see the same results as given above. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 58718] Crash in src_register() during glClear() call
https://bugs.freedesktop.org/show_bug.cgi?id=58718 --- Comment #7 from Keith Kriewall keith.kriew...@attachmate.com --- In case it helps, it appears that MSVC always treats enum values as signed int. E.g. see: http://compgroups.net/comp.lang.c++/problem-with-visual-c++-7.1.3088-and-bit-fields/1013665 GCC appears to use unsigned int if no enum values are negative. http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Structures-unions-enumerations-and-bit_002dfields-implementation.html The implication is that bit-fields may be a bit short if specified as GLuint. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 58718] Crash in src_register() during glClear() call
https://bugs.freedesktop.org/show_bug.cgi?id=58718 --- Comment #8 from Roland Scheidegger srol...@vmware.com --- (In reply to comment #7) In case it helps, it appears that MSVC always treats enum values as signed int. E.g. see: http://compgroups.net/comp.lang.c++/problem-with-visual-c++-7.1.3088-and-bit- fields/1013665 GCC appears to use unsigned int if no enum values are negative. http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Structures-unions-enumerations- and-bit_002dfields-implementation.html The implication is that bit-fields may be a bit short if specified as GLuint. I think I'm missing how that could cause the bug we're seeing here. FWIW struct ureg_dst actually looks definitely buggy to me, since IndirectSwizzle should be unsigned, not int, but I just noticed that now by accident and it's not an issue with ureg_src. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
On 03/11/2013 07:56 AM, Jose Fonseca wrote: I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? The second hit is cache-hot, so it may not be too expensive. I suspect memcpy is optimized to fill the cache in a more efficient manner than the old loop. Since the old loop did a read and a bit-wise or, it's also possible the compiler generated some really dumb code. We'd have to look at the assembly output to know. As Patrick suggests, there's probably an SSE2 method to do this even faster. That may be worth investigating. Once upon a time Matt Turner was talking about using pixman to accelerate operations like this in Mesa. It has a lot of highly optimized paths for just this sort of thing. Since it's used by other projects, it gets a lot more testing, etc. It may be worth looking at using that to solve this problem. Jose - Original Message - --- src/mesa/main/readpix.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 349b0bc..0f5c84c 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -285,11 +285,12 @@ fast_read_rgba_pixels_memcpy( struct gl_context *ctx, } } else if (copy_xrgb) { /* convert xrgb - argb */ + int alphaOffset = texelBytes - 1; for (j = 0; j height; j++) { - GLuint *dst4 = (GLuint *) dst, *map4 = (GLuint *) map; + memcpy(dst, map, width * texelBytes); int i; At the very least, the declaration needs to be moved before the memcpy or it will break the build on Windows. for (i = 0; i width; i++) { -dst4[i] = map4[i] | 0xff00; /* set A=0xff */ +dst[i * texelBytes + alphaOffset] = 0xff; /* set A=0xff */ } dst += dstStride; map += stride; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glxgears is faster but 3D render is so slow
I don't think you have to worry about the difference in buffer depths. If you really want a 24-bit depth buffer you can do 'export MESA_GLX_DEPTH_BITS=24' -Brian On 03/09/2013 12:48 AM, jupiter wrote: Hi Brian, Please see attached config.log. Le me make a correction, I mean 32 buffer bit and 24 depth bit in DRI and 24 buffer bit and 16 bit depth bit in xlib driver. Will it make difference if setting 32 buffer bit and 24 depth bit for xlib? If so, how to do it? Thank you. Kind regards. Jupiter On 3/8/13, jupiterjupiter@gmail.com wrote: Hi Brian, I finally built Mesa with configuration --enable-xlib-glx --disable-dri --enable-gallium-llvm --with-llvm-shared-libs, with dependencies of llvm and drm. It does not work either, please see following glxinfo. Please let me know if my configuration is not correct, or if there are any other ways I can try to make it work. $ glxinfo name of display: :0.0 display: :0 screen: 0 direct rendering: Yes server glx vendor string: Brian Paul server glx version string: 1.4 Mesa 9.1-devel server glx extensions: GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap, GLX_MESA_release_buffers, GLX_ARB_get_proc_address, GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer client glx vendor string: Brian Paul client glx version string: 1.4 Mesa 9.1-devel client glx extensions: GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap, GLX_MESA_release_buffers, GLX_ARB_get_proc_address, GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer GLX version: 1.4 GLX extensions: GLX_MESA_copy_sub_buffer, GLX_MESA_pixmap_colormap, GLX_MESA_release_buffers, GLX_ARB_get_proc_address, GLX_EXT_texture_from_pixmap, GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer OpenGL vendor string: Brian Paul OpenGL renderer string: Mesa X11 OpenGL version string: 2.1 Mesa 9.1-devel OpenGL shading language version string: 1.20 OpenGL extensions: GL_ARB_multisample, GL_EXT_abgr, GL_EXT_bgra, GL_EXT_blend_color, GL_EXT_blend_minmax, GL_EXT_blend_subtract, GL_EXT_copy_texture, GL_EXT_polygon_offset, GL_EXT_subtexture, GL_EXT_texture_object, GL_EXT_vertex_array, GL_EXT_compiled_vertex_array, GL_EXT_texture, GL_EXT_texture3D, GL_IBM_rasterpos_clip, GL_ARB_point_parameters, GL_EXT_draw_range_elements, GL_EXT_packed_pixels, GL_EXT_point_parameters, GL_EXT_rescale_normal, GL_EXT_separate_specular_color, GL_EXT_texture_edge_clamp, GL_SGIS_generate_mipmap, GL_SGIS_texture_border_clamp, GL_SGIS_texture_edge_clamp, GL_SGIS_texture_lod, GL_ARB_multitexture, GL_IBM_multimode_draw_arrays, GL_IBM_texture_mirrored_repeat, GL_3DFX_texture_compression_FXT1, GL_ARB_texture_cube_map, GL_ARB_texture_env_add, GL_ARB_transpose_matrix, GL_EXT_blend_func_separate, GL_EXT_fog_coord, GL_EXT_multi_draw_arrays, GL_EXT_secondary_color, GL_EXT_texture_env_add, GL_EXT_texture_filter_anisotropic, GL_EXT_texture_lod_bias, GL_INGR_blend_func_separate, GL_MESA_resize_buffers, GL_NV_blend_square, GL_NV_light_max_exponent, GL_NV_texgen_reflection, GL_NV_texture_env_combine4, GL_SUN_multi_draw_arrays, GL_ARB_texture_border_clamp, GL_ARB_texture_compression, GL_EXT_framebuffer_object, GL_EXT_texture_env_combine, GL_EXT_texture_env_dot3, GL_MESA_window_pos, GL_NV_packed_depth_stencil, GL_NV_texture_rectangle, GL_ARB_depth_texture, GL_ARB_occlusion_query, GL_ARB_shadow, GL_ARB_texture_env_combine, GL_ARB_texture_env_crossbar, GL_ARB_texture_env_dot3, GL_ARB_texture_mirrored_repeat, GL_ARB_window_pos, GL_ATI_envmap_bumpmap, GL_ATI_fragment_shader, GL_EXT_stencil_two_side, GL_EXT_texture_cube_map, GL_NV_depth_clamp, GL_NV_point_sprite, GL_APPLE_packed_pixels, GL_APPLE_vertex_array_object, GL_ARB_draw_buffers, GL_ARB_fragment_program, GL_ARB_fragment_shader, GL_ARB_shader_objects, GL_ARB_vertex_program, GL_ARB_vertex_shader, GL_ATI_draw_buffers, GL_ATI_texture_env_combine3, GL_EXT_depth_bounds_test, GL_EXT_shadow_funcs, GL_EXT_stencil_wrap, GL_MESA_pack_invert, GL_MESA_ycbcr_texture, GL_ARB_depth_clamp, GL_ARB_fragment_program_shadow, GL_ARB_half_float_pixel, GL_ARB_occlusion_query2, GL_ARB_point_sprite, GL_ARB_shading_language_100, GL_ARB_sync, GL_ARB_texture_non_power_of_two, GL_ARB_vertex_buffer_object, GL_ATI_blend_equation_separate, GL_EXT_blend_equation_separate, GL_OES_read_format, GL_ARB_pixel_buffer_object, GL_ARB_texture_compression_rgtc, GL_ARB_texture_rectangle, GL_ATI_texture_compression_3dc, GL_EXT_pixel_buffer_object, GL_EXT_texture_compression_rgtc, GL_EXT_texture_mirror_clamp, GL_EXT_texture_rectangle, GL_EXT_texture_sRGB, GL_EXT_texture_shared_exponent, GL_ARB_framebuffer_object, GL_EXT_framebuffer_blit,
[Mesa-dev] [PATCH] i965/vs: Add IR dumping for immediates.
This makes dump_instructions more useful. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 16 1 file changed, 16 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index f319f32..8e65be8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -981,6 +981,22 @@ vec4_visitor::dump_instruction(vec4_instruction *inst) case UNIFORM: printf(u%d, inst-src[i].reg); break; + case IMM: + switch (inst-src[i].type) { + case BRW_REGISTER_TYPE_F: +printf(%fF, inst-src[i].imm.f); +break; + case BRW_REGISTER_TYPE_D: +printf(%dD, inst-src[i].imm.i); +break; + case BRW_REGISTER_TYPE_UD: +printf(%uU, inst-src[i].imm.u); +break; + default: +printf(???); +break; + } + break; case BAD_FILE: printf((null)); break; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/18] build: Get rid of CORE_DIRS
Matt Turner matts...@gmail.com writes: A step toward working make dist/distcheck. --- configure.ac | 37 - src/Makefile.am | 30 +++--- src/mapi/Makefile.am | 42 ++ 3 files changed, 77 insertions(+), 32 deletions(-) create mode 100644 src/mapi/Makefile.am diff --git a/configure.ac b/configure.ac index 508b176..c023823 100644 --- a/configure.ac +++ b/configure.ac @@ -694,6 +694,13 @@ if test x$enable_gles2 = xyes; then fi AC_SUBST([API_DEFINES]) +AM_CONDITIONAL(HAVE_OPENGL, test x$enable_opengl = xyes) +AM_CONDITIONAL(HAVE_OPENGL_ES1, test x$enable_gles1 = xyes) +AM_CONDITIONAL(HAVE_OPENGL_ES2, test x$enable_gles2 = xyes) +AM_CONDITIONAL(NEED_OPENGL_COMMON, test x$enable_opengl = xyes -o \ +x$enable_gles1 = xyes -o \ +x$enable_gles2 = xyes) Looks like HAVE_OPENGL used to be set like NEED_OPENGL_COMMON, and was in use by egl-static/Makefile.am, so that would need updating. Also, if you're making major revisions to patches previously sent to the list, version information would be nice to have. diff --git a/src/Makefile.am b/src/Makefile.am index d6a7946..9e265d9 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1,4 +1,28 @@ -SUBDIRS=$(SRC_DIRS) +# Copyright © 2013 Intel Corporation +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. -all-local: - $(MKDIR_P) $(top_builddir)/$(LIB_DIR) Do we not need this any more? I guess it's in all of the subdirs that generate links into that directory, but it's odd to see it disappear in this commit. pgp7LBpGyGjm_.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix glGetInteger*(GL_SAMPLER_BINDING).
Alan Hourihane al...@fairlite.co.uk writes: On 03/06/13 18:36, Brian Paul wrote: On 03/06/2013 11:23 AM, Alan Hourihane wrote: If the sampler object has been deleted on another context, an alternative context may reference the old sampler. So ensure the sampler object still exists. Alan, is this specified somewhere in a spec? I can't find a description of this behaviour and we don't do this for texture objects or buffer objects, etc. I can't see it specifically mentioned, apart from the note that when deleting the sampler object it should be unbound from the texture unit, and I did consider the case of buffer texture objects whether to do this there too. But getting the GL_SAMPLER_BINDING id when switching contexts and attempting to re-bind with glBindSampler() gives a GL error, which seems wrong to me. I checked with the NVIDIA driver and no GL error is generated. I would guess that's because binding an non-genned name gens the object, rather than because of some sort of magic in the getter. pgpep0pbmO6WX.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/18] mesa: Replace MESA_VERSION with PACKAGE_VERSION.
Matt Turner matts...@gmail.com writes: One fewer place to have to update. In a couple places here you changed MESA to Mesa in user-visible version strings. I think this is a reasonable and good thing to do, but it's not mentioned in the commit message. If you split that out into a separate patch, then patches 1-4 and 6-7 are Reviewed-by: Eric Anholt e...@anholt.net These commit messages are *really* terse, and there have been some interesting tidbits hidden in them, like removing the MKDIR_P, or case changes on version strings. pgpsnK2liUVeC.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/18] build: Get rid of SRC_DIRS
This patch didn't incorporate review from last time. pgpf0iYPA7iZ9.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 58718] Crash in src_register() during glClear() call
https://bugs.freedesktop.org/show_bug.cgi?id=58718 --- Comment #9 from Keith Kriewall keith.kriew...@attachmate.com --- Sorry, I didn't mean to imply that the signed issue is causing this problem. I've tried increasing the 'File' bit field size by one, and it made no obvious difference. I just wanted to note the difference between the compilers, in case it indirectly pertains to the problem. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
- Original Message - On 03/11/2013 07:56 AM, Jose Fonseca wrote: I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? The second hit is cache-hot, so it may not be too expensive. Yes, but the size in question is 1900x1200, ie, 9MB, which will trash L1-L2 caches, and won't even fit on the L3 cache of several processors. I'm afraid we'd be optimizing some cases at expense of others. I think that at very least we should do this in 16KB/32KB or so chunks to avoid trashing the lower level caches. I suspect memcpy is optimized to fill the cache in a more efficient manner than the old loop. Since the old loop did a read and a bit-wise or, it's also possible the compiler generated some really dumb code. We'd have to look at the assembly output to know. As Patrick suggests, there's probably an SSE2 method to do this even faster. That may be worth investigating. An SSE2 is quite easy with intrinsics: _m128i pixels = _mm_loadu_si128((const __m128i *)src); // could use _mm_load_si128 with some checks pixels = _mm_or_si128(pixels, _mm_set1_epi32(0xff00)); _mm_storeu_si128((__m128i *)dst, pixels); src += sizeof(__m128i) / sizeof *src; dst += sizeof(__m128i) / sizeof *dst; the hard part is the runtime check for sse2 support... Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Apply depthstencil alignment workaround when doing fast clears.
Fast depth clears have the same depth/stencil alignment requirements as other drawing operations. Therefore, we need to call brw_workaround_depthstencil_alignment() from both the clear and drawing paths. Without this fix, we get image corruption if the following conditions hold: (a) the first ever drawing operation to a depth miplevel (or the first drawing operation after having used the texture for sampling) is a clear, (b) the depth miplevel has a size that is eligible for fast depth clears, and (c) the depth miplevel has an offset within the miptree that isn't 8x8 aligned. Fixes piglit depthstencil-render-miplevels tests with size 273. NOTE: This is a candidate for stable branches --- src/mesa/drivers/dri/i965/brw_clear.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c index 53d8e54..cde1a06 100644 --- a/src/mesa/drivers/dri/i965/brw_clear.c +++ b/src/mesa/drivers/dri/i965/brw_clear.c @@ -40,6 +40,8 @@ #include intel_mipmap_tree.h #include intel_regions.h +#include brw_context.h + #define FILE_DEBUG_FLAG DEBUG_BLIT static const char *buffer_names[] = { @@ -219,7 +221,8 @@ brw_fast_clear_depth(struct gl_context *ctx) static void brw_clear(struct gl_context *ctx, GLbitfield mask) { - struct intel_context *intel = intel_context(ctx); + struct brw_context *brw = brw_context(ctx); + struct intel_context *intel = brw-intel; if (!_mesa_check_conditional_render(ctx)) return; @@ -229,6 +232,7 @@ brw_clear(struct gl_context *ctx, GLbitfield mask) } intel_prepare_render(intel); + brw_workaround_depthstencil_alignment(brw); if (mask BUFFER_BIT_DEPTH) { if (brw_fast_clear_depth(ctx)) { -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.
Since apps typically begin rendering with a call to glClear(), it is likely that when brw_workaround_depthstencil_alignment() moves a miplevel to a temporary buffer, it can avoid doing a blit, since the contents of the miplevel are about to be erased. This patch adds the necessary plumbing to determine when brw_workaround_depthstencil_alignment() is being called as a consequence of glClear(), and avoids the unnecessary blit when it is safe to do so. --- src/mesa/drivers/dri/i965/brw_clear.c| 4 +++- src/mesa/drivers/dri/i965/brw_context.h | 3 ++- src/mesa/drivers/dri/i965/brw_draw.c | 2 +- src/mesa/drivers/dri/i965/brw_misc_state.c | 26 +++- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 2 +- src/mesa/drivers/dri/intel/intel_fbo.c | 10 +++-- src/mesa/drivers/dri/intel/intel_fbo.h | 3 ++- 7 files changed, 38 insertions(+), 12 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c index cde1a06..e740f65 100644 --- a/src/mesa/drivers/dri/i965/brw_clear.c +++ b/src/mesa/drivers/dri/i965/brw_clear.c @@ -223,6 +223,8 @@ brw_clear(struct gl_context *ctx, GLbitfield mask) { struct brw_context *brw = brw_context(ctx); struct intel_context *intel = brw-intel; + struct gl_framebuffer *fb = ctx-DrawBuffer; + bool partial_clear = ctx-Scissor.Enabled !noop_scissor(ctx, fb); if (!_mesa_check_conditional_render(ctx)) return; @@ -232,7 +234,7 @@ brw_clear(struct gl_context *ctx, GLbitfield mask) } intel_prepare_render(intel); - brw_workaround_depthstencil_alignment(brw); + brw_workaround_depthstencil_alignment(brw, partial_clear ? 0 : mask); if (mask BUFFER_BIT_DEPTH) { if (brw_fast_clear_depth(ctx)) { diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index c34d6b1..5aa0081 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1129,7 +1129,8 @@ void brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt, struct intel_mipmap_tree *stencil_mt, uint32_t *out_tile_mask_x, uint32_t *out_tile_mask_y); -void brw_workaround_depthstencil_alignment(struct brw_context *brw); +void brw_workaround_depthstencil_alignment(struct brw_context *brw, + GLbitfield clear_mask); /*== * brw_queryobj.c diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 9c96f69..149497f 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -439,7 +439,7 @@ static bool brw_try_draw_prims( struct gl_context *ctx, /* This workaround has to happen outside of brw_state_upload() because it * may flush the batchbuffer for a blit, affecting the state flags. */ - brw_workaround_depthstencil_alignment(brw); + brw_workaround_depthstencil_alignment(brw, 0); /* Resolves must occur after updating renderbuffers, updating context state, * and finalizing textures but before setting up any hardware state for diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 1024c42..bf367d0 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -41,6 +41,7 @@ #include brw_defines.h #include main/fbobject.h +#include main/glformats.h /* Constant single cliprect for framebuffer object or DRI2 drawing */ static void upload_drawing_rect(struct brw_context *brw) @@ -328,7 +329,8 @@ get_stencil_miptree(struct intel_renderbuffer *irb) } void -brw_workaround_depthstencil_alignment(struct brw_context *brw) +brw_workaround_depthstencil_alignment(struct brw_context *brw, + GLbitfield clear_mask) { struct intel_context *intel = brw-intel; struct gl_context *ctx = intel-ctx; @@ -341,10 +343,24 @@ brw_workaround_depthstencil_alignment(struct brw_context *brw) struct intel_mipmap_tree *stencil_mt = get_stencil_miptree(stencil_irb); uint32_t tile_x = 0, tile_y = 0, stencil_tile_x = 0, stencil_tile_y = 0; uint32_t stencil_draw_x = 0, stencil_draw_y = 0; + bool invalidate_depth = clear_mask GL_DEPTH_BUFFER_BIT; + bool invalidate_stencil = clear_mask GL_STENCIL_BUFFER_BIT; if (depth_irb) depth_mt = depth_irb-mt; + if (depth_irb invalidate_depth +_mesa_is_depthstencil_format( + _mesa_get_format_base_format(depth_mt-format)) +!depth_mt-stencil_mt) { + /* Depth buffer contains interleaved stencil data, so it's only safe to + * invalidate it if we're also clearing stencil, and both depth_irb and + * stencil_irb point to the
Re: [Mesa-dev] [RFC] Solving the TGSI indirect addressing optimization problem
On 03/11/2013 06:44 AM, Christian König wrote: Hi everybody, this problem has been open for quite some time now, with a bunch of different opinions and sometimes even patches floating on the list. The solutions proposed or implemented so far all more or less incomplete, so this approach was designed in mind with both completeness and compatibility with existing code. Over all it's just an implementation of what Tom Stellard named solution #4 in this eMail thread: http://lists.freedesktop.org/archives/mesa-dev/2013-January/033264.html Please review and as usual comments are welcome, I still don't quite get what's going on here. In Christoph's reply, it seems he tested your patch and got TGSI code that looks like this: DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], GENERIC[12] DCL OUT[2], GENERIC[13] DCL OUT[3], GENERIC[14] DCL OUT[4], GENERIC[15] DCL CONST[0..1] DCL TEMP[0..3], LOCAL DCL TEMP[4], LOCAL DCL ADDR[0] IMM[0] FLT32 {0., 0., 0., 0.} 0: UARL ADDR[0].x, CONST[1]. 1: MOV TEMP[4], IN[ADDR[0].x] (not the bug) but this is invalid as there is no IN array, just single ones 2: MOV TEMP[0], IN[0] 3: MOV TEMP[1], IN[1] 4: MOV TEMP[2], IMM[0]. 5: MOV TEMP[3], IMM[0]. 6: UARL ADDR[0].x, CONST[0]. 7: MOV TEMP[1][ADDR[0].x], IN[2] What exactly does LOCAL mean on the temp declarations? But in Jose's example, he wrote: DCL TEMP[1][0..70] DCL TEMP[2][0..7] MOV OUT[1], TEMP[1][ADDR[0].x] In this code, each chunk of temporaries has an explicit name as Marek suggested in his comments to the #4 proposal. What exactly is your proposal doing? Can you please provide some more sample TGSI code to illustrate what you're doing? And, how it would be extended for inputs/outputs? Thanks. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Software Rendering without X
Hi All, I don't have any Graphic Cards that support OpenGL , so I want to perform software rendering with Mesa without X, DRM etc. Also can someone explain how are the functions for eg. glClear (_mesa_clear the actual implementation) are invoked without glx? Thanks Ritvik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
On 03/11/2013 11:30 AM, Jose Fonseca wrote: - Original Message - On 03/11/2013 07:56 AM, Jose Fonseca wrote: I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? The second hit is cache-hot, so it may not be too expensive. Yes, but the size in question is 1900x1200, ie, 9MB, which will trash L1-L2 caches, and won't even fit on the L3 cache of several processors. But it's doing it line-by-line, right? So 1900 * 4bpp is only ~8kb. I'm afraid we'd be optimizing some cases at expense of others. That is probably true either way. To optimize this for everything, we'd need a lot more tests. I think that at very least we should do this in 16KB/32KB or so chunks to avoid trashing the lower level caches. I suspect memcpy is optimized to fill the cache in a more efficient manner than the old loop. Since the old loop did a read and a bit-wise or, it's also possible the compiler generated some really dumb code. We'd have to look at the assembly output to know. As Patrick suggests, there's probably an SSE2 method to do this even faster. That may be worth investigating. An SSE2 is quite easy with intrinsics: _m128i pixels = _mm_loadu_si128((const __m128i *)src); // could use _mm_load_si128 with some checks pixels = _mm_or_si128(pixels, _mm_set1_epi32(0xff00)); _mm_storeu_si128((__m128i *)dst, pixels); src += sizeof(__m128i) / sizeof *src; dst += sizeof(__m128i) / sizeof *dst; the hard part is the runtime check for sse2 support... We could start by doing something like this for 64-bit builds. SSE2 is always available there. :) If we're using the intrinsics anyway, it's probably even better to use PREFETCHNTA on the read. Mesa has some code for detecting CPU capabilities, but I don't think it has been updated in ages... It looks like src/mesa/x86/common_x86.c detects MMX and SSE, but there's no code for anything after that. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vs: Add IR dumping for immediates.
Kenneth Graunke kenn...@whitecape.org writes: This makes dump_instructions more useful. Reviewed-by: Eric Anholt e...@anholt.net pgpH3SqL1AwC7.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
- Original Message - On 03/11/2013 11:30 AM, Jose Fonseca wrote: - Original Message - On 03/11/2013 07:56 AM, Jose Fonseca wrote: I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? The second hit is cache-hot, so it may not be too expensive. Yes, but the size in question is 1900x1200, ie, 9MB, which will trash L1-L2 caches, and won't even fit on the L3 cache of several processors. But it's doing it line-by-line, right? So 1900 * 4bpp is only ~8kb. Oh I missed that. That looks quite sensible then. I'm afraid we'd be optimizing some cases at expense of others. That is probably true either way. To optimize this for everything, we'd need a lot more tests. I think that at very least we should do this in 16KB/32KB or so chunks to avoid trashing the lower level caches. I suspect memcpy is optimized to fill the cache in a more efficient manner than the old loop. Since the old loop did a read and a bit-wise or, it's also possible the compiler generated some really dumb code. We'd have to look at the assembly output to know. As Patrick suggests, there's probably an SSE2 method to do this even faster. That may be worth investigating. An SSE2 is quite easy with intrinsics: _m128i pixels = _mm_loadu_si128((const __m128i *)src); // could use _mm_load_si128 with some checks pixels = _mm_or_si128(pixels, _mm_set1_epi32(0xff00)); _mm_storeu_si128((__m128i *)dst, pixels); src += sizeof(__m128i) / sizeof *src; dst += sizeof(__m128i) / sizeof *dst; the hard part is the runtime check for sse2 support... We could start by doing something like this for 64-bit builds. SSE2 is always available there. :) If we're using the intrinsics anyway, it's probably even better to use PREFETCHNTA on the read. Yes, that would avoid trashing the cache with one-time reads. Mesa has some code for detecting CPU capabilities, but I don't think it has been updated in ages... It looks like src/mesa/x86/common_x86.c detects MMX and SSE, but there's no code for anything after that. Gallium too. Should move that into somwhere shareable... Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
Ian Romanick i...@freedesktop.org writes: On 03/11/2013 07:56 AM, Jose Fonseca wrote: I'm surprised this is is faster. In particular, for big things we'll be touching memory twice. Did you measure the speed up? The second hit is cache-hot, so it may not be too expensive. I suspect memcpy is optimized to fill the cache in a more efficient manner than the old loop. Since the old loop did a read and a bit-wise or, it's also possible the compiler generated some really dumb code. We'd have to look at the assembly output to know. This is readpixels. You are probably reading from uncached memory (assuming the driver didn't do something clever), so you want the biggest possible word read at a time (memcpy, not 32-bits in a loop), or if you're on a core2 or better CPU, you want to use movntdqa for the read so you get streaming performance. If anyone's interested, there's some code in the movntdqa branch of my tree (for the ugly old span code and pre-automake), and the movnt branch of my tree (that does automake integration and is much prettier, but movntdqa is the instruction you want) pgpSP0wnAdZWf.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Solving the TGSI indirect addressing optimization problem
On 11.03.2013 19:33, Brian Paul wrote: On 03/11/2013 06:44 AM, Christian König wrote: Hi everybody, this problem has been open for quite some time now, with a bunch of different opinions and sometimes even patches floating on the list. The solutions proposed or implemented so far all more or less incomplete, so this approach was designed in mind with both completeness and compatibility with existing code. Over all it's just an implementation of what Tom Stellard named solution #4 in this eMail thread: http://lists.freedesktop.org/archives/mesa-dev/2013-January/033264.html Please review and as usual comments are welcome, I still don't quite get what's going on here. In Christoph's reply, it seems he tested your patch and got TGSI code that looks like this: DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], GENERIC[12] DCL OUT[2], GENERIC[13] DCL OUT[3], GENERIC[14] DCL OUT[4], GENERIC[15] DCL CONST[0..1] DCL TEMP[0..3], LOCAL DCL TEMP[4], LOCAL DCL ADDR[0] IMM[0] FLT32 {0., 0., 0., 0.} 0: UARL ADDR[0].x, CONST[1]. 1: MOV TEMP[4], IN[ADDR[0].x] (not the bug) but this is invalid as there is no IN array, just single ones 2: MOV TEMP[0], IN[0] 3: MOV TEMP[1], IN[1] 4: MOV TEMP[2], IMM[0]. 5: MOV TEMP[3], IMM[0]. 6: UARL ADDR[0].x, CONST[0]. 7: MOV TEMP[1][ADDR[0].x], IN[2] What exactly does LOCAL mean on the temp declarations? That the register isn't used for parameter passing between subroutines. Has been introduced a long time ago. See commit 2644952bd4dfa3b75112dee8dfd287a12d770705. But in Jose's example, he wrote: DCL TEMP[1][0..70] DCL TEMP[2][0..7] MOV OUT[1], TEMP[1][ADDR[0].x] In this code, each chunk of temporaries has an explicit name as Marek suggested in his comments to the #4 proposal. The point is that TEMP (and all other spaces likewise) are still a single space, i.e. without duplicate indices. The only real change is that an indirect access is supplied with the index of the declaration of which the range will be accessed. What exactly is your proposal doing? Can you please provide some more sample TGSI code to illustrate what you're doing? And, how it would be extended for inputs/outputs? Thanks. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] tgsi: use separate structure for indirect address
- Original Message - Am 11.03.2013 15:52, schrieb Jose Fonseca: Christian, I didn't comment on the previous threads, but the approach mentioned in http://lists.freedesktop.org/archives/mesa-dev/2012-November/030476.html seems sensible to me. I think after the first round we should have this in a branch to allow drivers to catch up with the interface change. Or is it possible for drivers to opt-in via a cap? Not the drivers are in question of changing, the state trackers are. If the drivers just ignore those additional informations nothing should change for them. I think that drivers like llvmpipe will choke on declarations like DCL TEMP[1][0..7] So if the state trackers start emitting these I think we'll nee For the state trackers my current approach also doesn't need them to change, currently the semantics is as following: If Declaration==0 then we fall back to the old behavior, e.g. the whole register file is indirectly addressed. Else the state tracker (currently only glsl_to_tgsi) provided the necessary information in the Declaration field to only indirect address a certain part of the register file. Yes, I like that. I think we could eventually be strict and forbid indirect declaration==0, but it's nice not to have to rush. A few more remarks inline. - Original Message - From: Christian König christian.koe...@amd.com To further improve the optimization of source and destination indirect addressing we need the ability to store a reference to the declaration of the addressed operands. Just to be perfectly clear, declaration number does not refer to the n-th TEMP declaration, but declaration of TEMP[n], right? No, currently it indeed refers to the n-th TEMP declaration. But I'm still fighting with myself weather or not that's a good idea. I think that using n-th TEMP[?] declaration instead of declaration of TEMP[n] it might be a bad precedent. That is, this DCL TEMP[1][0..70] DCL TEMP[2][0..7] MOV OUT[1], TEMP[1][ADDR[0].x] and this DCL TEMP[2][0..7] DCL TEMP[1][0..70] MOV OUT[1], TEMP[1][ADDR[0].x] are equivalent, right? If so, I wonder if there is a name more descriptive than Declaration here. Maybe Range, or IndexableRange? Correct, yes. As said above, currently Declaration refers to the n-th declaration but I can easily add an Indirect flag to tgsi_declaration I think I'd prefer something along these lines. (there are still 6 bits of padding in it) and have an IndirectRangeID (or ArrayID, ArrayName, whatever, make your choice) token following the declaration. Array* sounds great to me. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/12] i965: Change fragment input related bitfields to 64-bit.
This patch updates the bitfields brw_context::wm.input_size_masks, tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of which are indexed by gl_frag_attrib, from 32-bit to 64-bit. This paves the way for supporting geometry shaders, and for merging the gl_frag_attrib and gl_vert_result enums. The combination of these two will require at least 55 bits in the bitfields. --- src/mesa/drivers/dri/i965/brw_context.h | 2 +- src/mesa/drivers/dri/i965/brw_fs.cpp| 7 --- src/mesa/drivers/dri/i965/brw_vs_constval.c | 18 +- src/mesa/drivers/dri/i965/brw_wm.c | 2 +- src/mesa/drivers/dri/i965/brw_wm.h | 2 +- 5 files changed, 16 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index c34d6b1..a8d802a 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -978,7 +978,7 @@ struct brw_context /** Input sizes, calculated from active vertex program. * One bit per fragment program input attribute. */ - GLbitfield input_size_masks[4]; + GLbitfield64 input_size_masks[4]; /** offsets in the batch to sampler default colors (texture border color) */ diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 8ce3954..3ee2780 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1044,7 +1044,8 @@ fs_visitor::emit_general_interpolation(ir_variable *ir) */ if (location = FRAG_ATTRIB_TEX0 location = FRAG_ATTRIB_TEX7 - k == 3 !(c-key.proj_attrib_mask (1 location))) { + k == 3 !(c-key.proj_attrib_mask +BITFIELD64_BIT(location))) { emit(BRW_OPCODE_MOV, attr, fs_reg(1.0f)); } else { struct brw_reg interp = interp_reg(location, k); @@ -2987,7 +2988,7 @@ brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog) } if (prog-Name != 0) - key.proj_attrib_mask = 0x; + key.proj_attrib_mask = ~(GLbitfield64) 0; if (intel-gen 6) key.vp_outputs_written |= BITFIELD64_BIT(FRAG_ATTRIB_WPOS); @@ -2997,7 +2998,7 @@ brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog) continue; if (prog-Name == 0) - key.proj_attrib_mask |= 1 i; + key.proj_attrib_mask |= BITFIELD64_BIT(i); if (intel-gen 6) { int vp_index = _mesa_vert_result_to_frag_attrib((gl_vert_result) i); diff --git a/src/mesa/drivers/dri/i965/brw_vs_constval.c b/src/mesa/drivers/dri/i965/brw_vs_constval.c index 48635c5..f6ac256 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_constval.c +++ b/src/mesa/drivers/dri/i965/brw_vs_constval.c @@ -40,7 +40,7 @@ struct tracker { bool twoside; GLubyte active[PROGRAM_OUTPUT+1][MAX_PROGRAM_TEMPS]; - GLbitfield size_masks[4]; /** one bit per fragment program input attrib */ + GLbitfield64 size_masks[4]; /** one bit per fragment program input attrib */ }; @@ -151,10 +151,10 @@ static void calc_sizes( struct tracker *t ) continue; switch (get_output_size(t, vertRes)) { - case 4: t-size_masks[4-1] |= 1 fragAttrib; - case 3: t-size_masks[3-1] |= 1 fragAttrib; - case 2: t-size_masks[2-1] |= 1 fragAttrib; - case 1: t-size_masks[1-1] |= 1 fragAttrib; + case 4: t-size_masks[4-1] |= BITFIELD64_BIT(fragAttrib); + case 3: t-size_masks[3-1] |= BITFIELD64_BIT(fragAttrib); + case 2: t-size_masks[2-1] |= BITFIELD64_BIT(fragAttrib); + case 1: t-size_masks[1-1] |= BITFIELD64_BIT(fragAttrib); break; } } @@ -200,10 +200,10 @@ static void calc_wm_input_sizes( struct brw_context *brw ) * that correct code is generated. */ if (vp-program.Base.NumInstructions == 0) { - brw-wm.input_size_masks[0] = ~0; - brw-wm.input_size_masks[1] = ~0; - brw-wm.input_size_masks[2] = ~0; - brw-wm.input_size_masks[3] = ~0; + brw-wm.input_size_masks[0] = ~(GLbitfield64) 0; + brw-wm.input_size_masks[1] = ~(GLbitfield64) 0; + brw-wm.input_size_masks[2] = ~(GLbitfield64) 0; + brw-wm.input_size_masks[3] = ~(GLbitfield64) 0; return; } diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 77bede0..e9ef5c7 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -428,7 +428,7 @@ static void brw_wm_populate_key( struct brw_context *brw, * useful for programs using shaders. */ if (ctx-Shader.CurrentFragmentProgram) - key-proj_attrib_mask = 0x; + key-proj_attrib_mask = ~(GLbitfield64) 0; else key-proj_attrib_mask = brw-wm.input_size_masks[4-1]; diff --git a/src/mesa/drivers/dri/i965/brw_wm.h
[Mesa-dev] [PATCH 02/12] mtypes.h: Add new gl_varying_slot enum, and bitfield defines.
Future patches will make use of the enum. It will eventually take the place of the existing enums gl_vert_result, gl_geom_attrib, gl_geom_result, and gl_frag_attrib, all of which represent essentially the same information but using inconsistent values. --- src/mesa/main/mtypes.h | 70 ++ 1 file changed, 70 insertions(+) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 4f09513..96ef416 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -209,6 +209,76 @@ typedef enum /** + * Indexes for vertex shader outputs, geometry shader inputs/outputs, and + * fragment shader inputs. + * + * Note that some of these values are not available to all pipeline stages. + */ +typedef enum +{ + VARYING_SLOT_POS, + VARYING_SLOT_COL0, /* COL0 and COL1 must be contiguous */ + VARYING_SLOT_COL1, + VARYING_SLOT_FOGC, + VARYING_SLOT_TEX0, /* TEX0-TEX7 must be contiguous */ + VARYING_SLOT_TEX1, + VARYING_SLOT_TEX2, + VARYING_SLOT_TEX3, + VARYING_SLOT_TEX4, + VARYING_SLOT_TEX5, + VARYING_SLOT_TEX6, + VARYING_SLOT_TEX7, + VARYING_SLOT_PSIZ, /* Does not appear in FS */ + VARYING_SLOT_BFC0, /* Does not appear in FS */ + VARYING_SLOT_BFC1, /* Does not appear in FS */ + VARYING_SLOT_EDGE, /* Does not appear in FS */ + VARYING_SLOT_CLIP_VERTEX, /* Does not appear in FS */ + VARYING_SLOT_CLIP_DIST0, + VARYING_SLOT_CLIP_DIST1, + VARYING_SLOT_PRIMITIVE_ID, /* Does not appear in VS */ + VARYING_SLOT_LAYER, /* Appears only as GS output */ + VARYING_SLOT_FACE, /* FS only */ + VARYING_SLOT_PNTC, /* FS only */ + VARYING_SLOT_VAR0, /* First generic varying slot */ + VARYING_SLOT_MAX = VARYING_SLOT_VAR0 + MAX_VARYING +} gl_varying_slot; + + +/** + * Bitflags for varying slots. + */ +/*@{*/ +#define VARYING_BIT_POS BITFIELD64_BIT(VARYING_SLOT_POS) +#define VARYING_BIT_COL0 BITFIELD64_BIT(VARYING_SLOT_COL0) +#define VARYING_BIT_COL1 BITFIELD64_BIT(VARYING_SLOT_COL1) +#define VARYING_BIT_FOGC BITFIELD64_BIT(VARYING_SLOT_FOGC) +#define VARYING_BIT_TEX0 BITFIELD64_BIT(VARYING_SLOT_TEX0) +#define VARYING_BIT_TEX1 BITFIELD64_BIT(VARYING_SLOT_TEX1) +#define VARYING_BIT_TEX2 BITFIELD64_BIT(VARYING_SLOT_TEX2) +#define VARYING_BIT_TEX3 BITFIELD64_BIT(VARYING_SLOT_TEX3) +#define VARYING_BIT_TEX4 BITFIELD64_BIT(VARYING_SLOT_TEX4) +#define VARYING_BIT_TEX5 BITFIELD64_BIT(VARYING_SLOT_TEX5) +#define VARYING_BIT_TEX6 BITFIELD64_BIT(VARYING_SLOT_TEX6) +#define VARYING_BIT_TEX7 BITFIELD64_BIT(VARYING_SLOT_TEX7) +#define VARYING_BIT_TEX(U) BITFIELD64_BIT(VARYING_SLOT_TEX0 + (U)) +#define VARYING_BITS_TEX_ANY BITFIELD64_RANGE(VARYING_SLOT_TEX0, \ + MAX_TEXTURE_COORD_UNITS) +#define VARYING_BIT_PSIZ BITFIELD64_BIT(VARYING_SLOT_PSIZ) +#define VARYING_BIT_BFC0 BITFIELD64_BIT(VARYING_SLOT_BFC0) +#define VARYING_BIT_BFC1 BITFIELD64_BIT(VARYING_SLOT_BFC1) +#define VARYING_BIT_EDGE BITFIELD64_BIT(VARYING_SLOT_EDGE) +#define VARYING_BIT_CLIP_VERTEX BITFIELD64_BIT(VARYING_SLOT_CLIP_VERTEX) +#define VARYING_BIT_CLIP_DIST0 BITFIELD64_BIT(VARYING_SLOT_CLIP_DIST0) +#define VARYING_BIT_CLIP_DIST1 BITFIELD64_BIT(VARYING_SLOT_CLIP_DIST1) +#define VARYING_BIT_PRIMITIVE_ID BITFIELD64_BIT(VARYING_SLOT_PRIMITIVE_ID) +#define VARYING_BIT_LAYER BITFIELD64_BIT(VARYING_SLOT_LAYER) +#define VARYING_BIT_FACE BITFIELD64_BIT(VARYING_SLOT_FACE) +#define VARYING_BIT_PNTC BITFIELD64_BIT(VARYING_SLOT_PNTC) +#define VARYING_BIT_VAR(V) BITFIELD64_BIT(VARYING_SLOT_VAR0 + (V)) +/*@}*/ + + +/** * Indexes for vertex program result attributes. Note that * _mesa_vert_result_to_frag_attrib() and _mesa_frag_attrib_to_vert_result() make * assumptions about the layout of this enum. -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/12] mtypes.h: Modify gl_vert_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_vert_result enum entirely. --- src/mesa/main/mtypes.h| 67 +-- src/mesa/program/prog_print.c | 4 +++ 2 files changed, 43 insertions(+), 28 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 96ef416..37cc2da 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -213,6 +213,11 @@ typedef enum * fragment shader inputs. * * Note that some of these values are not available to all pipeline stages. + * + * When this enum is updated, the following code must be updated too: + * - vertResults (in prog_print.c's arb_output_attrib_string()) + * - _mesa_vert_result_to_frag_attrib() + * - _mesa_frag_attrib_to_vert_result() */ typedef enum { @@ -285,27 +290,27 @@ typedef enum */ typedef enum { - VERT_RESULT_HPOS = 0, - VERT_RESULT_COL0 = 1, - VERT_RESULT_COL1 = 2, - VERT_RESULT_FOGC = 3, - VERT_RESULT_TEX0 = 4, - VERT_RESULT_TEX1 = 5, - VERT_RESULT_TEX2 = 6, - VERT_RESULT_TEX3 = 7, - VERT_RESULT_TEX4 = 8, - VERT_RESULT_TEX5 = 9, - VERT_RESULT_TEX6 = 10, - VERT_RESULT_TEX7 = 11, - VERT_RESULT_PSIZ = 12, - VERT_RESULT_BFC0 = 13, - VERT_RESULT_BFC1 = 14, - VERT_RESULT_EDGE = 15, - VERT_RESULT_CLIP_VERTEX = 16, - VERT_RESULT_CLIP_DIST0 = 17, - VERT_RESULT_CLIP_DIST1 = 18, - VERT_RESULT_VAR0 = 19, /** shader varying */ - VERT_RESULT_MAX = (VERT_RESULT_VAR0 + MAX_VARYING) + VERT_RESULT_HPOS = VARYING_SLOT_POS, + VERT_RESULT_COL0 = VARYING_SLOT_COL0, + VERT_RESULT_COL1 = VARYING_SLOT_COL1, + VERT_RESULT_FOGC = VARYING_SLOT_FOGC, + VERT_RESULT_TEX0 = VARYING_SLOT_TEX0, + VERT_RESULT_TEX1 = VARYING_SLOT_TEX1, + VERT_RESULT_TEX2 = VARYING_SLOT_TEX2, + VERT_RESULT_TEX3 = VARYING_SLOT_TEX3, + VERT_RESULT_TEX4 = VARYING_SLOT_TEX4, + VERT_RESULT_TEX5 = VARYING_SLOT_TEX5, + VERT_RESULT_TEX6 = VARYING_SLOT_TEX6, + VERT_RESULT_TEX7 = VARYING_SLOT_TEX7, + VERT_RESULT_PSIZ = VARYING_SLOT_PSIZ, + VERT_RESULT_BFC0 = VARYING_SLOT_BFC0, + VERT_RESULT_BFC1 = VARYING_SLOT_BFC1, + VERT_RESULT_EDGE = VARYING_SLOT_EDGE, + VERT_RESULT_CLIP_VERTEX = VARYING_SLOT_CLIP_VERTEX, + VERT_RESULT_CLIP_DIST0 = VARYING_SLOT_CLIP_DIST0, + VERT_RESULT_CLIP_DIST1 = VARYING_SLOT_CLIP_DIST1, + VERT_RESULT_VAR0 = VARYING_SLOT_VAR0, /** shader varying */ + VERT_RESULT_MAX = VARYING_SLOT_MAX } gl_vert_result; @@ -421,12 +426,16 @@ typedef enum static inline int _mesa_vert_result_to_frag_attrib(gl_vert_result vert_result) { - if (vert_result = VERT_RESULT_CLIP_DIST0) - return vert_result - VERT_RESULT_CLIP_DIST0 + FRAG_ATTRIB_CLIP_DIST0; - else if (vert_result = VERT_RESULT_TEX7) + if (vert_result = VERT_RESULT_TEX7) return vert_result; - else + else if (vert_result VERT_RESULT_CLIP_DIST0) + return -1; + else if (vert_result = VERT_RESULT_CLIP_DIST1) + return vert_result - VERT_RESULT_CLIP_DIST0 + FRAG_ATTRIB_CLIP_DIST0; + else if (vert_result VERT_RESULT_VAR0) return -1; + else + return vert_result - VERT_RESULT_VAR0 + FRAG_ATTRIB_VAR0; } @@ -443,10 +452,12 @@ _mesa_frag_attrib_to_vert_result(gl_frag_attrib frag_attrib) { if (frag_attrib = FRAG_ATTRIB_TEX7) return frag_attrib; - else if (frag_attrib = FRAG_ATTRIB_CLIP_DIST0) - return frag_attrib - FRAG_ATTRIB_CLIP_DIST0 + VERT_RESULT_CLIP_DIST0; - else + else if (frag_attrib FRAG_ATTRIB_CLIP_DIST0) return -1; + else if (frag_attrib = FRAG_ATTRIB_CLIP_DIST1) + return frag_attrib - FRAG_ATTRIB_CLIP_DIST0 + VERT_RESULT_CLIP_DIST0; + else /* frag_attrib = FRAG_ATTRIB_VAR0 */ + return frag_attrib - FRAG_ATTRIB_VAR0 + VERT_RESULT_VAR0; } diff --git a/src/mesa/program/prog_print.c b/src/mesa/program/prog_print.c index 7e7e081..e5592cf 100644 --- a/src/mesa/program/prog_print.c +++ b/src/mesa/program/prog_print.c @@ -263,6 +263,10 @@ arb_output_attrib_string(GLint index, GLenum progType) result.(sixteen), /* VERT_RESULT_CLIP_VERTEX */ result.(seventeen), /* VERT_RESULT_CLIP_DIST0 */ result.(eighteen), /* VERT_RESULT_CLIP_DIST1 */ + result.(nineteen), /* VARYING_SLOT_PRIMITIVE_ID */ + result.(twenty), /* VARYING_SLOT_LAYER */ + result.(twenty-one), /* VARYING_SLOT_FACE */ + result.(twenty-two), /* VARYING_SLOT_PNTC */ result.varying[0], result.varying[1], result.varying[2], -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/12] Replace gl_vert_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes: gl_vert_result - gl_varying_slot VERT_RESULT_* - VARYING_SLOT_* -- Note: this patch is very large, so for purposes of mailing list discussion I've trimmed it down to representative example hunks. To see the complete patch, please refer to branch combine_varying_slot_enums of git://github.com/stereotype441/mesa.git. src/glsl/builtin_variables.cpp | 20 src/glsl/ir.h | 2 +- src/glsl/link_varyings.cpp | 4 +- src/glsl/linker.cpp| 2 +- src/glsl/lower_packed_varyings.cpp | 2 +- src/mesa/drivers/dri/i965/brw_clip_line.c | 2 +- src/mesa/drivers/dri/i965/brw_clip_tri.c | 4 +- src/mesa/drivers/dri/i965/brw_clip_unfilled.c | 38 +++ src/mesa/drivers/dri/i965/brw_clip_util.c | 38 +++ src/mesa/drivers/dri/i965/brw_context.h| 35 +++--- src/mesa/drivers/dri/i965/brw_fs.cpp | 8 +-- src/mesa/drivers/dri/i965/brw_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_gs.h | 2 +- src/mesa/drivers/dri/i965/brw_gs_emit.c| 4 +- src/mesa/drivers/dri/i965/brw_sf.c | 6 +-- src/mesa/drivers/dri/i965/brw_sf_emit.c| 48 +- src/mesa/drivers/dri/i965/brw_vec4.h | 4 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 58 +++--- src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vs.c | 60 +++ src/mesa/drivers/dri/i965/brw_vs_constval.c| 14 +++--- src/mesa/drivers/dri/i965/gen6_sf_state.c | 18 +++ src/mesa/drivers/dri/i965/gen7_sol_state.c | 4 +- src/mesa/drivers/dri/r200/r200_tcl.c | 14 +++--- src/mesa/drivers/dri/r200/r200_vertprog.c | 40 +++ src/mesa/main/context.c| 6 +-- src/mesa/main/ff_fragment_shader.cpp | 7 ++- src/mesa/main/ffvertex_prog.c | 38 +++ src/mesa/main/mtypes.h | 67 +++--- src/mesa/program/prog_print.c | 22 - src/mesa/program/program.c | 2 +- src/mesa/program/program_parse.y | 16 +++--- src/mesa/program/programopt.c | 18 +++ src/mesa/state_tracker/st_atom_rasterizer.c| 2 +- src/mesa/state_tracker/st_cb_feedback.c| 4 +- src/mesa/state_tracker/st_cb_rasterpos.c | 6 +-- src/mesa/state_tracker/st_context.h| 2 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 12 ++--- src/mesa/state_tracker/st_mesa_to_tgsi.c | 12 ++--- src/mesa/state_tracker/st_program.c| 52 ++-- src/mesa/state_tracker/st_program.h| 8 +-- src/mesa/tnl/t_context.c | 2 +- src/mesa/tnl/t_context.h | 2 +- src/mesa/tnl/t_pipeline.c | 2 +- src/mesa/tnl/t_vb_program.c| 38 +++ 45 files changed, 358 insertions(+), 391 deletions(-) diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index 53c4c51..531effd 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -47,8 +47,8 @@ struct builtin_variable { }; static const builtin_variable builtin_core_vs_variables[] = { - { ir_var_shader_out, VERT_RESULT_HPOS, vec4, gl_Position }, - { ir_var_shader_out, VERT_RESULT_PSIZ, float, gl_PointSize }, + { ir_var_shader_out, VARYING_SLOT_POS, vec4, gl_Position }, + { ir_var_shader_out, VARYING_SLOT_PSIZ, float, gl_PointSize }, }; static const builtin_variable builtin_core_fs_variables[] = { @@ -96,12 +96,12 @@ static const builtin_variable builtin_110_deprecated_vs_variables[] = { { ir_var_shader_in, VERT_ATTRIB_TEX6,vec4, gl_MultiTexCoord6 }, { ir_var_shader_in, VERT_ATTRIB_TEX7,vec4, gl_MultiTexCoord7 }, { ir_var_shader_in, VERT_ATTRIB_FOG, float, gl_FogCoord }, - { ir_var_shader_out, VERT_RESULT_CLIP_VERTEX, vec4, gl_ClipVertex }, - { ir_var_shader_out, VERT_RESULT_COL0,vec4, gl_FrontColor }, - { ir_var_shader_out, VERT_RESULT_BFC0,vec4, gl_BackColor }, - { ir_var_shader_out, VERT_RESULT_COL1,vec4, gl_FrontSecondaryColor }, - { ir_var_shader_out, VERT_RESULT_BFC1,vec4, gl_BackSecondaryColor }, - { ir_var_shader_out, VERT_RESULT_FOGC,float, gl_FogFragCoord }, + { ir_var_shader_out, VARYING_SLOT_CLIP_VERTEX, vec4, gl_ClipVertex }, + { ir_var_shader_out, VARYING_SLOT_COL0,vec4, gl_FrontColor }, + { ir_var_shader_out, VARYING_SLOT_BFC0,vec4, gl_BackColor }, + { ir_var_shader_out, VARYING_SLOT_COL1,vec4, gl_FrontSecondaryColor }, + { ir_var_shader_out, VARYING_SLOT_BFC1,vec4, gl_BackSecondaryColor }, + {
[Mesa-dev] [PATCH 05/12] mtypes.h: Modify gl_geom_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_attrib enum entirely. --- src/mesa/main/mtypes.h | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 1e62e19..b39c9c5 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -290,19 +290,19 @@ typedef enum */ typedef enum { - GEOM_ATTRIB_POSITION = 0, - GEOM_ATTRIB_COLOR0 = 1, - GEOM_ATTRIB_COLOR1 = 2, - GEOM_ATTRIB_SECONDARY_COLOR0 = 3, - GEOM_ATTRIB_SECONDARY_COLOR1 = 4, - GEOM_ATTRIB_FOG_FRAG_COORD = 5, - GEOM_ATTRIB_POINT_SIZE = 6, - GEOM_ATTRIB_CLIP_VERTEX = 7, - GEOM_ATTRIB_PRIMITIVE_ID = 8, - GEOM_ATTRIB_TEX_COORD = 9, - - GEOM_ATTRIB_VAR0 = 16, - GEOM_ATTRIB_MAX = (GEOM_ATTRIB_VAR0 + MAX_VARYING) + GEOM_ATTRIB_POSITION = VARYING_SLOT_POS, + GEOM_ATTRIB_COLOR0 = VARYING_SLOT_COL0, + GEOM_ATTRIB_COLOR1 = VARYING_SLOT_COL1, + GEOM_ATTRIB_SECONDARY_COLOR0 = VARYING_SLOT_BFC0, + GEOM_ATTRIB_SECONDARY_COLOR1 = VARYING_SLOT_BFC1, + GEOM_ATTRIB_FOG_FRAG_COORD = VARYING_SLOT_FOGC, + GEOM_ATTRIB_POINT_SIZE = VARYING_SLOT_PNTC, + GEOM_ATTRIB_CLIP_VERTEX = VARYING_SLOT_CLIP_VERTEX, + GEOM_ATTRIB_PRIMITIVE_ID = VARYING_SLOT_PRIMITIVE_ID, + GEOM_ATTRIB_TEX_COORD = VARYING_SLOT_TEX0, + + GEOM_ATTRIB_VAR0 = VARYING_SLOT_VAR0, + GEOM_ATTRIB_MAX = VARYING_SLOT_MAX } gl_geom_attrib; /** -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/12] Replace gl_geom_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes: gl_geom_attrib - gl_varying_slot GEOM_ATTRIB_* - VARYING_SLOT_* GEOM_BIT_* - VARYING_BIT_* --- src/mesa/main/context.c | 2 -- src/mesa/main/mtypes.h | 41 - src/mesa/program/program.c | 2 +- src/mesa/state_tracker/st_program.c | 20 +- src/mesa/state_tracker/st_program.h | 8 5 files changed, 15 insertions(+), 58 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 7073f4a..53a373d 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -349,7 +349,6 @@ dummy_enum_func(void) gl_texture_index ti = TEXTURE_2D_ARRAY_INDEX; gl_vert_attrib va = VERT_ATTRIB_POS; gl_varying_slot vs = VARYING_SLOT_POS; - gl_geom_attrib ga = GEOM_ATTRIB_POSITION; gl_geom_result gr = GEOM_RESULT_POS; (void) bi; @@ -359,7 +358,6 @@ dummy_enum_func(void) (void) ti; (void) va; (void) vs; - (void) ga; (void) gr; } diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index b39c9c5..c01d584 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -286,47 +286,6 @@ typedef enum /*/ /** - * Indexes for geometry program attributes. - */ -typedef enum -{ - GEOM_ATTRIB_POSITION = VARYING_SLOT_POS, - GEOM_ATTRIB_COLOR0 = VARYING_SLOT_COL0, - GEOM_ATTRIB_COLOR1 = VARYING_SLOT_COL1, - GEOM_ATTRIB_SECONDARY_COLOR0 = VARYING_SLOT_BFC0, - GEOM_ATTRIB_SECONDARY_COLOR1 = VARYING_SLOT_BFC1, - GEOM_ATTRIB_FOG_FRAG_COORD = VARYING_SLOT_FOGC, - GEOM_ATTRIB_POINT_SIZE = VARYING_SLOT_PNTC, - GEOM_ATTRIB_CLIP_VERTEX = VARYING_SLOT_CLIP_VERTEX, - GEOM_ATTRIB_PRIMITIVE_ID = VARYING_SLOT_PRIMITIVE_ID, - GEOM_ATTRIB_TEX_COORD = VARYING_SLOT_TEX0, - - GEOM_ATTRIB_VAR0 = VARYING_SLOT_VAR0, - GEOM_ATTRIB_MAX = VARYING_SLOT_MAX -} gl_geom_attrib; - -/** - * Bitflags for geometry attributes. - * These are used in bitfields in many places. - */ -/*@{*/ -#define GEOM_BIT_COLOR0 (1 GEOM_ATTRIB_COLOR0) -#define GEOM_BIT_COLOR1 (1 GEOM_ATTRIB_COLOR1) -#define GEOM_BIT_SCOLOR0 (1 GEOM_ATTRIB_SECONDARY_COLOR0) -#define GEOM_BIT_SCOLOR1 (1 GEOM_ATTRIB_SECONDARY_COLOR1) -#define GEOM_BIT_TEX_COORD (1 GEOM_ATTRIB_TEX_COORD) -#define GEOM_BIT_FOG_COORD (1 GEOM_ATTRIB_FOG_FRAG_COORD) -#define GEOM_BIT_POSITION(1 GEOM_ATTRIB_POSITION) -#define GEOM_BIT_POINT_SIDE (1 GEOM_ATTRIB_POINT_SIZE) -#define GEOM_BIT_CLIP_VERTEX (1 GEOM_ATTRIB_CLIP_VERTEX) -#define GEOM_BIT_PRIM_ID (1 GEOM_ATTRIB_PRIMITIVE_ID) -#define GEOM_BIT_VAR0(1 GEOM_ATTRIB_VAR0) - -#define GEOM_BIT_VAR(g) (1 (GEOM_BIT_VAR0 + (g))) -/*@}*/ - - -/** * Indexes for geometry program result attributes */ typedef enum diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c index e235d6c..5cc18d4 100644 --- a/src/mesa/program/program.c +++ b/src/mesa/program/program.c @@ -936,7 +936,7 @@ _mesa_valid_register_index(const struct gl_context *ctx, case MESA_SHADER_FRAGMENT: return index FRAG_ATTRIB_VAR0 + (GLint) ctx-Const.MaxVarying; case MESA_SHADER_GEOMETRY: - return index GEOM_ATTRIB_VAR0 + (GLint) ctx-Const.MaxVarying; + return index VARYING_SLOT_VAR0 + (GLint) ctx-Const.MaxVarying; default: return GL_FALSE; } diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index 109d421..8bc2a12 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -800,7 +800,7 @@ st_translate_geometry_program(struct st_context *st, struct st_geometry_program *stgp, const struct st_gp_variant_key *key) { - GLuint inputMapping[GEOM_ATTRIB_MAX]; + GLuint inputMapping[VARYING_SLOT_MAX]; GLuint outputMapping[GEOM_RESULT_MAX]; struct pipe_context *pipe = st-pipe; GLuint attr; @@ -844,7 +844,7 @@ st_translate_geometry_program(struct st_context *st, * Convert Mesa program inputs to TGSI input register semantics. */ inputsRead = stgp-Base.Base.InputsRead; - for (attr = 0; attr GEOM_ATTRIB_MAX; attr++) { + for (attr = 0; attr VARYING_SLOT_MAX; attr++) { if ((inputsRead BITFIELD64_BIT(attr)) != 0) { const GLuint slot = gs_num_inputs; @@ -857,7 +857,7 @@ st_translate_geometry_program(struct st_context *st, stgp-index_to_input[vslot] = attr; ++vslot; - if (attr != GEOM_ATTRIB_PRIMITIVE_ID) { + if (attr != VARYING_SLOT_PRIMITIVE_ID) { gs_array_offset += 2; } else ++gs_builtin_inputs; @@ -868,31 +868,31 @@ st_translate_geometry_program(struct st_context *st, #endif switch (attr) { - case GEOM_ATTRIB_PRIMITIVE_ID: + case VARYING_SLOT_PRIMITIVE_ID: stgp-input_semantic_name[slot] =
[Mesa-dev] [PATCH 07/12] mtypes.h: Modify gl_geom_result to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_geom_result enum entirely. --- src/mesa/main/mtypes.h | 41 - 1 file changed, 20 insertions(+), 21 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index c01d584..d760d21 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -290,27 +290,26 @@ typedef enum */ typedef enum { - GEOM_RESULT_POS = 0, - GEOM_RESULT_COL0 = 1, - GEOM_RESULT_COL1 = 2, - GEOM_RESULT_SCOL0 = 3, - GEOM_RESULT_SCOL1 = 4, - GEOM_RESULT_FOGC = 5, - GEOM_RESULT_TEX0 = 6, - GEOM_RESULT_TEX1 = 7, - GEOM_RESULT_TEX2 = 8, - GEOM_RESULT_TEX3 = 9, - GEOM_RESULT_TEX4 = 10, - GEOM_RESULT_TEX5 = 11, - GEOM_RESULT_TEX6 = 12, - GEOM_RESULT_TEX7 = 13, - GEOM_RESULT_PSIZ = 14, - GEOM_RESULT_CLPV = 15, - GEOM_RESULT_PRID = 16, - GEOM_RESULT_LAYR = 17, - GEOM_RESULT_VAR0 = 18, /** shader varying, should really be 16 */ - /* ### we need to -2 because var0 is 18 instead 16 like in the others */ - GEOM_RESULT_MAX = (GEOM_RESULT_VAR0 + MAX_VARYING - 2) + GEOM_RESULT_POS = VARYING_SLOT_POS, + GEOM_RESULT_COL0 = VARYING_SLOT_COL0, + GEOM_RESULT_COL1 = VARYING_SLOT_COL1, + GEOM_RESULT_SCOL0 = VARYING_SLOT_BFC0, + GEOM_RESULT_SCOL1 = VARYING_SLOT_BFC1, + GEOM_RESULT_FOGC = VARYING_SLOT_FOGC, + GEOM_RESULT_TEX0 = VARYING_SLOT_TEX0, + GEOM_RESULT_TEX1 = VARYING_SLOT_TEX1, + GEOM_RESULT_TEX2 = VARYING_SLOT_TEX2, + GEOM_RESULT_TEX3 = VARYING_SLOT_TEX3, + GEOM_RESULT_TEX4 = VARYING_SLOT_TEX4, + GEOM_RESULT_TEX5 = VARYING_SLOT_TEX5, + GEOM_RESULT_TEX6 = VARYING_SLOT_TEX6, + GEOM_RESULT_TEX7 = VARYING_SLOT_TEX7, + GEOM_RESULT_PSIZ = VARYING_SLOT_PSIZ, + GEOM_RESULT_CLPV = VARYING_SLOT_CLIP_VERTEX, + GEOM_RESULT_PRID = VARYING_SLOT_PRIMITIVE_ID, + GEOM_RESULT_LAYR = VARYING_SLOT_LAYER, + GEOM_RESULT_VAR0 = VARYING_SLOT_VAR0, + GEOM_RESULT_MAX = VARYING_SLOT_MAX } gl_geom_result; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/12] Replace gl_geom_result enum with gl_varying_slot.
This patch makes the following search-and-replace changes: gl_geom_result - gl_varying_slot GEOM_RESULT_* - VARYING_SLOT_* --- src/mesa/main/context.c| 2 -- src/mesa/main/mtypes.h | 28 -- src/mesa/program/program.c | 2 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +- src/mesa/state_tracker/st_mesa_to_tgsi.c | 2 +- src/mesa/state_tracker/st_program.c| 38 +++--- 6 files changed, 22 insertions(+), 52 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 53a373d..d957a56 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -349,7 +349,6 @@ dummy_enum_func(void) gl_texture_index ti = TEXTURE_2D_ARRAY_INDEX; gl_vert_attrib va = VERT_ATTRIB_POS; gl_varying_slot vs = VARYING_SLOT_POS; - gl_geom_result gr = GEOM_RESULT_POS; (void) bi; (void) fi; @@ -358,7 +357,6 @@ dummy_enum_func(void) (void) ti; (void) va; (void) vs; - (void) gr; } diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index d760d21..d88d3d3 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -286,34 +286,6 @@ typedef enum /*/ /** - * Indexes for geometry program result attributes - */ -typedef enum -{ - GEOM_RESULT_POS = VARYING_SLOT_POS, - GEOM_RESULT_COL0 = VARYING_SLOT_COL0, - GEOM_RESULT_COL1 = VARYING_SLOT_COL1, - GEOM_RESULT_SCOL0 = VARYING_SLOT_BFC0, - GEOM_RESULT_SCOL1 = VARYING_SLOT_BFC1, - GEOM_RESULT_FOGC = VARYING_SLOT_FOGC, - GEOM_RESULT_TEX0 = VARYING_SLOT_TEX0, - GEOM_RESULT_TEX1 = VARYING_SLOT_TEX1, - GEOM_RESULT_TEX2 = VARYING_SLOT_TEX2, - GEOM_RESULT_TEX3 = VARYING_SLOT_TEX3, - GEOM_RESULT_TEX4 = VARYING_SLOT_TEX4, - GEOM_RESULT_TEX5 = VARYING_SLOT_TEX5, - GEOM_RESULT_TEX6 = VARYING_SLOT_TEX6, - GEOM_RESULT_TEX7 = VARYING_SLOT_TEX7, - GEOM_RESULT_PSIZ = VARYING_SLOT_PSIZ, - GEOM_RESULT_CLPV = VARYING_SLOT_CLIP_VERTEX, - GEOM_RESULT_PRID = VARYING_SLOT_PRIMITIVE_ID, - GEOM_RESULT_LAYR = VARYING_SLOT_LAYER, - GEOM_RESULT_VAR0 = VARYING_SLOT_VAR0, - GEOM_RESULT_MAX = VARYING_SLOT_MAX -} gl_geom_result; - - -/** * Indexes for fragment program input attributes. Note that * _mesa_vert_result_to_frag_attrib() and frag_attrib_to_vert_result() make * assumptions about the layout of this enum. diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c index 5cc18d4..bc7ab1e 100644 --- a/src/mesa/program/program.c +++ b/src/mesa/program/program.c @@ -951,7 +951,7 @@ _mesa_valid_register_index(const struct gl_context *ctx, case MESA_SHADER_FRAGMENT: return index FRAG_RESULT_DATA0 + (GLint) ctx-Const.MaxDrawBuffers; case MESA_SHADER_GEOMETRY: - return index GEOM_RESULT_VAR0 + (GLint) ctx-Const.MaxVarying; + return index VARYING_SLOT_VAR0 + (GLint) ctx-Const.MaxVarying; default: return GL_FALSE; } diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index ebb8068..b4f9465 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -4148,7 +4148,7 @@ dst_register(struct st_translate *t, else if (t-procType == TGSI_PROCESSOR_FRAGMENT) assert(index FRAG_RESULT_MAX); else - assert(index GEOM_RESULT_MAX); + assert(index VARYING_SLOT_MAX); assert(t-outputMapping[index] Elements(t-outputs)); diff --git a/src/mesa/state_tracker/st_mesa_to_tgsi.c b/src/mesa/state_tracker/st_mesa_to_tgsi.c index e494f45..a874e26 100644 --- a/src/mesa/state_tracker/st_mesa_to_tgsi.c +++ b/src/mesa/state_tracker/st_mesa_to_tgsi.c @@ -179,7 +179,7 @@ dst_register( struct st_translate *t, else if (t-procType == TGSI_PROCESSOR_FRAGMENT) assert(index FRAG_RESULT_MAX); else - assert(index GEOM_RESULT_MAX); + assert(index VARYING_SLOT_MAX); assert(t-outputMapping[index] Elements(t-outputs)); diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index 8bc2a12..6afad9b 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_program.c @@ -801,7 +801,7 @@ st_translate_geometry_program(struct st_context *st, const struct st_gp_variant_key *key) { GLuint inputMapping[VARYING_SLOT_MAX]; - GLuint outputMapping[GEOM_RESULT_MAX]; + GLuint outputMapping[VARYING_SLOT_MAX]; struct pipe_context *pipe = st-pipe; GLuint attr; GLbitfield64 inputsRead; @@ -912,7 +912,7 @@ st_translate_geometry_program(struct st_context *st, * Determine number of outputs, the (default) output register * mapping and the semantic information for each output. */ - for (attr = 0; attr GEOM_RESULT_MAX; attr++) { + for (attr = 0; attr VARYING_SLOT_MAX; attr++) { if
[Mesa-dev] [PATCH 09/12] mtypes.h: Modify gl_frag_attrib to refer to new gl_varying_slot enum.
This paves the way for eliminating the gl_frag_attrib enum entirely. --- src/mesa/main/mtypes.h| 45 ++- src/mesa/program/prog_print.c | 15 +++ 2 files changed, 34 insertions(+), 26 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index d88d3d3..cc11ca9 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -216,6 +216,7 @@ typedef enum * * When this enum is updated, the following code must be updated too: * - vertResults (in prog_print.c's arb_output_attrib_string()) + * - fragAttribs (in prog_print.c's arb_input_attrib_string()) * - _mesa_vert_result_to_frag_attrib() * - _mesa_frag_attrib_to_vert_result() */ @@ -292,24 +293,24 @@ typedef enum */ typedef enum { - FRAG_ATTRIB_WPOS = 0, - FRAG_ATTRIB_COL0 = 1, - FRAG_ATTRIB_COL1 = 2, - FRAG_ATTRIB_FOGC = 3, - FRAG_ATTRIB_TEX0 = 4, - FRAG_ATTRIB_TEX1 = 5, - FRAG_ATTRIB_TEX2 = 6, - FRAG_ATTRIB_TEX3 = 7, - FRAG_ATTRIB_TEX4 = 8, - FRAG_ATTRIB_TEX5 = 9, - FRAG_ATTRIB_TEX6 = 10, - FRAG_ATTRIB_TEX7 = 11, - FRAG_ATTRIB_FACE = 12, /** front/back face */ - FRAG_ATTRIB_PNTC = 13, /** sprite/point coord */ - FRAG_ATTRIB_CLIP_DIST0 = 14, - FRAG_ATTRIB_CLIP_DIST1 = 15, - FRAG_ATTRIB_VAR0 = 16, /** shader varying */ - FRAG_ATTRIB_MAX = (FRAG_ATTRIB_VAR0 + MAX_VARYING) + FRAG_ATTRIB_WPOS = VARYING_SLOT_POS, + FRAG_ATTRIB_COL0 = VARYING_SLOT_COL0, + FRAG_ATTRIB_COL1 = VARYING_SLOT_COL1, + FRAG_ATTRIB_FOGC = VARYING_SLOT_FOGC, + FRAG_ATTRIB_TEX0 = VARYING_SLOT_TEX0, + FRAG_ATTRIB_TEX1 = VARYING_SLOT_TEX1, + FRAG_ATTRIB_TEX2 = VARYING_SLOT_TEX2, + FRAG_ATTRIB_TEX3 = VARYING_SLOT_TEX3, + FRAG_ATTRIB_TEX4 = VARYING_SLOT_TEX4, + FRAG_ATTRIB_TEX5 = VARYING_SLOT_TEX5, + FRAG_ATTRIB_TEX6 = VARYING_SLOT_TEX6, + FRAG_ATTRIB_TEX7 = VARYING_SLOT_TEX7, + FRAG_ATTRIB_FACE = VARYING_SLOT_FACE, /** front/back face */ + FRAG_ATTRIB_PNTC = VARYING_SLOT_PNTC, /** sprite/point coord */ + FRAG_ATTRIB_CLIP_DIST0 = VARYING_SLOT_CLIP_DIST0, + FRAG_ATTRIB_CLIP_DIST1 = VARYING_SLOT_CLIP_DIST1, + FRAG_ATTRIB_VAR0 = VARYING_SLOT_VAR0, /** shader varying */ + FRAG_ATTRIB_MAX = VARYING_SLOT_MAX } gl_frag_attrib; @@ -329,11 +330,11 @@ _mesa_vert_result_to_frag_attrib(gl_varying_slot vert_result) else if (vert_result VARYING_SLOT_CLIP_DIST0) return -1; else if (vert_result = VARYING_SLOT_CLIP_DIST1) - return vert_result - VARYING_SLOT_CLIP_DIST0 + FRAG_ATTRIB_CLIP_DIST0; + return vert_result; else if (vert_result VARYING_SLOT_VAR0) return -1; else - return vert_result - VARYING_SLOT_VAR0 + FRAG_ATTRIB_VAR0; + return vert_result; } @@ -352,9 +353,9 @@ _mesa_frag_attrib_to_vert_result(gl_frag_attrib frag_attrib) else if (frag_attrib FRAG_ATTRIB_CLIP_DIST0) return -1; else if (frag_attrib = FRAG_ATTRIB_CLIP_DIST1) - return frag_attrib - FRAG_ATTRIB_CLIP_DIST0 + VARYING_SLOT_CLIP_DIST0; + return frag_attrib; else /* frag_attrib = FRAG_ATTRIB_VAR0 */ - return frag_attrib - FRAG_ATTRIB_VAR0 + VARYING_SLOT_VAR0; + return frag_attrib; } diff --git a/src/mesa/program/prog_print.c b/src/mesa/program/prog_print.c index 8617cfb..d740bf7 100644 --- a/src/mesa/program/prog_print.c +++ b/src/mesa/program/prog_print.c @@ -139,10 +139,17 @@ arb_input_attrib_string(GLint index, GLenum progType) fragment.texcoord[5], fragment.texcoord[6], fragment.texcoord[7], - fragment.(twelve), /* FRAG_ATTRIB_FACE */ - fragment.(thirteen), /* FRAG_ATTRIB_PNTC */ - fragment.(fourteen), /* FRAG_ATTRIB_CLIP_DIST0 */ - fragment.(fifteen), /* FRAG_ATTRIB_CLIP_DIST1 */ + fragment.(twelve), /* VARYING_SLOT_PSIZ */ + fragment.(thirteen), /* VARYING_SLOT_BFC0 */ + fragment.(fourteen), /* VARYING_SLOT_BFC1 */ + fragment.(fifteen), /* VARYING_SLOT_EDGE */ + fragment.(sixteen), /* VARYING_SLOT_CLIP_VERTEX */ + fragment.(seventeen), /* FRAG_ATTRIB_CLIP_DIST0 */ + fragment.(eighteen), /* FRAG_ATTRIB_CLIP_DIST1 */ + fragment.(nineteen), /* VARYING_SLOT_PRIMITIVE_ID */ + fragment.(twenty), /* VARYING_SLOT_LAYER */ + fragment.(twenty-one), /* FRAG_ATTRIB_FACE */ + fragment.(twenty-two), /* FRAG_ATTRIB_PNTC */ fragment.varying[0], fragment.varying[1], fragment.varying[2], -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/12] Get rid of _mesa_vert_result_to_frag_attrib().
Now that there is no difference between the enums that represent vertex outputs and fragment inputs, there's no need for a conversion function. But we still need to be able to detect when a given vertex output has no corresponding fragment input. So it is replaced by a new function, _mesa_varying_slot_in_fs(), which tells whether the given varying slot exists as an FS input or not. --- src/mesa/drivers/dri/i965/brw_fs.cpp| 12 - src/mesa/drivers/dri/i965/brw_vs_constval.c | 13 -- src/mesa/main/mtypes.h | 38 + 3 files changed, 27 insertions(+), 36 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 86f8cbb..ea4a56c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1265,7 +1265,7 @@ fs_visitor::calculate_urb_setup() continue; if (c-key.vp_outputs_written BITFIELD64_BIT(i)) { - int fp_index = _mesa_vert_result_to_frag_attrib((gl_varying_slot) i); +bool exists_in_fs = _mesa_varying_slot_in_fs((gl_varying_slot) i); /* The back color slot is skipped when the front color is * also written to. In addition, some slots can be @@ -1273,8 +1273,8 @@ fs_visitor::calculate_urb_setup() * fragment shader. So the register number must always be * incremented, mapped or not. */ - if (fp_index = 0) - urb_setup[fp_index] = urb_next; + if (exists_in_fs) + urb_setup[i] = urb_next; urb_next++; } } @@ -3001,10 +3001,8 @@ brw_fs_precompile(struct gl_context *ctx, struct gl_shader_program *prog) key.proj_attrib_mask |= BITFIELD64_BIT(i); if (intel-gen 6) { - int vp_index = _mesa_vert_result_to_frag_attrib((gl_varying_slot) i); - - if (vp_index = 0) -key.vp_outputs_written |= BITFIELD64_BIT(vp_index); + if (_mesa_varying_slot_in_fs((gl_varying_slot) i)) +key.vp_outputs_written |= BITFIELD64_BIT(i); } } diff --git a/src/mesa/drivers/dri/i965/brw_vs_constval.c b/src/mesa/drivers/dri/i965/brw_vs_constval.c index e623b4c..782f9d7 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_constval.c +++ b/src/mesa/drivers/dri/i965/brw_vs_constval.c @@ -144,17 +144,14 @@ static void calc_sizes( struct tracker *t ) * which describes the fragment program input sizes. */ for (vertRes = 0; vertRes VARYING_SLOT_MAX; vertRes++) { - - /* map vertex program output index to fragment program input index */ - GLint fragAttrib = _mesa_vert_result_to_frag_attrib(vertRes); - if (fragAttrib 0) + if (!_mesa_varying_slot_in_fs(vertRes)) continue; switch (get_output_size(t, vertRes)) { - case 4: t-size_masks[4-1] |= BITFIELD64_BIT(fragAttrib); - case 3: t-size_masks[3-1] |= BITFIELD64_BIT(fragAttrib); - case 2: t-size_masks[2-1] |= BITFIELD64_BIT(fragAttrib); - case 1: t-size_masks[1-1] |= BITFIELD64_BIT(fragAttrib); + case 4: t-size_masks[4-1] |= BITFIELD64_BIT(vertRes); + case 3: t-size_masks[3-1] |= BITFIELD64_BIT(vertRes); + case 2: t-size_masks[2-1] |= BITFIELD64_BIT(vertRes); + case 1: t-size_masks[1-1] |= BITFIELD64_BIT(vertRes); break; } } diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index cc11ca9..f8a6911 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -217,7 +217,7 @@ typedef enum * When this enum is updated, the following code must be updated too: * - vertResults (in prog_print.c's arb_output_attrib_string()) * - fragAttribs (in prog_print.c's arb_input_attrib_string()) - * - _mesa_vert_result_to_frag_attrib() + * - _mesa_varying_slot_in_fs() * - _mesa_frag_attrib_to_vert_result() */ typedef enum @@ -288,8 +288,8 @@ typedef enum /** * Indexes for fragment program input attributes. Note that - * _mesa_vert_result_to_frag_attrib() and frag_attrib_to_vert_result() make - * assumptions about the layout of this enum. + * _mesa_frag_attrib_to_vert_result() makes assumptions about the layout of + * this enum. */ typedef enum { @@ -315,26 +315,22 @@ typedef enum /** - * Convert from a gl_varying_slot value for a vertex output to the - * corresponding gl_frag_attrib. - * - * Varying output values which have no corresponding gl_frag_attrib - * (VARYING_SLOT_PSIZ, VARYING_SLOT_BFC0, VARYING_SLOT_BFC1, and - * VARYING_SLOT_EDGE) are converted to a value of -1. + * Determine if the given gl_varying_slot appears in the fragment shader. */ -static inline int -_mesa_vert_result_to_frag_attrib(gl_varying_slot vert_result) +static inline GLboolean +_mesa_varying_slot_in_fs(gl_varying_slot slot) { - if (vert_result = VARYING_SLOT_TEX7) - return vert_result; - else if (vert_result VARYING_SLOT_CLIP_DIST0) - return -1; - else if
[Mesa-dev] [PATCH 11/12] Get rid of _mesa_frag_attrib_to_vert_result().
Now that there is no difference between the enums that represent vertex outputs and fragment inputs, there's no need for a conversion function. --- src/mesa/drivers/dri/i965/gen6_sf_state.c | 15 +++ src/mesa/main/mtypes.h| 26 +- 2 files changed, 8 insertions(+), 33 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c b/src/mesa/drivers/dri/i965/gen6_sf_state.c index 3da220d..74b232f 100644 --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c @@ -56,24 +56,23 @@ uint32_t get_attr_override(struct brw_vue_map *vue_map, int urb_entry_read_offset, int fs_attr, bool two_side_color, uint32_t *max_source_attr) { - int vs_attr = _mesa_frag_attrib_to_vert_result(fs_attr); - if (vs_attr 0 || vs_attr == VARYING_SLOT_POS) { - /* These attributes will be overwritten by the fragment shader's - * interpolation code (see emit_interp() in brw_wm_fp.c), so just let - * them reference the first available attribute. + if (fs_attr == FRAG_ATTRIB_WPOS) { + /* This attribute will be overwritten by the fragment shader's + * interpolation code (see emit_interp() in brw_wm_fp.c), so just let it + * reference the first available attribute. */ return 0; } /* Find the VUE slot for this attribute. */ - int slot = vue_map-vert_result_to_slot[vs_attr]; + int slot = vue_map-vert_result_to_slot[fs_attr]; /* If there was only a back color written but not front, use back * as the color instead of undefined */ - if (slot == -1 vs_attr == VARYING_SLOT_COL0) + if (slot == -1 fs_attr == VARYING_SLOT_COL0) slot = vue_map-vert_result_to_slot[VARYING_SLOT_BFC0]; - if (slot == -1 vs_attr == VARYING_SLOT_COL1) + if (slot == -1 fs_attr == VARYING_SLOT_COL1) slot = vue_map-vert_result_to_slot[VARYING_SLOT_BFC1]; if (slot == -1) { diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index f8a6911..9c431af 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -218,7 +218,6 @@ typedef enum * - vertResults (in prog_print.c's arb_output_attrib_string()) * - fragAttribs (in prog_print.c's arb_input_attrib_string()) * - _mesa_varying_slot_in_fs() - * - _mesa_frag_attrib_to_vert_result() */ typedef enum { @@ -287,9 +286,7 @@ typedef enum /*/ /** - * Indexes for fragment program input attributes. Note that - * _mesa_frag_attrib_to_vert_result() makes assumptions about the layout of - * this enum. + * Indexes for fragment program input attributes. */ typedef enum { @@ -335,27 +332,6 @@ _mesa_varying_slot_in_fs(gl_varying_slot slot) /** - * Convert from a gl_frag_attrib value to the corresponding gl_varying_slot - * for a vertex output. - * - * gl_frag_attrib values which have no corresponding vertex output - * (FRAG_ATTRIB_FACE and FRAG_ATTRIB_PNTC) are converted to a value of -1. - */ -static inline int -_mesa_frag_attrib_to_vert_result(gl_frag_attrib frag_attrib) -{ - if (frag_attrib = FRAG_ATTRIB_TEX7) - return frag_attrib; - else if (frag_attrib FRAG_ATTRIB_CLIP_DIST0) - return -1; - else if (frag_attrib = FRAG_ATTRIB_CLIP_DIST1) - return frag_attrib; - else /* frag_attrib = FRAG_ATTRIB_VAR0 */ - return frag_attrib; -} - - -/** * Bitflags for fragment program input attributes. */ /*@{*/ -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/12] Replace gl_frag_attrib enum with gl_varying_slot.
This patch makes the following search-and-replace changes: gl_frag_attrib - gl_varying_slot FRAG_ATTRIB_* - VARYING_SLOT_* FRAG_BIT_* - VARYING_BIT_* -- Note: this patch is very large, so for purposes of mailing list discussion I've trimmed it down to representative example hunks. To see the complete patch, please refer to branch combine_varying_slot_enums of git://github.com/stereotype441/mesa.git. src/glsl/builtin_variables.cpp | 26 ++-- src/glsl/ir.cpp| 2 +- src/glsl/ir.h | 2 +- src/glsl/link_varyings.cpp | 13 +- src/glsl/linker.cpp| 2 +- src/mesa/drivers/dri/i915/i915_fragprog.c | 64 src/mesa/drivers/dri/i915/i915_state.c | 2 +- src/mesa/drivers/dri/i915/intel_tris.c | 2 +- src/mesa/drivers/dri/i965/brw_fs.cpp | 24 +-- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_fp.cpp| 10 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- src/mesa/drivers/dri/i965/brw_sf.c | 2 +- src/mesa/drivers/dri/i965/brw_vs_constval.c| 8 +- src/mesa/drivers/dri/i965/brw_wm.c | 10 +- src/mesa/drivers/dri/i965/brw_wm_iz.cpp| 2 +- src/mesa/drivers/dri/i965/brw_wm_state.c | 2 +- src/mesa/drivers/dri/i965/gen6_sf_state.c | 16 +- src/mesa/drivers/dri/i965/gen6_wm_state.c | 2 +- src/mesa/drivers/dri/i965/gen7_sf_state.c | 14 +- src/mesa/drivers/dri/i965/gen7_wm_state.c | 2 +- src/mesa/drivers/x11/xm_line.c | 8 +- src/mesa/main/context.c| 4 +- src/mesa/main/ff_fragment_shader.cpp | 38 ++--- src/mesa/main/ffvertex_prog.c | 14 +- src/mesa/main/mtypes.h | 62 +--- src/mesa/main/state.h | 2 +- src/mesa/main/texstate.c | 2 +- src/mesa/program/ir_to_mesa.cpp| 4 +- src/mesa/program/prog_execute.c| 6 +- src/mesa/program/prog_print.c | 16 +- src/mesa/program/program.c | 10 +- src/mesa/program/program_parse.y | 8 +- src/mesa/program/programopt.c | 14 +- src/mesa/state_tracker/st_atom_pixeltransfer.c | 4 +- src/mesa/state_tracker/st_atom_rasterizer.c| 4 +- src/mesa/state_tracker/st_cb_bitmap.c | 4 +- src/mesa/state_tracker/st_cb_drawpixels.c | 10 +- src/mesa/state_tracker/st_cb_drawtex.c | 2 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 26 ++-- src/mesa/state_tracker/st_mesa_to_tgsi.c | 12 +- src/mesa/state_tracker/st_program.c| 50 +++ src/mesa/swrast/s_aaline.c | 8 +- src/mesa/swrast/s_aalinetemp.h | 22 +-- src/mesa/swrast/s_aatritemp.h | 52 +++ src/mesa/swrast/s_alpha.c | 2 +- src/mesa/swrast/s_atifragshader.c | 10 +- src/mesa/swrast/s_context.c| 44 +++--- src/mesa/swrast/s_context.h| 6 +- src/mesa/swrast/s_copypix.c| 4 +- src/mesa/swrast/s_drawpix.c| 4 +- src/mesa/swrast/s_feedback.c | 24 +-- src/mesa/swrast/s_fog.c| 20 +-- src/mesa/swrast/s_fragprog.c | 12 +- src/mesa/swrast/s_lines.c | 12 +- src/mesa/swrast/s_linetemp.h | 62 src/mesa/swrast/s_logic.c | 2 +- src/mesa/swrast/s_masking.c| 2 +- src/mesa/swrast/s_points.c | 84 +-- src/mesa/swrast/s_span.c | 112 +++--- src/mesa/swrast/s_span.h | 12 +- src/mesa/swrast/s_texcombine.c | 6 +- src/mesa/swrast/s_texfilter.c | 2 +- src/mesa/swrast/s_triangle.c | 40 ++--- src/mesa/swrast/s_tritemp.h| 200 - src/mesa/swrast/s_zoom.c | 26 ++-- src/mesa/swrast/swrast.h | 6 +- src/mesa/swrast_setup/ss_context.c | 30 ++-- src/mesa/swrast_setup/ss_triangle.c| 12 +- src/mesa/swrast_setup/ss_tritmp.h | 92 ++-- src/mesa/tnl/t_context.c | 6 +- src/mesa/tnl_dd/t_dd_vb.c | 96 ++-- 72 files changed, 728 insertions(+), 791 deletions(-) diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index 531effd..b0c7a20 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -52,13 +52,13 @@ static const builtin_variable builtin_core_vs_variables[] = { }; static const builtin_variable
[Mesa-dev] [Bug 58718] Crash in src_register() during glClear() call
https://bugs.freedesktop.org/show_bug.cgi?id=58718 --- Comment #11 from Keith Kriewall keith.kriew...@attachmate.com --- I just tried that (signed fields ahead of unsigned) and it didn't help in this case. The modified struct began as: struct prog_src_register { GLint Index:(INST_INDEX_BITS+1); /** Extra bit here for sign bit. * May be negative for relative addressing. */ GLint Index2:(INST_INDEX_BITS+1); /** Extra bit here for sign bit. * May be negative for relative * addressing. */ GLuint File:4;/** One of the PROGRAM_* register file values. */ GLuint Swizzle:12; . . . } -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] shader_time update
Here's my rebase of Ken's shader_time updates for Haswell, attempting to incorporate review feedback plus my discomfort with the discussion of addressing changes (it was superfluous -- the messages we're sending are the untyped byte offsets same both ways, and the RAW vs RGBA32F in the surface state doesn't matter). I hope this accurately reflects review, so I'd appreciate an ack from ken/paul if they agree. While doing so, I also made the simple change I'd been thinking about to reduce the overhead of shader_time. Available at the shader_time branch of my tree. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] i965: Specialize SURFACE_STATE creation for shader time.
From: Kenneth Graunke kenn...@whitecape.org This is basically a copy and paste of gen7_create_constant_surface, but with the parameters filled in to offer a simpler interface. It will diverge shortly. I didn't bother adding it to the vtable for now since shader time is only exposed on Gen7+. v2: Replace tabs in the new code (by anholt) Add back dropped memset() and add a comment about HSW channel selects. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt e...@anholt.net Reviewed-by: Paul Berry stereotype...@gmail.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_state.h |2 + src/mesa/drivers/dri/i965/brw_vs_surface_state.c |4 +- src/mesa/drivers/dri/i965/brw_wm_surface_state.c |4 +- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 41 + 4 files changed, 45 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index ecc61c4..02ce57b 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -216,6 +216,8 @@ void gen7_set_surface_mcs_info(struct brw_context *brw, bool is_render_target); void gen7_check_surface_setup(uint32_t *surf, bool is_render_target); void gen7_init_vtable_surface_functions(struct brw_context *brw); +void gen7_create_shader_time_surface(struct brw_context *brw, + uint32_t *out_offset); /* brw_wm_sampler_state.c */ uint32_t translate_wrap_mode(GLenum wrap, bool using_nearest); diff --git a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c index 4da7eaa..9aa775f 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c @@ -142,9 +142,7 @@ brw_vs_upload_binding_table(struct brw_context *brw) int i; if (INTEL_DEBUG DEBUG_SHADER_TIME) { - intel-vtbl.create_constant_surface(brw, brw-shader_time.bo, 0, - brw-shader_time.bo-size, - brw-vs.surf_offset[SURF_INDEX_VS_SHADER_TIME]); + gen7_create_shader_time_surface(brw, brw-vs.surf_offset[SURF_INDEX_VS_SHADER_TIME]); assert(brw-vs.prog_data-num_surfaces = SURF_INDEX_VS_SHADER_TIME); brw-vs.prog_data-num_surfaces = SURF_INDEX_VS_SHADER_TIME; diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 7979487..45a2ffa 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -1487,9 +1487,7 @@ brw_upload_wm_binding_table(struct brw_context *brw) int i; if (INTEL_DEBUG DEBUG_SHADER_TIME) { - intel-vtbl.create_constant_surface(brw, brw-shader_time.bo, 0, - brw-shader_time.bo-size, - brw-wm.surf_offset[SURF_INDEX_WM_SHADER_TIME]); + gen7_create_shader_time_surface(brw, brw-wm.surf_offset[SURF_INDEX_WM_SHADER_TIME]); } /* Might want to calculate nr_surfaces first, to avoid taking up so much diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index ef062d2..034be82 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -424,6 +424,47 @@ gen7_create_constant_surface(struct brw_context *brw, gen7_check_surface_setup(surf, false /* is_render_target */); } +/** + * Create a surface for shader time. + */ +void +gen7_create_shader_time_surface(struct brw_context *brw, uint32_t *out_offset) +{ + struct intel_context *intel = brw-intel; + const int w = brw-shader_time.bo-size - 1; + + uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, +8 * 4, 32, out_offset); + memset(surf, 0, 8 * 4); + + surf[0] = BRW_SURFACE_BUFFER BRW_SURFACE_TYPE_SHIFT | + BRW_SURFACEFORMAT_R32G32B32A32_FLOAT BRW_SURFACE_FORMAT_SHIFT | + BRW_SURFACE_RC_READ_WRITE; + + surf[1] = brw-shader_time.bo-offset; /* reloc */ + + surf[2] = SET_FIELD(w 0x7f, GEN7_SURFACE_WIDTH) | + SET_FIELD((w 7) 0x1fff, GEN7_SURFACE_HEIGHT); + surf[3] = SET_FIELD((w 20) 0x7f, BRW_SURFACE_DEPTH) | + (16 - 1); /* stride between samples */ + + /* Unlike texture or renderbuffer surfaces, we only do untyped operations +* on the shader_time surface, so there's no need to set HSW channel +* overrides. +*/ + + /* Emit relocation to surface contents. Section 5.1.1 of the gen4 +* bspec (Data Cache) says that the data cache does not exist as +* a separate cache and is just the sampler cache. +*/ + drm_intel_bo_emit_reloc(intel-batch.bo, + *out_offset + 4, +
Re: [Mesa-dev] Software Rendering without X
On 03/11/2013 12:44 PM, ritvik_sha...@dell.com wrote: Hi All, I don’t have any Graphic Cards that support OpenGL , so I want to perform software rendering with Mesa without X, DRM etc. Also can someone explain how are the functions for eg. glClear (_mesa_clear the actual implementation) are invoked without glx? See the Off-Screen Rendering page on the website. Or see the GL/osmesa.h header files. There's some demo programs in the Mesa demo package in src/osdemos/ BTW, I'm actually on the verge of posting some patches which ports the OSMsea interface to gallium... -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] i965: Make INTEL_DEBUG=shader_time use the RAW surface format.
From: Kenneth Graunke kenn...@whitecape.org Untyped Atomic Operation messages are illegal for non-RAW formats. The IVB hardware proceeds happily (after all, who cares what the format of the surface is if you're doing untyped ops on it?), but later hardware apparently doesn't. The simulator for gen7 does complain, though. v2: Rebase against updates to previous patches. (by anholt) NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt e...@anholt.net Reviewed-by: Paul Berry stereotype...@gmail.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_defines.h |1 + src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |5 ++--- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index d0fe9be..efd5b4d 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -437,6 +437,7 @@ #define BRW_SURFACEFORMAT_B10G10R10A2_SSCALED0x1B9 #define BRW_SURFACEFORMAT_B10G10R10A2_UINT 0x1BA #define BRW_SURFACEFORMAT_B10G10R10A2_SINT 0x1BB +#define BRW_SURFACEFORMAT_RAW0x1FF #define BRW_SURFACE_FORMAT_SHIFT 18 #define BRW_SURFACE_FORMAT_MASKINTEL_MASK(26, 18) diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 034be82..db04253 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -438,15 +438,14 @@ gen7_create_shader_time_surface(struct brw_context *brw, uint32_t *out_offset) memset(surf, 0, 8 * 4); surf[0] = BRW_SURFACE_BUFFER BRW_SURFACE_TYPE_SHIFT | - BRW_SURFACEFORMAT_R32G32B32A32_FLOAT BRW_SURFACE_FORMAT_SHIFT | + BRW_SURFACEFORMAT_RAW BRW_SURFACE_FORMAT_SHIFT | BRW_SURFACE_RC_READ_WRITE; surf[1] = brw-shader_time.bo-offset; /* reloc */ surf[2] = SET_FIELD(w 0x7f, GEN7_SURFACE_WIDTH) | SET_FIELD((w 7) 0x1fff, GEN7_SURFACE_HEIGHT); - surf[3] = SET_FIELD((w 20) 0x7f, BRW_SURFACE_DEPTH) | - (16 - 1); /* stride between samples */ + surf[3] = SET_FIELD((w 20) 0x7f, BRW_SURFACE_DEPTH); /* Unlike texture or renderbuffer surfaces, we only do untyped operations * on the shader_time surface, so there's no need to set HSW channel -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] i965: Split shader_time entries into separate cachelines.
This avoids some snooping overhead between EUs processing separate shaders (so VS versus FS). Improves performance of a minecraft trace with shader_time by 28.9% +/- 18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4). --- src/mesa/drivers/dri/i965/brw_fs.cpp|2 +- src/mesa/drivers/dri/i965/brw_program.c |2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 8ce3954..dd3baa9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -621,7 +621,7 @@ fs_visitor::emit_shader_time_write(enum shader_time_shader_type type, fs_reg offset_mrf = fs_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; - emit(MOV(offset_mrf, fs_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, fs_reg(shader_time_index * 64))); fs_reg time_mrf = fs_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 75eb6bc..c1aeefc 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -409,7 +409,7 @@ brw_collect_shader_time(struct brw_context *brw) uint32_t *times = brw-shader_time.bo-virtual; for (int i = 0; i brw-shader_time.num_entries; i++) { - brw-shader_time.cumulative[i] += times[i]; + brw-shader_time.cumulative[i] += times[i * 16]; } /* Zero the BO out to clear it out for our next collection. diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index f319f32..b87d62b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1225,7 +1225,7 @@ vec4_visitor::emit_shader_time_write(enum shader_time_shader_type type, dst_reg offset_mrf = dst_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; - emit(MOV(offset_mrf, src_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, src_reg(shader_time_index * 64))); dst_reg time_mrf = dst_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] i965: Fix INTEL_DEBUG=shader_time for Haswell.
From: Kenneth Graunke kenn...@whitecape.org Haswell's Data Cache data port is a single unit, but split into two SFIDs to allow for more message types without adding more bits in the message descriptor. Untyped Atomic Operations are now message 0010 in the second data cache data port, rather than 6 in the first. v2: Use the #defines from the previous commit. (by anholt) NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Reviewed-by: Eric Anholt e...@anholt.net (v1) --- src/mesa/drivers/dri/i965/brw_defines.h |1 + src/mesa/drivers/dri/i965/brw_eu_emit.c | 15 +++ 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 042abcd..d0fe9be 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -859,6 +859,7 @@ enum brw_message_target { GEN6_SFID_DATAPORT_CONSTANT_CACHE = 9, GEN7_SFID_DATAPORT_DATA_CACHE = 10, + HSW_SFID_DATAPORT_DATA_CACHE_1= 12, }; #define GEN7_MESSAGE_TARGET_DP_DATA_CACHE 10 diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 8ed8c4a..992e784 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2455,15 +2455,22 @@ void brw_shader_time_add(struct brw_compile *p, brw_set_src0(p, send, brw_vec1_reg(BRW_MESSAGE_REGISTER_FILE, base_mrf, 0)); + uint32_t sfid, msg_type; + if (intel-is_haswell) { + sfid = HSW_SFID_DATAPORT_DATA_CACHE_1; + msg_type = HSW_DATAPORT_DC_PORT1_UNTYPED_ATOMIC_OP; + } else { + sfid = GEN7_SFID_DATAPORT_DATA_CACHE; + msg_type = GEN7_DATAPORT_DC_UNTYPED_ATOMIC_OP; + } + bool header_present = false; bool eot = false; uint32_t mlen = 2; /* offset, value */ uint32_t rlen = 0; - brw_set_message_descriptor(p, send, - GEN7_SFID_DATAPORT_DATA_CACHE, - mlen, rlen, header_present, eot); + brw_set_message_descriptor(p, send, sfid, mlen, rlen, header_present, eot); - send-bits3.ud |= 6 14; /* untyped atomic op */ + send-bits3.ud |= msg_type 14; send-bits3.ud |= 0 13; /* no return data */ send-bits3.ud |= 1 12; /* SIMD8 mode */ send-bits3.ud |= BRW_AOP_ADD 8; -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] remove mfeatures.h, take two
On Sat, Mar 2, 2013 at 7:29 AM, Brian Paul bri...@vmware.com wrote: I've respun my remove-mfeatures branch as remove-mfeatures-2. It's in my repo on freedesktop.org This removes the mfeatures.h file and the last of the #ifdef FEATURE_x code from core Mesa. Note, however, that the scons/makefiles still define FEATURE_GL, FEATURE_ES1, FEATURE_ES2, etc. because that controls whether some source files are built and in some places (like egl-static/egl_st.c) we need to test those symbols to avoid calling non-existant functions. If a few people can test, that'd be great. Ping. Can anyone test the updated remove-mfeatures-2 branch? Thanks. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/7] OSMesa interface for gallium
This series let us use the OSMesa interface with gallium llvmpipe/softpipe drivers. The OSMesa interface is pretty old and ugly but I think most OSMesa users would rather not have to port their apps to a new interface. When Mesa is configured with --enable-osmesa a new lib/gallium/libOSMesa.so will be built. Matt, my autoconf fu isn't very strong so my Makefile.am files may need some tweaks. Plus, I see that you're working on more autoconf changes anyway. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] targets/osmesa: new OSMesa gallium target
--- src/gallium/targets/osmesa/target.c | 55 +++ 1 files changed, 55 insertions(+), 0 deletions(-) create mode 100644 src/gallium/targets/osmesa/target.c diff --git a/src/gallium/targets/osmesa/target.c b/src/gallium/targets/osmesa/target.c new file mode 100644 index 000..33e9351 --- /dev/null +++ b/src/gallium/targets/osmesa/target.c @@ -0,0 +1,55 @@ +/* + * Copyright (c) 2013 Brian Paul All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN + * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + */ + + +#include target-helpers/inline_sw_helper.h +#include target-helpers/inline_debug_helper.h + +#include sw/null/null_sw_winsys.h + + +struct pipe_screen * +osmesa_create_screen(void); + + +struct pipe_screen * +osmesa_create_screen(void) +{ + struct sw_winsys *winsys; + struct pipe_screen *screen; + + /* We use a null software winsys since we always just render to ordinary +* driver resources. +*/ + winsys = null_sw_create(); + if (!winsys) + return NULL; + + /* Create llvmpipe or softpipe screen */ + screen = sw_screen_create(winsys); + if (!screen) { + winsys-destroy(winsys); + return NULL; + } + + /* Inject optional trace, debug, etc. wrappers */ + return debug_screen_wrap(screen); +} -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] configure: wire-up new OSMesa gallium state tracker and target
--- configure.ac |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/configure.ac b/configure.ac index 3204869..c42d0b0 100644 --- a/configure.ac +++ b/configure.ac @@ -805,6 +805,8 @@ fi if test x$enable_osmesa = xyes; then DRIVER_DIRS=$DRIVER_DIRS osmesa +GALLIUM_STATE_TRACKERS_DIRS=osmesa $GALLIUM_STATE_TRACKERS_DIRS +GALLIUM_TARGET_DIRS=$GALLIUM_TARGET_DIRS osmesa fi AC_SUBST([SRC_DIRS]) @@ -2081,6 +2083,7 @@ AC_CONFIG_FILES([Makefile src/gallium/state_trackers/egl/Makefile src/gallium/state_trackers/gbm/Makefile src/gallium/state_trackers/glx/Makefile + src/gallium/state_trackers/osmesa/Makefile src/gallium/state_trackers/vdpau/Makefile src/gallium/state_trackers/vega/Makefile src/gallium/state_trackers/xa/Makefile @@ -2097,6 +2100,7 @@ AC_CONFIG_FILES([Makefile src/gallium/targets/egl-static/Makefile src/gallium/targets/gbm/Makefile src/gallium/targets/opencl/Makefile + src/gallium/targets/osmesa/Makefile src/gallium/targets/pipe-loader/Makefile src/gallium/targets/libgl-xlib/Makefile src/gallium/targets/vdpau-nouveau/Makefile -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] st/mesa: add PIPE_FORMAT_R16G16B16A16_UNORM renderbuffer support
To allow rendering in 16-bit/channel RGBA buffers. --- src/mesa/state_tracker/st_cb_fbo.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_fbo.c b/src/mesa/state_tracker/st_cb_fbo.c index 72bc960..87c5b04 100644 --- a/src/mesa/state_tracker/st_cb_fbo.c +++ b/src/mesa/state_tracker/st_cb_fbo.c @@ -330,6 +330,9 @@ st_new_renderbuffer_fb(enum pipe_format format, int samples, boolean sw) /* accum buffer */ strb-Base.InternalFormat = GL_RGBA16_SNORM; break; + case PIPE_FORMAT_R16G16B16A16_UNORM: + strb-Base.InternalFormat = GL_RGBA16; + break; case PIPE_FORMAT_R8_UNORM: strb-Base.InternalFormat = GL_R8; break; -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] target/osmesa: add new Makefile.am
--- src/gallium/targets/osmesa/Makefile.am | 91 1 files changed, 91 insertions(+), 0 deletions(-) create mode 100644 src/gallium/targets/osmesa/Makefile.am diff --git a/src/gallium/targets/osmesa/Makefile.am b/src/gallium/targets/osmesa/Makefile.am new file mode 100644 index 000..e187a47 --- /dev/null +++ b/src/gallium/targets/osmesa/Makefile.am @@ -0,0 +1,91 @@ +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, +# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, +# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +include $(top_srcdir)/src/gallium/Automake.inc + +AM_CFLAGS = $(GALLIUM_CFLAGS) + +AM_CPPFLAGS = \ + -I$(top_srcdir)/include \ + -I$(top_srcdir)/src/mapi \ + -I$(top_srcdir)/src/mesa \ + -I$(top_srcdir)/src/gallium/include \ + -I$(top_srcdir)/src/gallium/drivers \ + -I$(top_srcdir)/src/gallium/winsys \ + -I$(top_srcdir)/src/gallium/auxiliary \ + -DGALLIUM_SOFTPIPE \ + -DGALLIUM_TRACE + +lib_LTLIBRARIES = lib@OSMESA_LIB@.la + +lib@OSMESA_LIB@_la_SOURCES = target.c + +lib@OSMESA_LIB@_la_LDFLAGS = -module -version-number @OSMESA_VERSION@ -no-undefined + +if HAVE_SHARED_GLAPI +GLAPI_LIB = $(top_builddir)/src/mapi/shared-glapi/libglapi.la +else +GLAPI_LIB = $(top_builddir)/src/mapi/glapi/libglapi.la +endif + +lib@OSMESA_LIB@_la_LIBADD = \ + $(top_builddir)/src/mesa/libmesagallium.la \ + $(top_builddir)/src/gallium/auxiliary/libgallium.la \ + $(top_builddir)/src/gallium/winsys/sw/null/libws_null.la \ + $(top_builddir)/src/gallium/drivers/trace/libtrace.la \ + $(top_builddir)/src/gallium/drivers/softpipe/libsoftpipe.la \ +$(top_builddir)/src/gallium/state_trackers/osmesa/libosmesa.la \ + $(GLAPI_LIB) \ + $(OSMESA_LIB_DEPS) \ + $(CLOCK_LIB) + + + +if HAVE_MESA_LLVM +lib@OSMESA_LIB@_la_LINK = $(CXXLINK) $(lib@OSMESA_LIB@_la_LDFLAGS) +# Mention a dummy pure C++ file to trigger generation of the $(LINK) variable +nodist_EXTRA_lib@OSMESA_LIB@_la_SOURCES = dummy-cpp.cpp + +lib@OSMESA_LIB@_la_LIBADD += $(top_builddir)/src/gallium/drivers/llvmpipe/libllvmpipe.la $(LLVM_LIBS) +AM_CPPFLAGS += -DGALLIUM_LLVMPIPE +lib@OSMESA_LIB@_la_LDFLAGS += $(LLVM_LDFLAGS) +else +lib@OSMESA_LIB@_la_LINK = $(CXXLINK) $(lib@OSMESA_LIB@_la_LDFLAGS) +# Mention a dummy pure C file to trigger generation of the $(LINK) variable +nodist_EXTRA_lib@OSMESA_LIB@_la_SOURCES = dummy-c.c +endif + + + +if BUILD_SHARED +# Provide compatibility with scripts for the old Mesa build system for +# a while by putting a link to the driver into /lib of the build tree. +all-local: lib@OSMESA_LIB@.la + $(MKDIR_P) $(top_builddir)/$(LIB_DIR); + $(MKDIR_P) $(top_builddir)/$(LIB_DIR)/gallium; + ln -f .libs/lib@OSMESA_LIB@.so $(top_builddir)/$(LIB_DIR)/gallium/lib@OSMESA_LIB@.so; + ln -f .libs/lib@OSMESA_LIB@.so.@OSMESA_VERSION@ $(top_builddir)/$(LIB_DIR)/gallium/lib@OSMESA_LIB@.so.@OSMESA_VERSION@; + cp .libs/lib@OSMESA_LIB@.so.@OSMESA_VERSION@.0.0 $(top_builddir)/$(LIB_DIR)/gallium/ +endif + +# XXX fix-up? +#pkgconfigdir = $(libdir)/pkgconfig +#pkgconfig_DATA = osmesa.pc -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] docs: rewrite the OSMesa info / instructions
--- docs/osmesa.html | 65 - 1 files changed, 25 insertions(+), 40 deletions(-) diff --git a/docs/osmesa.html b/docs/osmesa.html index b0609cf..8487545 100644 --- a/docs/osmesa.html +++ b/docs/osmesa.html @@ -18,77 +18,62 @@ p -Mesa's off-screen rendering interface is used for rendering into -user-allocated blocks of memory. +Mesa's off-screen interface is used for rendering into user-allocated memory +without any sort of window system or operating system dependencies. That is, the GL_FRONT colorbuffer is actually a buffer in main memory, rather than a window on your display. -There are no window system or operating system dependencies. -One potential application is to use Mesa as an off-line, batch-style renderer. /p p -The bOSMesa/b API provides three basic functions for making off-screen +The OSMesa API provides three basic functions for making off-screen renderings: OSMesaCreateContext(), OSMesaMakeCurrent(), and OSMesaDestroyContext(). See the Mesa/include/GL/osmesa.h header for more information about the API functions. /p p -There are several examples of OSMesa in the mesa/demos repository. +The OSMesa interface may be used with any of three software renderers: /p +ol +lillvmpipe - this is the high-performance Gallium LLVM driver +lisoftpipe - this it the reference Gallium software driver +liswrast - this is the legacy Mesa software rasterizer +/ol -h2Deep color channels/h2 - p -For some applications 8-bit color channels don't have sufficient -precision. -OSMesa supports 16-bit and 32-bit color channels through the OSMesa interface. -When using 16-bit channels, channels are GLushorts and RGBA pixels occupy -8 bytes. -When using 32-bit channels, channels are GLfloats and RGBA pixels occupy -16 bytes. +There are several examples of OSMesa in the mesa/demos repository. /p -p -Before version 6.5.1, Mesa had to be recompiled to support exactly -one of 8, 16 or 32-bit channels. -With Mesa 6.5.1, Mesa can be compiled for either 8, 16 or 32-bit channels -and render into any of the smaller size channels. -For example, if Mesa's compiled for 32-bit channels, you can also render -16 and 8-bit channel images. -/p +h1Building OSMesa/h1 p -To build Mesa/OSMesa for 16 and 8-bit color channel support: +Configure and build Mesa with something like: + pre - make realclean - make linux-osmesa16 +configure --enable-osmesa --disable-driglx-direct --disable-dri --with-gallium-drivers=swrast +make /pre p -To build Mesa/OSMesa for 32, 16 and 8-bit color channel support: -pre - make realclean - make linux-osmesa32 -/pre +Make sure you have LLVM installed first if you want to use the llvmpipe driver. +/p p -You'll wind up with a library named libOSMesa16.so or libOSMesa32.so. -Otherwise, most Mesa configurations build an 8-bit/channel libOSMesa.so library -by default. +When the build is complete you should find: /p +pre +lib/libOSMesa.so (swrast-based OSMesa) +lib/gallium/libOSMsea.so (gallium-based OSMesa) +/pre p -If performance is important, compile Mesa for the channel size you're -most interested in. +Set your LD_LIBRARY_PATH to point to one directory or the other to select +the library you want to use. /p p -If you need to compile on a non-Linux platform, copy Mesa/configs/linux-osmesa16 -to a new config file and edit it as needed. Then, add the new config name to -the top-level Makefile. Send a patch to the Mesa developers too, if you're -inclined. +When you link your application, link with -lOSMesa /p /div -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] st/osmesa: new OSMesa gallium state tracker
--- src/gallium/state_trackers/osmesa/osmesa.c | 828 1 files changed, 828 insertions(+), 0 deletions(-) create mode 100644 src/gallium/state_trackers/osmesa/osmesa.c diff --git a/src/gallium/state_trackers/osmesa/osmesa.c b/src/gallium/state_trackers/osmesa/osmesa.c new file mode 100644 index 000..fafcf19 --- /dev/null +++ b/src/gallium/state_trackers/osmesa/osmesa.c @@ -0,0 +1,828 @@ +/* + * Copyright (c) 2013 Brian Paul All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN + * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + */ + + +/* + * Off-Screen rendering into client memory. + * State tracker for gallium (for softpipe and llvmpipe) + * + * Notes: + * + * If Gallium is built with LLVM support we use the llvmpipe driver. + * Otherwise we use softpipe. The GALLIUM_DRIVER environment variable + * may be set to softpipe or llvmpipe to override. + * + * With softpipe we could render directly into the user's buffer by using a + * display target resource. However, softpipe doesn't suport upside-down + * rendering which would be needed for the OSMESA_Y_UP=TRUE case. + * + * With llvmpipe we could only render directly into the user's buffer when its + * width and height is a multiple of the tile size (64 pixels). + * + * Because of these constraints we always render into ordinary resources then + * copy the results to the user's buffer in the flush_front() function which + * is called when the app calls glFlush/Finish. + * + * In general, the OSMesa interface is pretty ugly and not a good match + * for Gallium. But we're interested in doing the best we can to preserve + * application portability. With a little work we could come up with a + * much nicer, new off-screen Gallium interface... + */ + + +#include GL/osmesa.h + +#include glapi/glapi.h /* for OSMesaGetProcAddress below */ + +#include pipe/p_context.h +#include pipe/p_screen.h +#include pipe/p_state.h + +#include util/u_atomic.h +#include util/u_box.h +#include util/u_format.h +#include util/u_memory.h + +#include state_tracker/st_api.h +#include state_tracker/st_gl_api.h + + + +extern struct pipe_screen * +osmesa_create_screen(void); + + + +struct osmesa_buffer +{ + struct st_framebuffer_iface *stfb; + struct st_visual visual; + unsigned width, height; + + struct pipe_resource *textures[ST_ATTACHMENT_COUNT]; + + void *map; +}; + + +struct osmesa_context +{ + struct st_context_iface *stctx; + + struct osmesa_buffer *current_buffer; + + enum pipe_format depth_stencil_format, accum_format; + + GLenum format; /* User-specified context format */ + GLenum type; /* Buffer's data type */ + GLint user_row_length; /* user-specified number of pixels per row */ + GLboolean y_up;/* TRUE - Y increases upward */ + /* FALSE - Y increases downward */ +}; + + + +/** + * Called from the ST manager. + */ +static int +osmesa_st_get_param(struct st_manager *smapi, enum st_manager_param param) +{ + /* no-op */ + return 0; +} + + +/** + * Create/return singleton st_api object. + */ +static struct st_api * +get_st_api(void) +{ + static struct st_api *stapi = NULL; + if (!stapi) { + stapi = st_gl_api_create(); + } + return stapi; +} + + +/** + * Create/return a singleton st_manager object. + */ +static struct st_manager * +get_st_manager(void) +{ + static struct st_manager *stmgr = NULL; + if (!stmgr) { + stmgr = CALLOC_STRUCT(st_manager); + if (stmgr) { + stmgr-screen = osmesa_create_screen(); + stmgr-get_param = osmesa_st_get_param; + stmgr-get_egl_image = NULL; + } + } + return stmgr; +} + + +static INLINE boolean +little_endian(void) +{ + const unsigned ui = 1; + return *((const char *) ui); +} + + +/** + * Given an OSMESA_x format and a GL_y type, return the best + * matching PIPE_FORMAT_z. + * Note that we can't exactly match all user format/type combinations + * with gallium formats. If we find this to be a
[Mesa-dev] [PATCH 3/7] st/osmesa: add new Makefile.am
--- src/gallium/state_trackers/osmesa/Makefile.am | 41 + 1 files changed, 41 insertions(+), 0 deletions(-) create mode 100644 src/gallium/state_trackers/osmesa/Makefile.am diff --git a/src/gallium/state_trackers/osmesa/Makefile.am b/src/gallium/state_trackers/osmesa/Makefile.am new file mode 100644 index 000..3182012 --- /dev/null +++ b/src/gallium/state_trackers/osmesa/Makefile.am @@ -0,0 +1,41 @@ +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the Software), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice (including the next +# paragraph) shall be included in all copies or substantial portions of the +# Software. +# +# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, +# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, +# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. + +include $(top_srcdir)/src/gallium/Automake.inc + +AM_CFLAGS = $(GALLIUM_CFLAGS) + +AM_CPPFLAGS = \ + -I$(top_srcdir)/include \ + -I$(top_srcdir)/src/mapi \ + -I$(top_srcdir)/src/mesa \ + -I$(top_srcdir)/src/gallium/include \ + -I$(top_srcdir)/src/gallium/drivers \ + -I$(top_srcdir)/src/gallium/winsys \ + -I$(top_srcdir)/src/gallium/state_trackers/glx/xlib \ + -I$(top_srcdir)/src/gallium/auxiliary \ + -DGALLIUM_SOFTPIPE \ + -DGALLIUM_TRACE + +noinst_LTLIBRARIES = libosmesa.la + +libosmesa_la_SOURCES = \ + osmesa.c -- 1.7.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] i965: Add definitions for gen7+ data cache messages.
We were sparsely using some of these message types, but I'll just fill them all in now. It will be used for fixing shader_time on HSW. NOTE: This is a candidate for the 9.1 branch. --- src/mesa/drivers/dri/i965/brw_defines.h | 36 +++ 1 file changed, 36 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 6414e69..042abcd 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -967,7 +967,43 @@ enum brw_message_target { /* GEN7 */ #define GEN7_DATAPORT_WRITE_MESSAGE_OWORD_DUAL_BLOCK_WRITE 10 +#define GEN7_DATAPORT_DC_OWORD_BLOCK_READ 0 +#define GEN7_DATAPORT_DC_UNALIGNED_OWORD_BLOCK_READ 1 +#define GEN7_DATAPORT_DC_OWORD_DUAL_BLOCK_READ 2 #define GEN7_DATAPORT_DC_DWORD_SCATTERED_READ 3 +#define GEN7_DATAPORT_DC_BYTE_SCATTERED_READ4 +#define GEN7_DATAPORT_DC_UNTYPED_SURFACE_READ 5 +#define GEN7_DATAPORT_DC_UNTYPED_ATOMIC_OP 6 +#define GEN7_DATAPORT_DC_MEMORY_FENCE 7 +#define GEN7_DATAPORT_DC_OWORD_BLOCK_WRITE 8 +#define GEN7_DATAPORT_DC_OWORD_DUAL_BLOCK_WRITE 10 +#define GEN7_DATAPORT_DC_DWORD_SCATTERED_WRITE 11 +#define GEN7_DATAPORT_DC_BYTE_SCATTERED_WRITE 12 +#define GEN7_DATAPORT_DC_UNTYPED_SURFACE_WRITE 13 + +/* HSW */ +#define HSW_DATAPORT_DC_PORT0_OWORD_BLOCK_READ 0 +#define HSW_DATAPORT_DC_PORT0_UNALIGNED_OWORD_BLOCK_READ1 +#define HSW_DATAPORT_DC_PORT0_OWORD_DUAL_BLOCK_READ 2 +#define HSW_DATAPORT_DC_PORT0_DWORD_SCATTERED_READ 3 +#define HSW_DATAPORT_DC_PORT0_BYTE_SCATTERED_READ 4 +#define HSW_DATAPORT_DC_PORT0_MEMORY_FENCE 7 +#define HSW_DATAPORT_DC_PORT0_OWORD_BLOCK_WRITE 8 +#define HSW_DATAPORT_DC_PORT0_OWORD_DUAL_BLOCK_WRITE10 +#define HSW_DATAPORT_DC_PORT0_DWORD_SCATTERED_WRITE 11 +#define HSW_DATAPORT_DC_PORT0_BYTE_SCATTERED_WRITE 12 + +#define HSW_DATAPORT_DC_PORT1_UNTYPED_SURFACE_READ 1 +#define HSW_DATAPORT_DC_PORT1_UNTYPED_ATOMIC_OP 2 +#define HSW_DATAPORT_DC_PORT1_UNTYPED_ATOMIC_OP_SIMD4X2 3 +#define HSW_DATAPORT_DC_PORT1_TYPED_SURFACE_READ5 +#define HSW_DATAPORT_DC_PORT1_TYPED_ATOMIC_OP 6 +#define HSW_DATAPORT_DC_PORT1_TYPED_ATOMIC_OP_SIMD4X2 7 +#define HSW_DATAPORT_DC_PORT1_UNTYPED_SURFACE_WRITE 9 +#define HSW_DATAPORT_DC_PORT1_MEDIA_BLOCK_WRITE 10 +#define HSW_DATAPORT_DC_PORT1_ATOMIC_COUNTER_OP 11 +#define HSW_DATAPORT_DC_PORT1_ATOMIC_COUNTER_OP_SIMD4X2 12 +#define HSW_DATAPORT_DC_PORT1_TYPED_SURFACE_WRITE 13 /* dataport atomic operations. */ #define BRW_AOP_AND 1 -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Speedup the xrgb - argb special case in fast_read_rgba_pixels_memcpy
On Mon, Mar 11, 2013 at 6:49 PM, Ian Romanick i...@freedesktop.org wrote: Once upon a time Matt Turner was talking about using pixman to accelerate operations like this in Mesa. It has a lot of highly optimized paths for just this sort of thing. Since it's used by other projects, it gets a lot more testing, etc. It may be worth looking at using that to solve this problem. I think that using pixman or any other CPU-based solution is a waste of time (for dedicated GPUs at least). The OpenGL packing and unpacking can be implemented entirely on the GPU using streamout and TBOs, and we generally only want memcpy on the CPU side. That would also allow us to finally accelerate pixel buffer objects. For now, the easiest and fastest solution is to do a blit, which should cover swizzling and format conversions. We just need a lot of texture formats or do swizzling in the fragment shader. The destination of the blit can be a temporary texture allocated in GTT. The author of the patch (at least I think it's him) has actually started working on the blit-based solution for ReadPixels in st/mesa and the time spent in ReadPixels went from 2300 ms to 9 ms (so he can still use additional 7.6 ms for rendering and be at 60 fps). Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Solving the TGSI indirect addressing optimization problem
On Mon, Mar 11, 2013 at 1:44 PM, Christian König deathsim...@vodafone.de wrote: Hi everybody, this problem has been open for quite some time now, with a bunch of different opinions and sometimes even patches floating on the list. The solutions proposed or implemented so far all more or less incomplete, so this approach was designed in mind with both completeness and compatibility with existing code. Over all it's just an implementation of what Tom Stellard named solution #4 in this eMail thread: http://lists.freedesktop.org/archives/mesa-dev/2013-January/033264.html Hi Christian, this is definitely not the solution #4. According to the TGSI dump Christoph posted, it looks more like #3. The solution #4 completely changes the temporary file such that it becomes two-dimensional with the first index being a literal and the second index being either a literal or ADDR[literal], and it would always be like that regardless of whether drivers support that or not. One-dimensional indexing of TEMP is not allowed. For backward compatibility, the drivers that do not support it would only get a single array declaration TEMP[0][0..n] and TEMP[0][...] would be everywhere in the code. I don't know much about TGSI internals, so I can't review this. I'd just like to say that TGSI dumps should make sense (2D indexing should be only allowed with 2D declarations) and tgsi_text_translate should be able to do the reverse - convert the dumps back to TGSI tokens. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] i965: Add definitions for gen7+ data cache messages.
On 03/11/2013 04:11 PM, Eric Anholt wrote: We were sparsely using some of these message types, but I'll just fill them all in now. It will be used for fixing shader_time on HSW. NOTE: This is a candidate for the 9.1 branch. --- src/mesa/drivers/dri/i965/brw_defines.h | 36 +++ 1 file changed, 36 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 6414e69..042abcd 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -967,7 +967,43 @@ enum brw_message_target { /* GEN7 */ #define GEN7_DATAPORT_WRITE_MESSAGE_OWORD_DUAL_BLOCK_WRITE 10 +#define GEN7_DATAPORT_DC_OWORD_BLOCK_READ 0 +#define GEN7_DATAPORT_DC_UNALIGNED_OWORD_BLOCK_READ 1 +#define GEN7_DATAPORT_DC_OWORD_DUAL_BLOCK_READ 2 #define GEN7_DATAPORT_DC_DWORD_SCATTERED_READ 3 +#define GEN7_DATAPORT_DC_BYTE_SCATTERED_READ4 +#define GEN7_DATAPORT_DC_UNTYPED_SURFACE_READ 5 +#define GEN7_DATAPORT_DC_UNTYPED_ATOMIC_OP 6 +#define GEN7_DATAPORT_DC_MEMORY_FENCE 7 +#define GEN7_DATAPORT_DC_OWORD_BLOCK_WRITE 8 +#define GEN7_DATAPORT_DC_OWORD_DUAL_BLOCK_WRITE 10 +#define GEN7_DATAPORT_DC_DWORD_SCATTERED_WRITE 11 +#define GEN7_DATAPORT_DC_BYTE_SCATTERED_WRITE 12 +#define GEN7_DATAPORT_DC_UNTYPED_SURFACE_WRITE 13 + +/* HSW */ +#define HSW_DATAPORT_DC_PORT0_OWORD_BLOCK_READ 0 +#define HSW_DATAPORT_DC_PORT0_UNALIGNED_OWORD_BLOCK_READ1 +#define HSW_DATAPORT_DC_PORT0_OWORD_DUAL_BLOCK_READ 2 +#define HSW_DATAPORT_DC_PORT0_DWORD_SCATTERED_READ 3 +#define HSW_DATAPORT_DC_PORT0_BYTE_SCATTERED_READ 4 +#define HSW_DATAPORT_DC_PORT0_MEMORY_FENCE 7 +#define HSW_DATAPORT_DC_PORT0_OWORD_BLOCK_WRITE 8 +#define HSW_DATAPORT_DC_PORT0_OWORD_DUAL_BLOCK_WRITE10 +#define HSW_DATAPORT_DC_PORT0_DWORD_SCATTERED_WRITE 11 +#define HSW_DATAPORT_DC_PORT0_BYTE_SCATTERED_WRITE 12 + +#define HSW_DATAPORT_DC_PORT1_UNTYPED_SURFACE_READ 1 +#define HSW_DATAPORT_DC_PORT1_UNTYPED_ATOMIC_OP 2 +#define HSW_DATAPORT_DC_PORT1_UNTYPED_ATOMIC_OP_SIMD4X2 3 Is there some reason you omitted MEDIA_BLOCK_READ (4), but chose to include MEDIA_BLOCK_WRITE (10)? Otherwise, this patch is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org +#define HSW_DATAPORT_DC_PORT1_TYPED_SURFACE_READ5 +#define HSW_DATAPORT_DC_PORT1_TYPED_ATOMIC_OP 6 +#define HSW_DATAPORT_DC_PORT1_TYPED_ATOMIC_OP_SIMD4X2 7 +#define HSW_DATAPORT_DC_PORT1_UNTYPED_SURFACE_WRITE 9 +#define HSW_DATAPORT_DC_PORT1_MEDIA_BLOCK_WRITE 10 +#define HSW_DATAPORT_DC_PORT1_ATOMIC_COUNTER_OP 11 +#define HSW_DATAPORT_DC_PORT1_ATOMIC_COUNTER_OP_SIMD4X2 12 +#define HSW_DATAPORT_DC_PORT1_TYPED_SURFACE_WRITE 13 /* dataport atomic operations. */ #define BRW_AOP_AND 1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965: Split shader_time entries into separate cachelines.
On 03/11/2013 04:11 PM, Eric Anholt wrote: This avoids some snooping overhead between EUs processing separate shaders (so VS versus FS). Improves performance of a minecraft trace with shader_time by 28.9% +/- 18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4). --- src/mesa/drivers/dri/i965/brw_fs.cpp|2 +- src/mesa/drivers/dri/i965/brw_program.c |2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 8ce3954..dd3baa9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -621,7 +621,7 @@ fs_visitor::emit_shader_time_write(enum shader_time_shader_type type, fs_reg offset_mrf = fs_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; Comments, please! An explanation that we're adding a ton of extra padding to be cacheline-aligned would be great. Also, please note the units. Generally, I've found that I have to read through 3-4 layers of code to figure out what units some offset is in, and if people just left a /* 64 bytes */ it would've saved a ton of time. - emit(MOV(offset_mrf, fs_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, fs_reg(shader_time_index * 64))); fs_reg time_mrf = fs_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 75eb6bc..c1aeefc 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -409,7 +409,7 @@ brw_collect_shader_time(struct brw_context *brw) uint32_t *times = brw-shader_time.bo-virtual; for (int i = 0; i brw-shader_time.num_entries; i++) { - brw-shader_time.cumulative[i] += times[i]; + brw-shader_time.cumulative[i] += times[i * 16]; /* 16 uint32_t values */ ? Otherwise, this and patches 3-5 are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org } /* Zero the BO out to clear it out for our next collection. diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index f319f32..b87d62b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1225,7 +1225,7 @@ vec4_visitor::emit_shader_time_write(enum shader_time_shader_type type, dst_reg offset_mrf = dst_reg(MRF, base_mrf); offset_mrf.type = BRW_REGISTER_TYPE_UD; - emit(MOV(offset_mrf, src_reg(shader_time_index * 4))); + emit(MOV(offset_mrf, src_reg(shader_time_index * 64))); dst_reg time_mrf = dst_reg(MRF, base_mrf + 1); time_mrf.type = BRW_REGISTER_TYPE_UD; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa (master): Don't bother making compatibilty symlinks
Jon, It looks like this commit (and the other ones in the series) aren't present in the mesa git tree. It also looks like the last commit was pushed twice, and is present with the hash from the second time. Stéphane On Tue, Mar 5, 2013 at 5:26 AM, Jon TURNEY jtur...@kemper.freedesktop.org wrote: Module: Mesa Branch: master Commit: 5ee414013fee48d6a7575512f7e8cb7be153c416 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=5ee414013fee48d6a7575512f7e8cb7be153c416 Author: Jon TURNEY jon.tur...@dronecode.org.uk Date: Fri Jan 11 19:14:32 2013 + Don't bother making compatibilty symlinks Don't bother making compatibilty symlinks. This doesn't work for us anyhow as we make .dll, not .so --- src/egl/main/Makefile.am |8 +--- src/gallium/targets/libgl-xlib/Makefile.am |6 -- src/glx/Makefile.am|7 --- src/mapi/shared-glapi/Makefile.am |6 -- src/mesa/drivers/dri/swrast/Makefile.am|6 -- src/mesa/drivers/osmesa/Makefile.am|9 - src/mesa/drivers/x11/Makefile.am |9 - src/mesa/libdricore/Makefile.am|7 --- 8 files changed, 1 insertions(+), 57 deletions(-) diff --git a/src/egl/main/Makefile.am b/src/egl/main/Makefile.am index ca5257a..bf8221d 100644 --- a/src/egl/main/Makefile.am +++ b/src/egl/main/Makefile.am @@ -1,3 +1,4 @@ + # Copyright © 2012 Intel Corporation # # Permission is hereby granted, free of charge, to any person obtaining a @@ -116,13 +117,6 @@ libEGL_la_LIBADD += ../drivers/dri2/libegl_dri2.la libEGL_la_LIBADD += $(LIBUDEV_LIBS) $(DLOPEN_LIBS) $(LIBDRM_LIBS) endif -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: libEGL.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/libEGL.so.1.0.0 $(top_builddir)/$(LIB_DIR)/libEGL.so.1 - ln -sf libEGL.so.1 $(top_builddir)/$(LIB_DIR)/libEGL.so - pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = egl.pc diff --git a/src/gallium/targets/libgl-xlib/Makefile.am b/src/gallium/targets/libgl-xlib/Makefile.am index cca0da4..4817766 100644 --- a/src/gallium/targets/libgl-xlib/Makefile.am +++ b/src/gallium/targets/libgl-xlib/Makefile.am @@ -70,9 +70,3 @@ libGL_la_LINK = $(CXXLINK) $(libGL_la_LDFLAGS) # Mention a dummy pure C file to trigger generation of the $(LINK) variable nodist_EXTRA_libGL_la_SOURCES = dummy-c.c endif - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: libGL.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR)/gallium - ln -f .libs/libGL.so* $(top_builddir)/$(LIB_DIR)/gallium/ diff --git a/src/glx/Makefile.am b/src/glx/Makefile.am index 4aa900a..5fc9408 100644 --- a/src/glx/Makefile.am +++ b/src/glx/Makefile.am @@ -106,10 +106,3 @@ GL_LDFLAGS = \ lib@GL_LIB@_la_SOURCES = lib@GL_LIB@_la_LIBADD = $(GL_LIBS) lib@GL_LIB@_la_LDFLAGS = $(GL_LDFLAGS) - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: lib@GL_LIB@.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/lib@GL_LIB@.so.1.2.0 $(top_builddir)/$(LIB_DIR)/lib@GL_LIB@.so.1 - ln -sf lib@GL_LIB@.so.1 $(top_builddir)/$(LIB_DIR)/lib@GL_LIB@.so diff --git a/src/mapi/shared-glapi/Makefile.am b/src/mapi/shared-glapi/Makefile.am index d215c43..79e3334 100644 --- a/src/mapi/shared-glapi/Makefile.am +++ b/src/mapi/shared-glapi/Makefile.am @@ -24,9 +24,3 @@ AM_CPPFLAGS = \ -I$(top_builddir)/src/mapi \ -DMAPI_MODE_GLAPI \ -DMAPI_ABI_HEADER=\shared-glapi/glapi_mapi_tmp.h\ - -all-local: libglapi.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR) - ln -f .libs/libglapi.so.0.0.0 $(top_builddir)/$(LIB_DIR)/libglapi.so.0.0.0 - ln -sf libglapi.so.0.0.0 $(top_builddir)/$(LIB_DIR)/libglapi.so.0 - ln -sf libglapi.so.0 $(top_builddir)/$(LIB_DIR)/libglapi.so diff --git a/src/mesa/drivers/dri/swrast/Makefile.am b/src/mesa/drivers/dri/swrast/Makefile.am index 3e53907..09a3dfd 100644 --- a/src/mesa/drivers/dri/swrast/Makefile.am +++ b/src/mesa/drivers/dri/swrast/Makefile.am @@ -46,9 +46,3 @@ swrast_dri_la_SOURCES = \ swrast_dri_la_LDFLAGS = -module -avoid-version -shared swrast_dri_la_LIBADD = \ $(DRI_LIB_DEPS) - -# Provide compatibility with scripts for the old Mesa build system for -# a while by putting a link to the driver into /lib of the build tree. -all-local: swrast_dri.la - $(MKDIR_P) $(top_builddir)/$(LIB_DIR); - ln -f .libs/swrast_dri.so $(top_builddir)/$(LIB_DIR)/swrast_dri.so; diff
Re: [Mesa-dev] remove mfeatures.h, take two
On Mon, Mar 11, 2013 at 5:28 PM, Brian Paul bri...@vmware.com wrote: On Sat, Mar 2, 2013 at 7:29 AM, Brian Paul bri...@vmware.com wrote: I've respun my remove-mfeatures branch as remove-mfeatures-2. It's in my repo on freedesktop.org This removes the mfeatures.h file and the last of the #ifdef FEATURE_x code from core Mesa. Note, however, that the scons/makefiles still define FEATURE_GL, FEATURE_ES1, FEATURE_ES2, etc. because that controls whether some source files are built and in some places (like egl-static/egl_st.c) we need to test those symbols to avoid calling non-existant functions. If a few people can test, that'd be great. Ping. Can anyone test the updated remove-mfeatures-2 branch? Thanks. I tested this with i965. It built and did not have regressions with piglit quick-driver.tests. How much changed from v1? I gave a reviewed-by for the v1 series, so I wonder if you would be willing to add my r-b to commit message for the patches that are unchanged from v1? If a lot changed, do you think v2 should be posted to the list for further code review? Thanks, -Jordan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev