Re: [Mesa-dev] [PATCH 02/50] glsl: Add "built-in" functions to do neg(fp64) (v2)
Am 13.03.2018 um 05:24 schrieb Dave Airlie: > From: Elie Tournier> > v2: use mix. > > Signed-off-by: Elie Tournier > --- > src/compiler/glsl/builtin_float64.h | 51 > + > src/compiler/glsl/builtin_functions.cpp | 4 +++ > src/compiler/glsl/builtin_functions.h | 3 ++ > src/compiler/glsl/float64.glsl | 24 > src/compiler/glsl/glcpp/glcpp-parse.y | 1 + > 5 files changed, 83 insertions(+) > > diff --git a/src/compiler/glsl/builtin_float64.h > b/src/compiler/glsl/builtin_float64.h > index 7b57231..2898fc9 100644 > --- a/src/compiler/glsl/builtin_float64.h > +++ b/src/compiler/glsl/builtin_float64.h > @@ -17,3 +17,54 @@ fabs64(void *mem_ctx, builtin_available_predicate avail) > sig->replace_parameters(_parameters); > return sig; > } > +ir_function_signature * > +is_nan(void *mem_ctx, builtin_available_predicate avail) > +{ > + ir_function_signature *const sig = > + new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail); > + ir_factory body(>body, mem_ctx); > + sig->is_defined = true; > + > + exec_list sig_parameters; > + > + ir_variable *const r000C = new(mem_ctx) > ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); > + sig_parameters.push_tail(r000C); > + ir_expression *const r000D = lshift(swizzle_y(r000C), > body.constant(int(1))); > + ir_expression *const r000E = gequal(r000D, body.constant(4292870144u)); > + ir_expression *const r000F = nequal(swizzle_x(r000C), body.constant(0u)); > + ir_expression *const r0010 = bit_and(swizzle_y(r000C), > body.constant(1048575u)); > + ir_expression *const r0011 = nequal(r0010, body.constant(0u)); > + ir_expression *const r0012 = logic_or(r000F, r0011); > + ir_expression *const r0013 = logic_and(r000E, r0012); > + body.emit(ret(r0013)); > + > + sig->replace_parameters(_parameters); > + return sig; > +} > +ir_function_signature * > +fneg64(void *mem_ctx, builtin_available_predicate avail) > +{ > + ir_function_signature *const sig = > + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); > + ir_factory body(>body, mem_ctx); > + sig->is_defined = true; > + > + exec_list sig_parameters; > + > + ir_variable *const r0014 = new(mem_ctx) > ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); > + sig_parameters.push_tail(r0014); > + ir_expression *const r0015 = lshift(swizzle_y(r0014), > body.constant(int(1))); > + ir_expression *const r0016 = gequal(r0015, body.constant(4292870144u)); > + ir_expression *const r0017 = nequal(swizzle_x(r0014), body.constant(0u)); > + ir_expression *const r0018 = bit_and(swizzle_y(r0014), > body.constant(1048575u)); > + ir_expression *const r0019 = nequal(r0018, body.constant(0u)); > + ir_expression *const r001A = logic_or(r0017, r0019); > + ir_expression *const r001B = logic_and(r0016, r001A); > + ir_expression *const r001C = bit_xor(swizzle_y(r0014), > body.constant(2147483648u)); > + body.emit(assign(r0014, expr(ir_triop_csel, r001B, swizzle_y(r0014), > r001C), 0x02)); > + > + body.emit(ret(r0014)); > + > + sig->replace_parameters(_parameters); > + return sig; > +} > diff --git a/src/compiler/glsl/builtin_functions.cpp > b/src/compiler/glsl/builtin_functions.cpp > index 133a896..9d88a31 100644 > --- a/src/compiler/glsl/builtin_functions.cpp > +++ b/src/compiler/glsl/builtin_functions.cpp > @@ -3346,6 +3346,10 @@ builtin_builder::create_builtins() > generate_ir::fabs64(mem_ctx, integer_functions_supported), > NULL); > > + add_function("__builtin_fneg64", > +generate_ir::fneg64(mem_ctx, integer_functions_supported), > +NULL); > + > #undef F > #undef FI > #undef FIUD_VEC > diff --git a/src/compiler/glsl/builtin_functions.h > b/src/compiler/glsl/builtin_functions.h > index deaf640..adec424 100644 > --- a/src/compiler/glsl/builtin_functions.h > +++ b/src/compiler/glsl/builtin_functions.h > @@ -70,6 +70,9 @@ udivmod64(void *mem_ctx, builtin_available_predicate avail); > ir_function_signature * > fabs64(void *mem_ctx, builtin_available_predicate avail); > > +ir_function_signature * > +fneg64(void *mem_ctx, builtin_available_predicate avail); > + > } > > #endif /* BULITIN_FUNCTIONS_H */ > diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl > index d798d7e..fedf8b7 100644 > --- a/src/compiler/glsl/float64.glsl > +++ b/src/compiler/glsl/float64.glsl > @@ -6,6 +6,7 @@ > > #version 130 > #extension GL_ARB_shader_bit_encoding : enable > +#extension GL_EXT_shader_integer_mix : enable > > /* Software IEEE floating-point rounding mode. > * GLSL spec section "4.7.1 Range and Precision": > @@ -27,3 +28,26 @@ fabs64(uvec2 a) > a.y &= 0x7FFFu; > return a; > } > + > +/* Returns 1 if the double-precision floating-point value `a' is a NaN; > + * otherwise returns 0. > + */ >
[Mesa-dev] [PATCH 49/50] gallium: add pipe double support enum + docs
From: Dave Airlie--- src/gallium/docs/source/screen.rst | 4 +++- src/gallium/include/pipe/p_defines.h | 7 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index e375d67..42e4f32 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -361,7 +361,9 @@ The integer capabilities: * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the ``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property. * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point operations - are supported. + are supported. PIPE_DOUBLES_HW indicates HW support for doubles, + PIPE_DOUBLES_EMULATE indicates the driver wants the state tracker to + lower doubles. * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported. * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo operations are supported. diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index ed8eeb8..b104007 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -1098,6 +1098,13 @@ enum pipe_debug_type PIPE_DEBUG_TYPE_CONFORMANCE, }; +enum pipe_double_support +{ + PIPE_DOUBLES_NONE = 0, + PIPE_DOUBLES_HW = 1, + PIPE_DOUBLES_EMULATE = 2 +}; + #define PIPE_UUID_SIZE 16 #ifdef __cplusplus -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 45/50] glsl: Add a lowering pass for 64-bit float frac()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/lower_instructions.cpp | 25 + 1 file changed, 25 insertions(+) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index 3064eef..94b262d 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -181,6 +181,7 @@ private: void dmax_to_less(ir_expression *ir); void dfloor_to_dtrunc(ir_expression *ir); void dceil_to_dtrunc(ir_expression *ir); + void dfrac_to_dtrunc(ir_expression *ir); ir_expression *_carry(operand a, operand b); }; @@ -1761,6 +1762,24 @@ lower_instructions_visitor::dceil_to_dtrunc(ir_expression *ir) this->progress = true; } +void +lower_instructions_visitor::dfrac_to_dtrunc(ir_expression *ir) +{ + ir_expression *const floor_expr = + new(ir) ir_expression(ir_unop_floor, +ir->operands[0]->type, ir->operands[0]); + dfloor_to_dtrunc(floor_expr); + ir_expression *const neg_expr = + new(ir) ir_expression(ir_unop_neg, +ir->operands[0]->type, floor_expr); + + ir->operation = ir_binop_add; + ir->init_num_operands(); + ir->operands[1] = neg_expr; + + this->progress = true; +} + ir_visitor_status lower_instructions_visitor::visit_leave(ir_expression *ir) { @@ -1926,6 +1945,12 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) dmax_to_less(ir); break; + case ir_unop_fract: + if (lowering(DOPS_TO_DTRUNC) && + ir->type->is_double()) + dfrac_to_dtrunc(ir); + break; + default: return visit_continue; } -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 44/50] glsl: Add a lowering pass for 64-bit float ceil()
From: Elie Tournier[airlied: handle vector case] Signed-off-by: Elie Tournier --- src/compiler/glsl/lower_instructions.cpp | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index 03246e6..3064eef 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -180,6 +180,7 @@ private: void dmin_to_less(ir_expression *ir); void dmax_to_less(ir_expression *ir); void dfloor_to_dtrunc(ir_expression *ir); + void dceil_to_dtrunc(ir_expression *ir); ir_expression *_carry(operand a, operand b); }; @@ -1739,6 +1740,27 @@ lower_instructions_visitor::dfloor_to_dtrunc(ir_expression *ir) this->progress = true; } +void +lower_instructions_visitor::dceil_to_dtrunc(ir_expression *ir) +{ + /* if x < 0,ceil(x) = trunc(x) +* else if (x - trunc(x) == 0), ceil(x) = x +* else,ceil(x) = trunc(x) + 1 +*/ + const unsigned vec_elem = ir->type->vector_elements; + ir_rvalue *src = ir->operands[0]->clone(ir, NULL); + ir_rvalue *tr = trunc(src); + + ir->operation = ir_triop_csel; + ir->init_num_operands(); + ir->operands[0] = logic_or(less(src, new(ir) ir_constant(0.0, vec_elem)), + equal(src, tr)); + ir->operands[1] = tr; + ir->operands[2] = add(tr, new(ir) ir_constant(1.0, vec_elem)); + + this->progress = true; +} + ir_visitor_status lower_instructions_visitor::visit_leave(ir_expression *ir) { @@ -1822,8 +1844,13 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) break; case ir_unop_ceil: - if (lowering(DOPS_TO_DFRAC) && ir->type->is_double()) - dceil_to_dfrac(ir); + if (ir->type->is_double()) { + if (lowering(DOPS_TO_DFRAC)) { +dceil_to_dfrac(ir); + } else if (lowering(DOPS_TO_DTRUNC)) { +dceil_to_dtrunc(ir); + } + } break; case ir_unop_floor: -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 24/50] glsl: Add a lowering pass for 64-bit float less()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 4 +++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 17db074..b5f8c45 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -63,8 +63,10 @@ #define ABS64 (1U << 4) #define NEG64 (1U << 5) #define EQ64 (1U << 6) +#define LT64 (1U << 7) + +#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64) -#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64) /** * \see class lower_packing_builtins_visitor */ diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index d5e0f32..24cc3cd 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -457,6 +457,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_binop_less: + if (lowering(LT64)) { + if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_flt64", generate_ir::flt64); + } + break; + case ir_binop_mod: if (lowering(MOD64)) { if (ir->type->base_type == GLSL_TYPE_UINT64) { -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 48/50] glsl: add lowering for mod64()
From: Elie TournierThis lowers to floor using the same code as the float lowering, it also fixes things to avoid creating more instructions that need lowering. Signed-off-by: Dave Airlie --- src/compiler/glsl/ir_optimization.h | 1 + src/compiler/glsl/lower_instructions.cpp | 9 +++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 38e35e3..5e6c82a 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -58,6 +58,7 @@ #define DOPS_TO_DTRUNC0x80 #define DRSQ_TO_DRCP 0x100 #define DFMA_TO_DMULADD 0x200 +#define DMOD_TO_FLOOR 0x400 /* Opertaions for lower_64bit_integer_instructions() */ #define MUL64 (1U << 0) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index b8f7224..d2a838c 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -359,6 +359,8 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir) if (lowering(DOPS_TO_DFRAC) && ir->type->is_double()) dfloor_to_dfrac(floor_expr); + else if (lowering(DOPS_TO_DTRUNC) && ir->type->is_double()) + dfloor_to_dtrunc(floor_expr); ir_expression *const mul_expr = new(ir) ir_expression(ir_binop_mul, @@ -369,6 +371,8 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir) ir->init_num_operands(); ir->operands[0] = new(ir) ir_dereference_variable(x); ir->operands[1] = mul_expr; + if (ir->type->is_double()) + sub_to_add_neg(ir); this->progress = true; } @@ -1855,8 +1859,9 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) break; case ir_binop_mod: - if (lowering(MOD_TO_FLOOR) && (ir->type->is_float() || ir->type->is_double())) -mod_to_floor(ir); + if ((lowering(MOD_TO_FLOOR) && ir->type->is_float()) || + (lowering(DMOD_TO_FLOOR) && ir->type->is_double())) + mod_to_floor(ir); break; case ir_binop_pow: -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 35/50] glsl: Add a lowering pass for 64-bit float round()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 6ef75f5..44d07bc 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -73,10 +73,11 @@ #define F2D (1U << 14) #define SQRT64(1U << 15) #define TRUNC64 (1U << 16) +#define ROUND64 (1U << 17) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ ADD64 | MUL64 | D2U | U2D | D2I | I2D | \ - D2F | F2D | SQRT64 | TRUNC64) + D2F | F2D | SQRT64 | TRUNC64 | ROUND64) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 3c34211..38c264f 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -466,6 +466,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_round_even: + if (lowering(ROUND64)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fround64", generate_ir::fround64); + } + break; + case ir_unop_sign: if (lowering(SIGN64)) { if (ir->type->is_integer_64()) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 34/50] glsl: Add a lowering pass for 64-bit float trunc()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 1b5d50a..6ef75f5 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -72,10 +72,11 @@ #define D2F (1U << 13) #define F2D (1U << 14) #define SQRT64(1U << 15) +#define TRUNC64 (1U << 16) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ ADD64 | MUL64 | D2U | U2D | D2I | I2D | \ - D2F | F2D | SQRT64) + D2F | F2D | SQRT64 | TRUNC64) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 4920150..3c34211 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -482,6 +482,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_trunc: + if (lowering(TRUNC64)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_ftrunc64", generate_ir::ftrunc64); + } + break; + case ir_unop_u2d: if (lowering(U2D)) { if (ir->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 39/50] glsl: Add a lowering pass for 64-bit float gequal()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/lower_64bit.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index c4b8e78..0dc6070 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -405,7 +405,8 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, body.emit(c); - if (ir->operation == ir_unop_d2b) + if (ir->operation == ir_unop_d2b || + ir->operation == ir_binop_gequal) body.emit(assign(dst[i], logic_not(dst[i]))); } @@ -605,6 +606,7 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_binop_gequal: case ir_binop_less: if (lowering(LT64)) { if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 29/50] glsl: Add a lowering pass for 64-bit float d2i()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index a4cb7b2..3cc7f2e 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -67,9 +67,10 @@ #define ADD64 (1U << 8) #define D2U (1U << 9) #define U2D (1U << 10) +#define D2I (1U << 11) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ - ADD64 | MUL64 | D2U | U2D) + ADD64 | MUL64 | D2U | U2D | D2I) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 1e97306..7b2ffe8 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_d2i: + if (lowering(D2I)) { + if (ir->type->base_type == GLSL_TYPE_INT) +*rvalue = handle_op(ir, "__builtin_fp64_to_int", generate_ir::fp64_to_int); + } + break; + case ir_unop_d2u: if (lowering(D2U)) { if (ir->type->base_type == GLSL_TYPE_UINT) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 27/50] glsl: Add a lowering pass for 64-bit float d2u()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 6506e28..e3d573c 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -65,9 +65,10 @@ #define EQ64 (1U << 6) #define LT64 (1U << 7) #define ADD64 (1U << 8) +#define D2U (1U << 9) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ - ADD64 | MUL64) + ADD64 | MUL64 | D2U) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index f3a2633..1b90830 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_d2u: + if (lowering(D2U)) { + if (ir->type->base_type == GLSL_TYPE_UINT) +*rvalue = handle_op(ir, "__builtin_fp64_to_uint", generate_ir::fp64_to_uint); + } + break; + case ir_unop_neg: if (lowering(NEG64)) { if (ir->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 33/50] glsl: Add a lowering pass for 64-bit float sqrt()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index c649c80..1b5d50a 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -71,10 +71,11 @@ #define I2D (1U << 12) #define D2F (1U << 13) #define F2D (1U << 14) +#define SQRT64(1U << 15) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ ADD64 | MUL64 | D2U | U2D | D2I | I2D | \ - D2F | F2D) + D2F | F2D | SQRT64) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 126c961..4920150 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -475,6 +475,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_sqrt: + if (lowering(SQRT64)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fsqrt64", generate_ir::fsqrt64); + } + break; + case ir_unop_u2d: if (lowering(U2D)) { if (ir->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] soft fp64 support - main body (glsl/gallium)
On 13 March 2018 at 14:24, Dave Airliewrote: > This is the main code for the soft fp64 work. It's mostly Elie's > code with a bunch of changes by me. > All the patches are in my tree here, along with some other bits: https://cgit.freedesktop.org/~airlied/mesa/log/?h=glsl_arb_gpu_shader_fp64_v4 Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 50/50] st/glsl: enable fp64 lowering support
From: Dave AirlieThis enables fp64 emulation if the driver requests it with PIPE_CAP_DOUBLES set to PIPE_DOUBLES_EMULATE. It moves the mat->vec lowering earlier as we don't want to hit any matrix operation in double lowering, and if we lower div->rcp we end up getting the wrong type of matrix mult, so just avoid that problem. Otherwise it just enables all the fp64 lowering. --- src/mesa/state_tracker/st_extensions.c | 2 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 13 - 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 3b8e226..524b021 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -1254,7 +1254,7 @@ void st_init_extensions(struct pipe_screen *screen, } #endif - if (screen->get_param(screen, PIPE_CAP_DOUBLES)) { + if (screen->get_param(screen, PIPE_CAP_DOUBLES) != PIPE_DOUBLES_NONE) { extensions->ARB_gpu_shader_fp64 = GL_TRUE; extensions->ARB_vertex_attrib_64bit = GL_TRUE; } diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index b608635..abcadd0 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -7028,9 +7028,21 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) options->EmitNoIndirectUniform); } + do_mat_op_to_vec(ir); + if (!pscreen->get_param(pscreen, PIPE_CAP_INT64_DIVMOD)) lower_64bit_instructions(ir, DIV64 | MOD64); + /* Enable double lowering if the hardware doesn't support doubles. + * The lowering requires GLSL >= 130. + */ + if ((pscreen->get_param(pscreen, PIPE_CAP_DOUBLES) == PIPE_DOUBLES_EMULATE) && + ctx->Const.GLSLVersion >= 130) { + lower_instructions(ir, DDIV_TO_MUL_RCP | DMIN_DMAX_TO_LESS | DOPS_TO_DTRUNC | DRSQ_TO_DRCP | DFMA_TO_DMULADD | +DMOD_TO_FLOOR | (have_dfrexp ? 0 : DFREXP_DLDEXP_TO_ARITH)); + lower_64bit_instructions(ir, LOWER_ALL_DOUBLE_OPS); + } + if (ctx->Extensions.ARB_shading_language_packing) { unsigned lower_inst = LOWER_PACK_SNORM_2x16 | LOWER_UNPACK_SNORM_2x16 | @@ -7053,7 +7065,6 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) if (!pscreen->get_param(pscreen, PIPE_CAP_TEXTURE_GATHER_OFFSETS)) lower_offset_arrays(ir); - do_mat_op_to_vec(ir); if (stage == MESA_SHADER_FRAGMENT) lower_blend_equation_advanced( -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 47/50] glsl: add a lowering pass for dfma to dmuladd.
From: Dave AirlieJust lowering dfma to dmuladd for now, I don't think it will matter for anything we care about. This also fixes the double dot to fma lowering to take this flag into account and avoid creating further fma's. Signed-off-by: Dave Airlie --- src/compiler/glsl/ir_optimization.h | 1 + src/compiler/glsl/lower_instructions.cpp | 35 2 files changed, 32 insertions(+), 4 deletions(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index e6f9ad3..38e35e3 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -57,6 +57,7 @@ #define DMIN_DMAX_TO_LESS 0x40 #define DOPS_TO_DTRUNC0x80 #define DRSQ_TO_DRCP 0x100 +#define DFMA_TO_DMULADD 0x200 /* Opertaions for lower_64bit_integer_instructions() */ #define MUL64 (1U << 0) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index d13a99b..b8f7224 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -184,7 +184,7 @@ private: void dceil_to_dtrunc(ir_expression *ir); void dfrac_to_dtrunc(ir_expression *ir); void drsq_to_drcp(ir_expression *ir); - + void dfma_to_dmuladd(ir_expression *ir); ir_expression *_carry(operand a, operand b); }; @@ -873,9 +873,12 @@ lower_instructions_visitor::double_dot_to_fma(ir_expression *ir) assig = assign(temp, mul(swizzle(ir->operands[0]->clone(ir, NULL), i, 1), swizzle(ir->operands[1]->clone(ir, NULL), i, 1))); } else { - assig = assign(temp, fma(swizzle(ir->operands[0]->clone(ir, NULL), i, 1), - swizzle(ir->operands[1]->clone(ir, NULL), i, 1), - temp)); + ir_expression *fma_expr = fma(swizzle(ir->operands[0]->clone(ir, NULL), i, 1), + swizzle(ir->operands[1]->clone(ir, NULL), i, 1), + temp); + if (lowering(DFMA_TO_DMULADD)) +dfma_to_dmuladd(fma_expr); + assig = assign(temp, fma_expr); } this->base_ir->insert_before(assig); } @@ -886,6 +889,8 @@ lower_instructions_visitor::double_dot_to_fma(ir_expression *ir) ir->operands[1] = swizzle(ir->operands[1], 0, 1); ir->operands[2] = new(ir) ir_dereference_variable(temp); + if (lowering(DFMA_TO_DMULADD)) + dfma_to_dmuladd(ir); this->progress = true; } @@ -1783,6 +1788,22 @@ lower_instructions_visitor::dfrac_to_dtrunc(ir_expression *ir) } void +lower_instructions_visitor::dfma_to_dmuladd(ir_expression *ir) +{ + ir_variable *temp = new(ir) ir_variable(ir->operands[0]->type, "temp", + ir_var_temporary); + ir_rvalue *arg = ir->operands[2]; + ir_instruction = *base_ir; + i.insert_before(temp); + i.insert_before(assign(temp, mul(ir->operands[0], ir->operands[1]))); + + ir->operation = ir_binop_add; + ir->init_num_operands(); + ir->operands[0] = new(ir) ir_dereference_variable(temp); + ir->operands[1] = arg->clone(ir, NULL); + this->progress = true; +} +void lower_instructions_visitor::drsq_to_drcp(ir_expression *ir) { ir_expression *const sqrt_expr = @@ -1976,6 +1997,12 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) dfrac_to_dtrunc(ir); break; + case ir_triop_fma: + if (lowering(DFMA_TO_DMULADD) && + ir->type->is_double()) + dfma_to_dmuladd(ir); + break; + default: return visit_continue; } -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 46/50] glsl: Add a lowering pass for 64-bit float rsq()
From: Elie Tournier--- src/compiler/glsl/ir_optimization.h | 1 + src/compiler/glsl/lower_instructions.cpp | 25 + 2 files changed, 26 insertions(+) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index ba0c101..e6f9ad3 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -56,6 +56,7 @@ #define SQRT_TO_ABS_SQRT 0x20 #define DMIN_DMAX_TO_LESS 0x40 #define DOPS_TO_DTRUNC0x80 +#define DRSQ_TO_DRCP 0x100 /* Opertaions for lower_64bit_integer_instructions() */ #define MUL64 (1U << 0) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index 94b262d..d13a99b 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -45,6 +45,7 @@ * - DOPS_TO_DFRAC * - DMIN_DMAX_TO_LESS * - DOPS_TO_DTRUNC + * - DRSQ_TO_DRCP * * SUB_TO_ADD_NEG: * --- @@ -182,6 +183,7 @@ private: void dfloor_to_dtrunc(ir_expression *ir); void dceil_to_dtrunc(ir_expression *ir); void dfrac_to_dtrunc(ir_expression *ir); + void drsq_to_drcp(ir_expression *ir); ir_expression *_carry(operand a, operand b); }; @@ -1780,6 +1782,22 @@ lower_instructions_visitor::dfrac_to_dtrunc(ir_expression *ir) this->progress = true; } +void +lower_instructions_visitor::drsq_to_drcp(ir_expression *ir) +{ + ir_expression *const sqrt_expr = + new(ir) ir_expression(ir_unop_sqrt, +ir->operands[0]->type, ir->operands[0]); + if (lowering(SQRT_TO_ABS_SQRT)) + sqrt_to_abs_sqrt(sqrt_expr); + + ir->operation = ir_unop_rcp; + ir->init_num_operands(); + ir->operands[0] = sqrt_expr; + + this->progress = true; +} + ir_visitor_status lower_instructions_visitor::visit_leave(ir_expression *ir) { @@ -1928,6 +1946,13 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) break; case ir_unop_rsq: + if (lowering(DRSQ_TO_DRCP) && + ir->type->is_double()) + drsq_to_drcp(ir); + else if (lowering(SQRT_TO_ABS_SQRT)) + sqrt_to_abs_sqrt(ir); + break; + case ir_unop_sqrt: if (lowering(SQRT_TO_ABS_SQRT)) sqrt_to_abs_sqrt(ir); -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 42/50] glsl: Add a lowering pass for 64-bit float max()
From: Elie Tournier[airlied: update to handle max(dvec, double) case] Signed-off-by: Elie Tournier --- src/compiler/glsl/lower_instructions.cpp | 27 +++ 1 file changed, 27 insertions(+) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index 8c3d623..144bc41 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -177,6 +177,7 @@ private: void imul_high_to_mul(ir_expression *ir); void sqrt_to_abs_sqrt(ir_expression *ir); void dmin_to_less(ir_expression *ir); + void dmax_to_less(ir_expression *ir); ir_expression *_carry(operand a, operand b); }; @@ -1693,6 +1694,26 @@ lower_instructions_visitor::dmin_to_less(ir_expression *ir) this->progress = true; } +void +lower_instructions_visitor::dmax_to_less(ir_expression *ir) +{ + const unsigned vec_elem = ir->type->vector_elements; + ir_rvalue *x_clone = ir->operands[0]->clone(ir, NULL); + ir_rvalue *y_clone = ir->operands[1]->clone(ir, NULL); + ir->operation = ir_triop_csel; + ir->init_num_operands(); + if (ir->operands[1]->type->vector_elements == 1 && vec_elem > 1) { + ir->operands[0] = less(ir->operands[0], swizzle(ir->operands[1], SWIZZLE_, vec_elem)); + ir->operands[1] = swizzle(y_clone, SWIZZLE_, vec_elem); + } else { + ir->operands[0] = less(ir->operands[0], ir->operands[1]); + ir->operands[1] = y_clone; + } + ir->operands[2] = x_clone; + + this->progress = true; +} + ir_visitor_status lower_instructions_visitor::visit_leave(ir_expression *ir) { @@ -1842,6 +1863,12 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) dmin_to_less(ir); break; + case ir_binop_max: + if (lowering(DMIN_DMAX_TO_LESS) && + ir->type->is_double()) + dmax_to_less(ir); + break; + default: return visit_continue; } -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 43/50] glsl: Add a lowering pass for 64-bit float floor()
From: Elie Tournier[airlied: handle vector cases] Signed-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 1 + src/compiler/glsl/lower_instructions.cpp | 34 ++-- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index f9b688a..ba0c101 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -55,6 +55,7 @@ #define DIV_TO_MUL_RCP(FDIV_TO_MUL_RCP | DDIV_TO_MUL_RCP) #define SQRT_TO_ABS_SQRT 0x20 #define DMIN_DMAX_TO_LESS 0x40 +#define DOPS_TO_DTRUNC0x80 /* Opertaions for lower_64bit_integer_instructions() */ #define MUL64 (1U << 0) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index 144bc41..03246e6 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -44,6 +44,7 @@ * - SAT_TO_CLAMP * - DOPS_TO_DFRAC * - DMIN_DMAX_TO_LESS + * - DOPS_TO_DTRUNC * * SUB_TO_ADD_NEG: * --- @@ -178,6 +179,7 @@ private: void sqrt_to_abs_sqrt(ir_expression *ir); void dmin_to_less(ir_expression *ir); void dmax_to_less(ir_expression *ir); + void dfloor_to_dtrunc(ir_expression *ir); ir_expression *_carry(operand a, operand b); }; @@ -1714,6 +1716,29 @@ lower_instructions_visitor::dmax_to_less(ir_expression *ir) this->progress = true; } +void +lower_instructions_visitor::dfloor_to_dtrunc(ir_expression *ir) +{ + /* +* For x >= 0, floor(x) = trunc(x) +* For x < 0, +*- if x is integer, floor(x) = x +*- otherwise, floor(x) = trunc(x) - 1 +*/ + const unsigned vec_elem = ir->type->vector_elements; + ir_rvalue *src = ir->operands[0]->clone(ir, NULL); + ir_rvalue *tr = trunc(src); + + ir->operation = ir_triop_csel; + ir->init_num_operands(); + ir->operands[0] = logic_or(gequal(src, new(ir) ir_constant(0.0, vec_elem)), + equal(src, tr)); + ir->operands[1] = tr; + ir->operands[2] = add(tr, new(ir) ir_constant(-1.0, vec_elem)); + + this->progress = true; +} + ir_visitor_status lower_instructions_visitor::visit_leave(ir_expression *ir) { @@ -1802,8 +1827,13 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) break; case ir_unop_floor: - if (lowering(DOPS_TO_DFRAC) && ir->type->is_double()) - dfloor_to_dfrac(ir); + if (ir->type->is_double()) { + if (lowering(DOPS_TO_DFRAC)) { +dfloor_to_dfrac(ir); + } else if (lowering(DOPS_TO_DTRUNC)) { +dfloor_to_dtrunc(ir); + } + } break; case ir_unop_round_even: -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 32/50] glsl: Add a lowering pass for 64-bit float f2d()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index e7860ef..c649c80 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -70,10 +70,11 @@ #define D2I (1U << 11) #define I2D (1U << 12) #define D2F (1U << 13) +#define F2D (1U << 14) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ ADD64 | MUL64 | D2U | U2D | D2I | I2D | \ - D2F) + D2F | F2D) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index e76ebdc..126c961 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -445,6 +445,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_f2d: + if (lowering(F2D)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fp32_to_fp64", generate_ir::fp32_to_fp64, true); + } + break; + case ir_unop_i2d: if (lowering(I2D)) { if (ir->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 40/50] glsl: Add a lowering pass for 64-bit float nequal()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/lower_64bit.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 0dc6070..f085dae 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -406,7 +406,8 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, body.emit(c); if (ir->operation == ir_unop_d2b || - ir->operation == ir_binop_gequal) + ir->operation == ir_binop_gequal || + ir->operation == ir_binop_nequal) body.emit(assign(dst[i], logic_not(dst[i]))); } @@ -599,6 +600,7 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_binop_nequal: case ir_binop_equal: if (lowering(EQ64)) { if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 41/50] glsl: Add a lowering pass for 64-bit float min()
From: Elie Tournier[airlied: update to handle min(dvec, double) case. Signed-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 1 + src/compiler/glsl/lower_instructions.cpp | 33 2 files changed, 34 insertions(+) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 3c99ae0..f9b688a 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -54,6 +54,7 @@ #define DDIV_TO_MUL_RCP 0x10 #define DIV_TO_MUL_RCP(FDIV_TO_MUL_RCP | DDIV_TO_MUL_RCP) #define SQRT_TO_ABS_SQRT 0x20 +#define DMIN_DMAX_TO_LESS 0x40 /* Opertaions for lower_64bit_integer_instructions() */ #define MUL64 (1U << 0) diff --git a/src/compiler/glsl/lower_instructions.cpp b/src/compiler/glsl/lower_instructions.cpp index 91f71b3..8c3d623 100644 --- a/src/compiler/glsl/lower_instructions.cpp +++ b/src/compiler/glsl/lower_instructions.cpp @@ -43,6 +43,7 @@ * - BORROW_TO_ARITH * - SAT_TO_CLAMP * - DOPS_TO_DFRAC + * - DMIN_DMAX_TO_LESS * * SUB_TO_ADD_NEG: * --- @@ -115,6 +116,12 @@ * DOPS_TO_DFRAC: * -- * Converts double trunc, ceil, floor, round to fract + * + * DMIN_DMAX_TO_LESS: + * + * Converts double min, max into less. + * min(x,y) = less(x,y) ? x, y; + * max(x,y) = less(x,y) ? y, x; */ #include "c99_math.h" @@ -169,6 +176,7 @@ private: void find_msb_to_float_cast(ir_expression *ir); void imul_high_to_mul(ir_expression *ir); void sqrt_to_abs_sqrt(ir_expression *ir); + void dmin_to_less(ir_expression *ir); ir_expression *_carry(operand a, operand b); }; @@ -1666,6 +1674,25 @@ lower_instructions_visitor::sqrt_to_abs_sqrt(ir_expression *ir) this->progress = true; } +void +lower_instructions_visitor::dmin_to_less(ir_expression *ir) +{ + const unsigned vec_elem = ir->type->vector_elements; + ir_rvalue *x_clone = ir->operands[0]->clone(ir, NULL); + ir_rvalue *y_clone = ir->operands[1]->clone(ir, NULL); + ir->operation = ir_triop_csel; + ir->init_num_operands(); + if (ir->operands[1]->type->vector_elements == 1 && vec_elem > 1) { + ir->operands[0] = less(ir->operands[0], swizzle(ir->operands[1], SWIZZLE_, vec_elem)); + ir->operands[2] = swizzle(y_clone, SWIZZLE_, vec_elem); + } else { + ir->operands[0] = less(ir->operands[0], ir->operands[1]); + ir->operands[2] = y_clone; + } + ir->operands[1] = x_clone; + this->progress = true; +} + ir_visitor_status lower_instructions_visitor::visit_leave(ir_expression *ir) { @@ -1809,6 +1836,12 @@ lower_instructions_visitor::visit_leave(ir_expression *ir) sqrt_to_abs_sqrt(ir); break; + case ir_binop_min: + if (lowering(DMIN_DMAX_TO_LESS) && + ir->type->is_double()) + dmin_to_less(ir); + break; + default: return visit_continue; } -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 38/50] glsl/lower_64bit: lower d2b using comparison
From: Dave AirlieThis just does a compare to 0 and inverts the result to lower d2b. Not 100% sure this is always correct, but it passes piglit --- src/compiler/glsl/lower_64bit.cpp | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 794cc3e..c4b8e78 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -349,7 +349,7 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, ir_expression *ir, ir_function_signature *callee) { - const unsigned num_operands = ir->num_operands; + unsigned num_operands = ir->num_operands; ir_variable *src[4][4]; ir_variable *dst[4]; void *const mem_ctx = ralloc_parent(ir); @@ -378,6 +378,16 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, source_components = ir->operands[i]->type->vector_elements; } + if (ir->operation == ir_unop_d2b) { + for (unsigned i = 0; i < source_components; i++) { + src[1][i] = body.make_temp(glsl_type::uvec2_type, "zero"); + + body.emit(assign(src[1][i], body.constant(0u), 1)); + body.emit(assign(src[1][i], body.constant(0u), 2)); + } + num_operands++; + } + for (unsigned i = 0; i < source_components; i++) { dst[i] = body.make_temp(result_type, "expanded_64bit_result"); @@ -394,6 +404,9 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, ); body.emit(c); + + if (ir->operation == ir_unop_d2b) + body.emit(assign(dst[i], logic_not(dst[i]))); } ir_rvalue *rv; @@ -475,6 +488,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_d2b: + if (lowering(EQ64)) { + if (ir->type->base_type == GLSL_TYPE_BOOL) +*rvalue = handle_op(ir, "__builtin_feq64", generate_ir::feq64); + } + break; + case ir_unop_d2f: if (lowering(D2F)) { if (ir->type->base_type == GLSL_TYPE_FLOAT) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 30/50] glsl: Add a lowering pass for 64-bit float i2d()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 3cc7f2e..f73faec 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -68,9 +68,10 @@ #define D2U (1U << 9) #define U2D (1U << 10) #define D2I (1U << 11) +#define I2D (1U << 12) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ - ADD64 | MUL64 | D2U | U2D | D2I) + ADD64 | MUL64 | D2U | U2D | D2I | I2D) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 7b2ffe8..2900409 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -438,6 +438,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_i2d: + if (lowering(I2D)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_int_to_fp64", generate_ir::int_to_fp64, true); + } + break; + case ir_unop_neg: if (lowering(NEG64)) { if (ir->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 31/50] glsl: Add a lowering pass for 64-bit float d2f()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 4 +++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index f73faec..e7860ef 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -69,9 +69,11 @@ #define U2D (1U << 10) #define D2I (1U << 11) #define I2D (1U << 12) +#define D2F (1U << 13) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ - ADD64 | MUL64 | D2U | U2D | D2I | I2D) + ADD64 | MUL64 | D2U | U2D | D2I | I2D | \ + D2F) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 2900409..e76ebdc 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_d2f: + if (lowering(D2F)) { + if (ir->type->base_type == GLSL_TYPE_FLOAT) +*rvalue = handle_op(ir, "__builtin_fp64_to_fp32", generate_ir::fp64_to_fp32); + } + break; + case ir_unop_d2i: if (lowering(D2I)) { if (ir->type->base_type == GLSL_TYPE_INT) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 37/50] glsl/lower_64bit: handle any/all operations
From: Dave AirlieThis just splits them out and combines the results. Signed-off-by: Dave Airlie --- src/compiler/glsl/lower_64bit.cpp | 61 ++- 1 file changed, 60 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index ee6d6f9..794cc3e 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -60,6 +60,12 @@ ir_dereference_variable *compact_destination(ir_factory &, ir_dereference_variable *merge_destination(ir_factory &, const glsl_type *type, ir_variable *result[4]); +ir_dereference_variable *all_equal_destination(ir_factory &, + const glsl_type *type, + ir_variable *result[4]); +ir_dereference_variable *any_nequal_destination(ir_factory &, + const glsl_type *type, + ir_variable *result[4]); ir_rvalue *lower_op_to_function_call(ir_instruction *base_ir, ir_expression *ir, @@ -297,6 +303,47 @@ lower_64bit::compact_destination(ir_factory , return new(mem_ctx) ir_dereference_variable(compacted_result); } +/* + * and the results from each comparison. + */ +ir_dereference_variable * +lower_64bit::all_equal_destination(ir_factory , +const glsl_type *type, +ir_variable *result[4]) +{ + ir_variable *const merged_result = + body.make_temp(glsl_type::bool_type, "all_result"); + + body.emit(assign(merged_result, result[0])); + for (unsigned i = 1; i < type->vector_elements; i++) { + body.emit(assign(merged_result, logic_and(merged_result, result[i]))); + } + + void *const mem_ctx = ralloc_parent(merged_result); + return new(mem_ctx) ir_dereference_variable(merged_result); +} + +/* + * and the results from each comparison, the not the result + */ +ir_dereference_variable * +lower_64bit::any_nequal_destination(ir_factory , +const glsl_type *type, +ir_variable *result[4]) +{ + ir_variable *const merged_result = + body.make_temp(glsl_type::bool_type, "any_result"); + + body.emit(assign(merged_result, result[0])); + for (unsigned i = 1; i < type->vector_elements; i++) { + body.emit(assign(merged_result, logic_and(merged_result, result[i]))); + } + + body.emit(assign(merged_result, logic_not(merged_result))); + void *const mem_ctx = ralloc_parent(merged_result); + return new(mem_ctx) ir_dereference_variable(merged_result); +} + ir_rvalue * lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, ir_expression *ir, @@ -350,7 +397,11 @@ lower_64bit::lower_op_to_function_call(ir_instruction *base_ir, } ir_rvalue *rv; - if (ir->type->is_64bit()) + if (ir->operation == ir_binop_all_equal) + rv = all_equal_destination(body, ir->type, dst); + else if (ir->operation == ir_binop_any_nequal) + rv = any_nequal_destination(body, ir->type, dst); + else if (ir->type->is_64bit()) rv = compact_destination(body, ir->type, dst); else rv = merge_destination(body, ir->type, dst); @@ -560,6 +611,14 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_binop_all_equal: + case ir_binop_any_nequal: + if (lowering(EQ64)) { +if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE) { +*rvalue = handle_op(ir, "__builtin_feq64", generate_ir::feq64); +} + } + break; default: break; } -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 25/50] glsl: Add a lowering pass for 64-bit float add()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 4 +++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index b5f8c45..691803e 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -64,8 +64,10 @@ #define NEG64 (1U << 5) #define EQ64 (1U << 6) #define LT64 (1U << 7) +#define ADD64 (1U << 8) -#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64) +#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ + ADD64) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 24cc3cd..eed1dba 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -440,6 +440,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_binop_add: + if (lowering(ADD64)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fadd64", generate_ir::fadd64); + } + break; + case ir_binop_div: if (lowering(DIV64)) { if (ir->type->base_type == GLSL_TYPE_UINT64) { -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 23/50] glsl: Add a lowering pass for 64-bit float equal()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 3a406ce..17db074 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -62,8 +62,9 @@ #define MOD64 (1U << 3) #define ABS64 (1U << 4) #define NEG64 (1U << 5) +#define EQ64 (1U << 6) -#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64) +#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64) /** * \see class lower_packing_builtins_visitor */ diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 88df912..d5e0f32 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -450,6 +450,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_binop_equal: + if (lowering(EQ64)) { + if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_feq64", generate_ir::feq64); + } + break; + case ir_binop_mod: if (lowering(MOD64)) { if (ir->type->base_type == GLSL_TYPE_UINT64) { -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 22/50] glsl: Add a lowering pass for 64-bit float sign()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 2 +- src/compiler/glsl/lower_64bit.cpp | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 2d9728d..3a406ce 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -63,7 +63,7 @@ #define ABS64 (1U << 4) #define NEG64 (1U << 5) -#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64) +#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64) /** * \see class lower_packing_builtins_visitor */ diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index bc9e477..88df912 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -435,6 +435,8 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) if (lowering(SIGN64)) { if (ir->type->is_integer_64()) *rvalue = handle_op(ir, "__builtin_sign64", generate_ir::sign64); +else if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fsign64", generate_ir::fsign64); } break; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 28/50] glsl: Add a lowering pass for 64-bit float u2d()
From: Elie TournierHandle non 64bit sources (airlied) Signed-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index e3d573c..a4cb7b2 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -66,9 +66,10 @@ #define LT64 (1U << 7) #define ADD64 (1U << 8) #define D2U (1U << 9) +#define U2D (1U << 10) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ - ADD64 | MUL64 | D2U) + ADD64 | MUL64 | D2U | U2D) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index 1b90830..1e97306 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -447,6 +447,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_u2d: + if (lowering(U2D)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_uint_to_fp64", generate_ir::uint_to_fp64, true); + } + break; + case ir_binop_add: if (lowering(ADD64)) { if (ir->type->base_type == GLSL_TYPE_DOUBLE) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 26/50] glsl: Add a lowering pass for 64-bit float mul()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 2 +- src/compiler/glsl/lower_64bit.cpp | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 691803e..6506e28 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -67,7 +67,7 @@ #define ADD64 (1U << 8) #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \ - ADD64) + ADD64 | MUL64) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index eed1dba..f3a2633 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -485,6 +485,8 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) if (lowering(MUL64)) { if (ir->type->is_integer_64()) *rvalue = handle_op(ir, "__builtin_umul64", generate_ir::umul64); +else if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fmul64", generate_ir::fmul64); } break; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/50] glsl: Add "built-in" functions to do int_to_fp64(int) (v2)
From: Elie Tournierv2: use mix Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 129 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 23 ++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 160 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 8ce2baa..b656fad 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -5533,3 +5533,132 @@ fp64_to_int(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +int_to_fp64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r08C0 = new(mem_ctx) ir_variable(glsl_type::int_type, "a", ir_var_function_in); + sig_parameters.push_tail(r08C0); + ir_variable *const r08C1 = body.make_temp(glsl_type::uvec2_type, "return_value"); + ir_variable *const r08C2 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zSign", ir_var_auto); + body.emit(r08C2); + ir_variable *const r08C3 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zFrac1", ir_var_auto); + body.emit(r08C3); + ir_variable *const r08C4 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zFrac0", ir_var_auto); + body.emit(r08C4); + body.emit(assign(r08C4, body.constant(0u), 0x01)); + + body.emit(assign(r08C3, body.constant(0u), 0x01)); + + /* IF CONDITION */ + ir_expression *const r08C6 = equal(r08C0, body.constant(int(0))); + ir_if *f08C5 = new(mem_ctx) ir_if(operand(r08C6).val); + exec_list *const f08C5_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + ir_variable *const r08C7 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "z", ir_var_auto); + body.emit(r08C7); + body.emit(assign(r08C7, body.constant(0u), 0x02)); + + body.emit(assign(r08C7, body.constant(0u), 0x01)); + + body.emit(assign(r08C1, r08C7, 0x03)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_expression *const r08C8 = less(r08C0, body.constant(int(0))); + ir_expression *const r08C9 = expr(ir_unop_b2i, r08C8); + body.emit(assign(r08C2, expr(ir_unop_i2u, r08C9), 0x01)); + + ir_variable *const r08CA = body.make_temp(glsl_type::uint_type, "mix_retval"); + ir_expression *const r08CB = less(r08C0, body.constant(int(0))); + ir_expression *const r08CC = neg(r08C0); + ir_expression *const r08CD = expr(ir_unop_i2u, r08CC); + ir_expression *const r08CE = expr(ir_unop_i2u, r08C0); + body.emit(assign(r08CA, expr(ir_triop_csel, r08CB, r08CD, r08CE), 0x01)); + + ir_variable *const r08CF = body.make_temp(glsl_type::int_type, "assignment_tmp"); + ir_expression *const r08D0 = equal(r08CA, body.constant(0u)); + ir_expression *const r08D1 = expr(ir_unop_find_msb, r08CA); + ir_expression *const r08D2 = sub(body.constant(int(31)), r08D1); + ir_expression *const r08D3 = expr(ir_triop_csel, r08D0, body.constant(int(32)), r08D2); + body.emit(assign(r08CF, add(r08D3, body.constant(int(-11))), 0x01)); + + /* IF CONDITION */ + ir_expression *const r08D5 = gequal(r08CF, body.constant(int(0))); + ir_if *f08D4 = new(mem_ctx) ir_if(operand(r08D5).val); + exec_list *const f08D4_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + body.emit(assign(r08C4, lshift(r08CA, r08CF), 0x01)); + + body.emit(assign(r08C3, body.constant(0u), 0x01)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r08D6 = body.make_temp(glsl_type::int_type, "count"); + body.emit(assign(r08D6, neg(r08CF), 0x01)); + + ir_expression *const r08D7 = equal(r08D6, body.constant(int(0))); + ir_expression *const r08D8 = less(r08D6, body.constant(int(32))); + ir_expression *const r08D9 = rshift(r08CA, r08D6); + ir_expression *const r08DA = expr(ir_triop_csel, r08D8, r08D9, body.constant(0u)); + body.emit(assign(r08C4, expr(ir_triop_csel, r08D7, r08CA, r08DA), 0x01)); + + ir_expression *const r08DB = equal(r08D6, body.constant(int(0))); + ir_expression *const r08DC = less(r08D6, body.constant(int(32))); + ir_expression *const r08DD = neg(r08D6); + ir_expression *const r08DE = bit_and(r08DD, body.constant(int(31))); + ir_expression *const r08DF = lshift(r08CA, r08DE); +
[Mesa-dev] [PATCH 21/50] glsl: Add a lowering pass for 64-bit float neg()
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 3 ++- src/compiler/glsl/lower_64bit.cpp | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 370812f..2d9728d 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -61,8 +61,9 @@ #define DIV64 (1U << 2) #define MOD64 (1U << 3) #define ABS64 (1U << 4) +#define NEG64 (1U << 5) -#define LOWER_ALL_DOUBLE_OPS (ABS64) +#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64) /** * \see class lower_packing_builtins_visitor */ diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index debedfc..bc9e477 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) } break; + case ir_unop_neg: + if (lowering(NEG64)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fneg64", generate_ir::fneg64); + } + break; + case ir_unop_sign: if (lowering(SIGN64)) { if (ir->type->is_integer_64()) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 16/50] glsl: Add "built-in" functions to do trunc(fp64) (v2)
From: Elie Tournierv2: use mix. Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 62 + src/compiler/glsl/builtin_functions.cpp | 4 +++ src/compiler/glsl/builtin_functions.h | 3 ++ src/compiler/glsl/float64.glsl | 21 +++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 91 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 6fbe12d..f0222e1 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -6635,3 +6635,65 @@ fsqrt64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +ftrunc64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0A28 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0A28); + ir_variable *const r0A29 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zHi", ir_var_auto); + body.emit(r0A29); + ir_variable *const r0A2A = new(mem_ctx) ir_variable(glsl_type::uint_type, "zLo", ir_var_auto); + body.emit(r0A2A); + ir_variable *const r0A2B = body.make_temp(glsl_type::int_type, "assignment_tmp"); + ir_expression *const r0A2C = rshift(swizzle_y(r0A28), body.constant(int(20))); + ir_expression *const r0A2D = bit_and(r0A2C, body.constant(2047u)); + ir_expression *const r0A2E = expr(ir_unop_u2i, r0A2D); + body.emit(assign(r0A2B, add(r0A2E, body.constant(int(-1023))), 0x01)); + + ir_variable *const r0A2F = body.make_temp(glsl_type::int_type, "assignment_tmp"); + body.emit(assign(r0A2F, sub(body.constant(int(52)), r0A2B), 0x01)); + + ir_expression *const r0A30 = gequal(r0A2F, body.constant(int(32))); + ir_expression *const r0A31 = lshift(body.constant(4294967295u), r0A2F); + ir_expression *const r0A32 = expr(ir_triop_csel, r0A30, body.constant(0u), r0A31); + body.emit(assign(r0A2A, bit_and(r0A32, swizzle_x(r0A28)), 0x01)); + + ir_expression *const r0A33 = less(r0A2F, body.constant(int(33))); + ir_expression *const r0A34 = add(r0A2F, body.constant(int(-32))); + ir_expression *const r0A35 = lshift(body.constant(4294967295u), r0A34); + ir_expression *const r0A36 = expr(ir_triop_csel, r0A33, body.constant(4294967295u), r0A35); + body.emit(assign(r0A29, bit_and(r0A36, swizzle_y(r0A28)), 0x01)); + + ir_variable *const r0A37 = body.make_temp(glsl_type::uint_type, "mix_retval"); + ir_expression *const r0A38 = less(body.constant(int(52)), r0A2B); + ir_expression *const r0A39 = less(r0A2B, body.constant(int(0))); + ir_expression *const r0A3A = expr(ir_triop_csel, r0A39, body.constant(0u), r0A2A); + body.emit(assign(r0A37, expr(ir_triop_csel, r0A38, swizzle_x(r0A28), r0A3A), 0x01)); + + body.emit(assign(r0A2A, r0A37, 0x01)); + + ir_variable *const r0A3B = body.make_temp(glsl_type::uint_type, "mix_retval"); + ir_expression *const r0A3C = less(body.constant(int(52)), r0A2B); + ir_expression *const r0A3D = less(r0A2B, body.constant(int(0))); + ir_expression *const r0A3E = expr(ir_triop_csel, r0A3D, body.constant(0u), r0A29); + body.emit(assign(r0A3B, expr(ir_triop_csel, r0A3C, swizzle_y(r0A28), r0A3E), 0x01)); + + body.emit(assign(r0A29, r0A3B, 0x01)); + + ir_variable *const r0A3F = body.make_temp(glsl_type::uvec2_type, "vec_ctor"); + body.emit(assign(r0A3F, r0A37, 0x01)); + + body.emit(assign(r0A3F, r0A3B, 0x02)); + + body.emit(ret(r0A3F)); + + sig->replace_parameters(_parameters); + return sig; +} diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index d919873..02618e0 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3398,6 +3398,10 @@ builtin_builder::create_builtins() generate_ir::fsqrt64(mem_ctx, integer_functions_supported), NULL); + add_function("__builtin_ftrunc64", +generate_ir::ftrunc64(mem_ctx, integer_functions_supported), +NULL); + #undef F #undef FI #undef FIUD_VEC diff --git a/src/compiler/glsl/builtin_functions.h b/src/compiler/glsl/builtin_functions.h index 2f72f51..4a6b922 100644 --- a/src/compiler/glsl/builtin_functions.h +++ b/src/compiler/glsl/builtin_functions.h @@ -109,6 +109,9 @@ fp32_to_fp64(void *mem_ctx, builtin_available_predicate avail); ir_function_signature * fsqrt64(void *mem_ctx, builtin_available_predicate avail); +ir_function_signature * +ftrunc64(void *mem_ctx, builtin_available_predicate avail); + } #endif /* BULITIN_FUNCTIONS_H */ diff --git a/src/compiler/glsl/float64.glsl
[Mesa-dev] [PATCH 17/50] glsl: Add "built-in" functions to do round(fp64)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 225 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 41 ++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 274 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index f0222e1..3cba289 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -6697,3 +6697,228 @@ ftrunc64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +fround64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0F1C = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0F1C); + ir_variable *const r0F1D = body.make_temp(glsl_type::bool_type, "execute_flag"); + body.emit(assign(r0F1D, body.constant(true), 0x01)); + + ir_variable *const r0F1E = body.make_temp(glsl_type::uvec2_type, "return_value"); + ir_variable *const r0F1F = new(mem_ctx) ir_variable(glsl_type::uint_type, "aLo", ir_var_auto); + body.emit(r0F1F); + ir_variable *const r0F20 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aHi", ir_var_auto); + body.emit(r0F20); + ir_variable *const r0F21 = body.make_temp(glsl_type::int_type, "assignment_tmp"); + ir_expression *const r0F22 = rshift(swizzle_y(r0F1C), body.constant(int(20))); + ir_expression *const r0F23 = bit_and(r0F22, body.constant(2047u)); + ir_expression *const r0F24 = expr(ir_unop_u2i, r0F23); + body.emit(assign(r0F21, add(r0F24, body.constant(int(-1023))), 0x01)); + + body.emit(assign(r0F20, swizzle_y(r0F1C), 0x01)); + + body.emit(assign(r0F1F, swizzle_x(r0F1C), 0x01)); + + /* IF CONDITION */ + ir_expression *const r0F26 = less(r0F21, body.constant(int(20))); + ir_if *f0F25 = new(mem_ctx) ir_if(operand(r0F26).val); + exec_list *const f0F25_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + /* IF CONDITION */ + ir_expression *const r0F28 = less(r0F21, body.constant(int(0))); + ir_if *f0F27 = new(mem_ctx) ir_if(operand(r0F28).val); + exec_list *const f0F27_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + body.emit(assign(r0F20, bit_and(swizzle_y(r0F1C), body.constant(2147483648u)), 0x01)); + + /* IF CONDITION */ + ir_expression *const r0F2A = equal(r0F21, body.constant(int(-1))); + ir_expression *const r0F2B = nequal(swizzle_x(r0F1C), body.constant(0u)); + ir_expression *const r0F2C = logic_and(r0F2A, r0F2B); + ir_if *f0F29 = new(mem_ctx) ir_if(operand(r0F2C).val); + exec_list *const f0F29_parent_instructions = body.instructions; + +/* THEN INSTRUCTIONS */ +body.instructions = >then_instructions; + +body.emit(assign(r0F20, bit_or(r0F20, body.constant(1072693248u)), 0x01)); + + + body.instructions = f0F29_parent_instructions; + body.emit(f0F29); + + /* END IF */ + + body.emit(assign(r0F1F, body.constant(0u), 0x01)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r0F2D = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r0F2D, rshift(body.constant(1048575u), r0F21), 0x01)); + + /* IF CONDITION */ + ir_expression *const r0F2F = bit_and(r0F20, r0F2D); + ir_expression *const r0F30 = equal(r0F2F, body.constant(0u)); + ir_expression *const r0F31 = equal(r0F1F, body.constant(0u)); + ir_expression *const r0F32 = logic_and(r0F30, r0F31); + ir_if *f0F2E = new(mem_ctx) ir_if(operand(r0F32).val); + exec_list *const f0F2E_parent_instructions = body.instructions; + +/* THEN INSTRUCTIONS */ +body.instructions = >then_instructions; + +body.emit(assign(r0F1E, r0F1C, 0x03)); + +body.emit(assign(r0F1D, body.constant(false), 0x01)); + + +/* ELSE INSTRUCTIONS */ +body.instructions = >else_instructions; + +ir_expression *const r0F33 = rshift(body.constant(524288u), r0F21); +body.emit(assign(r0F20, add(r0F20, r0F33), 0x01)); + +ir_expression *const r0F34 = expr(ir_unop_bit_not, r0F2D); +body.emit(assign(r0F20, bit_and(r0F20, r0F34), 0x01)); + +
[Mesa-dev] [PATCH 15/50] glsl: Add "built-in" functions to do sqrt(fp64)
From: Elie TournierThis currently uses fp64->fp32, sqrt(fp32), fp32->fp64. [airlied: The code is include from soft float for doing proper sqrt64 but it needs to be decided if we need to pursue this and how to optimise it better.] Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 393 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 275 ++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 676 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 034d2d0..6fbe12d 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -6242,3 +6242,396 @@ fp32_to_fp64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +fsqrt64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r09A9 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r09A9); + ir_variable *const r09AA = body.make_temp(glsl_type::uvec2_type, "a"); + body.emit(assign(r09AA, r09A9, 0x03)); + + ir_variable *const r09AB = body.make_temp(glsl_type::float_type, "return_value"); + ir_variable *const r09AC = body.make_temp(glsl_type::uint_type, "extractFloat64FracHi_retval"); + body.emit(assign(r09AC, bit_and(swizzle_y(r09A9), body.constant(1048575u)), 0x01)); + + ir_variable *const r09AD = body.make_temp(glsl_type::int_type, "extractFloat64Exp_retval"); + ir_expression *const r09AE = rshift(swizzle_y(r09A9), body.constant(int(20))); + ir_expression *const r09AF = bit_and(r09AE, body.constant(2047u)); + body.emit(assign(r09AD, expr(ir_unop_u2i, r09AF), 0x01)); + + ir_variable *const r09B0 = body.make_temp(glsl_type::uint_type, "extractFloat64Sign_retval"); + body.emit(assign(r09B0, rshift(swizzle_y(r09A9), body.constant(int(31))), 0x01)); + + /* IF CONDITION */ + ir_expression *const r09B2 = equal(r09AD, body.constant(int(2047))); + ir_if *f09B1 = new(mem_ctx) ir_if(operand(r09B2).val); + exec_list *const f09B1_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + ir_variable *const r09B3 = new(mem_ctx) ir_variable(glsl_type::float_type, "rval", ir_var_auto); + body.emit(r09B3); + ir_expression *const r09B4 = lshift(swizzle_y(r09A9), body.constant(int(12))); + ir_expression *const r09B5 = rshift(swizzle_x(r09A9), body.constant(int(20))); + body.emit(assign(r09AA, bit_or(r09B4, r09B5), 0x02)); + + body.emit(assign(r09AA, lshift(swizzle_x(r09A9), body.constant(int(12))), 0x01)); + + ir_expression *const r09B6 = lshift(r09B0, body.constant(int(31))); + ir_expression *const r09B7 = bit_or(r09B6, body.constant(2143289344u)); + ir_expression *const r09B8 = rshift(swizzle_y(r09AA), body.constant(int(9))); + ir_expression *const r09B9 = bit_or(r09B7, r09B8); + body.emit(assign(r09B3, expr(ir_unop_bitcast_u2f, r09B9), 0x01)); + + ir_variable *const r09BA = body.make_temp(glsl_type::float_type, "mix_retval"); + ir_expression *const r09BB = bit_or(r09AC, swizzle_x(r09A9)); + ir_expression *const r09BC = nequal(r09BB, body.constant(0u)); + ir_expression *const r09BD = lshift(r09B0, body.constant(int(31))); + ir_expression *const r09BE = add(r09BD, body.constant(2139095040u)); + ir_expression *const r09BF = expr(ir_unop_bitcast_u2f, r09BE); + body.emit(assign(r09BA, expr(ir_triop_csel, r09BC, r09B3, r09BF), 0x01)); + + body.emit(assign(r09B3, r09BA, 0x01)); + + body.emit(assign(r09AB, r09BA, 0x01)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r09C0 = body.make_temp(glsl_type::uint_type, "mix_retval"); + ir_expression *const r09C1 = lshift(r09AC, body.constant(int(10))); + ir_expression *const r09C2 = rshift(swizzle_x(r09A9), body.constant(int(22))); + ir_expression *const r09C3 = bit_or(r09C1, r09C2); + ir_expression *const r09C4 = lshift(swizzle_x(r09A9), body.constant(int(10))); + ir_expression *const r09C5 = nequal(r09C4, body.constant(0u)); + ir_expression *const r09C6 = expr(ir_unop_b2i, r09C5); + ir_expression *const r09C7 = expr(ir_unop_i2u, r09C6); + body.emit(assign(r09C0, bit_or(r09C3, r09C7), 0x01)); + + ir_variable *const r09C8 = body.make_temp(glsl_type::uint_type, "mix_retval"); + ir_expression *const r09C9 = nequal(r09AD,
[Mesa-dev] [PATCH 18/50] glsl: Add "built-in" functions to do rcp(fp64)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 1829 +++ src/compiler/glsl/builtin_functions.cpp |4 + src/compiler/glsl/builtin_functions.h |3 + src/compiler/glsl/float64.glsl | 10 + src/compiler/glsl/glcpp/glcpp-parse.y |1 + 5 files changed, 1847 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 3cba289..e8ef0b0 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -6922,3 +6922,1832 @@ fround64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +frcp64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0F45 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0F45); + ir_variable *const r0F46 = body.make_temp(glsl_type::uint_type, "z1Ptr"); + body.emit(assign(r0F46, sub(body.constant(2406117202u), swizzle_x(r0F45)), 0x01)); + + ir_expression *const r0F47 = sub(body.constant(3217938081u), swizzle_y(r0F45)); + ir_expression *const r0F48 = less(body.constant(2406117202u), swizzle_x(r0F45)); + ir_expression *const r0F49 = expr(ir_unop_b2i, r0F48); + ir_expression *const r0F4A = expr(ir_unop_i2u, r0F49); + body.emit(assign(r0F45, sub(r0F47, r0F4A), 0x02)); + + body.emit(assign(r0F45, r0F46, 0x01)); + + ir_variable *const r0F4B = new(mem_ctx) ir_variable(glsl_type::uint_type, "z1", ir_var_auto); + body.emit(r0F4B); + ir_variable *const r0F4C = new(mem_ctx) ir_variable(glsl_type::uint_type, "z0", ir_var_auto); + body.emit(r0F4C); + ir_expression *const r0F4D = lshift(swizzle_y(r0F45), body.constant(int(31))); + ir_expression *const r0F4E = rshift(r0F46, body.constant(int(1))); + body.emit(assign(r0F4B, bit_or(r0F4D, r0F4E), 0x01)); + + body.emit(assign(r0F4C, rshift(swizzle_y(r0F45), body.constant(int(1))), 0x01)); + + body.emit(assign(r0F45, r0F4C, 0x02)); + + body.emit(assign(r0F45, r0F4B, 0x01)); + + ir_variable *const r0F4F = body.make_temp(glsl_type::bool_type, "execute_flag"); + body.emit(assign(r0F4F, body.constant(true), 0x01)); + + ir_variable *const r0F50 = body.make_temp(glsl_type::uvec2_type, "return_value"); + ir_variable *const r0F51 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zSign", ir_var_auto); + body.emit(r0F51); + ir_variable *const r0F52 = new(mem_ctx) ir_variable(glsl_type::int_type, "bExp", ir_var_auto); + body.emit(r0F52); + ir_variable *const r0F53 = new(mem_ctx) ir_variable(glsl_type::int_type, "aExp", ir_var_auto); + body.emit(r0F53); + ir_variable *const r0F54 = new(mem_ctx) ir_variable(glsl_type::uint_type, "bFracHi", ir_var_auto); + body.emit(r0F54); + ir_variable *const r0F55 = new(mem_ctx) ir_variable(glsl_type::uint_type, "bFracLo", ir_var_auto); + body.emit(r0F55); + ir_variable *const r0F56 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aFracHi", ir_var_auto); + body.emit(r0F56); + ir_variable *const r0F57 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aFracLo", ir_var_auto); + body.emit(r0F57); + ir_variable *const r0F58 = new(mem_ctx) ir_variable(glsl_type::int_type, "zExp", ir_var_auto); + body.emit(r0F58); + ir_variable *const r0F59 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zFrac2", ir_var_auto); + body.emit(r0F59); + ir_variable *const r0F5A = new(mem_ctx) ir_variable(glsl_type::uint_type, "zFrac1", ir_var_auto); + body.emit(r0F5A); + ir_variable *const r0F5B = new(mem_ctx) ir_variable(glsl_type::uint_type, "zFrac0", ir_var_auto); + body.emit(r0F5B); + body.emit(assign(r0F5B, body.constant(0u), 0x01)); + + body.emit(assign(r0F5A, body.constant(0u), 0x01)); + + body.emit(assign(r0F59, body.constant(0u), 0x01)); + + ir_variable *const r0F5C = body.make_temp(glsl_type::uint_type, "extractFloat64FracLo_retval"); + body.emit(assign(r0F5C, swizzle_x(r0F45), 0x01)); + + body.emit(assign(r0F57, r0F5C, 0x01)); + + ir_variable *const r0F5D = body.make_temp(glsl_type::uint_type, "extractFloat64FracHi_retval"); + body.emit(assign(r0F5D, bit_and(r0F4C, body.constant(1048575u)), 0x01)); + + body.emit(assign(r0F56, r0F5D, 0x01)); + + ir_variable *const r0F5E = body.make_temp(glsl_type::uint_type, "extractFloat64FracLo_retval"); + body.emit(assign(r0F5E, swizzle_x(r0F45), 0x01)); + + body.emit(assign(r0F55, r0F5E, 0x01)); + + ir_variable *const r0F5F = body.make_temp(glsl_type::uint_type, "extractFloat64FracHi_retval"); + body.emit(assign(r0F5F, bit_and(r0F4C, body.constant(1048575u)), 0x01)); + +
[Mesa-dev] [PATCH 20/50] glsl: add define to lower all double operations
From: Dave AirlieWe will add all fp64 ops to this for now, later drivers may want to only lower some. Signed-off-by: Dave Airlie --- src/compiler/glsl/ir_optimization.h | 1 + 1 file changed, 1 insertion(+) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index c48d0a9..370812f 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -62,6 +62,7 @@ #define MOD64 (1U << 3) #define ABS64 (1U << 4) +#define LOWER_ALL_DOUBLE_OPS (ABS64) /** * \see class lower_packing_builtins_visitor */ -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 19/50] glsl: Add a lowering pass for 64-bit float abs()
From: Elie TournierSquashed with: glsl/lower_64bit: fix return type conversion (airlied) Only do conversion for the 64-bit types, add a path to do result merging without conversion. Signed-off-by: Elie Tournier --- src/compiler/glsl/ir_optimization.h | 1 + src/compiler/glsl/lower_64bit.cpp | 8 2 files changed, 9 insertions(+) diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 931bffb..c48d0a9 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -60,6 +60,7 @@ #define SIGN64(1U << 1) #define DIV64 (1U << 2) #define MOD64 (1U << 3) +#define ABS64 (1U << 4) /** * \see class lower_packing_builtins_visitor diff --git a/src/compiler/glsl/lower_64bit.cpp b/src/compiler/glsl/lower_64bit.cpp index d181f63..debedfc 100644 --- a/src/compiler/glsl/lower_64bit.cpp +++ b/src/compiler/glsl/lower_64bit.cpp @@ -416,6 +416,14 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue) assert(ir != NULL); switch (ir->operation) { + + case ir_unop_abs: + if (lowering(ABS64)) { + if (ir->type->base_type == GLSL_TYPE_DOUBLE) +*rvalue = handle_op(ir, "__builtin_fabs64", generate_ir::fabs64); + } + break; + case ir_unop_sign: if (lowering(SIGN64)) { if (ir->type->is_integer_64()) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/50] glsl: Add "built-in" functions to do fp64_to_int(fp64) (v2)
From: Elie Tournierv2: use mix Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 179 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 41 src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 228 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index c200447..8ce2baa 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -5354,3 +5354,182 @@ uint_to_fp64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +fp64_to_int(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::int_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r088E = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r088E); + ir_variable *const r088F = body.make_temp(glsl_type::bool_type, "execute_flag"); + body.emit(assign(r088F, body.constant(true), 0x01)); + + ir_variable *const r0890 = body.make_temp(glsl_type::int_type, "return_value"); + ir_variable *const r0891 = new(mem_ctx) ir_variable(glsl_type::uint_type, "absZ", ir_var_auto); + body.emit(r0891); + ir_variable *const r0892 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aSign", ir_var_auto); + body.emit(r0892); + ir_variable *const r0893 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aFracHi", ir_var_auto); + body.emit(r0893); + ir_variable *const r0894 = body.make_temp(glsl_type::uint_type, "extractFloat64FracHi_retval"); + body.emit(assign(r0894, bit_and(swizzle_y(r088E), body.constant(1048575u)), 0x01)); + + body.emit(assign(r0893, r0894, 0x01)); + + ir_variable *const r0895 = body.make_temp(glsl_type::int_type, "extractFloat64Exp_retval"); + ir_expression *const r0896 = rshift(swizzle_y(r088E), body.constant(int(20))); + ir_expression *const r0897 = bit_and(r0896, body.constant(2047u)); + body.emit(assign(r0895, expr(ir_unop_u2i, r0897), 0x01)); + + body.emit(assign(r0892, rshift(swizzle_y(r088E), body.constant(int(31))), 0x01)); + + body.emit(assign(r0891, body.constant(0u), 0x01)); + + ir_variable *const r0898 = body.make_temp(glsl_type::int_type, "assignment_tmp"); + body.emit(assign(r0898, add(r0895, body.constant(int(-1043))), 0x01)); + + /* IF CONDITION */ + ir_expression *const r089A = gequal(r0898, body.constant(int(0))); + ir_if *f0899 = new(mem_ctx) ir_if(operand(r089A).val); + exec_list *const f0899_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + /* IF CONDITION */ + ir_expression *const r089C = less(body.constant(int(1054)), r0895); + ir_if *f089B = new(mem_ctx) ir_if(operand(r089C).val); + exec_list *const f089B_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + /* IF CONDITION */ + ir_expression *const r089E = equal(r0895, body.constant(int(2047))); + ir_expression *const r089F = bit_or(r0894, swizzle_x(r088E)); + ir_expression *const r08A0 = expr(ir_unop_u2i, r089F); + ir_expression *const r08A1 = expr(ir_unop_i2b, r08A0); + ir_expression *const r08A2 = logic_and(r089E, r08A1); + ir_if *f089D = new(mem_ctx) ir_if(operand(r08A2).val); + exec_list *const f089D_parent_instructions = body.instructions; + +/* THEN INSTRUCTIONS */ +body.instructions = >then_instructions; + +body.emit(assign(r0892, body.constant(0u), 0x01)); + + + body.instructions = f089D_parent_instructions; + body.emit(f089D); + + /* END IF */ + + ir_expression *const r08A3 = expr(ir_unop_u2i, r0892); + ir_expression *const r08A4 = expr(ir_unop_i2b, r08A3); + body.emit(assign(r0890, expr(ir_triop_csel, r08A4, body.constant(int(-2147483648)), body.constant(int(2147483647))), 0x01)); + + body.emit(assign(r088F, body.constant(false), 0x01)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r08A5 = body.make_temp(glsl_type::uint_type, "a0"); + body.emit(assign(r08A5, bit_or(r0894, body.constant(1048576u)), 0x01)); + + ir_expression *const r08A6 = equal(r0898, body.constant(int(0))); + ir_expression *const r08A7 = lshift(r08A5, r0898); + ir_expression *const r08A8 = neg(r0898); + ir_expression *const r08A9 = bit_and(r08A8, body.constant(int(31))); + ir_expression
[Mesa-dev] [PATCH 04/50] glsl: Add "built-in" functions to do eq(fp64, fp64)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 104 src/compiler/glsl/builtin_functions.cpp | 4 ++ src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 44 ++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 156 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 8546048..2340c48 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -96,3 +96,107 @@ fsign64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +extractFloat64FracLo(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0024 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0024); + ir_swizzle *const r0025 = swizzle_x(r0024); + body.emit(ret(r0025)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +extractFloat64FracHi(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0026 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0026); + ir_expression *const r0027 = bit_and(swizzle_y(r0026), body.constant(1048575u)); + body.emit(ret(r0027)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +extractFloat64Exp(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::int_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0028 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0028); + ir_expression *const r0029 = rshift(swizzle_y(r0028), body.constant(int(20))); + ir_expression *const r002A = bit_and(r0029, body.constant(2047u)); + ir_expression *const r002B = expr(ir_unop_u2i, r002A); + body.emit(ret(r002B)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +feq64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r002C = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r002C); + ir_variable *const r002D = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "b", ir_var_function_in); + sig_parameters.push_tail(r002D); + ir_variable *const r002E = body.make_temp(glsl_type::bool_type, "mix_retval"); + ir_expression *const r002F = rshift(swizzle_y(r002C), body.constant(int(20))); + ir_expression *const r0030 = bit_and(r002F, body.constant(2047u)); + ir_expression *const r0031 = expr(ir_unop_u2i, r0030); + ir_expression *const r0032 = equal(r0031, body.constant(int(2047))); + ir_expression *const r0033 = bit_and(swizzle_y(r002C), body.constant(1048575u)); + ir_expression *const r0034 = bit_or(r0033, swizzle_x(r002C)); + ir_expression *const r0035 = nequal(r0034, body.constant(0u)); + ir_expression *const r0036 = logic_and(r0032, r0035); + ir_expression *const r0037 = rshift(swizzle_y(r002D), body.constant(int(20))); + ir_expression *const r0038 = bit_and(r0037, body.constant(2047u)); + ir_expression *const r0039 = expr(ir_unop_u2i, r0038); + ir_expression *const r003A = equal(r0039, body.constant(int(2047))); + ir_expression *const r003B = bit_and(swizzle_y(r002D), body.constant(1048575u)); + ir_expression *const r003C = bit_or(r003B, swizzle_x(r002D)); + ir_expression *const r003D = nequal(r003C, body.constant(0u)); + ir_expression *const r003E = logic_and(r003A, r003D); + ir_expression *const r003F = logic_or(r0036, r003E); + ir_expression *const r0040 = equal(swizzle_x(r002C), swizzle_x(r002D)); + ir_expression *const r0041 = equal(swizzle_y(r002C), swizzle_y(r002D)); + ir_expression *const r0042 = equal(swizzle_x(r002C), body.constant(0u)); + ir_expression *const r0043 = bit_or(swizzle_y(r002C), swizzle_y(r002D)); + ir_expression *const r0044 = lshift(r0043, body.constant(int(1))); + ir_expression
[Mesa-dev] [PATCH 02/50] glsl: Add "built-in" functions to do neg(fp64) (v2)
From: Elie Tournierv2: use mix. Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 51 + src/compiler/glsl/builtin_functions.cpp | 4 +++ src/compiler/glsl/builtin_functions.h | 3 ++ src/compiler/glsl/float64.glsl | 24 src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 83 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 7b57231..2898fc9 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -17,3 +17,54 @@ fabs64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +is_nan(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r000C = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r000C); + ir_expression *const r000D = lshift(swizzle_y(r000C), body.constant(int(1))); + ir_expression *const r000E = gequal(r000D, body.constant(4292870144u)); + ir_expression *const r000F = nequal(swizzle_x(r000C), body.constant(0u)); + ir_expression *const r0010 = bit_and(swizzle_y(r000C), body.constant(1048575u)); + ir_expression *const r0011 = nequal(r0010, body.constant(0u)); + ir_expression *const r0012 = logic_or(r000F, r0011); + ir_expression *const r0013 = logic_and(r000E, r0012); + body.emit(ret(r0013)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +fneg64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0014 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0014); + ir_expression *const r0015 = lshift(swizzle_y(r0014), body.constant(int(1))); + ir_expression *const r0016 = gequal(r0015, body.constant(4292870144u)); + ir_expression *const r0017 = nequal(swizzle_x(r0014), body.constant(0u)); + ir_expression *const r0018 = bit_and(swizzle_y(r0014), body.constant(1048575u)); + ir_expression *const r0019 = nequal(r0018, body.constant(0u)); + ir_expression *const r001A = logic_or(r0017, r0019); + ir_expression *const r001B = logic_and(r0016, r001A); + ir_expression *const r001C = bit_xor(swizzle_y(r0014), body.constant(2147483648u)); + body.emit(assign(r0014, expr(ir_triop_csel, r001B, swizzle_y(r0014), r001C), 0x02)); + + body.emit(ret(r0014)); + + sig->replace_parameters(_parameters); + return sig; +} diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index 133a896..9d88a31 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3346,6 +3346,10 @@ builtin_builder::create_builtins() generate_ir::fabs64(mem_ctx, integer_functions_supported), NULL); + add_function("__builtin_fneg64", +generate_ir::fneg64(mem_ctx, integer_functions_supported), +NULL); + #undef F #undef FI #undef FIUD_VEC diff --git a/src/compiler/glsl/builtin_functions.h b/src/compiler/glsl/builtin_functions.h index deaf640..adec424 100644 --- a/src/compiler/glsl/builtin_functions.h +++ b/src/compiler/glsl/builtin_functions.h @@ -70,6 +70,9 @@ udivmod64(void *mem_ctx, builtin_available_predicate avail); ir_function_signature * fabs64(void *mem_ctx, builtin_available_predicate avail); +ir_function_signature * +fneg64(void *mem_ctx, builtin_available_predicate avail); + } #endif /* BULITIN_FUNCTIONS_H */ diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl index d798d7e..fedf8b7 100644 --- a/src/compiler/glsl/float64.glsl +++ b/src/compiler/glsl/float64.glsl @@ -6,6 +6,7 @@ #version 130 #extension GL_ARB_shader_bit_encoding : enable +#extension GL_EXT_shader_integer_mix : enable /* Software IEEE floating-point rounding mode. * GLSL spec section "4.7.1 Range and Precision": @@ -27,3 +28,26 @@ fabs64(uvec2 a) a.y &= 0x7FFFu; return a; } + +/* Returns 1 if the double-precision floating-point value `a' is a NaN; + * otherwise returns 0. + */ +bool +is_nan(uvec2 a) +{ + return (0xFFE0u <= (a.y<<1)) && + ((a.x != 0u) || ((a.y & 0x000Fu) != 0u)); +} + +/* Negate value of a Float64 : + * Toggle the sign bit + */ +uvec2 +fneg64(uvec2 a) +{ + uint t = a.y; + + t ^= (1u << 31); + a.y = mix(t, a.y, is_nan(a)); + return a; +} diff --git
[Mesa-dev] soft fp64 support - main body (glsl/gallium)
This is the main code for the soft fp64 work. It's mostly Elie's code with a bunch of changes by me. This patchset has all the glsl lowering code. (using float64.glsl, yes I know checked in files are bad, but not bad enough for anyone to have solved int64.glsl yet, so we have a precedent). It introduces the builtin code for all the functions first, this code has seen some optimisation using findMSB and mix opcodes to remove if branches, I'm sure it could see a lot more. if statements are the enemy, esp when you hit glsl copy prop and the r600/sb backend. The second part is just the lowering hooks to use the builtins, but also to do a bunch of non-builtin lowering. Finally the gallium patches adds a new interpreation for the PIPE_CAP_DOUBLES, allowing drivers to choose if they want no fp64, hw fp64, or emulated fp64. I don't think we should be enabling this for everyone, just drivers who ask. There is no r600 patch in this series, it's a one liner, but the code does cause a lot of long compile times in both the glsl compiler and the r600 backend, however I'd really like to get this stuff checked in so we have a known stable good base (it passes [1375/1375] skip: 5, pass: 1368, fail: 2 on r600 nosb at the moment). I think most of the remaining issues are not to be found in this code, but fixes for the other parts of the stack. Also I'm not really interested in bikeshedding the nitty gritty details of the fp64 emulation, the main goal for this code is to provide the fp64 bit so we can enable GL4.3 on evergreen GPUs, I don't think anyone is going to use it that often in practice, and if we can get it to the level that passes conformance (still WIP) then I'll be happy. I think optimising it to reduce CPU usage at compile time is way more important than optimising it to reduce GPU usage. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/50] glsl: Add "built-in" functions to do uint_to_fp64(uint)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 71 + src/compiler/glsl/builtin_functions.cpp | 4 ++ src/compiler/glsl/builtin_functions.h | 3 ++ src/compiler/glsl/float64.glsl | 22 ++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 101 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 2dcaba40..c200447 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -5283,3 +5283,74 @@ fp64_to_uint(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +uint_to_fp64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0872 = new(mem_ctx) ir_variable(glsl_type::uint_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0872); + ir_variable *const r0873 = body.make_temp(glsl_type::uvec2_type, "return_value"); + /* IF CONDITION */ + ir_expression *const r0875 = equal(r0872, body.constant(0u)); + ir_if *f0874 = new(mem_ctx) ir_if(operand(r0875).val); + exec_list *const f0874_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + body.emit(assign(r0873, ir_constant::zero(mem_ctx, glsl_type::uvec2_type), 0x03)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r0876 = body.make_temp(glsl_type::int_type, "assignment_tmp"); + ir_expression *const r0877 = equal(r0872, body.constant(0u)); + ir_expression *const r0878 = expr(ir_unop_find_msb, r0872); + ir_expression *const r0879 = sub(body.constant(int(31)), r0878); + ir_expression *const r087A = expr(ir_triop_csel, r0877, body.constant(int(32)), r0879); + body.emit(assign(r0876, add(r087A, body.constant(int(21))), 0x01)); + + ir_variable *const r087B = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "z", ir_var_auto); + body.emit(r087B); + ir_expression *const r087C = sub(body.constant(int(1074)), r0876); + ir_expression *const r087D = expr(ir_unop_i2u, r087C); + ir_expression *const r087E = lshift(r087D, body.constant(int(20))); + ir_expression *const r087F = less(r0876, body.constant(int(32))); + ir_expression *const r0880 = neg(r0876); + ir_expression *const r0881 = bit_and(r0880, body.constant(int(31))); + ir_expression *const r0882 = rshift(r0872, r0881); + ir_expression *const r0883 = equal(r0876, body.constant(int(0))); + ir_expression *const r0884 = less(r0876, body.constant(int(64))); + ir_expression *const r0885 = add(r0876, body.constant(int(-32))); + ir_expression *const r0886 = lshift(r0872, r0885); + ir_expression *const r0887 = expr(ir_triop_csel, r0884, r0886, body.constant(0u)); + ir_expression *const r0888 = expr(ir_triop_csel, r0883, body.constant(0u), r0887); + ir_expression *const r0889 = expr(ir_triop_csel, r087F, r0882, r0888); + body.emit(assign(r087B, add(r087E, r0889), 0x02)); + + ir_expression *const r088A = less(r0876, body.constant(int(32))); + ir_expression *const r088B = lshift(r0872, r0876); + ir_expression *const r088C = equal(r0876, body.constant(int(0))); + ir_expression *const r088D = expr(ir_triop_csel, r088C, r0872, body.constant(0u)); + body.emit(assign(r087B, expr(ir_triop_csel, r088A, r088B, r088D), 0x01)); + + body.emit(assign(r0873, r087B, 0x03)); + + + body.instructions = f0874_parent_instructions; + body.emit(f0874); + + /* END IF */ + + body.emit(ret(r0873)); + + sig->replace_parameters(_parameters); + return sig; +} diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index a0fc9bc..20051b1 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3374,6 +3374,10 @@ builtin_builder::create_builtins() generate_ir::fp64_to_uint(mem_ctx, integer_functions_supported), NULL); + add_function("__builtin_uint_to_fp64", +generate_ir::uint_to_fp64(mem_ctx, integer_functions_supported), +NULL); + #undef F #undef FI #undef FIUD_VEC diff --git a/src/compiler/glsl/builtin_functions.h b/src/compiler/glsl/builtin_functions.h index f99e3b7..a9674dc 100644 --- a/src/compiler/glsl/builtin_functions.h +++ b/src/compiler/glsl/builtin_functions.h @@ -91,6 +91,9 @@ fmul64(void *mem_ctx, builtin_available_predicate avail); ir_function_signature * fp64_to_uint(void *mem_ctx,
[Mesa-dev] [PATCH 09/50] glsl: Add "built-in" functions to do fp64_to_uint(fp64)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 209 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 61 ++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 278 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index ca56d3b..2dcaba40 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -5074,3 +5074,212 @@ fmul64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +shift64Right(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::void_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0818 = new(mem_ctx) ir_variable(glsl_type::uint_type, "a0", ir_var_function_in); + sig_parameters.push_tail(r0818); + ir_variable *const r0819 = new(mem_ctx) ir_variable(glsl_type::uint_type, "a1", ir_var_function_in); + sig_parameters.push_tail(r0819); + ir_variable *const r081A = new(mem_ctx) ir_variable(glsl_type::int_type, "count", ir_var_function_in); + sig_parameters.push_tail(r081A); + ir_variable *const r081B = new(mem_ctx) ir_variable(glsl_type::uint_type, "z0Ptr", ir_var_function_inout); + sig_parameters.push_tail(r081B); + ir_variable *const r081C = new(mem_ctx) ir_variable(glsl_type::uint_type, "z1Ptr", ir_var_function_inout); + sig_parameters.push_tail(r081C); + ir_expression *const r081D = equal(r081A, body.constant(int(0))); + ir_expression *const r081E = less(r081A, body.constant(int(32))); + ir_expression *const r081F = neg(r081A); + ir_expression *const r0820 = bit_and(r081F, body.constant(int(31))); + ir_expression *const r0821 = lshift(r0818, r0820); + ir_expression *const r0822 = rshift(r0819, r081A); + ir_expression *const r0823 = bit_or(r0821, r0822); + ir_expression *const r0824 = less(r081A, body.constant(int(64))); + ir_expression *const r0825 = bit_and(r081A, body.constant(int(31))); + ir_expression *const r0826 = rshift(r0818, r0825); + ir_expression *const r0827 = expr(ir_triop_csel, r0824, r0826, body.constant(0u)); + ir_expression *const r0828 = expr(ir_triop_csel, r081E, r0823, r0827); + body.emit(assign(r081C, expr(ir_triop_csel, r081D, r0818, r0828), 0x01)); + + ir_expression *const r0829 = equal(r081A, body.constant(int(0))); + ir_expression *const r082A = less(r081A, body.constant(int(32))); + ir_expression *const r082B = rshift(r0818, r081A); + ir_expression *const r082C = expr(ir_triop_csel, r082A, r082B, body.constant(0u)); + body.emit(assign(r081B, expr(ir_triop_csel, r0829, r0818, r082C), 0x01)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +fp64_to_uint(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r082D = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r082D); + ir_variable *const r082E = body.make_temp(glsl_type::uint_type, "return_value"); + ir_variable *const r082F = new(mem_ctx) ir_variable(glsl_type::uint_type, "aFracHi", ir_var_auto); + body.emit(r082F); + ir_variable *const r0830 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aFracLo", ir_var_auto); + body.emit(r0830); + body.emit(assign(r0830, swizzle_x(r082D), 0x01)); + + ir_variable *const r0831 = body.make_temp(glsl_type::uint_type, "extractFloat64FracHi_retval"); + body.emit(assign(r0831, bit_and(swizzle_y(r082D), body.constant(1048575u)), 0x01)); + + body.emit(assign(r082F, r0831, 0x01)); + + ir_variable *const r0832 = body.make_temp(glsl_type::int_type, "extractFloat64Exp_retval"); + ir_expression *const r0833 = rshift(swizzle_y(r082D), body.constant(int(20))); + ir_expression *const r0834 = bit_and(r0833, body.constant(2047u)); + body.emit(assign(r0832, expr(ir_unop_u2i, r0834), 0x01)); + + ir_variable *const r0835 = body.make_temp(glsl_type::uint_type, "extractFloat64Sign_retval"); + body.emit(assign(r0835, rshift(swizzle_y(r082D), body.constant(int(31))), 0x01)); + + /* IF CONDITION */ + ir_expression *const r0837 = equal(r0832, body.constant(int(2047))); + ir_expression *const r0838 = bit_or(r0831, swizzle_x(r082D)); + ir_expression *const r0839 = nequal(r0838, body.constant(0u)); + ir_expression *const r083A = logic_and(r0837, r0839); + ir_if *f0836 = new(mem_ctx)
[Mesa-dev] [PATCH 13/50] glsl: Add "built-in" functions to do fp64_to_fp32(fp64)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 388 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 100 src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 496 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index b656fad..f937a2f 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -5662,3 +5662,391 @@ int_to_fp64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +packFloat32(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::float_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r08EC = new(mem_ctx) ir_variable(glsl_type::uint_type, "zSign", ir_var_function_in); + sig_parameters.push_tail(r08EC); + ir_variable *const r08ED = new(mem_ctx) ir_variable(glsl_type::int_type, "zExp", ir_var_function_in); + sig_parameters.push_tail(r08ED); + ir_variable *const r08EE = new(mem_ctx) ir_variable(glsl_type::uint_type, "zFrac", ir_var_function_in); + sig_parameters.push_tail(r08EE); + ir_variable *const r08EF = body.make_temp(glsl_type::float_type, "uintBitsToFloat_retval"); + ir_expression *const r08F0 = lshift(r08EC, body.constant(int(31))); + ir_expression *const r08F1 = expr(ir_unop_i2u, r08ED); + ir_expression *const r08F2 = lshift(r08F1, body.constant(int(23))); + ir_expression *const r08F3 = add(r08F0, r08F2); + ir_expression *const r08F4 = add(r08F3, r08EE); + body.emit(assign(r08EF, expr(ir_unop_bitcast_u2f, r08F4), 0x01)); + + body.emit(ret(r08EF)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +roundAndPackFloat32(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::float_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r08F5 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zSign", ir_var_function_in); + sig_parameters.push_tail(r08F5); + ir_variable *const r08F6 = new(mem_ctx) ir_variable(glsl_type::int_type, "zExp", ir_var_function_in); + sig_parameters.push_tail(r08F6); + ir_variable *const r08F7 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zFrac", ir_var_function_in); + sig_parameters.push_tail(r08F7); + ir_variable *const r08F8 = body.make_temp(glsl_type::bool_type, "execute_flag"); + body.emit(assign(r08F8, body.constant(true), 0x01)); + + ir_variable *const r08F9 = body.make_temp(glsl_type::float_type, "return_value"); + ir_variable *const r08FA = new(mem_ctx) ir_variable(glsl_type::int_type, "roundBits", ir_var_auto); + body.emit(r08FA); + ir_expression *const r08FB = bit_and(r08F7, body.constant(127u)); + body.emit(assign(r08FA, expr(ir_unop_u2i, r08FB), 0x01)); + + /* IF CONDITION */ + ir_expression *const r08FD = expr(ir_unop_i2u, r08F6); + ir_expression *const r08FE = gequal(r08FD, body.constant(253u)); + ir_if *f08FC = new(mem_ctx) ir_if(operand(r08FE).val); + exec_list *const f08FC_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + /* IF CONDITION */ + ir_expression *const r0900 = less(body.constant(int(253)), r08F6); + ir_expression *const r0901 = equal(r08F6, body.constant(int(253))); + ir_expression *const r0902 = expr(ir_unop_u2i, r08F7); + ir_expression *const r0903 = less(r0902, body.constant(int(-64))); + ir_expression *const r0904 = logic_and(r0901, r0903); + ir_expression *const r0905 = logic_or(r0900, r0904); + ir_if *f08FF = new(mem_ctx) ir_if(operand(r0905).val); + exec_list *const f08FF_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + ir_expression *const r0906 = lshift(r08F5, body.constant(int(31))); + ir_expression *const r0907 = add(r0906, body.constant(2139095040u)); + body.emit(assign(r08F9, expr(ir_unop_bitcast_u2f, r0907), 0x01)); + + body.emit(assign(r08F8, body.constant(false), 0x01)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r0908 = body.make_temp(glsl_type::int_type, "assignment_tmp"); + body.emit(assign(r0908, neg(r08F6), 0x01)); + + ir_variable *const r0909 = body.make_temp(glsl_type::bool_type, "assignment_tmp"); +
[Mesa-dev] [PATCH 08/50] glsl: Add "built-in" functions to do mul(fp64, fp64) (v2)
From: Elie Tournierv2: use mix Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 1348 +++ src/compiler/glsl/builtin_functions.cpp |4 + src/compiler/glsl/builtin_functions.h |3 + src/compiler/glsl/float64.glsl | 148 src/compiler/glsl/glcpp/glcpp-parse.y |1 + 5 files changed, 1504 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 0ebfb42..ca56d3b 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -3726,3 +3726,1351 @@ fadd64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +mul32To64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::void_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r05FE = new(mem_ctx) ir_variable(glsl_type::uint_type, "a", ir_var_function_in); + sig_parameters.push_tail(r05FE); + ir_variable *const r05FF = new(mem_ctx) ir_variable(glsl_type::uint_type, "b", ir_var_function_in); + sig_parameters.push_tail(r05FF); + ir_variable *const r0600 = new(mem_ctx) ir_variable(glsl_type::uint_type, "z0Ptr", ir_var_function_inout); + sig_parameters.push_tail(r0600); + ir_variable *const r0601 = new(mem_ctx) ir_variable(glsl_type::uint_type, "z1Ptr", ir_var_function_inout); + sig_parameters.push_tail(r0601); + ir_variable *const r0602 = new(mem_ctx) ir_variable(glsl_type::uint_type, "z0", ir_var_auto); + body.emit(r0602); + ir_variable *const r0603 = new(mem_ctx) ir_variable(glsl_type::uint_type, "zMiddleA", ir_var_auto); + body.emit(r0603); + ir_variable *const r0604 = new(mem_ctx) ir_variable(glsl_type::uint_type, "z1", ir_var_auto); + body.emit(r0604); + ir_variable *const r0605 = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r0605, bit_and(r05FE, body.constant(65535u)), 0x01)); + + ir_variable *const r0606 = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r0606, rshift(r05FE, body.constant(int(16))), 0x01)); + + ir_variable *const r0607 = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r0607, bit_and(r05FF, body.constant(65535u)), 0x01)); + + ir_variable *const r0608 = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r0608, rshift(r05FF, body.constant(int(16))), 0x01)); + + ir_variable *const r0609 = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r0609, mul(r0606, r0607), 0x01)); + + ir_expression *const r060A = mul(r0605, r0608); + body.emit(assign(r0603, add(r060A, r0609), 0x01)); + + ir_expression *const r060B = mul(r0606, r0608); + ir_expression *const r060C = less(r0603, r0609); + ir_expression *const r060D = expr(ir_unop_b2i, r060C); + ir_expression *const r060E = expr(ir_unop_i2u, r060D); + ir_expression *const r060F = lshift(r060E, body.constant(int(16))); + ir_expression *const r0610 = rshift(r0603, body.constant(int(16))); + ir_expression *const r0611 = add(r060F, r0610); + body.emit(assign(r0602, add(r060B, r0611), 0x01)); + + body.emit(assign(r0603, lshift(r0603, body.constant(int(16))), 0x01)); + + ir_expression *const r0612 = mul(r0605, r0607); + body.emit(assign(r0604, add(r0612, r0603), 0x01)); + + ir_expression *const r0613 = less(r0604, r0603); + ir_expression *const r0614 = expr(ir_unop_b2i, r0613); + ir_expression *const r0615 = expr(ir_unop_i2u, r0614); + body.emit(assign(r0602, add(r0602, r0615), 0x01)); + + body.emit(assign(r0601, r0604, 0x01)); + + body.emit(assign(r0600, r0602, 0x01)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +mul64To128(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::void_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0616 = new(mem_ctx) ir_variable(glsl_type::uint_type, "a0", ir_var_function_in); + sig_parameters.push_tail(r0616); + ir_variable *const r0617 = new(mem_ctx) ir_variable(glsl_type::uint_type, "a1", ir_var_function_in); + sig_parameters.push_tail(r0617); + ir_variable *const r0618 = new(mem_ctx) ir_variable(glsl_type::uint_type, "b0", ir_var_function_in); + sig_parameters.push_tail(r0618); + ir_variable *const r0619 = new(mem_ctx) ir_variable(glsl_type::uint_type, "b1", ir_var_function_in); + sig_parameters.push_tail(r0619); + ir_variable *const r061A = new(mem_ctx) ir_variable(glsl_type::uint_type, "z0Ptr", ir_var_function_inout); +
[Mesa-dev] [PATCH 14/50] glsl: Add "built-in" functions to do fp32_to_fp64(fp32)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 192 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 38 +++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 238 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index f937a2f..034d2d0 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -6050,3 +6050,195 @@ fp64_to_fp32(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +fp32_to_fp64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r097F = new(mem_ctx) ir_variable(glsl_type::float_type, "f", ir_var_function_in); + sig_parameters.push_tail(r097F); + ir_variable *const r0980 = body.make_temp(glsl_type::bool_type, "execute_flag"); + body.emit(assign(r0980, body.constant(true), 0x01)); + + ir_variable *const r0981 = body.make_temp(glsl_type::uvec2_type, "return_value"); + ir_variable *const r0982 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aSign", ir_var_auto); + body.emit(r0982); + ir_variable *const r0983 = new(mem_ctx) ir_variable(glsl_type::int_type, "aExp", ir_var_auto); + body.emit(r0983); + ir_variable *const r0984 = new(mem_ctx) ir_variable(glsl_type::uint_type, "aFrac", ir_var_auto); + body.emit(r0984); + ir_variable *const r0985 = body.make_temp(glsl_type::uint_type, "floatBitsToUint_retval"); + body.emit(assign(r0985, expr(ir_unop_bitcast_f2u, r097F), 0x01)); + + ir_variable *const r0986 = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r0986, bit_and(r0985, body.constant(8388607u)), 0x01)); + + body.emit(assign(r0984, r0986, 0x01)); + + ir_variable *const r0987 = body.make_temp(glsl_type::int_type, "assignment_tmp"); + ir_expression *const r0988 = rshift(r0985, body.constant(int(23))); + ir_expression *const r0989 = bit_and(r0988, body.constant(255u)); + body.emit(assign(r0987, expr(ir_unop_u2i, r0989), 0x01)); + + body.emit(assign(r0983, r0987, 0x01)); + + body.emit(assign(r0982, rshift(r0985, body.constant(int(31))), 0x01)); + + /* IF CONDITION */ + ir_expression *const r098B = equal(r0987, body.constant(int(255))); + ir_if *f098A = new(mem_ctx) ir_if(operand(r098B).val); + exec_list *const f098A_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + /* IF CONDITION */ + ir_expression *const r098D = nequal(r0986, body.constant(0u)); + ir_if *f098C = new(mem_ctx) ir_if(operand(r098D).val); + exec_list *const f098C_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + ir_variable *const r098E = body.make_temp(glsl_type::uint_type, "assignment_tmp"); + body.emit(assign(r098E, lshift(r0985, body.constant(int(9))), 0x01)); + + ir_variable *const r098F = body.make_temp(glsl_type::uvec2_type, "vec_ctor"); + ir_expression *const r0990 = lshift(r098E, body.constant(int(20))); + body.emit(assign(r098F, bit_or(r0990, body.constant(0u)), 0x01)); + + ir_expression *const r0991 = rshift(r098E, body.constant(int(12))); + ir_expression *const r0992 = lshift(r0982, body.constant(int(31))); + ir_expression *const r0993 = bit_or(r0992, body.constant(2146959360u)); + body.emit(assign(r098F, bit_or(r0991, r0993), 0x02)); + + body.emit(assign(r0981, r098F, 0x03)); + + body.emit(assign(r0980, body.constant(false), 0x01)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r0994 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "z", ir_var_auto); + body.emit(r0994); + ir_expression *const r0995 = lshift(r0982, body.constant(int(31))); + body.emit(assign(r0994, add(r0995, body.constant(2146435072u)), 0x02)); + + body.emit(assign(r0994, body.constant(0u), 0x01)); + + body.emit(assign(r0981, r0994, 0x03)); + + body.emit(assign(r0980, body.constant(false), 0x01)); + + + body.instructions = f098C_parent_instructions; + body.emit(f098C); + + /* END IF */ + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + /* IF CONDITION */ + ir_expression *const r0997 = equal(r0987, body.constant(int(0))); + ir_if *f0996 = new(mem_ctx) ir_if(operand(r0997).val);
[Mesa-dev] [PATCH 05/50] glsl: add utility function to extract 64-bit sign.
From: Elie Tournier[airlied: left over from dropping le64] Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 18 ++ src/compiler/glsl/float64.glsl | 7 +++ 2 files changed, 25 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 2340c48..6a8afea 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -200,3 +200,21 @@ feq64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +extractFloat64Sign(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0049 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0049); + ir_expression *const r004A = rshift(swizzle_y(r0049), body.constant(int(31))); + body.emit(ret(r004A)); + + sig->replace_parameters(_parameters); + return sig; +} diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl index 0cd7991..6d939c2 100644 --- a/src/compiler/glsl/float64.glsl +++ b/src/compiler/glsl/float64.glsl @@ -104,3 +104,10 @@ feq64(uvec2 a, uvec2 b) ((a.y == b.y) || ((a.x == 0u) && (((a.y | b.y)<<1) == 0u))); return mix(result, false, isaNaN || isbNaN); } + +/* Returns the sign bit of the double-precision floating-point value `a'.*/ +uint +extractFloat64Sign(uvec2 a) +{ + return (a.y>>31); +} -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/50] glsl: Add "built-in" function to do sign(fp64) (v2)
From: Elie Tournierv2: use mix. Signed-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 28 src/compiler/glsl/builtin_functions.cpp | 4 src/compiler/glsl/builtin_functions.h | 3 +++ src/compiler/glsl/float64.glsl | 9 + src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 45 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 2898fc9..8546048 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -68,3 +68,31 @@ fneg64(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +fsign64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r001D = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r001D); + ir_variable *const r001E = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "retval", ir_var_auto); + body.emit(r001E); + body.emit(assign(r001E, body.constant(0u), 0x01)); + + ir_expression *const r001F = lshift(swizzle_y(r001D), body.constant(int(1))); + ir_expression *const r0020 = bit_or(r001F, swizzle_x(r001D)); + ir_expression *const r0021 = equal(r0020, body.constant(0u)); + ir_expression *const r0022 = bit_and(swizzle_y(r001D), body.constant(2147483648u)); + ir_expression *const r0023 = bit_or(r0022, body.constant(1072693248u)); + body.emit(assign(r001E, expr(ir_triop_csel, r0021, body.constant(0u), r0023), 0x02)); + + body.emit(ret(r001E)); + + sig->replace_parameters(_parameters); + return sig; +} diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index 9d88a31..17aa868 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3350,6 +3350,10 @@ builtin_builder::create_builtins() generate_ir::fneg64(mem_ctx, integer_functions_supported), NULL); + add_function("__builtin_fsign64", +generate_ir::fsign64(mem_ctx, integer_functions_supported), +NULL); + #undef F #undef FI #undef FIUD_VEC diff --git a/src/compiler/glsl/builtin_functions.h b/src/compiler/glsl/builtin_functions.h index adec424..7954373 100644 --- a/src/compiler/glsl/builtin_functions.h +++ b/src/compiler/glsl/builtin_functions.h @@ -73,6 +73,9 @@ fabs64(void *mem_ctx, builtin_available_predicate avail); ir_function_signature * fneg64(void *mem_ctx, builtin_available_predicate avail); +ir_function_signature * +fsign64(void *mem_ctx, builtin_available_predicate avail); + } #endif /* BULITIN_FUNCTIONS_H */ diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl index fedf8b7..f8eb1f3 100644 --- a/src/compiler/glsl/float64.glsl +++ b/src/compiler/glsl/float64.glsl @@ -51,3 +51,12 @@ fneg64(uvec2 a) a.y = mix(t, a.y, is_nan(a)); return a; } + +uvec2 +fsign64(uvec2 a) +{ + uvec2 retval; + retval.x = 0u; + retval.y = mix((a.y & 0x8000u) | 0x3FF0u, 0u, (a.y << 1 | a.x) == 0u); + return retval; +} diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y b/src/compiler/glsl/glcpp/glcpp-parse.y index b9506d8..666543b 100644 --- a/src/compiler/glsl/glcpp/glcpp-parse.y +++ b/src/compiler/glsl/glcpp/glcpp-parse.y @@ -2370,6 +2370,7 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio add_builtin_define(parser, "__have_builtin_builtin_imod64", 1); add_builtin_define(parser, "__have_builtin_builtin_fabs64", 1); add_builtin_define(parser, "__have_builtin_builtin_fneg64", 1); + add_builtin_define(parser, "__have_builtin_builtin_fsign64", 1); } } -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/50] glsl: Add "built-in" function to do abs(fp64)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/Makefile.sources | 1 + src/compiler/glsl/builtin_float64.h | 19 +++ src/compiler/glsl/builtin_functions.cpp | 4 src/compiler/glsl/builtin_functions.h | 3 +++ src/compiler/glsl/float64.glsl | 29 + src/compiler/glsl/generate_ir.cpp | 1 + src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 7 files changed, 58 insertions(+) create mode 100644 src/compiler/glsl/builtin_float64.h create mode 100644 src/compiler/glsl/float64.glsl diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index b29218e..ee223ae 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -22,6 +22,7 @@ LIBGLSL_FILES = \ glsl/builtin_functions.cpp \ glsl/builtin_functions.h \ glsl/builtin_int64.h \ + glsl/builtin_float64.h \ glsl/builtin_types.cpp \ glsl/builtin_variables.cpp \ glsl/generate_ir.cpp \ diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h new file mode 100644 index 000..7b57231 --- /dev/null +++ b/src/compiler/glsl/builtin_float64.h @@ -0,0 +1,19 @@ +ir_function_signature * +fabs64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r000B = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r000B); + body.emit(assign(r000B, bit_and(swizzle_y(r000B), body.constant(2147483647u)), 0x02)); + + body.emit(ret(r000B)); + + sig->replace_parameters(_parameters); + return sig; +} diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index 5f772c9..133a896 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3342,6 +3342,10 @@ builtin_builder::create_builtins() generate_ir::umul64(mem_ctx, integer_functions_supported), NULL); + add_function("__builtin_fabs64", +generate_ir::fabs64(mem_ctx, integer_functions_supported), +NULL); + #undef F #undef FI #undef FIUD_VEC diff --git a/src/compiler/glsl/builtin_functions.h b/src/compiler/glsl/builtin_functions.h index 89ec9b7..deaf640 100644 --- a/src/compiler/glsl/builtin_functions.h +++ b/src/compiler/glsl/builtin_functions.h @@ -67,6 +67,9 @@ sign64(void *mem_ctx, builtin_available_predicate avail); ir_function_signature * udivmod64(void *mem_ctx, builtin_available_predicate avail); +ir_function_signature * +fabs64(void *mem_ctx, builtin_available_predicate avail); + } #endif /* BULITIN_FUNCTIONS_H */ diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl new file mode 100644 index 000..d798d7e --- /dev/null +++ b/src/compiler/glsl/float64.glsl @@ -0,0 +1,29 @@ +/* Compile with: + * + * glsl_compiler --version 130 --dump-builder float64.glsl > builtin_float64.h + * + */ + +#version 130 +#extension GL_ARB_shader_bit_encoding : enable + +/* Software IEEE floating-point rounding mode. + * GLSL spec section "4.7.1 Range and Precision": + * The rounding mode cannot be set and is undefined. + * But here, we are able to define the rounding mode at the compilation time. + */ +#define FLOAT_ROUND_NEAREST_EVEN0 +#define FLOAT_ROUND_TO_ZERO 1 +#define FLOAT_ROUND_DOWN2 +#define FLOAT_ROUND_UP 3 +#define FLOAT_ROUNDING_MODE FLOAT_ROUND_NEAREST_EVEN + +/* Absolute value of a Float64 : + * Clear the sign bit + */ +uvec2 +fabs64(uvec2 a) +{ + a.y &= 0x7FFFu; + return a; +} diff --git a/src/compiler/glsl/generate_ir.cpp b/src/compiler/glsl/generate_ir.cpp index 255b048..e6ece48 100644 --- a/src/compiler/glsl/generate_ir.cpp +++ b/src/compiler/glsl/generate_ir.cpp @@ -29,5 +29,6 @@ using namespace ir_builder; namespace generate_ir { #include "builtin_int64.h" +#include "builtin_float64.h" } diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y b/src/compiler/glsl/glcpp/glcpp-parse.y index 913bce1..4e7affa 100644 --- a/src/compiler/glsl/glcpp/glcpp-parse.y +++ b/src/compiler/glsl/glcpp/glcpp-parse.y @@ -2368,6 +2368,7 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio add_builtin_define(parser, "__have_builtin_builtin_umod64", 1); add_builtin_define(parser, "__have_builtin_builtin_idiv64", 1); add_builtin_define(parser, "__have_builtin_builtin_imod64", 1); + add_builtin_define(parser, "__have_builtin_builtin_fabs64", 1); } } -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
[Mesa-dev] [PATCH 06/50] glsl: Add "built-in" functions to do lt(fp64, fp64)
From: Elie TournierSigned-off-by: Elie Tournier --- src/compiler/glsl/builtin_float64.h | 135 src/compiler/glsl/builtin_functions.cpp | 4 + src/compiler/glsl/builtin_functions.h | 3 + src/compiler/glsl/float64.glsl | 42 ++ src/compiler/glsl/glcpp/glcpp-parse.y | 1 + 5 files changed, 185 insertions(+) diff --git a/src/compiler/glsl/builtin_float64.h b/src/compiler/glsl/builtin_float64.h index 6a8afea..f7e613f 100644 --- a/src/compiler/glsl/builtin_float64.h +++ b/src/compiler/glsl/builtin_float64.h @@ -218,3 +218,138 @@ extractFloat64Sign(void *mem_ctx, builtin_available_predicate avail) sig->replace_parameters(_parameters); return sig; } +ir_function_signature * +lt64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r004B = new(mem_ctx) ir_variable(glsl_type::uint_type, "a0", ir_var_function_in); + sig_parameters.push_tail(r004B); + ir_variable *const r004C = new(mem_ctx) ir_variable(glsl_type::uint_type, "a1", ir_var_function_in); + sig_parameters.push_tail(r004C); + ir_variable *const r004D = new(mem_ctx) ir_variable(glsl_type::uint_type, "b0", ir_var_function_in); + sig_parameters.push_tail(r004D); + ir_variable *const r004E = new(mem_ctx) ir_variable(glsl_type::uint_type, "b1", ir_var_function_in); + sig_parameters.push_tail(r004E); + ir_expression *const r004F = less(r004B, r004D); + ir_expression *const r0050 = equal(r004B, r004D); + ir_expression *const r0051 = less(r004C, r004E); + ir_expression *const r0052 = logic_and(r0050, r0051); + ir_expression *const r0053 = logic_or(r004F, r0052); + body.emit(ret(r0053)); + + sig->replace_parameters(_parameters); + return sig; +} +ir_function_signature * +flt64(void *mem_ctx, builtin_available_predicate avail) +{ + ir_function_signature *const sig = + new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail); + ir_factory body(>body, mem_ctx); + sig->is_defined = true; + + exec_list sig_parameters; + + ir_variable *const r0054 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in); + sig_parameters.push_tail(r0054); + ir_variable *const r0055 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, "b", ir_var_function_in); + sig_parameters.push_tail(r0055); + ir_variable *const r0056 = body.make_temp(glsl_type::bool_type, "return_value"); + ir_variable *const r0057 = new(mem_ctx) ir_variable(glsl_type::bool_type, "isbNaN", ir_var_auto); + body.emit(r0057); + ir_variable *const r0058 = new(mem_ctx) ir_variable(glsl_type::bool_type, "isaNaN", ir_var_auto); + body.emit(r0058); + ir_expression *const r0059 = rshift(swizzle_y(r0054), body.constant(int(20))); + ir_expression *const r005A = bit_and(r0059, body.constant(2047u)); + ir_expression *const r005B = expr(ir_unop_u2i, r005A); + ir_expression *const r005C = equal(r005B, body.constant(int(2047))); + ir_expression *const r005D = bit_and(swizzle_y(r0054), body.constant(1048575u)); + ir_expression *const r005E = bit_or(r005D, swizzle_x(r0054)); + ir_expression *const r005F = nequal(r005E, body.constant(0u)); + body.emit(assign(r0058, logic_and(r005C, r005F), 0x01)); + + ir_expression *const r0060 = rshift(swizzle_y(r0055), body.constant(int(20))); + ir_expression *const r0061 = bit_and(r0060, body.constant(2047u)); + ir_expression *const r0062 = expr(ir_unop_u2i, r0061); + ir_expression *const r0063 = equal(r0062, body.constant(int(2047))); + ir_expression *const r0064 = bit_and(swizzle_y(r0055), body.constant(1048575u)); + ir_expression *const r0065 = bit_or(r0064, swizzle_x(r0055)); + ir_expression *const r0066 = nequal(r0065, body.constant(0u)); + body.emit(assign(r0057, logic_and(r0063, r0066), 0x01)); + + /* IF CONDITION */ + ir_expression *const r0068 = logic_or(r0058, r0057); + ir_if *f0067 = new(mem_ctx) ir_if(operand(r0068).val); + exec_list *const f0067_parent_instructions = body.instructions; + + /* THEN INSTRUCTIONS */ + body.instructions = >then_instructions; + + body.emit(assign(r0056, body.constant(false), 0x01)); + + + /* ELSE INSTRUCTIONS */ + body.instructions = >else_instructions; + + ir_variable *const r0069 = body.make_temp(glsl_type::uint_type, "extractFloat64Sign_retval"); + body.emit(assign(r0069, rshift(swizzle_y(r0054), body.constant(int(31))), 0x01)); + + ir_variable *const r006A = body.make_temp(glsl_type::uint_type, "extractFloat64Sign_retval"); + body.emit(assign(r006A, rshift(swizzle_y(r0055), body.constant(int(31))), 0x01)); + + /* IF CONDITION */ + ir_expression *const r006C = nequal(r0069, r006A); + ir_if *f006B
[Mesa-dev] [PATCH] r600: fix abs for op3 sources
From: Roland ScheideggerIf a src was referencing the same temp as the dst, the per-component copy code didn't work. e.g. cndge r0.xy, r0.xx, |r2|, r3 got expanded into mov r12.x, |r2| cndge r0.x, r0.x, r12, r3 mov r12.y, |r2| cndge r0.y, r0.x, r12, r3 hence for the second cndge r0.x was mistakenly the previous cndge result. Fix this by doing all the movs first, so there's no bogus alu.last in between. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905 --- src/gallium/drivers/r600/r600_shader.c | 110 + 1 file changed, 56 insertions(+), 54 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 6b5c42f86d..bd511c76ac 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -7076,33 +7076,42 @@ static int tgsi_helper_copy(struct r600_shader_ctx *ctx, struct tgsi_full_instru } static int tgsi_make_src_for_op3(struct r600_shader_ctx *ctx, - unsigned temp, int chan, + unsigned writemask, struct r600_bytecode_alu_src *bc_src, const struct r600_shader_src *shader_src) { struct r600_bytecode_alu alu; - int r; + int i, r; + int lasti = tgsi_last_instruction(writemask); + int temp_reg = 0; - r600_bytecode_src(bc_src, shader_src, chan); + r600_bytecode_src(_src[0], shader_src, 0); + r600_bytecode_src(_src[1], shader_src, 1); + r600_bytecode_src(_src[2], shader_src, 2); + r600_bytecode_src(_src[3], shader_src, 3); - /* op3 operands don't support abs modifier */ if (bc_src->abs) { - assert(temp!=0); /* we actually need the extra register, make sure it is allocated. */ - memset(, 0, sizeof(struct r600_bytecode_alu)); - alu.op = ALU_OP1_MOV; - alu.dst.sel = temp; - alu.dst.chan = chan; - alu.dst.write = 1; + temp_reg = r600_get_temp(ctx); - alu.src[0] = *bc_src; - alu.last = true; // sufficient? - r = r600_bytecode_add_alu(ctx->bc, ); - if (r) - return r; - - memset(bc_src, 0, sizeof(*bc_src)); - bc_src->sel = temp; - bc_src->chan = chan; + for (i = 0; i < lasti + 1; i++) { + if (!(writemask & (1 << i))) + continue; + memset(, 0, sizeof(struct r600_bytecode_alu)); + alu.op = ALU_OP1_MOV; + alu.dst.sel = temp_reg; + alu.dst.chan = i; + alu.dst.write = 1; + alu.src[0] = bc_src[i]; + if (i == lasti) { + alu.last = 1; + } + r = r600_bytecode_add_alu(ctx->bc, ); + if (r) + return r; + memset(_src[i], 0, sizeof(*bc_src)); + bc_src[i].sel = temp_reg; + bc_src[i].chan = i; + } } return 0; } @@ -7111,9 +7120,9 @@ static int tgsi_op3_dst(struct r600_shader_ctx *ctx, int dst) { struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; + struct r600_bytecode_alu_src srcs[4][4]; int i, j, r; int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); - int temp_regs[4]; unsigned op = ctx->inst_info->op; if (op == ALU_OP3_MULADD_IEEE && @@ -7121,10 +7130,12 @@ static int tgsi_op3_dst(struct r600_shader_ctx *ctx, int dst) op = ALU_OP3_MULADD; for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { - temp_regs[j] = 0; - if (ctx->src[j].abs) - temp_regs[j] = r600_get_temp(ctx); + r = tgsi_make_src_for_op3(ctx, inst->Dst[0].Register.WriteMask, + srcs[j], >src[j]); + if (r) + return r; } + for (i = 0; i < lasti + 1; i++) { if (!(inst->Dst[0].Register.WriteMask & (1 << i))) continue; @@ -7132,9 +7143,7 @@ static int tgsi_op3_dst(struct r600_shader_ctx *ctx, int dst) memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = op; for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { - r = tgsi_make_src_for_op3(ctx, temp_regs[j], i, [j], >src[j]); - if (r) - return r; + alu.src[j] = srcs[j][i]; }
[Mesa-dev] [PATCH 1/2] r600: add simple ib dumping under a env var
From: Dave AirlieI've used this a lot when developing, and keep rebasing it around a lot, seems like it could be useful to have upstream. R600_DUMP witll make lots of /tmp/rad_dump_.txt for every command submitted to the hw. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/eg_debug.c| 16 src/gallium/drivers/r600/r600_hw_context.c | 4 src/gallium/drivers/r600/r600_pipe.h | 2 ++ 3 files changed, 22 insertions(+) diff --git a/src/gallium/drivers/r600/eg_debug.c b/src/gallium/drivers/r600/eg_debug.c index ceb7c16..990bd56 100644 --- a/src/gallium/drivers/r600/eg_debug.c +++ b/src/gallium/drivers/r600/eg_debug.c @@ -359,3 +359,19 @@ void eg_dump_debug_state(struct pipe_context *ctx, FILE *f, radeon_clear_saved_cs(>last_gfx); r600_resource_reference(>last_trace_buf, NULL); } + +void eg_dump_ib_to_file(struct r600_context *rctx, + struct radeon_winsys_cs *cs) +{ + static int ib_dump_id = 0; + char name[128]; + FILE *fl; + ib_dump_id++; + + snprintf(name, 127, "/tmp/rad_dump_%d.txt", ib_dump_id); + fl = fopen(name, "w+"); + eg_parse_ib(fl, cs->current.buf, cs->current.cdw, + -1, "IB", rctx->b.chip_class, + NULL, NULL); + fclose(fl); +} diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index 3ce1825..3031cdd 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -293,6 +293,10 @@ void r600_context_gfx_flush(void *context, unsigned flags, r600_resource_reference(>last_trace_buf, ctx->trace_buf); r600_resource_reference(>trace_buf, NULL); } + + if (getenv("R600_DUMP")) + eg_dump_ib_to_file(ctx, cs); + /* Flush the CS. */ ws->cs_flush(cs, flags, >b.last_gfx_fence); if (fence) diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 6d09093..87978ee 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -1077,4 +1077,6 @@ void r600_update_compressed_resource_state(struct r600_context *rctx, bool compu void eg_setup_buffer_constants(struct r600_context *rctx, int shader_type); void r600_update_driver_const_buffers(struct r600_context *rctx, bool compute_only); +void eg_dump_ib_to_file(struct r600_context *rctx, + struct radeon_winsys_cs *cs); #endif -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] r600: assert on double opcodes if we hit them
From: Dave AirlieThis asserts on any double opcocde getting into the shader assembler on gpus that don't support them. This is a better way to find holes in the soft fp64 coverage than gpu hangs. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 6b5c42f..c2f5b8d 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -402,6 +402,17 @@ static bool ctx_needs_stack_workaround_8xx(struct r600_shader_ctx *ctx) return true; } +static bool ctx_has_doubles(struct r600_shader_ctx *ctx) +{ + if (ctx->bc->family == CHIP_ARUBA || + ctx->bc->family == CHIP_CAYMAN || + ctx->bc->family == CHIP_CYPRESS || + ctx->bc->family == CHIP_HEMLOCK) + return true; + else + return false; +} + static int tgsi_last_instruction(unsigned writemask) { int i, lasti = 0; @@ -4419,6 +4430,7 @@ static int tgsi_op2_64_params(struct r600_shader_ctx *ctx, bool singledest, bool int use_tmp = 0; int swizzle_x = inst->Src[0].Register.SwizzleX; + assert (ctx_has_doubles(ctx)); if (singledest) { switch (write_mask) { case 0x1: @@ -4568,6 +4580,7 @@ static int tgsi_op3_64(struct r600_shader_ctx *ctx) int lasti = 3; int tmp = r600_get_temp(ctx); + assert (ctx_has_doubles(ctx)); for (i = 0; i < lasti + 1; i++) { memset(, 0, sizeof(struct r600_bytecode_alu)); @@ -4987,6 +5000,7 @@ static int cayman_emit_double_instr(struct r600_shader_ctx *ctx) int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); int t1 = ctx->temp_reg; + assert (ctx_has_doubles(ctx)); /* should only be one src regs */ assert(inst->Instruction.NumSrcRegs == 1); -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements
Quoting Mark Janes (2018-03-12 14:59:29) > Dylan Bakerwrites: > > > Quoting Mark Janes (2018-03-12 12:40:47) > >> Handling a screw-up could be done by maintainers by force-pushing the > >> commits off the WIP branch, and adding some annotations that prevent the > >> broken commit from being re-applied to WIP by automation. > >> > > > > That sounds like introducing a lot of developer headaches, the kind that > > make > > people not want to use the system. Take this scenario: > > > > 1. I push patches > > In this case the person is a developer, not a release manager > > > 2. CI starts > > 3. you push patches > > I'll call this person "developer 2" below. > > > 4. My CI fails > > At this point, developer 1 needs feedback that their patch nearly > created a problem for many end-users. Frankly, it's unacceptable for > developers to annotate a commit for the stable branches unless they are > confident that it is a *safe and necessary* fix for end-users. We have > almost zero verification between the developer and millions of users. > > > 5. I force-push > > A release manager would have to resolve the failure manually, not > developer 1. Developers can't force-push anything in my proposal. > > > Now both of our patches are removed, even though yours haven't gone through > > CI > > at all. > > Release manager would manually drop patches from developer 1, and leave > the patches from developer 2. CI would re-test patches from developer 2. > > > And if our tool isn't smart enough it will block your patches as well. > > In fact, I can't think of a way to make force pushes on a branch that > > multiple people work on *not* have race conditions. > > I agree that there is a race condition here. Right now our race > condition covers a weeks long window between the time a developer CC's > stable, and when the release manager starts applying patches. With an > automated implementation, the window narrows to a day or so. > > > I think that we should either: > > 1. Use gitlab and have CI run on PRs as well as on merged code. Either the > > PR > >will be red and gitlab can block the merge, or it will be green. It > > should be > >possible to have gitlab block code that cannot be cleanly merged. > > 2. Use merges and reverts. > > My 2 cents: choosing a specific git service is a step in the wrong > direction for mesa. I agree that providing a branch to a release > manager may be preferable to email, in the cases where a developer has > to backport patches. I think that letting the release manager take branches would be superior, that would mean that only the release manager should be doing pushes at that point and some of the pain of force pushes is removed (the pain of tracking such a branch isn't, but the pain of pushing is). I bring up gitlab because the plan seems to be (sh) that fdo is migrating to gitlab as a whole, even if individual projects are free to continue using mailing lists instead of pull requests. Dylan signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105444] Enable GL disk shader cache when transform feedback is enabled
https://bugs.freedesktop.org/show_bug.cgi?id=105444 Jordan Justenchanged: What|Removed |Added Status|NEEDINFO|NEW --- Comment #1 from Jordan Justen --- >From irc, Tim mentions that we need to add prog->TransformFeedback->VaryingNames into the sha1 in shader_cache_read_program_metadata. Similar to AttributeBindings. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: don't use compiler.has_header
Acked-by: Matt TurnerThanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] vulkan: Add KHR_display extension to anv and radv using DRM
Jason Ekstrandwrites: > On Fri, Feb 23, 2018 at 3:43 PM, Keith Packard wrote: > > Once we're sure that's what we want, create an MR against the spec that > just adds enough to the XML to reserve your extension number. That will > get merged almost immediately. Then make a second one with the actual > extension text and we'll iterate on that either in Khronos gitlab or, if > you prefer, you can send it as a patch to mesa-dev and then make a Khrons > MR once it's baked. I just wrote up the full extension description for both extensions I need (the one for passing a KMS fd to the driver, and the second to get the GPU timestamp for doing GOOGLE_display_timing): https://github.com/keith-packard/Vulkan-Docs > See also my comments about GEM handle ownership. Yeah, I think I've got that all cleaned up now -- the code no longer shares the same file for rendering and display. -- -keith signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED
On 2018-03-12 14:51:56, Timothy Arceri wrote: > This only seems to be needed by i965. Alternative can't you just remove the: > > if (prog->sh.data->LinkStatus != LINKING_SKIPPED) >goto fail; > > from brw_disk_cache_upload_program() and let the cache search do its job? > > I believe its possible to end up linking the GLSL IR i.e. > prog->sh.data->LinkStatus == LINKING_SUCCESS but still have the i965 > binary in the cache (although I guess that's a pretty big corner case). In the cover letter I mentioned that possibility, and why I decided to start with this instead. I would like to cover this corner case eventually, but I thought maybe tackling xform feedback might be good next step. Regarding this patch, do you agree that this is perhaps a more accurate status than LINKING_SUCCESS? -Jordan > On 12/03/18 11:25, Jordan Justen wrote: > > This change allows the disk shader cache to work with programs loaded > > with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set, > > then they try to use the shader cache. > > > > Since the program loaded by ProgramBinary is similar to loading the > > shader from the disk cache, this is probably more appropriate. > > > > Cc: Timothy Arceri> > Signed-off-by: Jordan Justen > > --- > > src/mesa/main/program_binary.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/mesa/main/program_binary.c b/src/mesa/main/program_binary.c > > index 3df70059342..021f6315e72 100644 > > --- a/src/mesa/main/program_binary.c > > +++ b/src/mesa/main/program_binary.c > > @@ -287,5 +287,5 @@ _mesa_program_binary(struct gl_context *ctx, struct > > gl_shader_program *sh_prog, > > return; > > } > > > > - sh_prog->data->LinkStatus = LINKING_SUCCESS; > > + sh_prog->data->LinkStatus = LINKING_SKIPPED; > > } > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements
Dylan Bakerwrites: > Quoting Mark Janes (2018-03-12 12:40:47) >> Handling a screw-up could be done by maintainers by force-pushing the >> commits off the WIP branch, and adding some annotations that prevent the >> broken commit from being re-applied to WIP by automation. >> > > That sounds like introducing a lot of developer headaches, the kind that make > people not want to use the system. Take this scenario: > > 1. I push patches In this case the person is a developer, not a release manager > 2. CI starts > 3. you push patches I'll call this person "developer 2" below. > 4. My CI fails At this point, developer 1 needs feedback that their patch nearly created a problem for many end-users. Frankly, it's unacceptable for developers to annotate a commit for the stable branches unless they are confident that it is a *safe and necessary* fix for end-users. We have almost zero verification between the developer and millions of users. > 5. I force-push A release manager would have to resolve the failure manually, not developer 1. Developers can't force-push anything in my proposal. > Now both of our patches are removed, even though yours haven't gone through CI > at all. Release manager would manually drop patches from developer 1, and leave the patches from developer 2. CI would re-test patches from developer 2. > And if our tool isn't smart enough it will block your patches as well. > In fact, I can't think of a way to make force pushes on a branch that > multiple people work on *not* have race conditions. I agree that there is a race condition here. Right now our race condition covers a weeks long window between the time a developer CC's stable, and when the release manager starts applying patches. With an automated implementation, the window narrows to a day or so. > I think that we should either: > 1. Use gitlab and have CI run on PRs as well as on merged code. Either the PR >will be red and gitlab can block the merge, or it will be green. It should > be >possible to have gitlab block code that cannot be cleanly merged. > 2. Use merges and reverts. My 2 cents: choosing a specific git service is a step in the wrong direction for mesa. I agree that providing a branch to a release manager may be preferable to email, in the cases where a developer has to backport patches. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl/serialize: Save shader program metadata sha1
Reviewed-by: Timothy ArceriOn 12/03/18 11:25, Jordan Justen wrote: When the shader cache is used, this can be generated. In fact, the shader cache uses this sha1 to lookup the serialized GL shader program. If a GL shader program is restored with ProgramBinary, the shaders are not available, and therefore the correct sha1 cannot be generated. If this is restored, then we can use the shader cache to restore the binary programs to the program that was loaded with ProgramBinary. Cc: Timothy Arceri Signed-off-by: Jordan Justen --- src/compiler/glsl/serialize.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/compiler/glsl/serialize.cpp b/src/compiler/glsl/serialize.cpp index 9d2033bddfa..1fdbaa990f4 100644 --- a/src/compiler/glsl/serialize.cpp +++ b/src/compiler/glsl/serialize.cpp @@ -1163,6 +1163,8 @@ extern "C" void serialize_glsl_program(struct blob *blob, struct gl_context *ctx, struct gl_shader_program *prog) { + blob_write_bytes(blob, prog->data->sha1, sizeof(prog->data->sha1)); + write_uniforms(blob, prog); write_hash_tables(blob, prog); @@ -1219,6 +1221,8 @@ deserialize_glsl_program(struct blob_reader *blob, struct gl_context *ctx, assert(prog->data->UniformStorage == NULL); + blob_copy_bytes(blob, prog->data->sha1, sizeof(prog->data->sha1)); + read_uniforms(blob, prog); read_hash_tables(blob, prog); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED
This only seems to be needed by i965. Alternative can't you just remove the: if (prog->sh.data->LinkStatus != LINKING_SKIPPED) goto fail; from brw_disk_cache_upload_program() and let the cache search do its job? I believe its possible to end up linking the GLSL IR i.e. prog->sh.data->LinkStatus == LINKING_SUCCESS but still have the i965 binary in the cache (although I guess that's a pretty big corner case). On 12/03/18 11:25, Jordan Justen wrote: This change allows the disk shader cache to work with programs loaded with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set, then they try to use the shader cache. Since the program loaded by ProgramBinary is similar to loading the shader from the disk cache, this is probably more appropriate. Cc: Timothy ArceriSigned-off-by: Jordan Justen --- src/mesa/main/program_binary.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/program_binary.c b/src/mesa/main/program_binary.c index 3df70059342..021f6315e72 100644 --- a/src/mesa/main/program_binary.c +++ b/src/mesa/main/program_binary.c @@ -287,5 +287,5 @@ _mesa_program_binary(struct gl_context *ctx, struct gl_shader_program *sh_prog, return; } - sh_prog->data->LinkStatus = LINKING_SUCCESS; + sh_prog->data->LinkStatus = LINKING_SKIPPED; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values
https://bugs.freedesktop.org/show_bug.cgi?id=105464 --- Comment #2 from Philip Rebohle--- Created attachment 138044 --> https://bugs.freedesktop.org/attachment.cgi?id=138044=edit Tessellation demo screenshot -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] nv50, nvc0: Support BGRX1010102 and RGBX1010102 for sampling.
Reviewed-by: Ilia MirkinIt should be possible to get rendering on them BTW, with a bit of state fixups for DST_ALPHA blending. Just haven't gotten around to it. On Mon, Mar 12, 2018 at 4:45 PM, Mario Kleiner wrote: > Add them as usable for textures, so they can be used by > Wayland drm in 10 bpc mode and for X11 compositing under > GLX and EGL. We need these formats to be supported at > least for sampling, otherwise GLX_texture_from_pixmap > and the equivalent EGL image extension won't work with > X11 drawables of depth 30 and just display an all black > window. > > Do not expose these formats as renderable, and thereby > not as a fbconfig/EGLConfig/Visual, as NVidia hw does > not support 10 bpc unorm formats without alpha channel. > > Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing, > and under Wayland+Weston drm backend with a Tesla and > Pascal gpu. > > Signed-off-by: Mario Kleiner > --- > src/gallium/drivers/nouveau/nv50/nv50_formats.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats.c > b/src/gallium/drivers/nouveau/nv50/nv50_formats.c > index fc5deac..0ead8ac 100644 > --- a/src/gallium/drivers/nouveau/nv50/nv50_formats.c > +++ b/src/gallium/drivers/nouveau/nv50/nv50_formats.c > @@ -153,7 +153,9 @@ const struct nv50_format > nv50_format_table[PIPE_FORMAT_COUNT] = > F3(A, R9G9B9E5_FLOAT, NONE, R, G, B, xx, FLOAT, E5B9G9R9_SHAREDEXP, T), > > C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, > TD), > + F3(A, R10G10B10X2_UNORM, RGB10_A2_UNORM, R, G, B, xx, UNORM, A2B10G10R10, > T), > C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, > IB), > + F3(A, B10G10R10X2_UNORM, BGR10_A2_UNORM, B, G, R, xx, UNORM, A2B10G10R10, > T), > C4(A, R10G10B10A2_SNORM, NONE, R, G, B, A, SNORM, A2B10G10R10, T), > C4(A, B10G10R10A2_SNORM, NONE, B, G, R, A, SNORM, A2B10G10R10, T), > C4(A, R10G10B10A2_UINT, RGB10_A2_UINT, R, G, B, A, UINT, A2B10G10R10, TR), > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] nv50, nvc0: Support BGRX1010102 and RGBX1010102 for sampling.
Add them as usable for textures, so they can be used by Wayland drm in 10 bpc mode and for X11 compositing under GLX and EGL. We need these formats to be supported at least for sampling, otherwise GLX_texture_from_pixmap and the equivalent EGL image extension won't work with X11 drawables of depth 30 and just display an all black window. Do not expose these formats as renderable, and thereby not as a fbconfig/EGLConfig/Visual, as NVidia hw does not support 10 bpc unorm formats without alpha channel. Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing, and under Wayland+Weston drm backend with a Tesla and Pascal gpu. Signed-off-by: Mario Kleiner--- src/gallium/drivers/nouveau/nv50/nv50_formats.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats.c b/src/gallium/drivers/nouveau/nv50/nv50_formats.c index fc5deac..0ead8ac 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_formats.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_formats.c @@ -153,7 +153,9 @@ const struct nv50_format nv50_format_table[PIPE_FORMAT_COUNT] = F3(A, R9G9B9E5_FLOAT, NONE, R, G, B, xx, FLOAT, E5B9G9R9_SHAREDEXP, T), C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, TD), + F3(A, R10G10B10X2_UNORM, RGB10_A2_UNORM, R, G, B, xx, UNORM, A2B10G10R10, T), C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, IB), + F3(A, B10G10R10X2_UNORM, BGR10_A2_UNORM, B, G, R, xx, UNORM, A2B10G10R10, T), C4(A, R10G10B10A2_SNORM, NONE, R, G, B, A, SNORM, A2B10G10R10, T), C4(A, B10G10R10A2_SNORM, NONE, B, G, R, A, SNORM, A2B10G10R10, T), C4(A, R10G10B10A2_UINT, RGB10_A2_UINT, R, G, B, A, UINT, A2B10G10R10, TR), -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] More pieces for xbgr2101010/abgr2101010 support.
These are needed together with Daniel Stone's 10 bpc bgr patches to make nouveau's 10 bpc support more complete. All tested on nouveau on a nv96 as primary/display gpu and also with a radeon as prime renderoffload gpu, and then the other way round with radeon primary + nouveau renderoffload. Also tested under X11 DRI2 and DRI3/Present, GLX and EGL, composited and unredirected. And with Waylands weston normal and with prime renderoffload. Patch 1 completes Daniels patches. Patch 2 makes weston work on nouveau with gbm-format=xbgr2101010, and enables x11 compositing of depth 30 drawables. Patch 3 makes sure we get the right colors when compositing on x11 + EGL. Some patches on top of weston master to test gbm-format=xbgr2101010 are here: https://github.com/kleinerm/weston/tree/westonnew10bpc -mario ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] egl/x11: Handle both depth 30 formats for eglCreateImage().
We need to distinguish if a backing pixmap of a window is XRGB2101010 or XBGR2101010, as different gpu hw supports different formats. NVidia hw prefers XBGR, whereas AMD and Intel are happy with XRGB. We use the red channel mask of the visual to distinguish at depth 30, but because we can't easily get the associated visual of a Pixmap, we use the visual of the x-screens root window instead as a proxy. This fixes desktop composition of color depth 30 windows when the X11 compositor uses EGL. Signed-off-by: Mario Kleiner--- src/egl/drivers/dri2/egl_dri2.h | 7 ++ src/egl/drivers/dri2/platform_x11.c | 37 +++- src/egl/drivers/dri2/platform_x11_dri3.c | 7 +- 3 files changed, 49 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index d36d02c..a399b06 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -402,6 +402,8 @@ EGLBoolean dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp); void dri2_teardown_x11(struct dri2_egl_display *dri2_dpy); +unsigned int +dri2_x11_get_red_mask(struct dri2_egl_display *dri2_dpy); #else static inline EGLBoolean dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp) @@ -410,6 +412,11 @@ dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp) } static inline void dri2_teardown_x11(struct dri2_egl_display *dri2_dpy) {} +static inline unsigned int +dri2_x11_get_red_mask(struct dri2_egl_display *dri2_dpy) +{ + return 0; +} #endif #ifdef HAVE_DRM_PLATFORM diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index 6c287b4..da28981 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -209,6 +209,37 @@ get_xcb_screen(xcb_screen_iterator_t iter, int screen) return NULL; } +static xcb_visualtype_t * +get_xcb_visualtype(struct dri2_egl_display *dri2_dpy) +{ + xcb_visualtype_iterator_t visual_iter; + xcb_screen_t *screen = dri2_dpy->screen; + xcb_visualid_t visual_id = screen->root_visual; + xcb_depth_iterator_t depth_iter = xcb_screen_allowed_depths_iterator(screen); + + for (; depth_iter.rem; xcb_depth_next(_iter)) { + visual_iter = xcb_depth_visuals_iterator(depth_iter.data); + + for (; visual_iter.rem; xcb_visualtype_next(_iter)) { + if (visual_iter.data->visual_id == visual_id) +return visual_iter.data; + } + } + + return NULL; +} + +/* Get red channel mask of the root windows visual for our x-screen */ +unsigned int +dri2_x11_get_red_mask(struct dri2_egl_display *dri2_dpy) +{ + unsigned int red_mask = 0; + xcb_visualtype_t *visual = get_xcb_visualtype(dri2_dpy); + if (visual) + red_mask = visual->red_mask; + + return red_mask; +} /** * Called via eglCreateWindowSurface(), drv->API.CreateWindowSurface(). @@ -1050,7 +1081,11 @@ dri2_create_image_khr_pixmap(_EGLDisplay *disp, _EGLContext *ctx, format = __DRI_IMAGE_FORMAT_XRGB; break; case 30: - format = __DRI_IMAGE_FORMAT_XRGB2101010; + /* Different preferred formats for different hw */ + if (dri2_x11_get_red_mask(dri2_dpy) == 0x3ff) + format = __DRI_IMAGE_FORMAT_XBGR2101010; + else + format = __DRI_IMAGE_FORMAT_XRGB2101010; break; case 32: format = __DRI_IMAGE_FORMAT_ARGB; diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c b/src/egl/drivers/dri2/platform_x11_dri3.c index 2073c59..667c845 100644 --- a/src/egl/drivers/dri2/platform_x11_dri3.c +++ b/src/egl/drivers/dri2/platform_x11_dri3.c @@ -282,7 +282,12 @@ dri3_create_image_khr_pixmap(_EGLDisplay *disp, _EGLContext *ctx, format = __DRI_IMAGE_FORMAT_XRGB; break; case 30: - format = __DRI_IMAGE_FORMAT_XRGB2101010; + /* Different preferred formats for different hw */ + if (dri2_x11_get_red_mask(dri2_dpy) == 0x3ff) + format = __DRI_IMAGE_FORMAT_XBGR2101010; + else + format = __DRI_IMAGE_FORMAT_XRGB2101010; + break; case 32: format = __DRI_IMAGE_FORMAT_ARGB; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] wayland-drm: Expose server-side xbgr2101010 and abgr2101010 formats.
This way the wayland server can signal support for these formats to wayland EGL clients. This is currently used by nouveau for 10 bpc support. Tested with glmark2-wayland and glmark2-es2-wayland under weston to now expose 10 bpc EGL configs under nouveau. Signed-off-by: Mario Kleiner--- src/egl/wayland/wayland-drm/wayland-drm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/egl/wayland/wayland-drm/wayland-drm.c b/src/egl/wayland/wayland-drm/wayland-drm.c index 3c6696d..7d44d38 100644 --- a/src/egl/wayland/wayland-drm/wayland-drm.c +++ b/src/egl/wayland/wayland-drm/wayland-drm.c @@ -111,6 +111,8 @@ drm_create_buffer(struct wl_client *client, struct wl_resource *resource, uint32_t stride, uint32_t format) { switch (format) { +case WL_DRM_FORMAT_ABGR2101010: +case WL_DRM_FORMAT_XBGR2101010: case WL_DRM_FORMAT_ARGB2101010: case WL_DRM_FORMAT_XRGB2101010: case WL_DRM_FORMAT_ARGB: @@ -215,6 +217,10 @@ bind_drm(struct wl_client *client, void *data, uint32_t version, uint32_t id) wl_resource_post_event(resource, WL_DRM_FORMAT, WL_DRM_FORMAT_XRGB2101010); wl_resource_post_event(resource, WL_DRM_FORMAT, + WL_DRM_FORMAT_ABGR2101010); + wl_resource_post_event(resource, WL_DRM_FORMAT, + WL_DRM_FORMAT_XBGR2101010); + wl_resource_post_event(resource, WL_DRM_FORMAT, WL_DRM_FORMAT_ARGB); wl_resource_post_event(resource, WL_DRM_FORMAT, WL_DRM_FORMAT_XRGB); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements
Quoting Mark Janes (2018-03-12 12:40:47) > Dylan Bakerwrites: > > > Quoting Emil Velikov (2018-03-12 08:38:31) > >> On 12 March 2018 at 11:31, Juan A. Suarez Romero > >> wrote: > >> > On Fri, 2018-03-09 at 12:12 -0800, Mark Janes wrote: > >> >> Ilia Mirkin writes: > >> >> > >> >> > On Tue, Mar 6, 2018 at 2:34 PM, Emil Velikov > >> >> > wrote: > >> >> > > So while others explore ways of improving the testing, let me > >> >> > > propose > >> >> > > a few ideas for improving the actual releasing process. > >> >> > > > >> >> > > > >> >> > > - Making the current state always visible - have a web page, git > >> >> > >branch and other ways for people to see which patches are picked, > >> >> > >require backports, etc. > >> >> > > >> >> > Yes please! A git branch that's available (and force-pushed freely) > >> >> > before the "you're screwed" announcement is going to help clear a lot > >> >> > of things up. > >> >> > >> >> I agree that early information is good. I don't agree that anyone > >> >> should force push. Release branches need to be protected. Proposed > >> >> release branches should only accept patches that have already been > >> >> vetted by the process on mesa master. > >> >> > >> Agreed - release branches should not be force-pushed. We can use > >> "wip/" ones instead. > > > > I also strongly agree with this, force pushes to live branches are > > *bad* (force pushing a pull request of a features branch are perfectly > > fine). I would much rather have reverts than force pushes. If we're > > going to automate this in such way that we think we need force pushes > > I'd much rather use merges (only for stable), so that we can simply > > revert the merge commit. > > > > Or, as other have suggested, not allowing the proposed patches to be > > pushed until CI has come back green would be even better. I've used > > this approach in several github based projects and it works very well > > for keeping the branch in question in good shape. > > The patches need to be in a branch for any CI to test them. A WIP > branch seems like a good thing for CI to poll. > > If CI fails at this point, then it means the developer messed up. No > one should add a fixes/cc tag to a commit unless they have some > confidence that it will work on top of the WIP branch (by *testing* it). > > Handling a screw-up could be done by maintainers by force-pushing the > commits off the WIP branch, and adding some annotations that prevent the > broken commit from being re-applied to WIP by automation. > That sounds like introducing a lot of developer headaches, the kind that make people not want to use the system. Take this scenario: 1. I push patches 2. CI starts 3. you push patches 4. My CI fails 5. I force-push Now both of our patches are removed, even though yours haven't gone through CI at all. And if our tool isn't smart enough it will block your patches as well. In fact, I can't think of a way to make force pushes on a branch that multiple people work on *not* have race conditions. I think that we should either: 1. Use gitlab and have CI run on PRs as well as on merged code. Either the PR will be red and gitlab can block the merge, or it will be green. It should be possible to have gitlab block code that cannot be cleanly merged. 2. Use merges and reverts. Dylan signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb
On 13 March 2018 at 05:58, Marek Olšákwrote: > On Mon, Mar 12, 2018 at 3:05 PM, Dave Airlie wrote: >> On 13 March 2018 at 03:59, Marek Olšák wrote: >>> This is good, though some older distros only have libxcb 1.11. >> >> On those distros you likely just want to --disable-dri3 anyways. >> >> Dave. > > Good one. I know you don't care, but we are talking about the latest > long-term stable version of a major distro. Does that distro have dri3 support in it's X server? If so, then a follow up patch to lower this to 1.11 would be fine (actually I've posted a cleaner patch), but if you don't need dri3 support, then the follow up could just enable libxcb 1.11 support by dropping dri3 Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb
On Mon, Mar 12, 2018 at 3:05 PM, Dave Airliewrote: > On 13 March 2018 at 03:59, Marek Olšák wrote: >> This is good, though some older distros only have libxcb 1.11. > > On those distros you likely just want to --disable-dri3 anyways. > > Dave. Good one. I know you don't care, but we are talking about the latest long-term stable version of a major distro. I know you don't care about the following either, but if Mesa can't use older libxcb, the PRO driver will have to ship its own libxcb for older distros. It's a terrible idea IMO. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements
Emil Velikovwrites: > On 12 March 2018 at 11:31, Juan A. Suarez Romero wrote: >> On Fri, 2018-03-09 at 12:12 -0800, Mark Janes wrote: >>> - Patches are applied to proposed stable branch by automation when the >>>associated commit is pushed to master. The existing commit message >>>annotations drive this process. There must be zero ambiguity in the >>>annotations (eg which stable branches need the patch). >>> > > I would recommend a delay between the patch landing in master and the > wip branch. In the past, we have multiple cases where a fix lands in > master, which causes severe regressions. > IMHO having a 24-48h period sounds reasonable, although it can be > tweaked based on feedback. Having a delay means developers cannot quickly verify that their stable-branch annotations correctly resulted in their patch being applied where they wanted. In the rare case that we have bad patches applied through the process, we can treat it like a CI failure, where the maintainer steps in, force-pushes the bad patches off the WIP branch, and adds annotations to prevent automation from re-applying the commits later. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] spirv: fix OpSConvert when the source is unsigned
On Mon, Mar 5, 2018 at 10:21 PM, Samuel Iglesias Gonsálvez < sigles...@igalia.com> wrote: > OpSConvert interprets the MSB of the unsigned value as the sign bit and > extends it to the new type. If we want to preserve the value, we need > to use OpUConvert opcode. > > v2: > - No need to check dst type. > - Fix typo in comment. > > Signed-off-by: Samuel Iglesias Gonsálvez> --- > src/compiler/spirv/vtn_alu.c | 18 +- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c > index d0c9e316935..a5cefc35773 100644 > --- a/src/compiler/spirv/vtn_alu.c > +++ b/src/compiler/spirv/vtn_alu.c > @@ -354,10 +354,26 @@ vtn_nir_alu_op_for_spirv_opcode(struct vtn_builder > *b, > case SpvOpConvertFToS: > case SpvOpConvertSToF: > case SpvOpConvertUToF: > - case SpvOpSConvert: > case SpvOpFConvert: >return nir_type_conversion_op(src, dst, nir_rounding_mode_undef); > > + case SpvOpSConvert: { > + nir_alu_type src_base = (nir_alu_type) nir_alu_type_get_base_type( > src); > + if (src_base == nir_type_uint) { > Why are we predicating this on src_base == nir_type_uint? It seems to me as if we should just ignore the source and destination type except for the bit size. > + /* SPIR-V expects to interpret the unsigned value as signed and > + * do sign extend. Return the opcode accordingly. > + */ > + unsigned dst_bit_size = nir_alu_type_get_type_size(dst); > + switch (dst_bit_size) { > + case 16: return nir_op_i2i16; > + case 32: return nir_op_i2i32; > + case 64: return nir_op_i2i64; > This can be nir_type_int | dst_bit_size. NIR types are convenient like that. :-) > + default: > +vtn_fail("Invalid nir alu bit size"); > + } > + } > + return nir_type_conversion_op(src, dst, nir_rounding_mode_undef); > + } > /* Derivatives: */ > case SpvOpDPdx: return nir_op_fddx; > case SpvOpDPdy: return nir_op_fddy; > -- > 2.14.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements
Dylan Bakerwrites: > Quoting Emil Velikov (2018-03-12 08:38:31) >> On 12 March 2018 at 11:31, Juan A. Suarez Romero wrote: >> > On Fri, 2018-03-09 at 12:12 -0800, Mark Janes wrote: >> >> Ilia Mirkin writes: >> >> >> >> > On Tue, Mar 6, 2018 at 2:34 PM, Emil Velikov >> >> > wrote: >> >> > > So while others explore ways of improving the testing, let me propose >> >> > > a few ideas for improving the actual releasing process. >> >> > > >> >> > > >> >> > > - Making the current state always visible - have a web page, git >> >> > >branch and other ways for people to see which patches are picked, >> >> > >require backports, etc. >> >> > >> >> > Yes please! A git branch that's available (and force-pushed freely) >> >> > before the "you're screwed" announcement is going to help clear a lot >> >> > of things up. >> >> >> >> I agree that early information is good. I don't agree that anyone >> >> should force push. Release branches need to be protected. Proposed >> >> release branches should only accept patches that have already been >> >> vetted by the process on mesa master. >> >> >> Agreed - release branches should not be force-pushed. We can use >> "wip/" ones instead. > > I also strongly agree with this, force pushes to live branches are > *bad* (force pushing a pull request of a features branch are perfectly > fine). I would much rather have reverts than force pushes. If we're > going to automate this in such way that we think we need force pushes > I'd much rather use merges (only for stable), so that we can simply > revert the merge commit. > > Or, as other have suggested, not allowing the proposed patches to be > pushed until CI has come back green would be even better. I've used > this approach in several github based projects and it works very well > for keeping the branch in question in good shape. The patches need to be in a branch for any CI to test them. A WIP branch seems like a good thing for CI to poll. If CI fails at this point, then it means the developer messed up. No one should add a fixes/cc tag to a commit unless they have some confidence that it will work on top of the WIP branch (by *testing* it). Handling a screw-up could be done by maintainers by force-pushing the commits off the WIP branch, and adding some annotations that prevent the broken commit from being re-applied to WIP by automation. > Dylan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values
https://bugs.freedesktop.org/show_bug.cgi?id=105464 --- Comment #1 from Philip Rebohle--- Created attachment 138038 --> https://bugs.freedesktop.org/attachment.cgi?id=138038=edit Witcher 3 hull shader which may suffer from the same issue FWIW, the tessellation demo works correctly with AMDVLK with the patched shader. On RADV, the tessellation levels are seemingly random, and its behaviour changes by just changing the location number. A similar issue occurs in The Witcher 3 when run with DXVK, where water surfaces have incorrect tessellation factors applied on RADV. It reportedly renders correctly on Nvidia. In this case however, there is only one single invocation writing per-patch outputs. A workaround that makes this particular shader work correctly on RADV is to write all per-patch outputs to an array with storage class Private first, reading the tessellation factors from that array, and finally copying the contents of the temporary array to the output array, which tells me that reading from the output array again returns incorrect results. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb
On 12 March 2018 at 18:48, Dave Airliewrote: > On 13 March 2018 at 03:24, Emil Velikov wrote: >> Hi Dave, >> >> On 11 March 2018 at 23:26, Dave Airlie wrote: >>> From: Dave Airlie >>> >>> I'm not sure everyone wants to be updating their dri3 in a forced >>> march setting, this allows a nicer approach, esp when you want >>> to build on distro that aren't brand new. >>> >>> I'm sure there are plenty of ways this patch could be cleaner, >>> and I've also not built it against an updated dri3. >> >> Have you considered cases where the build server is using 1.12, while >> at run-time we have 1.13? >> Are you explicitly forbidding that, say via the packaging? It tends to >> be allowed on most(all?) distributions. > > Yes I am because really who does that, and why do I care. > Sounds like I stepped on your toes here. Pardon, did not mean to. All I've seen is distribution packaging ensuring the runtime version is at least equal to the build-time one. I have not seen the opposite, hence the question. > If you build against a newer libxcb it won't run against the older one either, > why do you expect building against the older one will magically work against > a newer one with all the features? > Very often an updated version is of dependency is shipped, yet the package (say mesa) is not rebuilt. AFAICT there's no clear way to annotate this kind of 'hidden' dependency, thus package maintainers don't know about it. Hence, causing fair amount of time lost in user frustration and developers debugging. >> That said, if updating XCB is a serious no-go, may I suggest something >> like the following: >> - add local fallback definitions/declarations >> - add local functions (annotated as weak) which return 'the correct' >> value so that the fallback paths kick in > > I can sorta see the first part being useful, the second is definitely > over engineering > the solution. > > The thing is most of the features in dri3.1 are gated on the X server > having support, > Most people are not updating their X servers, I'm guessing apart from > the modifiers > devs there'll be at most 10 people who update their X server for this > feature in advance > of a distro moving them to it. I know I won't personally be going > around all 10 boxes I > keep running updating their X server for a feature that doesn't add > anything on those > hw configurations yet. When distros move to the 1.20 X server they'll > also move to newer > xcb, this is for distros that won't move at all. > Hey, I'm just sharing an idea of what sounds like the more robust solution. It should work "for everyone" even though it seem like an overkill. I dare not think of the xcb/xserver/mesa combinations that people use. As long as people are on board with the fun experience mentioned above, don't mind me ;-) HTH Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values
https://bugs.freedesktop.org/show_bug.cgi?id=105464 Bug ID: 105464 Summary: Reading per-patch outputs in Tessellation Control Shader returns undefined values Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: philip.rebo...@tu-dortmund.de QA Contact: mesa-dev@lists.freedesktop.org Created attachment 138037 --> https://bugs.freedesktop.org/attachment.cgi?id=138037=edit Patch for tessellation demo to reproduce the issue As mentioned in the title, reading a per-patch output variable that was written by a different invocation produces undefined results even after a barrier() call. The attached patch changes the tessellation control shader of the 'tessellation' demo from Sascha Willems' Vulkan samples in that it reads the tessellation levels from a per-patch output array. Note that the shader needs to be recompiled manually in order to reproduce the issue. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5 v6] clover/llvm: Add get_[cl|language]_version, validation and some helpers
ping. --Aaron On Thu, Mar 1, 2018 at 8:02 PM, Aaron Watrywrote: > Used to calculate the default CLC language version based on the --cl-std in > build args > and the device capabilities. > > According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by: > 1) If you have -cl-std=CL1.1+ use the version specified > 2) If not, use the highest 1.x version that the device supports > > Curiously, there is no valid value for -cl-std=CL1.0 > > Validates requested cl-std against device_clc_version > > Signed-off-by: Aaron Watry > Cc: Pierre Moreau > > v6: (Pierre) Add more const and fix some whitespace > > v5: (Aaron) Use a collection of cl versions instead of switch cases > Consolidates the string, numeric version, and clc langstandard::kind > > v4: (Pierre) Split get_language_version addition and use into separate patches > Squash patches that add the helpers and validate the language standard > > v3: Change device_version to device_clc_version > > v2: (Pierre) Move create_compiler_instance changes to correct patch > to prevent temporary build breakage. > Convert version_str into unsigned and use it to find language version > Add build_error for unknown language version string > Whitespace fixes > --- > .../state_trackers/clover/llvm/invocation.cpp | 63 > ++ > 1 file changed, 63 insertions(+) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 0bc06e..0f854b9049 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -63,6 +63,23 @@ using ::llvm::Module; > using ::llvm::raw_string_ostream; > > namespace { > + > + struct cl_version { > + std::string version_str; // CL Version > + unsigned version_number; // Numeric CL Version > + clang::LangStandard::Kind clc_lang_standard; // lang standard for > version > + }; > + > + static const unsigned ANY_VERSION = 999; > + const cl_version cl_versions[] = { > + { "1.0", 100, clang::LangStandard::lang_opencl10}, > + { "1.1", 110, clang::LangStandard::lang_opencl11}, > + { "1.2", 120, clang::LangStandard::lang_opencl12}, > + { "2.0", 200, clang::LangStandard::lang_opencl20}, > + { "2.1", 210, clang::LangStandard::lang_unspecified}, //2.1 doesn't > exist > + { "2.2", 220, clang::LangStandard::lang_unspecified}, //2.2 doesn't > exist > + }; > + > void > init_targets() { >static bool targets_initialized = false; > @@ -93,6 +110,52 @@ namespace { >return ctx; > } > > + const struct cl_version > + get_cl_version(const std::string _str, > + unsigned max = ANY_VERSION) { > + for (const struct cl_version version : cl_versions) { > + if (version.version_number == max || version.version_str == > version_str) { > +return version; > + } > + } > + throw build_error("Unknown/Unsupported language version"); > + } > + > + clang::LangStandard::Kind > + get_lang_standard_from_version_str(const std::string _str, > + bool is_build_opt = false) { > + /** > + * Per CL 2.0 spec, section 5.8.4.5: > + * If it's an option, use the value directly. > + * If it's a device version, clamp to max 1.x version, a.k.a. 1.2 > + */ > + const struct cl_version version = get_cl_version(version_str, > + is_build_opt ? ANY_VERSION : 120); > + return version.clc_lang_standard; > + } > + > + clang::LangStandard::Kind > + get_language_version(const std::vector , > +const std::string _version) { > + > + const std::string search = "-cl-std=CL"; > + > + for (auto opt: opts) { > + auto pos = opt.find(search); > + if (pos == 0){ > +const auto ver = opt.substr(pos + search.size()); > +const auto device_ver = get_cl_version(device_version); > +const auto requested = get_cl_version(ver); > +if (requested.version_number > device_ver.version_number) { > + throw build_error(); > +} > +return get_lang_standard_from_version_str(ver, true); > + } > + } > + > + return get_lang_standard_from_version_str(device_version); > + } > + > std::unique_ptr > create_compiler_instance(const device , > const std::vector , > -- > 2.14.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb
On 13 March 2018 at 03:59, Marek Olšákwrote: > This is good, though some older distros only have libxcb 1.11. On those distros you likely just want to --disable-dri3 anyways. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl: Use hash table cloning in copy propagation
Thomas Hellandwrites: > Walking the whole hash table, inserting entries by hashing them first > is just a really bad idea. We can simply memcpy the whole thing. > --- > src/compiler/glsl/opt_copy_propagation.cpp | 13 -- > .../glsl/opt_copy_propagation_elements.cpp | 29 > -- > 2 files changed, 15 insertions(+), 27 deletions(-) > > diff --git a/src/compiler/glsl/opt_copy_propagation.cpp > b/src/compiler/glsl/opt_copy_propagation.cpp > index e904e6ede4..96667779da 100644 > --- a/src/compiler/glsl/opt_copy_propagation.cpp > +++ b/src/compiler/glsl/opt_copy_propagation.cpp > @@ -220,10 +220,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list > *instructions) > this->killed_all = false; > > /* Populate the initial acp with a copy of the original */ > - struct hash_entry *entry; > - hash_table_foreach(orig_acp, entry) { > - _mesa_hash_table_insert(acp, entry->key, entry->data); > - } > + acp = _mesa_hash_table_clone(orig_acp, NULL); Remove creation of acp above > > visit_list_elements(this, instructions); > > @@ -271,10 +268,10 @@ ir_copy_propagation_visitor::handle_loop(ir_loop *ir, > bool keep_acp) > this->killed_all = false; > > if (keep_acp) { > - struct hash_entry *entry; > - hash_table_foreach(orig_acp, entry) { > - _mesa_hash_table_insert(acp, entry->key, entry->data); > - } > + acp = _mesa_hash_table_clone(orig_acp, NULL); > + } else { > + acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer, > +_mesa_key_pointer_equal); > } Again, remove the old creation of the acp. Other than that, these are: Reviewed-by: Eric Anholt signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl: Use hash table cloning in copy propagation
Hi Thomas, If I were you I'd split out the introduction of clone_acp() into a separate patch. Regardless of that suggestions, there seems to be a bug in this patch. On 12 March 2018 at 17:55, Thomas Hellandwrote: > Walking the whole hash table, inserting entries by hashing them first > is just a really bad idea. We can simply memcpy the whole thing. > --- > src/compiler/glsl/opt_copy_propagation.cpp | 13 -- > .../glsl/opt_copy_propagation_elements.cpp | 29 > -- > 2 files changed, 15 insertions(+), 27 deletions(-) > > diff --git a/src/compiler/glsl/opt_copy_propagation.cpp > b/src/compiler/glsl/opt_copy_propagation.cpp > index e904e6ede4..96667779da 100644 > --- a/src/compiler/glsl/opt_copy_propagation.cpp > +++ b/src/compiler/glsl/opt_copy_propagation.cpp > @@ -220,10 +220,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list > *instructions) > this->killed_all = false; > > /* Populate the initial acp with a copy of the original */ > - struct hash_entry *entry; > - hash_table_foreach(orig_acp, entry) { > - _mesa_hash_table_insert(acp, entry->key, entry->data); > - } > + acp = _mesa_hash_table_clone(orig_acp, NULL); > There's a _mesa_hash_table_create just above that should be removed. HTH Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb
On 13 March 2018 at 03:24, Emil Velikovwrote: > Hi Dave, > > On 11 March 2018 at 23:26, Dave Airlie wrote: >> From: Dave Airlie >> >> I'm not sure everyone wants to be updating their dri3 in a forced >> march setting, this allows a nicer approach, esp when you want >> to build on distro that aren't brand new. >> >> I'm sure there are plenty of ways this patch could be cleaner, >> and I've also not built it against an updated dri3. > > Have you considered cases where the build server is using 1.12, while > at run-time we have 1.13? > Are you explicitly forbidding that, say via the packaging? It tends to > be allowed on most(all?) distributions. Yes I am because really who does that, and why do I care. If you build against a newer libxcb it won't run against the older one either, why do you expect building against the older one will magically work against a newer one with all the features? > That said, if updating XCB is a serious no-go, may I suggest something > like the following: > - add local fallback definitions/declarations > - add local functions (annotated as weak) which return 'the correct' > value so that the fallback paths kick in I can sorta see the first part being useful, the second is definitely over engineering the solution. The thing is most of the features in dri3.1 are gated on the X server having support, Most people are not updating their X servers, I'm guessing apart from the modifiers devs there'll be at most 10 people who update their X server for this feature in advance of a distro moving them to it. I know I won't personally be going around all 10 boxes I keep running updating their X server for a feature that doesn't add anything on those hw configurations yet. When distros move to the 1.20 X server they'll also move to newer xcb, this is for distros that won't move at all. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Implement a hash table cloning function
Hi Thomas, On 12 March 2018 at 17:55, Thomas Hellandwrote: > V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav) > --- > src/util/hash_table.c | 22 ++ > src/util/hash_table.h | 2 ++ > 2 files changed, 24 insertions(+) > > diff --git a/src/util/hash_table.c b/src/util/hash_table.c > index b7421a0144..f8d5d0f88a 100644 > --- a/src/util/hash_table.c > +++ b/src/util/hash_table.c > @@ -141,6 +141,28 @@ _mesa_hash_table_create(void *mem_ctx, > return ht; > } > > +struct hash_table * > +_mesa_hash_table_clone(struct hash_table *src, void *dst_mem_ctx) > +{ > + struct hash_table *ht; > + > + ht = ralloc(dst_mem_ctx, struct hash_table); > + if (ht == NULL) > + return NULL; > + > + memcpy(ht, src, sizeof(struct hash_table)); > + > + ht->table = ralloc_array(ht, struct hash_entry, ht->size); > + if (ht->table == NULL) { > + ralloc_free(ht); > + return NULL; > + } > + > + memcpy(ht->table, src->table, ht->size * sizeof(struct hash_entry)); > + Thinking out loud: I'm wondering if it won't make sense to reuse _mesa_hash_table_create, instead of open-coding it? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH RESEND] spirv: Silence compiler warning about undefined srcs[0]
Reviewed-by: Ian RomanickOn 03/12/2018 11:21 AM, Eric Anholt wrote: > v2: Use assume() at the srcs[] definition instead. > > Cc: Jason Ekstrand > Cc: Ian Romanick > Cc: Eric Engestrom > --- > src/compiler/spirv/spirv_to_nir.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/compiler/spirv/spirv_to_nir.c > b/src/compiler/spirv/spirv_to_nir.c > index 6a358c597316..3de45c47371e 100644 > --- a/src/compiler/spirv/spirv_to_nir.c > +++ b/src/compiler/spirv/spirv_to_nir.c > @@ -2925,6 +2925,7 @@ vtn_handle_composite(struct vtn_builder *b, SpvOp > opcode, > > case SpvOpCompositeConstruct: { >unsigned elems = count - 3; > + assume(elems >= 1); >if (glsl_type_is_vector_or_scalar(type)) { > nir_ssa_def *srcs[4]; > for (unsigned i = 0; i < elems; i++) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] omx: always define ENABLE_ST_OMX_{BELLAGIO, TIZONIA}
Quoting Eric Engestrom (2018-03-12 11:05:51) > On Monday, 2018-03-12 10:19:49 -0700, Dylan Baker wrote: > > Quoting Eric Engestrom (2018-03-12 07:33:27) > > > We're trying to be -Wundef clean so that we can turn it on (and > > > eventually make it an error). > > > > > > Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead > > > of #ifdef; I could've changed these, but the point of -Wundef is to > > > catch typos, so we might as well make the change the right way. > > > > > > Fixes: 83d4a5d5aea5a8a05be2 "st/omx/tizonia: Add H.264 decoder" > > > Fixes: b2f2236dc565dd1460f0 "st/omx/tizonia: Add H.264 encoder" > > > Fixes: c62cf1f165919bc74296 "st/omx/tizonia/h264d: Add EGLImage support" > > > Cc: Gurkirpal Singh> > > Signed-off-by: Eric Engestrom > > > --- > > > The meson hunk doesn't look pretty at all, but I'm planning on replacing > > > all the `pre_args` with a configuration_data(), which will allow to > > > simplify a lot of this #defines code. > > > --- > > > configure.ac | 4 > > > meson.build | 11 +-- > > > 2 files changed, 13 insertions(+), 2 deletions(-) > > > > > > diff --git a/configure.ac b/configure.ac > > > index 1553ce99da44bca4e826..6de4ceb2fb715505120e 100644 > > > --- a/configure.ac > > > +++ b/configure.ac > > > @@ -2281,6 +2281,8 @@ if test "x$enable_omx_bellagio" = xyes; then > > > PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= > > > $LIBOMXIL_BELLAGIO_REQUIRED]) > > > gallium_st="$gallium_st omx_bellagio" > > > AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL]) > > > +else > > > +AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0) > > > fi > > > AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes) > > > > > > @@ -2294,6 +2296,8 @@ if test "x$enable_omx_tizonia" = xyes; then > > > libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED]) > > > gallium_st="$gallium_st omx_tizonia" > > > AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL]) > > > +else > > > +AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0) > > > fi > > > AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes) > > > > > > diff --git a/meson.build b/meson.build > > > index b6e9692f192c528520e7..b9f7cd2aff5fc49e0d93 100644 > > > --- a/meson.build > > > +++ b/meson.build > > > @@ -504,7 +504,7 @@ if with_gallium_omx == 'bellagio' or with_gallium_omx > > > == 'auto' > > > 'libomxil-bellagio', required : with_gallium_omx == 'bellagio' > > >) > > >if dep_omx.found() > > > -pre_args += '-DENABLE_ST_OMX_BELLAGIO' > > > +pre_args += '-DENABLE_ST_OMX_BELLAGIO=1' > > > with_gallium_omx = 'bellagio' > > >endif > > > endif > > > @@ -525,7 +525,7 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx > > > == 'auto' > > >dependency('tizilheaders', required : with_gallium_omx == > > > 'tizonia'), > > > ] > > > if dep_omx.found() and dep_omx_other[0].found() and > > > dep_omx_other[1].found() > > > - pre_args += '-DENABLE_ST_OMX_TIZONIA' > > > + pre_args += '-DENABLE_ST_OMX_TIZONIA=1' > > >with_gallium_omx = 'tizonia' > > > else > > >with_gallium_omx = 'disabled' > > > @@ -533,6 +533,13 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx > > > == 'auto' > > >endif > > > endif > > > > > > +if with_gallium_omx != 'bellagio' > > > + pre_args += '-DENABLE_ST_OMX_BELLAGIO=0' > > > +endif > > > +if with_gallium_omx != 'tizonia' > > > + pre_args += '-DENABLE_ST_OMX_TIZONIA=0' > > > +endif > > > + > > > > This is fine as-is, but if you wanted to clean it up a little, you could do > > something like: > > > > pre_args += [ > > '-DENABLE_ST_OMX_BELLAGIO=' + with_gallium_omx == 'bellagio ? '1' : '0', > > '-DENABLE_ST_OMX_TIZONIA=' + with_gallium_omx == 'tizonia ? '1' : '0', > > ] > > That's what I was looking for but too tired to figure out :] > Thanks, I'll send a v2 with that tomorrow! Yup! I haven't tested this, and I have noticed some problems with using the ternary construct in meson (there's some problems with the parser), so this may not actually work. Dylan signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Few issues with Meson
This is my cross file (Arch doesn't have a pkg-config for x86, so I have a shell wrapper that sets PKG_CONFIG_PATH), you'll probably need to adjust some paths ``` [binaries] c = '/usr/bin/gcc' cpp = '/usr/bin/g++' ar = '/usr/bin/ar' strip = '/usr/bin/strip' pkgconfig = '/home/dylan/.local/bin/pkg-config-lib32' llvm-config = '/usr/bin/llvm-config32' [properties] c_args = ['-m32'] c_link_args = ['-m32'] cpp_args = ['-m32'] cpp_link_args = ['-m32'] [host_machine] system = 'linux' cpu_family = 'x86' cpu = 'i686' endian = 'little' # vim: ft=dosini ``` meson build-x66 --cross-file should give you a working mesa for your arch. There's some upstream discussion on how to choose llvm-config for non-cross compilation cases, but that hasn't moved a whole lot recently. Dylan Quoting Mike Lothian (2018-03-12 10:57:01) > Hi Dylan > > Do you have the link to patch on patchwork? I'll give it a go > > I'm using meson 0.45 however the cross-file requires more than just defining > llvm-config, everything else is normally picked up from what portage is > setting > in the build environment - though strangely not if clang is used - I'll look > into that sometime > > Regards > > Mike > > On Fri, 9 Mar 2018 at 16:37 Dylan Bakerwrote: > > Quoting Mike Lothian (2018-03-06 05:07:34) > > Hi > > > > When compiling wine I also noticed that the d3d.pc files didn't have > moduledir > > set, so wine couldn't find it > > > > configure: error: pkg-config couldn't find Gallium Nine module > > I've sent a patch for this. > > > > > Regards > > > > Mike > > > > On Tue, 6 Mar 2018 at 02:17 Mike Lothian wrote: > > > > Hi > > > > I've been trying to get a Gentoo ebuild ready for meson > > > > I've had to fudge the llvm-config for cross compiling a 32bit mesa > on > > a 64bit machine > > If you're using a new enough meson (0.45) you can specify the llvm-config > you > want to use in the cross file. > > > > > I notice that -Dvulkan-drivers= doesn't accept intel,radeon like > > autotools used to, it also seems as long as one value is correct the > > other is ignored > > we're using amd instead of radeon. After 18.0 branches I want to bump the > meson > requirement so we can use meson's list argument type, which will check for > such > problems. > > > > > Also -Dva-libs-path= doesn't play well with absolute paths, or > rather > > install_megadrivers.py is doing something strange - normally gentoo > > installs everything to a temporary image path then puts those files > > into the live system. It seems install_megadrivers.py doesn't do > this > > and installs directly to the live system - I worked around it by > > dropping the /usr > > There's a patch from someone in FreeBSD that might fix this (the way we do > symlinking in install_megadrivers is wrong). > > Sorry it took me so long to find this email, notmuch applied some odd tags > to > it. > > Dylan > signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] meson: don't use compiler.has_header
Meson's compiler.has_header is completely useless, it only checks that a header exists, not whether it's usable. This creates problems if a header contains a conditional #error declaration, like so: > #if __x86_64__ > # error "Doesn't work with x86_64!" > #endif Compiler.has_header will return true in this case, even when compiling for x86_64. This is useless. Instead, we'll do a compile check so that any #error declarations will be treated as errors, and compilation will work. Fixes compilation on x32 architecture. Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746 meson bug: https://github.com/mesonbuild/meson/issues/2246 CC: Matt TurnerSigned-off-by: Dylan Baker --- meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meson.build b/meson.build index 3c63f384381..51b470253f5 100644 --- a/meson.build +++ b/meson.build @@ -912,7 +912,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') endif foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] - if cc.has_header(h) + if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) endif endforeach -- 2.16.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RESEND] spirv: Silence compiler warning about undefined srcs[0]
v2: Use assume() at the srcs[] definition instead. Cc: Jason EkstrandCc: Ian Romanick Cc: Eric Engestrom --- src/compiler/spirv/spirv_to_nir.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index 6a358c597316..3de45c47371e 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -2925,6 +2925,7 @@ vtn_handle_composite(struct vtn_builder *b, SpvOp opcode, case SpvOpCompositeConstruct: { unsigned elems = count - 3; + assume(elems >= 1); if (glsl_type_is_vector_or_scalar(type)) { nir_ssa_def *srcs[4]; for (unsigned i = 0; i < elems; i++) -- 2.16.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 1/2] gallium/winsys/kms: Fix possible leak in map/unmap.
On 12 March 2018 at 17:45, Lepton Wuwrote: > Ping. Any more comments or missing stuff to get this commited into master? > As things have changed a bit (the original map/unmap behaviour is preserved) I was hoping that Tomasz will give it another look. If he prefers, I could add some revision summary and keep him as reviewer of v1? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] omx: always define ENABLE_ST_OMX_{BELLAGIO, TIZONIA}
On Monday, 2018-03-12 10:19:49 -0700, Dylan Baker wrote: > Quoting Eric Engestrom (2018-03-12 07:33:27) > > We're trying to be -Wundef clean so that we can turn it on (and > > eventually make it an error). > > > > Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead > > of #ifdef; I could've changed these, but the point of -Wundef is to > > catch typos, so we might as well make the change the right way. > > > > Fixes: 83d4a5d5aea5a8a05be2 "st/omx/tizonia: Add H.264 decoder" > > Fixes: b2f2236dc565dd1460f0 "st/omx/tizonia: Add H.264 encoder" > > Fixes: c62cf1f165919bc74296 "st/omx/tizonia/h264d: Add EGLImage support" > > Cc: Gurkirpal Singh> > Signed-off-by: Eric Engestrom > > --- > > The meson hunk doesn't look pretty at all, but I'm planning on replacing > > all the `pre_args` with a configuration_data(), which will allow to > > simplify a lot of this #defines code. > > --- > > configure.ac | 4 > > meson.build | 11 +-- > > 2 files changed, 13 insertions(+), 2 deletions(-) > > > > diff --git a/configure.ac b/configure.ac > > index 1553ce99da44bca4e826..6de4ceb2fb715505120e 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -2281,6 +2281,8 @@ if test "x$enable_omx_bellagio" = xyes; then > > PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= > > $LIBOMXIL_BELLAGIO_REQUIRED]) > > gallium_st="$gallium_st omx_bellagio" > > AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL]) > > +else > > +AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0) > > fi > > AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes) > > > > @@ -2294,6 +2296,8 @@ if test "x$enable_omx_tizonia" = xyes; then > > libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED]) > > gallium_st="$gallium_st omx_tizonia" > > AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL]) > > +else > > +AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0) > > fi > > AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes) > > > > diff --git a/meson.build b/meson.build > > index b6e9692f192c528520e7..b9f7cd2aff5fc49e0d93 100644 > > --- a/meson.build > > +++ b/meson.build > > @@ -504,7 +504,7 @@ if with_gallium_omx == 'bellagio' or with_gallium_omx > > == 'auto' > > 'libomxil-bellagio', required : with_gallium_omx == 'bellagio' > >) > >if dep_omx.found() > > -pre_args += '-DENABLE_ST_OMX_BELLAGIO' > > +pre_args += '-DENABLE_ST_OMX_BELLAGIO=1' > > with_gallium_omx = 'bellagio' > >endif > > endif > > @@ -525,7 +525,7 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx == > > 'auto' > >dependency('tizilheaders', required : with_gallium_omx == 'tizonia'), > > ] > > if dep_omx.found() and dep_omx_other[0].found() and > > dep_omx_other[1].found() > > - pre_args += '-DENABLE_ST_OMX_TIZONIA' > > + pre_args += '-DENABLE_ST_OMX_TIZONIA=1' > >with_gallium_omx = 'tizonia' > > else > >with_gallium_omx = 'disabled' > > @@ -533,6 +533,13 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx > > == 'auto' > >endif > > endif > > > > +if with_gallium_omx != 'bellagio' > > + pre_args += '-DENABLE_ST_OMX_BELLAGIO=0' > > +endif > > +if with_gallium_omx != 'tizonia' > > + pre_args += '-DENABLE_ST_OMX_TIZONIA=0' > > +endif > > + > > This is fine as-is, but if you wanted to clean it up a little, you could do > something like: > > pre_args += [ > '-DENABLE_ST_OMX_BELLAGIO=' + with_gallium_omx == 'bellagio ? '1' : '0', > '-DENABLE_ST_OMX_TIZONIA=' + with_gallium_omx == 'tizonia ? '1' : '0', > ] That's what I was looking for but too tired to figure out :] Thanks, I'll send a v2 with that tomorrow! > > and take the pre_args out of the block above altogether. > > either way, > > Reviewed-by: Dylan Baker ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/2] Hash table cloning for copy propagation
I've also uploaded this series to my github, if you wan to pull them down from there [1]. I've also uploaded my previously talked about pointer_map to my github account [2]. There's a pointer map, pointer set, and some patches for nir in there, and some for disabling asserts in some places. So it's not ready for primetime, but that series has been tested recently, and has been stable for a couple months now. Been tinkering with it and adding small pieces now and then. What remains is a bench-a-tonne to ensure it is OK performance wise, and cleaning it up for posting on the mailing list. [1]: https://github.com/thohel/mesa/commits/hash-table-clone [2]: https://github.com/thohel/mesa/commits/pointer_map 2018-03-12 18:55 GMT+01:00 Thomas Helland: > This is a revival of some old patches I had around to improve > the compile times in the glsl compiler by reducing the time > spend inserting items in the hash table in opt_copy_propagation. > I've only rebased this, as my system don't even want to compile > anything right now. I also don't remember if it was thoroughly > tested, so that will have to be done. Sending it out as Dave > might be interested in this to mitigate some of the overhead > his soft-dobule implementation incurs. > > CC: Dave Airlie > > Thomas Helland (2): > util: Implement a hash table cloning function > glsl: Use hash table cloning in copy propagation > > src/compiler/glsl/opt_copy_propagation.cpp | 13 -- > .../glsl/opt_copy_propagation_elements.cpp | 29 > -- > src/util/hash_table.c | 22 > src/util/hash_table.h | 2 ++ > 4 files changed, 39 insertions(+), 27 deletions(-) > > -- > 2.15.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa release improvements - Feature and Stable releases
Hi Andres, On 12 March 2018 at 15:57, Andres Gomezwrote: > On Mon, 2018-03-12 at 16:45 +0100, Juan A. Suarez Romero wrote: >> > >> On Mon, 2018-03-12 at 17:17 +0200, Andres Gomez wrote: > > [...] > I'm fully on board with your initial suggestion. >> > My proposal would be, similarly to what Intel does to track [1] the >> > stabilization for a release, 1 week (?) prior to the branching time to >> > create a metabug in bugzilla (or GitLab in the future ?), to announce >> > this metabug in mesa-dev and to let any developer who wants to see >> > their feature into the coming release to open a blocking bug for this >> > metabug explaining such feature and its progress. This way we can track >> > the progress and the process will be more transparent. We can still be >> > flexible to include the blocking features but the coordination will >> > happen over these bugs. >> > >> >> So, when the branch point is created? After the metabug is closed? or 1 week >> after the metabug is created? >> >> >> Not sure if this provide any difference on what we are doing now: create the >> branchpoint, open a metabug with the desire features, and cherry-pick all the >> patches that solves the metabug. >> > > 18.1 example: > >1. Create a Metabug for the 18.1 branch point. >2. Announce the Metabug in mesa-dev and give 1 week (?) for developers > to complete their features. Advice to block the Metabug with other > feature bugs. >3. Developers create bugs with the WIP features they want to include in > 18.1 and block the Metabug. >4. After 1 week, check the status >* If there are no blockers, close the Metabug and create the 18.1 > branch point. >* If there are blockers; coordinate with the developers of the > blockers and decide whether to give a bit more of margin if the > feature is almost complete or just remove the blocking bugs > leaving the WIP features out, close the Metabug and create the > 18.1 branch point. >5. Release 18.1-0-rc1. >6. Create a Metabug to track the status of the final 18.1.0 release. >7. Block this Metabug with regressions found from 18.1.0-rcX. >8. Once we reach stability, close the Metabug and announce the final > release of 18.1.0. > I might sound a bit negative, yet I'm not sure what this brings us. Can you please elaborate? The original goal is to have the time based releases, as opposed to feature ones. That was reiterated by developers not too long ago. So far, there has been an announcement email 2-4 weeks before the branch point, aiming to: - remind, and - seek feedback about required features The email was also followed by weekly ping/reminder. IIRC suggestions and requests that are made in timely fashion* have always been accepted. If we're adopt the above approach, this will: - lead to noticeable delays in the branch point, which combined with - the current delays getting the blocking bugs fixed. equals - even greater delays and less time based releases Furthermore I'm a bit worried that this might have negative impact on developers: I don't know any instances, yet some developers may put extra pressure on themselves trying to get 'too many' features merged. Leading to stress, burn out and others. Perhaps we can somehow utilise your suggestion while ensuring that my grim 'predictions' do not come true? Thanks Emil * 3+days/a week before the branch point ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb
This is good, though some older distros only have libxcb 1.11. Marek On Sun, Mar 11, 2018 at 7:26 PM, Dave Airliewrote: > From: Dave Airlie > > I'm not sure everyone wants to be updating their dri3 in a forced > march setting, this allows a nicer approach, esp when you want > to build on distro that aren't brand new. > > I'm sure there are plenty of ways this patch could be cleaner, > and I've also not built it against an updated dri3. > --- > configure.ac | 4 ++-- > src/egl/drivers/dri2/platform_x11_dri3.c | 4 > src/loader/loader_dri3_helper.c | 22 -- > src/loader/loader_dri3_helper.h | 3 ++- > 4 files changed, 24 insertions(+), 9 deletions(-) > > diff --git a/configure.ac b/configure.ac > index 1553ce9..6a1f139 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -92,9 +92,9 @@ WAYLAND_REQUIRED=1.11 > WAYLAND_PROTOCOLS_REQUIRED=1.8 > XCB_REQUIRED=1.9.3 > XCBDRI2_REQUIRED=1.8 > -XCBDRI3_REQUIRED=1.13 > +XCBDRI3_REQUIRED=1.12 > XCBGLX_REQUIRED=1.8.1 > -XCBPRESENT_REQUIRED=1.13 > +XCBPRESENT_REQUIRED=1.12 > XDAMAGE_REQUIRED=1.1 > XSHMFENCE_REQUIRED=1.1 > XVMC_REQUIRED=1.0.6 > diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c > b/src/egl/drivers/dri2/platform_x11_dri3.c > index dce3356..efe030a 100644 > --- a/src/egl/drivers/dri2/platform_x11_dri3.c > +++ b/src/egl/drivers/dri2/platform_x11_dri3.c > @@ -327,6 +327,7 @@ dri3_create_image_khr_pixmap_from_buffers(_EGLDisplay > *disp, _EGLContext *ctx, >EGLClientBuffer buffer, >const EGLint *attr_list) > { > +#if XCB_DRI3_MAJOR_VERSION == 1 && XCB_DRI3_MINOR_VERSION > 0 > struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); > struct dri2_egl_image *dri2_img; > xcb_dri3_buffers_from_pixmap_cookie_t bp_cookie; > @@ -376,6 +377,9 @@ dri3_create_image_khr_pixmap_from_buffers(_EGLDisplay > *disp, _EGLContext *ctx, > } > > return _img->base; > +#else > + return NULL; > +#endif > } > > static _EGLImage * > diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c > index 585f7ce..624ef1b 100644 > --- a/src/loader/loader_dri3_helper.c > +++ b/src/loader/loader_dri3_helper.c > @@ -389,6 +389,7 @@ dri3_handle_present_event(struct loader_dri3_drawable > *draw, > /* If the server tells us that our allocation is suboptimal, we >* reallocate once. >*/ > +#ifdef XCB_PRESENT_COMPLETE_MODE_SUBOPTIMAL_COPY > if (ce->mode == XCB_PRESENT_COMPLETE_MODE_SUBOPTIMAL_COPY && > draw->last_present_mode != ce->mode) { > for (int b = 0; b < ARRAY_SIZE(draw->buffers); b++) { > @@ -396,7 +397,7 @@ dri3_handle_present_event(struct loader_dri3_drawable > *draw, >draw->buffers[b]->reallocate = true; > } > } > - > +#endif > draw->last_present_mode = ce->mode; > > if (draw->vtable->show_fps) > @@ -903,10 +904,10 @@ loader_dri3_swap_buffers_msc(struct > loader_dri3_drawable *draw, > */ >if (!loader_dri3_have_image_blit(draw) && draw->cur_blit_source != -1) > options |= XCB_PRESENT_OPTION_COPY; > - > +#ifdef XCB_PRESENT_OPTION_SUBOPTIMAL >if (draw->multiplanes_available) > options |= XCB_PRESENT_OPTION_SUBOPTIMAL; > - > +#endif >back->busy = 1; >back->last_swap = draw->send_sbc; >xcb_present_pixmap(draw->conn, > @@ -1053,6 +1054,7 @@ image_format_to_fourcc(int format) > return 0; > } > > +#if XCB_DRI3_MAJOR_VERSION == 1 && XCB_DRI3_MINOR_VERSION > 0 > static bool > has_supported_modifier(struct loader_dri3_drawable *draw, unsigned int > format, > uint64_t *modifiers, uint32_t count) > @@ -1087,6 +1089,7 @@ has_supported_modifier(struct loader_dri3_drawable > *draw, unsigned int format, > free(supported_modifiers); > return found; > } > +#endif > > /** loader_dri3_alloc_render_buffer > * > @@ -1132,6 +1135,7 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable > *draw, unsigned int format, >goto no_image; > > if (!draw->is_different_gpu) { > +#if XCB_DRI3_MAJOR_VERSION == 1 && XCB_DRI3_MINOR_VERSION > 0 >if (draw->multiplanes_available && >draw->ext->image->base.version >= 15 && >draw->ext->image->queryDmaBufModifiers && > @@ -1195,7 +1199,7 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable > *draw, unsigned int format, > buffer); > free(modifiers); >} > - > +#endif >if (!buffer->image) > buffer->image = draw->ext->image->createImage(draw->dri_screen, > width, height, > @@ -1272,6 +1276,7 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable > *draw, unsigned int
Re: [Mesa-dev] Few issues with Meson
Hi Dylan Do you have the link to patch on patchwork? I'll give it a go I'm using meson 0.45 however the cross-file requires more than just defining llvm-config, everything else is normally picked up from what portage is setting in the build environment - though strangely not if clang is used - I'll look into that sometime Regards Mike On Fri, 9 Mar 2018 at 16:37 Dylan Bakerwrote: > Quoting Mike Lothian (2018-03-06 05:07:34) > > Hi > > > > When compiling wine I also noticed that the d3d.pc files didn't have > moduledir > > set, so wine couldn't find it > > > > configure: error: pkg-config couldn't find Gallium Nine module > > I've sent a patch for this. > > > > > Regards > > > > Mike > > > > On Tue, 6 Mar 2018 at 02:17 Mike Lothian wrote: > > > > Hi > > > > I've been trying to get a Gentoo ebuild ready for meson > > > > I've had to fudge the llvm-config for cross compiling a 32bit mesa on > > a 64bit machine > > If you're using a new enough meson (0.45) you can specify the llvm-config > you > want to use in the cross file. > > > > > I notice that -Dvulkan-drivers= doesn't accept intel,radeon like > > autotools used to, it also seems as long as one value is correct the > > other is ignored > > we're using amd instead of radeon. After 18.0 branches I want to bump the > meson > requirement so we can use meson's list argument type, which will check for > such > problems. > > > > > Also -Dva-libs-path= doesn't play well with absolute paths, or rather > > install_megadrivers.py is doing something strange - normally gentoo > > installs everything to a temporary image path then puts those files > > into the live system. It seems install_megadrivers.py doesn't do this > > and installs directly to the live system - I worked around it by > > dropping the /usr > > There's a patch from someone in FreeBSD that might fix this (the way we do > symlinking in install_megadrivers is wrong). > > Sorry it took me so long to find this email, notmuch applied some odd tags > to > it. > > Dylan > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] glsl: Use hash table cloning in copy propagation
Walking the whole hash table, inserting entries by hashing them first is just a really bad idea. We can simply memcpy the whole thing. --- src/compiler/glsl/opt_copy_propagation.cpp | 13 -- .../glsl/opt_copy_propagation_elements.cpp | 29 -- 2 files changed, 15 insertions(+), 27 deletions(-) diff --git a/src/compiler/glsl/opt_copy_propagation.cpp b/src/compiler/glsl/opt_copy_propagation.cpp index e904e6ede4..96667779da 100644 --- a/src/compiler/glsl/opt_copy_propagation.cpp +++ b/src/compiler/glsl/opt_copy_propagation.cpp @@ -220,10 +220,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list *instructions) this->killed_all = false; /* Populate the initial acp with a copy of the original */ - struct hash_entry *entry; - hash_table_foreach(orig_acp, entry) { - _mesa_hash_table_insert(acp, entry->key, entry->data); - } + acp = _mesa_hash_table_clone(orig_acp, NULL); visit_list_elements(this, instructions); @@ -271,10 +268,10 @@ ir_copy_propagation_visitor::handle_loop(ir_loop *ir, bool keep_acp) this->killed_all = false; if (keep_acp) { - struct hash_entry *entry; - hash_table_foreach(orig_acp, entry) { - _mesa_hash_table_insert(acp, entry->key, entry->data); - } + acp = _mesa_hash_table_clone(orig_acp, NULL); + } else { + acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer, +_mesa_key_pointer_equal); } visit_list_elements(this, >body_instructions); diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp b/src/compiler/glsl/opt_copy_propagation_elements.cpp index 9f79fa9202..8bae424a1d 100644 --- a/src/compiler/glsl/opt_copy_propagation_elements.cpp +++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp @@ -124,6 +124,12 @@ public: ralloc_free(mem_ctx); } + void clone_acp(hash_table *lhs, hash_table *rhs) + { + lhs_ht = _mesa_hash_table_clone(lhs, mem_ctx); + rhs_ht = _mesa_hash_table_clone(rhs, mem_ctx); + } + void create_acp() { lhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer, @@ -138,19 +144,6 @@ public: _mesa_hash_table_destroy(rhs_ht, NULL); } - void populate_acp(hash_table *lhs, hash_table *rhs) - { - struct hash_entry *entry; - - hash_table_foreach(lhs, entry) { - _mesa_hash_table_insert(lhs_ht, entry->key, entry->data); - } - - hash_table_foreach(rhs, entry) { - _mesa_hash_table_insert(rhs_ht, entry->key, entry->data); - } - } - void handle_loop(ir_loop *, bool keep_acp); virtual ir_visitor_status visit_enter(class ir_loop *); virtual ir_visitor_status visit_enter(class ir_function_signature *); @@ -395,10 +388,8 @@ ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions) this->kills = new(mem_ctx) exec_list; this->killed_all = false; - create_acp(); - /* Populate the initial acp with a copy of the original */ - populate_acp(orig_lhs_ht, orig_rhs_ht); + clone_acp(orig_lhs_ht, orig_rhs_ht); visit_list_elements(this, instructions); @@ -454,11 +445,11 @@ ir_copy_propagation_elements_visitor::handle_loop(ir_loop *ir, bool keep_acp) this->kills = new(mem_ctx) exec_list; this->killed_all = false; - create_acp(); - if (keep_acp) { /* Populate the initial acp with a copy of the original */ - populate_acp(orig_lhs_ht, orig_rhs_ht); + clone_acp(orig_lhs_ht, orig_rhs_ht); + } else { + create_acp(); } visit_list_elements(this, >body_instructions); -- 2.15.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] util: Implement a hash table cloning function
V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav) --- src/util/hash_table.c | 22 ++ src/util/hash_table.h | 2 ++ 2 files changed, 24 insertions(+) diff --git a/src/util/hash_table.c b/src/util/hash_table.c index b7421a0144..f8d5d0f88a 100644 --- a/src/util/hash_table.c +++ b/src/util/hash_table.c @@ -141,6 +141,28 @@ _mesa_hash_table_create(void *mem_ctx, return ht; } +struct hash_table * +_mesa_hash_table_clone(struct hash_table *src, void *dst_mem_ctx) +{ + struct hash_table *ht; + + ht = ralloc(dst_mem_ctx, struct hash_table); + if (ht == NULL) + return NULL; + + memcpy(ht, src, sizeof(struct hash_table)); + + ht->table = ralloc_array(ht, struct hash_entry, ht->size); + if (ht->table == NULL) { + ralloc_free(ht); + return NULL; + } + + memcpy(ht->table, src->table, ht->size * sizeof(struct hash_entry)); + + return ht; +} + /** * Frees the given hash table. * diff --git a/src/util/hash_table.h b/src/util/hash_table.h index d3e0758b26..3846dad4b4 100644 --- a/src/util/hash_table.h +++ b/src/util/hash_table.h @@ -62,6 +62,8 @@ _mesa_hash_table_create(void *mem_ctx, uint32_t (*key_hash_function)(const void *key), bool (*key_equals_function)(const void *a, const void *b)); +struct hash_table * +_mesa_hash_table_clone(struct hash_table *src, void *dst_mem_ctx); void _mesa_hash_table_destroy(struct hash_table *ht, void (*delete_function)(struct hash_entry *entry)); void _mesa_hash_table_clear(struct hash_table *ht, -- 2.15.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2] Hash table cloning for copy propagation
This is a revival of some old patches I had around to improve the compile times in the glsl compiler by reducing the time spend inserting items in the hash table in opt_copy_propagation. I've only rebased this, as my system don't even want to compile anything right now. I also don't remember if it was thoroughly tested, so that will have to be done. Sending it out as Dave might be interested in this to mitigate some of the overhead his soft-dobule implementation incurs. CC: Dave AirlieThomas Helland (2): util: Implement a hash table cloning function glsl: Use hash table cloning in copy propagation src/compiler/glsl/opt_copy_propagation.cpp | 13 -- .../glsl/opt_copy_propagation_elements.cpp | 29 -- src/util/hash_table.c | 22 src/util/hash_table.h | 2 ++ 4 files changed, 39 insertions(+), 27 deletions(-) -- 2.15.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] i965/miptree: Use cpu tiling/detiling when mapping
Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. Tiling/detiling with the cpu will be the only way to handle Yf/Ys tiling, when support is added for those formats. v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson) v3: Add units to parameter names of tile_extents (Nanley Chery) Use _mesa_align_malloc for the shadow copy (Nanley) Continue using gtt maps on gen4 (Nanley) --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 94 --- 1 file changed, 86 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index c6213b21629..fba17bf5b7b 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -31,6 +31,7 @@ #include "intel_image.h" #include "intel_mipmap_tree.h" #include "intel_tex.h" +#include "intel_tiled_memcpy.h" #include "intel_blit.h" #include "intel_fbo.h" @@ -3046,10 +3047,10 @@ intel_miptree_unmap_raw(struct intel_mipmap_tree *mt) } static void -intel_miptree_map_gtt(struct brw_context *brw, - struct intel_mipmap_tree *mt, - struct intel_miptree_map *map, - unsigned int level, unsigned int slice) +intel_miptree_map_map(struct brw_context *brw, + struct intel_mipmap_tree *mt, + struct intel_miptree_map *map, + unsigned int level, unsigned int slice) { unsigned int bw, bh; void *base; @@ -3093,11 +3094,81 @@ intel_miptree_map_gtt(struct brw_context *brw, } static void -intel_miptree_unmap_gtt(struct intel_mipmap_tree *mt) +intel_miptree_unmap_map(struct intel_mipmap_tree *mt) { intel_miptree_unmap_raw(mt); } +/* Compute extent parameters for use with tiled_memcpy functions. + * xs are in units of bytes and ys are in units of strides. */ +static inline void +tile_extents(struct intel_mipmap_tree *mt, struct intel_miptree_map *map, + unsigned int level, unsigned int slice, unsigned int *x1_B, + unsigned int *x2_B, unsigned int *y1_el, unsigned int *y2_el) +{ + unsigned int block_width, block_height; + unsigned int x0_el, y0_el; + + _mesa_get_format_block_size(mt->format, _width, _height); + + assert(map->x % block_width == 0); + assert(map->y % block_height == 0); + + intel_miptree_get_image_offset(mt, level, slice, _el, _el); + *x1_B = (map->x / block_width + x0_el) * mt->cpp; + *y1_el = map->y / block_height + y0_el; + *x2_B = (DIV_ROUND_UP(map->x + map->w, block_width) + x0_el) * mt->cpp; + *y2_el = DIV_ROUND_UP(map->y + map->h, block_height) + y0_el; +} + +static void +intel_miptree_map_tiled_memcpy(struct brw_context *brw, + struct intel_mipmap_tree *mt, + struct intel_miptree_map *map, + unsigned int level, unsigned int slice) +{ + unsigned int x1, x2, y1, y2; + tile_extents(mt, map, level, slice, , , , ); + map->stride = _mesa_format_row_stride(mt->format, map->w); + map->buffer = map->ptr = _mesa_align_malloc(map->stride * (y2 - y1), 16); + + assert(map->ptr); + + if (!(map->mode & GL_MAP_INVALIDATE_RANGE_BIT)) { + char *src = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW); + src += mt->offset; + + tiled_to_linear(x1, x2, y1, y2, map->ptr, src, map->stride, + mt->surf.row_pitch, brw->has_swizzling, mt->surf.tiling, + memcpy); + + intel_miptree_unmap_raw(mt); + } +} + +static void +intel_miptree_unmap_tiled_memcpy(struct brw_context *brw, + struct intel_mipmap_tree *mt, + struct intel_miptree_map *map, + unsigned int level, + unsigned int slice) +{ + if (map->mode & GL_MAP_WRITE_BIT) { + unsigned int x1, x2, y1, y2; + tile_extents(mt, map, level, slice, , , , ); + + char *dst = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW); + dst += mt->offset; + + linear_to_tiled(x1, x2, y1, y2, dst, map->ptr, mt->surf.row_pitch, + map->stride, brw->has_swizzling, mt->surf.tiling, memcpy); + + intel_miptree_unmap_raw(mt); + } + _mesa_align_free(map->buffer); + map->buffer = map->ptr = NULL; +} + static void intel_miptree_map_blit(struct brw_context *brw, struct intel_mipmap_tree *mt, @@ -3655,8 +3726,10 @@ intel_miptree_map(struct brw_context *brw, (mt->surf.row_pitch % 16 == 0)) { intel_miptree_map_movntdqa(brw, mt, map, level, slice); #endif + } else if (mt->surf.tiling != ISL_TILING_LINEAR && brw->screen->devinfo.gen > 4) { + intel_miptree_map_tiled_memcpy(brw, mt,
Re: [Mesa-dev] [PATCH v4 1/2] gallium/winsys/kms: Fix possible leak in map/unmap.
Ping. Any more comments or missing stuff to get this commited into master? Thanks. On Wed, Mar 7, 2018 at 2:39 PM, Lepton Wuwrote: > If user calls map twice for kms_sw_displaytarget, the first mapped > buffer could get leaked. Instead of calling mmap every time, just > reuse previous mapping. Since user could map same displaytarget with > different flags, we have to keep two different pointers, one for rw > mapping and one for ro mapping. > > Change-Id: I65308f0ff2640bd57b2577c6a3469540c9722859 > Signed-off-by: Lepton Wu > --- > .../winsys/sw/kms-dri/kms_dri_sw_winsys.c | 21 --- > 1 file changed, 14 insertions(+), 7 deletions(-) > > diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c > b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c > index 22e1c936ac5..7fc40488c2e 100644 > --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c > +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c > @@ -70,6 +70,7 @@ struct kms_sw_displaytarget > > uint32_t handle; > void *mapped; > + void *ro_mapped; > > int ref_count; > struct list_head link; > @@ -198,16 +199,19 @@ kms_sw_displaytarget_map(struct sw_winsys *ws, >return NULL; > > prot = (flags == PIPE_TRANSFER_READ) ? PROT_READ : (PROT_READ | > PROT_WRITE); > - kms_sw_dt->mapped = mmap(0, kms_sw_dt->size, prot, MAP_SHARED, > -kms_sw->fd, map_req.offset); > - > - if (kms_sw_dt->mapped == MAP_FAILED) > - return NULL; > + void **ptr = (flags == PIPE_TRANSFER_READ) ? _sw_dt->ro_mapped : > _sw_dt->mapped; > + if (!*ptr) { > + void *tmp = mmap(0, kms_sw_dt->size, prot, MAP_SHARED, > + kms_sw->fd, map_req.offset); > + if (tmp == MAP_FAILED) > + return NULL; > + *ptr = tmp; > + } > > DEBUG_PRINT("KMS-DEBUG: mapped buffer %u (size %u) at %p\n", > - kms_sw_dt->handle, kms_sw_dt->size, kms_sw_dt->mapped); > + kms_sw_dt->handle, kms_sw_dt->size, *ptr); > > - return kms_sw_dt->mapped; > + return *ptr; > } > > static struct kms_sw_displaytarget * > @@ -278,9 +282,12 @@ kms_sw_displaytarget_unmap(struct sw_winsys *ws, > struct kms_sw_displaytarget *kms_sw_dt = kms_sw_displaytarget(dt); > > DEBUG_PRINT("KMS-DEBUG: unmapped buffer %u (was %p)\n", > kms_sw_dt->handle, kms_sw_dt->mapped); > + DEBUG_PRINT("KMS-DEBUG: unmapped buffer %u (was %p)\n", > kms_sw_dt->handle, kms_sw_dt->ro_mapped); > > munmap(kms_sw_dt->mapped, kms_sw_dt->size); > kms_sw_dt->mapped = NULL; > + munmap(kms_sw_dt->ro_mapped, kms_sw_dt->size); > + kms_sw_dt->ro_mapped = NULL; > } > > static struct sw_displaytarget * > -- > 2.16.2.395.g2e18187dfd-goog > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 PATCH 3/6] spirv_extensions: add list of extensions and to_string method
Adding Jason and Ian here for their opinions. Quoting Alejandro Piñeiro (2018-03-12 01:31:02) > On 11/03/18 18:08, Dylan Baker wrote: > > Quoting Alejandro Piñeiro (2018-03-08 07:00:16) > >> Ideally this should be generated somehow. One option would be gather > >> all the extension dependencies listed on the core grammar, but there > >> would be the possibility of not including some of the extensions. > >> > >> Note that spirv-tools is doing it just slightly better, as it has a > >> hardcoded list of extensions manually took from the registry, that > >> they parse to get the enum and the to_string method (see > >> generate_grammar_tables.py). > > If there were extensions not in the core grammar that we wanted to or > > needed to > > support, are they available in a different format that is still machine > > readable? > > Taking a look to the last version of the core grammar [1], it seems that > all the extensions, but SPV_AMD_gcn_shader are now part of the core. For > the latter, I found a json file as part of spirv-tools [3] > > But when I wrote this patch, some of the extensions were not part of the > core, and as far as I saw, they were just listed on the registry [2]. I > was not able to find a individual json core grammar for some extensions > then. On the commit message I mention generate_grammar_tables. From a > comment there: > > #Extensions to recognize, but which don't necessarily come from the SPIR-V > #core grammar. Get this list from the SPIR-V registery web page. > > So right now one option would be create that list from the core grammar > plus the grammar amd one. But what would happen if a new extension is > defined without a grammar file? Would we just write one ourselves? Would > we ask khronos (or who defined the spec) to provide one? > > BR > > > [1] > https://github.com/KhronosGroup/SPIRV-Headers/blob/master/include/spirv/1.0/spirv.core.grammar.json > [2] https://www.khronos.org/registry/spir-v/extensions/KHR/ > [3] > https://github.com/KhronosGroup/SPIRV-Tools/blob/master/source/extinst.spv-amd-gcn-shader.grammar.json > > signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev