date:20180312

Re: [Mesa-dev] [PATCH 02/50] glsl: Add "built-in" functions to do neg(fp64) (v2)

2018-03-12 Thread Roland Scheidegger

Am 13.03.2018 um 05:24 schrieb Dave Airlie:
> From: Elie Tournier 
> 
> v2: use mix.
> 
> Signed-off-by: Elie Tournier 
> ---
>  src/compiler/glsl/builtin_float64.h | 51 
> +
>  src/compiler/glsl/builtin_functions.cpp |  4 +++
>  src/compiler/glsl/builtin_functions.h   |  3 ++
>  src/compiler/glsl/float64.glsl  | 24 
>  src/compiler/glsl/glcpp/glcpp-parse.y   |  1 +
>  5 files changed, 83 insertions(+)
> 
> diff --git a/src/compiler/glsl/builtin_float64.h 
> b/src/compiler/glsl/builtin_float64.h
> index 7b57231..2898fc9 100644
> --- a/src/compiler/glsl/builtin_float64.h
> +++ b/src/compiler/glsl/builtin_float64.h
> @@ -17,3 +17,54 @@ fabs64(void *mem_ctx, builtin_available_predicate avail)
> sig->replace_parameters(_parameters);
> return sig;
>  }
> +ir_function_signature *
> +is_nan(void *mem_ctx, builtin_available_predicate avail)
> +{
> +   ir_function_signature *const sig =
> +  new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail);
> +   ir_factory body(>body, mem_ctx);
> +   sig->is_defined = true;
> +
> +   exec_list sig_parameters;
> +
> +   ir_variable *const r000C = new(mem_ctx) 
> ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in);
> +   sig_parameters.push_tail(r000C);
> +   ir_expression *const r000D = lshift(swizzle_y(r000C), 
> body.constant(int(1)));
> +   ir_expression *const r000E = gequal(r000D, body.constant(4292870144u));
> +   ir_expression *const r000F = nequal(swizzle_x(r000C), body.constant(0u));
> +   ir_expression *const r0010 = bit_and(swizzle_y(r000C), 
> body.constant(1048575u));
> +   ir_expression *const r0011 = nequal(r0010, body.constant(0u));
> +   ir_expression *const r0012 = logic_or(r000F, r0011);
> +   ir_expression *const r0013 = logic_and(r000E, r0012);
> +   body.emit(ret(r0013));
> +
> +   sig->replace_parameters(_parameters);
> +   return sig;
> +}
> +ir_function_signature *
> +fneg64(void *mem_ctx, builtin_available_predicate avail)
> +{
> +   ir_function_signature *const sig =
> +  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
> +   ir_factory body(>body, mem_ctx);
> +   sig->is_defined = true;
> +
> +   exec_list sig_parameters;
> +
> +   ir_variable *const r0014 = new(mem_ctx) 
> ir_variable(glsl_type::uvec2_type, "a", ir_var_function_in);
> +   sig_parameters.push_tail(r0014);
> +   ir_expression *const r0015 = lshift(swizzle_y(r0014), 
> body.constant(int(1)));
> +   ir_expression *const r0016 = gequal(r0015, body.constant(4292870144u));
> +   ir_expression *const r0017 = nequal(swizzle_x(r0014), body.constant(0u));
> +   ir_expression *const r0018 = bit_and(swizzle_y(r0014), 
> body.constant(1048575u));
> +   ir_expression *const r0019 = nequal(r0018, body.constant(0u));
> +   ir_expression *const r001A = logic_or(r0017, r0019);
> +   ir_expression *const r001B = logic_and(r0016, r001A);
> +   ir_expression *const r001C = bit_xor(swizzle_y(r0014), 
> body.constant(2147483648u));
> +   body.emit(assign(r0014, expr(ir_triop_csel, r001B, swizzle_y(r0014), 
> r001C), 0x02));
> +
> +   body.emit(ret(r0014));
> +
> +   sig->replace_parameters(_parameters);
> +   return sig;
> +}
> diff --git a/src/compiler/glsl/builtin_functions.cpp 
> b/src/compiler/glsl/builtin_functions.cpp
> index 133a896..9d88a31 100644
> --- a/src/compiler/glsl/builtin_functions.cpp
> +++ b/src/compiler/glsl/builtin_functions.cpp
> @@ -3346,6 +3346,10 @@ builtin_builder::create_builtins()
>  generate_ir::fabs64(mem_ctx, integer_functions_supported),
>  NULL);
>  
> +   add_function("__builtin_fneg64",
> +generate_ir::fneg64(mem_ctx, integer_functions_supported),
> +NULL);
> +
>  #undef F
>  #undef FI
>  #undef FIUD_VEC
> diff --git a/src/compiler/glsl/builtin_functions.h 
> b/src/compiler/glsl/builtin_functions.h
> index deaf640..adec424 100644
> --- a/src/compiler/glsl/builtin_functions.h
> +++ b/src/compiler/glsl/builtin_functions.h
> @@ -70,6 +70,9 @@ udivmod64(void *mem_ctx, builtin_available_predicate avail);
>  ir_function_signature *
>  fabs64(void *mem_ctx, builtin_available_predicate avail);
>  
> +ir_function_signature *
> +fneg64(void *mem_ctx, builtin_available_predicate avail);
> +
>  }
>  
>  #endif /* BULITIN_FUNCTIONS_H */
> diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl
> index d798d7e..fedf8b7 100644
> --- a/src/compiler/glsl/float64.glsl
> +++ b/src/compiler/glsl/float64.glsl
> @@ -6,6 +6,7 @@
>  
>  #version 130
>  #extension GL_ARB_shader_bit_encoding : enable
> +#extension GL_EXT_shader_integer_mix : enable
>  
>  /* Software IEEE floating-point rounding mode.
>   * GLSL spec section "4.7.1 Range and Precision":
> @@ -27,3 +28,26 @@ fabs64(uvec2 a)
> a.y &= 0x7FFFu;
> return a;
>  }
> +
> +/* Returns 1 if the double-precision floating-point value `a' is a NaN;
> + * otherwise returns 0.
> + */
>

[Mesa-dev] [PATCH 49/50] gallium: add pipe double support enum + docs

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

---
 src/gallium/docs/source/screen.rst   | 4 +++-
 src/gallium/include/pipe/p_defines.h | 7 +++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index e375d67..42e4f32 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -361,7 +361,9 @@ The integer capabilities:
 * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the
   ``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property.
 * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point operations
-  are supported.
+  are supported. PIPE_DOUBLES_HW indicates HW support for doubles,
+  PIPE_DOUBLES_EMULATE indicates the driver wants the state tracker to
+  lower doubles.
 * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported.
 * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo
   operations are supported.
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index ed8eeb8..b104007 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -1098,6 +1098,13 @@ enum pipe_debug_type
PIPE_DEBUG_TYPE_CONFORMANCE,
 };
 
+enum pipe_double_support
+{
+   PIPE_DOUBLES_NONE = 0,
+   PIPE_DOUBLES_HW = 1,
+   PIPE_DOUBLES_EMULATE = 2
+};
+
 #define PIPE_UUID_SIZE 16
 
 #ifdef __cplusplus
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 45/50] glsl: Add a lowering pass for 64-bit float frac()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/lower_instructions.cpp | 25 +
 1 file changed, 25 insertions(+)

diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 3064eef..94b262d 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -181,6 +181,7 @@ private:
void dmax_to_less(ir_expression *ir);
void dfloor_to_dtrunc(ir_expression *ir);
void dceil_to_dtrunc(ir_expression *ir);
+   void dfrac_to_dtrunc(ir_expression *ir);
 
ir_expression *_carry(operand a, operand b);
 };
@@ -1761,6 +1762,24 @@ 
lower_instructions_visitor::dceil_to_dtrunc(ir_expression *ir)
this->progress = true;
 }
 
+void
+lower_instructions_visitor::dfrac_to_dtrunc(ir_expression *ir)
+{
+   ir_expression *const floor_expr =
+  new(ir) ir_expression(ir_unop_floor,
+ir->operands[0]->type, ir->operands[0]);
+   dfloor_to_dtrunc(floor_expr);
+   ir_expression *const neg_expr =
+  new(ir) ir_expression(ir_unop_neg,
+ir->operands[0]->type, floor_expr);
+
+   ir->operation = ir_binop_add;
+   ir->init_num_operands();
+   ir->operands[1] = neg_expr;
+
+   this->progress = true;
+}
+
 ir_visitor_status
 lower_instructions_visitor::visit_leave(ir_expression *ir)
 {
@@ -1926,6 +1945,12 @@ lower_instructions_visitor::visit_leave(ir_expression 
*ir)
  dmax_to_less(ir);
   break;
 
+   case ir_unop_fract:
+  if (lowering(DOPS_TO_DTRUNC) &&
+  ir->type->is_double())
+ dfrac_to_dtrunc(ir);
+  break;
+
default:
   return visit_continue;
}
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 44/50] glsl: Add a lowering pass for 64-bit float ceil()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

[airlied: handle vector case]
Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/lower_instructions.cpp | 31 +--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 03246e6..3064eef 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -180,6 +180,7 @@ private:
void dmin_to_less(ir_expression *ir);
void dmax_to_less(ir_expression *ir);
void dfloor_to_dtrunc(ir_expression *ir);
+   void dceil_to_dtrunc(ir_expression *ir);
 
ir_expression *_carry(operand a, operand b);
 };
@@ -1739,6 +1740,27 @@ 
lower_instructions_visitor::dfloor_to_dtrunc(ir_expression *ir)
this->progress = true;
 }
 
+void
+lower_instructions_visitor::dceil_to_dtrunc(ir_expression *ir)
+{
+   /* if x < 0,ceil(x) = trunc(x)
+* else if (x - trunc(x) == 0), ceil(x) = x
+* else,ceil(x) = trunc(x) + 1
+*/
+   const unsigned vec_elem = ir->type->vector_elements;
+   ir_rvalue *src = ir->operands[0]->clone(ir, NULL);
+   ir_rvalue *tr = trunc(src);
+
+   ir->operation = ir_triop_csel;
+   ir->init_num_operands();
+   ir->operands[0] = logic_or(less(src, new(ir) ir_constant(0.0, vec_elem)),
+  equal(src, tr));
+   ir->operands[1] = tr;
+   ir->operands[2] = add(tr, new(ir) ir_constant(1.0, vec_elem));
+
+   this->progress = true;
+}
+
 ir_visitor_status
 lower_instructions_visitor::visit_leave(ir_expression *ir)
 {
@@ -1822,8 +1844,13 @@ lower_instructions_visitor::visit_leave(ir_expression 
*ir)
   break;
 
case ir_unop_ceil:
-  if (lowering(DOPS_TO_DFRAC) && ir->type->is_double())
- dceil_to_dfrac(ir);
+  if (ir->type->is_double()) {
+ if (lowering(DOPS_TO_DFRAC)) {
+dceil_to_dfrac(ir);
+ } else if (lowering(DOPS_TO_DTRUNC)) {
+dceil_to_dtrunc(ir);
+ }
+  }
   break;
 
case ir_unop_floor:
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 24/50] glsl: Add a lowering pass for 64-bit float less()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 4 +++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 17db074..b5f8c45 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -63,8 +63,10 @@
 #define ABS64 (1U << 4)
 #define NEG64 (1U << 5)
 #define EQ64  (1U << 6)
+#define LT64  (1U << 7)
+
+#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64)
 
-#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64)
 /**
  * \see class lower_packing_builtins_visitor
  */
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index d5e0f32..24cc3cd 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -457,6 +457,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_binop_less:
+  if (lowering(LT64)) {
+ if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_flt64", generate_ir::flt64);
+  }
+  break;
+
case ir_binop_mod:
   if (lowering(MOD64)) {
  if (ir->type->base_type == GLSL_TYPE_UINT64) {
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 48/50] glsl: add lowering for mod64()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

This lowers to floor using the same code as the float lowering,
it also fixes things to avoid creating more instructions that
need lowering.

Signed-off-by: Dave Airlie 
---
 src/compiler/glsl/ir_optimization.h  | 1 +
 src/compiler/glsl/lower_instructions.cpp | 9 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 38e35e3..5e6c82a 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -58,6 +58,7 @@
 #define DOPS_TO_DTRUNC0x80
 #define DRSQ_TO_DRCP  0x100
 #define DFMA_TO_DMULADD   0x200
+#define DMOD_TO_FLOOR 0x400
 
 /* Opertaions for lower_64bit_integer_instructions() */
 #define MUL64 (1U << 0)
diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index b8f7224..d2a838c 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -359,6 +359,8 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir)
 
if (lowering(DOPS_TO_DFRAC) && ir->type->is_double())
   dfloor_to_dfrac(floor_expr);
+   else if (lowering(DOPS_TO_DTRUNC) && ir->type->is_double())
+  dfloor_to_dtrunc(floor_expr);
 
ir_expression *const mul_expr =
   new(ir) ir_expression(ir_binop_mul,
@@ -369,6 +371,8 @@ lower_instructions_visitor::mod_to_floor(ir_expression *ir)
ir->init_num_operands();
ir->operands[0] = new(ir) ir_dereference_variable(x);
ir->operands[1] = mul_expr;
+   if (ir->type->is_double())
+  sub_to_add_neg(ir);
this->progress = true;
 }
 
@@ -1855,8 +1859,9 @@ lower_instructions_visitor::visit_leave(ir_expression *ir)
   break;
 
case ir_binop_mod:
-  if (lowering(MOD_TO_FLOOR) && (ir->type->is_float() || 
ir->type->is_double()))
-mod_to_floor(ir);
+  if ((lowering(MOD_TO_FLOOR) && ir->type->is_float()) ||
+  (lowering(DMOD_TO_FLOOR) && ir->type->is_double()))
+ mod_to_floor(ir);
   break;
 
case ir_binop_pow:
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 35/50] glsl: Add a lowering pass for 64-bit float round()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 6ef75f5..44d07bc 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -73,10 +73,11 @@
 #define F2D   (1U << 14)
 #define SQRT64(1U << 15)
 #define TRUNC64   (1U << 16)
+#define ROUND64   (1U << 17)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
   ADD64 | MUL64 | D2U | U2D | D2I | I2D | \
-  D2F | F2D | SQRT64 | TRUNC64)
+  D2F | F2D | SQRT64 | TRUNC64 | ROUND64)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 3c34211..38c264f 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -466,6 +466,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_round_even:
+  if (lowering(ROUND64)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fround64", 
generate_ir::fround64);
+  }
+  break;
+
case ir_unop_sign:
   if (lowering(SIGN64)) {
  if (ir->type->is_integer_64())
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 34/50] glsl: Add a lowering pass for 64-bit float trunc()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 1b5d50a..6ef75f5 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -72,10 +72,11 @@
 #define D2F   (1U << 13)
 #define F2D   (1U << 14)
 #define SQRT64(1U << 15)
+#define TRUNC64   (1U << 16)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
   ADD64 | MUL64 | D2U | U2D | D2I | I2D | \
-  D2F | F2D | SQRT64)
+  D2F | F2D | SQRT64 | TRUNC64)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 4920150..3c34211 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -482,6 +482,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_trunc:
+  if (lowering(TRUNC64)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_ftrunc64", 
generate_ir::ftrunc64);
+  }
+  break;
+
case ir_unop_u2d:
   if (lowering(U2D)) {
  if (ir->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 39/50] glsl: Add a lowering pass for 64-bit float gequal()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/lower_64bit.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index c4b8e78..0dc6070 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -405,7 +405,8 @@ lower_64bit::lower_op_to_function_call(ir_instruction 
*base_ir,
 
   body.emit(c);
 
-  if (ir->operation == ir_unop_d2b)
+  if (ir->operation == ir_unop_d2b ||
+  ir->operation == ir_binop_gequal)
  body.emit(assign(dst[i], logic_not(dst[i])));
}
 
@@ -605,6 +606,7 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_binop_gequal:
case ir_binop_less:
   if (lowering(LT64)) {
  if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 29/50] glsl: Add a lowering pass for 64-bit float d2i()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index a4cb7b2..3cc7f2e 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -67,9 +67,10 @@
 #define ADD64 (1U << 8)
 #define D2U   (1U << 9)
 #define U2D   (1U << 10)
+#define D2I   (1U << 11)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
-  ADD64 | MUL64 | D2U | U2D)
+  ADD64 | MUL64 | D2U | U2D | D2I)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 1e97306..7b2ffe8 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_d2i:
+  if (lowering(D2I)) {
+ if (ir->type->base_type == GLSL_TYPE_INT)
+*rvalue = handle_op(ir, "__builtin_fp64_to_int", 
generate_ir::fp64_to_int);
+  }
+  break;
+
case ir_unop_d2u:
   if (lowering(D2U)) {
  if (ir->type->base_type == GLSL_TYPE_UINT)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 27/50] glsl: Add a lowering pass for 64-bit float d2u()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 6506e28..e3d573c 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -65,9 +65,10 @@
 #define EQ64  (1U << 6)
 #define LT64  (1U << 7)
 #define ADD64 (1U << 8)
+#define D2U   (1U << 9)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
-  ADD64 | MUL64)
+  ADD64 | MUL64 | D2U)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index f3a2633..1b90830 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_d2u:
+  if (lowering(D2U)) {
+ if (ir->type->base_type == GLSL_TYPE_UINT)
+*rvalue = handle_op(ir, "__builtin_fp64_to_uint", 
generate_ir::fp64_to_uint);
+  }
+  break;
+
case ir_unop_neg:
   if (lowering(NEG64)) {
  if (ir->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 33/50] glsl: Add a lowering pass for 64-bit float sqrt()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index c649c80..1b5d50a 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -71,10 +71,11 @@
 #define I2D   (1U << 12)
 #define D2F   (1U << 13)
 #define F2D   (1U << 14)
+#define SQRT64(1U << 15)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
   ADD64 | MUL64 | D2U | U2D | D2I | I2D | \
-  D2F | F2D)
+  D2F | F2D | SQRT64)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 126c961..4920150 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -475,6 +475,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_sqrt:
+  if (lowering(SQRT64)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fsqrt64", generate_ir::fsqrt64);
+  }
+  break;
+
case ir_unop_u2d:
   if (lowering(U2D)) {
  if (ir->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] soft fp64 support - main body (glsl/gallium)

2018-03-12 Thread Dave Airlie

On 13 March 2018 at 14:24, Dave Airlie  wrote:
> This is the main code for the soft fp64 work. It's mostly Elie's
> code with a bunch of changes by me.
>

All the patches are in my tree here, along with some other bits:
https://cgit.freedesktop.org/~airlied/mesa/log/?h=glsl_arb_gpu_shader_fp64_v4

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 50/50] st/glsl: enable fp64 lowering support

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

This enables fp64 emulation if the driver requests it
with PIPE_CAP_DOUBLES set to PIPE_DOUBLES_EMULATE.

It moves the mat->vec lowering earlier as we don't
want to hit any matrix operation in double lowering,
and if we lower div->rcp we end up getting the wrong
type of matrix mult, so just avoid that problem.

Otherwise it just enables all the fp64 lowering.
---
 src/mesa/state_tracker/st_extensions.c |  2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 13 -
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 3b8e226..524b021 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -1254,7 +1254,7 @@ void st_init_extensions(struct pipe_screen *screen,
}
 #endif
 
-   if (screen->get_param(screen, PIPE_CAP_DOUBLES)) {
+   if (screen->get_param(screen, PIPE_CAP_DOUBLES) != PIPE_DOUBLES_NONE) {
   extensions->ARB_gpu_shader_fp64 = GL_TRUE;
   extensions->ARB_vertex_attrib_64bit = GL_TRUE;
}
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index b608635..abcadd0 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -7028,9 +7028,21 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
  options->EmitNoIndirectUniform);
   }
 
+  do_mat_op_to_vec(ir);
+
   if (!pscreen->get_param(pscreen, PIPE_CAP_INT64_DIVMOD))
  lower_64bit_instructions(ir, DIV64 | MOD64);
 
+  /* Enable double lowering if the hardware doesn't support doubles.
+   * The lowering requires GLSL >= 130.
+   */
+  if ((pscreen->get_param(pscreen, PIPE_CAP_DOUBLES) == 
PIPE_DOUBLES_EMULATE) &&
+  ctx->Const.GLSLVersion >= 130) {
+ lower_instructions(ir, DDIV_TO_MUL_RCP | DMIN_DMAX_TO_LESS | 
DOPS_TO_DTRUNC | DRSQ_TO_DRCP | DFMA_TO_DMULADD |
+DMOD_TO_FLOOR | (have_dfrexp ? 0 : 
DFREXP_DLDEXP_TO_ARITH));
+ lower_64bit_instructions(ir, LOWER_ALL_DOUBLE_OPS);
+  }
+
   if (ctx->Extensions.ARB_shading_language_packing) {
  unsigned lower_inst = LOWER_PACK_SNORM_2x16 |
LOWER_UNPACK_SNORM_2x16 |
@@ -7053,7 +7065,6 @@ st_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
   if (!pscreen->get_param(pscreen, PIPE_CAP_TEXTURE_GATHER_OFFSETS))
  lower_offset_arrays(ir);
-  do_mat_op_to_vec(ir);
 
   if (stage == MESA_SHADER_FRAGMENT)
  lower_blend_equation_advanced(
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 47/50] glsl: add a lowering pass for dfma to dmuladd.

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

Just lowering dfma to dmuladd for now, I don't think it will matter
for anything we care about.

This also fixes the double dot to fma lowering to take this
flag into account and avoid creating further fma's.

Signed-off-by: Dave Airlie 
---
 src/compiler/glsl/ir_optimization.h  |  1 +
 src/compiler/glsl/lower_instructions.cpp | 35 
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index e6f9ad3..38e35e3 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -57,6 +57,7 @@
 #define DMIN_DMAX_TO_LESS 0x40
 #define DOPS_TO_DTRUNC0x80
 #define DRSQ_TO_DRCP  0x100
+#define DFMA_TO_DMULADD   0x200
 
 /* Opertaions for lower_64bit_integer_instructions() */
 #define MUL64 (1U << 0)
diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index d13a99b..b8f7224 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -184,7 +184,7 @@ private:
void dceil_to_dtrunc(ir_expression *ir);
void dfrac_to_dtrunc(ir_expression *ir);
void drsq_to_drcp(ir_expression *ir);
-
+   void dfma_to_dmuladd(ir_expression *ir);
ir_expression *_carry(operand a, operand b);
 };
 
@@ -873,9 +873,12 @@ 
lower_instructions_visitor::double_dot_to_fma(ir_expression *ir)
  assig = assign(temp, mul(swizzle(ir->operands[0]->clone(ir, NULL), i, 
1),
   swizzle(ir->operands[1]->clone(ir, NULL), i, 
1)));
   } else {
- assig = assign(temp, fma(swizzle(ir->operands[0]->clone(ir, NULL), i, 
1),
-  swizzle(ir->operands[1]->clone(ir, NULL), i, 
1),
-  temp));
+ ir_expression *fma_expr = fma(swizzle(ir->operands[0]->clone(ir, 
NULL), i, 1),
+   swizzle(ir->operands[1]->clone(ir, 
NULL), i, 1),
+   temp);
+ if (lowering(DFMA_TO_DMULADD))
+dfma_to_dmuladd(fma_expr);
+ assig = assign(temp, fma_expr);
   }
   this->base_ir->insert_before(assig);
}
@@ -886,6 +889,8 @@ lower_instructions_visitor::double_dot_to_fma(ir_expression 
*ir)
ir->operands[1] = swizzle(ir->operands[1], 0, 1);
ir->operands[2] = new(ir) ir_dereference_variable(temp);
 
+   if (lowering(DFMA_TO_DMULADD))
+  dfma_to_dmuladd(ir);
this->progress = true;
 
 }
@@ -1783,6 +1788,22 @@ 
lower_instructions_visitor::dfrac_to_dtrunc(ir_expression *ir)
 }
 
 void
+lower_instructions_visitor::dfma_to_dmuladd(ir_expression *ir)
+{
+   ir_variable *temp = new(ir) ir_variable(ir->operands[0]->type, "temp",
+   ir_var_temporary);
+   ir_rvalue *arg = ir->operands[2];
+   ir_instruction  = *base_ir;
+   i.insert_before(temp);
+   i.insert_before(assign(temp, mul(ir->operands[0], ir->operands[1])));
+
+   ir->operation = ir_binop_add;
+   ir->init_num_operands();
+   ir->operands[0] = new(ir) ir_dereference_variable(temp);
+   ir->operands[1] = arg->clone(ir, NULL);
+   this->progress = true;
+}
+void
 lower_instructions_visitor::drsq_to_drcp(ir_expression *ir)
 {
ir_expression *const sqrt_expr =
@@ -1976,6 +1997,12 @@ lower_instructions_visitor::visit_leave(ir_expression 
*ir)
  dfrac_to_dtrunc(ir);
   break;
 
+   case ir_triop_fma:
+  if (lowering(DFMA_TO_DMULADD) &&
+  ir->type->is_double())
+ dfma_to_dmuladd(ir);
+  break;
+
default:
   return visit_continue;
}
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 46/50] glsl: Add a lowering pass for 64-bit float rsq()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

---
 src/compiler/glsl/ir_optimization.h  |  1 +
 src/compiler/glsl/lower_instructions.cpp | 25 +
 2 files changed, 26 insertions(+)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index ba0c101..e6f9ad3 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -56,6 +56,7 @@
 #define SQRT_TO_ABS_SQRT  0x20
 #define DMIN_DMAX_TO_LESS 0x40
 #define DOPS_TO_DTRUNC0x80
+#define DRSQ_TO_DRCP  0x100
 
 /* Opertaions for lower_64bit_integer_instructions() */
 #define MUL64 (1U << 0)
diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 94b262d..d13a99b 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -45,6 +45,7 @@
  * - DOPS_TO_DFRAC
  * - DMIN_DMAX_TO_LESS
  * - DOPS_TO_DTRUNC
+ * - DRSQ_TO_DRCP
  *
  * SUB_TO_ADD_NEG:
  * ---
@@ -182,6 +183,7 @@ private:
void dfloor_to_dtrunc(ir_expression *ir);
void dceil_to_dtrunc(ir_expression *ir);
void dfrac_to_dtrunc(ir_expression *ir);
+   void drsq_to_drcp(ir_expression *ir);
 
ir_expression *_carry(operand a, operand b);
 };
@@ -1780,6 +1782,22 @@ 
lower_instructions_visitor::dfrac_to_dtrunc(ir_expression *ir)
this->progress = true;
 }
 
+void
+lower_instructions_visitor::drsq_to_drcp(ir_expression *ir)
+{
+   ir_expression *const sqrt_expr =
+  new(ir) ir_expression(ir_unop_sqrt,
+ir->operands[0]->type, ir->operands[0]);
+   if (lowering(SQRT_TO_ABS_SQRT))
+  sqrt_to_abs_sqrt(sqrt_expr);
+
+   ir->operation = ir_unop_rcp;
+   ir->init_num_operands();
+   ir->operands[0] = sqrt_expr;
+
+   this->progress = true;
+}
+
 ir_visitor_status
 lower_instructions_visitor::visit_leave(ir_expression *ir)
 {
@@ -1928,6 +1946,13 @@ lower_instructions_visitor::visit_leave(ir_expression 
*ir)
   break;
 
case ir_unop_rsq:
+  if (lowering(DRSQ_TO_DRCP) &&
+  ir->type->is_double())
+ drsq_to_drcp(ir);
+  else if (lowering(SQRT_TO_ABS_SQRT))
+ sqrt_to_abs_sqrt(ir);
+  break;
+
case ir_unop_sqrt:
   if (lowering(SQRT_TO_ABS_SQRT))
  sqrt_to_abs_sqrt(ir);
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 42/50] glsl: Add a lowering pass for 64-bit float max()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

[airlied: update to handle max(dvec, double) case]
Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/lower_instructions.cpp | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 8c3d623..144bc41 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -177,6 +177,7 @@ private:
void imul_high_to_mul(ir_expression *ir);
void sqrt_to_abs_sqrt(ir_expression *ir);
void dmin_to_less(ir_expression *ir);
+   void dmax_to_less(ir_expression *ir);
 
ir_expression *_carry(operand a, operand b);
 };
@@ -1693,6 +1694,26 @@ lower_instructions_visitor::dmin_to_less(ir_expression 
*ir)
this->progress = true;
 }
 
+void
+lower_instructions_visitor::dmax_to_less(ir_expression *ir)
+{
+   const unsigned vec_elem = ir->type->vector_elements;
+   ir_rvalue *x_clone = ir->operands[0]->clone(ir, NULL);
+   ir_rvalue *y_clone = ir->operands[1]->clone(ir, NULL);
+   ir->operation = ir_triop_csel;
+   ir->init_num_operands();
+   if (ir->operands[1]->type->vector_elements == 1 && vec_elem > 1) {
+  ir->operands[0] = less(ir->operands[0], swizzle(ir->operands[1], 
SWIZZLE_, vec_elem));
+  ir->operands[1] = swizzle(y_clone, SWIZZLE_, vec_elem);
+   } else {
+  ir->operands[0] = less(ir->operands[0], ir->operands[1]);
+  ir->operands[1] = y_clone;
+   }
+   ir->operands[2] = x_clone;
+
+   this->progress = true;
+}
+
 ir_visitor_status
 lower_instructions_visitor::visit_leave(ir_expression *ir)
 {
@@ -1842,6 +1863,12 @@ lower_instructions_visitor::visit_leave(ir_expression 
*ir)
  dmin_to_less(ir);
   break;
 
+   case ir_binop_max:
+  if (lowering(DMIN_DMAX_TO_LESS) &&
+  ir->type->is_double())
+ dmax_to_less(ir);
+  break;
+
default:
   return visit_continue;
}
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 43/50] glsl: Add a lowering pass for 64-bit float floor()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

[airlied: handle vector cases]
Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h  |  1 +
 src/compiler/glsl/lower_instructions.cpp | 34 ++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index f9b688a..ba0c101 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -55,6 +55,7 @@
 #define DIV_TO_MUL_RCP(FDIV_TO_MUL_RCP | DDIV_TO_MUL_RCP)
 #define SQRT_TO_ABS_SQRT  0x20
 #define DMIN_DMAX_TO_LESS 0x40
+#define DOPS_TO_DTRUNC0x80
 
 /* Opertaions for lower_64bit_integer_instructions() */
 #define MUL64 (1U << 0)
diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 144bc41..03246e6 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -44,6 +44,7 @@
  * - SAT_TO_CLAMP
  * - DOPS_TO_DFRAC
  * - DMIN_DMAX_TO_LESS
+ * - DOPS_TO_DTRUNC
  *
  * SUB_TO_ADD_NEG:
  * ---
@@ -178,6 +179,7 @@ private:
void sqrt_to_abs_sqrt(ir_expression *ir);
void dmin_to_less(ir_expression *ir);
void dmax_to_less(ir_expression *ir);
+   void dfloor_to_dtrunc(ir_expression *ir);
 
ir_expression *_carry(operand a, operand b);
 };
@@ -1714,6 +1716,29 @@ lower_instructions_visitor::dmax_to_less(ir_expression 
*ir)
this->progress = true;
 }
 
+void
+lower_instructions_visitor::dfloor_to_dtrunc(ir_expression *ir)
+{
+   /*
+* For x >= 0, floor(x) = trunc(x)
+* For x < 0,
+*- if x is integer, floor(x) = x
+*- otherwise, floor(x) = trunc(x) - 1
+*/
+   const unsigned vec_elem = ir->type->vector_elements;
+   ir_rvalue *src = ir->operands[0]->clone(ir, NULL);
+   ir_rvalue *tr = trunc(src);
+
+   ir->operation = ir_triop_csel;
+   ir->init_num_operands();
+   ir->operands[0] = logic_or(gequal(src, new(ir) ir_constant(0.0, vec_elem)),
+  equal(src, tr));
+   ir->operands[1] = tr;
+   ir->operands[2] = add(tr, new(ir) ir_constant(-1.0, vec_elem));
+
+   this->progress = true;
+}
+
 ir_visitor_status
 lower_instructions_visitor::visit_leave(ir_expression *ir)
 {
@@ -1802,8 +1827,13 @@ lower_instructions_visitor::visit_leave(ir_expression 
*ir)
   break;
 
case ir_unop_floor:
-  if (lowering(DOPS_TO_DFRAC) && ir->type->is_double())
- dfloor_to_dfrac(ir);
+  if (ir->type->is_double()) {
+ if (lowering(DOPS_TO_DFRAC)) {
+dfloor_to_dfrac(ir);
+ } else if (lowering(DOPS_TO_DTRUNC)) {
+dfloor_to_dtrunc(ir);
+ }
+  }
   break;
 
case ir_unop_round_even:
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 32/50] glsl: Add a lowering pass for 64-bit float f2d()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index e7860ef..c649c80 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -70,10 +70,11 @@
 #define D2I   (1U << 11)
 #define I2D   (1U << 12)
 #define D2F   (1U << 13)
+#define F2D   (1U << 14)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
   ADD64 | MUL64 | D2U | U2D | D2I | I2D | \
-  D2F)
+  D2F | F2D)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index e76ebdc..126c961 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -445,6 +445,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_f2d:
+  if (lowering(F2D)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fp32_to_fp64", 
generate_ir::fp32_to_fp64, true);
+  }
+  break;
+
case ir_unop_i2d:
   if (lowering(I2D)) {
  if (ir->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 40/50] glsl: Add a lowering pass for 64-bit float nequal()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/lower_64bit.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 0dc6070..f085dae 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -406,7 +406,8 @@ lower_64bit::lower_op_to_function_call(ir_instruction 
*base_ir,
   body.emit(c);
 
   if (ir->operation == ir_unop_d2b ||
-  ir->operation == ir_binop_gequal)
+  ir->operation == ir_binop_gequal ||
+  ir->operation == ir_binop_nequal)
  body.emit(assign(dst[i], logic_not(dst[i])));
}
 
@@ -599,6 +600,7 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_binop_nequal:
case ir_binop_equal:
   if (lowering(EQ64)) {
  if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 41/50] glsl: Add a lowering pass for 64-bit float min()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

[airlied: update to handle min(dvec, double) case.

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h  |  1 +
 src/compiler/glsl/lower_instructions.cpp | 33 
 2 files changed, 34 insertions(+)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 3c99ae0..f9b688a 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -54,6 +54,7 @@
 #define DDIV_TO_MUL_RCP   0x10
 #define DIV_TO_MUL_RCP(FDIV_TO_MUL_RCP | DDIV_TO_MUL_RCP)
 #define SQRT_TO_ABS_SQRT  0x20
+#define DMIN_DMAX_TO_LESS 0x40
 
 /* Opertaions for lower_64bit_integer_instructions() */
 #define MUL64 (1U << 0)
diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 91f71b3..8c3d623 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -43,6 +43,7 @@
  * - BORROW_TO_ARITH
  * - SAT_TO_CLAMP
  * - DOPS_TO_DFRAC
+ * - DMIN_DMAX_TO_LESS
  *
  * SUB_TO_ADD_NEG:
  * ---
@@ -115,6 +116,12 @@
  * DOPS_TO_DFRAC:
  * --
  * Converts double trunc, ceil, floor, round to fract
+ *
+ * DMIN_DMAX_TO_LESS:
+ * 
+ * Converts double min, max into less.
+ * min(x,y) = less(x,y) ? x, y;
+ * max(x,y) = less(x,y) ? y, x;
  */
 
 #include "c99_math.h"
@@ -169,6 +176,7 @@ private:
void find_msb_to_float_cast(ir_expression *ir);
void imul_high_to_mul(ir_expression *ir);
void sqrt_to_abs_sqrt(ir_expression *ir);
+   void dmin_to_less(ir_expression *ir);
 
ir_expression *_carry(operand a, operand b);
 };
@@ -1666,6 +1674,25 @@ 
lower_instructions_visitor::sqrt_to_abs_sqrt(ir_expression *ir)
this->progress = true;
 }
 
+void
+lower_instructions_visitor::dmin_to_less(ir_expression *ir)
+{
+   const unsigned vec_elem = ir->type->vector_elements;
+   ir_rvalue *x_clone = ir->operands[0]->clone(ir, NULL);
+   ir_rvalue *y_clone = ir->operands[1]->clone(ir, NULL);
+   ir->operation = ir_triop_csel;
+   ir->init_num_operands();
+   if (ir->operands[1]->type->vector_elements == 1 && vec_elem > 1) {
+  ir->operands[0] = less(ir->operands[0], swizzle(ir->operands[1], 
SWIZZLE_, vec_elem));
+  ir->operands[2] = swizzle(y_clone, SWIZZLE_, vec_elem);
+   } else {
+  ir->operands[0] = less(ir->operands[0], ir->operands[1]);
+  ir->operands[2] = y_clone;
+   }
+   ir->operands[1] = x_clone;
+   this->progress = true;
+}
+
 ir_visitor_status
 lower_instructions_visitor::visit_leave(ir_expression *ir)
 {
@@ -1809,6 +1836,12 @@ lower_instructions_visitor::visit_leave(ir_expression 
*ir)
  sqrt_to_abs_sqrt(ir);
   break;
 
+   case ir_binop_min:
+  if (lowering(DMIN_DMAX_TO_LESS) &&
+  ir->type->is_double())
+ dmin_to_less(ir);
+  break;
+
default:
   return visit_continue;
}
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 38/50] glsl/lower_64bit: lower d2b using comparison

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

This just does a compare to 0 and inverts the result to lower
d2b. Not 100% sure this is always correct, but it passes piglit
---
 src/compiler/glsl/lower_64bit.cpp | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 794cc3e..c4b8e78 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -349,7 +349,7 @@ lower_64bit::lower_op_to_function_call(ir_instruction 
*base_ir,
ir_expression *ir,
ir_function_signature *callee)
 {
-   const unsigned num_operands = ir->num_operands;
+   unsigned num_operands = ir->num_operands;
ir_variable *src[4][4];
ir_variable *dst[4];
void *const mem_ctx = ralloc_parent(ir);
@@ -378,6 +378,16 @@ lower_64bit::lower_op_to_function_call(ir_instruction 
*base_ir,
  source_components = ir->operands[i]->type->vector_elements;
}
 
+   if (ir->operation == ir_unop_d2b) {
+  for (unsigned i = 0; i < source_components; i++) {
+ src[1][i] = body.make_temp(glsl_type::uvec2_type, "zero");
+
+ body.emit(assign(src[1][i], body.constant(0u), 1));
+ body.emit(assign(src[1][i], body.constant(0u), 2));
+  }
+  num_operands++;
+   }
+
for (unsigned i = 0; i < source_components; i++) {
   dst[i] = body.make_temp(result_type, "expanded_64bit_result");
 
@@ -394,6 +404,9 @@ lower_64bit::lower_op_to_function_call(ir_instruction 
*base_ir,
   );
 
   body.emit(c);
+
+  if (ir->operation == ir_unop_d2b)
+ body.emit(assign(dst[i], logic_not(dst[i])));
}
 
ir_rvalue *rv;
@@ -475,6 +488,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_d2b:
+  if (lowering(EQ64)) {
+ if (ir->type->base_type == GLSL_TYPE_BOOL)
+*rvalue = handle_op(ir, "__builtin_feq64", generate_ir::feq64);
+  }
+  break;
+
case ir_unop_d2f:
   if (lowering(D2F)) {
  if (ir->type->base_type == GLSL_TYPE_FLOAT)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 30/50] glsl: Add a lowering pass for 64-bit float i2d()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 3cc7f2e..f73faec 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -68,9 +68,10 @@
 #define D2U   (1U << 9)
 #define U2D   (1U << 10)
 #define D2I   (1U << 11)
+#define I2D   (1U << 12)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
-  ADD64 | MUL64 | D2U | U2D | D2I)
+  ADD64 | MUL64 | D2U | U2D | D2I | I2D)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 7b2ffe8..2900409 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -438,6 +438,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_i2d:
+  if (lowering(I2D)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_int_to_fp64", 
generate_ir::int_to_fp64, true);
+  }
+  break;
+
case ir_unop_neg:
   if (lowering(NEG64)) {
  if (ir->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 31/50] glsl: Add a lowering pass for 64-bit float d2f()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 4 +++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index f73faec..e7860ef 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -69,9 +69,11 @@
 #define U2D   (1U << 10)
 #define D2I   (1U << 11)
 #define I2D   (1U << 12)
+#define D2F   (1U << 13)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
-  ADD64 | MUL64 | D2U | U2D | D2I | I2D)
+  ADD64 | MUL64 | D2U | U2D | D2I | I2D | \
+  D2F)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 2900409..e76ebdc 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_d2f:
+  if (lowering(D2F)) {
+ if (ir->type->base_type == GLSL_TYPE_FLOAT)
+*rvalue = handle_op(ir, "__builtin_fp64_to_fp32", 
generate_ir::fp64_to_fp32);
+  }
+  break;
+
case ir_unop_d2i:
   if (lowering(D2I)) {
  if (ir->type->base_type == GLSL_TYPE_INT)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 37/50] glsl/lower_64bit: handle any/all operations

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

This just splits them out and combines the results.

Signed-off-by: Dave Airlie 
---
 src/compiler/glsl/lower_64bit.cpp | 61 ++-
 1 file changed, 60 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index ee6d6f9..794cc3e 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -60,6 +60,12 @@ ir_dereference_variable *compact_destination(ir_factory &,
 ir_dereference_variable *merge_destination(ir_factory &,
const glsl_type *type,
ir_variable *result[4]);
+ir_dereference_variable *all_equal_destination(ir_factory &,
+  const glsl_type *type,
+  ir_variable *result[4]);
+ir_dereference_variable *any_nequal_destination(ir_factory &,
+   const glsl_type *type,
+   ir_variable *result[4]);
 
 ir_rvalue *lower_op_to_function_call(ir_instruction *base_ir,
  ir_expression *ir,
@@ -297,6 +303,47 @@ lower_64bit::compact_destination(ir_factory ,
return new(mem_ctx) ir_dereference_variable(compacted_result);
 }
 
+/*
+ * and the results from each comparison.
+ */
+ir_dereference_variable *
+lower_64bit::all_equal_destination(ir_factory ,
+const glsl_type *type,
+ir_variable *result[4])
+{
+   ir_variable *const merged_result =
+  body.make_temp(glsl_type::bool_type, "all_result");
+
+   body.emit(assign(merged_result, result[0]));
+   for (unsigned i = 1; i < type->vector_elements; i++) {
+  body.emit(assign(merged_result, logic_and(merged_result, result[i])));
+   }
+
+   void *const mem_ctx = ralloc_parent(merged_result);
+   return new(mem_ctx) ir_dereference_variable(merged_result);
+}
+
+/*
+ * and the results from each comparison, the not the result
+ */
+ir_dereference_variable *
+lower_64bit::any_nequal_destination(ir_factory ,
+const glsl_type *type,
+ir_variable *result[4])
+{
+   ir_variable *const merged_result =
+  body.make_temp(glsl_type::bool_type, "any_result");
+
+   body.emit(assign(merged_result, result[0]));
+   for (unsigned i = 1; i < type->vector_elements; i++) {
+  body.emit(assign(merged_result, logic_and(merged_result, result[i])));
+   }
+
+   body.emit(assign(merged_result, logic_not(merged_result)));
+   void *const mem_ctx = ralloc_parent(merged_result);
+   return new(mem_ctx) ir_dereference_variable(merged_result);
+}
+
 ir_rvalue *
 lower_64bit::lower_op_to_function_call(ir_instruction *base_ir,
ir_expression *ir,
@@ -350,7 +397,11 @@ lower_64bit::lower_op_to_function_call(ir_instruction 
*base_ir,
}
 
ir_rvalue *rv;
-   if (ir->type->is_64bit())
+   if (ir->operation == ir_binop_all_equal)
+  rv = all_equal_destination(body, ir->type, dst);
+   else if (ir->operation == ir_binop_any_nequal)
+  rv = any_nequal_destination(body, ir->type, dst);
+   else if (ir->type->is_64bit())
   rv = compact_destination(body, ir->type, dst);
else
   rv = merge_destination(body, ir->type, dst);
@@ -560,6 +611,14 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_binop_all_equal:
+   case ir_binop_any_nequal:
+  if (lowering(EQ64)) {
+if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE) {
+*rvalue = handle_op(ir, "__builtin_feq64", generate_ir::feq64);
+}
+  }
+  break;
default:
   break;
}
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 25/50] glsl: Add a lowering pass for 64-bit float add()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 4 +++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index b5f8c45..691803e 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -64,8 +64,10 @@
 #define NEG64 (1U << 5)
 #define EQ64  (1U << 6)
 #define LT64  (1U << 7)
+#define ADD64 (1U << 8)
 
-#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64)
+#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
+  ADD64)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 24cc3cd..eed1dba 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -440,6 +440,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_binop_add:
+  if (lowering(ADD64)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fadd64", generate_ir::fadd64);
+  }
+  break;
+
case ir_binop_div:
   if (lowering(DIV64)) {
  if (ir->type->base_type == GLSL_TYPE_UINT64) {
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 23/50] glsl: Add a lowering pass for 64-bit float equal()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 3a406ce..17db074 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -62,8 +62,9 @@
 #define MOD64 (1U << 3)
 #define ABS64 (1U << 4)
 #define NEG64 (1U << 5)
+#define EQ64  (1U << 6)
 
-#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64)
+#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64)
 /**
  * \see class lower_packing_builtins_visitor
  */
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 88df912..d5e0f32 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -450,6 +450,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_binop_equal:
+  if (lowering(EQ64)) {
+ if (ir->operands[0]->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_feq64", generate_ir::feq64);
+  }
+  break;
+
case ir_binop_mod:
   if (lowering(MOD64)) {
  if (ir->type->base_type == GLSL_TYPE_UINT64) {
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 22/50] glsl: Add a lowering pass for 64-bit float sign()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 2 +-
 src/compiler/glsl/lower_64bit.cpp   | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 2d9728d..3a406ce 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -63,7 +63,7 @@
 #define ABS64 (1U << 4)
 #define NEG64 (1U << 5)
 
-#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64)
+#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64)
 /**
  * \see class lower_packing_builtins_visitor
  */
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index bc9e477..88df912 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -435,6 +435,8 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   if (lowering(SIGN64)) {
  if (ir->type->is_integer_64())
 *rvalue = handle_op(ir, "__builtin_sign64", generate_ir::sign64);
+else if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fsign64", generate_ir::fsign64);
   }
   break;
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 28/50] glsl: Add a lowering pass for 64-bit float u2d()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Handle non 64bit sources (airlied)

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index e3d573c..a4cb7b2 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -66,9 +66,10 @@
 #define LT64  (1U << 7)
 #define ADD64 (1U << 8)
 #define D2U   (1U << 9)
+#define U2D   (1U << 10)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
-  ADD64 | MUL64 | D2U)
+  ADD64 | MUL64 | D2U | U2D)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index 1b90830..1e97306 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -447,6 +447,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_u2d:
+  if (lowering(U2D)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_uint_to_fp64", 
generate_ir::uint_to_fp64, true);
+  }
+  break;
+
case ir_binop_add:
   if (lowering(ADD64)) {
  if (ir->type->base_type == GLSL_TYPE_DOUBLE)
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 26/50] glsl: Add a lowering pass for 64-bit float mul()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 2 +-
 src/compiler/glsl/lower_64bit.cpp   | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 691803e..6506e28 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -67,7 +67,7 @@
 #define ADD64 (1U << 8)
 
 #define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64 | SIGN64 | EQ64 | LT64 | \
-  ADD64)
+  ADD64 | MUL64)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index eed1dba..f3a2633 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -485,6 +485,8 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   if (lowering(MUL64)) {
  if (ir->type->is_integer_64())
 *rvalue = handle_op(ir, "__builtin_umul64", generate_ir::umul64);
+else if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fmul64", generate_ir::fmul64);
   }
   break;
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/50] glsl: Add "built-in" functions to do int_to_fp64(int) (v2)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

v2: use mix
Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 129 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  |  23 ++
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 160 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 8ce2baa..b656fad 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -5533,3 +5533,132 @@ fp64_to_int(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+int_to_fp64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r08C0 = new(mem_ctx) ir_variable(glsl_type::int_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r08C0);
+   ir_variable *const r08C1 = body.make_temp(glsl_type::uvec2_type, 
"return_value");
+   ir_variable *const r08C2 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zSign", ir_var_auto);
+   body.emit(r08C2);
+   ir_variable *const r08C3 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zFrac1", ir_var_auto);
+   body.emit(r08C3);
+   ir_variable *const r08C4 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zFrac0", ir_var_auto);
+   body.emit(r08C4);
+   body.emit(assign(r08C4, body.constant(0u), 0x01));
+
+   body.emit(assign(r08C3, body.constant(0u), 0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r08C6 = equal(r08C0, body.constant(int(0)));
+   ir_if *f08C5 = new(mem_ctx) ir_if(operand(r08C6).val);
+   exec_list *const f08C5_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  ir_variable *const r08C7 = new(mem_ctx) 
ir_variable(glsl_type::uvec2_type, "z", ir_var_auto);
+  body.emit(r08C7);
+  body.emit(assign(r08C7, body.constant(0u), 0x02));
+
+  body.emit(assign(r08C7, body.constant(0u), 0x01));
+
+  body.emit(assign(r08C1, r08C7, 0x03));
+
+
+  /* ELSE INSTRUCTIONS */
+  body.instructions = >else_instructions;
+
+  ir_expression *const r08C8 = less(r08C0, body.constant(int(0)));
+  ir_expression *const r08C9 = expr(ir_unop_b2i, r08C8);
+  body.emit(assign(r08C2, expr(ir_unop_i2u, r08C9), 0x01));
+
+  ir_variable *const r08CA = body.make_temp(glsl_type::uint_type, 
"mix_retval");
+  ir_expression *const r08CB = less(r08C0, body.constant(int(0)));
+  ir_expression *const r08CC = neg(r08C0);
+  ir_expression *const r08CD = expr(ir_unop_i2u, r08CC);
+  ir_expression *const r08CE = expr(ir_unop_i2u, r08C0);
+  body.emit(assign(r08CA, expr(ir_triop_csel, r08CB, r08CD, r08CE), 0x01));
+
+  ir_variable *const r08CF = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+  ir_expression *const r08D0 = equal(r08CA, body.constant(0u));
+  ir_expression *const r08D1 = expr(ir_unop_find_msb, r08CA);
+  ir_expression *const r08D2 = sub(body.constant(int(31)), r08D1);
+  ir_expression *const r08D3 = expr(ir_triop_csel, r08D0, 
body.constant(int(32)), r08D2);
+  body.emit(assign(r08CF, add(r08D3, body.constant(int(-11))), 0x01));
+
+  /* IF CONDITION */
+  ir_expression *const r08D5 = gequal(r08CF, body.constant(int(0)));
+  ir_if *f08D4 = new(mem_ctx) ir_if(operand(r08D5).val);
+  exec_list *const f08D4_parent_instructions = body.instructions;
+
+ /* THEN INSTRUCTIONS */
+ body.instructions = >then_instructions;
+
+ body.emit(assign(r08C4, lshift(r08CA, r08CF), 0x01));
+
+ body.emit(assign(r08C3, body.constant(0u), 0x01));
+
+
+ /* ELSE INSTRUCTIONS */
+ body.instructions = >else_instructions;
+
+ ir_variable *const r08D6 = body.make_temp(glsl_type::int_type, 
"count");
+ body.emit(assign(r08D6, neg(r08CF), 0x01));
+
+ ir_expression *const r08D7 = equal(r08D6, body.constant(int(0)));
+ ir_expression *const r08D8 = less(r08D6, body.constant(int(32)));
+ ir_expression *const r08D9 = rshift(r08CA, r08D6);
+ ir_expression *const r08DA = expr(ir_triop_csel, r08D8, r08D9, 
body.constant(0u));
+ body.emit(assign(r08C4, expr(ir_triop_csel, r08D7, r08CA, r08DA), 
0x01));
+
+ ir_expression *const r08DB = equal(r08D6, body.constant(int(0)));
+ ir_expression *const r08DC = less(r08D6, body.constant(int(32)));
+ ir_expression *const r08DD = neg(r08D6);
+ ir_expression *const r08DE = bit_and(r08DD, body.constant(int(31)));
+ ir_expression *const r08DF = lshift(r08CA, r08DE);
+

[Mesa-dev] [PATCH 21/50] glsl: Add a lowering pass for 64-bit float neg()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 3 ++-
 src/compiler/glsl/lower_64bit.cpp   | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 370812f..2d9728d 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -61,8 +61,9 @@
 #define DIV64 (1U << 2)
 #define MOD64 (1U << 3)
 #define ABS64 (1U << 4)
+#define NEG64 (1U << 5)
 
-#define LOWER_ALL_DOUBLE_OPS (ABS64)
+#define LOWER_ALL_DOUBLE_OPS (ABS64 | NEG64)
 /**
  * \see class lower_packing_builtins_visitor
  */
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index debedfc..bc9e477 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -424,6 +424,13 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
   }
   break;
 
+   case ir_unop_neg:
+  if (lowering(NEG64)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fneg64", generate_ir::fneg64);
+  }
+  break;
+
case ir_unop_sign:
   if (lowering(SIGN64)) {
  if (ir->type->is_integer_64())
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 16/50] glsl: Add "built-in" functions to do trunc(fp64) (v2)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

v2: use mix.

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 62 +
 src/compiler/glsl/builtin_functions.cpp |  4 +++
 src/compiler/glsl/builtin_functions.h   |  3 ++
 src/compiler/glsl/float64.glsl  | 21 +++
 src/compiler/glsl/glcpp/glcpp-parse.y   |  1 +
 5 files changed, 91 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 6fbe12d..f0222e1 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -6635,3 +6635,65 @@ fsqrt64(void *mem_ctx, builtin_available_predicate avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+ftrunc64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0A28 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0A28);
+   ir_variable *const r0A29 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zHi", ir_var_auto);
+   body.emit(r0A29);
+   ir_variable *const r0A2A = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zLo", ir_var_auto);
+   body.emit(r0A2A);
+   ir_variable *const r0A2B = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+   ir_expression *const r0A2C = rshift(swizzle_y(r0A28), 
body.constant(int(20)));
+   ir_expression *const r0A2D = bit_and(r0A2C, body.constant(2047u));
+   ir_expression *const r0A2E = expr(ir_unop_u2i, r0A2D);
+   body.emit(assign(r0A2B, add(r0A2E, body.constant(int(-1023))), 0x01));
+
+   ir_variable *const r0A2F = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+   body.emit(assign(r0A2F, sub(body.constant(int(52)), r0A2B), 0x01));
+
+   ir_expression *const r0A30 = gequal(r0A2F, body.constant(int(32)));
+   ir_expression *const r0A31 = lshift(body.constant(4294967295u), r0A2F);
+   ir_expression *const r0A32 = expr(ir_triop_csel, r0A30, body.constant(0u), 
r0A31);
+   body.emit(assign(r0A2A, bit_and(r0A32, swizzle_x(r0A28)), 0x01));
+
+   ir_expression *const r0A33 = less(r0A2F, body.constant(int(33)));
+   ir_expression *const r0A34 = add(r0A2F, body.constant(int(-32)));
+   ir_expression *const r0A35 = lshift(body.constant(4294967295u), r0A34);
+   ir_expression *const r0A36 = expr(ir_triop_csel, r0A33, 
body.constant(4294967295u), r0A35);
+   body.emit(assign(r0A29, bit_and(r0A36, swizzle_y(r0A28)), 0x01));
+
+   ir_variable *const r0A37 = body.make_temp(glsl_type::uint_type, 
"mix_retval");
+   ir_expression *const r0A38 = less(body.constant(int(52)), r0A2B);
+   ir_expression *const r0A39 = less(r0A2B, body.constant(int(0)));
+   ir_expression *const r0A3A = expr(ir_triop_csel, r0A39, body.constant(0u), 
r0A2A);
+   body.emit(assign(r0A37, expr(ir_triop_csel, r0A38, swizzle_x(r0A28), 
r0A3A), 0x01));
+
+   body.emit(assign(r0A2A, r0A37, 0x01));
+
+   ir_variable *const r0A3B = body.make_temp(glsl_type::uint_type, 
"mix_retval");
+   ir_expression *const r0A3C = less(body.constant(int(52)), r0A2B);
+   ir_expression *const r0A3D = less(r0A2B, body.constant(int(0)));
+   ir_expression *const r0A3E = expr(ir_triop_csel, r0A3D, body.constant(0u), 
r0A29);
+   body.emit(assign(r0A3B, expr(ir_triop_csel, r0A3C, swizzle_y(r0A28), 
r0A3E), 0x01));
+
+   body.emit(assign(r0A29, r0A3B, 0x01));
+
+   ir_variable *const r0A3F = body.make_temp(glsl_type::uvec2_type, 
"vec_ctor");
+   body.emit(assign(r0A3F, r0A37, 0x01));
+
+   body.emit(assign(r0A3F, r0A3B, 0x02));
+
+   body.emit(ret(r0A3F));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index d919873..02618e0 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3398,6 +3398,10 @@ builtin_builder::create_builtins()
 generate_ir::fsqrt64(mem_ctx, integer_functions_supported),
 NULL);
 
+   add_function("__builtin_ftrunc64",
+generate_ir::ftrunc64(mem_ctx, integer_functions_supported),
+NULL);
+
 #undef F
 #undef FI
 #undef FIUD_VEC
diff --git a/src/compiler/glsl/builtin_functions.h 
b/src/compiler/glsl/builtin_functions.h
index 2f72f51..4a6b922 100644
--- a/src/compiler/glsl/builtin_functions.h
+++ b/src/compiler/glsl/builtin_functions.h
@@ -109,6 +109,9 @@ fp32_to_fp64(void *mem_ctx, builtin_available_predicate 
avail);
 ir_function_signature *
 fsqrt64(void *mem_ctx, builtin_available_predicate avail);
 
+ir_function_signature *
+ftrunc64(void *mem_ctx, builtin_available_predicate avail);
+
 }
 
 #endif /* BULITIN_FUNCTIONS_H */
diff --git a/src/compiler/glsl/float64.glsl

[Mesa-dev] [PATCH 17/50] glsl: Add "built-in" functions to do round(fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 225 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  |  41 ++
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 274 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index f0222e1..3cba289 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -6697,3 +6697,228 @@ ftrunc64(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+fround64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0F1C = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0F1C);
+   ir_variable *const r0F1D = body.make_temp(glsl_type::bool_type, 
"execute_flag");
+   body.emit(assign(r0F1D, body.constant(true), 0x01));
+
+   ir_variable *const r0F1E = body.make_temp(glsl_type::uvec2_type, 
"return_value");
+   ir_variable *const r0F1F = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aLo", ir_var_auto);
+   body.emit(r0F1F);
+   ir_variable *const r0F20 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aHi", ir_var_auto);
+   body.emit(r0F20);
+   ir_variable *const r0F21 = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+   ir_expression *const r0F22 = rshift(swizzle_y(r0F1C), 
body.constant(int(20)));
+   ir_expression *const r0F23 = bit_and(r0F22, body.constant(2047u));
+   ir_expression *const r0F24 = expr(ir_unop_u2i, r0F23);
+   body.emit(assign(r0F21, add(r0F24, body.constant(int(-1023))), 0x01));
+
+   body.emit(assign(r0F20, swizzle_y(r0F1C), 0x01));
+
+   body.emit(assign(r0F1F, swizzle_x(r0F1C), 0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r0F26 = less(r0F21, body.constant(int(20)));
+   ir_if *f0F25 = new(mem_ctx) ir_if(operand(r0F26).val);
+   exec_list *const f0F25_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  /* IF CONDITION */
+  ir_expression *const r0F28 = less(r0F21, body.constant(int(0)));
+  ir_if *f0F27 = new(mem_ctx) ir_if(operand(r0F28).val);
+  exec_list *const f0F27_parent_instructions = body.instructions;
+
+ /* THEN INSTRUCTIONS */
+ body.instructions = >then_instructions;
+
+ body.emit(assign(r0F20, bit_and(swizzle_y(r0F1C), 
body.constant(2147483648u)), 0x01));
+
+ /* IF CONDITION */
+ ir_expression *const r0F2A = equal(r0F21, body.constant(int(-1)));
+ ir_expression *const r0F2B = nequal(swizzle_x(r0F1C), 
body.constant(0u));
+ ir_expression *const r0F2C = logic_and(r0F2A, r0F2B);
+ ir_if *f0F29 = new(mem_ctx) ir_if(operand(r0F2C).val);
+ exec_list *const f0F29_parent_instructions = body.instructions;
+
+/* THEN INSTRUCTIONS */
+body.instructions = >then_instructions;
+
+body.emit(assign(r0F20, bit_or(r0F20, body.constant(1072693248u)), 
0x01));
+
+
+ body.instructions = f0F29_parent_instructions;
+ body.emit(f0F29);
+
+ /* END IF */
+
+ body.emit(assign(r0F1F, body.constant(0u), 0x01));
+
+
+ /* ELSE INSTRUCTIONS */
+ body.instructions = >else_instructions;
+
+ ir_variable *const r0F2D = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+ body.emit(assign(r0F2D, rshift(body.constant(1048575u), r0F21), 
0x01));
+
+ /* IF CONDITION */
+ ir_expression *const r0F2F = bit_and(r0F20, r0F2D);
+ ir_expression *const r0F30 = equal(r0F2F, body.constant(0u));
+ ir_expression *const r0F31 = equal(r0F1F, body.constant(0u));
+ ir_expression *const r0F32 = logic_and(r0F30, r0F31);
+ ir_if *f0F2E = new(mem_ctx) ir_if(operand(r0F32).val);
+ exec_list *const f0F2E_parent_instructions = body.instructions;
+
+/* THEN INSTRUCTIONS */
+body.instructions = >then_instructions;
+
+body.emit(assign(r0F1E, r0F1C, 0x03));
+
+body.emit(assign(r0F1D, body.constant(false), 0x01));
+
+
+/* ELSE INSTRUCTIONS */
+body.instructions = >else_instructions;
+
+ir_expression *const r0F33 = rshift(body.constant(524288u), r0F21);
+body.emit(assign(r0F20, add(r0F20, r0F33), 0x01));
+
+ir_expression *const r0F34 = expr(ir_unop_bit_not, r0F2D);
+body.emit(assign(r0F20, bit_and(r0F20, r0F34), 0x01));
+
+

[Mesa-dev] [PATCH 15/50] glsl: Add "built-in" functions to do sqrt(fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

This currently uses fp64->fp32, sqrt(fp32), fp32->fp64.

[airlied: The code is include from soft float for doing proper sqrt64
but it needs to be decided if we need to pursue this and
how to optimise it better.]

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 393 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  | 275 ++
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 676 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 034d2d0..6fbe12d 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -6242,3 +6242,396 @@ fp32_to_fp64(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+fsqrt64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r09A9 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r09A9);
+   ir_variable *const r09AA = body.make_temp(glsl_type::uvec2_type, "a");
+   body.emit(assign(r09AA, r09A9, 0x03));
+
+   ir_variable *const r09AB = body.make_temp(glsl_type::float_type, 
"return_value");
+   ir_variable *const r09AC = body.make_temp(glsl_type::uint_type, 
"extractFloat64FracHi_retval");
+   body.emit(assign(r09AC, bit_and(swizzle_y(r09A9), body.constant(1048575u)), 
0x01));
+
+   ir_variable *const r09AD = body.make_temp(glsl_type::int_type, 
"extractFloat64Exp_retval");
+   ir_expression *const r09AE = rshift(swizzle_y(r09A9), 
body.constant(int(20)));
+   ir_expression *const r09AF = bit_and(r09AE, body.constant(2047u));
+   body.emit(assign(r09AD, expr(ir_unop_u2i, r09AF), 0x01));
+
+   ir_variable *const r09B0 = body.make_temp(glsl_type::uint_type, 
"extractFloat64Sign_retval");
+   body.emit(assign(r09B0, rshift(swizzle_y(r09A9), body.constant(int(31))), 
0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r09B2 = equal(r09AD, body.constant(int(2047)));
+   ir_if *f09B1 = new(mem_ctx) ir_if(operand(r09B2).val);
+   exec_list *const f09B1_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  ir_variable *const r09B3 = new(mem_ctx) 
ir_variable(glsl_type::float_type, "rval", ir_var_auto);
+  body.emit(r09B3);
+  ir_expression *const r09B4 = lshift(swizzle_y(r09A9), 
body.constant(int(12)));
+  ir_expression *const r09B5 = rshift(swizzle_x(r09A9), 
body.constant(int(20)));
+  body.emit(assign(r09AA, bit_or(r09B4, r09B5), 0x02));
+
+  body.emit(assign(r09AA, lshift(swizzle_x(r09A9), 
body.constant(int(12))), 0x01));
+
+  ir_expression *const r09B6 = lshift(r09B0, body.constant(int(31)));
+  ir_expression *const r09B7 = bit_or(r09B6, body.constant(2143289344u));
+  ir_expression *const r09B8 = rshift(swizzle_y(r09AA), 
body.constant(int(9)));
+  ir_expression *const r09B9 = bit_or(r09B7, r09B8);
+  body.emit(assign(r09B3, expr(ir_unop_bitcast_u2f, r09B9), 0x01));
+
+  ir_variable *const r09BA = body.make_temp(glsl_type::float_type, 
"mix_retval");
+  ir_expression *const r09BB = bit_or(r09AC, swizzle_x(r09A9));
+  ir_expression *const r09BC = nequal(r09BB, body.constant(0u));
+  ir_expression *const r09BD = lshift(r09B0, body.constant(int(31)));
+  ir_expression *const r09BE = add(r09BD, body.constant(2139095040u));
+  ir_expression *const r09BF = expr(ir_unop_bitcast_u2f, r09BE);
+  body.emit(assign(r09BA, expr(ir_triop_csel, r09BC, r09B3, r09BF), 0x01));
+
+  body.emit(assign(r09B3, r09BA, 0x01));
+
+  body.emit(assign(r09AB, r09BA, 0x01));
+
+
+  /* ELSE INSTRUCTIONS */
+  body.instructions = >else_instructions;
+
+  ir_variable *const r09C0 = body.make_temp(glsl_type::uint_type, 
"mix_retval");
+  ir_expression *const r09C1 = lshift(r09AC, body.constant(int(10)));
+  ir_expression *const r09C2 = rshift(swizzle_x(r09A9), 
body.constant(int(22)));
+  ir_expression *const r09C3 = bit_or(r09C1, r09C2);
+  ir_expression *const r09C4 = lshift(swizzle_x(r09A9), 
body.constant(int(10)));
+  ir_expression *const r09C5 = nequal(r09C4, body.constant(0u));
+  ir_expression *const r09C6 = expr(ir_unop_b2i, r09C5);
+  ir_expression *const r09C7 = expr(ir_unop_i2u, r09C6);
+  body.emit(assign(r09C0, bit_or(r09C3, r09C7), 0x01));
+
+  ir_variable *const r09C8 = body.make_temp(glsl_type::uint_type, 
"mix_retval");
+  ir_expression *const r09C9 = nequal(r09AD,

[Mesa-dev] [PATCH 18/50] glsl: Add "built-in" functions to do rcp(fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 1829 +++
 src/compiler/glsl/builtin_functions.cpp |4 +
 src/compiler/glsl/builtin_functions.h   |3 +
 src/compiler/glsl/float64.glsl  |   10 +
 src/compiler/glsl/glcpp/glcpp-parse.y   |1 +
 5 files changed, 1847 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 3cba289..e8ef0b0 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -6922,3 +6922,1832 @@ fround64(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+frcp64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0F45 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0F45);
+   ir_variable *const r0F46 = body.make_temp(glsl_type::uint_type, "z1Ptr");
+   body.emit(assign(r0F46, sub(body.constant(2406117202u), swizzle_x(r0F45)), 
0x01));
+
+   ir_expression *const r0F47 = sub(body.constant(3217938081u), 
swizzle_y(r0F45));
+   ir_expression *const r0F48 = less(body.constant(2406117202u), 
swizzle_x(r0F45));
+   ir_expression *const r0F49 = expr(ir_unop_b2i, r0F48);
+   ir_expression *const r0F4A = expr(ir_unop_i2u, r0F49);
+   body.emit(assign(r0F45, sub(r0F47, r0F4A), 0x02));
+
+   body.emit(assign(r0F45, r0F46, 0x01));
+
+   ir_variable *const r0F4B = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z1", ir_var_auto);
+   body.emit(r0F4B);
+   ir_variable *const r0F4C = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z0", ir_var_auto);
+   body.emit(r0F4C);
+   ir_expression *const r0F4D = lshift(swizzle_y(r0F45), 
body.constant(int(31)));
+   ir_expression *const r0F4E = rshift(r0F46, body.constant(int(1)));
+   body.emit(assign(r0F4B, bit_or(r0F4D, r0F4E), 0x01));
+
+   body.emit(assign(r0F4C, rshift(swizzle_y(r0F45), body.constant(int(1))), 
0x01));
+
+   body.emit(assign(r0F45, r0F4C, 0x02));
+
+   body.emit(assign(r0F45, r0F4B, 0x01));
+
+   ir_variable *const r0F4F = body.make_temp(glsl_type::bool_type, 
"execute_flag");
+   body.emit(assign(r0F4F, body.constant(true), 0x01));
+
+   ir_variable *const r0F50 = body.make_temp(glsl_type::uvec2_type, 
"return_value");
+   ir_variable *const r0F51 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zSign", ir_var_auto);
+   body.emit(r0F51);
+   ir_variable *const r0F52 = new(mem_ctx) ir_variable(glsl_type::int_type, 
"bExp", ir_var_auto);
+   body.emit(r0F52);
+   ir_variable *const r0F53 = new(mem_ctx) ir_variable(glsl_type::int_type, 
"aExp", ir_var_auto);
+   body.emit(r0F53);
+   ir_variable *const r0F54 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"bFracHi", ir_var_auto);
+   body.emit(r0F54);
+   ir_variable *const r0F55 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"bFracLo", ir_var_auto);
+   body.emit(r0F55);
+   ir_variable *const r0F56 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aFracHi", ir_var_auto);
+   body.emit(r0F56);
+   ir_variable *const r0F57 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aFracLo", ir_var_auto);
+   body.emit(r0F57);
+   ir_variable *const r0F58 = new(mem_ctx) ir_variable(glsl_type::int_type, 
"zExp", ir_var_auto);
+   body.emit(r0F58);
+   ir_variable *const r0F59 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zFrac2", ir_var_auto);
+   body.emit(r0F59);
+   ir_variable *const r0F5A = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zFrac1", ir_var_auto);
+   body.emit(r0F5A);
+   ir_variable *const r0F5B = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zFrac0", ir_var_auto);
+   body.emit(r0F5B);
+   body.emit(assign(r0F5B, body.constant(0u), 0x01));
+
+   body.emit(assign(r0F5A, body.constant(0u), 0x01));
+
+   body.emit(assign(r0F59, body.constant(0u), 0x01));
+
+   ir_variable *const r0F5C = body.make_temp(glsl_type::uint_type, 
"extractFloat64FracLo_retval");
+   body.emit(assign(r0F5C, swizzle_x(r0F45), 0x01));
+
+   body.emit(assign(r0F57, r0F5C, 0x01));
+
+   ir_variable *const r0F5D = body.make_temp(glsl_type::uint_type, 
"extractFloat64FracHi_retval");
+   body.emit(assign(r0F5D, bit_and(r0F4C, body.constant(1048575u)), 0x01));
+
+   body.emit(assign(r0F56, r0F5D, 0x01));
+
+   ir_variable *const r0F5E = body.make_temp(glsl_type::uint_type, 
"extractFloat64FracLo_retval");
+   body.emit(assign(r0F5E, swizzle_x(r0F45), 0x01));
+
+   body.emit(assign(r0F55, r0F5E, 0x01));
+
+   ir_variable *const r0F5F = body.make_temp(glsl_type::uint_type, 
"extractFloat64FracHi_retval");
+   body.emit(assign(r0F5F, bit_and(r0F4C, body.constant(1048575u)), 0x01));
+
+

[Mesa-dev] [PATCH 20/50] glsl: add define to lower all double operations

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

We will add all fp64 ops to this for now, later drivers
may want to only lower some.

Signed-off-by: Dave Airlie 
---
 src/compiler/glsl/ir_optimization.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index c48d0a9..370812f 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -62,6 +62,7 @@
 #define MOD64 (1U << 3)
 #define ABS64 (1U << 4)
 
+#define LOWER_ALL_DOUBLE_OPS (ABS64)
 /**
  * \see class lower_packing_builtins_visitor
  */
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 19/50] glsl: Add a lowering pass for 64-bit float abs()

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Squashed with:
glsl/lower_64bit: fix return type conversion (airlied)

Only do conversion for the 64-bit types, add a path
to do result merging without conversion.

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/ir_optimization.h | 1 +
 src/compiler/glsl/lower_64bit.cpp   | 8 
 2 files changed, 9 insertions(+)

diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 931bffb..c48d0a9 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -60,6 +60,7 @@
 #define SIGN64(1U << 1)
 #define DIV64 (1U << 2)
 #define MOD64 (1U << 3)
+#define ABS64 (1U << 4)
 
 /**
  * \see class lower_packing_builtins_visitor
diff --git a/src/compiler/glsl/lower_64bit.cpp 
b/src/compiler/glsl/lower_64bit.cpp
index d181f63..debedfc 100644
--- a/src/compiler/glsl/lower_64bit.cpp
+++ b/src/compiler/glsl/lower_64bit.cpp
@@ -416,6 +416,14 @@ lower_64bit_visitor::handle_rvalue(ir_rvalue **rvalue)
assert(ir != NULL);
 
switch (ir->operation) {
+
+   case ir_unop_abs:
+  if (lowering(ABS64)) {
+ if (ir->type->base_type == GLSL_TYPE_DOUBLE)
+*rvalue = handle_op(ir, "__builtin_fabs64", generate_ir::fabs64);
+  }
+  break;
+
case ir_unop_sign:
   if (lowering(SIGN64)) {
  if (ir->type->is_integer_64())
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/50] glsl: Add "built-in" functions to do fp64_to_int(fp64) (v2)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

v2: use mix

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 179 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  |  41 
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 228 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index c200447..8ce2baa 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -5354,3 +5354,182 @@ uint_to_fp64(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+fp64_to_int(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::int_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r088E = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r088E);
+   ir_variable *const r088F = body.make_temp(glsl_type::bool_type, 
"execute_flag");
+   body.emit(assign(r088F, body.constant(true), 0x01));
+
+   ir_variable *const r0890 = body.make_temp(glsl_type::int_type, 
"return_value");
+   ir_variable *const r0891 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"absZ", ir_var_auto);
+   body.emit(r0891);
+   ir_variable *const r0892 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aSign", ir_var_auto);
+   body.emit(r0892);
+   ir_variable *const r0893 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aFracHi", ir_var_auto);
+   body.emit(r0893);
+   ir_variable *const r0894 = body.make_temp(glsl_type::uint_type, 
"extractFloat64FracHi_retval");
+   body.emit(assign(r0894, bit_and(swizzle_y(r088E), body.constant(1048575u)), 
0x01));
+
+   body.emit(assign(r0893, r0894, 0x01));
+
+   ir_variable *const r0895 = body.make_temp(glsl_type::int_type, 
"extractFloat64Exp_retval");
+   ir_expression *const r0896 = rshift(swizzle_y(r088E), 
body.constant(int(20)));
+   ir_expression *const r0897 = bit_and(r0896, body.constant(2047u));
+   body.emit(assign(r0895, expr(ir_unop_u2i, r0897), 0x01));
+
+   body.emit(assign(r0892, rshift(swizzle_y(r088E), body.constant(int(31))), 
0x01));
+
+   body.emit(assign(r0891, body.constant(0u), 0x01));
+
+   ir_variable *const r0898 = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+   body.emit(assign(r0898, add(r0895, body.constant(int(-1043))), 0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r089A = gequal(r0898, body.constant(int(0)));
+   ir_if *f0899 = new(mem_ctx) ir_if(operand(r089A).val);
+   exec_list *const f0899_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  /* IF CONDITION */
+  ir_expression *const r089C = less(body.constant(int(1054)), r0895);
+  ir_if *f089B = new(mem_ctx) ir_if(operand(r089C).val);
+  exec_list *const f089B_parent_instructions = body.instructions;
+
+ /* THEN INSTRUCTIONS */
+ body.instructions = >then_instructions;
+
+ /* IF CONDITION */
+ ir_expression *const r089E = equal(r0895, body.constant(int(2047)));
+ ir_expression *const r089F = bit_or(r0894, swizzle_x(r088E));
+ ir_expression *const r08A0 = expr(ir_unop_u2i, r089F);
+ ir_expression *const r08A1 = expr(ir_unop_i2b, r08A0);
+ ir_expression *const r08A2 = logic_and(r089E, r08A1);
+ ir_if *f089D = new(mem_ctx) ir_if(operand(r08A2).val);
+ exec_list *const f089D_parent_instructions = body.instructions;
+
+/* THEN INSTRUCTIONS */
+body.instructions = >then_instructions;
+
+body.emit(assign(r0892, body.constant(0u), 0x01));
+
+
+ body.instructions = f089D_parent_instructions;
+ body.emit(f089D);
+
+ /* END IF */
+
+ ir_expression *const r08A3 = expr(ir_unop_u2i, r0892);
+ ir_expression *const r08A4 = expr(ir_unop_i2b, r08A3);
+ body.emit(assign(r0890, expr(ir_triop_csel, r08A4, 
body.constant(int(-2147483648)), body.constant(int(2147483647))), 0x01));
+
+ body.emit(assign(r088F, body.constant(false), 0x01));
+
+
+ /* ELSE INSTRUCTIONS */
+ body.instructions = >else_instructions;
+
+ ir_variable *const r08A5 = body.make_temp(glsl_type::uint_type, "a0");
+ body.emit(assign(r08A5, bit_or(r0894, body.constant(1048576u)), 
0x01));
+
+ ir_expression *const r08A6 = equal(r0898, body.constant(int(0)));
+ ir_expression *const r08A7 = lshift(r08A5, r0898);
+ ir_expression *const r08A8 = neg(r0898);
+ ir_expression *const r08A9 = bit_and(r08A8, body.constant(int(31)));
+ ir_expression

[Mesa-dev] [PATCH 04/50] glsl: Add "built-in" functions to do eq(fp64, fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 104 
 src/compiler/glsl/builtin_functions.cpp |   4 ++
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  |  44 ++
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 156 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 8546048..2340c48 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -96,3 +96,107 @@ fsign64(void *mem_ctx, builtin_available_predicate avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+extractFloat64FracLo(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0024 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0024);
+   ir_swizzle *const r0025 = swizzle_x(r0024);
+   body.emit(ret(r0025));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+extractFloat64FracHi(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0026 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0026);
+   ir_expression *const r0027 = bit_and(swizzle_y(r0026), 
body.constant(1048575u));
+   body.emit(ret(r0027));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+extractFloat64Exp(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::int_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0028 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0028);
+   ir_expression *const r0029 = rshift(swizzle_y(r0028), 
body.constant(int(20)));
+   ir_expression *const r002A = bit_and(r0029, body.constant(2047u));
+   ir_expression *const r002B = expr(ir_unop_u2i, r002A);
+   body.emit(ret(r002B));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+feq64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r002C = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r002C);
+   ir_variable *const r002D = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"b", ir_var_function_in);
+   sig_parameters.push_tail(r002D);
+   ir_variable *const r002E = body.make_temp(glsl_type::bool_type, 
"mix_retval");
+   ir_expression *const r002F = rshift(swizzle_y(r002C), 
body.constant(int(20)));
+   ir_expression *const r0030 = bit_and(r002F, body.constant(2047u));
+   ir_expression *const r0031 = expr(ir_unop_u2i, r0030);
+   ir_expression *const r0032 = equal(r0031, body.constant(int(2047)));
+   ir_expression *const r0033 = bit_and(swizzle_y(r002C), 
body.constant(1048575u));
+   ir_expression *const r0034 = bit_or(r0033, swizzle_x(r002C));
+   ir_expression *const r0035 = nequal(r0034, body.constant(0u));
+   ir_expression *const r0036 = logic_and(r0032, r0035);
+   ir_expression *const r0037 = rshift(swizzle_y(r002D), 
body.constant(int(20)));
+   ir_expression *const r0038 = bit_and(r0037, body.constant(2047u));
+   ir_expression *const r0039 = expr(ir_unop_u2i, r0038);
+   ir_expression *const r003A = equal(r0039, body.constant(int(2047)));
+   ir_expression *const r003B = bit_and(swizzle_y(r002D), 
body.constant(1048575u));
+   ir_expression *const r003C = bit_or(r003B, swizzle_x(r002D));
+   ir_expression *const r003D = nequal(r003C, body.constant(0u));
+   ir_expression *const r003E = logic_and(r003A, r003D);
+   ir_expression *const r003F = logic_or(r0036, r003E);
+   ir_expression *const r0040 = equal(swizzle_x(r002C), swizzle_x(r002D));
+   ir_expression *const r0041 = equal(swizzle_y(r002C), swizzle_y(r002D));
+   ir_expression *const r0042 = equal(swizzle_x(r002C), body.constant(0u));
+   ir_expression *const r0043 = bit_or(swizzle_y(r002C), swizzle_y(r002D));
+   ir_expression *const r0044 = lshift(r0043, body.constant(int(1)));
+   ir_expression

[Mesa-dev] [PATCH 02/50] glsl: Add "built-in" functions to do neg(fp64) (v2)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

v2: use mix.

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 51 +
 src/compiler/glsl/builtin_functions.cpp |  4 +++
 src/compiler/glsl/builtin_functions.h   |  3 ++
 src/compiler/glsl/float64.glsl  | 24 
 src/compiler/glsl/glcpp/glcpp-parse.y   |  1 +
 5 files changed, 83 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 7b57231..2898fc9 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -17,3 +17,54 @@ fabs64(void *mem_ctx, builtin_available_predicate avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+is_nan(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r000C = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r000C);
+   ir_expression *const r000D = lshift(swizzle_y(r000C), 
body.constant(int(1)));
+   ir_expression *const r000E = gequal(r000D, body.constant(4292870144u));
+   ir_expression *const r000F = nequal(swizzle_x(r000C), body.constant(0u));
+   ir_expression *const r0010 = bit_and(swizzle_y(r000C), 
body.constant(1048575u));
+   ir_expression *const r0011 = nequal(r0010, body.constant(0u));
+   ir_expression *const r0012 = logic_or(r000F, r0011);
+   ir_expression *const r0013 = logic_and(r000E, r0012);
+   body.emit(ret(r0013));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+fneg64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0014 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0014);
+   ir_expression *const r0015 = lshift(swizzle_y(r0014), 
body.constant(int(1)));
+   ir_expression *const r0016 = gequal(r0015, body.constant(4292870144u));
+   ir_expression *const r0017 = nequal(swizzle_x(r0014), body.constant(0u));
+   ir_expression *const r0018 = bit_and(swizzle_y(r0014), 
body.constant(1048575u));
+   ir_expression *const r0019 = nequal(r0018, body.constant(0u));
+   ir_expression *const r001A = logic_or(r0017, r0019);
+   ir_expression *const r001B = logic_and(r0016, r001A);
+   ir_expression *const r001C = bit_xor(swizzle_y(r0014), 
body.constant(2147483648u));
+   body.emit(assign(r0014, expr(ir_triop_csel, r001B, swizzle_y(r0014), 
r001C), 0x02));
+
+   body.emit(ret(r0014));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 133a896..9d88a31 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3346,6 +3346,10 @@ builtin_builder::create_builtins()
 generate_ir::fabs64(mem_ctx, integer_functions_supported),
 NULL);
 
+   add_function("__builtin_fneg64",
+generate_ir::fneg64(mem_ctx, integer_functions_supported),
+NULL);
+
 #undef F
 #undef FI
 #undef FIUD_VEC
diff --git a/src/compiler/glsl/builtin_functions.h 
b/src/compiler/glsl/builtin_functions.h
index deaf640..adec424 100644
--- a/src/compiler/glsl/builtin_functions.h
+++ b/src/compiler/glsl/builtin_functions.h
@@ -70,6 +70,9 @@ udivmod64(void *mem_ctx, builtin_available_predicate avail);
 ir_function_signature *
 fabs64(void *mem_ctx, builtin_available_predicate avail);
 
+ir_function_signature *
+fneg64(void *mem_ctx, builtin_available_predicate avail);
+
 }
 
 #endif /* BULITIN_FUNCTIONS_H */
diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl
index d798d7e..fedf8b7 100644
--- a/src/compiler/glsl/float64.glsl
+++ b/src/compiler/glsl/float64.glsl
@@ -6,6 +6,7 @@
 
 #version 130
 #extension GL_ARB_shader_bit_encoding : enable
+#extension GL_EXT_shader_integer_mix : enable
 
 /* Software IEEE floating-point rounding mode.
  * GLSL spec section "4.7.1 Range and Precision":
@@ -27,3 +28,26 @@ fabs64(uvec2 a)
a.y &= 0x7FFFu;
return a;
 }
+
+/* Returns 1 if the double-precision floating-point value `a' is a NaN;
+ * otherwise returns 0.
+ */
+bool
+is_nan(uvec2 a)
+{
+   return (0xFFE0u <= (a.y<<1)) &&
+  ((a.x != 0u) || ((a.y & 0x000Fu) != 0u));
+}
+
+/* Negate value of a Float64 :
+ * Toggle the sign bit
+ */
+uvec2
+fneg64(uvec2 a)
+{
+   uint t = a.y;
+
+   t ^= (1u << 31);
+   a.y = mix(t, a.y, is_nan(a));
+   return a;
+}
diff --git

[Mesa-dev] soft fp64 support - main body (glsl/gallium)

2018-03-12 Thread Dave Airlie

This is the main code for the soft fp64 work. It's mostly Elie's
code with a bunch of changes by me.

This patchset has all the glsl lowering code. (using float64.glsl,
yes I know checked in files are bad, but not bad enough for anyone
to have solved int64.glsl yet, so we have a precedent).

It introduces the builtin code for all the functions first,
this code has seen some optimisation using findMSB and mix opcodes
to remove if branches, I'm sure it could see a lot more. if statements
are the enemy, esp when you hit glsl copy prop and the r600/sb backend.

The second part is just the lowering hooks to use the builtins,
but also to do a bunch of non-builtin lowering.

Finally the gallium patches adds a new interpreation for the PIPE_CAP_DOUBLES,
allowing drivers to choose if they want no fp64, hw fp64, or emulated fp64.
I don't think we should be enabling this for everyone, just drivers who ask.

There is no r600 patch in this series, it's a one liner, but the code does
cause a lot of long compile times in both the glsl compiler and the r600
backend, however I'd really like to get this stuff checked in so we have
 a known stable good base (it passes
[1375/1375] skip: 5, pass: 1368, fail: 2
on r600 nosb at the moment).

I think most of the remaining issues are not to be found in this code,
but fixes for the other parts of the stack.

Also I'm not really interested in bikeshedding the nitty gritty details
of the fp64 emulation, the main goal for this code is to provide the
fp64 bit so we can enable GL4.3 on evergreen GPUs, I don't think anyone
is going to use it that often in practice, and if we can get it to the
level that passes conformance (still WIP) then I'll be happy. I think
optimising it to reduce CPU usage at compile time is way more important
than optimising it to reduce GPU usage.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/50] glsl: Add "built-in" functions to do uint_to_fp64(uint)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 71 +
 src/compiler/glsl/builtin_functions.cpp |  4 ++
 src/compiler/glsl/builtin_functions.h   |  3 ++
 src/compiler/glsl/float64.glsl  | 22 ++
 src/compiler/glsl/glcpp/glcpp-parse.y   |  1 +
 5 files changed, 101 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 2dcaba40..c200447 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -5283,3 +5283,74 @@ fp64_to_uint(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+uint_to_fp64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0872 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0872);
+   ir_variable *const r0873 = body.make_temp(glsl_type::uvec2_type, 
"return_value");
+   /* IF CONDITION */
+   ir_expression *const r0875 = equal(r0872, body.constant(0u));
+   ir_if *f0874 = new(mem_ctx) ir_if(operand(r0875).val);
+   exec_list *const f0874_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  body.emit(assign(r0873, ir_constant::zero(mem_ctx, 
glsl_type::uvec2_type), 0x03));
+
+
+  /* ELSE INSTRUCTIONS */
+  body.instructions = >else_instructions;
+
+  ir_variable *const r0876 = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+  ir_expression *const r0877 = equal(r0872, body.constant(0u));
+  ir_expression *const r0878 = expr(ir_unop_find_msb, r0872);
+  ir_expression *const r0879 = sub(body.constant(int(31)), r0878);
+  ir_expression *const r087A = expr(ir_triop_csel, r0877, 
body.constant(int(32)), r0879);
+  body.emit(assign(r0876, add(r087A, body.constant(int(21))), 0x01));
+
+  ir_variable *const r087B = new(mem_ctx) 
ir_variable(glsl_type::uvec2_type, "z", ir_var_auto);
+  body.emit(r087B);
+  ir_expression *const r087C = sub(body.constant(int(1074)), r0876);
+  ir_expression *const r087D = expr(ir_unop_i2u, r087C);
+  ir_expression *const r087E = lshift(r087D, body.constant(int(20)));
+  ir_expression *const r087F = less(r0876, body.constant(int(32)));
+  ir_expression *const r0880 = neg(r0876);
+  ir_expression *const r0881 = bit_and(r0880, body.constant(int(31)));
+  ir_expression *const r0882 = rshift(r0872, r0881);
+  ir_expression *const r0883 = equal(r0876, body.constant(int(0)));
+  ir_expression *const r0884 = less(r0876, body.constant(int(64)));
+  ir_expression *const r0885 = add(r0876, body.constant(int(-32)));
+  ir_expression *const r0886 = lshift(r0872, r0885);
+  ir_expression *const r0887 = expr(ir_triop_csel, r0884, r0886, 
body.constant(0u));
+  ir_expression *const r0888 = expr(ir_triop_csel, r0883, 
body.constant(0u), r0887);
+  ir_expression *const r0889 = expr(ir_triop_csel, r087F, r0882, r0888);
+  body.emit(assign(r087B, add(r087E, r0889), 0x02));
+
+  ir_expression *const r088A = less(r0876, body.constant(int(32)));
+  ir_expression *const r088B = lshift(r0872, r0876);
+  ir_expression *const r088C = equal(r0876, body.constant(int(0)));
+  ir_expression *const r088D = expr(ir_triop_csel, r088C, r0872, 
body.constant(0u));
+  body.emit(assign(r087B, expr(ir_triop_csel, r088A, r088B, r088D), 0x01));
+
+  body.emit(assign(r0873, r087B, 0x03));
+
+
+   body.instructions = f0874_parent_instructions;
+   body.emit(f0874);
+
+   /* END IF */
+
+   body.emit(ret(r0873));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index a0fc9bc..20051b1 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3374,6 +3374,10 @@ builtin_builder::create_builtins()
 generate_ir::fp64_to_uint(mem_ctx, 
integer_functions_supported),
 NULL);
 
+   add_function("__builtin_uint_to_fp64",
+generate_ir::uint_to_fp64(mem_ctx, 
integer_functions_supported),
+NULL);
+
 #undef F
 #undef FI
 #undef FIUD_VEC
diff --git a/src/compiler/glsl/builtin_functions.h 
b/src/compiler/glsl/builtin_functions.h
index f99e3b7..a9674dc 100644
--- a/src/compiler/glsl/builtin_functions.h
+++ b/src/compiler/glsl/builtin_functions.h
@@ -91,6 +91,9 @@ fmul64(void *mem_ctx, builtin_available_predicate avail);
 ir_function_signature *
 fp64_to_uint(void *mem_ctx,

[Mesa-dev] [PATCH 09/50] glsl: Add "built-in" functions to do fp64_to_uint(fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 209 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  |  61 ++
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 278 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index ca56d3b..2dcaba40 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -5074,3 +5074,212 @@ fmul64(void *mem_ctx, builtin_available_predicate avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+shift64Right(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::void_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0818 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a0", ir_var_function_in);
+   sig_parameters.push_tail(r0818);
+   ir_variable *const r0819 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a1", ir_var_function_in);
+   sig_parameters.push_tail(r0819);
+   ir_variable *const r081A = new(mem_ctx) ir_variable(glsl_type::int_type, 
"count", ir_var_function_in);
+   sig_parameters.push_tail(r081A);
+   ir_variable *const r081B = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z0Ptr", ir_var_function_inout);
+   sig_parameters.push_tail(r081B);
+   ir_variable *const r081C = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z1Ptr", ir_var_function_inout);
+   sig_parameters.push_tail(r081C);
+   ir_expression *const r081D = equal(r081A, body.constant(int(0)));
+   ir_expression *const r081E = less(r081A, body.constant(int(32)));
+   ir_expression *const r081F = neg(r081A);
+   ir_expression *const r0820 = bit_and(r081F, body.constant(int(31)));
+   ir_expression *const r0821 = lshift(r0818, r0820);
+   ir_expression *const r0822 = rshift(r0819, r081A);
+   ir_expression *const r0823 = bit_or(r0821, r0822);
+   ir_expression *const r0824 = less(r081A, body.constant(int(64)));
+   ir_expression *const r0825 = bit_and(r081A, body.constant(int(31)));
+   ir_expression *const r0826 = rshift(r0818, r0825);
+   ir_expression *const r0827 = expr(ir_triop_csel, r0824, r0826, 
body.constant(0u));
+   ir_expression *const r0828 = expr(ir_triop_csel, r081E, r0823, r0827);
+   body.emit(assign(r081C, expr(ir_triop_csel, r081D, r0818, r0828), 0x01));
+
+   ir_expression *const r0829 = equal(r081A, body.constant(int(0)));
+   ir_expression *const r082A = less(r081A, body.constant(int(32)));
+   ir_expression *const r082B = rshift(r0818, r081A);
+   ir_expression *const r082C = expr(ir_triop_csel, r082A, r082B, 
body.constant(0u));
+   body.emit(assign(r081B, expr(ir_triop_csel, r0829, r0818, r082C), 0x01));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+fp64_to_uint(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r082D = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r082D);
+   ir_variable *const r082E = body.make_temp(glsl_type::uint_type, 
"return_value");
+   ir_variable *const r082F = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aFracHi", ir_var_auto);
+   body.emit(r082F);
+   ir_variable *const r0830 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aFracLo", ir_var_auto);
+   body.emit(r0830);
+   body.emit(assign(r0830, swizzle_x(r082D), 0x01));
+
+   ir_variable *const r0831 = body.make_temp(glsl_type::uint_type, 
"extractFloat64FracHi_retval");
+   body.emit(assign(r0831, bit_and(swizzle_y(r082D), body.constant(1048575u)), 
0x01));
+
+   body.emit(assign(r082F, r0831, 0x01));
+
+   ir_variable *const r0832 = body.make_temp(glsl_type::int_type, 
"extractFloat64Exp_retval");
+   ir_expression *const r0833 = rshift(swizzle_y(r082D), 
body.constant(int(20)));
+   ir_expression *const r0834 = bit_and(r0833, body.constant(2047u));
+   body.emit(assign(r0832, expr(ir_unop_u2i, r0834), 0x01));
+
+   ir_variable *const r0835 = body.make_temp(glsl_type::uint_type, 
"extractFloat64Sign_retval");
+   body.emit(assign(r0835, rshift(swizzle_y(r082D), body.constant(int(31))), 
0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r0837 = equal(r0832, body.constant(int(2047)));
+   ir_expression *const r0838 = bit_or(r0831, swizzle_x(r082D));
+   ir_expression *const r0839 = nequal(r0838, body.constant(0u));
+   ir_expression *const r083A = logic_and(r0837, r0839);
+   ir_if *f0836 = new(mem_ctx)

[Mesa-dev] [PATCH 13/50] glsl: Add "built-in" functions to do fp64_to_fp32(fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 388 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  | 100 
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 496 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index b656fad..f937a2f 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -5662,3 +5662,391 @@ int_to_fp64(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+packFloat32(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::float_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r08EC = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zSign", ir_var_function_in);
+   sig_parameters.push_tail(r08EC);
+   ir_variable *const r08ED = new(mem_ctx) ir_variable(glsl_type::int_type, 
"zExp", ir_var_function_in);
+   sig_parameters.push_tail(r08ED);
+   ir_variable *const r08EE = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zFrac", ir_var_function_in);
+   sig_parameters.push_tail(r08EE);
+   ir_variable *const r08EF = body.make_temp(glsl_type::float_type, 
"uintBitsToFloat_retval");
+   ir_expression *const r08F0 = lshift(r08EC, body.constant(int(31)));
+   ir_expression *const r08F1 = expr(ir_unop_i2u, r08ED);
+   ir_expression *const r08F2 = lshift(r08F1, body.constant(int(23)));
+   ir_expression *const r08F3 = add(r08F0, r08F2);
+   ir_expression *const r08F4 = add(r08F3, r08EE);
+   body.emit(assign(r08EF, expr(ir_unop_bitcast_u2f, r08F4), 0x01));
+
+   body.emit(ret(r08EF));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+roundAndPackFloat32(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::float_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r08F5 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zSign", ir_var_function_in);
+   sig_parameters.push_tail(r08F5);
+   ir_variable *const r08F6 = new(mem_ctx) ir_variable(glsl_type::int_type, 
"zExp", ir_var_function_in);
+   sig_parameters.push_tail(r08F6);
+   ir_variable *const r08F7 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zFrac", ir_var_function_in);
+   sig_parameters.push_tail(r08F7);
+   ir_variable *const r08F8 = body.make_temp(glsl_type::bool_type, 
"execute_flag");
+   body.emit(assign(r08F8, body.constant(true), 0x01));
+
+   ir_variable *const r08F9 = body.make_temp(glsl_type::float_type, 
"return_value");
+   ir_variable *const r08FA = new(mem_ctx) ir_variable(glsl_type::int_type, 
"roundBits", ir_var_auto);
+   body.emit(r08FA);
+   ir_expression *const r08FB = bit_and(r08F7, body.constant(127u));
+   body.emit(assign(r08FA, expr(ir_unop_u2i, r08FB), 0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r08FD = expr(ir_unop_i2u, r08F6);
+   ir_expression *const r08FE = gequal(r08FD, body.constant(253u));
+   ir_if *f08FC = new(mem_ctx) ir_if(operand(r08FE).val);
+   exec_list *const f08FC_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  /* IF CONDITION */
+  ir_expression *const r0900 = less(body.constant(int(253)), r08F6);
+  ir_expression *const r0901 = equal(r08F6, body.constant(int(253)));
+  ir_expression *const r0902 = expr(ir_unop_u2i, r08F7);
+  ir_expression *const r0903 = less(r0902, body.constant(int(-64)));
+  ir_expression *const r0904 = logic_and(r0901, r0903);
+  ir_expression *const r0905 = logic_or(r0900, r0904);
+  ir_if *f08FF = new(mem_ctx) ir_if(operand(r0905).val);
+  exec_list *const f08FF_parent_instructions = body.instructions;
+
+ /* THEN INSTRUCTIONS */
+ body.instructions = >then_instructions;
+
+ ir_expression *const r0906 = lshift(r08F5, body.constant(int(31)));
+ ir_expression *const r0907 = add(r0906, body.constant(2139095040u));
+ body.emit(assign(r08F9, expr(ir_unop_bitcast_u2f, r0907), 0x01));
+
+ body.emit(assign(r08F8, body.constant(false), 0x01));
+
+
+ /* ELSE INSTRUCTIONS */
+ body.instructions = >else_instructions;
+
+ ir_variable *const r0908 = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+ body.emit(assign(r0908, neg(r08F6), 0x01));
+
+ ir_variable *const r0909 = body.make_temp(glsl_type::bool_type, 
"assignment_tmp");
+

[Mesa-dev] [PATCH 08/50] glsl: Add "built-in" functions to do mul(fp64, fp64) (v2)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

v2: use mix
Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 1348 +++
 src/compiler/glsl/builtin_functions.cpp |4 +
 src/compiler/glsl/builtin_functions.h   |3 +
 src/compiler/glsl/float64.glsl  |  148 
 src/compiler/glsl/glcpp/glcpp-parse.y   |1 +
 5 files changed, 1504 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 0ebfb42..ca56d3b 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -3726,3 +3726,1351 @@ fadd64(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+mul32To64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::void_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r05FE = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r05FE);
+   ir_variable *const r05FF = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"b", ir_var_function_in);
+   sig_parameters.push_tail(r05FF);
+   ir_variable *const r0600 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z0Ptr", ir_var_function_inout);
+   sig_parameters.push_tail(r0600);
+   ir_variable *const r0601 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z1Ptr", ir_var_function_inout);
+   sig_parameters.push_tail(r0601);
+   ir_variable *const r0602 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z0", ir_var_auto);
+   body.emit(r0602);
+   ir_variable *const r0603 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"zMiddleA", ir_var_auto);
+   body.emit(r0603);
+   ir_variable *const r0604 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z1", ir_var_auto);
+   body.emit(r0604);
+   ir_variable *const r0605 = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+   body.emit(assign(r0605, bit_and(r05FE, body.constant(65535u)), 0x01));
+
+   ir_variable *const r0606 = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+   body.emit(assign(r0606, rshift(r05FE, body.constant(int(16))), 0x01));
+
+   ir_variable *const r0607 = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+   body.emit(assign(r0607, bit_and(r05FF, body.constant(65535u)), 0x01));
+
+   ir_variable *const r0608 = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+   body.emit(assign(r0608, rshift(r05FF, body.constant(int(16))), 0x01));
+
+   ir_variable *const r0609 = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+   body.emit(assign(r0609, mul(r0606, r0607), 0x01));
+
+   ir_expression *const r060A = mul(r0605, r0608);
+   body.emit(assign(r0603, add(r060A, r0609), 0x01));
+
+   ir_expression *const r060B = mul(r0606, r0608);
+   ir_expression *const r060C = less(r0603, r0609);
+   ir_expression *const r060D = expr(ir_unop_b2i, r060C);
+   ir_expression *const r060E = expr(ir_unop_i2u, r060D);
+   ir_expression *const r060F = lshift(r060E, body.constant(int(16)));
+   ir_expression *const r0610 = rshift(r0603, body.constant(int(16)));
+   ir_expression *const r0611 = add(r060F, r0610);
+   body.emit(assign(r0602, add(r060B, r0611), 0x01));
+
+   body.emit(assign(r0603, lshift(r0603, body.constant(int(16))), 0x01));
+
+   ir_expression *const r0612 = mul(r0605, r0607);
+   body.emit(assign(r0604, add(r0612, r0603), 0x01));
+
+   ir_expression *const r0613 = less(r0604, r0603);
+   ir_expression *const r0614 = expr(ir_unop_b2i, r0613);
+   ir_expression *const r0615 = expr(ir_unop_i2u, r0614);
+   body.emit(assign(r0602, add(r0602, r0615), 0x01));
+
+   body.emit(assign(r0601, r0604, 0x01));
+
+   body.emit(assign(r0600, r0602, 0x01));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+mul64To128(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::void_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0616 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a0", ir_var_function_in);
+   sig_parameters.push_tail(r0616);
+   ir_variable *const r0617 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a1", ir_var_function_in);
+   sig_parameters.push_tail(r0617);
+   ir_variable *const r0618 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"b0", ir_var_function_in);
+   sig_parameters.push_tail(r0618);
+   ir_variable *const r0619 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"b1", ir_var_function_in);
+   sig_parameters.push_tail(r0619);
+   ir_variable *const r061A = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"z0Ptr", ir_var_function_inout);
+

[Mesa-dev] [PATCH 14/50] glsl: Add "built-in" functions to do fp32_to_fp64(fp32)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 192 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  |  38 +++
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 238 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index f937a2f..034d2d0 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -6050,3 +6050,195 @@ fp64_to_fp32(void *mem_ctx, builtin_available_predicate 
avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+fp32_to_fp64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r097F = new(mem_ctx) ir_variable(glsl_type::float_type, 
"f", ir_var_function_in);
+   sig_parameters.push_tail(r097F);
+   ir_variable *const r0980 = body.make_temp(glsl_type::bool_type, 
"execute_flag");
+   body.emit(assign(r0980, body.constant(true), 0x01));
+
+   ir_variable *const r0981 = body.make_temp(glsl_type::uvec2_type, 
"return_value");
+   ir_variable *const r0982 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aSign", ir_var_auto);
+   body.emit(r0982);
+   ir_variable *const r0983 = new(mem_ctx) ir_variable(glsl_type::int_type, 
"aExp", ir_var_auto);
+   body.emit(r0983);
+   ir_variable *const r0984 = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"aFrac", ir_var_auto);
+   body.emit(r0984);
+   ir_variable *const r0985 = body.make_temp(glsl_type::uint_type, 
"floatBitsToUint_retval");
+   body.emit(assign(r0985, expr(ir_unop_bitcast_f2u, r097F), 0x01));
+
+   ir_variable *const r0986 = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+   body.emit(assign(r0986, bit_and(r0985, body.constant(8388607u)), 0x01));
+
+   body.emit(assign(r0984, r0986, 0x01));
+
+   ir_variable *const r0987 = body.make_temp(glsl_type::int_type, 
"assignment_tmp");
+   ir_expression *const r0988 = rshift(r0985, body.constant(int(23)));
+   ir_expression *const r0989 = bit_and(r0988, body.constant(255u));
+   body.emit(assign(r0987, expr(ir_unop_u2i, r0989), 0x01));
+
+   body.emit(assign(r0983, r0987, 0x01));
+
+   body.emit(assign(r0982, rshift(r0985, body.constant(int(31))), 0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r098B = equal(r0987, body.constant(int(255)));
+   ir_if *f098A = new(mem_ctx) ir_if(operand(r098B).val);
+   exec_list *const f098A_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  /* IF CONDITION */
+  ir_expression *const r098D = nequal(r0986, body.constant(0u));
+  ir_if *f098C = new(mem_ctx) ir_if(operand(r098D).val);
+  exec_list *const f098C_parent_instructions = body.instructions;
+
+ /* THEN INSTRUCTIONS */
+ body.instructions = >then_instructions;
+
+ ir_variable *const r098E = body.make_temp(glsl_type::uint_type, 
"assignment_tmp");
+ body.emit(assign(r098E, lshift(r0985, body.constant(int(9))), 0x01));
+
+ ir_variable *const r098F = body.make_temp(glsl_type::uvec2_type, 
"vec_ctor");
+ ir_expression *const r0990 = lshift(r098E, body.constant(int(20)));
+ body.emit(assign(r098F, bit_or(r0990, body.constant(0u)), 0x01));
+
+ ir_expression *const r0991 = rshift(r098E, body.constant(int(12)));
+ ir_expression *const r0992 = lshift(r0982, body.constant(int(31)));
+ ir_expression *const r0993 = bit_or(r0992, 
body.constant(2146959360u));
+ body.emit(assign(r098F, bit_or(r0991, r0993), 0x02));
+
+ body.emit(assign(r0981, r098F, 0x03));
+
+ body.emit(assign(r0980, body.constant(false), 0x01));
+
+
+ /* ELSE INSTRUCTIONS */
+ body.instructions = >else_instructions;
+
+ ir_variable *const r0994 = new(mem_ctx) 
ir_variable(glsl_type::uvec2_type, "z", ir_var_auto);
+ body.emit(r0994);
+ ir_expression *const r0995 = lshift(r0982, body.constant(int(31)));
+ body.emit(assign(r0994, add(r0995, body.constant(2146435072u)), 
0x02));
+
+ body.emit(assign(r0994, body.constant(0u), 0x01));
+
+ body.emit(assign(r0981, r0994, 0x03));
+
+ body.emit(assign(r0980, body.constant(false), 0x01));
+
+
+  body.instructions = f098C_parent_instructions;
+  body.emit(f098C);
+
+  /* END IF */
+
+
+  /* ELSE INSTRUCTIONS */
+  body.instructions = >else_instructions;
+
+  /* IF CONDITION */
+  ir_expression *const r0997 = equal(r0987, body.constant(int(0)));
+  ir_if *f0996 = new(mem_ctx) ir_if(operand(r0997).val);

[Mesa-dev] [PATCH 05/50] glsl: add utility function to extract 64-bit sign.

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

[airlied: left over from dropping le64]
Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 18 ++
 src/compiler/glsl/float64.glsl  |  7 +++
 2 files changed, 25 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 2340c48..6a8afea 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -200,3 +200,21 @@ feq64(void *mem_ctx, builtin_available_predicate avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+extractFloat64Sign(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uint_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0049 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0049);
+   ir_expression *const r004A = rshift(swizzle_y(r0049), 
body.constant(int(31)));
+   body.emit(ret(r004A));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl
index 0cd7991..6d939c2 100644
--- a/src/compiler/glsl/float64.glsl
+++ b/src/compiler/glsl/float64.glsl
@@ -104,3 +104,10 @@ feq64(uvec2 a, uvec2 b)
   ((a.y == b.y) || ((a.x == 0u) && (((a.y | b.y)<<1) == 0u)));
return mix(result, false, isaNaN || isbNaN);
 }
+
+/* Returns the sign bit of the double-precision floating-point value `a'.*/
+uint
+extractFloat64Sign(uvec2 a)
+{
+   return (a.y>>31);
+}
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/50] glsl: Add "built-in" function to do sign(fp64) (v2)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

v2: use mix.

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 28 
 src/compiler/glsl/builtin_functions.cpp |  4 
 src/compiler/glsl/builtin_functions.h   |  3 +++
 src/compiler/glsl/float64.glsl  |  9 +
 src/compiler/glsl/glcpp/glcpp-parse.y   |  1 +
 5 files changed, 45 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 2898fc9..8546048 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -68,3 +68,31 @@ fneg64(void *mem_ctx, builtin_available_predicate avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+fsign64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r001D = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r001D);
+   ir_variable *const r001E = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"retval", ir_var_auto);
+   body.emit(r001E);
+   body.emit(assign(r001E, body.constant(0u), 0x01));
+
+   ir_expression *const r001F = lshift(swizzle_y(r001D), 
body.constant(int(1)));
+   ir_expression *const r0020 = bit_or(r001F, swizzle_x(r001D));
+   ir_expression *const r0021 = equal(r0020, body.constant(0u));
+   ir_expression *const r0022 = bit_and(swizzle_y(r001D), 
body.constant(2147483648u));
+   ir_expression *const r0023 = bit_or(r0022, body.constant(1072693248u));
+   body.emit(assign(r001E, expr(ir_triop_csel, r0021, body.constant(0u), 
r0023), 0x02));
+
+   body.emit(ret(r001E));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 9d88a31..17aa868 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3350,6 +3350,10 @@ builtin_builder::create_builtins()
 generate_ir::fneg64(mem_ctx, integer_functions_supported),
 NULL);
 
+   add_function("__builtin_fsign64",
+generate_ir::fsign64(mem_ctx, integer_functions_supported),
+NULL);
+
 #undef F
 #undef FI
 #undef FIUD_VEC
diff --git a/src/compiler/glsl/builtin_functions.h 
b/src/compiler/glsl/builtin_functions.h
index adec424..7954373 100644
--- a/src/compiler/glsl/builtin_functions.h
+++ b/src/compiler/glsl/builtin_functions.h
@@ -73,6 +73,9 @@ fabs64(void *mem_ctx, builtin_available_predicate avail);
 ir_function_signature *
 fneg64(void *mem_ctx, builtin_available_predicate avail);
 
+ir_function_signature *
+fsign64(void *mem_ctx, builtin_available_predicate avail);
+
 }
 
 #endif /* BULITIN_FUNCTIONS_H */
diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl
index fedf8b7..f8eb1f3 100644
--- a/src/compiler/glsl/float64.glsl
+++ b/src/compiler/glsl/float64.glsl
@@ -51,3 +51,12 @@ fneg64(uvec2 a)
a.y = mix(t, a.y, is_nan(a));
return a;
 }
+
+uvec2
+fsign64(uvec2 a)
+{
+   uvec2 retval;
+   retval.x = 0u;
+   retval.y = mix((a.y & 0x8000u) | 0x3FF0u, 0u, (a.y << 1 | a.x) == 
0u);
+   return retval;
+}
diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index b9506d8..666543b 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -2370,6 +2370,7 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
  add_builtin_define(parser, "__have_builtin_builtin_imod64", 1);
  add_builtin_define(parser, "__have_builtin_builtin_fabs64", 1);
  add_builtin_define(parser, "__have_builtin_builtin_fneg64", 1);
+ add_builtin_define(parser, "__have_builtin_builtin_fsign64", 1);
   }
}
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/50] glsl: Add "built-in" function to do abs(fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/Makefile.sources   |  1 +
 src/compiler/glsl/builtin_float64.h | 19 +++
 src/compiler/glsl/builtin_functions.cpp |  4 
 src/compiler/glsl/builtin_functions.h   |  3 +++
 src/compiler/glsl/float64.glsl  | 29 +
 src/compiler/glsl/generate_ir.cpp   |  1 +
 src/compiler/glsl/glcpp/glcpp-parse.y   |  1 +
 7 files changed, 58 insertions(+)
 create mode 100644 src/compiler/glsl/builtin_float64.h
 create mode 100644 src/compiler/glsl/float64.glsl

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index b29218e..ee223ae 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -22,6 +22,7 @@ LIBGLSL_FILES = \
glsl/builtin_functions.cpp \
glsl/builtin_functions.h \
glsl/builtin_int64.h \
+   glsl/builtin_float64.h \
glsl/builtin_types.cpp \
glsl/builtin_variables.cpp \
glsl/generate_ir.cpp \
diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
new file mode 100644
index 000..7b57231
--- /dev/null
+++ b/src/compiler/glsl/builtin_float64.h
@@ -0,0 +1,19 @@
+ir_function_signature *
+fabs64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::uvec2_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r000B = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r000B);
+   body.emit(assign(r000B, bit_and(swizzle_y(r000B), 
body.constant(2147483647u)), 0x02));
+
+   body.emit(ret(r000B));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 5f772c9..133a896 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3342,6 +3342,10 @@ builtin_builder::create_builtins()
 generate_ir::umul64(mem_ctx, integer_functions_supported),
 NULL);
 
+   add_function("__builtin_fabs64",
+generate_ir::fabs64(mem_ctx, integer_functions_supported),
+NULL);
+
 #undef F
 #undef FI
 #undef FIUD_VEC
diff --git a/src/compiler/glsl/builtin_functions.h 
b/src/compiler/glsl/builtin_functions.h
index 89ec9b7..deaf640 100644
--- a/src/compiler/glsl/builtin_functions.h
+++ b/src/compiler/glsl/builtin_functions.h
@@ -67,6 +67,9 @@ sign64(void *mem_ctx, builtin_available_predicate avail);
 ir_function_signature *
 udivmod64(void *mem_ctx, builtin_available_predicate avail);
 
+ir_function_signature *
+fabs64(void *mem_ctx, builtin_available_predicate avail);
+
 }
 
 #endif /* BULITIN_FUNCTIONS_H */
diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl
new file mode 100644
index 000..d798d7e
--- /dev/null
+++ b/src/compiler/glsl/float64.glsl
@@ -0,0 +1,29 @@
+/* Compile with:
+ *
+ * glsl_compiler --version 130 --dump-builder float64.glsl > builtin_float64.h
+ *
+ */
+
+#version 130
+#extension GL_ARB_shader_bit_encoding : enable
+
+/* Software IEEE floating-point rounding mode.
+ * GLSL spec section "4.7.1 Range and Precision":
+ * The rounding mode cannot be set and is undefined.
+ * But here, we are able to define the rounding mode at the compilation time.
+ */
+#define FLOAT_ROUND_NEAREST_EVEN0
+#define FLOAT_ROUND_TO_ZERO 1
+#define FLOAT_ROUND_DOWN2
+#define FLOAT_ROUND_UP  3
+#define FLOAT_ROUNDING_MODE FLOAT_ROUND_NEAREST_EVEN
+
+/* Absolute value of a Float64 :
+ * Clear the sign bit
+ */
+uvec2
+fabs64(uvec2 a)
+{
+   a.y &= 0x7FFFu;
+   return a;
+}
diff --git a/src/compiler/glsl/generate_ir.cpp 
b/src/compiler/glsl/generate_ir.cpp
index 255b048..e6ece48 100644
--- a/src/compiler/glsl/generate_ir.cpp
+++ b/src/compiler/glsl/generate_ir.cpp
@@ -29,5 +29,6 @@ using namespace ir_builder;
 namespace generate_ir {
 
 #include "builtin_int64.h"
+#include "builtin_float64.h"
 
 }
diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index 913bce1..4e7affa 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -2368,6 +2368,7 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
  add_builtin_define(parser, "__have_builtin_builtin_umod64", 1);
  add_builtin_define(parser, "__have_builtin_builtin_idiv64", 1);
  add_builtin_define(parser, "__have_builtin_builtin_imod64", 1);
+ add_builtin_define(parser, "__have_builtin_builtin_fabs64", 1);
   }
}
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 06/50] glsl: Add "built-in" functions to do lt(fp64, fp64)

2018-03-12 Thread Dave Airlie

From: Elie Tournier 

Signed-off-by: Elie Tournier 
---
 src/compiler/glsl/builtin_float64.h | 135 
 src/compiler/glsl/builtin_functions.cpp |   4 +
 src/compiler/glsl/builtin_functions.h   |   3 +
 src/compiler/glsl/float64.glsl  |  42 ++
 src/compiler/glsl/glcpp/glcpp-parse.y   |   1 +
 5 files changed, 185 insertions(+)

diff --git a/src/compiler/glsl/builtin_float64.h 
b/src/compiler/glsl/builtin_float64.h
index 6a8afea..f7e613f 100644
--- a/src/compiler/glsl/builtin_float64.h
+++ b/src/compiler/glsl/builtin_float64.h
@@ -218,3 +218,138 @@ extractFloat64Sign(void *mem_ctx, 
builtin_available_predicate avail)
sig->replace_parameters(_parameters);
return sig;
 }
+ir_function_signature *
+lt64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r004B = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a0", ir_var_function_in);
+   sig_parameters.push_tail(r004B);
+   ir_variable *const r004C = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"a1", ir_var_function_in);
+   sig_parameters.push_tail(r004C);
+   ir_variable *const r004D = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"b0", ir_var_function_in);
+   sig_parameters.push_tail(r004D);
+   ir_variable *const r004E = new(mem_ctx) ir_variable(glsl_type::uint_type, 
"b1", ir_var_function_in);
+   sig_parameters.push_tail(r004E);
+   ir_expression *const r004F = less(r004B, r004D);
+   ir_expression *const r0050 = equal(r004B, r004D);
+   ir_expression *const r0051 = less(r004C, r004E);
+   ir_expression *const r0052 = logic_and(r0050, r0051);
+   ir_expression *const r0053 = logic_or(r004F, r0052);
+   body.emit(ret(r0053));
+
+   sig->replace_parameters(_parameters);
+   return sig;
+}
+ir_function_signature *
+flt64(void *mem_ctx, builtin_available_predicate avail)
+{
+   ir_function_signature *const sig =
+  new(mem_ctx) ir_function_signature(glsl_type::bool_type, avail);
+   ir_factory body(>body, mem_ctx);
+   sig->is_defined = true;
+
+   exec_list sig_parameters;
+
+   ir_variable *const r0054 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"a", ir_var_function_in);
+   sig_parameters.push_tail(r0054);
+   ir_variable *const r0055 = new(mem_ctx) ir_variable(glsl_type::uvec2_type, 
"b", ir_var_function_in);
+   sig_parameters.push_tail(r0055);
+   ir_variable *const r0056 = body.make_temp(glsl_type::bool_type, 
"return_value");
+   ir_variable *const r0057 = new(mem_ctx) ir_variable(glsl_type::bool_type, 
"isbNaN", ir_var_auto);
+   body.emit(r0057);
+   ir_variable *const r0058 = new(mem_ctx) ir_variable(glsl_type::bool_type, 
"isaNaN", ir_var_auto);
+   body.emit(r0058);
+   ir_expression *const r0059 = rshift(swizzle_y(r0054), 
body.constant(int(20)));
+   ir_expression *const r005A = bit_and(r0059, body.constant(2047u));
+   ir_expression *const r005B = expr(ir_unop_u2i, r005A);
+   ir_expression *const r005C = equal(r005B, body.constant(int(2047)));
+   ir_expression *const r005D = bit_and(swizzle_y(r0054), 
body.constant(1048575u));
+   ir_expression *const r005E = bit_or(r005D, swizzle_x(r0054));
+   ir_expression *const r005F = nequal(r005E, body.constant(0u));
+   body.emit(assign(r0058, logic_and(r005C, r005F), 0x01));
+
+   ir_expression *const r0060 = rshift(swizzle_y(r0055), 
body.constant(int(20)));
+   ir_expression *const r0061 = bit_and(r0060, body.constant(2047u));
+   ir_expression *const r0062 = expr(ir_unop_u2i, r0061);
+   ir_expression *const r0063 = equal(r0062, body.constant(int(2047)));
+   ir_expression *const r0064 = bit_and(swizzle_y(r0055), 
body.constant(1048575u));
+   ir_expression *const r0065 = bit_or(r0064, swizzle_x(r0055));
+   ir_expression *const r0066 = nequal(r0065, body.constant(0u));
+   body.emit(assign(r0057, logic_and(r0063, r0066), 0x01));
+
+   /* IF CONDITION */
+   ir_expression *const r0068 = logic_or(r0058, r0057);
+   ir_if *f0067 = new(mem_ctx) ir_if(operand(r0068).val);
+   exec_list *const f0067_parent_instructions = body.instructions;
+
+  /* THEN INSTRUCTIONS */
+  body.instructions = >then_instructions;
+
+  body.emit(assign(r0056, body.constant(false), 0x01));
+
+
+  /* ELSE INSTRUCTIONS */
+  body.instructions = >else_instructions;
+
+  ir_variable *const r0069 = body.make_temp(glsl_type::uint_type, 
"extractFloat64Sign_retval");
+  body.emit(assign(r0069, rshift(swizzle_y(r0054), 
body.constant(int(31))), 0x01));
+
+  ir_variable *const r006A = body.make_temp(glsl_type::uint_type, 
"extractFloat64Sign_retval");
+  body.emit(assign(r006A, rshift(swizzle_y(r0055), 
body.constant(int(31))), 0x01));
+
+  /* IF CONDITION */
+  ir_expression *const r006C = nequal(r0069, r006A);
+  ir_if *f006B

[Mesa-dev] [PATCH] r600: fix abs for op3 sources

2018-03-12 Thread sroland

From: Roland Scheidegger 

If a src was referencing the same temp as the dst, the per-component
copy code didn't work.
e.g.
  cndge r0.xy, r0.xx, |r2|, r3
got expanded into
  mov  r12.x, |r2|
  cndge r0.x, r0.x, r12, r3
  mov  r12.y, |r2|
  cndge r0.y, r0.x, r12, r3
hence for the second cndge r0.x was mistakenly the previous cndge result.
Fix this by doing all the movs first, so there's no bogus alu.last in between.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905
---
 src/gallium/drivers/r600/r600_shader.c | 110 +
 1 file changed, 56 insertions(+), 54 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 6b5c42f86d..bd511c76ac 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -7076,33 +7076,42 @@ static int tgsi_helper_copy(struct r600_shader_ctx 
*ctx, struct tgsi_full_instru
 }
 
 static int tgsi_make_src_for_op3(struct r600_shader_ctx *ctx,
- unsigned temp, int chan,
+ unsigned writemask,
  struct r600_bytecode_alu_src *bc_src,
  const struct r600_shader_src *shader_src)
 {
struct r600_bytecode_alu alu;
-   int r;
+   int i, r;
+   int lasti = tgsi_last_instruction(writemask);
+   int temp_reg = 0;
 
-   r600_bytecode_src(bc_src, shader_src, chan);
+   r600_bytecode_src(_src[0], shader_src, 0);
+   r600_bytecode_src(_src[1], shader_src, 1);
+   r600_bytecode_src(_src[2], shader_src, 2);
+   r600_bytecode_src(_src[3], shader_src, 3);
 
-   /* op3 operands don't support abs modifier */
if (bc_src->abs) {
-   assert(temp!=0);  /* we actually need the extra register, 
make sure it is allocated. */
-   memset(, 0, sizeof(struct r600_bytecode_alu));
-   alu.op = ALU_OP1_MOV;
-   alu.dst.sel = temp;
-   alu.dst.chan = chan;
-   alu.dst.write = 1;
+   temp_reg = r600_get_temp(ctx);
 
-   alu.src[0] = *bc_src;
-   alu.last = true; // sufficient?
-   r = r600_bytecode_add_alu(ctx->bc, );
-   if (r)
-   return r;
-
-   memset(bc_src, 0, sizeof(*bc_src));
-   bc_src->sel = temp;
-   bc_src->chan = chan;
+   for (i = 0; i < lasti + 1; i++) {
+   if (!(writemask & (1 << i)))
+   continue;
+   memset(, 0, sizeof(struct r600_bytecode_alu));
+   alu.op = ALU_OP1_MOV;
+   alu.dst.sel = temp_reg;
+   alu.dst.chan = i;
+   alu.dst.write = 1;
+   alu.src[0] = bc_src[i];
+   if (i == lasti) {
+   alu.last = 1;
+   }
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+   memset(_src[i], 0, sizeof(*bc_src));
+   bc_src[i].sel = temp_reg;
+   bc_src[i].chan = i;
+   }
}
return 0;
 }
@@ -7111,9 +7120,9 @@ static int tgsi_op3_dst(struct r600_shader_ctx *ctx, int 
dst)
 {
struct tgsi_full_instruction *inst = 
>parse.FullToken.FullInstruction;
struct r600_bytecode_alu alu;
+   struct r600_bytecode_alu_src srcs[4][4];
int i, j, r;
int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
-   int temp_regs[4];
unsigned op = ctx->inst_info->op;
 
if (op == ALU_OP3_MULADD_IEEE &&
@@ -7121,10 +7130,12 @@ static int tgsi_op3_dst(struct r600_shader_ctx *ctx, 
int dst)
op = ALU_OP3_MULADD;
 
for (j = 0; j < inst->Instruction.NumSrcRegs; j++) {
-   temp_regs[j] = 0;
-   if (ctx->src[j].abs)
-   temp_regs[j] = r600_get_temp(ctx);
+   r = tgsi_make_src_for_op3(ctx, inst->Dst[0].Register.WriteMask,
+ srcs[j], >src[j]);
+   if (r)
+   return r;
}
+
for (i = 0; i < lasti + 1; i++) {
if (!(inst->Dst[0].Register.WriteMask & (1 << i)))
continue;
@@ -7132,9 +7143,7 @@ static int tgsi_op3_dst(struct r600_shader_ctx *ctx, int 
dst)
memset(, 0, sizeof(struct r600_bytecode_alu));
alu.op = op;
for (j = 0; j < inst->Instruction.NumSrcRegs; j++) {
-   r = tgsi_make_src_for_op3(ctx, temp_regs[j], i, 
[j], >src[j]);
-   if (r)
-   return r;
+   alu.src[j] = srcs[j][i];
}

[Mesa-dev] [PATCH 1/2] r600: add simple ib dumping under a env var

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

I've used this a lot when developing, and keep rebasing it around a lot,
seems like it could be useful to have upstream.

R600_DUMP witll make lots of /tmp/rad_dump_.txt for every command
submitted to the hw.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/eg_debug.c| 16 
 src/gallium/drivers/r600/r600_hw_context.c |  4 
 src/gallium/drivers/r600/r600_pipe.h   |  2 ++
 3 files changed, 22 insertions(+)

diff --git a/src/gallium/drivers/r600/eg_debug.c 
b/src/gallium/drivers/r600/eg_debug.c
index ceb7c16..990bd56 100644
--- a/src/gallium/drivers/r600/eg_debug.c
+++ b/src/gallium/drivers/r600/eg_debug.c
@@ -359,3 +359,19 @@ void eg_dump_debug_state(struct pipe_context *ctx, FILE *f,
radeon_clear_saved_cs(>last_gfx);
r600_resource_reference(>last_trace_buf, NULL);
 }
+
+void eg_dump_ib_to_file(struct r600_context *rctx,
+   struct radeon_winsys_cs *cs)
+{
+   static int ib_dump_id = 0;
+   char name[128];
+   FILE *fl;
+   ib_dump_id++;
+
+   snprintf(name, 127, "/tmp/rad_dump_%d.txt", ib_dump_id);
+   fl = fopen(name, "w+");
+   eg_parse_ib(fl, cs->current.buf, cs->current.cdw,
+   -1, "IB", rctx->b.chip_class,
+   NULL, NULL);
+   fclose(fl);
+}
diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
b/src/gallium/drivers/r600/r600_hw_context.c
index 3ce1825..3031cdd 100644
--- a/src/gallium/drivers/r600/r600_hw_context.c
+++ b/src/gallium/drivers/r600/r600_hw_context.c
@@ -293,6 +293,10 @@ void r600_context_gfx_flush(void *context, unsigned flags,
r600_resource_reference(>last_trace_buf, ctx->trace_buf);
r600_resource_reference(>trace_buf, NULL);
}
+
+   if (getenv("R600_DUMP"))
+   eg_dump_ib_to_file(ctx, cs);
+
/* Flush the CS. */
ws->cs_flush(cs, flags, >b.last_gfx_fence);
if (fence)
diff --git a/src/gallium/drivers/r600/r600_pipe.h 
b/src/gallium/drivers/r600/r600_pipe.h
index 6d09093..87978ee 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -1077,4 +1077,6 @@ void r600_update_compressed_resource_state(struct 
r600_context *rctx, bool compu
 
 void eg_setup_buffer_constants(struct r600_context *rctx, int shader_type);
 void r600_update_driver_const_buffers(struct r600_context *rctx, bool 
compute_only);
+void eg_dump_ib_to_file(struct r600_context *rctx,
+   struct radeon_winsys_cs *cs);
 #endif
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] r600: assert on double opcodes if we hit them

2018-03-12 Thread Dave Airlie

From: Dave Airlie 

This asserts on any double opcocde getting into the shader
assembler on gpus that don't support them. This is a better
way to find holes in the soft fp64 coverage than gpu hangs.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/r600_shader.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 6b5c42f..c2f5b8d 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -402,6 +402,17 @@ static bool ctx_needs_stack_workaround_8xx(struct 
r600_shader_ctx *ctx)
return true;
 }
 
+static bool ctx_has_doubles(struct r600_shader_ctx *ctx)
+{
+   if (ctx->bc->family == CHIP_ARUBA ||
+   ctx->bc->family == CHIP_CAYMAN ||
+   ctx->bc->family == CHIP_CYPRESS ||
+   ctx->bc->family == CHIP_HEMLOCK)
+   return true;
+   else
+   return false;
+}
+
 static int tgsi_last_instruction(unsigned writemask)
 {
int i, lasti = 0;
@@ -4419,6 +4430,7 @@ static int tgsi_op2_64_params(struct r600_shader_ctx 
*ctx, bool singledest, bool
int use_tmp = 0;
int swizzle_x = inst->Src[0].Register.SwizzleX;
 
+   assert (ctx_has_doubles(ctx));
if (singledest) {
switch (write_mask) {
case 0x1:
@@ -4568,6 +4580,7 @@ static int tgsi_op3_64(struct r600_shader_ctx *ctx)
int lasti = 3;
int tmp = r600_get_temp(ctx);
 
+   assert (ctx_has_doubles(ctx));
for (i = 0; i < lasti + 1; i++) {
 
memset(, 0, sizeof(struct r600_bytecode_alu));
@@ -4987,6 +5000,7 @@ static int cayman_emit_double_instr(struct 
r600_shader_ctx *ctx)
int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask);
int t1 = ctx->temp_reg;
 
+   assert (ctx_has_doubles(ctx));
/* should only be one src regs */
assert(inst->Instruction.NumSrcRegs == 1);
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements

2018-03-12 Thread Dylan Baker

Quoting Mark Janes (2018-03-12 14:59:29)
> Dylan Baker  writes:
> 
> > Quoting Mark Janes (2018-03-12 12:40:47)
> >> Handling a screw-up could be done by maintainers by force-pushing the
> >> commits off the WIP branch, and adding some annotations that prevent the
> >> broken commit from being re-applied to WIP by automation.
> >> 
> >
> > That sounds like introducing a lot of developer headaches, the kind that 
> > make
> > people not want to use the system. Take this scenario:
> >
> > 1. I push patches
> 
> In this case the person is a developer, not a release manager
> 
> > 2. CI starts
> > 3. you push patches
> 
> I'll call this person "developer 2" below.
> 
> > 4. My CI fails
> 
> At this point, developer 1 needs feedback that their patch nearly
> created a problem for many end-users.  Frankly, it's unacceptable for
> developers to annotate a commit for the stable branches unless they are
> confident that it is a *safe and necessary* fix for end-users.  We have
> almost zero verification between the developer and millions of users.
> 
> > 5. I force-push
> 
> A release manager would have to resolve the failure manually, not
> developer 1.  Developers can't force-push anything in my proposal.
> 
> > Now both of our patches are removed, even though yours haven't gone through 
> > CI
> > at all.
> 
> Release manager would manually drop patches from developer 1, and leave
> the patches from developer 2.  CI would re-test patches from developer 2.
> 
> > And if our tool isn't smart enough it will block your patches as well.
> > In fact, I can't think of a way to make force pushes on a branch that
> > multiple people work on *not* have race conditions.
> 
> I agree that there is a race condition here.  Right now our race
> condition covers a weeks long window between the time a developer CC's
> stable, and when the release manager starts applying patches.  With an
> automated implementation, the window narrows to a day or so.
> 
> > I think that we should either:
> > 1. Use gitlab and have CI run on PRs as well as on merged code. Either the 
> > PR
> >will be red and gitlab can block the merge, or it will be green. It 
> > should be
> >possible to have gitlab block code that cannot be cleanly merged.
> > 2. Use merges and reverts.
> 
> My 2 cents: choosing a specific git service is a step in the wrong
> direction for mesa.  I agree that providing a branch to a release
> manager may be preferable to email, in the cases where a developer has
> to backport patches.

I think that letting the release manager take branches would be superior, that
would mean that only the release manager should be doing pushes at that point
and some of the pain of force pushes is removed (the pain of tracking such a
branch isn't, but the pain of pushing is).

I bring up gitlab because the plan seems to be (sh) that fdo is migrating to
gitlab as a whole, even if individual projects are free to continue using
mailing lists instead of pull requests.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 105444] Enable GL disk shader cache when transform feedback is enabled

2018-03-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=105444

Jordan Justen  changed:

   What|Removed |Added

 Status|NEEDINFO|NEW

--- Comment #1 from Jordan Justen  ---
>From irc, Tim mentions that we need to add
prog->TransformFeedback->VaryingNames into the sha1 in
shader_cache_read_program_metadata. Similar to 
AttributeBindings.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: don't use compiler.has_header

2018-03-12 Thread Matt Turner

Acked-by: Matt Turner 

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] vulkan: Add KHR_display extension to anv and radv using DRM

2018-03-12 Thread Keith Packard

Jason Ekstrand  writes:

> On Fri, Feb 23, 2018 at 3:43 PM, Keith Packard  wrote:
>
> Once we're sure that's what we want, create an MR against the spec that
> just adds enough to the XML to reserve your extension number.  That will
> get merged almost immediately.  Then make a second one with the actual
> extension text and we'll iterate on that either in Khronos gitlab or, if
> you prefer, you can send it as a patch to mesa-dev and then make a Khrons
> MR once it's baked.

I just wrote up the full extension description for both extensions I
need (the one for passing a KMS fd to the driver, and the second to get
the GPU timestamp for doing GOOGLE_display_timing):

https://github.com/keith-packard/Vulkan-Docs

> See also my comments about GEM handle ownership.

Yeah, I think I've got that all cleaned up now -- the code no longer
shares the same file for rendering and display.

-- 
-keith


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED

2018-03-12 Thread Jordan Justen

On 2018-03-12 14:51:56, Timothy Arceri wrote:
> This only seems to be needed by i965. Alternative can't you just remove the:
> 
> if (prog->sh.data->LinkStatus != LINKING_SKIPPED)
>goto fail;
> 
> from brw_disk_cache_upload_program() and let the cache search do its job?
> 
> I believe its possible to end up linking the GLSL IR i.e. 
> prog->sh.data->LinkStatus == LINKING_SUCCESS but still have the i965 
> binary in the cache (although I guess that's a pretty big corner case).

In the cover letter I mentioned that possibility, and why I decided to
start with this instead. I would like to cover this corner case
eventually, but I thought maybe tackling xform feedback might be good
next step.

Regarding this patch, do you agree that this is perhaps a more
accurate status than LINKING_SUCCESS?

-Jordan

> On 12/03/18 11:25, Jordan Justen wrote:
> > This change allows the disk shader cache to work with programs loaded
> > with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set,
> > then they try to use the shader cache.
> > 
> > Since the program loaded by ProgramBinary is similar to loading the
> > shader from the disk cache, this is probably more appropriate.
> > 
> > Cc: Timothy Arceri 
> > Signed-off-by: Jordan Justen 
> > ---
> >   src/mesa/main/program_binary.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/main/program_binary.c b/src/mesa/main/program_binary.c
> > index 3df70059342..021f6315e72 100644
> > --- a/src/mesa/main/program_binary.c
> > +++ b/src/mesa/main/program_binary.c
> > @@ -287,5 +287,5 @@ _mesa_program_binary(struct gl_context *ctx, struct 
> > gl_shader_program *sh_prog,
> > return;
> >  }
> >   
> > -   sh_prog->data->LinkStatus = LINKING_SUCCESS;
> > +   sh_prog->data->LinkStatus = LINKING_SKIPPED;
> >   }
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements

2018-03-12 Thread Mark Janes

Dylan Baker  writes:

> Quoting Mark Janes (2018-03-12 12:40:47)
>> Handling a screw-up could be done by maintainers by force-pushing the
>> commits off the WIP branch, and adding some annotations that prevent the
>> broken commit from being re-applied to WIP by automation.
>> 
>
> That sounds like introducing a lot of developer headaches, the kind that make
> people not want to use the system. Take this scenario:
>
> 1. I push patches

In this case the person is a developer, not a release manager

> 2. CI starts
> 3. you push patches

I'll call this person "developer 2" below.

> 4. My CI fails

At this point, developer 1 needs feedback that their patch nearly
created a problem for many end-users.  Frankly, it's unacceptable for
developers to annotate a commit for the stable branches unless they are
confident that it is a *safe and necessary* fix for end-users.  We have
almost zero verification between the developer and millions of users.

> 5. I force-push

A release manager would have to resolve the failure manually, not
developer 1.  Developers can't force-push anything in my proposal.

> Now both of our patches are removed, even though yours haven't gone through CI
> at all.

Release manager would manually drop patches from developer 1, and leave
the patches from developer 2.  CI would re-test patches from developer 2.

> And if our tool isn't smart enough it will block your patches as well.
> In fact, I can't think of a way to make force pushes on a branch that
> multiple people work on *not* have race conditions.

I agree that there is a race condition here.  Right now our race
condition covers a weeks long window between the time a developer CC's
stable, and when the release manager starts applying patches.  With an
automated implementation, the window narrows to a day or so.

> I think that we should either:
> 1. Use gitlab and have CI run on PRs as well as on merged code. Either the PR
>will be red and gitlab can block the merge, or it will be green. It should 
> be
>possible to have gitlab block code that cannot be cleanly merged.
> 2. Use merges and reverts.

My 2 cents: choosing a specific git service is a step in the wrong
direction for mesa.  I agree that providing a branch to a release
manager may be preferable to email, in the cases where a developer has
to backport patches.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl/serialize: Save shader program metadata sha1

2018-03-12 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 12/03/18 11:25, Jordan Justen wrote:

When the shader cache is used, this can be generated. In fact, the
shader cache uses this sha1 to lookup the serialized GL shader
program.

If a GL shader program is restored with ProgramBinary, the shaders are
not available, and therefore the correct sha1 cannot be generated. If
this is restored, then we can use the shader cache to restore the
binary programs to the program that was loaded with ProgramBinary.

Cc: Timothy Arceri 
Signed-off-by: Jordan Justen 
---
  src/compiler/glsl/serialize.cpp | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/compiler/glsl/serialize.cpp b/src/compiler/glsl/serialize.cpp
index 9d2033bddfa..1fdbaa990f4 100644
--- a/src/compiler/glsl/serialize.cpp
+++ b/src/compiler/glsl/serialize.cpp
@@ -1163,6 +1163,8 @@ extern "C" void
  serialize_glsl_program(struct blob *blob, struct gl_context *ctx,
 struct gl_shader_program *prog)
  {
+   blob_write_bytes(blob, prog->data->sha1, sizeof(prog->data->sha1));
+
 write_uniforms(blob, prog);
  
 write_hash_tables(blob, prog);

@@ -1219,6 +1221,8 @@ deserialize_glsl_program(struct blob_reader *blob, struct 
gl_context *ctx,
  
 assert(prog->data->UniformStorage == NULL);
  
+   blob_copy_bytes(blob, prog->data->sha1, sizeof(prog->data->sha1));

+
 read_uniforms(blob, prog);
  
 read_hash_tables(blob, prog);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED

2018-03-12 Thread Timothy Arceri


This only seems to be needed by i965. Alternative can't you just remove the:

   if (prog->sh.data->LinkStatus != LINKING_SKIPPED)
  goto fail;

from brw_disk_cache_upload_program() and let the cache search do its job?

I believe its possible to end up linking the GLSL IR i.e. 
prog->sh.data->LinkStatus == LINKING_SUCCESS but still have the i965 
binary in the cache (although I guess that's a pretty big corner case).


On 12/03/18 11:25, Jordan Justen wrote:

This change allows the disk shader cache to work with programs loaded
with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set,
then they try to use the shader cache.

Since the program loaded by ProgramBinary is similar to loading the
shader from the disk cache, this is probably more appropriate.

Cc: Timothy Arceri 
Signed-off-by: Jordan Justen 
---
  src/mesa/main/program_binary.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/program_binary.c b/src/mesa/main/program_binary.c
index 3df70059342..021f6315e72 100644
--- a/src/mesa/main/program_binary.c
+++ b/src/mesa/main/program_binary.c
@@ -287,5 +287,5 @@ _mesa_program_binary(struct gl_context *ctx, struct 
gl_shader_program *sh_prog,
return;
 }
  
-   sh_prog->data->LinkStatus = LINKING_SUCCESS;

+   sh_prog->data->LinkStatus = LINKING_SKIPPED;
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values

2018-03-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=105464

--- Comment #2 from Philip Rebohle  ---
Created attachment 138044
  --> https://bugs.freedesktop.org/attachment.cgi?id=138044=edit
Tessellation demo screenshot

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] nv50, nvc0: Support BGRX1010102 and RGBX1010102 for sampling.

2018-03-12 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

It should be possible to get rendering on them BTW, with a bit of
state fixups for DST_ALPHA blending. Just haven't gotten around to it.

On Mon, Mar 12, 2018 at 4:45 PM, Mario Kleiner
 wrote:
> Add them as usable for textures, so they can be used by
> Wayland drm in 10 bpc mode and for X11 compositing under
> GLX and EGL. We need these formats to be supported at
> least for sampling, otherwise GLX_texture_from_pixmap
> and the equivalent EGL image extension won't work with
> X11 drawables of depth 30 and just display an all black
> window.
>
> Do not expose these formats as renderable, and thereby
> not as a fbconfig/EGLConfig/Visual, as NVidia hw does
> not support 10 bpc unorm formats without alpha channel.
>
> Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing,
> and under Wayland+Weston drm backend with a Tesla and
> Pascal gpu.
>
> Signed-off-by: Mario Kleiner 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_formats.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_formats.c
> index fc5deac..0ead8ac 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_formats.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_formats.c
> @@ -153,7 +153,9 @@ const struct nv50_format 
> nv50_format_table[PIPE_FORMAT_COUNT] =
> F3(A, R9G9B9E5_FLOAT, NONE, R, G, B, xx, FLOAT, E5B9G9R9_SHAREDEXP, T),
>
> C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, 
> TD),
> +   F3(A, R10G10B10X2_UNORM, RGB10_A2_UNORM, R, G, B, xx, UNORM, A2B10G10R10, 
> T),
> C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, 
> IB),
> +   F3(A, B10G10R10X2_UNORM, BGR10_A2_UNORM, B, G, R, xx, UNORM, A2B10G10R10, 
> T),
> C4(A, R10G10B10A2_SNORM, NONE, R, G, B, A, SNORM, A2B10G10R10, T),
> C4(A, B10G10R10A2_SNORM, NONE, B, G, R, A, SNORM, A2B10G10R10, T),
> C4(A, R10G10B10A2_UINT, RGB10_A2_UINT, R, G, B, A, UINT, A2B10G10R10, TR),
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] nv50, nvc0: Support BGRX1010102 and RGBX1010102 for sampling.

2018-03-12 Thread Mario Kleiner

Add them as usable for textures, so they can be used by
Wayland drm in 10 bpc mode and for X11 compositing under
GLX and EGL. We need these formats to be supported at
least for sampling, otherwise GLX_texture_from_pixmap
and the equivalent EGL image extension won't work with
X11 drawables of depth 30 and just display an all black
window.

Do not expose these formats as renderable, and thereby
not as a fbconfig/EGLConfig/Visual, as NVidia hw does
not support 10 bpc unorm formats without alpha channel.

Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing,
and under Wayland+Weston drm backend with a Tesla and
Pascal gpu.

Signed-off-by: Mario Kleiner 
---
 src/gallium/drivers/nouveau/nv50/nv50_formats.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats.c 
b/src/gallium/drivers/nouveau/nv50/nv50_formats.c
index fc5deac..0ead8ac 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_formats.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_formats.c
@@ -153,7 +153,9 @@ const struct nv50_format 
nv50_format_table[PIPE_FORMAT_COUNT] =
F3(A, R9G9B9E5_FLOAT, NONE, R, G, B, xx, FLOAT, E5B9G9R9_SHAREDEXP, T),
 
C4(A, R10G10B10A2_UNORM, RGB10_A2_UNORM, R, G, B, A, UNORM, A2B10G10R10, 
TD),
+   F3(A, R10G10B10X2_UNORM, RGB10_A2_UNORM, R, G, B, xx, UNORM, A2B10G10R10, 
T),
C4(A, B10G10R10A2_UNORM, BGR10_A2_UNORM, B, G, R, A, UNORM, A2B10G10R10, 
IB),
+   F3(A, B10G10R10X2_UNORM, BGR10_A2_UNORM, B, G, R, xx, UNORM, A2B10G10R10, 
T),
C4(A, R10G10B10A2_SNORM, NONE, R, G, B, A, SNORM, A2B10G10R10, T),
C4(A, B10G10R10A2_SNORM, NONE, B, G, R, A, SNORM, A2B10G10R10, T),
C4(A, R10G10B10A2_UINT, RGB10_A2_UINT, R, G, B, A, UINT, A2B10G10R10, TR),
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] More pieces for xbgr2101010/abgr2101010 support.

2018-03-12 Thread Mario Kleiner

These are needed together with Daniel Stone's 10 bpc bgr patches
to make nouveau's 10 bpc support more complete.

All tested on nouveau on a nv96 as primary/display gpu and also
with a radeon as prime renderoffload gpu, and then the other way
round with radeon primary + nouveau renderoffload. Also tested
under X11 DRI2 and DRI3/Present, GLX and EGL, composited and
unredirected. And with Waylands weston normal and with prime
renderoffload.

Patch 1 completes Daniels patches. Patch 2 makes weston work
on nouveau with gbm-format=xbgr2101010, and enables x11 compositing
of depth 30 drawables. Patch 3 makes sure we get the right colors
when compositing on x11 + EGL.

Some patches on top of weston master to test gbm-format=xbgr2101010
are here: https://github.com/kleinerm/weston/tree/westonnew10bpc

-mario

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] egl/x11: Handle both depth 30 formats for eglCreateImage().

2018-03-12 Thread Mario Kleiner

We need to distinguish if a backing pixmap of a window is
XRGB2101010 or XBGR2101010, as different gpu hw supports
different formats. NVidia hw prefers XBGR, whereas AMD and
Intel are happy with XRGB.

We use the red channel mask of the visual to distinguish at
depth 30, but because we can't easily get the associated
visual of a Pixmap, we use the visual of the x-screens root
window instead as a proxy.

This fixes desktop composition of color depth 30 windows
when the X11 compositor uses EGL.

Signed-off-by: Mario Kleiner 
---
 src/egl/drivers/dri2/egl_dri2.h  |  7 ++
 src/egl/drivers/dri2/platform_x11.c  | 37 +++-
 src/egl/drivers/dri2/platform_x11_dri3.c |  7 +-
 3 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index d36d02c..a399b06 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -402,6 +402,8 @@ EGLBoolean
 dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp);
 void
 dri2_teardown_x11(struct dri2_egl_display *dri2_dpy);
+unsigned int
+dri2_x11_get_red_mask(struct dri2_egl_display *dri2_dpy);
 #else
 static inline EGLBoolean
 dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp)
@@ -410,6 +412,11 @@ dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp)
 }
 static inline void
 dri2_teardown_x11(struct dri2_egl_display *dri2_dpy) {}
+static inline unsigned int
+dri2_x11_get_red_mask(struct dri2_egl_display *dri2_dpy)
+{
+   return 0;
+}
 #endif
 
 #ifdef HAVE_DRM_PLATFORM
diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 6c287b4..da28981 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -209,6 +209,37 @@ get_xcb_screen(xcb_screen_iterator_t iter, int screen)
 return NULL;
 }
 
+static xcb_visualtype_t *
+get_xcb_visualtype(struct dri2_egl_display *dri2_dpy)
+{
+   xcb_visualtype_iterator_t visual_iter;
+   xcb_screen_t *screen = dri2_dpy->screen;
+   xcb_visualid_t visual_id = screen->root_visual;
+   xcb_depth_iterator_t depth_iter = 
xcb_screen_allowed_depths_iterator(screen);
+
+   for (; depth_iter.rem; xcb_depth_next(_iter)) {
+  visual_iter = xcb_depth_visuals_iterator(depth_iter.data);
+
+  for (; visual_iter.rem; xcb_visualtype_next(_iter)) {
+ if (visual_iter.data->visual_id == visual_id)
+return visual_iter.data;
+  }
+   }
+
+   return NULL;
+}
+
+/* Get red channel mask of the root windows visual for our x-screen */
+unsigned int
+dri2_x11_get_red_mask(struct dri2_egl_display *dri2_dpy)
+{
+   unsigned int red_mask = 0;
+   xcb_visualtype_t *visual = get_xcb_visualtype(dri2_dpy);
+   if (visual)
+  red_mask = visual->red_mask;
+
+   return red_mask;
+}
 
 /**
  * Called via eglCreateWindowSurface(), drv->API.CreateWindowSurface().
@@ -1050,7 +1081,11 @@ dri2_create_image_khr_pixmap(_EGLDisplay *disp, 
_EGLContext *ctx,
   format = __DRI_IMAGE_FORMAT_XRGB;
   break;
case 30:
-  format = __DRI_IMAGE_FORMAT_XRGB2101010;
+  /* Different preferred formats for different hw */
+  if (dri2_x11_get_red_mask(dri2_dpy) == 0x3ff)
+ format = __DRI_IMAGE_FORMAT_XBGR2101010;
+  else
+ format = __DRI_IMAGE_FORMAT_XRGB2101010;
   break;
case 32:
   format = __DRI_IMAGE_FORMAT_ARGB;
diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index 2073c59..667c845 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -282,7 +282,12 @@ dri3_create_image_khr_pixmap(_EGLDisplay *disp, 
_EGLContext *ctx,
   format = __DRI_IMAGE_FORMAT_XRGB;
   break;
case 30:
-  format = __DRI_IMAGE_FORMAT_XRGB2101010;
+  /* Different preferred formats for different hw */
+  if (dri2_x11_get_red_mask(dri2_dpy) == 0x3ff)
+ format = __DRI_IMAGE_FORMAT_XBGR2101010;
+  else
+ format = __DRI_IMAGE_FORMAT_XRGB2101010;
+
   break;
case 32:
   format = __DRI_IMAGE_FORMAT_ARGB;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] wayland-drm: Expose server-side xbgr2101010 and abgr2101010 formats.

2018-03-12 Thread Mario Kleiner

This way the wayland server can signal support for these formats
to wayland EGL clients. This is currently used by nouveau for 10
bpc support.

Tested with glmark2-wayland and glmark2-es2-wayland under weston
to now expose 10 bpc EGL configs under nouveau.

Signed-off-by: Mario Kleiner 
---
 src/egl/wayland/wayland-drm/wayland-drm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/egl/wayland/wayland-drm/wayland-drm.c 
b/src/egl/wayland/wayland-drm/wayland-drm.c
index 3c6696d..7d44d38 100644
--- a/src/egl/wayland/wayland-drm/wayland-drm.c
+++ b/src/egl/wayland/wayland-drm/wayland-drm.c
@@ -111,6 +111,8 @@ drm_create_buffer(struct wl_client *client, struct 
wl_resource *resource,
  uint32_t stride, uint32_t format)
 {
 switch (format) {
+case WL_DRM_FORMAT_ABGR2101010:
+case WL_DRM_FORMAT_XBGR2101010:
 case WL_DRM_FORMAT_ARGB2101010:
 case WL_DRM_FORMAT_XRGB2101010:
 case WL_DRM_FORMAT_ARGB:
@@ -215,6 +217,10 @@ bind_drm(struct wl_client *client, void *data, uint32_t 
version, uint32_t id)
wl_resource_post_event(resource, WL_DRM_FORMAT,
   WL_DRM_FORMAT_XRGB2101010);
wl_resource_post_event(resource, WL_DRM_FORMAT,
+  WL_DRM_FORMAT_ABGR2101010);
+   wl_resource_post_event(resource, WL_DRM_FORMAT,
+  WL_DRM_FORMAT_XBGR2101010);
+   wl_resource_post_event(resource, WL_DRM_FORMAT,
   WL_DRM_FORMAT_ARGB);
wl_resource_post_event(resource, WL_DRM_FORMAT,
   WL_DRM_FORMAT_XRGB);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements

2018-03-12 Thread Dylan Baker

Quoting Mark Janes (2018-03-12 12:40:47)
> Dylan Baker  writes:
> 
> > Quoting Emil Velikov (2018-03-12 08:38:31)
> >> On 12 March 2018 at 11:31, Juan A. Suarez Romero  
> >> wrote:
> >> > On Fri, 2018-03-09 at 12:12 -0800, Mark Janes wrote:
> >> >> Ilia Mirkin  writes:
> >> >>
> >> >> > On Tue, Mar 6, 2018 at 2:34 PM, Emil Velikov 
> >> >> >  wrote:
> >> >> > > So while others explore ways of improving the testing, let me 
> >> >> > > propose
> >> >> > > a few ideas for improving the actual releasing process.
> >> >> > >
> >> >> > >
> >> >> > >  - Making the current state always visible - have a web page, git
> >> >> > >branch and other ways for people to see which patches are picked,
> >> >> > >require backports, etc.
> >> >> >
> >> >> > Yes please! A git branch that's available (and force-pushed freely)
> >> >> > before the "you're screwed" announcement is going to help clear a lot
> >> >> > of things up.
> >> >>
> >> >> I agree that early information is good.  I don't agree that anyone
> >> >> should force push.  Release branches need to be protected.  Proposed
> >> >> release branches should only accept patches that have already been
> >> >> vetted by the process on mesa master.
> >> >>
> >> Agreed - release branches should not be force-pushed. We can use
> >> "wip/" ones instead.
> >
> > I also strongly agree with this, force pushes to live branches are
> > *bad* (force pushing a pull request of a features branch are perfectly
> > fine). I would much rather have reverts than force pushes. If we're
> > going to automate this in such way that we think we need force pushes
> > I'd much rather use merges (only for stable), so that we can simply
> > revert the merge commit.
> >
> > Or, as other have suggested, not allowing the proposed patches to be
> > pushed until CI has come back green would be even better. I've used
> > this approach in several github based projects and it works very well
> > for keeping the branch in question in good shape.
> 
> The patches need to be in a branch for any CI to test them.  A WIP
> branch seems like a good thing for CI to poll.
> 
> If CI fails at this point, then it means the developer messed up.  No
> one should add a fixes/cc tag to a commit unless they have some
> confidence that it will work on top of the WIP branch (by *testing* it).
> 
> Handling a screw-up could be done by maintainers by force-pushing the
> commits off the WIP branch, and adding some annotations that prevent the
> broken commit from being re-applied to WIP by automation.
> 

That sounds like introducing a lot of developer headaches, the kind that make
people not want to use the system. Take this scenario:

1. I push patches
2. CI starts
3. you push patches
4. My CI fails
5. I force-push

Now both of our patches are removed, even though yours haven't gone through CI
at all. And if our tool isn't smart enough it will block your patches as well.
In fact, I can't think of a way to make force pushes on a branch that multiple
people work on *not* have race conditions.

I think that we should either:
1. Use gitlab and have CI run on PRs as well as on merged code. Either the PR
   will be red and gitlab can block the merge, or it will be green. It should be
   possible to have gitlab block code that cannot be cleanly merged.
2. Use merges and reverts.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb

2018-03-12 Thread Dave Airlie

On 13 March 2018 at 05:58, Marek Olšák  wrote:
> On Mon, Mar 12, 2018 at 3:05 PM, Dave Airlie  wrote:
>> On 13 March 2018 at 03:59, Marek Olšák  wrote:
>>> This is good, though some older distros only have libxcb 1.11.
>>
>> On those distros you likely just want to --disable-dri3 anyways.
>>
>> Dave.
>
> Good one. I know you don't care, but we are talking about the latest
> long-term stable version of a major distro.

Does that distro have dri3 support in it's X server?

If so, then a follow up patch to lower this to 1.11 would be fine (actually
I've posted a cleaner patch), but if you don't need dri3 support, then
the follow up could just enable libxcb 1.11 support by dropping dri3

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb

2018-03-12 Thread Marek Olšák

On Mon, Mar 12, 2018 at 3:05 PM, Dave Airlie  wrote:
> On 13 March 2018 at 03:59, Marek Olšák  wrote:
>> This is good, though some older distros only have libxcb 1.11.
>
> On those distros you likely just want to --disable-dri3 anyways.
>
> Dave.

Good one. I know you don't care, but we are talking about the latest
long-term stable version of a major distro.

I know you don't care about the following either, but if Mesa can't
use older libxcb, the PRO driver will have to ship its own libxcb for
older distros. It's a terrible idea IMO.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements

2018-03-12 Thread Mark Janes

Emil Velikov  writes:

> On 12 March 2018 at 11:31, Juan A. Suarez Romero  wrote:
>> On Fri, 2018-03-09 at 12:12 -0800, Mark Janes wrote:
>>>  - Patches are applied to proposed stable branch by automation when the
>>>associated commit is pushed to master.  The existing commit message
>>>annotations drive this process.  There must be zero ambiguity in the
>>>annotations (eg which stable branches need the patch).
>>>
>
> I would recommend a delay between the patch landing in master and the
> wip branch. In the past, we have multiple cases where a fix lands in
> master, which causes severe regressions.
> IMHO having a 24-48h period sounds reasonable, although it can be
> tweaked based on feedback.

Having a delay means developers cannot quickly verify that their
stable-branch annotations correctly resulted in their patch being
applied where they wanted.

In the rare case that we have bad patches applied through the process,
we can treat it like a CI failure, where the maintainer steps in,
force-pushes the bad patches off the WIP branch, and adds annotations to
prevent automation from re-applying the commits later.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/2] spirv: fix OpSConvert when the source is unsigned

2018-03-12 Thread Jason Ekstrand

On Mon, Mar 5, 2018 at 10:21 PM, Samuel Iglesias Gonsálvez <
sigles...@igalia.com> wrote:

> OpSConvert interprets the MSB of the unsigned value as the sign bit and
> extends it to the new type. If we want to preserve the value, we need
> to use OpUConvert opcode.
>
> v2:
> - No need to check dst type.
> - Fix typo in comment.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/compiler/spirv/vtn_alu.c | 18 +-
>  1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
> index d0c9e316935..a5cefc35773 100644
> --- a/src/compiler/spirv/vtn_alu.c
> +++ b/src/compiler/spirv/vtn_alu.c
> @@ -354,10 +354,26 @@ vtn_nir_alu_op_for_spirv_opcode(struct vtn_builder
> *b,
> case SpvOpConvertFToS:
> case SpvOpConvertSToF:
> case SpvOpConvertUToF:
> -   case SpvOpSConvert:
> case SpvOpFConvert:
>return nir_type_conversion_op(src, dst, nir_rounding_mode_undef);
>
> +   case SpvOpSConvert: {
> +  nir_alu_type src_base = (nir_alu_type) nir_alu_type_get_base_type(
> src);
> +  if (src_base == nir_type_uint) {
>

Why are we predicating this on src_base == nir_type_uint?  It seems to me
as if we should just ignore the source and destination type except for the
bit size.


> + /* SPIR-V expects to interpret the unsigned value as signed and
> +  * do sign extend. Return the opcode accordingly.
> +  */
> + unsigned dst_bit_size = nir_alu_type_get_type_size(dst);
> + switch (dst_bit_size) {
> + case 16:   return nir_op_i2i16;
> + case 32:   return nir_op_i2i32;
> + case 64:   return nir_op_i2i64;
>

This can be nir_type_int | dst_bit_size. NIR types are convenient like
that. :-)


> + default:
> +vtn_fail("Invalid nir alu bit size");
> + }
> +  }
> +  return nir_type_conversion_op(src, dst, nir_rounding_mode_undef);
> +   }
> /* Derivatives: */
> case SpvOpDPdx: return nir_op_fddx;
> case SpvOpDPdy: return nir_op_fddy;
> --
> 2.14.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements

2018-03-12 Thread Mark Janes

Dylan Baker  writes:

> Quoting Emil Velikov (2018-03-12 08:38:31)
>> On 12 March 2018 at 11:31, Juan A. Suarez Romero  wrote:
>> > On Fri, 2018-03-09 at 12:12 -0800, Mark Janes wrote:
>> >> Ilia Mirkin  writes:
>> >>
>> >> > On Tue, Mar 6, 2018 at 2:34 PM, Emil Velikov  
>> >> > wrote:
>> >> > > So while others explore ways of improving the testing, let me propose
>> >> > > a few ideas for improving the actual releasing process.
>> >> > >
>> >> > >
>> >> > >  - Making the current state always visible - have a web page, git
>> >> > >branch and other ways for people to see which patches are picked,
>> >> > >require backports, etc.
>> >> >
>> >> > Yes please! A git branch that's available (and force-pushed freely)
>> >> > before the "you're screwed" announcement is going to help clear a lot
>> >> > of things up.
>> >>
>> >> I agree that early information is good.  I don't agree that anyone
>> >> should force push.  Release branches need to be protected.  Proposed
>> >> release branches should only accept patches that have already been
>> >> vetted by the process on mesa master.
>> >>
>> Agreed - release branches should not be force-pushed. We can use
>> "wip/" ones instead.
>
> I also strongly agree with this, force pushes to live branches are
> *bad* (force pushing a pull request of a features branch are perfectly
> fine). I would much rather have reverts than force pushes. If we're
> going to automate this in such way that we think we need force pushes
> I'd much rather use merges (only for stable), so that we can simply
> revert the merge commit.
>
> Or, as other have suggested, not allowing the proposed patches to be
> pushed until CI has come back green would be even better. I've used
> this approach in several github based projects and it works very well
> for keeping the branch in question in good shape.

The patches need to be in a branch for any CI to test them.  A WIP
branch seems like a good thing for CI to poll.

If CI fails at this point, then it means the developer messed up.  No
one should add a fixes/cc tag to a commit unless they have some
confidence that it will work on top of the WIP branch (by *testing* it).

Handling a screw-up could be done by maintainers by force-pushing the
commits off the WIP branch, and adding some annotations that prevent the
broken commit from being re-applied to WIP by automation.

> Dylan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values

2018-03-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=105464

--- Comment #1 from Philip Rebohle  ---
Created attachment 138038
  --> https://bugs.freedesktop.org/attachment.cgi?id=138038=edit
Witcher 3 hull shader which may suffer from the same issue

FWIW, the tessellation demo works correctly with AMDVLK with the patched
shader. On RADV, the tessellation levels are seemingly random, and its
behaviour changes by just changing the location number.

A similar issue occurs in The Witcher 3 when run with DXVK, where water
surfaces have incorrect tessellation factors applied on RADV. It reportedly
renders correctly on Nvidia.

In this case however, there is only one single invocation writing per-patch
outputs. A workaround that makes this particular shader work correctly on RADV
is to write all per-patch outputs to an array with storage class Private first,
reading the tessellation factors from that array, and finally copying the
contents of the temporary array to the output array, which tells me that
reading from the output array again returns incorrect results.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb

2018-03-12 Thread Emil Velikov

On 12 March 2018 at 18:48, Dave Airlie  wrote:
> On 13 March 2018 at 03:24, Emil Velikov  wrote:
>> Hi Dave,
>>
>> On 11 March 2018 at 23:26, Dave Airlie  wrote:
>>> From: Dave Airlie 
>>>
>>> I'm not sure everyone wants to be updating their dri3 in a forced
>>> march setting, this allows a nicer approach, esp when you want
>>> to build on distro that aren't brand new.
>>>
>>> I'm sure there are plenty of ways this patch could be cleaner,
>>> and I've also not built it against an updated dri3.
>>
>> Have you considered cases where the build server is using 1.12, while
>> at run-time we have 1.13?
>> Are you explicitly forbidding that, say via the packaging? It tends to
>> be allowed on most(all?) distributions.
>
> Yes I am because really who does that, and why do I care.
>
Sounds like I stepped on your toes here. Pardon, did not mean to.

All I've seen is distribution packaging ensuring the runtime version
is at least equal to the build-time one.
I have not seen the opposite, hence the question.

> If you build against a newer libxcb it won't run against the older one either,
> why do you expect building against the older one will magically work against
> a newer one with all the features?
>
Very often an updated version is of dependency is shipped, yet the
package (say mesa) is not rebuilt.
AFAICT there's no clear way to annotate this kind of 'hidden'
dependency, thus package maintainers don't know about it.

Hence, causing fair amount of time lost in user frustration and
developers debugging.

>> That said, if updating XCB is a serious no-go, may I suggest something
>> like the following:
>>  - add local fallback definitions/declarations
>>  - add local functions (annotated as weak) which return 'the correct'
>> value so that the fallback paths kick in
>
> I can sorta see the first part being useful, the second is definitely
> over engineering
> the solution.
>
> The thing is most of the features in dri3.1 are gated on the X server
> having support,
> Most people are not updating their X servers, I'm guessing apart from
> the modifiers
> devs there'll be at most 10 people who update their X server for this
> feature in advance
> of a distro moving them to it. I know I won't personally be going
> around all 10 boxes I
> keep running updating their X server for a feature that doesn't add
> anything on those
> hw configurations yet. When distros move to the 1.20 X server they'll
> also move to newer
> xcb, this is for distros that won't move at all.
>
Hey, I'm just sharing an idea of what sounds like the more robust
solution. It should work "for everyone" even though it seem like an
overkill.
I dare not think of the xcb/xserver/mesa combinations that people use.

As long as people are on board with the fun experience mentioned
above, don't mind me ;-)

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 105464] Reading per-patch outputs in Tessellation Control Shader returns undefined values

2018-03-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=105464

Bug ID: 105464
   Summary: Reading per-patch outputs in Tessellation Control
Shader returns undefined values
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: philip.rebo...@tu-dortmund.de
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 138037
  --> https://bugs.freedesktop.org/attachment.cgi?id=138037=edit
Patch for tessellation demo to reproduce the issue

As mentioned in the title, reading a per-patch output variable that was written
by a different invocation produces undefined results even after a barrier()
call. 

The attached patch changes the tessellation control shader of the
'tessellation' demo from Sascha Willems' Vulkan samples in that it reads the
tessellation levels from a per-patch output array. Note that the shader needs
to be recompiled manually in order to reproduce the issue.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5 v6] clover/llvm: Add get_[cl|language]_version, validation and some helpers

2018-03-12 Thread Aaron Watry

ping.

--Aaron

On Thu, Mar 1, 2018 at 8:02 PM, Aaron Watry  wrote:
> Used to calculate the default CLC language version based on the --cl-std in 
> build args
> and the device capabilities.
>
> According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
>  1) If you have -cl-std=CL1.1+ use the version specified
>  2) If not, use the highest 1.x version that the device supports
>
> Curiously, there is no valid value for -cl-std=CL1.0
>
> Validates requested cl-std against device_clc_version
>
> Signed-off-by: Aaron Watry 
> Cc: Pierre Moreau 
>
> v6: (Pierre) Add more const and fix some whitespace
>
> v5: (Aaron) Use a collection of cl versions instead of switch cases
> Consolidates the string, numeric version, and clc langstandard::kind
>
> v4: (Pierre) Split get_language_version addition and use into separate patches
> Squash patches that add the helpers and validate the language standard
>
> v3: Change device_version to device_clc_version
>
> v2: (Pierre) Move create_compiler_instance changes to correct patch
> to prevent temporary build breakage.
> Convert version_str into unsigned and use it to find language version
> Add build_error for unknown language version string
> Whitespace fixes
> ---
>  .../state_trackers/clover/llvm/invocation.cpp  | 63 
> ++
>  1 file changed, 63 insertions(+)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 0bc06e..0f854b9049 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -63,6 +63,23 @@ using ::llvm::Module;
>  using ::llvm::raw_string_ostream;
>
>  namespace {
> +
> +   struct cl_version {
> +  std::string version_str; // CL Version
> +  unsigned version_number; // Numeric CL Version
> +  clang::LangStandard::Kind clc_lang_standard; // lang standard for 
> version
> +   };
> +
> +   static const unsigned ANY_VERSION = 999;
> +   const cl_version cl_versions[] = {
> +  { "1.0", 100, clang::LangStandard::lang_opencl10},
> +  { "1.1", 110, clang::LangStandard::lang_opencl11},
> +  { "1.2", 120, clang::LangStandard::lang_opencl12},
> +  { "2.0", 200, clang::LangStandard::lang_opencl20},
> +  { "2.1", 210, clang::LangStandard::lang_unspecified}, //2.1 doesn't 
> exist
> +  { "2.2", 220, clang::LangStandard::lang_unspecified}, //2.2 doesn't 
> exist
> +   };
> +
> void
> init_targets() {
>static bool targets_initialized = false;
> @@ -93,6 +110,52 @@ namespace {
>return ctx;
> }
>
> +   const struct cl_version
> +   get_cl_version(const std::string _str,
> +  unsigned max = ANY_VERSION) {
> +  for (const struct cl_version version : cl_versions) {
> + if (version.version_number == max || version.version_str == 
> version_str) {
> +return version;
> + }
> +  }
> +  throw build_error("Unknown/Unsupported language version");
> +   }
> +
> +   clang::LangStandard::Kind
> +   get_lang_standard_from_version_str(const std::string _str,
> +  bool is_build_opt = false) {
> +   /**
> +   * Per CL 2.0 spec, section 5.8.4.5:
> +   * If it's an option, use the value directly.
> +   * If it's a device version, clamp to max 1.x version, a.k.a. 1.2
> +   */
> +  const struct cl_version version = get_cl_version(version_str,
> +  is_build_opt ? ANY_VERSION : 120);
> +  return version.clc_lang_standard;
> +   }
> +
> +   clang::LangStandard::Kind
> +   get_language_version(const std::vector ,
> +const std::string _version) {
> +
> +  const std::string search = "-cl-std=CL";
> +
> +  for (auto opt: opts) {
> + auto pos = opt.find(search);
> + if (pos == 0){
> +const auto ver = opt.substr(pos + search.size());
> +const auto device_ver = get_cl_version(device_version);
> +const auto requested = get_cl_version(ver);
> +if (requested.version_number > device_ver.version_number) {
> +   throw build_error();
> +}
> +return get_lang_standard_from_version_str(ver, true);
> + }
> +  }
> +
> +  return get_lang_standard_from_version_str(device_version);
> +   }
> +
> std::unique_ptr
> create_compiler_instance(const device ,
>  const std::vector ,
> --
> 2.14.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb

2018-03-12 Thread Dave Airlie

On 13 March 2018 at 03:59, Marek Olšák  wrote:
> This is good, though some older distros only have libxcb 1.11.

On those distros you likely just want to --disable-dri3 anyways.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl: Use hash table cloning in copy propagation

2018-03-12 Thread Eric Anholt

Thomas Helland  writes:

> Walking the whole hash table, inserting entries by hashing them first
> is just a really bad idea. We can simply memcpy the whole thing.
> ---
>  src/compiler/glsl/opt_copy_propagation.cpp | 13 --
>  .../glsl/opt_copy_propagation_elements.cpp | 29 
> --
>  2 files changed, 15 insertions(+), 27 deletions(-)
>
> diff --git a/src/compiler/glsl/opt_copy_propagation.cpp 
> b/src/compiler/glsl/opt_copy_propagation.cpp
> index e904e6ede4..96667779da 100644
> --- a/src/compiler/glsl/opt_copy_propagation.cpp
> +++ b/src/compiler/glsl/opt_copy_propagation.cpp
> @@ -220,10 +220,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list 
> *instructions)
> this->killed_all = false;
>  
> /* Populate the initial acp with a copy of the original */
> -   struct hash_entry *entry;
> -   hash_table_foreach(orig_acp, entry) {
> -  _mesa_hash_table_insert(acp, entry->key, entry->data);
> -   }
> +   acp = _mesa_hash_table_clone(orig_acp, NULL);

Remove creation of acp above

>  
> visit_list_elements(this, instructions);
>  
> @@ -271,10 +268,10 @@ ir_copy_propagation_visitor::handle_loop(ir_loop *ir, 
> bool keep_acp)
> this->killed_all = false;
>  
> if (keep_acp) {
> -  struct hash_entry *entry;
> -  hash_table_foreach(orig_acp, entry) {
> - _mesa_hash_table_insert(acp, entry->key, entry->data);
> -  }
> +  acp = _mesa_hash_table_clone(orig_acp, NULL);
> +   } else {
> +  acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
> +_mesa_key_pointer_equal);
> }

Again, remove the old creation of the acp.

Other than that, these are:

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl: Use hash table cloning in copy propagation

2018-03-12 Thread Emil Velikov

Hi Thomas,

If I were you I'd split out the introduction of clone_acp() into a
separate patch.
Regardless of that suggestions, there seems to be a bug in this patch.

On 12 March 2018 at 17:55, Thomas Helland  wrote:
> Walking the whole hash table, inserting entries by hashing them first
> is just a really bad idea. We can simply memcpy the whole thing.
> ---
>  src/compiler/glsl/opt_copy_propagation.cpp | 13 --
>  .../glsl/opt_copy_propagation_elements.cpp | 29 
> --
>  2 files changed, 15 insertions(+), 27 deletions(-)
>
> diff --git a/src/compiler/glsl/opt_copy_propagation.cpp 
> b/src/compiler/glsl/opt_copy_propagation.cpp
> index e904e6ede4..96667779da 100644
> --- a/src/compiler/glsl/opt_copy_propagation.cpp
> +++ b/src/compiler/glsl/opt_copy_propagation.cpp
> @@ -220,10 +220,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list 
> *instructions)
> this->killed_all = false;
>
> /* Populate the initial acp with a copy of the original */
> -   struct hash_entry *entry;
> -   hash_table_foreach(orig_acp, entry) {
> -  _mesa_hash_table_insert(acp, entry->key, entry->data);
> -   }
> +   acp = _mesa_hash_table_clone(orig_acp, NULL);
>
There's a _mesa_hash_table_create just above that should be removed.

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb

2018-03-12 Thread Dave Airlie

On 13 March 2018 at 03:24, Emil Velikov  wrote:
> Hi Dave,
>
> On 11 March 2018 at 23:26, Dave Airlie  wrote:
>> From: Dave Airlie 
>>
>> I'm not sure everyone wants to be updating their dri3 in a forced
>> march setting, this allows a nicer approach, esp when you want
>> to build on distro that aren't brand new.
>>
>> I'm sure there are plenty of ways this patch could be cleaner,
>> and I've also not built it against an updated dri3.
>
> Have you considered cases where the build server is using 1.12, while
> at run-time we have 1.13?
> Are you explicitly forbidding that, say via the packaging? It tends to
> be allowed on most(all?) distributions.

Yes I am because really who does that, and why do I care.

If you build against a newer libxcb it won't run against the older one either,
why do you expect building against the older one will magically work against
a newer one with all the features?

> That said, if updating XCB is a serious no-go, may I suggest something
> like the following:
>  - add local fallback definitions/declarations
>  - add local functions (annotated as weak) which return 'the correct'
> value so that the fallback paths kick in

I can sorta see the first part being useful, the second is definitely
over engineering
the solution.

The thing is most of the features in dri3.1 are gated on the X server
having support,
Most people are not updating their X servers, I'm guessing apart from
the modifiers
devs there'll be at most 10 people who update their X server for this
feature in advance
of a distro moving them to it. I know I won't personally be going
around all 10 boxes I
keep running updating their X server for a feature that doesn't add
anything on those
hw configurations yet. When distros move to the 1.20 X server they'll
also move to newer
xcb, this is for distros that won't move at all.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] util: Implement a hash table cloning function

2018-03-12 Thread Emil Velikov

Hi Thomas,

On 12 March 2018 at 17:55, Thomas Helland  wrote:
> V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav)
> ---
>  src/util/hash_table.c | 22 ++
>  src/util/hash_table.h |  2 ++
>  2 files changed, 24 insertions(+)
>
> diff --git a/src/util/hash_table.c b/src/util/hash_table.c
> index b7421a0144..f8d5d0f88a 100644
> --- a/src/util/hash_table.c
> +++ b/src/util/hash_table.c
> @@ -141,6 +141,28 @@ _mesa_hash_table_create(void *mem_ctx,
> return ht;
>  }
>
> +struct hash_table *
> +_mesa_hash_table_clone(struct hash_table *src, void *dst_mem_ctx)
> +{
> +   struct hash_table *ht;
> +
> +   ht = ralloc(dst_mem_ctx, struct hash_table);
> +   if (ht == NULL)
> +  return NULL;
> +
> +   memcpy(ht, src, sizeof(struct hash_table));
> +
> +   ht->table = ralloc_array(ht, struct hash_entry, ht->size);
> +   if (ht->table == NULL) {
> +  ralloc_free(ht);
> +  return NULL;
> +   }
> +
> +   memcpy(ht->table, src->table, ht->size * sizeof(struct hash_entry));
> +
Thinking out loud:

I'm wondering if it won't make sense to reuse _mesa_hash_table_create,
instead of open-coding it?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH RESEND] spirv: Silence compiler warning about undefined srcs[0]

2018-03-12 Thread Ian Romanick

Reviewed-by: Ian Romanick 

On 03/12/2018 11:21 AM, Eric Anholt wrote:
> v2: Use assume() at the srcs[] definition instead.
> 
> Cc: Jason Ekstrand 
> Cc: Ian Romanick 
> Cc: Eric Engestrom 
> ---
>  src/compiler/spirv/spirv_to_nir.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/compiler/spirv/spirv_to_nir.c 
> b/src/compiler/spirv/spirv_to_nir.c
> index 6a358c597316..3de45c47371e 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -2925,6 +2925,7 @@ vtn_handle_composite(struct vtn_builder *b, SpvOp 
> opcode,
>  
> case SpvOpCompositeConstruct: {
>unsigned elems = count - 3;
> +  assume(elems >= 1);
>if (glsl_type_is_vector_or_scalar(type)) {
>   nir_ssa_def *srcs[4];
>   for (unsigned i = 0; i < elems; i++)
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] omx: always define ENABLE_ST_OMX_{BELLAGIO, TIZONIA}

2018-03-12 Thread Dylan Baker

Quoting Eric Engestrom (2018-03-12 11:05:51)
> On Monday, 2018-03-12 10:19:49 -0700, Dylan Baker wrote:
> > Quoting Eric Engestrom (2018-03-12 07:33:27)
> > > We're trying to be -Wundef clean so that we can turn it on (and
> > > eventually make it an error).
> > > 
> > > Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead
> > > of #ifdef; I could've changed these, but the point of -Wundef is to
> > > catch typos, so we might as well make the change the right way.
> > > 
> > > Fixes: 83d4a5d5aea5a8a05be2 "st/omx/tizonia: Add H.264 decoder"
> > > Fixes: b2f2236dc565dd1460f0 "st/omx/tizonia: Add H.264 encoder"
> > > Fixes: c62cf1f165919bc74296 "st/omx/tizonia/h264d: Add EGLImage support"
> > > Cc: Gurkirpal Singh 
> > > Signed-off-by: Eric Engestrom 
> > > ---
> > > The meson hunk doesn't look pretty at all, but I'm planning on replacing
> > > all the `pre_args` with a configuration_data(), which will allow to
> > > simplify a lot of this #defines code.
> > > ---
> > >  configure.ac |  4 
> > >  meson.build  | 11 +--
> > >  2 files changed, 13 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/configure.ac b/configure.ac
> > > index 1553ce99da44bca4e826..6de4ceb2fb715505120e 100644
> > > --- a/configure.ac
> > > +++ b/configure.ac
> > > @@ -2281,6 +2281,8 @@ if test "x$enable_omx_bellagio" = xyes; then
> > >  PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= 
> > > $LIBOMXIL_BELLAGIO_REQUIRED])
> > >  gallium_st="$gallium_st omx_bellagio"
> > >  AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL])
> > > +else
> > > +AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0)
> > >  fi
> > >  AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes)
> > >  
> > > @@ -2294,6 +2296,8 @@ if test "x$enable_omx_tizonia" = xyes; then
> > > libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED])
> > >  gallium_st="$gallium_st omx_tizonia"
> > >  AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL])
> > > +else
> > > +AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0)
> > >  fi
> > >  AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes)
> > >  
> > > diff --git a/meson.build b/meson.build
> > > index b6e9692f192c528520e7..b9f7cd2aff5fc49e0d93 100644
> > > --- a/meson.build
> > > +++ b/meson.build
> > > @@ -504,7 +504,7 @@ if with_gallium_omx == 'bellagio' or with_gallium_omx 
> > > == 'auto'
> > >  'libomxil-bellagio', required : with_gallium_omx == 'bellagio'
> > >)
> > >if dep_omx.found()
> > > -pre_args += '-DENABLE_ST_OMX_BELLAGIO'
> > > +pre_args += '-DENABLE_ST_OMX_BELLAGIO=1'
> > >  with_gallium_omx = 'bellagio'
> > >endif
> > >  endif
> > > @@ -525,7 +525,7 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx 
> > > == 'auto'
> > >dependency('tizilheaders', required : with_gallium_omx == 
> > > 'tizonia'),
> > >  ]
> > >  if dep_omx.found() and dep_omx_other[0].found() and 
> > > dep_omx_other[1].found()
> > > -  pre_args += '-DENABLE_ST_OMX_TIZONIA'
> > > +  pre_args += '-DENABLE_ST_OMX_TIZONIA=1'
> > >with_gallium_omx = 'tizonia'
> > >  else
> > >with_gallium_omx = 'disabled'
> > > @@ -533,6 +533,13 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx 
> > > == 'auto'
> > >endif
> > >  endif
> > >  
> > > +if with_gallium_omx != 'bellagio'
> > > +  pre_args += '-DENABLE_ST_OMX_BELLAGIO=0'
> > > +endif
> > > +if with_gallium_omx != 'tizonia'
> > > +  pre_args += '-DENABLE_ST_OMX_TIZONIA=0'
> > > +endif
> > > +
> > 
> > This is fine as-is, but if you wanted to clean it up a little, you could do
> > something like:
> > 
> > pre_args += [
> >   '-DENABLE_ST_OMX_BELLAGIO=' + with_gallium_omx == 'bellagio ? '1' : '0',
> >   '-DENABLE_ST_OMX_TIZONIA=' + with_gallium_omx == 'tizonia ? '1' : '0',
> > ]
> 
> That's what I was looking for but too tired to figure out :]
> Thanks, I'll send a v2 with that tomorrow!

Yup!

I haven't tested this, and I have noticed some problems with using the ternary
construct in meson (there's some problems with the parser), so this may not
actually work.

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Few issues with Meson

2018-03-12 Thread Dylan Baker

This is my cross file (Arch doesn't have a pkg-config for x86, so I have a shell
wrapper that sets PKG_CONFIG_PATH), you'll probably need to adjust some paths

```
[binaries]
c = '/usr/bin/gcc'
cpp = '/usr/bin/g++'
ar = '/usr/bin/ar'
strip = '/usr/bin/strip'
pkgconfig = '/home/dylan/.local/bin/pkg-config-lib32'
llvm-config = '/usr/bin/llvm-config32'

[properties]
c_args = ['-m32']
c_link_args = ['-m32']
cpp_args = ['-m32']
cpp_link_args = ['-m32']

[host_machine]
system = 'linux'
cpu_family = 'x86'
cpu = 'i686'
endian = 'little'

# vim: ft=dosini
```

meson build-x66 --cross-file   should give you a working
mesa for your arch.

There's some upstream discussion on how to choose llvm-config for non-cross
compilation cases, but that hasn't moved a whole lot recently.

Dylan

Quoting Mike Lothian (2018-03-12 10:57:01)
> Hi Dylan
> 
> Do you have the link to patch on patchwork? I'll give it a go
> 
> I'm using meson 0.45 however the cross-file requires more than just defining
> llvm-config, everything else is normally picked up from what portage is 
> setting
> in the build environment - though strangely not if clang is used - I'll look
> into that sometime
> 
> Regards
> 
> Mike
> 
> On Fri, 9 Mar 2018 at 16:37 Dylan Baker  wrote:
> 
> Quoting Mike Lothian (2018-03-06 05:07:34)
> > Hi
> >
> > When compiling wine I also noticed that the d3d.pc files didn't have
> moduledir
> > set, so wine couldn't find it
> >
> > configure: error: pkg-config couldn't find Gallium Nine module
> 
> I've sent a patch for this.
> 
> >
> > Regards
> >
> > Mike
> >
> > On Tue, 6 Mar 2018 at 02:17 Mike Lothian  wrote:
> >
> >     Hi
> >
> >     I've been trying to get a Gentoo ebuild ready for meson
> >
> >     I've had to fudge the llvm-config for cross compiling a 32bit mesa 
> on
> >     a 64bit machine
> 
> If you're using a new enough meson (0.45) you can specify the llvm-config
> you
> want to use in the cross file.
> 
> >
> >     I notice that -Dvulkan-drivers= doesn't accept intel,radeon like
> >     autotools used to, it also seems as long as one value is correct the
> >     other is ignored
> 
> we're using amd instead of radeon. After 18.0 branches I want to bump the
> meson
> requirement so we can use meson's list argument type, which will check for
> such
> problems.
> 
> >
> >     Also -Dva-libs-path= doesn't play well with absolute paths, or 
> rather
> >     install_megadrivers.py is doing something strange - normally gentoo
> >     installs everything to a temporary image path then puts those files
> >     into the live system. It seems install_megadrivers.py doesn't do 
> this
> >     and installs directly to the live system - I worked around it by
> >     dropping the /usr
> 
> There's a patch from someone in FreeBSD that might fix this (the way we do
> symlinking in install_megadrivers is wrong).
> 
> Sorry it took me so long to find this email, notmuch applied some odd tags
> to
> it.
> 
> Dylan
> 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] meson: don't use compiler.has_header

2018-03-12 Thread Dylan Baker

Meson's compiler.has_header is completely useless, it only checks that a
header exists, not whether it's usable. This creates problems if a
header contains a conditional #error declaration, like so:

> #if __x86_64__
> # error "Doesn't work with x86_64!"
> #endif

Compiler.has_header will return true in this case, even when compiling
for x86_64. This is useless.

Instead, we'll do a compile check so that any #error declarations will
be treated as errors, and compilation will work.

Fixes compilation on x32 architecture.

Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746
meson bug: https://github.com/mesonbuild/meson/issues/2246
CC: Matt Turner 
Signed-off-by: Dylan Baker 
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 3c63f384381..51b470253f5 100644
--- a/meson.build
+++ b/meson.build
@@ -912,7 +912,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major')
 endif
 
 foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h']
-  if cc.has_header(h)
+  if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h))
 pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify())
   endif
 endforeach
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH RESEND] spirv: Silence compiler warning about undefined srcs[0]

2018-03-12 Thread Eric Anholt

v2: Use assume() at the srcs[] definition instead.

Cc: Jason Ekstrand 
Cc: Ian Romanick 
Cc: Eric Engestrom 
---
 src/compiler/spirv/spirv_to_nir.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 6a358c597316..3de45c47371e 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -2925,6 +2925,7 @@ vtn_handle_composite(struct vtn_builder *b, SpvOp opcode,
 
case SpvOpCompositeConstruct: {
   unsigned elems = count - 3;
+  assume(elems >= 1);
   if (glsl_type_is_vector_or_scalar(type)) {
  nir_ssa_def *srcs[4];
  for (unsigned i = 0; i < elems; i++)
-- 
2.16.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 1/2] gallium/winsys/kms: Fix possible leak in map/unmap.

2018-03-12 Thread Emil Velikov

On 12 March 2018 at 17:45, Lepton Wu  wrote:
> Ping.  Any more comments or missing stuff to get this commited into master?
>
As things have changed a bit (the original map/unmap behaviour is
preserved) I was hoping that Tomasz will give it another look.
If he prefers, I could add some revision summary and keep him as reviewer of v1?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] omx: always define ENABLE_ST_OMX_{BELLAGIO, TIZONIA}

2018-03-12 Thread Eric Engestrom

On Monday, 2018-03-12 10:19:49 -0700, Dylan Baker wrote:
> Quoting Eric Engestrom (2018-03-12 07:33:27)
> > We're trying to be -Wundef clean so that we can turn it on (and
> > eventually make it an error).
> > 
> > Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead
> > of #ifdef; I could've changed these, but the point of -Wundef is to
> > catch typos, so we might as well make the change the right way.
> > 
> > Fixes: 83d4a5d5aea5a8a05be2 "st/omx/tizonia: Add H.264 decoder"
> > Fixes: b2f2236dc565dd1460f0 "st/omx/tizonia: Add H.264 encoder"
> > Fixes: c62cf1f165919bc74296 "st/omx/tizonia/h264d: Add EGLImage support"
> > Cc: Gurkirpal Singh 
> > Signed-off-by: Eric Engestrom 
> > ---
> > The meson hunk doesn't look pretty at all, but I'm planning on replacing
> > all the `pre_args` with a configuration_data(), which will allow to
> > simplify a lot of this #defines code.
> > ---
> >  configure.ac |  4 
> >  meson.build  | 11 +--
> >  2 files changed, 13 insertions(+), 2 deletions(-)
> > 
> > diff --git a/configure.ac b/configure.ac
> > index 1553ce99da44bca4e826..6de4ceb2fb715505120e 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -2281,6 +2281,8 @@ if test "x$enable_omx_bellagio" = xyes; then
> >  PKG_CHECK_MODULES([OMX_BELLAGIO], [libomxil-bellagio >= 
> > $LIBOMXIL_BELLAGIO_REQUIRED])
> >  gallium_st="$gallium_st omx_bellagio"
> >  AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 1, [Use Bellagio for OMX IL])
> > +else
> > +AC_DEFINE([ENABLE_ST_OMX_BELLAGIO], 0)
> >  fi
> >  AM_CONDITIONAL(HAVE_ST_OMX_BELLAGIO, test "x$enable_omx_bellagio" = xyes)
> >  
> > @@ -2294,6 +2296,8 @@ if test "x$enable_omx_tizonia" = xyes; then
> > libtizplatform >= $LIBOMXIL_TIZONIA_REQUIRED])
> >  gallium_st="$gallium_st omx_tizonia"
> >  AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 1, [Use Tizoina for OMX IL])
> > +else
> > +AC_DEFINE([ENABLE_ST_OMX_TIZONIA], 0)
> >  fi
> >  AM_CONDITIONAL(HAVE_ST_OMX_TIZONIA, test "x$enable_omx_tizonia" = xyes)
> >  
> > diff --git a/meson.build b/meson.build
> > index b6e9692f192c528520e7..b9f7cd2aff5fc49e0d93 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -504,7 +504,7 @@ if with_gallium_omx == 'bellagio' or with_gallium_omx 
> > == 'auto'
> >  'libomxil-bellagio', required : with_gallium_omx == 'bellagio'
> >)
> >if dep_omx.found()
> > -pre_args += '-DENABLE_ST_OMX_BELLAGIO'
> > +pre_args += '-DENABLE_ST_OMX_BELLAGIO=1'
> >  with_gallium_omx = 'bellagio'
> >endif
> >  endif
> > @@ -525,7 +525,7 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx == 
> > 'auto'
> >dependency('tizilheaders', required : with_gallium_omx == 'tizonia'),
> >  ]
> >  if dep_omx.found() and dep_omx_other[0].found() and 
> > dep_omx_other[1].found()
> > -  pre_args += '-DENABLE_ST_OMX_TIZONIA'
> > +  pre_args += '-DENABLE_ST_OMX_TIZONIA=1'
> >with_gallium_omx = 'tizonia'
> >  else
> >with_gallium_omx = 'disabled'
> > @@ -533,6 +533,13 @@ if with_gallium_omx == 'tizonia' or with_gallium_omx 
> > == 'auto'
> >endif
> >  endif
> >  
> > +if with_gallium_omx != 'bellagio'
> > +  pre_args += '-DENABLE_ST_OMX_BELLAGIO=0'
> > +endif
> > +if with_gallium_omx != 'tizonia'
> > +  pre_args += '-DENABLE_ST_OMX_TIZONIA=0'
> > +endif
> > +
> 
> This is fine as-is, but if you wanted to clean it up a little, you could do
> something like:
> 
> pre_args += [
>   '-DENABLE_ST_OMX_BELLAGIO=' + with_gallium_omx == 'bellagio ? '1' : '0',
>   '-DENABLE_ST_OMX_TIZONIA=' + with_gallium_omx == 'tizonia ? '1' : '0',
> ]

That's what I was looking for but too tired to figure out :]
Thanks, I'll send a v2 with that tomorrow!

> 
> and take the pre_args out of the block above altogether.
> 
> either way,
> 
> Reviewed-by: Dylan Baker 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2] Hash table cloning for copy propagation

2018-03-12 Thread Thomas Helland

I've also uploaded this series to my github, if you wan to
pull them down from there [1].

I've also uploaded my previously talked about pointer_map
to my github account [2]. There's a pointer map, pointer set,
and some patches for nir in there, and some for disabling
asserts in some places. So it's not ready for primetime,
but that series has been tested recently, and has been
stable for a couple months now. Been tinkering with it
and adding small pieces now and then. What remains is
a bench-a-tonne to ensure it is OK performance wise,
and cleaning it up for posting on the mailing list.

[1]: https://github.com/thohel/mesa/commits/hash-table-clone
[2]: https://github.com/thohel/mesa/commits/pointer_map

2018-03-12 18:55 GMT+01:00 Thomas Helland :
> This is a revival of some old patches I had around to improve
> the compile times in the glsl compiler by reducing the time
> spend inserting items in the hash table in opt_copy_propagation.
> I've only rebased this, as my system don't even want to compile
> anything right now. I also don't remember if it was thoroughly
> tested, so that will have to be done. Sending it out as Dave
> might be interested in this to mitigate some of the overhead
> his soft-dobule implementation incurs.
>
> CC: Dave Airlie 
>
> Thomas Helland (2):
>   util: Implement a hash table cloning function
>   glsl: Use hash table cloning in copy propagation
>
>  src/compiler/glsl/opt_copy_propagation.cpp | 13 --
>  .../glsl/opt_copy_propagation_elements.cpp | 29 
> --
>  src/util/hash_table.c  | 22 
>  src/util/hash_table.h  |  2 ++
>  4 files changed, 39 insertions(+), 27 deletions(-)
>
> --
> 2.15.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] Mesa release improvements - Feature and Stable releases

2018-03-12 Thread Emil Velikov

Hi Andres,

On 12 March 2018 at 15:57, Andres Gomez  wrote:
> On Mon, 2018-03-12 at 16:45 +0100, Juan A. Suarez Romero wrote:
>> >
>> On Mon, 2018-03-12 at 17:17 +0200, Andres Gomez wrote:
>
> [...]
>
I'm fully on board with your initial suggestion.


>> > My proposal would be, similarly to what Intel does to track [1] the
>> > stabilization for a release, 1 week (?) prior to the branching time to
>> > create a metabug in bugzilla (or GitLab in the future ?), to announce
>> > this metabug in mesa-dev and to let any developer who wants to see
>> > their feature into the coming release to open a blocking bug for this
>> > metabug explaining such feature and its progress. This way we can track
>> > the progress and the process will be more transparent. We can still be
>> > flexible to include the blocking features but the coordination will
>> > happen over these bugs.
>> >
>>
>> So, when the branch point is created? After the metabug is closed? or 1 week
>> after the metabug is created?
>>
>>
>> Not sure if this provide any difference on what we are doing now: create the
>> branchpoint, open a metabug with the desire features, and cherry-pick all the
>> patches that solves the metabug.
>>
>
> 18.1 example:
>
>1. Create a Metabug for the 18.1 branch point.
>2. Announce the Metabug in mesa-dev and give 1 week (?) for developers
>   to complete their features. Advice to block the Metabug with other
>   feature bugs.
>3. Developers create bugs with the WIP features they want to include in
>   18.1 and block the Metabug.
>4. After 1 week, check the status
>* If there are no blockers, close the Metabug and create the 18.1
>   branch point.
>* If there are blockers; coordinate with the developers of the
>   blockers and decide whether to give a bit more of margin if the
>   feature is almost complete or just remove the blocking bugs
>   leaving the WIP features out, close the Metabug and create the
>   18.1 branch point.
>5. Release 18.1-0-rc1.
>6. Create a Metabug to track the status of the final 18.1.0 release.
>7. Block this Metabug with regressions found from 18.1.0-rcX.
>8. Once we reach stability, close the Metabug and announce the final
>   release of 18.1.0.
>
I might sound a bit negative, yet I'm not sure what this brings us.
Can you please elaborate?

The original goal is to have the time based releases, as opposed to
feature ones.
That was reiterated by developers not too long ago.

So far, there has been an announcement email 2-4 weeks before the
branch point, aiming to:
 - remind, and
 - seek feedback about required features

The email was also followed by weekly ping/reminder.

IIRC suggestions and requests that are made in timely fashion* have
always been accepted.
If we're adopt the above approach, this will:
 - lead to noticeable delays in the branch point, which combined with
 - the current delays getting the blocking bugs fixed. equals
 - even greater delays and less time based releases

Furthermore I'm a bit worried that this might have negative impact on
developers:
I don't know any instances, yet some developers may put extra pressure
on themselves trying to get 'too many' features merged. Leading to
stress, burn out and others.


Perhaps we can somehow utilise your suggestion while ensuring that my
grim 'predictions' do not come true?


Thanks
Emil

* 3+days/a week before the branch point
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [rfc] dri3: allow building against older xcb

2018-03-12 Thread Marek Olšák

This is good, though some older distros only have libxcb 1.11.

Marek

On Sun, Mar 11, 2018 at 7:26 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> I'm not sure everyone wants to be updating their dri3 in a forced
> march setting, this allows a nicer approach, esp when you want
> to build on distro that aren't brand new.
>
> I'm sure there are plenty of ways this patch could be cleaner,
> and I've also not built it against an updated dri3.
> ---
>  configure.ac |  4 ++--
>  src/egl/drivers/dri2/platform_x11_dri3.c |  4 
>  src/loader/loader_dri3_helper.c  | 22 --
>  src/loader/loader_dri3_helper.h  |  3 ++-
>  4 files changed, 24 insertions(+), 9 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 1553ce9..6a1f139 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -92,9 +92,9 @@ WAYLAND_REQUIRED=1.11
>  WAYLAND_PROTOCOLS_REQUIRED=1.8
>  XCB_REQUIRED=1.9.3
>  XCBDRI2_REQUIRED=1.8
> -XCBDRI3_REQUIRED=1.13
> +XCBDRI3_REQUIRED=1.12
>  XCBGLX_REQUIRED=1.8.1
> -XCBPRESENT_REQUIRED=1.13
> +XCBPRESENT_REQUIRED=1.12
>  XDAMAGE_REQUIRED=1.1
>  XSHMFENCE_REQUIRED=1.1
>  XVMC_REQUIRED=1.0.6
> diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
> b/src/egl/drivers/dri2/platform_x11_dri3.c
> index dce3356..efe030a 100644
> --- a/src/egl/drivers/dri2/platform_x11_dri3.c
> +++ b/src/egl/drivers/dri2/platform_x11_dri3.c
> @@ -327,6 +327,7 @@ dri3_create_image_khr_pixmap_from_buffers(_EGLDisplay 
> *disp, _EGLContext *ctx,
>EGLClientBuffer buffer,
>const EGLint *attr_list)
>  {
> +#if XCB_DRI3_MAJOR_VERSION == 1 && XCB_DRI3_MINOR_VERSION > 0
> struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
> struct dri2_egl_image *dri2_img;
> xcb_dri3_buffers_from_pixmap_cookie_t bp_cookie;
> @@ -376,6 +377,9 @@ dri3_create_image_khr_pixmap_from_buffers(_EGLDisplay 
> *disp, _EGLContext *ctx,
> }
>
> return _img->base;
> +#else
> +   return NULL;
> +#endif
>  }
>
>  static _EGLImage *
> diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
> index 585f7ce..624ef1b 100644
> --- a/src/loader/loader_dri3_helper.c
> +++ b/src/loader/loader_dri3_helper.c
> @@ -389,6 +389,7 @@ dri3_handle_present_event(struct loader_dri3_drawable 
> *draw,
>  /* If the server tells us that our allocation is suboptimal, we
>* reallocate once.
>*/
> +#ifdef XCB_PRESENT_COMPLETE_MODE_SUBOPTIMAL_COPY
>   if (ce->mode == XCB_PRESENT_COMPLETE_MODE_SUBOPTIMAL_COPY &&
>   draw->last_present_mode != ce->mode) {
>  for (int b = 0; b < ARRAY_SIZE(draw->buffers); b++) {
> @@ -396,7 +397,7 @@ dri3_handle_present_event(struct loader_dri3_drawable 
> *draw,
>draw->buffers[b]->reallocate = true;
>  }
>   }
> -
> +#endif
>   draw->last_present_mode = ce->mode;
>
>   if (draw->vtable->show_fps)
> @@ -903,10 +904,10 @@ loader_dri3_swap_buffers_msc(struct 
> loader_dri3_drawable *draw,
> */
>if (!loader_dri3_have_image_blit(draw) && draw->cur_blit_source != -1)
>   options |= XCB_PRESENT_OPTION_COPY;
> -
> +#ifdef XCB_PRESENT_OPTION_SUBOPTIMAL
>if (draw->multiplanes_available)
>   options |= XCB_PRESENT_OPTION_SUBOPTIMAL;
> -
> +#endif
>back->busy = 1;
>back->last_swap = draw->send_sbc;
>xcb_present_pixmap(draw->conn,
> @@ -1053,6 +1054,7 @@ image_format_to_fourcc(int format)
> return 0;
>  }
>
> +#if XCB_DRI3_MAJOR_VERSION == 1 && XCB_DRI3_MINOR_VERSION > 0
>  static bool
>  has_supported_modifier(struct loader_dri3_drawable *draw, unsigned int 
> format,
> uint64_t *modifiers, uint32_t count)
> @@ -1087,6 +1089,7 @@ has_supported_modifier(struct loader_dri3_drawable 
> *draw, unsigned int format,
> free(supported_modifiers);
> return found;
>  }
> +#endif
>
>  /** loader_dri3_alloc_render_buffer
>   *
> @@ -1132,6 +1135,7 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable 
> *draw, unsigned int format,
>goto no_image;
>
> if (!draw->is_different_gpu) {
> +#if XCB_DRI3_MAJOR_VERSION == 1 && XCB_DRI3_MINOR_VERSION > 0
>if (draw->multiplanes_available &&
>draw->ext->image->base.version >= 15 &&
>draw->ext->image->queryDmaBufModifiers &&
> @@ -1195,7 +1199,7 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable 
> *draw, unsigned int format,
>  buffer);
>   free(modifiers);
>}
> -
> +#endif
>if (!buffer->image)
>   buffer->image = draw->ext->image->createImage(draw->dri_screen,
> width, height,
> @@ -1272,6 +1276,7 @@ dri3_alloc_render_buffer(struct loader_dri3_drawable 
> *draw, unsigned int

Re: [Mesa-dev] Few issues with Meson

2018-03-12 Thread Mike Lothian

Hi Dylan

Do you have the link to patch on patchwork? I'll give it a go

I'm using meson 0.45 however the cross-file requires more than just
defining llvm-config, everything else is normally picked up from what
portage is setting in the build environment - though strangely not if clang
is used - I'll look into that sometime

Regards

Mike

On Fri, 9 Mar 2018 at 16:37 Dylan Baker  wrote:

> Quoting Mike Lothian (2018-03-06 05:07:34)
> > Hi
> >
> > When compiling wine I also noticed that the d3d.pc files didn't have
> moduledir
> > set, so wine couldn't find it
> >
> > configure: error: pkg-config couldn't find Gallium Nine module
>
> I've sent a patch for this.
>
> >
> > Regards
> >
> > Mike
> >
> > On Tue, 6 Mar 2018 at 02:17 Mike Lothian  wrote:
> >
> > Hi
> >
> > I've been trying to get a Gentoo ebuild ready for meson
> >
> > I've had to fudge the llvm-config for cross compiling a 32bit mesa on
> > a 64bit machine
>
> If you're using a new enough meson (0.45) you can specify the llvm-config
> you
> want to use in the cross file.
>
> >
> > I notice that -Dvulkan-drivers= doesn't accept intel,radeon like
> > autotools used to, it also seems as long as one value is correct the
> > other is ignored
>
> we're using amd instead of radeon. After 18.0 branches I want to bump the
> meson
> requirement so we can use meson's list argument type, which will check for
> such
> problems.
>
> >
> > Also -Dva-libs-path= doesn't play well with absolute paths, or rather
> > install_megadrivers.py is doing something strange - normally gentoo
> > installs everything to a temporary image path then puts those files
> > into the live system. It seems install_megadrivers.py doesn't do this
> > and installs directly to the live system - I worked around it by
> > dropping the /usr
>
> There's a patch from someone in FreeBSD that might fix this (the way we do
> symlinking in install_megadrivers is wrong).
>
> Sorry it took me so long to find this email, notmuch applied some odd tags
> to
> it.
>
> Dylan
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] glsl: Use hash table cloning in copy propagation

2018-03-12 Thread Thomas Helland

Walking the whole hash table, inserting entries by hashing them first
is just a really bad idea. We can simply memcpy the whole thing.
---
 src/compiler/glsl/opt_copy_propagation.cpp | 13 --
 .../glsl/opt_copy_propagation_elements.cpp | 29 --
 2 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/src/compiler/glsl/opt_copy_propagation.cpp 
b/src/compiler/glsl/opt_copy_propagation.cpp
index e904e6ede4..96667779da 100644
--- a/src/compiler/glsl/opt_copy_propagation.cpp
+++ b/src/compiler/glsl/opt_copy_propagation.cpp
@@ -220,10 +220,7 @@ ir_copy_propagation_visitor::handle_if_block(exec_list 
*instructions)
this->killed_all = false;
 
/* Populate the initial acp with a copy of the original */
-   struct hash_entry *entry;
-   hash_table_foreach(orig_acp, entry) {
-  _mesa_hash_table_insert(acp, entry->key, entry->data);
-   }
+   acp = _mesa_hash_table_clone(orig_acp, NULL);
 
visit_list_elements(this, instructions);
 
@@ -271,10 +268,10 @@ ir_copy_propagation_visitor::handle_loop(ir_loop *ir, 
bool keep_acp)
this->killed_all = false;
 
if (keep_acp) {
-  struct hash_entry *entry;
-  hash_table_foreach(orig_acp, entry) {
- _mesa_hash_table_insert(acp, entry->key, entry->data);
-  }
+  acp = _mesa_hash_table_clone(orig_acp, NULL);
+   } else {
+  acp = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+_mesa_key_pointer_equal);
}
 
visit_list_elements(this, >body_instructions);
diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp 
b/src/compiler/glsl/opt_copy_propagation_elements.cpp
index 9f79fa9202..8bae424a1d 100644
--- a/src/compiler/glsl/opt_copy_propagation_elements.cpp
+++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp
@@ -124,6 +124,12 @@ public:
   ralloc_free(mem_ctx);
}
 
+   void clone_acp(hash_table *lhs, hash_table *rhs)
+   {
+  lhs_ht = _mesa_hash_table_clone(lhs, mem_ctx);
+  rhs_ht = _mesa_hash_table_clone(rhs, mem_ctx);
+   }
+
void create_acp()
{
   lhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
@@ -138,19 +144,6 @@ public:
   _mesa_hash_table_destroy(rhs_ht, NULL);
}
 
-   void populate_acp(hash_table *lhs, hash_table *rhs)
-   {
-  struct hash_entry *entry;
-
-  hash_table_foreach(lhs, entry) {
- _mesa_hash_table_insert(lhs_ht, entry->key, entry->data);
-  }
-
-  hash_table_foreach(rhs, entry) {
- _mesa_hash_table_insert(rhs_ht, entry->key, entry->data);
-  }
-   }
-
void handle_loop(ir_loop *, bool keep_acp);
virtual ir_visitor_status visit_enter(class ir_loop *);
virtual ir_visitor_status visit_enter(class ir_function_signature *);
@@ -395,10 +388,8 @@ 
ir_copy_propagation_elements_visitor::handle_if_block(exec_list *instructions)
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
 
-   create_acp();
-
/* Populate the initial acp with a copy of the original */
-   populate_acp(orig_lhs_ht, orig_rhs_ht);
+   clone_acp(orig_lhs_ht, orig_rhs_ht);
 
visit_list_elements(this, instructions);
 
@@ -454,11 +445,11 @@ ir_copy_propagation_elements_visitor::handle_loop(ir_loop 
*ir, bool keep_acp)
this->kills = new(mem_ctx) exec_list;
this->killed_all = false;
 
-   create_acp();
-
if (keep_acp) {
   /* Populate the initial acp with a copy of the original */
-  populate_acp(orig_lhs_ht, orig_rhs_ht);
+  clone_acp(orig_lhs_ht, orig_rhs_ht);
+   } else {
+  create_acp();
}
 
visit_list_elements(this, >body_instructions);
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] util: Implement a hash table cloning function

2018-03-12 Thread Thomas Helland

V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav)
---
 src/util/hash_table.c | 22 ++
 src/util/hash_table.h |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index b7421a0144..f8d5d0f88a 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -141,6 +141,28 @@ _mesa_hash_table_create(void *mem_ctx,
return ht;
 }
 
+struct hash_table *
+_mesa_hash_table_clone(struct hash_table *src, void *dst_mem_ctx)
+{
+   struct hash_table *ht;
+
+   ht = ralloc(dst_mem_ctx, struct hash_table);
+   if (ht == NULL)
+  return NULL;
+
+   memcpy(ht, src, sizeof(struct hash_table));
+
+   ht->table = ralloc_array(ht, struct hash_entry, ht->size);
+   if (ht->table == NULL) {
+  ralloc_free(ht);
+  return NULL;
+   }
+
+   memcpy(ht->table, src->table, ht->size * sizeof(struct hash_entry));
+
+   return ht;
+}
+
 /**
  * Frees the given hash table.
  *
diff --git a/src/util/hash_table.h b/src/util/hash_table.h
index d3e0758b26..3846dad4b4 100644
--- a/src/util/hash_table.h
+++ b/src/util/hash_table.h
@@ -62,6 +62,8 @@ _mesa_hash_table_create(void *mem_ctx,
 uint32_t (*key_hash_function)(const void *key),
 bool (*key_equals_function)(const void *a,
 const void *b));
+struct hash_table *
+_mesa_hash_table_clone(struct hash_table *src, void *dst_mem_ctx);
 void _mesa_hash_table_destroy(struct hash_table *ht,
   void (*delete_function)(struct hash_entry 
*entry));
 void _mesa_hash_table_clear(struct hash_table *ht,
-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/2] Hash table cloning for copy propagation

2018-03-12 Thread Thomas Helland

This is a revival of some old patches I had around to improve
the compile times in the glsl compiler by reducing the time
spend inserting items in the hash table in opt_copy_propagation.
I've only rebased this, as my system don't even want to compile
anything right now. I also don't remember if it was thoroughly
tested, so that will have to be done. Sending it out as Dave
might be interested in this to mitigate some of the overhead
his soft-dobule implementation incurs.

CC: Dave Airlie 

Thomas Helland (2):
  util: Implement a hash table cloning function
  glsl: Use hash table cloning in copy propagation

 src/compiler/glsl/opt_copy_propagation.cpp | 13 --
 .../glsl/opt_copy_propagation_elements.cpp | 29 --
 src/util/hash_table.c  | 22 
 src/util/hash_table.h  |  2 ++
 4 files changed, 39 insertions(+), 27 deletions(-)

-- 
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] i965/miptree: Use cpu tiling/detiling when mapping

2018-03-12 Thread Scott D Phillips

Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.

Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.

v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)

v3: Add units to parameter names of tile_extents (Nanley Chery)
Use _mesa_align_malloc for the shadow copy (Nanley)
Continue using gtt maps on gen4 (Nanley)
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 94 ---
 1 file changed, 86 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index c6213b21629..fba17bf5b7b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -31,6 +31,7 @@
 #include "intel_image.h"
 #include "intel_mipmap_tree.h"
 #include "intel_tex.h"
+#include "intel_tiled_memcpy.h"
 #include "intel_blit.h"
 #include "intel_fbo.h"
 
@@ -3046,10 +3047,10 @@ intel_miptree_unmap_raw(struct intel_mipmap_tree *mt)
 }
 
 static void
-intel_miptree_map_gtt(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- struct intel_miptree_map *map,
- unsigned int level, unsigned int slice)
+intel_miptree_map_map(struct brw_context *brw,
+  struct intel_mipmap_tree *mt,
+  struct intel_miptree_map *map,
+  unsigned int level, unsigned int slice)
 {
unsigned int bw, bh;
void *base;
@@ -3093,11 +3094,81 @@ intel_miptree_map_gtt(struct brw_context *brw,
 }
 
 static void
-intel_miptree_unmap_gtt(struct intel_mipmap_tree *mt)
+intel_miptree_unmap_map(struct intel_mipmap_tree *mt)
 {
intel_miptree_unmap_raw(mt);
 }
 
+/* Compute extent parameters for use with tiled_memcpy functions.
+ * xs are in units of bytes and ys are in units of strides. */
+static inline void
+tile_extents(struct intel_mipmap_tree *mt, struct intel_miptree_map *map,
+ unsigned int level, unsigned int slice, unsigned int *x1_B,
+ unsigned int *x2_B, unsigned int *y1_el, unsigned int *y2_el)
+{
+   unsigned int block_width, block_height;
+   unsigned int x0_el, y0_el;
+
+   _mesa_get_format_block_size(mt->format, _width, _height);
+
+   assert(map->x % block_width == 0);
+   assert(map->y % block_height == 0);
+
+   intel_miptree_get_image_offset(mt, level, slice, _el, _el);
+   *x1_B = (map->x / block_width + x0_el) * mt->cpp;
+   *y1_el = map->y / block_height + y0_el;
+   *x2_B = (DIV_ROUND_UP(map->x + map->w, block_width) + x0_el) * mt->cpp;
+   *y2_el = DIV_ROUND_UP(map->y + map->h, block_height) + y0_el;
+}
+
+static void
+intel_miptree_map_tiled_memcpy(struct brw_context *brw,
+   struct intel_mipmap_tree *mt,
+   struct intel_miptree_map *map,
+   unsigned int level, unsigned int slice)
+{
+   unsigned int x1, x2, y1, y2;
+   tile_extents(mt, map, level, slice, , , , );
+   map->stride = _mesa_format_row_stride(mt->format, map->w);
+   map->buffer = map->ptr = _mesa_align_malloc(map->stride * (y2 - y1), 16);
+
+   assert(map->ptr);
+
+   if (!(map->mode & GL_MAP_INVALIDATE_RANGE_BIT)) {
+  char *src = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW);
+  src += mt->offset;
+
+  tiled_to_linear(x1, x2, y1, y2, map->ptr, src, map->stride,
+  mt->surf.row_pitch, brw->has_swizzling, mt->surf.tiling,
+  memcpy);
+
+  intel_miptree_unmap_raw(mt);
+   }
+}
+
+static void
+intel_miptree_unmap_tiled_memcpy(struct brw_context *brw,
+ struct intel_mipmap_tree *mt,
+ struct intel_miptree_map *map,
+ unsigned int level,
+ unsigned int slice)
+{
+   if (map->mode & GL_MAP_WRITE_BIT) {
+  unsigned int x1, x2, y1, y2;
+  tile_extents(mt, map, level, slice, , , , );
+
+  char *dst = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW);
+  dst += mt->offset;
+
+  linear_to_tiled(x1, x2, y1, y2, dst, map->ptr, mt->surf.row_pitch,
+  map->stride, brw->has_swizzling, mt->surf.tiling, 
memcpy);
+
+  intel_miptree_unmap_raw(mt);
+   }
+   _mesa_align_free(map->buffer);
+   map->buffer = map->ptr = NULL;
+}
+
 static void
 intel_miptree_map_blit(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
@@ -3655,8 +3726,10 @@ intel_miptree_map(struct brw_context *brw,
   (mt->surf.row_pitch % 16 == 0)) {
   intel_miptree_map_movntdqa(brw, mt, map, level, slice);
 #endif
+   } else if (mt->surf.tiling != ISL_TILING_LINEAR && brw->screen->devinfo.gen 
> 4) {
+  intel_miptree_map_tiled_memcpy(brw, mt,

Re: [Mesa-dev] [PATCH v4 1/2] gallium/winsys/kms: Fix possible leak in map/unmap.

2018-03-12 Thread Lepton Wu

Ping.  Any more comments or missing stuff to get this commited into master?

Thanks.

On Wed, Mar 7, 2018 at 2:39 PM, Lepton Wu  wrote:
> If user calls map twice for kms_sw_displaytarget, the first mapped
> buffer could get leaked. Instead of calling mmap every time, just
> reuse previous mapping. Since user could map same displaytarget with
> different flags, we have to keep two different pointers, one for rw
> mapping and one for ro mapping.
>
> Change-Id: I65308f0ff2640bd57b2577c6a3469540c9722859
> Signed-off-by: Lepton Wu 
> ---
>  .../winsys/sw/kms-dri/kms_dri_sw_winsys.c | 21 ---
>  1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
> b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> index 22e1c936ac5..7fc40488c2e 100644
> --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
> @@ -70,6 +70,7 @@ struct kms_sw_displaytarget
>
> uint32_t handle;
> void *mapped;
> +   void *ro_mapped;
>
> int ref_count;
> struct list_head link;
> @@ -198,16 +199,19 @@ kms_sw_displaytarget_map(struct sw_winsys *ws,
>return NULL;
>
> prot = (flags == PIPE_TRANSFER_READ) ? PROT_READ : (PROT_READ | 
> PROT_WRITE);
> -   kms_sw_dt->mapped = mmap(0, kms_sw_dt->size, prot, MAP_SHARED,
> -kms_sw->fd, map_req.offset);
> -
> -   if (kms_sw_dt->mapped == MAP_FAILED)
> -  return NULL;
> +   void **ptr = (flags == PIPE_TRANSFER_READ) ? _sw_dt->ro_mapped : 
> _sw_dt->mapped;
> +   if (!*ptr) {
> +  void *tmp = mmap(0, kms_sw_dt->size, prot, MAP_SHARED,
> +   kms_sw->fd, map_req.offset);
> +  if (tmp == MAP_FAILED)
> + return NULL;
> +  *ptr = tmp;
> +   }
>
> DEBUG_PRINT("KMS-DEBUG: mapped buffer %u (size %u) at %p\n",
> - kms_sw_dt->handle, kms_sw_dt->size, kms_sw_dt->mapped);
> + kms_sw_dt->handle, kms_sw_dt->size, *ptr);
>
> -   return kms_sw_dt->mapped;
> +   return *ptr;
>  }
>
>  static struct kms_sw_displaytarget *
> @@ -278,9 +282,12 @@ kms_sw_displaytarget_unmap(struct sw_winsys *ws,
> struct kms_sw_displaytarget *kms_sw_dt = kms_sw_displaytarget(dt);
>
> DEBUG_PRINT("KMS-DEBUG: unmapped buffer %u (was %p)\n", 
> kms_sw_dt->handle, kms_sw_dt->mapped);
> +   DEBUG_PRINT("KMS-DEBUG: unmapped buffer %u (was %p)\n", 
> kms_sw_dt->handle, kms_sw_dt->ro_mapped);
>
> munmap(kms_sw_dt->mapped, kms_sw_dt->size);
> kms_sw_dt->mapped = NULL;
> +   munmap(kms_sw_dt->ro_mapped, kms_sw_dt->size);
> +   kms_sw_dt->ro_mapped = NULL;
>  }
>
>  static struct sw_displaytarget *
> --
> 2.16.2.395.g2e18187dfd-goog
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [v4 PATCH 3/6] spirv_extensions: add list of extensions and to_string method

2018-03-12 Thread Dylan Baker

Adding Jason and Ian here for their opinions.

Quoting Alejandro Piñeiro (2018-03-12 01:31:02)
> On 11/03/18 18:08, Dylan Baker wrote:
> > Quoting Alejandro Piñeiro (2018-03-08 07:00:16)
> >> Ideally this should be generated somehow. One option would be gather
> >> all the extension dependencies listed on the core grammar, but there
> >> would be the possibility of not including some of the extensions.
> >>
> >> Note that spirv-tools is doing it just slightly better, as it has a
> >> hardcoded list of extensions manually took from the registry, that
> >> they parse to get the enum and the to_string method (see
> >> generate_grammar_tables.py).
> > If there were extensions not in the core grammar that we wanted to or 
> > needed to
> > support, are they available in a different format that is still machine
> > readable?
> 
> Taking a look to the last version of the core grammar [1], it seems that
> all the extensions, but SPV_AMD_gcn_shader are now part of the core. For
> the latter, I found a json file as part of spirv-tools [3]
> 
> But when I wrote this patch, some of the extensions were not part of the
> core, and as far as I saw, they were just listed on the registry [2]. I
> was not able to find a individual json core grammar for some extensions
> then. On the commit message I mention generate_grammar_tables. From a
> comment there:
> 
>  #Extensions to recognize, but which don't necessarily come from the SPIR-V
>  #core grammar. Get this list from the SPIR-V registery web page.
> 
> So right now one option would be create that list from the core grammar
> plus the grammar amd one. But what would happen if a new extension is
> defined without a grammar file? Would we just write one ourselves? Would
> we ask khronos (or who defined the spec) to provide one?
> 
> BR
> 
> 
> [1]
> https://github.com/KhronosGroup/SPIRV-Headers/blob/master/include/spirv/1.0/spirv.core.grammar.json
> [2] https://www.khronos.org/registry/spir-v/extensions/KHR/
> [3]
> https://github.com/KhronosGroup/SPIRV-Tools/blob/master/source/extinst.spv-amd-gcn-shader.grammar.json
> 
> 


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 170 matches

Mail list logo