Re: [Mesa-dev] radeonsi: NIR - Polaris triangle sprinkling running UH SOLVED - finally

2019-02-15 Thread Timothy Arceri



On 13/2/19 8:28 am, Dieter Nützel wrote:

Hello Marek, Timo, Nicolai,

Timo SOLVED this long-standing NIR corruption on Polaris with his 'nir: 
rewrite varying component packing' commit.


It was triggered with

commit 86b52d42368ac496fe24bc6674e754c323381635
Author: Marek Olšák 
Date:   Fri Jul 13 00:23:36 2018 -0400

     radeonsi: reduce LDS stalls by 40% for tessellation

     40% is the decrease in the LGKM counter (which includes SMEM too)
     for the GFX9 LSHS stage.

     This will make the LDS size slightly larger, but I wasn't able to 
increase
     the patch stride without corruption, so I'm increasing the vertex 
stride.


and now finally SOLVED with

commit 26aa460940f6222565ad5eb40a21c2377c59c3a6
Author: Timothy Arceri 
Date:   Mon Dec 10 10:23:51 2018 +1100

     nir: rewrite varying component packing

     There are a number of reasons for the rewrite.

     1. Adding support for packing tess patch varyings in a sane way.

     2. Making use of qsort allowing the code to be much easier to
    follow.

     3. Fixes a bug where different interp types caused component
    packing to be skipped for all varyings in some scenarios.

     4. Allows us to add a crude live range analysis for deciding
    which components should be packed together. This support can
    optionally be added in a future patch.

     Reviewed-by: Jason Ekstrand 

Maybe it should backported (Cc: ) 
) for 19.0?


I'd rather it didn't since NIR isn't default for radeonsi and this is 
use for other drivers like RADV. I'd rather have more testing in the dev 
branch.


This change really shouldn't fix anything, it could just be that the 
change avoids the bug. I'm really not sure.


One thing this change did fix was a bug with doubles, but I wasn't aware 
of anything actually using doubles so I doubt this is it.




I hope my bisect help to bring some more understanding for this Polaris 
NIR bug.


Now, hunting for the (last) 19.0+ EQAA regression (DiRT Rally, black 
squares like  radv/DXVK corruption, NOT NIR related) and 'meson' OpenCL 
(Clover) build error.


Greetings,
Dieter

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 31/41] ac/nir: implement 16-bit pack/unpack opcodes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index bad1c2a990e..f6ad1aa7e77 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1015,6 +1015,30 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
}
 
+   case nir_op_pack_32_2x16_split: {
+   LLVMValueRef tmp = ac_build_gather_values(&ctx->ac, src, 2);
+   result = LLVMBuildBitCast(ctx->ac.builder, tmp, ctx->ac.i32, 
"");
+   break;
+   }
+
+   case nir_op_unpack_32_2x16_split_x: {
+   LLVMValueRef tmp = LLVMBuildBitCast(ctx->ac.builder, src[0],
+   ctx->ac.v2i16,
+   "");
+   result = LLVMBuildExtractElement(ctx->ac.builder, tmp,
+ctx->ac.i32_0, "");
+   break;
+   }
+
+   case nir_op_unpack_32_2x16_split_y: {
+   LLVMValueRef tmp = LLVMBuildBitCast(ctx->ac.builder, src[0],
+   ctx->ac.v2i16,
+   "");
+   result = LLVMBuildExtractElement(ctx->ac.builder, tmp,
+ctx->ac.i32_1, "");
+   break;
+   }
+
case nir_op_cube_face_coord: {
src[0] = ac_to_float(&ctx->ac, src[0]);
LLVMValueRef results[2];
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 32/41] ac/nir: add 8-bit types to glsl_base_to_llvm_type

2019-02-15 Thread Rhys Perry
v2: remove 16-bit additions and rebase

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index f6ad1aa7e77..defbfdf4297 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3969,6 +3969,9 @@ glsl_base_to_llvm_type(struct ac_llvm_context *ac,
case GLSL_TYPE_BOOL:
case GLSL_TYPE_SUBROUTINE:
return ac->i32;
+   case GLSL_TYPE_INT8:
+   case GLSL_TYPE_UINT8:
+   return ac->i8;
case GLSL_TYPE_INT16:
case GLSL_TYPE_UINT16:
return ac->i16;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 25/41] nir: make bitfield_reverse and ifind_msb work with all integers

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/compiler/nir/nir_opcodes.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index dc4cd9ac63d..0f40bd6c548 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -350,7 +350,7 @@ unop_convert("unpack_64_2x32_split_y", tuint32, tuint64, 
"src0 >> 32")
 # Bit operations, part of ARB_gpu_shader5.
 
 
-unop("bitfield_reverse", tuint32, """
+unop("bitfield_reverse", tuint, """
 /* we're not winning any awards for speed here, but that's ok */
 dst = 0;
 for (unsigned bit = 0; bit < 32; bit++)
@@ -374,7 +374,7 @@ for (int bit = bit_size - 1; bit >= 0; bit--) {
 }
 """)
 
-unop("ifind_msb", tint32, """
+unop_convert("ifind_msb", tint32, tint, """
 dst = -1;
 for (int bit = 31; bit >= 0; bit--) {
/* If src0 < 0, we're looking for the first 0 bit.
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 33/41] ac/nir, radv: create an array of varying output types

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c   | 68 +++
 src/amd/common/ac_shader_abi.h|  1 +
 src/amd/vulkan/radv_nir_to_llvm.c |  3 ++
 3 files changed, 72 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index defbfdf4297..5821c18aeb1 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -4238,6 +4238,68 @@ static void visit_cf_list(struct ac_nir_context *ctx,
}
 }
 
+static unsigned traverse_var_component_slots(struct ac_llvm_context *ctx, bool 
vs_in,
+struct nir_variable *var, unsigned 
cur_offset,
+const struct glsl_type *cur_type,
+void (*cb)(struct ac_llvm_context 
*, unsigned, enum glsl_base_type, void *),
+void *cbdata)
+{
+   if (glsl_type_is_struct(cur_type)) {
+   for (unsigned i = 0; i < glsl_get_length(cur_type); i++) {
+   const struct glsl_type *ft = 
glsl_get_struct_field(cur_type, i);
+   cur_offset = traverse_var_component_slots(ctx, vs_in, 
var, cur_offset, ft, cb, cbdata);
+   }
+   return (cur_offset + 3) / 4 * 4;
+   }
+
+   enum glsl_base_type base_type = 
glsl_get_base_type(glsl_without_array_or_matrix(cur_type));
+
+   unsigned stride = 
glsl_get_component_slots(glsl_without_array_or_matrix(cur_type));
+   if (!var->data.compact)
+   stride = (stride + 3) / 4 * 4;
+   unsigned arr_len = MAX2(glsl_get_matrix_columns(cur_type), 1);
+   if (glsl_type_is_array(cur_type))
+   arr_len *= glsl_get_aoa_size(cur_type);
+   for (unsigned i = 0; i < arr_len; i++) {
+   for (unsigned j = 0; j < 
glsl_get_component_slots(glsl_without_array_or_matrix(cur_type)); j++) {
+   cb(ctx, cur_offset + var->data.location_frac + j, 
base_type, cbdata);
+   }
+   cur_offset += stride;
+   }
+   return cur_offset;
+}
+
+static void setup_output_type(struct ac_llvm_context *ctx, unsigned index, 
enum glsl_base_type base, void *output_types)
+{
+   LLVMTypeRef type;
+   switch (base) {
+   case GLSL_TYPE_INT8:
+   case GLSL_TYPE_UINT8:
+   type = ctx->i8;
+   break;
+   case GLSL_TYPE_INT16:
+   case GLSL_TYPE_UINT16:
+   type = ctx->i16;
+   break;
+   case GLSL_TYPE_FLOAT16:
+   type = ctx->f16;
+   break;
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_BOOL:
+   case GLSL_TYPE_INT64:
+   case GLSL_TYPE_UINT64:
+   type = ctx->i32;
+   break;
+   case GLSL_TYPE_FLOAT:
+   case GLSL_TYPE_DOUBLE:
+   default:
+   type = ctx->f32;
+   break;
+   }
+   ((LLVMTypeRef*)output_types)[index] = type;
+}
+
 void
 ac_handle_shader_output_decl(struct ac_llvm_context *ctx,
 struct ac_shader_abi *abi,
@@ -4275,6 +4337,9 @@ ac_handle_shader_output_decl(struct ac_llvm_context *ctx,
   ac_build_alloca_undef(ctx, type, "");
}
}
+
+   traverse_var_component_slots(ctx, false, variable, output_loc * 4,
+variable->type, &setup_output_type, 
abi->output_types);
 }
 
 static void
@@ -4328,6 +4393,9 @@ void ac_nir_translate(struct ac_llvm_context *ac, struct 
ac_shader_abi *abi,
 
ctx.main_function = 
LLVMGetBasicBlockParent(LLVMGetInsertBlock(ctx.ac.builder));
 
+   for (unsigned i = 0; i < AC_LLVM_MAX_OUTPUTS * 4; i++)
+   ctx.abi->output_types[i] = ac->i32;
+
nir_foreach_variable(variable, &nir->outputs)
ac_handle_shader_output_decl(&ctx.ac, ctx.abi, nir, variable,
 ctx.stage);
diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
index ee18e6c1923..274deeb13a4 100644
--- a/src/amd/common/ac_shader_abi.h
+++ b/src/amd/common/ac_shader_abi.h
@@ -69,6 +69,7 @@ struct ac_shader_abi {
LLVMValueRef view_index;
 
LLVMValueRef outputs[AC_LLVM_MAX_OUTPUTS * 4];
+   LLVMTypeRef output_types[AC_LLVM_MAX_OUTPUTS * 4];
 
/* For VS and PS: pre-loaded shader inputs.
 *
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index d3795eec403..8fdaee72036 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -3910,6 +3910,9 @@ radv_compile_gs_copy_shader(struct ac_llvm_compiler 
*ac_llvm,
ctx.gs_max_out_vertices = geom_shader->info.gs.vertices_out;
ac_setup_rings(&ctx);
 
+   for (unsigned i = 0; i < AC_LLVM_MAX_OUTPUTS * 4; i++)
+   ctx.abi.output_types[i] = ctx.ac.i32;

[Mesa-dev] [PATCH] panfrost: Fix clipping region

2019-02-15 Thread Alyssa Rosenzweig
Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index 97df92258da..a4d0719fdc5 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -497,10 +497,17 @@ panfrost_viewport(struct panfrost_context *ctx,
  * (somewhat) asymmetric ints. */
 
 struct mali_viewport ret = {
-.clip_minx = viewport_x0,
-.clip_miny = viewport_y0,
-.clip_maxx = viewport_x1,
-.clip_maxy = viewport_x1,
+/* By default, do no viewport clipping, i.e. clip to (-inf,
+ * inf) in each direction. Clipping to the viewport in theory
+ * should work, but in practice causes issues when we're not
+ * explicitly trying to scissor */
+
+.clip_minx = -inff,
+.clip_miny = -inff,
+.clip_maxx = inff,
+.clip_maxy = inff,
+
+/* We always perform depth clipping (TODO: Can this be 
disabled?) */
 
 .clip_minz = depth_clip_near,
 .clip_maxz = depth_clip_far,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 35/41] radv: store all fragment shader inputs as f32

2019-02-15 Thread Rhys Perry
v2: rebase

Signed-off-by: Rhys Perry 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 2002a744545..01b8b097ea1 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2056,7 +2056,6 @@ static void interp_fs_input(struct radv_shader_context 
*ctx,
LLVMValueRef attr_number;
unsigned chan;
LLVMValueRef i, j;
-   bool interp = !LLVMIsUndef(interp_param);
 
attr_number = LLVMConstInt(ctx->ac.i32, attr, false);
 
@@ -2070,7 +2069,7 @@ static void interp_fs_input(struct radv_shader_context 
*ctx,
 * fs.interp cannot be used on integers, because they can be equal
 * to NaN.
 */
-   if (interp) {
+   if (interp_param) {
interp_param = LLVMBuildBitCast(ctx->ac.builder, interp_param,
ctx->ac.v2f32, "");
 
@@ -2083,7 +2082,7 @@ static void interp_fs_input(struct radv_shader_context 
*ctx,
for (chan = 0; chan < 4; chan++) {
LLVMValueRef llvm_chan = LLVMConstInt(ctx->ac.i32, chan, false);
 
-   if (interp) {
+   if (interp_param) {
result[chan] = ac_build_fs_interp(&ctx->ac,
  llvm_chan,
  attr_number,
@@ -2095,7 +2094,6 @@ static void interp_fs_input(struct radv_shader_context 
*ctx,
  attr_number,
  prim_mask);
result[chan] = LLVMBuildBitCast(ctx->ac.builder, 
result[chan], ctx->ac.i32, "");
-   result[chan] = LLVMBuildTruncOrBitCast(ctx->ac.builder, 
result[chan], LLVMTypeOf(interp_param), "");
}
}
 }
@@ -2123,10 +2121,6 @@ handle_fs_input_decl(struct radv_shader_context *ctx,
 
interp = lookup_interp_param(&ctx->abi, 
variable->data.interpolation, interp_type);
}
-   bool is_16bit = glsl_type_is_16bit(glsl_without_array(variable->type));
-   LLVMTypeRef type = is_16bit ? ctx->ac.i16 : ctx->ac.i32;
-   if (interp == NULL)
-   interp = LLVMGetUndef(type);
 
for (unsigned i = 0; i < attrib_count; ++i)
ctx->inputs[ac_llvm_reg_index_soa(idx + i, 0)] = interp;
@@ -2187,7 +2181,7 @@ handle_fs_inputs(struct radv_shader_context *ctx,
if (ctx->shader_info->info.ps.uses_input_attachments ||
ctx->shader_info->info.needs_multiview_view_index) {
ctx->input_mask |= 1ull << VARYING_SLOT_LAYER;
-   ctx->inputs[ac_llvm_reg_index_soa(VARYING_SLOT_LAYER, 0)] = 
LLVMGetUndef(ctx->ac.i32);
+   ctx->inputs[ac_llvm_reg_index_soa(VARYING_SLOT_LAYER, 0)] = 
NULL;
}
 
for (unsigned i = 0; i < RADEON_LLVM_MAX_INPUTS; ++i) {
@@ -2203,7 +2197,7 @@ handle_fs_inputs(struct radv_shader_context *ctx,
interp_fs_input(ctx, index, interp_param, 
ctx->abi.prim_mask,
inputs);
 
-   if (LLVMIsUndef(interp_param))
+   if (!interp_param)
ctx->shader_info->fs.flat_shaded_mask |= 1u << 
index;
if (i >= VARYING_SLOT_VAR0)
ctx->abi.fs_input_attr_indices[i - 
VARYING_SLOT_VAR0] = index;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 37/41] radv, ac: implement 16-bit interpolation

2019-02-15 Thread Rhys Perry
This patch can be ignored. I forgot to delete it and it ended up getting sent.
"[PATCH v2 37/41] WIP: radv, ac: implement 16-bit interpolation" is
the correct one.

On Sat, 16 Feb 2019 at 00:23, Rhys Perry  wrote:
>
> v2: add to patch series
>
> Signed-off-by: Rhys Perry 
> ---
>  src/amd/common/ac_llvm_build.c   | 33 +---
>  src/amd/common/ac_llvm_build.h   |  3 ++-
>  src/amd/common/ac_nir_to_llvm.c  | 14 +++---
>  src/amd/vulkan/radv_nir_to_llvm.c| 27 ++-
>  src/amd/vulkan/radv_pipeline.c   | 19 --
>  src/amd/vulkan/radv_shader.h |  1 +
>  src/gallium/drivers/radeonsi/si_shader.c |  2 +-
>  7 files changed, 69 insertions(+), 30 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index dff369aae7f..be2c2251a21 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -937,27 +937,40 @@ ac_build_fs_interp(struct ac_llvm_context *ctx,
>LLVMValueRef attr_number,
>LLVMValueRef params,
>LLVMValueRef i,
> -  LLVMValueRef j)
> +  LLVMValueRef j,
> +  int word)
>  {
> -   LLVMValueRef args[5];
> +   LLVMValueRef args[6];
> LLVMValueRef p1;
>
> args[0] = i;
> args[1] = llvm_chan;
> args[2] = attr_number;
> -   args[3] = params;
> -
> -   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1",
> -   ctx->f32, args, 4, AC_FUNC_ATTR_READNONE);
> +   if (word >= 0) {
> +   args[3] = LLVMConstInt(ctx->i1, word, false);
> +   args[4] = params;
> +   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1.f16",
> +   ctx->f16, args, 5, 
> AC_FUNC_ATTR_READNONE);
> +   } else {
> +   args[3] = params;
> +   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1",
> +   ctx->f32, args, 4, 
> AC_FUNC_ATTR_READNONE);
> +   }
>
> args[0] = p1;
> args[1] = j;
> args[2] = llvm_chan;
> args[3] = attr_number;
> -   args[4] = params;
> -
> -   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2",
> - ctx->f32, args, 5, AC_FUNC_ATTR_READNONE);
> +   if (word >= 0) {
> +   args[4] = LLVMConstInt(ctx->i1, word, false);
> +   args[5] = params;
> +   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2.f16",
> + ctx->f16, args, 6, 
> AC_FUNC_ATTR_READNONE);
> +   } else {
> +   args[4] = params;
> +   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2",
> + ctx->f32, args, 5, 
> AC_FUNC_ATTR_READNONE);
> +   }
>  }
>
>  LLVMValueRef
> diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
> index 61c9b5e4b6c..655427567c4 100644
> --- a/src/amd/common/ac_llvm_build.h
> +++ b/src/amd/common/ac_llvm_build.h
> @@ -224,7 +224,8 @@ ac_build_fs_interp(struct ac_llvm_context *ctx,
>LLVMValueRef attr_number,
>LLVMValueRef params,
>LLVMValueRef i,
> -  LLVMValueRef j);
> +  LLVMValueRef j,
> +  int word);
>
>  LLVMValueRef
>  ac_build_fs_interp_mov(struct ac_llvm_context *ctx,
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index bf7024c68e4..939b8eb13de 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -3120,8 +3120,15 @@ static LLVMValueRef visit_interp(struct ac_nir_context 
> *ctx,
> LLVMValueRef j = LLVMBuildExtractElement(
> ctx->ac.builder, interp_param, 
> ctx->ac.i32_1, "");
>
> +   /* This fp16 handling isn't technically 
> correct
> +* but should be correct for the attributes we
> +* are actually going to use. */
> +   bool fp16 = instr->dest.ssa.bit_size == 16;
> +   int word = fp16 ? 0 : -1;
> v = ac_build_fs_interp(&ctx->ac, llvm_chan, 
> attr_number,
> -  ctx->abi->prim_mask, 
> i, j);
> +  ctx->abi->prim_mask, 
> i, j, word);
> +   if (fp16)
> +   v = ac_build_reinterpret(&ctx->ac, v, 
> ctx->ac.f32);
> } else {
> v = ac_build_fs_interp_mov(&ctx->ac, 
> LLVMConstInt(ctx->ac.i32, 2, false),
>  

[Mesa-dev] [PATCH v2 34/41] ac/nir: store all outputs as f32

2019-02-15 Thread Rhys Perry
v2: rebase
v2: fix 64-bit visit_load_var()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c   | 14 ++
 src/amd/vulkan/radv_nir_to_llvm.c | 22 +-
 2 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 5821c18aeb1..bf7024c68e4 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2114,7 +2114,10 @@ static LLVMValueRef visit_load_var(struct ac_nir_context 
*ctx,
unreachable("unhandle variable mode");
}
ret = ac_build_varying_gather_values(&ctx->ac, values, ve, comp);
-   return LLVMBuildBitCast(ctx->ac.builder, ret, get_def_type(ctx, 
&instr->dest.ssa), "");
+   if (instr->dest.ssa.bit_size == 16)
+   return ac_build_reinterpret(&ctx->ac, ret, get_def_type(ctx, 
&instr->dest.ssa));
+   else
+   return LLVMBuildBitCast(ctx->ac.builder, ret, get_def_type(ctx, 
&instr->dest.ssa), "");
 }
 
 static void
@@ -2152,6 +2155,11 @@ visit_store_var(struct ac_nir_context *ctx,
 
writemask = writemask << comp;
 
+   LLVMTypeRef type = ctx->ac.f32;
+   if (LLVMGetTypeKind(LLVMTypeOf(src)) == LLVMVectorTypeKind)
+   type = LLVMVectorType(ctx->ac.f32, 
LLVMGetVectorSize(LLVMTypeOf(src)));
+   src = ac_build_reinterpret(&ctx->ac, src, type);
+
switch (deref->mode) {
case nir_var_shader_out:
 
@@ -4329,12 +4337,10 @@ ac_handle_shader_output_decl(struct ac_llvm_context 
*ctx,
}
}
 
-   bool is_16bit = glsl_type_is_16bit(glsl_without_array(variable->type));
-   LLVMTypeRef type = is_16bit ? ctx->f16 : ctx->f32;
for (unsigned i = 0; i < attrib_count; ++i) {
for (unsigned chan = 0; chan < 4; chan++) {
abi->outputs[ac_llvm_reg_index_soa(output_loc + i, 
chan)] =
-  ac_build_alloca_undef(ctx, type, "");
+  ac_build_alloca_undef(ctx, ctx->f32, "");
}
}
 
diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 8fdaee72036..2002a744545 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2305,6 +2305,7 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
 
bool is_16bit = ac_get_type_size(LLVMTypeOf(values[0])) == 2;
if (ctx->stage == MESA_SHADER_FRAGMENT) {
+   bool is_16bit = ac_get_type_size(LLVMTypeOf(values[0])) == 2;
unsigned index = target - V_008DFC_SQ_EXP_MRT;
unsigned col_format = (ctx->options->key.fs.col_format >> (4 * 
index)) & 0xf;
bool is_int8 = (ctx->options->key.fs.is_int8 >> index) & 1;
@@ -2421,16 +2422,8 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
return;
}
 
-   if (is_16bit) {
-   for (unsigned chan = 0; chan < 4; chan++) {
-   values[chan] = LLVMBuildBitCast(ctx->ac.builder, 
values[chan], ctx->ac.i16, "");
-   args->out[chan] = LLVMBuildZExt(ctx->ac.builder, 
values[chan], ctx->ac.i32, "");
-   }
-   } else
-   memcpy(&args->out[0], values, sizeof(values[0]) * 4);
-
-   for (unsigned i = 0; i < 4; ++i)
-   args->out[i] = ac_to_float(&ctx->ac, args->out[i]);
+   for (unsigned chan = 0; chan < 4; chan++)
+   args->out[chan] = ac_build_reinterpret(&ctx->ac, values[chan], 
ctx->ac.f32);
 }
 
 static void
@@ -3137,9 +3130,12 @@ handle_fs_outputs_post(struct radv_shader_context *ctx)
if (i < FRAG_RESULT_DATA0)
continue;
 
-   for (unsigned j = 0; j < 4; j++)
-   values[j] = ac_to_float(&ctx->ac,
-   radv_load_output(ctx, i, j));
+   for (unsigned j = 0; j < 4; j++) {
+   values[j] = radv_load_output(ctx, i, j);
+   unsigned index = ac_llvm_reg_index_soa(i, 0);
+   LLVMTypeRef new_type = ctx->abi.output_types[index];
+   values[j] = ac_build_reinterpret(&ctx->ac, values[j], 
new_type);
+   }
 
bool ret = si_export_mrt_color(ctx, values,
   i - FRAG_RESULT_DATA0,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 37/41] WIP: radv, ac: implement 16-bit interpolation

2019-02-15 Thread Rhys Perry
v2: add to patch series

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c   | 33 +---
 src/amd/common/ac_llvm_build.h   |  3 ++-
 src/amd/common/ac_nir_to_llvm.c  | 14 +++---
 src/amd/vulkan/radv_nir_to_llvm.c| 27 ++-
 src/amd/vulkan/radv_pipeline.c   | 19 --
 src/amd/vulkan/radv_shader.h |  1 +
 src/gallium/drivers/radeonsi/si_shader.c |  2 +-
 7 files changed, 69 insertions(+), 30 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index dff369aae7f..be2c2251a21 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -937,27 +937,40 @@ ac_build_fs_interp(struct ac_llvm_context *ctx,
   LLVMValueRef attr_number,
   LLVMValueRef params,
   LLVMValueRef i,
-  LLVMValueRef j)
+  LLVMValueRef j,
+  int word)
 {
-   LLVMValueRef args[5];
+   LLVMValueRef args[6];
LLVMValueRef p1;
 
args[0] = i;
args[1] = llvm_chan;
args[2] = attr_number;
-   args[3] = params;
-
-   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1",
-   ctx->f32, args, 4, AC_FUNC_ATTR_READNONE);
+   if (word >= 0) {
+   args[3] = LLVMConstInt(ctx->i1, word, false);
+   args[4] = params;
+   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1.f16",
+   ctx->f16, args, 5, 
AC_FUNC_ATTR_READNONE);
+   } else {
+   args[3] = params;
+   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1",
+   ctx->f32, args, 4, 
AC_FUNC_ATTR_READNONE);
+   }
 
args[0] = p1;
args[1] = j;
args[2] = llvm_chan;
args[3] = attr_number;
-   args[4] = params;
-
-   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2",
- ctx->f32, args, 5, AC_FUNC_ATTR_READNONE);
+   if (word >= 0) {
+   args[4] = LLVMConstInt(ctx->i1, word, false);
+   args[5] = params;
+   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2.f16",
+ ctx->f16, args, 6, 
AC_FUNC_ATTR_READNONE);
+   } else {
+   args[4] = params;
+   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2",
+ ctx->f32, args, 5, 
AC_FUNC_ATTR_READNONE);
+   }
 }
 
 LLVMValueRef
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index 61c9b5e4b6c..655427567c4 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -224,7 +224,8 @@ ac_build_fs_interp(struct ac_llvm_context *ctx,
   LLVMValueRef attr_number,
   LLVMValueRef params,
   LLVMValueRef i,
-  LLVMValueRef j);
+  LLVMValueRef j,
+  int word);
 
 LLVMValueRef
 ac_build_fs_interp_mov(struct ac_llvm_context *ctx,
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index bf7024c68e4..939b8eb13de 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3120,8 +3120,15 @@ static LLVMValueRef visit_interp(struct ac_nir_context 
*ctx,
LLVMValueRef j = LLVMBuildExtractElement(
ctx->ac.builder, interp_param, 
ctx->ac.i32_1, "");
 
+   /* This fp16 handling isn't technically correct
+* but should be correct for the attributes we
+* are actually going to use. */
+   bool fp16 = instr->dest.ssa.bit_size == 16;
+   int word = fp16 ? 0 : -1;
v = ac_build_fs_interp(&ctx->ac, llvm_chan, 
attr_number,
-  ctx->abi->prim_mask, i, 
j);
+  ctx->abi->prim_mask, i, 
j, word);
+   if (fp16)
+   v = ac_build_reinterpret(&ctx->ac, v, 
ctx->ac.f32);
} else {
v = ac_build_fs_interp_mov(&ctx->ac, 
LLVMConstInt(ctx->ac.i32, 2, false),
   llvm_chan, 
attr_number, ctx->abi->prim_mask);
@@ -3134,8 +3141,9 @@ static LLVMValueRef visit_interp(struct ac_nir_context 
*ctx,
result[chan] = LLVMBuildExtractElement(ctx->ac.builder, gather, 
attrib_idx, "");
 
}
-   return ac_build_varying_gather_values(&ctx->ac, result, 
instr->num_components,
- var->data.location_frac);
+   LLVMValueRef ret = a

[Mesa-dev] [PATCH v2 37/41] radv, ac: implement 16-bit interpolation

2019-02-15 Thread Rhys Perry
v2: add to patch series

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c   | 33 +---
 src/amd/common/ac_llvm_build.h   |  3 ++-
 src/amd/common/ac_nir_to_llvm.c  | 14 +++---
 src/amd/vulkan/radv_nir_to_llvm.c| 27 ++-
 src/amd/vulkan/radv_pipeline.c   | 19 --
 src/amd/vulkan/radv_shader.h |  1 +
 src/gallium/drivers/radeonsi/si_shader.c |  2 +-
 7 files changed, 69 insertions(+), 30 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index dff369aae7f..be2c2251a21 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -937,27 +937,40 @@ ac_build_fs_interp(struct ac_llvm_context *ctx,
   LLVMValueRef attr_number,
   LLVMValueRef params,
   LLVMValueRef i,
-  LLVMValueRef j)
+  LLVMValueRef j,
+  int word)
 {
-   LLVMValueRef args[5];
+   LLVMValueRef args[6];
LLVMValueRef p1;
 
args[0] = i;
args[1] = llvm_chan;
args[2] = attr_number;
-   args[3] = params;
-
-   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1",
-   ctx->f32, args, 4, AC_FUNC_ATTR_READNONE);
+   if (word >= 0) {
+   args[3] = LLVMConstInt(ctx->i1, word, false);
+   args[4] = params;
+   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1.f16",
+   ctx->f16, args, 5, 
AC_FUNC_ATTR_READNONE);
+   } else {
+   args[3] = params;
+   p1 = ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p1",
+   ctx->f32, args, 4, 
AC_FUNC_ATTR_READNONE);
+   }
 
args[0] = p1;
args[1] = j;
args[2] = llvm_chan;
args[3] = attr_number;
-   args[4] = params;
-
-   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2",
- ctx->f32, args, 5, AC_FUNC_ATTR_READNONE);
+   if (word >= 0) {
+   args[4] = LLVMConstInt(ctx->i1, word, false);
+   args[5] = params;
+   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2.f16",
+ ctx->f16, args, 6, 
AC_FUNC_ATTR_READNONE);
+   } else {
+   args[4] = params;
+   return ac_build_intrinsic(ctx, "llvm.amdgcn.interp.p2",
+ ctx->f32, args, 5, 
AC_FUNC_ATTR_READNONE);
+   }
 }
 
 LLVMValueRef
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index 61c9b5e4b6c..655427567c4 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -224,7 +224,8 @@ ac_build_fs_interp(struct ac_llvm_context *ctx,
   LLVMValueRef attr_number,
   LLVMValueRef params,
   LLVMValueRef i,
-  LLVMValueRef j);
+  LLVMValueRef j,
+  int word);
 
 LLVMValueRef
 ac_build_fs_interp_mov(struct ac_llvm_context *ctx,
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index bf7024c68e4..939b8eb13de 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3120,8 +3120,15 @@ static LLVMValueRef visit_interp(struct ac_nir_context 
*ctx,
LLVMValueRef j = LLVMBuildExtractElement(
ctx->ac.builder, interp_param, 
ctx->ac.i32_1, "");
 
+   /* This fp16 handling isn't technically correct
+* but should be correct for the attributes we
+* are actually going to use. */
+   bool fp16 = instr->dest.ssa.bit_size == 16;
+   int word = fp16 ? 0 : -1;
v = ac_build_fs_interp(&ctx->ac, llvm_chan, 
attr_number,
-  ctx->abi->prim_mask, i, 
j);
+  ctx->abi->prim_mask, i, 
j, word);
+   if (fp16)
+   v = ac_build_reinterpret(&ctx->ac, v, 
ctx->ac.f32);
} else {
v = ac_build_fs_interp_mov(&ctx->ac, 
LLVMConstInt(ctx->ac.i32, 2, false),
   llvm_chan, 
attr_number, ctx->abi->prim_mask);
@@ -3134,8 +3141,9 @@ static LLVMValueRef visit_interp(struct ac_nir_context 
*ctx,
result[chan] = LLVMBuildExtractElement(ctx->ac.builder, gather, 
attrib_idx, "");
 
}
-   return ac_build_varying_gather_values(&ctx->ac, result, 
instr->num_components,
- var->data.location_frac);
+   LLVMValueRef ret = a

[Mesa-dev] [PATCH v2 38/41] WIP: ac, radv: run LLVM's SLP vectorizer

2019-02-15 Thread Rhys Perry
v2: rebase
v2: move LLVMAddSLPVectorizePass to after LLVMAddEarlyCSEMemSSAPass
v2: run unconditionally on GFX9 and later
v2: mark as WIP because it can make 32-bit code much worse

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_util.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_llvm_util.c b/src/amd/common/ac_llvm_util.c
index 69446863b95..8d78b5a850b 100644
--- a/src/amd/common/ac_llvm_util.c
+++ b/src/amd/common/ac_llvm_util.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "c11/threads.h"
 #include "gallivm/lp_bld_misc.h"
 #include "util/u_math.h"
@@ -175,7 +176,7 @@ static LLVMTargetMachineRef ac_create_target_machine(enum 
radeon_family family,
 }
 
 static LLVMPassManagerRef ac_create_passmgr(LLVMTargetLibraryInfoRef 
target_library_info,
-   bool check_ir)
+   bool check_ir, enum radeon_family 
family)
 {
LLVMPassManagerRef passmgr = LLVMCreatePassManager();
if (!passmgr)
@@ -203,6 +204,9 @@ static LLVMPassManagerRef 
ac_create_passmgr(LLVMTargetLibraryInfoRef target_libr
LLVMAddCFGSimplificationPass(passmgr);
/* This is recommended by the instruction combining pass. */
LLVMAddEarlyCSEMemSSAPass(passmgr);
+   /* vectorization is disabled on pre-GFX9 because it's not very useful 
there */
+   if (family >= CHIP_VEGA10)
+   LLVMAddSLPVectorizePass(passmgr);
LLVMAddInstructionCombiningPass(passmgr);
return passmgr;
 }
@@ -327,7 +331,7 @@ ac_init_llvm_compiler(struct ac_llvm_compiler *compiler,
goto fail;
 
compiler->passmgr = ac_create_passmgr(compiler->target_library_info,
- tm_options & AC_TM_CHECK_IR);
+ tm_options & AC_TM_CHECK_IR, 
family);
if (!compiler->passmgr)
goto fail;
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 39/41] ac/nir: generate better code for nir_op_f2f16_rtz

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 939b8eb13de..8bfc63958ca 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -889,7 +889,9 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
src[0] = LLVMBuildFPTrunc(ctx->ac.builder, src[0], 
ctx->ac.f32, "");
LLVMValueRef param[2] = { src[0], ctx->ac.f32_0 };
result = ac_build_cvt_pkrtz_f16(&ctx->ac, param);
-   result = LLVMBuildExtractElement(ctx->ac.builder, result, 
ctx->ac.i32_0, "");
+   // generates better code than an extractelement with slp 
vectorization
+   result = LLVMBuildBitCast(ctx->ac.builder, result, ctx->ac.i32, 
"");
+   result = LLVMBuildTrunc(ctx->ac.builder, result, ctx->ac.i16, 
"");
break;
case nir_op_f2f16_rtne:
case nir_op_f2f16:
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 41/41] radv, docs: expose float16, int16 and int8 features and extensions

2019-02-15 Thread Rhys Perry
v2: rebase
v2: mark VK_KHR_8bit_storage as DONE in features.txt

Signed-off-by: Rhys Perry 
---
 docs/features.txt |  2 +-
 src/amd/vulkan/radv_device.c  | 17 +
 src/amd/vulkan/radv_extensions.py |  4 
 src/amd/vulkan/radv_shader.c  |  3 +++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/docs/features.txt b/docs/features.txt
index 6c2b6d59377..ded753b0182 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -439,7 +439,7 @@ Vulkan 1.1 -- all DONE: anv, radv
   VK_KHR_variable_pointers  DONE (anv, radv)
 
 Khronos extensions that are not part of any Vulkan version:
-  VK_KHR_8bit_storage   DONE (anv)
+  VK_KHR_8bit_storage   DONE (anv, radv)
   VK_KHR_android_surfacenot started
   VK_KHR_create_renderpass2 DONE (anv, radv)
   VK_KHR_displayDONE (anv, radv)
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 0fef92773e1..4137b778466 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -877,6 +877,23 @@ void radv_GetPhysicalDeviceFeatures2(
features->bufferDeviceAddressMultiDevice = false;
break;
}
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT16_INT8_FEATURES_KHR: {
+   VkPhysicalDeviceFloat16Int8FeaturesKHR *features =
+   (VkPhysicalDeviceFloat16Int8FeaturesKHR*)ext;
+   bool enabled = pdevice->rad_info.chip_class >= VI;
+   features->shaderFloat16 = enabled && HAVE_LLVM >= 
0x0800;
+   features->shaderInt8 = enabled;
+   break;
+   }
+   case 
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_8BIT_STORAGE_FEATURES_KHR: {
+   VkPhysicalDevice8BitStorageFeaturesKHR *features =
+   (VkPhysicalDevice8BitStorageFeaturesKHR*)ext;
+   bool enabled = pdevice->rad_info.chip_class >= VI;
+   features->storageBuffer8BitAccess = enabled;
+   features->uniformAndStorageBuffer8BitAccess = enabled;
+   features->storagePushConstant8 = enabled;
+   break;
+   }
default:
break;
}
diff --git a/src/amd/vulkan/radv_extensions.py 
b/src/amd/vulkan/radv_extensions.py
index f218598f123..e38cfcfdcbe 100644
--- a/src/amd/vulkan/radv_extensions.py
+++ b/src/amd/vulkan/radv_extensions.py
@@ -91,6 +91,8 @@ EXTENSIONS = [
 Extension('VK_KHR_xlib_surface',  6, 
'VK_USE_PLATFORM_XLIB_KHR'),
 Extension('VK_KHR_multiview', 1, True),
 Extension('VK_KHR_display',  23, 
'VK_USE_PLATFORM_DISPLAY_KHR'),
+Extension('VK_KHR_shader_float16_int8',   1, 
'device->rad_info.chip_class >= VI'),
+Extension('VK_KHR_8bit_storage',  1, 
'device->rad_info.chip_class >= VI'),
 Extension('VK_EXT_direct_mode_display',   1, 
'VK_USE_PLATFORM_DISPLAY_KHR'),
 Extension('VK_EXT_acquire_xlib_display',  1, 
'VK_USE_PLATFORM_XLIB_XRANDR_EXT'),
 Extension('VK_EXT_buffer_device_address', 1, True),
@@ -121,6 +123,8 @@ EXTENSIONS = [
 Extension('VK_AMD_shader_core_properties',1, True),
 Extension('VK_AMD_shader_info',   1, True),
 Extension('VK_AMD_shader_trinary_minmax', 1, True),
+Extension('VK_AMD_gpu_shader_half_float', 1, 
'device->rad_info.chip_class >= VI && HAVE_LLVM >= 0x0800'),
+Extension('VK_AMD_gpu_shader_int16',  1, 
'device->rad_info.chip_class >= VI'),
 Extension('VK_GOOGLE_decorate_string',1, True),
 Extension('VK_GOOGLE_hlsl_functionality1',1, True),
 ]
diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index adba730ad8b..44dea8e7203 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -249,6 +249,9 @@ radv_shader_compile_to_nir(struct radv_device *device,
.transform_feedback = true,
.trinary_minmax = true,
.variable_pointers = true,
+   .float16 = true,
+   .storage_8bit = true,
+   .int8 = true,
},
.ubo_ptr_type = glsl_vector_type(GLSL_TYPE_UINT, 2),
.ssbo_ptr_type = glsl_vector_type(GLSL_TYPE_UINT, 2),
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.

[Mesa-dev] [PATCH v2 40/41] ac/nir: have nir_op_f2f16 round to zero

2019-02-15 Thread Rhys Perry
In the hope that one day LLVM will then be able to generate code with
vectorized v_cvt_pkrtz_f16_f32 instructions.

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 8bfc63958ca..7a5e95506f2 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -884,6 +884,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
result = LLVMBuildUIToFP(ctx->ac.builder, src[0], 
ac_to_float_type(&ctx->ac, def_type), "");
break;
case nir_op_f2f16_rtz:
+   case nir_op_f2f16:
src[0] = ac_to_float(&ctx->ac, src[0]);
if (LLVMTypeOf(src[0]) == ctx->ac.f64)
src[0] = LLVMBuildFPTrunc(ctx->ac.builder, src[0], 
ctx->ac.f32, "");
@@ -894,7 +895,6 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
result = LLVMBuildTrunc(ctx->ac.builder, result, ctx->ac.i16, 
"");
break;
case nir_op_f2f16_rtne:
-   case nir_op_f2f16:
case nir_op_f2f32:
case nir_op_f2f64:
src[0] = ac_to_float(&ctx->ac, src[0]);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 28/41] ac/nir: implement 8 and 16 bit ac_build_imsb

2019-02-15 Thread Rhys Perry
v2: fix C++ style comment

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index ec87a7b9343..c986f800fa4 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1531,6 +1531,10 @@ ac_build_imsb(struct ac_llvm_context *ctx,
  LLVMValueRef arg,
  LLVMTypeRef dst_type)
 {
+   /* TODO: support 64-bit integers */
+   if (LLVMTypeOf(arg) != ctx->i32)
+   arg = LLVMBuildSExt(ctx->builder, arg, ctx->i32, "");
+
LLVMValueRef msb = ac_build_intrinsic(ctx, "llvm.amdgcn.sffbh.i32",
  dst_type, &arg, 1,
  AC_FUNC_ATTR_READNONE);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 10/41] ac/nir: make ac_build_clamp work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zerof() and ac_get_onef()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index b53d9c7ff8c..667f9700764 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1597,16 +1597,20 @@ ac_build_umsb(struct ac_llvm_context *ctx,
 LLVMValueRef ac_build_fmin(struct ac_llvm_context *ctx, LLVMValueRef a,
   LLVMValueRef b)
 {
+   char intr[64];
+   snprintf(intr, sizeof(intr), "llvm.minnum.f%d", ac_get_elem_bits(ctx, 
LLVMTypeOf(a)));
LLVMValueRef args[2] = {a, b};
-   return ac_build_intrinsic(ctx, "llvm.minnum.f32", ctx->f32, args, 2,
+   return ac_build_intrinsic(ctx, intr, LLVMTypeOf(a), args, 2,
  AC_FUNC_ATTR_READNONE);
 }
 
 LLVMValueRef ac_build_fmax(struct ac_llvm_context *ctx, LLVMValueRef a,
   LLVMValueRef b)
 {
+   char intr[64];
+   snprintf(intr, sizeof(intr), "llvm.maxnum.f%d", ac_get_elem_bits(ctx, 
LLVMTypeOf(a)));
LLVMValueRef args[2] = {a, b};
-   return ac_build_intrinsic(ctx, "llvm.maxnum.f32", ctx->f32, args, 2,
+   return ac_build_intrinsic(ctx, intr, LLVMTypeOf(a), args, 2,
  AC_FUNC_ATTR_READNONE);
 }
 
@@ -1633,8 +1637,9 @@ LLVMValueRef ac_build_umin(struct ac_llvm_context *ctx, 
LLVMValueRef a,
 
 LLVMValueRef ac_build_clamp(struct ac_llvm_context *ctx, LLVMValueRef value)
 {
-   return ac_build_fmin(ctx, ac_build_fmax(ctx, value, ctx->f32_0),
-ctx->f32_1);
+   LLVMTypeRef t = LLVMTypeOf(value);
+   return ac_build_fmin(ctx, ac_build_fmax(ctx, value, LLVMConstReal(t, 
0.0)),
+LLVMConstReal(t, 1.0));
 }
 
 void ac_build_export(struct ac_llvm_context *ctx, struct ac_export_args *a)
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 14/41] ac/nir: make ac_build_fdiv support 16-bit floats

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 23e454385d7..fb871a47400 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -661,7 +661,7 @@ ac_build_fdiv(struct ac_llvm_context *ctx,
 * If we do (num * (1 / den)), LLVM does:
 *return num * v_rcp_f32(den);
 */
-   LLVMValueRef one = LLVMTypeOf(num) == ctx->f64 ? ctx->f64_1 : 
ctx->f32_1;
+   LLVMValueRef one = LLVMConstReal(LLVMTypeOf(num), 1.0);
LLVMValueRef rcp = LLVMBuildFDiv(ctx->builder, one, den, "");
LLVMValueRef ret = LLVMBuildFMul(ctx->builder, num, rcp, "");
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 21/41] ac/nir: implement 16-bit shifts

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 75bb19031bf..bad1c2a990e 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -672,20 +672,17 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
case nir_op_ishl:
result = LLVMBuildShl(ctx->ac.builder, src[0],
- LLVMBuildZExt(ctx->ac.builder, src[1],
-   LLVMTypeOf(src[0]), ""),
+ ac_build_ui_cast(&ctx->ac, src[1], 
LLVMTypeOf(src[0])),
  "");
break;
case nir_op_ishr:
result = LLVMBuildAShr(ctx->ac.builder, src[0],
-  LLVMBuildZExt(ctx->ac.builder, src[1],
-LLVMTypeOf(src[0]), ""),
+  ac_build_ui_cast(&ctx->ac, src[1], 
LLVMTypeOf(src[0])),
   "");
break;
case nir_op_ushr:
result = LLVMBuildLShr(ctx->ac.builder, src[0],
-  LLVMBuildZExt(ctx->ac.builder, src[1],
-LLVMTypeOf(src[0]), ""),
+  ac_build_ui_cast(&ctx->ac, src[1], 
LLVMTypeOf(src[0])),
   "");
break;
case nir_op_ilt32:
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 11/41] ac/nir: make ac_build_fract work on all bit sizes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 667f9700764..db937eb66fb 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2049,16 +2049,9 @@ void ac_build_waitcnt(struct ac_llvm_context *ctx, 
unsigned simm16)
 LLVMValueRef ac_build_fract(struct ac_llvm_context *ctx, LLVMValueRef src0,
unsigned bitsize)
 {
-   LLVMTypeRef type;
-   char *intr;
-
-   if (bitsize == 32) {
-   intr = "llvm.floor.f32";
-   type = ctx->f32;
-   } else {
-   intr = "llvm.floor.f64";
-   type = ctx->f64;
-   }
+   LLVMTypeRef type = ac_float_of_size(ctx, bitsize);
+   char intr[64];
+   snprintf(intr, sizeof(intr), "llvm.floor.f%d", bitsize);
 
LLVMValueRef params[] = {
src0,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 16/41] ac/nir: implement half-float nir_op_frsq

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index cba0cec3e8f..8b0e07d2930 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -788,8 +788,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
case nir_op_frsq:
result = emit_intrin_1f_param(&ctx->ac, "llvm.sqrt",
  ac_to_float_type(&ctx->ac, 
def_type), src[0]);
-   result = ac_build_fdiv(&ctx->ac, instr->dest.dest.ssa.bit_size 
== 32 ? ctx->ac.f32_1 : ctx->ac.f64_1,
-  result);
+   result = ac_build_fdiv(&ctx->ac, 
LLVMConstReal(LLVMTypeOf(result), 1.0), result);
break;
case nir_op_frexp_exp:
src[0] = ac_to_float(&ctx->ac, src[0]);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 15/41] ac/nir: implement half-float nir_op_frcp

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_onef()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 741059b5f1a..cba0cec3e8f 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -657,8 +657,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
case nir_op_frcp:
src[0] = ac_to_float(&ctx->ac, src[0]);
-   result = ac_build_fdiv(&ctx->ac, instr->dest.dest.ssa.bit_size 
== 32 ? ctx->ac.f32_1 : ctx->ac.f64_1,
-  src[0]);
+   result = ac_build_fdiv(&ctx->ac, 
LLVMConstReal(LLVMTypeOf(src[0]), 1.0), src[0]);
break;
case nir_op_iand:
result = LLVMBuildAnd(ctx->ac.builder, src[0], src[1], "");
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 18/41] radv: lower 16-bit flrp

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/vulkan/radv_shader.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 1dcb0606246..adba730ad8b 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -53,6 +53,7 @@
 static const struct nir_shader_compiler_options nir_options = {
.vertex_id_zero_based = true,
.lower_scmp = true,
+   .lower_flrp16 = true,
.lower_flrp32 = true,
.lower_flrp64 = true,
.lower_device_index_to_zero = true,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 17/41] ac/nir: implement half-float nir_op_ldexp

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 8b0e07d2930..0e5946dfdb3 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -829,8 +829,10 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
case nir_op_ldexp:
src[0] = ac_to_float(&ctx->ac, src[0]);
-   if (ac_get_elem_bits(&ctx->ac, LLVMTypeOf(src[0])) == 32)
+   if (ac_get_elem_bits(&ctx->ac, def_type) == 32)
result = ac_build_intrinsic(&ctx->ac, 
"llvm.amdgcn.ldexp.f32", ctx->ac.f32, src, 2, AC_FUNC_ATTR_READNONE);
+   else if (ac_get_elem_bits(&ctx->ac, def_type) == 16)
+   result = ac_build_intrinsic(&ctx->ac, 
"llvm.amdgcn.ldexp.f16", ctx->ac.f16, src, 2, AC_FUNC_ATTR_READNONE);
else
result = ac_build_intrinsic(&ctx->ac, 
"llvm.amdgcn.ldexp.f64", ctx->ac.f64, src, 2, AC_FUNC_ATTR_READNONE);
break;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 26/41] ac/nir: make ac_find_lsb work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zero() and ac_int_of_size()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 33 ++---
 1 file changed, 6 insertions(+), 27 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index aa92c55c822..61085db9320 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2474,30 +2474,11 @@ LLVMValueRef ac_find_lsb(struct ac_llvm_context *ctx,
 LLVMTypeRef dst_type,
 LLVMValueRef src0)
 {
-   unsigned src0_bitsize = ac_get_elem_bits(ctx, LLVMTypeOf(src0));
-   const char *intrin_name;
-   LLVMTypeRef type;
-   LLVMValueRef zero;
-
-   switch (src0_bitsize) {
-   case 64:
-   intrin_name = "llvm.cttz.i64";
-   type = ctx->i64;
-   zero = ctx->i64_0;
-   break;
-   case 32:
-   intrin_name = "llvm.cttz.i32";
-   type = ctx->i32;
-   zero = ctx->i32_0;
-   break;
-   case 16:
-   intrin_name = "llvm.cttz.i16";
-   type = ctx->i16;
-   zero = ctx->i16_0;
-   break;
-   default:
-   unreachable(!"invalid bitsize");
-   }
+   LLVMTypeRef type = LLVMTypeOf(src0);
+   unsigned src0_bitsize = ac_get_elem_bits(ctx, type);
+   char intrin_name[64];
+   LLVMValueRef zero = LLVMConstInt(type, 0, false);
+   snprintf(intrin_name, sizeof(intrin_name), "llvm.cttz.i%d", 
src0_bitsize);
 
LLVMValueRef params[2] = {
src0,
@@ -2518,9 +2499,7 @@ LLVMValueRef ac_find_lsb(struct ac_llvm_context *ctx,
  params, 2,
  AC_FUNC_ATTR_READNONE);
 
-   if (src0_bitsize == 64) {
-   lsb = LLVMBuildTrunc(ctx->builder, lsb, ctx->i32, "");
-   }
+   lsb = ac_build_ui_cast(ctx, lsb, ctx->i32);
 
/* TODO: We need an intrinsic to skip this conditional. */
/* Check for zero: */
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 12/41] ac/nir: make ac_build_isign work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 27 ---
 1 file changed, 4 insertions(+), 23 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index db937eb66fb..3b2257e8bf0 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2064,30 +2064,11 @@ LLVMValueRef ac_build_fract(struct ac_llvm_context 
*ctx, LLVMValueRef src0,
 LLVMValueRef ac_build_isign(struct ac_llvm_context *ctx, LLVMValueRef src0,
unsigned bitsize)
 {
-   LLVMValueRef cmp, val, zero, one;
-   LLVMTypeRef type;
-
-   switch (bitsize) {
-   case 64:
-   type = ctx->i64;
-   zero = ctx->i64_0;
-   one = ctx->i64_1;
-   break;
-   case 32:
-   type = ctx->i32;
-   zero = ctx->i32_0;
-   one = ctx->i32_1;
-   break;
-   case 16:
-   type = ctx->i16;
-   zero = ctx->i16_0;
-   one = ctx->i16_1;
-   break;
-   default:
-   unreachable(!"invalid bitsize");
-   break;
-   }
+   LLVMTypeRef type = LLVMIntTypeInContext(ctx->context, bitsize);
+   LLVMValueRef zero = LLVMConstInt(type, 0, false);
+   LLVMValueRef one = LLVMConstInt(type, 1, false);
 
+   LLVMValueRef cmp, val;
cmp = LLVMBuildICmp(ctx->builder, LLVMIntSGT, src0, zero, "");
val = LLVMBuildSelect(ctx->builder, cmp, one, src0, "");
cmp = LLVMBuildICmp(ctx->builder, LLVMIntSGE, val, zero, "");
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 19/41] ac/nir: support half floats in emit_b2f

2019-02-15 Thread Rhys Perry
This seems to generate fine code, even though the IR is a bit ugly.

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 0e5946dfdb3..e459001c1cf 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -316,14 +316,20 @@ static LLVMValueRef emit_b2f(struct ac_llvm_context *ctx,
 unsigned bitsize)
 {
LLVMValueRef result = LLVMBuildAnd(ctx->builder, src0,
-  LLVMBuildBitCast(ctx->builder, 
LLVMConstReal(ctx->f32, 1.0), ctx->i32, ""),
+  LLVMBuildBitCast(ctx->builder, 
ctx->f32_1, ctx->i32, ""),
   "");
result = LLVMBuildBitCast(ctx->builder, result, ctx->f32, "");
 
-   if (bitsize == 32)
+   switch (bitsize) {
+   case 16:
+   return LLVMBuildFPTrunc(ctx->builder, result, ctx->f16, "");
+   case 32:
return result;
-
-   return LLVMBuildFPExt(ctx->builder, result, ctx->f64, "");
+   case 64:
+   return LLVMBuildFPExt(ctx->builder, result, ctx->f64, "");
+   default:
+   unreachable("Unsupported bit size.");
+   }
 }
 
 static LLVMValueRef emit_f2b(struct ac_llvm_context *ctx,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 27/41] ac/nir: make ac_build_umsb work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zero() and ac_int_of_size()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 38 +++---
 1 file changed, 7 insertions(+), 31 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 61085db9320..ec87a7b9343 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1555,36 +1555,12 @@ ac_build_umsb(struct ac_llvm_context *ctx,
  LLVMValueRef arg,
  LLVMTypeRef dst_type)
 {
-   const char *intrin_name;
-   LLVMTypeRef type;
-   LLVMValueRef highest_bit;
-   LLVMValueRef zero;
-   unsigned bitsize;
-
-   bitsize = ac_get_elem_bits(ctx, LLVMTypeOf(arg));
-   switch (bitsize) {
-   case 64:
-   intrin_name = "llvm.ctlz.i64";
-   type = ctx->i64;
-   highest_bit = LLVMConstInt(ctx->i64, 63, false);
-   zero = ctx->i64_0;
-   break;
-   case 32:
-   intrin_name = "llvm.ctlz.i32";
-   type = ctx->i32;
-   highest_bit = LLVMConstInt(ctx->i32, 31, false);
-   zero = ctx->i32_0;
-   break;
-   case 16:
-   intrin_name = "llvm.ctlz.i16";
-   type = ctx->i16;
-   highest_bit = LLVMConstInt(ctx->i16, 15, false);
-   zero = ctx->i16_0;
-   break;
-   default:
-   unreachable(!"invalid bitsize");
-   break;
-   }
+   LLVMTypeRef type = LLVMTypeOf(arg);
+   unsigned bitsize = ac_get_elem_bits(ctx, type);
+   LLVMValueRef highest_bit = LLVMConstInt(type, bitsize - 1, false);
+   LLVMValueRef zero = LLVMConstInt(type, 0, false);
+   char intrin_name[64];
+   snprintf(intrin_name, sizeof(intrin_name), "llvm.ctlz.i%d", bitsize);
 
LLVMValueRef params[2] = {
arg,
@@ -1598,7 +1574,7 @@ ac_build_umsb(struct ac_llvm_context *ctx,
/* The HW returns the last bit index from MSB, but TGSI/NIR wants
 * the index from LSB. Invert it by doing "31 - msb". */
msb = LLVMBuildSub(ctx->builder, highest_bit, msb, "");
-   msb = LLVMBuildTruncOrBitCast(ctx->builder, msb, ctx->i32, "");
+   msb = ac_build_ui_cast(ctx, msb, dst_type);
 
/* check for zero */
return LLVMBuildSelect(ctx->builder,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 24/41] ac/nir: implement 8 and 16 bit ac_build_readlane

2019-02-15 Thread Rhys Perry
v2: don't use ac_int_of_size()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 71eaac4b7bd..aa92c55c822 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2868,9 +2868,15 @@ ac_build_readlane(struct ac_llvm_context *ctx, 
LLVMValueRef src, LLVMValueRef la
 {
LLVMTypeRef src_type = LLVMTypeOf(src);
src = ac_to_integer(ctx, src);
-   unsigned bits = LLVMGetIntTypeWidth(LLVMTypeOf(src));
+   unsigned src_bits = LLVMGetIntTypeWidth(LLVMTypeOf(src));
+   unsigned bits = src_bits;
LLVMValueRef ret;
 
+   if (bits < 32) {
+   src = LLVMBuildZExt(ctx->builder, src, ctx->i32, "");
+   bits = 32;
+   }
+
if (bits == 32) {
ret = _ac_build_readlane(ctx, src, lane);
} else {
@@ -2887,6 +2893,10 @@ ac_build_readlane(struct ac_llvm_context *ctx, 
LLVMValueRef src, LLVMValueRef la
LLVMConstInt(ctx->i32, i, 0), 
"");
}
}
+
+   if (src_bits < 32)
+   ret = LLVMBuildTrunc(ctx->builder, ret, 
LLVMIntTypeInContext(ctx->context, src_bits), "");
+
return LLVMBuildBitCast(ctx->builder, ret, src_type, "");
 }
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 30/41] ac/nir: make ac_build_bitfield_reverse work on all bit sizes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 26 ++
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 46738faea9d..dff369aae7f 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2100,28 +2100,14 @@ LLVMValueRef ac_build_bit_count(struct ac_llvm_context 
*ctx, LLVMValueRef src0)
 LLVMValueRef ac_build_bitfield_reverse(struct ac_llvm_context *ctx,
   LLVMValueRef src0)
 {
-   LLVMValueRef result;
-   unsigned bitsize;
-
-   bitsize = ac_get_elem_bits(ctx, LLVMTypeOf(src0));
+   unsigned bitsize = ac_get_elem_bits(ctx, LLVMTypeOf(src0));
 
-   switch (bitsize) {
-   case 32:
-   result = ac_build_intrinsic(ctx, "llvm.bitreverse.i32", 
ctx->i32,
-   (LLVMValueRef []) { src0 }, 1,
-   AC_FUNC_ATTR_READNONE);
-   break;
-   case 16:
-   result = ac_build_intrinsic(ctx, "llvm.bitreverse.i16", 
ctx->i16,
-   (LLVMValueRef []) { src0 }, 1,
-   AC_FUNC_ATTR_READNONE);
-   break;
-   default:
-   unreachable(!"invalid bitsize");
-   break;
-   }
+   char name[64];
+   snprintf(name, sizeof(name), "llvm.bitreverse.i%d", bitsize);
 
-   return result;
+   return ac_build_intrinsic(ctx, name, LLVMTypeOf(src0),
+ (LLVMValueRef []) { src0 }, 1,
+ AC_FUNC_ATTR_READNONE);
 }
 
 #define AC_EXP_TARGET  0
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 23/41] ac/nir: implement 16-bit ac_build_ddxy

2019-02-15 Thread Rhys Perry
v2: rebase

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index fb871a47400..71eaac4b7bd 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1481,6 +1481,11 @@ ac_build_ddxy(struct ac_llvm_context *ctx,
LLVMValueRef tl, trbl;
LLVMValueRef result;
 
+   int size = ac_get_type_size(LLVMTypeOf(val));
+
+   if (size == 2)
+   val = LLVMBuildZExt(ctx->builder, val, ctx->i32, "");
+
for (unsigned i = 0; i < 4; ++i) {
tl_lanes[i] = i & mask;
trbl_lanes[i] = (i & mask) + idx;
@@ -1493,12 +1498,19 @@ ac_build_ddxy(struct ac_llvm_context *ctx,
 trbl_lanes[0], trbl_lanes[1],
 trbl_lanes[2], trbl_lanes[3]);
 
-   tl = LLVMBuildBitCast(ctx->builder, tl, ctx->f32, "");
-   trbl = LLVMBuildBitCast(ctx->builder, trbl, ctx->f32, "");
+   if (size == 2) {
+   tl = LLVMBuildTrunc(ctx->builder, tl, ctx->i16, "");
+   trbl = LLVMBuildTrunc(ctx->builder, trbl, ctx->i16, "");
+   }
+
+   LLVMTypeRef type = ac_float_of_size(ctx, size * 8);
+   tl = LLVMBuildBitCast(ctx->builder, tl, type, "");
+   trbl = LLVMBuildBitCast(ctx->builder, trbl, type, "");
result = LLVMBuildFSub(ctx->builder, trbl, tl, "");
 
-   result = ac_build_intrinsic(ctx, "llvm.amdgcn.wqm.f32", ctx->f32,
-   &result, 1, 0);
+   result = ac_build_intrinsic(ctx,
+   LLVMTypeOf(val) == ctx->f32 ? "llvm.amdgcn.wqm.f32" : 
"llvm.amdgcn.wqm.f16", type,
+   &result, 1, 0);
 
return result;
 }
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 29/41] ac/nir: make ac_build_bit_count work on all bit sizes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 33 +++--
 1 file changed, 7 insertions(+), 26 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index c986f800fa4..46738faea9d 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2085,35 +2085,16 @@ LLVMValueRef ac_build_fsign(struct ac_llvm_context 
*ctx, LLVMValueRef src0,
 
 LLVMValueRef ac_build_bit_count(struct ac_llvm_context *ctx, LLVMValueRef src0)
 {
-   LLVMValueRef result;
-   unsigned bitsize;
+   unsigned bitsize = ac_get_elem_bits(ctx, LLVMTypeOf(src0));
 
-   bitsize = ac_get_elem_bits(ctx, LLVMTypeOf(src0));
+   char name[64];
+   snprintf(name, sizeof(name), "llvm.ctpop.i%d", bitsize);
 
-   switch (bitsize) {
-   case 64:
-   result = ac_build_intrinsic(ctx, "llvm.ctpop.i64", ctx->i64,
-   (LLVMValueRef []) { src0 }, 1,
-   AC_FUNC_ATTR_READNONE);
-
-   result = LLVMBuildTrunc(ctx->builder, result, ctx->i32, "");
-   break;
-   case 32:
-   result = ac_build_intrinsic(ctx, "llvm.ctpop.i32", ctx->i32,
-   (LLVMValueRef []) { src0 }, 1,
-   AC_FUNC_ATTR_READNONE);
-   break;
-   case 16:
-   result = ac_build_intrinsic(ctx, "llvm.ctpop.i16", ctx->i16,
-   (LLVMValueRef []) { src0 }, 1,
-   AC_FUNC_ATTR_READNONE);
-   break;
-   default:
-   unreachable(!"invalid bitsize");
-   break;
-   }
+   LLVMValueRef result = ac_build_intrinsic(ctx, name, LLVMTypeOf(src0),
+(LLVMValueRef []) { src0 }, 1,
+AC_FUNC_ATTR_READNONE);
 
-   return result;
+   return ac_build_ui_cast(ctx, result, ctx->i32);
 }
 
 LLVMValueRef ac_build_bitfield_reverse(struct ac_llvm_context *ctx,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 36/41] radv: handle all fragment output types

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 55 ---
 1 file changed, 35 insertions(+), 20 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 01b8b097ea1..c46eabf3656 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2297,9 +2297,7 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
if (!values)
return;
 
-   bool is_16bit = ac_get_type_size(LLVMTypeOf(values[0])) == 2;
if (ctx->stage == MESA_SHADER_FRAGMENT) {
-   bool is_16bit = ac_get_type_size(LLVMTypeOf(values[0])) == 2;
unsigned index = target - V_008DFC_SQ_EXP_MRT;
unsigned col_format = (ctx->options->key.fs.col_format >> (4 * 
index)) & 0xf;
bool is_int8 = (ctx->options->key.fs.is_int8 >> index) & 1;
@@ -2310,6 +2308,28 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
LLVMValueRef (*packi)(struct ac_llvm_context *ctx, LLVMValueRef 
args[2],
  unsigned bits, bool hi) = NULL;
 
+   if (LLVMTypeOf(values[0]) == ctx->ac.f16 &&
+   col_format != V_028714_SPI_SHADER_FP16_ABGR) {
+   for (unsigned chan = 0; chan < 4; chan++)
+   values[chan] = LLVMBuildFPExt(ctx->ac.builder,
+ values[chan],
+ ctx->ac.f32, "");
+   }
+
+   if (LLVMTypeOf(values[0]) == ctx->ac.i16 || 
LLVMTypeOf(values[0]) == ctx->ac.i8) {
+   if (col_format == V_028714_SPI_SHADER_SINT16_ABGR) {
+   for (unsigned chan = 0; chan < 4; chan++)
+   values[chan] = 
LLVMBuildSExt(ctx->ac.builder,
+
values[chan],
+
ctx->ac.i32, "");
+   } else {
+   for (unsigned chan = 0; chan < 4; chan++)
+   values[chan] = 
LLVMBuildZExt(ctx->ac.builder,
+
values[chan],
+
ctx->ac.i32, "");
+   }
+   }
+
switch(col_format) {
case V_028714_SPI_SHADER_ZERO:
args->enabled_channels = 0; /* writemask */
@@ -2335,12 +2355,16 @@ si_llvm_init_export_args(struct radv_shader_context 
*ctx,
 
case V_028714_SPI_SHADER_FP16_ABGR:
args->enabled_channels = 0x5;
-   packf = ac_build_cvt_pkrtz_f16;
-   if (is_16bit) {
-   for (unsigned chan = 0; chan < 4; chan++)
-   values[chan] = 
LLVMBuildFPExt(ctx->ac.builder,
- 
values[chan],
- 
ctx->ac.f32, "");
+   if (LLVMTypeOf(values[0]) == ctx->ac.f16) {
+   packi = ac_build_cvt_pk_u16;
+   for (unsigned chan = 0; chan < 4; chan++) {
+   values[chan] = ac_to_integer(&ctx->ac, 
values[chan]);
+   values[chan] = 
LLVMBuildZExt(ctx->ac.builder,
+
values[chan],
+
ctx->ac.i32, "");
+   }
+   } else {
+   packf = ac_build_cvt_pkrtz_f16;
}
break;
 
@@ -2357,23 +2381,11 @@ si_llvm_init_export_args(struct radv_shader_context 
*ctx,
case V_028714_SPI_SHADER_UINT16_ABGR:
args->enabled_channels = 0x5;
packi = ac_build_cvt_pk_u16;
-   if (is_16bit) {
-   for (unsigned chan = 0; chan < 4; chan++)
-   values[chan] = 
LLVMBuildZExt(ctx->ac.builder,
- 
ac_to_integer(&ctx->ac, values[chan]),
- 
ctx->ac.i32, "");
-   }
break;
 
case V_028714_SPI_SHADER_SINT16_ABGR:
args->enabled_channels = 0x5;
packi = ac_build_cvt_pk_i16;
-   if (is_16bit) {
-   for (unsig

[Mesa-dev] [PATCH v2 22/41] compiler/nir: add lowering option for 16-bit ffma

2019-02-15 Thread Rhys Perry
The lowering needs to be disabled for sufficient precision to pass
deqp-vk's 16-bit fma test on radv.

Signed-off-by: Rhys Perry 
---
 src/broadcom/compiler/nir_to_vir.c| 1 +
 src/compiler/nir/nir.h| 1 +
 src/compiler/nir/nir_opt_algebraic.py | 4 +++-
 src/gallium/drivers/radeonsi/si_get.c | 1 +
 src/gallium/drivers/vc4/vc4_program.c | 1 +
 5 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/broadcom/compiler/nir_to_vir.c 
b/src/broadcom/compiler/nir_to_vir.c
index d983f91e718..6c0a623096a 100644
--- a/src/broadcom/compiler/nir_to_vir.c
+++ b/src/broadcom/compiler/nir_to_vir.c
@@ -2471,6 +2471,7 @@ const nir_shader_compiler_options v3d_nir_options = {
 .lower_fdiv = true,
 .lower_find_lsb = true,
 .lower_ffma = true,
+.lower_ffma16 = true,
 .lower_flrp32 = true,
 .lower_fpow = true,
 .lower_fsat = true,
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 740c64d2a94..8df275f4aa3 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2111,6 +2111,7 @@ typedef struct nir_function {
 
 typedef struct nir_shader_compiler_options {
bool lower_fdiv;
+   bool lower_ffma16;
bool lower_ffma;
bool fuse_ffma;
bool lower_flrp16;
diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 71c626e1b3f..63dff878d35 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -136,7 +136,9 @@ optimizations = [
(('~fadd', a, ('fmul', ('b2f', 'c@1'), ('fadd', b, ('fneg', a, 
('bcsel', c, b, a), 'options->lower_flrp32'),
(('~fadd@32', a, ('fmul', c , ('fadd', b, ('fneg', a, ('flrp', 
a, b, c), '!options->lower_flrp32'),
(('~fadd@64', a, ('fmul', c , ('fadd', b, ('fneg', a, ('flrp', 
a, b, c), '!options->lower_flrp64'),
-   (('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'),
+   (('ffma@16', a, b, c), ('fadd', ('fmul', a, b), c), 
'options->lower_ffma16'),
+   (('ffma@32', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'),
+   (('ffma@64', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'),
(('~fadd', ('fmul', a, b), c), ('ffma', a, b, c), 'options->fuse_ffma'),
 
(('fdot4', ('vec4', a, b,   c,   1.0), d), ('fdph',  ('vec3', a, b, c), d)),
diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index f8ca02d4fcf..5bf107ef6fe 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -491,6 +491,7 @@ static const struct nir_shader_compiler_options nir_options 
= {
.lower_fdiv = true,
.lower_sub = true,
.lower_ffma = true,
+   .lower_ffma16 = true,
.lower_pack_snorm_2x16 = true,
.lower_pack_snorm_4x8 = true,
.lower_pack_unorm_2x16 = true,
diff --git a/src/gallium/drivers/vc4/vc4_program.c 
b/src/gallium/drivers/vc4/vc4_program.c
index 2d0a52bb5fb..8be258cbba4 100644
--- a/src/gallium/drivers/vc4/vc4_program.c
+++ b/src/gallium/drivers/vc4/vc4_program.c
@@ -2234,6 +2234,7 @@ static const nir_shader_compiler_options nir_options = {
 .lower_extract_word = true,
 .lower_fdiv = true,
 .lower_ffma = true,
+.lower_ffma16 = true,
 .lower_flrp32 = true,
 .lower_fpow = true,
 .lower_fsat = true,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 20/41] ac/nir: make emit_b2i work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_int_of_size()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index e459001c1cf..75bb19031bf 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -347,11 +347,7 @@ static LLVMValueRef emit_b2i(struct ac_llvm_context *ctx,
 unsigned bitsize)
 {
LLVMValueRef result = LLVMBuildAnd(ctx->builder, src0, ctx->i32_1, "");
-
-   if (bitsize == 32)
-   return result;
-
-   return LLVMBuildZExt(ctx->builder, result, ctx->i64, "");
+   return ac_build_ui_cast(ctx, result, LLVMIntTypeInContext(ctx->context, 
bitsize));
 }
 
 static LLVMValueRef emit_i2b(struct ac_llvm_context *ctx,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 05/41] ac/nir: implement 8-bit ssbo stores

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 17d952d1ae8..89a78b43c6f 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1524,7 +1524,7 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
 
LLVMValueRef rsrc = ctx->abi->load_ssbo(ctx->abi,
get_src(ctx, instr->src[1]), true);
-   LLVMValueRef base_data = ac_to_float(&ctx->ac, src_data);
+   LLVMValueRef base_data = src_data;
base_data = ac_trim_vector(&ctx->ac, base_data, instr->num_components);
LLVMValueRef base_offset = get_src(ctx, instr->src[2]);
 
@@ -1565,7 +1565,25 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
offset = LLVMBuildAdd(ctx->ac.builder, base_offset,
  LLVMConstInt(ctx->ac.i32, start * 
elem_size_bytes, false), "");
}
-   if (num_bytes == 2) {
+   if (num_bytes == 1) {
+   store_name = "llvm.amdgcn.tbuffer.store.i32";
+   data_type = ctx->ac.i32;
+   data = LLVMBuildZExt(ctx->ac.builder, data, data_type, 
"");
+   LLVMValueRef tbuffer_params[] = {
+   data,
+   rsrc,
+   ctx->ac.i32_0, /* vindex */
+   offset,/* voffset */
+   ctx->ac.i32_0,
+   ctx->ac.i32_0,
+   LLVMConstInt(ctx->ac.i32, 1, false), // dfmt (= 
8bit)
+   LLVMConstInt(ctx->ac.i32, 4, false), // nfmt (= 
uint)
+   glc,
+   ctx->ac.i1false,
+   };
+   ac_build_intrinsic(&ctx->ac, store_name,
+  ctx->ac.voidt, tbuffer_params, 10, 
0);
+   } else if (num_bytes == 2) {
store_name = "llvm.amdgcn.tbuffer.store.i32";
data_type = ctx->ac.i32;
LLVMValueRef tbuffer_params[] = {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 03/41] ac: add various helpers for float16/int16/int8

2019-02-15 Thread Rhys Perry
v2: remove ac_get_one(), ac_get_zero(), ac_get_onef() and ac_get_zerof()
v2: remove ac_int_of_size()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c  | 55 ++---
 src/amd/common/ac_llvm_build.h  | 15 +++--
 src/amd/common/ac_nir_to_llvm.c | 30 +-
 3 files changed, 79 insertions(+), 21 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 9395bd1bbda..b53d9c7ff8c 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -87,12 +87,16 @@ ac_llvm_context_init(struct ac_llvm_context *ctx,
ctx->v4f32 = LLVMVectorType(ctx->f32, 4);
ctx->v8i32 = LLVMVectorType(ctx->i32, 8);
 
+   ctx->i8_0 = LLVMConstInt(ctx->i8, 0, false);
+   ctx->i8_1 = LLVMConstInt(ctx->i8, 1, false);
ctx->i16_0 = LLVMConstInt(ctx->i16, 0, false);
ctx->i16_1 = LLVMConstInt(ctx->i16, 1, false);
ctx->i32_0 = LLVMConstInt(ctx->i32, 0, false);
ctx->i32_1 = LLVMConstInt(ctx->i32, 1, false);
ctx->i64_0 = LLVMConstInt(ctx->i64, 0, false);
ctx->i64_1 = LLVMConstInt(ctx->i64, 1, false);
+   ctx->f16_0 = LLVMConstReal(ctx->f16, 0.0);
+   ctx->f16_1 = LLVMConstReal(ctx->f16, 1.0);
ctx->f32_0 = LLVMConstReal(ctx->f32, 0.0);
ctx->f32_1 = LLVMConstReal(ctx->f32, 1.0);
ctx->f64_0 = LLVMConstReal(ctx->f64, 0.0);
@@ -201,7 +205,9 @@ ac_get_type_size(LLVMTypeRef type)
 
 static LLVMTypeRef to_integer_type_scalar(struct ac_llvm_context *ctx, 
LLVMTypeRef t)
 {
-   if (t == ctx->f16 || t == ctx->i16)
+   if (t == ctx->i8)
+   return ctx->i8;
+   else if (t == ctx->f16 || t == ctx->i16)
return ctx->i16;
else if (t == ctx->f32 || t == ctx->i32)
return ctx->i32;
@@ -281,6 +287,42 @@ ac_to_float(struct ac_llvm_context *ctx, LLVMValueRef v)
return LLVMBuildBitCast(ctx->builder, v, ac_to_float_type(ctx, type), 
"");
 }
 
+LLVMTypeRef ac_float_of_size(struct ac_llvm_context *ctx, unsigned bit_size)
+{
+   switch (bit_size) {
+   case 16:
+   return ctx->f16;
+   case 32:
+   return ctx->f32;
+   case 64:
+   return ctx->f64;
+   default:
+   unreachable("Unhandled bit size");
+   }
+}
+
+LLVMValueRef ac_build_ui_cast(struct ac_llvm_context *ctx, LLVMValueRef v, 
LLVMTypeRef t)
+{
+   unsigned new_bit_size = ac_get_elem_bits(ctx, t);
+   unsigned old_bit_size = ac_get_elem_bits(ctx, LLVMTypeOf(v));
+   if (new_bit_size > old_bit_size)
+   return LLVMBuildZExt(ctx->builder, v, t, "");
+   else if (new_bit_size < old_bit_size)
+   return LLVMBuildTrunc(ctx->builder, v, t, "");
+   else
+   return v;
+}
+
+LLVMValueRef ac_build_reinterpret(struct ac_llvm_context *ctx, LLVMValueRef v, 
LLVMTypeRef t)
+{
+   if (LLVMTypeOf(v) == t)
+   return v;
+
+   v = ac_to_integer(ctx, v);
+   v = ac_build_ui_cast(ctx, v, ac_to_integer_type(ctx, t));
+   return LLVMBuildBitCast(ctx->builder, v, t, "");
+}
+
 
 LLVMValueRef
 ac_build_intrinsic(struct ac_llvm_context *ctx, const char *name,
@@ -1338,15 +1380,18 @@ LLVMValueRef 
ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context *ctx,
 }
 
 LLVMValueRef
-ac_build_tbuffer_load_short(struct ac_llvm_context *ctx,
+ac_build_tbuffer_load_short_byte(struct ac_llvm_context *ctx,
LLVMValueRef rsrc,
LLVMValueRef vindex,
LLVMValueRef voffset,
LLVMValueRef soffset,
LLVMValueRef immoffset,
-   LLVMValueRef glc)
+   LLVMValueRef glc,
+   unsigned size)
 {
+   assert(size == 1 || size == 2);
const char *name = "llvm.amdgcn.tbuffer.load.i32";
+   int data_format = size == 1 ? V_008F0C_BUF_DATA_FORMAT_8 : 
V_008F0C_BUF_DATA_FORMAT_16;
LLVMTypeRef type = ctx->i32;
LLVMValueRef params[] = {
rsrc,
@@ -1354,13 +1399,13 @@ ac_build_tbuffer_load_short(struct ac_llvm_context *ctx,
voffset,
soffset,
immoffset,
-   LLVMConstInt(ctx->i32, 
V_008F0C_BUF_DATA_FORMAT_16, false),
+   LLVMConstInt(ctx->i32, data_format, false),
LLVMConstInt(ctx->i32, 
V_008F0C_BUF_NUM_FORMAT_UINT, false),
glc,
ctx->i1false,
};
LLVMValueRef res = ac_build_intrinsic(ctx, name, type, params, 9, 0);
-   return LLVMBuildTrunc(ctx->builder, res, ctx->i16, "");
+   return LLVMBuildTrunc(ctx->builder, res, 
LLVMIntTypeInContext(ctx->context, size * 8), "");
 }
 
 /**
diff 

[Mesa-dev] [PATCH v2 09/41] ac/nir: fix 64-bit nir_op_f2f16_rtz

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 691d444db05..741059b5f1a 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -886,6 +886,8 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
break;
case nir_op_f2f16_rtz:
src[0] = ac_to_float(&ctx->ac, src[0]);
+   if (LLVMTypeOf(src[0]) == ctx->ac.f64)
+   src[0] = LLVMBuildFPTrunc(ctx->ac.builder, src[0], 
ctx->ac.f32, "");
LLVMValueRef param[2] = { src[0], ctx->ac.f32_0 };
result = ac_build_cvt_pkrtz_f16(&ctx->ac, param);
result = LLVMBuildExtractElement(ctx->ac.builder, result, 
ctx->ac.i32_0, "");
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 02/41] radv: ensure export arguments are always float

2019-02-15 Thread Rhys Perry
So that the signature is correct and consistent, the inputs to a export
intrinsic should always be 32-bit floats.

This and the previous commit fixes a large amount crashes from
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_*
tests

Fixes: b722b29f10d ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index a8268c44ecf..d3795eec403 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2429,12 +2429,8 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
} else
memcpy(&args->out[0], values, sizeof(values[0]) * 4);
 
-   for (unsigned i = 0; i < 4; ++i) {
-   if (!(args->enabled_channels & (1 << i)))
-   continue;
-
+   for (unsigned i = 0; i < 4; ++i)
args->out[i] = ac_to_float(&ctx->ac, args->out[i]);
-   }
 }
 
 static void
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 07/41] ac/nir: implement 8-bit nir_load_const_instr

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index b260142c177..f39232b91a1 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1114,6 +1114,10 @@ static void visit_load_const(struct ac_nir_context *ctx,
 
for (unsigned i = 0; i < instr->def.num_components; ++i) {
switch (instr->def.bit_size) {
+   case 8:
+   values[i] = LLVMConstInt(element_type,
+instr->value.u8[i], false);
+   break;
case 16:
values[i] = LLVMConstInt(element_type,
 instr->value.u16[i], false);
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 13/41] ac/nir: make ac_build_fsign work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zerof() and ac_get_onef()

Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_llvm_build.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 3b2257e8bf0..23e454385d7 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -2079,19 +2079,11 @@ LLVMValueRef ac_build_isign(struct ac_llvm_context 
*ctx, LLVMValueRef src0,
 LLVMValueRef ac_build_fsign(struct ac_llvm_context *ctx, LLVMValueRef src0,
unsigned bitsize)
 {
-   LLVMValueRef cmp, val, zero, one;
-   LLVMTypeRef type;
-
-   if (bitsize == 32) {
-   type = ctx->f32;
-   zero = ctx->f32_0;
-   one = ctx->f32_1;
-   } else {
-   type = ctx->f64;
-   zero = ctx->f64_0;
-   one = ctx->f64_1;
-   }
+   LLVMTypeRef type = ac_float_of_size(ctx, bitsize);
+   LLVMValueRef zero = LLVMConstReal(type, 0.0);
+   LLVMValueRef one = LLVMConstReal(type, 1.0);
 
+   LLVMValueRef cmp, val;
cmp = LLVMBuildFCmp(ctx->builder, LLVMRealOGT, src0, zero, "");
val = LLVMBuildSelect(ctx->builder, cmp, one, src0, "");
cmp = LLVMBuildFCmp(ctx->builder, LLVMRealOGE, val, zero, "");
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 08/41] ac/nir: implement 8-bit conversions

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index f39232b91a1..691d444db05 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -858,12 +858,14 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
src[i] = ac_to_integer(&ctx->ac, src[i]);
result = ac_build_gather_values(&ctx->ac, src, num_components);
break;
+   case nir_op_f2i8:
case nir_op_f2i16:
case nir_op_f2i32:
case nir_op_f2i64:
src[0] = ac_to_float(&ctx->ac, src[0]);
result = LLVMBuildFPToSI(ctx->ac.builder, src[0], def_type, "");
break;
+   case nir_op_f2u8:
case nir_op_f2u16:
case nir_op_f2u32:
case nir_op_f2u64:
@@ -898,15 +900,14 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
else
result = LLVMBuildFPTrunc(ctx->ac.builder, src[0], 
ac_to_float_type(&ctx->ac, def_type), "");
break;
+   case nir_op_u2u8:
case nir_op_u2u16:
case nir_op_u2u32:
case nir_op_u2u64:
src[0] = ac_to_integer(&ctx->ac, src[0]);
-   if (ac_get_elem_bits(&ctx->ac, LLVMTypeOf(src[0])) < 
ac_get_elem_bits(&ctx->ac, def_type))
-   result = LLVMBuildZExt(ctx->ac.builder, src[0], 
def_type, "");
-   else
-   result = LLVMBuildTrunc(ctx->ac.builder, src[0], 
def_type, "");
+   result = ac_build_ui_cast(&ctx->ac, src[0], def_type);
break;
+   case nir_op_i2i8:
case nir_op_i2i16:
case nir_op_i2i32:
case nir_op_i2i64:
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 04/41] ac/nir: implement 8-bit push constant, ssbo and ubo loads

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 37 +++--
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index bed52490bad..17d952d1ae8 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1399,7 +1399,30 @@ static LLVMValueRef visit_load_push_constant(struct 
ac_nir_context *ctx,
 
ptr = ac_build_gep0(&ctx->ac, ctx->abi->push_constants, addr);
 
-   if (instr->dest.ssa.bit_size == 16) {
+   if (instr->dest.ssa.bit_size == 8) {
+   unsigned load_dwords = instr->dest.ssa.num_components > 1 ? 2 : 
1;
+   LLVMTypeRef vec_type = 
LLVMVectorType(LLVMInt8TypeInContext(ctx->ac.context), 4 * load_dwords);
+   ptr = ac_cast_ptr(&ctx->ac, ptr, vec_type);
+   LLVMValueRef res = LLVMBuildLoad(ctx->ac.builder, ptr, "");
+
+   LLVMValueRef params[3];
+   if (load_dwords > 1) {
+   LLVMValueRef res_vec = 
LLVMBuildBitCast(ctx->ac.builder, res, LLVMVectorType(ctx->ac.i32, 2), "");
+   params[0] = LLVMBuildExtractElement(ctx->ac.builder, 
res_vec, LLVMConstInt(ctx->ac.i32, 1, false), "");
+   params[1] = LLVMBuildExtractElement(ctx->ac.builder, 
res_vec, LLVMConstInt(ctx->ac.i32, 0, false), "");
+   } else {
+   res = LLVMBuildBitCast(ctx->ac.builder, res, 
ctx->ac.i32, "");
+   params[0] = ctx->ac.i32_0;
+   params[1] = res;
+   }
+   params[2] = addr;
+   res = ac_build_intrinsic(&ctx->ac, "llvm.amdgcn.alignbyte", 
ctx->ac.i32, params, 3, 0);
+
+   res = LLVMBuildTrunc(ctx->ac.builder, res, 
LLVMIntTypeInContext(ctx->ac.context, instr->dest.ssa.num_components * 8), "");
+   if (instr->dest.ssa.num_components > 1)
+   res = LLVMBuildBitCast(ctx->ac.builder, res, 
LLVMVectorType(LLVMInt8TypeInContext(ctx->ac.context), 
instr->dest.ssa.num_components), "");
+   return res;
+   } else if (instr->dest.ssa.bit_size == 16) {
unsigned load_dwords = instr->dest.ssa.num_components / 2 + 1;
LLVMTypeRef vec_type = 
LLVMVectorType(LLVMInt16TypeInContext(ctx->ac.context), 2 * load_dwords);
ptr = ac_cast_ptr(&ctx->ac, ptr, vec_type);
@@ -1676,7 +1699,7 @@ static LLVMValueRef visit_load_buffer(struct 
ac_nir_context *ctx,
LLVMValueRef immoffset = LLVMConstInt(ctx->ac.i32, i * 
elem_size_bytes, false);
 
LLVMValueRef ret;
-   if (load_bytes == 2) {
+   if (load_bytes <= 2) {
ret = ac_build_tbuffer_load_short_byte(&ctx->ac,
   rsrc,
   vindex,
@@ -1684,7 +1707,7 @@ static LLVMValueRef visit_load_buffer(struct 
ac_nir_context *ctx,
   ctx->ac.i32_0,
   immoffset,
   glc,
-  2);
+  load_bytes);
} else {
const char *load_name;
LLVMTypeRef data_type;
@@ -1700,6 +1723,7 @@ static LLVMValueRef visit_load_buffer(struct 
ac_nir_context *ctx,
data_type = ctx->ac.v2f32;
break;
case 4:
+   case 3:
load_name = "llvm.amdgcn.buffer.load.f32";
data_type = ctx->ac.f32;
break;
@@ -1746,7 +1770,8 @@ static LLVMValueRef visit_load_ubo_buffer(struct 
ac_nir_context *ctx,
if (instr->dest.ssa.bit_size == 64)
num_components *= 2;
 
-   if (instr->dest.ssa.bit_size == 16) {
+   if (instr->dest.ssa.bit_size == 16 || instr->dest.ssa.bit_size == 8) {
+   unsigned size = instr->dest.ssa.bit_size / 8;
LLVMValueRef results[num_components];
for (unsigned i = 0; i < num_components; ++i) {
results[i] = ac_build_tbuffer_load_short_byte(&ctx->ac,
@@ -1754,9 +1779,9 @@ static LLVMValueRef visit_load_ubo_buffer(struct 
ac_nir_context *ctx,
  
ctx->ac.i32_0,
  offset,
  
ctx->ac.i32_0,
- 
LLVMConstInt(ctx->ac.i32, 2 * i, 0),
+

[Mesa-dev] [PATCH v2 06/41] ac/nir: fix 16-bit ssbo stores

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry 
---
 src/amd/common/ac_nir_to_llvm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 89a78b43c6f..b260142c177 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1586,6 +1586,8 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
} else if (num_bytes == 2) {
store_name = "llvm.amdgcn.tbuffer.store.i32";
data_type = ctx->ac.i32;
+   data = LLVMBuildBitCast(ctx->ac.builder, data, 
ctx->ac.i16, "");
+   data = LLVMBuildZExt(ctx->ac.builder, data, data_type, 
"");
LLVMValueRef tbuffer_params[] = {
data,
rsrc,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 01/41] radv: bitcast 16-bit outputs to integers

2019-02-15 Thread Rhys Perry
16-bit outputs are stored as 16-bit floats in the outputs array, so they
have to be bitcast.

Fixes: b722b29f10d ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry 
---
 src/amd/vulkan/radv_nir_to_llvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
b/src/amd/vulkan/radv_nir_to_llvm.c
index 7f74678d5f1..a8268c44ecf 100644
--- a/src/amd/vulkan/radv_nir_to_llvm.c
+++ b/src/amd/vulkan/radv_nir_to_llvm.c
@@ -2365,7 +2365,7 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
if (is_16bit) {
for (unsigned chan = 0; chan < 4; chan++)
values[chan] = 
LLVMBuildZExt(ctx->ac.builder,
- 
values[chan],
+ 
ac_to_integer(&ctx->ac, values[chan]),
  
ctx->ac.i32, "");
}
break;
@@ -2376,7 +2376,7 @@ si_llvm_init_export_args(struct radv_shader_context *ctx,
if (is_16bit) {
for (unsigned chan = 0; chan < 4; chan++)
values[chan] = 
LLVMBuildSExt(ctx->ac.builder,
- 
values[chan],
+ 
ac_to_integer(&ctx->ac, values[chan]),
  
ctx->ac.i32, "");
}
break;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

2019-02-15 Thread Rhys Perry
This series add support for:
- VK_KHR_shader_float16_int8
- VK_AMD_gpu_shader_half_float
- VK_AMD_gpu_shader_int16
- VK_KHR_8bit_storage
on VI+. Half floats are disabled on LLVM 7 because of a bug causing large
memory usage and long (or unbounded) compilation times with some CTS
tests.

It is written against the following patch series:
- https://patchwork.freedesktop.org/series/53454/ (v4)
- https://patchwork.freedesktop.org/series/53660/ (v1)

With LLVM 9, there are no reproducable Vulkan CTS regressions with Vega
and VI except for
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_float_64_to_16.*
which fails or crashes because of unrelated radv bugs with 64-bit varyings
and because the tests use VK_FORMAT_R64_SFLOAT as a vertex format even
though radv does not support it.

With LLVM 9, there are no reproducable piglit regressions except for
glsl-array-bounds-12.shader_test because of a LLVM bug when
SLP vectorization is enabled.

With LLVM 8, there are no reproducable Vulkan CTS regressions with Vega
and VI except for those with LLVM 9 and a couple of tests because of a
LLVM bug after the SLP vectorizer and with the current lack of fallback
for 16-bit interpolation on LLVM versions before LLVM 9.

With LLVM 7, there are no reproducable Vulkan CTS regressions with Vega
and VI except for those with LLVM 9 and a couple of tests because of a
LLVM bug after the SLP vectorizer.

The SLP vectorization patch is marked as WIP because it exposes LLVM bugs
with piglit's glsl-array-bounds-12.shader_test, some Vulkan CTS tests and
some shader-db test for a game I can't remember. It also over-vectorizes
32-bit code which can cause significant worsening in generated code
quality.

The 16-bit interpolation patch is marked as WIP because it currently
requires intrinsics only available in LLVM 9 and does not have a fallback.

A branch on Github containing this series can be found at:
https://github.com/pendingchaos/mesa/commits/radv_fp16_int16_int8_v2

v2: rebase
v2: implement 16-bit interpolation
v2: move LLVMAddSLPVectorizePass to after LLVMAddEarlyCSEMemSSAPass
v2: run vectorization unconditionally on GFX9 and later
v2: remove ac_get_one(), ac_get_zero(), ac_get_onef() and ac_get_zerof()
v2: remove ac_int_of_size()
v2: fix 64-bit visit_load_var()
v2: mark VK_KHR_8bit_storage as DONE in features.txt
v2: mark SLP vectorization patch as WIP
v2: fix C++ style comment

Rhys Perry (41):
  radv: bitcast 16-bit outputs to integers
  radv: ensure export arguments are always float
  ac: add various helpers for float16/int16/int8
  ac/nir: implement 8-bit push constant, ssbo and ubo loads
  ac/nir: implement 8-bit ssbo stores
  ac/nir: fix 16-bit ssbo stores
  ac/nir: implement 8-bit nir_load_const_instr
  ac/nir: implement 8-bit conversions
  ac/nir: fix 64-bit nir_op_f2f16_rtz
  ac/nir: make ac_build_clamp work on all bit sizes
  ac/nir: make ac_build_fract work on all bit sizes
  ac/nir: make ac_build_isign work on all bit sizes
  ac/nir: make ac_build_fsign work on all bit sizes
  ac/nir: make ac_build_fdiv support 16-bit floats
  ac/nir: implement half-float nir_op_frcp
  ac/nir: implement half-float nir_op_frsq
  ac/nir: implement half-float nir_op_ldexp
  radv: lower 16-bit flrp
  ac/nir: support half floats in emit_b2f
  ac/nir: make emit_b2i work on all bit sizes
  ac/nir: implement 16-bit shifts
  compiler/nir: add lowering option for 16-bit ffma
  ac/nir: implement 16-bit ac_build_ddxy
  ac/nir: implement 8 and 16 bit ac_build_readlane
  nir: make bitfield_reverse and ifind_msb work with all integers
  ac/nir: make ac_find_lsb work on all bit sizes
  ac/nir: make ac_build_umsb work on all bit sizes
  ac/nir: implement 8 and 16 bit ac_build_imsb
  ac/nir: make ac_build_bit_count work on all bit sizes
  ac/nir: make ac_build_bitfield_reverse work on all bit sizes
  ac/nir: implement 16-bit pack/unpack opcodes
  ac/nir: add 8-bit types to glsl_base_to_llvm_type
  ac/nir,radv: create an array of varying output types
  ac/nir: store all outputs as f32
  radv: store all fragment shader inputs as f32
  radv: handle all fragment output types
  WIP: radv,ac: implement 16-bit interpolation
  WIP: ac,radv: run LLVM's SLP vectorizer
  ac/nir: generate better code for nir_op_f2f16_rtz
  ac/nir: have nir_op_f2f16 round to zero
  radv,docs: expose float16, int16 and int8 features and extensions

 docs/features.txt|   2 +-
 src/amd/common/ac_llvm_build.c   | 325 +++
 src/amd/common/ac_llvm_build.h   |  18 +-
 src/amd/common/ac_llvm_util.c|   8 +-
 src/amd/common/ac_nir_to_llvm.c  | 268 +++
 src/amd/common/ac_shader_abi.h   |   1 +
 src/amd/vulkan/radv_device.c |  17 ++
 src/amd/vulkan/radv_extensions.py|   4 +
 src/amd/vulkan/radv_nir_to_llvm.c| 123 +
 src/amd/vulkan/radv_pipeline.c   |  19 +-
 src/amd/vulkan/radv_shader.c |   4 +
 src/amd/vulkan/radv_shade

Re: [Mesa-dev] [PATCH v5 02/40] intel/compiler: add a NIR pass to lower conversions

2019-02-15 Thread Jason Ekstrand
On Fri, Feb 15, 2019 at 2:22 AM Iago Toral Quiroga 
wrote:

> Some conversions are not directly supported in hardware and need to be
> split in two conversion instructions going through an intermediary type.
> Doing this at the NIR level simplifies a bit the complexity in the backend.
>
> v2:
>  - Consider fp16 rounding conversion opcodes
>  - Properly handle swizzles on conversion sources.
>
> v3
>  - Run the pass earlier, right after nir_opt_algebraic_late (Jason)
>  - NIR alu output types already have the bit-size (Jason)
>  - Use 'is_conversion' to identify conversion operations (Jason)
>
> v4:
>  - Be careful about the intermediate types we use so we don't lose
>range and avoid incorrect rounding semantics (Jason)
>
> Reviewed-by: Topi Pohjolainen  (v1)
> ---
>  src/intel/Makefile.sources|   1 +
>  src/intel/compiler/brw_nir.c  |   2 +
>  src/intel/compiler/brw_nir.h  |   2 +
>  .../compiler/brw_nir_lower_conversions.c  | 169 ++
>  src/intel/compiler/meson.build|   1 +
>  5 files changed, 175 insertions(+)
>  create mode 100644 src/intel/compiler/brw_nir_lower_conversions.c
>
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 94a28d370e8..9975daa3ad1 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -83,6 +83,7 @@ COMPILER_FILES = \
> compiler/brw_nir_analyze_boolean_resolves.c \
> compiler/brw_nir_analyze_ubo_ranges.c \
> compiler/brw_nir_attribute_workarounds.c \
> +   compiler/brw_nir_lower_conversions.c \
> compiler/brw_nir_lower_cs_intrinsics.c \
> compiler/brw_nir_lower_image_load_store.c \
> compiler/brw_nir_lower_mem_access_bit_sizes.c \
> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index 9dbf06004a4..7e3dbc9e447 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -876,6 +876,8 @@ brw_postprocess_nir(nir_shader *nir, const struct
> brw_compiler *compiler,
>
> OPT(nir_opt_algebraic_late);
>
> +   OPT(brw_nir_lower_conversions);
> +
> OPT(nir_lower_to_source_mods, nir_lower_all_source_mods);
> OPT(nir_copy_prop);
> OPT(nir_opt_dce);
> diff --git a/src/intel/compiler/brw_nir.h b/src/intel/compiler/brw_nir.h
> index bc81950d47e..662b2627e95 100644
> --- a/src/intel/compiler/brw_nir.h
> +++ b/src/intel/compiler/brw_nir.h
> @@ -114,6 +114,8 @@ void brw_nir_lower_tcs_outputs(nir_shader *nir, const
> struct brw_vue_map *vue,
> GLenum tes_primitive_mode);
>  void brw_nir_lower_fs_outputs(nir_shader *nir);
>
> +bool brw_nir_lower_conversions(nir_shader *nir);
> +
>  bool brw_nir_lower_image_load_store(nir_shader *nir,
>  const struct gen_device_info
> *devinfo);
>  void brw_nir_rewrite_image_intrinsic(nir_intrinsic_instr *intrin,
> diff --git a/src/intel/compiler/brw_nir_lower_conversions.c
> b/src/intel/compiler/brw_nir_lower_conversions.c
> new file mode 100644
> index 000..9aff30b568b
> --- /dev/null
> +++ b/src/intel/compiler/brw_nir_lower_conversions.c
> @@ -0,0 +1,169 @@
> +/*
> + * Copyright © 2018 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> next
> + * paragraph) shall be included in all copies or substantial portions of
> the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "brw_nir.h"
> +#include "compiler/nir/nir_builder.h"
> +
> +static nir_op
> +get_conversion_op(nir_alu_type src_type,
> +  unsigned src_bit_size,
> +  nir_alu_type dst_type,
> +  unsigned dst_bit_size,
> +  nir_rounding_mode rounding_mode)
> +{
> +   nir_alu_type src_full_type = (nir_alu_type) (src_type | src_bit_size);
> +   nir_alu_type dst_full_type = (nir_alu_type) (dst_type | dst_bit_size);
> +
> +   return nir_type_conversion_op(src_full_type, dst_full_type,
> rounding_mode);

Re: [Mesa-dev] [PATCH v6 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-15 Thread Nanley Chery
On Fri, Feb 15, 2019 at 03:29:41PM +0200, Eleni Maria Stea wrote:
> GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
> compressed EAC/ETC2 images to non-compressed RGBA images. When
> GetCompressed* functions were called, the pixels were returned in this
> RGBA format and not the compressed format that was expected.
> 
> Trying to fix this problem, we use a secondary shadow miptree to store the
> decompressed data for the rendering and the main miptree to store the
> compressed for the Get functions to work. Each time that the main miptree
> is written with compressed data, we decompress them to RGB and update the
> shadow. Then we use the shadow for rendering.
> 
> v2:
>- Fixes in the commit message (Nanley Chery)
>- Reversed the changes in brw_get_texture_swizzle and swapped the b, g
>values at the time that we decompress the data in the function:
>intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>- Simplified the format checks in the miptree_create function of the
>intel_mipmap_tree.c and reserved the call of the
>intel_lower_compressed_format for the case that we are faking the ETC
>support (Nanley Chery)
>- Removed the check for the auxiliary usage for the shadow miptree at
>creation (miptree_create of intel_mipmap_tree.c) as we won't use
>auxiliary buffers with these types of trees (Nanley Chery)
>- Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
>removed the unecessary checks (Nanley Chery)
>- Fixed an unrelated indentation change (Nanley Chery)
>- Modified the function intel_miptree_finish_write to set the
>mt->shadow_needs_update to true to catch all the cases when we need to
>update the miptree (Nanley Chery)
>- In order to update the shadow miptree during the unmap of the
>main and always map the main (Nanley Chery) the following change was
>necessary: Splitted the previous update function that was updating all
>the mipmap levels and use two functions instead: one that updates one
>level and one that updates all of them. Used the first during unmap
>and the second before the rendering.
>- Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
>miptree should be mapped each time and reversed all the changes in the
>higher level texture functions that upload data to textures as they
>aren't needed anymore.
>- Replaced the boolean needs_fake_etc with an inline function that
>checks when we need to fake the ETC compression (Nanley Chery)
>- Removed the initialization of the strides in the update function as
>the values will be overwritten by the intel_miptree_map call (Nanley
>Chery)
>- Used minify instead of division in the new update function
>intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
>Chery)
>- Removed the depth from the calculation of the number of slices in
>the new update function (intel_miptree_update_etc_shadow_levels of
>intel_mipmap_tree.c) as we don't need to support 3D ETC images.
>(Nanley Chery)
> 
> v3:
>   - Renamed the rgba_fmt in function miptree_create
>   (intel_mipmap_tree.c) to decomp_format as the format is not always in
>   rgba order. (Nanley Chery)
>   - Documented the new usage for the shadow miptree in the comment above
>   the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
>   Chery)
>   - Removed the redundant flags from the mapping of the miptrees in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the switch from surface's logical level to physical level in
>   the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
>   (Nanley Chery)
>   - Excluded the Baytrail GPUs from the check for the ETC emulation as
>   they support the ETC formats natively. (Nanley Chery)
>   - Simplified the check if the format is BGRA in
>   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
> 
> v4:
>   - Removed the functions intel_miptree_(map|unmap)_etc and the check if
>we need to call them as with the new changes, they became unreachable.
>(Nanley Chery)
>   - We'd rather calculate the level width and height using the shadow
>   miptree instead of the main in intel_miptree_update_etc_shadow_levels of
>   intel_mipmap_tree.c (Nanley Chery)
>   - Fixed the format in the mt_surface_usage, set at the miptree creation,
>in miptree_create of intel_mipmap_tree.c (Nanley Chery)
> 
> v5:
>   - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
>   - Update the flag shadow_needs_update outside the function
>   intel_miptree_update_etc_shadow (Nanley Chery)
>   - Fixed indentation error (Nanley Chery)
> 
> v6:
>   - Fixed typo in commit message (Nanley Chery)
>   - Simplified the assignment of the mt_fmt in the miptree_create of the
>   intel_mipmap_tree.c (Nanley Chery)
>   - Combined declarations and assignments where it was possible 

Re: [Mesa-dev] [PATCH] Revert "glsl: relax input->output validation for SSO programs"

2019-02-15 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 

But I think you should add something like the following to the commit 
message:


"This was fixed properly by commit ..."

Once you push "glsl/linker: don't fail non static used inputs without 
matching outputs"


On 9/2/19 4:06 am, Andres Gomez wrote:

This reverts commit 1aa5738e666a9534c7e5b46f077327e6d647c64f.

This patch incorrectly asumed that for SSOs no inner interface
matching check was needed.

 From the ARB_separate_shader_objects spec v.25:

   " With separable program objects, interfaces between shader stages
 may involve the outputs from one program object and the inputs
 from a second program object.  For such interfaces, it is not
 possible to detect mismatches at link time, because the programs
 are linked separately.  When each such program is linked, all
 inputs or outputs interfacing with another program stage are
 treated as active.  The linker will generate an executable that
 assumes the presence of a compatible program on the other side of
 the interface.  If a mismatch between programs occurs, no GL error
 will be generated, but some or all of the inputs on the interface
 will be undefined."

Fixes: 1aa5738e666 ("glsl: relax input->output validation for SSO programs")
Cc: Tapani Pälli 
Cc: Timothy Arceri 
Cc: Ilia Mirkin 
Cc: Samuel Iglesias Gonsálvez 
Cc: Ian Romanick 
Signed-off-by: Andres Gomez 
---
  src/compiler/glsl/link_varyings.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 3969c0120b3..4efdfcbc4f6 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -804,7 +804,7 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
   */
  assert(!input->data.assigned);
  if (input->data.used && !input->get_interface_type() &&
-!input->data.explicit_location && !prog->SeparateShader)
+!input->data.explicit_location)
 linker_error(prog,
  "%s shader input `%s' "
  "has no matching output in the previous stage\n",


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 4/6] glsl/linker: don't fail non static used inputs without matching outputs

2019-02-15 Thread Timothy Arceri
Actually you probably want to add this to the commit message also since 
it will also be needed if the revert patch gets picked up by stable:


Fixes: 1aa5738e666 ("glsl: relax input->output validation for SSO programs")

On 7/2/19 2:58 am, Andres Gomez wrote:

If there is no Static Use of an input variable, the linker shouldn't
fail whenever there is no defined matching output variable in the
previous stage.

 From page 47 (page 51 of the PDF) of the GLSL 4.60 v.5 spec:

   " Only the input variables that are statically read need to be
 written by the previous stage; it is allowed to have superfluous
 declarations of input variables."

Now, we complete this exception whenever the input variable has an
explicit location. Previously, 18004c338f6 ("glsl: fail when a
shader's input var has not an equivalent out var in previous") took
care of the cases in which the input variable didn't have an explicit
location.

v2: do the location based interface matching check regardless on
 whehter it is a separable program or not (Ilia).

Cc: Timothy Arceri 
Cc: Iago Toral Quiroga 
Cc: Samuel Iglesias Gonsálvez 
Cc: Tapani Pälli 
Cc: Ian Romanick 
Cc: Ilia Mirkin 
Signed-off-by: Andres Gomez 
---
  src/compiler/glsl/link_varyings.cpp | 16 ++--
  1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index e5f7d3e322a..36908d95263 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -808,8 +808,20 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
  
 output = output_explicit_locations[idx][input->data.location_frac].var;
  
-   if (output == NULL ||

-   input->data.location != output->data.location) {
+   if (output == NULL) {
+  /* A linker failure should only happen when there is no
+   * output declaration and there is Static Use of the
+   * declared input.
+   */
+  if (input->data.used) {
+ linker_error(prog,
+  "%s shader input `%s' with explicit location 
"
+  "has no matching output\n",
+  
_mesa_shader_stage_to_string(consumer->Stage),
+  input->name);
+ break;
+  }
+   } else if (input->data.location != output->data.location) {
linker_error(prog,
 "%s shader input `%s' with explicit location "
 "has no matching output\n",


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 4/6] glsl/linker: don't fail non static used inputs without matching outputs

2019-02-15 Thread Timothy Arceri
If the updated piglit tests pass on the Nvidia blob as per my reply to 
those patches and this patch passes on the new and old piglit tests. 
Then this patch is:


Reviewed-by: Timothy Arceri 

Thanks for fixing this!

On 7/2/19 2:58 am, Andres Gomez wrote:

If there is no Static Use of an input variable, the linker shouldn't
fail whenever there is no defined matching output variable in the
previous stage.

 From page 47 (page 51 of the PDF) of the GLSL 4.60 v.5 spec:

   " Only the input variables that are statically read need to be
 written by the previous stage; it is allowed to have superfluous
 declarations of input variables."

Now, we complete this exception whenever the input variable has an
explicit location. Previously, 18004c338f6 ("glsl: fail when a
shader's input var has not an equivalent out var in previous") took
care of the cases in which the input variable didn't have an explicit
location.

v2: do the location based interface matching check regardless on
 whehter it is a separable program or not (Ilia).

Cc: Timothy Arceri 
Cc: Iago Toral Quiroga 
Cc: Samuel Iglesias Gonsálvez 
Cc: Tapani Pälli 
Cc: Ian Romanick 
Cc: Ilia Mirkin 
Signed-off-by: Andres Gomez 
---
  src/compiler/glsl/link_varyings.cpp | 16 ++--
  1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index e5f7d3e322a..36908d95263 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -808,8 +808,20 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
  
 output = output_explicit_locations[idx][input->data.location_frac].var;
  
-   if (output == NULL ||

-   input->data.location != output->data.location) {
+   if (output == NULL) {
+  /* A linker failure should only happen when there is no
+   * output declaration and there is Static Use of the
+   * declared input.
+   */
+  if (input->data.used) {
+ linker_error(prog,
+  "%s shader input `%s' with explicit location 
"
+  "has no matching output\n",
+  
_mesa_shader_stage_to_string(consumer->Stage),
+  input->name);
+ break;
+  }
+   } else if (input->data.location != output->data.location) {
linker_error(prog,
 "%s shader input `%s' with explicit location "
 "has no matching output\n",


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109107] gallium/st/va: change va max_profiles when using Radeon VCN Hardware

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109107

Michael Eagle  changed:

   What|Removed |Added

   See Also||https://bugs.freedesktop.or
   ||g/show_bug.cgi?id=109648

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] panfrost: Fix various leaks unmapping resources

2019-02-15 Thread Alyssa Rosenzweig
v2: Don't check for NULL before free()

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_resource.c | 19 +++
 src/gallium/drivers/panfrost/pan_screen.h   |  4 +++-
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
b/src/gallium/drivers/panfrost/pan_resource.c
index f2cff7c80df..1287193c0e9 100644
--- a/src/gallium/drivers/panfrost/pan_resource.c
+++ b/src/gallium/drivers/panfrost/pan_resource.c
@@ -287,16 +287,18 @@ panfrost_destroy_bo(struct panfrost_screen *screen, 
struct panfrost_bo *pbo)
 {
struct panfrost_bo *bo = (struct panfrost_bo *)pbo;
 
-if (bo->entry[0] != NULL) {
-/* Most allocations have an entry to free */
-bo->entry[0]->freed = true;
-pb_slab_free(&screen->slabs, &bo->entry[0]->base);
+for (int l = 0; l < MAX_MIP_LEVELS; ++l) {
+if (bo->entry[l] != NULL) {
+/* Most allocations have an entry to free */
+bo->entry[l]->freed = true;
+pb_slab_free(&screen->slabs, &bo->entry[l]->base);
+}
 }
 
 if (bo->tiled) {
 /* Tiled has a malloc'd CPU, so just plain ol' free needed */
 
-for (int l = 0; bo->cpu[l]; l++) {
+for (int l = 0; l < MAX_MIP_LEVELS; ++l) {
 free(bo->cpu[l]);
 }
 }
@@ -509,9 +511,10 @@ panfrost_slab_can_reclaim(void *priv, struct pb_slab_entry 
*entry)
 static void
 panfrost_slab_free(void *priv, struct pb_slab *slab)
 {
-/* STUB */
-//struct panfrost_memory *mem = (struct panfrost_memory *) slab;
-printf("stub: Tried to free slab\n");
+struct panfrost_memory *mem = (struct panfrost_memory *) slab;
+struct panfrost_screen *screen = (struct panfrost_screen *) priv;
+
+screen->driver->free_slab(screen, mem);
 }
 
 static void
diff --git a/src/gallium/drivers/panfrost/pan_screen.h 
b/src/gallium/drivers/panfrost/pan_screen.h
index b89d921c71f..afb3d34b5b1 100644
--- a/src/gallium/drivers/panfrost/pan_screen.h
+++ b/src/gallium/drivers/panfrost/pan_screen.h
@@ -58,7 +58,9 @@ struct panfrost_driver {
   int extra_flags,
   int commit_count,
   int extent);
-   void (*enable_counters) (struct panfrost_screen *screen);
+void (*free_slab) (struct panfrost_screen *screen,
+   struct panfrost_memory *mem);
+void (*enable_counters) (struct panfrost_screen *screen);
 };
 
 struct panfrost_screen {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v6 0/5] improved the support for ETC2 formats on Gen 7

2019-02-15 Thread Nanley Chery
On Fri, Feb 15, 2019 at 03:29:39PM +0200, Eleni Maria Stea wrote:
> Intel Gen7 GPUs don't support the ETC2 formats natively and in order to
> show the pixels properly we decompress them and create decompressed
> miptrees. The problem with that is that the functions that map the
> miptrees for reading (for example the GetCompressed* calls), and would
> be supposed to read compressed pixel values, would read decompressed
> values instead unless if we prevented this with assertions that make
> the user programs either crash or misfunction.
> 
> These patches are an attempt to give a solution to this problem by using 2
> miptrees: the main to store the ETC values and the generic shadow
> (mt->shadow) to store the decompressed values. Each time that the main
> miptree is mapped for writing we set a flag that the shadow will need
> update and we check this flag before every draw call to update the
> shadow miptree. (We perform the check right before drawing to avoid
> missing changes from functions like the CopyImageSubData in the next 
> frame). Then we map the shadow for sampling. This way, we can render the
> images using the decompressed pixels of the shadow but we return the
> compressed ones from the main when the texture is mapped for reading.
> 
> Also, the OES_copy_image extension that couldn't work on Gen 7 due to the
> lack of the ETC support is now enabled back.
> 
> Finally, the following glcts and piglit tests pass:
> 
> On HSW (previously failing):
> 
> KHR-GL46.direct_state_access.textures_compressed_subimage
> 
> On HSW and IVB (previously skipped):
> -
> dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_alpha8_etc2_eac.*
>(6 tests)
> dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_etc2.*
>(6 tests)
> dEQP-GLES31.functional.texture.border_clamp.formats.compressed_srgb8_punchthrough_alpha1_etc2.*
>(6 tests)
> 
> On HSW, IVB, SNB (previously skipped):
> ---
> dEQP-GLES3.functional.texture.format.compressed.*
>(12 tests)
> dEQP-GLES3.functional.texture.wrap.etc2_eac_srgb8_alpha8.*
>(36 tests)
> dEQP-GLES3.functional.texture.wrap.etc2_srgb8.*
>(36 tests)
> dEQP-GLES3.functional.texture.wrap.etc2_srgb8_punchthrough_alpha1.*
>(36 tests)
> 
> piglit.spec.!opengl es 3_0.oes_compressed_etc2_texture-miptree_gles3 
>(srgb8, srgb8-alpha, srgb8-punchthrough-alpha1)
> piglit.spec.arb_es3_compatibility.oes_compressed_etc2_texture-miptree
>(srgb8 compat, srgb8 core, srgb8-alpha8 compat, srgb8-alpha8 core,
> srgb8-punchthrough-alpha1 compat, srgb8-punchthrough-alpha1 core)
> (9 tests)
> 
> Total tests passing: 148
> 
> Eleni Maria Stea (4):
>   i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.
>   i965: Fixed the CopyImageSubData for ETC2 on Gen < 8
>   i965: Enabled the OES_copy_image extension on Gen 7 GPUs
>   i965: Removed the field etc_format from the struct intel_mipmap_tree
> 

These patches are
Reviewed-by: Nanley Chery 

I like how this series turned out. Thank you!

> Nanley Chery (1):
>   i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*
> 
>  src/mesa/drivers/dri/i965/brw_draw.c  |   5 +
>  .../drivers/dri/i965/brw_wm_surface_state.c   |  15 +-
>  src/mesa/drivers/dri/i965/intel_extensions.c  |  16 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 170 ++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  48 +++--
>  5 files changed, 149 insertions(+), 105 deletions(-)
> 
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] panfrost: Swap order of tiled texture (de)alloc

2019-02-15 Thread Alyssa Rosenzweig
> Am I reading this correctly, that now we free a slab [for a tiled
> texture] before allocating new one? The commit message seems rather
> cryptic.

Yeah, it's a super minor fix. Rather than allocating a new slab for the
texture and then freeing the old slab, we free then allocate.
Practically, it's identical (since there's no dependency, and they ought
to be the same size...), but it does avoid having to expand the pool if
we're right on the edge. Almost nitpick level, but hey, if it makes
memory use slightly lower and performance slightly more predictable :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109646] New video compositor compute shader render glitches mpv

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109646

--- Comment #4 from bmil...@gmail.com ---
Ok, some updates:

1. The stutter/framedrop was related to the codec, not your patches. Switching
to dav1d improved it.
2. The last patch you posted fixed the black UI issues.
3. Got a comparison shot of how the jagginess looks between mesa before and
after your implementation: https://i.imgur.com/A6ko6ap.png

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109201] Deep Rock Galactic: GPU Hang (Steam Play) (DXVK)

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109201

Alexander  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #16 from Alexander  ---
Ok it works now, the bug doesn't occur any more and it even works with Mesa
18.3.3, i used mesa 18.3.2 before.

Thanks!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109599] small shadows are not drawn in Heroes of the Storm

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109599

--- Comment #8 from tempel.jul...@gmail.com ---
Regarding the "hedges regression": Sorry, false alarm. It looks the same on
Windows DX11, the game developer apparently has recently downgraded the game's
visuals.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #30 from Mark Janes  ---
(In reply to Mark Janes from comment #28)
>   https://android-review.googlesource.com/c/platform/external/deqp/+/901894

Mesa still asserts with this fix.  I also tested Andrii's mesa patch with the
dEQP fix and the test fails.

Since non-mesa drivers have found issues with the original dEQP change, I
suspect there are still deeper problems with the test.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Fix various leaks unmapping resources

2019-02-15 Thread Alyssa Rosenzweig
> Nit: staying consistent with "foo != NULL" vs "foo" checks helps a
> lot.

Which form is preferred?

> free(NULL); is perfectly valid.

Huh, TIL, thank you.

> The function pointer seems to be NULL. Did you forget to git add the
> file which sets it?

See my comment in the other mail about the overlay. Generally, functions
of the form "screen->driver->..." are specific to the kernel module in
question, and we're not able to upstream the code specific to the vendor
kernel for various reasons. It may be a good idea to stub out the
corresponding routines in pan_drm.c, but that's up to Rob and Tomeu.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109647] /usr/include/xf86drm.h:40:10: fatal error: drm.h: No such file or directory

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109647

Bug ID: 109647
   Summary: /usr/include/xf86drm.h:40:10: fatal error: drm.h: No
such file or directory
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org

autotools build error

  CC   tegra_screen.lo
In file included from tegra_screen.c:33:
/usr/include/xf86drm.h:40:10: fatal error: drm.h: No such file or directory
 #include 
  ^~~


commit f1374805a86d0d506557e61efbc09e23caa7a038
Author: Eric Engestrom 
Date:   Tue Feb 12 18:18:03 2019 +

drm-uapi: use local files, not system libdrm

There was an issue recently caused by the system header being included
by mistake, so let's just get rid of this include path and always
explicitly #include "drm-uapi/FOO.h"

Signed-off-by: Eric Engestrom 
Reviewed-by: Kristian H. Kristensen 

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Backport driver to Mali T600/T700

2019-02-15 Thread Alyssa Rosenzweig
> - about 1/5 of the patch seems to be white space changes

...Oops. Any tips for avoiding this type of diff churn in the future? I
suppose it's not inherently harmful, but maybe it could make merging
more difficult than strictly necessary.

> - doesn't seem like BIFROST is defined anywhere

Indeed it's not; Bifrost is not yet supported, but at least this way we
can share headers with the out-of-tree work on Bifrost (is anyone
working on these parts right now..?)

> - other drivers keep details like is_t6xx, require_sfbd, others in
> driver/screen specific struct

Aye, that'll be fixed next patch :)

> - the __LP64__ checks seems suspicious, no other mesa driver has those

Is there a better way to handle mixed bit-ness? We have shared memory
(sort of -- separate MMUs, separate address spaces, but they're mapped
together with shared physical RAM and we opt for SAME_VA where gpu_va ==
user_cpu_va). As such, 32-bit Mali and 64-bit Mali behave differently,
since pointers are larger and then some fields get rearranged to pack
tighter/less-so depending on pointer sizes. There's no real benefit to
support both modes in the same build of the driver; by far, having a
32-bit build for armhf with 32-bit Mali descriptors and a 64-bit build
for aarch64 with 64-bit descriptors is the sane approach. Accordingly,
I reasoned that __LP64__ is the cleanest way to check what type of
system we're building for, and from there which descriptor flavour we
should use. Is there something inherently problematic about this scheme?

In theory we can mix and match, the hardware can do both regardless of
the CPU as far as I know, but that complicates things dramatically for
little benefit.

Keep in mind that Midgard onwards uses descriptors in shared memory,
rather than a true command stream, so it's possible no other mesa driver
does this since no other mesa-supported hardware needs this.

Thank you,

Alyssa
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Fix build; depend on libdrm

2019-02-15 Thread Alyssa Rosenzweig
> Feel free to reuse:
> meson: panfrost: add missing libdrm dependency

Hmm?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] panfrost: Free imported BOs

2019-02-15 Thread Alyssa Rosenzweig
> Seems like a file is missing - git add pan_screen.c perhaps? Neither
> the function pointer nor the imported/imported_size are set anywhere.

This is defined in the out-of-tree overlay with working with the vendor
kernel. When the DRM driver gains support for these features (Rob and
Tomeu are making good progress here), it will be added in tree in
pan_drm.c, which is currently just a stub for future work.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba

2019-02-15 Thread Zhu, James
Fixes: 9364d66cb7f7 (Add video compositor compute shader render)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646
Problem 1,4: they are caused by imcomplete blend comute shader
implementation. So Reverts rgba back to frament shader.

Signed-off-by: James Zhu 
---
 src/gallium/auxiliary/vl/vl_compositor.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index 8731ad9..a8f3620 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -100,12 +100,12 @@ init_shaders(struct vl_compositor *c)
  debug_printf("Unable to create YCbCr-to-RGB weave fragment 
shader.\n");
  return false;
   }
+   }
 
-  c->fs_rgba = create_frag_shader_rgba(c);
-  if (!c->fs_rgba) {
- debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
- return false;
-  }
+   c->fs_rgba = create_frag_shader_rgba(c);
+   if (!c->fs_rgba) {
+  debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
+  return false;
}
 
return true;
@@ -132,8 +132,8 @@ static void cleanup_shaders(struct vl_compositor *c)
} else {
   c->pipe->delete_fs_state(c->pipe, c->fs_video_buffer);
   c->pipe->delete_fs_state(c->pipe, c->fs_weave_rgb);
-  c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
}
+   c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
 }
 
 static bool
@@ -642,10 +642,7 @@ vl_compositor_set_rgba_layer(struct vl_compositor_state *s,
assert(layer < VL_COMPOSITOR_MAX_LAYERS);
 
s->used_layers |= 1 << layer;
-   if (c->pipe_compute_supported)
-  s->layers[layer].cs = c->cs_rgba;
-   else
-  s->layers[layer].fs = c->fs_rgba;
+   s->layers[layer].fs = c->fs_rgba;
s->layers[layer].samplers[0] = c->sampler_linear;
s->layers[layer].samplers[1] = NULL;
s->layers[layer].samplers[2] = NULL;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba

2019-02-15 Thread Zhu, James
Bugzilla bug 109646 - New video compositor compute shader render glitches mpv
https://bugs.freedesktop.org/show_bug.cgi?id=109646
Problem 1,4: they are caused by imcomplete blend comute shader
implementation. So Reverts rgba back to frament shader from
commit  9364d66cb7f7deb83876a44bb4e29e8105141c16.

Signed-off-by: James Zhu 
---
 src/gallium/auxiliary/vl/vl_compositor.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index 8731ad9..a8f3620 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -100,12 +100,12 @@ init_shaders(struct vl_compositor *c)
  debug_printf("Unable to create YCbCr-to-RGB weave fragment 
shader.\n");
  return false;
   }
+   }
 
-  c->fs_rgba = create_frag_shader_rgba(c);
-  if (!c->fs_rgba) {
- debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
- return false;
-  }
+   c->fs_rgba = create_frag_shader_rgba(c);
+   if (!c->fs_rgba) {
+  debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
+  return false;
}
 
return true;
@@ -132,8 +132,8 @@ static void cleanup_shaders(struct vl_compositor *c)
} else {
   c->pipe->delete_fs_state(c->pipe, c->fs_video_buffer);
   c->pipe->delete_fs_state(c->pipe, c->fs_weave_rgb);
-  c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
}
+   c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
 }
 
 static bool
@@ -642,10 +642,7 @@ vl_compositor_set_rgba_layer(struct vl_compositor_state *s,
assert(layer < VL_COMPOSITOR_MAX_LAYERS);
 
s->used_layers |= 1 << layer;
-   if (c->pipe_compute_supported)
-  s->layers[layer].cs = c->cs_rgba;
-   else
-  s->layers[layer].fs = c->fs_rgba;
+   s->layers[layer].fs = c->fs_rgba;
s->layers[layer].samplers[0] = c->sampler_linear;
s->layers[layer].samplers[1] = NULL;
s->layers[layer].samplers[2] = NULL;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/auxiliary/vl: Revert rgba back to frament shader

2019-02-15 Thread Liu, Leo

On 2/15/19 3:42 PM, Zhu, James wrote:
> Bugzilla Bug 109646 - New video compositor compute shader render glitches mpv
> Problem 1,4: they are caused by imcomplete blend compute shader
> implementation. So Revert rgba back to frament shader.

Please refer to other commit message to put Bugzilla link here, also 
refer to others to add "FIXES:"

Leo


>
> Signed-off-by: James Zhu 
> ---
>   src/gallium/auxiliary/vl/vl_compositor.c | 17 +++--
>   1 file changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
> b/src/gallium/auxiliary/vl/vl_compositor.c
> index 8731ad9..a8f3620 100644
> --- a/src/gallium/auxiliary/vl/vl_compositor.c
> +++ b/src/gallium/auxiliary/vl/vl_compositor.c
> @@ -100,12 +100,12 @@ init_shaders(struct vl_compositor *c)
>debug_printf("Unable to create YCbCr-to-RGB weave fragment 
> shader.\n");
>return false;
> }
> +   }
>   
> -  c->fs_rgba = create_frag_shader_rgba(c);
> -  if (!c->fs_rgba) {
> - debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
> - return false;
> -  }
> +   c->fs_rgba = create_frag_shader_rgba(c);
> +   if (!c->fs_rgba) {
> +  debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
> +  return false;
>  }
>   
>  return true;
> @@ -132,8 +132,8 @@ static void cleanup_shaders(struct vl_compositor *c)
>  } else {
> c->pipe->delete_fs_state(c->pipe, c->fs_video_buffer);
> c->pipe->delete_fs_state(c->pipe, c->fs_weave_rgb);
> -  c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
>  }
> +   c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
>   }
>   
>   static bool
> @@ -642,10 +642,7 @@ vl_compositor_set_rgba_layer(struct vl_compositor_state 
> *s,
>  assert(layer < VL_COMPOSITOR_MAX_LAYERS);
>   
>  s->used_layers |= 1 << layer;
> -   if (c->pipe_compute_supported)
> -  s->layers[layer].cs = c->cs_rgba;
> -   else
> -  s->layers[layer].fs = c->fs_rgba;
> +   s->layers[layer].fs = c->fs_rgba;
>  s->layers[layer].samplers[0] = c->sampler_linear;
>  s->layers[layer].samplers[1] = NULL;
>  s->layers[layer].samplers[2] = NULL;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109646] New video compositor compute shader render glitches mpv

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109646

--- Comment #3 from bmil...@gmail.com ---
(In reply to jam...@amd.com from comment #2)
> bmil...@gmail.com: 
> Problem 1,4: the blend implementation seems incomplete with compute shader.
> Problem 3: could you provide the clip and the screen capture? I want to
> check on my bench.
> Problem 2: So far I didn't see on my bench. I will try to reproduce here.

Also found an extra problem with this test case. Both vaapi and vdpau are
dropping lots of frames on it.

Could you try:
mpv --vo=vaapi --hwdec=vaapi https://www.youtube.com/watch?v=LXb3EKWsInQ
mpv --vo=vdpau --hwdec=vdpau https://www.youtube.com/watch?v=LXb3EKWsInQ

Then resize down the window.
This reproduces the jagginess and also introduces constant stuttering.

Let me know if you can reproduce, I'm at work now but I can provide captures
and logs when I come home if needed.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #31 from andrii simiklit  ---
(In reply to Mark Janes from comment #30)
> (In reply to Mark Janes from comment #28)
> >   https://android-review.googlesource.com/c/platform/external/deqp/+/901894
> 
> Mesa still asserts with this fix.  I also tested Andrii's mesa patch with
> the dEQP fix and the test fails.
Do you mean the Chris's dEQP fix here, yes?
But looks like the mentioned Chris's dEQP fix considers some GL limitations and
doesn't affect the expectations of binding points.

Also the assertion is a separate issue, I created the piglit test for that:
https://patchwork.freedesktop.org/patch/286287/
But yes, we unable to fix the test fail without assertion because of crash :-)

> 
> Since non-mesa drivers have found issues with the original dEQP change, I
> suspect there are still deeper problems with the test.
Possible they have the same issue with binding points mismatch after
optimizations by glsl compiler. 
They could try this fix/hack for deqp which is already helped us:
https://github.com/asimiklit/deqp/commit/91cff8150944213f6da533e281ee76d95ca00f21
If it helps them we will know that it is a common issue and it could expedite
this:
https://github.com/KhronosGroup/OpenGL-API/issues/46

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/auxiliary/vl: Revert rgba back to frament shader

2019-02-15 Thread Zhu, James
Bugzilla Bug 109646 - New video compositor compute shader render glitches mpv
Problem 1,4: they are caused by imcomplete blend compute shader
implementation. So Revert rgba back to frament shader.

Signed-off-by: James Zhu 
---
 src/gallium/auxiliary/vl/vl_compositor.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index 8731ad9..a8f3620 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -100,12 +100,12 @@ init_shaders(struct vl_compositor *c)
  debug_printf("Unable to create YCbCr-to-RGB weave fragment 
shader.\n");
  return false;
   }
+   }
 
-  c->fs_rgba = create_frag_shader_rgba(c);
-  if (!c->fs_rgba) {
- debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
- return false;
-  }
+   c->fs_rgba = create_frag_shader_rgba(c);
+   if (!c->fs_rgba) {
+  debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
+  return false;
}
 
return true;
@@ -132,8 +132,8 @@ static void cleanup_shaders(struct vl_compositor *c)
} else {
   c->pipe->delete_fs_state(c->pipe, c->fs_video_buffer);
   c->pipe->delete_fs_state(c->pipe, c->fs_weave_rgb);
-  c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
}
+   c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
 }
 
 static bool
@@ -642,10 +642,7 @@ vl_compositor_set_rgba_layer(struct vl_compositor_state *s,
assert(layer < VL_COMPOSITOR_MAX_LAYERS);
 
s->used_layers |= 1 << layer;
-   if (c->pipe_compute_supported)
-  s->layers[layer].cs = c->cs_rgba;
-   else
-  s->layers[layer].fs = c->fs_rgba;
+   s->layers[layer].fs = c->fs_rgba;
s->layers[layer].samplers[0] = c->sampler_linear;
s->layers[layer].samplers[1] = NULL;
s->layers[layer].samplers[2] = NULL;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109646] New video compositor compute shader render glitches mpv

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109646

--- Comment #2 from jam...@amd.com  ---
bmil...@gmail.com: 
Problem 1,4: the blend implementation seems incomplete with compute shader.
Problem 3: could you provide the clip and the screen capture? I want to check
on my bench.
Problem 2: So far I didn't see on my bench. I will try to reproduce here.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109646] New video compositor compute shader render glitches mpv

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109646

--- Comment #1 from leoxs...@gmail.com ---
Thanks for the testing. We'll have a look.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109646] New video compositor compute shader render glitches mpv

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109646

Bug ID: 109646
   Summary: New video compositor compute shader render glitches
mpv
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: bmil...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

If I play a video with mpv --vo=vaapi OR --vo=vdpau after this commit
https://gitlab.freedesktop.org/mesa/mesa/commit/f6ac0b5d7187ebb6839fc884e1dbfa8f1dd21eac
I get a few problems:
1. Press i to show mpv info and it will show black rectangles instead. 
2. Going fullscreen fails silently sometimes.
3. Video looks really jaggy when downsized, more than the usual with those --vo
backends.
4. Places where the UI should be transparent are black instead (related to
first issue?).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Fix build; depend on libdrm

2019-02-15 Thread Emil Velikov
On Fri, 15 Feb 2019 at 08:55, Alyssa Rosenzweig  wrote:
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  src/gallium/drivers/panfrost/meson.build | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/gallium/drivers/panfrost/meson.build 
> b/src/gallium/drivers/panfrost/meson.build
> index 17fdb5e1e0d..8ea0b8d2025 100644
> --- a/src/gallium/drivers/panfrost/meson.build
> +++ b/src/gallium/drivers/panfrost/meson.build
> @@ -84,6 +84,7 @@ libpanfrost = static_library(
>[files_panfrost, midgard_nir_algebraic_c],
>dependencies: [
>  dep_thread,
> +dep_libdrm,

Feel free to reuse:
meson: panfrost: add missing libdrm dependency

Either way, the patch is
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] panfrost: Swap order of tiled texture (de)alloc

2019-02-15 Thread Emil Velikov
On Fri, 15 Feb 2019 at 08:50, Alyssa Rosenzweig  wrote:
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  src/gallium/drivers/panfrost/pan_resource.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
> b/src/gallium/drivers/panfrost/pan_resource.c
> index b13461013f5..4be371ba32e 100644
> --- a/src/gallium/drivers/panfrost/pan_resource.c
> +++ b/src/gallium/drivers/panfrost/pan_resource.c
> @@ -415,12 +415,6 @@ panfrost_tile_texture(struct panfrost_screen *screen, 
> struct panfrost_resource *
>
>  int swizzled_sz = panfrost_swizzled_size(width, height, 
> bytes_per_pixel);
>
> -/* Allocate the transfer given that known size but do not copy */
> -struct pb_slab_entry *entry = pb_slab_alloc(&screen->slabs, 
> swizzled_sz, HEAP_TEXTURE);
> -struct panfrost_memory_entry *p_entry = (struct 
> panfrost_memory_entry *) entry;
> -struct panfrost_memory *backing = (struct panfrost_memory *) 
> entry->slab;
> -uint8_t *swizzled = backing->cpu + p_entry->offset;
> -
>  /* Save the entry. But if there was already an entry here (from a
>   * previous upload of the resource), free that one so we don't leak 
> */
>
> @@ -429,6 +423,12 @@ panfrost_tile_texture(struct panfrost_screen *screen, 
> struct panfrost_resource *
>  pb_slab_free(&screen->slabs, &bo->entry[level]->base);
>  }
>
> +/* Allocate the transfer given that known size but do not copy */
> +struct pb_slab_entry *entry = pb_slab_alloc(&screen->slabs, 
> swizzled_sz, HEAP_TEXTURE);
> +struct panfrost_memory_entry *p_entry = (struct 
> panfrost_memory_entry *) entry;
> +struct panfrost_memory *backing = (struct panfrost_memory *) 
> entry->slab;
> +uint8_t *swizzled = backing->cpu + p_entry->offset;
> +
Am I reading this correctly, that now we free a slab [for a tiled
texture] before allocating new one? The commit message seems rather
cryptic.

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109645] build error on arm64: tegra_screen.c:33: /usr/include/xf86drm.h:41:10: fatal error: drm.h: No such file or directory

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109645

Bug ID: 109645
   Summary: build error on arm64: tegra_screen.c:33:
/usr/include/xf86drm.h:41:10: fatal error: drm.h: No
such file or directory
   Product: Mesa
   Version: git
  Hardware: ARM
OS: Linux (All)
Status: NEW
  Severity: major
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: pedretti.fa...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org
CC: fdo-b...@engestrom.ch

tegra fails to build since a couple days:

In file included from ../src/gallium/drivers/tegra/tegra_screen.c:33:
/usr/include/xf86drm.h:41:10: fatal error: drm.h: No such file or directory
 #include 
  ^~~
compilation terminated.

Full build log:
https://launchpadlibrarian.net/411385733/buildlog_ubuntu-cosmic-arm64.mesa_19.1~git1902150730.08bfd7~oibaf~c_BUILDING.txt.gz

Not bisected, but supposing related to:
https://cgit.freedesktop.org/mesa/mesa/commit/?id=f1374805a86d0d506557e61efbc09e23caa7a038

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535

Mark Janes  changed:

   What|Removed |Added

 Depends on||109594


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109594
[Bug 109594] totem assert failure: totem: src/intel/genxml/gen9_pack.h:72:
__gen_uint: La declaración `v <= max' no se cumple.
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] panfrost: Free imported BOs

2019-02-15 Thread Emil Velikov
On Fri, 15 Feb 2019 at 08:50, Alyssa Rosenzweig  wrote:
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  src/gallium/drivers/panfrost/pan_resource.c | 4 
>  src/gallium/drivers/panfrost/pan_resource.h | 6 ++
>  src/gallium/drivers/panfrost/pan_screen.h   | 2 ++
>  3 files changed, 12 insertions(+)
>
> diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
> b/src/gallium/drivers/panfrost/pan_resource.c
> index fb9b8e63c83..b13461013f5 100644
> --- a/src/gallium/drivers/panfrost/pan_resource.c
> +++ b/src/gallium/drivers/panfrost/pan_resource.c
> @@ -313,6 +313,10 @@ panfrost_destroy_bo(struct panfrost_screen *screen, 
> struct panfrost_bo *pbo)
>  /* TODO */
>  printf("--leaking checksum (%zd bytes)--\n", 
> bo->checksum_slab.size);
>  }
> +
> +if (bo->imported) {
> +screen->driver->free_imported_bo(screen, bo);
> +}
>  }
>
>  static void
> diff --git a/src/gallium/drivers/panfrost/pan_resource.h 
> b/src/gallium/drivers/panfrost/pan_resource.h
> index 78baffbd1b2..48c0ca7fbb1 100644
> --- a/src/gallium/drivers/panfrost/pan_resource.h
> +++ b/src/gallium/drivers/panfrost/pan_resource.h
> @@ -45,6 +45,12 @@ struct panfrost_bo {
>  /* Memory entry corresponding to gpu above */
>  struct panfrost_memory_entry *entry[MAX_MIP_LEVELS];
>
> +/* Set if this bo was imported rather than allocated */
> +bool imported;
> +
> +/* Number of bytes of the imported allocation */
> +size_t imported_size;
> +
>  /* Set for tiled, clear for linear. */
>  bool tiled;
>
> diff --git a/src/gallium/drivers/panfrost/pan_screen.h 
> b/src/gallium/drivers/panfrost/pan_screen.h
> index afb3d34b5b1..646923c9864 100644
> --- a/src/gallium/drivers/panfrost/pan_screen.h
> +++ b/src/gallium/drivers/panfrost/pan_screen.h
> @@ -60,6 +60,8 @@ struct panfrost_driver {
>int extent);
>  void (*free_slab) (struct panfrost_screen *screen,
> struct panfrost_memory *mem);
> +void (*free_imported_bo) (struct panfrost_screen *screen,
> + struct panfrost_bo *bo);

Seems like a file is missing - git add pan_screen.c perhaps? Neither
the function pointer nor the imported/imported_size are set anywhere.

HTH
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] panfrost: Fix various leaks unmapping resources

2019-02-15 Thread Emil Velikov
Hi Alyssa,

On Fri, 15 Feb 2019 at 08:50, Alyssa Rosenzweig  wrote:
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  src/gallium/drivers/panfrost/pan_resource.c | 22 -
>  src/gallium/drivers/panfrost/pan_screen.h   |  4 +++-
>  2 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/drivers/panfrost/pan_resource.c 
> b/src/gallium/drivers/panfrost/pan_resource.c
> index 7fa00117a28..fb9b8e63c83 100644
> --- a/src/gallium/drivers/panfrost/pan_resource.c
> +++ b/src/gallium/drivers/panfrost/pan_resource.c
> @@ -287,17 +287,20 @@ panfrost_destroy_bo(struct panfrost_screen *screen, 
> struct panfrost_bo *pbo)
>  {
> struct panfrost_bo *bo = (struct panfrost_bo *)pbo;
>
> -if (bo->entry[0] != NULL) {
> -/* Most allocations have an entry to free */
> -bo->entry[0]->freed = true;
> -pb_slab_free(&screen->slabs, &bo->entry[0]->base);
> +for (int l = 0; l < MAX_MIP_LEVELS; ++l) {
> +if (bo->entry[l] != NULL) {
Nit: staying consistent with "foo != NULL" vs "foo" checks helps a lot.

> +/* Most allocations have an entry to free */
> +bo->entry[l]->freed = true;
> +pb_slab_free(&screen->slabs, &bo->entry[l]->base);
> +}
>  }
>
>  if (bo->tiled) {
>  /* Tiled has a malloc'd CPU, so just plain ol' free needed */
>
> -for (int l = 0; bo->cpu[l]; l++) {
> -free(bo->cpu[l]);
> +for (int l = 0; l < MAX_MIP_LEVELS; ++l) {
> +if (bo->cpu[l])
free(NULL); is perfectly valid.

> +free(bo->cpu[l]);
>  }
>  }
>
> @@ -509,9 +512,10 @@ panfrost_slab_can_reclaim(void *priv, struct 
> pb_slab_entry *entry)
>  static void
>  panfrost_slab_free(void *priv, struct pb_slab *slab)
>  {
> -/* STUB */
> -//struct panfrost_memory *mem = (struct panfrost_memory *) slab;
> -printf("stub: Tried to free slab\n");
> +struct panfrost_memory *mem = (struct panfrost_memory *) slab;
> +struct panfrost_screen *screen = (struct panfrost_screen *) priv;
> +
> +screen->driver->free_slab(screen, mem);
The function pointer seems to be NULL. Did you forget to git add the
file which sets it?

HTH
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Backport driver to Mali T600/T700

2019-02-15 Thread Emil Velikov
Hi Alyssa,

On Thu, 14 Feb 2019 at 01:58, Alyssa Rosenzweig  wrote:
>
> There are a few differenes between Mali T860 (Panfrost's primary
> reference target) and the older Midgard generations (T600/T700):
>
>  - Miscellaneous different magic numbers. It's not clear what these
> numbers mean on either the old or new configurations yet.
>
>  - Errata fixes. T800 is the final Midgard generation and presumably the
> least buggy. Older Midgard has some extra hardware errata we have to
> workaround.
>
> - SFBD vs MFBD split. Essentially, older Midgard use a Single
> FrameBuffer Descriptor (SFBD), which corresponds to single
> render-target rendering. Newer Midgard (T760+) use a Multiple
> FrameBuffer Descriptor (MFBD), allowing multiple RTs. On ES 2.0, these
> descriptors serve the same function, but we implement both, depending on
> the version of the hardware.
>
> - CPU bitness. 32-bit systems generally use 32-bit GPU descriptors, and
> vice versa for 64-bit. Our target T760 systems are 32-bit whereas our
> target T860 systems are 64-bit. More work is needed in this area.
>
> This patch fixes support in these areas for supporting older Midgard
> hardware. It is tested on Mali T760 and Mali T860.
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  .../drivers/panfrost/include/panfrost-job.h   |  21 +-
>  src/gallium/drivers/panfrost/meson.build  |   1 +
>  src/gallium/drivers/panfrost/pan_assemble.c   |   4 +-
>  src/gallium/drivers/panfrost/pan_blending.c   |   4 +-
>  src/gallium/drivers/panfrost/pan_context.c| 541 ++
>  src/gallium/drivers/panfrost/pan_context.h|  31 +-
>  6 files changed, 340 insertions(+), 262 deletions(-)
>
Some random ideas which came to mind while skimming through:
- about 1/5 of the patch seems to be white space changes
- doesn't seem like BIFROST is defined anywhere
- other drivers keep details like is_t6xx, require_sfbd, others in
driver/screen specific struct
- the __LP64__ checks seems suspicious, no other mesa driver has those

HTH
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] egl/dri2: try to bind old context if bindContext failed

2019-02-15 Thread Emil Velikov
On Wed, 13 Feb 2019 at 09:32, Luigi Santivetti
 wrote:
>
> Hello Emil,
>
> thanks for your feedback, I agree, dri2_make_current() looks complex.
> I'll comment inline.
>
> Emil Velikov  writes:
>
> > Hi all,
> >
> > Haven't looked it this has landed or not.
> >
> > On Tue, 5 Feb 2019 at 16:41, Eric Engestrom  
> > wrote:
> >>
> >> On Friday, 2019-02-01 13:36:27 +, Luigi Santivetti wrote:
> >> > Before this change, if bindContext() failed then dri2_make_current() 
> >> > would
> >> > rebind the old EGL context and surfaces and return EGL_BAD_MATCH. 
> >> > However,
> >> > it wouldn't rebind the DRI context and surfaces, thus leaving it in an
> >> > inconsistent and unrecoverable state.
> >> >
> >> > After this change, dri2_make_current() tries to bind the old DRI context
> >> > and surfaces when bindContext() failed. If unable to do so, it leaves EGL
> >> > and the DRI driver in a consistent state, it reports an error and returns
> >> > EGL_BAD_MATCH.
> >>
> >> Admittedly I don't understand everything in this function, but your
> >> patch looks reasonable.
> >> Acked-by: Eric Engestrom 
> >>
> >> I ran it through our CI and no regression was spotted, so there's
> >> that :)
> >>
> >> If Emil doesn't raise any concern by the end of the week, I'll push
> >> your patch.
> >>
> >
> > My biggest concern, which is unrelated to this patch. As Eric's alluded:
> >
> > As-is the function is fairly confusing and convoluted, hence why I did
> > not really get to looking at the patch earlier.
> > I've tried to untangle it with
> > 675719817e7bf7c5b9da22c02252aca77a41338d, although it did not cover
> > all cases.
> >
> > No doubt Luigi spend some time trying to get this right, yet making it
> > is even harder to follow.
>
> Having spent some time on it and with the present design of
> dri2_make_current(), I don't think this change can address issues
> other than the DRI bindContext().
>
> > Can we try simplifying things up?
> >
>
> Are you suggesting to split the work in refactoring first and then
> re-implementing this change on top it? If so, could you suggest what you
> believe is to be improved?
>
> Some weak points I found are:
> 1. Convoluted control flow, due to many if/else
> 2. Variable naming, it is questionable: the prefix "tmp_" used in the
> asserts only, "cctx" and "ctx" respectively for DRI and EGL.
>
Off the top of my head:
Patch 1: factor the error handling outside of _eglBindContext() and
use only as needed
Patch 2: fold the separate old_ctx hunks - glFlush, old_dpy, unbindContext
Patch 3: do not conflate the unbind with the bindContext() failure -
perhaps try something like my earlier commit

Can you give that a stab please?

FWIW bindContext() cannot fail for the in-tree drivers, since the only
probable error case has been handled further up in the EGL stack.
Or at least, the drivers lack actual error handling/reporting - but
that's a topic for another day :-P

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #29 from Mark Janes  ---
*** Bug 109260 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #28 from Mark Janes  ---
Chris Forbes at google found a patch that explains at least some of the reason
for this test regression:

  https://android-review.googlesource.com/c/platform/external/deqp/+/901894

I just realized that this bug is a dupe the one I opened when the test
regressed.  Since the discussion is here, I'll close that one.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] egl/sl: use kms_swrast with vgem instead of a random GPU

2019-02-15 Thread Emil Velikov
On Tue, 5 Feb 2019 at 15:35, Emil Velikov  wrote:
>
> From: Emil Velikov 
>
> VGEM and kms_swrast were introduced to work with one another.
>
> All we do is CPU rendering to dumb buffers. There is no reason to carve
> out GPU memory, increasing the memory pressure on a device that could
> make a better use of it.
>
> For kms_swrast to work properly we require the primary node, as the dumb
> buffer ioctls are not exposed via the render node.
>
> Note that this requires libdrm commit 3df8a7f0 ("xf86drm: fallback to
> MODALIAS for OF less platform devices")
>
> Signed-off-by: Emil Velikov 
> ---
>  src/egl/drivers/dri2/platform_surfaceless.c | 20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/platform_surfaceless.c 
> b/src/egl/drivers/dri2/platform_surfaceless.c
> index e1151e3585c..54c6856c63c 100644
> --- a/src/egl/drivers/dri2/platform_surfaceless.c
> +++ b/src/egl/drivers/dri2/platform_surfaceless.c
> @@ -286,10 +286,11 @@ surfaceless_probe_device(_EGLDisplay *dpy, bool swrast)
> for (i = 0; i < num_devices; ++i) {
>device = devices[i];
>
> -  if (!(device->available_nodes & (1 << DRM_NODE_RENDER)))
> +  const unsigned node_type = swrast ? DRM_NODE_PRIMARY : DRM_NODE_RENDER;
> +  if (!(device->available_nodes & (1 << node_type)))
>   continue;
>
> -  dri2_dpy->fd = loader_open_device(device->nodes[DRM_NODE_RENDER]);
> +  dri2_dpy->fd = loader_open_device(device->nodes[node_type]);
>if (dri2_dpy->fd < 0)
>   continue;
>
> @@ -300,10 +301,17 @@ surfaceless_probe_device(_EGLDisplay *dpy, bool swrast)
>   continue;
>}
>
> -  if (swrast)
> - dri2_dpy->driver_name = strdup("kms_swrast");
> -  else
> - dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd);
> +  dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd);
> +  if (swrast) {
> + /* Use kms swrast only with vgem */
> + if (strcmp(dri2_dpy->driver_name, "vgem") != 0) {
> +free(dri2_dpy->driver_name);
> +dri2_dpy->driver_name = NULL;
> + } else {
> +free(dri2_dpy->driver_name);
> +dri2_dpy->driver_name = strdup("kms_swrast");
> + }
> +  }
>
Chad, Gurchetan can you look at the series when you have some time?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [ANNOUNCE] Mesa 18.3.4 release candidate

2019-02-15 Thread Emil Velikov
Hello list,

The candidate for the Mesa 18.3.4 is now available. Currently we have:
 - 34 queued
 - 1 nominated (outstanding)
 - and 16 rejected patches

The current queue consists of:

A fix in the XvMC state-tracker, which was causing some video attributes to
not take affect. On the video front the VAAPI state tracker has seen
improvements with VP9 streams while the amdgpu driver advertises all available
profiles.

On Intel side we have compiler fixes and extra PCI IDs for Coffee Lake and 
Ice Lake parts. In the Broadcom drivers a couple of memory leaks were
addressed.

Other drivers such as radeonsi, nouveau and freedreno have also seen some
love. The RADV driver has seen addressed to compile correctly with GCC9
amongst other changes.

The Xlib based libGL have been addressed to work with X servers, which lacks
the MIT-SHM extension such as XMing.

To top it up we have a few fixes to the meson build system.


For more details, take a look at section "Mesa stable queue" below.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch) will be
greatly appreciated.

The plan is to have 18.3.4 this Saturday 16th February, around or shortly
after 17 GMT.

If you have any questions or suggestions - be that about the current patch
queue or otherwise, please let me know.


Trivial merge conflicts
---

commit 434f19a8dc5b24e69415e0a36ed067369ea8a8fe
Author: Rob Clark 

freedreno: stop frob'ing pipe_resource::nr_samples

(cherry picked from commit c3baa077bf6db9f9d46be62ed7cbbc3167e68c8f)


Cheers,
Emil


Mesa stable queue
-

Nominated (1)
=

Jason Ekstrand (1):

  367b0ede4d9 intel/fs: Bail in optimize_extract_to_float if we have 
modifiers


Queued (34)
===

Bart Oldeman (1):
  gallium-xlib: query MIT-SHM before using it.

Bas Nieuwenhuizen (2):
  radv: Only look at pImmutableSamples if the descriptor has a sampler.
  amd/common: Use correct writemask for shared memory stores.

Dylan Baker (2):
  get-pick-list: Add --pretty=medium to the arguments for Cc patches
  meson: Add dependency on genxml to anvil

Emil Velikov (4):
  docs: add sha256 checksums for 18.3.3
  cherry-ignore: nv50,nvc0: add explicit settings for recent caps
  cherry-ignore: add more 19.0 only nominations from Ilia
  cherry-ignore: radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares 
on GFX8

Eric Engestrom (2):
  xvmc: fix string comparison
  xvmc: fix string comparison

Ernestas Kulik (2):
  vc4: Fix leak in HW queries error path
  v3d: Fix leak in resource setup error path

Iago Toral Quiroga (1):
  intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments

Ilia Mirkin (1):
  nvc0: we have 16k-sized framebuffers, fix default scissors

Jason Ekstrand (3):
  intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()
  intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode
  nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks

Juan A. Suarez Romero (1):
  anv/cmd_buffer: check for NULL framebuffer

Kenneth Graunke (1):
  st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048

Kristian H. Kristensen (1):
  freedreno/a6xx: Emit blitter dst with OUT_RELOCW

Leo Liu (2):
  st/va: fix the incorrect max profiles report
  st/va/vp9: set max reference as default of VP9 reference number

Marek Olšák (4):
  meson: drop the xcb-xrandr version requirement
  gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0
  radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0
  winsys/amdgpu: don't drop manually added fence dependencies

Mario Kleiner (2):
  egl/wayland: Allow client->server format conversion for PRIME offload. 
(v2)
  egl/wayland-drm: Only announce formats via wl_drm which the driver 
supports.

Oscar Blumberg (1):
  radeonsi: Fix guardband computation for large render targets

Rob Clark (1):
  freedreno: stop frob'ing pipe_resource::nr_samples

Rodrigo Vivi (1):
  intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.

Samuel Pitoiset (2):
  radv: fix compiler issues with GCC 9
  radv: always export gl_SampleMask when the fragment shader uses it


Rejected (16)
=

Ilia Mirkin (7):
  38f542783fa v50,nvc0: add explicit settings for recent caps
  399215eb7a0 nvc0: add support for handling indirect draws with attrib 
conversion
  4443b6ddf2e nvc0/ir: always use CG mode for loads from atomic-only buffers
  5de5beedf21 nvc0/ir: fix second tex argument after levelZero optimization
  162352e6711 nvc0: fix 3d images on kepler
  e00799d3dc0 nv50,nvc0: use condition for occlusion queries when already 
complete
  6adb9b38bfb nvc0: stick zero values for the compute invocation counts

Reason: Explicit 19.0 only nominations as confirmed by Ilia on IRC


Boyan Ding (3):

Re: [Mesa-dev] [PATCH] radv: fix invalid element type when filling vertex input default values

2019-02-15 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Fri, Feb 15, 2019 at 3:57 PM Samuel Pitoiset
 wrote:
>
> The elements added into a vector should have the same type as the
> first one, otherwise this hits an assertion in LLVM.
>
> Fixes: 4b3549c0846 ("radv: reduce the number of loaded channels for vertex 
> input fetches")
> reported-by: Philip Rebohle 
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_nir_to_llvm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_nir_to_llvm.c 
> b/src/amd/vulkan/radv_nir_to_llvm.c
> index f1fc392292a..28221b2889a 100644
> --- a/src/amd/vulkan/radv_nir_to_llvm.c
> +++ b/src/amd/vulkan/radv_nir_to_llvm.c
> @@ -2089,8 +2089,10 @@ radv_fixup_vertex_input_fetches(struct 
> radv_shader_context *ctx,
> elemtype = LLVMTypeOf(value);
> }
>
> -   for (unsigned i = num_channels; i < 4; i++)
> +   for (unsigned i = num_channels; i < 4; i++) {
> chan[i] = i == 3 ? one : zero;
> +   chan[i] = ac_to_float(&ctx->ac, chan[i]);
> +   }
>
> return ac_build_gather_values(&ctx->ac, chan, 4);
>  }
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109333] mesa, meson: Need ability to remember PKG_CONFIG_PATH

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109333

--- Comment #5 from Emil Velikov  ---
Food for though:

AFAICT the concept of reproducible builds relies heavily of having consistent
environment. It even suggests to explicitly set SOURCE_DATE_EPOCH although
there could be others. Be that explicitly required by the initiative or
implicitly via the tools used during the process.

https://reproducible-builds.org/docs/source-date-epoch/

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109064] temp_comp_access::get_required_live_range: enclosing_scope_first_write is NULL

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109064

--- Comment #1 from Alex Xu (Hello71)  ---
Still broken in 19.0.0_rc4.

#0  0x7f0322468821 in get_temp_registers_required_live_ranges(void*,
exec_list*, int, register_live_range*, int, array_live_range*)
() from /usr/lib64/dri/r600_dri.so
#1  0x7f032245d555 in glsl_to_tgsi_visitor::merge_registers() () from
/usr/lib64/dri/r600_dri.so
#2  0x7f032245ebb4 in get_mesa_program_tgsi(gl_context*,
gl_shader_program*, gl_linked_shader*) () from /usr/lib64/dri/r600_dri.so
#3  0x7f032245f063 in st_link_shader () from /usr/lib64/dri/r600_dri.so
#4  0x7f03223fb189 in _mesa_glsl_link_shader () from
/usr/lib64/dri/r600_dri.so
#5  0x7f03222d2dc8 in link_program_error.part () from
/usr/lib64/dri/r600_dri.so
#6  0x7f0d256cd728 in set_glsl_shader_program.isra ()
   from
/usr/lib/wine-any-4.1/bin/../../../lib64/wine-any-4.1/wine/wined3d.dll.so
#7  0x7f0d256cf1f4 in shader_glsl_select () from
/usr/lib/wine-any-4.1/bin/../../../lib64/wine-any-4.1/wine/wined3d.dll.so
#8  0x7f0d25688649 in draw_primitive () from
/usr/lib/wine-any-4.1/bin/../../../lib64/wine-any-4.1/wine/wined3d.dll.so
#9  0x7f0d2568f03f in wined3d_cs_exec_draw () from
/usr/lib/wine-any-4.1/bin/../../../lib64/wine-any-4.1/wine/wined3d.dll.so
#10 0x7f0d2568fb4c in wined3d_cs_run () from
/usr/lib/wine-any-4.1/bin/../../../lib64/wine-any-4.1/wine/wined3d.dll.so
#11 0x7bcc2dd2 in call_thread_func () from
/usr/lib/wine-any-4.1/bin/../../../lib64/wine-any-4.1/wine/ntdll.dll.so

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: write the alpha channel of MRT0 when alpha coverage is enabled

2019-02-15 Thread Samuel Pitoiset
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597
Cc: 18.3 19.0 
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_pipeline.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 9745a1f2aa7..6b54da2e31b 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -511,6 +511,13 @@ radv_pipeline_compute_spi_color_formats(struct 
radv_pipeline *pipeline,
 
if (subpass->color_attachments[i].attachment == 
VK_ATTACHMENT_UNUSED) {
cf = V_028714_SPI_SHADER_ZERO;
+
+   if (blend->need_src_alpha & (1 << i)) {
+   /* Write the alpha channel of MRT0 when alpha 
coverage is
+* enabled because the depth attachment needs 
it.
+*/
+   col_format |= V_028714_SPI_SHADER_32_ABGR;
+   }
} else {
struct radv_render_pass_attachment *attachment = 
pass->attachments + subpass->color_attachments[i].attachment;
bool blend_enable =
@@ -689,6 +696,7 @@ radv_pipeline_init_blend_state(struct radv_pipeline 
*pipeline,
 
if (vkms && vkms->alphaToCoverageEnable) {
blend.db_alpha_to_mask |= S_028B70_ALPHA_TO_MASK_ENABLE(1);
+   blend.need_src_alpha |= 0x1;
}
 
blend.cb_target_mask = 0;
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109333] mesa, meson: Need ability to remember PKG_CONFIG_PATH

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109333

--- Comment #4 from Jan Vesely  ---
(In reply to Dylan Baker from comment #3)
> What exactly are you wanting? We (meson) have been trying not to use
> environment variables when possible because they are so awful to use on
> Windows. Our goal has been to make config files more comprehensive (which is
> something I'm working on) so that you can define an entire test setup
> through the file.

it's not required for my setup. It was just an idea to have a way to configure
the state of the environment explicitly in meson. It could scrub the existing
environment (similar to sudo), and only allow -E options to explicitly set it
up.
It'd enable cleaner, more controlled environment, for better reproducibility
and isolation of builds.
I guess it goes against the design principle of ignoring env so feel free to
ignore so feel free to ignore the suggestion.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: remove simple dead if detection from nir_opt_dead_cf()

2019-02-15 Thread Connor Abbott
Reviewed-by: Connor Abbott 

I agree that if we ever need to bring this back, we should just check for
both branches empty and no phis afterwards.

On Thu, Feb 14, 2019 at 2:38 AM Timothy Arceri 
wrote:

> This was probably useful when it was first written, however it
> looks to be no longer necessary.
>
> As far as I can tell these days dce is smart enough to remove useless
> instructions from if branches. Once this is done
> nir_opt_peephole_select() will end up removing the empty if.
>
> Removing this support reduces the dolphin uber shader compilation
> time by around 60%. Compile time is reduced due to no longer having
> to compute the live ssa defs metadata so much.
>
> No shader-db changes on i965 or radeonsi.
> ---
>  src/compiler/nir/nir_opt_dead_cf.c | 9 ++---
>  1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_dead_cf.c
> b/src/compiler/nir/nir_opt_dead_cf.c
> index 14986732096..053c5743527 100644
> --- a/src/compiler/nir/nir_opt_dead_cf.c
> +++ b/src/compiler/nir/nir_opt_dead_cf.c
> @@ -180,7 +180,7 @@ def_not_live_out(nir_ssa_def *def, void *state)
>  }
>
>  /*
> - * Test if a loop node or if node is dead. Such nodes are dead if:
> + * Test if a loop node is dead. Such nodes are dead if:
>   *
>   * 1) It has no side effects (i.e. intrinsics which could possibly affect
> the
>   * state of the program aside from producing an SSA value, indicated by a
> lack
> @@ -198,7 +198,7 @@ def_not_live_out(nir_ssa_def *def, void *state)
>  static bool
>  node_is_dead(nir_cf_node *node)
>  {
> -   assert(node->type == nir_cf_node_loop || node->type == nir_cf_node_if);
> +   assert(node->type == nir_cf_node_loop);
>
> nir_block *before = nir_cf_node_as_block(nir_cf_node_prev(node));
> nir_block *after = nir_cf_node_as_block(nir_cf_node_next(node));
> @@ -230,11 +230,6 @@ dead_cf_block(nir_block *block)
>  {
> nir_if *following_if = nir_block_get_following_if(block);
> if (following_if) {
> -  if (node_is_dead(&following_if->cf_node)) {
> - nir_cf_node_remove(&following_if->cf_node);
> - return true;
> -  }
> -
>if (!nir_src_is_const(following_if->condition))
>   return false;
>
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109641] GLX swrast driver leaks shared memory segments

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109641

Bug ID: 109641
   Summary: GLX swrast driver leaks shared memory segments
   Product: Mesa
   Version: 18.0
  Hardware: All
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/llvmpipe
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: marcello.blanca...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Using Gnome 3.28 (CentOS 7.6) with llvmpipe driver, Mesa 18.0, but same problem
exists with Gnome 3.30 (Ubuntu 18.10), Mesa 18.2.

Opened a Gnome Terminal window.

New shared memory segments are created every time the terminal window is
resized (found it with: ipcs -m | wc -l).

It looks XShmAttach() is called in src/glx/drisw_glx.c:XCreateDrawable(), but
XShmDetach() is never called. Should it be called in XDestroyDrawable()?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v6 2/5] i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.

2019-02-15 Thread Eleni Maria Stea
GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the
compressed EAC/ETC2 images to non-compressed RGBA images. When
GetCompressed* functions were called, the pixels were returned in this
RGBA format and not the compressed format that was expected.

Trying to fix this problem, we use a secondary shadow miptree to store the
decompressed data for the rendering and the main miptree to store the
compressed for the Get functions to work. Each time that the main miptree
is written with compressed data, we decompress them to RGB and update the
shadow. Then we use the shadow for rendering.

v2:
   - Fixes in the commit message (Nanley Chery)
   - Reversed the changes in brw_get_texture_swizzle and swapped the b, g
   values at the time that we decompress the data in the function:
   intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
   - Simplified the format checks in the miptree_create function of the
   intel_mipmap_tree.c and reserved the call of the
   intel_lower_compressed_format for the case that we are faking the ETC
   support (Nanley Chery)
   - Removed the check for the auxiliary usage for the shadow miptree at
   creation (miptree_create of intel_mipmap_tree.c) as we won't use
   auxiliary buffers with these types of trees (Nanley Chery)
   - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and
   removed the unecessary checks (Nanley Chery)
   - Fixed an unrelated indentation change (Nanley Chery)
   - Modified the function intel_miptree_finish_write to set the
   mt->shadow_needs_update to true to catch all the cases when we need to
   update the miptree (Nanley Chery)
   - In order to update the shadow miptree during the unmap of the
   main and always map the main (Nanley Chery) the following change was
   necessary: Splitted the previous update function that was updating all
   the mipmap levels and use two functions instead: one that updates one
   level and one that updates all of them. Used the first during unmap
   and the second before the rendering.
   - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which
   miptree should be mapped each time and reversed all the changes in the
   higher level texture functions that upload data to textures as they
   aren't needed anymore.
   - Replaced the boolean needs_fake_etc with an inline function that
   checks when we need to fake the ETC compression (Nanley Chery)
   - Removed the initialization of the strides in the update function as
   the values will be overwritten by the intel_miptree_map call (Nanley
   Chery)
   - Used minify instead of division in the new update function
   intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley
   Chery)
   - Removed the depth from the calculation of the number of slices in
   the new update function (intel_miptree_update_etc_shadow_levels of
   intel_mipmap_tree.c) as we don't need to support 3D ETC images.
   (Nanley Chery)

v3:
  - Renamed the rgba_fmt in function miptree_create
  (intel_mipmap_tree.c) to decomp_format as the format is not always in
  rgba order. (Nanley Chery)
  - Documented the new usage for the shadow miptree in the comment above
  the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley
  Chery)
  - Removed the redundant flags from the mapping of the miptrees in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)
  - Fixed the switch from surface's logical level to physical level in
  the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c
  (Nanley Chery)
  - Excluded the Baytrail GPUs from the check for the ETC emulation as
  they support the ETC formats natively. (Nanley Chery)
  - Simplified the check if the format is BGRA in
  intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery)

v4:
  - Removed the functions intel_miptree_(map|unmap)_etc and the check if
   we need to call them as with the new changes, they became unreachable.
   (Nanley Chery)
  - We'd rather calculate the level width and height using the shadow
  miptree instead of the main in intel_miptree_update_etc_shadow_levels of
  intel_mipmap_tree.c (Nanley Chery)
  - Fixed the format in the mt_surface_usage, set at the miptree creation,
   in miptree_create of intel_mipmap_tree.c (Nanley Chery)

v5:
  - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery)
  - Update the flag shadow_needs_update outside the function
  intel_miptree_update_etc_shadow (Nanley Chery)
  - Fixed indentation error (Nanley Chery)

v6:
  - Fixed typo in commit message (Nanley Chery)
  - Simplified the assignment of the mt_fmt in the miptree_create of the
  intel_mipmap_tree.c (Nanley Chery)
  - Combined declarations and assignments where it was possible in the
  intel_miptree_update_etc_shadow and
  intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c
  (Nanley Chery)
---
 .../drivers/dri/i965/brw_wm_surface_state.c   |   5 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 174 +++

[Mesa-dev] [PATCH 2/2] anv/image: fix offset's alignment to the surface alignment

2019-02-15 Thread Samuel Iglesias Gonsálvez
Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/vulkan/anv_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 3999c7399d0..f4a65044a3b 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -142,7 +142,7 @@ add_surface(struct anv_image *image, struct anv_surface 
*surf, uint32_t plane)
surf->isl.alignment_B);
   /* Plane offset is always 0 when it's disjoint. */
} else {
-  surf->offset = align_u32(image->size, surf->isl.alignment_B);
+  surf->offset = util_align_npot(image->size, surf->isl.alignment_B);
   /* Determine plane's offset only once when the first surface is added. */
   if (image->planes[plane].size == 0)
  image->planes[plane].offset = image->size;
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v6 1/5] i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*

2019-02-15 Thread Eleni Maria Stea
From: Nanley Chery 

Use more generic field names. We'll reuse these fields for a workaround
with ASTC miptrees.

Reviewed-by: Eleni Maria Stea 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  8 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 16 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 14 +++---
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index ece3197a858..c55182d7ffb 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -571,15 +571,15 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
 
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
  if (devinfo->gen <= 7) {
-assert(mt->r8stencil_mt && 
!mt->stencil_mt->r8stencil_needs_update);
-mt = mt->r8stencil_mt;
+assert(mt->shadow_mt && !mt->stencil_mt->shadow_needs_update);
+mt = mt->shadow_mt;
  } else {
 mt = mt->stencil_mt;
  }
  format = ISL_FORMAT_R8_UINT;
   } else if (devinfo->gen <= 7 && mt->format == MESA_FORMAT_S_UINT8) {
- assert(mt->r8stencil_mt && !mt->r8stencil_needs_update);
- mt = mt->r8stencil_mt;
+ assert(mt->shadow_mt && !mt->shadow_needs_update);
+ mt = mt->shadow_mt;
  format = ISL_FORMAT_R8_UINT;
   }
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index fe77d72fae4..e364fed2cc7 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1214,7 +1214,7 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
 
   brw_bo_unreference((*mt)->bo);
   intel_miptree_release(&(*mt)->stencil_mt);
-  intel_miptree_release(&(*mt)->r8stencil_mt);
+  intel_miptree_release(&(*mt)->shadow_mt);
   intel_miptree_aux_buffer_free((*mt)->aux_buf);
   free_aux_state_map((*mt)->aux_state);
 
@@ -2427,7 +2427,7 @@ intel_miptree_finish_write(struct brw_context *brw,
switch (mt->aux_usage) {
case ISL_AUX_USAGE_NONE:
   if (mt->format == MESA_FORMAT_S_UINT8 && devinfo->gen <= 7)
- mt->r8stencil_needs_update = true;
+ mt->shadow_needs_update = true;
   break;
 
case ISL_AUX_USAGE_MCS:
@@ -2933,9 +2933,9 @@ intel_update_r8stencil(struct brw_context *brw,
 
assert(src->surf.size_B > 0);
 
-   if (!mt->r8stencil_mt) {
+   if (!mt->shadow_mt) {
   assert(devinfo->gen > 6); /* Handle MIPTREE_LAYOUT_GEN6_HIZ_STENCIL */
-  mt->r8stencil_mt = make_surface(
+  mt->shadow_mt = make_surface(
 brw,
 src->target,
 MESA_FORMAT_R_UINT8,
@@ -2949,13 +2949,13 @@ intel_update_r8stencil(struct brw_context *brw,
 ISL_TILING_Y0_BIT,
 ISL_SURF_USAGE_TEXTURE_BIT,
 BO_ALLOC_BUSY, 0, NULL);
-  assert(mt->r8stencil_mt);
+  assert(mt->shadow_mt);
}
 
-   if (src->r8stencil_needs_update == false)
+   if (src->shadow_needs_update == false)
   return;
 
-   struct intel_mipmap_tree *dst = mt->r8stencil_mt;
+   struct intel_mipmap_tree *dst = mt->shadow_mt;
 
for (int level = src->first_level; level <= src->last_level; level++) {
   const unsigned depth = src->surf.dim == ISL_SURF_DIM_3D ?
@@ -2975,7 +2975,7 @@ intel_update_r8stencil(struct brw_context *brw,
}
 
brw_cache_flush_for_read(brw, dst->bo);
-   src->r8stencil_needs_update = false;
+   src->shadow_needs_update = false;
 }
 
 static void *
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 17668944adc..1a7507023a1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -294,16 +294,16 @@ struct intel_mipmap_tree
struct intel_mipmap_tree *stencil_mt;
 
/**
-* \brief Stencil texturing miptree for sampling from a stencil texture
+* \brief Shadow miptree for sampling when the main isn't supported by HW.
 *
-* Some hardware doesn't support sampling from the stencil texture as
-* required by the GL_ARB_stencil_texturing extenion. To workaround this we
-* blit the texture into a new texture that can be sampled.
+* To workaround various sampler bugs and limitations, we blit the main
+* texture into a new texture that can be sampled.
 *
-* \see intel_update_r8stencil()
+* This miptree may be used for:
+* - Stencil texturing (pre-BDW) as required by GL_ARB_stencil_texturing.
 */
-   struct intel_mipmap_tree *r8stencil_mt;
-   bool r8stencil_needs_update;
+   struct intel_mipmap_tree *shadow_mt;
+   bool shadow_needs_update;
 
/**
 * \brief CCS, MC

[Mesa-dev] [PATCH 1/2] isl: remove the cache line size alignment requirement

2019-02-15 Thread Samuel Iglesias Gonsálvez
There are formats which bpp are not aligned to a power-of-two and
that can cause problems in the checks we do.

The cacheline size was a requirement for using the BLT engine, which
we don't use anymore except for a few things on old HW, so we drop it.

Fixes CTS's CL#3500 test:

dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r8g8b8_unorm

Signed-off-by: Samuel Iglesias Gonsálvez 
---
 src/intel/isl/isl.c | 21 -
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index eaaa28014a3..7f1f2339931 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1381,20 +1381,6 @@ isl_calc_row_pitch(const struct isl_device *dev,
uint32_t alignment_B =
   isl_calc_row_pitch_alignment(surf_info, tile_info);
 
-   /* If pitch isn't given and it can be chosen freely, align it by cache line
-* allowing one to use blit engine on the surface.
-*/
-   if (surf_info->row_pitch_B == 0 && tile_info->tiling == ISL_TILING_LINEAR) {
-  /* From the Broadwell PRM docs for XY_SRC_COPY_BLT::SourceBaseAddress:
-   *
-   *"Base address of the destination surface: X=0, Y=0. Lower 32bits
-   *of the 48bit addressing. When Src Tiling is enabled (Bit_15
-   *enabled), this address must be 4KB-aligned. When Tiling is not
-   *enabled, this address should be CL (64byte) aligned."
-   */
-  alignment_B = MAX2(alignment_B, 64);
-   }
-
const uint32_t min_row_pitch_B =
   isl_calc_min_row_pitch(dev, surf_info, tile_info, phys_total_el,
  alignment_B);
@@ -1527,12 +1513,13 @@ isl_surf_init_s(const struct isl_device *dev,
   base_alignment_B = MAX(1, info->min_alignment_B);
   if (info->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) {
  if (isl_format_is_yuv(info->format)) {
-base_alignment_B = MAX(base_alignment_B, fmtl->bpb / 4);
+base_alignment_B = isl_align_npot(base_alignment_B, fmtl->bpb / 4);
  } else {
-base_alignment_B = MAX(base_alignment_B, fmtl->bpb / 8);
+base_alignment_B = isl_align_npot(base_alignment_B, fmtl->bpb / 8);
  }
+  } else {
+ base_alignment_B = isl_round_up_to_power_of_two(base_alignment_B);
   }
-  base_alignment_B = isl_round_up_to_power_of_two(base_alignment_B);
} else {
   const uint32_t total_h_tl =
  isl_align_div(phys_total_el.h, tile_info.logical_extent_el.height);
-- 
2.19.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #27 from asimiklit  ---
(In reply to Ilia Mirkin from comment #24)
> """
> But according to next comment it should be 0 because of the BlockB[0] was
> optimized
> and there is the BlockB[1] only:
> 
>/* The ARB_shading_language_420pack spec says:
> *
> *If the binding identifier is used with a uniform block instanced as
> *an array then the first element of the array takes the specified
> *block binding and each subsequent element takes the next consecutive
> *uniform block binding point.
> */
> """
> 
> I don't think that's enough justification for the "mesa" way of doing it.
> Whether a block is eliminated or not is not specified by GLSL spec. Would be
> good to get some more experienced opinions on this. [Not mine...]

I have asked about it here:
https://github.com/KhronosGroup/OpenGL-API/issues/46
Hope that somebody will clarify it)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: Be resilient in the face of GPU hangs

2019-02-15 Thread Chris Wilson
Quoting Chris Wilson (2019-02-14 12:05:00)
> If we hang the GPU and end up banning our context, we will no longer be
> able to submit and abort with an error (exit(1) no less). As we submit
> minimal incremental batches that rely on the logical context state of
> previous batches, we can not rely on the kernel's recovery mechanism
> which tries to restore the context back to a "golden" renderstate (the
> default HW setup) and replay the batches in flight. Instead, we must
> create a new context and set it up, including all the lost register
> settings that we only apply once during setup, before allow the user to
> continue rendering. The batches already submitted are lost
> (unrecoverable) so there will be a momentarily glitch and lost rendering
> across frames, but the application should be able to recover and
> continue on fairly oblivious.
> 
> To make wedging even more likely, we use a new "no recovery" context
> parameter that tells the kernel to not even attempt to replay any
> batches in flight against the default context image, as experience shows
> the HW is not always robust enough to cope with the conflicting state.
> 
> Cc: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c| 25 +++
>  src/mesa/drivers/dri/i965/brw_bufmgr.h|  2 ++
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c | 19 ++
>  3 files changed, 46 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index b33a30930db..289b39cd584 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -1589,6 +1589,16 @@ init_cache_buckets(struct brw_bufmgr *bufmgr)
> }
>  }
>  
> +static void init_context(struct brw_bufmgr *bufmgr, uint32_t ctx_id)
> +{
> +   struct drm_i915_gem_context_param p = {
> +  .ctx_id = ctx_id,
> +  .param = 0x7, // I915_CONTEXT_PARAM_RECOVERABLE,
> +   };
> +
> +   drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &p);
> +}
> +
>  uint32_t
>  brw_create_hw_context(struct brw_bufmgr *bufmgr)
>  {
> @@ -1599,6 +1609,8 @@ brw_create_hw_context(struct brw_bufmgr *bufmgr)
>return 0;
> }
>  
> +   init_context(bufmgr, create.ctx_id);
> +
> return create.ctx_id;
>  }
>  
> @@ -1621,6 +1633,19 @@ brw_hw_context_set_priority(struct brw_bufmgr *bufmgr,
> return err;
>  }
>  
> +int
> +brw_hw_context_get_priority(struct brw_bufmgr *bufmgr, uint32_t ctx_id)
> +{
> +   struct drm_i915_gem_context_param p = {
> +  .ctx_id = ctx_id,
> +  .param = I915_CONTEXT_PARAM_PRIORITY,
> +   };
> +
> +   drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, &p);
> +
> +   return p.value; /* on error, return 0 i.e. default priority */
> +}
> +
>  void
>  brw_destroy_hw_context(struct brw_bufmgr *bufmgr, uint32_t ctx_id)
>  {
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
> b/src/mesa/drivers/dri/i965/brw_bufmgr.h
> index 32fc7a553c9..886b2e607ce 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
> @@ -356,6 +356,8 @@ uint32_t brw_create_hw_context(struct brw_bufmgr *bufmgr);
>  int brw_hw_context_set_priority(struct brw_bufmgr *bufmgr,
>  uint32_t ctx_id,
>  int priority);
> +int
> +brw_hw_context_get_priority(struct brw_bufmgr *bufmgr, uint32_t ctx_id);
>  
>  void brw_destroy_hw_context(struct brw_bufmgr *bufmgr, uint32_t ctx_id);
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
> b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index 8097392d22b..afb6e2401e3 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -748,6 +748,18 @@ execbuffer(int fd,
> return ret;
>  }
>  
> +static void recreate_context(struct brw_context *brw)
> +{
> +   struct brw_bufmgr *bufmgr = brw->bufmgr;
> +   int prio;
> +
> +   prio = brw_hw_context_get_priority(bufmgr, brw->hw_ctx);
> +   brw_destroy_hw_context(bufmgr, brw->hw_ctx);
> +
> +   brw->hw_ctx = brw_create_hw_context(bufmgr);
> +   brw_hw_context_set_priority(bufmgr, brw->hw_ctx, prio);

Hmm, fwiw we can make this into a clone operation in the kernel. That
way, we can also do things like move across the ppgtt. We will have the
ability to that shortly...
-Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] gpu/docs: Clarify what userspace means for gl

2019-02-15 Thread Daniel Vetter
On Thu, Feb 14, 2019 at 05:47:06PM -0500, Rob Clark wrote:
> On Thu, Feb 14, 2019 at 4:00 AM Daniel Vetter  wrote:
> >
> > Clear rules avoid arguing.
> >
> > I think it'd be good to have an equally solid list on the kms side.
> > But kms is much more meant to be a standard, and the list of userspace
> > projects we've accepted in the past is constantly shifting and
> > adjusting. So I figured I'll leave that as an exercise for later on.
> >
> > v2: Try to clarify that we don't want a mesa driver just for mesa's
> > sake, and more clearly exclude anything that just doesn't make sense
> > technically.  Example would be a compute driver that makes sense to be
> > merged into drm (for kernel side code-sharing), but where the intended
> > use is some single-source CUDA-style compute without ever bothering
> > about any of the 3D/rendering side baggage that comes with gl/vk.
> >
> > v3: Drop vulkan for now, the situation there isn't as obviously
> > clear-cut as on the gl side, and I don't want to tank this idea on a
> > hot discussion about vk and mesa. Plus I think once we have 1-2 more
> > vk drivers in mesa the situation on the vk side is clear-cut too, and
> > we can do a follow-up patch to add vk to the list where we expect the
> > userspace to be in upstream mesa. That's would give nice precedence to
> > make it clear that this isn't cast in stone, but meant to reflect
> > reality and should be adjusted as needed.
> >
> > v4: Fix typo.
> >
> > Signed-off-by: Daniel Vetter 
> > ---
> > Hi all,
> >
> > I discussed this a bit with a few people over the past few months (I
> > think), to get a feel for where the consensus might be. Goal here isn't
> > anything aspirational (like with the recent igt patch), but just
> > documented current expectations, so that there's no confusion or companies
> > with failed projects that had no reason to fail. Same reasons really like
> > for the patch to document open source userspace requirements a few years
> > ago, that one is still extremely useful.
> >
> > For obvious reasons needs solid support from both mesa and kernel people,
> > or it won't land.
> >
> > Thoughts, hot takes, comments, also acks all very much welcome.
> >
> > Thanks, Daniel
> > ---
> >  Documentation/gpu/drm-uapi.rst | 23 +++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> > index c9fd23efd957..79f78c3fa458 100644
> > --- a/Documentation/gpu/drm-uapi.rst
> > +++ b/Documentation/gpu/drm-uapi.rst
> > @@ -105,6 +105,29 @@ is already rather painful for the DRM subsystem, with 
> > multiple different uAPIs
> >  for the same thing co-existing. If we add a few more complete mistakes 
> > into the
> >  mix every year it would be entirely unmanageable.
> >
> > +Below some clarifications what this means for specific areas in DRM.
> > +
> > +Compute&Rendering Userspace
> > +---
> > +
> > +Userspace API for enabling compute and rendering blocks which are capable 
> > of at
> > +least supporting one of the OpenGL or OpenGL ES standards from Khronos 
> > need to
> > +be enabled in the upstream `Mesa3D project`.
> > +
> > +Mesa3D is the canonical upstream for these areas because it is a fully
> > +compliant, performant and cross-vendor implementation that supports all 
> > kernel
> > +drivers in DRM. It is therefore the best platform to validate userspace 
> > API and
> > +especially make sure that cross-vendor interoperation is assured.
> > +
> 
> I'm not entirely sure how I feel about *requiring* a mesa driver.  I'd
> defn *very strongly recommend* a mesa driver, and preferably a gallium
> driver at that.  But the current blurb here doesn't capture what I
> think is the most important reason for that: I'm already familiar with
> the core mesa and gallium APIs, which will make it far more easy for
> me to review the userspace driver code, compared to something which is
> an entirely different codebase.

If we put "strongly recommend" and then a vendor submits a driver with
their own gl, we'll have endless amounts of arguing, and I think in the
end that driver won't land until anyway, so all wasted. This is a big
difference compared to the recent igt requirement, since we've merged tons
of features without igts, and because of various reasons, we'll continue
to accept justified exceptions. If all we can agree on is "strongly
recommend" then I think we don't need to document that, and will just keep
suffering through the inevitable arguing :-)

I agree that it should be a gallium driver, but the kernel isn't really
the right place to document how a mesa driver should look like.

> In any case, I think it is important that people look at the open src
> userspace, not that it just exists.  Upstreaming to mesa is a good way
> to make sure that happens.

We already require (further up in this doc) that people push their open
source into the canonical upstream, to prevent dodgi

[Mesa-dev] [PATCH v6 5/5] i965: Removed the field etc_format from the struct intel_mipmap_tree

2019-02-15 Thread Eleni Maria Stea
After the previous changes to emulate the ETC/EAC formats using the
secondary shadow miptree, the etc_format field of the intel_mipmap_tree
struct became redundant and the remaining check that used it has been
replaced. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  2 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c|  7 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h| 10 --
 3 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 19a46fcf243..a0984791614 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -520,7 +520,7 @@ static void brw_update_texture_surface(struct gl_context 
*ctx,
   * is safe because texture views aren't allowed on depth/stencil.
   */
  mesa_fmt = mt->format;
-  } else if (mt->etc_format != MESA_FORMAT_NONE) {
+  } else if (intel_miptree_has_etc_shadow(brw, mt)) {
  mesa_fmt = mt->shadow_mt->format;
   } else if (plane > 0) {
  mesa_fmt = mt->format;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 7146fcb6582..426782c5883 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -706,7 +706,6 @@ miptree_create(struct brw_context *brw,
 
if (intel_miptree_needs_fake_etc(brw, mt)) {
   mesa_format decomp_format = intel_lower_compressed_format(brw, format);
-  mt->etc_format = format;
   mt->shadow_mt = make_surface(brw, target, decomp_format, first_level,
last_level, width0, height0, depth0,
num_samples, tiling_flags,
@@ -717,10 +716,6 @@ miptree_create(struct brw_context *brw,
  intel_miptree_release(&mt);
  return NULL;
   }
-
-  mt->shadow_mt->etc_format = MESA_FORMAT_NONE;
-   } else {
-  mt->etc_format = MESA_FORMAT_NONE;
}
 
if (needs_separate_stencil(brw, mt, format)) {
@@ -1302,8 +1297,6 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt,
   mt_format = MESA_FORMAT_Z24_UNORM_S8_UINT;
if (mt->format == MESA_FORMAT_Z_FLOAT32 && mt->stencil_mt)
   mt_format = MESA_FORMAT_Z32_FLOAT_S8X24_UINT;
-   if (mt->etc_format != MESA_FORMAT_NONE)
-  mt_format = mt->etc_format;
 
if (_mesa_get_srgb_format_linear(image->TexFormat) !=
_mesa_get_srgb_format_linear(mt_format))
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 752aeaaf9b7..3e53a0049cc 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -215,21 +215,11 @@ struct intel_mipmap_tree
 * MESA_FORMAT_Z_FLOAT32, otherwise for MESA_FORMAT_Z24_UNORM_S8_UINT 
objects it will be
 * MESA_FORMAT_Z24_UNORM_X8_UINT.
 *
-* For ETC1/ETC2 textures, this is one of the uncompressed mesa texture
-* formats if the hardware lacks support for ETC1/ETC2. See @ref etc_format.
-*
 * @see RENDER_SURFACE_STATE.SurfaceFormat
 * @see 3DSTATE_DEPTH_BUFFER.SurfaceFormat
 */
mesa_format format;
 
-   /**
-* This variable stores the value of ETC compressed texture format
-*
-* @see RENDER_SURFACE_STATE.SurfaceFormat
-*/
-   mesa_format etc_format;
-
GLuint first_level;
GLuint last_level;
 
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

  1   2   >