Re: [Mesa-dev] [PATCH 6/6] r600: don't emit tes samplers/views when tes isn't active

2018-01-02 Thread Konstantin Kharlamov

Yeah, the testing is done on AMD TURKS.

On 03.01.2018 10:29, Konstantin Kharlamov wrote:
Sorry, I don't have an expertise to give a r-b, but here's my t-b :) I 
found no statistically significant changes at "the big keybench" of 
`vblank_mode=0 ./xonotic-linux64-glx`.


But note, there's a trailing whitespace at patch 5 (first "+" after "@@ 
-1267,6 +1268,20 @@"), and patch 6 (first "+" after "@@ -1723,6 +1723,21 
@@").


Tested-by: Konstantin Kharlamov 

On 03.01.2018 05:25, srol...@vmware.com wrote:

From: Roland Scheidegger 

Similar to const buffers. The driver must not emit any tes-related 
state if tes

is disabled, since the hw slots are all shared by VS, therefore it would
overwrite them (the mesa state tracker might not do this, but it would be
perfectly legal to do so).
Nevertheless I think the dirty state tracking logic in the driver is
fundamentally flawed when tes is disabled/enabled, since it looks to 
me like
the VS (and TES) state would not get reemitted to the correct slots 
(if it's

not dirty anyway). Unless I'm missing something...
Theoretically, the overwrite problem could be solved by using 
non-overlapping
resource slots for TES and VS (since we're not even close to using 
half the
resource slots), but it wouldn't work for constant buffers nor 
samplers, and
for VS would still need to propagate changes to both LS and VS, so 
probably

not a useful idea.
Unfortunately there's zero coverage of this with piglit, since all 
tessellation
shader tests are just shader_runner tests, which are unsuitable for 
testing
any kind of state dependency tracking issues (so I can't even quickly 
hack

something up to proove it and fix it...).
TCS otoh is just fine - like GS it has its own hw slots.
---
  src/gallium/drivers/r600/evergreen_state.c   |  4 
  src/gallium/drivers/r600/r600_state_common.c | 15 +++
  2 files changed, 19 insertions(+)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c

index 4cc48dfa11..fb1de9cbf4 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2334,6 +2334,8 @@ static void 
evergreen_emit_tcs_sampler_views(struct r600_context *rctx, struct r
  static void evergreen_emit_tes_sampler_views(struct r600_context 
*rctx, struct r600_atom *atom)

  {
+    if (!rctx->tes_shader)
+    return;
  evergreen_emit_sampler_views(rctx, 
>samplers[PIPE_SHADER_TESS_EVAL].views,
   EG_FETCH_CONSTANTS_OFFSET_VS + 
R600_MAX_CONST_BUFFERS, 0);

  }
@@ -2404,6 +2406,8 @@ static void 
evergreen_emit_tcs_sampler_states(struct r600_context *rctx, struct
  static void evergreen_emit_tes_sampler_states(struct r600_context 
*rctx, struct r600_atom *atom)

  {
+    if (!rctx->tes_shader)
+    return;
  evergreen_emit_sampler_states(rctx, 
>samplers[PIPE_SHADER_TESS_EVAL], 18,

    R_00A414_TD_VS_SAMPLER0_BORDER_INDEX, 0);
  }
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c

index 4364350487..a434156c16 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1723,6 +1723,21 @@ static bool r600_update_derived_state(struct 
r600_context *rctx)

  UPDATE_SHADER_CLIP(R600_HW_STAGE_VS, vs);
  }
  }
+
+    /*
+ * XXX: I believe there's some fatal flaw in the dirty state 
logic when

+ * enabling/disabling tes.
+ * VS/ES share all buffer/resource/sampler slots. If TES is enabled,
+ * it will therefore overwrite the VS slots. If it now gets 
disabled,
+ * the VS needs to rebind all buffer/resource/sampler slots - not 
only

+ * has TES overwritten the corresponding slots, but when the VS was
+ * operating as LS the things with correpsonding dirty bits got 
bound
+ * to LS slots and won't reflect what is dirty as VS stage even 
if the

+ * TES didn't overwrite it. The story for re-enabled TES is similar.
+ * In any case, we're not allowed to submit any TES state when
+ * TES is disabled (the state tracker may not do this but this looks
+ * like an optimization to me, not something which can be relied 
on).

+ */
  /* Update clip misc state. */
  if (clip_so_current) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] r600: don't emit tes samplers/views when tes isn't active

2018-01-02 Thread Konstantin Kharlamov
Sorry, I don't have an expertise to give a r-b, but here's my t-b :) I 
found no statistically significant changes at "the big keybench" of 
`vblank_mode=0 ./xonotic-linux64-glx`.


But note, there's a trailing whitespace at patch 5 (first "+" after "@@ 
-1267,6 +1268,20 @@"), and patch 6 (first "+" after "@@ -1723,6 +1723,21 
@@").


Tested-by: Konstantin Kharlamov 

On 03.01.2018 05:25, srol...@vmware.com wrote:

From: Roland Scheidegger 

Similar to const buffers. The driver must not emit any tes-related state if tes
is disabled, since the hw slots are all shared by VS, therefore it would
overwrite them (the mesa state tracker might not do this, but it would be
perfectly legal to do so).
Nevertheless I think the dirty state tracking logic in the driver is
fundamentally flawed when tes is disabled/enabled, since it looks to me like
the VS (and TES) state would not get reemitted to the correct slots (if it's
not dirty anyway). Unless I'm missing something...
Theoretically, the overwrite problem could be solved by using non-overlapping
resource slots for TES and VS (since we're not even close to using half the
resource slots), but it wouldn't work for constant buffers nor samplers, and
for VS would still need to propagate changes to both LS and VS, so probably
not a useful idea.
Unfortunately there's zero coverage of this with piglit, since all tessellation
shader tests are just shader_runner tests, which are unsuitable for testing
any kind of state dependency tracking issues (so I can't even quickly hack
something up to proove it and fix it...).
TCS otoh is just fine - like GS it has its own hw slots.
---
  src/gallium/drivers/r600/evergreen_state.c   |  4 
  src/gallium/drivers/r600/r600_state_common.c | 15 +++
  2 files changed, 19 insertions(+)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 4cc48dfa11..fb1de9cbf4 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2334,6 +2334,8 @@ static void evergreen_emit_tcs_sampler_views(struct 
r600_context *rctx, struct r
  
  static void evergreen_emit_tes_sampler_views(struct r600_context *rctx, struct r600_atom *atom)

  {
+   if (!rctx->tes_shader)
+   return;
evergreen_emit_sampler_views(rctx, 
>samplers[PIPE_SHADER_TESS_EVAL].views,
 EG_FETCH_CONSTANTS_OFFSET_VS + 
R600_MAX_CONST_BUFFERS, 0);
  }
@@ -2404,6 +2406,8 @@ static void evergreen_emit_tcs_sampler_states(struct 
r600_context *rctx, struct
  
  static void evergreen_emit_tes_sampler_states(struct r600_context *rctx, struct r600_atom *atom)

  {
+   if (!rctx->tes_shader)
+   return;
evergreen_emit_sampler_states(rctx, 
>samplers[PIPE_SHADER_TESS_EVAL], 18,
  R_00A414_TD_VS_SAMPLER0_BORDER_INDEX, 0);
  }
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 4364350487..a434156c16 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1723,6 +1723,21 @@ static bool r600_update_derived_state(struct 
r600_context *rctx)
UPDATE_SHADER_CLIP(R600_HW_STAGE_VS, vs);
}
}
+   
+   /*
+* XXX: I believe there's some fatal flaw in the dirty state logic when
+* enabling/disabling tes.
+* VS/ES share all buffer/resource/sampler slots. If TES is enabled,
+* it will therefore overwrite the VS slots. If it now gets disabled,
+* the VS needs to rebind all buffer/resource/sampler slots - not only
+* has TES overwritten the corresponding slots, but when the VS was
+* operating as LS the things with correpsonding dirty bits got bound
+* to LS slots and won't reflect what is dirty as VS stage even if the
+* TES didn't overwrite it. The story for re-enabled TES is similar.
+* In any case, we're not allowed to submit any TES state when
+* TES is disabled (the state tracker may not do this but this looks
+* like an optimization to me, not something which can be relied on).
+*/
  
  	/* Update clip misc state. */

if (clip_so_current) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Retrieving useful debug infos about (AMD) Mesa and Firefox quirks

2018-01-02 Thread Tapani Pälli

Hi;

On 01/02/2018 10:46 PM, Germano Massullo wrote:

Hi everyone!

Bugreport

"Latest mesa breaks firefox on kde plasma with compositing on" 
https://bugs.freedesktop.org/show_bug.cgi?id=103699


is about Intel cards, and it looks like the bug has been fixed with an 
Intel only patch. Personally, I am experiencing the same bug on AMDGPU 
(FOSS) driver, RX480 card, so I opened bugreport 
https://bugs.freedesktop.org/show_bug.cgi?id=104216


As I mentioned in bug #103699, this is a different bug/issue. With 
#103699, Firefox menus (separate windows than browser itself) are 
invisible and there are no black or white squares as in your report.


Since my Firefox user experienced has been ruined since several weeks, I 
would like to ask you if it is anything I can do to retireve more useful 
infos for you Mesa developers.


Thank you very much


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Radeonsi NIR tess support V3

2018-01-02 Thread Timothy Arceri
I should add that this is just initial support there is more work to be 
done but this fixes around 1800 piglit tests.


On 03/01/18 16:04, Timothy Arceri wrote:

V3:
- rebased on recent changes/fixes

V2:
- addressed feedback from Nicolai

The following patches lack a reviewed-by:

1-3, 5, 20
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 07/20] ac: move some helpers to ac_llvm_build.c

2018-01-02 Thread Timothy Arceri
We will call these from the radeonsi NIR backend.

Reviewed-by: Nicolai Hähnle 
---
 src/amd/common/ac_llvm_build.c  | 24 +
 src/amd/common/ac_llvm_build.h  |  8 ++
 src/amd/common/ac_nir_to_llvm.c | 58 +
 3 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index c74a47a799..0ea5e7f4ca 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -99,6 +99,30 @@ ac_llvm_context_init(struct ac_llvm_context *ctx, 
LLVMContextRef context,
ctx->empty_md = LLVMMDNodeInContext(ctx->context, NULL, 0);
 }
 
+int
+ac_get_llvm_num_components(LLVMValueRef value)
+{
+   LLVMTypeRef type = LLVMTypeOf(value);
+   unsigned num_components = LLVMGetTypeKind(type) == LLVMVectorTypeKind
+ ? LLVMGetVectorSize(type)
+ : 1;
+   return num_components;
+}
+
+LLVMValueRef
+ac_llvm_extract_elem(struct ac_llvm_context *ac,
+LLVMValueRef value,
+int index)
+{
+   int count = ac_get_llvm_num_components(value);
+
+   if (count == 1)
+   return value;
+
+   return LLVMBuildExtractElement(ac->builder, value,
+  LLVMConstInt(ac->i32, index, false), "");
+}
+
 unsigned
 ac_get_type_size(LLVMTypeRef type)
 {
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index 6427d5315a..3c81e2d43d 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -83,6 +83,14 @@ void
 ac_llvm_context_init(struct ac_llvm_context *ctx, LLVMContextRef context,
 enum chip_class chip_class, enum radeon_family family);
 
+int
+ac_get_llvm_num_components(LLVMValueRef value);
+
+LLVMValueRef
+ac_llvm_extract_elem(struct ac_llvm_context *ac,
+LLVMValueRef value,
+int index);
+
 unsigned ac_get_type_size(LLVMTypeRef type);
 
 LLVMTypeRef ac_to_integer_type(struct ac_llvm_context *ctx, LLVMTypeRef t);
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 786d7fcb09..56cb4c5d36 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1127,32 +1127,10 @@ static void create_function(struct nir_to_llvm_context 
*ctx,
ctx->shader_info->num_user_sgprs = user_sgpr_idx;
 }
 
-static int get_llvm_num_components(LLVMValueRef value)
-{
-   LLVMTypeRef type = LLVMTypeOf(value);
-   unsigned num_components = LLVMGetTypeKind(type) == LLVMVectorTypeKind
- ? LLVMGetVectorSize(type)
- : 1;
-   return num_components;
-}
-
-static LLVMValueRef llvm_extract_elem(struct ac_llvm_context *ac,
- LLVMValueRef value,
- int index)
-{
-   int count = get_llvm_num_components(value);
-
-   if (count == 1)
-   return value;
-
-   return LLVMBuildExtractElement(ac->builder, value,
-  LLVMConstInt(ac->i32, index, false), "");
-}
-
 static LLVMValueRef trim_vector(struct ac_llvm_context *ctx,
 LLVMValueRef value, unsigned count)
 {
-   unsigned num_components = get_llvm_num_components(value);
+   unsigned num_components = ac_get_llvm_num_components(value);
if (count == num_components)
return value;
 
@@ -2453,7 +2431,7 @@ static void visit_store_ssbo(struct ac_nir_context *ctx,
 
} else {
assert(count == 1);
-   if (get_llvm_num_components(base_data) > 1)
+   if (ac_get_llvm_num_components(base_data) > 1)
data = LLVMBuildExtractElement(ctx->ac.builder, 
base_data,
   
LLVMConstInt(ctx->ac.i32, start, false), "");
else
@@ -2480,9 +2458,9 @@ static LLVMValueRef visit_atomic_ssbo(struct 
ac_nir_context *ctx,
int arg_count = 0;
 
if (instr->intrinsic == nir_intrinsic_ssbo_atomic_comp_swap) {
-   params[arg_count++] = llvm_extract_elem(>ac, get_src(ctx, 
instr->src[3]), 0);
+   params[arg_count++] = ac_llvm_extract_elem(>ac, 
get_src(ctx, instr->src[3]), 0);
}
-   params[arg_count++] = llvm_extract_elem(>ac, get_src(ctx, 
instr->src[2]), 0);
+   params[arg_count++] = ac_llvm_extract_elem(>ac, get_src(ctx, 
instr->src[2]), 0);
params[arg_count++] = ctx->abi->load_ssbo(ctx->abi,
 get_src(ctx, instr->src[0]),
 true);
@@ -2959,7 +2937,7 @@ store_tcs_output(struct ac_shader_abi *abi,
for (unsigned chan = 0; chan < 8; chan++) {
 

[Mesa-dev] [PATCH v3 14/20] ac/radeonsi: add load_tess_coord() to the abi

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/amd/common/ac_nir_to_llvm.c  | 20 +--
 src/amd/common/ac_shader_abi.h   |  4 +++
 src/gallium/drivers/radeonsi/si_shader.c | 42 +++-
 3 files changed, 42 insertions(+), 24 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 3259e14584..02986c2a9b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -4146,9 +4146,11 @@ visit_end_primitive(struct nir_to_llvm_context *ctx,
 }
 
 static LLVMValueRef
-visit_load_tess_coord(struct nir_to_llvm_context *ctx,
- const nir_intrinsic_instr *instr)
+load_tess_coord(struct ac_shader_abi *abi, LLVMTypeRef type,
+   unsigned num_components)
 {
+   struct nir_to_llvm_context *ctx = nir_to_llvm_context_from_abi(abi);
+
LLVMValueRef coord[4] = {
ctx->tes_u,
ctx->tes_v,
@@ -4160,9 +4162,8 @@ visit_load_tess_coord(struct nir_to_llvm_context *ctx,
coord[2] = LLVMBuildFSub(ctx->builder, ctx->ac.f32_1,
LLVMBuildFAdd(ctx->builder, coord[0], 
coord[1], ""), "");
 
-   LLVMValueRef result = ac_build_gather_values(>ac, coord, 
instr->num_components);
-   return LLVMBuildBitCast(ctx->builder, result,
-   get_def_type(ctx->nir, >dest.ssa), "");
+   LLVMValueRef result = ac_build_gather_values(>ac, coord, 
num_components);
+   return LLVMBuildBitCast(ctx->builder, result, type, "");
 }
 
 static void visit_intrinsic(struct ac_nir_context *ctx,
@@ -4357,9 +4358,13 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
case nir_intrinsic_end_primitive:
visit_end_primitive(ctx->nctx, instr);
break;
-   case nir_intrinsic_load_tess_coord:
-   result = visit_load_tess_coord(ctx->nctx, instr);
+   case nir_intrinsic_load_tess_coord: {
+   LLVMTypeRef type = ctx->nctx ?
+   get_def_type(ctx->nctx->nir, >dest.ssa) :
+   NULL;
+   result = ctx->abi->load_tess_coord(ctx->abi, type, 
instr->num_components);
break;
+   }
case nir_intrinsic_load_patch_vertices_in:
result = LLVMConstInt(ctx->ac.i32, 
ctx->nctx->options->key.tcs.input_vertices, false);
break;
@@ -6691,6 +6696,7 @@ LLVMModuleRef 
ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
} else if (shaders[i]->info.stage == MESA_SHADER_TESS_EVAL) {
ctx.tes_primitive_mode = 
shaders[i]->info.tess.primitive_mode;
ctx.abi.load_tess_inputs = load_tes_input;
+   ctx.abi.load_tess_coord = load_tess_coord;
} else if (shaders[i]->info.stage == MESA_SHADER_VERTEX) {
if (shader_info->info.vs.needs_instance_id) {
ctx.shader_info->vs.vgpr_comp_cnt =
diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
index d5d7c9c327..277e4efe47 100644
--- a/src/amd/common/ac_shader_abi.h
+++ b/src/amd/common/ac_shader_abi.h
@@ -99,6 +99,10 @@ struct ac_shader_abi {
  bool is_compact,
  unsigned writemask);
 
+   LLVMValueRef (*load_tess_coord)(struct ac_shader_abi *abi,
+   LLVMTypeRef type,
+   unsigned num_components);
+
LLVMValueRef (*load_ubo)(struct ac_shader_abi *abi, LLVMValueRef index);
 
/**
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index c642279f41..73bf7245be 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1899,11 +1899,33 @@ static LLVMValueRef load_sample_position(struct 
si_shader_context *ctx, LLVMValu
return lp_build_gather_values(>gallivm, pos, 4);
 }
 
+static LLVMValueRef si_load_tess_coord(struct ac_shader_abi *abi,
+  LLVMTypeRef type,
+  unsigned num_components)
+{
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   struct lp_build_context *bld = >bld_base.base;
+
+   LLVMValueRef coord[4] = {
+   LLVMGetParam(ctx->main_fn, ctx->param_tes_u),
+   LLVMGetParam(ctx->main_fn, ctx->param_tes_v),
+   ctx->ac.f32_0,
+   ctx->ac.f32_0
+   };
+
+   /* For triangles, the vector should be (u, v, 1-u-v). */
+   if (ctx->shader->selector->info.properties[TGSI_PROPERTY_TES_PRIM_MODE] 
==
+   PIPE_PRIM_TRIANGLES)
+   coord[2] = lp_build_sub(bld, ctx->ac.f32_1,
+   lp_build_add(bld, coord[0], coord[1]));
+
+   return 

[Mesa-dev] [PATCH v3 10/20] radeonsi: add unpack_llvm_param() helper

2018-01-02 Thread Timothy Arceri
This allows us to pass the llvm param directly rather than looking
it up.

Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 11831a7864..f1589f495f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -241,13 +241,10 @@ unsigned si_shader_io_get_unique_index(unsigned 
semantic_name, unsigned index)
 /**
  * Get the value of a shader input parameter and extract a bitfield.
  */
-static LLVMValueRef unpack_param(struct si_shader_context *ctx,
-unsigned param, unsigned rshift,
-unsigned bitwidth)
+static LLVMValueRef unpack_llvm_param(struct si_shader_context *ctx,
+ LLVMValueRef value, unsigned rshift,
+ unsigned bitwidth)
 {
-   LLVMValueRef value = LLVMGetParam(ctx->main_fn,
- param);
-
if (LLVMGetTypeKind(LLVMTypeOf(value)) == LLVMFloatTypeKind)
value = ac_to_integer(>ac, value);
 
@@ -264,6 +261,15 @@ static LLVMValueRef unpack_param(struct si_shader_context 
*ctx,
return value;
 }
 
+static LLVMValueRef unpack_param(struct si_shader_context *ctx,
+unsigned param, unsigned rshift,
+unsigned bitwidth)
+{
+   LLVMValueRef value = LLVMGetParam(ctx->main_fn, param);
+
+   return unpack_llvm_param(ctx, value, rshift, bitwidth);
+}
+
 static LLVMValueRef get_rel_patch_id(struct si_shader_context *ctx)
 {
switch (ctx->type) {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 19/20] ac: add load_tess_level() to the abi

2018-01-02 Thread Timothy Arceri
Fixes the following piglit tests in radeonsi:

vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tcs-tes-tessinner-tessouter-inputs-tris.shader_test
vs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tes-tessinner-tessouter-inputs-tris.shader_test

Reviewed-by: Nicolai Hähnle 
---
 src/amd/common/ac_nir_to_llvm.c  | 6 ++
 src/amd/common/ac_shader_abi.h   | 4 
 src/gallium/drivers/radeonsi/si_shader.c | 1 +
 3 files changed, 11 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 02986c2a9b..1ca132850d 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -4365,6 +4365,12 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
result = ctx->abi->load_tess_coord(ctx->abi, type, 
instr->num_components);
break;
}
+   case nir_intrinsic_load_tess_level_outer:
+   result = ctx->abi->load_tess_level(ctx->abi, 
shader_io_get_unique_index(VARYING_SLOT_TESS_LEVEL_OUTER));
+   break;
+   case nir_intrinsic_load_tess_level_inner:
+   result = ctx->abi->load_tess_level(ctx->abi, 
shader_io_get_unique_index(VARYING_SLOT_TESS_LEVEL_INNER));
+   break;
case nir_intrinsic_load_patch_vertices_in:
result = LLVMConstInt(ctx->ac.i32, 
ctx->nctx->options->key.tcs.input_vertices, false);
break;
diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
index 277e4efe47..992ed52cf7 100644
--- a/src/amd/common/ac_shader_abi.h
+++ b/src/amd/common/ac_shader_abi.h
@@ -103,6 +103,10 @@ struct ac_shader_abi {
LLVMTypeRef type,
unsigned num_components);
 
+   LLVMValueRef (*load_tess_level)(struct ac_shader_abi *abi,
+   int param);
+
+
LLVMValueRef (*load_ubo)(struct ac_shader_abi *abi, LLVMValueRef index);
 
/**
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index e7e668c9ac..7aa942da36 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5977,6 +5977,7 @@ static bool si_compile_tgsi_main(struct si_shader_context 
*ctx,
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_tes;
ctx->abi.load_tess_inputs = si_nir_load_input_tes;
ctx->abi.load_tess_coord = si_load_tess_coord;
+   ctx->abi.load_tess_level = si_load_tess_level;
if (shader->key.as_es)
ctx->abi.emit_outputs = si_llvm_emit_es_epilogue;
else
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 17/20] st/glsl_to_nir/radeonsi: enable tessellation shaders

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader_nir.c | 2 ++
 src/mesa/state_tracker/st_glsl_to_nir.cpp| 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index f96bf7c2d2..5ac020d9fc 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -157,6 +157,8 @@ void si_nir_scan_shader(const struct nir_shader *nir,
 
assert(nir->info.stage == MESA_SHADER_VERTEX ||
   nir->info.stage == MESA_SHADER_GEOMETRY ||
+  nir->info.stage == MESA_SHADER_TESS_CTRL ||
+  nir->info.stage == MESA_SHADER_TESS_EVAL ||
   nir->info.stage == MESA_SHADER_FRAGMENT);
 
info->processor = pipe_shader_type_from_mesa(nir->info.stage);
diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp 
b/src/mesa/state_tracker/st_glsl_to_nir.cpp
index 276450a64a..5683df 100644
--- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
@@ -675,7 +675,9 @@ st_finalize_nir(struct st_context *st, struct gl_program 
*prog,
   >num_outputs,
   nir->info.stage);
   st_nir_fixup_varying_slots(st, >outputs);
-   } else if (nir->info.stage == MESA_SHADER_GEOMETRY) {
+   } else if (nir->info.stage == MESA_SHADER_GEOMETRY ||
+  nir->info.stage == MESA_SHADER_TESS_CTRL ||
+  nir->info.stage == MESA_SHADER_TESS_EVAL) {
   sort_varyings(>inputs);
   st_nir_assign_var_locations(>inputs,
   >num_inputs,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 20/20] ac: rework ac_llvm_extract_elem()

2018-01-02 Thread Timothy Arceri
Simplifies the logic a little and asserts index is 0.

Suggested-by: Nicolai Hähnle 
---
 src/amd/common/ac_llvm_build.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 0ea5e7f4ca..8a3a2abf17 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -114,10 +114,10 @@ ac_llvm_extract_elem(struct ac_llvm_context *ac,
 LLVMValueRef value,
 int index)
 {
-   int count = ac_get_llvm_num_components(value);
-
-   if (count == 1)
+   if (LLVMGetTypeKind(LLVMTypeOf(value)) != LLVMVectorTypeKind) {
+   assert(index == 0);
return value;
+   }
 
return LLVMBuildExtractElement(ac->builder, value,
   LLVMConstInt(ac->i32, index, false), "");
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 15/20] radeonsi: add dummy implementation of si_nir_scan_tess_ctrl()

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader.h|  3 +++
 src/gallium/drivers/radeonsi/si_shader_nir.c| 19 +++
 src/gallium/drivers/radeonsi/si_state_shaders.c |  1 +
 3 files changed, 23 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index c981d3562e..c449aa9684 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -652,6 +652,9 @@ const char *si_get_shader_name(const struct si_shader 
*shader, unsigned processo
 /* si_shader_nir.c */
 void si_nir_scan_shader(const struct nir_shader *nir,
struct tgsi_shader_info *info);
+void si_nir_scan_tess_ctrl(const struct nir_shader *nir,
+  const struct tgsi_shader_info *info,
+  struct tgsi_tessctrl_info *out);
 void si_lower_nir(struct si_shader_selector *sel);
 
 /* Inline helpers. */
diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index d2760b03bc..f96bf7c2d2 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -130,6 +130,25 @@ static void scan_instruction(struct tgsi_shader_info *info,
}
 }
 
+void si_nir_scan_tess_ctrl(const struct nir_shader *nir,
+  const struct tgsi_shader_info *info,
+  struct tgsi_tessctrl_info *out)
+{
+   memset(out, 0, sizeof(*out));
+
+   if (nir->info.stage != MESA_SHADER_TESS_CTRL)
+   return;
+
+   /* Initial value = true. Here the pass will accumulate results from
+* multiple segments surrounded by barriers. If tess factors aren't
+* written at all, it's a shader bug and we don't care if this will be
+* true.
+*/
+   out->tessfactors_are_def_in_all_invocs = true;
+
+   /* TODO: Implement scanning of tess factors, see tgsi backend. */
+}
+
 void si_nir_scan_shader(const struct nir_shader *nir,
struct tgsi_shader_info *info)
 {
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 9143f61fcd..446e417294 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -2002,6 +2002,7 @@ static void *si_create_shader_selector(struct 
pipe_context *ctx,
sel->nir = state->ir.nir;
 
si_nir_scan_shader(sel->nir, >info);
+   si_nir_scan_tess_ctrl(sel->nir, >info, >tcs_info);
 
si_lower_nir(sel);
}
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 18/20] radeonsi: add si_load_tess_level() helper

2018-01-02 Thread Timothy Arceri
This will be shared by the tgsi and nir backends.

Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader.c | 31 ++-
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 73bf7245be..e7e668c9ac 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1922,6 +1922,22 @@ static LLVMValueRef si_load_tess_coord(struct 
ac_shader_abi *abi,
return lp_build_gather_values(>gallivm, coord, 4);
 }
 
+static LLVMValueRef si_load_tess_level(struct ac_shader_abi *abi, int param)
+{
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   LLVMValueRef buffer, base, addr;
+
+   buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
+
+   base = LLVMGetParam(ctx->main_fn, ctx->param_tcs_offchip_offset);
+   addr = get_tcs_tes_buffer_address(ctx, get_rel_patch_id(ctx), NULL,
+ LLVMConstInt(ctx->i32, param, 0));
+
+   return buffer_load(>bld_base, ctx->f32,
+  ~0, buffer, base, addr, true);
+
+}
+
 void si_load_system_value(struct si_shader_context *ctx,
  unsigned index,
  const struct tgsi_full_declaration *decl)
@@ -2039,20 +2055,9 @@ void si_load_system_value(struct si_shader_context *ctx,
break;
 
case TGSI_SEMANTIC_TESSINNER:
-   case TGSI_SEMANTIC_TESSOUTER:
-   {
-   LLVMValueRef buffer, base, addr;
+   case TGSI_SEMANTIC_TESSOUTER: {
int param = 
si_shader_io_get_unique_index_patch(decl->Semantic.Name, 0);
-
-   buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
-
-   base = LLVMGetParam(ctx->main_fn, 
ctx->param_tcs_offchip_offset);
-   addr = get_tcs_tes_buffer_address(ctx, get_rel_patch_id(ctx), 
NULL,
- LLVMConstInt(ctx->i32, param, 0));
-
-   value = buffer_load(>bld_base, ctx->f32,
-   ~0, buffer, base, addr, true);
-
+   value = si_load_tess_level(>abi, param);
break;
}
 
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 16/20] gallium/tgsi: add patch support to tgsi_get_gl_varying_semantic()

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/gallium/auxiliary/tgsi/tgsi_from_mesa.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_from_mesa.c 
b/src/gallium/auxiliary/tgsi/tgsi_from_mesa.c
index c014115918..659156b519 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_from_mesa.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_from_mesa.c
@@ -154,9 +154,14 @@ tgsi_get_gl_varying_semantic(gl_varying_slot attr,
default:
   assert(attr >= VARYING_SLOT_VAR0 ||
  (attr >= VARYING_SLOT_TEX0 && attr <= VARYING_SLOT_TEX7));
-  *semantic_name = TGSI_SEMANTIC_GENERIC;
-  *semantic_index =
- tgsi_get_generic_gl_varying_index(attr, needs_texcoord_semantic);
+  if (attr >= VARYING_SLOT_PATCH0) {
+ *semantic_name = TGSI_SEMANTIC_PATCH;
+ *semantic_index = attr - VARYING_SLOT_PATCH0;
+  } else {
+ *semantic_name = TGSI_SEMANTIC_GENERIC;
+ *semantic_index =
+tgsi_get_generic_gl_varying_index(attr, needs_texcoord_semantic);
+  }
   break;
}
 }
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 12/20] radeonsi/nir: gather tess properties

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader_nir.c | 29 
 1 file changed, 29 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index 4138e04dcb..d2760b03bc 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -83,6 +83,9 @@ static void scan_instruction(struct tgsi_shader_info *info,
case nir_intrinsic_load_instance_id:
info->uses_instanceid = 1;
break;
+   case nir_intrinsic_load_invocation_id:
+   info->uses_invocationid = true;
+   break;
case nir_intrinsic_load_vertex_id:
info->uses_vertexid = 1;
break;
@@ -95,6 +98,10 @@ static void scan_instruction(struct tgsi_shader_info *info,
case nir_intrinsic_load_primitive_id:
info->uses_primid = 1;
break;
+   case nir_intrinsic_load_tess_level_inner:
+   case nir_intrinsic_load_tess_level_outer:
+   info->reads_tess_factors = true;
+   break;
case nir_intrinsic_image_store:
case nir_intrinsic_image_atomic_add:
case nir_intrinsic_image_atomic_min:
@@ -137,6 +144,28 @@ void si_nir_scan_shader(const struct nir_shader *nir,
info->num_tokens = 2; /* indicate that the shader is non-empty */
info->num_instructions = 2;
 
+   if (nir->info.stage == MESA_SHADER_TESS_CTRL) {
+   info->properties[TGSI_PROPERTY_TCS_VERTICES_OUT] =
+   nir->info.tess.tcs_vertices_out;
+   }
+
+   if (nir->info.stage == MESA_SHADER_TESS_EVAL) {
+   if (nir->info.tess.primitive_mode == GL_ISOLINES)
+   info->properties[TGSI_PROPERTY_TES_PRIM_MODE] = 
GL_LINES;
+   else
+   info->properties[TGSI_PROPERTY_TES_PRIM_MODE] = 
nir->info.tess.primitive_mode;
+
+   STATIC_ASSERT((TESS_SPACING_EQUAL + 1) % 3 == 
PIPE_TESS_SPACING_EQUAL);
+   STATIC_ASSERT((TESS_SPACING_FRACTIONAL_ODD + 1) % 3 ==
+ PIPE_TESS_SPACING_FRACTIONAL_ODD);
+   STATIC_ASSERT((TESS_SPACING_FRACTIONAL_EVEN + 1) % 3 ==
+ PIPE_TESS_SPACING_FRACTIONAL_EVEN);
+
+   info->properties[TGSI_PROPERTY_TES_SPACING] = 
(nir->info.tess.spacing + 1) % 3;
+   info->properties[TGSI_PROPERTY_TES_VERTEX_ORDER_CW] = 
!nir->info.tess.ccw;
+   info->properties[TGSI_PROPERTY_TES_POINT_MODE] = 
nir->info.tess.point_mode;
+   }
+
if (nir->info.stage == MESA_SHADER_GEOMETRY) {
info->properties[TGSI_PROPERTY_GS_INPUT_PRIM] = 
nir->info.gs.input_primitive;
info->properties[TGSI_PROPERTY_GS_OUTPUT_PRIM] = 
nir->info.gs.output_primitive;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 08/20] radeonsi: add nir support for tcs outputs

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader.c | 124 +++
 1 file changed, 124 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 816396bf86..233f161f1c 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1434,6 +1434,129 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
}
 }
 
+static void si_nir_store_output_tcs(struct ac_shader_abi *abi,
+   LLVMValueRef vertex_index,
+   LLVMValueRef param_index,
+   unsigned const_index,
+   unsigned location,
+   unsigned driver_location,
+   LLVMValueRef src,
+   unsigned component,
+   bool is_patch,
+   bool is_compact,
+   unsigned writemask)
+{
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   struct tgsi_shader_info *info = >shader->selector->info;
+   LLVMValueRef dw_addr, stride;
+   LLVMValueRef buffer, base, addr;
+   LLVMValueRef values[4];
+   bool skip_lds_store;
+   bool is_tess_factor = false, is_tess_inner = false;
+
+   driver_location = driver_location / 4;
+
+   if (param_index) {
+   /* Add the constant index to the indirect index */
+   param_index = LLVMBuildAdd(ctx->ac.builder, param_index,
+  LLVMConstInt(ctx->i32, const_index, 
0), "");
+   } else {
+   if (const_index != 0)
+   param_index = LLVMConstInt(ctx->i32, const_index, 0);
+   }
+
+   if (!is_patch) {
+   stride = get_tcs_out_vertex_dw_stride(ctx);
+   dw_addr = get_tcs_out_current_patch_offset(ctx);
+   dw_addr = get_dw_address_from_generic_indices(ctx, stride, 
dw_addr,
+ vertex_index, 
param_index,
+ driver_location,
+ 
info->output_semantic_name,
+ 
info->output_semantic_index,
+ is_patch);
+
+   skip_lds_store = !info->reads_pervertex_outputs;
+   } else {
+   dw_addr = get_tcs_out_current_patch_data_offset(ctx);
+   dw_addr = get_dw_address_from_generic_indices(ctx, NULL, 
dw_addr,
+ vertex_index, 
param_index,
+ driver_location,
+ 
info->output_semantic_name,
+ 
info->output_semantic_index,
+ is_patch);
+
+   skip_lds_store = !info->reads_perpatch_outputs;
+
+   if (!param_index) {
+   int name = info->output_semantic_name[driver_location];
+
+   /* Always write tess factors into LDS for the TCS 
epilog. */
+   if (name == TGSI_SEMANTIC_TESSINNER ||
+   name == TGSI_SEMANTIC_TESSOUTER) {
+   /* The epilog doesn't read LDS if invocation 0 
defines tess factors. */
+   skip_lds_store = 
!info->reads_tessfactor_outputs &&
+
ctx->shader->selector->tcs_info.tessfactors_are_def_in_all_invocs;
+   is_tess_factor = true;
+   is_tess_inner = name == TGSI_SEMANTIC_TESSINNER;
+   }
+   }
+   }
+
+   buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
+
+   base = LLVMGetParam(ctx->main_fn, ctx->param_tcs_offchip_offset);
+
+   addr = get_tcs_tes_buffer_address_from_generic_indices(ctx, 
vertex_index,
+  param_index, 
driver_location,
+  
info->output_semantic_name,
+  
info->output_semantic_index,
+  is_patch);
+
+   for (unsigned chan = 0; chan < 4; chan++) {
+   if (!(writemask & (1 << chan)))
+   continue;
+   LLVMValueRef value = ac_llvm_extract_elem(>ac, src, chan - 
component);
+
+   

[Mesa-dev] [PATCH v3 04/20] radeonsi: add get_dw_address_from_generic_indices() helper

2018-01-02 Thread Timothy Arceri
This will be used by both the tgsi and nir backends.

Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader.c | 76 +++-
 1 file changed, 46 insertions(+), 30 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 647a5a4d40..0696020c41 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -827,6 +827,38 @@ LLVMValueRef si_get_bounded_indirect_index(struct 
si_shader_context *ctx,
return si_llvm_bound_index(ctx, result, num);
 }
 
+static LLVMValueRef get_dw_address_from_generic_indices(struct 
si_shader_context *ctx,
+   LLVMValueRef 
vertex_dw_stride,
+   LLVMValueRef base_addr,
+   LLVMValueRef 
vertex_index,
+   LLVMValueRef 
param_index,
+   unsigned input_index,
+   ubyte *name,
+   ubyte *index,
+   bool is_patch)
+{
+   if (vertex_dw_stride) {
+   base_addr = LLVMBuildAdd(ctx->ac.builder, base_addr,
+LLVMBuildMul(ctx->ac.builder, 
vertex_index,
+ vertex_dw_stride, ""), 
"");
+   }
+
+   if (param_index) {
+   base_addr = LLVMBuildAdd(ctx->ac.builder, base_addr,
+LLVMBuildMul(ctx->ac.builder, 
param_index,
+ LLVMConstInt(ctx->i32, 4, 
0), ""), "");
+   }
+
+   int param = is_patch ?
+   si_shader_io_get_unique_index_patch(name[input_index],
+   index[input_index]) :
+   si_shader_io_get_unique_index(name[input_index],
+ index[input_index]);
+
+   /* Add the base address of the element. */
+   return LLVMBuildAdd(ctx->ac.builder, base_addr,
+   LLVMConstInt(ctx->i32, param * 4, 0), "");
+}
 
 /**
  * Calculate a dword address given an input or output register and a stride.
@@ -839,8 +871,10 @@ static LLVMValueRef get_dw_address(struct 
si_shader_context *ctx,
 {
struct tgsi_shader_info *info = >shader->selector->info;
ubyte *name, *index, *array_first;
-   int first, param;
+   int input_index;
struct tgsi_full_dst_register reg;
+   LLVMValueRef vertex_index = NULL;
+   LLVMValueRef ind_index = NULL;
 
/* Set the register description. The address computation is the same
 * for sources and destinations. */
@@ -858,17 +892,11 @@ static LLVMValueRef get_dw_address(struct 
si_shader_context *ctx,
/* If the register is 2-dimensional (e.g. an array of vertices
 * in a primitive), calculate the base address of the vertex. */
if (reg.Register.Dimension) {
-   LLVMValueRef index;
-
if (reg.Dimension.Indirect)
-   index = si_get_indirect_index(ctx, ,
+   vertex_index = si_get_indirect_index(ctx, 
,
  1, reg.Dimension.Index);
else
-   index = LLVMConstInt(ctx->i32, reg.Dimension.Index, 0);
-
-   base_addr = LLVMBuildAdd(ctx->ac.builder, base_addr,
-LLVMBuildMul(ctx->ac.builder, index,
- vertex_dw_stride, ""), 
"");
+   vertex_index = LLVMConstInt(ctx->i32, 
reg.Dimension.Index, 0);
}
 
/* Get information about the register. */
@@ -887,34 +915,22 @@ static LLVMValueRef get_dw_address(struct 
si_shader_context *ctx,
 
if (reg.Register.Indirect) {
/* Add the relative address of the element. */
-   LLVMValueRef ind_index;
-
if (reg.Indirect.ArrayID)
-   first = array_first[reg.Indirect.ArrayID];
+   input_index = array_first[reg.Indirect.ArrayID];
else
-   first = reg.Register.Index;
+   input_index = reg.Register.Index;
 
ind_index = si_get_indirect_index(ctx, ,
- 1, reg.Register.Index - 
first);
-
-   base_addr = LLVMBuildAdd(ctx->ac.builder, base_addr,
-   LLVMBuildMul(ctx->ac.builder, ind_index,
-LLVMConstInt(ctx->i32, 4, 0), 
""), "");
-
-   param = 

[Mesa-dev] [PATCH v3 06/20] ac: add store_tcs_outputs() to the abi

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/amd/common/ac_nir_to_llvm.c | 63 +
 src/amd/common/ac_shader_abi.h  | 12 
 2 files changed, 51 insertions(+), 24 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index ea8bdd338e..786d7fcb09 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2902,65 +2902,64 @@ load_tcs_output(struct nir_to_llvm_context *ctx,
 }
 
 static void
-store_tcs_output(struct nir_to_llvm_context *ctx,
-nir_intrinsic_instr *instr,
+store_tcs_output(struct ac_shader_abi *abi,
+LLVMValueRef vertex_index,
+LLVMValueRef param_index,
+unsigned const_index,
+unsigned location,
+unsigned driver_location,
 LLVMValueRef src,
+unsigned component,
+bool is_patch,
+bool is_compact,
 unsigned writemask)
 {
+   struct nir_to_llvm_context *ctx = nir_to_llvm_context_from_abi(abi);
LLVMValueRef dw_addr;
LLVMValueRef stride = NULL;
LLVMValueRef buf_addr = NULL;
-   LLVMValueRef vertex_index = NULL;
-   LLVMValueRef indir_index = NULL;
-   unsigned const_index = 0;
unsigned param;
-   const unsigned comp = instr->variables[0]->var->data.location_frac;
-   const bool per_vertex = nir_is_per_vertex_io(instr->variables[0]->var, 
ctx->stage);
-   const bool is_compact = instr->variables[0]->var->data.compact;
bool store_lds = true;
 
-   if (instr->variables[0]->var->data.patch) {
-   if (!(ctx->tcs_patch_outputs_read & (1U << 
(instr->variables[0]->var->data.location - VARYING_SLOT_PATCH0
+   if (is_patch) {
+   if (!(ctx->tcs_patch_outputs_read & (1U << (location - 
VARYING_SLOT_PATCH0
store_lds = false;
} else {
-   if (!(ctx->tcs_outputs_read & (1ULL << 
instr->variables[0]->var->data.location)))
+   if (!(ctx->tcs_outputs_read & (1ULL << location)))
store_lds = false;
}
-   get_deref_offset(ctx->nir, instr->variables[0],
-false, NULL, per_vertex ? _index : NULL,
-_index, _index);
 
-   param = 
shader_io_get_unique_index(instr->variables[0]->var->data.location);
-   if (instr->variables[0]->var->data.location == VARYING_SLOT_CLIP_DIST0 
&&
+   param = shader_io_get_unique_index(location);
+   if (location == VARYING_SLOT_CLIP_DIST0 &&
is_compact && const_index > 3) {
const_index -= 3;
param++;
}
 
-   if (!instr->variables[0]->var->data.patch) {
+   if (!is_patch) {
stride = unpack_param(>ac, ctx->tcs_out_layout, 13, 8);
dw_addr = get_tcs_out_current_patch_offset(ctx);
} else {
dw_addr = get_tcs_out_current_patch_data_offset(ctx);
}
 
-   mark_tess_output(ctx, instr->variables[0]->var->data.patch, param);
+   mark_tess_output(ctx, is_patch, param);
 
dw_addr = get_dw_address(ctx, dw_addr, param, const_index, is_compact, 
vertex_index, stride,
-indir_index);
+param_index);
buf_addr = get_tcs_tes_buffer_address_params(ctx, param, const_index, 
is_compact,
-vertex_index, indir_index);
+vertex_index, param_index);
 
bool is_tess_factor = false;
-   if (instr->variables[0]->var->data.location == 
VARYING_SLOT_TESS_LEVEL_INNER ||
-   instr->variables[0]->var->data.location == 
VARYING_SLOT_TESS_LEVEL_OUTER)
+   if (location == VARYING_SLOT_TESS_LEVEL_INNER ||
+   location == VARYING_SLOT_TESS_LEVEL_OUTER)
is_tess_factor = true;
 
unsigned base = is_compact ? const_index : 0;
for (unsigned chan = 0; chan < 8; chan++) {
if (!(writemask & (1 << chan)))
continue;
-   LLVMValueRef value = llvm_extract_elem(>ac, src, chan - 
comp);
+   LLVMValueRef value = llvm_extract_elem(>ac, src, chan - 
component);
 
if (store_lds || is_tess_factor)
ac_lds_store(>ac, dw_addr, value);
@@ -3266,7 +3265,22 @@ visit_store_var(struct ac_nir_context *ctx,
case nir_var_shader_out:
 
if (ctx->stage == MESA_SHADER_TESS_CTRL) {
-   store_tcs_output(ctx->nctx, instr, src, writemask);
+   LLVMValueRef vertex_index = NULL;
+   LLVMValueRef indir_index = NULL;
+   unsigned const_index = 0;
+   const unsigned location = 
instr->variables[0]->var->data.location;
+  

[Mesa-dev] [PATCH v3 13/20] radeonsi: make si_llvm_emit_tcs_epilogue compatible with emit_outputs abi

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 4811277233..c642279f41 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3245,9 +3245,12 @@ si_insert_input_ptr_as_2xi32(struct si_shader_context 
*ctx, LLVMValueRef ret,
 }
 
 /* This only writes the tessellation factor levels. */
-static void si_llvm_emit_tcs_epilogue(struct lp_build_tgsi_context *bld_base)
+static void si_llvm_emit_tcs_epilogue(struct ac_shader_abi *abi,
+ unsigned max_outputs,
+ LLVMValueRef *addrs)
 {
-   struct si_shader_context *ctx = si_shader_context(bld_base);
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   struct lp_build_tgsi_context *bld_base = >bld_base;
LLVMBuilderRef builder = ctx->ac.builder;
LLVMValueRef rel_patch_id, invocation_id, tf_lds_offset;
 
@@ -5955,7 +5958,8 @@ static bool si_compile_tgsi_main(struct si_shader_context 
*ctx,
bld_base->emit_fetch_funcs[TGSI_FILE_OUTPUT] = fetch_output_tcs;
bld_base->emit_store = store_output_tcs;
ctx->abi.store_tcs_outputs = si_nir_store_output_tcs;
-   bld_base->emit_epilogue = si_llvm_emit_tcs_epilogue;
+   ctx->abi.emit_outputs = si_llvm_emit_tcs_epilogue;
+   bld_base->emit_epilogue = si_tgsi_emit_epilogue;
break;
case PIPE_SHADER_TESS_EVAL:
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_tes;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 09/20] ac: add {tcs,tes}_patch_id to the abi

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/amd/common/ac_nir_to_llvm.c   | 20 ++--
 src/amd/common/ac_shader_abi.h|  2 ++
 src/gallium/drivers/radeonsi/si_shader.c  | 17 -
 src/gallium/drivers/radeonsi/si_shader_internal.h |  2 --
 4 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 56cb4c5d36..bdf171b24e 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -111,10 +111,8 @@ struct nir_to_llvm_context {
LLVMValueRef oc_lds;
LLVMValueRef merged_wave_info;
LLVMValueRef tess_factor_offset;
-   LLVMValueRef tcs_patch_id;
LLVMValueRef tcs_rel_ids;
LLVMValueRef tes_rel_patch_id;
-   LLVMValueRef tes_patch_id;
LLVMValueRef tes_u;
LLVMValueRef tes_v;
 
@@ -684,7 +682,7 @@ declare_tes_input_vgprs(struct nir_to_llvm_context *ctx, 
struct arg_info *args)
add_arg(args, ARG_VGPR, ctx->ac.f32, >tes_u);
add_arg(args, ARG_VGPR, ctx->ac.f32, >tes_v);
add_arg(args, ARG_VGPR, ctx->ac.i32, >tes_rel_patch_id);
-   add_arg(args, ARG_VGPR, ctx->ac.i32, >tes_patch_id);
+   add_arg(args, ARG_VGPR, ctx->ac.i32, >abi.tes_patch_id);
 }
 
 static void
@@ -850,7 +848,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
>view_index);
 
add_arg(, ARG_VGPR, ctx->ac.i32,
-   >tcs_patch_id);
+   >abi.tcs_patch_id);
add_arg(, ARG_VGPR, ctx->ac.i32,
>tcs_rel_ids);
 
@@ -878,7 +876,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
add_arg(, ARG_SGPR, ctx->ac.i32,
>tess_factor_offset);
add_arg(, ARG_VGPR, ctx->ac.i32,
-   >tcs_patch_id);
+   >abi.tcs_patch_id);
add_arg(, ARG_VGPR, ctx->ac.i32,
>tcs_rel_ids);
}
@@ -4218,11 +4216,13 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
ctx->nctx->shader_info->gs.uses_prim_id = true;
result = ctx->abi->gs_prim_id;
} else if (ctx->stage == MESA_SHADER_TESS_CTRL) {
-   ctx->nctx->shader_info->tcs.uses_prim_id = true;
-   result = ctx->nctx->tcs_patch_id;
+   if (ctx->nctx)
+   ctx->nctx->shader_info->tcs.uses_prim_id = true;
+   result = ctx->abi->tcs_patch_id;
} else if (ctx->stage == MESA_SHADER_TESS_EVAL) {
-   ctx->nctx->shader_info->tcs.uses_prim_id = true;
-   result = ctx->nctx->tes_patch_id;
+   if (ctx->nctx)
+   ctx->nctx->shader_info->tcs.uses_prim_id = true;
+   result = ctx->abi->tes_patch_id;
} else
fprintf(stderr, "Unknown primitive id intrinsic: %d", 
ctx->stage);
break;
@@ -6545,7 +6545,7 @@ static void ac_nir_fixup_ls_hs_input_vgprs(struct 
nir_to_llvm_context *ctx)
ctx->abi.instance_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->rel_auto_id, ctx->abi.instance_id, "");
ctx->vs_prim_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.vertex_id, ctx->vs_prim_id, "");
ctx->rel_auto_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->tcs_rel_ids, ctx->rel_auto_id, "");
-   ctx->abi.vertex_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->tcs_patch_id, ctx->abi.vertex_id, "");
+   ctx->abi.vertex_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.tcs_patch_id, ctx->abi.vertex_id, "");
 }
 
 static void prepare_gs_input_vgprs(struct nir_to_llvm_context *ctx)
diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
index fd2ec06fb1..6f526d9f25 100644
--- a/src/amd/common/ac_shader_abi.h
+++ b/src/amd/common/ac_shader_abi.h
@@ -42,6 +42,8 @@ struct ac_shader_abi {
LLVMValueRef draw_id;
LLVMValueRef vertex_id;
LLVMValueRef instance_id;
+   LLVMValueRef tcs_patch_id;
+   LLVMValueRef tes_patch_id;
LLVMValueRef gs_prim_id;
LLVMValueRef gs_invocation_id;
LLVMValueRef frag_pos[4];
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 233f161f1c..11831a7864 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -763,11 +763,9 @@ static LLVMValueRef get_primitive_id(struct 
si_shader_context *ctx,
return LLVMGetParam(ctx->main_fn,

[Mesa-dev] [PATCH v3 11/20] ac/radeonsi: add tcs_rel_ids to the abi

2018-01-02 Thread Timothy Arceri
Reviewed-by: Nicolai Hähnle 
---
 src/amd/common/ac_nir_to_llvm.c   | 15 +++
 src/amd/common/ac_shader_abi.h|  1 +
 src/gallium/drivers/radeonsi/si_shader.c  | 19 ++-
 src/gallium/drivers/radeonsi/si_shader_internal.h |  1 -
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index bdf171b24e..3259e14584 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -111,7 +111,6 @@ struct nir_to_llvm_context {
LLVMValueRef oc_lds;
LLVMValueRef merged_wave_info;
LLVMValueRef tess_factor_offset;
-   LLVMValueRef tcs_rel_ids;
LLVMValueRef tes_rel_patch_id;
LLVMValueRef tes_u;
LLVMValueRef tes_v;
@@ -402,7 +401,7 @@ static LLVMValueRef get_rel_patch_id(struct 
nir_to_llvm_context *ctx)
 {
switch (ctx->stage) {
case MESA_SHADER_TESS_CTRL:
-   return unpack_param(>ac, ctx->tcs_rel_ids, 0, 8);
+   return unpack_param(>ac, ctx->abi.tcs_rel_ids, 0, 8);
case MESA_SHADER_TESS_EVAL:
return ctx->tes_rel_patch_id;
break;
@@ -850,7 +849,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
add_arg(, ARG_VGPR, ctx->ac.i32,
>abi.tcs_patch_id);
add_arg(, ARG_VGPR, ctx->ac.i32,
-   >tcs_rel_ids);
+   >abi.tcs_rel_ids);
 
declare_vs_input_vgprs(ctx, );
} else {
@@ -878,7 +877,7 @@ static void create_function(struct nir_to_llvm_context *ctx,
add_arg(, ARG_VGPR, ctx->ac.i32,
>abi.tcs_patch_id);
add_arg(, ARG_VGPR, ctx->ac.i32,
-   >tcs_rel_ids);
+   >abi.tcs_rel_ids);
}
break;
case MESA_SHADER_TESS_EVAL:
@@ -4206,7 +4205,7 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
break;
case nir_intrinsic_load_invocation_id:
if (ctx->stage == MESA_SHADER_TESS_CTRL)
-   result = unpack_param(>ac, ctx->nctx->tcs_rel_ids, 
8, 5);
+   result = unpack_param(>ac, ctx->abi->tcs_rel_ids, 
8, 5);
else
result = ctx->abi->gs_invocation_id;
break;
@@ -6154,8 +6153,8 @@ write_tess_factors(struct nir_to_llvm_context *ctx)
 {
unsigned stride, outer_comps, inner_comps;
struct ac_build_if_state if_ctx, inner_if_ctx;
-   LLVMValueRef invocation_id = unpack_param(>ac, ctx->tcs_rel_ids, 
8, 5);
-   LLVMValueRef rel_patch_id = unpack_param(>ac, ctx->tcs_rel_ids, 0, 
8);
+   LLVMValueRef invocation_id = unpack_param(>ac, 
ctx->abi.tcs_rel_ids, 8, 5);
+   LLVMValueRef rel_patch_id = unpack_param(>ac, 
ctx->abi.tcs_rel_ids, 0, 8);
unsigned tess_inner_index, tess_outer_index;
LLVMValueRef lds_base, lds_inner, lds_outer, byteoffset, buffer;
LLVMValueRef out[6], vec0, vec1, tf_base, inner[4], outer[4];
@@ -6544,7 +6543,7 @@ static void ac_nir_fixup_ls_hs_input_vgprs(struct 
nir_to_llvm_context *ctx)
  ctx->ac.i32_0, "");
ctx->abi.instance_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->rel_auto_id, ctx->abi.instance_id, "");
ctx->vs_prim_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.vertex_id, ctx->vs_prim_id, "");
-   ctx->rel_auto_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->tcs_rel_ids, ctx->rel_auto_id, "");
+   ctx->rel_auto_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.tcs_rel_ids, ctx->rel_auto_id, "");
ctx->abi.vertex_id = LLVMBuildSelect(ctx->ac.builder, hs_empty, 
ctx->abi.tcs_patch_id, ctx->abi.vertex_id, "");
 }
 
diff --git a/src/amd/common/ac_shader_abi.h b/src/amd/common/ac_shader_abi.h
index 6f526d9f25..d5d7c9c327 100644
--- a/src/amd/common/ac_shader_abi.h
+++ b/src/amd/common/ac_shader_abi.h
@@ -43,6 +43,7 @@ struct ac_shader_abi {
LLVMValueRef vertex_id;
LLVMValueRef instance_id;
LLVMValueRef tcs_patch_id;
+   LLVMValueRef tcs_rel_ids;
LLVMValueRef tes_patch_id;
LLVMValueRef gs_prim_id;
LLVMValueRef gs_invocation_id;
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index f1589f495f..4811277233 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -274,7 +274,7 @@ static LLVMValueRef get_rel_patch_id(struct 
si_shader_context *ctx)
 {
switch (ctx->type) {
case PIPE_SHADER_TESS_CTRL:
-   return unpack_param(ctx, ctx->param_tcs_rel_ids, 0, 8);
+   return 

[Mesa-dev] [PATCH v3 05/20] radeonsi: add si_nir_load_input_tcs()

2018-01-02 Thread Timothy Arceri
V2: drop type param and just use ctx->i32
---
 src/gallium/drivers/radeonsi/si_shader.c | 45 
 1 file changed, 45 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0696020c41..816396bf86 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1208,6 +1208,50 @@ static LLVMValueRef fetch_input_tcs(
return lds_load(bld_base, tgsi2llvmtype(bld_base, type), swizzle, 
dw_addr);
 }
 
+static LLVMValueRef si_nir_load_input_tcs(struct ac_shader_abi *abi,
+ LLVMValueRef vertex_index,
+ LLVMValueRef param_index,
+ unsigned const_index,
+ unsigned location,
+ unsigned driver_location,
+ unsigned component,
+ unsigned num_components,
+ bool is_patch,
+ bool is_compact)
+{
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   struct tgsi_shader_info *info = >shader->selector->info;
+   struct lp_build_tgsi_context *bld_base = >bld_base;
+   LLVMValueRef dw_addr, stride;
+
+   driver_location = driver_location / 4;
+
+   stride = get_tcs_in_vertex_dw_stride(ctx);
+   dw_addr = get_tcs_in_current_patch_offset(ctx);
+
+   if (param_index) {
+   /* Add the constant index to the indirect index */
+   param_index = LLVMBuildAdd(ctx->ac.builder, param_index,
+  LLVMConstInt(ctx->i32, const_index, 
0), "");
+   } else {
+   param_index = LLVMConstInt(ctx->i32, const_index, 0);
+   }
+
+   dw_addr = get_dw_address_from_generic_indices(ctx, stride, dw_addr,
+ vertex_index, param_index,
+ driver_location,
+ info->input_semantic_name,
+ 
info->input_semantic_index,
+ is_patch);
+
+   LLVMValueRef value[4];
+   for (unsigned i = 0; i < num_components + component; i++) {
+   value[i] = lds_load(bld_base, ctx->i32, i, dw_addr);
+   }
+
+   return ac_build_varying_gather_values(>ac, value, num_components, 
component);
+}
+
 static LLVMValueRef fetch_output_tcs(
struct lp_build_tgsi_context *bld_base,
const struct tgsi_full_src_register *reg,
@@ -5778,6 +5822,7 @@ static bool si_compile_tgsi_main(struct si_shader_context 
*ctx,
break;
case PIPE_SHADER_TESS_CTRL:
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_tcs;
+   ctx->abi.load_tess_inputs = si_nir_load_input_tcs;
bld_base->emit_fetch_funcs[TGSI_FILE_OUTPUT] = fetch_output_tcs;
bld_base->emit_store = store_output_tcs;
bld_base->emit_epilogue = si_llvm_emit_tcs_epilogue;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 03/20] ac: call load_tcs_input() via the abi

2018-01-02 Thread Timothy Arceri
This also enables some code sharing with tes.

V2: drop type param and just use ctx->i32
---
 src/amd/common/ac_nir_to_llvm.c | 36 +---
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 4127b24239..ea8bdd338e 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2832,35 +2832,33 @@ get_dw_address(struct nir_to_llvm_context *ctx,
 }
 
 static LLVMValueRef
-load_tcs_input(struct nir_to_llvm_context *ctx,
-  nir_intrinsic_instr *instr)
+load_tcs_input(struct ac_shader_abi *abi,
+  LLVMValueRef vertex_index,
+  LLVMValueRef indir_index,
+  unsigned const_index,
+  unsigned location,
+  unsigned driver_location,
+  unsigned component,
+  unsigned num_components,
+  bool is_patch,
+  bool is_compact)
 {
+   struct nir_to_llvm_context *ctx = nir_to_llvm_context_from_abi(abi);
LLVMValueRef dw_addr, stride;
-   unsigned const_index;
-   LLVMValueRef vertex_index;
-   LLVMValueRef indir_index;
-   unsigned param;
LLVMValueRef value[4], result;
-   const bool per_vertex = nir_is_per_vertex_io(instr->variables[0]->var, 
ctx->stage);
-   const bool is_compact = instr->variables[0]->var->data.compact;
-   param = 
shader_io_get_unique_index(instr->variables[0]->var->data.location);
-   get_deref_offset(ctx->nir, instr->variables[0],
-false, NULL, per_vertex ? _index : NULL,
-_index, _index);
+   unsigned param = shader_io_get_unique_index(location);
 
stride = unpack_param(>ac, ctx->tcs_in_layout, 13, 8);
dw_addr = get_tcs_in_current_patch_offset(ctx);
dw_addr = get_dw_address(ctx, dw_addr, param, const_index, is_compact, 
vertex_index, stride,
 indir_index);
 
-   unsigned comp = instr->variables[0]->var->data.location_frac;
-   for (unsigned i = 0; i < instr->num_components + comp; i++) {
+   for (unsigned i = 0; i < num_components + component; i++) {
value[i] = ac_lds_load(>ac, dw_addr);
dw_addr = LLVMBuildAdd(ctx->builder, dw_addr,
   ctx->ac.i32_1, "");
}
-   result = ac_build_varying_gather_values(>ac, value, 
instr->num_components, comp);
-   result = LLVMBuildBitCast(ctx->builder, result, get_def_type(ctx->nir, 
>dest.ssa), "");
+   result = ac_build_varying_gather_values(>ac, value, 
num_components, component);
return result;
 }
 
@@ -3127,9 +3125,8 @@ static LLVMValueRef visit_load_var(struct ac_nir_context 
*ctx,
 
switch (instr->variables[0]->var->data.mode) {
case nir_var_shader_in:
-   if (ctx->stage == MESA_SHADER_TESS_CTRL)
-   return load_tcs_input(ctx->nctx, instr);
-   if (ctx->stage == MESA_SHADER_TESS_EVAL) {
+   if (ctx->stage == MESA_SHADER_TESS_CTRL ||
+   ctx->stage == MESA_SHADER_TESS_EVAL) {
LLVMValueRef result;
LLVMValueRef vertex_index = NULL;
LLVMValueRef indir_index = NULL;
@@ -6698,6 +6695,7 @@ LLVMModuleRef 
ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
} else if (shaders[i]->info.stage == MESA_SHADER_TESS_CTRL) {
ctx.tcs_outputs_read = shaders[i]->info.outputs_read;
ctx.tcs_patch_outputs_read = 
shaders[i]->info.patch_outputs_read;
+   ctx.abi.load_tess_inputs = load_tcs_input;
} else if (shaders[i]->info.stage == MESA_SHADER_TESS_EVAL) {
ctx.tes_primitive_mode = 
shaders[i]->info.tess.primitive_mode;
ctx.abi.load_tess_inputs = load_tes_input;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Radeonsi NIR tess support V3

2018-01-02 Thread Timothy Arceri
V3:
- rebased on recent changes/fixes

V2:
- addressed feedback from Nicolai

The following patches lack a reviewed-by:

1-3, 5, 20 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 01/20] radeonsi: add si_nir_load_input_tes()

2018-01-02 Thread Timothy Arceri
V2: drop type param and just use ctx->i32
---
 src/gallium/drivers/radeonsi/si_shader.c  | 48 +++
 src/gallium/drivers/radeonsi/si_shader_internal.h | 11 ++
 2 files changed, 59 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index ef1b460f45..bb251986ff 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1229,6 +1229,54 @@ static LLVMValueRef fetch_input_tes(
   buffer, base, addr, true);
 }
 
+LLVMValueRef si_nir_load_input_tes(struct ac_shader_abi *abi,
+  LLVMValueRef vertex_index,
+  LLVMValueRef param_index,
+  unsigned const_index,
+  unsigned location,
+  unsigned driver_location,
+  unsigned component,
+  unsigned num_components,
+  bool is_patch,
+  bool is_compact)
+{
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   struct tgsi_shader_info *info = >shader->selector->info;
+   LLVMValueRef buffer, base, addr;
+
+   driver_location = driver_location / 4;
+
+   buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
+
+   base = LLVMGetParam(ctx->main_fn, ctx->param_tcs_offchip_offset);
+
+   if (param_index) {
+   /* Add the constant index to the indirect index */
+   param_index = LLVMBuildAdd(ctx->ac.builder, param_index,
+  LLVMConstInt(ctx->i32, const_index, 
0), "");
+   } else {
+   param_index = LLVMConstInt(ctx->i32, const_index, 0);
+   }
+
+   addr = get_tcs_tes_buffer_address_from_generic_indices(ctx, 
vertex_index,
+  param_index, 
driver_location,
+  
info->input_semantic_name,
+  
info->input_semantic_index,
+  is_patch);
+
+   /* TODO: This will generate rather ordinary llvm code, although it
+* should be easy for the optimiser to fix up. In future we might want
+* to refactor buffer_load(), but for now this maximises code sharing
+* between the NIR and TGSI backends.
+*/
+   LLVMValueRef value[4];
+   for (unsigned i = component; i < num_components + component; i++) {
+   value[i] = buffer_load(>bld_base, ctx->i32, i, buffer, 
base, addr, true);
+   }
+
+   return ac_build_varying_gather_values(>ac, value, num_components, 
component);
+}
+
 static void store_output_tcs(struct lp_build_tgsi_context *bld_base,
 const struct tgsi_full_instruction *inst,
 const struct tgsi_opcode_info *info,
diff --git a/src/gallium/drivers/radeonsi/si_shader_internal.h 
b/src/gallium/drivers/radeonsi/si_shader_internal.h
index e05927c7fd..378bfc1a7a 100644
--- a/src/gallium/drivers/radeonsi/si_shader_internal.h
+++ b/src/gallium/drivers/radeonsi/si_shader_internal.h
@@ -277,6 +277,17 @@ LLVMValueRef si_llvm_emit_fetch(struct 
lp_build_tgsi_context *bld_base,
enum tgsi_opcode_type type,
unsigned swizzle);
 
+LLVMValueRef si_nir_load_input_tes(struct ac_shader_abi *abi,
+  LLVMValueRef vertex_index,
+  LLVMValueRef param_index,
+  unsigned const_index,
+  unsigned location,
+  unsigned driver_location,
+  unsigned component,
+  unsigned num_components,
+  bool is_patch,
+  bool is_compact);
+
 LLVMValueRef si_llvm_load_input_gs(struct ac_shader_abi *abi,
   unsigned input_index,
   unsigned vtx_offset_param,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 02/20] ac: add load_tes_inputs() to the abi

2018-01-02 Thread Timothy Arceri
V2: drop type param and just use ctx->i32
---
 src/amd/common/ac_nir_to_llvm.c  | 62 
 src/amd/common/ac_shader_abi.h   | 11 ++
 src/gallium/drivers/radeonsi/si_shader.c |  1 +
 3 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c..4127b24239 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2984,39 +2984,36 @@ store_tcs_output(struct nir_to_llvm_context *ctx,
 }
 
 static LLVMValueRef
-load_tes_input(struct nir_to_llvm_context *ctx,
-  const nir_intrinsic_instr *instr)
+load_tes_input(struct ac_shader_abi *abi,
+  LLVMValueRef vertex_index,
+  LLVMValueRef param_index,
+  unsigned const_index,
+  unsigned location,
+  unsigned driver_location,
+  unsigned component,
+  unsigned num_components,
+  bool is_patch,
+  bool is_compact)
 {
+   struct nir_to_llvm_context *ctx = nir_to_llvm_context_from_abi(abi);
LLVMValueRef buf_addr;
LLVMValueRef result;
-   LLVMValueRef vertex_index = NULL;
-   LLVMValueRef indir_index = NULL;
-   unsigned const_index = 0;
-   unsigned param;
-   const bool per_vertex = nir_is_per_vertex_io(instr->variables[0]->var, 
ctx->stage);
-   const bool is_compact = instr->variables[0]->var->data.compact;
+   unsigned param = shader_io_get_unique_index(location);
 
-   get_deref_offset(ctx->nir, instr->variables[0],
-false, NULL, per_vertex ? _index : NULL,
-_index, _index);
-   param = 
shader_io_get_unique_index(instr->variables[0]->var->data.location);
-   if (instr->variables[0]->var->data.location == VARYING_SLOT_CLIP_DIST0 
&&
-   is_compact && const_index > 3) {
+   if (location == VARYING_SLOT_CLIP_DIST0 && is_compact && const_index > 
3) {
const_index -= 3;
param++;
}
 
-   unsigned comp = instr->variables[0]->var->data.location_frac;
buf_addr = get_tcs_tes_buffer_address_params(ctx, param, const_index,
-is_compact, vertex_index, 
indir_index);
+is_compact, vertex_index, 
param_index);
 
-   LLVMValueRef comp_offset = LLVMConstInt(ctx->ac.i32, comp * 4, false);
+   LLVMValueRef comp_offset = LLVMConstInt(ctx->ac.i32, component * 4, 
false);
buf_addr = LLVMBuildAdd(ctx->builder, buf_addr, comp_offset, "");
 
-   result = ac_build_buffer_load(>ac, ctx->hs_ring_tess_offchip, 
instr->num_components, NULL,
+   result = ac_build_buffer_load(>ac, ctx->hs_ring_tess_offchip, 
num_components, NULL,
  buf_addr, ctx->oc_lds, is_compact ? (4 * 
const_index) : 0, 1, 0, true, false);
-   result = trim_vector(>ac, result, instr->num_components);
-   result = LLVMBuildBitCast(ctx->builder, result, get_def_type(ctx->nir, 
>dest.ssa), "");
+   result = trim_vector(>ac, result, num_components);
return result;
 }
 
@@ -3132,8 +3129,28 @@ static LLVMValueRef visit_load_var(struct ac_nir_context 
*ctx,
case nir_var_shader_in:
if (ctx->stage == MESA_SHADER_TESS_CTRL)
return load_tcs_input(ctx->nctx, instr);
-   if (ctx->stage == MESA_SHADER_TESS_EVAL)
-   return load_tes_input(ctx->nctx, instr);
+   if (ctx->stage == MESA_SHADER_TESS_EVAL) {
+   LLVMValueRef result;
+   LLVMValueRef vertex_index = NULL;
+   LLVMValueRef indir_index = NULL;
+   unsigned const_index = 0;
+   unsigned location = 
instr->variables[0]->var->data.location;
+   unsigned driver_location = 
instr->variables[0]->var->data.driver_location;
+   const bool is_patch =  
instr->variables[0]->var->data.patch;
+   const bool is_compact = 
instr->variables[0]->var->data.compact;
+
+   get_deref_offset(ctx, instr->variables[0],
+false, NULL, is_patch ? NULL : 
_index,
+_index, _index);
+
+   result = ctx->abi->load_tess_inputs(ctx->abi, 
vertex_index, indir_index,
+   const_index, 
location, driver_location,
+   
instr->variables[0]->var->data.location_frac,
+   
instr->num_components,
+   is_patch, 
is_compact);
+   return LLVMBuildBitCast(ctx->ac.builder, result, 
get_def_type(ctx, 

Re: [Mesa-dev] 10-bit Mesa/Gallium support

2018-01-02 Thread Mario Kleiner

On 12/31/2017 05:53 PM, Ilia Mirkin wrote:

On Thu, Nov 23, 2017 at 1:31 PM, Mario Kleiner
 wrote:

On 11/23/2017 06:45 PM, Ilia Mirkin wrote:


On Thu, Nov 23, 2017 at 12:35 PM, Marek Olšák  wrote:


Hi everybody,

Mario, feel free to push your patches if you haven't yet. (except the
workaround)



Hi,

just started 10 minutes ago with rebasing my current patchset against mesa
master. Will need some adjustments and retesting against i965.

I was also just "sort of done" with a mesa/gallium 10 bit version. I think
i'll submit rev 3 later today or tomorrow and maybe we'll need to sort this
out then, what goes where. I'll compare with Mareks branch...

The current state of my series for AMD here is that radeon-kms + ati-ddx
works nicely under exa (and with a slightly patched weston), but the ati-ddx
also needed some small patches which i have to send out. On amdgpu-kms i
know it works under my patched weston branch.

What is completely missing is glamor support, ergo support for at least
amdgpu-ddx and modesetting-ddx -- and xwayland.


For AMD, I applied Mario's patches (except Wayland - that didn't
apply) and added initial Gallium support:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=10bit

What's the status of Glamor?

Do we have patches for xf86-video-amdgpu? The closed should have
10-bit support, meaning we should have DDX patches already somewhere,
right?



Somewhere there must be some, as the amdgpu-pro driver with the proprietary
libGL supported depth 30 at least in some version i tested earlier this
year?



I'd like to test this out with nouveau as well... do I understand
correctly that I shouldn't need anything special to check if it
basically works? i.e. I apply the patches, start Xorg in bpp=30 mode,
and then if glxgears works then I'm done? Is there a good way that I'm
really in 30bpp mode as far as all the software is concerned? (I don't
have a colorimeter or whatever fancy hw to *really* tell the
difference, although I do have a "deep color" TV.) If used with a
24bpp display, is the hw supposed to dither somehow?x

-ilia



nouveau is quite a bit work to do and not so clear how to proceed.

My current series does do proper xrgb2101010 / argb2101010 rendering under
gallium on both nv50 and nvc0 (Tested under GeForce 9600 for tesla, GTX 970
and 1050 for maxwell and pascal). I used PRIME render offload under both
DRI3/Present and Wayland/Weston with both intel and amd as display gpus, so
i know the drivers work together properly and nouveau-gallium renders
correctly.

The display side for native scanout on Nvidia is somewhat broken atm.:

1. Since Linux 4.10 with the switch of nouveau-kms to atomic modesetting,
using drmAddFB() with depth/bpp 30/32 maps to xrgb2101010 format, but
nouveau-kms doesn't support xrgb2101010, so setting Xorg to depth 30 will
end in a server-abort with modesetting failure. nouveau before Linux 4.10
mapped 30/32 to xbgr2101010 which seems to be supported since nv50. If i
boot with a < 4.10 kernel i get a picture at least on the old GeForce 9600
and GT330M.

If i hack nouveau-ddx to use a xrgb2101010 color channel mask (red in msb's,
blue in lsbs) instead of the correct xbgr2101010 mask, then i can get
nouveau-gallium to render 10 bits, but of course with swapped red and blue
channels. Switching dithering on via xrandr allows to get rendered 10 bit
images to get to a 8 bpc display, as confirmed via colorimeter. I hope a
deep color TV might work without dithering.

According to

https://github.com/envytools/envytools/blob/master/rnndb/display/nv_evo.xml

gpu's since kepler gk104 support xrgb2101010 scanout. With a hacked
nouveau-kms i can get the maxwell and pascal cards to accept xrgb2101010,
but the display is beyond weird. So far i couldn't make much sense of the
pixeltrash -- some of it remotely resembles a desktop, but something is
going wrong badly. Also the xbgr2101010 mode doesn't work correct. The same
is true for Wayland+Weston and even if i run Weston with pixman, keeping
Mesa out of the picture. So nouveau-kms needs some work for all modern
nvidia gpu's. Gamma table handling changed quite a bit, so maybe something
is wrong there.

2. We might also need some work for exa on nvc0+, but it's not clear what
problems are caused by kernel side, and what in exa.

3. In principle the clean solution for nouveau would be to upgrade the ddx
to drmAddFB2 ioctl, and use xbgr2101010 scanout to support everything back
to nv50+, but everything we have in X or Wayland is meant for xrgb2101010
not xbgr2101010. And we run into ambiguities of what, e.g., a depth 30
pixmap means in some extensions like glx_texture_form_pixmap. And the GLX
extension generally seems to have unresolved problems with ARGB formats
instead of ABGR formats, which is why Mesa doesn't expose ARGB by default --
only on Android atm.

So on nouveau everything except the gallium bits is quite a bit messy at the
moment, but the gallium bits work according 

[Mesa-dev] [PATCH 4/6] r600: RFC: use GET_BUFFER_RESINFO vtx fetch on eg instead of setting up consts

2018-01-02 Thread sroland
From: Roland Scheidegger 

Contrary to what the comment said, this appears to work just fine on my rv770
(tested with piglit textureSize 140 fs/vs samplerBuffer).
I have no clue though if it's actually preferrable to use it (unfortunately
we cannot get rid of the tex constants completely, as we still require them
for cube map txq).
Albeit filling in the format (1 channels or 4?) and the stuff related to mega-
or mini-fetch (what the hell is this...) is just a guess based on other usage
of vtx fetch instructions...
The docs (for eg, not cayman) suggests this has to be done through tc cache
but it seems to work either way (since it actually just fetches the value from
the buffer descriptor I'm not sure why caches would be involved).
---
 src/gallium/drivers/r600/evergreen_state.c   |  7 ++--
 src/gallium/drivers/r600/r600_asm.c  |  3 +-
 src/gallium/drivers/r600/r600_shader.c   | 59 ++--
 src/gallium/drivers/r600/r600_state_common.c | 39 +++---
 4 files changed, 50 insertions(+), 58 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index f5b8e7115d..f645791a2c 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -653,11 +653,12 @@ static void evergreen_fill_buffer_resource_words(struct 
r600_context *rctx,
S_030008_ENDIAN_SWAP(endian);
tex_resource_words[3] = swizzle_res | 
S_03000C_UNCACHED(params->uncached);
/*
-* in theory dword 4 is for number of elements, for use with resinfo,
-* but it seems to utterly fail to work, the amd gpu shader analyser
+* dword 4 is for number of elements, for use with resinfo,
+* albeit the amd gpu shader analyser
 * uses a const buffer to store the element sizes for buffer txq
 */
-   tex_resource_words[4] = 0;
+   tex_resource_words[4] = params->size / stride;
+
tex_resource_words[5] = tex_resource_words[6] = 0;
tex_resource_words[7] = S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER);
 }
diff --git a/src/gallium/drivers/r600/r600_asm.c 
b/src/gallium/drivers/r600/r600_asm.c
index d6bd561f01..92c2bdf27c 100644
--- a/src/gallium/drivers/r600/r600_asm.c
+++ b/src/gallium/drivers/r600/r600_asm.c
@@ -1510,7 +1510,8 @@ int cm_bytecode_add_cf_end(struct r600_bytecode *bc)
 /* common to all 3 families */
 static int r600_bytecode_vtx_build(struct r600_bytecode *bc, struct 
r600_bytecode_vtx *vtx, unsigned id)
 {
-   bc->bytecode[id] = S_SQ_VTX_WORD0_BUFFER_ID(vtx->buffer_id) |
+   bc->bytecode[id] = S_SQ_VTX_WORD0_VTX_INST(vtx->op) |
+   S_SQ_VTX_WORD0_BUFFER_ID(vtx->buffer_id) |
S_SQ_VTX_WORD0_FETCH_TYPE(vtx->fetch_type) |
S_SQ_VTX_WORD0_SRC_GPR(vtx->src_gpr) |
S_SQ_VTX_WORD0_SRC_SEL_X(vtx->src_sel_x);
diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 8a36bcf1b4..51c38a6e00 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -6949,31 +6949,48 @@ static int do_vtx_fetch_inst(struct r600_shader_ctx 
*ctx, boolean src_requires_l
 static int r600_do_buffer_txq(struct r600_shader_ctx *ctx, int reg_idx, int 
offset)
 {
struct tgsi_full_instruction *inst = 
>parse.FullToken.FullInstruction;
-   struct r600_bytecode_alu alu;
int r;
int id = tgsi_tex_get_src_gpr(ctx, reg_idx) + offset;
+   int sampler_index_mode = inst->Src[reg_idx].Indirect.Index == 2 ? 2 : 
0; // CF_INDEX_1 : CF_INDEX_NONE
 
-   memset(, 0, sizeof(struct r600_bytecode_alu));
-   alu.op = ALU_OP1_MOV;
-   alu.src[0].sel = R600_SHADER_BUFFER_INFO_SEL;
-   if (ctx->bc->chip_class >= EVERGREEN) {
-   /* with eg each dword is either buf size or number of cubes */
-   alu.src[0].sel += id / 4;
-   alu.src[0].chan = id % 4;
-   } else {
+   if (ctx->bc->chip_class < EVERGREEN) {
+   struct r600_bytecode_alu alu;
+   memset(, 0, sizeof(struct r600_bytecode_alu));
+   alu.op = ALU_OP1_MOV;
+   alu.src[0].sel = R600_SHADER_BUFFER_INFO_SEL;
/* r600 we have them at channel 2 of the second dword */
alu.src[0].sel += (id * 2) + 1;
alu.src[0].chan = 1;
+   alu.src[0].kc_bank = R600_BUFFER_INFO_CONST_BUFFER;
+   tgsi_dst(ctx, >Dst[0], 0, );
+   alu.last = 1;
+   r = r600_bytecode_add_alu(ctx->bc, );
+   if (r)
+   return r;
+   return 0;
+   } else {
+   struct r600_bytecode_vtx vtx;
+   memset(, 0, sizeof(vtx));
+   vtx.op = FETCH_OP_GDS_MIN_UINT; /* aka GET_BUFFER_RESINFO */
+   vtx.buffer_id = id + R600_MAX_CONST_BUFFERS;
+ 

[Mesa-dev] [PATCH 2/6] r600: don't use vtx offset for load_sample_position

2018-01-02 Thread sroland
From: Roland Scheidegger 

The offset looks bogus to me. Albeit in the end it doesn't matter, by the
looks of it offsets smaller than 4 get ignored there (not sure of the rules,
I suppose either non-dword aligned offsets never work there or the offset
must be at least aligned to the size of a single element).
---
 src/gallium/drivers/r600/r600_shader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index e28882b2e5..792da950b3 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -1284,7 +1284,7 @@ static int load_sample_position(struct r600_shader_ctx 
*ctx, struct r600_shader_
vtx.num_format_all = 2;
vtx.format_comp_all = 1;
vtx.use_const_fields = 0;
-   vtx.offset = 1; // first element is size of buffer
+   vtx.offset = 0;
vtx.endian = r600_endian_swap(32);
vtx.srf_mode_all = 1; /* SRF_MODE_NO_ZERO */
 
-- 
2.12.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] r600: fix sampler indexing with texture buffers sampling

2018-01-02 Thread sroland
From: Roland Scheidegger 

This fixes the new piglit test.
(I could not actually figure out where the hell that index_1 parameter comes
from but in any case it's completely the same as for ordinary texturing...)
While here also fix up the logic for early exit of setting up driver consts.
---
 src/gallium/drivers/r600/r600_shader.c   | 2 ++
 src/gallium/drivers/r600/r600_state_common.c | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 792da950b3..8a36bcf1b4 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -6856,6 +6856,7 @@ static int do_vtx_fetch_inst(struct r600_shader_ctx *ctx, 
boolean src_requires_l
struct tgsi_full_instruction *inst = 
>parse.FullToken.FullInstruction;
int src_gpr, r, i;
int id = tgsi_tex_get_src_gpr(ctx, 1);
+   int sampler_index_mode = inst->Src[1].Indirect.Index == 2 ? 2 : 0; // 
CF_INDEX_1 : CF_INDEX_NONE
 
src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
if (src_requires_loading) {
@@ -6887,6 +6888,7 @@ static int do_vtx_fetch_inst(struct r600_shader_ctx *ctx, 
boolean src_requires_l
vtx.dst_sel_z = (inst->Dst[0].Register.WriteMask & 4) ? 2 : 7;  
/* SEL_Z */
vtx.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7;  
/* SEL_W */
vtx.use_const_fields = 1;
+   vtx.buffer_index_mode = sampler_index_mode;
 
if ((r = r600_bytecode_add_vtx(ctx->bc, )))
return r;
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index e9dd80fa96..4429246d31 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1380,8 +1380,8 @@ void eg_setup_buffer_constants(struct r600_context *rctx, 
int shader_type)
}
 
if (!samplers->views.dirty_buffer_constants &&
-   (images && !images->dirty_buffer_constants) &&
-   (buffers && !buffers->dirty_buffer_constants))
+   !(images && images->dirty_buffer_constants) &&
+   !(buffers && buffers->dirty_buffer_constants))
return;
 
if (images)
-- 
2.12.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] r600: increase number of ubos by one to 14

2018-01-02 Thread sroland
From: Roland Scheidegger 

Ideally we'd support 16 (d3d11 requires 15, and mesa subtracts one for non-ubo
constants), but that's kind of impossible (it would be only doable if either
we'd somehow merge the mesa non-ubo constants with the driver constants, or
only use the driver constants with vtx fetch instead of through the kcache
mechanism - the latter probably wouldn't be too bad).
For now just do as the comment already said, place the gs ring (not really
a const buffer in any case) which is only ever referred to through vc fetch
clauses at index 16. Throw in a couple asserts for good measure to make sure
the hw limit isn't exceeded.
---
 src/gallium/drivers/r600/evergreen_state.c |  1 +
 src/gallium/drivers/r600/r600_asm.c|  1 +
 src/gallium/drivers/r600/r600_pipe.h   | 10 ++
 src/gallium/drivers/r600/r600_state.c  |  1 +
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 81b7c4a285..f5b8e7115d 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2168,6 +2168,7 @@ static void evergreen_emit_constant_buffers(struct 
r600_context *rctx,
va = rbuffer->gpu_address + cb->buffer_offset;
 
if (!gs_ring_buffer) {
+   assert(buffer_index < R600_MAX_HW_CONST_BUFFERS);
radeon_set_context_reg_flag(cs, reg_alu_constbuf_size + 
buffer_index * 4,

DIV_ROUND_UP(cb->buffer_size, 256), pkt_flags);
radeon_set_context_reg_flag(cs, reg_alu_const_cache + 
buffer_index * 4, va >> 8,
diff --git a/src/gallium/drivers/r600/r600_asm.c 
b/src/gallium/drivers/r600/r600_asm.c
index 69b2d142c1..d6bd561f01 100644
--- a/src/gallium/drivers/r600/r600_asm.c
+++ b/src/gallium/drivers/r600/r600_asm.c
@@ -1008,6 +1008,7 @@ static int r600_bytecode_alloc_inst_kcache_lines(struct 
r600_bytecode *bc,
continue;
 
bank = alu->src[i].kc_bank;
+   assert(bank < R600_MAX_HW_CONST_BUFFERS);
line = (sel-512)>>4;
index_mode = alu->src[i].kc_rel ? 1 : 0; // V_SQ_CF_INDEX_0 / 
V_SQ_CF_INDEX_NONE
 
diff --git a/src/gallium/drivers/r600/r600_pipe.h 
b/src/gallium/drivers/r600/r600_pipe.h
index e042edf2b4..cb84bc1998 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -69,11 +69,12 @@
 #define R600_MAX_DRAW_CS_DWORDS58
 #define R600_MAX_PFP_SYNC_ME_DWORDS16
 
-#define R600_MAX_USER_CONST_BUFFERS 13
+#define EG_MAX_ATOMIC_BUFFERS 8
+
+#define R600_MAX_USER_CONST_BUFFERS 14
 #define R600_MAX_DRIVER_CONST_BUFFERS 3
 #define R600_MAX_CONST_BUFFERS (R600_MAX_USER_CONST_BUFFERS + 
R600_MAX_DRIVER_CONST_BUFFERS)
-
-#define EG_MAX_ATOMIC_BUFFERS 8
+#define R600_MAX_HW_CONST_BUFFERS 16
 
 /* start driver buffers after user buffers */
 #define R600_BUFFER_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS)
@@ -84,7 +85,8 @@
 #define R600_LDS_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 1)
 /*
  * Note GS doesn't use a constant buffer binding, just a resource index,
- * so it's fine to have it exist at index 16.
+ * so it's fine to have it exist at index 16. I.e. it's not actually
+ * a const buffer, just a buffer resource.
  */
 #define R600_GS_RING_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 2)
 /* Currently R600_MAX_CONST_BUFFERS just fits on the hw, which has a limit
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 253ff57a98..89cf7d2e50 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -1712,6 +1712,7 @@ static void r600_emit_constant_buffers(struct 
r600_context *rctx,
offset = cb->buffer_offset;
 
if (!gs_ring_buffer) {
+   assert(buffer_index < R600_MAX_HW_CONST_BUFFERS);
radeon_set_context_reg(cs, reg_alu_constbuf_size + 
buffer_index * 4,
   DIV_ROUND_UP(cb->buffer_size, 
256));
radeon_set_context_reg(cs, reg_alu_const_cache + 
buffer_index * 4, offset >> 8);
-- 
2.12.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] r600: don't emit tes samplers/views when tes isn't active

2018-01-02 Thread sroland
From: Roland Scheidegger 

Similar to const buffers. The driver must not emit any tes-related state if tes
is disabled, since the hw slots are all shared by VS, therefore it would
overwrite them (the mesa state tracker might not do this, but it would be
perfectly legal to do so).
Nevertheless I think the dirty state tracking logic in the driver is
fundamentally flawed when tes is disabled/enabled, since it looks to me like
the VS (and TES) state would not get reemitted to the correct slots (if it's
not dirty anyway). Unless I'm missing something...
Theoretically, the overwrite problem could be solved by using non-overlapping
resource slots for TES and VS (since we're not even close to using half the
resource slots), but it wouldn't work for constant buffers nor samplers, and
for VS would still need to propagate changes to both LS and VS, so probably
not a useful idea.
Unfortunately there's zero coverage of this with piglit, since all tessellation
shader tests are just shader_runner tests, which are unsuitable for testing
any kind of state dependency tracking issues (so I can't even quickly hack
something up to proove it and fix it...).
TCS otoh is just fine - like GS it has its own hw slots.
---
 src/gallium/drivers/r600/evergreen_state.c   |  4 
 src/gallium/drivers/r600/r600_state_common.c | 15 +++
 2 files changed, 19 insertions(+)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 4cc48dfa11..fb1de9cbf4 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2334,6 +2334,8 @@ static void evergreen_emit_tcs_sampler_views(struct 
r600_context *rctx, struct r
 
 static void evergreen_emit_tes_sampler_views(struct r600_context *rctx, struct 
r600_atom *atom)
 {
+   if (!rctx->tes_shader)
+   return;
evergreen_emit_sampler_views(rctx, 
>samplers[PIPE_SHADER_TESS_EVAL].views,
 EG_FETCH_CONSTANTS_OFFSET_VS + 
R600_MAX_CONST_BUFFERS, 0);
 }
@@ -2404,6 +2406,8 @@ static void evergreen_emit_tcs_sampler_states(struct 
r600_context *rctx, struct
 
 static void evergreen_emit_tes_sampler_states(struct r600_context *rctx, 
struct r600_atom *atom)
 {
+   if (!rctx->tes_shader)
+   return;
evergreen_emit_sampler_states(rctx, 
>samplers[PIPE_SHADER_TESS_EVAL], 18,
  R_00A414_TD_VS_SAMPLER0_BORDER_INDEX, 0);
 }
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 4364350487..a434156c16 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1723,6 +1723,21 @@ static bool r600_update_derived_state(struct 
r600_context *rctx)
UPDATE_SHADER_CLIP(R600_HW_STAGE_VS, vs);
}
}
+   
+   /*
+* XXX: I believe there's some fatal flaw in the dirty state logic when
+* enabling/disabling tes.
+* VS/ES share all buffer/resource/sampler slots. If TES is enabled,
+* it will therefore overwrite the VS slots. If it now gets disabled,
+* the VS needs to rebind all buffer/resource/sampler slots - not only
+* has TES overwritten the corresponding slots, but when the VS was
+* operating as LS the things with correpsonding dirty bits got bound
+* to LS slots and won't reflect what is dirty as VS stage even if the
+* TES didn't overwrite it. The story for re-enabled TES is similar.
+* In any case, we're not allowed to submit any TES state when
+* TES is disabled (the state tracker may not do this but this looks
+* like an optimization to me, not something which can be relied on).
+*/
 
/* Update clip misc state. */
if (clip_so_current) {
-- 
2.12.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] r600: increase number of UBOs to 15

2018-01-02 Thread sroland
From: Roland Scheidegger 

With the exception of the default tess levels only ever accessed
by the default tcs shader, the LDS_INFO const buffer was only accessed by vtx
instructions, and not through kcache. No idea why really, but use this to our
advantage by not using a constant buffer slot for it. This just requires us to
throw the default tess levels into the "normal" driver const buffer instead.
Alternatively, could acesss those constants via vtx instructions too, but then
we couldn't use a ordinary ureg prog accessing them as constants and would have
to generate that directly when compiling the default tcs shader. (Another
alternative would be to put all lds info into the ordinary driver const
buffer, albeit we'd maybe need to increase the fixed size as it can't fit
alongside the ucp since vs needs access to the lds info too.)
---
 src/gallium/drivers/r600/evergreen_state.c   | 15 --
 src/gallium/drivers/r600/r600_pipe.h | 13 
 src/gallium/drivers/r600/r600_state_common.c | 31 +---
 3 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index f645791a2c..4cc48dfa11 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2168,8 +2168,7 @@ static void evergreen_emit_constant_buffers(struct 
r600_context *rctx,
 
va = rbuffer->gpu_address + cb->buffer_offset;
 
-   if (!gs_ring_buffer) {
-   assert(buffer_index < R600_MAX_HW_CONST_BUFFERS);
+   if (buffer_index < R600_MAX_HW_CONST_BUFFERS) {
radeon_set_context_reg_flag(cs, reg_alu_constbuf_size + 
buffer_index * 4,

DIV_ROUND_UP(cb->buffer_size, 256), pkt_flags);
radeon_set_context_reg_flag(cs, reg_alu_const_cache + 
buffer_index * 4, va >> 8,
@@ -3880,7 +3879,7 @@ static void evergreen_set_tess_state(struct pipe_context 
*ctx,
 
memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4);
memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2);
-   rctx->tess_state_dirty = true;
+   rctx->driver_consts[PIPE_SHADER_TESS_CTRL].tcs_default_levels_dirty = 
true;
 }
 
 static void evergreen_setup_immed_buffer(struct r600_context *rctx,
@@ -4344,7 +4343,7 @@ void evergreen_setup_tess_constants(struct r600_context 
*rctx, const struct pipe
unsigned input_vertex_size, output_vertex_size;
unsigned input_patch_size, pervertex_output_patch_size, 
output_patch_size;
unsigned output_patch0_offset, perpatch_output_offset, lds_size;
-   uint32_t values[16];
+   uint32_t values[8];
unsigned num_waves;
unsigned num_pipes = rctx->screen->b.info.r600_max_quad_pipes;
unsigned wave_divisor = (16 * num_pipes);
@@ -4364,7 +4363,6 @@ void evergreen_setup_tess_constants(struct r600_context 
*rctx, const struct pipe
 
if (rctx->lds_alloc != 0 &&
rctx->last_ls == ls &&
-   !rctx->tess_state_dirty &&
rctx->last_num_tcs_input_cp == num_tcs_input_cp &&
rctx->last_tcs == tcs)
return;
@@ -4411,17 +4409,12 @@ void evergreen_setup_tess_constants(struct r600_context 
*rctx, const struct pipe
 
rctx->lds_alloc = (lds_size | (num_waves << 14));
 
-   memcpy([8], rctx->tess_state, 6 * sizeof(float));
-   values[14] = 0;
-   values[15] = 0;
-
-   rctx->tess_state_dirty = false;
rctx->last_ls = ls;
rctx->last_tcs = tcs;
rctx->last_num_tcs_input_cp = num_tcs_input_cp;
 
constbuf.user_buffer = values;
-   constbuf.buffer_size = 16 * 4;
+   constbuf.buffer_size = 8 * 4;
 
rctx->b.b.set_constant_buffer(>b.b, PIPE_SHADER_VERTEX,
  R600_LDS_INFO_CONST_BUFFER, );
diff --git a/src/gallium/drivers/r600/r600_pipe.h 
b/src/gallium/drivers/r600/r600_pipe.h
index cb84bc1998..112b5cbb83 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -71,7 +71,7 @@
 
 #define EG_MAX_ATOMIC_BUFFERS 8
 
-#define R600_MAX_USER_CONST_BUFFERS 14
+#define R600_MAX_USER_CONST_BUFFERS 15
 #define R600_MAX_DRIVER_CONST_BUFFERS 3
 #define R600_MAX_CONST_BUFFERS (R600_MAX_USER_CONST_BUFFERS + 
R600_MAX_DRIVER_CONST_BUFFERS)
 #define R600_MAX_HW_CONST_BUFFERS 16
@@ -80,12 +80,17 @@
 #define R600_BUFFER_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS)
 #define R600_UCP_SIZE (4*4*8)
 #define R600_CS_BLOCK_GRID_SIZE (8 * 4)
+#define R600_TCS_DEFAULT_LEVELS_SIZE (6 * 4)
 #define R600_BUFFER_INFO_OFFSET (R600_UCP_SIZE)
 
+/*
+ * We only access this buffer through vtx clauses hence it's fine to exist
+ * at index beyond 15.
+ */
 #define R600_LDS_INFO_CONST_BUFFER (R600_MAX_USER_CONST_BUFFERS + 1)
 /*
  * Note GS doesn't use a constant buffer 

[Mesa-dev] [egl/android: Implement the eglSwapinterval for Android] egl/android: Implement the eglSwapinterval for Android.

2018-01-02 Thread zhongmin . wu
From: Zhongmin Wu 

Implement the eglSwapinterval for Android platform to
enable the async mode for some GFX benchmarks.

Change-Id: I3576d8b92862719dae11c31e2adc2d77cb5a0b64
Signed-off-by: Zhongmin Wu 
---
 src/egl/drivers/dri2/platform_android.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index f6a24cd..f9c74ee 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -476,6 +476,18 @@ droid_destroy_surface(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surf)
return EGL_TRUE;
 }
 
+static EGLBoolean droid_swap_interval(_EGLDriver *drv, _EGLDisplay *dpy,
+_EGLSurface *surf, EGLint interval) {
+
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   struct ANativeWindow *window = dri2_surf->window;
+   if (window->setSwapInterval(window, interval)) {
+  return EGL_FALSE;
+   }
+   surf->SwapInterval = interval;
+   return EGL_TRUE;
+}
+
 static int
 update_buffers(struct dri2_egl_surface *dri2_surf)
 {
@@ -1300,6 +1312,7 @@ static const struct dri2_egl_display_vtbl 
droid_display_vtbl = {
.swap_buffers = droid_swap_buffers,
.swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage, /* 
Android implements the function */
.swap_buffers_region = dri2_fallback_swap_buffers_region,
+   .swap_interval = droid_swap_interval,
 #if ANDROID_API_LEVEL >= 23
.set_damage_region = droid_set_damage_region,
 #else
@@ -1443,6 +1456,8 @@ dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *dpy)
 
dri2_setup_screen(dpy);
 
+   dri2_setup_swap_interval(dpy, 1);
+
if (!droid_add_configs_for_visuals(drv, dpy)) {
   err = "DRI2: failed to add configs";
   goto cleanup;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [egl/android: Implement the eglSwapinterval for Android] egl/android: Implement the eglSwapinterval for Android.

2018-01-02 Thread zhongmin . wu
From: Zhongmin Wu 

Implement the eglSwapinterval for Android platform to
enable the async mode for some GFX benchmarks.

Change-Id: I3576d8b92862719dae11c31e2adc2d77cb5a0b64
Signed-off-by: Zhongmin Wu 
---
 src/egl/drivers/dri2/platform_android.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_android.c 
b/src/egl/drivers/dri2/platform_android.c
index f6a24cd..f9c74ee 100644
--- a/src/egl/drivers/dri2/platform_android.c
+++ b/src/egl/drivers/dri2/platform_android.c
@@ -476,6 +476,18 @@ droid_destroy_surface(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surf)
return EGL_TRUE;
 }
 
+static EGLBoolean droid_swap_interval(_EGLDriver *drv, _EGLDisplay *dpy,
+_EGLSurface *surf, EGLint interval) {
+
+   struct dri2_egl_surface *dri2_surf = dri2_egl_surface(surf);
+   struct ANativeWindow *window = dri2_surf->window;
+   if (window->setSwapInterval(window, interval)) {
+  return EGL_FALSE;
+   }
+   surf->SwapInterval = interval;
+   return EGL_TRUE;
+}
+
 static int
 update_buffers(struct dri2_egl_surface *dri2_surf)
 {
@@ -1300,6 +1312,7 @@ static const struct dri2_egl_display_vtbl 
droid_display_vtbl = {
.swap_buffers = droid_swap_buffers,
.swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage, /* 
Android implements the function */
.swap_buffers_region = dri2_fallback_swap_buffers_region,
+   .swap_interval = droid_swap_interval,
 #if ANDROID_API_LEVEL >= 23
.set_damage_region = droid_set_damage_region,
 #else
@@ -1443,6 +1456,8 @@ dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *dpy)
 
dri2_setup_screen(dpy);
 
+   dri2_setup_swap_interval(dpy, 1);
+
if (!droid_add_configs_for_visuals(drv, dpy)) {
   err = "DRI2: failed to add configs";
   goto cleanup;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: use zlib's CRC32 implementaion for larger buffers

2018-01-02 Thread Grazvydas Ignotas
On Wed, Jan 3, 2018 at 3:09 AM, Ian Romanick  wrote:
> On 01/02/2018 04:52 PM, Grazvydas Ignotas wrote:
>> On Tue, Jan 2, 2018 at 11:38 PM, Ian Romanick  wrote:
>>> On 12/28/2017 05:56 PM, Grazvydas Ignotas wrote:
 zlib provides a faster slice-by-4 CRC32 implementation than the
 traditional single byte lookup one used by mesa. As most supported
 platforms now link zlib unconditionally, we can easily use it.
 For small buffers the old implementation is still used as it's faster
 with cold cache (first call), as indicated by some throughput
 benchmarking (avg MB/s, n=100, zlib 1.2.8):

 i5-6600KC2D E4500
 size  mesa zlibmesa zlib
 4   66   43 -35% +/- 4.8%43   22 -49% +/- 9.6%
 32 193  171 -11% +/- 5.8%   129   49 -61% +/- 7.2%
 64 256  267   4% +/- 4.1%   171   63 -63% +/- 5.4%
 128317  389  22% +/- 5.8%   253   89 -64% +/- 4.2%
 256364  596  63% +/- 5.6%   304  166 -45% +/- 2.8%
 512401  838 108% +/- 5.3%   338  296 -12% +/- 3.1%
 1024   420 1036 146% +/- 7.6%   375  461  23% +/- 3.7%
 1M 443 1443 225% +/- 2.1%   403 1175 191% +/- 0.9%
 100M   448 1452 224% +/- 0.3%   406 1214 198% +/- 0.3%

 With hot cache (repeated calls) zlib almost always wins on both CPUS.
 It has been verified the calculation results stay the same after this
 change.

 Signed-off-by: Grazvydas Ignotas 
 ---
  src/util/crc32.c | 13 +
  1 file changed, 13 insertions(+)

 diff --git a/src/util/crc32.c b/src/util/crc32.c
 index f2e01c6..0cffa49 100644
 --- a/src/util/crc32.c
 +++ b/src/util/crc32.c
 @@ -31,12 +31,20 @@
   *
   * @author Jose Fonseca
   */


 +#ifdef HAVE_ZLIB
 +#include 
 +#endif
  #include "crc32.h"

 +/* For small buffers it's faster to avoid the library call.
 + * The optimal threshold depends on CPU characteristics, it is hoped
 + * the choice below is reasonable for typical modern CPU.
 + */
 +#define ZLIB_SIZE_THRESHOLD 64
>>>
>>> For the actual users of this function in Mesa, is it even possible to
>>> pass less than 64 bytes (I'm assuming that's the units)?
>>
>> Hmm why wouldn't it be? The unit is a byte, and you can compute CRC32
>> of a single byte.
>
> It can be done, but that's not my question.  My question is: would any
> of the existing users actually do this?  I thought the main user was the
> various disk cache / GetProgramBinary kind of things.  You won't have a
> shader binary that's a single byte... less than 64 also seems unlikely.
> Unless there are other users?

You're most likely right, I'll just drop this.

>

  static const uint32_t
  util_crc32_table[256] = {
 0x, 0x77073096, 0xee0e612c, 0x990951ba,
 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3,
 @@ -112,10 +120,15 @@ uint32_t
  util_hash_crc32(const void *data, size_t size)
  {
 const uint8_t *p = data;
 uint32_t crc = 0x;

 +#ifdef HAVE_ZLIB
 +   if (size >= ZLIB_SIZE_THRESHOLD && (uInt)size == size)
>>>  ^^
>>> This comparison is always true (unless sizeof(uInt) != sizeof(size_t)?).
>>>  I'm not 100% sure what you're trying to accomplish here, but I think
>>> you want 'size > 0'.  I'm not familiar with this zlib function, so I
>>> don't know what it's expectations for size are.
>>
>> zlib's uInt is always 32bit while size_t is 64bit on most (all?) 64bit
>> architectures, so if someone decides to CRC32 >= 4GiB buffer, this
>> function will do the wrong thing without such check. Newer zlib has
>> crc32_z that takes size_t, but I was trying to avoid build system
>> complications of detecting that function...
>
> Ok.  That makes sense.  I'll bet this adds a warning about tautological
> compares only on 32-bit, then.

I don't seem to be getting a warning on gcc 7.2.0 or 5.4.0 as well as
clang 3.8.0 at least.

>  I was going to suggest comparing with
> 0x instead, but GCC emits worse code for that... and it probably
> still results in the warning on 32-bit.  Maybe just a comment (based on
> your reply) that describes what's happening for the next person that
> reads the code.

ok
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: use zlib's CRC32 implementaion for larger buffers

2018-01-02 Thread Ian Romanick
On 01/02/2018 04:52 PM, Grazvydas Ignotas wrote:
> On Tue, Jan 2, 2018 at 11:38 PM, Ian Romanick  wrote:
>> On 12/28/2017 05:56 PM, Grazvydas Ignotas wrote:
>>> zlib provides a faster slice-by-4 CRC32 implementation than the
>>> traditional single byte lookup one used by mesa. As most supported
>>> platforms now link zlib unconditionally, we can easily use it.
>>> For small buffers the old implementation is still used as it's faster
>>> with cold cache (first call), as indicated by some throughput
>>> benchmarking (avg MB/s, n=100, zlib 1.2.8):
>>>
>>> i5-6600KC2D E4500
>>> size  mesa zlibmesa zlib
>>> 4   66   43 -35% +/- 4.8%43   22 -49% +/- 9.6%
>>> 32 193  171 -11% +/- 5.8%   129   49 -61% +/- 7.2%
>>> 64 256  267   4% +/- 4.1%   171   63 -63% +/- 5.4%
>>> 128317  389  22% +/- 5.8%   253   89 -64% +/- 4.2%
>>> 256364  596  63% +/- 5.6%   304  166 -45% +/- 2.8%
>>> 512401  838 108% +/- 5.3%   338  296 -12% +/- 3.1%
>>> 1024   420 1036 146% +/- 7.6%   375  461  23% +/- 3.7%
>>> 1M 443 1443 225% +/- 2.1%   403 1175 191% +/- 0.9%
>>> 100M   448 1452 224% +/- 0.3%   406 1214 198% +/- 0.3%
>>>
>>> With hot cache (repeated calls) zlib almost always wins on both CPUS.
>>> It has been verified the calculation results stay the same after this
>>> change.
>>>
>>> Signed-off-by: Grazvydas Ignotas 
>>> ---
>>>  src/util/crc32.c | 13 +
>>>  1 file changed, 13 insertions(+)
>>>
>>> diff --git a/src/util/crc32.c b/src/util/crc32.c
>>> index f2e01c6..0cffa49 100644
>>> --- a/src/util/crc32.c
>>> +++ b/src/util/crc32.c
>>> @@ -31,12 +31,20 @@
>>>   *
>>>   * @author Jose Fonseca
>>>   */
>>>
>>>
>>> +#ifdef HAVE_ZLIB
>>> +#include 
>>> +#endif
>>>  #include "crc32.h"
>>>
>>> +/* For small buffers it's faster to avoid the library call.
>>> + * The optimal threshold depends on CPU characteristics, it is hoped
>>> + * the choice below is reasonable for typical modern CPU.
>>> + */
>>> +#define ZLIB_SIZE_THRESHOLD 64
>>
>> For the actual users of this function in Mesa, is it even possible to
>> pass less than 64 bytes (I'm assuming that's the units)?
> 
> Hmm why wouldn't it be? The unit is a byte, and you can compute CRC32
> of a single byte.

It can be done, but that's not my question.  My question is: would any
of the existing users actually do this?  I thought the main user was the
various disk cache / GetProgramBinary kind of things.  You won't have a
shader binary that's a single byte... less than 64 also seems unlikely.
Unless there are other users?

>>>
>>>  static const uint32_t
>>>  util_crc32_table[256] = {
>>> 0x, 0x77073096, 0xee0e612c, 0x990951ba,
>>> 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3,
>>> @@ -112,10 +120,15 @@ uint32_t
>>>  util_hash_crc32(const void *data, size_t size)
>>>  {
>>> const uint8_t *p = data;
>>> uint32_t crc = 0x;
>>>
>>> +#ifdef HAVE_ZLIB
>>> +   if (size >= ZLIB_SIZE_THRESHOLD && (uInt)size == size)
>>  ^^
>> This comparison is always true (unless sizeof(uInt) != sizeof(size_t)?).
>>  I'm not 100% sure what you're trying to accomplish here, but I think
>> you want 'size > 0'.  I'm not familiar with this zlib function, so I
>> don't know what it's expectations for size are.
> 
> zlib's uInt is always 32bit while size_t is 64bit on most (all?) 64bit
> architectures, so if someone decides to CRC32 >= 4GiB buffer, this
> function will do the wrong thing without such check. Newer zlib has
> crc32_z that takes size_t, but I was trying to avoid build system
> complications of detecting that function...

Ok.  That makes sense.  I'll bet this adds a warning about tautological
compares only on 32-bit, then.  I was going to suggest comparing with
0x instead, but GCC emits worse code for that... and it probably
still results in the warning on 32-bit.  Maybe just a comment (based on
your reply) that describes what's happening for the next person that
reads the code.

> Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] spirv: Add a mechanism for dumping failing shaders

2018-01-02 Thread Ian Romanick
On 01/02/2018 04:24 PM, Grazvydas Ignotas wrote:
> On Tue, Jan 2, 2018 at 6:30 PM, Jason Ekstrand  wrote:
>> ---
>>  src/compiler/spirv/spirv_to_nir.c | 29 +
>>  src/compiler/spirv/vtn_private.h  |  1 +
>>  2 files changed, 30 insertions(+)
>>
>> diff --git a/src/compiler/spirv/spirv_to_nir.c 
>> b/src/compiler/spirv/spirv_to_nir.c
>> index dcff56f..751fb03 100644
>> --- a/src/compiler/spirv/spirv_to_nir.c
>> +++ b/src/compiler/spirv/spirv_to_nir.c
>> @@ -31,6 +31,9 @@
>>  #include "nir/nir_constant_expressions.h"
>>  #include "spirv_info.h"
>>
>> +#include 
>> +#include 
>> +
>>  void
>>  vtn_log(struct vtn_builder *b, enum nir_spirv_debug_level level,
>>  size_t spirv_offset, const char *message)
>> @@ -94,6 +97,27 @@ vtn_log_err(struct vtn_builder *b,
>> ralloc_free(msg);
>>  }
>>
>> +static void
>> +vtn_dump_shader(struct vtn_builder *b, const char *path, const char *prefix)
>> +{
>> +   static int idx = 0;
>> +
>> +   char filename[1024];
>> +   int len = snprintf(filename, sizeof(filename), "%s/%s-%d.spirv",
>> +  path, prefix, idx++);
>> +   if (len < 0 || len >= sizeof(filename))
>> +  return;
>> +
>> +   int fd = open(filename, O_CREAT | O_CLOEXEC | O_WRONLY, 0777);
>> +   if (fd < 0)
>> +  return;
>> +
>> +   write(fd, b->spirv, b->spirv_word_count * 4);
> 
> Feel free to ignore, but what about * sizeof(b->spirv[0]) ?
> 
> also, this emits a not-so-useful warning for me:
> warning: ignoring return value of ‘write’, declared with attribute
> warn_unused_result [-Wunused-result]
> (and no, sticking (void) before write() doesn't help)

Oh, that's annoying... but technically correct.  The problem is that
write() might only write part of your data.  If the returned size is
less than the size you asked, you have to try again.  This is why people
use fopen/fwrite/fclose. :)  Either that or:

   ssize_t remain = b->spirv_word_count * 4;
   unsigned offset = 0;

   do {
  ssize_t written =
 write(fd, (uint8_t *) b->spriv + offset, remain);

  if (written < 0) {
 /* Error.  Bail out. */
 break;
   }

  remain -= written;
  offset += written;
   } while (remain > 0);

I'll let Jason pick. :D

> Gražvydas
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: use zlib's CRC32 implementaion for larger buffers

2018-01-02 Thread Grazvydas Ignotas
On Tue, Jan 2, 2018 at 11:38 PM, Ian Romanick  wrote:
> On 12/28/2017 05:56 PM, Grazvydas Ignotas wrote:
>> zlib provides a faster slice-by-4 CRC32 implementation than the
>> traditional single byte lookup one used by mesa. As most supported
>> platforms now link zlib unconditionally, we can easily use it.
>> For small buffers the old implementation is still used as it's faster
>> with cold cache (first call), as indicated by some throughput
>> benchmarking (avg MB/s, n=100, zlib 1.2.8):
>>
>> i5-6600KC2D E4500
>> size  mesa zlibmesa zlib
>> 4   66   43 -35% +/- 4.8%43   22 -49% +/- 9.6%
>> 32 193  171 -11% +/- 5.8%   129   49 -61% +/- 7.2%
>> 64 256  267   4% +/- 4.1%   171   63 -63% +/- 5.4%
>> 128317  389  22% +/- 5.8%   253   89 -64% +/- 4.2%
>> 256364  596  63% +/- 5.6%   304  166 -45% +/- 2.8%
>> 512401  838 108% +/- 5.3%   338  296 -12% +/- 3.1%
>> 1024   420 1036 146% +/- 7.6%   375  461  23% +/- 3.7%
>> 1M 443 1443 225% +/- 2.1%   403 1175 191% +/- 0.9%
>> 100M   448 1452 224% +/- 0.3%   406 1214 198% +/- 0.3%
>>
>> With hot cache (repeated calls) zlib almost always wins on both CPUS.
>> It has been verified the calculation results stay the same after this
>> change.
>>
>> Signed-off-by: Grazvydas Ignotas 
>> ---
>>  src/util/crc32.c | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/src/util/crc32.c b/src/util/crc32.c
>> index f2e01c6..0cffa49 100644
>> --- a/src/util/crc32.c
>> +++ b/src/util/crc32.c
>> @@ -31,12 +31,20 @@
>>   *
>>   * @author Jose Fonseca
>>   */
>>
>>
>> +#ifdef HAVE_ZLIB
>> +#include 
>> +#endif
>>  #include "crc32.h"
>>
>> +/* For small buffers it's faster to avoid the library call.
>> + * The optimal threshold depends on CPU characteristics, it is hoped
>> + * the choice below is reasonable for typical modern CPU.
>> + */
>> +#define ZLIB_SIZE_THRESHOLD 64
>
> For the actual users of this function in Mesa, is it even possible to
> pass less than 64 bytes (I'm assuming that's the units)?

Hmm why wouldn't it be? The unit is a byte, and you can compute CRC32
of a single byte.

>
>>
>>  static const uint32_t
>>  util_crc32_table[256] = {
>> 0x, 0x77073096, 0xee0e612c, 0x990951ba,
>> 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3,
>> @@ -112,10 +120,15 @@ uint32_t
>>  util_hash_crc32(const void *data, size_t size)
>>  {
>> const uint8_t *p = data;
>> uint32_t crc = 0x;
>>
>> +#ifdef HAVE_ZLIB
>> +   if (size >= ZLIB_SIZE_THRESHOLD && (uInt)size == size)
>  ^^
> This comparison is always true (unless sizeof(uInt) != sizeof(size_t)?).
>  I'm not 100% sure what you're trying to accomplish here, but I think
> you want 'size > 0'.  I'm not familiar with this zlib function, so I
> don't know what it's expectations for size are.

zlib's uInt is always 32bit while size_t is 64bit on most (all?) 64bit
architectures, so if someone decides to CRC32 >= 4GiB buffer, this
function will do the wrong thing without such check. Newer zlib has
crc32_z that takes size_t, but I was trying to avoid build system
complications of detecting that function...

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] spirv: Add a mechanism for dumping failing shaders

2018-01-02 Thread Grazvydas Ignotas
On Tue, Jan 2, 2018 at 6:30 PM, Jason Ekstrand  wrote:
> ---
>  src/compiler/spirv/spirv_to_nir.c | 29 +
>  src/compiler/spirv/vtn_private.h  |  1 +
>  2 files changed, 30 insertions(+)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c 
> b/src/compiler/spirv/spirv_to_nir.c
> index dcff56f..751fb03 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -31,6 +31,9 @@
>  #include "nir/nir_constant_expressions.h"
>  #include "spirv_info.h"
>
> +#include 
> +#include 
> +
>  void
>  vtn_log(struct vtn_builder *b, enum nir_spirv_debug_level level,
>  size_t spirv_offset, const char *message)
> @@ -94,6 +97,27 @@ vtn_log_err(struct vtn_builder *b,
> ralloc_free(msg);
>  }
>
> +static void
> +vtn_dump_shader(struct vtn_builder *b, const char *path, const char *prefix)
> +{
> +   static int idx = 0;
> +
> +   char filename[1024];
> +   int len = snprintf(filename, sizeof(filename), "%s/%s-%d.spirv",
> +  path, prefix, idx++);
> +   if (len < 0 || len >= sizeof(filename))
> +  return;
> +
> +   int fd = open(filename, O_CREAT | O_CLOEXEC | O_WRONLY, 0777);
> +   if (fd < 0)
> +  return;
> +
> +   write(fd, b->spirv, b->spirv_word_count * 4);

Feel free to ignore, but what about * sizeof(b->spirv[0]) ?

also, this emits a not-so-useful warning for me:
warning: ignoring return value of ‘write’, declared with attribute
warn_unused_result [-Wunused-result]
(and no, sticking (void) before write() doesn't help)

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Ian Romanick
On 01/02/2018 04:05 PM, Jason Ekstrand wrote:
> On January 2, 2018 15:22:51 Ian Romanick  wrote:
> 
>> On 01/01/2018 08:09 PM, Jason Ekstrand wrote:
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
>>> ---
>>>  src/compiler/spirv/vtn_variables.c | 31 +--
>>>  1 file changed, 25 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/compiler/spirv/vtn_variables.c
>>> b/src/compiler/spirv/vtn_variables.c
>>> index d69b056..48797f6 100644
>>> --- a/src/compiler/spirv/vtn_variables.c
>>> +++ b/src/compiler/spirv/vtn_variables.c
>>> @@ -1899,6 +1899,28 @@ vtn_create_variable(struct vtn_builder *b,
>>> struct vtn_value *val,
>>>     }
>>>  }
>>>
>>> +static void
>>> +vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
>>> +   struct vtn_type *dst_type, struct vtn_type
>>> *src_type)
>>> +{
>>> +   if (dst_type->val == src_type->val)
>>> +  return;
>>> +
>>> +   if (dst_type->type == src_type->type) {
>>> +  /* Early versions of GLSLang would re-emit types unnecessarily
>>> and you
>>> +   * would end up with OpLoad, OpStore, or OpCopyMemory opcodes
>>> which have
>>> +   * mismatched source and destination types.
>>> +   */
>>> +  vtn_warn("Source and destination types of %s do not match",
>>> +   spirv_op_to_string(opcode));
>>
>> This is deep compare vs. shallow compare, right?  Looking at the SPIR-V
>> spec, it's not clear to me which kind of "equality" is necessary.  What
>> does the validator do?  I'm just wondering of we should even bother
>> emitting a warning since this may just be a sub-optimal SPIR-V binary.
>> Emitting this particular warning here is a bit misleading.  The real
>> problem is that there are multiple identical types with different names,
>> right?
> 
> That's an interesting question.  The SPIR-V spec is definitely unclear
> on point and, if you wind back the clock a bit, I think you'll find that
> the working group didn't really have consensus for quite some time on
> what "the same type" really means.  At this point in time, I believe the
> consensus is that it means the same SPIR-V id (in this patch we compare
> value pointers but that's equivalent).  However, this consensus was
> reached long after the initial SPIR-V spec was released.

"Same ID" is what I would have expected because it means the loader
doesn't have to do these deep checks.  The rest of that background
information also matches my expectations. :)

> We certainly could ditch the warning and keep the vtn_fail_if only using
> a looser condition.  That said, if the consensus is going to be that
> they must have the same id then I'm a fan of enforcing the rules even if
> it's just a warning.

Yeah, that makes sense.  I'm just thinking about how someone would
respond to seeing that warning come out of the driver.  My initial
reaction would probably be, "Huh?  They're both scalar, signed, 32-bit
integer.  What gives?"  Maybe changing the message to:

   vtn_warn("Source and destination types of %s do not have the same "
"ID (but are compatible): %d vs %d",
spirv_op_to_string(opcode),
/* are the actual IDs even available here? */,
...);

> Also, fyi, I sent a new version of this patch today which uses a new
> vtn_types_compatible helper which is a bit more obvious in it's behavior
> than comparing ->type.

Ah... I overlooked that.

>>> +   } else {
>>> +  vtn_fail("Source and destination types of %s do not match: %s
>>> vs. %s",
>>> +   spirv_op_to_string(opcode),
>>> +   glsl_get_type_name(dst_type->type),
>>> +   glsl_get_type_name(src_type->type));
>>> +   }
>>> +}
>>> +
>>>  void
>>>  vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
>>>   const uint32_t *w, unsigned count)
>>> @@ -1975,8 +1997,7 @@ vtn_handle_variables(struct vtn_builder *b,
>>> SpvOp opcode,
>>>    struct vtn_value *dest = vtn_value(b, w[1],
>>> vtn_value_type_pointer);
>>>    struct vtn_value *src = vtn_value(b, w[2],
>>> vtn_value_type_pointer);
>>>
>>> -  vtn_fail_if(dest->type->deref != src->type->deref,
>>> -  "Dereferenced pointer types to OpCopyMemory do not
>>> match");
>>> +  vtn_assert_types_equal(b, opcode, dest->type->deref,
>>> src->type->deref);
>>>
>>>    vtn_variable_copy(b, dest->pointer, src->pointer);
>>>    break;
>>> @@ -1988,8 +2009,7 @@ vtn_handle_variables(struct vtn_builder *b,
>>> SpvOp opcode,
>>>    struct vtn_value *src_val = vtn_value(b, w[3],
>>> vtn_value_type_pointer);
>>>    struct vtn_pointer *src = src_val->pointer;
>>>
>>> -  vtn_fail_if(res_type != src_val->type->deref,
>>> -  "Result and pointer types of OpLoad do not match");
>>> +  vtn_assert_types_equal(b, opcode, res_type,
>>> src_val->type->deref);
>>>
>>>    if (src->mode == vtn_variable_mode_image ||
>>>    src->mode == 

Re: [Mesa-dev] [PATCH] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Jason Ekstrand

On January 2, 2018 15:22:51 Ian Romanick  wrote:


On 01/01/2018 08:09 PM, Jason Ekstrand wrote:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
---
 src/compiler/spirv/vtn_variables.c | 31 +--
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c

index d69b056..48797f6 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1899,6 +1899,28 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,

}
 }

+static void
+vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
+   struct vtn_type *dst_type, struct vtn_type *src_type)
+{
+   if (dst_type->val == src_type->val)
+  return;
+
+   if (dst_type->type == src_type->type) {
+  /* Early versions of GLSLang would re-emit types unnecessarily and you
+   * would end up with OpLoad, OpStore, or OpCopyMemory opcodes which have
+   * mismatched source and destination types.
+   */
+  vtn_warn("Source and destination types of %s do not match",
+   spirv_op_to_string(opcode));


This is deep compare vs. shallow compare, right?  Looking at the SPIR-V
spec, it's not clear to me which kind of "equality" is necessary.  What
does the validator do?


I believe so tough I haven't did until the validator sources.  See also

https://github.com/SaschaWillems/Vulkan/issues/345


 I'm just wondering of we should even bother
emitting a warning since this may just be a sub-optimal SPIR-V binary.
Emitting this particular warning here is a bit misleading.  The real
problem is that there are multiple identical types with different names,
right?


+   } else {
+  vtn_fail("Source and destination types of %s do not match: %s vs. %s",
+   spirv_op_to_string(opcode),
+   glsl_get_type_name(dst_type->type),
+   glsl_get_type_name(src_type->type));
+   }
+}
+
 void
 vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
  const uint32_t *w, unsigned count)
@@ -1975,8 +1997,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *dest = vtn_value(b, w[1], vtn_value_type_pointer);
   struct vtn_value *src = vtn_value(b, w[2], vtn_value_type_pointer);

-  vtn_fail_if(dest->type->deref != src->type->deref,
-  "Dereferenced pointer types to OpCopyMemory do not match");
+  vtn_assert_types_equal(b, opcode, dest->type->deref, src->type->deref);

   vtn_variable_copy(b, dest->pointer, src->pointer);
   break;
@@ -1988,8 +2009,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *src_val = vtn_value(b, w[3], vtn_value_type_pointer);
   struct vtn_pointer *src = src_val->pointer;

-  vtn_fail_if(res_type != src_val->type->deref,
-  "Result and pointer types of OpLoad do not match");
+  vtn_assert_types_equal(b, opcode, res_type, src_val->type->deref);

   if (src->mode == vtn_variable_mode_image ||
   src->mode == vtn_variable_mode_sampler) {
@@ -2006,8 +2026,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_pointer *dest = dest_val->pointer;
   struct vtn_value *src_val = vtn_untyped_value(b, w[2]);

-  vtn_fail_if(dest_val->type->deref != src_val->type,
-  "Value and pointer types of OpStore do not match");
+  vtn_assert_types_equal(b, opcode, dest_val->type->deref, src_val->type);

   if (glsl_type_is_sampler(dest->type->type)) {
  vtn_warn("OpStore of a sampler detected.  Doing on-the-fly copy "






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Jason Ekstrand

On January 2, 2018 11:59:15 Alejandro Piñeiro  wrote:


nitpick, from the commit message: is "loosten" an English word? Perhaps
do you mean "loosen"? (said the non-native English speaker).


Yes, Matt pointed that out as well.  I may be a native but my spelling 
ability is a bit sub-par some days...



The patch looks good. My only concern is that as far as I understand, we
are accepting non-strict SPIR-V shaders (quoting from previous patch:
"Technically, the SPIR-V rules require the exact same type ID but this
lets us internally be a bit looser."). I'm wondering if that could cause
problems in the future if we are loosening the check for any shader, and
this loosening should be conditional. Although perhaps that would be too
much hassle right now.


It's possible.  However, I think this approach is ok given that the rules 
we are actually enforcing (with the vtn_fail) is that the two types must be 
identical in their construction (minus decorations).  The primary reason we 
need this check at all is to ensure that loading and storing composite 
types works out ok.  Unless we get our source and destination tires mixed 
up somewhere, the decorations don't matter when in comes to compatibility 
of the resulting SSA value.


Could this change in the future?  Quite possibly.  If it does, then we will 
have to evaluate what checks we want to make.  There is a possibility that 
we may choose to stricter.  However, as it stands today, the apps are 
breaking the rules so anyone who sees that warning should go fix their app.



In any case, nitpicks and concerns apart:
Reviewed-by: Alejandro Piñeiro 

On 02/01/18 17:30, Jason Ekstrand wrote:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
---
 src/compiler/spirv/vtn_variables.c | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c

index d69b056..a74d0ce 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1899,6 +1899,29 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,

}
 }

+static void
+vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
+   struct vtn_type *dst_type,
+   struct vtn_type *src_type)
+{
+   if (dst_type->val == src_type->val)
+  return;
+
+   if (vtn_types_compatible(b, dst_type, src_type)) {
+  /* Early versions of GLSLang would re-emit types unnecessarily and you
+   * would end up with OpLoad, OpStore, or OpCopyMemory opcodes which have
+   * mismatched source and destination types.
+   */
+  vtn_warn("Source and destination types of %s do not match",
+   spirv_op_to_string(opcode));
+   } else {
+  vtn_fail("Source and destination types of %s do not match: %s vs. %s",
+   spirv_op_to_string(opcode),
+   glsl_get_type_name(dst_type->type),
+   glsl_get_type_name(src_type->type));
+   }
+}
+
 void
 vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
  const uint32_t *w, unsigned count)
@@ -1975,8 +1998,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *dest = vtn_value(b, w[1], vtn_value_type_pointer);
   struct vtn_value *src = vtn_value(b, w[2], vtn_value_type_pointer);

-  vtn_fail_if(dest->type->deref != src->type->deref,
-  "Dereferenced pointer types to OpCopyMemory do not match");
+  vtn_assert_types_equal(b, opcode, dest->type->deref, src->type->deref);

   vtn_variable_copy(b, dest->pointer, src->pointer);
   break;
@@ -1988,8 +2010,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *src_val = vtn_value(b, w[3], vtn_value_type_pointer);
   struct vtn_pointer *src = src_val->pointer;

-  vtn_fail_if(res_type != src_val->type->deref,
-  "Result and pointer types of OpLoad do not match");
+  vtn_assert_types_equal(b, opcode, res_type, src_val->type->deref);

   if (src->mode == vtn_variable_mode_image ||
   src->mode == vtn_variable_mode_sampler) {
@@ -2006,8 +2027,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_pointer *dest = dest_val->pointer;
   struct vtn_value *src_val = vtn_untyped_value(b, w[2]);

-  vtn_fail_if(dest_val->type->deref != src_val->type,
-  "Value and pointer types of OpStore do not match");
+  vtn_assert_types_equal(b, opcode, dest_val->type->deref, src_val->type);

   if (glsl_type_is_sampler(dest->type->type)) {
  vtn_warn("OpStore of a sampler detected.  Doing on-the-fly copy "






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Jason Ekstrand

On January 2, 2018 15:22:51 Ian Romanick  wrote:


On 01/01/2018 08:09 PM, Jason Ekstrand wrote:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
---
 src/compiler/spirv/vtn_variables.c | 31 +--
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c

index d69b056..48797f6 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1899,6 +1899,28 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,

}
 }

+static void
+vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
+   struct vtn_type *dst_type, struct vtn_type *src_type)
+{
+   if (dst_type->val == src_type->val)
+  return;
+
+   if (dst_type->type == src_type->type) {
+  /* Early versions of GLSLang would re-emit types unnecessarily and you
+   * would end up with OpLoad, OpStore, or OpCopyMemory opcodes which have
+   * mismatched source and destination types.
+   */
+  vtn_warn("Source and destination types of %s do not match",
+   spirv_op_to_string(opcode));


This is deep compare vs. shallow compare, right?  Looking at the SPIR-V
spec, it's not clear to me which kind of "equality" is necessary.  What
does the validator do?  I'm just wondering of we should even bother
emitting a warning since this may just be a sub-optimal SPIR-V binary.
Emitting this particular warning here is a bit misleading.  The real
problem is that there are multiple identical types with different names,
right?


That's an interesting question.  The SPIR-V spec is definitely unclear on 
point and, if you wind back the clock a bit, I think you'll find that the 
working group didn't really have consensus for quite some time on what "the 
same type" really means.  At this point in time, I believe the consensus is 
that it means the same SPIR-V id (in this patch we compare value pointers 
but that's equivalent).  However, this consensus was reached long after the 
initial SPIR-V spec was released.


We certainly could ditch the warning and keep the vtn_fail_if only using a 
looser condition.  That said, if the consensus is going to be that they 
must have the same id then I'm a fan of enforcing the rules even if it's 
just a warning.


Also, fyi, I sent a new version of this patch today which uses a new 
vtn_types_compatible helper which is a bit more obvious in it's behavior 
than comparing ->type.



+   } else {
+  vtn_fail("Source and destination types of %s do not match: %s vs. %s",
+   spirv_op_to_string(opcode),
+   glsl_get_type_name(dst_type->type),
+   glsl_get_type_name(src_type->type));
+   }
+}
+
 void
 vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
  const uint32_t *w, unsigned count)
@@ -1975,8 +1997,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *dest = vtn_value(b, w[1], vtn_value_type_pointer);
   struct vtn_value *src = vtn_value(b, w[2], vtn_value_type_pointer);

-  vtn_fail_if(dest->type->deref != src->type->deref,
-  "Dereferenced pointer types to OpCopyMemory do not match");
+  vtn_assert_types_equal(b, opcode, dest->type->deref, src->type->deref);

   vtn_variable_copy(b, dest->pointer, src->pointer);
   break;
@@ -1988,8 +2009,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *src_val = vtn_value(b, w[3], vtn_value_type_pointer);
   struct vtn_pointer *src = src_val->pointer;

-  vtn_fail_if(res_type != src_val->type->deref,
-  "Result and pointer types of OpLoad do not match");
+  vtn_assert_types_equal(b, opcode, res_type, src_val->type->deref);

   if (src->mode == vtn_variable_mode_image ||
   src->mode == vtn_variable_mode_sampler) {
@@ -2006,8 +2026,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_pointer *dest = dest_val->pointer;
   struct vtn_value *src_val = vtn_untyped_value(b, w[2]);

-  vtn_fail_if(dest_val->type->deref != src_val->type,
-  "Value and pointer types of OpStore do not match");
+  vtn_assert_types_equal(b, opcode, dest_val->type->deref, src_val->type);

   if (glsl_type_is_sampler(dest->type->type)) {
  vtn_warn("OpStore of a sampler detected.  Doing on-the-fly copy "






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/bindless: fix missing image _Layer initialization

2018-01-02 Thread Ian Romanick
Once you can see the then-block before this else-block, it becomes
obvious. :)

Reviewed-by: Ian Romanick 

On 12/29/2017 09:30 PM, Ilia Mirkin wrote:
> Some later code relies on _Layer to set first/last_layer. Make sure it's
> always initialized.
> 
> Detected by valgrind's conditional jump/move with uninit value logic.
> 
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/main/texturebindless.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/main/texturebindless.c b/src/mesa/main/texturebindless.c
> index f062ea904a1..9aaa0367c2d 100644
> --- a/src/mesa/main/texturebindless.c
> +++ b/src/mesa/main/texturebindless.c
> @@ -327,6 +327,7 @@ get_image_handle(struct gl_context *ctx, struct 
> gl_texture_object *texObj,
> } else {
>imgObj.Layered = GL_FALSE;
>imgObj.Layer = 0;
> +  imgObj._Layer = 0;
> }
>  
> /* Request a new image handle from the driver. */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104214] Dota crashes when switching from game to desktop

2018-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104214

Ian Romanick  changed:

   What|Removed |Added

 CC||i...@freedesktop.org

--- Comment #12 from Ian Romanick  ---
Created attachment 136512
  --> https://bugs.freedesktop.org/attachment.cgi?id=136512=edit
Fail gracefully when make_surface returns NULL

Does this patch help?  If there is any difference in behavior with this patch,
can you describe it?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: use zlib's CRC32 implementaion for larger buffers

2018-01-02 Thread Ian Romanick
On 12/28/2017 05:56 PM, Grazvydas Ignotas wrote:
> zlib provides a faster slice-by-4 CRC32 implementation than the
> traditional single byte lookup one used by mesa. As most supported
> platforms now link zlib unconditionally, we can easily use it.
> For small buffers the old implementation is still used as it's faster
> with cold cache (first call), as indicated by some throughput
> benchmarking (avg MB/s, n=100, zlib 1.2.8):
> 
> i5-6600KC2D E4500
> size  mesa zlibmesa zlib
> 4   66   43 -35% +/- 4.8%43   22 -49% +/- 9.6%
> 32 193  171 -11% +/- 5.8%   129   49 -61% +/- 7.2%
> 64 256  267   4% +/- 4.1%   171   63 -63% +/- 5.4%
> 128317  389  22% +/- 5.8%   253   89 -64% +/- 4.2%
> 256364  596  63% +/- 5.6%   304  166 -45% +/- 2.8%
> 512401  838 108% +/- 5.3%   338  296 -12% +/- 3.1%
> 1024   420 1036 146% +/- 7.6%   375  461  23% +/- 3.7%
> 1M 443 1443 225% +/- 2.1%   403 1175 191% +/- 0.9%
> 100M   448 1452 224% +/- 0.3%   406 1214 198% +/- 0.3%
> 
> With hot cache (repeated calls) zlib almost always wins on both CPUS.
> It has been verified the calculation results stay the same after this
> change.
> 
> Signed-off-by: Grazvydas Ignotas 
> ---
>  src/util/crc32.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/src/util/crc32.c b/src/util/crc32.c
> index f2e01c6..0cffa49 100644
> --- a/src/util/crc32.c
> +++ b/src/util/crc32.c
> @@ -31,12 +31,20 @@
>   * 
>   * @author Jose Fonseca
>   */
>  
>  
> +#ifdef HAVE_ZLIB
> +#include 
> +#endif
>  #include "crc32.h"
>  
> +/* For small buffers it's faster to avoid the library call.
> + * The optimal threshold depends on CPU characteristics, it is hoped
> + * the choice below is reasonable for typical modern CPU.
> + */
> +#define ZLIB_SIZE_THRESHOLD 64

For the actual users of this function in Mesa, is it even possible to
pass less than 64 bytes (I'm assuming that's the units)?

>  
>  static const uint32_t 
>  util_crc32_table[256] = {
> 0x, 0x77073096, 0xee0e612c, 0x990951ba, 
> 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, 
> @@ -112,10 +120,15 @@ uint32_t
>  util_hash_crc32(const void *data, size_t size)
>  {
> const uint8_t *p = data;
> uint32_t crc = 0x;
>   
> +#ifdef HAVE_ZLIB
> +   if (size >= ZLIB_SIZE_THRESHOLD && (uInt)size == size)
 ^^
This comparison is always true (unless sizeof(uInt) != sizeof(size_t)?).
 I'm not 100% sure what you're trying to accomplish here, but I think
you want 'size > 0'.  I'm not familiar with this zlib function, so I
don't know what it's expectations for size are.

> +  return ~crc32(0, data, size);
> +#endif
> +
> while (size--)
>crc = util_crc32_table[(crc ^ *p++) & 0xff] ^ (crc >> 8);
> 
> return crc;
>  }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Drop support for the legacy SNORM -> Float equation.

2018-01-02 Thread Ian Romanick
I had forgotten all about this issue.

Reviewed-by: Ian Romanick 

On 12/25/2017 10:56 PM, Kenneth Graunke wrote:
> Older OpenGL defines two equations for converting from signed-normalized
> to floating point data.  These are:
> 
> f = (2c + 1)/(2^b - 1)(equation 2.2)
> f = max{c/2^(b-1) - 1), -1.0} (equation 2.3)
> 
> Both OpenGL 4.2+ and OpenGL ES 3.0+ mandate that equation 2.3 is to be
> used in all scenarios, and remove equation 2.2.  DirectX uses equation
> 2.3 as well.  Intel hardware only supports equation 2.3, so Gen7.5+
> systems that use the vertex fetcher hardware to do the conversions
> always get formula 2.3.
> 
> This can make a big difference for 10-10-10-2 formats - the 2-bit value
> can represent 0 with equation 2.3, and cannot with equation 2.2.
> 
> Ivybridge and older were using equation 2.2 for OpenGL, and 2.3 for ES.
> Now that Ivybridge supports OpenGL 4.2, this is wrong - we need to use
> the new rules, at least in core profile.  That would leave Gen4-6 doing
> something different than all other hardware, which seems...lame.
> 
> With context version promotion, applications that requested a pre-4.2
> context may get promoted to 4.2, and thus get the new rules.  Zero cases
> have been reported of this being a problem.  However, we've received a
> report that following the old rules breaks expectations.  SuperTuxKart
> apparently renders the cars red when following equation 2.2, and works
> correctly when following equation 2.3:
> 
> https://github.com/supertuxkart/stk-code/issues/2885#issuecomment-353858405
> 
> So, this patch deletes the legacy equation 2.2 support entirely, making
> all hardware and APIs consistently use the new equation 2.3 rules.
> 
> If we ever find an application that truly requires the old formula, then
> we'd likely want that application to work on modern hardware, too.  We'd
> likely restore this support as a driconf option.  Until then, drop it.
> 
> This commit will regress Piglit's draw-vertices-2101010 test on
> pre-Haswell without the corresponding Piglit patch to accept either
> formula:
> 
> draw-vertices-2101010: Accept either SNORM conversion formula.
> ---
>  src/intel/blorp/blorp.c|  3 +--
>  src/intel/compiler/brw_compiler.h  |  1 -
>  src/intel/compiler/brw_nir.c   |  4 +--
>  src/intel/compiler/brw_nir.h   |  2 --
>  src/intel/compiler/brw_nir_attribute_workarounds.c | 29 
> ++
>  src/intel/compiler/brw_vec4.cpp|  7 ++
>  src/intel/compiler/brw_vec4_vs.h   |  5 +---
>  src/intel/compiler/brw_vec4_vs_visitor.cpp |  6 ++---
>  src/intel/vulkan/anv_pipeline.c|  2 +-
>  src/mesa/drivers/dri/i965/brw_vs.c |  1 -
>  10 files changed, 15 insertions(+), 45 deletions(-)
> 
> diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c
> index 8a9d2fd3b97..e8a2c6135f5 100644
> --- a/src/intel/blorp/blorp.c
> +++ b/src/intel/blorp/blorp.c
> @@ -223,8 +223,7 @@ blorp_compile_vs(struct blorp_context *blorp, void 
> *mem_ctx,
>  
> const unsigned *program =
>brw_compile_vs(compiler, blorp->driver_ctx, mem_ctx,
> - _key, vs_prog_data, nir,
> - false, -1, NULL);
> + _key, vs_prog_data, nir, -1, NULL);
>  
> return program;
>  }
> diff --git a/src/intel/compiler/brw_compiler.h 
> b/src/intel/compiler/brw_compiler.h
> index 28aed833245..0060c381c0d 100644
> --- a/src/intel/compiler/brw_compiler.h
> +++ b/src/intel/compiler/brw_compiler.h
> @@ -1123,7 +1123,6 @@ brw_compile_vs(const struct brw_compiler *compiler, 
> void *log_data,
> const struct brw_vs_prog_key *key,
> struct brw_vs_prog_data *prog_data,
> const struct nir_shader *shader,
> -   bool use_legacy_snorm_formula,
> int shader_time_index,
> char **error_str);
>  
> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index 265c63efdda..dbddef0d04d 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -211,7 +211,6 @@ remap_patch_urb_offsets(nir_block *block, nir_builder *b,
>  
>  void
>  brw_nir_lower_vs_inputs(nir_shader *nir,
> -bool use_legacy_snorm_formula,
>  const uint8_t *vs_attrib_wa_flags)
>  {
> /* Start with the location of the variable's base. */
> @@ -230,8 +229,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
>  
> add_const_offset_to_base(nir, nir_var_shader_in);
>  
> -   brw_nir_apply_attribute_workarounds(nir, use_legacy_snorm_formula,
> -   vs_attrib_wa_flags);
> +   brw_nir_apply_attribute_workarounds(nir, vs_attrib_wa_flags);
>  
> /* The last step is to remap VERT_ATTRIB_* to actual registers */
>  
> diff --git 

Re: [Mesa-dev] [PATCH] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Ian Romanick
On 01/01/2018 08:09 PM, Jason Ekstrand wrote:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
> ---
>  src/compiler/spirv/vtn_variables.c | 31 +--
>  1 file changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/src/compiler/spirv/vtn_variables.c 
> b/src/compiler/spirv/vtn_variables.c
> index d69b056..48797f6 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -1899,6 +1899,28 @@ vtn_create_variable(struct vtn_builder *b, struct 
> vtn_value *val,
> }
>  }
>  
> +static void
> +vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
> +   struct vtn_type *dst_type, struct vtn_type *src_type)
> +{
> +   if (dst_type->val == src_type->val)
> +  return;
> +
> +   if (dst_type->type == src_type->type) {
> +  /* Early versions of GLSLang would re-emit types unnecessarily and you
> +   * would end up with OpLoad, OpStore, or OpCopyMemory opcodes which 
> have
> +   * mismatched source and destination types.
> +   */
> +  vtn_warn("Source and destination types of %s do not match",
> +   spirv_op_to_string(opcode));

This is deep compare vs. shallow compare, right?  Looking at the SPIR-V
spec, it's not clear to me which kind of "equality" is necessary.  What
does the validator do?  I'm just wondering of we should even bother
emitting a warning since this may just be a sub-optimal SPIR-V binary.
Emitting this particular warning here is a bit misleading.  The real
problem is that there are multiple identical types with different names,
right?

> +   } else {
> +  vtn_fail("Source and destination types of %s do not match: %s vs. %s",
> +   spirv_op_to_string(opcode),
> +   glsl_get_type_name(dst_type->type),
> +   glsl_get_type_name(src_type->type));
> +   }
> +}
> +
>  void
>  vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
>   const uint32_t *w, unsigned count)
> @@ -1975,8 +1997,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp 
> opcode,
>struct vtn_value *dest = vtn_value(b, w[1], vtn_value_type_pointer);
>struct vtn_value *src = vtn_value(b, w[2], vtn_value_type_pointer);
>  
> -  vtn_fail_if(dest->type->deref != src->type->deref,
> -  "Dereferenced pointer types to OpCopyMemory do not match");
> +  vtn_assert_types_equal(b, opcode, dest->type->deref, src->type->deref);
>  
>vtn_variable_copy(b, dest->pointer, src->pointer);
>break;
> @@ -1988,8 +2009,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp 
> opcode,
>struct vtn_value *src_val = vtn_value(b, w[3], vtn_value_type_pointer);
>struct vtn_pointer *src = src_val->pointer;
>  
> -  vtn_fail_if(res_type != src_val->type->deref,
> -  "Result and pointer types of OpLoad do not match");
> +  vtn_assert_types_equal(b, opcode, res_type, src_val->type->deref);
>  
>if (src->mode == vtn_variable_mode_image ||
>src->mode == vtn_variable_mode_sampler) {
> @@ -2006,8 +2026,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp 
> opcode,
>struct vtn_pointer *dest = dest_val->pointer;
>struct vtn_value *src_val = vtn_untyped_value(b, w[2]);
>  
> -  vtn_fail_if(dest_val->type->deref != src_val->type,
> -  "Value and pointer types of OpStore do not match");
> +  vtn_assert_types_equal(b, opcode, dest_val->type->deref, 
> src_val->type);
>  
>if (glsl_type_is_sampler(dest->type->type)) {
>   vtn_warn("OpStore of a sampler detected.  Doing on-the-fly copy "
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Retrieving useful debug infos about (AMD) Mesa and Firefox quirks

2018-01-02 Thread Germano Massullo
Hi everyone!

Bugreport

"Latest mesa breaks firefox on kde plasma with compositing on"
https://bugs.freedesktop.org/show_bug.cgi?id=103699

is about Intel cards, and it looks like the bug has been fixed with an
Intel only patch. Personally, I am experiencing the same bug on AMDGPU
(FOSS) driver, RX480 card, so I opened bugreport
https://bugs.freedesktop.org/show_bug.cgi?id=104216

Since my Firefox user experienced has been ruined since several weeks, I
would like to ask you if it is anything I can do to retireve more useful
infos for you Mesa developers.

Thank you very much

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 3/4] meson: build clover

2018-01-02 Thread Dylan Baker
ping.

Quoting Dylan Baker (2017-12-15 10:54:18)
> This has only been compile tested.
> 
> v2: - Have a single option for opencl (Eric E)
> - fix typo "tgis" -> "tgsi" (Curro)
> - Don't add "lib" to pipe loader libraries, which matches the
>   autotools behavior
> v3: - Remove trailing whitespace
> - Make PIPE_SEARCH_DIR an absolute path
> 
> cc: Curro Jerez 
> cc: Jan Vesely 
> cc: Aaron Watry 
> Signed-off-by: Dylan Baker 
> ---
>  include/meson.build   |  19 
>  meson.build   |  29 +-
>  meson_options.txt |   7 ++
>  src/gallium/auxiliary/pipe-loader/meson.build |   3 +-
>  src/gallium/meson.build   |  12 ++-
>  src/gallium/state_trackers/clover/meson.build | 122 
> ++
>  src/gallium/targets/opencl/meson.build|  73 +++
>  src/gallium/targets/pipe-loader/meson.build   |  77 
>  8 files changed, 336 insertions(+), 6 deletions(-)
>  create mode 100644 src/gallium/state_trackers/clover/meson.build
>  create mode 100644 src/gallium/targets/opencl/meson.build
>  create mode 100644 src/gallium/targets/pipe-loader/meson.build
> 
> diff --git a/include/meson.build b/include/meson.build
> index e4dae91cede..a2e7ce6580e 100644
> --- a/include/meson.build
> +++ b/include/meson.build
> @@ -78,3 +78,22 @@ if with_gallium_st_nine
>  subdir : 'd3dadapter',
>)
>  endif
> +
> +# Only install the headers if we are building a stand alone implementation 
> and
> +# not an ICD enabled implementation
> +if with_gallium_opencl and not with_opencl_icd
> +  install_headers(
> +'CL/cl.h',
> +'CL/cl.hpp',
> +'CL/cl_d3d10.h',
> +'CL/cl_d3d11.h',
> +'CL/cl_dx9_media_sharing.h',
> +'CL/cl_egl.h',
> +'CL/cl_ext.h',
> +'CL/cl_gl.h',
> +'CL/cl_gl_ext.h',
> +'CL/cl_platform.h',
> +'CL/opencl.h',
> +subdir: 'CL'
> +  )
> +endif
> diff --git a/meson.build b/meson.build
> index 842d441199e..74b2d5c49dc 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -583,6 +583,22 @@ if with_gallium_st_nine
>endif
>  endif
>  
> +_opencl = get_option('gallium-opencl')
> +if _opencl !=' disabled'
> +  if not with_gallium
> +error('OpenCL Clover implementation requires at least one gallium 
> driver.')
> +  endif
> +
> +  # TODO: alitvec?
> +  dep_clc = dependency('libclc')
> +  with_gallium_opencl = true
> +  with_opencl_icd = _opencl == 'icd'
> +else
> +  dep_clc = []
> +  with_gallium_opencl = false
> +  with_gallium_icd = false
> +endif
> +
>  gl_pkgconfig_c_flags = []
>  if with_platform_x11
>if with_any_vk or (with_glx == 'dri' and with_dri_platform == 'drm')
> @@ -930,7 +946,7 @@ dep_thread = dependency('threads')
>  if dep_thread.found() and host_machine.system() != 'windows'
>pre_args += '-DHAVE_PTHREAD'
>  endif
> -if with_amd_vk or with_gallium_radeonsi or with_gallium_r600 # TODO: clover
> +if with_amd_vk or with_gallium_radeonsi or with_gallium_r600 or 
> with_gallium_opencl
>dep_elf = dependency('libelf', required : false)
>if not dep_elf.found()
>  dep_elf = cc.find_library('elf')
> @@ -972,12 +988,19 @@ if with_amd_vk or with_gallium_radeonsi or 
> with_gallium_r600
>  llvm_modules += 'asmparser'
>endif
>  endif
> +if with_gallium_opencl
> +  llvm_modules += [
> +'all-targets', 'linker', 'coverage', 'instrumentation', 'ipo', 
> 'irreader',
> +'lto', 'option', 'objcarcopts', 'profiledata',
> +  ]
> +  # TODO: optional modules
> +endif
>  
>  _llvm = get_option('llvm')
>  if _llvm == 'auto'
>dep_llvm = dependency(
>  'llvm', version : '>= 3.9.0', modules : llvm_modules,
> -required : with_amd_vk or with_gallium_radeonsi or with_gallium_swr,
> +required : with_amd_vk or with_gallium_radeonsi or with_gallium_swr or 
> with_gallium_opencl,
>)
>with_llvm = dep_llvm.found()
>  elif _llvm == 'true'
> @@ -1154,8 +1177,6 @@ else
>dep_lmsensors = []
>  endif
>  
> -# TODO: clover
> -
>  # TODO: gallium tests
>  
>  # TODO: various libdirs
> diff --git a/meson_options.txt b/meson_options.txt
> index 4f4db5b7d26..894378985fd 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -120,6 +120,13 @@ option(
>value : false,
>description : 'build gallium "nine" Direct3D 9.x state tracker.',
>  )
> +option(
> +  'gallium-opencl',
> +  type : 'combo',
> +  choices : ['icd', 'standalone', 'disabled'],
> +  value : 'disabled',
> +  description : 'build gallium "clover" OpenCL state tracker.',
> +)
>  option(
>'d3d-drivers-path',
>type : 'string',
> diff --git a/src/gallium/auxiliary/pipe-loader/meson.build 
> b/src/gallium/auxiliary/pipe-loader/meson.build
> index 9b12432aea0..869a2935149 100644
> --- a/src/gallium/auxiliary/pipe-loader/meson.build
> +++ b/src/gallium/auxiliary/pipe-loader/meson.build
> @@ -60,7 +60,8 @@ 

Re: [Mesa-dev] FOSDEM 2018: Graphics DevRoom: Call for speaker.

2018-01-02 Thread Navare, Manasi D

Hi Luc,

I have submitted a proposal for a talk in Graphics Dev Room, but haven’t heard 
anything regarding the selection. Is there a timeline by which the 
notifications will be sent so that the travel can be planned accordingly?

Has anyone else heard about the submissions?

Regards
Manasi


Hi,

At FOSDEM on saturday the 3rd of february 2018, there will be another graphics 
DevRoom. URL: https://fosdem.org/2018/

The focus of this DevRoom is of course the same as the previous editions, 
namely:
* Graphics drivers: from display to media to 3d drivers, both in kernel or 
userspace. Be it part of DRM, KMS, (direct)FB, V4L, Xorg, Mesa...
* Input drivers: kernel and userspace.
* Windowing systems: X, Wayland, Mir, directFB, ...
* Even colour management, low level toolkit stuff, and other areas which i 
might have overlooked above are accepted.

Slots will be handed out on a first come, first serve basis. The best slots 
will go to those who apply the earliest. We have the devroom from
10:30 til 19:00, giving us 8h30, so eight 50 minute talkes and one 20 minute 
talk are available.

Talk Submission:


Like the last few years, the pentabarf system will be used for talk submission.

https://penta.fosdem.org/submission/FOSDEM18

Remember that FOSDEM is not like XDC, it's not some 50 odd people meeting with 
a sliding schedule which only gets filled out on the last day. Upwards of 1 
people are visiting this event, and most of them get a printed booklet or use 
the schedule on the FOSDEM website or an app for their phone to figure out what 
to watch or participate in next. 
So please put some effort in your talk submission and details.

Since this an open source community event, please refrain from turning in a 
talk that is a pure corporate or product commercial. Also, if you are unsure on 
whether you can come or not (this is FOSDEM, why are you not there anyway?), 
please wait with submitting your talk. Submitting a talk and then not turning 
up because you could not be bothered is a sure-fire way to get larted and then 
to never be allowed to talk again.

When in pentabarf, please give the abstract and description, for both the event 
and the speaker, some thought. The abstract should be a shortened description, 
and the event abstract will sometimes even be printed directly in the booklet. 
BUT, on the website the abstract is immediately followed by the full 
description. If your abstract is fully descriptive, while terse, you might get 
away with just the abstract.

All talks will be recorded, and will be streamed out live, and will later be 
made available as CC-BY after a few days.

As for deadlines, the fosdem organizers want to have a finished schedule by the 
15th of december. Don't count on this deadline: first come first serve! The 
worst slots will be assigned to those who come last, which could be pretty dire 
given that there is the traditional FOSDEM beer event the night before ;)

Please try to re-use your accounts from the previous years, i hope that this 
year you can actually recycle your data. If you have forgotten your password, 
then you can reset it here: 
https://penta.fosdem.org/user/forgot_password If there are any issues, just 
poke me here or on IRC.

Necessary information:
--

Below is a list of what i need to see filled in in pentabarf when you apply for 
a devroom before i consider it a valid submission. Remember: 
first come, first serve. The best slots (which are on saturday
afternoon) are for the earliest submissions.

On your personal page:
* General:
  * First and last name
  * Nickname
  * Image
* Contact:
  * email address
  * mobile number (this is a very hard requirement as there will be no
   other reliable form of emergency communication on the day)
* Description:
  * Abstract
  * Description

Create an event:
* On the General page:
  * Event title
  * Event subtitle.
  * Track: Graphics Devroom
  * Event type: Lecture (talk) or Meeting (BoF)
* Persons:
  * Add yourself as speaker.
* Description:
  * Abstract:
  * Full Description
* Links:
  * Add relevant links.

Everything else can be ignored or will be filled in by me or the FOSDEM 
organizers. Remember, i will only schedule your talk after the basics are 
somewhat filled in (you still can change them until december 15th).

I will be keeping a keen eye on your submissions and will come back with 
further questions or make small fixes as needed. Feel free to poke me with any 
questions or anything, both on irc (libv@freenode) and on email.

That's about it. Hope to see you all at FOSDEM :)

Luc Verhaegen.
___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Alejandro Piñeiro
nitpick, from the commit message: is "loosten" an English word? Perhaps
do you mean "loosen"? (said the non-native English speaker).

The patch looks good. My only concern is that as far as I understand, we
are accepting non-strict SPIR-V shaders (quoting from previous patch:
"Technically, the SPIR-V rules require the exact same type ID but this
lets us internally be a bit looser."). I'm wondering if that could cause
problems in the future if we are loosening the check for any shader, and
this loosening should be conditional. Although perhaps that would be too
much hassle right now.

In any case, nitpicks and concerns apart:
Reviewed-by: Alejandro Piñeiro 

On 02/01/18 17:30, Jason Ekstrand wrote:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
> ---
>  src/compiler/spirv/vtn_variables.c | 32 ++--
>  1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_variables.c 
> b/src/compiler/spirv/vtn_variables.c
> index d69b056..a74d0ce 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -1899,6 +1899,29 @@ vtn_create_variable(struct vtn_builder *b, struct 
> vtn_value *val,
> }
>  }
>  
> +static void
> +vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
> +   struct vtn_type *dst_type,
> +   struct vtn_type *src_type)
> +{
> +   if (dst_type->val == src_type->val)
> +  return;
> +
> +   if (vtn_types_compatible(b, dst_type, src_type)) {
> +  /* Early versions of GLSLang would re-emit types unnecessarily and you
> +   * would end up with OpLoad, OpStore, or OpCopyMemory opcodes which 
> have
> +   * mismatched source and destination types.
> +   */
> +  vtn_warn("Source and destination types of %s do not match",
> +   spirv_op_to_string(opcode));
> +   } else {
> +  vtn_fail("Source and destination types of %s do not match: %s vs. %s",
> +   spirv_op_to_string(opcode),
> +   glsl_get_type_name(dst_type->type),
> +   glsl_get_type_name(src_type->type));
> +   }
> +}
> +
>  void
>  vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
>   const uint32_t *w, unsigned count)
> @@ -1975,8 +1998,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp 
> opcode,
>struct vtn_value *dest = vtn_value(b, w[1], vtn_value_type_pointer);
>struct vtn_value *src = vtn_value(b, w[2], vtn_value_type_pointer);
>  
> -  vtn_fail_if(dest->type->deref != src->type->deref,
> -  "Dereferenced pointer types to OpCopyMemory do not match");
> +  vtn_assert_types_equal(b, opcode, dest->type->deref, src->type->deref);
>  
>vtn_variable_copy(b, dest->pointer, src->pointer);
>break;
> @@ -1988,8 +2010,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp 
> opcode,
>struct vtn_value *src_val = vtn_value(b, w[3], vtn_value_type_pointer);
>struct vtn_pointer *src = src_val->pointer;
>  
> -  vtn_fail_if(res_type != src_val->type->deref,
> -  "Result and pointer types of OpLoad do not match");
> +  vtn_assert_types_equal(b, opcode, res_type, src_val->type->deref);
>  
>if (src->mode == vtn_variable_mode_image ||
>src->mode == vtn_variable_mode_sampler) {
> @@ -2006,8 +2027,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp 
> opcode,
>struct vtn_pointer *dest = dest_val->pointer;
>struct vtn_value *src_val = vtn_untyped_value(b, w[2]);
>  
> -  vtn_fail_if(dest_val->type->deref != src_val->type,
> -  "Value and pointer types of OpStore do not match");
> +  vtn_assert_types_equal(b, opcode, dest_val->type->deref, 
> src_val->type);
>  
>if (glsl_type_is_sampler(dest->type->type)) {
>   vtn_warn("OpStore of a sampler detected.  Doing on-the-fly copy "


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] spirv: Add a vtn_types_compatible helper

2018-01-02 Thread Alejandro Piñeiro
Reviewed-by: Alejandro Piñeiro 

On 02/01/18 17:30, Jason Ekstrand wrote:
> ---
>  src/compiler/spirv/spirv_to_nir.c | 52 
> +++
>  src/compiler/spirv/vtn_private.h  |  3 +++
>  2 files changed, 55 insertions(+)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c 
> b/src/compiler/spirv/spirv_to_nir.c
> index 751fb03..5004d81 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -538,6 +538,58 @@ struct member_decoration_ctx {
> struct vtn_type *type;
>  };
>  
> +/** Returns true if two types are "compatible", i.e. you can do an OpLoad,
> + * OpStore, or OpCopyMemory between them without breaking anything.
> + * Technically, the SPIR-V rules require the exact same type ID but this lets
> + * us internally be a bit looser.
> + */
> +bool
> +vtn_types_compatible(struct vtn_builder *b,
> + struct vtn_type *t1, struct vtn_type *t2)
> +{
> +   if (t1->val == t2->val)
> +  return true;
> +
> +   if (t1->base_type != t2->base_type)
> +  return false;
> +
> +   switch (t1->base_type) {
> +   case vtn_base_type_void:
> +   case vtn_base_type_scalar:
> +   case vtn_base_type_vector:
> +   case vtn_base_type_matrix:
> +   case vtn_base_type_image:
> +   case vtn_base_type_sampler:
> +   case vtn_base_type_sampled_image:
> +  return t1->type == t2->type;
> +
> +   case vtn_base_type_array:
> +  return t1->length == t2->length &&
> + vtn_types_compatible(b, t1->array_element, t2->array_element);
> +
> +   case vtn_base_type_pointer:
> +  return vtn_types_compatible(b, t1->deref, t2->deref);
> +
> +   case vtn_base_type_struct:
> +  if (t1->length != t2->length)
> + return false;
> +
> +  for (unsigned i = 0; i < t1->length; i++) {
> + if (!vtn_types_compatible(b, t1->members[i], t2->members[i]))
> +return false;
> +  }
> +  return true;
> +
> +   case vtn_base_type_function:
> +  /* This case shouldn't get hit since you can't copy around function
> +   * types.  Just require them to be identical.
> +   */
> +  return false;
> +   }
> +
> +   vtn_fail("Invalid base type");
> +}
> +
>  /* does a shallow copy of a vtn_type */
>  
>  static struct vtn_type *
> diff --git a/src/compiler/spirv/vtn_private.h 
> b/src/compiler/spirv/vtn_private.h
> index 374643a..f2b53e1 100644
> --- a/src/compiler/spirv/vtn_private.h
> +++ b/src/compiler/spirv/vtn_private.h
> @@ -365,6 +365,9 @@ struct vtn_type {
> };
>  };
>  
> +bool vtn_types_compatible(struct vtn_builder *b,
> +  struct vtn_type *t1, struct vtn_type *t2);
> +
>  struct vtn_variable;
>  
>  enum vtn_access_mode {

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] spirv: Add a mechanism for dumping failing shaders

2018-01-02 Thread Alejandro Piñeiro
Reviewed-by: Alejandro Piñeiro 

On 02/01/18 17:30, Jason Ekstrand wrote:
> ---
>  src/compiler/spirv/spirv_to_nir.c | 29 +
>  src/compiler/spirv/vtn_private.h  |  1 +
>  2 files changed, 30 insertions(+)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c 
> b/src/compiler/spirv/spirv_to_nir.c
> index dcff56f..751fb03 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -31,6 +31,9 @@
>  #include "nir/nir_constant_expressions.h"
>  #include "spirv_info.h"
>  
> +#include 
> +#include 
> +
>  void
>  vtn_log(struct vtn_builder *b, enum nir_spirv_debug_level level,
>  size_t spirv_offset, const char *message)
> @@ -94,6 +97,27 @@ vtn_log_err(struct vtn_builder *b,
> ralloc_free(msg);
>  }
>  
> +static void
> +vtn_dump_shader(struct vtn_builder *b, const char *path, const char *prefix)
> +{
> +   static int idx = 0;
> +
> +   char filename[1024];
> +   int len = snprintf(filename, sizeof(filename), "%s/%s-%d.spirv",
> +  path, prefix, idx++);
> +   if (len < 0 || len >= sizeof(filename))
> +  return;
> +
> +   int fd = open(filename, O_CREAT | O_CLOEXEC | O_WRONLY, 0777);
> +   if (fd < 0)
> +  return;
> +
> +   write(fd, b->spirv, b->spirv_word_count * 4);
> +   close(fd);
> +
> +   vtn_info("SPIR-V shader dumped to %s", filename);
> +}
> +
>  void
>  _vtn_warn(struct vtn_builder *b, const char *file, unsigned line,
>const char *fmt, ...)
> @@ -117,6 +141,10 @@ _vtn_fail(struct vtn_builder *b, const char *file, 
> unsigned line,
> file, line, fmt, args);
> va_end(args);
>  
> +   const char *dump_path = getenv("MESA_SPIRV_FAIL_DUMP_PATH");
> +   if (dump_path)
> +  vtn_dump_shader(b, dump_path, "fail");
> +
> longjmp(b->fail_jump, 1);
>  }
>  
> @@ -3690,6 +3718,7 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
> /* Initialize the stn_builder object */
> struct vtn_builder *b = rzalloc(NULL, struct vtn_builder);
> b->spirv = words;
> +   b->spirv_word_count = word_count;
> b->file = NULL;
> b->line = -1;
> b->col = -1;
> diff --git a/src/compiler/spirv/vtn_private.h 
> b/src/compiler/spirv/vtn_private.h
> index f7d8f49..374643a 100644
> --- a/src/compiler/spirv/vtn_private.h
> +++ b/src/compiler/spirv/vtn_private.h
> @@ -531,6 +531,7 @@ struct vtn_builder {
> jmp_buf fail_jump;
>  
> const uint32_t *spirv;
> +   size_t spirv_word_count;
>  
> nir_shader *shader;
> const struct spirv_to_nir_options *options;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Matt Turner
The word is loosen.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr/rast: fix build break for llvm-6

2018-01-02 Thread Tim Rowley
LLVM api change.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104381
---
 src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
index 3f0772c942..59672bb545 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
@@ -498,7 +498,11 @@ std::unique_ptr 
JitCache::getObject(const llvm::Module* M)
 break;
 }
 
+#if LLVM_VERSION_MAJOR < 6
 pBuf = 
llvm::MemoryBuffer::getNewUninitMemBuffer(size_t(header.GetBufferSize()));
+#else
+pBuf = 
llvm::WritableMemoryBuffer::getNewUninitMemBuffer(size_t(header.GetBufferSize()));
+#endif
 if (!fread(const_cast(pBuf->getBufferStart()), 
header.GetBufferSize(), 1, fpIn))
 {
 pBuf = nullptr;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Eero Tamminen

Hi,

On 02.01.2018 06:11, Jason Ekstrand wrote:

This patch depends on the first three patches of this series:

https://patchwork.freedesktop.org/series/35469/


I tested subset of Sacha Willems' demos with the above patch series + 
this patch.  Without them, raytracing demo (still) crashes, with them, 
it works fine.


Tested-by: Eero Tamminen 


On Mon, Jan 1, 2018 at 8:09 PM, Jason Ekstrand > wrote:


Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424

---
  src/compiler/spirv/vtn_variables.c | 31
+--
  1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c
b/src/compiler/spirv/vtn_variables.c
index d69b056..48797f6 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1899,6 +1899,28 @@ vtn_create_variable(struct vtn_builder *b,
struct vtn_value *val,
     }
  }

+static void
+vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
+                       struct vtn_type *dst_type, struct vtn_type
*src_type)
+{
+   if (dst_type->val == src_type->val)
+      return;
+
+   if (dst_type->type == src_type->type) {
+      /* Early versions of GLSLang would re-emit types
unnecessarily and you
+       * would end up with OpLoad, OpStore, or OpCopyMemory opcodes
which have
+       * mismatched source and destination types.
+       */
+      vtn_warn("Source and destination types of %s do not match",
+               spirv_op_to_string(opcode));
+   } else {
+      vtn_fail("Source and destination types of %s do not match: %s
vs. %s",
+               spirv_op_to_string(opcode),
+               glsl_get_type_name(dst_type->type),
+               glsl_get_type_name(src_type->type));
+   }
+}
+
  void
  vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
                       const uint32_t *w, unsigned count)
@@ -1975,8 +1997,7 @@ vtn_handle_variables(struct vtn_builder *b,
SpvOp opcode,
        struct vtn_value *dest = vtn_value(b, w[1],
vtn_value_type_pointer);
        struct vtn_value *src = vtn_value(b, w[2],
vtn_value_type_pointer);

-      vtn_fail_if(dest->type->deref != src->type->deref,
-                  "Dereferenced pointer types to OpCopyMemory do
not match");
+      vtn_assert_types_equal(b, opcode, dest->type->deref,
src->type->deref);

        vtn_variable_copy(b, dest->pointer, src->pointer);
        break;
@@ -1988,8 +2009,7 @@ vtn_handle_variables(struct vtn_builder *b,
SpvOp opcode,
        struct vtn_value *src_val = vtn_value(b, w[3],
vtn_value_type_pointer);
        struct vtn_pointer *src = src_val->pointer;

-      vtn_fail_if(res_type != src_val->type->deref,
-                  "Result and pointer types of OpLoad do not match");
+      vtn_assert_types_equal(b, opcode, res_type,
src_val->type->deref);

        if (src->mode == vtn_variable_mode_image ||
            src->mode == vtn_variable_mode_sampler) {
@@ -2006,8 +2026,7 @@ vtn_handle_variables(struct vtn_builder *b,
SpvOp opcode,
        struct vtn_pointer *dest = dest_val->pointer;
        struct vtn_value *src_val = vtn_untyped_value(b, w[2]);

-      vtn_fail_if(dest_val->type->deref != src_val->type,
-                  "Value and pointer types of OpStore do not match");
+      vtn_assert_types_equal(b, opcode, dest_val->type->deref,
src_val->type);

        if (glsl_type_is_sampler(dest->type->type)) {
           vtn_warn("OpStore of a sampler detected.  Doing
on-the-fly copy "
--
2.5.0.400.gff86faf




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] spirv: Loosten the validation for load/store type matching

2018-01-02 Thread Jason Ekstrand
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
---
 src/compiler/spirv/vtn_variables.c | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index d69b056..a74d0ce 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1899,6 +1899,29 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,
}
 }
 
+static void
+vtn_assert_types_equal(struct vtn_builder *b, SpvOp opcode,
+   struct vtn_type *dst_type,
+   struct vtn_type *src_type)
+{
+   if (dst_type->val == src_type->val)
+  return;
+
+   if (vtn_types_compatible(b, dst_type, src_type)) {
+  /* Early versions of GLSLang would re-emit types unnecessarily and you
+   * would end up with OpLoad, OpStore, or OpCopyMemory opcodes which have
+   * mismatched source and destination types.
+   */
+  vtn_warn("Source and destination types of %s do not match",
+   spirv_op_to_string(opcode));
+   } else {
+  vtn_fail("Source and destination types of %s do not match: %s vs. %s",
+   spirv_op_to_string(opcode),
+   glsl_get_type_name(dst_type->type),
+   glsl_get_type_name(src_type->type));
+   }
+}
+
 void
 vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
  const uint32_t *w, unsigned count)
@@ -1975,8 +1998,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *dest = vtn_value(b, w[1], vtn_value_type_pointer);
   struct vtn_value *src = vtn_value(b, w[2], vtn_value_type_pointer);
 
-  vtn_fail_if(dest->type->deref != src->type->deref,
-  "Dereferenced pointer types to OpCopyMemory do not match");
+  vtn_assert_types_equal(b, opcode, dest->type->deref, src->type->deref);
 
   vtn_variable_copy(b, dest->pointer, src->pointer);
   break;
@@ -1988,8 +2010,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_value *src_val = vtn_value(b, w[3], vtn_value_type_pointer);
   struct vtn_pointer *src = src_val->pointer;
 
-  vtn_fail_if(res_type != src_val->type->deref,
-  "Result and pointer types of OpLoad do not match");
+  vtn_assert_types_equal(b, opcode, res_type, src_val->type->deref);
 
   if (src->mode == vtn_variable_mode_image ||
   src->mode == vtn_variable_mode_sampler) {
@@ -2006,8 +2027,7 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode,
   struct vtn_pointer *dest = dest_val->pointer;
   struct vtn_value *src_val = vtn_untyped_value(b, w[2]);
 
-  vtn_fail_if(dest_val->type->deref != src_val->type,
-  "Value and pointer types of OpStore do not match");
+  vtn_assert_types_equal(b, opcode, dest_val->type->deref, src_val->type);
 
   if (glsl_type_is_sampler(dest->type->type)) {
  vtn_warn("OpStore of a sampler detected.  Doing on-the-fly copy "
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] spirv: Add a vtn_types_compatible helper

2018-01-02 Thread Jason Ekstrand
---
 src/compiler/spirv/spirv_to_nir.c | 52 +++
 src/compiler/spirv/vtn_private.h  |  3 +++
 2 files changed, 55 insertions(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 751fb03..5004d81 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -538,6 +538,58 @@ struct member_decoration_ctx {
struct vtn_type *type;
 };
 
+/** Returns true if two types are "compatible", i.e. you can do an OpLoad,
+ * OpStore, or OpCopyMemory between them without breaking anything.
+ * Technically, the SPIR-V rules require the exact same type ID but this lets
+ * us internally be a bit looser.
+ */
+bool
+vtn_types_compatible(struct vtn_builder *b,
+ struct vtn_type *t1, struct vtn_type *t2)
+{
+   if (t1->val == t2->val)
+  return true;
+
+   if (t1->base_type != t2->base_type)
+  return false;
+
+   switch (t1->base_type) {
+   case vtn_base_type_void:
+   case vtn_base_type_scalar:
+   case vtn_base_type_vector:
+   case vtn_base_type_matrix:
+   case vtn_base_type_image:
+   case vtn_base_type_sampler:
+   case vtn_base_type_sampled_image:
+  return t1->type == t2->type;
+
+   case vtn_base_type_array:
+  return t1->length == t2->length &&
+ vtn_types_compatible(b, t1->array_element, t2->array_element);
+
+   case vtn_base_type_pointer:
+  return vtn_types_compatible(b, t1->deref, t2->deref);
+
+   case vtn_base_type_struct:
+  if (t1->length != t2->length)
+ return false;
+
+  for (unsigned i = 0; i < t1->length; i++) {
+ if (!vtn_types_compatible(b, t1->members[i], t2->members[i]))
+return false;
+  }
+  return true;
+
+   case vtn_base_type_function:
+  /* This case shouldn't get hit since you can't copy around function
+   * types.  Just require them to be identical.
+   */
+  return false;
+   }
+
+   vtn_fail("Invalid base type");
+}
+
 /* does a shallow copy of a vtn_type */
 
 static struct vtn_type *
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 374643a..f2b53e1 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -365,6 +365,9 @@ struct vtn_type {
};
 };
 
+bool vtn_types_compatible(struct vtn_builder *b,
+  struct vtn_type *t1, struct vtn_type *t2);
+
 struct vtn_variable;
 
 enum vtn_access_mode {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] spirv: Add a mechanism for dumping failing shaders

2018-01-02 Thread Jason Ekstrand
---
 src/compiler/spirv/spirv_to_nir.c | 29 +
 src/compiler/spirv/vtn_private.h  |  1 +
 2 files changed, 30 insertions(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index dcff56f..751fb03 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -31,6 +31,9 @@
 #include "nir/nir_constant_expressions.h"
 #include "spirv_info.h"
 
+#include 
+#include 
+
 void
 vtn_log(struct vtn_builder *b, enum nir_spirv_debug_level level,
 size_t spirv_offset, const char *message)
@@ -94,6 +97,27 @@ vtn_log_err(struct vtn_builder *b,
ralloc_free(msg);
 }
 
+static void
+vtn_dump_shader(struct vtn_builder *b, const char *path, const char *prefix)
+{
+   static int idx = 0;
+
+   char filename[1024];
+   int len = snprintf(filename, sizeof(filename), "%s/%s-%d.spirv",
+  path, prefix, idx++);
+   if (len < 0 || len >= sizeof(filename))
+  return;
+
+   int fd = open(filename, O_CREAT | O_CLOEXEC | O_WRONLY, 0777);
+   if (fd < 0)
+  return;
+
+   write(fd, b->spirv, b->spirv_word_count * 4);
+   close(fd);
+
+   vtn_info("SPIR-V shader dumped to %s", filename);
+}
+
 void
 _vtn_warn(struct vtn_builder *b, const char *file, unsigned line,
   const char *fmt, ...)
@@ -117,6 +141,10 @@ _vtn_fail(struct vtn_builder *b, const char *file, 
unsigned line,
file, line, fmt, args);
va_end(args);
 
+   const char *dump_path = getenv("MESA_SPIRV_FAIL_DUMP_PATH");
+   if (dump_path)
+  vtn_dump_shader(b, dump_path, "fail");
+
longjmp(b->fail_jump, 1);
 }
 
@@ -3690,6 +3718,7 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
/* Initialize the stn_builder object */
struct vtn_builder *b = rzalloc(NULL, struct vtn_builder);
b->spirv = words;
+   b->spirv_word_count = word_count;
b->file = NULL;
b->line = -1;
b->col = -1;
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index f7d8f49..374643a 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -531,6 +531,7 @@ struct vtn_builder {
jmp_buf fail_jump;
 
const uint32_t *spirv;
+   size_t spirv_word_count;
 
nir_shader *shader;
const struct spirv_to_nir_options *options;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104457] Resetting rcs0 after gpu hang

2018-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104457

--- Comment #1 from Chris Smith  ---
Created attachment 136498
  --> https://bugs.freedesktop.org/attachment.cgi?id=136498=edit
glxinfo > glxinfo.txt

I've attached the output of glxinfo this this bug report.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104457] Resetting rcs0 after gpu hang

2018-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104457

Bug ID: 104457
   Summary: Resetting rcs0 after gpu hang
   Product: Mesa
   Version: 17.3
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: ch...@hichris.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 136497
  --> https://bugs.freedesktop.org/attachment.cgi?id=136497=edit
sudo cat /sys/class/drm/card0/error > crashdump.txt

Original post here: https://github.com/ValveSoftware/portal2/issues/295

I've attached the output of /sys/class/drm/card0/error to this bug report.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104214] Dota crashes when switching from game to desktop

2018-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104214

--- Comment #11 from Cyril  ---
Used git bisect (between tag mesa-17.3.1 and mesa-17.2.6) and tested if i could
launch the game or not for each iteration. Got this commit at the end :


15e208c4ccdd94582a459d0066b587f91caf270c is the first bad commit
commit 15e208c4ccdd94582a459d0066b587f91caf270c
Author: Thomas Hellstrom 
Date:   Thu Sep 14 13:09:05 2017 +0200

loader/dri3: Don't accidently free buffer holding new back content

Avoid freeing buffers holding new back content
(with GLX_SWAP_COPY_OML and GLX_SWAP_EXCHANGE_OML)
Prevously that would have resulted in back buffer content becoming
incorrect after a swap, although I haven't managed to trigger such a
situation yet.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Sinclair Yeh 



I was able to launch dota by reverting this one but i guess it's not the proper
fix :D

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103152] Mesa 17.2 cannot be built on ARM with GCC 7.2

2018-01-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103152

Icenowy Zheng  changed:

   What|Removed |Added

   Hardware|Other   |ARM
 OS|All |Linux (All)
   Priority|medium  |high

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev