date:20171019

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Iago Toral

On Fri, 2017-10-20 at 13:18 +1100, Timothy Arceri wrote:
> 
> On 20/10/17 03:31, Iago Toral Quiroga wrote:
> > The existing code was checking the whole interface variable rather
> > than its members, which is not what we want: we want to check
> > aliasing for each member in the interface variable.
> > 
> > Surprisingly, there are piglit tests that verify this and were
> > passing due to a bug in the existing code: when we were computing
> > the last component used by an interface variable we would use
> > the 'vector' path and multiply by vector_elements, which is 0 for
> > interface variables. This made the loop that checks for aliasing
> > be a no-op and not add the interface variable to the list of
> > outputs
> > so then we would fail to link when we did not see a matching output
> > for the same input in the next stage. Since the tests expect a
> > linker error to happen, they would pass, but not for the right
> > reason.
> > 
> > Unfortunately, the current implementation uses ir_variable
> > instances
> > to keep track of explicit locations. Since we don't have
> > ir_variables instances for individual interface members, we need
> > to have a custom struct with the data we need. This struct has
> > the ir_variable (which for interface members is the whole
> > interface variable), plus the data that we need to validate for
> > each aliased location, for now only the base type, which for
> > interface members we will take from the appropriate field inside
> > the interface variable.
> > 
> > Later patches will expand this custom struct so we can also check
> > other requirements for location aliasing, specifically that
> > we have matching interpolation and auxiliary storage, that once
> > again, we will take from the appropriate field members for the
> > interface variables.
> > 
> > Fixes:
> > KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations
> > 
> > Fixes:
> > tests/spec/arb_separate_shader_objects/execution/layout-location-
> > named-block.shader_test -auto -fbo
> > 
> > Fixes (these were passing before but for incorrect reasons):
> > tests/spec/arb_enhanced_layouts/linker/block-member-
> > locations/named-block-member-location-overlap.shader_test
> > tests/spec/arb_enhanced_layouts/linker/block-member-
> > locations/named-block-member-mixed-order-overlap.shader_test
> > 
> > ---
> >   src/compiler/glsl/link_varyings.cpp | 45
> > +++--
> >   1 file changed, 33 insertions(+), 12 deletions(-)
> > 
> > diff --git a/src/compiler/glsl/link_varyings.cpp
> > b/src/compiler/glsl/link_varyings.cpp
> > index 687bd50e52..5dc2704321 100644
> > --- a/src/compiler/glsl/link_varyings.cpp
> > +++ b/src/compiler/glsl/link_varyings.cpp
> > @@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable
> > *var, gl_shader_stage stage)
> >  return var->data.location - location_start;
> >   }
> >   
> > +struct explicit_location_info {
> > +   ir_variable *var;
> > +   unsigned base_type;
> > +};
> > +
> >   static bool
> > -check_location_aliasing(ir_variable *explicit_locations[][4],
> > +check_location_aliasing(struct explicit_location_info
> > explicit_locations[][4],
> >   ir_variable *var,
> >   unsigned location,
> >   unsigned component,
> > @@ -427,7 +432,7 @@ check_location_aliasing(ir_variable
> > *explicit_locations[][4],
> >  while (location < location_limit) {
> > unsigned i = component;
> > while (i < last_comp) {
> > - if (explicit_locations[location][i] != NULL) {
> > + if (explicit_locations[location][i].var != NULL) {
> >   linker_error(prog,
> >    "%s shader has multiple outputs
> > explicitly "
> >    "assigned to location %d and component
> > %d\n",
> > @@ -439,9 +444,9 @@ check_location_aliasing(ir_variable
> > *explicit_locations[][4],
> >    /* Make sure all component at this location have the
> > same type.
> > */
> >    for (unsigned j = 0; j < 4; j++) {
> > -if (explicit_locations[location][j] &&
> > -(explicit_locations[location][j]->type-
> > >without_array()
> > - ->base_type != type->without_array()->base_type)) 
> > {
> > +if (explicit_locations[location][j].var &&
> > +explicit_locations[location][j].base_type !=
> > +  type->without_array()->base_type) {
> >  linker_error(prog,
> >   "Varyings sharing the same location
> > must "
> >   "have the same underlying numerical
> > type. "
> > @@ -450,7 +455,9 @@ check_location_aliasing(ir_variable
> > *explicit_locations[][4],
> >   }
> >    }
> >   
> > - explicit_locations[location][i] = var;
> > + explicit_locations[location][i].var = var;
> > + explicit_locations[location][i].base_type =

Re: [Mesa-dev] [PATCH] radv: ensure correct outinfo is picked.

2017-10-19 Thread Timothy Arceri




On 20/10/17 15:12, Dave Airlie wrote:

From: Dave Airlie 

This struct used to rely on being in a union, it isn't anymore,
so we have to pick the correct outinfo struct now.

This should fix a regression since the union became a struct.

dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set

Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about 
multiple stages.)
Signed-off-by: Dave Airlie 
---
  src/amd/vulkan/radv_pipeline.c | 27 ++-
  1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index c16b5e3..9ffeda8 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1410,12 +1410,19 @@ static uint32_t si_vgt_gs_mode(struct 
radv_shader_variant *gs)
   S_028A40_GS_WRITE_OPTIMIZE(1);
  }
  
-static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline)

+static struct ac_vs_output_info *get_vs_output_info(struct radv_pipeline 
*pipeline)
  {
-   struct radv_shader_variant *vs;
-   vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : 
(radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] :  
pipeline->shaders[MESA_SHADER_VERTEX]);
+   if (radv_pipeline_has_gs(pipeline))
+   return >gs_copy_shader->info.vs.outinfo;
+   else if (radv_pipeline_has_tess(pipeline))
+   return 
>shaders[MESA_SHADER_TESS_EVAL]->info.tes.outinfo;
+   else
+   return >shaders[MESA_SHADER_VERTEX]->info.vs.outinfo;
+}
  
-	struct ac_vs_output_info *outinfo = >info.vs.outinfo;

+static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline)
+{
+   struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);;


extra ;

  
  	pipeline->graphics.vgt_primitiveid_en = false;

pipeline->graphics.vgt_gs_mode = 0;
@@ -1430,10 +1437,7 @@ static void calculate_vgt_gs_mode(struct radv_pipeline 
*pipeline)
  
  static void calculate_pa_cl_vs_out_cntl(struct radv_pipeline *pipeline)

  {
-   struct radv_shader_variant *vs;
-   vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : 
(radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] :  
pipeline->shaders[MESA_SHADER_VERTEX]);
-
-   struct ac_vs_output_info *outinfo = >info.vs.outinfo;
+   struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);;


extra ;

  
  	unsigned clip_dist_mask, cull_dist_mask, total_mask;

clip_dist_mask = outinfo->clip_dist_mask;
@@ -1476,13 +1480,10 @@ static uint32_t offset_to_ps_input(uint32_t offset, 
bool flat_shade)
  
  static void calculate_ps_inputs(struct radv_pipeline *pipeline)

  {
-   struct radv_shader_variant *ps, *vs;
-   struct ac_vs_output_info *outinfo;
+   struct radv_shader_variant *ps;
+   struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);;


extra ;

With those removed:

Reviewed-by: Timothy Arceri 

  
  	ps = pipeline->shaders[MESA_SHADER_FRAGMENT];

-   vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : 
(radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] :  
pipeline->shaders[MESA_SHADER_VERTEX]);
-
-   outinfo = >info.vs.outinfo;
  
  	unsigned ps_offset = 0;
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] i965 : optimized bucket index calculation

2017-10-19 Thread Ian Romanick

On 09/13/2017 11:43 PM, aravindan.muthuku...@intel.com wrote:
> From: Aravindan Muthukumar 
> 
> Avoiding the loop which was running with O(n) complexity.
> Now the complexity has been reduced to O(1)
> 
> Algorithm calculates the index using matrix method.
> Matrix arrangement is as below:
> Assuming PAGE_SIZE is 4096.
> 
>  1*4096   2*40963*40964*4096
>  5*4096   6*40967*40968*4096

I think adding one more row to this chart would make it more clear.  The
two rows shown also follow a simpler pattern, and that made some of the
complexity below seem confusing.

   10*4096  12*4096   14*4096   16*4096

>   ...  ...   ...   ...
>   ...  ...   ...   ...
>   ...  ...   ...   max_cache_size
> 
> From this matrix its cleary seen that every row
   clearly

> follows the below way:
>  ...   ...   ...n
>n+(1/4)n  n+(1/2)n  n+(3/4)n2n
> 
> Row is calulated as log2(size/PAGE_SIZE)
> Column is calculated as converting the difference
> between the elements to fit into power size of two
> and indexing it.
> 
> Final Index is (row*4)+(col-1)
> 
> Tested with Intel Mesa CI.
> 
> Improves performance of 3d Mark on Broxton.
> Analyzed using Compare Perf Analyser:
> Average : 201.2 +/- 65.4836 (n=20)

Is 201 the improvement or the absolute score?  Do not quote absolute scores.

> Percentage : 0.705966% +/- 0.229767% (n=20)

Eero: Can you reproduce this result on BXT or other platforms?  Just
curious...

> v2: Review comments regarding cosmetics and asserts implemented
> 
> Signed-off-by: Aravindan Muthukumar 
> Signed-off-by: Kedar Karanje 
> Reviewed-by: Yogesh Marathe 
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 
> --
>  1 file changed, 39 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c 
> b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 8017219..8013ccb 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -87,6 +87,8 @@
>  
>  #define memclear(s) memset(, 0, sizeof(s))
>  
> +#define PAGE_SIZE 4096
> +
>  #define FILE_DEBUG_FLAG DEBUG_BUFMGR
>  
>  static inline int
> @@ -181,19 +183,45 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t 
> pitch, uint32_t tiling)
> return ALIGN(pitch, tile_width);
>  }
>  
> +static inline int
> +ilog2_round_up(int value)
> +{
> +   assert(value != 0);
> +   return 32 - __builtin_clz(value - 1);
> +}
> +
> +/*
> + * This function finds the correct bucket fit for the input size.
> + * The function works with O(1) complexity when the requested size
> + * was queried instead of iterating the size through all the buckets.
> + */
>  static struct bo_cache_bucket *
>  bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size)
>  {
> -   int i;
> +   int index = -1;

I don't see how execution could get to that test without index being set
again, so it should be safe to remove the initialization.

> +   int row, col = 0;
> +   int pages, pages_log2;

Move the declarations of row, col, pages, and pages_log2 into the else
case, initialize them with the only assignment to them, and make them const.

Since all of these values, including index, get values derived from
size, I believe the should all be unsigned.  In that case, remove the
index >= 0 test below.

>  
> -   for (i = 0; i < bufmgr->num_buckets; i++) {
> -  struct bo_cache_bucket *bucket = >cache_bucket[i];
> -  if (bucket->size >= size) {
> - return bucket;
> -  }
> +   /* condition for size less  than 4*4096 (16KB) page size */
 ^   ^ Remove one space here.
 |
 Capitalize and add a period at the end of the sentence.

> +   if(size <= 4 * PAGE_SIZE) {
^ Add a space here.

> +  index = DIV_ROUND_UP(size, PAGE_SIZE) - 1;;
  Remove extra trailing semicolon. ^

> +   } else {
> +  /* Number of pages of page size */
> +  pages = DIV_ROUND_UP(size, PAGE_SIZE);
> +  pages_log2 = ilog2_round_up(pages) - 1;

Isn't this going to be the same result as _mesa_logbase2(pages)?

> +
> +  /* Finding the row and column of the matrix */
> +  row = pages_log2 - 1;
> +  col = DIV_ROUND_UP((pages - (1 << pages_log2)),
> +(1 << (pages_log2 - 2)));
   ^^
   This should  |
   be aligned   |
   here. ---+

Removing some of the extra parenthesis would also make it easier to read.

So... I feel like this function doesn't need to special case size < 16k
like this.  I haven't benchmarked it, but the following is about 30
bytes shorter and avoids the conditional branch.  I also think it's
easier to understand.

   const unsigned pages = DIV_ROUND_UP(size, PAGE_SIZE);

   /* Row

[Mesa-dev] [PATCH] radv: ensure correct outinfo is picked.

2017-10-19 Thread Dave Airlie

From: Dave Airlie 

This struct used to rely on being in a union, it isn't anymore,
so we have to pick the correct outinfo struct now.

This should fix a regression since the union became a struct.

dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set

Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about 
multiple stages.)
Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_pipeline.c | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index c16b5e3..9ffeda8 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1410,12 +1410,19 @@ static uint32_t si_vgt_gs_mode(struct 
radv_shader_variant *gs)
   S_028A40_GS_WRITE_OPTIMIZE(1);
 }
 
-static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline)
+static struct ac_vs_output_info *get_vs_output_info(struct radv_pipeline 
*pipeline)
 {
-   struct radv_shader_variant *vs;
-   vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : 
(radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] :  
pipeline->shaders[MESA_SHADER_VERTEX]);
+   if (radv_pipeline_has_gs(pipeline))
+   return >gs_copy_shader->info.vs.outinfo;
+   else if (radv_pipeline_has_tess(pipeline))
+   return 
>shaders[MESA_SHADER_TESS_EVAL]->info.tes.outinfo;
+   else
+   return >shaders[MESA_SHADER_VERTEX]->info.vs.outinfo;
+}
 
-   struct ac_vs_output_info *outinfo = >info.vs.outinfo;
+static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline)
+{
+   struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);;
 
pipeline->graphics.vgt_primitiveid_en = false;
pipeline->graphics.vgt_gs_mode = 0;
@@ -1430,10 +1437,7 @@ static void calculate_vgt_gs_mode(struct radv_pipeline 
*pipeline)
 
 static void calculate_pa_cl_vs_out_cntl(struct radv_pipeline *pipeline)
 {
-   struct radv_shader_variant *vs;
-   vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : 
(radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] :  
pipeline->shaders[MESA_SHADER_VERTEX]);
-
-   struct ac_vs_output_info *outinfo = >info.vs.outinfo;
+   struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);;
 
unsigned clip_dist_mask, cull_dist_mask, total_mask;
clip_dist_mask = outinfo->clip_dist_mask;
@@ -1476,13 +1480,10 @@ static uint32_t offset_to_ps_input(uint32_t offset, 
bool flat_shade)
 
 static void calculate_ps_inputs(struct radv_pipeline *pipeline)
 {
-   struct radv_shader_variant *ps, *vs;
-   struct ac_vs_output_info *outinfo;
+   struct radv_shader_variant *ps;
+   struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);;
 
ps = pipeline->shaders[MESA_SHADER_FRAGMENT];
-   vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : 
(radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] :  
pipeline->shaders[MESA_SHADER_VERTEX]);
-
-   outinfo = >info.vs.outinfo;
 
unsigned ps_offset = 0;
 
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] glsl/linker: outputs in the same location must share auxiliary storage

2017-10-19 Thread Timothy Arceri


I only really skimmed over them but these look fine to me 3-4:

Reviewed-by: Timothy Arceri 

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Timothy Arceri




On 20/10/17 13:36, Ilia Mirkin wrote:

On Thu, Oct 19, 2017 at 10:18 PM, Timothy Arceri  wrote:



On 20/10/17 03:31, Iago Toral Quiroga wrote:


The existing code was checking the whole interface variable rather
than its members, which is not what we want: we want to check
aliasing for each member in the interface variable.

Surprisingly, there are piglit tests that verify this and were
passing due to a bug in the existing code: when we were computing
the last component used by an interface variable we would use
the 'vector' path and multiply by vector_elements, which is 0 for
interface variables. This made the loop that checks for aliasing
be a no-op and not add the interface variable to the list of outputs
so then we would fail to link when we did not see a matching output
for the same input in the next stage. Since the tests expect a
linker error to happen, they would pass, but not for the right
reason.

Unfortunately, the current implementation uses ir_variable instances
to keep track of explicit locations. Since we don't have
ir_variables instances for individual interface members, we need
to have a custom struct with the data we need. This struct has
the ir_variable (which for interface members is the whole
interface variable), plus the data that we need to validate for
each aliased location, for now only the base type, which for
interface members we will take from the appropriate field inside
the interface variable.

Later patches will expand this custom struct so we can also check
other requirements for location aliasing, specifically that
we have matching interpolation and auxiliary storage, that once
again, we will take from the appropriate field members for the
interface variables.

Fixes:
KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations

Fixes:

tests/spec/arb_separate_shader_objects/execution/layout-location-named-block.shader_test
-auto -fbo

Fixes (these were passing before but for incorrect reasons):

tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test

tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test

---
   src/compiler/glsl/link_varyings.cpp | 45
+++--
   1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp
b/src/compiler/glsl/link_varyings.cpp
index 687bd50e52..5dc2704321 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable *var,
gl_shader_stage stage)
  return var->data.location - location_start;
   }
   +struct explicit_location_info {
+   ir_variable *var;
+   unsigned base_type;
+};
+
   static bool
-check_location_aliasing(ir_variable *explicit_locations[][4],
+check_location_aliasing(struct explicit_location_info
explicit_locations[][4],
   ir_variable *var,
   unsigned location,
   unsigned component,
@@ -427,7 +432,7 @@ check_location_aliasing(ir_variable
*explicit_locations[][4],
  while (location < location_limit) {
 unsigned i = component;
 while (i < last_comp) {
- if (explicit_locations[location][i] != NULL) {
+ if (explicit_locations[location][i].var != NULL) {
   linker_error(prog,
"%s shader has multiple outputs explicitly "
"assigned to location %d and component %d\n",
@@ -439,9 +444,9 @@ check_location_aliasing(ir_variable
*explicit_locations[][4],
/* Make sure all component at this location have the same type.
 */
for (unsigned j = 0; j < 4; j++) {
-if (explicit_locations[location][j] &&
-(explicit_locations[location][j]->type->without_array()
- ->base_type != type->without_array()->base_type)) {
+if (explicit_locations[location][j].var &&
+explicit_locations[location][j].base_type !=
+  type->without_array()->base_type) {
  linker_error(prog,
   "Varyings sharing the same location must "
   "have the same underlying numerical type. "
@@ -450,7 +455,9 @@ check_location_aliasing(ir_variable
*explicit_locations[][4],
   }
}
   - explicit_locations[location][i] = var;
+ explicit_locations[location][i].var = var;
+ explicit_locations[location][i].base_type =
+type->without_array()->base_type;
i++;
  /* We need to do some special handling for doubles as dvec3
and
@@ -482,8 +489,8 @@ cross_validate_outputs_to_inputs(struct gl_context
*ctx,
gl_linked_shader *consumer)
   {
  glsl_symbol_table parameters;
-   ir_variable

Re: [Mesa-dev] [PATCH] glsl: Allow precision mismatch on dead uniform with GLSL ES 1.00 (v3)

2017-10-19 Thread Tomasz Figa

Hi Ian, Kenneth,

On Wed, Sep 27, 2017 at 2:57 AM, Tomasz Figa  wrote:
> Commit 259fc505454ea6a67aeacf6cdebf1398d9947759 added linker error for
> mismatching uniform precision, as required by GLES 3.0 specification and
> conformance test-suite.
>
> Several Android applications, including Forge of Empires, have shaders
> which violate this rule, on uniforms that are declared but not used
> further in shader code. The problem affects a big number of Android
> games, including ones built on top of one of the common 2D graphics
> engines and other GLES implementations accept this, which poses a serious
> application compatibility issue.
>
> Starting from GLSL ES 3.0, declarations with conflicting precision
> qualifiers are explicitly prohibited. However GLSL ES 1.00 does not
> clearly specify the behavior, except that
>
>   "Uniforms are defined to behave as if they are using the same storage in
>   the vertex and fragment processors and may be implemented this way.
>   If uniforms are used in both the vertex and fragment shaders, developers
>   should be warned if the precisions are different. Conversion of
>   precision should never be implicit."
>
> The word "used" is not clear in this context and might refer to
>  1) declared (same as GLES 3.x)
>  2) referred after post-processing, or
>  3) linked after all optimizations are done.
>
> Looking at existing applications, 2) or 3) seems to be widely adopted.
> To avoid compatibility issues, turn the error into a warning if GLSL ES
> version is lower than 3.0 and the data is dead in at least one of the
> shaders.
>
> v3:
>  - Add a comment explaining the behavior.
>  - Fix bad copy/paste in commit message (s/varyings/uniforms).

Would you be able to take a look?

Ian, I believe your previous NAK was due to the confusing erroneous
copy/paste from the freedesktop bug I made in commit message. Looking
at last comment from Kenneth there, we were going to go with my patch,
but things remained quiet after that.

Thanks,
Tomasz

> v2:
>  - Change the behavior only for GLSL ES 1.00 shaders.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532
> Signed-off-by: Tomasz Figa 
> ---
>  src/compiler/glsl/linker.cpp | 32 
>  1 file changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index f352c5385ca..693c5ae3664 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -1121,10 +1121,34 @@ cross_validate_globals(struct gl_shader_program *prog,
>   if (prog->IsES && (prog->data->Version != 310 ||
>  !var->get_interface_type()) &&
>   existing->data.precision != var->data.precision) {
> -linker_error(prog, "declarations for %s `%s` have "
> - "mismatching precision qualifiers\n",
> - mode_string(var), var->name);
> -return;
> +/*
> + * GLSL ES 3.00 is the first version that explicitly requires
> + * that uniform declarations have matching precision qualifiers.
> + *
> + * The only relevant part of GLSL ES 1.00 spec,
> + *
> + *  "If uniforms are used in both the vertex and fragment 
> shaders,
> + *   developers should be warned if the precisions are different.
> + *   Conversion of precision should never be implicit."
> + *
> + * leaves too wide field for intepretation. However, judging by
> + * applications and implementations existing in the wild, it 
> seems
> + * to be widely assumed that declarations alone are not enough to
> + * fail the link.
> + *
> + * Thus, in case of GLSL ES < 3.00, trigger an error only if the
> + * uniform is actually referenced in the code of both shaders.
> + */
> +if ((existing->data.used && var->data.used) || 
> prog->data->Version >= 300) {
> +   linker_error(prog, "declarations for %s `%s` have "
> +"mismatching precision qualifiers\n",
> +mode_string(var), var->name);
> +   return;
> +} else {
> +   linker_warning(prog, "declarations for %s `%s` have "
> +  "mismatching precision qualifiers\n",
> +  mode_string(var), var->name);
> +}
>   }
>} else
>   variables->add_variable(var);
> --
> 2.14.1.992.g2c7b836f3a-goog
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Ilia Mirkin

On Thu, Oct 19, 2017 at 10:18 PM, Timothy Arceri  wrote:
>
>
> On 20/10/17 03:31, Iago Toral Quiroga wrote:
>>
>> The existing code was checking the whole interface variable rather
>> than its members, which is not what we want: we want to check
>> aliasing for each member in the interface variable.
>>
>> Surprisingly, there are piglit tests that verify this and were
>> passing due to a bug in the existing code: when we were computing
>> the last component used by an interface variable we would use
>> the 'vector' path and multiply by vector_elements, which is 0 for
>> interface variables. This made the loop that checks for aliasing
>> be a no-op and not add the interface variable to the list of outputs
>> so then we would fail to link when we did not see a matching output
>> for the same input in the next stage. Since the tests expect a
>> linker error to happen, they would pass, but not for the right
>> reason.
>>
>> Unfortunately, the current implementation uses ir_variable instances
>> to keep track of explicit locations. Since we don't have
>> ir_variables instances for individual interface members, we need
>> to have a custom struct with the data we need. This struct has
>> the ir_variable (which for interface members is the whole
>> interface variable), plus the data that we need to validate for
>> each aliased location, for now only the base type, which for
>> interface members we will take from the appropriate field inside
>> the interface variable.
>>
>> Later patches will expand this custom struct so we can also check
>> other requirements for location aliasing, specifically that
>> we have matching interpolation and auxiliary storage, that once
>> again, we will take from the appropriate field members for the
>> interface variables.
>>
>> Fixes:
>> KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations
>>
>> Fixes:
>>
>> tests/spec/arb_separate_shader_objects/execution/layout-location-named-block.shader_test
>> -auto -fbo
>>
>> Fixes (these were passing before but for incorrect reasons):
>>
>> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test
>>
>> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test
>>
>> ---
>>   src/compiler/glsl/link_varyings.cpp | 45
>> +++--
>>   1 file changed, 33 insertions(+), 12 deletions(-)
>>
>> diff --git a/src/compiler/glsl/link_varyings.cpp
>> b/src/compiler/glsl/link_varyings.cpp
>> index 687bd50e52..5dc2704321 100644
>> --- a/src/compiler/glsl/link_varyings.cpp
>> +++ b/src/compiler/glsl/link_varyings.cpp
>> @@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable *var,
>> gl_shader_stage stage)
>>  return var->data.location - location_start;
>>   }
>>   +struct explicit_location_info {
>> +   ir_variable *var;
>> +   unsigned base_type;
>> +};
>> +
>>   static bool
>> -check_location_aliasing(ir_variable *explicit_locations[][4],
>> +check_location_aliasing(struct explicit_location_info
>> explicit_locations[][4],
>>   ir_variable *var,
>>   unsigned location,
>>   unsigned component,
>> @@ -427,7 +432,7 @@ check_location_aliasing(ir_variable
>> *explicit_locations[][4],
>>  while (location < location_limit) {
>> unsigned i = component;
>> while (i < last_comp) {
>> - if (explicit_locations[location][i] != NULL) {
>> + if (explicit_locations[location][i].var != NULL) {
>>   linker_error(prog,
>>"%s shader has multiple outputs explicitly "
>>"assigned to location %d and component %d\n",
>> @@ -439,9 +444,9 @@ check_location_aliasing(ir_variable
>> *explicit_locations[][4],
>>/* Make sure all component at this location have the same type.
>> */
>>for (unsigned j = 0; j < 4; j++) {
>> -if (explicit_locations[location][j] &&
>> -(explicit_locations[location][j]->type->without_array()
>> - ->base_type != type->without_array()->base_type)) {
>> +if (explicit_locations[location][j].var &&
>> +explicit_locations[location][j].base_type !=
>> +  type->without_array()->base_type) {
>>  linker_error(prog,
>>   "Varyings sharing the same location must "
>>   "have the same underlying numerical type. "
>> @@ -450,7 +455,9 @@ check_location_aliasing(ir_variable
>> *explicit_locations[][4],
>>   }
>>}
>>   - explicit_locations[location][i] = var;
>> + explicit_locations[location][i].var = var;
>> + explicit_locations[location][i].base_type =
>> +type->without_array()->base_type;
>>i++;
>>  /* We need to do some special handling for

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Timothy Arceri




On 20/10/17 03:31, Iago Toral Quiroga wrote:

The existing code was checking the whole interface variable rather
than its members, which is not what we want: we want to check
aliasing for each member in the interface variable.

Surprisingly, there are piglit tests that verify this and were
passing due to a bug in the existing code: when we were computing
the last component used by an interface variable we would use
the 'vector' path and multiply by vector_elements, which is 0 for
interface variables. This made the loop that checks for aliasing
be a no-op and not add the interface variable to the list of outputs
so then we would fail to link when we did not see a matching output
for the same input in the next stage. Since the tests expect a
linker error to happen, they would pass, but not for the right
reason.

Unfortunately, the current implementation uses ir_variable instances
to keep track of explicit locations. Since we don't have
ir_variables instances for individual interface members, we need
to have a custom struct with the data we need. This struct has
the ir_variable (which for interface members is the whole
interface variable), plus the data that we need to validate for
each aliased location, for now only the base type, which for
interface members we will take from the appropriate field inside
the interface variable.

Later patches will expand this custom struct so we can also check
other requirements for location aliasing, specifically that
we have matching interpolation and auxiliary storage, that once
again, we will take from the appropriate field members for the
interface variables.

Fixes:
KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations

Fixes:
tests/spec/arb_separate_shader_objects/execution/layout-location-named-block.shader_test
 -auto -fbo

Fixes (these were passing before but for incorrect reasons):
tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test
tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test

---
  src/compiler/glsl/link_varyings.cpp | 45 +++--
  1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 687bd50e52..5dc2704321 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable *var, 
gl_shader_stage stage)
 return var->data.location - location_start;
  }
  
+struct explicit_location_info {

+   ir_variable *var;
+   unsigned base_type;
+};
+
  static bool
-check_location_aliasing(ir_variable *explicit_locations[][4],
+check_location_aliasing(struct explicit_location_info explicit_locations[][4],
  ir_variable *var,
  unsigned location,
  unsigned component,
@@ -427,7 +432,7 @@ check_location_aliasing(ir_variable 
*explicit_locations[][4],
 while (location < location_limit) {
unsigned i = component;
while (i < last_comp) {
- if (explicit_locations[location][i] != NULL) {
+ if (explicit_locations[location][i].var != NULL) {
  linker_error(prog,
   "%s shader has multiple outputs explicitly "
   "assigned to location %d and component %d\n",
@@ -439,9 +444,9 @@ check_location_aliasing(ir_variable 
*explicit_locations[][4],
   /* Make sure all component at this location have the same type.
*/
   for (unsigned j = 0; j < 4; j++) {
-if (explicit_locations[location][j] &&
-(explicit_locations[location][j]->type->without_array()
- ->base_type != type->without_array()->base_type)) {
+if (explicit_locations[location][j].var &&
+explicit_locations[location][j].base_type !=
+  type->without_array()->base_type) {
 linker_error(prog,
  "Varyings sharing the same location must "
  "have the same underlying numerical type. "
@@ -450,7 +455,9 @@ check_location_aliasing(ir_variable 
*explicit_locations[][4],
  }
   }
  
- explicit_locations[location][i] = var;

+ explicit_locations[location][i].var = var;
+ explicit_locations[location][i].base_type =
+type->without_array()->base_type;
   i++;
  
   /* We need to do some special handling for doubles as dvec3 and

@@ -482,8 +489,8 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx,
   gl_linked_shader *consumer)
  {
 glsl_symbol_table parameters;
-   ir_variable *explicit_locations[MAX_VARYINGS_INCL_PATCH][4] =
-  { {NULL, NULL} };
+   struct explicit_location_info 
explicit_locations[MAX_VARYINGS_INCL_PATCH][4] =
+

Re: [Mesa-dev] [PATCH v2 25/32] i965: add cache fallback support using serialized nir

2017-10-19 Thread Timothy Arceri


On 20/10/17 09:13, Timothy Arceri wrote:
IMO you should just do this in brw_link() when its cheap. You will be 
doing the deserialization at draw time here which is not what we want. 
Can also drop the serialized_nir params if you follow my suggestions in 
patch 14.


I still think you might want to always restore the nir rather than doing 
this at draw time, but it's not going to be a huge overhead compared to 
the compile so for now.


Acked-by: Timothy Arceri 

Something you probably want to do as a follow up is free the serialised 
NIR after you load it.




On 19/10/17 16:32, Jordan Justen wrote:

If the i965 gen program cannot be loaded from the cache, then we
fallback to using a serialized nir program.

This is based on "i965: add cache fallback support" by Timothy Arceri
. Tim's version was written to fallback
to compiling from source, and therefore had to be much more complex.
After Connor and Jason implemented nir serialization, I was able to
rewrite and greatly simplify this patch.

Signed-off-by: Jordan Justen 
---
  src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 
++-

  1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c

index d89df846d5..790fad6925 100644
--- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -24,6 +24,7 @@
  #include "compiler/blob.h"
  #include "compiler/glsl/ir_uniform.h"
  #include "compiler/glsl/shader_cache.h"
+#include "compiler/nir/nir_serialize.h"
  #include "main/mtypes.h"
  #include "util/disk_cache.h"
  #include "util/macros.h"
@@ -79,6 +80,27 @@ gen_shader_sha1(struct brw_context *brw, struct 
gl_program *prog,

 _mesa_sha1_compute(manifest, strlen(manifest), out_sha1);
  }
+static void
+fallback_to_full_recompile(struct brw_context *brw, struct gl_program 
*prog,

+   gl_shader_stage stage)
+{
+   prog->program_written_to_cache = false;
+   if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) {
+  fprintf(stderr, "falling back to nir %s.\n",
+  _mesa_shader_stage_to_abbrev(prog->info.stage));
+   }
+
+   if (!prog->nir) {
+  assert(prog->serialized_nir && prog->serialized_nir_size > 0);
+  const struct nir_shader_compiler_options *options =
+ brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions;
+  struct blob_reader reader;
+  blob_reader_init(, prog->serialized_nir,
+   prog->serialized_nir_size);
+  prog->nir = nir_deserialize(NULL, options, );
+   }
+}
+
  static void
  read_program_data(struct gl_program *glprog, struct blob_reader 
*binary,

    struct brw_stage_prog_data *prog_data,
@@ -298,6 +320,9 @@ brw_disk_cache_upload_program(struct brw_context 
*brw, gl_shader_stage stage)

 prog->sh.LinkedTransformFeedback->api_enabled)
    return false;
+   if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK)
+  goto FAIL;
+
 if (prog->sh.data->LinkStatus != linking_skipped)
    goto FAIL;
@@ -311,7 +336,7 @@ brw_disk_cache_upload_program(struct brw_context 
*brw, gl_shader_stage stage)

 return true;
  FAIL:
-   /*FIXME: Fall back and compile from source here. */
+   fallback_to_full_recompile(brw, prog, stage);
 return false;
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 14/32] glsl/shader_cache: Save and restore serialized nir in gl_program

2017-10-19 Thread Timothy Arceri


On 20/10/17 09:03, Timothy Arceri wrote:
This and the previous patch feels wrong. The glsl shader cache shouldn't 
be handling writing nir to disk. IMO you should add this functionality 
to nir_serialize.c (maybe rename is nir_cache.c ??). That way we have 
continue with the nir is a toolkit theme and we can easily reused the 
util to write nir to disk in Vulkan which could be useful for reducing 
compile times.


Doing it this way we could deserialize it in brw_link() and serialize it 
in glsl_to_nir() no need for the extra serialized_nir params.


After discussing this on IRC Jordan has changed my mind.

Reviewed-by: Timothy Arceri 



On 19/10/17 16:32, Jordan Justen wrote:

Signed-off-by: Jordan Justen 
---
  src/compiler/glsl/shader_cache.cpp | 16 
  1 file changed, 16 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp

index ca90cfde35..f43bd6b17e 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1062,6 +1062,14 @@ write_shader_metadata(struct blob *metadata, 
gl_linked_shader *shader)

 }
 write_shader_parameters(metadata, glprog->Parameters);
+
+   assert((glprog->serialized_nir == NULL) ==
+  (glprog->serialized_nir_size == 0));
+   blob_write_uint32(metadata, (uint32_t)glprog->serialized_nir_size);
+   if (glprog->serialized_nir_size > 0) {
+  blob_write_bytes(metadata, glprog->serialized_nir,
+   glprog->serialized_nir_size);
+   }
  }
  static void
@@ -1116,6 +1124,14 @@ read_shader_metadata(struct blob_reader *metadata,
 glprog->Parameters = _mesa_new_parameter_list();
 read_shader_parameters(metadata, glprog->Parameters);
+
+   glprog->serialized_nir_size = (size_t)blob_read_uint32(metadata);
+   if (glprog->serialized_nir_size > 0) {
+  glprog->serialized_nir =
+ (uint8_t*)ralloc_size(glprog, glprog->serialized_nir_size);
+  blob_copy_bytes(metadata, glprog->serialized_nir,
+  glprog->serialized_nir_size);
+   }
  }
  static void


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 09/32] glsl_to_nir: Zero nir_variable struct for valgrind & nir_serialize

2017-10-19 Thread Kenneth Graunke

On Wednesday, October 18, 2017 10:31:57 PM PDT Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> ---
>  src/compiler/glsl/glsl_to_nir.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
> b/src/compiler/glsl/glsl_to_nir.cpp
> index 63694fd41f..1d1085ffbc 100644
> --- a/src/compiler/glsl/glsl_to_nir.cpp
> +++ b/src/compiler/glsl/glsl_to_nir.cpp
> @@ -311,7 +311,7 @@ nir_visitor::visit(ir_variable *ir)
> if (ir->data.mode == ir_var_shader_shared)
>return;
>  
> -   nir_variable *var = ralloc(shader, nir_variable);
> +   nir_variable *var = rzalloc(shader, nir_variable);
> var->type = ir->type;
> var->name = ralloc_strdup(var, ir->name);
>  
> 

Patches 9-10:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 08/32] nir: Zero nir_load_const_instr::value for valgrind & nir_serialize

2017-10-19 Thread Kenneth Graunke

On Wednesday, October 18, 2017 10:31:56 PM PDT Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> ---
>  src/compiler/nir/nir.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> index fe48451694..cbba9c8749 100644
> --- a/src/compiler/nir/nir.c
> +++ b/src/compiler/nir/nir.c
> @@ -481,6 +481,7 @@ nir_load_const_instr_create(nir_shader *shader, unsigned 
> num_components,
>  unsigned bit_size)
>  {
> nir_load_const_instr *instr = ralloc(shader, nir_load_const_instr);
> +   memset(>value, 0, sizeof(instr->value));

I would just rzalloc here.  Preemptively, that patch would get a:
Reviewed-by: Kenneth Graunke 

> instr_init(>instr, nir_instr_type_load_const);
>  
> nir_ssa_def_init(>instr, >def, num_components, bit_size, 
> NULL);
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 07/32] intel/nir: Zero local index const struct for valgrind & nir_serialize

2017-10-19 Thread Kenneth Graunke

On Wednesday, October 18, 2017 10:31:55 PM PDT Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> ---
>  src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/intel/compiler/brw_nir_lower_cs_intrinsics.c 
> b/src/intel/compiler/brw_nir_lower_cs_intrinsics.c
> index f9322654e7..d27727624c 100644
> --- a/src/intel/compiler/brw_nir_lower_cs_intrinsics.c
> +++ b/src/intel/compiler/brw_nir_lower_cs_intrinsics.c
> @@ -116,6 +116,7 @@ lower_cs_intrinsics_convert_block(struct 
> lower_intrinsics_state *state,
>   nir_ssa_def *local_index = nir_load_local_invocation_index(b);
>  
>   nir_const_value uvec3;
> + memset(, 0, sizeof(uvec3));
>   uvec3.u32[0] = 1;
>   uvec3.u32[1] = size[0];
>   uvec3.u32[2] = size[0] * size[1];
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 06/32] nir: Zero local_size const struct for valgrind & nir_serialize

2017-10-19 Thread Kenneth Graunke

On Wednesday, October 18, 2017 10:31:54 PM PDT Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> ---
>  src/compiler/nir/nir_lower_system_values.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/compiler/nir/nir_lower_system_values.c 
> b/src/compiler/nir/nir_lower_system_values.c
> index ba20d3083f..39b1a260bd 100644
> --- a/src/compiler/nir/nir_lower_system_values.c
> +++ b/src/compiler/nir/nir_lower_system_values.c
> @@ -58,6 +58,7 @@ convert_block(nir_block *block, nir_builder *b)
>*/
>  
>   nir_const_value local_size;
> + memset(_size, 0, sizeof(local_size));
>   local_size.u32[0] = b->shader->info.cs.local_size[0];
>   local_size.u32[1] = b->shader->info.cs.local_size[1];
>   local_size.u32[2] = b->shader->info.cs.local_size[2];
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 05/32] glsl: Zero per_vertex_accumulator::fields for valgrind & nir_serialize

2017-10-19 Thread Kenneth Graunke

On Wednesday, October 18, 2017 10:31:53 PM PDT Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> ---
>  src/compiler/glsl/builtin_variables.cpp | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/compiler/glsl/builtin_variables.cpp 
> b/src/compiler/glsl/builtin_variables.cpp
> index ea2d897cc8..d3cf12475b 100644
> --- a/src/compiler/glsl/builtin_variables.cpp
> +++ b/src/compiler/glsl/builtin_variables.cpp
> @@ -318,6 +318,7 @@ per_vertex_accumulator::per_vertex_accumulator()
> : fields(),
>   num_fields(0)
>  {
> +   memset(fields, 0, sizeof(fields));
>  }

This shouldn't fix anything, we're calling the fields() constructor,
which should value-initialize every element of the fields array,
calling the constructor for glsl_struct_type on each element.

That constructor is supposed to zero initialize items.  Maybe it is
missing some of them?

FWIW, I trying to figure out the behavior of this by reading the
rules here:

https://stackoverflow.com/questions/1628311/array-initialisation

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 03/32] nir/intrinsics: Set the correct num_indices for load_output

2017-10-19 Thread Kenneth Graunke

On Wednesday, October 18, 2017 10:31:51 PM PDT Jordan Justen wrote:
> From: Jason Ekstrand 
> 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/compiler/nir/nir_intrinsics.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/nir/nir_intrinsics.h 
> b/src/compiler/nir/nir_intrinsics.h
> index 0de7080bfa..cefd18be90 100644
> --- a/src/compiler/nir/nir_intrinsics.h
> +++ b/src/compiler/nir/nir_intrinsics.h
> @@ -434,7 +434,7 @@ INTRINSIC(load_interpolated_input, 2, ARR(2, 1), true, 0, 
> 0,
>  /* src[] = { buffer_index, offset }. No const_index */
>  LOAD(ssbo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE)
>  /* src[] = { offset }. const_index[] = { base, component } */
> -LOAD(output, 1, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
> +LOAD(output, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE)
>  /* src[] = { vertex, offset }. const_index[] = { base, component } */
>  LOAD(per_vertex_output, 2, 1, BASE, COMPONENT, xx, 
> NIR_INTRINSIC_CAN_ELIMINATE)
>  /* src[] = { offset }. const_index[] = { base } */
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 02/32] nir: Get rid of nir_shader::stage

2017-10-19 Thread Kenneth Graunke

On Wednesday, October 18, 2017 10:31:50 PM PDT Jordan Justen wrote:
> From: Jason Ekstrand 
> 
> It's redundant with nir_shader::info::stage.

I hate the extra wordiness here, but removing the redundancy is
certainly a reasonable thing to do.

So, as much as I dislike this,
Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Revert absolute mode for constant buffer pointers.

2017-10-19 Thread Kristian Høgsberg

On Thu, Oct 19, 2017 at 4:18 PM, Kenneth Graunke  wrote:
> The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2
> registers at context initialization time.  Instead, they're inherited
> from whatever happened to be running on the GPU prior to first run of a
> new context.  So, when we started setting these, other contexts in the
> system started inheriting our values.  Since this controls whether
> 3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong
> setting is fatal for almost any process which isn't expecting this.
>
> Unfortunately, VA-API and Beignet don't initialize this (nor does older
> Mesa), so they will die horribly if we start doing this.  UXA and SNA
> don't use any push constants, so they are unaffected.
>
> The kernel developers are apparently uninterested in making the proto-
> context initialize these registers to guarantee deterministic behavior.

Could somebody from the kernel team elaborate here? This is obviously
broken and undermines the security and containerization that hw
contexts are supposed to provide. I'm really curious what the thinking
is here...

Kristian

> Apparently, the "Default Value" of registers in the documentation is a
> total lie, and cannot be relied upon by userspace.  So, we need to find
> a new solution.  One would be to patch VA-API and Beignet to initialize
> these, then get every distributor to ship those in tandem with the newer
> Mesa version.  I'm unclear when va-intel-driver releases might happen.
> Another option would be to hack Mesa to restore the register back to the
> historical default (relative mode) at the end of each batch.  This is
> also gross, as it has non-zero cost, and is also relying on userspace to
> be "polite" to other broken userspace.  A large part of the motivation
> for contexts was to provide proper process isolation, so we should not
> have to do this.  But, we're already doing it in some cases, because
> our kernel fixes to enforce isolation were reverted.
>
> In the meantime, I guess we'll just revert this patch and abandon using
> the feature.  It will lead to fewer pushed UBOs on Broadwell+, which may
> lead to lower performance, but I don't have any data on the impact.
>
> Cc: "17.2" 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774
> ---
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 24 
>  src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
>  2 files changed, 1 insertion(+), 25 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index 16f44d03bbe..23e4ebda259 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -101,30 +101,6 @@ brw_upload_initial_gpu_state(struct brw_context *brw)
>OUT_BATCH(0);
>ADVANCE_BATCH();
> }
> -
> -   /* Set the "CONSTANT_BUFFER Address Offset Disable" bit, so
> -* 3DSTATE_CONSTANT_XS buffer 0 is an absolute address.
> -*
> -* On Gen6-7.5, we use an execbuf parameter to do this for us.
> -* However, the kernel ignores that when execlists are in use.
> -* Fortunately, we can just write the registers from userspace
> -* on Gen8+, and they're context saved/restored.
> -*/
> -   if (devinfo->gen >= 9) {
> -  BEGIN_BATCH(3);
> -  OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> -  OUT_BATCH(CS_DEBUG_MODE2);
> -  OUT_BATCH(REG_MASK(CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) |
> -CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE);
> -  ADVANCE_BATCH();
> -   } else if (devinfo->gen == 8) {
> -  BEGIN_BATCH(3);
> -  OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
> -  OUT_BATCH(INSTPM);
> -  OUT_BATCH(REG_MASK(INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) |
> -INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE);
> -  ADVANCE_BATCH();
> -   }
>  }
>
>  static inline const struct brw_tracked_state *
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index ea04a72e860..776c2898d5b 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -2510,7 +2510,7 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen)
> screen->compiler = brw_compiler_create(screen, devinfo);
> screen->compiler->shader_debug_log = shader_debug_log_mesa;
> screen->compiler->shader_perf_log = shader_perf_log_mesa;
> -   screen->compiler->constant_buffer_0_is_relative = devinfo->gen < 8;
> +   screen->compiler->constant_buffer_0_is_relative = true;
> screen->compiler->supports_pull_constants = true;
>
> screen->has_exec_fence =
> --
> 2.14.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Use align1 mode on ternary instructions on Gen10+

2017-10-19 Thread Matt Turner

Align1 mode offers some nice features over align16, like access to more
data types and the ability to use a 16-bit immediate. This patch does
not start using any new features. It just emits ternary instructions in
align1 mode.
---
 src/intel/compiler/brw_fs_generator.cpp | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 2622a91917..bdf2f916cb 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1729,13 +1729,15 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 
   case BRW_OPCODE_MAD:
  assert(devinfo->gen >= 6);
-brw_set_default_access_mode(p, BRW_ALIGN_16);
+ if (devinfo->gen < 10)
+brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_MAD(p, dst, src[0], src[1], src[2]);
 break;
 
   case BRW_OPCODE_LRP:
  assert(devinfo->gen >= 6);
-brw_set_default_access_mode(p, BRW_ALIGN_16);
+ if (devinfo->gen < 10)
+brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_LRP(p, dst, src[0], src[1], src[2]);
 break;
 
@@ -1833,7 +1835,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 
   case BRW_OPCODE_BFE:
  assert(devinfo->gen >= 7);
- brw_set_default_access_mode(p, BRW_ALIGN_16);
+ if (devinfo->gen < 10)
+brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_BFE(p, dst, src[0], src[1], src[2]);
  break;
 
@@ -1843,7 +1846,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
  break;
   case BRW_OPCODE_BFI2:
  assert(devinfo->gen >= 7);
- brw_set_default_access_mode(p, BRW_ALIGN_16);
+ if (devinfo->gen < 10)
+brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_BFI2(p, dst, src[0], src[1], src[2]);
  break;
 
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/13] i965: Add align1 ternary instruction disassembler support

2017-10-19 Thread Matt Turner

On Fri, Sep 29, 2017 at 5:11 PM, Scott D Phillips
 wrote:
> Matt Turner  writes:
>
>> ---
>>  src/intel/compiler/brw_disasm.c | 399 
>> +---
>>  src/intel/compiler/brw_eu_defines.h |  11 -
>>  2 files changed, 322 insertions(+), 88 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_disasm.c 
>> b/src/intel/compiler/brw_disasm.c
>> index 3726172e5d..7215735967 100644
>> --- a/src/intel/compiler/brw_disasm.c
>> +++ b/src/intel/compiler/brw_disasm.c
>> @@ -30,6 +30,7 @@
>>  #include "brw_reg.h"
>>  #include "brw_inst.h"
>>  #include "brw_eu.h"
>> +#include "util/half_float.h"
>>
>>  static bool
>>  has_jip(const struct gen_device_info *devinfo, enum opcode opcode)
>> @@ -237,13 +238,6 @@ static const char *const access_mode[2] = {
>> [1] = "align16",
>>  };
>>
>> -static const char *const three_source_reg_encoding[] = {
>> -   [BRW_3SRC_TYPE_F]  = "F",
>> -   [BRW_3SRC_TYPE_D]  = "D",
>> -   [BRW_3SRC_TYPE_UD] = "UD",
>> -   [BRW_3SRC_TYPE_DF] = "DF",
>> -};
>> -
>>  static const char *const reg_file[4] = {
>> [0] = "A",
>> [1] = "g",
>> @@ -762,17 +756,17 @@ dest(FILE *file, const struct gen_device_info 
>> *devinfo, const brw_inst *inst)
>>  static int
>>  dest_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
>> *inst)
>>  {
>> +   bool is_align1 = brw_inst_3src_access_mode(devinfo, inst) == BRW_ALIGN_1;
>> int err = 0;
>> +   unsigned flags = 0;
>> uint32_t reg_file;
>> -   enum brw_reg_type type =
>> -  brw_hw_3src_type_to_reg_type(devinfo,
>> -   brw_inst_3src_a16_dst_hw_type(devinfo, 
>> inst),
>> -   0);
>> -   unsigned dst_subreg_nr =
>> -  brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 /
>> -  brw_reg_type_to_size(type);
>> -
>> -   if (devinfo->gen == 6 && brw_inst_3src_a16_dst_reg_file(devinfo, inst))
>> +   unsigned subreg_nr;
>> +   unsigned hw_type;
>> +   enum brw_reg_type type;
>> +
>> +   if (is_align1 && brw_inst_3src_a1_dst_reg_file(devinfo, inst))
>> +  reg_file = BRW_ARCHITECTURE_REGISTER_FILE;
>> +   else if (devinfo->gen == 6 && brw_inst_3src_a16_dst_reg_file(devinfo, 
>> inst))
>>reg_file = BRW_MESSAGE_REGISTER_FILE;
>> else
>>reg_file = BRW_GENERAL_REGISTER_FILE;
>> @@ -780,13 +774,32 @@ dest_3src(FILE *file, const struct gen_device_info 
>> *devinfo, const brw_inst *ins
>> err |= reg(file, reg_file, brw_inst_3src_dst_reg_nr(devinfo, inst));
>> if (err == -1)
>>return 0;
>> -   if (dst_subreg_nr)
>> -  format(file, ".%u", dst_subreg_nr);
>> +
>> +   if (is_align1) {
>> +  flags |= IS_ALIGN1;
>> +
>> +  if (brw_inst_3src_a1_exec_type(devinfo, inst) ==
>> +  BRW_ALIGN1_3SRC_EXEC_TYPE_INT)
>> + flags |= IS_INTEGER;
>> +
>> +  hw_type = brw_inst_3src_a1_dst_hw_type(devinfo, inst);
>> +  subreg_nr = brw_inst_3src_a1_dst_subreg_nr(devinfo, inst);
>> +   } else {
>> +  hw_type = brw_inst_3src_a16_dst_hw_type(devinfo, inst);
>> +  subreg_nr = brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4;
>> +   }
>> +   type = brw_hw_3src_type_to_reg_type(devinfo, hw_type, flags);
>> +   subreg_nr /= brw_reg_type_to_size(type);
>> +
>> +   if (subreg_nr)
>> +  format(file, ".%u", subreg_nr);
>> string(file, "<1>");
>> -   err |= control(file, "writemask", writemask,
>> -  brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL);
>> -   err |= control(file, "dest reg encoding", three_source_reg_encoding,
>> -  brw_inst_3src_a16_dst_hw_type(devinfo, inst), NULL);
>> +
>> +   if (!is_align1) {
>> +  err |= control(file, "writemask", writemask,
>> + brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL);
>> +   }
>> +   string(file, brw_reg_type_to_letters(type));
>>
>> return 0;
>>  }
>> @@ -931,36 +944,169 @@ src_da16(FILE *file,
>> return err;
>>  }
>>
>> +static enum brw_vertical_stride
>> +vstride_from_align1_3src_vstride(enum gen10_align1_3src_vertical_stride 
>> vstride)
>> +{
>> +   switch (vstride) {
>> +   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_0: return BRW_VERTICAL_STRIDE_0;
>> +   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_2: return BRW_VERTICAL_STRIDE_2;
>> +   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_4: return BRW_VERTICAL_STRIDE_4;
>> +   case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_8: return BRW_VERTICAL_STRIDE_8;
>> +   default:
>> +  unreachable("not reached");
>> +   }
>> +}
>> +
>> +static enum brw_horizontal_stride
>> +hstride_from_align1_3src_hstride(enum 
>> gen10_align1_3src_src_horizontal_stride hstride)
>> +{
>> +   switch (hstride) {
>> +   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_0: return 
>> BRW_HORIZONTAL_STRIDE_0;
>> +   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_1: return 
>> BRW_HORIZONTAL_STRIDE_1;
>> +   case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_2: return 
>> BRW_HORIZONTAL_STRIDE_2;
>> +   case

[Mesa-dev] [PATCH] i965: Add align1 ternary instruction emission support

2017-10-19 Thread Matt Turner

---
 src/intel/compiler/brw_eu_emit.c | 219 +--
 1 file changed, 166 insertions(+), 53 deletions(-)

diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 7495a19fd8..4f98860044 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -678,64 +678,177 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
 
gen7_convert_mrf_to_grf(p, );
 
-   assert(brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_16);
-
-   assert(dest.file == BRW_GENERAL_REGISTER_FILE ||
- dest.file == BRW_MESSAGE_REGISTER_FILE);
assert(dest.nr < 128);
+   assert(src0.nr < 128);
+   assert(src1.nr < 128);
+   assert(src2.nr < 128);
assert(dest.address_mode == BRW_ADDRESS_DIRECT);
-   assert(dest.type == BRW_REGISTER_TYPE_F  ||
-  dest.type == BRW_REGISTER_TYPE_DF ||
-  dest.type == BRW_REGISTER_TYPE_D  ||
-  dest.type == BRW_REGISTER_TYPE_UD);
-   if (devinfo->gen == 6) {
-  brw_inst_set_3src_a16_dst_reg_file(devinfo, inst,
- dest.file == 
BRW_MESSAGE_REGISTER_FILE);
-   }
-   brw_inst_set_3src_dst_reg_nr(devinfo, inst, dest.nr);
-   brw_inst_set_3src_a16_dst_subreg_nr(devinfo, inst, dest.subnr / 16);
-   brw_inst_set_3src_a16_dst_writemask(devinfo, inst, dest.writemask);
-
-   assert(src0.file == BRW_GENERAL_REGISTER_FILE);
assert(src0.address_mode == BRW_ADDRESS_DIRECT);
-   assert(src0.nr < 128);
-   brw_inst_set_3src_a16_src0_swizzle(devinfo, inst, src0.swizzle);
-   brw_inst_set_3src_a16_src0_subreg_nr(devinfo, inst, 
get_3src_subreg_nr(src0));
-   brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr);
-   brw_inst_set_3src_src0_abs(devinfo, inst, src0.abs);
-   brw_inst_set_3src_src0_negate(devinfo, inst, src0.negate);
-   brw_inst_set_3src_a16_src0_rep_ctrl(devinfo, inst,
-   src0.vstride == BRW_VERTICAL_STRIDE_0);
-
-   assert(src1.file == BRW_GENERAL_REGISTER_FILE);
assert(src1.address_mode == BRW_ADDRESS_DIRECT);
-   assert(src1.nr < 128);
-   brw_inst_set_3src_src1_reg_nr(devinfo, inst, src1.nr);
-   brw_inst_set_3src_src1_abs(devinfo, inst, src1.abs);
-   brw_inst_set_3src_src1_negate(devinfo, inst, src1.negate);
-   brw_inst_set_3src_a16_src1_rep_ctrl(devinfo, inst,
-   src1.vstride == BRW_VERTICAL_STRIDE_0);
-
-   assert(src2.file == BRW_GENERAL_REGISTER_FILE);
assert(src2.address_mode == BRW_ADDRESS_DIRECT);
-   assert(src2.nr < 128);
-   brw_inst_set_3src_a16_src2_swizzle(devinfo, inst, src2.swizzle);
-   brw_inst_set_3src_a16_src2_subreg_nr(devinfo, inst, 
get_3src_subreg_nr(src2));
-   brw_inst_set_3src_src2_reg_nr(devinfo, inst, src2.nr);
-   brw_inst_set_3src_src2_abs(devinfo, inst, src2.abs);
-   brw_inst_set_3src_src2_negate(devinfo, inst, src2.negate);
-   brw_inst_set_3src_a16_src2_rep_ctrl(devinfo, inst,
-   src2.vstride == BRW_VERTICAL_STRIDE_0);
-
-   if (devinfo->gen >= 7) {
-  /* Set both the source and destination types based on dest.type,
-   * ignoring the source register types.  The MAD and LRP emitters ensure
-   * that all four types are float.  The BFE and BFI2 emitters, however,
-   * may send us mixed D and UD types and want us to ignore that and use
-   * the destination type.
-   */
-  brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type);
-  brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type);
+
+   if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
+  assert(dest.file == BRW_GENERAL_REGISTER_FILE ||
+ dest.file == BRW_ARCHITECTURE_REGISTER_FILE);
+
+  if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE) {
+ brw_inst_set_3src_a1_dst_reg_file(devinfo, inst,
+   BRW_ALIGN1_3SRC_ACCUMULATOR);
+ brw_inst_set_3src_dst_reg_nr(devinfo, inst, BRW_ARF_ACCUMULATOR);
+  } else {
+ brw_inst_set_3src_a1_dst_reg_file(devinfo, inst,
+   
BRW_ALIGN1_3SRC_GENERAL_REGISTER_FILE);
+ brw_inst_set_3src_dst_reg_nr(devinfo, inst, dest.nr);
+  }
+  brw_inst_set_3src_a1_dst_subreg_nr(devinfo, inst, dest.subnr / 8);
+
+  brw_inst_set_3src_a1_dst_hstride(devinfo, inst, 
BRW_ALIGN1_3SRC_DST_HORIZONTAL_STRIDE_1);
+
+  if (brw_reg_type_is_floating_point(dest.type)) {
+ brw_inst_set_3src_a1_exec_type(devinfo, inst,
+BRW_ALIGN1_3SRC_EXEC_TYPE_FLOAT);
+  } else {
+ brw_inst_set_3src_a1_exec_type(devinfo, inst,
+BRW_ALIGN1_3SRC_EXEC_TYPE_INT);
+  }
+
+  brw_inst_set_3src_a1_dst_type(devinfo, inst, dest.type);
+  brw_inst_set_3src_a1_src0_type(devinfo, inst, src0.type);
+  brw_inst_set_3src_a1_src1_type(devinfo, inst, src1.type);
+  brw_inst_set_3src_a1_src2_type(devinfo, inst, src2.type);
+
+

[Mesa-dev] [PATCH] i965: Revert absolute mode for constant buffer pointers.

2017-10-19 Thread Kenneth Graunke

The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2
registers at context initialization time.  Instead, they're inherited
from whatever happened to be running on the GPU prior to first run of a
new context.  So, when we started setting these, other contexts in the
system started inheriting our values.  Since this controls whether
3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong
setting is fatal for almost any process which isn't expecting this.

Unfortunately, VA-API and Beignet don't initialize this (nor does older
Mesa), so they will die horribly if we start doing this.  UXA and SNA
don't use any push constants, so they are unaffected.

The kernel developers are apparently uninterested in making the proto-
context initialize these registers to guarantee deterministic behavior.
Apparently, the "Default Value" of registers in the documentation is a
total lie, and cannot be relied upon by userspace.  So, we need to find
a new solution.  One would be to patch VA-API and Beignet to initialize
these, then get every distributor to ship those in tandem with the newer
Mesa version.  I'm unclear when va-intel-driver releases might happen.
Another option would be to hack Mesa to restore the register back to the
historical default (relative mode) at the end of each batch.  This is
also gross, as it has non-zero cost, and is also relying on userspace to
be "polite" to other broken userspace.  A large part of the motivation
for contexts was to provide proper process isolation, so we should not
have to do this.  But, we're already doing it in some cases, because
our kernel fixes to enforce isolation were reverted.

In the meantime, I guess we'll just revert this patch and abandon using
the feature.  It will lead to fewer pushed UBOs on Broadwell+, which may
lead to lower performance, but I don't have any data on the impact.

Cc: "17.2" 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774
---
 src/mesa/drivers/dri/i965/brw_state_upload.c | 24 
 src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
 2 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 16f44d03bbe..23e4ebda259 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -101,30 +101,6 @@ brw_upload_initial_gpu_state(struct brw_context *brw)
   OUT_BATCH(0);
   ADVANCE_BATCH();
}
-
-   /* Set the "CONSTANT_BUFFER Address Offset Disable" bit, so
-* 3DSTATE_CONSTANT_XS buffer 0 is an absolute address.
-*
-* On Gen6-7.5, we use an execbuf parameter to do this for us.
-* However, the kernel ignores that when execlists are in use.
-* Fortunately, we can just write the registers from userspace
-* on Gen8+, and they're context saved/restored.
-*/
-   if (devinfo->gen >= 9) {
-  BEGIN_BATCH(3);
-  OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
-  OUT_BATCH(CS_DEBUG_MODE2);
-  OUT_BATCH(REG_MASK(CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) |
-CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE);
-  ADVANCE_BATCH();
-   } else if (devinfo->gen == 8) {
-  BEGIN_BATCH(3);
-  OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
-  OUT_BATCH(INSTPM);
-  OUT_BATCH(REG_MASK(INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) |
-INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE);
-  ADVANCE_BATCH();
-   }
 }
 
 static inline const struct brw_tracked_state *
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index ea04a72e860..776c2898d5b 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -2510,7 +2510,7 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen)
screen->compiler = brw_compiler_create(screen, devinfo);
screen->compiler->shader_debug_log = shader_debug_log_mesa;
screen->compiler->shader_perf_log = shader_perf_log_mesa;
-   screen->compiler->constant_buffer_0_is_relative = devinfo->gen < 8;
+   screen->compiler->constant_buffer_0_is_relative = true;
screen->compiler->supports_pull_constants = true;
 
screen->has_exec_fence =
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Add align1 ternary instruction-word support

2017-10-19 Thread Matt Turner

Reviewed-by: Scott D Phillips 
---
 src/intel/compiler/brw_inst.h | 108 ++
 1 file changed, 108 insertions(+)

diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index 1ddee77164..40723f9012 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -268,6 +268,114 @@ REG_TYPE(src)
 #undef REG_TYPE
 
 /**
+ * Three-source align1 instructions:
+ *  @{
+ */
+/* Reserved 127:126 */
+/* src2_reg_nr same in align16 */
+FC(3src_a1_src2_subreg_nr, 117, 113, devinfo->gen >= 10)
+FC(3src_a1_src2_hstride,   112, 111, devinfo->gen >= 10)
+/* Reserved 110:109. src2 vstride is an implied parameter */
+FC(3src_a1_src2_hw_type,   108, 106, devinfo->gen >= 10)
+/* Reserved 105 */
+/* src1_reg_nr same in align16 */
+FC(3src_a1_src1_subreg_nr,  96,  92, devinfo->gen >= 10)
+FC(3src_a1_src1_hstride,91,  90, devinfo->gen >= 10)
+FC(3src_a1_src1_vstride,89,  88, devinfo->gen >= 10)
+FC(3src_a1_src1_hw_type,87,  85, devinfo->gen >= 10)
+/* Reserved 84 */
+/* src0_reg_nr same in align16 */
+FC(3src_a1_src0_subreg_nr,  75,  71, devinfo->gen >= 10)
+FC(3src_a1_src0_hstride,70,  69, devinfo->gen >= 10)
+FC(3src_a1_src0_vstride,68,  67, devinfo->gen >= 10)
+FC(3src_a1_src0_hw_type,66,  64, devinfo->gen >= 10)
+/* dst_reg_nr same in align16 */
+FC(3src_a1_dst_subreg_nr,   55,  54, devinfo->gen >= 10)
+FC(3src_a1_special_acc, 55,  52, devinfo->gen >= 10) /* aliases 
dst_subreg_nr */
+/* Reserved 51:50 */
+FC(3src_a1_dst_hstride, 49,  49, devinfo->gen >= 10)
+FC(3src_a1_dst_hw_type, 48,  46, devinfo->gen >= 10)
+FC(3src_a1_src2_reg_file,   45,  45, devinfo->gen >= 10)
+FC(3src_a1_src1_reg_file,   44,  44, devinfo->gen >= 10)
+FC(3src_a1_src0_reg_file,   43,  43, devinfo->gen >= 10)
+/* Source Modifier fields same in align16 */
+FC(3src_a1_dst_reg_file,36,  36, devinfo->gen >= 10)
+FC(3src_a1_exec_type,   35,  35, devinfo->gen >= 10)
+/* Fields below this same in align16 */
+/** @} */
+
+#define REG_TYPE(reg) \
+static inline void\
+brw_inst_set_3src_a1_##reg##_type(const struct gen_device_info *devinfo,  \
+  brw_inst *inst, enum brw_reg_type type) \
+{ \
+   UNUSED enum gen10_align1_3src_exec_type exec_type =\
+  (enum gen10_align1_3src_exec_type) brw_inst_3src_a1_exec_type(devinfo,  \
+inst);\
+   if (brw_reg_type_is_floating_point(type)) {\
+  assert(exec_type == BRW_ALIGN1_3SRC_EXEC_TYPE_FLOAT);   \
+   } else {   \
+  assert(exec_type == BRW_ALIGN1_3SRC_EXEC_TYPE_INT); \
+   }  \
+   unsigned hw_type = brw_reg_type_to_a1_hw_3src_type(devinfo, type); \
+   brw_inst_set_3src_a1_##reg##_hw_type(devinfo, inst, hw_type);  \
+} \
+  \
+static inline enum brw_reg_type   \
+brw_inst_3src_a1_##reg##_type(const struct gen_device_info *devinfo,  \
+  const brw_inst *inst)   \
+{ \
+   enum gen10_align1_3src_exec_type exec_type =   \
+  (enum gen10_align1_3src_exec_type) brw_inst_3src_a1_exec_type(devinfo,  \
+inst);\
+   unsigned hw_type = brw_inst_3src_a1_##reg##_hw_type(devinfo, inst);\
+   return brw_a1_hw_3src_type_to_reg_type(devinfo, hw_type, exec_type);   \
+}
+
+REG_TYPE(dst)
+REG_TYPE(src0)
+REG_TYPE(src1)
+REG_TYPE(src2)
+#undef REG_TYPE
+
+/**
+ * Three-source align1 instruction immediates:
+ *  @{
+ */
+static inline uint16_t
+brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo,
+   const brw_inst *insn)
+{
+   assert(devinfo->gen >= 10);
+   return brw_inst_bits(insn, 82, 67);
+}
+
+static inline uint16_t
+brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo,
+   const brw_inst *insn)
+{
+   assert(devinfo->gen >= 10);
+   return brw_inst_bits(insn, 124, 109);
+}
+
+static inline void
+brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo,
+   brw_inst *insn, uint16_t value)
+{
+   assert(devinfo->gen >= 10);
+   return brw_inst_set_bits(insn, 82, 67, value);
+}
+
+static inline

Re: [Mesa-dev] [PATCH 09/13] i965: Add align1 ternary instruction-word support

2017-10-19 Thread Matt Turner

On Fri, Sep 29, 2017 at 5:08 PM, Scott D Phillips
 wrote:
> Matt Turner  writes:
>
>> ---
>>  src/intel/compiler/brw_inst.h | 114 
>> ++
>>  1 file changed, 114 insertions(+)
>>
>> diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
>> index e6169057e3..b9c03fa88f 100644
>> --- a/src/intel/compiler/brw_inst.h
>> +++ b/src/intel/compiler/brw_inst.h
>> @@ -268,6 +268,120 @@ REG_TYPE(src)
>>  #undef REG_TYPE
>>
>>  /**
>> + * Three-source align1 instructions:
>> + *  @{
>> + */
>> +/* Reserved 127:126 */
>> +/* src2_reg_nr same in align16 */
>> +FC(3src_a1_src2_subreg_nr, 117, 113, devinfo->gen >= 10)
>> +FC(3src_a1_src2_hstride,   112, 111, devinfo->gen >= 10)
>> +/* Reserved 110:109. src2 vstride is an implied parameter */
>> +FC(3src_a1_src2_hw_type,   108, 106, devinfo->gen >= 10)
>> +/* Reserved 105 */
>> +/* src1_reg_nr same in align16 */
>> +FC(3src_a1_src1_subreg_nr,  96,  92, devinfo->gen >= 10)
>> +FC(3src_a1_src1_hstride,91,  90, devinfo->gen >= 10)
>> +FC(3src_a1_src1_vstride,89,  88, devinfo->gen >= 10)
>> +FC(3src_a1_src1_hw_type,87,  85, devinfo->gen >= 10)
>> +/* Reserved 84 */
>> +/* src0_reg_nr same in align16 */
>> +FC(3src_a1_src0_subreg_nr,  75,  71, devinfo->gen >= 10)
>> +FC(3src_a1_src0_hstride,70,  69, devinfo->gen >= 10)
>> +FC(3src_a1_src0_vstride,68,  67, devinfo->gen >= 10)
>> +FC(3src_a1_src0_hw_type,66,  64, devinfo->gen >= 10)
>> +/* dst_reg_nr same in align16 */
>> +FC(3src_a1_dst_subreg_nr,   55,  54, devinfo->gen >= 10)
>> +FC(3src_a1_special_acc, 55,  52, devinfo->gen >= 10) /* aliases 
>> dst_subreg_nr */
>> +/* Reserved 51:50 */
>> +FC(3src_a1_dst_hstride, 49,  49, devinfo->gen >= 10)
>> +FC(3src_a1_dst_hw_type, 48,  46, devinfo->gen >= 10)
>> +FC(3src_a1_src2_reg_file,   45,  45, devinfo->gen >= 10)
>> +FC(3src_a1_src1_reg_file,   44,  44, devinfo->gen >= 10)
>> +FC(3src_a1_src0_reg_file,   43,  43, devinfo->gen >= 10)
>> +/* Source Modifier fields same in align16 */
>> +FC(3src_a1_dst_reg_file,36,  36, devinfo->gen >= 10)
>> +FC(3src_a1_exec_type,   35,  35, devinfo->gen >= 10)
>> +/* Fields below this same in align16 */
>> +/** @} */
>> +
>> +#define REG_TYPE(reg)   
>>   \
>> +static inline void  
>>   \
>> +brw_inst_set_3src_a1_##reg##_type(const struct gen_device_info *devinfo,
>>   \
>> +  brw_inst *inst, enum brw_reg_type type)   
>>   \
>> +{   
>>   \
>> +   enum gen10_align1_3src_exec_type exec_type = 
>>   \
>> +  (enum gen10_align1_3src_exec_type) 
>> brw_inst_3src_a1_exec_type(devinfo,  \
>> +inst);  
>>   \
>
> add MAYBE_UNUSED on exec_type here to silence a bunch of warnings.

Indeed. I didn't see that until I built a non-debug build. Fixed locally.

> Patches 4-7,9:
>
> Reviewed-by: Scott D Phillips 

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Add align1 ternary instruction support to conversion functions

2017-10-19 Thread Matt Turner

---
 src/intel/compiler/brw_disasm.c   | 16 ++-
 src/intel/compiler/brw_inst.h |  4 +-
 src/intel/compiler/brw_reg_type.c | 99 ---
 src/intel/compiler/brw_reg_type.h | 16 +--
 4 files changed, 101 insertions(+), 34 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index e91961028e..ef7f623ceb 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -764,9 +764,7 @@ dest_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
 {
int err = 0;
uint32_t reg_file;
-   enum brw_reg_type type =
-  brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_dst_hw_type(devinfo, 
inst));
+   enum brw_reg_type type = brw_inst_3src_a16_dst_type(devinfo, inst);
unsigned dst_subreg_nr =
   brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -934,9 +932,7 @@ static int
 src0_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
*inst)
 {
int err = 0;
-   enum brw_reg_type type =
-  brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
+   enum brw_reg_type type = brw_inst_3src_a16_src_type(devinfo, inst);
unsigned src0_subreg_nr =
   brw_inst_3src_a16_src0_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -966,9 +962,7 @@ static int
 src1_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
*inst)
 {
int err = 0;
-   enum brw_reg_type type =
-  brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
+   enum brw_reg_type type = brw_inst_3src_a16_src_type(devinfo, inst);
unsigned src1_subreg_nr =
   brw_inst_3src_a16_src1_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
@@ -999,9 +993,7 @@ static int
 src2_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst 
*inst)
 {
int err = 0;
-   enum brw_reg_type type =
-  brw_hw_3src_type_to_reg_type(devinfo,
-   brw_inst_3src_a16_src_hw_type(devinfo, 
inst));
+   enum brw_reg_type type = brw_inst_3src_a16_src_type(devinfo, inst);
unsigned src2_subreg_nr =
   brw_inst_3src_a16_src2_subreg_nr(devinfo, inst) * 4 /
   brw_reg_type_to_size(type);
diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
index 0cc1a3e911..1ddee77164 100644
--- a/src/intel/compiler/brw_inst.h
+++ b/src/intel/compiler/brw_inst.h
@@ -251,7 +251,7 @@ static inline void  
  \
 brw_inst_set_3src_a16_##reg##_type(const struct gen_device_info *devinfo, \
brw_inst *inst, enum brw_reg_type type)\
 { \
-   unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type);\
+   unsigned hw_type = brw_reg_type_to_a16_hw_3src_type(devinfo, type);\
brw_inst_set_3src_a16_##reg##_hw_type(devinfo, inst, hw_type); \
 } \
   \
@@ -260,7 +260,7 @@ brw_inst_3src_a16_##reg##_type(const struct gen_device_info 
*devinfo, \
const brw_inst *inst)  \
 { \
unsigned hw_type = brw_inst_3src_a16_##reg##_hw_type(devinfo, inst);   \
-   return brw_hw_3src_type_to_reg_type(devinfo, hw_type); \
+   return brw_a16_hw_3src_type_to_reg_type(devinfo, hw_type); \
 }
 
 REG_TYPE(dst)
diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index f5aadf88bb..c73be15f16 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -84,20 +84,57 @@ static const struct {
  * and unsigned doublewords, so a new field is also available in the da3src
  * struct (part of struct brw_instruction.bits1 in brw_structs.h) to select
  * dst and shared-src types.
+ *
+ * CNL adds support for 3-src instructions in align1 mode, and with it support
+ * for most register types.
  */
 enum hw_3src_reg_type {
GEN7_3SRC_TYPE_F  = 0,
GEN7_3SRC_TYPE_D  = 1,
GEN7_3SRC_TYPE_UD = 2,
GEN7_3SRC_TYPE_DF = 3,
+
+   /** When ExecutionDatatype is 1: @{ */
+   GEN10_ALIGN1_3SRC_REG_TYPE_HF = 0b000,
+   GEN10_ALIGN1_3SRC_REG_TYPE_F  = 0b001,
+   GEN10_ALIGN1_3SRC_REG_TYPE_DF = 0b010,
+   /** @} */
+
+   /** When ExecutionDatatype is 0: @{ */
+   GEN10_ALIGN1_3SRC_REG_TYPE_UD = 0b000,
+   GEN10_ALIGN1_3SRC_REG_TYPE_D  = 0b001,
+   GEN10_ALIGN1_3SRC_REG_TYPE_UW = 0b010,
+   GEN10_ALIGN1_3SRC_REG_TYPE_W  = 0b011,
+

Re: [Mesa-dev] [PATCH 12/13] i965/fs: Use align1 mode on ternary instructions on Gen10+

2017-10-19 Thread Matt Turner

On Fri, Sep 29, 2017 at 5:20 PM, Scott D Phillips
 wrote:
> Matt Turner  writes:
>
>> Align1 mode offers some nice features over align16, like access to more
>> data types and the ability to use a 16-bit immediate. This patch does
>> not start using any new features. It just emits ternary instructions in
>> align1 mode.
>> ---
>>  src/intel/compiler/brw_fs_generator.cpp | 12 
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs_generator.cpp 
>> b/src/intel/compiler/brw_fs_generator.cpp
>> index afaec5c949..03ee26ccd4 100644
>> --- a/src/intel/compiler/brw_fs_generator.cpp
>> +++ b/src/intel/compiler/brw_fs_generator.cpp
>> @@ -1728,14 +1728,16 @@ fs_generator::generate_code(const cfg_t *cfg, int 
>> dispatch_width)
>>
>>case BRW_OPCODE_MAD:
>>   assert(devinfo->gen >= 6);
>> -  brw_set_default_access_mode(p, BRW_ALIGN_16);
>> + if (devinfo->gen < 10)
>> +brw_set_default_access_mode(p, BRW_ALIGN_16);
>
> Not setting anything gets BRW_ALIGN_1 because its enum value is 0. Maybe
> add a comment for that if you want.

Actually, it's set at the top of the loop:

  brw_set_default_access_mode(p, BRW_ALIGN_1);

along with some other defaults. This is done, because align1 mode
matches well with the scalar mode that the fragment shader runs in
(and on Gen8+ all other stages too).

>>   brw_MAD(p, dst, src[0], src[1], src[2]);
>>break;
>>
>>case BRW_OPCODE_LRP:
>>   assert(devinfo->gen >= 6);
>>brw_set_default_access_mode(p, BRW_ALIGN_16);
>> - brw_LRP(p, dst, src[0], src[1], src[2]);
>> + if (devinfo->gen < 10)
>> +brw_LRP(p, dst, src[0], src[1], src[2]);
>
> Copy paste error

Whoops. Thanks. Fixed locally.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/13] i965: Add align1 ternary instruction support to conversion functions

2017-10-19 Thread Matt Turner

On Fri, Sep 29, 2017 at 5:07 PM, Scott D Phillips
 wrote:
> Matt Turner  writes:
>
>> ---
>>  src/intel/compiler/brw_disasm.c   | 12 ---
>>  src/intel/compiler/brw_inst.h |  4 +--
>>  src/intel/compiler/brw_reg_type.c | 76 
>> ---
>>  src/intel/compiler/brw_reg_type.h |  7 ++--
>>  4 files changed, 79 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_disasm.c 
>> b/src/intel/compiler/brw_disasm.c
>> index aab4a65b7d..3726172e5d 100644
>> --- a/src/intel/compiler/brw_disasm.c
>> +++ b/src/intel/compiler/brw_disasm.c
>> @@ -766,7 +766,8 @@ dest_3src(FILE *file, const struct gen_device_info 
>> *devinfo, const brw_inst *ins
>> uint32_t reg_file;
>> enum brw_reg_type type =
>>brw_hw_3src_type_to_reg_type(devinfo,
>> -   brw_inst_3src_a16_dst_hw_type(devinfo, 
>> inst));
>> +   brw_inst_3src_a16_dst_hw_type(devinfo, 
>> inst),
>> +   0);
>> unsigned dst_subreg_nr =
>>brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 /
>>brw_reg_type_to_size(type);
>> @@ -936,7 +937,8 @@ src0_3src(FILE *file, const struct gen_device_info 
>> *devinfo, const brw_inst *ins
>> int err = 0;
>> enum brw_reg_type type =
>>brw_hw_3src_type_to_reg_type(devinfo,
>> -   brw_inst_3src_a16_src_hw_type(devinfo, 
>> inst));
>> +   brw_inst_3src_a16_src_hw_type(devinfo, 
>> inst),
>> +   0);
>> unsigned src0_subreg_nr =
>>brw_inst_3src_a16_src0_subreg_nr(devinfo, inst) * 4 /
>>brw_reg_type_to_size(type);
>> @@ -968,7 +970,8 @@ src1_3src(FILE *file, const struct gen_device_info 
>> *devinfo, const brw_inst *ins
>> int err = 0;
>> enum brw_reg_type type =
>>brw_hw_3src_type_to_reg_type(devinfo,
>> -   brw_inst_3src_a16_src_hw_type(devinfo, 
>> inst));
>> +   brw_inst_3src_a16_src_hw_type(devinfo, 
>> inst),
>> +   0);
>> unsigned src1_subreg_nr =
>>brw_inst_3src_a16_src1_subreg_nr(devinfo, inst) * 4 /
>>brw_reg_type_to_size(type);
>> @@ -1001,7 +1004,8 @@ src2_3src(FILE *file, const struct gen_device_info 
>> *devinfo, const brw_inst *ins
>> int err = 0;
>> enum brw_reg_type type =
>>brw_hw_3src_type_to_reg_type(devinfo,
>> -   brw_inst_3src_a16_src_hw_type(devinfo, 
>> inst));
>> +   brw_inst_3src_a16_src_hw_type(devinfo, 
>> inst),
>> +   0);
>> unsigned src2_subreg_nr =
>>brw_inst_3src_a16_src2_subreg_nr(devinfo, inst) * 4 /
>>brw_reg_type_to_size(type);
>> diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h
>> index 0cc1a3e911..e6169057e3 100644
>> --- a/src/intel/compiler/brw_inst.h
>> +++ b/src/intel/compiler/brw_inst.h
>> @@ -251,7 +251,7 @@ static inline void   
>>  \
>>  brw_inst_set_3src_a16_##reg##_type(const struct gen_device_info *devinfo,   
>>   \
>> brw_inst *inst, enum brw_reg_type type)  
>>   \
>>  {   
>>   \
>> -   unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type);  
>>   \
>> +   unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type, 0);   
>>   \
>> brw_inst_set_3src_a16_##reg##_hw_type(devinfo, inst, hw_type);   
>>   \
>>  }   
>>   \
>>  
>>   \
>> @@ -260,7 +260,7 @@ brw_inst_3src_a16_##reg##_type(const struct 
>> gen_device_info *devinfo, \
>> const brw_inst *inst)
>>   \
>>  {   
>>   \
>> unsigned hw_type = brw_inst_3src_a16_##reg##_hw_type(devinfo, inst); 
>>   \
>> -   return brw_hw_3src_type_to_reg_type(devinfo, hw_type);   
>>   \
>> +   return brw_hw_3src_type_to_reg_type(devinfo, hw_type, 0);
>>   \
>>  }
>>
>>  REG_TYPE(dst)
>> diff --git a/src/intel/compiler/brw_reg_type.c 
>> b/src/intel/compiler/brw_reg_type.c
>> index d65ebaee48..7fb4a1e62a 100644
>> --- a/src/intel/compiler/brw_reg_type.c
>> +++ b/src/intel/compiler/brw_reg_type.c
>> @@ -84,20 +84,55 @@ static const struct {
>>   * and unsigned doublewords, so a new field is also available in the da3src
>>   * struct (part of struct brw_instruction.bits1 in brw_structs.h) to select
>>   * dst and shared-src types.
>> + *
>> + * CNL adds support for 3-src instructions in align1

Re: [Mesa-dev] [PATCH v2 16/32] i965: Don't rely on nir for uses_texture_gather

2017-10-19 Thread Timothy Arceri


On 20/10/17 09:38, Jordan Justen wrote:

On 2017-10-19 15:07:45, Timothy Arceri wrote:

Maybe you should just do:

prog->nir->info = prog->info;

After you restore nir from the cache?


We only deserialize from nir if the gen program restore fails. So,
hopefully prog->nir will be NULL.


IMO we should always restore the NIR. See my comments on 14 and 25.

Basically we want to do all the restores at link time to avoid ever 
falling back at draw time.




-Jordan



On 19/10/17 16:32, Jordan Justen wrote:

When a program is restored from the shader cache, prog->nir will be
NULL, but prog->info will be restored.

Signed-off-by: Jordan Justen 
---
   src/mesa/drivers/dri/i965/brw_wm.c   |  4 ++--
   src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
   2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 69d8e61e40..e511f0f70b 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -330,7 +330,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx,
}
   
/* gather4 for RG32* is broken in multiple ways on Gen7. */

- if (devinfo->gen == 7 && prog->nir->info.uses_texture_gather) {
+ if (devinfo->gen == 7 && prog->info.uses_texture_gather) {
   switch (img->InternalFormat) {
   case GL_RG32I:
   case GL_RG32UI: {
@@ -368,7 +368,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx,
/* Gen6's gather4 is broken for UINT/SINT; we treat them as
 * UNORM/FLOAT instead and fix it in the shader.
 */
- if (devinfo->gen == 6 && prog->nir->info.uses_texture_gather) {
+ if (devinfo->gen == 6 && prog->info.uses_texture_gather) {
   key->gen6_gather_wa[s] = 
gen6_gather_workaround(img->InternalFormat);
}
   
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c

index f4e9cf48c6..4f454dae44 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1198,15 +1198,15 @@ brw_update_texture_surfaces(struct brw_context *brw)
   * allows the surface format to be overriden for only the
   * gather4 messages. */
  if (devinfo->gen < 8) {
-  if (vs && vs->nir->info.uses_texture_gather)
+  if (vs && vs->info.uses_texture_gather)
update_stage_texture_surfaces(brw, vs, >vs.base, true, 0);
-  if (tcs && tcs->nir->info.uses_texture_gather)
+  if (tcs && tcs->info.uses_texture_gather)
update_stage_texture_surfaces(brw, tcs, >tcs.base, true, 0);
-  if (tes && tes->nir->info.uses_texture_gather)
+  if (tes && tes->info.uses_texture_gather)
update_stage_texture_surfaces(brw, tes, >tes.base, true, 0);
-  if (gs && gs->nir->info.uses_texture_gather)
+  if (gs && gs->info.uses_texture_gather)
update_stage_texture_surfaces(brw, gs, >gs.base, true, 0);
-  if (fs && fs->nir->info.uses_texture_gather)
+  if (fs && fs->info.uses_texture_gather)
update_stage_texture_surfaces(brw, fs, >wm.base, true, 0);
  }
   
@@ -1253,7 +1253,7 @@ brw_update_cs_texture_surfaces(struct brw_context *brw)

   * gather4 messages.
   */
  if (devinfo->gen < 8) {
-  if (cs && cs->nir->info.uses_texture_gather)
+  if (cs && cs->info.uses_texture_gather)
update_stage_texture_surfaces(brw, cs, >cs.base, true, 0);
  }
   


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 16/32] i965: Don't rely on nir for uses_texture_gather

2017-10-19 Thread Jordan Justen

On 2017-10-19 15:07:45, Timothy Arceri wrote:
> Maybe you should just do:
> 
> prog->nir->info = prog->info;
> 
> After you restore nir from the cache?

We only deserialize from nir if the gen program restore fails. So,
hopefully prog->nir will be NULL.

-Jordan

> 
> On 19/10/17 16:32, Jordan Justen wrote:
> > When a program is restored from the shader cache, prog->nir will be
> > NULL, but prog->info will be restored.
> > 
> > Signed-off-by: Jordan Justen 
> > ---
> >   src/mesa/drivers/dri/i965/brw_wm.c   |  4 ++--
> >   src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
> >   2 files changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
> > b/src/mesa/drivers/dri/i965/brw_wm.c
> > index 69d8e61e40..e511f0f70b 100644
> > --- a/src/mesa/drivers/dri/i965/brw_wm.c
> > +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> > @@ -330,7 +330,7 @@ brw_populate_sampler_prog_key_data(struct gl_context 
> > *ctx,
> >}
> >   
> >/* gather4 for RG32* is broken in multiple ways on Gen7. */
> > - if (devinfo->gen == 7 && prog->nir->info.uses_texture_gather) {
> > + if (devinfo->gen == 7 && prog->info.uses_texture_gather) {
> >   switch (img->InternalFormat) {
> >   case GL_RG32I:
> >   case GL_RG32UI: {
> > @@ -368,7 +368,7 @@ brw_populate_sampler_prog_key_data(struct gl_context 
> > *ctx,
> >/* Gen6's gather4 is broken for UINT/SINT; we treat them as
> > * UNORM/FLOAT instead and fix it in the shader.
> > */
> > - if (devinfo->gen == 6 && prog->nir->info.uses_texture_gather) {
> > + if (devinfo->gen == 6 && prog->info.uses_texture_gather) {
> >   key->gen6_gather_wa[s] = 
> > gen6_gather_workaround(img->InternalFormat);
> >}
> >   
> > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > index f4e9cf48c6..4f454dae44 100644
> > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > @@ -1198,15 +1198,15 @@ brw_update_texture_surfaces(struct brw_context *brw)
> >   * allows the surface format to be overriden for only the
> >   * gather4 messages. */
> >  if (devinfo->gen < 8) {
> > -  if (vs && vs->nir->info.uses_texture_gather)
> > +  if (vs && vs->info.uses_texture_gather)
> >update_stage_texture_surfaces(brw, vs, >vs.base, true, 0);
> > -  if (tcs && tcs->nir->info.uses_texture_gather)
> > +  if (tcs && tcs->info.uses_texture_gather)
> >update_stage_texture_surfaces(brw, tcs, >tcs.base, true, 0);
> > -  if (tes && tes->nir->info.uses_texture_gather)
> > +  if (tes && tes->info.uses_texture_gather)
> >update_stage_texture_surfaces(brw, tes, >tes.base, true, 0);
> > -  if (gs && gs->nir->info.uses_texture_gather)
> > +  if (gs && gs->info.uses_texture_gather)
> >update_stage_texture_surfaces(brw, gs, >gs.base, true, 0);
> > -  if (fs && fs->nir->info.uses_texture_gather)
> > +  if (fs && fs->info.uses_texture_gather)
> >update_stage_texture_surfaces(brw, fs, >wm.base, true, 0);
> >  }
> >   
> > @@ -1253,7 +1253,7 @@ brw_update_cs_texture_surfaces(struct brw_context 
> > *brw)
> >   * gather4 messages.
> >   */
> >  if (devinfo->gen < 8) {
> > -  if (cs && cs->nir->info.uses_texture_gather)
> > +  if (cs && cs->info.uses_texture_gather)
> >update_stage_texture_surfaces(brw, cs, >cs.base, true, 0);
> >  }
> >   
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] swr: Rework scratch space allocation

2017-10-19 Thread Cherniak, Bruce

Reviewed-by: Bruce Cherniak  

> On Oct 19, 2017, at 4:40 PM, George Kyriazis  
> wrote:
> 
> Remove allocation of > 2kbyte buffers into context memory in
> swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers
> and shader constants to a scratch space to be used by the upcoming draw.)
> 
> Large shader constant allocations need to be done in the circular scratch
> buffer instead of context memory, because their values persist across
> render calls.
> 
> Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger
> buffers will get too large for the circular scratch space.
> 
> Fixes render issues with CEI Ensight.
> ---
> src/gallium/drivers/swr/swr_scratch.cpp | 51 ++---
> src/gallium/drivers/swr/swr_screen.cpp  |  2 +-
> 2 files changed, 23 insertions(+), 30 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/swr_scratch.cpp 
> b/src/gallium/drivers/swr/swr_scratch.cpp
> index d298a48..c31f3c2 100644
> --- a/src/gallium/drivers/swr/swr_scratch.cpp
> +++ b/src/gallium/drivers/swr/swr_scratch.cpp
> @@ -28,8 +28,6 @@
> #include "swr_fence_work.h"
> #include "api.h"
> 
> -#define SCRATCH_SINGLE_ALLOCATION_LIMIT 2048
> -
> void *
> swr_copy_to_scratch_space(struct swr_context *ctx,
>   struct swr_scratch_space *space,
> @@ -40,41 +38,36 @@ swr_copy_to_scratch_space(struct swr_context *ctx,
>assert(space);
>assert(size);
> 
> -   if (size >= SCRATCH_SINGLE_ALLOCATION_LIMIT) {
> -  /* Use per draw SwrAllocDrawContextMemory for larger copies */
> -  ptr = ctx->api.pfnSwrAllocDrawContextMemory(ctx->swrContext, size, 4);
> -   } else {
> -  /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */
> -  unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT;
> -
> -  /* Need to grow space */
> -  if (max_size_in_flight > space->current_size) {
> - space->current_size = max_size_in_flight;
> +   /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */
> +   unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT;
> 
> - if (space->base) {
> -/* defer delete, use aligned-free */
> -struct swr_screen *screen = swr_screen(ctx->pipe.screen);
> -swr_fence_work_free(screen->flush_fence, space->base, true);
> -space->base = NULL;
> - }
> +   /* Need to grow space */
> +   if (max_size_in_flight > space->current_size) {
> +  space->current_size = max_size_in_flight;
> 
> - if (!space->base) {
> -space->base = (uint8_t *)AlignedMalloc(space->current_size, 
> -   sizeof(void *));
> -space->head = (void *)space->base;
> - }
> +  if (space->base) {
> + /* defer delete, use aligned-free */
> + struct swr_screen *screen = swr_screen(ctx->pipe.screen);
> + swr_fence_work_free(screen->flush_fence, space->base, true);
> + space->base = NULL;
>   }
> 
> -  /* Wrap */
> -  if (((uint8_t *)space->head + size)
> -  >= ((uint8_t *)space->base + space->current_size)) {
> - space->head = space->base;
> +  if (!space->base) {
> + space->base = (uint8_t *)AlignedMalloc(space->current_size, 
> +sizeof(void *));
> + space->head = (void *)space->base;
>   }
> +   }
> 
> -  ptr = space->head;
> -  space->head = (uint8_t *)space->head + size;
> +   /* Wrap */
> +   if (((uint8_t *)space->head + size)
> +   >= ((uint8_t *)space->base + space->current_size)) {
> +  space->head = space->base;
>}
> 
> +   ptr = space->head;
> +   space->head = (uint8_t *)space->head + size;
> +
>/* Copy user_buffer to scratch */
>if (user_buffer)
>   memcpy(ptr, user_buffer, size);
> diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
> b/src/gallium/drivers/swr/swr_screen.cpp
> index 46b3a00..b21c35e 100644
> --- a/src/gallium/drivers/swr/swr_screen.cpp
> +++ b/src/gallium/drivers/swr/swr_screen.cpp
> @@ -57,7 +57,7 @@
> #define SWR_MAX_TEXTURE_ARRAY_LAYERS 512 /* 8K x 512 / 8K x 8K x 512 */
> 
> /* Default max client_copy_limit */
> -#define SWR_CLIENT_COPY_LIMIT 32768
> +#define SWR_CLIENT_COPY_LIMIT 8192
> 
> /* Flag indicates creation of alternate surface, to prevent recursive loop
>  * in resource creation when msaa_force_enable is set. */
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Timothy Arceri


On 20/10/17 08:45, Ilia Mirkin wrote:

On Thu, Oct 19, 2017 at 5:35 PM, Timothy Arceri  wrote:



On 20/10/17 08:27, Timothy Arceri wrote:




On 20/10/17 08:19, Timothy Arceri wrote:


On 20/10/17 04:21, Ilia Mirkin wrote:


On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin 
wrote:


On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:


On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:


Will this work with SSO shaders? Presumably the validation still has
to happen, but I don't think cross_validate_outputs_to_inputs() will
end up getting called.



The piglit tests I mention use SSO so it seems to be working for this.
See for example:

tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
block-member-location-overlap.shader_test



Ah great. I'm a little curious how, since I don't see how
cross_validate_outputs_to_inputs would get called for SSO shaders.
Perhaps I'm looking at old code.

Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
- can you try with that? It's just using a pipeline, but both shaders
are ending up in it.



BTW, my solution to all this was

https://patchwork.freedesktop.org/patch/175959/

but Tim hated it, and I didn't have the time to properly respond.



Hate is a strong word, the problem is it duplicated some of the
checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The checks
should be pulled into a helper/helpers that can also be used by SSO.



Oh and your patch was also missing all the component checking logic which
we also should be doing for SSO. Moving the checks into helpers will give us
these check for free.



Thinking about it some more, I don't see how your patch works for SSO.

The only place you can validate the SSO pipeline is via
_mesa_validate_program_pipeline() since we can't know what the other stages
will be at link time.


You don't need to validate the pipeline. You need to make sure each
individual program links, and the code is called at glLinkShader()
time, at which point it will fail the link if the interfaces have
things they're not supposed to.



Oh right, this is just for validating the outward facing interfaces of a 
SSO program. I think we have discussed this before :P

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 32/32] disk_cache: Add support for MESA_GLSL_CACHE_TIMESTAMP in debug builds

2017-10-19 Thread Timothy Arceri


Maybe add to docs?

On 19/10/17 16:32, Jordan Justen wrote:

The MESA_GLSL_CACHE_TIMESTAMP environment variable can be set to
override the driver timestamp. Usually the driver will specify a hash
of their driver build so the cache items become invalid with each
driver build.

We don't guarantee a stable serialized shader cache format, so
changing the timestamp for each build is required for safety.

Nevertheless, during debug, making small changes to the driver may be
known to be safe. The driver developer can use this variable to keep
the timestamp consistent. When debugging issues on an application for
which the shader cache greatly lowers the startup time, this can save
the developer significant time.

Signed-off-by: Jordan Justen 
---
  src/util/disk_cache.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index fde6e2e097..54f48a8ba5 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -208,6 +208,18 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp,
 if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", false))
goto fail;
  
+#ifdef DEBUG

+   /* For debug builds, MESA_GLSL_CACHE_TIMESTAMP can be set to override the
+* driver specified timestamp. This will allow small changes to be made to
+* the driver without invalidating the cache. Given that this is normally
+* unsafe, it is only allowed for debug builds.
+*/
+   const char *timestamp_override = getenv("MESA_GLSL_CACHE_TIMESTAMP");
+   if (timestamp_override) {
+  timestamp = timestamp_override;
+   }
+#endif
+
 /* Determine path for cache based on the first defined name as follows:
  *
  *   $MESA_GLSL_CACHE_DIR


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 29/32] disk_cache: Fix issue reading GLSL metadata

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 28/32] glsl/shader_cache: Save fs (BlendSupport) metadata

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 27/32] i965: Initialize sha1 hash of dri config options

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 25/32] i965: add cache fallback support using serialized nir

2017-10-19 Thread Timothy Arceri

IMO you should just do this in brw_link() when its cheap. You will be 
doing the deserialization at draw time here which is not what we want. 
Can also drop the serialized_nir params if you follow my suggestions in 
patch 14.


On 19/10/17 16:32, Jordan Justen wrote:

If the i965 gen program cannot be loaded from the cache, then we
fallback to using a serialized nir program.

This is based on "i965: add cache fallback support" by Timothy Arceri
. Tim's version was written to fallback
to compiling from source, and therefore had to be much more complex.
After Connor and Jason implemented nir serialization, I was able to
rewrite and greatly simplify this patch.

Signed-off-by: Jordan Justen 
---
  src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 ++-
  1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c
index d89df846d5..790fad6925 100644
--- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -24,6 +24,7 @@
  #include "compiler/blob.h"
  #include "compiler/glsl/ir_uniform.h"
  #include "compiler/glsl/shader_cache.h"
+#include "compiler/nir/nir_serialize.h"
  #include "main/mtypes.h"
  #include "util/disk_cache.h"
  #include "util/macros.h"
@@ -79,6 +80,27 @@ gen_shader_sha1(struct brw_context *brw, struct gl_program 
*prog,
 _mesa_sha1_compute(manifest, strlen(manifest), out_sha1);
  }
  
+static void

+fallback_to_full_recompile(struct brw_context *brw, struct gl_program *prog,
+   gl_shader_stage stage)
+{
+   prog->program_written_to_cache = false;
+   if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) {
+  fprintf(stderr, "falling back to nir %s.\n",
+  _mesa_shader_stage_to_abbrev(prog->info.stage));
+   }
+
+   if (!prog->nir) {
+  assert(prog->serialized_nir && prog->serialized_nir_size > 0);
+  const struct nir_shader_compiler_options *options =
+ brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions;
+  struct blob_reader reader;
+  blob_reader_init(, prog->serialized_nir,
+   prog->serialized_nir_size);
+  prog->nir = nir_deserialize(NULL, options, );
+   }
+}
+
  static void
  read_program_data(struct gl_program *glprog, struct blob_reader *binary,
struct brw_stage_prog_data *prog_data,
@@ -298,6 +320,9 @@ brw_disk_cache_upload_program(struct brw_context *brw, 
gl_shader_stage stage)
 prog->sh.LinkedTransformFeedback->api_enabled)
return false;
  
+   if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK)

+  goto FAIL;
+
 if (prog->sh.data->LinkStatus != linking_skipped)
goto FAIL;
  
@@ -311,7 +336,7 @@ brw_disk_cache_upload_program(struct brw_context *brw, gl_shader_stage stage)

 return true;
  
  FAIL:

-   /*FIXME: Fall back and compile from source here. */
+   fallback_to_full_recompile(brw, prog, stage);
 return false;
  }
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 22/32] i965: Add shader cache support for compute

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 16/32] i965: Don't rely on nir for uses_texture_gather

2017-10-19 Thread Timothy Arceri


Maybe you should just do:

prog->nir->info = prog->info;

After you restore nir from the cache?

On 19/10/17 16:32, Jordan Justen wrote:

When a program is restored from the shader cache, prog->nir will be
NULL, but prog->info will be restored.

Signed-off-by: Jordan Justen 
---
  src/mesa/drivers/dri/i965/brw_wm.c   |  4 ++--
  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
  2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
b/src/mesa/drivers/dri/i965/brw_wm.c
index 69d8e61e40..e511f0f70b 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -330,7 +330,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx,
   }
  
   /* gather4 for RG32* is broken in multiple ways on Gen7. */

- if (devinfo->gen == 7 && prog->nir->info.uses_texture_gather) {
+ if (devinfo->gen == 7 && prog->info.uses_texture_gather) {
  switch (img->InternalFormat) {
  case GL_RG32I:
  case GL_RG32UI: {
@@ -368,7 +368,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx,
   /* Gen6's gather4 is broken for UINT/SINT; we treat them as
* UNORM/FLOAT instead and fix it in the shader.
*/
- if (devinfo->gen == 6 && prog->nir->info.uses_texture_gather) {
+ if (devinfo->gen == 6 && prog->info.uses_texture_gather) {
  key->gen6_gather_wa[s] = 
gen6_gather_workaround(img->InternalFormat);
   }
  
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c

index f4e9cf48c6..4f454dae44 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1198,15 +1198,15 @@ brw_update_texture_surfaces(struct brw_context *brw)
  * allows the surface format to be overriden for only the
  * gather4 messages. */
 if (devinfo->gen < 8) {
-  if (vs && vs->nir->info.uses_texture_gather)
+  if (vs && vs->info.uses_texture_gather)
   update_stage_texture_surfaces(brw, vs, >vs.base, true, 0);
-  if (tcs && tcs->nir->info.uses_texture_gather)
+  if (tcs && tcs->info.uses_texture_gather)
   update_stage_texture_surfaces(brw, tcs, >tcs.base, true, 0);
-  if (tes && tes->nir->info.uses_texture_gather)
+  if (tes && tes->info.uses_texture_gather)
   update_stage_texture_surfaces(brw, tes, >tes.base, true, 0);
-  if (gs && gs->nir->info.uses_texture_gather)
+  if (gs && gs->info.uses_texture_gather)
   update_stage_texture_surfaces(brw, gs, >gs.base, true, 0);
-  if (fs && fs->nir->info.uses_texture_gather)
+  if (fs && fs->info.uses_texture_gather)
   update_stage_texture_surfaces(brw, fs, >wm.base, true, 0);
 }
  
@@ -1253,7 +1253,7 @@ brw_update_cs_texture_surfaces(struct brw_context *brw)

  * gather4 messages.
  */
 if (devinfo->gen < 8) {
-  if (cs && cs->nir->info.uses_texture_gather)
+  if (cs && cs->info.uses_texture_gather)
   update_stage_texture_surfaces(brw, cs, >cs.base, true, 0);
 }
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 14/32] glsl/shader_cache: Save and restore serialized nir in gl_program

2017-10-19 Thread Timothy Arceri

This and the previous patch feels wrong. The glsl shader cache shouldn't 
be handling writing nir to disk. IMO you should add this functionality 
to nir_serialize.c (maybe rename is nir_cache.c ??). That way we have 
continue with the nir is a toolkit theme and we can easily reused the 
util to write nir to disk in Vulkan which could be useful for reducing 
compile times.


Doing it this way we could deserialize it in brw_link() and serialize it 
in glsl_to_nir() no need for the extra serialized_nir params.


On 19/10/17 16:32, Jordan Justen wrote:

Signed-off-by: Jordan Justen 
---
  src/compiler/glsl/shader_cache.cpp | 16 
  1 file changed, 16 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index ca90cfde35..f43bd6b17e 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1062,6 +1062,14 @@ write_shader_metadata(struct blob *metadata, 
gl_linked_shader *shader)
 }
  
 write_shader_parameters(metadata, glprog->Parameters);

+
+   assert((glprog->serialized_nir == NULL) ==
+  (glprog->serialized_nir_size == 0));
+   blob_write_uint32(metadata, (uint32_t)glprog->serialized_nir_size);
+   if (glprog->serialized_nir_size > 0) {
+  blob_write_bytes(metadata, glprog->serialized_nir,
+   glprog->serialized_nir_size);
+   }
  }
  
  static void

@@ -1116,6 +1124,14 @@ read_shader_metadata(struct blob_reader *metadata,
  
 glprog->Parameters = _mesa_new_parameter_list();

 read_shader_parameters(metadata, glprog->Parameters);
+
+   glprog->serialized_nir_size = (size_t)blob_read_uint32(metadata);
+   if (glprog->serialized_nir_size > 0) {
+  glprog->serialized_nir =
+ (uint8_t*)ralloc_size(glprog, glprog->serialized_nir_size);
+  blob_copy_bytes(metadata, glprog->serialized_nir,
+  glprog->serialized_nir_size);
+   }
  }
  
  static void



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 12/32] nir: Add hooks for testing serialization

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 11/32] nir: add serialization and deserialization

2017-10-19 Thread Timothy Arceri


Acked-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv: Enable tessellation shaders for GFX9.

2017-10-19 Thread Bas Nieuwenhuizen

It mostly works now.
---
 src/amd/vulkan/radv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 7f306db5c48..125498809ec 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -429,7 +429,7 @@ void radv_GetPhysicalDeviceFeatures(
.imageCubeArray   = true,
.independentBlend = true,
.geometryShader   = !is_gfx9,
-   .tessellationShader   = !is_gfx9,
+   .tessellationShader   = true,
.sampleRateShading= true,
.dualSrcBlend = true,
.logicOp  = true,
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] ac/nir: init full exec mask for merged shaders.

2017-10-19 Thread Bas Nieuwenhuizen

From: Dave Airlie 

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/common/ac_llvm_build.c  | 8 
 src/amd/common/ac_llvm_build.h  | 1 +
 src/amd/common/ac_nir_to_llvm.c | 3 +++
 3 files changed, 12 insertions(+)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 949f181aace..e5cd23e0251 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -1734,3 +1734,11 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ctx,
*num_param_exports = exports.num;
}
 }
+
+void ac_init_exec_full_mask(struct ac_llvm_context *ctx)
+{
+   LLVMValueRef full_mask = LLVMConstInt(ctx->i64, ~0ull, 0);
+   ac_build_intrinsic(ctx,
+  "llvm.amdgcn.init.exec", ctx->voidt,
+  _mask, 1, AC_FUNC_ATTR_CONVERGENT);
+}
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index f0b5875b423..aa2a2899ab4 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -281,6 +281,7 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ac,
uint8_t *vs_output_param_offset,
uint32_t num_outputs,
uint8_t *num_param_exports);
+void ac_init_exec_full_mask(struct ac_llvm_context *ctx);
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 242675654d2..b6a7d43789d 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -6488,6 +6488,9 @@ LLVMModuleRef 
ac_translate_nir_to_llvm(LLVMTargetMachineRef tm,
ctx.abi.load_ssbo = radv_load_ssbo;
ctx.abi.load_sampler_desc = radv_get_sampler_desc;
 
+   if (shader_count >= 2)
+   ac_init_exec_full_mask();
+
if (ctx.ac.chip_class == GFX9 &&
shaders[shader_count - 1]->stage == MESA_SHADER_TESS_CTRL)
ac_nir_fixup_ls_hs_input_vgprs();
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 10/32] glsl_to_nir: Zero nir_constant in constant_copy for valgrind & nir_serialize

2017-10-19 Thread Timothy Arceri


5-10:

Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 04/32] compiler/types: Support [de]serializing void types

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Ilia Mirkin

On Thu, Oct 19, 2017 at 5:35 PM, Timothy Arceri  wrote:
>
>
> On 20/10/17 08:27, Timothy Arceri wrote:
>>
>>
>>
>> On 20/10/17 08:19, Timothy Arceri wrote:
>>>
>>> On 20/10/17 04:21, Ilia Mirkin wrote:

 On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin 
 wrote:
>
> On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:
>>
>> On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:
>>>
>>> Will this work with SSO shaders? Presumably the validation still has
>>> to happen, but I don't think cross_validate_outputs_to_inputs() will
>>> end up getting called.
>>
>>
>> The piglit tests I mention use SSO so it seems to be working for this.
>> See for example:
>>
>> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
>> block-member-location-overlap.shader_test
>
>
> Ah great. I'm a little curious how, since I don't see how
> cross_validate_outputs_to_inputs would get called for SSO shaders.
> Perhaps I'm looking at old code.
>
> Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
> - can you try with that? It's just using a pipeline, but both shaders
> are ending up in it.


 BTW, my solution to all this was

 https://patchwork.freedesktop.org/patch/175959/

 but Tim hated it, and I didn't have the time to properly respond.
>>>
>>>
>>> Hate is a strong word, the problem is it duplicated some of the
>>> checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The checks
>>> should be pulled into a helper/helpers that can also be used by SSO.
>>
>>
>> Oh and your patch was also missing all the component checking logic which
>> we also should be doing for SSO. Moving the checks into helpers will give us
>> these check for free.
>
>
> Thinking about it some more, I don't see how your patch works for SSO.
>
> The only place you can validate the SSO pipeline is via
> _mesa_validate_program_pipeline() since we can't know what the other stages
> will be at link time.

You don't need to validate the pipeline. You need to make sure each
individual program links, and the code is called at glLinkShader()
time, at which point it will fail the link if the interfaces have
things they're not supposed to.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 03/32] nir/intrinsics: Set the correct num_indices for load_output

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 02/32] nir: Get rid of nir_shader::stage

2017-10-19 Thread Timothy Arceri


Acked-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 01/32] glsl: move shader_cache type handling to glsl_types

2017-10-19 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] swr: Rework scratch space allocation

2017-10-19 Thread George Kyriazis

Remove allocation of > 2kbyte buffers into context memory in
swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers
and shader constants to a scratch space to be used by the upcoming draw.)

Large shader constant allocations need to be done in the circular scratch
buffer instead of context memory, because their values persist across
render calls.

Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger
buffers will get too large for the circular scratch space.

Fixes render issues with CEI Ensight.
---
 src/gallium/drivers/swr/swr_scratch.cpp | 51 ++---
 src/gallium/drivers/swr/swr_screen.cpp  |  2 +-
 2 files changed, 23 insertions(+), 30 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_scratch.cpp 
b/src/gallium/drivers/swr/swr_scratch.cpp
index d298a48..c31f3c2 100644
--- a/src/gallium/drivers/swr/swr_scratch.cpp
+++ b/src/gallium/drivers/swr/swr_scratch.cpp
@@ -28,8 +28,6 @@
 #include "swr_fence_work.h"
 #include "api.h"
 
-#define SCRATCH_SINGLE_ALLOCATION_LIMIT 2048
-
 void *
 swr_copy_to_scratch_space(struct swr_context *ctx,
   struct swr_scratch_space *space,
@@ -40,41 +38,36 @@ swr_copy_to_scratch_space(struct swr_context *ctx,
assert(space);
assert(size);
 
-   if (size >= SCRATCH_SINGLE_ALLOCATION_LIMIT) {
-  /* Use per draw SwrAllocDrawContextMemory for larger copies */
-  ptr = ctx->api.pfnSwrAllocDrawContextMemory(ctx->swrContext, size, 4);
-   } else {
-  /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */
-  unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT;
-
-  /* Need to grow space */
-  if (max_size_in_flight > space->current_size) {
- space->current_size = max_size_in_flight;
+   /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */
+   unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT;
 
- if (space->base) {
-/* defer delete, use aligned-free */
-struct swr_screen *screen = swr_screen(ctx->pipe.screen);
-swr_fence_work_free(screen->flush_fence, space->base, true);
-space->base = NULL;
- }
+   /* Need to grow space */
+   if (max_size_in_flight > space->current_size) {
+  space->current_size = max_size_in_flight;
 
- if (!space->base) {
-space->base = (uint8_t *)AlignedMalloc(space->current_size, 
-   sizeof(void *));
-space->head = (void *)space->base;
- }
+  if (space->base) {
+ /* defer delete, use aligned-free */
+ struct swr_screen *screen = swr_screen(ctx->pipe.screen);
+ swr_fence_work_free(screen->flush_fence, space->base, true);
+ space->base = NULL;
   }
 
-  /* Wrap */
-  if (((uint8_t *)space->head + size)
-  >= ((uint8_t *)space->base + space->current_size)) {
- space->head = space->base;
+  if (!space->base) {
+ space->base = (uint8_t *)AlignedMalloc(space->current_size, 
+sizeof(void *));
+ space->head = (void *)space->base;
   }
+   }
 
-  ptr = space->head;
-  space->head = (uint8_t *)space->head + size;
+   /* Wrap */
+   if (((uint8_t *)space->head + size)
+   >= ((uint8_t *)space->base + space->current_size)) {
+  space->head = space->base;
}
 
+   ptr = space->head;
+   space->head = (uint8_t *)space->head + size;
+
/* Copy user_buffer to scratch */
if (user_buffer)
   memcpy(ptr, user_buffer, size);
diff --git a/src/gallium/drivers/swr/swr_screen.cpp 
b/src/gallium/drivers/swr/swr_screen.cpp
index 46b3a00..b21c35e 100644
--- a/src/gallium/drivers/swr/swr_screen.cpp
+++ b/src/gallium/drivers/swr/swr_screen.cpp
@@ -57,7 +57,7 @@
 #define SWR_MAX_TEXTURE_ARRAY_LAYERS 512 /* 8K x 512 / 8K x 8K x 512 */
 
 /* Default max client_copy_limit */
-#define SWR_CLIENT_COPY_LIMIT 32768
+#define SWR_CLIENT_COPY_LIMIT 8192
 
 /* Flag indicates creation of alternate surface, to prevent recursive loop
  * in resource creation when msaa_force_enable is set. */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Timothy Arceri




On 20/10/17 08:27, Timothy Arceri wrote:



On 20/10/17 08:19, Timothy Arceri wrote:

On 20/10/17 04:21, Ilia Mirkin wrote:
On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin  
wrote:

On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:

On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:

Will this work with SSO shaders? Presumably the validation still has
to happen, but I don't think cross_validate_outputs_to_inputs() will
end up getting called.


The piglit tests I mention use SSO so it seems to be working for this.
See for example:

tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
block-member-location-overlap.shader_test


Ah great. I'm a little curious how, since I don't see how
cross_validate_outputs_to_inputs would get called for SSO shaders.
Perhaps I'm looking at old code.

Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
- can you try with that? It's just using a pipeline, but both shaders
are ending up in it.


BTW, my solution to all this was

https://patchwork.freedesktop.org/patch/175959/

but Tim hated it, and I didn't have the time to properly respond.


Hate is a strong word, the problem is it duplicated some of the 
checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The 
checks should be pulled into a helper/helpers that can also be used by 
SSO.


Oh and your patch was also missing all the component checking logic 
which we also should be doing for SSO. Moving the checks into helpers 
will give us these check for free.


Thinking about it some more, I don't see how your patch works for SSO.

The only place you can validate the SSO pipeline is via 
_mesa_validate_program_pipeline() since we can't know what the other 
stages will be at link time.









   -ilia


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Timothy Arceri




On 20/10/17 08:19, Timothy Arceri wrote:

On 20/10/17 04:21, Ilia Mirkin wrote:
On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin  
wrote:

On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:

On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:

Will this work with SSO shaders? Presumably the validation still has
to happen, but I don't think cross_validate_outputs_to_inputs() will
end up getting called.


The piglit tests I mention use SSO so it seems to be working for this.
See for example:

tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
block-member-location-overlap.shader_test


Ah great. I'm a little curious how, since I don't see how
cross_validate_outputs_to_inputs would get called for SSO shaders.
Perhaps I'm looking at old code.

Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
- can you try with that? It's just using a pipeline, but both shaders
are ending up in it.


BTW, my solution to all this was

https://patchwork.freedesktop.org/patch/175959/

but Tim hated it, and I didn't have the time to properly respond.


Hate is a strong word, the problem is it duplicated some of the 
checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The 
checks should be pulled into a helper/helpers that can also be used by SSO.


Oh and your patch was also missing all the component checking logic 
which we also should be doing for SSO. Moving the checks into helpers 
will give us these check for free.







   -ilia


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Ilia Mirkin

On Thu, Oct 19, 2017 at 5:19 PM, Timothy Arceri  wrote:
> On 20/10/17 04:21, Ilia Mirkin wrote:
>>
>> On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin 
>> wrote:
>>>
>>> On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:

 On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:
>
> Will this work with SSO shaders? Presumably the validation still has
> to happen, but I don't think cross_validate_outputs_to_inputs() will
> end up getting called.


 The piglit tests I mention use SSO so it seems to be working for this.
 See for example:

 tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
 block-member-location-overlap.shader_test
>>>
>>>
>>> Ah great. I'm a little curious how, since I don't see how
>>> cross_validate_outputs_to_inputs would get called for SSO shaders.
>>> Perhaps I'm looking at old code.
>>>
>>> Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
>>> - can you try with that? It's just using a pipeline, but both shaders
>>> are ending up in it.
>>
>>
>> BTW, my solution to all this was
>>
>> https://patchwork.freedesktop.org/patch/175959/
>>
>> but Tim hated it, and I didn't have the time to properly respond.
>
>
> Hate is a strong word, the problem is it duplicated some of the checks/logic
> in cross_validate_outputs_to_inputs() unnecessarily. The checks should be
> pulled into a helper/helpers that can also be used by SSO.

Once I looked at the other code, I hated the duplication too - they do
look sadly similar. But fixing it seemed difficult, since they were
counting slightly different things, and I didn't have time to
investigate in depth (and continue to not have that time).

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Timothy Arceri


On 20/10/17 04:21, Ilia Mirkin wrote:

On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin  wrote:

On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:

On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:

Will this work with SSO shaders? Presumably the validation still has
to happen, but I don't think cross_validate_outputs_to_inputs() will
end up getting called.


The piglit tests I mention use SSO so it seems to be working for this.
See for example:

tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
block-member-location-overlap.shader_test


Ah great. I'm a little curious how, since I don't see how
cross_validate_outputs_to_inputs would get called for SSO shaders.
Perhaps I'm looking at old code.

Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
- can you try with that? It's just using a pipeline, but both shaders
are ending up in it.


BTW, my solution to all this was

https://patchwork.freedesktop.org/patch/175959/

but Tim hated it, and I didn't have the time to properly respond.


Hate is a strong word, the problem is it duplicated some of the 
checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The 
checks should be pulled into a helper/helpers that can also be used by SSO.




   -ilia


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/8] egl: add dri2_egl_surface_free_outdated_buffers_and_update_size() helper (v2)

2017-10-19 Thread Gurchetan Singh

De-duplicating and then trimming down works for me.

On Thu, Oct 19, 2017 at 3:31 AM, Emil Velikov 
wrote:

> On 18 October 2017 at 23:36, Gurchetan Singh
>  wrote:
> >> Then again, I'd suggest keeping that as separate series. These patches
> >> started as a way to minimise the duplication we have in drivers/dri2.
> >
> > I'm fine with dri2_$action_$object.  We can modify the existing functions
> > later, but I recommend adopting more concise conventions in this
> patchset,
> > i.e:
> >
> > dri2_egl_surface_record_buffers_and_update_back_buffer -->
> > dri2_set_back_buffer_surface
> > dri2_egl_surface_free_outdated_buffers_and_update_size -->
> > dri2_fixup_surface
> > dri2_egl_surface_update_buffer_age --> dri2_update_age_surface
> > dri2_egl_surface_get_image_front --> dri2_get_front_image_surface
> >
> Sure thing, let's use consistent names with this series.
>
> >> goal the series is to a) remove a handful of the ifdef spaghetti and
> >
> > I agree, struct dri2_egl_surface can be refactored. I would advocate a
> > solution where the surface (a) has everything a platform needs but
> nothing
> > else (b) has a minimal amount of duplication.  I would like to look at
> the
> > struct and see if it defines buffers[5], it must mean the platform
> > implements get_buffers_with_format for example.  If a platform doesn't
> > define color_buffers, it means EXT_buffer_age is not used for whatever
> > reason.  Everything has dri_image_front -- then everything must use the
> > image extension.  I think this type of self-consistency is useful, from a
> > code is documentation point of view.  Here's pseudo-code of what I would
> > want:
> >
> I agreed with the goals but I would swap the order/priority -
> de-duplicate, then trim down.
> By de-duplicating/refactoring one could add support for X/Y fairly
> easily. Once all that is done, we can polish ;-)
>
> I fear that otherwise there will be a lot of unnecessary churn.
>
> Thanks
> Emil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 5/5] etnaviv: fix implicit conversion warning

2017-10-19 Thread Christian Gmeiner

Galliums query_type used in APIs is unsigned.

Signed-off-by: Christian Gmeiner 
Reviewed-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/etnaviv/etnaviv_query.h| 2 +-
 src/gallium/drivers/etnaviv/etnaviv_query_sw.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_query.h 
b/src/gallium/drivers/etnaviv/etnaviv_query.h
index e099e10f7c..8927266057 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_query.h
+++ b/src/gallium/drivers/etnaviv/etnaviv_query.h
@@ -44,7 +44,7 @@ struct etna_query_funcs {
 struct etna_query {
const struct etna_query_funcs *funcs;
bool active;
-   int type;
+   unsigned type;
 };
 
 static inline struct etna_query *
diff --git a/src/gallium/drivers/etnaviv/etnaviv_query_sw.c 
b/src/gallium/drivers/etnaviv/etnaviv_query_sw.c
index ea79467b61..dd9bac3850 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_query_sw.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_query_sw.c
@@ -42,7 +42,7 @@ etna_sw_destroy_query(struct etna_context *ctx, struct 
etna_query *q)
 }
 
 static uint64_t
-read_counter(struct etna_context *ctx, int type)
+read_counter(struct etna_context *ctx, unsigned type)
 {
switch (type) {
case PIPE_QUERY_PRIMITIVES_EMITTED:
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/5] etnaviv: add support for occlusion queries

2017-10-19 Thread Christian Gmeiner

Passes most occlusion query piglits. The following piglits are broken:
- spec@arb_occlusion_query@occlusion_query_meta_fragments
- spec@arb_occlusion_query@occlusion_query_meta_save
- spec@arb_occlusion_query2@render

v1 -> v2:
 - use one sample provider for all occlusion queries tyes
 - add comment about 'magic' value 0x1DF5E76

Signed-off-by: Christian Gmeiner 
---
 src/gallium/drivers/etnaviv/etnaviv_query_hw.c | 78 ++
 1 file changed, 78 insertions(+)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_query_hw.c 
b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c
index a1dead335c..0f3cd7257b 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_query_hw.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c
@@ -30,9 +30,73 @@
 #include "util/u_memory.h"
 
 #include "etnaviv_context.h"
+#include "etnaviv_debug.h"
+#include "etnaviv_emit.h"
 #include "etnaviv_query_hw.h"
 #include "etnaviv_screen.h"
 
+/*
+ * Occlusion Query:
+ *
+ * OCCLUSION_COUNTER and OCCLUSION_PREDICATE differ only in how they
+ * interpret results
+ */
+
+static void
+occlusion_start(struct etna_hw_query *hq, struct etna_context *ctx)
+{
+   struct etna_resource *rsc = etna_resource(hq->prsc);
+   struct etna_reloc r = {
+  .bo = rsc->bo,
+  .flags = ETNA_RELOC_WRITE
+   };
+
+   if (hq->samples > 63) {
+  hq->samples = 63;
+  BUG("samples overflow");
+   }
+
+   r.offset = hq->samples * 8; /* 64bit value */
+
+   etna_set_state_reloc(ctx->stream, VIVS_GL_OCCLUSION_QUERY_ADDR, );
+}
+
+static void
+occlusion_stop(struct etna_hw_query *hq, struct etna_context *ctx)
+{
+   /* 0x1DF5E76 is the value used by blob - but any random value will work */
+   etna_set_state(ctx->stream, VIVS_GL_OCCLUSION_QUERY_CONTROL, 0x1DF5E76);
+}
+
+static void
+occlusion_suspend(struct etna_hw_query *hq, struct etna_context *ctx)
+{
+   occlusion_stop(hq, ctx);
+}
+
+static void
+occlusion_resume(struct etna_hw_query *hq, struct etna_context *ctx)
+{
+   hq->samples++;
+   occlusion_start(hq, ctx);
+}
+
+static void
+occlusion_result(struct etna_hw_query *hq, void *buf,
+ union pipe_query_result *result)
+{
+   uint64_t sum = 0;
+   uint64_t *ptr = (uint64_t *)buf;
+
+   for (unsigned i = 0; i <= hq->samples; i++)
+  sum += *(ptr + i);
+
+   if (hq->base.type == PIPE_QUERY_OCCLUSION_COUNTER)
+  result->u64 = sum;
+   else
+  result->b = !!sum;
+}
+
 static void
 etna_hw_destroy_query(struct etna_context *ctx, struct etna_query *q)
 {
@@ -44,6 +108,14 @@ etna_hw_destroy_query(struct etna_context *ctx, struct 
etna_query *q)
FREE(hq);
 }
 
+static const struct etna_hw_sample_provider occlusion_provider = {
+   .start = occlusion_start,
+   .stop = occlusion_stop,
+   .suspend = occlusion_suspend,
+   .resume = occlusion_resume,
+   .result = occlusion_result,
+};
+
 static void
 realloc_query_bo(struct etna_context *ctx, struct etna_hw_query *hq)
 {
@@ -153,6 +225,12 @@ static inline const struct etna_hw_sample_provider *
 query_sample_provider(unsigned query_type)
 {
switch (query_type) {
+   case PIPE_QUERY_OCCLUSION_COUNTER:
+  /* fallthrough */
+   case PIPE_QUERY_OCCLUSION_PREDICATE:
+  /* fallthrough */
+   case PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE:
+  return _provider;
default:
   return NULL;
}
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/5] etnaviv: enable occlusion query if GPU supports it

2017-10-19 Thread Christian Gmeiner

Signed-off-by: Christian Gmeiner 
Reviewed-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/etnaviv/etnaviv_screen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index 738605a4f3..009bc73c14 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -319,8 +319,9 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
 
/* Timer queries. */
case PIPE_CAP_QUERY_TIME_ELAPSED:
-   case PIPE_CAP_OCCLUSION_QUERY:
   return 0;
+   case PIPE_CAP_OCCLUSION_QUERY:
+  return VIV_FEATURE(screen, chipMinorFeatures1, HALTI0);
case PIPE_CAP_QUERY_TIMESTAMP:
   return 1;
case PIPE_CAP_QUERY_PIPELINE_STATISTICS:
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/5] etnaviv: update headers from rnndb

2017-10-19 Thread Christian Gmeiner

Update to etna_viv commit 6c9c706.

Signed-off-by: Christian Gmeiner 
Reviewed-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/etnaviv/hw/cmdstream.xml.h |  36 ++-
 src/gallium/drivers/etnaviv/hw/common.xml.h| 117 
 src/gallium/drivers/etnaviv/hw/isa.xml.h   |   4 +-
 src/gallium/drivers/etnaviv/hw/state.xml.h | 197 --
 src/gallium/drivers/etnaviv/hw/state_3d.xml.h  | 357 -
 5 files changed, 622 insertions(+), 89 deletions(-)

diff --git a/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h 
b/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h
index f8d76b0105..e12188ea52 100644
--- a/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h
+++ b/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h
@@ -8,9 +8,9 @@ http://0x04.net/cgit/index.cgi/rules-ng-ng
 git clone git://0x04.net/rules-ng-ng
 
 The rules-ng-ng source files this header was generated from are:
-- cmdstream.xml (  15289 bytes, from 2017-09-29 11:52:39)
-- copyright.xml (   1597 bytes, from 2016-12-08 16:37:56)
-- common.xml(  23529 bytes, from 2017-09-29 11:52:39)
+- cmdstream.xml (  16595 bytes, from 2017-10-05 21:20:32)
+- copyright.xml (   1597 bytes, from 2016-11-13 13:46:17)
+- common.xml(  26135 bytes, from 2017-10-05 21:20:32)
 
 Copyright (C) 2012-2017 by the following authors:
 - Wladimir J. van der Laan 
@@ -52,6 +52,8 @@ DEALINGS IN THE SOFTWARE.
 #define FE_OPCODE_RETURN   0x000b
 #define FE_OPCODE_DRAW_INSTANCED   0x000c
 #define FE_OPCODE_CHIP_SELECT  0x000d
+#define FE_OPCODE_WAIT_FENCE   0x000f
+#define FE_OPCODE_SNAP_PAGES   0x0013
 #define PRIMITIVE_TYPE_POINTS  0x0001
 #define PRIMITIVE_TYPE_LINES   0x0002
 #define PRIMITIVE_TYPE_LINE_STRIP  0x0003
@@ -192,6 +194,9 @@ DEALINGS IN THE SOFTWARE.
 #define VIV_FE_STALL_TOKEN_TO__MASK0x1f00
 #define VIV_FE_STALL_TOKEN_TO__SHIFT   8
 #define VIV_FE_STALL_TOKEN_TO(x)   (((x) << 
VIV_FE_STALL_TOKEN_TO__SHIFT) & VIV_FE_STALL_TOKEN_TO__MASK)
+#define VIV_FE_STALL_TOKEN_UNK28__MASK 0x3000
+#define VIV_FE_STALL_TOKEN_UNK28__SHIFT28
+#define VIV_FE_STALL_TOKEN_UNK28(x)(((x) << 
VIV_FE_STALL_TOKEN_UNK28__SHIFT) & VIV_FE_STALL_TOKEN_UNK28__MASK)
 
 #define VIV_FE_CALL0x
 
@@ -266,5 +271,30 @@ DEALINGS IN THE SOFTWARE.
 #define VIV_FE_DRAW_INSTANCED_START_INDEX__SHIFT   0
 #define VIV_FE_DRAW_INSTANCED_START_INDEX(x)   (((x) << 
VIV_FE_DRAW_INSTANCED_START_INDEX__SHIFT) & 
VIV_FE_DRAW_INSTANCED_START_INDEX__MASK)
 
+#define VIV_FE_WAIT_FENCE  0x
+
+#define VIV_FE_WAIT_FENCE_HEADER   0x
+#define VIV_FE_WAIT_FENCE_HEADER_OP__MASK  0xf800
+#define VIV_FE_WAIT_FENCE_HEADER_OP__SHIFT 27
+#define VIV_FE_WAIT_FENCE_HEADER_OP_WAIT_FENCE 0x7800
+#define VIV_FE_WAIT_FENCE_HEADER_UNK16__MASK   0x0003
+#define VIV_FE_WAIT_FENCE_HEADER_UNK16__SHIFT  16
+#define VIV_FE_WAIT_FENCE_HEADER_UNK16(x)  (((x) << 
VIV_FE_WAIT_FENCE_HEADER_UNK16__SHIFT) & VIV_FE_WAIT_FENCE_HEADER_UNK16__MASK)
+#define VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__MASK   0x
+#define VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__SHIFT  0
+#define VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT(x)  (((x) << 
VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__SHIFT) & 
VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__MASK)
+
+#define VIV_FE_WAIT_FENCE_ADDRESS  0x0004
+
+#define VIV_FE_SNAP_PAGES  0x
+
+#define VIV_FE_SNAP_PAGES_HEADER   0x
+#define VIV_FE_SNAP_PAGES_HEADER_OP__MASK  0xf800
+#define VIV_FE_SNAP_PAGES_HEADER_OP__SHIFT 27
+#define VIV_FE_SNAP_PAGES_HEADER_OP_SNAP_PAGES 0x9800
+#define VIV_FE_SNAP_PAGES_HEADER_UNK0__MASK0x001f
+#define VIV_FE_SNAP_PAGES_HEADER_UNK0__SHIFT   0
+#define VIV_FE_SNAP_PAGES_HEADER_UNK0(x)   (((x) << 
VIV_FE_SNAP_PAGES_HEADER_UNK0__SHIFT) & VIV_FE_SNAP_PAGES_HEADER_UNK0__MASK)
+
 
 #endif /* CMDSTREAM_XML */
diff --git a/src/gallium/drivers/etnaviv/hw/common.xml.h 
b/src/gallium/drivers/etnaviv/hw/common.xml.h
index 85c4990b61..57369d741d 100644
--- a/src/gallium/drivers/etnaviv/hw/common.xml.h
+++

[Mesa-dev] [PATCH v2 2/5] etnaviv: add basic infrastructure for hw queries

2017-10-19 Thread Christian Gmeiner

No hardware query is supported yet.

v1 -> v2
 - removed query_type from strcut etna_hw_sample_provider

Signed-off-by: Christian Gmeiner 
---
 src/gallium/drivers/etnaviv/Makefile.sources   |   2 +
 src/gallium/drivers/etnaviv/etnaviv_context.c  |  11 ++
 src/gallium/drivers/etnaviv/etnaviv_context.h  |   3 +
 src/gallium/drivers/etnaviv/etnaviv_query.c|   3 +
 src/gallium/drivers/etnaviv/etnaviv_query_hw.c | 185 +
 src/gallium/drivers/etnaviv/etnaviv_query_hw.h |  88 
 6 files changed, 292 insertions(+)
 create mode 100644 src/gallium/drivers/etnaviv/etnaviv_query_hw.c
 create mode 100644 src/gallium/drivers/etnaviv/etnaviv_query_hw.h

diff --git a/src/gallium/drivers/etnaviv/Makefile.sources 
b/src/gallium/drivers/etnaviv/Makefile.sources
index 60275c949d..ea8df807f6 100644
--- a/src/gallium/drivers/etnaviv/Makefile.sources
+++ b/src/gallium/drivers/etnaviv/Makefile.sources
@@ -27,6 +27,8 @@ C_SOURCES :=  \
etnaviv_internal.h \
etnaviv_query.c \
etnaviv_query.h \
+   etnaviv_query_hw.c \
+   etnaviv_query_hw.h \
etnaviv_query_sw.c \
etnaviv_query_sw.h \
etnaviv_rasterizer.c \
diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c 
b/src/gallium/drivers/etnaviv/etnaviv_context.c
index 67aab6a68c..65c20d2f83 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_context.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_context.c
@@ -34,6 +34,7 @@
 #include "etnaviv_emit.h"
 #include "etnaviv_fence.h"
 #include "etnaviv_query.h"
+#include "etnaviv_query_hw.h"
 #include "etnaviv_rasterizer.h"
 #include "etnaviv_screen.h"
 #include "etnaviv_shader.h"
@@ -260,6 +261,9 @@ etna_draw_vbo(struct pipe_context *pctx, const struct 
pipe_draw_info *info)
   if (ctx->sampler_view[i])
  resource_read(ctx, ctx->sampler_view[i]->texture);
 
+   list_for_each_entry(struct etna_hw_query, hq, >active_hw_queries, node)
+  resource_written(ctx, hq->prsc);
+
ctx->stats.prims_emitted += u_reduced_prims_for_vertices(info->mode, 
info->count);
ctx->stats.draw_calls++;
 
@@ -299,10 +303,16 @@ etna_flush(struct pipe_context *pctx, struct 
pipe_fence_handle **fence,
struct etna_context *ctx = etna_context(pctx);
int out_fence_fd = -1;
 
+   list_for_each_entry(struct etna_hw_query, hq, >active_hw_queries, node)
+  etna_hw_query_suspend(hq, ctx);
+
etna_cmd_stream_flush2(ctx->stream, ctx->in_fence_fd,
  (flags & PIPE_FLUSH_FENCE_FD) ? _fence_fd :
  NULL);
 
+   list_for_each_entry(struct etna_hw_query, hq, >active_hw_queries, node)
+  etna_hw_query_resume(hq, ctx);
+
if (fence)
   *fence = etna_fence_create(pctx, out_fence_fd);
 }
@@ -436,6 +446,7 @@ etna_context_create(struct pipe_screen *pscreen, void 
*priv, unsigned flags)
   goto fail;
 
slab_create_child(>transfer_pool, >transfer_pool);
+   list_inithead(>active_hw_queries);
 
return pctx;
 
diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.h 
b/src/gallium/drivers/etnaviv/etnaviv_context.h
index bf2b265f5e..2903e0963d 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_context.h
+++ b/src/gallium/drivers/etnaviv/etnaviv_context.h
@@ -180,6 +180,9 @@ struct etna_context {
 
struct pipe_debug_callback debug;
int in_fence_fd;
+
+   /* list of active hardware queries */
+   struct list_head active_hw_queries;
 };
 
 static inline struct etna_context *
diff --git a/src/gallium/drivers/etnaviv/etnaviv_query.c 
b/src/gallium/drivers/etnaviv/etnaviv_query.c
index a416a7cb0f..9e897cd75a 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_query.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_query.c
@@ -30,6 +30,7 @@
 
 #include "etnaviv_context.h"
 #include "etnaviv_query.h"
+#include "etnaviv_query_hw.h"
 #include "etnaviv_query_sw.h"
 
 static struct pipe_query *
@@ -40,6 +41,8 @@ etna_create_query(struct pipe_context *pctx, unsigned 
query_type,
struct etna_query *q;
 
q = etna_sw_create_query(ctx, query_type);
+   if (!q)
+  q = etna_hw_create_query(ctx, query_type);
 
return (struct pipe_query *)q;
 }
diff --git a/src/gallium/drivers/etnaviv/etnaviv_query_hw.c 
b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c
new file mode 100644
index 00..a1dead335c
--- /dev/null
+++ b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c
@@ -0,0 +1,185 @@
+/*
+ * Copyright (c) 2017 Etnaviv Project
+ * Copyright (C) 2017 Zodiac Inflight Innovations
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sub license,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this

Re: [Mesa-dev] [PATCH 4/4] meson: Add support for EGL glvnd

2017-10-19 Thread Lyude Paul

Works just fine for me, and patch looks good.

Reviewed-by: Lyude Paul 

On Wed, 2017-10-18 at 16:55 -0700, Dylan Baker wrote:
> Signed-off-by: Dylan Baker 
> ---
>  src/egl/meson.build | 46 --
>  1 file changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/src/egl/meson.build b/src/egl/meson.build
> index ade6810bf91..8ea8a5bbb69 100644
> --- a/src/egl/meson.build
> +++ b/src/egl/meson.build
> @@ -70,6 +70,34 @@ linux_dmabuf_unstable_v1_client_protocol_h =
> custom_target(
>command : [prog_wl_scanner, 'client-header', '@INPUT@', '@OUTPUT@'],
>  )
>  
> +g_egldispatchstubs_c = custom_target(
> +  'g_egldispatchstubs.c',
> +  input : [
> +'generate/gen_egl_dispatch.py', 'generate/eglFunctionList.py',
> +'generate/egl.xml', 'generate/egl_other.xml'
> +  ],
> +  output : 'g_egldispatchstubs.c',
> +  command : [
> +prog_python2, '@INPUT0@', 'source', '@INPUT1@', '@INPUT2@', '@INPUT3@'
> +  ],
> +  depend_files : files('generate/genCommon.py'),
> +  capture : true,
> +)
> +
> +g_egldispatchstubs_h = custom_target(
> +  'g_egldispatchstubs.h',
> +  input : [
> +'generate/gen_egl_dispatch.py', 'generate/eglFunctionList.py',
> +'generate/egl.xml', 'generate/egl_other.xml'
> +  ],
> +  output : 'g_egldispatchstubs.h',
> +  command : [
> +prog_python2, '@INPUT0@', 'header', '@INPUT1@', '@INPUT2@', '@INPUT3@'
> +  ],
> +  depend_files : files('generate/genCommon.py'),
> +  capture : true,
> +)
> +
>  if with_platform_x11
>files_egl += files('drivers/dri2/platform_x11.c')
>if with_dri3
> @@ -107,8 +135,22 @@ if cc.has_function('mincore')
>c_args_for_egl += '-DHAVE_MINCORE'
>  endif
>  
> +if not with_glvnd
> +  egl_lib_name = 'EGL'
> +  egl_lib_version = '1.0.0'
> +else
> +  egl_lib_name = 'EGL_mesa'
> +  egl_lib_version = '0'
> +  files_egl += [g_egldispatchstubs_h, g_egldispatchstubs_c]
> +  files_egl += files('main/eglglvnd.c', 'main/egldispatchstubs.c')
> +  install_data(
> +'main/50_mesa.json',
> +install_dir : join_paths(get_option('datadir'), 'glvnd',
> 'egl_vendor.d')
> +  )
> +endif
> +
>  libegl = shared_library(
> -  'EGL',
> +  egl_lib_name,
>files_egl,
>c_args : [
>  c_vis_args,
> @@ -125,7 +167,7 @@ libegl = shared_library(
>link_args : [ld_args_bsymbolic, ld_args_gc_sections],
>dependencies : [deps_for_egl, dep_dl, dep_libdrm, dep_clock, dep_thread],
>install : true,
> -  version : '1.0.0',
> +  version : egl_lib_version,
>  )
>  
>  pkg.generate(
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] etnaviv: add support for occlusion queries

2017-10-19 Thread Wladimir J. van der Laan

> There is one difference - how the sum is interpreted - uint64_t vs. bool
> value. In general the code

Ok in that case it's ok like this, just looked like unnecessary/accidental
duplication.

Regards,
Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT

2017-10-19 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

On Thu, Oct 19, 2017 at 8:54 PM, Fredrik Höglund  wrote:
> The Vulkan specification says:
>
>"... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
> PIPE_BIT in the source stage mask will effectively not wait for
> any prior commands to complete."
>
> Signed-off-by: Fredrik Höglund 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 147235006fa..4b9d49cd2bd 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1813,8 +1813,7 @@ static void radv_stage_flush(struct radv_cmd_buffer 
> *cmd_buffer,
>   VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT |
>   VK_PIPELINE_STAGE_ALL_COMMANDS_BIT)) {
> cmd_buffer->state.flush_bits |= 
> RADV_CMD_FLAG_PS_PARTIAL_FLUSH;
> -   } else if (src_stage_mask & (VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT |
> -VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT |
> +   } else if (src_stage_mask & (VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT |
>  VK_PIPELINE_STAGE_VERTEX_INPUT_BIT |
>  VK_PIPELINE_STAGE_VERTEX_SHADER_BIT)) {
> cmd_buffer->state.flush_bits |= 
> RADV_CMD_FLAG_VS_PARTIAL_FLUSH;
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] radv: be smarter with descriptors when emitting secondary buffers

2017-10-19 Thread Bas Nieuwenhuizen

On Wed, Oct 11, 2017 at 5:18 PM, Samuel Pitoiset
 wrote:
> If the secondary buffers don't use any descriptors we don't
> have to re-emit the ones from the primary command buffer.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 22637950c4..69ca16b52d 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -2615,6 +2615,29 @@ void radv_CmdSetStencilReference(
> cmd_buffer->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE;
>  }
>
> +static bool
> +radv_cmd_buffer_has_descriptors(struct radv_cmd_buffer *cmd_buffer)
> +{
> +   struct radv_cmd_state *state = _buffer->state;
> +   int i;
> +
> +   for (i = 0; i < MAX_SETS; i++) {
> +   if (cmd_buffer->state.descriptors[i])
> +   return true;
> +   }
This does not work, because of the meta restore stuff resetting to NULL.

> +
> +   if ((state->pipeline &&
> +state->pipeline->need_indirect_descriptor_sets) ||
> +   (state->compute_pipeline &&
> +state->compute_pipeline->need_indirect_descriptor_sets))
> +   return true;

This seems unnecessary to me here.

> +
> +   if (cmd_buffer->push_descriptors.capacity > 0)
> +   return true;

This as well, we have a separate descriptor set per cmd buffer, so
what is in there does not matter.

Furthermore, if the parent command buffer and child command buffer
have different emitted pipelines the user SGPR assignment is different
and we need a flush too.


> +
> +   return false;
> +}
> +
>  void radv_CmdExecuteCommands(
> VkCommandBuffer commandBuffer,
> uint32_tcommandBufferCount,
> @@ -2675,6 +2698,11 @@ void radv_CmdExecuteCommands(
> secondary->state.last_primitive_reset_en;
> }
>
> +   /* Only re-emit all descriptors when needed. */
> +   if (radv_cmd_buffer_has_descriptors(secondary)) {
> +   radv_mark_descriptor_sets_dirty(primary);
> +   }
> +
> primary->state.last_primitive_reset_index = 
> secondary->state.last_primitive_reset_index;
> primary->state.last_ia_multi_vgt_param = 
> secondary->state.last_ia_multi_vgt_param;
> }
> @@ -2684,7 +2712,6 @@ void radv_CmdExecuteCommands(
>  */
> primary->state.dirty |= RADV_CMD_DIRTY_PIPELINE;
> primary->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_ALL;
> -   radv_mark_descriptor_sets_dirty(primary);
>  }
>
>  VkResult radv_CreateCommandPool(
> --
> 2.14.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 8/9] radv: add radv_emit_shaders_prefetch()

2017-10-19 Thread Bas Nieuwenhuizen

On Tue, Oct 17, 2017 at 11:03 AM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 38 ++
>  1 file changed, 26 insertions(+), 12 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index ec4e34966c..e72ef5ffb7 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -606,6 +606,30 @@ radv_emit_shader_prefetch(struct radv_cmd_buffer 
> *cmd_buffer,
> si_cp_dma_prefetch(cmd_buffer, va, shader->code_size);
>  }
>
> +static void
> +radv_emit_shaders_prefetch(struct radv_cmd_buffer *cmd_buffer,
> +  struct radv_pipeline *pipeline)
> +{
> +   radv_emit_shader_prefetch(cmd_buffer,
> + pipeline->shaders[MESA_SHADER_VERTEX]);

Do this per stage instead of grouping, and also check the vertex
shader for NULL. Consequence of the merged shaders on vega.

> +
> +   if (pipeline->shaders[MESA_SHADER_TESS_EVAL]) {
> +   radv_emit_shader_prefetch(cmd_buffer,
> + 
> pipeline->shaders[MESA_SHADER_TESS_CTRL]);
> +   radv_emit_shader_prefetch(cmd_buffer,
> + 
> pipeline->shaders[MESA_SHADER_TESS_EVAL]);
> +   }
> +
> +   if (pipeline->shaders[MESA_SHADER_GEOMETRY]) {
> +   radv_emit_shader_prefetch(cmd_buffer,
> + 
> pipeline->shaders[MESA_SHADER_GEOMETRY]);
> +   radv_emit_shader_prefetch(cmd_buffer, 
> pipeline->gs_copy_shader);
> +   }
> +
> +   radv_emit_shader_prefetch(cmd_buffer,
> + pipeline->shaders[MESA_SHADER_FRAGMENT]);
> +}
> +
>  static void
>  radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
> struct radv_pipeline *pipeline,
> @@ -615,8 +639,6 @@ radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
> uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
> unsigned export_count;
>
> -   radv_emit_shader_prefetch(cmd_buffer, shader);
> -
> export_count = MAX2(1, outinfo->param_exports);
> radeon_set_context_reg(cmd_buffer->cs, R_0286C4_SPI_VS_OUT_CONFIG,
>S_0286C4_VS_EXPORT_COUNT(export_count - 1));
> @@ -662,8 +684,6 @@ radv_emit_hw_es(struct radv_cmd_buffer *cmd_buffer,
>  {
> uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
>
> -   radv_emit_shader_prefetch(cmd_buffer, shader);
> -
> radeon_set_context_reg(cmd_buffer->cs, 
> R_028AAC_VGT_ESGS_RING_ITEMSIZE,
>outinfo->esgs_itemsize / 4);
> radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B320_SPI_SHADER_PGM_LO_ES, 
> 4);
> @@ -680,8 +700,6 @@ radv_emit_hw_ls(struct radv_cmd_buffer *cmd_buffer,
> uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
> uint32_t rsrc2 = shader->rsrc2;
>
> -   radv_emit_shader_prefetch(cmd_buffer, shader);
> -
> radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B520_SPI_SHADER_PGM_LO_LS, 
> 2);
> radeon_emit(cmd_buffer->cs, va >> 8);
> radeon_emit(cmd_buffer->cs, va >> 40);
> @@ -702,8 +720,6 @@ radv_emit_hw_hs(struct radv_cmd_buffer *cmd_buffer,
>  {
> uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset;
>
> -   radv_emit_shader_prefetch(cmd_buffer, shader);
> -
> radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B420_SPI_SHADER_PGM_LO_HS, 
> 4);
> radeon_emit(cmd_buffer->cs, va >> 8);
> radeon_emit(cmd_buffer->cs, va >> 40);
> @@ -835,8 +851,6 @@ radv_emit_geometry_shader(struct radv_cmd_buffer 
> *cmd_buffer,
>
> va = radv_buffer_get_va(gs->bo) + gs->bo_offset;
>
> -   radv_emit_shader_prefetch(cmd_buffer, gs);
> -
> radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B220_SPI_SHADER_PGM_LO_GS, 
> 4);
> radeon_emit(cmd_buffer->cs, va >> 8);
> radeon_emit(cmd_buffer->cs, va >> 40);
> @@ -875,8 +889,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
> *cmd_buffer,
> ps = pipeline->shaders[MESA_SHADER_FRAGMENT];
> va = radv_buffer_get_va(ps->bo) + ps->bo_offset;
>
> -   radv_emit_shader_prefetch(cmd_buffer, ps);
> -
> radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B020_SPI_SHADER_PGM_LO_PS, 
> 4);
> radeon_emit(cmd_buffer->cs, va >> 8);
> radeon_emit(cmd_buffer->cs, va >> 40);
> @@ -953,6 +965,8 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer 
> *cmd_buffer)
> radv_emit_fragment_shader(cmd_buffer, pipeline);
> radv_emit_vgt_vertex_reuse(cmd_buffer, pipeline);
>
> +   radv_emit_shaders_prefetch(cmd_buffer, pipeline);
> +
> cmd_buffer->scratch_size_needed =
>   MAX2(cmd_buffer->scratch_size_needed,
>

[Mesa-dev] [PATCH] radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT

2017-10-19 Thread Fredrik Höglund

The Vulkan specification says:

   "... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
PIPE_BIT in the source stage mask will effectively not wait for
any prior commands to complete."

Signed-off-by: Fredrik Höglund 
---
 src/amd/vulkan/radv_cmd_buffer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 147235006fa..4b9d49cd2bd 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1813,8 +1813,7 @@ static void radv_stage_flush(struct radv_cmd_buffer 
*cmd_buffer,
  VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT |
  VK_PIPELINE_STAGE_ALL_COMMANDS_BIT)) {
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_PS_PARTIAL_FLUSH;
-   } else if (src_stage_mask & (VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT |
-VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT |
+   } else if (src_stage_mask & (VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT |
 VK_PIPELINE_STAGE_VERTEX_INPUT_BIT |
 VK_PIPELINE_STAGE_VERTEX_SHADER_BIT)) {
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_VS_PARTIAL_FLUSH;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radv: copy indirect lowering settings from radeonsi

2017-10-19 Thread Bas Nieuwenhuizen

Reviewed-by: Bas Nieuwenhuizen 

as well.

On Thu, Oct 19, 2017 at 12:27 AM, Timothy Arceri  wrote:
> It looks the original indirect mask was probably copied from
> ANV.
>
> Sascha Willems demo results:
>
> tessellation ~4000 -> ~4200 fps
>
> V2: continue lowering local indirect due to llvm deficiencies.
>
> Cc: Alex Smith 
> ---
>  src/amd/vulkan/radv_shader.c | 27 ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
> index 055787a705..faba0c50e9 100644
> --- a/src/amd/vulkan/radv_shader.c
> +++ b/src/amd/vulkan/radv_shader.c
> @@ -238,22 +238,47 @@ radv_shader_compile_to_nir(struct radv_device *device,
> NIR_PASS_V(nir, nir_lower_constant_initializers, ~0);
> NIR_PASS_V(nir, nir_lower_system_values);
> NIR_PASS_V(nir, nir_lower_clip_cull_distance_arrays);
> }
>
> /* Vulkan uses the separate-shader linking model */
> nir->info.separate_shader = true;
>
> nir_shader_gather_info(nir, entry_point->impl);
>
> +   /* While it would be nice not to have this flag, we are constrained
> +* by the reality that LLVM 5.0 doesn't have working VGPR indexing
> +* on GFX9.
> +*/
> +   bool llvm_has_working_vgpr_indexing =
> +   device->physical_device->rad_info.chip_class <= VI;
> +
> +   /* TODO: Indirect indexing of GS inputs is unimplemented.
> +*
> +* TCS and TES load inputs directly from LDS or offchip memory, so
> +* indirect indexing is trivial.
> +*/
> nir_variable_mode indirect_mask = 0;
> -   indirect_mask |= nir_var_shader_in;
> +   if (nir->stage == MESA_SHADER_GEOMETRY ||
> +   (nir->stage != MESA_SHADER_TESS_CTRL &&
> +nir->stage != MESA_SHADER_TESS_EVAL &&
> +!llvm_has_working_vgpr_indexing)) {
> +   indirect_mask |= nir_var_shader_in;
> +   }
> +
> +   /* TODO: We shouldn't need to do this, however LLVM isn't currently
> +* smart enough to handle indirects without causing excess spilling
> +* causing the gpu to hang.
> +*
> +* See the following thread for more details of the problem:
> +* 
> https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html
> +*/
> indirect_mask |= nir_var_local;
>
> nir_lower_indirect_derefs(nir, indirect_mask);
>
> static const nir_lower_tex_options tex_options = {
>   .lower_txp = ~0,
> };
>
> nir_lower_tex(nir, _options);
>
> --
> 2.13.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-19 Thread Manasi Navare

On Wed, Oct 18, 2017 at 09:28:01PM +0200, Daniel Vetter wrote:
> On Wed, Oct 18, 2017 at 6:59 PM, Michel Dänzer  wrote:
> > On 18/10/17 12:15 PM, Nicolai Hähnle wrote:
> >> On 18.10.2017 10:10, Daniel Vetter wrote:
> >>> On Tue, Oct 17, 2017 at 09:01:52PM +0200, Nicolai Hähnle wrote:
>  On 17.10.2017 19:16, Daniel Vetter wrote:
> > On Tue, Oct 17, 2017 at 5:40 PM, Michel Dänzer 
> > wrote:
> >> On 17/10/17 05:04 PM, Daniel Vetter wrote:
> >>> On Tue, Oct 17, 2017 at 03:46:24PM +0200, Michel Dänzer wrote:
>  On 17/10/17 02:22 PM, Daniel Vetter wrote:
> > On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
> >> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> >
> >>> Common sense suggests that there need to be two side to
> >>> FreeSync / VESA
> >>> Adaptive Sync support:
> >>>
> >>> 1. Query the display capabilities. This means querying minimum
> >>> / maximum
> >>> refresh duration, plus possibly a query for when the
> >>> earliest/latest
> >>> timing of the *next* refresh.
> >>>
> >>> 2. Signal desired present time. This means passing a target
> >>> timer value
> >>> instead of a target vblank count, e.g. something like this for
> >>> the KMS
> >>> interface:
> >>>
> >>> int drmModePageFlipTarget64(int fd, uint32_t crtc_id,
> >>> uint32_t fb_id,
> >>> uint32_t flags, void *user_data,
> >>> uint64_t target);
> >>>
> >>> + a flag to indicate whether target is the vblank count or
> >>> the
> >>> CLOCK_MONOTONIC (?) time in ns.
> >>
> >> drmModePageFlip(Target) is part of the pre-atomic KMS API, but
> >> adapative
> >> sync should probably only be supported via the atomic API,
> >> presumably
> >> via output properties.
> >
> > +1
> >
> > At least now that DC is on track to land properly, and you want
> > to do this
> > for DC-only anyway there's no reason to pimp the legacy interfaces
> > further. And atomic is soo much easier to extend.
> >
> > The big question imo is where we need to put the flag on the kms
> > side,
> > since freesync is not just about presenting earlier, but also about
> > presenting later. But for backwards compat we can't stretch the
> > refresh
> > rate by default for everyone, or clients that rely on high
> > precision
> > timestamps and regular refresh will get a bad surprise.
> 
>  The idea described above is that adaptive sync would be used for
>  flips
>  with a target timestamp. Apps which don't want to use adaptive sync
>  wouldn't set a target timestamp.
> 
> 
> > I think a boolean enable_freesync property is probably what we
> > want, which
> > enables freesync for as long as it's set.
> 
>  The question then becomes under what circumstances the property
>  is (not)
>  set. Not sure offhand this will actually solve any problem, or
>  just push
>  it somewhere else.
> >>>
> >>> I thought that's what the driconf switch is for, with a policy of
> >>> "please
> >>> schedule asap" instead of a specific timestamp.
> >>
> >> The driconf switch is just for the user's intention to use adaptive
> >> sync
> >> when possible. A property as you suggest cannot be set by the client
> >> directly, because it can't know when adaptive sync can actually be
> >> used
> >> (only when its window is fullscreen and using page flipping). So the
> >> property would have to be set by the X server/driver / Wayland
> >> compositor / ... instead. The question is whether such a property is
> >> actually needed, or whether the kernel could just enable adaptive sync
> >> when there's a flip with a target timestamp, and disable it when
> >> there's
> >> a flip without a target timestamp, or something like that.
> >
> > If your adaptive sync also supports extending the vblank beyond the
> > nominal limit, then you can't do that with a per-flip flag. Because
> > absent of a userspace requesting adaptive sync you must flip at the
> > nominal vrefresh rate. So if your userspace is a tad bit late with the
> > frame and would like to extend the frame to avoid missing a frame
> > entirely it'll be too late by the time the vblank actually gets
> > submitted. That's a bit a variation of what Ville brought up about
> > what we're going to do when the timestamp was missed by the time all
> > the depending fences signalled.
> 
>  These are very

Re: [Mesa-dev] [PATCH 3/3] radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite it

2017-10-19 Thread Bas Nieuwenhuizen

r-b for the series. MAybe add a fixes tag?

On Thu, Oct 19, 2017 at 12:35 PM, Samuel Pitoiset
 wrote:
> Only on CIK and later. We should only update VGT_INDEX_TYPE but
> it seems easier to re-emit all the index buffer packets.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index b3f0ad0da7..13489b9faf 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -1770,8 +1770,17 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer 
> *cmd_buffer,
> if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_FRAMEBUFFER)
> radv_emit_framebuffer_state(cmd_buffer);
>
> -   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_INDEX_BUFFER)
> -   radv_emit_index_buffer(cmd_buffer);
> +   if (indexed_draw) {
> +   if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_INDEX_BUFFER)
> +   radv_emit_index_buffer(cmd_buffer);
> +   } else {
> +   /* On CI and later, non-indexed draws overwrite 
> VGT_INDEX_TYPE,
> +* so the state must be re-emitted before the next indexed
> +* draw.
> +*/
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class 
> >= CIK)
> +   cmd_buffer->state.dirty |= 
> RADV_CMD_DIRTY_INDEX_BUFFER;
> +   }
>
> ia_multi_vgt_param = si_get_ia_multi_vgt_param(cmd_buffer, 
> instanced_draw, indirect_draw, draw_vertex_count);
> if (cmd_buffer->state.last_ia_multi_vgt_param != ia_multi_vgt_param) {
> --
> 2.14.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radv: move DB_COUNT_CONTROL initialization to si_emit_config()

2017-10-19 Thread Bas Nieuwenhuizen

r-b

On Thu, Oct 19, 2017 at 4:25 PM, Samuel Pitoiset
 wrote:
> CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 1 -
>  src/amd/vulkan/si_cmd_buffer.c   | 5 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 2791f1e096..7cffeec482 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -2123,7 +2123,6 @@ VkResult radv_BeginCommandBuffer(
> switch (cmd_buffer->queue_family_index) {
> case RADV_QUEUE_GENERAL:
> emit_gfx_buffer_state(cmd_buffer);
> -   radv_set_db_count_control(cmd_buffer);
> break;
> case RADV_QUEUE_COMPUTE:
> si_init_compute(cmd_buffer);
> diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
> index 1e8b43d4fa..d24b63a5b5 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -528,6 +528,11 @@ si_emit_config(struct radv_physical_device 
> *physical_device,
> radeon_emit(cs, S_028A04_MIN_SIZE(radv_pack_float_12p4(0)) |
> S_028A04_MAX_SIZE(radv_pack_float_12p4(8192/2)));
>
> +   if (!physical_device->has_clear_state) {
> +   radeon_set_context_reg(cs, R_028004_DB_COUNT_CONTROL,
> +  S_028004_ZPASS_INCREMENT_DISABLE(1));
> +   }
> +
> si_emit_compute(physical_device, cs);
>  }
>
> --
> 2.14.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: don't build gallium dri target if gallium is disabled

2017-10-19 Thread Rafael Antognolli

On Thu, Oct 19, 2017 at 10:32:46AM -0700, Dylan Baker wrote:
> Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and
> the megadriver install script to attempt to install drivers without any
> actual drivers being built.

Tested-by: Rafael Antognolli 

> fixes: 66f97f6640f5316b36177fd1053f0027eb6ec6cc ("meson: build radeonsi")
> Reported-by: Rafael Antognolli 
> Signed-off-by: Dylan Baker 
> ---
>  src/gallium/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/meson.build b/src/gallium/meson.build
> index a65b32c658e..97347819d60 100644
> --- a/src/gallium/meson.build
> +++ b/src/gallium/meson.build
> @@ -67,7 +67,7 @@ subdir('state_trackers/dri')
>  # TODO: virgl
>  # TODO: winsys/sw/xlib
>  # TODO: clover
> -if with_dri
> +if with_dri and with_gallium
>subdir('targets/dri')
>  endif
>  # TODO: xlib-glx
> -- 
> 2.14.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/12] anv: Add support for the variablePointers feature

2017-10-19 Thread Jason Ekstrand

Not to be confused with variablePointersStorageBuffer which is the
subset of VK_KHR_variable_pointers required to enable the extension.
This means we now have "full" support for variable pointers.
---
 src/intel/vulkan/anv_device.c   | 2 +-
 src/intel/vulkan/anv_pipeline.c | 5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index a305afe..256fc41 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -699,7 +699,7 @@ void anv_GetPhysicalDeviceFeatures2KHR(
   case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTER_FEATURES_KHR: {
  VkPhysicalDeviceVariablePointerFeaturesKHR *features = (void *)ext;
  features->variablePointersStorageBuffer = true;
- features->variablePointers = false;
+ features->variablePointers = true;
  break;
   }
 
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 92995ee..a200fec 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -124,6 +124,7 @@ anv_shader_compile_to_nir(struct anv_pipeline *pipeline,
}
 
struct spirv_to_nir_options spirv_options = {
+  .lower_workgroup_access_to_offsets = true,
   .caps = {
  .float64 = device->instance->physicalDevice.info.gen >= 8,
  .int64 = device->instance->physicalDevice.info.gen >= 8,
@@ -388,10 +389,8 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,
if (stage != MESA_SHADER_COMPUTE)
   NIR_PASS_V(nir, anv_nir_lower_multiview, pipeline->subpass->view_mask);
 
-   if (stage == MESA_SHADER_COMPUTE) {
-  NIR_PASS_V(nir, brw_nir_lower_cs_shared);
+   if (stage == MESA_SHADER_COMPUTE)
   prog_data->total_shared = nir->num_shared;
-   }
 
nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir));
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/12] spirv: Refactor the base case of offset_pointer_dereference

2017-10-19 Thread Jason Ekstrand

This makes us key off of !offset instead of !block_index.  It also puts
the guts inside a switch statement so that we can handle more than just
UBOs and SSBOs.
---
 src/compiler/spirv/vtn_variables.c | 38 +++---
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 0909af3..6a7cba4 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -161,24 +161,32 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b,
   idx++;
}
 
-   if (!block_index) {
+   if (!offset) {
+  /* This is the first access chain so we don't have a block index */
+  assert(!block_index);
+
   assert(base->var);
-  if (glsl_type_is_array(type->type)) {
- /* We need at least one element in the chain */
- assert(deref_chain->length >= 1);
+  switch (base->mode) {
+  case vtn_variable_mode_ubo:
+  case vtn_variable_mode_ssbo:
+ if (glsl_type_is_array(type->type)) {
+/* We need at least one element in the chain */
+assert(deref_chain->length >= 1);
+
+nir_ssa_def *desc_arr_idx =
+   vtn_access_link_as_ssa(b, deref_chain->link[0], 1);
+block_index = vtn_variable_resource_index(b, base->var, 
desc_arr_idx);
+type = type->array_element;
+idx++;
+ } else {
+block_index = vtn_variable_resource_index(b, base->var, NULL);
+ }
+ offset = nir_imm_int(>nb, 0);
+ break;
 
- nir_ssa_def *desc_arr_idx =
-vtn_access_link_as_ssa(b, deref_chain->link[0], 1);
- block_index = vtn_variable_resource_index(b, base->var, desc_arr_idx);
- type = type->array_element;
- idx++;
-  } else {
- block_index = vtn_variable_resource_index(b, base->var, NULL);
+  default:
+ unreachable("Invalid offset pointer mode");
   }
-
-  /* This is the first access chain so we also need an offset */
-  assert(!offset);
-  offset = nir_imm_int(>nb, 0);
}
assert(offset);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/12] spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op

2017-10-19 Thread Jason Ekstrand

---
 src/compiler/spirv/spirv_to_nir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index a2dcbcf..96ecff6 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -2102,7 +2102,7 @@ get_ssbo_nir_atomic_op(SpvOp opcode)
 }
 
 static nir_intrinsic_op
-get_shared_nir_atomic_op(SpvOp opcode)
+get_var_nir_atomic_op(SpvOp opcode)
 {
switch (opcode) {
case SpvOpAtomicLoad:  return nir_intrinsic_load_var;
@@ -2169,7 +2169,7 @@ vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, 
SpvOp opcode,
if (ptr->mode == vtn_variable_mode_workgroup) {
   nir_deref_var *deref = vtn_pointer_to_deref(b, ptr);
   const struct glsl_type *deref_type = nir_deref_tail(>deref)->type;
-  nir_intrinsic_op op = get_shared_nir_atomic_op(opcode);
+  nir_intrinsic_op op = get_var_nir_atomic_op(opcode);
   atomic = nir_intrinsic_instr_create(b->nb.shader, op);
   atomic->variables[0] = nir_deref_var_clone(deref, atomic);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/12] spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index

2017-10-19 Thread Jason Ekstrand

There is no good reason why we should have the same logic repeated in
get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference.  If
we're a bit more careful about how we do things, we can just use the one
function and get rid of the other entirely.  This also makes the push
constant special case a lot more clear.
---
 src/compiler/spirv/vtn_variables.c | 56 ++
 1 file changed, 21 insertions(+), 35 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index b135b03..8272c72 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -503,45 +503,27 @@ vtn_local_store(struct vtn_builder *b, struct 
vtn_ssa_value *src,
}
 }
 
-static nir_ssa_def *
-get_vulkan_resource_index(struct vtn_builder *b, struct vtn_pointer *ptr,
-  struct vtn_type **type, unsigned *chain_idx)
-{
-   /* Push constants have no explicit binding */
-   if (ptr->mode == vtn_variable_mode_push_constant) {
-  *chain_idx = 0;
-  *type = ptr->var->type;
-  return NULL;
-   }
-
-   if (glsl_type_is_array(ptr->var->type->type)) {
-  assert(ptr->chain->length > 0);
-  nir_ssa_def *desc_array_index =
- vtn_access_link_as_ssa(b, ptr->chain->link[0], 1);
-  *chain_idx = 1;
-  *type = ptr->var->type->array_element;
-  return vtn_variable_resource_index(b, ptr->var, desc_array_index);
-   } else {
-  *chain_idx = 0;
-  *type = ptr->var->type;
-  return vtn_variable_resource_index(b, ptr->var, NULL);
-   }
-}
-
 nir_ssa_def *
 vtn_pointer_to_offset(struct vtn_builder *b, struct vtn_pointer *ptr,
   nir_ssa_def **index_out, unsigned *end_idx_out)
 {
-   if (ptr->offset) {
-  assert(ptr->block_index);
+   if (vtn_pointer_uses_ssa_offset(b, ptr)) {
+  if (!ptr->offset) {
+ assert(ptr->mode == vtn_variable_mode_workgroup);
+ struct vtn_access_chain chain = {
+.length = 0,
+ };
+ ptr = vtn_ssa_offset_pointer_dereference(b, ptr, );
+  }
   *index_out = ptr->block_index;
   return ptr->offset;
}
 
-   unsigned idx = 0;
-   struct vtn_type *type;
-   *index_out = get_vulkan_resource_index(b, ptr, , );
+   assert(ptr->mode == vtn_variable_mode_push_constant);
+   *index_out = NULL;
 
+   unsigned idx = 0;
+   struct vtn_type *type = ptr->var->type;
nir_ssa_def *offset = nir_imm_int(>nb, 0);
for (; idx < ptr->chain->length; idx++) {
   enum glsl_base_type base_type = glsl_get_base_type(type->type);
@@ -1894,15 +1876,19 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp 
opcode,
   const uint32_t offset = ptr->var->type->offsets[w[4]];
   const uint32_t stride = ptr->var->type->members[w[4]]->stride;
 
-  unsigned chain_idx;
-  struct vtn_type *type;
-  nir_ssa_def *index =
- get_vulkan_resource_index(b, ptr, , _idx);
+  if (!ptr->block_index) {
+ assert(ptr->mode == vtn_variable_mode_workgroup);
+ struct vtn_access_chain chain = {
+.length = 0,
+ };
+ ptr = vtn_ssa_offset_pointer_dereference(b, ptr, );
+ assert(ptr->block_index);
+  }
 
   nir_intrinsic_instr *instr =
  nir_intrinsic_instr_create(b->nb.shader,
 nir_intrinsic_get_buffer_size);
-  instr->src[0] = nir_src_for_ssa(index);
+  instr->src[0] = nir_src_for_ssa(ptr->block_index);
   nir_ssa_dest_init(>instr, >dest, 1, 32, NULL);
   nir_builder_instr_insert(>nb, >instr);
   nir_ssa_def *buf_size = >dest.ssa;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/12] spirv: Add theoretical support for single component pointers

2017-10-19 Thread Jason Ekstrand

Up until now, all pointers have been ivec2s.  We're about to add support
for pointers to workgroup storage and those are going to be uints.
---
 src/compiler/spirv/vtn_variables.c | 30 --
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 8272c72..74204f1 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1500,7 +1500,7 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct 
vtn_pointer *ptr)
assert(ptr->ptr_type);
assert(ptr->ptr_type->type);
 
-   if (!ptr->offset || !ptr->block_index) {
+   if (!ptr->offset) {
   /* If we don't have an offset then we must be a pointer to the variable
* itself.
*/
@@ -1518,15 +1518,22 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct 
vtn_pointer *ptr)
   ptr = vtn_ssa_offset_pointer_dereference(b, ptr, );
}
 
-   assert(ptr->offset && ptr->block_index);
-   return nir_vec2(>nb, ptr->block_index, ptr->offset);
+   assert(ptr->offset);
+   if (ptr->block_index) {
+  assert(ptr->mode == vtn_variable_mode_ubo ||
+ ptr->mode == vtn_variable_mode_ssbo);
+  return nir_vec2(>nb, ptr->block_index, ptr->offset);
+   } else {
+  unreachable("Invalid pointer");
+  return ptr->offset;
+   }
 }
 
 struct vtn_pointer *
 vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa,
  struct vtn_type *ptr_type)
 {
-   assert(ssa->num_components == 2 && ssa->bit_size == 32);
+   assert(ssa->num_components <= 2 && ssa->bit_size == 32);
assert(ptr_type->base_type == vtn_base_type_pointer);
assert(ptr_type->deref->base_type != vtn_base_type_pointer);
/* This pointer type needs to have actual storage */
@@ -1537,8 +1544,19 @@ vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def 
*ssa,
  ptr_type, NULL);
ptr->type = ptr_type->deref;
ptr->ptr_type = ptr_type;
-   ptr->block_index = nir_channel(>nb, ssa, 0);
-   ptr->offset = nir_channel(>nb, ssa, 1);
+
+   if (ssa->num_components > 1) {
+  assert(ssa->num_components == 2);
+  assert(ptr->mode == vtn_variable_mode_ubo ||
+ ptr->mode == vtn_variable_mode_ssbo);
+  ptr->block_index = nir_channel(>nb, ssa, 0);
+  ptr->offset = nir_channel(>nb, ssa, 1);
+   } else {
+  assert(ssa->num_components == 1);
+  unreachable("Invalid pointer");
+  ptr->block_index = NULL;
+  ptr->offset = ssa;
+   }
 
return ptr;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/12] spirv: Refactor a couple of pointer query helpers

2017-10-19 Thread Jason Ekstrand

This commit moves them both into vtn_variables.c towards the top, makes
them take a vtn_builder, and replaces a hand-rolled instance of
is_external_block with a function call.
---
 src/compiler/spirv/vtn_private.h   |  7 ---
 src/compiler/spirv/vtn_variables.c | 35 +--
 2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 6110b6f..63bebe3 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -372,13 +372,6 @@ struct vtn_pointer {
struct nir_ssa_def *offset;
 };
 
-static inline bool
-vtn_pointer_uses_ssa_offset(struct vtn_pointer *ptr)
-{
-   return ptr->mode == vtn_variable_mode_ubo ||
-  ptr->mode == vtn_variable_mode_ssbo;
-}
-
 struct vtn_variable {
enum vtn_variable_mode mode;
 
diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 6a7cba4..b135b03 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -57,6 +57,23 @@ vtn_access_chain_extend(struct vtn_builder *b, struct 
vtn_access_chain *old,
return chain;
 }
 
+static bool
+vtn_pointer_uses_ssa_offset(struct vtn_builder *b,
+struct vtn_pointer *ptr)
+{
+   return ptr->mode == vtn_variable_mode_ubo ||
+  ptr->mode == vtn_variable_mode_ssbo;
+}
+
+static bool
+vtn_pointer_is_external_block(struct vtn_builder *b,
+  struct vtn_pointer *ptr)
+{
+   return ptr->mode == vtn_variable_mode_ssbo ||
+  ptr->mode == vtn_variable_mode_ubo ||
+  ptr->mode == vtn_variable_mode_push_constant;
+}
+
 /* Dereference the given base pointer by the access chain */
 static struct vtn_pointer *
 vtn_access_chain_pointer_dereference(struct vtn_builder *b,
@@ -236,7 +253,7 @@ vtn_pointer_dereference(struct vtn_builder *b,
 struct vtn_pointer *base,
 struct vtn_access_chain *deref_chain)
 {
-   if (vtn_pointer_uses_ssa_offset(base)) {
+   if (vtn_pointer_uses_ssa_offset(b, base)) {
   return vtn_ssa_offset_pointer_dereference(b, base, deref_chain);
} else {
   return vtn_access_chain_pointer_dereference(b, base, deref_chain);
@@ -873,14 +890,6 @@ vtn_block_store(struct vtn_builder *b, struct 
vtn_ssa_value *src,
  0, 0, dst->chain, chain_idx, dst->type, );
 }
 
-static bool
-vtn_pointer_is_external_block(struct vtn_pointer *ptr)
-{
-   return ptr->mode == vtn_variable_mode_ssbo ||
-  ptr->mode == vtn_variable_mode_ubo ||
-  ptr->mode == vtn_variable_mode_push_constant;
-}
-
 static void
 _vtn_variable_load_store(struct vtn_builder *b, bool load,
  struct vtn_pointer *ptr,
@@ -940,7 +949,7 @@ _vtn_variable_load_store(struct vtn_builder *b, bool load,
 struct vtn_ssa_value *
 vtn_variable_load(struct vtn_builder *b, struct vtn_pointer *src)
 {
-   if (vtn_pointer_is_external_block(src)) {
+   if (vtn_pointer_is_external_block(b, src)) {
   return vtn_block_load(b, src);
} else {
   struct vtn_ssa_value *val = NULL;
@@ -953,7 +962,7 @@ void
 vtn_variable_store(struct vtn_builder *b, struct vtn_ssa_value *src,
struct vtn_pointer *dest)
 {
-   if (vtn_pointer_is_external_block(dest)) {
+   if (vtn_pointer_is_external_block(b, dest)) {
   assert(dest->mode == vtn_variable_mode_ssbo);
   vtn_block_store(b, src, dest);
} else {
@@ -1761,9 +1770,7 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,
  nir_shader_add_variable(b->shader, var->members[i]);
   }
} else {
-  assert(var->mode == vtn_variable_mode_ubo ||
- var->mode == vtn_variable_mode_ssbo ||
- var->mode == vtn_variable_mode_push_constant);
+  assert(vtn_pointer_is_external_block(b, val->pointer));
}
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/12] spirv: Add support for lowering workgroup access to offsets

2017-10-19 Thread Jason Ekstrand

Before, we always left workgroup variables as shared nir_variables and
let the driver call nir_lower_io.  This adds an option to do the
lowering directly in spirv_to_nir.  To do this, we implicitly assign the
variables a std430 layout and then treat them like a UBO or SSBO and
immediately lower all the way to an offset.

As a side-effect, the spirv_to_nir pass now handles variable pointers
for workgroup variables.
---
 src/compiler/spirv/nir_spirv.h |   8 +++
 src/compiler/spirv/spirv_to_nir.c  | 130 +
 src/compiler/spirv/vtn_private.h   |  17 -
 src/compiler/spirv/vtn_variables.c |  54 +--
 4 files changed, 190 insertions(+), 19 deletions(-)

diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
index 234b0ce..58df3e1 100644
--- a/src/compiler/spirv/nir_spirv.h
+++ b/src/compiler/spirv/nir_spirv.h
@@ -43,6 +43,14 @@ struct nir_spirv_specialization {
 };
 
 struct spirv_to_nir_options {
+   /* Whether or not to lower all workgroup variable access to offsets
+* up-front.  This means you will _shared intrinsics instead of _var
+* for workgroup data access.
+*
+* This is currently required for full variable pointers support.
+*/
+   bool lower_workgroup_access_to_offsets;
+
struct {
   bool float64;
   bool image_ms_array;
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 96ecff6..1a612ae 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -729,6 +729,64 @@ translate_image_format(SpvImageFormat format)
}
 }
 
+static struct vtn_type *
+vtn_type_layout_std430(struct vtn_builder *b, struct vtn_type *type,
+   uint32_t *size_out, uint32_t *align_out)
+{
+   switch (type->base_type) {
+   case vtn_base_type_scalar: {
+  uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
+  *size_out = comp_size;
+  *align_out = comp_size;
+  return type;
+   }
+
+   case vtn_base_type_vector: {
+  uint32_t comp_size = glsl_get_bit_size(type->type) / 8;
+  assert(type->length > 0 && type->length <= 4);
+  unsigned align_comps = type->length == 3 ? 4 : type->length;
+  *size_out = comp_size * type->length,
+  *align_out = comp_size * align_comps;
+  return type;
+   }
+
+   case vtn_base_type_matrix:
+   case vtn_base_type_array: {
+  /* We're going to add an array stride */
+  type = vtn_type_copy(b, type);
+  uint32_t elem_size, elem_align;
+  type->array_element = vtn_type_layout_std430(b, type->array_element,
+   _size, _align);
+  type->stride = vtn_align_u32(elem_size, elem_align);
+  *size_out = type->stride * type->length;
+  *align_out = elem_align;
+  return type;
+   }
+
+   case vtn_base_type_struct: {
+  /* We're going to add member offsets */
+  type = vtn_type_copy(b, type);
+  uint32_t offset = 0;
+  uint32_t align = 0;
+  for (unsigned i = 0; i < type->length; i++) {
+ uint32_t mem_size, mem_align;
+ type->members[i] = vtn_type_layout_std430(b, type->members[i],
+   _size, _align);
+ offset = vtn_align_u32(offset, mem_align);
+ type->offsets[i] = offset;
+ offset += mem_size;
+ align = MAX2(align, mem_align);
+  }
+  *size_out = offset;
+  *align_out = align;
+  return type;
+   }
+
+   default:
+  unreachable("Invalid SPIR-V type for std430");
+   }
+}
+
 static void
 vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
 const uint32_t *w, unsigned count)
@@ -878,6 +936,19 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
   */
  val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2);
   }
+
+  if (storage_class == SpvStorageClassWorkgroup &&
+  b->options->lower_workgroup_access_to_offsets) {
+ uint32_t size, align;
+ val->type->deref = vtn_type_layout_std430(b, val->type->deref,
+   , );
+ val->type->length = size;
+ val->type->align = align;
+ /* These can actually be stored to nir_variables and used as SSA
+  * values so they need a real glsl_type.
+  */
+ val->type->type = glsl_uint_type();
+  }
   break;
}
 
@@ -2102,6 +2173,32 @@ get_ssbo_nir_atomic_op(SpvOp opcode)
 }
 
 static nir_intrinsic_op
+get_shared_nir_atomic_op(SpvOp opcode)
+{
+   switch (opcode) {
+   case SpvOpAtomicLoad:  return nir_intrinsic_load_shared;
+   case SpvOpAtomicStore: return nir_intrinsic_store_shared;
+#define OP(S, N) case SpvOp##S: return nir_intrinsic_shared_##N;
+   OP(AtomicExchange, atomic_exchange)
+   OP(AtomicCompareExchange,  atomic_comp_swap)
+   OP(AtomicIIncrement,   atomic_add)
+   OP(AtomicIDecrement,   atomic_add)
+   OP(AtomicIAdd,

[Mesa-dev] [PATCH 04/12] spirv: Add a switch statement for the block store opcode

2017-10-19 Thread Jason Ekstrand

This parallels what we do for vtn_block_load except that we don't yet
support anything except SSBO loads through this path.
---
 src/compiler/spirv/vtn_variables.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 7bfac46..0909af3 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -848,11 +848,20 @@ static void
 vtn_block_store(struct vtn_builder *b, struct vtn_ssa_value *src,
 struct vtn_pointer *dst)
 {
+   nir_intrinsic_op op;
+   switch (dst->mode) {
+   case vtn_variable_mode_ssbo:
+  op = nir_intrinsic_store_ssbo;
+  break;
+   default:
+  unreachable("Invalid block variable mode");
+   }
+
nir_ssa_def *offset, *index = NULL;
unsigned chain_idx;
offset = vtn_pointer_to_offset(b, dst, , _idx);
 
-   _vtn_block_load_store(b, nir_intrinsic_store_ssbo, false, index, offset,
+   _vtn_block_load_store(b, op, false, index, offset,
  0, 0, dst->chain, chain_idx, dst->type, );
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/12] spirv: Use a dereference instead of vtn_variable_resource_index

2017-10-19 Thread Jason Ekstrand

This is equivalent and means we don't have resource index code scattered
about.
---
 src/compiler/spirv/vtn_variables.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index a7e6ae0..7bfac46 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1492,11 +1492,9 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct 
vtn_pointer *ptr)
assert(ptr->ptr_type);
assert(ptr->ptr_type->type);
 
-   if (ptr->offset && ptr->block_index) {
-  return nir_vec2(>nb, ptr->block_index, ptr->offset);
-   } else {
-  /* If we don't have an offset or block index, then we must be a pointer
-   * to the variable itself.
+   if (!ptr->offset || !ptr->block_index) {
+  /* If we don't have an offset then we must be a pointer to the variable
+   * itself.
*/
   assert(!ptr->offset && !ptr->block_index);
 
@@ -1506,9 +1504,14 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct 
vtn_pointer *ptr)
*/
   assert(ptr->var && ptr->var->type->base_type == vtn_base_type_struct);
 
-  return nir_vec2(>nb, vtn_variable_resource_index(b, ptr->var, NULL),
-  nir_imm_int(>nb, 0));
+  struct vtn_access_chain chain = {
+ .length = 0,
+  };
+  ptr = vtn_ssa_offset_pointer_dereference(b, ptr, );
}
+
+   assert(ptr->offset && ptr->block_index);
+   return nir_vec2(>nb, ptr->block_index, ptr->offset);
 }
 
 struct vtn_pointer *
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/12] spirv: Only emit functions which are actually used

2017-10-19 Thread Jason Ekstrand

Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint.  This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.
---
 src/compiler/spirv/spirv_to_nir.c | 29 +
 src/compiler/spirv/vtn_cfg.c  |  2 ++
 src/compiler/spirv/vtn_private.h  |  3 +++
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index d22c3dc..e5e12ff 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -1394,8 +1394,11 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp 
opcode,
  const uint32_t *w, unsigned count)
 {
struct vtn_type *res_type = vtn_value(b, w[1], vtn_value_type_type)->type;
-   struct nir_function *callee =
-  vtn_value(b, w[3], vtn_value_type_function)->func->impl->function;
+   struct vtn_function *vtn_callee =
+  vtn_value(b, w[3], vtn_value_type_function)->func;
+   struct nir_function *callee = vtn_callee->impl->function;
+
+   vtn_callee->referenced = true;
 
nir_call_instr *call = nir_call_instr_create(b->nb.shader, callee);
for (unsigned i = 0; i < call->num_params; i++) {
@@ -3367,12 +3370,22 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
 
vtn_build_cfg(b, words, word_end);
 
-   foreach_list_typed(struct vtn_function, func, node, >functions) {
-  b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer,
-   _mesa_key_pointer_equal);
-
-  vtn_function_emit(b, func, vtn_handle_body_instruction);
-   }
+   assert(b->entry_point->value_type == vtn_value_type_function);
+   b->entry_point->func->referenced = true;
+
+   bool progress;
+   do {
+  progress = false;
+  foreach_list_typed(struct vtn_function, func, node, >functions) {
+ if (func->referenced && !func->emitted) {
+b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer,
+ _mesa_key_pointer_equal);
+
+vtn_function_emit(b, func, vtn_handle_body_instruction);
+progress = true;
+ }
+  }
+   } while (progress);
 
assert(b->entry_point->value_type == vtn_value_type_function);
nir_function *entry_point = b->entry_point->func->impl->function;
diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index 13f0221..70bbccb 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -783,4 +783,6 @@ vtn_function_emit(struct vtn_builder *b, struct 
vtn_function *func,
 */
if (b->has_loop_continue)
   nir_repair_ssa_impl(func->impl);
+
+   func->emitted = true;
 }
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 728c1ff..2551e0b 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -159,6 +159,9 @@ struct vtn_block {
 struct vtn_function {
struct exec_node node;
 
+   bool referenced;
+   bool emitted;
+
nir_function_impl *impl;
struct vtn_block *start_block;
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/12] spirv: Convert the supported_extensions struct to spirv_options

2017-10-19 Thread Jason Ekstrand

This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.
---
 src/amd/vulkan/radv_shader.c  | 23 +--
 src/compiler/spirv/nir_spirv.h| 26 ++
 src/compiler/spirv/spirv_to_nir.c | 10 +-
 src/compiler/spirv/vtn_private.h  |  2 +-
 src/intel/vulkan/anv_pipeline.c   | 20 +++-
 5 files changed, 44 insertions(+), 37 deletions(-)

diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c
index 055787a..7190ca6 100644
--- a/src/amd/vulkan/radv_shader.c
+++ b/src/amd/vulkan/radv_shader.c
@@ -194,19 +194,22 @@ radv_shader_compile_to_nir(struct radv_device *device,
spec_entries[i].data32 = *(const 
uint32_t *)data;
}
}
-   const struct nir_spirv_supported_extensions supported_ext = {
-   .draw_parameters = true,
-   .float64 = true,
-   .image_read_without_format = true,
-   .image_write_without_format = true,
-   .tessellation = true,
-   .int64 = true,
-   .multiview = true,
-   .variable_pointers = true,
+   const struct spirv_to_nir_options spirv_options = {
+   .caps = {
+   .draw_parameters = true,
+   .float64 = true,
+   .image_read_without_format = true,
+   .image_write_without_format = true,
+   .tessellation = true,
+   .int64 = true,
+   .multiview = true,
+   .variable_pointers = true,
+   },
};
entry_point = spirv_to_nir(spirv, module->size / 4,
   spec_entries, num_spec_entries,
-  stage, entrypoint_name, 
_ext, _options);
+  stage, entrypoint_name,
+  _options, _options);
nir = entry_point->shader;
assert(nir->stage == stage);
nir_validate_shader(nir);
diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h
index 83577fb..234b0ce 100644
--- a/src/compiler/spirv/nir_spirv.h
+++ b/src/compiler/spirv/nir_spirv.h
@@ -42,24 +42,26 @@ struct nir_spirv_specialization {
};
 };
 
-struct nir_spirv_supported_extensions {
-   bool float64;
-   bool image_ms_array;
-   bool tessellation;
-   bool draw_parameters;
-   bool image_read_without_format;
-   bool image_write_without_format;
-   bool int64;
-   bool multiview;
-   bool variable_pointers;
+struct spirv_to_nir_options {
+   struct {
+  bool float64;
+  bool image_ms_array;
+  bool tessellation;
+  bool draw_parameters;
+  bool image_read_without_format;
+  bool image_write_without_format;
+  bool int64;
+  bool multiview;
+  bool variable_pointers;
+   } caps;
 };
 
 nir_function *spirv_to_nir(const uint32_t *words, size_t word_count,
struct nir_spirv_specialization *specializations,
unsigned num_specializations,
gl_shader_stage stage, const char *entry_point_name,
-   const struct nir_spirv_supported_extensions *ext,
-   const nir_shader_compiler_options *options);
+   const struct spirv_to_nir_options *options,
+   const nir_shader_compiler_options *nir_options);
 
 #ifdef __cplusplus
 }
diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index e5e12ff..a2dcbcf 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -2677,7 +2677,7 @@ stage_for_execution_model(SpvExecutionModel model)
 }
 
 #define spv_check_supported(name, cap) do {\
-  if (!(b->ext && b->ext->name))   \
+  if (!(b->options && b->options->caps.name))  \
  vtn_warn("Unsupported SPIR-V capability: %s",  \
   spirv_capability_to_string(cap)); \
} while(0)
@@ -3317,8 +3317,8 @@ nir_function *
 spirv_to_nir(const uint32_t *words, size_t word_count,
  struct nir_spirv_specialization *spec, unsigned num_spec,
  gl_shader_stage stage, const char *entry_point_name,
- const struct nir_spirv_supported_extensions *ext,
- const nir_shader_compiler_options *options)
+ const struct spirv_to_nir_options *options,
+ const nir_shader_compiler_options *nir_options)
 {
const uint32_t *word_end = words + word_count;
 
@@ -3340,7 +3340,7 @@

[Mesa-dev] [PATCH 01/12] spirv: Drop the impl field from vtn_builder

2017-10-19 Thread Jason Ekstrand

We have a nir_builder and it has an impl field.
---
 src/compiler/spirv/spirv_to_nir.c  | 9 -
 src/compiler/spirv/vtn_cfg.c   | 2 +-
 src/compiler/spirv/vtn_private.h   | 1 -
 src/compiler/spirv/vtn_variables.c | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 079ff0f..d22c3dc 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -117,7 +117,7 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant 
*constant,
 
  load->value = constant->values[0];
 
- nir_instr_insert_before_cf_list(>impl->body, >instr);
+ nir_instr_insert_before_cf_list(>nb.impl->body, >instr);
  val->def = >def;
   } else {
  assert(glsl_type_is_matrix(type));
@@ -133,7 +133,7 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant 
*constant,
 
 load->value = constant->values[i];
 
-nir_instr_insert_before_cf_list(>impl->body, >instr);
+nir_instr_insert_before_cf_list(>nb.impl->body, >instr);
 col_val->def = >def;
 
 val->elems[i] = col_val;
@@ -1410,7 +1410,7 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp 
opcode,
 
  /* Make a temporary to store the argument in */
  nir_variable *tmp =
-nir_local_variable_create(b->impl, arg_ssa->type, "arg_tmp");
+nir_local_variable_create(b->nb.impl, arg_ssa->type, "arg_tmp");
  call->params[i] = nir_deref_var_create(call, tmp);
 
  vtn_local_store(b, arg_ssa, call->params[i]);
@@ -1420,7 +1420,7 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp 
opcode,
nir_variable *out_tmp = NULL;
assert(res_type->type == callee->return_type);
if (!glsl_type_is_void(callee->return_type)) {
-  out_tmp = nir_local_variable_create(b->impl, callee->return_type,
+  out_tmp = nir_local_variable_create(b->nb.impl, callee->return_type,
   "out_tmp");
   call->return_deref = nir_deref_var_create(call, out_tmp);
}
@@ -3368,7 +3368,6 @@ spirv_to_nir(const uint32_t *words, size_t word_count,
vtn_build_cfg(b, words, word_end);
 
foreach_list_typed(struct vtn_function, func, node, >functions) {
-  b->impl = func->impl;
   b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer,
_mesa_key_pointer_equal);
 
diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index 25ff254..13f0221 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -606,7 +606,7 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head 
*cf_list,
  if ((*block->branch & SpvOpCodeMask) == SpvOpReturnValue) {
 struct vtn_ssa_value *src = vtn_ssa_value(b, block->branch[1]);
 vtn_local_store(b, src,
-nir_deref_var_create(b, b->impl->return_var));
+nir_deref_var_create(b, b->nb.impl->return_var));
  }
 
  if (block->branch_type != vtn_branch_type_none) {
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 8458462..728c1ff 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -463,7 +463,6 @@ struct vtn_builder {
nir_builder nb;
 
nir_shader *shader;
-   nir_function_impl *impl;
const struct nir_spirv_supported_extensions *ext;
struct vtn_block *block;
 
diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 997b66f..a7e6ae0 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1731,7 +1731,7 @@ vtn_create_variable(struct vtn_builder *b, struct 
vtn_value *val,
 
if (var->mode == vtn_variable_mode_local) {
   assert(var->members == NULL && var->var != NULL);
-  nir_function_impl_add_variable(b->impl, var->var);
+  nir_function_impl_add_variable(b->nb.impl, var->var);
} else if (var->var) {
   nir_shader_add_variable(b->shader, var->var);
} else if (var->members) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/12] anv: Add support for the variablePointers feature

2017-10-19 Thread Jason Ekstrand

Not to be confused with variablePointersStorageBuffer which is the
subset of VK_KHR_variable_pointers required to enable the extension.
This gives us "full" support for variable pointers.

The approach chosen here was to do the lowering to _shared intrinsics
directly in spirv_to_nir instead of using the _var intrinsics and using
nir_lower_io.  Pointers with a storage class of Workgroup are given an
implicit std430 layout and now go through the same offset pointer paths as
UBO and SSBO access.  The whole thing really ended up working out rather
cleanly.

There are some downsides to this approach.  One, is that we can't delete
unused shared variables post-optimization.  Also, the driver may be able to
handle better than std430.  Both of these can lead to some waisted SLM
space.  This also means that we can't do any deref-based load/store
elimination optimizations on SLM but we didn't really before so that's no
great loss; SLM is now exactly as hard to optimize as SSBOs.

Connor, Yes, I know that this is not quite the approach you were suggesting
on IRC.  I considered how we might add some sort of deref intrinsic and I
don't see a good way of doing so without rewriting large chunks of NIR.  I
think that rewrite is probably worth it some day but that day is not today.
We people asking for this feature so I really don't want to delay on a
major NIR rewrite.

Cc: Connor Abbott 
Cc: Chad Versace 
Cc: Dave Airlie 

Jason Ekstrand (12):
  spirv: Drop the impl field from vtn_builder
  spirv: Only emit functions which are actually used
  spirv: Use a dereference instead of vtn_variable_resource_index
  spirv: Add a switch statement for the block store opcode
  spirv: Refactor the base case of offset_pointer_dereference
  spirv: Convert the supported_extensions struct to spirv_options
  spirv: Refactor a couple of pointer query helpers
  spirv: Use offset_pointer_dereference to instead of
get_vulkan_resource_index
  spirv: Add theoretical support for single component pointers
  spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op
  spirv: Add support for lowering workgroup access to offsets
  anv: Add support for the variablePointers feature

 src/amd/vulkan/radv_shader.c   |  23 ++--
 src/compiler/spirv/nir_spirv.h |  34 --
 src/compiler/spirv/spirv_to_nir.c  | 180 -
 src/compiler/spirv/vtn_cfg.c   |   4 +-
 src/compiler/spirv/vtn_private.h   |  30 +++--
 src/compiler/spirv/vtn_variables.c | 229 -
 src/intel/vulkan/anv_device.c  |   2 +-
 src/intel/vulkan/anv_pipeline.c|  25 ++--
 8 files changed, 372 insertions(+), 155 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/5] etnaviv: fix implicit conversion warning

2017-10-19 Thread Wladimir J. van der Laan

On Tue, Oct 17, 2017 at 10:38:17PM +0200, Christian Gmeiner wrote:
> Galliums query_type used in APIs is unsigned.

Reviewed-by: Wladimir J. van der Laan 

> Signed-off-by: Christian Gmeiner 
> ---
>  src/gallium/drivers/etnaviv/etnaviv_query.h| 2 +-
>  src/gallium/drivers/etnaviv/etnaviv_query_sw.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] etnaviv: add support for occlusion queries

2017-10-19 Thread Wladimir J. van der Laan

On Tue, Oct 17, 2017 at 10:38:15PM +0200, Christian Gmeiner wrote:
> Passes most occlusion query piglits. The following piglits are broken:
> - spec@arb_occlusion_query@occlusion_query_meta_fragments
> - spec@arb_occlusion_query@occlusion_query_meta_save
> - spec@arb_occlusion_query2@render
> 
> Signed-off-by: Christian Gmeiner 

Comments inline

> +static void
> +occlusion_stop(struct etna_hw_query *hq, struct etna_context *ctx)
> +{
> +   etna_set_state(ctx->stream, VIVS_GL_OCCLUSION_QUERY_CONTROL, 0x1DF5E76);

Does the actual value written matter here?
If so, it'd make sense to define a constant (or bit definitions in the rnndb).
If not it's fine like this, or add a comment that it's just a "random" nonce.

> +static const struct etna_hw_sample_provider occlusion_counter = {
> +   .query_type = PIPE_QUERY_OCCLUSION_COUNTER,
> +   .start = occlusion_start,
> +   .stop = occlusion_stop,
> +   .suspend = occlusion_suspend,
> +   .resume = occlusion_resume,
> +   .result = occlusion_counter_result,
> +};
> +
> +static const struct etna_hw_sample_provider occlusion_predicate = {
> +   .query_type = PIPE_QUERY_OCCLUSION_PREDICATE,
> +   .start = occlusion_start,
> +   .stop = occlusion_stop,
> +   .suspend = occlusion_suspend,
> +   .resume = occlusion_resume,
> +   .result = occlusion_predicate_result,
> +};
> +
> +static const struct etna_hw_sample_provider occlusion_predicate_conservative 
> = {
> +   .query_type = PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE,
> +   .start = occlusion_start,
> +   .stop = occlusion_stop,
> +   .suspend = occlusion_suspend,
> +   .resume = occlusion_resume,
> +   .result = occlusion_predicate_result,
> +};

Is it intentional that this defines the same fields three times?

Why not return the same structure for all three cases? Is this expected to 
change to
specific implementations soon in another patch?

Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] etnaviv: enable occlusion query if GPU supports it

2017-10-19 Thread Wladimir J. van der Laan

On Tue, Oct 17, 2017 at 10:38:16PM +0200, Christian Gmeiner wrote:
> Signed-off-by: Christian Gmeiner 

Reviewed-by: Wladimir J. van der Laan 

> ---
>  src/gallium/drivers/etnaviv/etnaviv_screen.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
> b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> index 738605a4f3..009bc73c14 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> @@ -319,8 +319,9 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
>  
> /* Timer queries. */
> case PIPE_CAP_QUERY_TIME_ELAPSED:
> -   case PIPE_CAP_OCCLUSION_QUERY:
>return 0;
> +   case PIPE_CAP_OCCLUSION_QUERY:
> +  return VIV_FEATURE(screen, chipMinorFeatures1, HALTI0);
> case PIPE_CAP_QUERY_TIMESTAMP:
>return 1;
> case PIPE_CAP_QUERY_PIPELINE_STATISTICS:
> -- 
> 2.13.6
> 
> ___
> etnaviv mailing list
> etna...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/etnaviv
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/5] etnaviv: add basic infrastructure for hw queries

2017-10-19 Thread Wladimir J. van der Laan

On Tue, Oct 17, 2017 at 10:38:14PM +0200, Christian Gmeiner wrote:
> No hardware query is supported yet.
> 
> Signed-off-by: Christian Gmeiner 

Reviewed-by: Wladimir J. van der Laan 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium: add more exceptions to tgsi_util_get_inst_usage_mask

2017-10-19 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Thu, Oct 19, 2017 at 6:40 PM, Tim Rowley  wrote:
> A number of double/int64 operations don't have matching
> read and write usage masks, which the fallthrough case of
> tgsi_util_get_inst_usage_mask assumes for componentwise
> tagged instructions.
>
> No regressions in llvmpipe piglit; fixes a large number of
> swr regressions.
> ---
>  src/gallium/auxiliary/tgsi/tgsi_util.c | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c 
> b/src/gallium/auxiliary/tgsi/tgsi_util.c
> index cfce59093c..afe5690ce0 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_util.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c
> @@ -230,13 +230,25 @@ tgsi_util_get_inst_usage_mask(const struct 
> tgsi_full_instruction *inst,
>read_mask = TGSI_WRITEMASK_XYZ;
>break;
>
> +   case TGSI_OPCODE_DSEQ:
> +   case TGSI_OPCODE_DSNE:
> +   case TGSI_OPCODE_DSLT:
> +   case TGSI_OPCODE_DSGE:
> case TGSI_OPCODE_DP4:
> case TGSI_OPCODE_PK4B:
> case TGSI_OPCODE_PK4UB:
> case TGSI_OPCODE_D2F:
> +   case TGSI_OPCODE_D2I:
> +   case TGSI_OPCODE_D2U:
> case TGSI_OPCODE_I2F:
> case TGSI_OPCODE_U2F:
> +   case TGSI_OPCODE_U64SEQ:
> +   case TGSI_OPCODE_U64SNE:
> +   case TGSI_OPCODE_U64SLT:
> +   case TGSI_OPCODE_U64SGE:
> case TGSI_OPCODE_U642F:
> +   case TGSI_OPCODE_I64SLT:
> +   case TGSI_OPCODE_I64SGE:
> case TGSI_OPCODE_I642F:
>read_mask = TGSI_WRITEMASK_XYZW;
>break;
> --
> 2.11.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] meson: don't build gallium dri target if gallium is disabled

2017-10-19 Thread Dylan Baker

Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and
the megadriver install script to attempt to install drivers without any
actual drivers being built.

fixes: 66f97f6640f5316b36177fd1053f0027eb6ec6cc ("meson: build radeonsi")
Reported-by: Rafael Antognolli 
Signed-off-by: Dylan Baker 
---
 src/gallium/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/meson.build b/src/gallium/meson.build
index a65b32c658e..97347819d60 100644
--- a/src/gallium/meson.build
+++ b/src/gallium/meson.build
@@ -67,7 +67,7 @@ subdir('state_trackers/dri')
 # TODO: virgl
 # TODO: winsys/sw/xlib
 # TODO: clover
-if with_dri
+if with_dri and with_gallium
   subdir('targets/dri')
 endif
 # TODO: xlib-glx
-- 
2.14.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] docs: add documentation for building with meson

2017-10-19 Thread Dylan Baker

Quoting Emil Velikov (2017-10-19 09:44:37)
> On 18 October 2017 at 18:27, Dylan Baker  wrote:
> 
> >> > +Note that in meson this defaults to "debug", and  not setting it to
> >> > +"release" will yield non-optimal performance and binary size
> >> Ouch, can we change that?
> >
> > When I did an informal poll in the Intel cube the universal consensus was 
> > that
> > defaulting to debug was a feature. If the wider community disagrees, yes, 
> > it can be
> > changed.
> >
> The intent behind is amazing, but it subtly changes behaviour over any
> of the existing build systems.
> 
> Past experience suggests that listing such changes in release
> notes/elsewhere people do miss it.
> Thus as users get annoyed with the optimised build there'll be some
> bad noise/publicity.
> 
> Just something to keep in mind.

That is good to keep in mind.

> 
> > I'll fix up the other stuff too.
> >
> Great thanks.
> 
> -Emil


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Ilia Mirkin

On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin  wrote:
> On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:
>> On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:
>>> Will this work with SSO shaders? Presumably the validation still has
>>> to happen, but I don't think cross_validate_outputs_to_inputs() will
>>> end up getting called.
>>
>> The piglit tests I mention use SSO so it seems to be working for this.
>> See for example:
>>
>> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
>> block-member-location-overlap.shader_test
>
> Ah great. I'm a little curious how, since I don't see how
> cross_validate_outputs_to_inputs would get called for SSO shaders.
> Perhaps I'm looking at old code.
>
> Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
> - can you try with that? It's just using a pipeline, but both shaders
> are ending up in it.

BTW, my solution to all this was

https://patchwork.freedesktop.org/patch/175959/

but Tim hated it, and I didn't have the time to properly respond.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] meson: build libEGL

2017-10-19 Thread Dylan Baker

Quoting Daniel Stone (2017-10-19 07:34:28)
> Hi Dylan,
> 
> On 19 October 2017 at 01:55, Dylan Baker  wrote:
> > This is based heavily on Daniel Stone's work for the same, rebased on
> > master and with a number of TODO's fixed.
> >
> > This does not implement glvnd (which is coming in a later patch)
> >
> > Meson builds egl slightly differently than autotools, namely it doesn't
> > build an intermediate shared library. It doesn't do this because meson
> > doesn't have problems with the name of the library being dynamically
> > generated, so the glvnd and non-glvnd code can follow the same path.
> 
> Thanks a million for picking this up, and fixing all the egregious
> bugs / gaps / variable name inconsistency! Is this available in a
> fully-baked git branch somewhere?

https://github.com/dcbaker/mesa submit/meson-egl

> 
> > +linux_dmabuf_unstable_v1_protocol_c = custom_target(
> > +  'linux-dmabuf-unstable-v1-protocol.c',
> > +  input : wayland_dmabuf_xml,
> > +  output : 'linux-dmabuf-unstable-v1-protocol.c',
> > +  command : [prog_wl_scanner, 'code', '@INPUT@', '@OUTPUT@'],
> > +)
> > +
> > +linux_dmabuf_unstable_v1_client_protocol_h = custom_target(
> > +  'linux-dmabuf-unstable-v1-client-protocol.h',
> > +  input : wayland_dmabuf_xml,
> > +  output : 'linux-dmabuf-unstable-v1-client-protocol.h',
> > +  command : [prog_wl_scanner, 'client-header', '@INPUT@', '@OUTPUT@'],
> > +)
> 
> Could you please move these into src/egl/wayland/wayland-drm? They're
> not actually wl_drm of course, but I have patches out to use
> linux-dmabuf inside Vulkan as well, so having wl_drm and
> zwp_linux_dmabuf in the same place where it's just built once seems
> like an idea. Then it'd be trivial to hoist that out of src/egl/
> later.

Jason and I talked about this, and he thought that the better thing to do would
be to make a src/wsi folder that could have things like src/wsi/wayland,
src/wsi/x11, etc. and we could move it in there. I like that idea too, would
that be okay as follow up work?

> 
> > +libwayland_egl = shared_library(
> > +  'wayland-egl',
> > +  'wayland-egl.c',
> > +  c_args : [c_vis_args],
> > +  link_args : ld_args_gc_sections,
> > +  version : '1.0.0',
> > +  install : true,
> >  )
> 
> As a drive-by musing, is there a reason c_vis_args isn't part of the
> global arguments? I realised after the fact that I'd left it out of
> quite a few places in my branch.

There are actually some targets that don't have visibility flags set for them.
Maybe those are just oversights?

> 
> Other than that, it looks good to me, so:
> Reviewed-by: Daniel Stone 
> 
> Cheers,
> Daniel


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] etnaviv: update headers from rnndb

2017-10-19 Thread Wladimir J. van der Laan

On Tue, Oct 17, 2017 at 10:38:13PM +0200, Christian Gmeiner wrote:
> Update to etna_viv commit 6c9c706.
> 
> Signed-off-by: Christian Gmeiner 

Reviewed-by: Wladimir J. van der Laan 

> ---
>  src/gallium/drivers/etnaviv/hw/cmdstream.xml.h |  36 ++-
>  src/gallium/drivers/etnaviv/hw/common.xml.h| 117 
>  src/gallium/drivers/etnaviv/hw/isa.xml.h   |   4 +-
>  src/gallium/drivers/etnaviv/hw/state.xml.h | 197 --
>  src/gallium/drivers/etnaviv/hw/state_3d.xml.h  | 357 
> -
>  5 files changed, 622 insertions(+), 89 deletions(-)
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium: add more exceptions to tgsi_util_get_inst_usage_mask

2017-10-19 Thread Roland Scheidegger

Reviewed-by: Roland Scheidegger 

Albeit the way those masks are derived looks quite error-prone in
general (especially for new opcodes).

Am 19.10.2017 um 18:40 schrieb Tim Rowley:
> A number of double/int64 operations don't have matching
> read and write usage masks, which the fallthrough case of
> tgsi_util_get_inst_usage_mask assumes for componentwise
> tagged instructions.
> 
> No regressions in llvmpipe piglit; fixes a large number of
> swr regressions.
> ---
>  src/gallium/auxiliary/tgsi/tgsi_util.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c 
> b/src/gallium/auxiliary/tgsi/tgsi_util.c
> index cfce59093c..afe5690ce0 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_util.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c
> @@ -230,13 +230,25 @@ tgsi_util_get_inst_usage_mask(const struct 
> tgsi_full_instruction *inst,
>read_mask = TGSI_WRITEMASK_XYZ;
>break;
>  
> +   case TGSI_OPCODE_DSEQ:
> +   case TGSI_OPCODE_DSNE:
> +   case TGSI_OPCODE_DSLT:
> +   case TGSI_OPCODE_DSGE:
> case TGSI_OPCODE_DP4:
> case TGSI_OPCODE_PK4B:
> case TGSI_OPCODE_PK4UB:
> case TGSI_OPCODE_D2F:
> +   case TGSI_OPCODE_D2I:
> +   case TGSI_OPCODE_D2U:
> case TGSI_OPCODE_I2F:
> case TGSI_OPCODE_U2F:
> +   case TGSI_OPCODE_U64SEQ:
> +   case TGSI_OPCODE_U64SNE:
> +   case TGSI_OPCODE_U64SLT:
> +   case TGSI_OPCODE_U64SGE:
> case TGSI_OPCODE_U642F:
> +   case TGSI_OPCODE_I64SLT:
> +   case TGSI_OPCODE_I64SGE:
> case TGSI_OPCODE_I642F:
>read_mask = TGSI_WRITEMASK_XYZW;
>break;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables

2017-10-19 Thread Ilia Mirkin

On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral  wrote:
> On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote:
>> Will this work with SSO shaders? Presumably the validation still has
>> to happen, but I don't think cross_validate_outputs_to_inputs() will
>> end up getting called.
>
> The piglit tests I mention use SSO so it seems to be working for this.
> See for example:
>
> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-
> block-member-location-overlap.shader_test

Ah great. I'm a little curious how, since I don't see how
cross_validate_outputs_to_inputs would get called for SSO shaders.
Perhaps I'm looking at old code.

Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED"
- can you try with that? It's just using a pipeline, but both shaders
are ending up in it.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] docs: add documentation for building with meson

2017-10-19 Thread Emil Velikov

On 18 October 2017 at 18:27, Dylan Baker  wrote:

>> > +Note that in meson this defaults to "debug", and  not setting it to
>> > +"release" will yield non-optimal performance and binary size
>> Ouch, can we change that?
>
> When I did an informal poll in the Intel cube the universal consensus was that
> defaulting to debug was a feature. If the wider community disagrees, yes, it 
> can be
> changed.
>
The intent behind is amazing, but it subtly changes behaviour over any
of the existing build systems.

Past experience suggests that listing such changes in release
notes/elsewhere people do miss it.
Thus as users get annoyed with the optimised build there'll be some
bad noise/publicity.

Just something to keep in mind.

> I'll fix up the other stuff too.
>
Great thanks.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium: add more exceptions to tgsi_util_get_inst_usage_mask

2017-10-19 Thread Tim Rowley

A number of double/int64 operations don't have matching
read and write usage masks, which the fallthrough case of
tgsi_util_get_inst_usage_mask assumes for componentwise
tagged instructions.

No regressions in llvmpipe piglit; fixes a large number of
swr regressions.
---
 src/gallium/auxiliary/tgsi/tgsi_util.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c 
b/src/gallium/auxiliary/tgsi/tgsi_util.c
index cfce59093c..afe5690ce0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_util.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_util.c
@@ -230,13 +230,25 @@ tgsi_util_get_inst_usage_mask(const struct 
tgsi_full_instruction *inst,
   read_mask = TGSI_WRITEMASK_XYZ;
   break;
 
+   case TGSI_OPCODE_DSEQ:
+   case TGSI_OPCODE_DSNE:
+   case TGSI_OPCODE_DSLT:
+   case TGSI_OPCODE_DSGE:
case TGSI_OPCODE_DP4:
case TGSI_OPCODE_PK4B:
case TGSI_OPCODE_PK4UB:
case TGSI_OPCODE_D2F:
+   case TGSI_OPCODE_D2I:
+   case TGSI_OPCODE_D2U:
case TGSI_OPCODE_I2F:
case TGSI_OPCODE_U2F:
+   case TGSI_OPCODE_U64SEQ:
+   case TGSI_OPCODE_U64SNE:
+   case TGSI_OPCODE_U64SLT:
+   case TGSI_OPCODE_U64SGE:
case TGSI_OPCODE_U642F:
+   case TGSI_OPCODE_I64SLT:
+   case TGSI_OPCODE_I64SGE:
case TGSI_OPCODE_I642F:
   read_mask = TGSI_WRITEMASK_XYZW;
   break;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] swr: rasterizer update

2017-10-19 Thread Cherniak, Bruce

Reviewed-by: Bruce Cherniak  


> On Oct 19, 2017, at 8:12 AM, Tim Rowley  wrote:
> 
> Highlights are code cleanups, some more simd16 work (disabled by default),
> and tuning for the Intel Xeon Phi architecture.
> 
> Tim Rowley (7):
>  swr/rast: Minor changes for os-x
>  swr/rast: Miscellaneous viewport array code changes
>  swr/rast: Fix indentation
>  swr/rast: Change DS memory allocation
>  swr/rast: Widen fetch shader to SIMD16 (disabled for now)
>  swr/rast: Add api to override draws in flight
>  swr: knob overrides for Intel Xeon Phi
> 
> src/gallium/drivers/swr/rasterizer/core/api.cpp|  26 +-
> src/gallium/drivers/swr/rasterizer/core/api.h  |   4 +
> src/gallium/drivers/swr/rasterizer/core/binner.cpp |  45 ++-
> src/gallium/drivers/swr/rasterizer/core/clip.h |  14 +-
> src/gallium/drivers/swr/rasterizer/core/context.h  |   2 +
> .../drivers/swr/rasterizer/core/frontend.cpp   |  26 +-
> src/gallium/drivers/swr/rasterizer/core/pa.h   |  24 +-
> src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp |   4 +-
> src/gallium/drivers/swr/rasterizer/core/state.h|   3 +-
> .../drivers/swr/rasterizer/core/threads.cpp|  24 +-
> .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 441 -
> src/gallium/drivers/swr/swr_context.cpp|  27 ++
> src/gallium/drivers/swr/swr_context.h  |   2 +
> src/gallium/drivers/swr/swr_loader.cpp |   4 +
> src/gallium/drivers/swr/swr_scratch.cpp|   2 +-
> src/gallium/drivers/swr/swr_screen.h   |   3 +
> 16 files changed, 575 insertions(+), 76 deletions(-)
> 
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 157 matches

Mail list logo