Re: [Mesa-dev] [PATCH] gallium/swr: update rasterizer (532172)

2016-03-22 Thread Stéphane Marchesin
On Tue, Mar 22, 2016 at 8:55 PM, Rowley, Timothy O
 wrote:
>
>> On Mar 22, 2016, at 3:51 PM, Justen, Jordan L  
>> wrote:
>>
>> What does 532172 in the subject refer to?
>
> swr rasterizer development happens in another source control system.  532172 
> is a revision id to checkpoint where we’ve pushed the changes publicly.
>
>> From this commit message, it seems clear that this single patch is
>> doing a whole lot. Usually that's a good sign that it should be split
>> into multiple patches.
>>
>> However, since this is only changing your driver, you can probably
>> take any sort of patches that you like. :)
>>
>> There is arguably little value to sending out a patch like this, since
>> it is very difficult to review. In other words, perhaps if you are
>> going to make big, unreviewable patches like this that only change
>> your driver, then you might as well just push them straight away.
>>
>> (But, it would be better, in my opinion, to try to split up the
>> changes and let them be reviewed.)
>
> Yes, there’s a lot in this patch.  I froze the public version of the 
> rasterizer when I began the upstreaming process mid February, so this is 
> syncing up with about a month’s worth of development.
>
> I also have this change as a series of 81 commits.  Not sure if that would be 
> preferable by the community or if people would be interested in reviewing the 
> series, as issues with early commits might already be addressed later in the 
> patch set.

From a consumer perspective, I am less interested in swr if I can't
bisect it to find and fix issues locally. Landing such large patches
would definitely prevent bisectability. Even if you have another
upstream repo (after all, that's what git is about), what prevents you
from turning all these commits into mesa commits, possibly with a
script if that's too tedious?

Stéphane


>
> -Tim
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/swr: update rasterizer (532172)

2016-03-22 Thread Kenneth Graunke
On Wednesday, March 23, 2016 3:55:10 AM PDT Rowley, Timothy O wrote:
> 
> > On Mar 22, 2016, at 3:51 PM, Justen, Jordan L  
wrote:
> > 
> > What does 532172 in the subject refer to?
> 
> swr rasterizer development happens in another source control system.  532172 
is a revision id to checkpoint where we’ve pushed the changes publicly.

That's an awkward situation we've not run into before.

If the code is going to live in the upstream Mesa git repository, then
it seems like the best long term plan is to reverse the workflow: make
upstream Mesa the canonical repository, do development upstream, and
pull changes from upstream into any internal repositories.

Obviously, that's a huge process change - presumably you have a bunch
of people working in some Intel perforce system - but working in the
public is very beneficial.  It's also the mark of a true open source
project, rather than simply "available source".

I don't know how much control you have over this, though...?


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/swr: update rasterizer (532172)

2016-03-22 Thread Jordan Justen
On 2016-03-22 20:55:10, Rowley, Timothy O wrote:
> 
> > On Mar 22, 2016, at 3:51 PM, Justen, Jordan L  
> > wrote:
> > 
> > What does 532172 in the subject refer to?
> 
> swr rasterizer development happens in another source control system.
> 532172 is a revision id to checkpoint where we’ve pushed the changes
> publicly.
> 
> > From this commit message, it seems clear that this single patch is
> > doing a whole lot. Usually that's a good sign that it should be split
> > into multiple patches.
> > 
> > However, since this is only changing your driver, you can probably
> > take any sort of patches that you like. :)
> > 
> > There is arguably little value to sending out a patch like this, since
> > it is very difficult to review. In other words, perhaps if you are
> > going to make big, unreviewable patches like this that only change
> > your driver, then you might as well just push them straight away.
> > 
> > (But, it would be better, in my opinion, to try to split up the
> > changes and let them be reviewed.)
> 
> Yes, there’s a lot in this patch. I froze the public version of the
> rasterizer when I began the upstreaming process mid February, so
> this is syncing up with about a month’s worth of development.
> 
> I also have this change as a series of 81 commits. Not sure if that
> would be preferable by the community or if people would be
> interested in reviewing the series, as issues with early commits
> might already be addressed later in the patch set.
> 

There seems to be some things working against community code review.

* Expected broken commits earlier in the series (We would normally ask
  that commits are cleaned up before posting them.)

* External development (What would happen to any code review asking
  for reworks, given that the patches are already merged elsewhere?)

* A large backlog of changes. :)

For those reasons, I don't see much value in posting this, or the 81
patches to mesa-dev. Maybe going fwd, there won't be such a backlog,
and code review would then be possible. (And, of course anything
outside the openswr driver code would require code review.)

I still think it would be better to see the 81 commits split up in the
history as long as they won't cause problems for others. Since most
people are unlikely to be building openswr, I don't think the commits
will affect them.

We rarely use merges, but perhaps it is appropriate since openswr is
developed externally. You could start a branch at the last openswr
commit, add your 81 commits. Then you could merge the resulting branch
into master.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix out-of-bounds indexing of shader images

2016-03-22 Thread Michel Dänzer
On 22.03.2016 05:41, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
> 
> Results are undefined but may not crash. Without this change, out-of-bounds
> indexing can lead to VM faults and GPU hangs.
> 
> Constant buffers, samplers, and possibly others will eventually need similar
> treatment to support GL_ARB_robust_buffer_access_behavior.

Reviewed-and-Tested-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/swr: update rasterizer (532172)

2016-03-22 Thread Rowley, Timothy O

> On Mar 22, 2016, at 3:51 PM, Justen, Jordan L  
> wrote:
> 
> What does 532172 in the subject refer to?

swr rasterizer development happens in another source control system.  532172 is 
a revision id to checkpoint where we’ve pushed the changes publicly.

> From this commit message, it seems clear that this single patch is
> doing a whole lot. Usually that's a good sign that it should be split
> into multiple patches.
> 
> However, since this is only changing your driver, you can probably
> take any sort of patches that you like. :)
> 
> There is arguably little value to sending out a patch like this, since
> it is very difficult to review. In other words, perhaps if you are
> going to make big, unreviewable patches like this that only change
> your driver, then you might as well just push them straight away.
> 
> (But, it would be better, in my opinion, to try to split up the
> changes and let them be reviewed.)

Yes, there’s a lot in this patch.  I froze the public version of the rasterizer 
when I began the upstreaming process mid February, so this is syncing up with 
about a month’s worth of development.

I also have this change as a series of 81 commits.  Not sure if that would be 
preferable by the community or if people would be interested in reviewing the 
series, as issues with early commits might already be addressed later in the 
patch set.

-Tim


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix 2D array MSAA failures since image support landed

2016-03-22 Thread Michel Dänzer
On 23.03.2016 02:27, Marek Olšák wrote:
> From: Marek Olšák 
> 
> ---
>  src/gallium/drivers/radeonsi/si_state.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_state.c 
> b/src/gallium/drivers/radeonsi/si_state.c
> index b9bdd47..b8fde00 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -2993,7 +2993,8 @@ si_make_texture_descriptor(struct si_screen *screen,
>   if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
>   height = 1;
>   depth = res->array_size;
> - } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY) {
> + } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY ||
> +type == V_008F1C_SQ_RSRC_IMG_2D_MSAA_ARRAY) {
>   if (sampler || res->target != PIPE_TEXTURE_3D)
>   depth = res->array_size;
>   } else if (type == V_008F1C_SQ_RSRC_IMG_CUBE)
> 

Reviewed-and-Tested-by: Michel Dänzer 


P.S. The incorrect code actually seemed to "fix" the
"spec@ext_framebuffer_multisample_blit_scaled@blit-scaled
samples={2,4,6,8} with gl_texture_2d_multisample_array" tests for me,
i.e. those tests were originally failing, then passing since Nicolai's
image changes, and now failing again with this fix. Might be worth
looking into, or maybe they were just passing by chance with the
incorrect code?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa/st: Remove GLSLVersion clamping

2016-03-22 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/mesa/state_tracker/st_extensions.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index b03f531..6645189 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -846,10 +846,8 @@ void st_init_extensions(struct pipe_screen *screen,
 
/* Figure out GLSL support. */
glsl_feature_level = screen->get_param(screen, PIPE_CAP_GLSL_FEATURE_LEVEL);
-
+   /* Set GLSLVersion to PIPE_CAP_GLSL_FEATURE_LEVEL */
consts->GLSLVersion = glsl_feature_level;
-   if (glsl_feature_level >= 410)
-  consts->GLSLVersion = 410;
 
_mesa_override_glsl_version(consts);
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] mesa/st: Trivial, use glsl_feature_level var consistently

2016-03-22 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/mesa/state_tracker/st_extensions.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 2fdaba0..b03f531 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -854,7 +854,7 @@ void st_init_extensions(struct pipe_screen *screen,
_mesa_override_glsl_version(consts);
 
if (options->force_glsl_version > 0 &&
-   options->force_glsl_version <= consts->GLSLVersion) {
+   options->force_glsl_version <= glsl_feature_level) {
   consts->ForceGLSLVersion = options->force_glsl_version;
}
 
@@ -865,12 +865,12 @@ void st_init_extensions(struct pipe_screen *screen,
 
/* This extension needs full OpenGL 3.2, but we don't know if that's
 * supported at this point. Only check the GLSL version. */
-   if (consts->GLSLVersion >= 150 &&
+   if (glsl_feature_level >= 150 &&
screen->get_param(screen, PIPE_CAP_TGSI_VS_LAYER_VIEWPORT)) {
   extensions->AMD_vertex_shader_layer = GL_TRUE;
}
 
-   if (consts->GLSLVersion >= 130) {
+   if (glsl_feature_level >= 130) {
   consts->NativeIntegers = GL_TRUE;
   consts->MaxClipPlanes = 8;
 
@@ -1054,7 +1054,7 @@ void st_init_extensions(struct pipe_screen *screen,
 * Assume that ES3 is supported if GLSL 3.30 is supported.
 * (OpenGL 3.3 is a requirement for that extension.)
 */
-   if (consts->GLSLVersion >= 330 &&
+   if (glsl_feature_level >= 330 &&
/* Requirements for ETC2 emulation. */
screen->is_format_supported(screen, PIPE_FORMAT_R8G8B8A8_UNORM,
PIPE_TEXTURE_2D, 0,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Misc asortment of minor fixes

2016-03-22 Thread Edward O'Callaghan
The only functional change in this series is taking off the breaks on
higher GLSL versions than 4.1. This will likely be relevant by weeks
end.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] radeon/r600_query.c: Minor style fix

2016-03-22 Thread Edward O'Callaghan
Signed-off-by: Edward O'Callaghan 
---
 src/gallium/drivers/radeon/r600_query.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index f8b6241..f9a5721 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -1066,7 +1066,7 @@ void r600_query_init_backend_mask(struct 
r600_common_context *ctx)
item_mask = 0x3;
}
 
-   while(num_tile_pipes--) {
+   while (num_tile_pipes--) {
i = backend_map & item_mask;
mask |= (1<>= item_width;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix 2D array MSAA failures since image support landed

2016-03-22 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-23 04:27, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index b9bdd47..b8fde00 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2993,7 +2993,8 @@ si_make_texture_descriptor(struct si_screen 
*screen,

if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
height = 1;
depth = res->array_size;
-   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY) {
+   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY ||
+  type == V_008F1C_SQ_RSRC_IMG_2D_MSAA_ARRAY) {
if (sampler || res->target != PIPE_TEXTURE_3D)
depth = res->array_size;
} else if (type == V_008F1C_SQ_RSRC_IMG_CUBE)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] compiler/glsl: allow sequence op as const expression for gles 1.0

2016-03-22 Thread Lars Hamre
v2: Fixed regression pointed out by Eduardo Lima Mitev

Allow the sequence operator to be a constant expression in GLSL ES versions 
prior
to GLSL ES 3.0

Fixes the following piglit test:
   /all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert

This is similar to the logic from process_initializer() which performs the
same check for constant variable initialization with sequence operators.

Signed-off-by: Lars Hamre 

---
 src/compiler/glsl/ast_to_hir.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 5262bd8..35def8e 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2125,7 +2125,9 @@ process_array_size(exec_node *node,
}

ir_constant *const size = ir->constant_expression_value();
-   if (size == NULL || array_size->has_sequence_subexpression()) {
+   if (size == NULL ||
+   (state->is_version(120, 300) &&
+array_size->has_sequence_subexpression())) {
   _mesa_glsl_error(& loc, state, "array size must be a "
"constant valued expression");
   return 0;
--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] compiler/glsl: Allow the sequence operator to be a constant expression

2016-03-22 Thread Lars Hamre
You are correct, it should be state->is_version(120, 300).
I will submit an updated patch.

On Tue, Mar 22, 2016 at 3:32 PM, Eduardo Lima Mitev 
wrote:

> On 03/22/2016 02:48 PM, Lars Hamre wrote:
>
>> Resending this patch because it received no response last week.
>>
>> Allow the sequence operator to be a constant expression in GLSL ES
>> versions prior
>> to GLSL ES 3.0
>>
>> Fixes the following piglit test:
>>
>> /all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert
>>
>>
> I confirm this fixes the above test, but it also regresses test:
>
>
> /all/spec/glsl-1.20/compiler/structure-and-array-operations/array-size-sequence-in-parenthesis.vert.
>
> Maybe you are missing a version check?
>
> Eduardo
>
> This mirrors the logic from process_initializer() which performs the
>> same check for constant variable initialization with sequence operators.
>>
>> Section 4.3.3 (Constant Expressions) of the GLSL 4.30.9 spec and of the
>> GLSL ES 3.00.4 spec say that the result of a sequence operator is not a
>> constant expression; however, we should not mandate that for lower GLSL
>> versions.
>>
>> Signed-off-by: Lars Hamre 
>>
>> ---
>>   src/compiler/glsl/ast_to_hir.cpp | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/compiler/glsl/ast_to_hir.cpp
>> b/src/compiler/glsl/ast_to_hir.cpp
>> index 5262bd8..4037468 100644
>> --- a/src/compiler/glsl/ast_to_hir.cpp
>> +++ b/src/compiler/glsl/ast_to_hir.cpp
>> @@ -2125,7 +2125,9 @@ process_array_size(exec_node *node,
>>  }
>>
>>  ir_constant *const size = ir->constant_expression_value();
>> -   if (size == NULL || array_size->has_sequence_subexpression()) {
>> +   if (size == NULL ||
>> +   (state->is_version(430, 300) &&
>> +array_size->has_sequence_subexpression())) {
>> _mesa_glsl_error(& loc, state, "array size must be a "
>>  "constant valued expression");
>> return 0;
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] tgsi: add support for image operations to tgsi_exec.

2016-03-22 Thread Dave Airlie
>> +   int dim;
>> +   switch (tgsi_tex) {
>> +   case TGSI_TEXTURE_BUFFER:
>> +   case TGSI_TEXTURE_1D:
>> +  dim = 1;
>> +  break;
>> +   case TGSI_TEXTURE_2D:
>> +   case TGSI_TEXTURE_RECT:
>> +   case TGSI_TEXTURE_1D_ARRAY:
>> +   case TGSI_TEXTURE_2D_MSAA:
>> +  dim = 2;
>> +  break;
>> +   case TGSI_TEXTURE_3D:
>> +   case TGSI_TEXTURE_CUBE:
>> +   case TGSI_TEXTURE_2D_ARRAY:
>> +   case TGSI_TEXTURE_2D_ARRAY_MSAA:
>> +   case TGSI_TEXTURE_CUBE_ARRAY:
>> +  dim = 3;
>> +  break;
>> +   default:
>> +  assert(!"unknown texture target");
>> +  dim = 0;
>> +  break;
>> +   }
>> +
>> +   if (sample) {
>> +  switch (tgsi_tex) {
>> +  case TGSI_TEXTURE_2D_MSAA:
>> + *sample = 3;
>> + break;
>> +  case TGSI_TEXTURE_2D_ARRAY_MSAA:
>> + *sample = 4;
>> + break;
>> +  default:
>> + *sample = 0;
>> + break;
>> +  }
>> +   }
>> +   return dim;
>> +}
>
>
> That function seems to do two independent things.  Can this be two
> functions?
>
Probably, I was just following local style

tgsi_util_get_texture_coord_dim

was what I copied.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] tgsi: introduce NonHelperMask

2016-03-22 Thread Dave Airlie
On 23 March 2016 at 01:37, Brian Paul  wrote:
> On 03/21/2016 04:02 PM, Dave Airlie wrote:
>>
>> From: Dave Airlie 
>>
>> This is a mask of which of the current 2x2 grid are non-helper
>> invocations. This allows us to mask off the helper invocations
>> later for the image operations.
>
>
> Can you elaborate on what a helper invocation is somewhere in the comments?

It's defined in the GLSL 4.5 spec.

"A helper invocation is a fragment-shader invocation that is created
solely for the purposes of evaluating derivatives for use in
non-helper fragment-shader invocations."

Then there is a big chunk of text, I could add a comment saying it's
in the spec.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 09/15] i965/fs: Get rid of the param_size array

2016-03-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 1 -
 src/mesa/drivers/dri/i965/brw_fs.h   | 2 --
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 9 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 ---
 4 files changed, 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6db491a..1a62029 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1024,7 +1024,6 @@ fs_visitor::import_uniforms(fs_visitor *v)
this->push_constant_loc = v->push_constant_loc;
this->pull_constant_loc = v->pull_constant_loc;
this->uniforms = v->uniforms;
-   this->param_size = v->param_size;
 }
 
 fs_reg *
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index d4acc87..8c412f5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -326,8 +326,6 @@ public:
 
const struct brw_vue_map *input_vue_map;
 
-   int *param_size;
-
int *virtual_grf_start;
int *virtual_grf_end;
brw::fs_live_variables *live_intervals;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 2cec97b..14480fb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -179,15 +179,6 @@ fs_visitor::nir_setup_uniforms()
   return;
 
uniforms = nir->num_uniforms / 4;
-
-   nir_foreach_variable(var, >uniforms) {
-  /* UBO's and atomics don't take up space in the uniform file */
-  if (var->interface_type != NULL || var->type->contains_atomic())
- continue;
-
-  if (type_size_scalar(var->type) > 0)
- param_size[var->data.driver_location / 4] = 
type_size_scalar(var->type);
-   }
 }
 
 static bool
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index dc61d09..f1da218 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1063,9 +1063,6 @@ fs_visitor::init()
 
this->spilled_any_registers = false;
this->do_dual_src = false;
-
-   if (dispatch_width == 8)
-  this->param_size = rzalloc_array(mem_ctx, int, 
stage_prog_data->nr_params);
 }
 
 fs_visitor::~fs_visitor()
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 14/15] i965/fs: Rename demote_pull_constants to lower_constant_loads

2016-03-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs.h   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 6c7c8cd..2d12449 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2031,7 +2031,7 @@ fs_visitor::assign_constant_locations()
  * or VARYING_PULL_CONSTANT_LOAD instructions which load values into VGRFs.
  */
 void
-fs_visitor::demote_pull_constants()
+fs_visitor::lower_constant_loads()
 {
const unsigned index = stage_prog_data->binding_table.pull_constants_start;
 
@@ -5096,7 +5096,7 @@ fs_visitor::optimize()
bld = fs_builder(this, 64);
 
assign_constant_locations();
-   demote_pull_constants();
+   lower_constant_loads();
 
validate();
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 8c412f5..6afb9b6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -139,7 +139,7 @@ public:
void split_virtual_grfs();
bool compact_virtual_grfs();
void assign_constant_locations();
-   void demote_pull_constants();
+   void lower_constant_loads();
void invalidate_live_intervals();
void calculate_live_intervals();
void calculate_register_pressure();
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 08/15] i965/fs: Stop relying on param_size in assign_constant_locations

2016-03-22 Thread Jason Ekstrand
Now that we have MOV_INDIRECT opcodes, we have all of the size information
we need directly in the opcode.  With a little restructuring of the
algorithm used in assign_constant_locations we don't need param_size
anymore.  The big thing to watch out for now, however, is that you can have
two ranges overlap where neither contains the other.  In order to deal with
this, we make the first pass just flag what needs pulling and handle
assigning pull constant locations until later.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 44 ++--
 1 file changed, 17 insertions(+), 27 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 57397f2..6db491a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1938,14 +1938,12 @@ fs_visitor::assign_constant_locations()
if (dispatch_width != min_dispatch_width)
   return;
 
-   unsigned int num_pull_constants = 0;
-
-   pull_constant_loc = ralloc_array(mem_ctx, int, uniforms);
-   memset(pull_constant_loc, -1, sizeof(pull_constant_loc[0]) * uniforms);
-
bool is_live[uniforms];
memset(is_live, 0, sizeof(is_live));
 
+   bool needs_pull[uniforms];
+   memset(needs_pull, 0, sizeof(needs_pull));
+
/* First, we walk through the instructions and do two things:
 *
 *  1) Figure out which uniforms are live.
@@ -1961,20 +1959,15 @@ fs_visitor::assign_constant_locations()
  if (inst->src[i].file != UNIFORM)
 continue;
 
- if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
-int uniform = inst->src[0].nr;
+ int constant_nr = inst->src[i].nr + inst->src[i].reg_offset;
 
-/* If this array isn't already present in the pull constant buffer,
- * add it.
- */
-if (pull_constant_loc[uniform] == -1) {
-   assert(param_size[uniform]);
-   for (int j = 0; j < param_size[uniform]; j++)
-  pull_constant_loc[uniform + j] = num_pull_constants++;
+ if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
+for (unsigned j = 0; j < inst->src[2].ud / 4; j++) {
+   is_live[constant_nr + j] = true;
+   needs_pull[constant_nr + j] = true;
 }
  } else {
 /* Mark the the one accessed uniform as live */
-int constant_nr = inst->src[i].nr + inst->src[i].reg_offset;
 if (constant_nr >= 0 && constant_nr < (int) uniforms)
is_live[constant_nr] = true;
  }
@@ -1991,26 +1984,23 @@ fs_visitor::assign_constant_locations()
 */
unsigned int max_push_components = 16 * 8;
unsigned int num_push_constants = 0;
+   unsigned int num_pull_constants = 0;
 
push_constant_loc = ralloc_array(mem_ctx, int, uniforms);
+   pull_constant_loc = ralloc_array(mem_ctx, int, uniforms);
 
for (unsigned int i = 0; i < uniforms; i++) {
-  if (!is_live[i] || pull_constant_loc[i] != -1) {
- /* This UNIFORM register is either dead, or has already been demoted
-  * to a pull const.  Mark it as no longer living in the param[] array.
-  */
- push_constant_loc[i] = -1;
+  push_constant_loc[i] = -1;
+  pull_constant_loc[i] = -1;
+
+  if (!is_live[i])
  continue;
-  }
 
-  if (num_push_constants < max_push_components) {
- /* Retain as a push constant.  Record the location in the params[]
-  * array.
-  */
+  if (!needs_pull[i] && num_push_constants < max_push_components) {
+ /* Retain as a push constant */
  push_constant_loc[i] = num_push_constants++;
   } else {
- /* Demote to a pull constant. */
- push_constant_loc[i] = -1;
+ /* We have to pull it */
  pull_constant_loc[i] = num_pull_constants++;
   }
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 15/15] i965/fs: Push small uniform arrays

2016-03-22 Thread Jason Ekstrand
Unfortunately, this also means that we need to use a slightly different
algorithm for assign_constant_locations.  The old algorithm worked based on
the assumption that each read of a uniform value read exactly one float.
If it encountered a MOV_INDIRECT, it would immediately bail and push the
whole thing.  Since we can now read ranges using MOV_INDIRECT, we need to
be able to push a series of floats without breaking them up.  To do this,
we use an algorithm similar to the on in split_virtual_grfs.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 76 +---
 1 file changed, 53 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 2d12449..ebb7579 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1926,9 +1926,7 @@ fs_visitor::compact_virtual_grfs()
  * maximum number of fragment shader uniform components (64).  If
  * there are too many of these, they'd fill up all of register space.
  * So, this will push some of them out to the pull constant buffer and
- * update the program to load them.  We also use pull constants for all
- * indirect constant loads because we don't support indirect accesses in
- * registers yet.
+ * update the program to load them.
  */
 void
 fs_visitor::assign_constant_locations()
@@ -1940,15 +1938,18 @@ fs_visitor::assign_constant_locations()
bool is_live[uniforms];
memset(is_live, 0, sizeof(is_live));
 
-   bool needs_pull[uniforms];
-   memset(needs_pull, 0, sizeof(needs_pull));
+   /* For each uniform slot, a value of true indicates that the given slot and
+* the next slot must remain contiguous.  This is used to keep us from
+* splitting arrays apart.
+*/
+   bool contiguous[uniforms];
+   memset(contiguous, 0, sizeof(contiguous));
 
/* First, we walk through the instructions and do two things:
 *
 *  1) Figure out which uniforms are live.
 *
-*  2) Find all indirect access of uniform arrays and flag them as needing
-* to go into the pull constant buffer.
+*  2) Mark any indirectly used ranges of registers as contiguous.
 *
 * Note that we don't move constant-indexed accesses to arrays.  No
 * testing has been done of the performance impact of this choice.
@@ -1961,12 +1962,16 @@ fs_visitor::assign_constant_locations()
  int constant_nr = inst->src[i].nr + inst->src[i].reg_offset;
 
  if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
-for (unsigned j = 0; j < inst->src[2].ud / 4; j++) {
-   is_live[constant_nr + j] = true;
-   needs_pull[constant_nr + j] = true;
+assert(inst->src[2].ud % 4 == 0);
+unsigned last = constant_nr + (inst->src[2].ud / 4) - 1;
+assert(last < uniforms);
+
+for (unsigned j = constant_nr; j < last; j++) {
+   is_live[j] = true;
+   contiguous[j] = true;
 }
+is_live[last] = true;
  } else {
-/* Mark the the one accessed uniform as live */
 if (constant_nr >= 0 && constant_nr < (int) uniforms)
is_live[constant_nr] = true;
  }
@@ -1981,26 +1986,49 @@ fs_visitor::assign_constant_locations()
 * If changing this value, note the limitation about total_regs in
 * brw_curbe.c.
 */
-   unsigned int max_push_components = 16 * 8;
+   const unsigned int max_push_components = 16 * 8;
+
+   /* We push small arrays, but no bigger than 16 floats.  This is big enough
+* for a vec4 but hopefully not large enough to push out other stuff.  We
+* should probably use a better heuristic at some point.
+*/
+   const unsigned int max_chunk_size = 16;
+
unsigned int num_push_constants = 0;
unsigned int num_pull_constants = 0;
 
push_constant_loc = ralloc_array(mem_ctx, int, uniforms);
pull_constant_loc = ralloc_array(mem_ctx, int, uniforms);
 
-   for (unsigned int i = 0; i < uniforms; i++) {
-  push_constant_loc[i] = -1;
-  pull_constant_loc[i] = -1;
+   int chunk_start = -1;
+   for (unsigned u = 0; u < uniforms; u++) {
+  push_constant_loc[u] = -1;
+  pull_constant_loc[u] = -1;
 
-  if (!is_live[i])
+  if (!is_live[u])
  continue;
 
-  if (!needs_pull[i] && num_push_constants < max_push_components) {
- /* Retain as a push constant */
- push_constant_loc[i] = num_push_constants++;
-  } else {
- /* We have to pull it */
- pull_constant_loc[i] = num_pull_constants++;
+  /* This is the first live uniform in the chunk */
+  if (chunk_start < 0)
+ chunk_start = u;
+
+  /* If this element does not need to be contiguous with the next, we
+   * split at this point and everthing between chunk_start and u forms a
+   * single chunk.
+   */
+  if (!contiguous[u]) {
+ unsigned chunk_size = u - chunk_start + 1;
+
+ 

[Mesa-dev] [PATCH v2 10/15] i965/vec4: Inline get_pull_constant_offset

2016-03-22 Thread Jason Ekstrand
It's not really doing enough anymore to justify a helper function.
---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  2 --
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 37 ++
 2 files changed, 14 insertions(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index d43a5a8..9c40ed7 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -278,8 +278,6 @@ public:
 
src_reg get_scratch_offset(bblock_t *block, vec4_instruction *inst,
  src_reg *reladdr, int reg_offset);
-   src_reg get_pull_constant_offset(bblock_t *block, vec4_instruction *inst,
-   src_reg *reladdr, int reg_offset);
void emit_scratch_read(bblock_t *block, vec4_instruction *inst,
  dst_reg dst,
  src_reg orig_src,
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index d30330a..cfa58a6 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1404,27 +1404,6 @@ vec4_visitor::get_scratch_offset(bblock_t *block, 
vec4_instruction *inst,
}
 }
 
-src_reg
-vec4_visitor::get_pull_constant_offset(bblock_t * block, vec4_instruction 
*inst,
-  src_reg *reladdr, int reg_offset)
-{
-   if (reladdr) {
-  src_reg index = src_reg(this, glsl_type::int_type);
-
-  emit_before(block, inst, ADD(dst_reg(index), *reladdr,
-   brw_imm_d(reg_offset * 16)));
-
-  return index;
-   } else if (devinfo->gen >= 8) {
-  /* Store the offset in a GRF so we can send-from-GRF. */
-  src_reg offset = src_reg(this, glsl_type::int_type);
-  emit_before(block, inst, MOV(dst_reg(offset), brw_imm_d(reg_offset * 
16)));
-  return offset;
-   } else {
-  return brw_imm_d(reg_offset * 16);
-   }
-}
-
 /**
  * Emits an instruction before @inst to load the value named by @orig_src
  * from scratch space at @base_offset to @temp.
@@ -1606,8 +1585,20 @@ vec4_visitor::emit_pull_constant_load(bblock_t *block, 
vec4_instruction *inst,
 {
int reg_offset = base_offset + orig_src.reg_offset;
const unsigned index = prog_data->base.binding_table.pull_constants_start;
-   src_reg offset = get_pull_constant_offset(block, inst, orig_src.reladdr,
- reg_offset);
+
+   src_reg offset;
+   if (orig_src.reladdr) {
+  offset = src_reg(this, glsl_type::int_type);
+
+  emit_before(block, inst, ADD(dst_reg(offset), *orig_src.reladdr,
+   brw_imm_d(reg_offset * 16)));
+   } else if (devinfo->gen >= 8) {
+  /* Store the offset in a GRF so we can send-from-GRF. */
+  offset = src_reg(this, glsl_type::int_type);
+  emit_before(block, inst, MOV(dst_reg(offset), brw_imm_d(reg_offset * 
16)));
+   } else {
+  offset = brw_imm_d(reg_offset * 16);
+   }
 
emit_pull_constant_load_reg(temp,
brw_imm_ud(index),
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 02/15] i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions

2016-03-22 Thread Jason Ekstrand
It should work fine without it and the visitor can set it if it wants.
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index c883fe3..35400cb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -366,7 +366,6 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
assert(inst->exec_size == 8 || devinfo->gen >= 8);
 
brw_MOV(p, addr, indirect_byte_offset);
-   brw_inst_set_mask_control(devinfo, brw_last_inst, BRW_MASK_DISABLE);
brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type));
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 05/15] nir: Add another index to load_uniform to specify the range read

2016-03-22 Thread Jason Ekstrand
---
 src/compiler/nir/nir.h| 7 +++
 src/compiler/nir/nir_intrinsics.h | 6 +-
 src/compiler/nir/nir_lower_io.c   | 5 +
 src/compiler/nir/nir_print.c  | 1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 36f90fc..f686b74 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -938,6 +938,12 @@ typedef enum {
 */
NIR_INTRINSIC_UCP_ID = 4,
 
+   /**
+* The ammount of data, starting from BASE, that this instruction may
+* access.  This is used to provide bounds if the offset is not constant.
+*/
+   NIR_INTRINSIC_RANGE = 5,
+
NIR_INTRINSIC_NUM_INDEX_FLAGS,
 
 } nir_intrinsic_index_flag;
@@ -1001,6 +1007,7 @@ INTRINSIC_IDX_ACCESSORS(write_mask, WRMASK, unsigned)
 INTRINSIC_IDX_ACCESSORS(base, BASE, int)
 INTRINSIC_IDX_ACCESSORS(stream_id, STREAM_ID, unsigned)
 INTRINSIC_IDX_ACCESSORS(ucp_id, UCP_ID, unsigned)
+INTRINSIC_IDX_ACCESSORS(range, RANGE, unsigned)
 
 /**
  * \group texture information
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index 3ba1563..2d6b7b7 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -293,6 +293,10 @@ SYSTEM_VALUE(helper_invocation, 1, 0, xx, xx, xx)
  * of the start of the variable being loaded and and the offset source is a
  * offset into that variable.
  *
+ * Uniform load operations have a second "range" index that specifies the
+ * range (starting at base) of the data from which we are loading.  If
+ * const_index[1] == 0, then the range is unknown.
+ *
  * Some load operations such as UBO/SSBO load and per_vertex loads take an
  * additional source to specify which UBO/SSBO/vertex to load from.
  *
@@ -306,7 +310,7 @@ SYSTEM_VALUE(helper_invocation, 1, 0, xx, xx, xx)
INTRINSIC(load_##name, srcs, ARR(1, 1, 1, 1), true, 0, 0, num_indices, 
idx0, idx1, idx2, flags)
 
 /* src[] = { offset }. const_index[] = { base } */
-LOAD(uniform, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
+LOAD(uniform, 1, 2, BASE, RANGE, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
 /* src[] = { buffer_index, offset }. No const_index */
 LOAD(ubo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
 /* src[] = { offset }. const_index[] = { base } */
diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
index d9af8bf..508e1ec 100644
--- a/src/compiler/nir/nir_lower_io.c
+++ b/src/compiler/nir/nir_lower_io.c
@@ -277,6 +277,11 @@ nir_lower_io_block(nir_block *block, void *void_state)
  nir_intrinsic_set_base(load,
 intrin->variables[0]->var->data.driver_location);
 
+ if (load->intrinsic == nir_intrinsic_load_uniform) {
+nir_intrinsic_set_range(load,
+   state->type_size(intrin->variables[0]->var->type));
+ }
+
  if (per_vertex)
 load->src[0] = nir_src_for_ssa(vertex_index);
 
diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index d3d5b84..99e85c3 100644
--- a/src/compiler/nir/nir_print.c
+++ b/src/compiler/nir/nir_print.c
@@ -502,6 +502,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, 
print_state *state)
   [NIR_INTRINSIC_WRMASK] = "wrmask",
   [NIR_INTRINSIC_STREAM_ID] = "stream-id",
   [NIR_INTRINSIC_UCP_ID] = "ucp-id",
+  [NIR_INTRINSIC_RANGE] = "range",
};
for (unsigned idx = 1; idx < NIR_INTRINSIC_NUM_INDEX_FLAGS; idx++) {
   if (!info->index_map[idx])
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 07/15] i965/fs: Get rid of reladdr

2016-03-22 Thread Jason Ekstrand
We aren't using it anymore.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 7 +--
 src/mesa/drivers/dri/i965/brw_ir_fs.h | 5 +
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 2cf093a..57397f2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -433,7 +433,6 @@ fs_reg::fs_reg(struct ::brw_reg reg) :
 {
this->reg_offset = 0;
this->subreg_offset = 0;
-   this->reladdr = NULL;
this->stride = 1;
if (this->file == IMM &&
(this->type != BRW_REGISTER_TYPE_V &&
@@ -448,7 +447,6 @@ fs_reg::equals(const fs_reg ) const
 {
return (this->backend_reg::equals(r) &&
subreg_offset == r.subreg_offset &&
-   !reladdr && !r.reladdr &&
stride == r.stride);
 }
 
@@ -4781,9 +4779,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
  break;
   case UNIFORM:
  fprintf(file, "u%d", inst->src[i].nr + inst->src[i].reg_offset);
- if (inst->src[i].reladdr) {
-fprintf(file, "+reladdr");
- } else if (inst->src[i].subreg_offset) {
+ if (inst->src[i].subreg_offset) {
 fprintf(file, "+%d.%d", inst->src[i].reg_offset,
 inst->src[i].subreg_offset);
  }
@@ -4894,7 +4890,6 @@ fs_visitor::get_instruction_generating_reg(fs_inst *start,
 {
if (end == start ||
end->is_partial_write() ||
-   reg.reladdr ||
!reg.equals(end->dst)) {
   return NULL;
} else {
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index c3eec2e..e4f20f4 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -58,8 +58,6 @@ public:
 */
int subreg_offset;
 
-   fs_reg *reladdr;
-
/** Register region horizontal stride */
uint8_t stride;
 };
@@ -136,8 +134,7 @@ component(fs_reg reg, unsigned idx)
 static inline bool
 is_uniform(const fs_reg )
 {
-   return (reg.stride == 0 || reg.is_null()) &&
-  (!reg.reladdr || is_uniform(*reg.reladdr));
+   return (reg.stride == 0 || reg.is_null());
 }
 
 /**
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 01/15] i965/fs: Add support for doing MOV_INDIRECT on uniforms

2016-03-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index eaff953..7d15794 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -853,7 +853,10 @@ fs_inst::regs_read(int arg) const
  assert(src[2].file == IMM);
  unsigned region_length = src[2].ud;
 
- if (src[0].file == FIXED_GRF) {
+ if (src[0].file == UNIFORM) {
+assert(region_length % 4 == 0);
+return region_length / 4;
+ } else if (src[0].file == FIXED_GRF) {
 /* If the start of the region is not register aligned, then
  * there's some portion of the register that's technically
  * unread at the beginning.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 13/15] i965/vec4: Get rid of the uniform_size array

2016-03-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  8 
 src/mesa/drivers/dri/i965/brw_vec4.h  |  2 --
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp|  9 -
 src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp|  2 --
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 11 ---
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  1 -
 6 files changed, 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index d468380..65e57ba 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -496,11 +496,6 @@ vec4_visitor::split_uniform_registers()
 inst->src[i].reg_offset = 0;
   }
}
-
-   /* Update that everything is now vector-sized. */
-   for (int i = 0; i < this->uniforms; i++) {
-  this->uniform_size[i] = 1;
-   }
 }
 
 void
@@ -558,7 +553,6 @@ vec4_visitor::pack_uniform_registers()
 * push constants.
 */
for (int src = 0; src < uniforms; src++) {
-  assert(src < uniform_array_size);
   int size = chans_used[src];
 
   if (size == 0)
@@ -1610,8 +1604,6 @@ vec4_visitor::setup_uniforms(int reg)
 * matter what, or the GPU would hang.
 */
if (devinfo->gen < 6 && this->uniforms == 0) {
-  assert(this->uniforms < this->uniform_array_size);
-
   stage_prog_data->param =
  reralloc(NULL, stage_prog_data->param, const gl_constant_value *, 4);
   for (unsigned int i = 0; i < 4; i++) {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index f93805a..6143f65 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -115,8 +115,6 @@ public:
 */
dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int *uniform_size;
-   int uniform_array_size; /*< Size of the uniform_size array */
int uniforms;
 
src_reg shader_start_time;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 66275bb..585674f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -132,15 +132,6 @@ void
 vec4_visitor::nir_setup_uniforms()
 {
uniforms = nir->num_uniforms / 16;
-
-   nir_foreach_variable(var, >uniforms) {
-  /* UBO's and atomics don't take up space in the uniform file */
-  if (var->interface_type != NULL || var->type->contains_atomic())
- continue;
-
-  if (type_size_vec4(var->type) > 0)
- uniform_size[var->data.driver_location / 16] = 
type_size_vec4(var->type);
-   }
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
index 2046b94..0ce48b8 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp
@@ -59,8 +59,6 @@ vec4_tcs_visitor::emit_nir_code()
* copies VS outputs to TES inputs.
*/
   uniforms = 2;
-  uniform_size[0] = 1;
-  uniform_size[1] = 1;
 
   uint64_t varyings = key->outputs_written;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index ee86e03..4cfbc14 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1726,17 +1726,6 @@ vec4_visitor::vec4_visitor(const struct brw_compiler 
*compiler,
this->max_grf = devinfo->gen >= 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
 
this->uniforms = 0;
-
-   /* Initialize uniform_array_size to at least 1 because pre-gen6 VS requires
-* at least one. See setup_uniforms() in brw_vec4.cpp.
-*/
-   this->uniform_array_size = 1;
-   if (prog_data) {
-  this->uniform_array_size =
- MAX2(DIV_ROUND_UP(stage_prog_data->nr_params, 4), 1);
-   }
-
-   this->uniform_size = rzalloc_array(mem_ctx, int, this->uniform_array_size);
 }
 
 vec4_visitor::~vec4_visitor()
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
index f3cfc88..39f0c0b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
@@ -161,7 +161,6 @@ void
 vec4_vs_visitor::setup_uniform_clipplane_values()
 {
for (int i = 0; i < key->nr_userclip_plane_consts; ++i) {
-  assert(this->uniforms < uniform_array_size);
   this->userplane[i] = dst_reg(UNIFORM, this->uniforms);
   this->userplane[i].type = BRW_REGISTER_TYPE_F;
   for (int j = 0; j < 4; ++j) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 06/15] i965/fs: Use MOV_INDIRECT for all indirect uniform loads

2016-03-22 Thread Jason Ekstrand
Instead of using reladdr, this commit changes the FS backend to emit a
MOV_INDIRECT whenever we need an indirect uniform load.  We also have to
rework some of the other bits of the backend to handle this new form of
uniform load.  The obvious change is that demote_pull_constants now acts
more like a lowering pass when it hits a MOV_INDIRECT.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 72 +++-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 54 +++-
 2 files changed, 87 insertions(+), 39 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 91487c9..2cf093a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1963,8 +1963,8 @@ fs_visitor::assign_constant_locations()
  if (inst->src[i].file != UNIFORM)
 continue;
 
- if (inst->src[i].reladdr) {
-int uniform = inst->src[i].nr;
+ if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
+int uniform = inst->src[0].nr;
 
 /* If this array isn't already present in the pull constant buffer,
  * add it.
@@ -2046,49 +2046,63 @@ fs_visitor::assign_constant_locations()
 void
 fs_visitor::demote_pull_constants()
 {
-   foreach_block_and_inst (block, fs_inst, inst, cfg) {
+   const unsigned index = stage_prog_data->binding_table.pull_constants_start;
+
+   foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
+  /* Set up the annotation tracking for new generated instructions. */
+  const fs_builder ibld(this, block, inst);
+
   for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file != UNIFORM)
continue;
 
- int pull_index;
+ /* We'll handle this case later */
+ if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0)
+continue;
+
  unsigned location = inst->src[i].nr + inst->src[i].reg_offset;
- if (location >= uniforms) /* Out of bounds access */
-pull_index = -1;
- else
-pull_index = pull_constant_loc[location];
+ if (location >= uniforms)
+continue; /* Out of bounds access */
+
+ int pull_index = pull_constant_loc[location];
 
  if (pull_index == -1)
continue;
 
- /* Set up the annotation tracking for new generated instructions. */
- const fs_builder ibld(this, block, inst);
- const unsigned index = 
stage_prog_data->binding_table.pull_constants_start;
- fs_reg dst = vgrf(glsl_type::float_type);
-
  assert(inst->src[i].stride == 0);
 
- /* Generate a pull load into dst. */
- if (inst->src[i].reladdr) {
-VARYING_PULL_CONSTANT_LOAD(ibld, dst,
-   brw_imm_ud(index),
-   *inst->src[i].reladdr,
-   pull_index * 4);
-inst->src[i].reladdr = NULL;
-inst->src[i].stride = 1;
- } else {
-const fs_builder ubld = ibld.exec_all().group(8, 0);
-struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & 
~15);
-ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
-  dst, brw_imm_ud(index), offset);
-inst->src[i].set_smear(pull_index & 3);
- }
- brw_mark_surface_used(prog_data, index);
+ fs_reg dst = vgrf(glsl_type::float_type);
+ const fs_builder ubld = ibld.exec_all().group(8, 0);
+ struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & ~15);
+ ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
+   dst, brw_imm_ud(index), offset);
 
  /* Rewrite the instruction to use the temporary VGRF. */
  inst->src[i].file = VGRF;
  inst->src[i].nr = dst.nr;
  inst->src[i].reg_offset = 0;
+ inst->src[i].set_smear(pull_index & 3);
+
+ brw_mark_surface_used(prog_data, index);
+  }
+
+  if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT &&
+  inst->src[0].file == UNIFORM) {
+
+ unsigned location = inst->src[0].nr + inst->src[0].reg_offset;
+ if (location >= uniforms)
+continue; /* Out of bounds access */
+
+ int pull_index = pull_constant_loc[location];
+ assert(pull_index >= 0); /* This had better be pull */
+
+ VARYING_PULL_CONSTANT_LOAD(ibld, inst->dst,
+brw_imm_ud(index),
+inst->src[1],
+pull_index * 4);
+ inst->remove(block);
+
+ brw_mark_surface_used(prog_data, index);
   }
}
invalidate_live_intervals();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 4de5599..2cec97b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ 

[Mesa-dev] [PATCH v2 12/15] i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD

2016-03-22 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 1a62029..6c7c8cd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -174,7 +174,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
,
 * CSE can later notice that those loads are all the same and eliminate
 * the redundant ones.
 */
-   fs_reg vec4_offset = vgrf(glsl_type::int_type);
+   fs_reg vec4_offset = vgrf(glsl_type::uint_type);
bld.ADD(vec4_offset, varying_offset, brw_imm_ud(const_offset & ~0xf));
 
int scale = 1;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 04/15] i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware

2016-03-22 Thread Jason Ekstrand
While we're at it, we also add support for the possibility that the
indirect is, in fact, a constant.  This shouldn't happen in the common case
(if it does, that means NIR failed to constant-fold something), but it's
possible so we should handle it.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 67 +-
 2 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d41c8a8..91487c9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4481,6 +4481,10 @@ get_lowered_simd_width(const struct brw_device_info 
*devinfo,
case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
   return 8;
 
+   case SHADER_OPCODE_MOV_INDIRECT:
+  /* Prior to Broadwell, we only have 8 address subregisters */
+  return devinfo->gen < 8 ? 8 : inst->exec_size;
+
default:
   return inst->exec_size;
}
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 35400cb..4e89a8a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -351,22 +351,63 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
 
unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr;
 
-   /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
-   struct brw_reg addr = vec8(brw_address_reg(0));
+   if (indirect_byte_offset.file == BRW_IMMEDIATE_VALUE) {
+  imm_byte_offset += indirect_byte_offset.ud;
 
-   /* The destination stride of an instruction (in bytes) must be greater
-* than or equal to the size of the rest of the instruction.  Since the
-* address register is of type UW, we can't use a D-type instruction.
-* In order to get around this, re re-type to UW and use a stride.
-*/
-   indirect_byte_offset =
-  retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
+  reg.nr = imm_byte_offset / REG_SIZE;
+  reg.subnr = imm_byte_offset % REG_SIZE;
+  brw_MOV(p, dst, reg);
+   } else {
+  /* Prior to Broadwell, there are only 8 address registers. */
+  assert(inst->exec_size == 8 || devinfo->gen >= 8);
+
+  /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
+  struct brw_reg addr = vec8(brw_address_reg(0));
+
+  /* The destination stride of an instruction (in bytes) must be greater
+   * than or equal to the size of the rest of the instruction.  Since the
+   * address register is of type UW, we can't use a D-type instruction.
+   * In order to get around this, re re-type to UW and use a stride.
+   */
+  indirect_byte_offset =
+ retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
+
+  struct brw_reg ind_src;
+  if (devinfo->gen < 8) {
+ /* Prior to broadwell, we have a restriction that the bottom 5 bits
+  * of the base offset and the bottom 5 bits of the indirect must add
+  * to less than 32.  In other words, the hardware needs to be able to
+  * add the bottom five bits of the two to get the subnumber and add
+  * the next 7 bits of each to get the actual register number.  Since
+  * the indirect may cause us to cross a register boundary, this makes
+  * it almost useless.  We could try and do something clever where we
+  * use a actual base offset if base_offset % 32 == 0 but that would
+  * mean we were generating different code depending on the base
+  * offset.  Instead, for the sake of consistency, we'll just do the
+  * add ourselves.
+  */
+ brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
+ ind_src = brw_VxH_indirect(0, 0);
+  } else {
+ brw_MOV(p, addr, indirect_byte_offset);
+ ind_src = brw_VxH_indirect(0, imm_byte_offset);
+  }
 
-   /* Prior to Broadwell, there are only 8 address registers. */
-   assert(inst->exec_size == 8 || devinfo->gen >= 8);
+  brw_inst *mov = brw_MOV(p, dst, retype(ind_src, dst.type));
 
-   brw_MOV(p, addr, indirect_byte_offset);
-   brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type));
+  if (devinfo->gen == 6 && dst.file == BRW_MESSAGE_REGISTER_FILE &&
+  !inst->get_next()->is_head_sentinel() &&
+  ((fs_inst *)inst->get_next())->mlen > 0) {
+ /* From the Sandybridge PRM:
+  *
+  *"[Errata: DevSNB(SNB)] If MRF register is updated by any
+  *instruction that “indexed/indirect” source AND is followed by a
+  *send, the instruction requires a “Switch”. This is to avoid
+  *race condition where send may dispatch before MRF is updated."
+  */
+ brw_inst_set_thread_control(devinfo, mov, BRW_THREAD_SWITCH);
+  }
+   }
 }
 
 void
-- 
2.5.0.400.gff86faf


[Mesa-dev] [PATCH v2 03/15] i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr

2016-03-22 Thread Jason Ekstrand
The subnr field is in bytes so we don't need to multiply by type_sz.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 7d15794..d41c8a8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -870,7 +870,7 @@ fs_inst::regs_read(int arg) const
  * unread portion at the beginning.
  */
 if (src[0].subnr)
-   region_length += src[0].subnr * type_sz(src[0].type);
+   region_length += src[0].subnr;
 
 return DIV_ROUND_UP(region_length, REG_SIZE);
  } else {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 11/15] i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants

2016-03-22 Thread Jason Ekstrand
This commit moves us to an instruction based model rather than a
register-based model for indirects.  This is more accurate anyway as we
have to emit instructions to resolve the reladdr.  It's also a lot simpler
because it gets rid of the recursive reladdr problem by design.

One side-effect of this is that we need a whole new algorithm in
move_uniform_array_access_to_pull_constants.  This new algorithm is much
more straightforward than the old one and is fairly similar to what we're
already doing in the FS backend.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4.h   |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 10 +--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 86 --
 4 files changed, 50 insertions(+), 51 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index baf72a2..d468380 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -805,7 +805,7 @@ vec4_visitor::move_push_constants_to_pull_constants()
 dst_reg temp = dst_reg(this, glsl_type::vec4_type);
 
 emit_pull_constant_load(block, inst, temp, inst->src[i],
-pull_constant_loc[uniform]);
+pull_constant_loc[uniform], src_reg());
 
 inst->src[i].file = temp.file;
  inst->src[i].nr = temp.nr;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 9c40ed7..f93805a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -287,7 +287,8 @@ public:
void emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
dst_reg dst,
src_reg orig_src,
-   int base_offset);
+   int base_offset,
+src_reg indirect);
void emit_pull_constant_load_reg(dst_reg dst,
 src_reg surf_index,
 src_reg offset,
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index eef3940..66275bb 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -708,12 +708,14 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
  /* Offsets are in bytes but they should always be multiples of 16 */
  assert(const_offset->u32[0] % 16 == 0);
  src.reg_offset = const_offset->u32[0] / 16;
+
+ emit(MOV(dest, src));
   } else {
- src_reg tmp = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_D, 1);
- src.reladdr = new(mem_ctx) src_reg(tmp);
-  }
+ src_reg indirect = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_UD, 
1);
 
-  emit(MOV(dest, src));
+ emit(SHADER_OPCODE_MOV_INDIRECT, dest, src,
+  indirect, brw_imm_ud(instr->const_index[1]));
+  }
   break;
}
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index cfa58a6..ee86e03 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1581,16 +1581,16 @@ vec4_visitor::move_grf_array_access_to_scratch()
 void
 vec4_visitor::emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
  dst_reg temp, src_reg orig_src,
- int base_offset)
+ int base_offset, src_reg indirect)
 {
int reg_offset = base_offset + orig_src.reg_offset;
const unsigned index = prog_data->base.binding_table.pull_constants_start;
 
src_reg offset;
-   if (orig_src.reladdr) {
+   if (indirect.file != BAD_FILE) {
   offset = src_reg(this, glsl_type::int_type);
 
-  emit_before(block, inst, ADD(dst_reg(offset), *orig_src.reladdr,
+  emit_before(block, inst, ADD(dst_reg(offset), indirect,
brw_imm_d(reg_offset * 16)));
} else if (devinfo->gen >= 8) {
   /* Store the offset in a GRF so we can send-from-GRF. */
@@ -1625,59 +1625,55 @@ 
vec4_visitor::move_uniform_array_access_to_pull_constants()
 {
int pull_constant_loc[this->uniforms];
memset(pull_constant_loc, -1, sizeof(pull_constant_loc));
-   bool nested_reladdr;
 
-   /* Walk through and find array access of uniforms.  Put a copy of that
-* uniform in the pull constant buffer.
-*
-* Note that we don't move constant-indexed accesses to arrays.  No
-* testing has been done of the performance impact of this choice.
+   /* First, walk through the instructions and determine which things need to
+* be pulled.  We mark something as needing to be pulled by setting
+* pull_constant_loc to 0.
 */
-   do {
-  

[Mesa-dev] [PATCH v2 00/15] i965: Rework uniform handling in the back-end

2016-03-22 Thread Jason Ekstrand
This is mostly a re-send of a patch series I've had floating around in one
form or a while for quite some time.  It's basically the same except that
the original version was missing a work-around for Sandy Bridge.  For a
while, I wasn't really pushing to get it merged because I couldn't
demonstrate any actual performance benifit from pushing arrays.  However,
with the Vulkan API, the concept of push constants is directly exposed to
the user and we really need to be able to indirect on them.  This series
makes the FS backend 100% ready for indirect push constants;  vec4 will
take a little more work.

It's worth noting that we've been carying these patches around in our
Vulkan driver for probably 3 or 4 months now and it's working great.

For those that prefer to review on a branch:

https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/i965-uniforms

I think Kristian has mostly reviewed these patches.  However, he never sent
any R-Bs to the list.  I'd also like Ken or Matt to look at it from a
design perspective.

Jason Ekstrand (15):
  i965/fs: Add support for doing MOV_INDIRECT on uniforms
  i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions
  i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr
  i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware
  nir: Add another index to load_uniform to specify the range read
  i965/fs: Use MOV_INDIRECT for all indirect uniform loads
  i965/fs: Get rid of reladdr
  i965/fs: Stop relying on param_size in assign_constant_locations
  i965/fs: Get rid of the param_size array
  i965/vec4: Inline get_pull_constant_offset
  i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push
constants
  i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD
  i965/vec4: Get rid of the uniform_size array
  i965/fs: Rename demote_pull_constants to lower_constant_loads
  i965/fs: Push small uniform arrays

 src/compiler/nir/nir.h|   7 +
 src/compiler/nir/nir_intrinsics.h |   6 +-
 src/compiler/nir/nir_lower_io.c   |   5 +
 src/compiler/nir/nir_print.c  |   1 +
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 189 +-
 src/mesa/drivers/dri/i965/brw_fs.h|   4 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp|  68 ++--
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |  63 +---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |   3 -
 src/mesa/drivers/dri/i965/brw_ir_fs.h |   5 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  10 +-
 src/mesa/drivers/dri/i965/brw_vec4.h  |   7 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp|  19 +--
 src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp|   2 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 130 ++-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |   1 -
 16 files changed, 292 insertions(+), 228 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.

2016-03-22 Thread Kenneth Graunke
On Tuesday, March 22, 2016 2:31:42 PM PDT Ian Romanick wrote:
> On 03/17/2016 09:18 AM, Martin Peres wrote:
> > On 16/03/16 19:33, Kenneth Graunke wrote:
> >> The SIN and COS instructions on Intel hardware can produce values
> >> slightly outside of the [-1.0, 1.0] range for a small set of values.
> >> Obviously, this can break everyone's expectations about trig functions.
> >>
> >> According to an internal presentation, the COS instruction can produce
> >> a value up to 1.27 for inputs in the range (0.08296, 0.09888).  One
> >> suggested workaround is to multiply by 0.7, scaling down the
> >> amplitude slightly.  Apparently this also minimizes the error function,
> >> reducing the maximum error from 0.6 to about 0.3.
> >>
> >> I chose to apply this only when not saturating, as saturate already
> >> clamps to 1.0.  This may or may not be a good idea.
> >>
> >> Fixes 16 dEQP precision tests
> >>
> >> dEQP-GLES31.functional.shaders.builtin_functions.precision.
> >> {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.
> >>
> >> at the cost of making every sin and cos call more expensive.
> >>
> >> Signed-off-by: Kenneth Graunke 
> >> ---
> >>   src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 26
> >> --
> >>   src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26
> >> --
> >>   2 files changed, 40 insertions(+), 12 deletions(-)
> >>
> >> This has been in the Vulkan tree for a while - we needed it to pass the
> >> Vulkan CTS, as it contains these same dEQP tests.
> >>
> >> I haven't run shader-db yet, but I don't expect we'll like the results.
> >>
> >> The patch is pretty sketchy, too.  I'm sort of tempted to hide it behind
> >> an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day...
> > 
> > FYI, a quick run on hsw_gt2 shows -10.45% on Gputest:voplosion.
> 
> Can you explain this result?  -10.45% of what?  Instructions?  FPS?  And
> this is comparing what to what?  Before this patch to after?

I think FPS goes down by 10.45% when applying this patch.  Pretty dire.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94512] X segfaults with glx-tls enabled in a x32 environment

2016-03-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94512

--- Comment #5 from EoD  ---
(In reply to Michel Dänzer from comment #4)
> Sounds like it's not a driver specific issue then but a general one with
> GLX-TLS on x32.

X did not segfault when I used my Barts (radeon + r600) instead of the Tonga
(amdgpu + radeonsi), although I had no acceleration on both. So there might be
too issues here which I mixed up by accident. I'll stick to the glx-tls issue
from now on.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Work around SIN/COS output range problem.

2016-03-22 Thread Ian Romanick
On 03/17/2016 09:18 AM, Martin Peres wrote:
> On 16/03/16 19:33, Kenneth Graunke wrote:
>> The SIN and COS instructions on Intel hardware can produce values
>> slightly outside of the [-1.0, 1.0] range for a small set of values.
>> Obviously, this can break everyone's expectations about trig functions.
>>
>> According to an internal presentation, the COS instruction can produce
>> a value up to 1.27 for inputs in the range (0.08296, 0.09888).  One
>> suggested workaround is to multiply by 0.7, scaling down the
>> amplitude slightly.  Apparently this also minimizes the error function,
>> reducing the maximum error from 0.6 to about 0.3.
>>
>> I chose to apply this only when not saturating, as saturate already
>> clamps to 1.0.  This may or may not be a good idea.
>>
>> Fixes 16 dEQP precision tests
>>
>> dEQP-GLES31.functional.shaders.builtin_functions.precision.
>> {cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.
>>
>> at the cost of making every sin and cos call more expensive.
>>
>> Signed-off-by: Kenneth Graunke 
>> ---
>>   src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 26
>> --
>>   src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 26
>> --
>>   2 files changed, 40 insertions(+), 12 deletions(-)
>>
>> This has been in the Vulkan tree for a while - we needed it to pass the
>> Vulkan CTS, as it contains these same dEQP tests.
>>
>> I haven't run shader-db yet, but I don't expect we'll like the results.
>>
>> The patch is pretty sketchy, too.  I'm sort of tempted to hide it behind
>> an INTEL_STRICT_CONFORMANCE=1 option, like we had way back in the day...
> 
> FYI, a quick run on hsw_gt2 shows -10.45% on Gputest:voplosion.

Can you explain this result?  -10.45% of what?  Instructions?  FPS?  And
this is comparing what to what?  Before this patch to after?

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta: Use ARB_explicit_attrib_location in the rest of the meta shaders.

2016-03-22 Thread Ian Romanick
I thought I already had this patch, but it looks like it was on my to-do
list.  I had patches to use GL_ARB_explicit_uniform_location, and those
needed GL_ARB_explicit_attrib_location (to get the layout keyword).

https://patchwork.freedesktop.org/patch/74220/ (and others... I'm not
sure why I haven't landed these.)

Every driver that supports GLSL also supports
GL_ARB_explicit_attrib_location.

On 03/15/2016 12:05 PM, Kenneth Graunke wrote:
> This is cleaner than using glBindAttribLocation().
> 
> Not all drivers support the extension, but I don't think those drivers
> use GLSL in the first place.  Apparently some Meta shaders already use
> GL_ARB_explicit_attrib_location, so I think it should be okay.
> 
> Honestly, I'm not sure how the old code worked anyway - we bound the
> attribute location for "texcoords", while all the shaders capitalized
> or spelled it differently.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/common/meta.c  | 17 ++---
>  src/mesa/drivers/common/meta_blit.c | 15 +--
>  2 files changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index ab78f45..b05dfc7 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -207,8 +207,6 @@ _mesa_meta_compile_and_link_program(struct gl_context 
> *ctx,
> _mesa_DeleteShader(fs);
> _mesa_AttachShader(*program, vs);
> _mesa_DeleteShader(vs);
> -   _mesa_BindAttribLocation(*program, 0, "position");
> -   _mesa_BindAttribLocation(*program, 1, "texcoords");
> _mesa_meta_link_program_with_debug(ctx, *program);
>  
> _mesa_UseProgram(*program);
> @@ -230,19 +228,15 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>  {
> char *vs_source, *fs_source;
> struct blit_shader *shader = choose_blit_shader(target, table);
> -   const char *vs_input, *vs_output, *fs_input, *vs_preprocess, 
> *fs_preprocess;
> +   const char *fs_input, *vs_preprocess, *fs_preprocess;
> void *mem_ctx;
>  
> if (ctx->Const.GLSLVersion < 130) {
>vs_preprocess = "";
> -  vs_input = "attribute";
> -  vs_output = "varying";
>fs_preprocess = "#extension GL_EXT_texture_array : enable";
>fs_input = "varying";
> } else {
>vs_preprocess = "#version 130";
> -  vs_input = "in";
> -  vs_output = "out";
>fs_preprocess = "#version 130";
>fs_input = "in";
>shader->func = "texture";
> @@ -259,15 +253,16 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>  
> vs_source = ralloc_asprintf(mem_ctx,
>  "%s\n"
> -"%s vec2 position;\n"
> -"%s vec4 textureCoords;\n"
> -"%s vec4 texCoords;\n"
> +"#extension GL_ARB_explicit_attrib_location: enable\n"
> +"layout(location = 0) in vec2 position;\n"
> +"layout(location = 1) in vec4 textureCoords;\n"
> +"out vec4 texCoords;\n"
>  "void main()\n"
>  "{\n"
>  "   texCoords = textureCoords;\n"
>  "   gl_Position = vec4(position, 0.0, 1.0);\n"
>  "}\n",
> -vs_preprocess, vs_input, vs_input, vs_output);
> +vs_preprocess);
>  
> fs_source = ralloc_asprintf(mem_ctx,
>  "%s\n"
> diff --git a/src/mesa/drivers/common/meta_blit.c 
> b/src/mesa/drivers/common/meta_blit.c
> index 5d80f7d..179dc0d 100644
> --- a/src/mesa/drivers/common/meta_blit.c
> +++ b/src/mesa/drivers/common/meta_blit.c
> @@ -168,8 +168,9 @@ setup_glsl_msaa_blit_scaled_shader(struct gl_context *ctx,
>  
> static const char vs_source[] =
> "#version 130\n"
> -   "in vec2 position;\n"
> -   "in vec3 textureCoords;\n"
> +   "#extension GL_ARB_explicit_attrib_location: 
> enable\n"
> +   "layout(location = 0) in vec2 position;\n"
> +   "layout(location = 1) in vec3 
> textureCoords;\n"
> "out vec2 texCoords;\n"
> "flat out int layer;\n"
> "void main()\n"
> @@ -384,8 +385,9 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
>  
>vs_source = ralloc_asprintf(mem_ctx,
>"#version 130\n"
> -  "in vec2 position;\n"
> -  "in %s textureCoords;\n"
> +  "#extension 
> GL_ARB_explicit_attrib_location: enable\n"
> +  "layout(location = 0) in vec2 position;\n"
> +  "layout(location = 1) in %s 
> textureCoords;\n"
>"out %s texCoords;\n"
> 

Re: [Mesa-dev] [PATCH] gallium/swr: update rasterizer (532172)

2016-03-22 Thread Jordan Justen
What does 532172 in the subject refer to?

On 2016-03-22 12:45:48, Tim Rowley wrote:
> Highlights include:
>   * code style fixes
>   * start removing win32 types
>   * switch DC/DS rings to ringbuffer datastructure
>   * rdtsc bucket support for shaders
>   * address some coverity issues
>   * user clip planes
>   * global arena
>   * support llvm-svn

From this commit message, it seems clear that this single patch is
doing a whole lot. Usually that's a good sign that it should be split
into multiple patches.

However, since this is only changing your driver, you can probably
take any sort of patches that you like. :)

There is arguably little value to sending out a patch like this, since
it is very difficult to review. In other words, perhaps if you are
going to make big, unreviewable patches like this that only change
your driver, then you might as well just push them straight away.

(But, it would be better, in my opinion, to try to split up the
changes and let them be reviewed.)

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix 2D array MSAA failures since image support landed

2016-03-22 Thread Nicolai Hähnle

On 22.03.2016 12:27, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeonsi/si_state.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index b9bdd47..b8fde00 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2993,7 +2993,8 @@ si_make_texture_descriptor(struct si_screen *screen,
if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
height = 1;
depth = res->array_size;
-   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY) {
+   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY ||
+  type == V_008F1C_SQ_RSRC_IMG_2D_MSAA_ARRAY) {
if (sampler || res->target != PIPE_TEXTURE_3D)
depth = res->array_size;
} else if (type == V_008F1C_SQ_RSRC_IMG_CUBE)



Looks good to me.

Reviewed-by: Nicolai Hähnle 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/11] radeonsi: implement set_shader_buffers

2016-03-22 Thread Marek Olšák
On Tue, Mar 22, 2016 at 9:15 PM, Marek Olšák  wrote:
> On Tue, Mar 22, 2016 at 12:21 AM, Nicolai Hähnle  wrote:
>> From: Nicolai Hähnle 
>>
>> ---
>>  src/gallium/drivers/radeonsi/si_descriptors.c |  61 +-
>>  src/gallium/drivers/radeonsi/si_pipe.h|   1 +
>>  src/gallium/drivers/radeonsi/si_shader.c  |   5 +-
>>  src/gallium/drivers/radeonsi/si_shader.h  | 114 
>> +-
>>  4 files changed, 123 insertions(+), 58 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
>> b/src/gallium/drivers/radeonsi/si_descriptors.c
>> index c7c30bf..72bd50f 100644
>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
>> @@ -746,6 +746,55 @@ static void si_set_constant_buffer(struct pipe_context 
>> *ctx, uint shader, uint s
>> buffers->desc.list_dirty = true;
>>  }
>>
>> +/* SHADER BUFFERS */
>> +
>> +static void si_set_shader_buffers(struct pipe_context *ctx, unsigned shader,
>> + unsigned start_slot, unsigned count,
>> + struct pipe_shader_buffer *sbuffers)
>> +{
>> +   struct si_context *sctx = (struct si_context *)ctx;
>> +   struct si_buffer_resources *buffers = >shader_buffers[shader];
>> +   unsigned i;
>> +
>> +   assert(start_slot + count <= SI_NUM_SHADER_BUFFERS);
>
> SI_NUM_SHADER_BUFFERS should be defined in this patch.

BTW, this will check if all commits can be built:

git rebase -i origin -x "make -j4 >/dev/null"

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: fix dangling ssadef->name ptrs

2016-03-22 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Tue, Mar 22, 2016 at 12:35 PM, Rob Clark  wrote:

> From: Rob Clark 
>
> In many places, the convention is to pass an existing ssadef name ptr
> when construction/initializing a new nir_ssa_def.  But that goes badly
> (as noticed by garbage in nir_print output) when the original string
> gets freed.
>
> Just use ralloc_strdup() instead, and add ralloc_free() in the two
> places that would care (not that the strings wouldn't eventually get
> freed anyways).
>
> Also fixup the nir_search code which was directly setting ssadef->name
> to use the parent instruction as memctx.
>
> Signed-off-by: Rob Clark 
> ---
>  src/compiler/nir/nir.c| 4 +++-
>  src/compiler/nir/nir_search.c | 6 +++---
>  src/compiler/nir/nir_to_ssa.c | 2 ++
>  3 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> index b114981..20f1a18 100644
> --- a/src/compiler/nir/nir.c
> +++ b/src/compiler/nir/nir.c
> @@ -1317,12 +1317,13 @@ nir_instr_rewrite_dest(nir_instr *instr, nir_dest
> *dest, nir_dest new_dest)
>src_add_all_uses(dest->reg.indirect, instr, NULL);
>  }
>
> +/* note: does *not* take ownership of 'name' */
>  void
>  nir_ssa_def_init(nir_instr *instr, nir_ssa_def *def,
>   unsigned num_components,
>   unsigned bit_size, const char *name)
>  {
> -   def->name = name;
> +   def->name = ralloc_strdup(instr, name);
> def->parent_instr = instr;
> list_inithead(>uses);
> list_inithead(>if_uses);
> @@ -1339,6 +1340,7 @@ nir_ssa_def_init(nir_instr *instr, nir_ssa_def *def,
> }
>  }
>
> +/* note: does *not* take ownership of 'name' */
>  void
>  nir_ssa_dest_init(nir_instr *instr, nir_dest *dest,
>   unsigned num_components, unsigned bit_size,
> diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c
> index 6df662a..842ff65 100644
> --- a/src/compiler/nir/nir_search.c
> +++ b/src/compiler/nir/nir_search.c
> @@ -464,7 +464,7 @@ construct_value(const nir_search_value *value,
>
>switch (c->type) {
>case nir_type_float:
> - load->def.name = ralloc_asprintf(mem_ctx, "%f", c->data.d);
> + load->def.name = ralloc_asprintf(load, "%f", c->data.d);
>   switch (bitsize->dest_size) {
>   case 32:
>  load->value.f32[0] = c->data.d;
> @@ -478,7 +478,7 @@ construct_value(const nir_search_value *value,
>   break;
>
>case nir_type_int:
> - load->def.name = ralloc_asprintf(mem_ctx, "%ld", c->data.i);
> + load->def.name = ralloc_asprintf(load, "%ld", c->data.i);
>   switch (bitsize->dest_size) {
>   case 32:
>  load->value.i32[0] = c->data.i;
> @@ -492,7 +492,7 @@ construct_value(const nir_search_value *value,
>   break;
>
>case nir_type_uint:
> - load->def.name = ralloc_asprintf(mem_ctx, "%lu", c->data.u);
> + load->def.name = ralloc_asprintf(load, "%lu", c->data.u);
>   switch (bitsize->dest_size) {
>   case 32:
>  load->value.u32[0] = c->data.u;
> diff --git a/src/compiler/nir/nir_to_ssa.c b/src/compiler/nir/nir_to_ssa.c
> index 0640607..d588d7d 100644
> --- a/src/compiler/nir/nir_to_ssa.c
> +++ b/src/compiler/nir/nir_to_ssa.c
> @@ -221,6 +221,7 @@ rewrite_def_forwards(nir_dest *dest, void *_state)
> list_del(>reg.def_link);
> nir_ssa_dest_init(state->parent_instr, dest, reg->num_components,
>   reg->bit_size, name);
> +   ralloc_free(name);
>
> /* push our SSA destination on the stack */
> state->states[index].index++;
> @@ -274,6 +275,7 @@ rewrite_alu_instr_forward(nir_alu_instr *instr,
> rewrite_state *state)
>list_del(>dest.dest.reg.def_link);
>nir_ssa_dest_init(>instr, >dest.dest, num_components,
>  reg->bit_size, name);
> +  ralloc_free(name);
>
>if (nir_op_infos[instr->op].output_size == 0) {
>   /*
> --
> 2.5.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/11] radeonsi: implement set_shader_buffers

2016-03-22 Thread Marek Olšák
On Tue, Mar 22, 2016 at 12:21 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/gallium/drivers/radeonsi/si_descriptors.c |  61 +-
>  src/gallium/drivers/radeonsi/si_pipe.h|   1 +
>  src/gallium/drivers/radeonsi/si_shader.c  |   5 +-
>  src/gallium/drivers/radeonsi/si_shader.h  | 114 
> +-
>  4 files changed, 123 insertions(+), 58 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index c7c30bf..72bd50f 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -746,6 +746,55 @@ static void si_set_constant_buffer(struct pipe_context 
> *ctx, uint shader, uint s
> buffers->desc.list_dirty = true;
>  }
>
> +/* SHADER BUFFERS */
> +
> +static void si_set_shader_buffers(struct pipe_context *ctx, unsigned shader,
> + unsigned start_slot, unsigned count,
> + struct pipe_shader_buffer *sbuffers)
> +{
> +   struct si_context *sctx = (struct si_context *)ctx;
> +   struct si_buffer_resources *buffers = >shader_buffers[shader];
> +   unsigned i;
> +
> +   assert(start_slot + count <= SI_NUM_SHADER_BUFFERS);

SI_NUM_SHADER_BUFFERS should be defined in this patch.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] history of the dusscutions in the mailing list

2016-03-22 Thread Samuel Pitoiset



On 03/22/2016 09:01 PM, Iurie wrote:

hello,
where is possible to see the history of the dusscutions in the mailing list?


https://lists.freedesktop.org/archives/mesa-dev/



thanks


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] history of the dusscutions in the mailing list

2016-03-22 Thread Iurie
hello,
where is possible to see the history of the dusscutions in the mailing list?

thanks
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled.

2016-03-22 Thread Bas Nieuwenhuizen
Sure, no problem.

- Bas

On Tue, Mar 22, 2016 at 8:27 PM, Brian Paul  wrote:
> If you can wait until tomorrow, Bas, I'll do an overight piglit run to check
> for regressions.
>
> -Brian
>
>
> On 03/22/2016 12:31 PM, Marek Olšák wrote:
>>
>> Reviewed-by: Marek Olšák 
>>
>> Somebody from Intel or VMWare might want to take a look too.
>>
>> Marek
>>
>> On Tue, Mar 22, 2016 at 2:58 AM, Bas Nieuwenhuizen
>>  wrote:
>>>
>>> This removes any dependency on driver validation of the number of
>>> framebuffer samples.
>>>
>>> Signed-off-by: Bas Nieuwenhuizen 
>>> ---
>>>   src/mesa/drivers/dri/i965/brw_util.h   |  5 +++--
>>>   src/mesa/drivers/dri/i965/gen6_cc.c|  6 +++---
>>>   src/mesa/drivers/dri/i965/gen6_multisample_state.c |  2 +-
>>>   src/mesa/drivers/dri/i965/gen8_blend_state.c   |  6 +++---
>>>   src/mesa/drivers/dri/i965/gen8_depth_state.c   |  3 ++-
>>>   src/mesa/drivers/dri/i965/gen8_sf_state.c  |  4 ++--
>>>   src/mesa/main/framebuffer.c| 19
>>> +++
>>>   src/mesa/main/framebuffer.h|  3 +++
>>>   src/mesa/main/mtypes.h |  1 -
>>>   src/mesa/main/state.c  | 17
>>> -
>>>   src/mesa/program/prog_statevars.c  |  2 +-
>>>   src/mesa/state_tracker/st_atom_rasterizer.c|  4 ++--
>>>   src/mesa/state_tracker/st_atom_shader.c|  2 +-
>>>   src/mesa/swrast/s_points.c |  4 ++--
>>>   14 files changed, 42 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_util.h
>>> b/src/mesa/drivers/dri/i965/brw_util.h
>>> index 1f27e98..3e9a6ee 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_util.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_util.h
>>> @@ -34,6 +34,7 @@
>>>   #define BRW_UTIL_H
>>>
>>>   #include "brw_context.h"
>>> +#include "main/framebuffer.h"
>>>
>>>   extern GLuint brw_translate_blend_factor( GLenum factor );
>>>   extern GLuint brw_translate_blend_equation( GLenum mode );
>>> @@ -49,13 +50,13 @@ brw_get_line_width(struct brw_context *brw)
>>>   * implementation-dependent maximum non-antialiased line width."
>>>   */
>>>  float line_width =
>>> -  CLAMP(!brw->ctx.Multisample._Enabled && !brw->ctx.Line.SmoothFlag
>>> +  CLAMP(!_mesa_is_multisample_enabled(>ctx) &&
>>> !brw->ctx.Line.SmoothFlag
>>>   ? roundf(brw->ctx.Line.Width) : brw->ctx.Line.Width,
>>>   0.0f, brw->ctx.Const.MaxLineWidth);
>>>  uint32_t line_width_u3_7 = U_FIXED(line_width, 7);
>>>
>>>  /* Line width of 0 is not allowed when MSAA enabled */
>>> -   if (brw->ctx.Multisample._Enabled) {
>>> +   if (_mesa_is_multisample_enabled(>ctx)) {
>>> if (line_width_u3_7 == 0)
>>>line_width_u3_7 = 1;
>>>  } else if (brw->ctx.Line.SmoothFlag && line_width < 1.5f) {
>>> diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c
>>> b/src/mesa/drivers/dri/i965/gen6_cc.c
>>> index cee139b..f5a7d4d 100644
>>> --- a/src/mesa/drivers/dri/i965/gen6_cc.c
>>> +++ b/src/mesa/drivers/dri/i965/gen6_cc.c
>>> @@ -198,14 +198,14 @@ gen6_upload_blend_state(struct brw_context *brw)
>>> if(!is_buffer_zero_integer_format) {
>>>/* _NEW_MULTISAMPLE */
>>>blend[b].blend1.alpha_to_coverage =
>>> -ctx->Multisample._Enabled &&
>>> ctx->Multisample.SampleAlphaToCoverage;
>>> +_mesa_is_multisample_enabled(ctx) &&
>>> ctx->Multisample.SampleAlphaToCoverage;
>>>
>>>  /* From SandyBridge PRM, volume 2 Part 1, section 8.2.3,
>>> BLEND_STATE:
>>>   * DWord 1, Bit 30 (AlphaToOne Enable):
>>>   * "If Dual Source Blending is enabled, this bit must be
>>> disabled"
>>>   */
>>>WARN_ONCE(ctx->Color.Blend[b]._UsesDualSrc &&
>>> -   ctx->Multisample._Enabled &&
>>> +   _mesa_is_multisample_enabled(ctx) &&
>>>  ctx->Multisample.SampleAlphaToOne,
>>>  "HW workaround: disabling alpha to one with dual src
>>> "
>>>  "blending\n");
>>> @@ -213,7 +213,7 @@ gen6_upload_blend_state(struct brw_context *brw)
>>>   blend[b].blend1.alpha_to_one = false;
>>>   else
>>>  blend[b].blend1.alpha_to_one =
>>> -  ctx->Multisample._Enabled &&
>>> ctx->Multisample.SampleAlphaToOne;
>>> +  _mesa_is_multisample_enabled(ctx) &&
>>> ctx->Multisample.SampleAlphaToOne;
>>>
>>>blend[b].blend1.alpha_to_coverage_dither = (brw->gen >= 7);
>>> }
>>> diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
>>> b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
>>> index 8eb620d..fcd313a 100644
>>> --- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
>>> +++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
>>> @@ -171,7 +171,7 

[Mesa-dev] [PATCH] nir: fix dangling ssadef->name ptrs

2016-03-22 Thread Rob Clark
From: Rob Clark 

In many places, the convention is to pass an existing ssadef name ptr
when construction/initializing a new nir_ssa_def.  But that goes badly
(as noticed by garbage in nir_print output) when the original string
gets freed.

Just use ralloc_strdup() instead, and add ralloc_free() in the two
places that would care (not that the strings wouldn't eventually get
freed anyways).

Also fixup the nir_search code which was directly setting ssadef->name
to use the parent instruction as memctx.

Signed-off-by: Rob Clark 
---
 src/compiler/nir/nir.c| 4 +++-
 src/compiler/nir/nir_search.c | 6 +++---
 src/compiler/nir/nir_to_ssa.c | 2 ++
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index b114981..20f1a18 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1317,12 +1317,13 @@ nir_instr_rewrite_dest(nir_instr *instr, nir_dest 
*dest, nir_dest new_dest)
   src_add_all_uses(dest->reg.indirect, instr, NULL);
 }
 
+/* note: does *not* take ownership of 'name' */
 void
 nir_ssa_def_init(nir_instr *instr, nir_ssa_def *def,
  unsigned num_components,
  unsigned bit_size, const char *name)
 {
-   def->name = name;
+   def->name = ralloc_strdup(instr, name);
def->parent_instr = instr;
list_inithead(>uses);
list_inithead(>if_uses);
@@ -1339,6 +1340,7 @@ nir_ssa_def_init(nir_instr *instr, nir_ssa_def *def,
}
 }
 
+/* note: does *not* take ownership of 'name' */
 void
 nir_ssa_dest_init(nir_instr *instr, nir_dest *dest,
  unsigned num_components, unsigned bit_size,
diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c
index 6df662a..842ff65 100644
--- a/src/compiler/nir/nir_search.c
+++ b/src/compiler/nir/nir_search.c
@@ -464,7 +464,7 @@ construct_value(const nir_search_value *value,
 
   switch (c->type) {
   case nir_type_float:
- load->def.name = ralloc_asprintf(mem_ctx, "%f", c->data.d);
+ load->def.name = ralloc_asprintf(load, "%f", c->data.d);
  switch (bitsize->dest_size) {
  case 32:
 load->value.f32[0] = c->data.d;
@@ -478,7 +478,7 @@ construct_value(const nir_search_value *value,
  break;
 
   case nir_type_int:
- load->def.name = ralloc_asprintf(mem_ctx, "%ld", c->data.i);
+ load->def.name = ralloc_asprintf(load, "%ld", c->data.i);
  switch (bitsize->dest_size) {
  case 32:
 load->value.i32[0] = c->data.i;
@@ -492,7 +492,7 @@ construct_value(const nir_search_value *value,
  break;
 
   case nir_type_uint:
- load->def.name = ralloc_asprintf(mem_ctx, "%lu", c->data.u);
+ load->def.name = ralloc_asprintf(load, "%lu", c->data.u);
  switch (bitsize->dest_size) {
  case 32:
 load->value.u32[0] = c->data.u;
diff --git a/src/compiler/nir/nir_to_ssa.c b/src/compiler/nir/nir_to_ssa.c
index 0640607..d588d7d 100644
--- a/src/compiler/nir/nir_to_ssa.c
+++ b/src/compiler/nir/nir_to_ssa.c
@@ -221,6 +221,7 @@ rewrite_def_forwards(nir_dest *dest, void *_state)
list_del(>reg.def_link);
nir_ssa_dest_init(state->parent_instr, dest, reg->num_components,
  reg->bit_size, name);
+   ralloc_free(name);
 
/* push our SSA destination on the stack */
state->states[index].index++;
@@ -274,6 +275,7 @@ rewrite_alu_instr_forward(nir_alu_instr *instr, 
rewrite_state *state)
   list_del(>dest.dest.reg.def_link);
   nir_ssa_dest_init(>instr, >dest.dest, num_components,
 reg->bit_size, name);
+  ralloc_free(name);
 
   if (nir_op_infos[instr->op].output_size == 0) {
  /*
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] compiler/glsl: Allow the sequence operator to be a constant expression

2016-03-22 Thread Eduardo Lima Mitev

On 03/22/2016 02:48 PM, Lars Hamre wrote:

Resending this patch because it received no response last week.

Allow the sequence operator to be a constant expression in GLSL ES versions 
prior
to GLSL ES 3.0

Fixes the following piglit test:
/all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert



I confirm this fixes the above test, but it also regresses test:

/all/spec/glsl-1.20/compiler/structure-and-array-operations/array-size-sequence-in-parenthesis.vert.

Maybe you are missing a version check?

Eduardo


This mirrors the logic from process_initializer() which performs the
same check for constant variable initialization with sequence operators.

Section 4.3.3 (Constant Expressions) of the GLSL 4.30.9 spec and of the
GLSL ES 3.00.4 spec say that the result of a sequence operator is not a
constant expression; however, we should not mandate that for lower GLSL
versions.

Signed-off-by: Lars Hamre 

---
  src/compiler/glsl/ast_to_hir.cpp | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 5262bd8..4037468 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2125,7 +2125,9 @@ process_array_size(exec_node *node,
 }

 ir_constant *const size = ir->constant_expression_value();
-   if (size == NULL || array_size->has_sequence_subexpression()) {
+   if (size == NULL ||
+   (state->is_version(430, 300) &&
+array_size->has_sequence_subexpression())) {
_mesa_glsl_error(& loc, state, "array size must be a "
 "constant valued expression");
return 0;
--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled.

2016-03-22 Thread Brian Paul
If you can wait until tomorrow, Bas, I'll do an overight piglit run to 
check for regressions.


-Brian

On 03/22/2016 12:31 PM, Marek Olšák wrote:

Reviewed-by: Marek Olšák 

Somebody from Intel or VMWare might want to take a look too.

Marek

On Tue, Mar 22, 2016 at 2:58 AM, Bas Nieuwenhuizen
 wrote:

This removes any dependency on driver validation of the number of
framebuffer samples.

Signed-off-by: Bas Nieuwenhuizen 
---
  src/mesa/drivers/dri/i965/brw_util.h   |  5 +++--
  src/mesa/drivers/dri/i965/gen6_cc.c|  6 +++---
  src/mesa/drivers/dri/i965/gen6_multisample_state.c |  2 +-
  src/mesa/drivers/dri/i965/gen8_blend_state.c   |  6 +++---
  src/mesa/drivers/dri/i965/gen8_depth_state.c   |  3 ++-
  src/mesa/drivers/dri/i965/gen8_sf_state.c  |  4 ++--
  src/mesa/main/framebuffer.c| 19 +++
  src/mesa/main/framebuffer.h|  3 +++
  src/mesa/main/mtypes.h |  1 -
  src/mesa/main/state.c  | 17 -
  src/mesa/program/prog_statevars.c  |  2 +-
  src/mesa/state_tracker/st_atom_rasterizer.c|  4 ++--
  src/mesa/state_tracker/st_atom_shader.c|  2 +-
  src/mesa/swrast/s_points.c |  4 ++--
  14 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_util.h 
b/src/mesa/drivers/dri/i965/brw_util.h
index 1f27e98..3e9a6ee 100644
--- a/src/mesa/drivers/dri/i965/brw_util.h
+++ b/src/mesa/drivers/dri/i965/brw_util.h
@@ -34,6 +34,7 @@
  #define BRW_UTIL_H

  #include "brw_context.h"
+#include "main/framebuffer.h"

  extern GLuint brw_translate_blend_factor( GLenum factor );
  extern GLuint brw_translate_blend_equation( GLenum mode );
@@ -49,13 +50,13 @@ brw_get_line_width(struct brw_context *brw)
  * implementation-dependent maximum non-antialiased line width."
  */
 float line_width =
-  CLAMP(!brw->ctx.Multisample._Enabled && !brw->ctx.Line.SmoothFlag
+  CLAMP(!_mesa_is_multisample_enabled(>ctx) && 
!brw->ctx.Line.SmoothFlag
  ? roundf(brw->ctx.Line.Width) : brw->ctx.Line.Width,
  0.0f, brw->ctx.Const.MaxLineWidth);
 uint32_t line_width_u3_7 = U_FIXED(line_width, 7);

 /* Line width of 0 is not allowed when MSAA enabled */
-   if (brw->ctx.Multisample._Enabled) {
+   if (_mesa_is_multisample_enabled(>ctx)) {
if (line_width_u3_7 == 0)
   line_width_u3_7 = 1;
 } else if (brw->ctx.Line.SmoothFlag && line_width < 1.5f) {
diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c 
b/src/mesa/drivers/dri/i965/gen6_cc.c
index cee139b..f5a7d4d 100644
--- a/src/mesa/drivers/dri/i965/gen6_cc.c
+++ b/src/mesa/drivers/dri/i965/gen6_cc.c
@@ -198,14 +198,14 @@ gen6_upload_blend_state(struct brw_context *brw)
if(!is_buffer_zero_integer_format) {
   /* _NEW_MULTISAMPLE */
   blend[b].blend1.alpha_to_coverage =
-ctx->Multisample._Enabled && 
ctx->Multisample.SampleAlphaToCoverage;
+_mesa_is_multisample_enabled(ctx) && 
ctx->Multisample.SampleAlphaToCoverage;

 /* From SandyBridge PRM, volume 2 Part 1, section 8.2.3, BLEND_STATE:
  * DWord 1, Bit 30 (AlphaToOne Enable):
  * "If Dual Source Blending is enabled, this bit must be disabled"
  */
   WARN_ONCE(ctx->Color.Blend[b]._UsesDualSrc &&
-   ctx->Multisample._Enabled &&
+   _mesa_is_multisample_enabled(ctx) &&
 ctx->Multisample.SampleAlphaToOne,
 "HW workaround: disabling alpha to one with dual src "
 "blending\n");
@@ -213,7 +213,7 @@ gen6_upload_blend_state(struct brw_context *brw)
  blend[b].blend1.alpha_to_one = false;
  else
 blend[b].blend1.alpha_to_one =
-  ctx->Multisample._Enabled && ctx->Multisample.SampleAlphaToOne;
+  _mesa_is_multisample_enabled(ctx) && 
ctx->Multisample.SampleAlphaToOne;

   blend[b].blend1.alpha_to_coverage_dither = (brw->gen >= 7);
}
diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c 
b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
index 8eb620d..fcd313a 100644
--- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
@@ -171,7 +171,7 @@ gen6_determine_sample_mask(struct brw_context *brw)
 /* BRW_NEW_NUM_SAMPLES */
 unsigned num_samples = brw->num_samples;

-   if (ctx->Multisample._Enabled) {
+   if (_mesa_is_multisample_enabled(ctx)) {
if (ctx->Multisample.SampleCoverage) {
   coverage = ctx->Multisample.SampleCoverageValue;
   coverage_invert = ctx->Multisample.SampleCoverageInvert;
diff --git a/src/mesa/drivers/dri/i965/gen8_blend_state.c 
b/src/mesa/drivers/dri/i965/gen8_blend_state.c
index 

Re: [Mesa-dev] [PATCH] radeonsi: fix out-of-bounds indexing of shader images

2016-03-22 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Mon, Mar 21, 2016 at 9:41 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Results are undefined but may not crash. Without this change, out-of-bounds
> indexing can lead to VM faults and GPU hangs.
>
> Constant buffers, samplers, and possibly others will eventually need similar
> treatment to support GL_ARB_robust_buffer_access_behavior.
> ---
>  src/gallium/drivers/radeonsi/si_shader.c | 44 
> +++-
>  1 file changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 9ad2290..1e4bf82 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -532,6 +532,37 @@ static LLVMValueRef get_indirect_index(struct 
> si_shader_context *ctx,
>  }
>
>  /**
> + * Like get_indirect_index, but restricts the return value to a (possibly
> + * undefined) value inside [0..num).
> + */
> +static LLVMValueRef get_bounded_indirect_index(struct si_shader_context *ctx,
> +  const struct tgsi_ind_register 
> *ind,
> +  int rel_index, unsigned num)
> +{
> +   struct gallivm_state *gallivm = >radeon_bld.gallivm;
> +   LLVMBuilderRef builder = gallivm->builder;
> +   LLVMValueRef result = get_indirect_index(ctx, ind, rel_index);
> +   LLVMValueRef c_max = LLVMConstInt(ctx->i32, num - 1, 0);
> +   LLVMValueRef cc;
> +
> +   if (util_is_power_of_two(num)) {
> +   result = LLVMBuildAnd(builder, result, c_max, "");
> +   } else {
> +   /* In theory, this MAX pattern should result in code that is
> +* as good as the bit-wise AND above.
> +*
> +* In practice, LLVM generates worse code (at the time of
> +* writing), because its value tracking is not strong enough.
> +*/
> +   cc = LLVMBuildICmp(builder, LLVMIntULE, result, c_max, "");
> +   result = LLVMBuildSelect(builder, cc, result, c_max, "");
> +   }
> +
> +   return result;
> +}
> +
> +
> +/**
>   * Calculate a dword address given an input or output register and a stride.
>   */
>  static LLVMValueRef get_dw_address(struct si_shader_context *ctx,
> @@ -2814,7 +2845,18 @@ image_fetch_rsrc(
> LLVMValueRef rsrc_ptr;
> LLVMValueRef tmp;
>
> -   ind_index = get_indirect_index(ctx, >Indirect, 
> image->Register.Index);
> +   /* From the GL_ARB_shader_image_load_store extension spec:
> +*
> +*If a shader performs an image load, store, or atomic
> +*operation using an image variable declared as an array,
> +*and if the index used to select an individual element is
> +*negative or greater than or equal to the size of the
> +*array, the results of the operation are undefined but 
> may
> +*not lead to termination.
> +*/
> +   ind_index = get_bounded_indirect_index(ctx, >Indirect,
> +  image->Register.Index,
> +  SI_NUM_IMAGES);
>
> rsrc_ptr = LLVMGetParam(ctx->radeon_bld.main_fn, 
> SI_PARAM_IMAGES);
> tmp = build_indexed_load_const(ctx, rsrc_ptr, ind_index);
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94088] [llvmpipe] SIGFPE pthread_barrier_destroy.c:40

2016-03-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94088

Steve Langasek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REOPENED|RESOLVED

--- Comment #4 from Steve Langasek  ---
Sorry, you're quite right, this code path is not used except when additional
threads are spawned. Please ignore.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] radeonsi: shader buffer support (atomic counters, ssbo)

2016-03-22 Thread Marek Olšák
On Tue, Mar 22, 2016 at 12:21 AM, Nicolai Hähnle  wrote:
> Hi,
>
> since shader images have laid most of the foundation, here are shader buffers
> now. This is the last extension missing for OpenGL 4.2 (we still need to turn
> on GLSL 4.2, but I think that only involves flipping a bit).
>
> As with shader images, this extension needs bleeding edge LLVM - this time,
> important patches have not landed upstream yet, and if you want to try this
> code you'll need my LLVM branch at 
> https://cgit.freedesktop.org/~nh/llvm/log/?h=images
>
> (For those following along at home, the necessary LLVM patches for shader
> images have already landed upstream.)
>
> In principle, there are two alternative implementations for shader buffers:
> using LLVM IR  pointers with LLVM-native load/store instructions directly, or
> using intrinsics that operate on GCN buffer descriptors. This implementation
> uses the second approach. A brief comparison between the two approaches:
>
> 1. The pointer approach would use FLAT memory instructions on CI+, which
>operate on 64 bit pointers rather than 128 bit buffer descriptors. This
>would reduce SGPR memory pressure slightly.
>
> 2. LLVM understands pointers for alias analysis, so it's possible that it
>would generate somewhat better code if we were to use pointers in the
>IR.
>
> 3. The buffer load/store intructions have built-in bounds checks. Bounds
>checks are required for an honest implementation of the ARB_robustness
>extension, which we claim to support.
>
> The last point makes it obvious that the implementation really needs to use
> buffer intrinsics, but it'd be interesting to know how big the difference
> in code quality is versus something that uses pointers. To get the best of
> both worlds, we should really find a way to teach LLVM's alias analysis
> about what those buffer descriptors mean. For now, this current approach is
> the right way to do it.

Using 64-bit pointers is annoying from the radeonsi perspective,
because it requires a shader key bit that determines whether to use
the pointers. I've tried to implement it for constant buffers, but the
improvement in code quality was negligible and the added code
complexity and shader variant totally weren't worth it.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled.

2016-03-22 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Somebody from Intel or VMWare might want to take a look too.

Marek

On Tue, Mar 22, 2016 at 2:58 AM, Bas Nieuwenhuizen
 wrote:
> This removes any dependency on driver validation of the number of
> framebuffer samples.
>
> Signed-off-by: Bas Nieuwenhuizen 
> ---
>  src/mesa/drivers/dri/i965/brw_util.h   |  5 +++--
>  src/mesa/drivers/dri/i965/gen6_cc.c|  6 +++---
>  src/mesa/drivers/dri/i965/gen6_multisample_state.c |  2 +-
>  src/mesa/drivers/dri/i965/gen8_blend_state.c   |  6 +++---
>  src/mesa/drivers/dri/i965/gen8_depth_state.c   |  3 ++-
>  src/mesa/drivers/dri/i965/gen8_sf_state.c  |  4 ++--
>  src/mesa/main/framebuffer.c| 19 +++
>  src/mesa/main/framebuffer.h|  3 +++
>  src/mesa/main/mtypes.h |  1 -
>  src/mesa/main/state.c  | 17 -
>  src/mesa/program/prog_statevars.c  |  2 +-
>  src/mesa/state_tracker/st_atom_rasterizer.c|  4 ++--
>  src/mesa/state_tracker/st_atom_shader.c|  2 +-
>  src/mesa/swrast/s_points.c |  4 ++--
>  14 files changed, 42 insertions(+), 36 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_util.h 
> b/src/mesa/drivers/dri/i965/brw_util.h
> index 1f27e98..3e9a6ee 100644
> --- a/src/mesa/drivers/dri/i965/brw_util.h
> +++ b/src/mesa/drivers/dri/i965/brw_util.h
> @@ -34,6 +34,7 @@
>  #define BRW_UTIL_H
>
>  #include "brw_context.h"
> +#include "main/framebuffer.h"
>
>  extern GLuint brw_translate_blend_factor( GLenum factor );
>  extern GLuint brw_translate_blend_equation( GLenum mode );
> @@ -49,13 +50,13 @@ brw_get_line_width(struct brw_context *brw)
>  * implementation-dependent maximum non-antialiased line width."
>  */
> float line_width =
> -  CLAMP(!brw->ctx.Multisample._Enabled && !brw->ctx.Line.SmoothFlag
> +  CLAMP(!_mesa_is_multisample_enabled(>ctx) && 
> !brw->ctx.Line.SmoothFlag
>  ? roundf(brw->ctx.Line.Width) : brw->ctx.Line.Width,
>  0.0f, brw->ctx.Const.MaxLineWidth);
> uint32_t line_width_u3_7 = U_FIXED(line_width, 7);
>
> /* Line width of 0 is not allowed when MSAA enabled */
> -   if (brw->ctx.Multisample._Enabled) {
> +   if (_mesa_is_multisample_enabled(>ctx)) {
>if (line_width_u3_7 == 0)
>   line_width_u3_7 = 1;
> } else if (brw->ctx.Line.SmoothFlag && line_width < 1.5f) {
> diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c 
> b/src/mesa/drivers/dri/i965/gen6_cc.c
> index cee139b..f5a7d4d 100644
> --- a/src/mesa/drivers/dri/i965/gen6_cc.c
> +++ b/src/mesa/drivers/dri/i965/gen6_cc.c
> @@ -198,14 +198,14 @@ gen6_upload_blend_state(struct brw_context *brw)
>if(!is_buffer_zero_integer_format) {
>   /* _NEW_MULTISAMPLE */
>   blend[b].blend1.alpha_to_coverage =
> -ctx->Multisample._Enabled && 
> ctx->Multisample.SampleAlphaToCoverage;
> +_mesa_is_multisample_enabled(ctx) && 
> ctx->Multisample.SampleAlphaToCoverage;
>
> /* From SandyBridge PRM, volume 2 Part 1, section 8.2.3, BLEND_STATE:
>  * DWord 1, Bit 30 (AlphaToOne Enable):
>  * "If Dual Source Blending is enabled, this bit must be disabled"
>  */
>   WARN_ONCE(ctx->Color.Blend[b]._UsesDualSrc &&
> -   ctx->Multisample._Enabled &&
> +   _mesa_is_multisample_enabled(ctx) &&
> ctx->Multisample.SampleAlphaToOne,
> "HW workaround: disabling alpha to one with dual src "
> "blending\n");
> @@ -213,7 +213,7 @@ gen6_upload_blend_state(struct brw_context *brw)
>  blend[b].blend1.alpha_to_one = false;
>  else
> blend[b].blend1.alpha_to_one =
> -  ctx->Multisample._Enabled && ctx->Multisample.SampleAlphaToOne;
> +  _mesa_is_multisample_enabled(ctx) && 
> ctx->Multisample.SampleAlphaToOne;
>
>   blend[b].blend1.alpha_to_coverage_dither = (brw->gen >= 7);
>}
> diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c 
> b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
> index 8eb620d..fcd313a 100644
> --- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
> @@ -171,7 +171,7 @@ gen6_determine_sample_mask(struct brw_context *brw)
> /* BRW_NEW_NUM_SAMPLES */
> unsigned num_samples = brw->num_samples;
>
> -   if (ctx->Multisample._Enabled) {
> +   if (_mesa_is_multisample_enabled(ctx)) {
>if (ctx->Multisample.SampleCoverage) {
>   coverage = ctx->Multisample.SampleCoverageValue;
>   coverage_invert = ctx->Multisample.SampleCoverageInvert;
> diff --git a/src/mesa/drivers/dri/i965/gen8_blend_state.c 
> b/src/mesa/drivers/dri/i965/gen8_blend_state.c
> index 

[Mesa-dev] [PATCH 09/21] nir: Add a phi node placement helper

2016-03-22 Thread Jason Ekstrand
Right now, we have phi placement code in two places and there are other
places where it would be nice to be able to do this analysis.  Instead of
repeating it all over the place, this commit adds a helper for placing all
of the needed phi nodes for a value.

v2: Add better documentation

Reviewed-by: Jordan Justen 
Cc: Connor Abbot 
---
 src/compiler/Makefile.sources  |   2 +
 src/compiler/nir/Makefile.sources  |   2 +
 src/compiler/nir/nir_phi_builder.c | 295 +
 src/compiler/nir/nir_phi_builder.h | 115 +++
 4 files changed, 414 insertions(+)
 create mode 100644 src/compiler/nir/nir_phi_builder.c
 create mode 100644 src/compiler/nir/nir_phi_builder.h

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 9f3bcf0..9ecff37 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -214,6 +214,8 @@ NIR_FILES = \
nir/nir_opt_peephole_select.c \
nir/nir_opt_remove_phis.c \
nir/nir_opt_undef.c \
+   nir/nir_phi_builder.c \
+   nir/nir_phi_builder.h \
nir/nir_print.c \
nir/nir_remove_dead_variables.c \
nir/nir_search.c \
diff --git a/src/compiler/nir/Makefile.sources 
b/src/compiler/nir/Makefile.sources
index f31547b..db3eecc 100644
--- a/src/compiler/nir/Makefile.sources
+++ b/src/compiler/nir/Makefile.sources
@@ -58,6 +58,8 @@ NIR_FILES = \
nir_opt_peephole_select.c \
nir_opt_remove_phis.c \
nir_opt_undef.c \
+   nir_phi_builder.c \
+   nir_phi_builder.h \
nir_print.c \
nir_remove_dead_variables.c \
nir_search.c \
diff --git a/src/compiler/nir/nir_phi_builder.c 
b/src/compiler/nir/nir_phi_builder.c
new file mode 100644
index 000..a39e360
--- /dev/null
+++ b/src/compiler/nir/nir_phi_builder.c
@@ -0,0 +1,295 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir_phi_builder.h"
+#include "nir/nir_vla.h"
+
+struct nir_phi_builder {
+   nir_shader *shader;
+   nir_function_impl *impl;
+
+   /* Copied from the impl for easy access */
+   unsigned num_blocks;
+
+   /* Array of all blocks indexed by block->index. */
+   nir_block **blocks;
+
+   /* Hold on to the values so we can easily iterate over them. */
+   struct exec_list values;
+
+   /* Worklist for phi adding */
+   unsigned iter_count;
+   unsigned *work;
+   nir_block **W;
+};
+
+#define NEEDS_PHI ((nir_ssa_def *)(intptr_t)-1)
+
+struct nir_phi_builder_value {
+   struct exec_node node;
+
+   struct nir_phi_builder *builder;
+
+   /* Needed so we can create phis and undefs */
+   unsigned num_components;
+   unsigned bit_size;
+
+   /* The list of phi nodes associated with this value.  Phi nodes are not
+* added directly.  Instead, they are created, the instr->block pointer
+* set, and then added to this list.  Later, in phi_builder_finish, we
+* set up their sources and add them to the top of their respective
+* blocks.
+*/
+   struct exec_list phis;
+
+   /* Array of SSA defs, indexed by block.  For each block, this array has has
+* one of three types of values:
+*
+*  - NULL. Indicates that there is no known definition in this block.  If
+*you need to find one, look at the block's immediate dominator.
+*
+*  - NEEDS_PHI. Indicates that the block may need a phi node but none has
+*been created yet.  If a def is requested for a block, a phi will need
+*to be created.
+*
+*  - A regular SSA def.  This will be either the result of a phi node or
+*one of the defs provided by nir_phi_builder_value_set_blocK_def().
+*/
+   nir_ssa_def *defs[0];
+};
+
+static bool
+fill_block_array(nir_block *block, void *void_data)
+{
+   nir_block **blocks = void_data;
+   blocks[block->index] = block;
+   return 

Re: [Mesa-dev] Mesa (master): 23 new commits

2016-03-22 Thread Marek Olšák
On Tue, Mar 22, 2016 at 8:46 AM, Michel Dänzer  wrote:
> On 22.03.2016 05:34, Nicolai =?UNKNOWN?Q?H=C3=A4hnle?= wrote:
>>
>> URL:
>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=e85cf35a6516c44e33663fcd9637c6b434bb63ee
>> Author: Nicolai Hähnle 
>> Date:   Sat Feb 6 18:32:13 2016 -0500
>>
>> radeonsi: implement set_shader_images (v2)
>>
>> Whether DCC is disabled depends on the access flags with which the image
>> is bound: image_load supports DCC, but store and atomic don't.
>>
>> v2: remove an unnecessary masking of images->desc.enabled_mask
>>
>> Reviewed-by: Marek Olšák 
>
> This change broke a bunch of MSAA related piglit tests for me on Kaveri,
> e.g.
>
> spec@!opengl 3.2@layered-rendering@clear-color-all-types 2d_multisample_array 
> single_level
> spec@arb_texture_multisample@texelfetch fs sampler2dmsarray 4 1x129x9-98x129x9
> spec@arb_texture_multisample@texelfetch@2-fs-isampler2dmsarray
>
> and related tests. Any ideas what could be wrong?

Yes, I have a fix.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: fix 2D array MSAA failures since image support landed

2016-03-22 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index b9bdd47..b8fde00 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2993,7 +2993,8 @@ si_make_texture_descriptor(struct si_screen *screen,
if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
height = 1;
depth = res->array_size;
-   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY) {
+   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY ||
+  type == V_008F1C_SQ_RSRC_IMG_2D_MSAA_ARRAY) {
if (sampler || res->target != PIPE_TEXTURE_3D)
depth = res->array_size;
} else if (type == V_008F1C_SQ_RSRC_IMG_CUBE)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/29] nir: add i2d and u2d opcodes

2016-03-22 Thread Jason Ekstrand
On Mar 22, 2016 8:19 AM, "Samuel Iglesias Gonsálvez" 
wrote:
>
>
>
> On 21/03/16 23:56, Jason Ekstrand wrote:
> > On Mon, Mar 21, 2016 at 5:06 AM, Samuel Iglesias Gonsálvez <
> > sigles...@igalia.com> wrote:
> >
> >> From: Iago Toral Quiroga 
> >>
> >> ---
> >>  src/compiler/nir/glsl_to_nir.cpp | 6 ++
> >>  src/compiler/nir/nir_opcodes.py  | 2 ++
> >>  2 files changed, 8 insertions(+)
> >>
> >> diff --git a/src/compiler/nir/glsl_to_nir.cpp
> >> b/src/compiler/nir/glsl_to_nir.cpp
> >> index 952d787..d087a77 100644
> >> --- a/src/compiler/nir/glsl_to_nir.cpp
> >> +++ b/src/compiler/nir/glsl_to_nir.cpp
> >> @@ -1357,6 +1357,12 @@ nir_visitor::visit(ir_expression *ir)
> >> case ir_unop_d2i:  result = nir_d2i(, srcs[0]);   break;
> >> case ir_unop_d2u:  result = nir_d2u(, srcs[0]);   break;
> >> case ir_unop_d2b:  result = nir_d2b(, srcs[0]);   break;
> >> +   case ir_unop_i2d:
> >> +  result = supports_ints ? nir_i2d(, srcs[0]) : nir_fmov(,
> >> srcs[0]);
> >> +  break;
> >> +   case ir_unop_u2d:
> >> +  result = supports_ints ? nir_u2d(, srcs[0]) : nir_fmov(,
> >> srcs[0]);
> >>
> >
> > If you're going to be using the u2d opcode, you'd better support
integers.
> >
>
> We did the same than integer to float conversions to keep this code
> aligned with what they do.

Right.  I don't think that would be correct for hardware that doesn't have
integers anyway.  You would want an ftrunc in the non-integer case with an
abs for f2u.  NIR has yet to be used on any platforms that don't support
native integers so all those cases de-paths are dead anyway.

> We can add an assert for support_ints here and only call to nir_u2d but,
> to be consistent, we would need to do similar changes to i2d, u2f, i2f
> too in a separate patch.

Feel free to add an assert. I don't think updating the others is needed.
It's not that f2u is invalid without integers so much as no hardware that
supports doubles won't have native integers.

> What do you think?
>
> Sam
>
> >
> >> +  break;
> >> case ir_unop_i2u:
> >> case ir_unop_u2i:
> >> case ir_unop_bitcast_i2f:
> >> diff --git a/src/compiler/nir/nir_opcodes.py
> >> b/src/compiler/nir/nir_opcodes.py
> >> index a161ac1..cf6ce83 100644
> >> --- a/src/compiler/nir/nir_opcodes.py
> >> +++ b/src/compiler/nir/nir_opcodes.py
> >> @@ -164,6 +164,7 @@ unop_convert("f2u", tuint32, tfloat32, "src0") #
> >> Float-to-unsigned conversion
> >>  unop_convert("d2i", tint32, tfloat64, "src0") # Double-to-integer
> >> conversion.
> >>  unop_convert("d2u", tuint32, tfloat64, "src0") # Double-to-unsigned
> >> conversion.
> >>  unop_convert("i2f", tfloat32, tint32, "src0") # Integer-to-float
> >> conversion.
> >> +unop_convert("i2d", tfloat64, tint32, "src0") # Integer-to-double
> >> conversion.
> >>  # Float-to-boolean conversion
> >>  unop_convert("f2b", tbool, tfloat32, "src0 != 0.0f")
> >>  unop_convert("d2b", tbool, tfloat64, "src0 != 0.0")
> >> @@ -173,6 +174,7 @@ unop_convert("b2f", tfloat32, tbool, "src0 ? 1.0f :
> >> 0.0f")
> >>  unop_convert("i2b", tbool, tint32, "src0 != 0")
> >>  unop_convert("b2i", tint32, tbool, "src0 ? 1 : 0") # Boolean-to-int
> >> conversion
> >>  unop_convert("u2f", tfloat32, tuint32, "src0") # Unsigned-to-float
> >> conversion.
> >> +unop_convert("u2d", tfloat64, tuint32, "src0") # Unsigned-to-double
> >> conversion.
> >>  # double-to-float conversion
> >>  unop_convert("d2f", tfloat32, tfloat64, "src0") # Single to double
> >> precision
> >>  unop_convert("f2d", tfloat64, tfloat32, "src0") # Double to single
> >> precision
> >> --
> >> 2.5.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >>
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/29] nir: fix up bit sizes for undefined alu sources

2016-03-22 Thread Jason Ekstrand
On Mar 22, 2016 8:18 AM, "Samuel Iglesias Gonsálvez" 
wrote:
>
>
>
> On 21/03/16 23:54, Jason Ekstrand wrote:
> > On Mon, Mar 21, 2016 at 5:05 AM, Samuel Iglesias Gonsálvez <
> > sigles...@igalia.com> wrote:
> >
> >> From: Iago Toral Quiroga 
> >>
> >> Undefined sources in alu operations don't have a valid bit size because
> >> they are uninitialized. Simply ignoring undefined sources for bit size
> >> validation is not enough since drivers can check and operate with the
> >> bit-size and that can lead to issues later on. Instead, fix undefined
> >> sources to always have a compatible bit size.
> >>
> >
> > I'm not sure what I think about this.  I think I'd rather have undefs
> > simply have the right bitsize.
> >
>
> With undefined sources you cannot get the bitsize from themselves
> because it is not initialized.

Doesn't patch 3 and the discussion on it imply that undefs should have
valid sizes?

> In that case, we pick the bit size from
> the ALU opcode's input definition. If it is unsized, then we use the
> destination size.

I dont think pulling the implicit size from the source size is ever
correct.  If you have an explicitly sized source that means it doesn't
affect and isn't affected by the implicit size.

> I think this is the right bitsize... or am I missing something?
>
> Sam
>
> >
> >> v2 (Sam):
> >> - Use helper to get type size from nir_alu_type.
> >> ---
> >>  src/compiler/nir/nir_validate.c | 10 ++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/src/compiler/nir/nir_validate.c
> >> b/src/compiler/nir/nir_validate.c
> >> index 9f18d1c..645c15a 100644
> >> --- a/src/compiler/nir/nir_validate.c
> >> +++ b/src/compiler/nir/nir_validate.c
> >> @@ -180,9 +180,11 @@ validate_alu_src(nir_alu_instr *instr, unsigned
> >> index, validate_state *state)
> >>
> >> unsigned num_components;
> >> unsigned src_bit_size;
> >> +   bool is_undef = false;
> >> if (src->src.is_ssa) {
> >>src_bit_size = src->src.ssa->bit_size;
> >>num_components = src->src.ssa->num_components;
> >> +  is_undef = src->src.ssa->parent_instr->type ==
> >> nir_instr_type_ssa_undef;
> >> } else {
> >>src_bit_size = src->src.reg.reg->bit_size;
> >>if (src->src.reg.reg->is_packed)
> >> @@ -205,12 +207,20 @@ validate_alu_src(nir_alu_instr *instr, unsigned
> >> index, validate_state *state)
> >>
> >> if (nir_alu_type_get_type_size(src_type)) {
> >>/* This source has an explicit bit size */
> >> +  if (is_undef) {
> >> + src_bit_size = nir_alu_type_get_type_size(src_type);
> >> + src->src.ssa->bit_size = src_bit_size;
> >> +  }
> >>assert(nir_alu_type_get_type_size(src_type) == src_bit_size);
> >> } else {
> >>if
> >> (!nir_alu_type_get_type_size(nir_op_infos[instr->op].output_type)) {
> >>   unsigned dest_bit_size =
> >>  instr->dest.dest.is_ssa ? instr->dest.dest.ssa.bit_size
> >>  :
instr->dest.dest.reg.reg->bit_size;
> >> + if (is_undef) {
> >> +src_bit_size = dest_bit_size;
> >> +src->src.ssa->bit_size = dest_bit_size;
> >> + }
> >>   assert(dest_bit_size == src_bit_size);
> >>}
> >> }
> >> --
> >> 2.5.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >>
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: adds EGL_KHR_reusable_sync to egl_dri

2016-03-22 Thread Marek Olšák
On Tue, Mar 22, 2016 at 1:06 AM, dw kim  wrote:
> On Mon, Mar 21, 2016 at 08:35:20PM +0100, Marek Olšák wrote:
>> On Wed, Mar 9, 2016 at 2:28 AM, Dongwon Kim  wrote:
>> > This patch enables an EGL extension, EGL_KHR_reusable_sync.
>> > This new extension basically provides a way for multiple APIs or
>> > threads to be excuted synchronously via a "reusable sync"
>> > primitive shared by those threads/API calls.
>> >
>> > This was implemented based on the specification at
>> >
>> > https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_reusable_sync.txt
>> >
>> > Signed-off-by: Dongwon Kim 
>> > ---
>> >  src/egl/drivers/dri2/egl_dri2.c | 197 
>> > ++--
>> >  src/egl/drivers/dri2/egl_dri2.h |   2 +
>> >  src/egl/main/eglapi.c   |   8 ++
>> >  src/egl/main/eglsync.c  |   3 +-
>> >  4 files changed, 200 insertions(+), 10 deletions(-)
>> >
>> > diff --git a/src/egl/drivers/dri2/egl_dri2.c 
>> > b/src/egl/drivers/dri2/egl_dri2.c
>> > index 8f50f0c..78164e4 100644
>> > --- a/src/egl/drivers/dri2/egl_dri2.c
>> > +++ b/src/egl/drivers/dri2/egl_dri2.c
>> > @@ -38,6 +38,8 @@
>> >  #include 
>> >  #include 
>> >  #include 
>> > +#include 
>> > +#include 
>> >  #ifdef HAVE_LIBDRM
>> >  #include 
>> >  #include 
>> > @@ -623,6 +625,8 @@ dri2_setup_screen(_EGLDisplay *disp)
>> >   disp->Extensions.KHR_cl_event2 = EGL_TRUE;
>> > }
>> >
>> > +   disp->Extensions.KHR_reusable_sync = EGL_TRUE;
>> > +
>> > if (dri2_dpy->image) {
>> >if (dri2_dpy->image->base.version >= 10 &&
>> >dri2_dpy->image->getCapabilities != NULL) {
>> > @@ -2389,14 +2393,33 @@ dri2_egl_ref_sync(struct dri2_egl_sync *sync)
>> > p_atomic_inc(>refcount);
>> >  }
>> >
>> > -static void
>> > +static EGLint
>> >  dri2_egl_unref_sync(struct dri2_egl_display *dri2_dpy,
>> >  struct dri2_egl_sync *dri2_sync)
>> >  {
>> > +   EGLint ret;
>> > +
>> > if (p_atomic_dec_zero(_sync->refcount)) {
>> > -  dri2_dpy->fence->destroy_fence(dri2_dpy->dri_screen, 
>> > dri2_sync->fence);
>> > +  /* mutex and cond should be freed if not freed yet. */
>> > +  if (dri2_sync->mutex)
>> > + free(dri2_sync->mutex);
>> > +
>> > +  if (dri2_sync->cond) {
>> > + ret = pthread_cond_destroy(dri2_sync->cond);
>> > +
>> > + if (ret)
>> > +return EGL_FALSE;
>> > +
>> > + free(dri2_sync->cond);
>> > +  }
>> > +
>> > +  if (dri2_sync->fence)
>> > + dri2_dpy->fence->destroy_fence(dri2_dpy->dri_screen, 
>> > dri2_sync->fence);
>> > +
>> >free(dri2_sync);
>> > }
>> > +
>> > +   return EGL_TRUE;
>> >  }
>> >
>> >  static _EGLSync *
>> > @@ -2408,6 +2431,7 @@ dri2_create_sync(_EGLDriver *drv, _EGLDisplay *dpy,
>> > struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
>> > struct dri2_egl_context *dri2_ctx = dri2_egl_context(ctx);
>> > struct dri2_egl_sync *dri2_sync;
>> > +   EGLint ret;
>> >
>> > dri2_sync = calloc(1, sizeof(struct dri2_egl_sync));
>> > if (!dri2_sync) {
>> > @@ -2450,6 +2474,23 @@ dri2_create_sync(_EGLDriver *drv, _EGLDisplay *dpy,
>> >  dri2_sync->fence, 0, 0))
>> >   dri2_sync->base.SyncStatus = EGL_SIGNALED_KHR;
>> >break;
>> > +
>> > +   case EGL_SYNC_REUSABLE_KHR:
>> > +  dri2_sync->cond = calloc(1, sizeof(pthread_cond_t));
>> > +  dri2_sync->mutex = calloc(1, sizeof(pthread_mutex_t));
>> > +  ret = pthread_cond_init(dri2_sync->cond, NULL);
>> > +
>> > +  if (ret) {
>> > + _eglError(EGL_BAD_PARAMETER, "eglCreateSyncKHR");
>> > + free(dri2_sync->cond);
>> > + free(dri2_sync->mutex);
>> > + free(dri2_sync);
>> > + return NULL;
>> > +  }
>> > +
>> > +  /* initial status of reusable sync must be "unsignaled" */
>> > +  dri2_sync->base.SyncStatus = EGL_UNSIGNALED_KHR;
>> > +  break;
>> > }
>> >
>> > p_atomic_set(_sync->refcount, 1);
>> > @@ -2461,9 +2502,33 @@ dri2_destroy_sync(_EGLDriver *drv, _EGLDisplay 
>> > *dpy, _EGLSync *sync)
>> >  {
>> > struct dri2_egl_display *dri2_dpy = dri2_egl_display(dpy);
>> > struct dri2_egl_sync *dri2_sync = dri2_egl_sync(sync);
>> > +   EGLint ret = EGL_TRUE;
>> > +   EGLint err;
>> >
>> > -   dri2_egl_unref_sync(dri2_dpy, dri2_sync);
>> > -   return EGL_TRUE;
>> > +   /* if type of sync is EGL_SYNC_REUSABLE_KHR and it is not signaled yet,
>> > +* then unlock all threads possibly blocked by the reusable sync before
>> > +* destroying it.
>> > +*/
>> > +   if (dri2_sync->base.Type == EGL_SYNC_REUSABLE_KHR &&
>> > +   dri2_sync->base.SyncStatus == EGL_UNSIGNALED_KHR) {
>> > +  dri2_sync->base.SyncStatus = EGL_SIGNALED_KHR;
>> > +  /* unblock all threads currently blocked by sync */
>> > +  ret = pthread_cond_broadcast(dri2_sync->cond);
>> > +
>> > +  if (ret) {
>> > + 

Re: [Mesa-dev] [V2 00/19] Add infrastructure for GL_OES_texture_compression_astc

2016-03-22 Thread Brian Paul

On 03/21/2016 03:03 PM, Anuj Phogat wrote:

I don't have a hardware which supports this extension and I realized it
after writing these patches. This blocks the testing of these patches.
So, I'm sure there will be few things left out in this series. But I
think it'll be nice to have 90% of the infrastructure ready when we
will have a hardware to enable the extension.

V2, NEW patches are based on the feedback by Brian Paul.


The v2 changes look good to me.

For the series, Reviewed-by: Brian Paul 





Anuj Phogat (19):

V1  mesa: Add block depth field in struct gl_format_info
NEW mesa: Add a helper function to query 3D block sizes
NEW mesa: Add an assert for BlockDepth in _mesa_get_format_block_size()
NEW mesa: Handle 3d block sizes in getteximage error checks
V2  mesa: Handle 3d block sizes in teximage error checks
NEW mesa: Handle 3d block sizes in _mesa_compute_compressed_pixelstore
V2  mesa: Account for block depth in _mesa_format_image_size()
V1  glapi: Update dispatch XML files for OES_texture_compression_astc.xml
V1  mesa: Add mesa formats for astc 3d formats
V1  mesa: Add entries for astc 3d formats initializing struct
 gl_format_info
V1  mesa: Add OES_texture_compression_astc to extension table and
 gl_extensions
V1  mesa: Align the values of #define's in glheader.h
V1  mesa: Add the missing defines for GL_OES_texture_compression_astc
V1  mesa: Add a helper function is_astc_3d_format()
V1  mesa: Account for astc 3d formats in _mesa_is_astc_format()
V1  mesa: Handle astc 3d formats in _mesa_base_tex_format()
V1  mesa: Handle astc 3d formats in _mesa_get_compressed_formats()
V1  mesa: Enable translation between astc 3d gl formats and mesa formats
V1  swrast: Add texfetch_funcs entries for astc 3d formats

  src/mapi/glapi/gen/Makefile.am |   1 +
  .../glapi/gen/OES_texture_compression_astc.xml |  61 +++
  src/mapi/glapi/gen/gl_API.xml  |   2 +
  src/mesa/main/extensions_table.h   |   1 +
  src/mesa/main/format_info.py   |   5 +-
  src/mesa/main/format_parser.py |  15 +-
  src/mesa/main/formats.c|  72 ++-
  src/mesa/main/formats.csv  | 550 +++--
  src/mesa/main/formats.h|  25 +
  src/mesa/main/glformats.c  |  54 +-
  src/mesa/main/glheader.h   |  81 +--
  src/mesa/main/mtypes.h |   1 +
  src/mesa/main/texcompress.c| 109 
  src/mesa/main/texgetimage.c|  21 +-
  src/mesa/main/teximage.c   |  19 +-
  src/mesa/main/texstore.c   |   6 +-
  src/mesa/swrast/s_texfetch.c   |  23 +-
  17 files changed, 703 insertions(+), 343 deletions(-)
  create mode 100644 src/mapi/glapi/gen/OES_texture_compression_astc.xml



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/29] nir: add i2d and u2d opcodes

2016-03-22 Thread Samuel Iglesias Gonsálvez


On 22/03/16 16:19, Samuel Iglesias Gonsálvez wrote:
> 
> 
> On 21/03/16 23:56, Jason Ekstrand wrote:
>> On Mon, Mar 21, 2016 at 5:06 AM, Samuel Iglesias Gonsálvez <
>> sigles...@igalia.com> wrote:
>>
>>> From: Iago Toral Quiroga 
>>>
>>> ---
>>>  src/compiler/nir/glsl_to_nir.cpp | 6 ++
>>>  src/compiler/nir/nir_opcodes.py  | 2 ++
>>>  2 files changed, 8 insertions(+)
>>>
>>> diff --git a/src/compiler/nir/glsl_to_nir.cpp
>>> b/src/compiler/nir/glsl_to_nir.cpp
>>> index 952d787..d087a77 100644
>>> --- a/src/compiler/nir/glsl_to_nir.cpp
>>> +++ b/src/compiler/nir/glsl_to_nir.cpp
>>> @@ -1357,6 +1357,12 @@ nir_visitor::visit(ir_expression *ir)
>>> case ir_unop_d2i:  result = nir_d2i(, srcs[0]);   break;
>>> case ir_unop_d2u:  result = nir_d2u(, srcs[0]);   break;
>>> case ir_unop_d2b:  result = nir_d2b(, srcs[0]);   break;
>>> +   case ir_unop_i2d:
>>> +  result = supports_ints ? nir_i2d(, srcs[0]) : nir_fmov(,
>>> srcs[0]);
>>> +  break;
>>> +   case ir_unop_u2d:
>>> +  result = supports_ints ? nir_u2d(, srcs[0]) : nir_fmov(,
>>> srcs[0]);
>>>
>>
>> If you're going to be using the u2d opcode, you'd better support integers.
>>
> 
> We did the same than integer to float conversions to keep this code
> aligned with what they do.
> 
> We can add an assert for support_ints here and only call to nir_u2d but,
> to be consistent, we would need to do similar changes to i2d, u2f, i2f
> too in a separate patch.
> 

Actually i2d would be done in this patch.

Sam

> What do you think?
> 
> Sam
> 
>>
>>> +  break;
>>> case ir_unop_i2u:
>>> case ir_unop_u2i:
>>> case ir_unop_bitcast_i2f:
>>> diff --git a/src/compiler/nir/nir_opcodes.py
>>> b/src/compiler/nir/nir_opcodes.py
>>> index a161ac1..cf6ce83 100644
>>> --- a/src/compiler/nir/nir_opcodes.py
>>> +++ b/src/compiler/nir/nir_opcodes.py
>>> @@ -164,6 +164,7 @@ unop_convert("f2u", tuint32, tfloat32, "src0") #
>>> Float-to-unsigned conversion
>>>  unop_convert("d2i", tint32, tfloat64, "src0") # Double-to-integer
>>> conversion.
>>>  unop_convert("d2u", tuint32, tfloat64, "src0") # Double-to-unsigned
>>> conversion.
>>>  unop_convert("i2f", tfloat32, tint32, "src0") # Integer-to-float
>>> conversion.
>>> +unop_convert("i2d", tfloat64, tint32, "src0") # Integer-to-double
>>> conversion.
>>>  # Float-to-boolean conversion
>>>  unop_convert("f2b", tbool, tfloat32, "src0 != 0.0f")
>>>  unop_convert("d2b", tbool, tfloat64, "src0 != 0.0")
>>> @@ -173,6 +174,7 @@ unop_convert("b2f", tfloat32, tbool, "src0 ? 1.0f :
>>> 0.0f")
>>>  unop_convert("i2b", tbool, tint32, "src0 != 0")
>>>  unop_convert("b2i", tint32, tbool, "src0 ? 1 : 0") # Boolean-to-int
>>> conversion
>>>  unop_convert("u2f", tfloat32, tuint32, "src0") # Unsigned-to-float
>>> conversion.
>>> +unop_convert("u2d", tfloat64, tuint32, "src0") # Unsigned-to-double
>>> conversion.
>>>  # double-to-float conversion
>>>  unop_convert("d2f", tfloat32, tfloat64, "src0") # Single to double
>>> precision
>>>  unop_convert("f2d", tfloat64, tfloat32, "src0") # Double to single
>>> precision
>>> --
>>> 2.5.0
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94088] [llvmpipe] SIGFPE pthread_barrier_destroy.c:40

2016-03-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94088

--- Comment #3 from Roland Scheidegger  ---
(In reply to Steve Langasek from comment #2)
> Hello,
> 
> The patch for this bug is incomplete.  In between the calls to
> pipe_barrier_init() and pipe_barrier_destroy() are calls to
> pipe_barrier_wait(), which is implemented on top of pthread_barrier_wait().
> 
> Since pipe_barrier_init() has not been called, the calls to
> pthread_barrier_wait() have undefined behavior, as per
>  pthread_barrier_wait.html>,
> .
> 
> The applied commit is sufficient to fix the immediate SIGFPE problem with
> glibc, but the API is still being used incorrectly and could result in
> future crashes on other implementations.

I can't see how this could possibly happen. Unless I'm missing something,
pipe_barrier_wait() is only called in the thread main function, which will
never get called if we don't have any threads to begin with.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] tgsi: add support for image operations to tgsi_exec.

2016-03-22 Thread Brian Paul

On 03/21/2016 04:02 PM, Dave Airlie wrote:

From: Dave Airlie 

This adds support for load/store/atomic operations on images
along with image tracking support.

Signed-off-by: Dave Airlie 
---
  src/gallium/auxiliary/draw/draw_gs.c  |   2 +-
  src/gallium/auxiliary/draw/draw_vs_exec.c |   2 +-
  src/gallium/auxiliary/tgsi/tgsi_exec.c| 229 +-
  src/gallium/auxiliary/tgsi/tgsi_exec.h|  40 +-
  src/gallium/drivers/softpipe/sp_fs_exec.c |   4 +-
  5 files changed, 271 insertions(+), 6 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_gs.c 
b/src/gallium/auxiliary/draw/draw_gs.c
index 6b33341..c4ced9f 100644
--- a/src/gallium/auxiliary/draw/draw_gs.c
+++ b/src/gallium/auxiliary/draw/draw_gs.c
@@ -687,7 +687,7 @@ void draw_geometry_shader_prepare(struct 
draw_geometry_shader *shader,
 if (!use_llvm && shader && shader->machine->Tokens != 
shader->state.tokens) {
tgsi_exec_machine_bind_shader(shader->machine,
  shader->state.tokens,
-draw->gs.tgsi.sampler);
+draw->gs.tgsi.sampler, NULL);
 }
  }

diff --git a/src/gallium/auxiliary/draw/draw_vs_exec.c 
b/src/gallium/auxiliary/draw/draw_vs_exec.c
index abd64f5..8c759d4 100644
--- a/src/gallium/auxiliary/draw/draw_vs_exec.c
+++ b/src/gallium/auxiliary/draw/draw_vs_exec.c
@@ -70,7 +70,7 @@ vs_exec_prepare( struct draw_vertex_shader *shader,
 if (evs->machine->Tokens != shader->state.tokens) {
tgsi_exec_machine_bind_shader(evs->machine,
  shader->state.tokens,
-draw->vs.tgsi.sampler);
+draw->vs.tgsi.sampler, NULL);
 }
  }

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index fa1c916..fe82a95 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -853,7 +853,8 @@ void
  tgsi_exec_machine_bind_shader(
 struct tgsi_exec_machine *mach,
 const struct tgsi_token *tokens,
-   struct tgsi_sampler *sampler)
+   struct tgsi_sampler *sampler,
+   struct tgsi_image *image)
  {
 uint k;
 struct tgsi_parse_context parse;
@@ -871,6 +872,7 @@ tgsi_exec_machine_bind_shader(

 mach->Tokens = tokens;
 mach->Sampler = sampler;
+   mach->Image = image;

 if (!tokens) {
/* unbind and free all */
@@ -3706,6 +3708,206 @@ exec_dfracexp(struct tgsi_exec_machine *mach,
 }
  }

+static int
+get_image_coord_dim(int tgsi_tex, int *sample)
+{
+   int dim;
+   switch (tgsi_tex) {
+   case TGSI_TEXTURE_BUFFER:
+   case TGSI_TEXTURE_1D:
+  dim = 1;
+  break;
+   case TGSI_TEXTURE_2D:
+   case TGSI_TEXTURE_RECT:
+   case TGSI_TEXTURE_1D_ARRAY:
+   case TGSI_TEXTURE_2D_MSAA:
+  dim = 2;
+  break;
+   case TGSI_TEXTURE_3D:
+   case TGSI_TEXTURE_CUBE:
+   case TGSI_TEXTURE_2D_ARRAY:
+   case TGSI_TEXTURE_2D_ARRAY_MSAA:
+   case TGSI_TEXTURE_CUBE_ARRAY:
+  dim = 3;
+  break;
+   default:
+  assert(!"unknown texture target");
+  dim = 0;
+  break;
+   }
+
+   if (sample) {
+  switch (tgsi_tex) {
+  case TGSI_TEXTURE_2D_MSAA:
+ *sample = 3;
+ break;
+  case TGSI_TEXTURE_2D_ARRAY_MSAA:
+ *sample = 4;
+ break;
+  default:
+ *sample = 0;
+ break;
+  }
+   }
+   return dim;
+}


That function seems to do two independent things.  Can this be two 
functions?





+
+static void
+exec_load(struct tgsi_exec_machine *mach,
+  const struct tgsi_full_instruction *inst)
+{
+   union tgsi_exec_channel r[4], sample_r;
+   uint unit;
+   int sample;
+   int i, j;
+   int dim;
+   uint chan;
+   float rgba[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE];
+   struct tgsi_image_params params;
+   int kilmask = mach->Temps[TEMP_KILMASK_I].xyzw[TEMP_KILMASK_C].u[0];
+
+   unit = fetch_sampler_unit(mach, inst, 0);
+   dim = get_image_coord_dim(inst->Memory.Texture, );
+   assert(dim <= 3);
+
+   params.execmask = mach->ExecMask & mach->NonHelperMask & ~kilmask;
+   params.unit = unit;
+   params.tgsi_tex_instr = inst->Memory.Texture;
+   params.format = inst->Memory.Format;
+
+   for (i = 0; i < dim; i++) {
+  IFETCH([i], 1, TGSI_CHAN_X + i);
+   }
+
+   if (sample)
+  IFETCH(_r, 1, TGSI_CHAN_X + sample);
+
+   mach->Image->load(mach->Image, ,
+ r[0].i, r[1].i, r[2].i, sample_r.i,
+ rgba);
+   for (j = 0; j < TGSI_QUAD_SIZE; j++) {
+  r[0].f[j] = rgba[0][j];
+  r[1].f[j] = rgba[1][j];
+  r[2].f[j] = rgba[2][j];
+  r[3].f[j] = rgba[3][j];
+   }
+   for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
+  if (inst->Dst[0].Register.WriteMask & (1 << chan)) {
+ store_dest(mach, [chan], >Dst[0], inst, chan, 
TGSI_EXEC_DATA_FLOAT);
+  }
+   }
+}
+
+static void
+exec_store(struct tgsi_exec_machine 

Re: [Mesa-dev] [PATCH 7/8] softpipe: add image support to softpipe

2016-03-22 Thread Brian Paul

A bunch of nit-picks below.

Overall, I'd like to see more comments on the new functions to explain 
what's going on.



On 03/21/2016 04:02 PM, Dave Airlie wrote:

From: Dave Airlie 

This adds support for ARB_shader_image_load_store to softpipe.

Signed-off-by: Dave Airlie 
---
  src/gallium/auxiliary/tgsi/tgsi_exec.h  |   4 +-
  src/gallium/drivers/softpipe/Makefile.sources   |   2 +
  src/gallium/drivers/softpipe/sp_context.c   |  20 +-
  src/gallium/drivers/softpipe/sp_context.h   |   2 +
  src/gallium/drivers/softpipe/sp_flush.c |  26 +
  src/gallium/drivers/softpipe/sp_flush.h |   2 +
  src/gallium/drivers/softpipe/sp_fs_exec.c   |   6 +-
  src/gallium/drivers/softpipe/sp_image.c | 643 
  src/gallium/drivers/softpipe/sp_image.h |  37 ++
  src/gallium/drivers/softpipe/sp_state.h |   7 +-
  src/gallium/drivers/softpipe/sp_state_derived.c |   3 +-
  src/gallium/drivers/softpipe/sp_state_image.c   |  57 +++
  src/gallium/drivers/softpipe/sp_texture.c   |   8 +-
  src/gallium/drivers/softpipe/sp_texture.h   |   4 +-
  14 files changed, 809 insertions(+), 12 deletions(-)
  create mode 100644 src/gallium/drivers/softpipe/sp_image.c
  create mode 100644 src/gallium/drivers/softpipe/sp_image.h
  create mode 100644 src/gallium/drivers/softpipe/sp_state_image.c

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h 
b/src/gallium/auxiliary/tgsi/tgsi_exec.h
index 9ff8a72..99051ed 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h
@@ -518,8 +518,10 @@ tgsi_exec_get_shader_param(enum pipe_shader_cap param)
 case PIPE_SHADER_CAP_TGSI_DROUND_SUPPORTED:
 case PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED:
 case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
-   case PIPE_SHADER_CAP_MAX_SHADER_IMAGES:
return 0;
+   case PIPE_SHADER_CAP_MAX_SHADER_IMAGES:
+  return PIPE_MAX_SHADER_IMAGES;
+
 case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
 }
diff --git a/src/gallium/drivers/softpipe/Makefile.sources 
b/src/gallium/drivers/softpipe/Makefile.sources
index 2af3d6a..efe8846 100644
--- a/src/gallium/drivers/softpipe/Makefile.sources
+++ b/src/gallium/drivers/softpipe/Makefile.sources
@@ -10,6 +10,7 @@ C_SOURCES := \
sp_flush.h \
sp_fs_exec.c \
sp_fs.h \
+   sp_image.c \
sp_limits.h \
sp_prim_vbuf.c \
sp_prim_vbuf.h \
@@ -31,6 +32,7 @@ C_SOURCES := \
sp_state_blend.c \
sp_state_clip.c \
sp_state_derived.c \
+   sp_state_image.c \
sp_state.h \
sp_state_rasterizer.c \
sp_state_sampler.c \
diff --git a/src/gallium/drivers/softpipe/sp_context.c 
b/src/gallium/drivers/softpipe/sp_context.c
index d2a3220..30b0276 100644
--- a/src/gallium/drivers/softpipe/sp_context.c
+++ b/src/gallium/drivers/softpipe/sp_context.c
@@ -50,7 +50,7 @@
  #include "sp_query.h"
  #include "sp_screen.h"
  #include "sp_tex_sample.h"
-
+#include "sp_image.h"

  static void
  softpipe_destroy( struct pipe_context *pipe )
@@ -199,6 +199,10 @@ softpipe_create_context(struct pipe_screen *screen,
softpipe->tgsi.sampler[i] = sp_create_tgsi_sampler();
 }

+   for (i = 0; i < PIPE_SHADER_TYPES; i++) {
+  softpipe->tgsi.image[i] = sp_create_tgsi_image();
+   }
+
 softpipe->dump_fs = debug_get_bool_option( "SOFTPIPE_DUMP_FS", FALSE );
 softpipe->dump_gs = debug_get_bool_option( "SOFTPIPE_DUMP_GS", FALSE );

@@ -216,6 +220,7 @@ softpipe_create_context(struct pipe_screen *screen,
 softpipe_init_streamout_funcs(>pipe);
 softpipe_init_texture_funcs( >pipe );
 softpipe_init_vertex_funcs(>pipe);
+   softpipe_init_image_funcs(>pipe);

 softpipe->pipe.set_framebuffer_state = softpipe_set_framebuffer_state;

@@ -223,7 +228,8 @@ softpipe_create_context(struct pipe_screen *screen,

 softpipe->pipe.clear = softpipe_clear;
 softpipe->pipe.flush = softpipe_flush_wrapped;
-
+   softpipe->pipe.texture_barrier = softpipe_texture_barrier;
+   softpipe->pipe.memory_barrier = softpipe_memory_barrier;
 softpipe->pipe.render_condition = softpipe_render_condition;

 /*
@@ -272,6 +278,16 @@ softpipe_create_context(struct pipe_screen *screen,
  (struct tgsi_sampler *)
 softpipe->tgsi.sampler[PIPE_SHADER_GEOMETRY]);

+   draw_image(softpipe->draw,
+  PIPE_SHADER_VERTEX,
+  (struct tgsi_image *)
+  softpipe->tgsi.image[PIPE_SHADER_VERTEX]);
+
+   draw_image(softpipe->draw,
+  PIPE_SHADER_GEOMETRY,
+  (struct tgsi_image *)
+  softpipe->tgsi.image[PIPE_SHADER_GEOMETRY]);
+
 if (debug_get_bool_option( "SOFTPIPE_NO_RAST", FALSE ))
softpipe->no_rast = TRUE;

diff --git a/src/gallium/drivers/softpipe/sp_context.h 
b/src/gallium/drivers/softpipe/sp_context.h
index d18bbe6..20a1235 100644
--- 

Re: [Mesa-dev] [PATCH 3/8] tgsi: introduce NonHelperMask

2016-03-22 Thread Brian Paul

On 03/21/2016 04:02 PM, Dave Airlie wrote:

From: Dave Airlie 

This is a mask of which of the current 2x2 grid are non-helper
invocations. This allows us to mask off the helper invocations
later for the image operations.


Can you elaborate on what a helper invocation is somewhere in the comments?




Signed-off-by: Dave Airlie 
---
  src/gallium/auxiliary/tgsi/tgsi_exec.c | 2 ++
  src/gallium/auxiliary/tgsi/tgsi_exec.h | 2 ++
  2 files changed, 4 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index a44a05c..fa1c916 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -5199,6 +5199,8 @@ tgsi_exec_machine_run( struct tgsi_exec_machine *mach )
default_mask = 0x1;
 }

+   if (mach->NonHelperMask == 0)
+  mach->NonHelperMask = default_mask;
 mach->CondMask = default_mask;
 mach->LoopMask = default_mask;
 mach->ContMask = default_mask;
diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h 
b/src/gallium/auxiliary/tgsi/tgsi_exec.h
index 011c9c3..05ae388 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h
@@ -317,6 +317,8 @@ struct tgsi_exec_machine
 struct tgsi_exec_vector   QuadPos;
 float Face;/**< +1 if front facing, -1 if back 
facing */
 bool  flatshade_color;
+
+   uint NonHelperMask;  /**< non-helpers */
 /* Conditional execution masks */
 uint CondMask;  /**< For IF/ELSE/ENDIF */
 uint LoopMask;  /**< For BGNLOOP/ENDLOOP */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] tgsi_exec: add support for up to 3 array registers

2016-03-22 Thread Brian Paul

In the subject, should that be "address" instead of "array"?


On 03/21/2016 04:02 PM, Dave Airlie wrote:

From: Dave Airlie 

Signed-off-by: Dave Airlie 
---
  src/gallium/auxiliary/tgsi/tgsi_exec.h | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h 
b/src/gallium/auxiliary/tgsi/tgsi_exec.h
index 12a6875..011c9c3 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h
@@ -205,12 +205,14 @@ struct tgsi_sampler
  #define TGSI_EXEC_NUM_TEMP_R4

  #define TGSI_EXEC_TEMP_ADDR (TGSI_EXEC_NUM_TEMPS + 8)
+#define TGSI_EXEC_TEMP_ADDR1(TGSI_EXEC_NUM_TEMPS + 9)
+#define TGSI_EXEC_TEMP_ADDR2(TGSI_EXEC_NUM_TEMPS + 10)

  /* predicate register */
-#define TGSI_EXEC_TEMP_P0   (TGSI_EXEC_NUM_TEMPS + 9)
+#define TGSI_EXEC_TEMP_P0   (TGSI_EXEC_NUM_TEMPS + 11)
  #define TGSI_EXEC_NUM_PREDS 1

-#define TGSI_EXEC_NUM_TEMP_EXTRAS   10
+#define TGSI_EXEC_NUM_TEMP_EXTRAS   12






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/29] nir: add i2d and u2d opcodes

2016-03-22 Thread Samuel Iglesias Gonsálvez


On 21/03/16 23:56, Jason Ekstrand wrote:
> On Mon, Mar 21, 2016 at 5:06 AM, Samuel Iglesias Gonsálvez <
> sigles...@igalia.com> wrote:
> 
>> From: Iago Toral Quiroga 
>>
>> ---
>>  src/compiler/nir/glsl_to_nir.cpp | 6 ++
>>  src/compiler/nir/nir_opcodes.py  | 2 ++
>>  2 files changed, 8 insertions(+)
>>
>> diff --git a/src/compiler/nir/glsl_to_nir.cpp
>> b/src/compiler/nir/glsl_to_nir.cpp
>> index 952d787..d087a77 100644
>> --- a/src/compiler/nir/glsl_to_nir.cpp
>> +++ b/src/compiler/nir/glsl_to_nir.cpp
>> @@ -1357,6 +1357,12 @@ nir_visitor::visit(ir_expression *ir)
>> case ir_unop_d2i:  result = nir_d2i(, srcs[0]);   break;
>> case ir_unop_d2u:  result = nir_d2u(, srcs[0]);   break;
>> case ir_unop_d2b:  result = nir_d2b(, srcs[0]);   break;
>> +   case ir_unop_i2d:
>> +  result = supports_ints ? nir_i2d(, srcs[0]) : nir_fmov(,
>> srcs[0]);
>> +  break;
>> +   case ir_unop_u2d:
>> +  result = supports_ints ? nir_u2d(, srcs[0]) : nir_fmov(,
>> srcs[0]);
>>
> 
> If you're going to be using the u2d opcode, you'd better support integers.
> 

We did the same than integer to float conversions to keep this code
aligned with what they do.

We can add an assert for support_ints here and only call to nir_u2d but,
to be consistent, we would need to do similar changes to i2d, u2f, i2f
too in a separate patch.

What do you think?

Sam

> 
>> +  break;
>> case ir_unop_i2u:
>> case ir_unop_u2i:
>> case ir_unop_bitcast_i2f:
>> diff --git a/src/compiler/nir/nir_opcodes.py
>> b/src/compiler/nir/nir_opcodes.py
>> index a161ac1..cf6ce83 100644
>> --- a/src/compiler/nir/nir_opcodes.py
>> +++ b/src/compiler/nir/nir_opcodes.py
>> @@ -164,6 +164,7 @@ unop_convert("f2u", tuint32, tfloat32, "src0") #
>> Float-to-unsigned conversion
>>  unop_convert("d2i", tint32, tfloat64, "src0") # Double-to-integer
>> conversion.
>>  unop_convert("d2u", tuint32, tfloat64, "src0") # Double-to-unsigned
>> conversion.
>>  unop_convert("i2f", tfloat32, tint32, "src0") # Integer-to-float
>> conversion.
>> +unop_convert("i2d", tfloat64, tint32, "src0") # Integer-to-double
>> conversion.
>>  # Float-to-boolean conversion
>>  unop_convert("f2b", tbool, tfloat32, "src0 != 0.0f")
>>  unop_convert("d2b", tbool, tfloat64, "src0 != 0.0")
>> @@ -173,6 +174,7 @@ unop_convert("b2f", tfloat32, tbool, "src0 ? 1.0f :
>> 0.0f")
>>  unop_convert("i2b", tbool, tint32, "src0 != 0")
>>  unop_convert("b2i", tint32, tbool, "src0 ? 1 : 0") # Boolean-to-int
>> conversion
>>  unop_convert("u2f", tfloat32, tuint32, "src0") # Unsigned-to-float
>> conversion.
>> +unop_convert("u2d", tfloat64, tuint32, "src0") # Unsigned-to-double
>> conversion.
>>  # double-to-float conversion
>>  unop_convert("d2f", tfloat32, tfloat64, "src0") # Single to double
>> precision
>>  unop_convert("f2d", tfloat64, tfloat32, "src0") # Double to single
>> precision
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/29] nir: fix up bit sizes for undefined alu sources

2016-03-22 Thread Samuel Iglesias Gonsálvez


On 21/03/16 23:54, Jason Ekstrand wrote:
> On Mon, Mar 21, 2016 at 5:05 AM, Samuel Iglesias Gonsálvez <
> sigles...@igalia.com> wrote:
> 
>> From: Iago Toral Quiroga 
>>
>> Undefined sources in alu operations don't have a valid bit size because
>> they are uninitialized. Simply ignoring undefined sources for bit size
>> validation is not enough since drivers can check and operate with the
>> bit-size and that can lead to issues later on. Instead, fix undefined
>> sources to always have a compatible bit size.
>>
> 
> I'm not sure what I think about this.  I think I'd rather have undefs
> simply have the right bitsize.
> 

With undefined sources you cannot get the bitsize from themselves
because it is not initialized. In that case, we pick the bit size from
the ALU opcode's input definition. If it is unsized, then we use the
destination size.

I think this is the right bitsize... or am I missing something?

Sam

> 
>> v2 (Sam):
>> - Use helper to get type size from nir_alu_type.
>> ---
>>  src/compiler/nir/nir_validate.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_validate.c
>> b/src/compiler/nir/nir_validate.c
>> index 9f18d1c..645c15a 100644
>> --- a/src/compiler/nir/nir_validate.c
>> +++ b/src/compiler/nir/nir_validate.c
>> @@ -180,9 +180,11 @@ validate_alu_src(nir_alu_instr *instr, unsigned
>> index, validate_state *state)
>>
>> unsigned num_components;
>> unsigned src_bit_size;
>> +   bool is_undef = false;
>> if (src->src.is_ssa) {
>>src_bit_size = src->src.ssa->bit_size;
>>num_components = src->src.ssa->num_components;
>> +  is_undef = src->src.ssa->parent_instr->type ==
>> nir_instr_type_ssa_undef;
>> } else {
>>src_bit_size = src->src.reg.reg->bit_size;
>>if (src->src.reg.reg->is_packed)
>> @@ -205,12 +207,20 @@ validate_alu_src(nir_alu_instr *instr, unsigned
>> index, validate_state *state)
>>
>> if (nir_alu_type_get_type_size(src_type)) {
>>/* This source has an explicit bit size */
>> +  if (is_undef) {
>> + src_bit_size = nir_alu_type_get_type_size(src_type);
>> + src->src.ssa->bit_size = src_bit_size;
>> +  }
>>assert(nir_alu_type_get_type_size(src_type) == src_bit_size);
>> } else {
>>if
>> (!nir_alu_type_get_type_size(nir_op_infos[instr->op].output_type)) {
>>   unsigned dest_bit_size =
>>  instr->dest.dest.is_ssa ? instr->dest.dest.ssa.bit_size
>>  : instr->dest.dest.reg.reg->bit_size;
>> + if (is_undef) {
>> +src_bit_size = dest_bit_size;
>> +src->src.ssa->bit_size = dest_bit_size;
>> + }
>>   assert(dest_bit_size == src_bit_size);
>>}
>> }
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/29] nir: add support for printing double immediates

2016-03-22 Thread Samuel Iglesias Gonsálvez
On 21/03/16 23:49, Jason Ekstrand wrote:
> On Mon, Mar 21, 2016 at 5:05 AM, Samuel Iglesias Gonsálvez <
> sigles...@igalia.com> wrote:
> 
>> From: Connor Abbott 
>>
>> ---
>>  src/compiler/nir/nir_print.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
>> index 30a8233..0b3f954 100644
>> --- a/src/compiler/nir/nir_print.c
>> +++ b/src/compiler/nir/nir_print.c
>> @@ -719,7 +719,10 @@ print_load_const_instr(nir_load_const_instr *instr,
>> print_state *state)
>> * and then print the float in a comment for readability.
>> */
>>
>> -  fprintf(fp, "0x%08x /* %f */", instr->value.u32[i],
>> instr->value.f32[i]);
>> +  if (instr->def.bit_size == 64)
>> + fprintf(fp, "%f", instr->value.f64[i]);
>>
> 
> Let's print out the 64-bit integer here as well.  64-bit integer support
> may happen.
> 

OK, I will do the change.

Sam

> 
>> +  else
>> + fprintf(fp, "0x%08x /* %f */", instr->value.u32[i],
>> instr->value.f32[i]);
>> }
>>
>> fprintf(fp, ")");
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/29] nir/vars_to_ssa: adapt to different bit sizes

2016-03-22 Thread Samuel Iglesias Gonsálvez


On 21/03/16 22:03, Jason Ekstrand wrote:
> On Mon, Mar 21, 2016 at 5:05 AM, Samuel Iglesias Gonsálvez <
> sigles...@igalia.com> wrote:
> 
>> From: Connor Abbott 
>>
>> v2 (Sam):
>> - Keep using nir_op_imov when calling nir_alu_instr_create() at
>> rename_variables_block(). nir_op_fmov is not needed anymore.
>>
>> Signed-off-by: Samuel Iglesias Gonsálvez 
>> ---
>>  src/compiler/nir/nir_lower_vars_to_ssa.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c
>> b/src/compiler/nir/nir_lower_vars_to_ssa.c
>> index 2331791..511662e 100644
>> --- a/src/compiler/nir/nir_lower_vars_to_ssa.c
>> +++ b/src/compiler/nir/nir_lower_vars_to_ssa.c
>> @@ -543,6 +543,8 @@ get_ssa_def_for_block(struct deref_node *node,
>> nir_block *block,
>> nir_ssa_undef_instr *undef =
>>nir_ssa_undef_instr_create(state->shader,
>>   glsl_get_vector_elements(node->type));
>> +   undef->def.bit_size =
>> +  glsl_get_bit_size(glsl_get_base_type(node->type));
>>
> 
> Can we instead make nir_ssa_undef_instr_create take a bit size?  That seems
> better than setting it manually.  We probably want to do the same for
> load_cons.
> 

Sure, I will do it for both.

Sam

> 
>> nir_instr_insert_before_cf_list(>impl->body, >instr);
>> def_stack_push(node, >def, state);
>> return >def;
>> @@ -627,6 +629,7 @@ rename_variables_block(nir_block *block, struct
>> lower_variables_state *state)
>> nir_ssa_undef_instr *undef =
>>nir_ssa_undef_instr_create(state->shader,
>>   intrin->num_components);
>> +   undef->def.bit_size = intrin->dest.ssa.bit_size;
>>
>> nir_instr_insert_before(>instr, >instr);
>> nir_instr_remove(>instr);
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: silence texture_slices warning

2016-03-22 Thread Brian Paul

I just pushed a (trivial) patch that fixed this.

-Brian

On 03/21/2016 05:52 PM, Dave Airlie wrote:

From: Dave Airlie 

In file included from ../../src/compiler/glsl/list.h:74:0,
  from ./main/mtypes.h:47,
  from ./main/errors.h:43,
  from ./main/imports.h:44,
  from ./main/context.h:52,
  from swrast/s_texture.c:30:
swrast/s_texture.c: In function ‘check_map_teximage’:
swrast/s_texture.c:191:34: warning: passing argument 1 of ‘texture_slices’ 
discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
 assert(slice < texture_slices(texImage));
   ^
swrast/s_texture.c:63:1: note: expected ‘struct gl_texture_image *’ but 
argument is of type ‘const struct gl_texture_image *’
  texture_slices(struct gl_texture_image *texImage)
  ^

Signed-off-by: Dave Airlie 
---
  src/mesa/swrast/s_texture.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
index 25918e3..d35bea9 100644
--- a/src/mesa/swrast/s_texture.c
+++ b/src/mesa/swrast/s_texture.c
@@ -60,7 +60,7 @@ _swrast_delete_texture_image(struct gl_context *ctx,
  }

  static unsigned int
-texture_slices(struct gl_texture_image *texImage)
+texture_slices(const struct gl_texture_image *texImage)
  {
 if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY)
return texImage->Height;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFCv3 04/11] nir: allow pre-resolved sampler uniform locations

2016-03-22 Thread Rob Clark
So I've been looking a bit more at this, and skipping the
shader_program->UniformStorage[location].opaque[stage].index offset
is, I think, not the right thing to do.

The original issue I was having with this seems to be a disagreement
about location idx into UniformStorage.  In
parcel_out_uniform_storage::handle_samplers we end up w/
>UniformStorage[id] where id comes from UniformHash, but id
doesn't match up with var->data.location in NIR (at least in some
cases, seems to work sometimes but that might just be a happy
accident, ie. when both just happen to be zero).

BR,
-R

On Sun, Jan 31, 2016 at 3:16 PM, Rob Clark  wrote:
> From: Rob Clark 
>
> With TGSI, the ir_variable::data.location gets fixed up to be a stage
> local location (rather than program global).  In this case we need to
> skip the UniformStorage[location] lookup.
>
> Signed-off-by: Rob Clark 
> ---
>  src/compiler/nir/nir_lower_samplers.c | 23 ---
>  1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/src/compiler/nir/nir_lower_samplers.c 
> b/src/compiler/nir/nir_lower_samplers.c
> index 96e8291..c95a474 100644
> --- a/src/compiler/nir/nir_lower_samplers.c
> +++ b/src/compiler/nir/nir_lower_samplers.c
> @@ -129,14 +129,18 @@ lower_sampler(nir_tex_instr *instr, const struct 
> gl_shader_program *shader_progr
>instr->sampler_array_size = array_elements;
> }
>
> -   if (location > shader_program->NumUniformStorage - 1 ||
> -   !shader_program->UniformStorage[location].opaque[stage].active) {
> -  assert(!"cannot return a sampler");
> -  return;
> -   }
> +   if (!shader_program) {
> +  instr->sampler_index += location;
> +   } else {
> +  if (location > shader_program->NumUniformStorage - 1 ||
> +  !shader_program->UniformStorage[location].opaque[stage].active) {
> + assert(!"cannot return a sampler");
> + return;
> +  }
>
> -   instr->sampler_index +=
> -  shader_program->UniformStorage[location].opaque[stage].index;
> +  instr->sampler_index +=
> + shader_program->UniformStorage[location].opaque[stage].index;
> +   }
>
> instr->sampler = NULL;
>  }
> @@ -176,6 +180,11 @@ lower_impl(nir_function_impl *impl, const struct 
> gl_shader_program *shader_progr
> nir_foreach_block(impl, lower_block_cb, );
>  }
>
> +/* Call with a null 'shader_program' if uniform locations are
> + * already local to the shader, ie. skipping the
> + * shader_program->UniformStorage[location].opaque[stage].index
> + * lookup
> + */
>  void
>  nir_lower_samplers(nir_shader *shader,
> const struct gl_shader_program *shader_program)
> --
> 2.5.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] compiler/glsl: Allow the sequence operator to be a constant expression

2016-03-22 Thread Lars Hamre
Resending this patch because it received no response last week.

Allow the sequence operator to be a constant expression in GLSL ES versions 
prior
to GLSL ES 3.0

Fixes the following piglit test:
   /all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert

This mirrors the logic from process_initializer() which performs the
same check for constant variable initialization with sequence operators.

Section 4.3.3 (Constant Expressions) of the GLSL 4.30.9 spec and of the
GLSL ES 3.00.4 spec say that the result of a sequence operator is not a
constant expression; however, we should not mandate that for lower GLSL
versions.

Signed-off-by: Lars Hamre 

---
 src/compiler/glsl/ast_to_hir.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 5262bd8..4037468 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2125,7 +2125,9 @@ process_array_size(exec_node *node,
}

ir_constant *const size = ir->constant_expression_value();
-   if (size == NULL || array_size->has_sequence_subexpression()) {
+   if (size == NULL ||
+   (state->is_version(430, 300) &&
+array_size->has_sequence_subexpression())) {
   _mesa_glsl_error(& loc, state, "array size must be a "
"constant valued expression");
   return 0;
--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/17] gallium/aux: Fix u_blitter.c for layers/samples

2016-03-22 Thread eocallaghan
Ah you are correct, this is no longer needed in the push branch. We can 
drop this one from the series as its a nop, please ignore thanks for 
spotting it.


On 2016-03-22 02:43, Marek Olšák wrote:

Does this fix anything even? The blitter always binds something, thus
this should have no effect.

Marek

On Sat, Mar 19, 2016 at 7:41 AM, Edward O'Callaghan
 wrote:

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/auxiliary/util/u_blitter.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_blitter.c 
b/src/gallium/auxiliary/util/u_blitter.c

index 43fbd8e..c4a32e8 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -1566,11 +1566,13 @@ void util_blitter_blit_generic(struct 
blitter_context *blitter,

/* Initialize framebuffer state. */
fb_state.width = dst->width;
fb_state.height = dst->height;
-   fb_state.nr_cbufs = blit_depth || blit_stencil ? 0 : 1;
fb_state.cbufs[0] = NULL;
fb_state.zsbuf = NULL;

if (blit_depth || blit_stencil) {
+  fb_state.nr_cbufs = 0;
+  fb_state.layers = 0;
+  fb_state.samples = 1;
   pipe->bind_blend_state(pipe, ctx->blend[0][0]);

   if (blit_depth && blit_stencil) {
@@ -1594,6 +1596,7 @@ void util_blitter_blit_generic(struct 
blitter_context *blitter,

   }

} else {
+  fb_state.nr_cbufs = 1;
   unsigned colormask = mask & PIPE_MASK_RGBA;

   pipe->bind_blend_state(pipe, 
ctx->blend[colormask][alpha_blend]);

--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): 23 new commits

2016-03-22 Thread Michel Dänzer
On 22.03.2016 05:34, Nicolai =?UNKNOWN?Q?H=C3=A4hnle?= wrote:
> 
> URL:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=e85cf35a6516c44e33663fcd9637c6b434bb63ee
> Author: Nicolai Hähnle 
> Date:   Sat Feb 6 18:32:13 2016 -0500
> 
> radeonsi: implement set_shader_images (v2)
> 
> Whether DCC is disabled depends on the access flags with which the image
> is bound: image_load supports DCC, but store and atomic don't.
> 
> v2: remove an unnecessary masking of images->desc.enabled_mask
> 
> Reviewed-by: Marek Olšák 

This change broke a bunch of MSAA related piglit tests for me on Kaveri,
e.g.

spec@!opengl 3.2@layered-rendering@clear-color-all-types 2d_multisample_array 
single_level
spec@arb_texture_multisample@texelfetch fs sampler2dmsarray 4 1x129x9-98x129x9
spec@arb_texture_multisample@texelfetch@2-fs-isampler2dmsarray

and related tests. Any ideas what could be wrong?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev