date:20140519

Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.

2014-05-19 Thread Rogovin, Kevin

>> For OpenGL ES, I propose a simpler solution:
>> - don't touch ARB_texture_float at all
>> - add OES_texture_float to gl_extensions
>> - add OES_texture_float_linear to gl_extensions
>> - define OES_texture_half_float as o(OES_texture_float)
>> - define OES_texture_half_float_linear as o(OES_texture_float_linear)
>>
>> Then, drivers can enable the extensions as they see fit.

>That sounds like a happy medium.  It seems like we could use
>ARB_texture_float as the enable for OES_texture_float, but I'm not
>crying over one extra flag.

I think it is actually the most unhappy medium. The patch as-is enable floating 
point
textures in GLES2 on hardware targets without affecting any DRI drivers (or the 
Gallium state tracker).
That was the original purpose of the patch. On one side: having 4 separate 
booleans then
gives complete resolution to the situation for floating point textures. Having 
just -2- means that
some resolution is provided but it is not complete and will then need to be 
revisited and leave
whoever made or pushed the patch embarrassed about not dotting the i's and 
crossing the t's.
Additionally, adding 2 or 4 and leaving ARB_texture_float, we are still left 
with the situation 
that the booleans are not orthogonal. Also, what does ARB_texture_float support
then mean? What contract is it satisfying that mesa/main can rely upon? 

Going further what happens when/if we want to add support for 
GL_ARB_ES2_compatibility 
and also expose OES extensions (as NVIDIA does)? [I admit exposing OES 
extension in
a non-ES context sounds gross, but the whole point of ES2_compatibility is to 
make ports
from GLES to GL almost a no-op, so the OES extensions should come too]. 

>It will mean that a bunch of extension checks in the code will need to
>be expanded.
>
>We'll probably also want a negative test that verifies an error is
>generated for glTexParameteri(..., GL_LINEAR_MIPMAP_LINEAR) when
>OES_texture_float_linear (or OES_texture_half_float_linear) is not
>supported.

This is the other reason why I do not want to go down the multiple booleans 
initially
as then the patch touches much more code; the all or nothing approach avoided
all sorts of additional ickiness. 

Lets put the patch as-is (because from the point of view of mesa/main it looks 
correct)
and then a subsequent patch, after some discussion, to support situations like 
the 
r300 partial floating point texture support.

-Kevin

> Marek
>
> On Mon, May 19, 2014 at 8:34 AM, Rogovin, Kevin  
> wrote:
>> Hi,
>>
>>   Each of the four extensions are right now set to be advertised if and only 
>> if a GL context would advertise GL_ARB_texture_float:
>>
>> { "GL_OES_texture_float",   o(ARB_texture_float),
>>ES2,2005 },
>> { "GL_OES_texture_half_float",  o(ARB_texture_float),
>>ES2,2005 },
>> { "GL_OES_texture_float_linear",o(ARB_texture_float),
>>ES2,2005 },
>> { "GL_OES_texture_half_float_linear",   o(ARB_texture_float),
>>ES2,2005 },
>>
>> From my interpretation of ARB_texture_float, that extension requires both 
>> 16-bit and 32-bit textures and ability to filter linearly such textures. Did 
>> I misunderstand the specification? If I got the specification correct, then 
>> the r300 should not be advertising any of the extensions for otherwise it 
>> would be advertising GL_ARB_texture_float.
>>
>> However, the r300 does give an example of ability to support some of the OES 
>> extensions but not all. Previously Matt asked if there an example or need 
>> and I thought not. It turns out I was wrong and there is a need atleast for 
>> the r300. Supporting that granularity is going to be a bigger patch since it 
>> would require changing the data structure struct gl_extensions to have four 
>> entries and in turn additional logic to combine them to 
>> GL_ARB_texture_float. The correct and more work way to do it would be to 
>> remove ARB_texture_float from gl_extension, add a GLboolean for each of the 
>> 4 OES extensions, change each driver to correctly fill them and then 
>> additional logic in creating extension string(s) to check if each of the 4 
>> OES extensions are TRUE then to advertise GL_ARB_texture_float; we could 
>> also instead just add the 4 OES booleans and have additional logic in 
>> mesa/main to set them each to TRUE if ARB_texture_float is true. The latter 
>> solution though easier is less clean a!
 nd begging
 for trouble later. Regardless, lets first get this patch as-is into Mesa, then 
do the "right" thing to allow a backend to support a subset of the OES 
extensions without needing to support the ARB extension.
>>
>> -Kevin
>>
>>
>>
>> 
>> From: Marek Olšák [mar...@gmail.com]
>> Sent: Friday, May 16, 2014 4:33 PM
>> To: Rogovin, Kevin
>> Cc: mesa-dev@lists.freedesktop.org
>> Subject

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Kenneth Graunke

On 05/19/2014 03:52 PM, Chris Forbes wrote:
> If you're going to do that, you'd really want to add draw buffer count
> to the cache key (and i guess this might be the point where you
> convert the blit shader cache to be a hashtable), to avoid recompiling
> all the time if the app does two blits with the same target but
> different draw buffer counts.
> 
> This all seems like a huge amount of extra machinery to avoid using
> gl_FragColor and having the backend just take care of it, though. What
> do we actually gain from this?

One thing that's bothered me about our blit code...the integer RT
support is rather sketchy.

Eric pointed out that it ought to work: we interpret the integer source
buffer as float, copy those bits to gl_FragColor - which takes a float -
and then write the bits out as if the destination were float.  It should
preserve the bits, and filtering should be off...

One annoying thing is that there's no int/uint equivalent to
gl_FragColor...so if you want to write to all the render targets, you
have to do something like this.  (Or, we'd have to add something to the
language...)

signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS

2014-05-19 Thread Tapani


On 05/19/2014 08:26 PM, Ian Romanick wrote:

On 04/09/2014 02:56 AM, Tapani Pälli wrote:

Patch adds new implementation dependent value required by the
GL_ARB_explicit_uniform_location extension. Default value for user
assignable locations is calculated as sum of MaxUniformComponents
for each stage.

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/context.c  | 10 +-
  src/mesa/main/get.c  |  1 +
  src/mesa/main/get_hash_params.py |  1 +
  src/mesa/main/mtypes.h   |  5 +
  src/mesa/main/tests/enum_strings.cpp |  1 +
  5 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 860ae86..8b77df1 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -610,8 +610,16 @@ _mesa_init_constants(struct gl_context *ctx)
 ctx->Const.MaxUniformBlockSize = 16384;
 ctx->Const.UniformBufferOffsetAlignment = 1;
  
-   for (i = 0; i < MESA_SHADER_STAGES; i++)

+   /* GL_ARB_explicit_uniform_location, initial value calculated
+* as sum of MaxUniformComponents for each stage.
+*/
+   ctx->Const.MaxUserAssignableUniformLocations = 0;
+
+   for (i = 0; i < MESA_SHADER_STAGES; i++) {
init_program_limits(ctx, i, &ctx->Const.Program[i]);
+  ctx->Const.MaxUserAssignableUniformLocations +=
+ ctx->Const.Program[i].MaxUniformComponents;
+   }

This is just going to set ctx->Const.MaxUserAssignableUniformLocations
to 4 * 4 * MAX_UNIFORMS, and that's probably not what we want.  Maybe
just set 4 * MAX_UNIFORMS with a comment saying it's, "MAX_UNIFORMS for
each possible shader stage."


There should be much more locations than number of uniforms though (?) 
MAX_UNIFORMS refers to count of available vec4 uniforms, each of these 
should have 4 locations available. Also, value from the above formula 
nicely matches with binary drivers so IMO it shouldn't be 'too much'.



 ctx->Const.MaxProgramMatrices = MAX_PROGRAM_MATRICES;
 ctx->Const.MaxProgramMatrixStackDepth = MAX_PROGRAM_MATRIX_STACK_DEPTH;
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 6d95790..8b50441 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -395,6 +395,7 @@ EXTRA_EXT(ARB_viewport_array);
  EXTRA_EXT(ARB_compute_shader);
  EXTRA_EXT(ARB_gpu_shader5);
  EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5);
+EXTRA_EXT(ARB_explicit_uniform_location);
  
  static const int

  extra_ARB_color_buffer_float_or_glcore[] = {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 06d0bba..5709d42 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -474,6 +474,7 @@ descriptor=[
[ "MAX_LIST_NESTING", "CONST(MAX_LIST_NESTING), NO_EXTRA" ],
[ "MAX_NAME_STACK_DEPTH", "CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA" ],
[ "MAX_PIXEL_MAP_TABLE", "CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA" ],
+  [ "MAX_UNIFORM_LOCATIONS", "CONTEXT_INT(Const.MaxUserAssignableUniformLocations), 
NO_EXTRA" ],

Ditto on Petri's comment.


[ "NAME_STACK_DEPTH", "CONTEXT_INT(Select.NameStackDepth), NO_EXTRA" ],
[ "PACK_LSB_FIRST", "CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA" ],
[ "PACK_SWAP_BYTES", "CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA" ],
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7ac6bbe..fefbe06 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3311,6 +3311,11 @@ struct gl_constants
 GLuint UniformBufferOffsetAlignment;
 /** @} */
  
+   /**

+* GL_ARB_explicit_uniform_location
+*/
+   GLuint MaxUserAssignableUniformLocations;
+
 /** GL_ARB_geometry_shader4 */
 GLuint MaxGeometryOutputVertices;
 GLuint MaxGeometryTotalOutputComponents;
diff --git a/src/mesa/main/tests/enum_strings.cpp 
b/src/mesa/main/tests/enum_strings.cpp
index 3795700..298ff6a 100644
--- a/src/mesa/main/tests/enum_strings.cpp
+++ b/src/mesa/main/tests/enum_strings.cpp
@@ -787,6 +787,7 @@ const struct enum_info everything[] = {
 { 0x8256, "GL_RESET_NOTIFICATION_STRATEGY_ARB" },
 { 0x8257, "GL_PROGRAM_BINARY_RETRIEVABLE_HINT" },
 { 0x8261, "GL_NO_RESET_NOTIFICATION_ARB" },
+   { 0x826E, "GL_MAX_UNIFORM_LOCATIONS" },
 { 0x82DF, "GL_TEXTURE_IMMUTABLE_LEVELS" },
 { 0x8362, "GL_UNSIGNED_BYTE_2_3_3_REV" },
 { 0x8363, "GL_UNSIGNED_SHORT_5_6_5" },



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] Enable GL_ARB_explicit_uniform_location in the drivers.

2014-05-19 Thread Tapani


On 05/19/2014 08:21 PM, Ian Romanick wrote:

Either this patch should:

  - Delete the extension enable flag
  - Change the table in extensions.c to use dummy_true

or

The next patch needs to not say "all drivers that support GLSL".

I think we should just enable it everywhere.


OK, I was following the way how GL_ARB_explicit_attrib_location was 
enabled. That one is still only for "all drivers that support GLSL" and 
you really need GLSL to be able to use attributes or uniforms. I can 
enable it everywhere via dummy_true.



On 04/09/2014 02:56 AM, Tapani Pälli wrote:

Signed-off-by: Tapani Pälli 
---
  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
  src/mesa/state_tracker/st_extensions.c   | 1 +
  2 files changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 15fcd30..f8abf98 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -170,6 +170,7 @@ intelInitExtensions(struct gl_context *ctx)
 ctx->Extensions.ARB_draw_instanced = true;
 ctx->Extensions.ARB_ES2_compatibility = true;
 ctx->Extensions.ARB_explicit_attrib_location = true;
+   ctx->Extensions.ARB_explicit_uniform_location = true;
 ctx->Extensions.ARB_fragment_coord_conventions = true;
 ctx->Extensions.ARB_fragment_program = true;
 ctx->Extensions.ARB_fragment_program_shadow = true;
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 3e1e45d..5b11e7b 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -534,6 +534,7 @@ void st_init_extensions(struct st_context *st)
 ctx->Extensions.ARB_ES2_compatibility = GL_TRUE;
 ctx->Extensions.ARB_draw_elements_base_vertex = GL_TRUE;
 ctx->Extensions.ARB_explicit_attrib_location = GL_TRUE;
+   ctx->Extensions.ARB_explicit_uniform_location = GL_TRUE;
 ctx->Extensions.ARB_fragment_coord_conventions = GL_TRUE;
 ctx->Extensions.ARB_fragment_program = GL_TRUE;
 ctx->Extensions.ARB_fragment_shader = GL_TRUE;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] glsl: parser changes for GL_ARB_explicit_uniform_location

2014-05-19 Thread Tapani


On 05/19/2014 08:18 PM, Ian Romanick wrote:

On 04/09/2014 02:56 AM, Tapani Pälli wrote:

Patch adds a preprocessor define for the extension and stores explicit
location data for uniforms during AST->HIR conversion. It also sets
layout token to be available when having the extension in place.

Signed-off-by: Tapani Pälli 
---
  src/glsl/ast_to_hir.cpp   | 37 +
  src/glsl/glcpp/glcpp-parse.y  |  3 +++
  src/glsl/glsl_lexer.ll|  1 +
  src/glsl/glsl_parser_extras.h | 14 ++
  4 files changed, 55 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 8d55ee3..7431ad7 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2170,6 +2170,43 @@ validate_explicit_location(const struct 
ast_type_qualifier *qual,
  {
 bool fail = false;
  
+   /* Checks for GL_ARB_explicit_uniform_location. */

+   if (qual->flags.q.uniform) {
+

Extra blank line.

oops


+  if (!state->check_explicit_uniform_location_allowed(loc, var))
+ return;
+
+  const struct gl_context *const ctx = state->ctx;
+  unsigned max_loc = qual->location + var->type->component_slots() - 1;

I think that over counts for this purpose, and we can blame confusing
nomenclature.  component_slots for a mat4 is 4, so a mat4 uniform counts
4*4 against the GL_MAX_VERTEX_UNIFORM_COMPONENTS limit.  However, it
only has one "location" (as returned by glGetUniformLocation), so it
only counts 1 against the GL_MAX_UNIFORM_LOCATIONS limit.


I see, I was considering structs and arrays when writing this part and 
forgot about matrix. I assume matrix is the only special case here 
though? Everything else gets correct location count value via 
component_slots().



+
+  /* ARB_explicit_uniform_location specification states:
+   *
+   * "The explicitly defined locations and the generated locations
+   * must be in the range of 0 to MAX_UNIFORM_LOCATIONS minus one."
+   *
+   * "Valid locations for default-block uniform variable locations
+   * are in the range of 0 to the implementation-defined maximum
+   * number of uniform locations."
+   */
+  if (qual->location < 0) {
+ _mesa_glsl_error(loc, state,
+  "explicit location < 0 for uniform %s", var->name);
+ return;
+  }
+
+  if (max_loc >= ctx->Const.MaxUserAssignableUniformLocations) {
+ _mesa_glsl_error(loc, state, "location qualifier for uniform %s "
+  ">= MAX_UNIFORM_LOCATIONS (%u)",
+  var->name,
+  ctx->Const.MaxUserAssignableUniformLocations);
+ return;
+  }
+
+  var->data.explicit_location = true;
+  var->data.location = qual->location;
+  return;
+   }
+
 /* Between GL_ARB_explicit_attrib_location an
  * GL_ARB_separate_shader_objects, the inputs and outputs of any shader
  * stage can be assigned explicit locations.  The checking here associates
diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index f28d853..6d42138 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2087,6 +2087,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
  if (extensions->ARB_explicit_attrib_location)
 add_builtin_define(parser, "GL_ARB_explicit_attrib_location", 
1);
  
+	  if (extensions->ARB_explicit_uniform_location)

+add_builtin_define(parser, "GL_ARB_explicit_uniform_location", 
1);
+
  if (extensions->ARB_shader_texture_lod)
 add_builtin_define(parser, "GL_ARB_shader_texture_lod", 1);
  
diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll

index 7602351..83f0b6d 100644
--- a/src/glsl/glsl_lexer.ll
+++ b/src/glsl/glsl_lexer.ll
@@ -393,6 +393,7 @@ layout  {
  || yyextra->AMD_conservative_depth_enable
  || yyextra->ARB_conservative_depth_enable
  || yyextra->ARB_explicit_attrib_location_enable
+ || yyextra->ARB_explicit_uniform_location_enable
|| yyextra->has_separate_shader_objects()
  || yyextra->ARB_uniform_buffer_object_enable
  || yyextra->ARB_fragment_coord_conventions_enable
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index c53c583..20879a0 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -152,6 +152,20 @@ struct _mesa_glsl_parse_state {
return true;
 }
  
+   bool check_explicit_uniform_location_allowed(YYLTYPE *locp,

+const ir_variable *var)
+   {
+  /* Requires OpenGL 3.3 or ARB_explicit_attrib_location. */
+  if (ctx->Version < 33 && !ctx->Extensions.ARB_explicit_attrib_location) {
+ _mesa_glsl_error(locp, this,

[Mesa-dev] ARB_sso layout() + other qualifiers

2014-05-19 Thread Chris Forbes

Hi Ian,

When I was writing the `precise` support I found some error cases in
the GLSL parser where we reject combinations of layout() with
invariant / interpolation / etc qualifiers.

This seems to be consistent with the GLSL 1.50 grammar (or, at least,
admits all the examples that were given in various GLSL specs and
extension specs), but I don't think it works any more with SSO, since
you'd want to be able to do rendezvous-by-location on qualified
variables. The body of the ARB_sso spec doesn't clearly make the
changes required to allow this, but various parts of the spec hint at
it being possible, with the most obvious being in the resolution of
issue 13:

13. How are interpolation modifiers handled for separate shader
programs?

RESOLVED:  GLSL only provides interpolation modifiers for user-
defined varyings. These modifiers can be used in conjunction
with the layout location qualifiers specified in this extension.
Such modifiers must match.

I propose relaxing the rules in type_qualifier as follows:

* If neither GLSL 4.20 nor ARB_shading_language_420pack is supported,
then require layout qualifiers to precede any other qualifiers;
continue to disallow multiple layout qualifiers.

* Remove all other error generation for combining layout with
invariant / interpolation / (with my other patches) precise.

I think this retains all the useful current behavior, and will accept
all the examples I've seen.

-- Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/10] mesa: support inactive uniforms in glUniform* functions

2014-05-19 Thread Tapani


On 05/19/2014 08:09 PM, Ian Romanick wrote:

On 04/09/2014 02:56 AM, Tapani Pälli wrote:

Support inactive uniforms that have explicit location set in
glUniform* functions.

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/uniform_query.cpp | 15 +++
  1 file changed, 15 insertions(+)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 5f1af08..e33800a 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -253,6 +253,21 @@ validate_uniform_parameters(struct gl_context *ctx,
return false;
 }
  
+   /* If the driver storage pointer in remap table is -1, we ignore silently.

+*
+* GL_ARB_explicit_uniform_location spec says:
+* "What happens if Uniform* is called with an explicitly defined
+* uniform location, but that uniform is deemed inactive by the
+* linker?
+*
+* RESOLVED: The call is ignored for inactive uniform variables and
+* no error is generated."
+*
+*/
+   if (ctx->Extensions.ARB_explicit_uniform_location &&
+  shProg->UniformRemapTable[location] == (gl_uniform_storage *) -1)
+  return false;
+

Do we actually need to check
ctx->Extensions.ARB_explicit_uniform_location?  It seems like
UniformRemapTable will only have -1 in it for that case, right?


Yes, the extension check can be removed.


 _mesa_uniform_split_location_offset(shProg, location, loc, array_index);
  
 if (shProg->UniformStorage[*loc].array_elements == 0 && count > 1) {




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/10] glsl/linker: assign explicit uniform locations

2014-05-19 Thread Tapani


On 05/19/2014 08:07 PM, Ian Romanick wrote:

On 04/09/2014 02:56 AM, Tapani Pälli wrote:

Patch refactors the existing uniform processing so explicit locations
are taken in to account during variable processing. These locations
are temporarily stored in gl_uniform_storage before actual locations
are set.

The 'remap_location' variable in gl_uniform_storage is changed to be
signed so that we can use 0 as a valid explicit location and '-1' as
identifier that no explicit location has been defined.

When locations are set, UniformRemapTable is first populated with
uniforms that have explicit location set (inactive and actives ones),
rest are put after explicit location slots.

Signed-off-by: Tapani Pälli 
---
  src/glsl/ir_uniform.h  |  5 +++--
  src/glsl/link_uniforms.cpp | 56 +-
  2 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h
index 3508509..9dc4a8e 100644
--- a/src/glsl/ir_uniform.h
+++ b/src/glsl/ir_uniform.h
@@ -181,9 +181,10 @@ struct gl_uniform_storage {
  
 /**

  * The 'base location' for this uniform in the uniform remap table. For
-* arrays this is the first element in the array.
+* arrays this is the first element in the array. It needs to be signed
+* so that we can use 0 as valid location and -1 as initial value
  */
-   unsigned remap_location;
+   int remap_location;

You could use ~0u instead of -1, right?  A #define for the magic value
will also help.


Sure, I can move to using ~0u. Should be enough to never become a problem.


  };
  
  #ifdef __cplusplus

diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 29dc0b1..0f99082 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -387,6 +387,9 @@ public:
 void set_and_process(struct gl_shader_program *prog,
ir_variable *var)
 {
+  current_var = var;
+  field_counter = 0;
+
ubo_block_index = -1;
if (var->is_in_uniform_block()) {
   if (var->is_interface_instance() && var->type->is_array()) {
@@ -543,6 +546,22 @@ private:
   return;
}
  
+  /* Assign explicit locations. */

+  if (current_var->data.explicit_location) {
+ /* Set sequential locations for struct fields. */
+ if (current_var->type->is_record()) {

I think you can accomplish the same thing with record_type != NULL.


ok, I can change


+const unsigned entries = MAX2(1, 
this->uniforms[id].array_elements);
+this->uniforms[id].remap_location =
+   current_var->data.location + field_counter;
+   field_counter += entries;

Weird indentation.


will fix


+ } else {
+this->uniforms[id].remap_location = current_var->data.location;
+ }
+  } else {
+ /* Initialize to -1 to indicate that no explicit location is set */
+ this->uniforms[id].remap_location = -1;
+  }
+
this->uniforms[id].name = ralloc_strdup(this->uniforms, name);
this->uniforms[id].type = base_type;
this->uniforms[id].initialized = 0;
@@ -598,6 +617,17 @@ public:
 gl_texture_index targets[MAX_SAMPLERS];
  
 /**

+* Current variable being processed.
+*/
+   ir_variable *current_var;
+
+   /**
+* Field counter is used to take care that uniform structures
+* with explicit locations get sequential locations.
+*/
+   unsigned field_counter;
+
+   /**
  * Mask of samplers used by the current shader stage.
  */
 unsigned shader_samplers_used;
@@ -799,10 +829,6 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog)
 prog->UniformStorage = NULL;
 prog->NumUserUniformStorage = 0;
  
-   ralloc_free(prog->UniformRemapTable);

-   prog->UniformRemapTable = NULL;
-   prog->NumUniformRemapTable = 0;
-
 if (prog->UniformHash != NULL) {
prog->UniformHash->clear();
 } else {
@@ -915,9 +941,29 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog)
   sizeof(prog->_LinkedShaders[i]->SamplerTargets));
 }
  
-   /* Build the uniform remap table that is used to set/get uniform locations */

+   /* Reserve all the explicit locations of the active uniforms. */
+   for (unsigned i = 0; i < num_user_uniforms; i++) {
+  if (uniforms[i].remap_location != -1) {
+ /* How many new entries for this uniform? */
+ const unsigned entries = MAX2(1, uniforms[i].array_elements);
+
+ /* Set remap table entries point to correct gl_uniform_storage. */
+ for (unsigned j = 0; j < entries; j++) {
+unsigned element_loc = uniforms[i].remap_location + j;
+assert(prog->UniformRemapTable[element_loc] ==
+   (gl_uniform_storage *) -1);
+prog->UniformRemapTable[element_loc] = &uniforms[i];
+ }
+  }
+   }
+
+   /* Reserve locations for rest of the uniforms. */
 for (unsigned

Re: [Mesa-dev] [PATCH 04/10] glsl/linker: initialize explicit uniform locations

2014-05-19 Thread Tapani


On 05/19/2014 07:51 PM, Ian Romanick wrote:

On 04/09/2014 02:56 AM, Tapani Pälli wrote:

Patch initializes the UniformRemapTable for explicit locations. This
needs to happen before optimizations to make sure all inactive uniforms
get their explicit locations correctly.

Signed-off-by: Tapani Pälli 
---
  src/glsl/linker.cpp | 99 +
  1 file changed, 99 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 7c194a2..1b4cb63 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -74,6 +74,7 @@
  #include "link_varyings.h"
  #include "ir_optimization.h"
  #include "ir_rvalue_visitor.h"
+#include "ir_uniform.h"
  
  extern "C" {

  #include "main/shaderobj.h"
@@ -2089,6 +2090,100 @@ check_image_resources(struct gl_context *ctx, struct 
gl_shader_program *prog)
linker_error(prog, "Too many combined image uniforms and fragment 
outputs");
  }
  
+

+/**
+ * Initializes explicit location slots point to -1 for a variable,
+ * checks for overlaps between other uniforms using explicit locations.
+ */
+static bool
+reserve_explicit_locations(struct gl_shader_program *prog,
+   string_to_uint_map *map, ir_variable *var)
+{
+   unsigned max_loc = var->data.location + var->type->component_slots() - 1;
+
+   /* Resize remap table if locations do not fit in the current one. */
+   if (max_loc + 1 > prog->NumUniformRemapTable) {
+  prog->UniformRemapTable =
+ reralloc(prog, prog->UniformRemapTable,
+  gl_uniform_storage *,
+  max_loc + 1);
+  prog->NumUniformRemapTable = max_loc + 1;
+   }
+
+   for (unsigned i = 0; i < var->type->component_slots(); i++) {

You should check the code that gets generated for this.  I suspect this
will translate to a call to component_slots per iteration of the loop.
Maybe just call it once above (since it is also used to calculate max_loc).


OK, will change.


+  unsigned loc = var->data.location + i;
+
+  /* Check if location is already used. */
+  if (prog->UniformRemapTable[loc] == (gl_uniform_storage *) -1) {

So... -1 means that an inactive uniform has that location explicitly
assigned?  I'm inferring that from comments in the next patch. Maybe we
should have a descriptive #define

#define INACTIVE_UNIFORM_EXPLICIT_LOCATION ((gl_uniform_storage *) -1)


Yep, makes it more easier to read.


+
+ /* Possibly same uniform from a different stage, this is ok. */
+ unsigned hash_loc;
+ if (map->get(hash_loc, var->name) && hash_loc == loc - i)
+   continue;
+
+ /* ARB_explicit_uniform_location specification states:
+  *
+  * "No two default-block uniform variables in the program can have
+  * the same location, even if they are unused, otherwise a 
compiler
+  * or linker error will be generated."
+  */
+ linker_error(prog, "location qualifier "
+  "for uniform %s "
+  "overlaps previously used location",
+  var->name);

Minor nit (which you can take or leave).  I usually like to have fewer
breaks in strings.  I would have split this up as:

  linker_error(prog,
   "location qualifier for uniform %s overlaps "
   "previously used location",
   var->name);


ok




+ return false;
+  }
+
+  prog->UniformRemapTable[loc] = (gl_uniform_storage *) -1;
+   }
+
+   /* Note, base location used for arrays. */
+   map->put(var->data.location, var->name);
+
+   return true;
+}
+
+/**
+ * Check and reserve all explicit uniform locations, called before
+ * any optimizations happen to handle also inactive uniforms and
+ * inactive array elements that may get trimmed away.
+ */
+static void
+check_explicit_uniform_locations(struct gl_context *ctx,
+ struct gl_shader_program *prog)
+{
+   if (!ctx->Extensions.ARB_explicit_uniform_location)
+  return;
+
+   /* This map is used to detect if overlapping explicit locations
+* occur with the same uniform (from different stage) or a different one.
+*/
+   string_to_uint_map *uniform_map = new string_to_uint_map;
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_shader *sh = prog->_LinkedShaders[i];
+
+  if (!sh)
+ continue;
+
+  foreach_list(node, sh->ir) {
+ ir_variable *var = ((ir_instruction *)node)->as_variable();
+ if ((var && var->data.mode == ir_var_uniform) &&
+ var->data.explicit_location) {
+if (!reserve_explicit_locations(prog, uniform_map, var))
+   return;
+
+/* Initialize locations that were allocated but left unused. */
+for (unsigned i = 0; i < prog->NumUniformRemapTable; i++)
+   if (prog->UniformRemapTable[i] != (gl_uniform_storage *) -1)
+  prog->U

[Mesa-dev] [PATCH] tgsi: add GS_INVOCATIONS to property names array

2014-05-19 Thread Ilia Mirkin

In commit 4be146b1, I neglected to add the new property to the strings
array. This leads to the string '(null)' to be printed instead when
converting a GS shader to text.

Signed-off-by: Ilia Mirkin 
Cc: "10.2" 
---
 src/gallium/auxiliary/tgsi/tgsi_strings.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index 5b6e47f..34dec4f 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -120,7 +120,8 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
"FS_COORD_PIXEL_CENTER",
"FS_COLOR0_WRITES_ALL_CBUFS",
"FS_DEPTH_LAYOUT",
-   "VS_PROHIBIT_UCPS"
+   "VS_PROHIBIT_UCPS",
+   "GS_INVOCATIONS",
 };
 
 const char *tgsi_type_names[5] =
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/23] targets/vdpau: convert to static and shared pipe-drivers

2014-05-19 Thread Jonathan Gray

On Mon, May 19, 2014 at 11:57:58PM +0100, Emil Velikov wrote:
> On 18/05/14 12:22, Jonathan Gray wrote:
> [snip]
> > 
> > Currently I run my autotools builds like this:
> > 
> > export LDFLAGS=-L/usr/local/lib
> > export CPPFLAGS="-I/usr/local/include -I/usr/local/include/libelf"
> > export AUTOMAKE_VERSION=1.12
> > export AUTOCONF_VERSION=2.69
> > export LEX=/usr/local/bin/gflex
> > ./autogen.sh \
> > --with-gallium-drivers=r300,r600,radeonsi,swrast,nouveau \
> > --with-dri-drivers=i915,i965,r200,radeon,swrast \
> > --disable-silent-rules \
> > --enable-r600-llvm-compiler --enable-gallium-llvm \
> > --disable-llvm-shared-libs \
> > --enable-gles1 --enable-gles2 \
> > --enable-shared-glapi \
> > --disable-osmesa \
> > --enable-debug \
> > --enable-gbm \
> > --with-egl-platforms="x11,drm" \
> > --prefix=/usr/mesa
> > 
> I'm a bit curious how xenocara's CVS is going to handle the symlinks when
> building dri/radeon, dri/r200 and st/dri (all gallium dri drivers). AFAICS it
> will fail miserably :\
> If interested you can rework the former two and effectively drop a handful
> symbol redefinition, shed off some code and size off the classic dri. I'm
> planning to address the st/dri case after this series is merged.

I'm not really sure what xenocara has to do with the autotools build?
As said before xenocara uses it's own set of makefiles, ie
http://www.openbsd.org/cgi-bin/cvsweb/xenocara/lib/libGL/dri/radeon/Makefile?rev=HEAD;content-type=text%2Fplain
http://www.openbsd.org/cgi-bin/cvsweb/xenocara/lib/libGL/dri/r600g/Makefile?rev=HEAD;content-type=text%2Fplain
with seperate directories for libglapi libGL libEGL libGLESv1_CM libGLESv2
that refer to the source in
http://www.openbsd.org/cgi-bin/cvsweb/xenocara/dist/Mesa/

> 
> >From the above configure one cannot determine if you're building vdpau.
> Current code enables the vdpau targets if the vdpau package is available. Can
> you confirm if this is the case or not ?

My autotools builds are not done on a system with vdpau installed.
The resulting target list from configure here looks like:

Gallium: yes
Target dirs: dri-nouveau dri-swrast r300/dri r600/dri radeonsi/dri 
Winsys dirs: nouveau/drm radeon/drm sw sw/dri sw/xlib 
Driver dirs: galahad identity llvmpipe noop nouveau r300 r600 
radeonsi rbug softpipe trace 
Trackers dirs:   dri

The build does not seem to reference gallium/state_trackers/vdpau but
does build mesa/main/vdpau.c and mesa/state_tracker/st_vdpau.c

It would be nice to have the possibility of building the gallium
vdpau code in future however.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Anuj Phogat

On Mon, May 19, 2014 at 3:52 PM, Chris Forbes  wrote:
> If you're going to do that, you'd really want to add draw buffer count
> to the cache key (and i guess this might be the point where you
> convert the blit shader cache to be a hashtable), to avoid recompiling
> all the time if the app does two blits with the same target but
> different draw buffer counts.
>
> This all seems like a huge amount of extra machinery to avoid using
> gl_FragColor and having the backend just take care of it, though. What
> do we actually gain from this?
>
Right, It doesn't look like worth doing it. I was avoiding 'gl_FragColor'
just because it's deprecated in GLSL 130. Using 'gl_FragColor" here
will work perfectly fine. But, seems like it won't work in fragment
shader for msaa blits because msaa blit shader makes use of non
vec4 output types. Although, blitting to multiple multisample buffers
is not a common use case, we'll have similar shader recompilation
problem due to changing draw buffers count.

For now, I'll go ahead and make changes to use 'gl_FragColor' for
non-multisample blits.
>
> On Tue, May 20, 2014 at 10:22 AM, Anuj Phogat  wrote:
>> On Mon, May 19, 2014 at 3:12 PM, Chris Forbes  wrote:
>>> On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat  wrote:
 @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
 struct blit_shader *shader = choose_blit_shader(target, table);
 const char *vs_input, *vs_output, *fs_input, *fs_output;
 const char *vs_preprocess = "", *fs_preprocess = "";
 -   const char *fs_output_decl = "";
 +   const char *fs_output_decl = "", *for_loop = "";
 +   const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers;
>>>
>>> You can't depend on the number of bound draw buffers here. These
>>> shaders get generated on first use, and cached for the life of the
>>> context.
>> Nice catch. I'll add a condition to recompile the shader if number of
>> draw buffers change.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] glsl: ideas how to improve dead code elimination?

2014-05-19 Thread Matt Turner

On Mon, May 19, 2014 at 10:56 AM, Aras Pranckevicius  wrote:
> Hi,
>
> When Mesa's GLSL compiler is faced with a code like this:
>
> // vec4 packednormal exists
> vec3 normal;
> normal.xy = packednormal.wy * 2.0 - 1.0;
> normal.z = sqrt(1.0 - dot(normal.xy, normal.xy));
> // now do not use the "normal" at all anywhere
>
> Then the dead code elimination pass will not be able to eliminate the
> "normal" variable, nor anything that lead to it (possibly sampling textures
> into packed normal, etc.).
>
> This is because variable refcounting visitor sees "normal" as having four
> references, but only two assignments, and can not consider it dead. Even if
> these two references are from assignment to normal.z where both LHS & RHS
> reference the same variable.
>
> Any ideas on how to improve this?
>
>
> If the original code was doing something like this, then dead code
> elimination is able to "properly" eliminate this whole thing:
>
> // vec4 packednormal exists
> vec3 normal;
> vec2 nxy = packednormal.wy * 2.0 - 1.0;
> float nz = sqrt(1.0 - dot(nxy, nxy));
> normal.xy = nxy;
> normal.z = nz;
> // now do not use the "normal" at all anywhere

Eric is working on a better GLSL IR dead code elimination pass. I'm
not sure of the current status.

It's in his tree:

   git://people.freedesktop.org/~anholt/mesa deadcode
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] loader: allow alternative methods for PCI identification.

2014-05-19 Thread Emil Velikov

On 15/05/14 05:39, Gary Wong wrote:
> loader_get_pci_id_for_fd() and loader_get_device_name_for_fd() now attempt
> all available strategies to identify the hardware, instead of conditionally
> compiling in a single test.  The existing libudev and DRM approaches have
> been retained, and another simple alternative of looking up the answer in
> the /sys filesystem (available on Linux) is added.
> 
> This should assist Linux systems that mount /sys but do not include
> libudev (Android?), give Mesa a fighting chance of running on systems
> where libudev is uninstalled/inaccessible/broken at runtime, and provides
> a hook where non-Linux systems (BSD?) could implement their own PCI
> identification.
> 
Hi Gary,

Are you trying to get mesa working under GNU Hurd ? IIRC Jonathan is able to
get mesa working under OpenBSD and I would expect other non-linux platforms to
just work(tm). Although with that said I may have broken Android (not sure if
autohell detects is as a linux platform).

As you can notice I'm not a huge fan of adding yet another way of retrieving
the device/driver name although I would not object if you're willing to split
this patch a bit, have the option off by default and fix bugs if/when they pop
up :)

Cheers,
Emil

> Signed-off-by: Gary Wong 
> ---
>  configure.ac|  51 
>  src/loader/loader.c | 173 
> +++-
>  2 files changed, 195 insertions(+), 29 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index d3e96de..fe572cd 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -818,13 +818,31 @@ fi
>  
>  case "$host_os" in
>  linux*)
> -need_libudev=yes ;;
> +need_pci_id=yes ;;
>  *)
> -need_libudev=no ;;
> +need_pci_id=no ;;
>  esac
>  
> -PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED],
> -  have_libudev=yes, have_libudev=no)
> +AC_ARG_ENABLE([libudev],
> +[AS_HELP_STRING([--disable-libudev],
> +[disable libudev PCI identification @<:@default=enabled on supported 
> platforms@:>@])],
> +[attempt_libudev="$enableval"],
> +[attempt_libudev=yes]
> +) 
> +
> +if test "x$attempt_libudev" = "xyes"; then
> +PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED],
> +  have_libudev=yes, have_libudev=no)
> +else
> +have_libudev=no
> +fi
> +
> +AC_ARG_ENABLE([sysfs],
> +[AS_HELP_STRING([--disable-sysfs],
> +[disable /sys PCI identification @<:@default=enabled on supported 
> platforms@:>@])],
> +[have_sysfs="$enableval"],
> +[have_sysfs=yes]
> +)
>  
>  if test "x$enable_dri" = xyes; then
>  if test "$enable_static" = yes; then
> @@ -910,8 +928,15 @@ xyesno)
>  ;;
>  esac
>  
> +have_pci_id=no
>  if test "$have_libudev" = yes; then
>  DEFINES="$DEFINES -DHAVE_LIBUDEV"
> +have_pci_id=yes
> +fi
> +
> +if test "$have_sysfs" = yes; then
> +DEFINES="$DEFINES -DHAVE_SYSFS"
> +have_pci_id=yes
>  fi
>  
>  # This is outside the case (above) so that it is invoked even for non-GLX
> @@ -1013,8 +1038,8 @@ if test "x$enable_dri" = xyes; then
>  DEFINES="$DEFINES -DHAVE_DRI3"
>  fi
>  
> -if test "x$have_libudev" != xyes; then
> -AC_MSG_ERROR([libudev-dev required for building DRI])
> +if test "x$have_pci_id" != xyes; then
> +AC_MSG_ERROR([libudev-dev or sysfs required for building DRI])
>  fi
>  
>  case "$host_cpu" in
> @@ -1183,8 +1208,8 @@ if test "x$enable_gbm" = xauto; then
>  esac
>  fi
>  if test "x$enable_gbm" = xyes; then
> -if test "x$need_libudev$have_libudev" = xyesno; then
> -AC_MSG_ERROR([gbm requires udev >= $LIBUDEV_REQUIRED])
> +if test "x$need_pci_id$have_pci_id" = xyesno; then
> +AC_MSG_ERROR([gbm requires udev >= $LIBUDEV_REQUIRED or sysfs])
>  fi
>  
>  if test "x$enable_dri" = xyes; then
> @@ -1202,7 +1227,7 @@ if test "x$enable_gbm" = xyes; then
>  fi
>  fi
>  AM_CONDITIONAL(HAVE_GBM, test "x$enable_gbm" = xyes)
> -if test "x$need_libudev" = xyes; then
> +if test "x$need_pci_id$have_libudev" = xyesyes; then
>  GBM_PC_REQ_PRIV="libudev >= $LIBUDEV_REQUIRED"
>  else
>  GBM_PC_REQ_PRIV=""
> @@ -1491,9 +1516,9 @@ for plat in $egl_platforms; do
>   ;;
>   esac
>  
> -case "$plat$need_libudev$have_libudev" in
> +case "$plat$need_pci_id$have_pci_id" in
>  waylandyesno|drmyesno)
> -AC_MSG_ERROR([cannot build $plat platform without udev 
> >= $LIBUDEV_REQUIRED]) ;;
> +AC_MSG_ERROR([cannot build $plat platform without udev 
> >= $LIBUDEV_REQUIRED or sysfs]) ;;
>  esac
>  done
>  
> @@ -1766,8 +1791,8 @@ gallium_require_llvm() {
>  
>  gallium_require_drm_loader() {
>  if test "x$enable_gallium_loader" = xyes; then
> -if test "x$need_libudev$have_libudev" = xyesno; then
> -AC_MSG_ERROR([Gallium drm loader requires libudev

Re: [Mesa-dev] [PATCH 16/23] targets/vdpau: convert to static and shared pipe-drivers

2014-05-19 Thread Emil Velikov

On 18/05/14 12:22, Jonathan Gray wrote:
[snip]
> 
> Currently I run my autotools builds like this:
> 
> export LDFLAGS=-L/usr/local/lib
> export CPPFLAGS="-I/usr/local/include -I/usr/local/include/libelf"
> export AUTOMAKE_VERSION=1.12
> export AUTOCONF_VERSION=2.69
> export LEX=/usr/local/bin/gflex
> ./autogen.sh \
> --with-gallium-drivers=r300,r600,radeonsi,swrast,nouveau \
> --with-dri-drivers=i915,i965,r200,radeon,swrast \
> --disable-silent-rules \
> --enable-r600-llvm-compiler --enable-gallium-llvm \
> --disable-llvm-shared-libs \
> --enable-gles1 --enable-gles2 \
> --enable-shared-glapi \
> --disable-osmesa \
> --enable-debug \
> --enable-gbm \
> --with-egl-platforms="x11,drm" \
> --prefix=/usr/mesa
> 
I'm a bit curious how xenocara's CVS is going to handle the symlinks when
building dri/radeon, dri/r200 and st/dri (all gallium dri drivers). AFAICS it
will fail miserably :\
If interested you can rework the former two and effectively drop a handful
symbol redefinition, shed off some code and size off the classic dri. I'm
planning to address the st/dri case after this series is merged.

>From the above configure one cannot determine if you're building vdpau.
Current code enables the vdpau targets if the vdpau package is available. Can
you confirm if this is the case or not ?

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Chris Forbes

If you're going to do that, you'd really want to add draw buffer count
to the cache key (and i guess this might be the point where you
convert the blit shader cache to be a hashtable), to avoid recompiling
all the time if the app does two blits with the same target but
different draw buffer counts.

This all seems like a huge amount of extra machinery to avoid using
gl_FragColor and having the backend just take care of it, though. What
do we actually gain from this?

On Tue, May 20, 2014 at 10:22 AM, Anuj Phogat  wrote:
> On Mon, May 19, 2014 at 3:12 PM, Chris Forbes  wrote:
>> On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat  wrote:
>>> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>>> struct blit_shader *shader = choose_blit_shader(target, table);
>>> const char *vs_input, *vs_output, *fs_input, *fs_output;
>>> const char *vs_preprocess = "", *fs_preprocess = "";
>>> -   const char *fs_output_decl = "";
>>> +   const char *fs_output_decl = "", *for_loop = "";
>>> +   const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers;
>>
>> You can't depend on the number of bound draw buffers here. These
>> shaders get generated on first use, and cached for the life of the
>> context.
> Nice catch. I'll add a condition to recompile the shader if number of
> draw buffers change.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not cover the same pixels when rasterized

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=78914

Roland Scheidegger  changed:

   What|Removed |Added

Summary|Front/Backfaces do not  |[llvmpipe] Front/Backfaces
   |cover the same pixels when  |do not cover the same
   |rasterized  |pixels when rasterized
  Component|Mesa core   |Other

--- Comment #1 from Roland Scheidegger  ---
So, in order to get front and backface tris, you draw essentially the same tri
twice, but once you draw index 0,1,2 and once you draw 0,2,1? I could see this
getting different results for interpolated attributes (in fact I know it will
happen...). I am not actually sure it's guaranteed to get the same results,
this is very tricky to get right (the reason is the interpolation /
interpolation setup is not quite symmetric wrt all triangle corners, the float
math can give different results). Though this should only affect interpolated
attribute values, not rasterization itself (which happens with fixed point
math). If it actually rasterizes different pixels this is a bug. Hence if you
could provide some minimal test case that would be great.
This only affects llvmpipe right?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Anuj Phogat

On Mon, May 19, 2014 at 3:12 PM, Chris Forbes  wrote:
> On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat  wrote:
>> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>> struct blit_shader *shader = choose_blit_shader(target, table);
>> const char *vs_input, *vs_output, *fs_input, *fs_output;
>> const char *vs_preprocess = "", *fs_preprocess = "";
>> -   const char *fs_output_decl = "";
>> +   const char *fs_output_decl = "", *for_loop = "";
>> +   const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers;
>
> You can't depend on the number of bound draw buffers here. These
> shaders get generated on first use, and cached for the life of the
> context.
Nice catch. I'll add a condition to recompile the shader if number of
draw buffers change.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Anuj Phogat

On Mon, May 19, 2014 at 2:45 PM, Matt Turner  wrote:
> On Mon, May 19, 2014 at 1:20 PM, Anuj Phogat  wrote:
>> _mesa_meta_setup_blit_shader() currently generates a fragment shader
>> which, irrespective of the number of draw buffers, writes the color
>> to only one output variable. Current shader rely on an undefined
>> behavior and possibly works by chance.
>>
>> From OpenGL 4.0  spec, page 256:
>>   "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a
>>set of draw buffers into which the single fragment color defined by
>>gl_FragColor is written. If a fragment shader writes to gl_FragData,
>>or a user-defined varying out variable, DrawBuffers specifies a set
>>of draw buffers into which each of the multiple output colors defined
>>by these variables are separately written. If a fragment shader writes
>>to none of gl_FragColor, gl_FragData, nor any user defined varying out
>>variables, the values of the fragment colors following shader execution
>>are undefined, and may differ for each fragment color."
>>
>> OpenGL 4.4 spec, page 463, added an additional line in this section:
>>   "If some, but not all user-defined output variables are written, the
>>values of fragment colors corresponding to unwritten variables are
>>similarly undefined."
>>
>> Cc: 
>> Signed-off-by: Anuj Phogat 
>> ---
>>  src/mesa/drivers/common/meta.c | 23 ++-
>>  1 file changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
>> index 87609b4..4897cd9 100644
>> --- a/src/mesa/drivers/common/meta.c
>> +++ b/src/mesa/drivers/common/meta.c
>> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>> struct blit_shader *shader = choose_blit_shader(target, table);
>> const char *vs_input, *vs_output, *fs_input, *fs_output;
>> const char *vs_preprocess = "", *fs_preprocess = "";
>> -   const char *fs_output_decl = "";
>> +   const char *fs_output_decl = "", *for_loop = "";
>> +   const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers;
>>
>> if (ctx->Const.GLSLVersion < 130) {
>>vs_input = "attribute";
>> @@ -255,12 +256,23 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>>fs_preprocess = "#extension GL_EXT_texture_array : enable";
>>fs_output = "gl_FragColor";
>> } else {
>> -  vs_preprocess = fs_preprocess = "#version 130";
>> +  vs_preprocess = "#version 130";
>>vs_input = fs_input = "in";
>>vs_output = "out";
>> -  fs_output = "out_color";
>> -  fs_output_decl = "out vec4 out_color;";
>>shader->func = "texture";
>> +  if (draw_buf_count > 1) {
>> + fs_preprocess = ralloc_asprintf(mem_ctx,
>> + "#version 130\n"
>> + "#define NUM_DRAW_BUFS %d",
>> + draw_buf_count);
>> + fs_output = "out_color[i]";
>> + fs_output_decl = "out vec4 out_color[NUM_DRAW_BUFS];";
>> + for_loop = "   for (int i = 0; i < NUM_DRAW_BUFS; i++)\n   ";
>> +  } else {
>> + fs_preprocess = "#version 130";
>> + fs_output = "out_color";
>> + fs_output_decl = "out vec4 out_color;";
>> +  }
>
> It's safe to emit a loop with only one iterations. The compiler will
> happily optimize that (it's going to unroll all of these loops
> anyway). Emitting GLSL code for the for loop unconditionally seems
> like it would clean this up some.
>
I wasn't sure if that'll generate extra instructions for one
iteration. I'll clean it
up before pushing.

> With the comments for these two patches addressed, both of these are
>
> Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Chris Forbes

On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat  wrote:
> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
> struct blit_shader *shader = choose_blit_shader(target, table);
> const char *vs_input, *vs_output, *fs_input, *fs_output;
> const char *vs_preprocess = "", *fs_preprocess = "";
> -   const char *fs_output_decl = "";
> +   const char *fs_output_decl = "", *for_loop = "";
> +   const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers;

You can't depend on the number of bound draw buffers here. These
shaders get generated on first use, and cached for the life of the
context.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Matt Turner

On Mon, May 19, 2014 at 1:20 PM, Anuj Phogat  wrote:
> _mesa_meta_setup_blit_shader() currently generates a fragment shader
> which, irrespective of the number of draw buffers, writes the color
> to only one output variable. Current shader rely on an undefined
> behavior and possibly works by chance.
>
> From OpenGL 4.0  spec, page 256:
>   "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a
>set of draw buffers into which the single fragment color defined by
>gl_FragColor is written. If a fragment shader writes to gl_FragData,
>or a user-defined varying out variable, DrawBuffers specifies a set
>of draw buffers into which each of the multiple output colors defined
>by these variables are separately written. If a fragment shader writes
>to none of gl_FragColor, gl_FragData, nor any user defined varying out
>variables, the values of the fragment colors following shader execution
>are undefined, and may differ for each fragment color."
>
> OpenGL 4.4 spec, page 463, added an additional line in this section:
>   "If some, but not all user-defined output variables are written, the
>values of fragment colors corresponding to unwritten variables are
>similarly undefined."
>
> Cc: 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/common/meta.c | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index 87609b4..4897cd9 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
> struct blit_shader *shader = choose_blit_shader(target, table);
> const char *vs_input, *vs_output, *fs_input, *fs_output;
> const char *vs_preprocess = "", *fs_preprocess = "";
> -   const char *fs_output_decl = "";
> +   const char *fs_output_decl = "", *for_loop = "";
> +   const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers;
>
> if (ctx->Const.GLSLVersion < 130) {
>vs_input = "attribute";
> @@ -255,12 +256,23 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>fs_preprocess = "#extension GL_EXT_texture_array : enable";
>fs_output = "gl_FragColor";
> } else {
> -  vs_preprocess = fs_preprocess = "#version 130";
> +  vs_preprocess = "#version 130";
>vs_input = fs_input = "in";
>vs_output = "out";
> -  fs_output = "out_color";
> -  fs_output_decl = "out vec4 out_color;";
>shader->func = "texture";
> +  if (draw_buf_count > 1) {
> + fs_preprocess = ralloc_asprintf(mem_ctx,
> + "#version 130\n"
> + "#define NUM_DRAW_BUFS %d",
> + draw_buf_count);
> + fs_output = "out_color[i]";
> + fs_output_decl = "out vec4 out_color[NUM_DRAW_BUFS];";
> + for_loop = "   for (int i = 0; i < NUM_DRAW_BUFS; i++)\n   ";
> +  } else {
> + fs_preprocess = "#version 130";
> + fs_output = "out_color";
> + fs_output_decl = "out vec4 out_color;";
> +  }

It's safe to emit a loop with only one iterations. The compiler will
happily optimize that (it's going to unroll all of these loops
anyway). Emitting GLSL code for the for loop unconditionally seems
like it would clean this up some.

With the comments for these two patches addressed, both of these are

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] meta: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code

2014-05-19 Thread Matt Turner

On Mon, May 19, 2014 at 1:20 PM, Anuj Phogat  wrote:
> Cc: 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/common/meta.c | 97 
> +++---
>  1 file changed, 44 insertions(+), 53 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index 3ef3f79..87609b4 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -242,10 +242,26 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>   GLenum target,
>   struct blit_shader_table *table)
>  {
> -   const char *vs_source;
> -   char *fs_source;
> +   char *vs_source, *fs_source;
> void *const mem_ctx = ralloc_context(NULL);
> struct blit_shader *shader = choose_blit_shader(target, table);
> +   const char *vs_input, *vs_output, *fs_input, *fs_output;
> +   const char *vs_preprocess = "", *fs_preprocess = "";
> +   const char *fs_output_decl = "";
> +
> +   if (ctx->Const.GLSLVersion < 130) {
> +  vs_input = "attribute";
> +  vs_output = fs_input = "varying";
> +  fs_preprocess = "#extension GL_EXT_texture_array : enable";
> +  fs_output = "gl_FragColor";
> +   } else {
> +  vs_preprocess = fs_preprocess = "#version 130";
> +  vs_input = fs_input = "in";
> +  vs_output = "out";
> +  fs_output = "out_color";

vs_output means "vertex shader output keyword" but fs_output means
"fragment shader output variable". Maybe change
s/fs_output/fs_output_var/?

> +  fs_output_decl = "out vec4 out_color;";
> +  shader->func = "texture";
> +   }

This block would be clearer if we assigned the same variables in the
same order. Instead of initializing variables, I'd set them in both
blocks. Multiple assignments on the same line also make it less
obviously correct.

>
> assert(shader != NULL);
>
> @@ -254,57 +270,32 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>return;
> }
>
> -   if (ctx->Const.GLSLVersion < 130) {
> -  vs_source =
> - "attribute vec2 position;\n"
> - "attribute vec4 textureCoords;\n"
> - "varying vec4 texCoords;\n"
> - "void main()\n"
> - "{\n"
> - "   texCoords = textureCoords;\n"
> - "   gl_Position = vec4(position, 0.0, 1.0);\n"
> - "}\n";
> -
> -  fs_source = ralloc_asprintf(mem_ctx,
> -  "#extension GL_EXT_texture_array : 
> enable\n"
> -  "#extension GL_ARB_texture_cube_map_array: 
> enable\n"
> -  "uniform %s texSampler;\n"
> -  "varying vec4 texCoords;\n"
> -  "void main()\n"
> -  "{\n"
> -  "   gl_FragColor = %s(texSampler, %s);\n"
> -  "   gl_FragDepth = gl_FragColor.x;\n"
> -  "}\n",
> -  shader->type,
> -  shader->func, shader->texcoords);
> -   }
> -   else {
> -  vs_source = ralloc_asprintf(mem_ctx,
> -  "#version 130\n"
> -  "in vec2 position;\n"
> -  "in vec4 textureCoords;\n"
> -  "out vec4 texCoords;\n"
> -  "void main()\n"
> -  "{\n"
> -  "   texCoords = textureCoords;\n"
> -  "   gl_Position = vec4(position, 0.0, 
> 1.0);\n"
> -  "}\n");
> -  fs_source = ralloc_asprintf(mem_ctx,
> -  "#version 130\n"
> -  "#extension GL_ARB_texture_cube_map_array: 
> enable\n"
> -  "uniform %s texSampler;\n"
> -  "in vec4 texCoords;\n"
> -  "out vec4 out_color;\n"
> -  "\n"
> -  "void main()\n"
> -  "{\n"
> -  "   out_color = texture(texSampler, %s);\n"
> -  "   gl_FragDepth = out_color.x;\n"
> -  "}\n",
> -  shader->type,
> -  shader->texcoords);
> -   }
> -
> +   vs_source = ralloc_asprintf(mem_ctx,
> +"%s\n"
> +"%s vec2 position;\n"
> +"%s vec4 textureCoords;\n"
> +"%s vec4 texCoords;\n"
> +"void main()\n"
> +"{\n"
> +"   texCoords = textureCoords;\n"
> +"   gl_Position = vec4(position, 0.0, 1.0);\n"
> +"}\n",
> +vs_preprocess, vs_input,

Re: [Mesa-dev] [Mesa-stable] [PATCH 4/4] meta: Avoid _swrast_BlitFramebuffer in the meta CopyTexSubImage code.

2014-05-19 Thread Chris Forbes

Series is:

Reviewed-by: Chris Forbes 

On Mon, May 19, 2014 at 6:12 PM, Kenneth Graunke  wrote:
> This is a replacement for bd44ac8b5ca08016bb064b37edaec95eccfdbcd5
> that should actually work.
>
> Fixes Piglit's copyteximage-border on swrast, as well as one of
> es3conform's packed_pixels_pixelstore test.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78546
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
> Signed-off-by: Kenneth Graunke 
> Cc: "10.2" 
> ---
>  src/mesa/drivers/common/meta.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index f90d5bd..b194b6e 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -2860,13 +2860,13 @@ copytexsubimage_using_blit_framebuffer(struct 
> gl_context *ctx, GLuint dims,
>  * are too strict for CopyTexImage.  We know meta will be fine with format
>  * changes.
>  */
> -   _mesa_meta_and_swrast_BlitFramebuffer(ctx, x, y,
> - x + width, y + height,
> - xoffset, yoffset,
> - xoffset + width, yoffset + height,
> - mask, GL_NEAREST);
> +   mask = _mesa_meta_BlitFramebuffer(ctx, x, y,
> + x + width, y + height,
> + xoffset, yoffset,
> + xoffset + width, yoffset + height,
> + mask, GL_NEAREST);
> ctx->Meta->Blit.no_ctsi_fallback = false;
> -   success = true;
> +   success = mask == 0x0;
>
>   out:
> _mesa_lock_texture(ctx, texObj);
> --
> 1.9.2
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers

2014-05-19 Thread Anuj Phogat

_mesa_meta_setup_blit_shader() currently generates a fragment shader
which, irrespective of the number of draw buffers, writes the color
to only one output variable. Current shader rely on an undefined
behavior and possibly works by chance.

>From OpenGL 4.0  spec, page 256:
  "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a
   set of draw buffers into which the single fragment color defined by
   gl_FragColor is written. If a fragment shader writes to gl_FragData,
   or a user-defined varying out variable, DrawBuffers specifies a set
   of draw buffers into which each of the multiple output colors defined
   by these variables are separately written. If a fragment shader writes
   to none of gl_FragColor, gl_FragData, nor any user defined varying out
   variables, the values of the fragment colors following shader execution
   are undefined, and may differ for each fragment color."

OpenGL 4.4 spec, page 463, added an additional line in this section:
  "If some, but not all user-defined output variables are written, the
   values of fragment colors corresponding to unwritten variables are
   similarly undefined."

Cc: 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/common/meta.c | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 87609b4..4897cd9 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
struct blit_shader *shader = choose_blit_shader(target, table);
const char *vs_input, *vs_output, *fs_input, *fs_output;
const char *vs_preprocess = "", *fs_preprocess = "";
-   const char *fs_output_decl = "";
+   const char *fs_output_decl = "", *for_loop = "";
+   const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers;
 
if (ctx->Const.GLSLVersion < 130) {
   vs_input = "attribute";
@@ -255,12 +256,23 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
   fs_preprocess = "#extension GL_EXT_texture_array : enable";
   fs_output = "gl_FragColor";
} else {
-  vs_preprocess = fs_preprocess = "#version 130";
+  vs_preprocess = "#version 130";
   vs_input = fs_input = "in";
   vs_output = "out";
-  fs_output = "out_color";
-  fs_output_decl = "out vec4 out_color;";
   shader->func = "texture";
+  if (draw_buf_count > 1) {
+ fs_preprocess = ralloc_asprintf(mem_ctx,
+ "#version 130\n"
+ "#define NUM_DRAW_BUFS %d",
+ draw_buf_count);
+ fs_output = "out_color[i]";
+ fs_output_decl = "out vec4 out_color[NUM_DRAW_BUFS];";
+ for_loop = "   for (int i = 0; i < NUM_DRAW_BUFS; i++)\n   ";
+  } else {
+ fs_preprocess = "#version 130";
+ fs_output = "out_color";
+ fs_output_decl = "out vec4 out_color;";
+  }
}
 
assert(shader != NULL);
@@ -291,11 +303,12 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
 "void main()\n"
 "{\n"
 "   vec4 color = %s(texSampler, %s);\n"
+"%s"
 "   %s = color;\n"
 "   gl_FragDepth = color.x;\n"
 "}\n",
 fs_preprocess, shader->type, fs_input, fs_output_decl,
-shader->func, shader->texcoords, fs_output);
+shader->func, shader->texcoords, for_loop, fs_output);
 
_mesa_meta_compile_and_link_program(ctx, vs_source, fs_source,
ralloc_asprintf(mem_ctx, "%s blit",
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] meta: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code

2014-05-19 Thread Anuj Phogat

Cc: 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/common/meta.c | 97 +++---
 1 file changed, 44 insertions(+), 53 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 3ef3f79..87609b4 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -242,10 +242,26 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
  GLenum target,
  struct blit_shader_table *table)
 {
-   const char *vs_source;
-   char *fs_source;
+   char *vs_source, *fs_source;
void *const mem_ctx = ralloc_context(NULL);
struct blit_shader *shader = choose_blit_shader(target, table);
+   const char *vs_input, *vs_output, *fs_input, *fs_output;
+   const char *vs_preprocess = "", *fs_preprocess = "";
+   const char *fs_output_decl = "";
+
+   if (ctx->Const.GLSLVersion < 130) {
+  vs_input = "attribute";
+  vs_output = fs_input = "varying";
+  fs_preprocess = "#extension GL_EXT_texture_array : enable";
+  fs_output = "gl_FragColor";
+   } else {
+  vs_preprocess = fs_preprocess = "#version 130";
+  vs_input = fs_input = "in";
+  vs_output = "out";
+  fs_output = "out_color";
+  fs_output_decl = "out vec4 out_color;";
+  shader->func = "texture";
+   }
 
assert(shader != NULL);
 
@@ -254,57 +270,32 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
   return;
}
 
-   if (ctx->Const.GLSLVersion < 130) {
-  vs_source =
- "attribute vec2 position;\n"
- "attribute vec4 textureCoords;\n"
- "varying vec4 texCoords;\n"
- "void main()\n"
- "{\n"
- "   texCoords = textureCoords;\n"
- "   gl_Position = vec4(position, 0.0, 1.0);\n"
- "}\n";
-
-  fs_source = ralloc_asprintf(mem_ctx,
-  "#extension GL_EXT_texture_array : enable\n"
-  "#extension GL_ARB_texture_cube_map_array: 
enable\n"
-  "uniform %s texSampler;\n"
-  "varying vec4 texCoords;\n"
-  "void main()\n"
-  "{\n"
-  "   gl_FragColor = %s(texSampler, %s);\n"
-  "   gl_FragDepth = gl_FragColor.x;\n"
-  "}\n",
-  shader->type,
-  shader->func, shader->texcoords);
-   }
-   else {
-  vs_source = ralloc_asprintf(mem_ctx,
-  "#version 130\n"
-  "in vec2 position;\n"
-  "in vec4 textureCoords;\n"
-  "out vec4 texCoords;\n"
-  "void main()\n"
-  "{\n"
-  "   texCoords = textureCoords;\n"
-  "   gl_Position = vec4(position, 0.0, 
1.0);\n"
-  "}\n");
-  fs_source = ralloc_asprintf(mem_ctx,
-  "#version 130\n"
-  "#extension GL_ARB_texture_cube_map_array: 
enable\n"
-  "uniform %s texSampler;\n"
-  "in vec4 texCoords;\n"
-  "out vec4 out_color;\n"
-  "\n"
-  "void main()\n"
-  "{\n"
-  "   out_color = texture(texSampler, %s);\n"
-  "   gl_FragDepth = out_color.x;\n"
-  "}\n",
-  shader->type,
-  shader->texcoords);
-   }
-
+   vs_source = ralloc_asprintf(mem_ctx,
+"%s\n"
+"%s vec2 position;\n"
+"%s vec4 textureCoords;\n"
+"%s vec4 texCoords;\n"
+"void main()\n"
+"{\n"
+"   texCoords = textureCoords;\n"
+"   gl_Position = vec4(position, 0.0, 1.0);\n"
+"}\n",
+vs_preprocess, vs_input, vs_input, vs_output);
+
+   fs_source = ralloc_asprintf(mem_ctx,
+"%s\n"
+"#extension GL_ARB_texture_cube_map_array: enable\n"
+"uniform %s texSampler;\n"
+"%s vec4 texCoords;\n"
+"%s\n"
+"void main()\n"
+"{\n"
+"   vec4 color = %s(texSampler, %s);\n"
+"   %s = color;\n"
+"   gl_FragDepth = color.x;\n"
+"}\n",
+fs_preprocess, shader->type, fs_input, fs_output_decl,
+shader->func, shader->texcoords, fs_output);
 
_mesa_

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/4] meta: Drop unnecessary early returns in _mesa_meta_BlitFramebuffer.

2014-05-19 Thread Courtney Goeltzenleuchter

Looks good.

Reviewed-by: Courtney Goeltzenleuchter 


On Mon, May 19, 2014 at 12:12 AM, Kenneth Graunke wrote:

> These aren't necessary - all of the following code is predicated on mask
> being non-zero, so no code will get executed anyway.
>
> Signed-off-by: Kenneth Graunke 
> Cc: "10.2" 
> ---
>  src/mesa/drivers/common/meta_blit.c | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta_blit.c
> b/src/mesa/drivers/common/meta_blit.c
> index beb1ea5..bd6118b 100644
> --- a/src/mesa/drivers/common/meta_blit.c
> +++ b/src/mesa/drivers/common/meta_blit.c
> @@ -705,10 +705,6 @@ _mesa_meta_BlitFramebuffer(struct gl_context *ctx,
>filter, dstFlipX, dstFlipY,
>use_glsl_version, false)) {
>   mask &= ~GL_COLOR_BUFFER_BIT;
> - if (mask == 0x0) {
> -_mesa_meta_end(ctx);
> -return;
> - }
>}
> }
>
> @@ -718,10 +714,6 @@ _mesa_meta_BlitFramebuffer(struct gl_context *ctx,
>filter, dstFlipX, dstFlipY,
>use_glsl_version, true)) {
>   mask &= ~GL_DEPTH_BUFFER_BIT;
> - if (mask == 0x0) {
> -_mesa_meta_end(ctx);
> -return;
> - }
>}
> }
>
> --
> 1.9.2
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-stable
>



-- 
Courtney Goeltzenleuchter
LunarG
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] loader: allow alternative methods for PCI identification.

2014-05-19 Thread Gary Wong

On Wed, May 14, 2014 at 10:39:05PM -0600, Gary Wong wrote:
> loader_get_pci_id_for_fd() and loader_get_device_name_for_fd() now attempt
> all available strategies to identify the hardware, instead of conditionally
> compiling in a single test.  The existing libudev and DRM approaches have
> been retained, and another simple alternative of looking up the answer in
> the /sys filesystem (available on Linux) is added.

Hi folks,

Any feedback on this patch?  I'd like to push it to master if there
are no objections.

Thanks,
Gary.
-- 
 Gary Wong g...@gnu.org http://www.cs.utah.edu/~gtw/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 78771] egl not works on 10.0.x and 10.1.x with black screen

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=78771

U. Artie Eoff  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #10 from U. Artie Eoff  ---
Works for me...

wayland (1.4) 1.4.0-0-g4b4cd00
  --disable-documentation
  --disable-static
drm (master) libdrm-2.4.52-0-g46d451c
  --enable-static=yes --enable-udev --enable-libkms
  --disable-nouveau-experimental-api --disable-radeon
  --disable-nouveau --enable-exynos-experimental-api
mesa (10.1) mesa-10.1.2-0-gbde3135
  --enable-gles1 --enable-gles2 --with-egl-platforms=drm,wayland
  --disable-glx --enable-shared-glapi --enable-texture-float
  --enable-gbm --enable-gallium-llvm
  --with-dri-drivers=i915,i965,swrast
  --with-gallium-drivers=swrast,svga
cairo (1.12) 1.12.16-0-g8e11a42
  --with-pic --enable-fc --enable-ft --enable-egl --enable-glesv2
  --enable-ps --enable-pdf --enable-script --enable-svg
  --enable-tee --disable-xlib --disable-xcb --disable-gtk-doc
  --disable-static
weston (1.4) 1.4.0-0-g1811312
  --disable-static --disable-setuid-install --enable-simple-clients
  --enable-clients
  --disable-libunwind --disable-xwayland --disable-xwayland-test
  --disable-x11-compositor --disable-rpi-compositor
  --enable-demo-clients-install

...I tested 32-bit on the NDiS-166, yet, I still don't see any issues.  Perhaps
your issue is caused by one of the custom IVI patches... which is beyond scope
here.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 20/23] i965/fs: Loop over instruction lists and generate code.

2014-05-19 Thread Matt Turner

Small code reduction. Will let us move the program header code into a
common place in generate_assembly().
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp  | 56 ++---
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 46 +---
 2 files changed, 42 insertions(+), 60 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 914fb29..bae39c1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1825,45 +1825,35 @@ fs_generator::generate_assembly(exec_list 
*simd8_instructions,
assert(simd8_instructions || simd16_instructions);
 
const struct gl_program *prog = fp ? &fp->Base : NULL;
+   exec_list *instructions[] = { simd8_instructions, simd16_instructions };
 
-   if (simd8_instructions) {
-  struct annotation *annotation;
-  int num_annotations;
+   for (unsigned i = 0; i < ARRAY_SIZE(instructions); i++) {
+  if (instructions[i]) {
+ if (i == 1) {
+/* align to 64 byte boundary. */
+while (p->next_insn_offset % 64) {
+   brw_NOP(p);
+}
 
-  dispatch_width = 8;
-  generate_code(simd8_instructions, &num_annotations, &annotation);
-  brw_compact_instructions(p, 0, num_annotations, annotation);
+/* Save off the start of this SIMD16 program */
+prog_data->prog_offset_16 = p->next_insn_offset;
 
-  if (unlikely(debug_flag)) {
- dump_assembly(p->store, num_annotations, annotation, brw, prog,
-   brw_disassemble);
- ralloc_free(annotation);
-  }
-   }
-
-   if (simd16_instructions) {
-  /* align to 64 byte boundary. */
-  while (p->next_insn_offset % 64) {
- brw_NOP(p);
-  }
-
-  /* Save off the start of this SIMD16 program */
-  prog_data->prog_offset_16 = p->next_insn_offset;
-
-  brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED);
+brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED);
+ }
 
-  struct annotation *annotation;
-  int num_annotations;
+ struct annotation *annotation;
+ int num_annotations;
 
-  dispatch_width = 16;
-  generate_code(simd16_instructions, &num_annotations, &annotation);
-  brw_compact_instructions(p, prog_data->prog_offset_16,
-   num_annotations, annotation);
+ dispatch_width = (i + 1) * 8;
+ generate_code(instructions[i], &num_annotations, &annotation);
+ brw_compact_instructions(p, prog_data->prog_offset_16,
+  num_annotations, annotation);
 
-  if (unlikely(debug_flag)) {
- dump_assembly(p->store, num_annotations, annotation, brw, prog,
-   brw_disassemble);
- ralloc_free(annotation);
+ if (unlikely(debug_flag)) {
+dump_assembly(p->store, num_annotations, annotation, brw, prog,
+  brw_disassemble);
+ralloc_free(annotation);
+ }
   }
}
 
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index 272f668..f498cd5 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -1314,38 +1314,30 @@ gen8_fs_generator::generate_assembly(exec_list 
*simd8_instructions,
 {
assert(simd8_instructions || simd16_instructions);
 
-   if (simd8_instructions) {
-  struct annotation *annotation;
-  int num_annotations;
+   exec_list *instructions[] = { simd8_instructions, simd16_instructions };
 
-  dispatch_width = 8;
-  generate_code(simd8_instructions, &num_annotations, &annotation);
+   for (unsigned i = 0; i < ARRAY_SIZE(instructions); i++) {
+  if (instructions[i]) {
+ if (i == 1) {
+/* Align to a 64-byte boundary. */
+while (next_inst_offset % 64)
+   NOP();
 
-  if (unlikely(INTEL_DEBUG & DEBUG_WM)) {
- dump_assembly(store, num_annotations, annotation, brw, prog,
-   gen8_disassemble);
- ralloc_free(annotation);
-  }
-   }
-
-   if (simd16_instructions) {
-  /* Align to a 64-byte boundary. */
-  while (next_inst_offset % 64)
- NOP();
-
-  /* Save off the start of this SIMD16 program */
-  prog_data->prog_offset_16 = next_inst_offset;
+/* Save off the start of this SIMD16 program */
+prog_data->prog_offset_16 = next_inst_offset;
+ }
 
-  struct annotation *annotation;
-  int num_annotations;
+ struct annotation *annotation;
+ int num_annotations;
 
-  dispatch_width = 16;
-  generate_code(simd16_instructions, &num_annotations, &annotation);
+ dispatch_width = (i + 1) * 8;
+ generate_code(instructions[i], &num_annotations, &annotation);
 
-  if (unlik

[Mesa-dev] [PATCH 09/23] i965/gen8/fs: Print disassembly after compaction.

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_fs.h  |   3 +-
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 138 +++-
 2 files changed, 65 insertions(+), 76 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index d26b972..1390895 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -740,7 +740,8 @@ public:
  unsigned *assembly_size);
 
 private:
-   void generate_code(exec_list *instructions);
+   void generate_code(exec_list *instructions, int *num_annotations,
+  struct annotation **annotation);
void generate_fb_write(fs_inst *inst);
void generate_linterp(fs_inst *inst, struct brw_reg dst,
  struct brw_reg *src);
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index 9df5b73..7e90ee6 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -883,12 +883,9 @@ gen8_fs_generator::generate_untyped_surface_read(fs_inst 
*ir,
 }
 
 void
-gen8_fs_generator::generate_code(exec_list *instructions)
+gen8_fs_generator::generate_code(exec_list *instructions, int *num_annotations,
+ struct annotation **annotation)
 {
-   int last_native_inst_offset = next_inst_offset;
-   const char *last_annotation_string = NULL;
-   const void *last_annotation_ir = NULL;
-
if (unlikely(INTEL_DEBUG & DEBUG_WM)) {
   if (prog) {
  fprintf(stderr,
@@ -905,53 +902,52 @@ gen8_fs_generator::generate_code(exec_list *instructions)
   }
}
 
+   int block_num = 0;
+   int ann_num = 0;
+   int ann_size = 1024;
cfg_t *cfg = NULL;
-   if (unlikely(INTEL_DEBUG & DEBUG_WM))
+   struct annotation *ann = NULL;
+
+   if (unlikely(INTEL_DEBUG & DEBUG_WM)) {
   cfg = new(mem_ctx) cfg_t(instructions);
+  ann = rzalloc_array(NULL, struct annotation, ann_size);
+   }
 
foreach_list(node, instructions) {
   fs_inst *ir = (fs_inst *) node;
   struct brw_reg src[3], dst;
 
   if (unlikely(INTEL_DEBUG & DEBUG_WM)) {
- foreach_list(node, &cfg->block_list) {
-bblock_link *link = (bblock_link *)node;
-bblock_t *block = link->block;
-
-if (block->start == ir) {
-   fprintf(stderr, "   START B%d", block->block_num);
-   foreach_list(predecessor_node, &block->parents) {
-  bblock_link *predecessor_link =
- (bblock_link *)predecessor_node;
-  bblock_t *predecessor_block = predecessor_link->block;
-  fprintf(stderr, " <-B%d", predecessor_block->block_num);
-   }
-   fprintf(stderr, "\n");
-}
+ if (ann_num == ann_size) {
+ann_size *= 2;
+ann = reralloc(NULL, ann, struct annotation, ann_size);
  }
 
- if (last_annotation_ir != ir->ir) {
-last_annotation_ir = ir->ir;
-if (last_annotation_ir) {
-   fprintf(stderr, "   ");
-   if (prog) {
-  ((ir_instruction *) ir->ir)->fprint(stderr);
-   } else if (prog) {
-  const prog_instruction *fpi;
-  fpi = (const prog_instruction *) ir->ir;
-  fprintf(stderr, "%d: ", (int)(fpi - prog->Instructions));
-  _mesa_fprint_instruction_opt(stderr,
-   fpi,
-   0, PROG_PRINT_DEBUG, NULL);
-   }
-   fprintf(stderr, "\n");
-}
+ ann[ann_num].offset = next_inst_offset;
+ ann[ann_num].ir = ir->ir;
+ ann[ann_num].annotation = ir->annotation;
+
+ if (cfg->blocks[block_num]->start == ir) {
+ann[ann_num].block_start = cfg->blocks[block_num];
  }
- if (last_annotation_string != ir->annotation) {
-last_annotation_string = ir->annotation;
-if (last_annotation_string)
-   fprintf(stderr, "   %s\n", last_annotation_string);
+
+ /* There is no hardware DO instruction on Gen6+, so since DO always
+  * starts a basic block, we need to set the .block_start of the next
+  * instruction's annotation with a pointer to the bblock started by
+  * the DO.
+  *
+  * There's also only complication from emitting an annotation without
+  * a corresponding hardware instruction to disassemble.
+  */
+ if (brw->gen >= 6 && ir->opcode == BRW_OPCODE_DO) {
+ann_num--;
  }
+
+ if (cfg->blocks[block_num]->end == ir) {
+ann[ann_num].block_end = cfg->blocks[block_num];
+block_num++;
+ }
+ ann_num++;
   }
 
   for (unsigned int i = 0; i < 3; i++) {
@@ -1295,44 +

[Mesa-dev] [PATCH 16/23] i965: Emit 0.0:F sources with type VF instead.

2014-05-19 Thread Matt Turner

Number of compacted instructions: 817752 -> 827404 (1.18%)
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index d8efa01..1810233 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -357,6 +357,22 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction 
*insn,
   } else {
  insn->bits1.da1.src1_reg_type = BRW_HW_REG_TYPE_UD;
   }
+
+  /* Compacted instructions only have 12-bits (plus 1 for the other 20)
+   * for immediate values. Presumably the hardware engineers realized
+   * that the only useful floating-point value that could be represented
+   * in this format is 0.0, which can also be represented as a VF-typed
+   * immediate, so they gave us the previously mentioned mapping on IVB+.
+   *
+   * Strangely, we do have a mapping for imm:f in src1, so we don't need
+   * to do this there.
+   *
+   * If we see a 0.0:F, change the type to VF so that it can be compacted.
+   */
+  if (insn->bits3.ud == 0x0 &&
+  insn->bits1.da1.src0_reg_type == BRW_HW_REG_TYPE_F) {
+ insn->bits1.da1.src0_reg_type = BRW_HW_REG_IMM_TYPE_VF;
+  }
}
else
{
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 23/23] i965/gen8: Print number of instructions directly.

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp   | 5 +
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index 0ac00f9..90743ee 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -1313,10 +1313,13 @@ gen8_fs_generator::generate_assembly(exec_list 
*simd8_instructions,
 
  struct annotation *annotation;
  int num_annotations;
+ int start_offset = next_inst_offset;
 
  dispatch_width = (i + 1) * 8;
  generate_code(instructions[i], &num_annotations, &annotation);
 
+ int before_size = next_inst_offset - start_offset;
+
  if (unlikely(INTEL_DEBUG & DEBUG_WM)) {
 if (this->prog) {
fprintf(stderr,
@@ -1331,6 +1334,8 @@ gen8_fs_generator::generate_assembly(exec_list 
*simd8_instructions,
fprintf(stderr, "Native code for blorp program (SIMD%d 
dispatch):\n",
dispatch_width);
 }
+fprintf(stderr, "SIMD%d shader: %d instructions.\n",
+dispatch_width, before_size / 16);
 dump_assembly(store, num_annotations, annotation, brw, prog,
   gen8_disassemble);
 ralloc_free(annotation);
diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
index 9f19a0a..3447ebf 100644
--- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
@@ -944,6 +944,8 @@ gen8_vec4_generator::generate_assembly(exec_list 
*instructions,
default_state.exec_size = BRW_EXECUTE_8;
generate_code(instructions, &num_annotations, &annotation);
 
+   int before_size = next_inst_offset;
+
if (unlikely(debug_flag)) {
   if (shader_prog) {
  fprintf(stderr, "Native code for %s vertex shader %d:\n",
@@ -952,6 +954,7 @@ gen8_vec4_generator::generate_assembly(exec_list 
*instructions,
   } else {
  fprintf(stderr, "Native code for vertex program %d:\n", prog->Id);
   }
+  fprintf(stderr, "vec4 shader: %d instructions.\n", before_size / 16);
   dump_assembly(store, num_annotations, annotation, brw, prog,
 gen8_disassemble);
   ralloc_free(annotation);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 21/23] i965: Print shader header in generate_assembly().

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 29 ++-
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 17 ++---
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp   | 29 ++-
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 17 ++---
 4 files changed, 40 insertions(+), 52 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index bae39c1..f70e7b2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1325,22 +1325,6 @@ void
 fs_generator::generate_code(exec_list *instructions, int *num_annotations,
 struct annotation **annotation)
 {
-   if (unlikely(debug_flag)) {
-  if (prog) {
- fprintf(stderr,
- "Native code for %s fragment shader %d (SIMD%d dispatch):\n",
- prog->Label ? prog->Label : "unnamed",
- prog->Name, dispatch_width);
-  } else if (fp) {
- fprintf(stderr,
- "Native code for fragment program %d (SIMD%d dispatch):\n",
- fp->Base.Id, dispatch_width);
-  } else {
- fprintf(stderr, "Native code for blorp program (SIMD%d dispatch):\n",
- dispatch_width);
-  }
-   }
-
int block_num = 0;
int ann_num = 0;
int ann_size = 1024;
@@ -1850,6 +1834,19 @@ fs_generator::generate_assembly(exec_list 
*simd8_instructions,
   num_annotations, annotation);
 
  if (unlikely(debug_flag)) {
+if (this->prog) {
+   fprintf(stderr,
+   "Native code for %s fragment shader %d (SIMD%d 
dispatch):\n",
+   this->prog->Label ? this->prog->Label : "unnamed",
+   this->prog->Name, dispatch_width);
+} else if (fp) {
+   fprintf(stderr,
+   "Native code for fragment program %d (SIMD%d 
dispatch):\n",
+   fp->Base.Id, dispatch_width);
+} else {
+   fprintf(stderr, "Native code for blorp program (SIMD%d 
dispatch):\n",
+   dispatch_width);
+}
 dump_assembly(p->store, num_annotations, annotation, brw, prog,
   brw_disassemble);
 ralloc_free(annotation);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 5980aad..819ed10 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1264,16 +1264,6 @@ void
 vec4_generator::generate_code(exec_list *instructions, int *num_annotations,
   struct annotation **annotation)
 {
-   if (unlikely(debug_flag)) {
-  if (shader_prog) {
- fprintf(stderr, "Native code for %s vertex shader %d:\n",
- shader_prog->Label ? shader_prog->Label : "unnamed",
- shader_prog->Name);
-  } else {
- fprintf(stderr, "Native code for vertex program %d:\n", prog->Id);
-  }
-   }
-
int block_num = 0;
int ann_num = 0;
int ann_size = 1024;
@@ -1378,6 +1368,13 @@ vec4_generator::generate_assembly(exec_list 
*instructions,
brw_compact_instructions(p, 0, num_annotations, annotation);
 
if (unlikely(debug_flag)) {
+  if (shader_prog) {
+ fprintf(stderr, "Native code for %s vertex shader %d:\n",
+ shader_prog->Label ? shader_prog->Label : "unnamed",
+ shader_prog->Name);
+  } else {
+ fprintf(stderr, "Native code for vertex program %d:\n", prog->Id);
+  }
   dump_assembly(p->store, num_annotations, annotation, brw, prog,
 brw_disassemble);
   ralloc_free(annotation);
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index f498cd5..0ac00f9 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -886,22 +886,6 @@ void
 gen8_fs_generator::generate_code(exec_list *instructions, int *num_annotations,
  struct annotation **annotation)
 {
-   if (unlikely(INTEL_DEBUG & DEBUG_WM)) {
-  if (prog) {
- fprintf(stderr,
- "Native code for %s fragment shader %d (SIMD%d dispatch):\n",
-shader_prog->Label ? shader_prog->Label : "unnamed",
-shader_prog->Name, dispatch_width);
-  } else if (fp) {
- fprintf(stderr,
- "Native code for fragment program %d (SIMD%d dispatch):\n",
- prog->Id, dispatch_width);
-  } else {
- fprintf(stderr, "Native code for blorp program (SIMD%d dispatch):\n",
- dispatch_width);
-  }
-   }
-
int block_num = 0;
int ann_num = 0;
int ann_size = 1

[Mesa-dev] [PATCH 04/23] i965/fs+blorp: Remove left over dump_file arguments.

2014-05-19 Thread Matt Turner

Were used by the blorp unit test programs.
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp| 20 
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h   |  2 +-
 src/mesa/drivers/dri/i965/brw_fs.h  |  5 ++---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp  | 13 ++---
 5 files changed, 15 insertions(+), 29 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 3da6388..118af27 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -519,8 +519,7 @@ public:
brw_blorp_blit_program(struct brw_context *brw,
   const brw_blorp_blit_prog_key *key, bool debug_flag);
 
-   const GLuint *compile(struct brw_context *brw, GLuint *program_size,
- FILE *dump_file = stderr);
+   const GLuint *compile(struct brw_context *brw, GLuint *program_size);
 
brw_blorp_prog_data prog_data;
 
@@ -634,8 +633,7 @@ brw_blorp_blit_program::brw_blorp_blit_program(
 
 const GLuint *
 brw_blorp_blit_program::compile(struct brw_context *brw,
-GLuint *program_size,
-FILE *dump_file)
+GLuint *program_size)
 {
/* Sanity checks */
if (key->dst_tiled_w && key->rt_samples > 0) {
@@ -790,7 +788,7 @@ brw_blorp_blit_program::compile(struct brw_context *brw,
 */
render_target_write();
 
-   return get_program(program_size, dump_file);
+   return get_program(program_size);
 }
 
 void
@@ -2146,7 +2144,7 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context 
*brw,
   brw_blorp_blit_program prog(brw, &this->wm_prog_key,
   INTEL_DEBUG & DEBUG_BLORP);
   GLuint program_size;
-  const GLuint *program = prog.compile(brw, &program_size, stderr);
+  const GLuint *program = prog.compile(brw, &program_size);
   brw_upload_cache(&brw->cache, BRW_BLORP_BLIT_PROG,
&this->wm_prog_key, sizeof(this->wm_prog_key),
program, program_size,
@@ -2155,13 +2153,3 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context 
*brw,
}
return prog_offset;
 }
-
-void
-brw_blorp_blit_test_compile(struct brw_context *brw,
-const brw_blorp_blit_prog_key *key,
-FILE *out)
-{
-   GLuint program_size;
-   brw_blorp_blit_program prog(brw, key, true /* debug_flag */);
-   prog.compile(brw, &program_size, out);
-}
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
index 4910b6c..33fa606 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
@@ -41,9 +41,9 @@ brw_blorp_eu_emitter::~brw_blorp_eu_emitter()
 }
 
 const unsigned *
-brw_blorp_eu_emitter::get_program(unsigned *program_size, FILE *dump_file)
+brw_blorp_eu_emitter::get_program(unsigned *program_size)
 {
-   return generator.generate_assembly(NULL, &insts, program_size, dump_file);
+   return generator.generate_assembly(NULL, &insts, program_size);
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
index 8a93f05..bc927fe 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
@@ -33,7 +33,7 @@ protected:
explicit brw_blorp_eu_emitter(struct brw_context *brw, bool debug_flag);
~brw_blorp_eu_emitter();
 
-   const unsigned *get_program(unsigned *program_size, FILE *dump_file);
+   const unsigned *get_program(unsigned *program_size);
 
void emit_kill_if_outside_rect(const struct brw_reg &x,
   const struct brw_reg &y,
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 7a87aed..8acad2f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -608,11 +608,10 @@ public:
 
const unsigned *generate_assembly(exec_list *simd8_instructions,
  exec_list *simd16_instructions,
- unsigned *assembly_size,
- FILE *dump_file = NULL);
+ unsigned *assembly_size);
 
 private:
-   void generate_code(exec_list *instructions, FILE *dump_file);
+   void generate_code(exec_list *instructions);
void generate_fb_write(fs_inst *inst);
void generate_blorp_fb_write(fs_inst *inst);
void generate_pixel_xy(struct brw_reg dst, bool is_x);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 878b0e0..bf3f32c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1321,7 +1321,7 @@ fs_ge

[Mesa-dev] [PATCH 08/23] i965/vec4: Print disassembly after compaction.

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_vec4.h |   4 +-
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 109 +--
 2 files changed, 66 insertions(+), 47 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index a86972a..3a1eb12 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -36,6 +36,7 @@ extern "C" {
 
 #include "brw_context.h"
 #include "brw_eu.h"
+#include "intel_asm_printer.h"
 
 #ifdef __cplusplus
 }; /* extern "C" */
@@ -650,7 +651,8 @@ public:
const unsigned *generate_assembly(exec_list *insts, unsigned *asm_size);
 
 private:
-   void generate_code(exec_list *instructions);
+   void generate_code(exec_list *instructions, int *num_annotations,
+  struct annotation **annotation);
void generate_vec4_instruction(vec4_instruction *inst,
   struct brw_reg dst,
   struct brw_reg *src);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index a91bfe7..2176de4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -21,6 +21,7 @@
  */
 
 #include "brw_vec4.h"
+#include "brw_cfg.h"
 
 extern "C" {
 #include "brw_eu.h"
@@ -1260,12 +1261,9 @@ 
vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
 }
 
 void
-vec4_generator::generate_code(exec_list *instructions)
+vec4_generator::generate_code(exec_list *instructions, int *num_annotations,
+  struct annotation **annotation)
 {
-   int last_native_insn_offset = 0;
-   const char *last_annotation_string = NULL;
-   const void *last_annotation_ir = NULL;
-
if (unlikely(debug_flag)) {
   if (shader_prog) {
  fprintf(stderr, "Native code for %s vertex shader %d:\n",
@@ -1276,32 +1274,52 @@ vec4_generator::generate_code(exec_list *instructions)
   }
}
 
+   int block_num = 0;
+   int ann_num = 0;
+   int ann_size = 1024;
+   cfg_t *cfg = NULL;
+   struct annotation *ann = NULL;
+
+   if (unlikely(debug_flag)) {
+  cfg = new(mem_ctx) cfg_t(instructions);
+  ann = rzalloc_array(NULL, struct annotation, ann_size);
+   }
+
foreach_list(node, instructions) {
   vec4_instruction *inst = (vec4_instruction *)node;
   struct brw_reg src[3], dst;
 
   if (unlikely(debug_flag)) {
-if (last_annotation_ir != inst->ir) {
-   last_annotation_ir = inst->ir;
-   if (last_annotation_ir) {
-  fprintf(stderr, "   ");
-   if (shader_prog) {
-  ((ir_instruction *) last_annotation_ir)->fprint(stderr);
-   } else {
-  const prog_instruction *vpi;
-  vpi = (const prog_instruction *) inst->ir;
-  fprintf(stderr, "%d: ", (int)(vpi - prog->Instructions));
-  _mesa_fprint_instruction_opt(stderr, vpi, 0,
-   PROG_PRINT_DEBUG, NULL);
-   }
-  fprintf(stderr, "\n");
-   }
-}
-if (last_annotation_string != inst->annotation) {
-   last_annotation_string = inst->annotation;
-   if (last_annotation_string)
-  fprintf(stderr, "   %s\n", last_annotation_string);
-}
+ if (ann_num == ann_size) {
+ann_size *= 2;
+ann = reralloc(NULL, ann, struct annotation, ann_size);
+ }
+
+ ann[ann_num].offset = p->next_insn_offset;
+ ann[ann_num].ir = inst->ir;
+ ann[ann_num].annotation = inst->annotation;
+
+ if (cfg->blocks[block_num]->start == inst) {
+ann[ann_num].block_start = cfg->blocks[block_num];
+ }
+
+ /* There is no hardware DO instruction on Gen6+, so since DO always
+  * starts a basic block, we need to set the .block_start of the next
+  * instruction's annotation with a pointer to the bblock started by
+  * the DO.
+  *
+  * There's also only complication from emitting an annotation without
+  * a corresponding hardware instruction to disassemble.
+  */
+ if (brw->gen >= 6 && inst->opcode == BRW_OPCODE_DO) {
+ann_num--;
+ }
+
+ if (cfg->blocks[block_num]->end == inst) {
+ann[ann_num].block_end = cfg->blocks[block_num];
+block_num++;
+ }
+ ann_num++;
   }
 
   for (unsigned int i = 0; i < 3; i++) {
@@ -1332,38 +1350,37 @@ vec4_generator::generate_code(exec_list *instructions)
  if (inst->no_dd_check)
 last->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED;
   }
-
-  if (unlikely(debug_flag)) {
-brw_disassemble(brw, p->store,
-last_native_insn_offset, p->next_insn_offset, stderr);
-  }
-
-  last_nativ

[Mesa-dev] [PATCH 13/23] i965: Use next_offset() in instruction compaction code.

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_eu_compact.c | 20 +++-
 1 file changed, 3 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c 
b/src/mesa/drivers/dri/i965/brw_eu_compact.c
index 40d1fc2..f6f055f 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_compact.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c
@@ -765,11 +765,7 @@ brw_compact_instructions(struct brw_compile *p, int 
start_offset,
  break;
   }
 
-  if (insn->header.cmpt_control) {
- offset += 8;
-  } else {
- offset += 16;
-  }
+  offset = next_offset(store, offset);
}
 
/* p->nr_insn is counting the number of uncompacted instructions still, so
@@ -792,22 +788,12 @@ brw_compact_instructions(struct brw_compile *p, int 
start_offset,
  while (start_offset + old_ip[offset / 8] * 8 != annotation[i].offset) 
{
 assert(start_offset + old_ip[offset / 8] * 8 <
annotation[i].offset);
-struct brw_instruction *insn = store + offset;
-if (insn->header.cmpt_control) {
-   offset += 8;
-} else {
-   offset += 16;
-}
+offset = next_offset(store, offset);
  }
 
  annotation[i].offset = start_offset + offset;
 
- struct brw_instruction *insn = store + offset;
- if (insn->header.cmpt_control) {
-offset += 8;
- } else {
-offset += 16;
- }
+ offset = next_offset(store, offset);
   }
 
   annotation[num_annotations].offset = p->next_insn_offset;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/23] i965/fs: Print disassembly after compaction.

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_fs.h |   4 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 156 -
 2 files changed, 77 insertions(+), 83 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 111e994..d26b972 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -46,6 +46,7 @@ extern "C" {
 #include "brw_eu.h"
 #include "brw_wm.h"
 #include "brw_shader.h"
+#include "intel_asm_printer.h"
 }
 #include "gen8_generator.h"
 #include "glsl/glsl_types.h"
@@ -611,7 +612,8 @@ public:
  unsigned *assembly_size);
 
 private:
-   void generate_code(exec_list *instructions);
+   void generate_code(exec_list *instructions, int *num_annotations,
+  struct annotation **annotation);
void generate_fb_write(fs_inst *inst);
void generate_blorp_fb_write(fs_inst *inst);
void generate_pixel_xy(struct brw_reg dst, bool is_x);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 132d5cd..b0b3b56 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1322,12 +1322,9 @@ fs_generator::generate_untyped_surface_read(fs_inst 
*inst, struct brw_reg dst,
 }
 
 void
-fs_generator::generate_code(exec_list *instructions)
+fs_generator::generate_code(exec_list *instructions, int *num_annotations,
+struct annotation **annotation)
 {
-   int last_native_insn_offset = p->next_insn_offset;
-   const char *last_annotation_string = NULL;
-   const void *last_annotation_ir = NULL;
-
if (unlikely(debug_flag)) {
   if (prog) {
  fprintf(stderr,
@@ -1344,54 +1341,52 @@ fs_generator::generate_code(exec_list *instructions)
   }
}
 
+   int block_num = 0;
+   int ann_num = 0;
+   int ann_size = 1024;
cfg_t *cfg = NULL;
-   if (unlikely(debug_flag))
+   struct annotation *ann = NULL;
+
+   if (unlikely(debug_flag)) {
   cfg = new(mem_ctx) cfg_t(instructions);
+  ann = rzalloc_array(NULL, struct annotation, ann_size);
+   }
 
foreach_list(node, instructions) {
   fs_inst *inst = (fs_inst *)node;
   struct brw_reg src[3], dst;
 
   if (unlikely(debug_flag)) {
-foreach_list(node, &cfg->block_list) {
-   bblock_link *link = (bblock_link *)node;
-   bblock_t *block = link->block;
-
-   if (block->start == inst) {
-  fprintf(stderr, "   START B%d", block->block_num);
-  foreach_list(predecessor_node, &block->parents) {
- bblock_link *predecessor_link =
-(bblock_link *)predecessor_node;
- bblock_t *predecessor_block = predecessor_link->block;
- fprintf(stderr, " <-B%d", predecessor_block->block_num);
-  }
-  fprintf(stderr, "\n");
-   }
-}
+ if (ann_num == ann_size) {
+ann_size *= 2;
+ann = reralloc(NULL, ann, struct annotation, ann_size);
+ }
 
-if (last_annotation_ir != inst->ir) {
-   last_annotation_ir = inst->ir;
-   if (last_annotation_ir) {
-  fprintf(stderr, "   ");
-   if (prog)
-  ((ir_instruction *)inst->ir)->fprint(stderr);
-   else {
-  const prog_instruction *fpi;
-  fpi = (const prog_instruction *)inst->ir;
-  fprintf(stderr, "%d: ",
-  (int)(fpi - (fp ? fp->Base.Instructions : 0)));
-  _mesa_fprint_instruction_opt(stderr,
-   fpi,
-   0, PROG_PRINT_DEBUG, NULL);
-   }
-  fprintf(stderr, "\n");
-   }
-}
-if (last_annotation_string != inst->annotation) {
-   last_annotation_string = inst->annotation;
-   if (last_annotation_string)
-  fprintf(stderr, "   %s\n", last_annotation_string);
-}
+ ann[ann_num].offset = p->next_insn_offset;
+ ann[ann_num].ir = inst->ir;
+ ann[ann_num].annotation = inst->annotation;
+
+ if (cfg->blocks[block_num]->start == inst) {
+ann[ann_num].block_start = cfg->blocks[block_num];
+ }
+
+ /* There is no hardware DO instruction on Gen6+, so since DO always
+  * starts a basic block, we need to set the .block_start of the next
+  * instruction's annotation with a pointer to the bblock started by
+  * the DO.
+  *
+  * There's also only complication from emitting an annotation without
+  * a corresponding hardware instruction to disassemble.
+  */
+ if (brw->gen >= 6 && inst->opcode == BRW_OPCODE_DO) {
+ann_num--;
+ }
+
+ if (cfg->blocks[block_num]->e

[Mesa-dev] [PATCH 19/23] i965/fs: Use next_insn_offset rather than nr_insn.

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp  | 4 ++--
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 872b5a4..914fb29 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1843,12 +1843,12 @@ fs_generator::generate_assembly(exec_list 
*simd8_instructions,
 
if (simd16_instructions) {
   /* align to 64 byte boundary. */
-  while ((p->nr_insn * sizeof(struct brw_instruction)) % 64) {
+  while (p->next_insn_offset % 64) {
  brw_NOP(p);
   }
 
   /* Save off the start of this SIMD16 program */
-  prog_data->prog_offset_16 = p->nr_insn * sizeof(struct brw_instruction);
+  prog_data->prog_offset_16 = p->next_insn_offset;
 
   brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED);
 
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index 9011bff..272f668 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -1330,11 +1330,11 @@ gen8_fs_generator::generate_assembly(exec_list 
*simd8_instructions,
 
if (simd16_instructions) {
   /* Align to a 64-byte boundary. */
-  while ((nr_inst * sizeof(gen8_instruction)) % 64)
+  while (next_inst_offset % 64)
  NOP();
 
   /* Save off the start of this SIMD16 program */
-  prog_data->prog_offset_16 = nr_inst * sizeof(gen8_instruction);
+  prog_data->prog_offset_16 = next_inst_offset;
 
   struct annotation *annotation;
   int num_annotations;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/23] i965: Rename next_ip() -> next_offset().

2014-05-19 Thread Matt Turner

That we were comparing its return value with offsets should have been a
clue. :)

Make it take a void *store in preparation for making the function useful
elsewhere.
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 63 +
 1 file changed, 33 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 1ebd7a9..a357d5d 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2383,31 +2383,32 @@ void brw_urb_WRITE(struct brw_compile *p,
 }
 
 static int
-next_ip(struct brw_compile *p, int ip)
+next_offset(void *store, int offset)
 {
-   struct brw_instruction *insn = (void *)p->store + ip;
+   struct brw_instruction *insn = (void *)store + offset;
 
if (insn->header.cmpt_control)
-  return ip + 8;
+  return offset + 8;
else
-  return ip + 16;
+  return offset + 16;
 }
 
 static int
-brw_find_next_block_end(struct brw_compile *p, int start)
+brw_find_next_block_end(struct brw_compile *p, int start_offset)
 {
-   int ip;
+   int offset;
void *store = p->store;
 
-   for (ip = next_ip(p, start); ip < p->next_insn_offset; ip = next_ip(p, ip)) 
{
-  struct brw_instruction *insn = store + ip;
+   for (offset = next_offset(store, start_offset); offset < 
p->next_insn_offset;
+offset = next_offset(store, offset)) {
+  struct brw_instruction *insn = store + offset;
 
   switch (insn->header.opcode) {
   case BRW_OPCODE_ENDIF:
   case BRW_OPCODE_ELSE:
   case BRW_OPCODE_WHILE:
   case BRW_OPCODE_HALT:
-return ip;
+return offset;
   }
}
 
@@ -2419,28 +2420,29 @@ brw_find_next_block_end(struct brw_compile *p, int 
start)
  * instruction.
  */
 static int
-brw_find_loop_end(struct brw_compile *p, int start)
+brw_find_loop_end(struct brw_compile *p, int start_offset)
 {
struct brw_context *brw = p->brw;
-   int ip;
+   int offset;
int scale = 8;
void *store = p->store;
 
/* Always start after the instruction (such as a WHILE) we're trying to fix
 * up.
 */
-   for (ip = next_ip(p, start); ip < p->next_insn_offset; ip = next_ip(p, ip)) 
{
-  struct brw_instruction *insn = store + ip;
+   for (offset = next_offset(store, start_offset); offset < 
p->next_insn_offset;
+offset = next_offset(store, offset)) {
+  struct brw_instruction *insn = store + offset;
 
   if (insn->header.opcode == BRW_OPCODE_WHILE) {
 int jip = brw->gen == 6 ? insn->bits1.branch_gen6.jump_count
   : insn->bits3.break_cont.jip;
-if (ip + jip * scale <= start)
-   return ip;
+if (offset + jip * scale <= start_offset)
+   return offset;
   }
}
assert(!"not reached");
-   return start;
+   return start_offset;
 }
 
 /* After program generation, go back and update the UIP and JIP of
@@ -2450,15 +2452,16 @@ void
 brw_set_uip_jip(struct brw_compile *p)
 {
struct brw_context *brw = p->brw;
-   int ip;
+   int offset;
int scale = 8;
void *store = p->store;
 
if (brw->gen < 6)
   return;
 
-   for (ip = 0; ip < p->next_insn_offset; ip = next_ip(p, ip)) {
-  struct brw_instruction *insn = store + ip;
+   for (offset = 0; offset < p->next_insn_offset;
+offset = next_offset(store, offset)) {
+  struct brw_instruction *insn = store + offset;
 
   if (insn->header.cmpt_control) {
 /* Fixups for compacted BREAK/CONTINUE not supported yet. */
@@ -2468,31 +2471,31 @@ brw_set_uip_jip(struct brw_compile *p)
 continue;
   }
 
-  int block_end_ip = brw_find_next_block_end(p, ip);
+  int block_end_offset = brw_find_next_block_end(p, offset);
   switch (insn->header.opcode) {
   case BRW_OPCODE_BREAK:
- assert(block_end_ip != 0);
-insn->bits3.break_cont.jip = (block_end_ip - ip) / scale;
+ assert(block_end_offset != 0);
+insn->bits3.break_cont.jip = (block_end_offset - offset) / scale;
 /* Gen7 UIP points to WHILE; Gen6 points just after it */
 insn->bits3.break_cont.uip =
-   (brw_find_loop_end(p, ip) - ip +
+   (brw_find_loop_end(p, offset) - offset +
  (brw->gen == 6 ? 16 : 0)) / scale;
 break;
   case BRW_OPCODE_CONTINUE:
- assert(block_end_ip != 0);
-insn->bits3.break_cont.jip = (block_end_ip - ip) / scale;
+ assert(block_end_offset != 0);
+insn->bits3.break_cont.jip = (block_end_offset - offset) / scale;
 insn->bits3.break_cont.uip =
-(brw_find_loop_end(p, ip) - ip) / scale;
+(brw_find_loop_end(p, offset) - offset) / scale;
 
 assert(insn->bits3.break_cont.uip != 0);
 assert(insn->bits3.break_cont.jip != 0);
 break;
 
   case BRW_OPCODE_ENDIF:
- if (block_end_ip == 0)
+ if (block_end_offset == 0)
 insn->bits3.break_cont.jip = 2;

[Mesa-dev] [PATCH 22/23] i965: Emit compaction stats without walking the assembly.

2014-05-19 Thread Matt Turner

The instruction count does not include padding NOPs, but the compaction
stats do.
---
 src/mesa/drivers/dri/i965/brw_eu_compact.c   | 19 ---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  8 
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  7 +++
 3 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c 
b/src/mesa/drivers/dri/i965/brw_eu_compact.c
index f40ba04..0560367 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_compact.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c
@@ -841,23 +841,4 @@ brw_compact_instructions(struct brw_compile *p, int 
start_offset,
 
   annotation[num_annotations].offset = p->next_insn_offset;
}
-
-   if (0) {
-  fprintf(stderr, "dumping compacted program\n");
-  brw_disassemble(brw, store, 0, p->next_insn_offset - start_offset, 
stderr);
-
-  int cmp = 0;
-  for (offset = 0; offset < p->next_insn_offset - start_offset;) {
- struct brw_instruction *insn = store + offset;
-
- if (insn->header.cmpt_control) {
-offset += 8;
-cmp++;
- } else {
-offset += 16;
- }
-  }
-  fprintf(stderr, "%db/%db saved (%d%%)\n", cmp * 8, offset + cmp * 8,
-  cmp * 8 * 100 / (offset + cmp * 8));
-   }
 }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index f70e7b2..4b2245b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1827,11 +1827,15 @@ fs_generator::generate_assembly(exec_list 
*simd8_instructions,
 
  struct annotation *annotation;
  int num_annotations;
+ int start_offset = p->next_insn_offset;
 
  dispatch_width = (i + 1) * 8;
  generate_code(instructions[i], &num_annotations, &annotation);
+
+ int before_size = p->next_insn_offset - start_offset;
  brw_compact_instructions(p, prog_data->prog_offset_16,
   num_annotations, annotation);
+ int after_size = p->next_insn_offset - start_offset;
 
  if (unlikely(debug_flag)) {
 if (this->prog) {
@@ -1847,6 +1851,10 @@ fs_generator::generate_assembly(exec_list 
*simd8_instructions,
fprintf(stderr, "Native code for blorp program (SIMD%d 
dispatch):\n",
dispatch_width);
 }
+fprintf(stderr, "SIMD%d shader: %d instructions. Compacted %d to 
%d"
+" bytes (%.0f%%)\n",
+dispatch_width, before_size / 16, before_size, after_size,
+100.0f * (before_size - after_size) / before_size);
 dump_assembly(p->store, num_annotations, annotation, brw, prog,
   brw_disassemble);
 ralloc_free(annotation);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 819ed10..affcc90 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1365,7 +1365,10 @@ vec4_generator::generate_assembly(exec_list 
*instructions,
 
brw_set_access_mode(p, BRW_ALIGN_16);
generate_code(instructions, &num_annotations, &annotation);
+
+   int before_size = p->next_insn_offset;
brw_compact_instructions(p, 0, num_annotations, annotation);
+   int after_size = p->next_insn_offset;
 
if (unlikely(debug_flag)) {
   if (shader_prog) {
@@ -1375,6 +1378,10 @@ vec4_generator::generate_assembly(exec_list 
*instructions,
   } else {
  fprintf(stderr, "Native code for vertex program %d:\n", prog->Id);
   }
+  fprintf(stderr, "vec4 shader: %d instructions. Compacted %d to %d"
+  " bytes (%.0f%%)\n",
+  before_size / 16, before_size, after_size,
+  100.0f * (before_size - after_size) / before_size);
   dump_assembly(p->store, num_annotations, annotation, brw, prog,
 brw_disassemble);
   ralloc_free(annotation);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/23] i965: Emit ARF:UD for non-present src1 on Gen6+.

2014-05-19 Thread Matt Turner

Enables the next commits to compact more instructions.
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 38d327a..d8efa01 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -329,10 +329,34 @@ brw_set_src0(struct brw_compile *p, struct 
brw_instruction *insn,
if (reg.file == BRW_IMMEDIATE_VALUE) {
   insn->bits3.ud = reg.dw1.ud;
 
-  /* Required to set some fields in src1 as well:
+  /* The Bspec's section titled "Non-present Operands" claims that if src0
+   * is an immediate that src1's type must be the same as that of src0.
+   *
+   * The SNB+ DataTypeIndex instruction compaction tables contain mappings
+   * that do not follow this rule. E.g., from the IVB/HSW table:
+   *
+   *  DataTypeIndex   18-Bit Mapping   Mapped Meaning
+   *3 0010101101   r:f | i:vf | a:ud | <1> | dir |
+   *
+   * And from the SNB table:
+   *
+   *  DataTypeIndex   18-Bit Mapping   Mapped Meaning
+   *8 0010001100   a:w | i:w | a:ud | <1> | dir |
+   *
+   * Neither of these cause warnings from the simulator when used,
+   * compacted or otherwise. In fact, all compaction mappings that have an
+   * immediate in src0 use a:ud for src1.
+   *
+   * The GM45 instruction compaction tables do not contain mapped meanings
+   * so it's not clear whether it has the restriction. We'll assume it was
+   * lifted on SNB. (FINISHME: decode the GM45 tables and check.)
*/
   insn->bits1.da1.src1_reg_file = 0; /* arf */
-  insn->bits1.da1.src1_reg_type = insn->bits1.da1.src0_reg_type;
+  if (brw->gen < 6) {
+ insn->bits1.da1.src1_reg_type = insn->bits1.da1.src0_reg_type;
+  } else {
+ insn->bits1.da1.src1_reg_type = BRW_HW_REG_TYPE_UD;
+  }
}
else
{
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 18/23] i965: Print IR annotations only with INTEL_DEBUG=annotation.

2014-05-19 Thread Matt Turner

Running shader-db without INTEL_DEBUG=annotation reduces the runtime
from ~90 to ~80 seconds on my machine. It also reduces the disk space
consumed by the .out files from 660 MB (676 on disk) to 343 MB (358 on
disk).
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 6 --
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 6 --
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp   | 6 --
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 6 --
 src/mesa/drivers/dri/i965/intel_debug.c   | 1 +
 src/mesa/drivers/dri/i965/intel_debug.h   | 1 +
 6 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index b0b3b56..872b5a4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1363,8 +1363,10 @@ fs_generator::generate_code(exec_list *instructions, int 
*num_annotations,
  }
 
  ann[ann_num].offset = p->next_insn_offset;
- ann[ann_num].ir = inst->ir;
- ann[ann_num].annotation = inst->annotation;
+ if (INTEL_DEBUG & DEBUG_ANNOTATION) {
+ann[ann_num].ir = inst->ir;
+ann[ann_num].annotation = inst->annotation;
+ }
 
  if (cfg->blocks[block_num]->start == inst) {
 ann[ann_num].block_start = cfg->blocks[block_num];
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 2176de4..5980aad 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1296,8 +1296,10 @@ vec4_generator::generate_code(exec_list *instructions, 
int *num_annotations,
  }
 
  ann[ann_num].offset = p->next_insn_offset;
- ann[ann_num].ir = inst->ir;
- ann[ann_num].annotation = inst->annotation;
+ if (INTEL_DEBUG & DEBUG_ANNOTATION) {
+ann[ann_num].ir = inst->ir;
+ann[ann_num].annotation = inst->annotation;
+ }
 
  if (cfg->blocks[block_num]->start == inst) {
 ann[ann_num].block_start = cfg->blocks[block_num];
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index 7e90ee6..9011bff 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -924,8 +924,10 @@ gen8_fs_generator::generate_code(exec_list *instructions, 
int *num_annotations,
  }
 
  ann[ann_num].offset = next_inst_offset;
- ann[ann_num].ir = ir->ir;
- ann[ann_num].annotation = ir->annotation;
+ if (INTEL_DEBUG & DEBUG_ANNOTATION) {
+ann[ann_num].ir = ir->ir;
+ann[ann_num].annotation = ir->annotation;
+ }
 
  if (cfg->blocks[block_num]->start == ir) {
 ann[ann_num].block_start = cfg->blocks[block_num];
diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
index 5470f87..4aeaf89 100644
--- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
@@ -878,8 +878,10 @@ gen8_vec4_generator::generate_code(exec_list *instructions,
  }
 
  ann[ann_num].offset = next_inst_offset;
- ann[ann_num].ir = ir->ir;
- ann[ann_num].annotation = ir->annotation;
+ if (INTEL_DEBUG & DEBUG_ANNOTATION) {
+ann[ann_num].ir = ir->ir;
+ann[ann_num].annotation = ir->annotation;
+ }
 
  if (cfg->blocks[block_num]->start == ir) {
 ann[ann_num].block_start = cfg->blocks[block_num];
diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
b/src/mesa/drivers/dri/i965/intel_debug.c
index 621a571..64d2c61 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.c
+++ b/src/mesa/drivers/dri/i965/intel_debug.c
@@ -64,6 +64,7 @@ static const struct dri_debug_control debug_control[] = {
{ "no16",  DEBUG_NO16 },
{ "blorp", DEBUG_BLORP },
{ "nodualobj", DEBUG_NO_DUAL_OBJECT_GS },
+   { "annotation", DEBUG_ANNOTATION },
{ NULL,0 }
 };
 
diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
b/src/mesa/drivers/dri/i965/intel_debug.h
index 6402cec..49cc584 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.h
+++ b/src/mesa/drivers/dri/i965/intel_debug.h
@@ -60,6 +60,7 @@ extern uint64_t INTEL_DEBUG;
 #define DEBUG_NO160x2000
 #define DEBUG_VUE 0x4000
 #define DEBUG_NO_DUAL_OBJECT_GS 0x8000
+#define DEBUG_ANNOTATION  0x1
 
 #ifdef HAVE_ANDROID_PLATFORM
 #define LOG_TAG "INTEL-MESA"
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/23] i965: Move next_offset() to brw_eu.h for use elsewhere.

2014-05-19 Thread Matt Turner

Also perform arithmetic on char* rather than void* since the latter is a
GNU C extension not available in C++.
---
 src/mesa/drivers/dri/i965/brw_eu.h  | 12 
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 11 ---
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
b/src/mesa/drivers/dri/i965/brw_eu.h
index 8ce31a1..3c89365 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -424,6 +424,18 @@ void brw_debug_compact_uncompact(struct brw_context *brw,
 struct brw_instruction *orig,
 struct brw_instruction *uncompacted);
 
+static inline int
+next_offset(void *store, int offset)
+{
+   struct brw_instruction *insn =
+  (struct brw_instruction *)((char *)store + offset);
+
+   if (insn->header.cmpt_control)
+  return offset + 8;
+   else
+  return offset + 16;
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index a357d5d..38d327a 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2383,17 +2383,6 @@ void brw_urb_WRITE(struct brw_compile *p,
 }
 
 static int
-next_offset(void *store, int offset)
-{
-   struct brw_instruction *insn = (void *)store + offset;
-
-   if (insn->header.cmpt_control)
-  return offset + 8;
-   else
-  return offset + 16;
-}
-
-static int
 brw_find_next_block_end(struct brw_compile *p, int start_offset)
 {
int offset;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/23] i965/gen8/vec4: Print disassembly after compaction.

2014-05-19 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_vec4.h  |   3 +-
 src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 103 +-
 2 files changed, 63 insertions(+), 43 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 3a1eb12..a3fa42f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -753,7 +753,8 @@ public:
const unsigned *generate_assembly(exec_list *insts, unsigned *asm_size);
 
 private:
-   void generate_code(exec_list *instructions);
+   void generate_code(exec_list *instructions, int *num_annotations,
+  struct annotation **annotation);
void generate_vec4_instruction(vec4_instruction *inst,
   struct brw_reg dst,
   struct brw_reg *src);
diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
index e53fd35..5470f87 100644
--- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp
@@ -22,6 +22,7 @@
  */
 
 #include "brw_vec4.h"
+#include "brw_cfg.h"
 
 extern "C" {
 #include "brw_eu.h"
@@ -841,12 +842,10 @@ 
gen8_vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
 }
 
 void
-gen8_vec4_generator::generate_code(exec_list *instructions)
+gen8_vec4_generator::generate_code(exec_list *instructions,
+   int *num_annotations,
+   struct annotation **annotation)
 {
-   int last_native_inst_offset = 0;
-   const char *last_annotation_string = NULL;
-   const void *last_annotation_ir = NULL;
-
if (unlikely(debug_flag)) {
   if (shader_prog) {
  fprintf(stderr, "Native code for %s vertex shader %d:\n",
@@ -857,32 +856,52 @@ gen8_vec4_generator::generate_code(exec_list 
*instructions)
   }
}
 
+   int block_num = 0;
+   int ann_num = 0;
+   int ann_size = 1024;
+   cfg_t *cfg = NULL;
+   struct annotation *ann = NULL;
+
+   if (unlikely(debug_flag)) {
+  cfg = new(mem_ctx) cfg_t(instructions);
+  ann = rzalloc_array(NULL, struct annotation, ann_size);
+   }
+
foreach_list(node, instructions) {
   vec4_instruction *ir = (vec4_instruction *) node;
   struct brw_reg src[3], dst;
 
   if (unlikely(debug_flag)) {
- if (last_annotation_ir != ir->ir) {
-last_annotation_ir = ir->ir;
-if (last_annotation_ir) {
-   fprintf(stderr, "   ");
-   if (shader_prog) {
-  ((ir_instruction *) last_annotation_ir)->fprint(stderr);
-   } else {
-  const prog_instruction *vpi;
-  vpi = (const prog_instruction *) ir->ir;
-  fprintf(stderr, "%d: ", (int)(vpi - prog->Instructions));
-  _mesa_fprint_instruction_opt(stderr, vpi, 0,
-   PROG_PRINT_DEBUG, NULL);
-   }
-   fprintf(stderr, "\n");
-}
+ if (ann_num == ann_size) {
+ann_size *= 2;
+ann = reralloc(NULL, ann, struct annotation, ann_size);
+ }
+
+ ann[ann_num].offset = next_inst_offset;
+ ann[ann_num].ir = ir->ir;
+ ann[ann_num].annotation = ir->annotation;
+
+ if (cfg->blocks[block_num]->start == ir) {
+ann[ann_num].block_start = cfg->blocks[block_num];
  }
- if (last_annotation_string != ir->annotation) {
-last_annotation_string = ir->annotation;
-if (last_annotation_string)
-   fprintf(stderr, "   %s\n", last_annotation_string);
+
+ /* There is no hardware DO instruction on Gen6+, so since DO always
+  * starts a basic block, we need to set the .block_start of the next
+  * instruction's annotation with a pointer to the bblock started by
+  * the DO.
+  *
+  * There's also only complication from emitting an annotation without
+  * a corresponding hardware instruction to disassemble.
+  */
+ if (brw->gen >= 6 && ir->opcode == BRW_OPCODE_DO) {
+ann_num--;
  }
+
+ if (cfg->blocks[block_num]->end == ir) {
+ann[ann_num].block_end = cfg->blocks[block_num];
+block_num++;
+ }
+ ann_num++;
   }
 
   for (unsigned int i = 0; i < 3; i++) {
@@ -908,37 +927,37 @@ gen8_vec4_generator::generate_code(exec_list 
*instructions)
  gen8_set_no_dd_clear(last, ir->no_dd_clear);
  gen8_set_no_dd_check(last, ir->no_dd_check);
   }
-
-  if (unlikely(debug_flag)) {
- gen8_disassemble(brw, store, last_native_inst_offset, 
next_inst_offset, stderr);
-  }
-
-  last_native_inst_offset = next_inst_offset;
-   }
-
-   if (unlikely(debug_flag)) {
-  fprintf(stderr, "\n");
}
 
patch_jump_targets();
 
-   /* OK

[Mesa-dev] [PATCH 17/23] i965: Switch types D->UD when possible to allow compaction.

2014-05-19 Thread Matt Turner

Number of compacted instructions: 827404 -> 833045 (0.68%)
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 1810233..ab00d7c 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -295,6 +295,16 @@ validate_reg(struct brw_instruction *insn, struct brw_reg 
reg)
/* 10. Check destination issues. */
 }
 
+static bool
+is_compactable_immediate(unsigned imm)
+{
+   /* We get the low 12 bits as-is. */
+   imm &= ~0xfff;
+
+   /* We get one bit replicated through the top 20 bits. */
+   return imm == 0 || imm == 0xf000;
+}
+
 void
 brw_set_src0(struct brw_compile *p, struct brw_instruction *insn,
 struct brw_reg reg)
@@ -373,6 +383,16 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction 
*insn,
   insn->bits1.da1.src0_reg_type == BRW_HW_REG_TYPE_F) {
  insn->bits1.da1.src0_reg_type = BRW_HW_REG_IMM_TYPE_VF;
   }
+
+  /* There are no mappings for dst:d | i:d, so if the immediate is suitable
+   * set the types to :UD so the instruction can be compacted.
+   */
+  if (is_compactable_immediate(insn->bits3.ud) &&
+  insn->bits1.da1.src0_reg_type == BRW_HW_REG_TYPE_D &&
+  insn->bits1.da1.dest_reg_type == BRW_HW_REG_TYPE_D) {
+ insn->bits1.da1.src0_reg_type = BRW_HW_REG_TYPE_UD;
+ insn->bits1.da1.dest_reg_type = BRW_HW_REG_TYPE_UD;
+  }
}
else
{
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/23] i965: Support compacted instructions with immediate sources.

2014-05-19 Thread Matt Turner

Note the weirdness with src1 subregs. The compacted immediate fields are
uncompacted to bits [127:96] and the high five bits of the subreg
mapping maps to bits [100:96].

Number of compacted instructions: 790085 -> 817752 (3.50%)
---
 src/mesa/drivers/dri/i965/brw_eu_compact.c | 83 +++---
 1 file changed, 63 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c 
b/src/mesa/drivers/dri/i965/brw_eu_compact.c
index f6f055f..f40ba04 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_compact.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c
@@ -373,13 +373,16 @@ set_datatype_index(struct brw_compact_instruction *dst,
 
 static bool
 set_subreg_index(struct brw_compact_instruction *dst,
- struct brw_instruction *src)
+ struct brw_instruction *src,
+ bool is_immediate)
 {
uint16_t uncompacted = 0;
 
uncompacted |= src->bits1.da1.dest_subreg_nr << 0;
uncompacted |= src->bits2.da1.src0_subreg_nr << 5;
-   uncompacted |= src->bits3.da1.src1_subreg_nr << 10;
+
+   if (!is_immediate)
+  uncompacted |= src->bits3.da1.src1_subreg_nr << 10;
 
for (int i = 0; i < 32; i++) {
   if (subreg_table[i] == uncompacted) {
@@ -424,20 +427,40 @@ set_src0_index(struct brw_compact_instruction *dst,
 
 static bool
 set_src1_index(struct brw_compact_instruction *dst,
-   struct brw_instruction *src)
+   struct brw_instruction *src, bool is_immediate)
 {
-   uint16_t compacted, uncompacted = 0;
+   if (is_immediate) {
+  dst->dw1.src1_index = (src->bits3.ud >> 8) & 0x1f;
+   } else {
+  uint16_t compacted, uncompacted;
 
-   uncompacted |= (src->bits3.ud >> 13) & 0xfff;
+  uncompacted = (src->bits3.ud >> 13) & 0xfff;
 
-   if (!get_src_index(uncompacted, &compacted))
-  return false;
+  if (!get_src_index(uncompacted, &compacted))
+ return false;
 
-   dst->dw1.src1_index = compacted;
+  dst->dw1.src1_index = compacted;
+   }
 
return true;
 }
 
+/* Compacted instructions have 12-bits for immediate sources, and a 13th bit
+ * that's replicated through the high 20 bits.
+ *
+ * Effectively this means we get 12-bit integers, 0.0f, and some limited uses
+ * of packed vectors as compactable immediates.
+ */
+static bool
+is_compactable_immediate(unsigned imm)
+{
+   /* We get the low 12 bits as-is. */
+   imm &= ~0xfff;
+
+   /* We get one bit replicated through the top 20 bits. */
+   return imm == 0 || imm == 0xf000;
+}
+
 /**
  * Tries to compact instruction src into dst.
  *
@@ -464,10 +487,11 @@ brw_try_compact_instruction(struct brw_compile *p,
   return false;
}
 
-   /* FINISHME: immediates */
-   if (src->bits1.da1.src0_reg_file == BRW_IMMEDIATE_VALUE ||
-   src->bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE)
+   bool is_immediate = src->bits1.da1.src0_reg_file == BRW_IMMEDIATE_VALUE ||
+   src->bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE;
+   if (is_immediate && !is_compactable_immediate(src->bits3.ud)) {
   return false;
+   }
 
memset(&temp, 0, sizeof(temp));
 
@@ -477,7 +501,7 @@ brw_try_compact_instruction(struct brw_compile *p,
   return false;
if (!set_datatype_index(&temp, src))
   return false;
-   if (!set_subreg_index(&temp, src))
+   if (!set_subreg_index(&temp, src, is_immediate))
   return false;
temp.dw0.acc_wr_control = src->header.acc_wr_control;
temp.dw0.conditionalmod = src->header.destreg__conditionalmod;
@@ -486,11 +510,15 @@ brw_try_compact_instruction(struct brw_compile *p,
temp.dw0.cmpt_ctrl = 1;
if (!set_src0_index(&temp, src))
   return false;
-   if (!set_src1_index(&temp, src))
+   if (!set_src1_index(&temp, src, is_immediate))
   return false;
temp.dw1.dst_reg_nr = src->bits1.da1.dest_reg_nr;
temp.dw1.src0_reg_nr = src->bits2.da1.src0_reg_nr;
-   temp.dw1.src1_reg_nr = src->bits3.da1.src1_reg_nr;
+   if (is_immediate) {
+  temp.dw1.src1_reg_nr = src->bits3.ud & 0xff;
+   } else {
+  temp.dw1.src1_reg_nr = src->bits3.da1.src1_reg_nr;
+   }
 
*dst = temp;
 
@@ -547,11 +575,17 @@ set_uncompacted_src0(struct brw_instruction *dst,
 
 static void
 set_uncompacted_src1(struct brw_instruction *dst,
- struct brw_compact_instruction *src)
+ struct brw_compact_instruction *src, bool is_immediate)
 {
-   uint16_t uncompacted = src_index_table[src->dw1.src1_index];
-
-   dst->bits3.ud |= uncompacted << 13;
+   if (is_immediate) {
+  signed high5 = src->dw1.src1_index;
+  /* Replicate top bit of src1_index into high 20 bits of the immediate. */
+  dst->bits3.ud = (high5 << 27) >> 19;
+   } else {
+  uint16_t uncompacted = src_index_table[src->dw1.src1_index];
+
+  dst->bits3.ud |= uncompacted << 13;
+   }
 }
 
 void
@@ -566,16 +600,25 @@ brw_uncompact_instruction(struct brw_context *brw,
 
set_uncompacted_control(brw, dst, src);
set_uncompacted_datatype(dst, src

[Mesa-dev] [PATCH 06/23] i965/fs: Make patch_discard_jumps_to_fb_writes return bool.

2014-05-19 Thread Matt Turner

... to tell us whether it emitted any code. Will be used to determine
whether we need to skip an annotation for it.
---
 src/mesa/drivers/dri/i965/brw_fs.h  | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp  | 5 +++--
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 5 +++--
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 8acad2f..111e994 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -696,7 +696,7 @@ private:
   struct brw_reg dst,
   struct brw_reg surf_index);
 
-   void patch_discard_jumps_to_fb_writes();
+   bool patch_discard_jumps_to_fb_writes();
 
struct brw_context *brw;
struct gl_context *ctx;
@@ -788,7 +788,7 @@ private:
   struct brw_reg surf_index);
void generate_discard_jump(fs_inst *ir);
 
-   void patch_discard_jumps_to_fb_writes();
+   bool patch_discard_jumps_to_fb_writes();
 
const struct brw_wm_prog_key *const key;
struct brw_wm_prog_data *prog_data;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 0fcf527..132d5cd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -59,11 +59,11 @@ fs_generator::~fs_generator()
 {
 }
 
-void
+bool
 fs_generator::patch_discard_jumps_to_fb_writes()
 {
if (brw->gen < 6 || this->discard_halt_patches.is_empty())
-  return;
+  return false;
 
/* There is a somewhat strange undocumented requirement of using
 * HALT, according to the simulator.  If some channel has HALTed to
@@ -92,6 +92,7 @@ fs_generator::patch_discard_jumps_to_fb_writes()
}
 
this->discard_halt_patches.make_empty();
+   return true;
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
index 294ce46..9df5b73 100644
--- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp
@@ -639,11 +639,11 @@ gen8_fs_generator::generate_discard_jump(fs_inst *ir)
HALT();
 }
 
-void
+bool
 gen8_fs_generator::patch_discard_jumps_to_fb_writes()
 {
if (discard_halt_patches.is_empty())
-  return;
+  return false;
 
/* There is a somewhat strange undocumented requirement of using
 * HALT, according to the simulator.  If some channel has HALTed to
@@ -672,6 +672,7 @@ gen8_fs_generator::patch_discard_jumps_to_fb_writes()
}
 
this->discard_halt_patches.make_empty();
+   return true;
 }
 
 /**
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/23] i965/fs: Don't hardcode DEBUG_WM in generic fs code.

2014-05-19 Thread Matt Turner

Similar to Paul's commit e9fa3a944 except brw_fs_generator's debug_flag
is for DEBUG_WM and DEBUG_BLORP.
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp| 13 +++--
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 17 -
 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h   |  2 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp|  3 ++-
 src/mesa/drivers/dri/i965/brw_fs.h  |  4 +++-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp  | 16 +---
 6 files changed, 26 insertions(+), 29 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index fe75100..3da6388 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -517,7 +517,7 @@ class brw_blorp_blit_program : public brw_blorp_eu_emitter
 {
 public:
brw_blorp_blit_program(struct brw_context *brw,
-  const brw_blorp_blit_prog_key *key);
+  const brw_blorp_blit_prog_key *key, bool debug_flag);
 
const GLuint *compile(struct brw_context *brw, GLuint *program_size,
  FILE *dump_file = stderr);
@@ -624,8 +624,9 @@ private:
 
 brw_blorp_blit_program::brw_blorp_blit_program(
   struct brw_context *brw,
-  const brw_blorp_blit_prog_key *key)
-   : brw_blorp_eu_emitter(brw),
+  const brw_blorp_blit_prog_key *key,
+  bool debug_flag)
+   : brw_blorp_eu_emitter(brw, debug_flag),
  brw(brw),
  key(key)
 {
@@ -2142,7 +2143,8 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context 
*brw,
if (!brw_search_cache(&brw->cache, BRW_BLORP_BLIT_PROG,
  &this->wm_prog_key, sizeof(this->wm_prog_key),
  &prog_offset, prog_data)) {
-  brw_blorp_blit_program prog(brw, &this->wm_prog_key);
+  brw_blorp_blit_program prog(brw, &this->wm_prog_key,
+  INTEL_DEBUG & DEBUG_BLORP);
   GLuint program_size;
   const GLuint *program = prog.compile(brw, &program_size, stderr);
   brw_upload_cache(&brw->cache, BRW_BLORP_BLIT_PROG,
@@ -2160,7 +2162,6 @@ brw_blorp_blit_test_compile(struct brw_context *brw,
 FILE *out)
 {
GLuint program_size;
-   brw_blorp_blit_program prog(brw, key);
-   INTEL_DEBUG |= DEBUG_BLORP;
+   brw_blorp_blit_program prog(brw, key, true /* debug_flag */);
prog.compile(brw, &program_size, out);
 }
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
index 3549173..4910b6c 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp
@@ -25,12 +25,13 @@
 #include "brw_blorp_blit_eu.h"
 #include "brw_blorp.h"
 
-brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw)
+brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw,
+   bool debug_flag)
: mem_ctx(ralloc_context(NULL)),
  generator(brw, mem_ctx,
rzalloc(mem_ctx, struct brw_wm_prog_key),
rzalloc(mem_ctx, struct brw_wm_prog_data),
-   NULL, NULL, false)
+   NULL, NULL, false, debug_flag)
 {
 }
 
@@ -42,17 +43,7 @@ brw_blorp_eu_emitter::~brw_blorp_eu_emitter()
 const unsigned *
 brw_blorp_eu_emitter::get_program(unsigned *program_size, FILE *dump_file)
 {
-   const unsigned *res;
-
-   if (unlikely(INTEL_DEBUG & DEBUG_BLORP)) {
-  fprintf(stderr, "Native code for BLORP blit:\n");
-  res = generator.generate_assembly(NULL, &insts, program_size, dump_file);
-  fprintf(stderr, "\n");
-   } else {
-  res = generator.generate_assembly(NULL, &insts, program_size);
-   }
-
-   return res;
+   return generator.generate_assembly(NULL, &insts, program_size, dump_file);
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h 
b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
index e68f925..8a93f05 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h
@@ -30,7 +30,7 @@
 class brw_blorp_eu_emitter
 {
 protected:
-   explicit brw_blorp_eu_emitter(struct brw_context *brw);
+   explicit brw_blorp_eu_emitter(struct brw_context *brw, bool debug_flag);
~brw_blorp_eu_emitter();
 
const unsigned *get_program(unsigned *program_size, FILE *dump_file);
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 606a160..0c9aeeb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3166,7 +3166,8 @@ brw_wm_fs_emit(struct brw_context *brw,
   assembly = g.generate_assembly(&v.instructions, simd16_instructions,
  final_assembly_size);
} else {
-  fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, v.do_dual_src);
+  fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, v.do_dual_src,
+

[Mesa-dev] [PATCH 01/23] i965/cfg: Make DO instruction begin a basic block.

2014-05-19 Thread Matt Turner

The DO instruction doesn't exist on Gen6+. Since before this commit, DO
always ended a basic block, if it also happened to start one (e.g., a
while loop inside an if statement) the block containing only the DO
would actually contain no hardware instructions.

Pre-Gen6's WHILE instructions jumps to the instruction following the DO,
so strictly speaking we won't be modeling that properly, but I claim
there is actually no functional difference.

This will simplify an upcoming change where we want to mark the first
hardware instruction in the loop as beginning a block, and the last
instruction before the loop as ending one.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index a806714..6bf99f1 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -98,7 +98,7 @@ cfg_t::cfg_t(exec_list *instructions)
bblock_t *cur_if = NULL;/**< BB ending with IF. */
bblock_t *cur_else = NULL;  /**< BB ending with ELSE. */
bblock_t *cur_endif = NULL; /**< BB starting with ENDIF. */
-   bblock_t *cur_do = NULL;/**< BB ending with DO. */
+   bblock_t *cur_do = NULL;/**< BB starting with DO. */
bblock_t *cur_while = NULL; /**< BB immediately following WHILE. */
exec_list if_stack, else_stack, do_stack, while_stack;
bblock_t *next;
@@ -205,15 +205,18 @@ cfg_t::cfg_t(exec_list *instructions)
  */
 cur_while = new_block();
 
-/* Set up our immediately following block, full of "then"
- * instructions.
- */
-next = new_block();
-next->start = (backend_instruction *)inst->next;
-cur->add_successor(mem_ctx, next);
-cur_do = next;
+ if (cur->start == inst) {
+/* New block was just created; use it. */
+cur_do = cur;
+ } else {
+cur_do = new_block();
+cur_do->start = inst;
 
-set_next_block(&cur, next, ip);
+cur->end = (backend_instruction *)inst->prev;
+cur->add_successor(mem_ctx, cur_do);
+
+set_next_block(&cur, cur_do, ip - 1);
+ }
 break;
 
   case BRW_OPCODE_CONTINUE:
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/23] i965: Pass in start_offset to brw_compact_instructions().

2014-05-19 Thread Matt Turner

Let's us avoid recompacting the SIMD8 instructions when we compact the
SIMD16 program.
---
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp|  2 +-
 src/mesa/drivers/dri/i965/brw_clip.c |  2 +-
 src/mesa/drivers/dri/i965/brw_eu.h   |  2 +-
 src/mesa/drivers/dri/i965/brw_eu_compact.c   | 18 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  4 ++--
 src/mesa/drivers/dri/i965/brw_gs.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_sf.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  2 +-
 8 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
index 28c01c4..4b2c667 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
@@ -490,7 +490,7 @@ brw_blorp_const_color_program::compile(struct brw_context 
*brw,
   fprintf(stderr, "\n");
}
 
-   brw_compact_instructions(&func);
+   brw_compact_instructions(&func, 0);
return brw_get_program(&func, program_size);
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 11f0b69..57c49f0 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -110,7 +110,7 @@ static void compile_clip_prog( struct brw_context *brw,
   return;
}
 
-   brw_compact_instructions(&c.func);
+   brw_compact_instructions(&c.func, 0);
 
/* get the program
 */
diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
b/src/mesa/drivers/dri/i965/brw_eu.h
index 51d5214..65008a0 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -410,7 +410,7 @@ uint32_t brw_swap_cmod(uint32_t cmod);
 
 /* brw_eu_compact.c */
 void brw_init_compaction_tables(struct brw_context *brw);
-void brw_compact_instructions(struct brw_compile *p);
+void brw_compact_instructions(struct brw_compile *p, int start_offset);
 void brw_uncompact_instruction(struct brw_context *brw,
   struct brw_instruction *dst,
   struct brw_compact_instruction *src);
diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c 
b/src/mesa/drivers/dri/i965/brw_eu_compact.c
index c85bc89..c3a2ec3 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_compact.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c
@@ -661,18 +661,18 @@ brw_init_compaction_tables(struct brw_context *brw)
 }
 
 void
-brw_compact_instructions(struct brw_compile *p)
+brw_compact_instructions(struct brw_compile *p, int start_offset)
 {
struct brw_context *brw = p->brw;
-   void *store = p->store;
+   void *store = p->store + start_offset / 16;
/* For an instruction at byte offset 8*i before compaction, this is the 
number
 * of compacted instructions that preceded it.
 */
-   int compacted_counts[p->next_insn_offset / 8];
+   int compacted_counts[(p->next_insn_offset - start_offset) / 8];
/* For an instruction at byte offset 8*i after compaction, this is the
 * 8-byte offset it was at before compaction.
 */
-   int old_ip[p->next_insn_offset / 8];
+   int old_ip[(p->next_insn_offset - start_offset) / 8];
 
if (brw->gen < 6)
   return;
@@ -680,7 +680,7 @@ brw_compact_instructions(struct brw_compile *p)
int src_offset;
int offset = 0;
int compacted_count = 0;
-   for (src_offset = 0; src_offset < p->nr_insn * 16;) {
+   for (src_offset = 0; src_offset < p->next_insn_offset - start_offset;) {
   struct brw_instruction *src = store + src_offset;
   void *dst = store + offset;
 
@@ -734,8 +734,8 @@ brw_compact_instructions(struct brw_compile *p)
}
 
/* Fix up control flow offsets. */
-   p->next_insn_offset = offset;
-   for (offset = 0; offset < p->next_insn_offset;) {
+   p->next_insn_offset = start_offset + offset;
+   for (offset = 0; offset < p->next_insn_offset - start_offset;) {
   struct brw_instruction *insn = store + offset;
   int this_old_ip = old_ip[offset / 8];
   int this_compacted_count = compacted_counts[this_old_ip];
@@ -786,10 +786,10 @@ brw_compact_instructions(struct brw_compile *p)
 
if (0) {
   fprintf(stderr, "dumping compacted program\n");
-  brw_disassemble(brw, p->store, 0, p->next_insn_offset, stderr);
+  brw_disassemble(brw, store, 0, p->next_insn_offset - start_offset, 
stderr);
 
   int cmp = 0;
-  for (offset = 0; offset < p->next_insn_offset;) {
+  for (offset = 0; offset < p->next_insn_offset - start_offset;) {
  struct brw_instruction *insn = store + offset;
 
  if (insn->header.cmpt_control) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index c61cc5c..9518e72 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1852,7 +1852,7 @@ fs_generator::generate_assembly(exec_list 
*simd8_inst

[Mesa-dev] [PATCH 05/23] i965: Add annotation data structure and support code.

2014-05-19 Thread Matt Turner

Will be used to print disassembly after jump targets are set and
instructions are compacted, while still retaining higher-level IR
annotations and basic block information.

An array of 'struct annotation' will live along side the generated
assembly. The generators will populate the array with their IR
annotations, and basic block pointers if the instructions began or ended
a basic block pointer.

We'll then update the instruction offset when we compact instructions
and then using the annotations print the disassembly.
---
 src/mesa/drivers/dri/i965/Makefile.sources   |  1 +
 src/mesa/drivers/dri/i965/brw_blorp_clear.cpp|  2 +-
 src/mesa/drivers/dri/i965/brw_clip.c |  2 +-
 src/mesa/drivers/dri/i965/brw_eu.h   |  4 +-
 src/mesa/drivers/dri/i965/brw_eu_compact.c   | 31 -
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  4 +-
 src/mesa/drivers/dri/i965/brw_gs.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_sf.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  2 +-
 src/mesa/drivers/dri/i965/intel_asm_printer.c| 89 
 src/mesa/drivers/dri/i965/intel_asm_printer.h| 53 ++
 11 files changed, 183 insertions(+), 9 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/intel_asm_printer.c
 create mode 100644 src/mesa/drivers/dri/i965/intel_asm_printer.h

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 5fc90b5..2570059 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -3,6 +3,7 @@ i965_INCLUDES = \
$(MESA_TOP)/src/mesa/drivers/dri/intel
 
 i965_FILES = \
+   intel_asm_printer.c \
intel_batchbuffer.c \
intel_blit.c \
intel_buffer_objects.c \
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
index 4b2c667..ea0065a 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp
@@ -490,7 +490,7 @@ brw_blorp_const_color_program::compile(struct brw_context 
*brw,
   fprintf(stderr, "\n");
}
 
-   brw_compact_instructions(&func, 0);
+   brw_compact_instructions(&func, 0, 0, NULL);
return brw_get_program(&func, program_size);
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index 57c49f0..536c085 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -110,7 +110,7 @@ static void compile_clip_prog( struct brw_context *brw,
   return;
}
 
-   brw_compact_instructions(&c.func, 0);
+   brw_compact_instructions(&c.func, 0, 0, NULL);
 
/* get the program
 */
diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
b/src/mesa/drivers/dri/i965/brw_eu.h
index 65008a0..8ce31a1 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -37,6 +37,7 @@
 #include "brw_structs.h"
 #include "brw_defines.h"
 #include "brw_reg.h"
+#include "intel_asm_printer.h"
 #include "program/prog_instruction.h"
 
 #ifdef __cplusplus
@@ -410,7 +411,8 @@ uint32_t brw_swap_cmod(uint32_t cmod);
 
 /* brw_eu_compact.c */
 void brw_init_compaction_tables(struct brw_context *brw);
-void brw_compact_instructions(struct brw_compile *p, int start_offset);
+void brw_compact_instructions(struct brw_compile *p, int start_offset,
+  int num_annotations, struct annotation 
*annotation);
 void brw_uncompact_instruction(struct brw_context *brw,
   struct brw_instruction *dst,
   struct brw_compact_instruction *src);
diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c 
b/src/mesa/drivers/dri/i965/brw_eu_compact.c
index c3a2ec3..40d1fc2 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_compact.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c
@@ -39,6 +39,7 @@
 
 #include "brw_context.h"
 #include "brw_eu.h"
+#include "intel_asm_printer.h"
 
 static const uint32_t gen6_control_index_table[32] = {
0b0,
@@ -661,7 +662,8 @@ brw_init_compaction_tables(struct brw_context *brw)
 }
 
 void
-brw_compact_instructions(struct brw_compile *p, int start_offset)
+brw_compact_instructions(struct brw_compile *p, int start_offset,
+ int num_annotations, struct annotation *annotation)
 {
struct brw_context *brw = p->brw;
void *store = p->store + start_offset / 16;
@@ -784,6 +786,33 @@ brw_compact_instructions(struct brw_compile *p, int 
start_offset)
}
p->nr_insn = p->next_insn_offset / 16;
 
+   /* Update the instruction offsets for each annotation. */
+   if (annotation) {
+  for (int offset = 0, i = 0; i < num_annotations; i++) {
+ while (start_offset + old_ip[offset / 8] * 8 != annotation[i].offset) 
{
+assert(start_offset + old_ip[offset / 8] * 8 <
+   annotation[i].offset);
+

[Mesa-dev] [PATCH 00/23] i965: Instruction compaction improvements.

2014-05-19 Thread Matt Turner

Available from

   git://people.freedesktop.org/~mattst88/mesa compaction

Highlights

   - Print disassembly after instruction compaction, while still
 having control-flow graph information and higher-level IR
 annotations.

   - Three improvements to instruction compaction increase number
 of compacted instructions by 5.4%.

   - Adds INTEL_DEBUG=annotation to control printing higher-level
 IR annotations. Saves significant time and space requirements
 of shader-db.

   - Print instruction counts and compaction stats with each shader.
 Will let shader-db parse this directly, rather than counting
 instructions.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] docs: update the prerequisites section

2014-05-19 Thread Ian Romanick

LGTM

Reviewed-by: Ian Romanick 

On 05/19/2014 07:17 AM, Brian Paul wrote:
> SCons is required for Windows.  Add links to flex/bison for Windows.
> Reorder items and improve formatting.
> ---
>  docs/install.html |   15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/docs/install.html b/docs/install.html
> index 5061ede..f12425f 100644
> --- a/docs/install.html
> +++ b/docs/install.html
> @@ -34,16 +34,25 @@
>  
>  1.1 General
>  
> +http://www.python.org/";>Python - Python is required.
> +Version 2.6.4 or later should work.
> +
> +
> +http://www.scons.org/";>SCons is required for building on
> +Windows and optional for Linux (it's an alternative to autoconf/automake.)
> +
> +
>  lex / yacc - for building the GLSL compiler.
> +
> +
>  On Linux systems, flex and bison are used.
>  Versions 2.5.35 and 2.4.1, respectively, (or later) should work.
>  
>  
>  On Windows with MinGW, install flex and bison with:
>  mingw-get install msys-flex msys-bison
> -
> -python - Python is needed for building the Gallium components.
> -Version 2.6.4 or later should work.
> +For MSVC on Windows, you can find flex/bison programs on the
> +ftp://ftp.freedesktop.org/pub/mesa/windows-utils/";>Mesa ftp 
> site.
>  
>  
>  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] Revert "i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage."

2014-05-19 Thread Ian Romanick

Thanks for the quick fix. :) Series is

Reviewed-by: Ian Romanick 

On 05/18/2014 11:12 PM, Kenneth Graunke wrote:
> This reverts commit bd44ac8b5ca08016bb064b37edaec95eccfdbcd5.
> 
> Fixes:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78842
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78843
> 
> Re-breaks:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705
> but that will be fixed properly in a few commits.
> 
> Cc: "10.2" 
> ---
>  src/mesa/drivers/common/meta_blit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/common/meta_blit.c 
> b/src/mesa/drivers/common/meta_blit.c
> index e5a0a9a..beb1ea5 100644
> --- a/src/mesa/drivers/common/meta_blit.c
> +++ b/src/mesa/drivers/common/meta_blit.c
> @@ -732,7 +732,7 @@ _mesa_meta_BlitFramebuffer(struct gl_context *ctx,
> _mesa_meta_end(ctx);
>  
>  fallback:
> -   if (mask && !ctx->Meta->Blit.no_ctsi_fallback) {
> +   if (mask) {
>_swrast_BlitFramebuffer(ctx, srcX0, srcY0, srcX1, srcY1,
>dstX0, dstY0, dstX1, dstY1, mask, filter);
> }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.

2014-05-19 Thread Ian Romanick

On 05/19/2014 06:39 AM, Marek Olšák wrote:
> You are complicating it. If we followed the specification to the
> letter, the driver would have to advertise OpenGL 1.1 instead of 2.1.
> 
> The fact r300 cannot filter floating-point textures is documented by
> the vendor and game developers (especially those who targeted D3D9)
> knew about it.
> 
> For OpenGL ES, I propose a simpler solution:
> - don't touch ARB_texture_float at all
> - add OES_texture_float to gl_extensions
> - add OES_texture_float_linear to gl_extensions
> - define OES_texture_half_float as o(OES_texture_float)
> - define OES_texture_half_float_linear as o(OES_texture_float_linear)
> 
> Then, drivers can enable the extensions as they see fit.

That sounds like a happy medium.  It seems like we could use
ARB_texture_float as the enable for OES_texture_float, but I'm not
crying over one extra flag.

It will mean that a bunch of extension checks in the code will need to
be expanded.

We'll probably also want a negative test that verifies an error is
generated for glTexParameteri(..., GL_LINEAR_MIPMAP_LINEAR) when
OES_texture_float_linear (or OES_texture_half_float_linear) is not
supported.

> Marek
> 
> On Mon, May 19, 2014 at 8:34 AM, Rogovin, Kevin  
> wrote:
>> Hi,
>>
>>   Each of the four extensions are right now set to be advertised if and only 
>> if a GL context would advertise GL_ARB_texture_float:
>>
>> { "GL_OES_texture_float",   o(ARB_texture_float),
>>ES2,2005 },
>> { "GL_OES_texture_half_float",  o(ARB_texture_float),
>>ES2,2005 },
>> { "GL_OES_texture_float_linear",o(ARB_texture_float),
>>ES2,2005 },
>> { "GL_OES_texture_half_float_linear",   o(ARB_texture_float),
>>ES2,2005 },
>>
>> From my interpretation of ARB_texture_float, that extension requires both 
>> 16-bit and 32-bit textures and ability to filter linearly such textures. Did 
>> I misunderstand the specification? If I got the specification correct, then 
>> the r300 should not be advertising any of the extensions for otherwise it 
>> would be advertising GL_ARB_texture_float.
>>
>> However, the r300 does give an example of ability to support some of the OES 
>> extensions but not all. Previously Matt asked if there an example or need 
>> and I thought not. It turns out I was wrong and there is a need atleast for 
>> the r300. Supporting that granularity is going to be a bigger patch since it 
>> would require changing the data structure struct gl_extensions to have four 
>> entries and in turn additional logic to combine them to 
>> GL_ARB_texture_float. The correct and more work way to do it would be to 
>> remove ARB_texture_float from gl_extension, add a GLboolean for each of the 
>> 4 OES extensions, change each driver to correctly fill them and then 
>> additional logic in creating extension string(s) to check if each of the 4 
>> OES extensions are TRUE then to advertise GL_ARB_texture_float; we could 
>> also instead just add the 4 OES booleans and have additional logic in 
>> mesa/main to set them each to TRUE if ARB_texture_float is true. The latter 
>> solution though easier is less clean a!
 nd begging
 for trouble later. Regardless, lets first get this patch as-is into Mesa, then 
do the "right" thing to allow a backend to support a subset of the OES 
extensions without needing to support the ARB extension.
>>
>> -Kevin
>>
>>
>>
>> 
>> From: Marek Olšák [mar...@gmail.com]
>> Sent: Friday, May 16, 2014 4:33 PM
>> To: Rogovin, Kevin
>> Cc: mesa-dev@lists.freedesktop.org
>> Subject: Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and 
>> GL_OES_texture_half_float.
>>
>> Sorry, I meant the linear filtering extensions.
>>
>> Marek
>>
>> On Fri, May 16, 2014 at 3:31 PM, Marek Olšák  wrote:
>>> Hi Kevin,
>>>
>>> r300g doesn't support filtering of floating-point textures, so the
>>> extension shouldn't be advertised there.
>>>
>>> Marek
>>>
>>> On Wed, May 7, 2014 at 1:18 PM, Kevin Rogovin  
>>> wrote:
  Add support for GLES2 extensions for floating point and half
  floating point textures (GL_OES_texture_float, GL_OES_texture_half_float,
  GL_OES_texture_float_linear and GL_OES_texture_half_float_linear).

 ---
  src/mesa/main/extensions.c | 12 +-
  src/mesa/main/glformats.c  | 25 
  src/mesa/main/pack.c   | 17 +
  src/mesa/main/teximage.c   | 59 
 ++
  4 files changed, 112 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
 index c2ff7e3..e39f65e 100644
 --- a/src/mesa/main/extensions.c
 +++ b/src/mesa/main/extensions.c
 @@ -301,7 +301,17 @@ static const struct extension extension_table[] = {
 { "GL_OES_t

[Mesa-dev] glsl: ideas how to improve dead code elimination?

2014-05-19 Thread Aras Pranckevicius

Hi,

When Mesa's GLSL compiler is faced with a code like this:

// vec4 packednormal exists
vec3 normal;
normal.xy = packednormal.wy * 2.0 - 1.0;
normal.z = sqrt(1.0 - dot(normal.xy, normal.xy));
// now do not use the "normal" at all anywhere

Then the dead code elimination pass will not be able to eliminate the
"normal" variable, nor anything that lead to it (possibly sampling textures
into packed normal, etc.).

This is because variable refcounting visitor sees "normal" as having four
references, but only two assignments, and can not consider it dead. Even if
these two references are from assignment to normal.z where both LHS & RHS
reference the same variable.

Any ideas on how to improve this?


If the original code was doing something like this, then dead code
elimination is able to "properly" eliminate this whole thing:

// vec4 packednormal exists
vec3 normal;
vec2 nxy = packednormal.wy * 2.0 - 1.0;
float nz = sqrt(1.0 - dot(nxy, nxy));
normal.xy = nxy;
normal.z = nz;
// now do not use the "normal" at all anywhere



-- 
Aras Pranckevičius
work: http://unity3d.com
home: http://aras-p.info
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS

2014-05-19 Thread Ian Romanick

On 04/09/2014 02:56 AM, Tapani Pälli wrote:
> Patch adds new implementation dependent value required by the
> GL_ARB_explicit_uniform_location extension. Default value for user
> assignable locations is calculated as sum of MaxUniformComponents
> for each stage.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/context.c  | 10 +-
>  src/mesa/main/get.c  |  1 +
>  src/mesa/main/get_hash_params.py |  1 +
>  src/mesa/main/mtypes.h   |  5 +
>  src/mesa/main/tests/enum_strings.cpp |  1 +
>  5 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
> index 860ae86..8b77df1 100644
> --- a/src/mesa/main/context.c
> +++ b/src/mesa/main/context.c
> @@ -610,8 +610,16 @@ _mesa_init_constants(struct gl_context *ctx)
> ctx->Const.MaxUniformBlockSize = 16384;
> ctx->Const.UniformBufferOffsetAlignment = 1;
>  
> -   for (i = 0; i < MESA_SHADER_STAGES; i++)
> +   /* GL_ARB_explicit_uniform_location, initial value calculated
> +* as sum of MaxUniformComponents for each stage.
> +*/
> +   ctx->Const.MaxUserAssignableUniformLocations = 0;
> +
> +   for (i = 0; i < MESA_SHADER_STAGES; i++) {
>init_program_limits(ctx, i, &ctx->Const.Program[i]);
> +  ctx->Const.MaxUserAssignableUniformLocations +=
> + ctx->Const.Program[i].MaxUniformComponents;
> +   }

This is just going to set ctx->Const.MaxUserAssignableUniformLocations
to 4 * 4 * MAX_UNIFORMS, and that's probably not what we want.  Maybe
just set 4 * MAX_UNIFORMS with a comment saying it's, "MAX_UNIFORMS for
each possible shader stage."

> ctx->Const.MaxProgramMatrices = MAX_PROGRAM_MATRICES;
> ctx->Const.MaxProgramMatrixStackDepth = MAX_PROGRAM_MATRIX_STACK_DEPTH;
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 6d95790..8b50441 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -395,6 +395,7 @@ EXTRA_EXT(ARB_viewport_array);
>  EXTRA_EXT(ARB_compute_shader);
>  EXTRA_EXT(ARB_gpu_shader5);
>  EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5);
> +EXTRA_EXT(ARB_explicit_uniform_location);
>  
>  static const int
>  extra_ARB_color_buffer_float_or_glcore[] = {
> diff --git a/src/mesa/main/get_hash_params.py 
> b/src/mesa/main/get_hash_params.py
> index 06d0bba..5709d42 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -474,6 +474,7 @@ descriptor=[
>[ "MAX_LIST_NESTING", "CONST(MAX_LIST_NESTING), NO_EXTRA" ],
>[ "MAX_NAME_STACK_DEPTH", "CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA" ],
>[ "MAX_PIXEL_MAP_TABLE", "CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA" ],
> +  [ "MAX_UNIFORM_LOCATIONS", 
> "CONTEXT_INT(Const.MaxUserAssignableUniformLocations), NO_EXTRA" ],

Ditto on Petri's comment.

>[ "NAME_STACK_DEPTH", "CONTEXT_INT(Select.NameStackDepth), NO_EXTRA" ],
>[ "PACK_LSB_FIRST", "CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA" ],
>[ "PACK_SWAP_BYTES", "CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA" ],
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 7ac6bbe..fefbe06 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -3311,6 +3311,11 @@ struct gl_constants
> GLuint UniformBufferOffsetAlignment;
> /** @} */
>  
> +   /**
> +* GL_ARB_explicit_uniform_location
> +*/
> +   GLuint MaxUserAssignableUniformLocations;
> +
> /** GL_ARB_geometry_shader4 */
> GLuint MaxGeometryOutputVertices;
> GLuint MaxGeometryTotalOutputComponents;
> diff --git a/src/mesa/main/tests/enum_strings.cpp 
> b/src/mesa/main/tests/enum_strings.cpp
> index 3795700..298ff6a 100644
> --- a/src/mesa/main/tests/enum_strings.cpp
> +++ b/src/mesa/main/tests/enum_strings.cpp
> @@ -787,6 +787,7 @@ const struct enum_info everything[] = {
> { 0x8256, "GL_RESET_NOTIFICATION_STRATEGY_ARB" },
> { 0x8257, "GL_PROGRAM_BINARY_RETRIEVABLE_HINT" },
> { 0x8261, "GL_NO_RESET_NOTIFICATION_ARB" },
> +   { 0x826E, "GL_MAX_UNIFORM_LOCATIONS" },
> { 0x82DF, "GL_TEXTURE_IMMUTABLE_LEVELS" },
> { 0x8362, "GL_UNSIGNED_BYTE_2_3_3_REV" },
> { 0x8363, "GL_UNSIGNED_SHORT_5_6_5" },
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/10] GL_ARB_explicit_uniform_location v2

2014-05-19 Thread Ian Romanick

Patches 1, 2, and 7 are

Reviewed-by: Ian Romanick 

I sent out comments for the rest.

On 04/09/2014 02:56 AM, Tapani Pälli wrote:
> Hi;
> 
> Patches implement the extension, no Piglit regressions and all the tests
> for the extension pass. Location initialization and assignment is done
> like Ian suggested, this removed quite a bit of code since now there is
> no need to store inactive uniforms temporarily.
> 
> Here's a branch with the patches:
> http://cgit.freedesktop.org/~tpalli/mesa/log/?h=exp_uniform_loc_v2
> 
> // Tapani
> 
> 
> Tapani Pälli (10):
>   glapi: add GL_ARB_explicit_uniform_location
>   mesa: add enable bit for ARB_explicit_uniform_location
>   mesa: add new enum MAX_UNIFORM_LOCATIONS
>   glsl/linker: initialize explicit uniform locations
>   glsl/linker: assign explicit uniform locations
>   mesa: support inactive uniforms in glUniform* functions
>   glsl: add enable bit for ARB_explicit_uniform_location
>   glsl: parser changes for GL_ARB_explicit_uniform_location
>   Enable GL_ARB_explicit_uniform_location in the drivers.
>   docs: update ARB_explicit_uniform_location status
> 
>  docs/GL3.txt |  2 +-
>  src/glsl/ast_to_hir.cpp  | 37 +++
>  src/glsl/glcpp/glcpp-parse.y |  3 +
>  src/glsl/glsl_lexer.ll   |  1 +
>  src/glsl/glsl_parser_extras.cpp  |  1 +
>  src/glsl/glsl_parser_extras.h| 16 +
>  src/glsl/ir_uniform.h|  5 +-
>  src/glsl/link_uniforms.cpp   | 56 ++--
>  src/glsl/linker.cpp  | 99 
> 
>  src/mapi/glapi/gen/gl_API.xml|  6 ++
>  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
>  src/mesa/main/context.c  | 10 ++-
>  src/mesa/main/extensions.c   |  1 +
>  src/mesa/main/get.c  |  1 +
>  src/mesa/main/get_hash_params.py |  1 +
>  src/mesa/main/mtypes.h   |  6 ++
>  src/mesa/main/tests/enum_strings.cpp |  1 +
>  src/mesa/main/uniform_query.cpp  | 15 +
>  src/mesa/state_tracker/st_extensions.c   |  1 +
>  19 files changed, 254 insertions(+), 9 deletions(-)
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] Enable GL_ARB_explicit_uniform_location in the drivers.

2014-05-19 Thread Ian Romanick

Either this patch should:

 - Delete the extension enable flag
 - Change the table in extensions.c to use dummy_true

or

The next patch needs to not say "all drivers that support GLSL".

I think we should just enable it everywhere.

On 04/09/2014 02:56 AM, Tapani Pälli wrote:
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
>  src/mesa/state_tracker/st_extensions.c   | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index 15fcd30..f8abf98 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -170,6 +170,7 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.ARB_draw_instanced = true;
> ctx->Extensions.ARB_ES2_compatibility = true;
> ctx->Extensions.ARB_explicit_attrib_location = true;
> +   ctx->Extensions.ARB_explicit_uniform_location = true;
> ctx->Extensions.ARB_fragment_coord_conventions = true;
> ctx->Extensions.ARB_fragment_program = true;
> ctx->Extensions.ARB_fragment_program_shadow = true;
> diff --git a/src/mesa/state_tracker/st_extensions.c 
> b/src/mesa/state_tracker/st_extensions.c
> index 3e1e45d..5b11e7b 100644
> --- a/src/mesa/state_tracker/st_extensions.c
> +++ b/src/mesa/state_tracker/st_extensions.c
> @@ -534,6 +534,7 @@ void st_init_extensions(struct st_context *st)
> ctx->Extensions.ARB_ES2_compatibility = GL_TRUE;
> ctx->Extensions.ARB_draw_elements_base_vertex = GL_TRUE;
> ctx->Extensions.ARB_explicit_attrib_location = GL_TRUE;
> +   ctx->Extensions.ARB_explicit_uniform_location = GL_TRUE;
> ctx->Extensions.ARB_fragment_coord_conventions = GL_TRUE;
> ctx->Extensions.ARB_fragment_program = GL_TRUE;
> ctx->Extensions.ARB_fragment_shader = GL_TRUE;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] glsl: parser changes for GL_ARB_explicit_uniform_location

2014-05-19 Thread Ian Romanick

On 04/09/2014 02:56 AM, Tapani Pälli wrote:
> Patch adds a preprocessor define for the extension and stores explicit
> location data for uniforms during AST->HIR conversion. It also sets
> layout token to be available when having the extension in place.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/ast_to_hir.cpp   | 37 +
>  src/glsl/glcpp/glcpp-parse.y  |  3 +++
>  src/glsl/glsl_lexer.ll|  1 +
>  src/glsl/glsl_parser_extras.h | 14 ++
>  4 files changed, 55 insertions(+)
> 
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 8d55ee3..7431ad7 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -2170,6 +2170,43 @@ validate_explicit_location(const struct 
> ast_type_qualifier *qual,
>  {
> bool fail = false;
>  
> +   /* Checks for GL_ARB_explicit_uniform_location. */
> +   if (qual->flags.q.uniform) {
> +

Extra blank line.

> +  if (!state->check_explicit_uniform_location_allowed(loc, var))
> + return;
> +
> +  const struct gl_context *const ctx = state->ctx;
> +  unsigned max_loc = qual->location + var->type->component_slots() - 1;

I think that over counts for this purpose, and we can blame confusing
nomenclature.  component_slots for a mat4 is 4, so a mat4 uniform counts
4*4 against the GL_MAX_VERTEX_UNIFORM_COMPONENTS limit.  However, it
only has one "location" (as returned by glGetUniformLocation), so it
only counts 1 against the GL_MAX_UNIFORM_LOCATIONS limit.

> +
> +  /* ARB_explicit_uniform_location specification states:
> +   *
> +   * "The explicitly defined locations and the generated locations
> +   * must be in the range of 0 to MAX_UNIFORM_LOCATIONS minus one."
> +   *
> +   * "Valid locations for default-block uniform variable locations
> +   * are in the range of 0 to the implementation-defined maximum
> +   * number of uniform locations."
> +   */
> +  if (qual->location < 0) {
> + _mesa_glsl_error(loc, state,
> +  "explicit location < 0 for uniform %s", var->name);
> + return;
> +  }
> +
> +  if (max_loc >= ctx->Const.MaxUserAssignableUniformLocations) {
> + _mesa_glsl_error(loc, state, "location qualifier for uniform %s "
> +  ">= MAX_UNIFORM_LOCATIONS (%u)",
> +  var->name,
> +  ctx->Const.MaxUserAssignableUniformLocations);
> + return;
> +  }
> +
> +  var->data.explicit_location = true;
> +  var->data.location = qual->location;
> +  return;
> +   }
> +
> /* Between GL_ARB_explicit_attrib_location an
>  * GL_ARB_separate_shader_objects, the inputs and outputs of any shader
>  * stage can be assigned explicit locations.  The checking here associates
> diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
> index f28d853..6d42138 100644
> --- a/src/glsl/glcpp/glcpp-parse.y
> +++ b/src/glsl/glcpp/glcpp-parse.y
> @@ -2087,6 +2087,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
> *parser, intmax_t versio
> if (extensions->ARB_explicit_attrib_location)
>add_builtin_define(parser, "GL_ARB_explicit_attrib_location", 
> 1);
>  
> +   if (extensions->ARB_explicit_uniform_location)
> +  add_builtin_define(parser, "GL_ARB_explicit_uniform_location", 
> 1);
> +
> if (extensions->ARB_shader_texture_lod)
>add_builtin_define(parser, "GL_ARB_shader_texture_lod", 1);
>  
> diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
> index 7602351..83f0b6d 100644
> --- a/src/glsl/glsl_lexer.ll
> +++ b/src/glsl/glsl_lexer.ll
> @@ -393,6 +393,7 @@ layout{
> || yyextra->AMD_conservative_depth_enable
> || yyextra->ARB_conservative_depth_enable
> || yyextra->ARB_explicit_attrib_location_enable
> +   || yyextra->ARB_explicit_uniform_location_enable
>|| yyextra->has_separate_shader_objects()
> || yyextra->ARB_uniform_buffer_object_enable
> || yyextra->ARB_fragment_coord_conventions_enable
> diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
> index c53c583..20879a0 100644
> --- a/src/glsl/glsl_parser_extras.h
> +++ b/src/glsl/glsl_parser_extras.h
> @@ -152,6 +152,20 @@ struct _mesa_glsl_parse_state {
>return true;
> }
>  
> +   bool check_explicit_uniform_location_allowed(YYLTYPE *locp,
> +const ir_variable *var)
> +   {
> +  /* Requires OpenGL 3.3 or ARB_explicit_attrib_location. */
> +  if (ctx->Version < 33 && 
> !ctx->Extensions.ARB_explicit_attrib_location) {
> + _mesa_glsl_error(locp, this, "%s explicit location requires "
> +  "GL_ARB_explicit_attrib_location extension "
>

Re: [Mesa-dev] [PATCH 05/10] glsl/linker: assign explicit uniform locations

2014-05-19 Thread Ian Romanick

On 04/09/2014 02:56 AM, Tapani Pälli wrote:
> Patch refactors the existing uniform processing so explicit locations
> are taken in to account during variable processing. These locations
> are temporarily stored in gl_uniform_storage before actual locations
> are set.
> 
> The 'remap_location' variable in gl_uniform_storage is changed to be
> signed so that we can use 0 as a valid explicit location and '-1' as
> identifier that no explicit location has been defined.
> 
> When locations are set, UniformRemapTable is first populated with
> uniforms that have explicit location set (inactive and actives ones),
> rest are put after explicit location slots.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/ir_uniform.h  |  5 +++--
>  src/glsl/link_uniforms.cpp | 56 
> +-
>  2 files changed, 54 insertions(+), 7 deletions(-)
> 
> diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h
> index 3508509..9dc4a8e 100644
> --- a/src/glsl/ir_uniform.h
> +++ b/src/glsl/ir_uniform.h
> @@ -181,9 +181,10 @@ struct gl_uniform_storage {
>  
> /**
>  * The 'base location' for this uniform in the uniform remap table. For
> -* arrays this is the first element in the array.
> +* arrays this is the first element in the array. It needs to be signed
> +* so that we can use 0 as valid location and -1 as initial value
>  */
> -   unsigned remap_location;
> +   int remap_location;

You could use ~0u instead of -1, right?  A #define for the magic value
will also help.

>  };
>  
>  #ifdef __cplusplus
> diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
> index 29dc0b1..0f99082 100644
> --- a/src/glsl/link_uniforms.cpp
> +++ b/src/glsl/link_uniforms.cpp
> @@ -387,6 +387,9 @@ public:
> void set_and_process(struct gl_shader_program *prog,
>   ir_variable *var)
> {
> +  current_var = var;
> +  field_counter = 0;
> +
>ubo_block_index = -1;
>if (var->is_in_uniform_block()) {
>   if (var->is_interface_instance() && var->type->is_array()) {
> @@ -543,6 +546,22 @@ private:
>   return;
>}
>  
> +  /* Assign explicit locations. */
> +  if (current_var->data.explicit_location) {
> + /* Set sequential locations for struct fields. */
> + if (current_var->type->is_record()) {

I think you can accomplish the same thing with record_type != NULL.

> +const unsigned entries = MAX2(1, 
> this->uniforms[id].array_elements);
> +this->uniforms[id].remap_location =
> +   current_var->data.location + field_counter;
> +   field_counter += entries;

Weird indentation.

> + } else {
> +this->uniforms[id].remap_location = current_var->data.location;
> + }
> +  } else {
> + /* Initialize to -1 to indicate that no explicit location is set */
> + this->uniforms[id].remap_location = -1;
> +  }
> +
>this->uniforms[id].name = ralloc_strdup(this->uniforms, name);
>this->uniforms[id].type = base_type;
>this->uniforms[id].initialized = 0;
> @@ -598,6 +617,17 @@ public:
> gl_texture_index targets[MAX_SAMPLERS];
>  
> /**
> +* Current variable being processed.
> +*/
> +   ir_variable *current_var;
> +
> +   /**
> +* Field counter is used to take care that uniform structures
> +* with explicit locations get sequential locations.
> +*/
> +   unsigned field_counter;
> +
> +   /**
>  * Mask of samplers used by the current shader stage.
>  */
> unsigned shader_samplers_used;
> @@ -799,10 +829,6 @@ link_assign_uniform_locations(struct gl_shader_program 
> *prog)
> prog->UniformStorage = NULL;
> prog->NumUserUniformStorage = 0;
>  
> -   ralloc_free(prog->UniformRemapTable);
> -   prog->UniformRemapTable = NULL;
> -   prog->NumUniformRemapTable = 0;
> -
> if (prog->UniformHash != NULL) {
>prog->UniformHash->clear();
> } else {
> @@ -915,9 +941,29 @@ link_assign_uniform_locations(struct gl_shader_program 
> *prog)
>   sizeof(prog->_LinkedShaders[i]->SamplerTargets));
> }
>  
> -   /* Build the uniform remap table that is used to set/get uniform 
> locations */
> +   /* Reserve all the explicit locations of the active uniforms. */
> +   for (unsigned i = 0; i < num_user_uniforms; i++) {
> +  if (uniforms[i].remap_location != -1) {
> + /* How many new entries for this uniform? */
> + const unsigned entries = MAX2(1, uniforms[i].array_elements);
> +
> + /* Set remap table entries point to correct gl_uniform_storage. */
> + for (unsigned j = 0; j < entries; j++) {
> +unsigned element_loc = uniforms[i].remap_location + j;
> +assert(prog->UniformRemapTable[element_loc] ==
> +   (gl_uniform_storage *) -1);
> +prog->UniformRemapTable[element_loc] = &uniforms[i];
> + }
> +  }
> +   }
> +
> +   /* Reserve

Re: [Mesa-dev] [PATCH 06/10] mesa: support inactive uniforms in glUniform* functions

2014-05-19 Thread Ian Romanick

On 04/09/2014 02:56 AM, Tapani Pälli wrote:
> Support inactive uniforms that have explicit location set in
> glUniform* functions.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/uniform_query.cpp | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
> index 5f1af08..e33800a 100644
> --- a/src/mesa/main/uniform_query.cpp
> +++ b/src/mesa/main/uniform_query.cpp
> @@ -253,6 +253,21 @@ validate_uniform_parameters(struct gl_context *ctx,
>return false;
> }
>  
> +   /* If the driver storage pointer in remap table is -1, we ignore silently.
> +*
> +* GL_ARB_explicit_uniform_location spec says:
> +* "What happens if Uniform* is called with an explicitly defined
> +* uniform location, but that uniform is deemed inactive by the
> +* linker?
> +*
> +* RESOLVED: The call is ignored for inactive uniform variables and
> +* no error is generated."
> +*
> +*/
> +   if (ctx->Extensions.ARB_explicit_uniform_location &&
> +  shProg->UniformRemapTable[location] == (gl_uniform_storage *) -1)
> +  return false;
> +

Do we actually need to check
ctx->Extensions.ARB_explicit_uniform_location?  It seems like
UniformRemapTable will only have -1 in it for that case, right?

> _mesa_uniform_split_location_offset(shProg, location, loc, array_index);
>  
> if (shProg->UniformStorage[*loc].array_elements == 0 && count > 1) {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/10] glsl/linker: initialize explicit uniform locations

2014-05-19 Thread Ian Romanick

On 04/09/2014 02:56 AM, Tapani Pälli wrote:
> Patch initializes the UniformRemapTable for explicit locations. This
> needs to happen before optimizations to make sure all inactive uniforms
> get their explicit locations correctly.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/linker.cpp | 99 
> +
>  1 file changed, 99 insertions(+)
> 
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index 7c194a2..1b4cb63 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -74,6 +74,7 @@
>  #include "link_varyings.h"
>  #include "ir_optimization.h"
>  #include "ir_rvalue_visitor.h"
> +#include "ir_uniform.h"
>  
>  extern "C" {
>  #include "main/shaderobj.h"
> @@ -2089,6 +2090,100 @@ check_image_resources(struct gl_context *ctx, struct 
> gl_shader_program *prog)
>linker_error(prog, "Too many combined image uniforms and fragment 
> outputs");
>  }
>  
> +
> +/**
> + * Initializes explicit location slots point to -1 for a variable,
> + * checks for overlaps between other uniforms using explicit locations.
> + */
> +static bool
> +reserve_explicit_locations(struct gl_shader_program *prog,
> +   string_to_uint_map *map, ir_variable *var)
> +{
> +   unsigned max_loc = var->data.location + var->type->component_slots() - 1;
> +
> +   /* Resize remap table if locations do not fit in the current one. */
> +   if (max_loc + 1 > prog->NumUniformRemapTable) {
> +  prog->UniformRemapTable =
> + reralloc(prog, prog->UniformRemapTable,
> +  gl_uniform_storage *,
> +  max_loc + 1);
> +  prog->NumUniformRemapTable = max_loc + 1;
> +   }
> +
> +   for (unsigned i = 0; i < var->type->component_slots(); i++) {

You should check the code that gets generated for this.  I suspect this
will translate to a call to component_slots per iteration of the loop.
Maybe just call it once above (since it is also used to calculate max_loc).

> +  unsigned loc = var->data.location + i;
> +
> +  /* Check if location is already used. */
> +  if (prog->UniformRemapTable[loc] == (gl_uniform_storage *) -1) {

So... -1 means that an inactive uniform has that location explicitly
assigned?  I'm inferring that from comments in the next patch. Maybe we
should have a descriptive #define

#define INACTIVE_UNIFORM_EXPLICIT_LOCATION ((gl_uniform_storage *) -1)

> +
> + /* Possibly same uniform from a different stage, this is ok. */
> + unsigned hash_loc;
> + if (map->get(hash_loc, var->name) && hash_loc == loc - i)
> +   continue;
> +
> + /* ARB_explicit_uniform_location specification states:
> +  *
> +  * "No two default-block uniform variables in the program can 
> have
> +  * the same location, even if they are unused, otherwise a 
> compiler
> +  * or linker error will be generated."
> +  */
> + linker_error(prog, "location qualifier "
> +  "for uniform %s "
> +  "overlaps previously used location",
> +  var->name);

Minor nit (which you can take or leave).  I usually like to have fewer
breaks in strings.  I would have split this up as:

 linker_error(prog,
  "location qualifier for uniform %s overlaps "
  "previously used location",
  var->name);


> + return false;
> +  }
> +
> +  prog->UniformRemapTable[loc] = (gl_uniform_storage *) -1;
> +   }
> +
> +   /* Note, base location used for arrays. */
> +   map->put(var->data.location, var->name);
> +
> +   return true;
> +}
> +
> +/**
> + * Check and reserve all explicit uniform locations, called before
> + * any optimizations happen to handle also inactive uniforms and
> + * inactive array elements that may get trimmed away.
> + */
> +static void
> +check_explicit_uniform_locations(struct gl_context *ctx,
> + struct gl_shader_program *prog)
> +{
> +   if (!ctx->Extensions.ARB_explicit_uniform_location)
> +  return;
> +
> +   /* This map is used to detect if overlapping explicit locations
> +* occur with the same uniform (from different stage) or a different one.
> +*/
> +   string_to_uint_map *uniform_map = new string_to_uint_map;
> +
> +   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
> +  struct gl_shader *sh = prog->_LinkedShaders[i];
> +
> +  if (!sh)
> + continue;
> +
> +  foreach_list(node, sh->ir) {
> + ir_variable *var = ((ir_instruction *)node)->as_variable();
> + if ((var && var->data.mode == ir_var_uniform) &&
> + var->data.explicit_location) {
> +if (!reserve_explicit_locations(prog, uniform_map, var))
> +   return;
> +
> +/* Initialize locations that were allocated but left unused. */
> +for (unsigned i = 0; i < prog->NumUniformRemapTable; i++)

[Mesa-dev] [PATCH 3/7] r600g/compute: Add more NULL checks

2014-05-19 Thread Bruno Jiménez

In this case, NULL checks are added to compute_memory_grow_pool,
so it returns -1 when it fails. This makes necesary
to handle such cases in compute_memory_finalize_pending
when it is needed to grow the pool
---
 src/gallium/drivers/r600/compute_memory_pool.c | 30 --
 src/gallium/drivers/r600/compute_memory_pool.h |  6 --
 2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/r600/compute_memory_pool.c 
b/src/gallium/drivers/r600/compute_memory_pool.c
index 7143545..e959a6d 100644
--- a/src/gallium/drivers/r600/compute_memory_pool.c
+++ b/src/gallium/drivers/r600/compute_memory_pool.c
@@ -160,9 +160,10 @@ struct compute_memory_item* compute_memory_postalloc_chunk(
 }
 
 /**
- * Reallocates pool, conserves data
+ * Reallocates pool, conserves data.
+ * @returns -1 if it fails, 0 otherwise
  */
-void compute_memory_grow_pool(struct compute_memory_pool* pool,
+int compute_memory_grow_pool(struct compute_memory_pool* pool,
struct pipe_context * pipe, int new_size_in_dw)
 {
COMPUTE_DBG(pool->screen, "* compute_memory_grow_pool() "
@@ -173,6 +174,8 @@ void compute_memory_grow_pool(struct compute_memory_pool* 
pool,
 
if (!pool->bo) {
compute_memory_pool_init(pool, MAX2(new_size_in_dw, 1024 * 16));
+   if (pool->shadow == NULL)
+   return -1;
} else {
new_size_in_dw += 1024 - (new_size_in_dw % 1024);
 
@@ -181,6 +184,9 @@ void compute_memory_grow_pool(struct compute_memory_pool* 
pool,
 
compute_memory_shadow(pool, pipe, 1);
pool->shadow = realloc(pool->shadow, new_size_in_dw*4);
+   if (pool->shadow == NULL)
+   return -1;
+
pool->size_in_dw = new_size_in_dw;
pool->screen->b.b.resource_destroy(
(struct pipe_screen *)pool->screen,
@@ -190,6 +196,8 @@ void compute_memory_grow_pool(struct compute_memory_pool* 
pool,
pool->size_in_dw * 4);
compute_memory_shadow(pool, pipe, 0);
}
+
+   return 0;
 }
 
 /**
@@ -213,8 +221,9 @@ void compute_memory_shadow(struct compute_memory_pool* pool,
 
 /**
  * Allocates pending allocations in the pool
+ * @returns -1 if it fails, 0 otherwise
  */
-void compute_memory_finalize_pending(struct compute_memory_pool* pool,
+int compute_memory_finalize_pending(struct compute_memory_pool* pool,
struct pipe_context * pipe)
 {
struct compute_memory_item *pending_list = NULL, *end_p = NULL;
@@ -225,6 +234,8 @@ void compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
 
int64_t start_in_dw = 0;
 
+   int err = 0;
+
COMPUTE_DBG(pool->screen, "* compute_memory_finalize_pending()\n");
 
for (item = pool->item_list; item; item = item->next) {
@@ -292,7 +303,9 @@ void compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
 * they aren't contiguous, so it will be impossible to allocate Item D.
 */
if (pool->size_in_dw < allocated+unallocated) {
-   compute_memory_grow_pool(pool, pipe, allocated+unallocated);
+   err = compute_memory_grow_pool(pool, pipe, 
allocated+unallocated);
+   if (err == -1)
+   return -1;
}
 
/* Loop through all the pending items, allocate space for them and
@@ -309,17 +322,20 @@ void compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
need += 1024 - (need % 1024);
 
if (need > 0) {
-   compute_memory_grow_pool(pool,
+   err = compute_memory_grow_pool(pool,
pipe,
pool->size_in_dw + need);
}
else {
need = pool->size_in_dw / 10;
need += 1024 - (need % 1024);
-   compute_memory_grow_pool(pool,
+   err = compute_memory_grow_pool(pool,
pipe,
pool->size_in_dw + need);
}
+
+   if (err == -1)
+   return -1;
}
COMPUTE_DBG(pool->screen, "  + Found space for Item %p id = %u "
"start_in_dw = %u (%u bytes) size_in_dw = %u (%u 
bytes)\n",
@@ -355,6 +371,8 @@ void compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
 
allocated += item->size_in_dw;
}
+
+   return 0;
 }
 
 
diff --git a/src/gallium/drivers/r600/compute_memory_pool.h 
b/src/gallium/drivers/r600/compute_memory_pool.h
index 3777e3f..e61c003 100644
--- a/src/gallium/drivers/r600

[Mesa-dev] [PATCH 7/7] r600g/compute: Use %u as the unsigned format

2014-05-19 Thread Bruno Jiménez

This fixes an issue when running cl-program-bitcoin-phatk
piglit test where some of the inputs have negative values
---
 src/gallium/drivers/r600/evergreen_compute.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 701bb5c..a2abf15 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -323,7 +323,7 @@ void evergreen_compute_upload_input(
memcpy(kernel_parameters_start, input, shader->input_size);
 
for (i = 0; i < (input_size / 4); i++) {
-   COMPUTE_DBG(ctx->screen, "input %i : %i\n", i,
+   COMPUTE_DBG(ctx->screen, "input %i : %u\n", i,
((unsigned*)num_work_groups_start)[i]);
}
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/7] r600g/compute: Cleanup of compute_memory_pool.h

2014-05-19 Thread Bruno Jiménez

Removed compute_memory_defrag declaration because it seems
to be unimplemented.

I think that this function would have been the one that solves
the problem with fragmentation that compute_memory_finalize_pending has.

Also removed comments that are already at compute_memory_pool.c
---
 src/gallium/drivers/r600/compute_memory_pool.h | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/src/gallium/drivers/r600/compute_memory_pool.h 
b/src/gallium/drivers/r600/compute_memory_pool.h
index e61c003..c711c59 100644
--- a/src/gallium/drivers/r600/compute_memory_pool.h
+++ b/src/gallium/drivers/r600/compute_memory_pool.h
@@ -64,32 +64,17 @@ int64_t compute_memory_prealloc_chunk(struct 
compute_memory_pool* pool, int64_t
 
 struct compute_memory_item* compute_memory_postalloc_chunk(struct 
compute_memory_pool* pool, int64_t start_in_dw); ///search for the chunk where 
we can link our new chunk after it
 
-/** 
- * reallocates pool, conserves data
- * @returns -1 if it fails, 0 otherwise
- */
 int compute_memory_grow_pool(struct compute_memory_pool* pool, struct 
pipe_context * pipe,
int new_size_in_dw);
 
-/**
- * Copy pool from device to host, or host to device
- */
 void compute_memory_shadow(struct compute_memory_pool* pool,
struct pipe_context * pipe, int device_to_host);
 
-/**
- * Allocates pending allocations in the pool
- * @returns -1 if it fails, 0 otherwise
- */
 int compute_memory_finalize_pending(struct compute_memory_pool* pool,
struct pipe_context * pipe);
-void compute_memory_defrag(struct compute_memory_pool* pool); ///Defragment 
the memory pool, always heavy memory usage
 void compute_memory_free(struct compute_memory_pool* pool, int64_t id);
 struct compute_memory_item* compute_memory_alloc(struct compute_memory_pool* 
pool, int64_t size_in_dw); ///Creates pending allocations
 
-/**
- * Transfer data host<->device, offset and size is in bytes
- */
 void compute_memory_transfer(struct compute_memory_pool* pool,
struct pipe_context * pipe, int device_to_host,
struct compute_memory_item* chunk, void* data,
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/7] r600g/compute: align items correctly

2014-05-19 Thread Bruno Jiménez

Now, items whose size is a multiple of 1024 dw won't leave
1024 dw between itself and the following item

The rest of the cases is left as it was
---
 src/gallium/drivers/r600/compute_memory_pool.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/r600/compute_memory_pool.c 
b/src/gallium/drivers/r600/compute_memory_pool.c
index 01851ad..2050f28 100644
--- a/src/gallium/drivers/r600/compute_memory_pool.c
+++ b/src/gallium/drivers/r600/compute_memory_pool.c
@@ -30,6 +30,7 @@
 #include "util/u_transfer.h"
 #include "util/u_surface.h"
 #include "util/u_pack_color.h"
+#include "util/u_math.h"
 #include "util/u_memory.h"
 #include "util/u_inlines.h"
 #include "util/u_framebuffer.h"
@@ -41,6 +42,7 @@
 #include "evergreen_compute_internal.h"
 #include 
 
+#define ITEM_ALIGNMENT 1024
 /**
  * Creates a new pool
  */
@@ -112,8 +114,7 @@ int64_t compute_memory_prealloc_chunk(
return last_end;
}
 
-   last_end = item->start_in_dw + item->size_in_dw;
-   last_end += (1024 - last_end % 1024);
+   last_end = item->start_in_dw + align(item->size_in_dw, 
ITEM_ALIGNMENT);
}
}
 
@@ -177,7 +178,7 @@ int compute_memory_grow_pool(struct compute_memory_pool* 
pool,
if (pool->shadow == NULL)
return -1;
} else {
-   new_size_in_dw += 1024 - (new_size_in_dw % 1024);
+   new_size_in_dw = align(new_size_in_dw, ITEM_ALIGNMENT);
 
COMPUTE_DBG(pool->screen, "  Aligned size = %d (%d bytes)\n",
new_size_in_dw, new_size_in_dw * 4);
@@ -323,7 +324,7 @@ int compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
need = pool->size_in_dw / 10;
}
 
-   need += 1024 - (need % 1024);
+   need = align(need, ITEM_ALIGNMENT);
 
err = compute_memory_grow_pool(pool,
pipe,
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/7] r600g/compute: Tidy a bit compute_memory_finalize_pending

2014-05-19 Thread Bruno Jiménez

Explanation of the changes, as requested by Tom Stellard:

Let's take need after is calculated as
item->size_in_dw+2048 - (pool->size_in_dw - allocated)

BEFORE:
If need is positive or 0:
we calculate need += 1024 - (need % 1024), which is like
cealing to the nearest multiple of 1024, for example
0 goes to 1024, 512 goes to 1024 as well, 1025 goes
to 2048 and so on. So now need is always possitive,
we do compute_memory_grow_pool, check its output
and continue.

If need is negative:
we calculate need += 1024 - (need % 1024), in this case
we will have negative numbers, and if need is
[-1024:-1] 0, so now we take the else, recalculate
need as need = pool->size_in_dw / 10 and
need += 1024 - (need % 1024), we do
compute_memory_grow_pool, check its output and continue.

AFTER:
If need is positive or 0:
we jump the if, calculate need += 1024 - (need % 1024)
compute_memory_grow_pool, check its output and continue.

If need is negative:
we enter the if, and need is now pool->size_in_dw / 10.
Now we calculate need += 1024 - (need % 1024)
compute_memory_grow_pool, check its output and continue.
---
 src/gallium/drivers/r600/compute_memory_pool.c | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/r600/compute_memory_pool.c 
b/src/gallium/drivers/r600/compute_memory_pool.c
index e959a6d..01851ad 100644
--- a/src/gallium/drivers/r600/compute_memory_pool.c
+++ b/src/gallium/drivers/r600/compute_memory_pool.c
@@ -319,21 +319,16 @@ int compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
int64_t need = item->size_in_dw+2048 -
(pool->size_in_dw - allocated);
 
-   need += 1024 - (need % 1024);
-
-   if (need > 0) {
-   err = compute_memory_grow_pool(pool,
-   pipe,
-   pool->size_in_dw + need);
-   }
-   else {
+   if (need < 0) {
need = pool->size_in_dw / 10;
-   need += 1024 - (need % 1024);
-   err = compute_memory_grow_pool(pool,
-   pipe,
-   pool->size_in_dw + need);
}
 
+   need += 1024 - (need % 1024);
+
+   err = compute_memory_grow_pool(pool,
+   pipe,
+   pool->size_in_dw + need);
+
if (err == -1)
return -1;
}
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] r600g/compute: Adding checks for NULL after CALLOC

2014-05-19 Thread Bruno Jiménez

---
 src/gallium/drivers/r600/compute_memory_pool.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/gallium/drivers/r600/compute_memory_pool.c 
b/src/gallium/drivers/r600/compute_memory_pool.c
index ccbb211..7143545 100644
--- a/src/gallium/drivers/r600/compute_memory_pool.c
+++ b/src/gallium/drivers/r600/compute_memory_pool.c
@@ -49,6 +49,8 @@ struct compute_memory_pool* compute_memory_pool_new(
 {
struct compute_memory_pool* pool = (struct compute_memory_pool*)
CALLOC(sizeof(struct compute_memory_pool), 1);
+   if (pool == NULL)
+   return NULL;
 
COMPUTE_DBG(rscreen, "* compute_memory_pool_new()\n");
 
@@ -64,6 +66,9 @@ static void compute_memory_pool_init(struct 
compute_memory_pool * pool,
initial_size_in_dw);
 
pool->shadow = (uint32_t*)CALLOC(initial_size_in_dw, 4);
+   if (pool->shadow == NULL)
+   return;
+
pool->next_id = 1;
pool->size_in_dw = initial_size_in_dw;
pool->bo = (struct 
r600_resource*)r600_compute_buffer_alloc_vram(pool->screen,
@@ -400,6 +405,9 @@ struct compute_memory_item* compute_memory_alloc(
 
new_item = (struct compute_memory_item *)
CALLOC(sizeof(struct compute_memory_item), 1);
+   if (new_item == NULL)
+   return NULL;
+
new_item->size_in_dw = size_in_dw;
new_item->start_in_dw = -1; /* mark pending */
new_item->id = pool->next_id++;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/7] r600g/compute: Some cleanup patches

2014-05-19 Thread Bruno Jiménez

Hi,

Firstly, I shall introduce myself (at least more formally than just
sending some patches). My name is Bruno Jiménez, I'm studying
physics at Zaragoza's University (Spain) and I am participating in
this year's Google Summer of Code, where I will try to improve
the compute_memory_pool, solve an annoying bug related to mappings
and anything else that I can do.

These patches are a little first cleanup. I sent the first five
some time ago, but weren't pushed. The sixth fixes the alignment
for items whose size is a multiple of 1024 dw and the last one
corrects the format type of an unsigned.

Thanks!
Bruno

Bruno Jiménez (7):
  r600g/compute: Fixing a typo and some indentation
  r600g/compute: Adding checks for NULL after CALLOC
  r600g/compute: Add more NULL checks
  r600g/compute: Tidy a bit compute_memory_finalize_pending
  r600g/compute: Cleanup of compute_memory_pool.h
  r600g/compute: align items correctly
  r600g/compute: Use %u as the unsigned format

 src/gallium/drivers/r600/compute_memory_pool.c | 64 +-
 src/gallium/drivers/r600/compute_memory_pool.h | 17 +--
 src/gallium/drivers/r600/evergreen_compute.c   |  2 +-
 3 files changed, 46 insertions(+), 37 deletions(-)

-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/7] r600g/compute: Fixing a typo and some indentation

2014-05-19 Thread Bruno Jiménez

---
 src/gallium/drivers/r600/compute_memory_pool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/compute_memory_pool.c 
b/src/gallium/drivers/r600/compute_memory_pool.c
index 2f0d4c8..ccbb211 100644
--- a/src/gallium/drivers/r600/compute_memory_pool.c
+++ b/src/gallium/drivers/r600/compute_memory_pool.c
@@ -263,7 +263,7 @@ void compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
unallocated += item->size_in_dw+1024;
}
else {
-   /* The item is not pendng, so update the amount of space
+   /* The item is not pending, so update the amount of 
space
 * that has already been allocated. */
allocated += item->size_in_dw;
}
@@ -451,7 +451,7 @@ void compute_memory_transfer(
map = pipe->transfer_map(pipe, gart, 0, PIPE_TRANSFER_READ,
&(struct pipe_box) { .width = aligned_size * 4,
.height = 1, .depth = 1 }, &xfer);
-assert(xfer);
+   assert(xfer);
assert(map);
memcpy(data, map + internal_offset, size);
pipe->transfer_unmap(pipe, xfer);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] llvmpipe: do IR counting for shader cache management after optimization.

2014-05-19 Thread Jose Fonseca

Looks good to me.

Jose

- Original Message -
> From: Roland Scheidegger 
> 
> 2ea923cf571235dfe573c35c3f0d90f632bd86d8 had the side effect of IR counting
> now being done after IR optimization instead of before. Some quick analysis
> shows that there's roughly 1.5 times more IR instructions before optimization
> than after, hence the effective shader cache size got quite a bit smaller.
> Could counter this with an increase of the instruction limit but it probably
> makes more sense to count them after optimizations, so move that code.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_type.c | 20 +++-
>  src/gallium/auxiliary/gallivm/lp_bld_type.h |  2 +-
>  src/gallium/drivers/llvmpipe/lp_state_fs.c  |  4 ++--
>  3 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_type.c
> b/src/gallium/auxiliary/gallivm/lp_bld_type.c
> index 9b25e15..5a80199 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_type.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_type.c
> @@ -394,7 +394,7 @@ lp_build_context_init(struct lp_build_context *bld,
>  /**
>   * Count the number of instructions in a function.
>   */
> -unsigned
> +static unsigned
>  lp_build_count_instructions(LLVMValueRef function)
>  {
> unsigned num_instrs = 0;
> @@ -414,3 +414,21 @@ lp_build_count_instructions(LLVMValueRef function)
>  
> return num_instrs;
>  }
> +
> +
> +/**
> + * Count the number of instructions in a module.
> + */
> +unsigned
> +lp_build_count_ir_module(LLVMModuleRef module)
> +{
> +   LLVMValueRef func;
> +   unsigned num_instrs = 0;
> +
> +   func = LLVMGetFirstFunction(module);
> +   while (func) {
> +  num_instrs += lp_build_count_instructions(func);
> +  func = LLVMGetNextFunction(func);
> +   }
> +   return num_instrs;
> +}
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_type.h
> b/src/gallium/auxiliary/gallivm/lp_bld_type.h
> index d0b490b..191cf92 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_type.h
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_type.h
> @@ -447,7 +447,7 @@ lp_build_context_init(struct lp_build_context *bld,
>  
>  
>  unsigned
> -lp_build_count_instructions(LLVMValueRef function);
> +lp_build_count_ir_module(LLVMModuleRef module);
>  
>  
>  #endif /* !LP_BLD_TYPE_H */
> diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c
> b/src/gallium/drivers/llvmpipe/lp_state_fs.c
> index 4872e0d..0b74d15 100644
> --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
> +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
> @@ -2438,8 +2438,6 @@ generate_fragment(struct llvmpipe_context *lp,
> LLVMBuildRetVoid(builder);
>  
> gallivm_verify_function(gallivm, function);
> -
> -   variant->nr_instrs += lp_build_count_instructions(function);
>  }
>  
>  
> @@ -2629,6 +2627,8 @@ generate_variant(struct llvmpipe_context *lp,
>  
> gallivm_compile_module(variant->gallivm);
>  
> +   variant->nr_instrs += lp_build_count_ir_module(variant->gallivm->module);
> +
> if (variant->function[RAST_EDGE_TEST]) {
>variant->jit_function[RAST_EDGE_TEST] = (lp_jit_frag_func)
>  gallivm_jit_function(variant->gallivm,
> --
> 1.9.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] targets/opencl: Fix (static) linking with LLVM

2014-05-19 Thread Kai Wasserbäch

Michel Dänzer schrieb am 19.05.2014 04:12:
> On 18.05.2014 18:37, Kai Wasserbäch wrote:
>>
>> And instead of just not starting, my X starts crashing, whenever
>> libGL fails to load a (32 bit) driver.
> 
> FWIW, some potential alternatives for avoiding the X crashes:
> 
> With current xserver Git master, you can pass the -iglx parameter to
> Xorg to prohibit GLX indirect rendering.
> 
> Or just make sure the 32-bit swrast_dri.so works.

Thanks a lot for those pointers. I think my swrast failed because it had picked
up some newer SO_VERSION as well. Which would bring me back to static linking.

Kind regards,
Kai Wasserbäch



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] targets/opencl: Fix (static) linking with LLVM (v2)

2014-05-19 Thread Kai Wasserbäch

Without this, I get linking failures (static linking).

The static linking is sort of required for me, because otherwise Steam and
applications using the Steam runtime regularily fail because my LLVM was
compiled and linked against a newer libgcc_s, libstdc++, etc. and uses
features from those newer versions. And instead of Steam just not
starting, my X starts crashing, whenever libGL fails to load a (32 bit)
driver.

Since I hate crashes of X and I don't think Valve/Steam will behave like
a proper distribution soon (rebuilds versus current Debian Testing, since
they base their Steam OS off that), I need a radeonsi which carries its
own LLVM within and doesn't care about what the runtime sets. This means
linking Mesa statically.

v1 → v2: Move logic to configure.ac

Signed-off-by: Kai Wasserbäch 
---

Dear Emil,
I hope this is the right place for adding the two additional modules.

If you accept this patch, please push it for me, I don't have commit access.

Cheers,
Kai



 configure.ac | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/configure.ac b/configure.ac
index 4e4d761..b4920ba 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1658,6 +1658,13 @@ if test "x$enable_gallium_llvm" = xyes; then
 if $LLVM_CONFIG --components | grep -qw 'option'; then
 LLVM_COMPONENTS="${LLVM_COMPONENTS} option"
 fi
+# Current OpenCL/Clover and LLVM 3.5 require ObjCARCOpts and 
ProfileData
+if $LLVM_CONFIG --components | grep -qw 'objcarcopts'; then
+LLVM_COMPONENTS="${LLVM_COMPONENTS} objcarcopts"
+fi
+if $LLVM_CONFIG --components | grep -qw 'profiledata'; then
+LLVM_COMPONENTS="${LLVM_COMPONENTS} profiledata"
+fi
 fi
 DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT 
-DLLVM_VERSION_PATCH=$LLVM_VERSION_PATCH"
 MESA_LLVM=1
-- 
2.0.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 78914] New: Front/Backfaces do not cover the same pixels when rasterized

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=78914

  Priority: medium
Bug ID: 78914
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Front/Backfaces do not cover the same pixels when
rasterized
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: florianl...@gmail.com
  Hardware: Other
Status: NEW
   Version: 10.1
 Component: Mesa core
   Product: Mesa

When trying to run my GLSL raycaster with Mesa/llvmpipe, I recognized artifacts
due to the wrong ray start/end positions. I render the start/end rays as boxes
where the front face is the start position and the back face is the back
position. The problem seems to be that the front/back faces do not cover the
exact same pixels in the framebuffer, even when they share the same edges.
On NVidia/ATI cards with native driver, the back faces cover the same pixels as
the front faces.

This can be reproduced by rendering a triangle with culling turned off and
blending turned on. When rotating the triangle one can see the artifacts on the
borders of the triangle.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] docs: update the prerequisites section

2014-05-19 Thread Brian Paul

SCons is required for Windows.  Add links to flex/bison for Windows.
Reorder items and improve formatting.
---
 docs/install.html |   15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/docs/install.html b/docs/install.html
index 5061ede..f12425f 100644
--- a/docs/install.html
+++ b/docs/install.html
@@ -34,16 +34,25 @@
 
 1.1 General
 
+http://www.python.org/";>Python - Python is required.
+Version 2.6.4 or later should work.
+
+
+http://www.scons.org/";>SCons is required for building on
+Windows and optional for Linux (it's an alternative to autoconf/automake.)
+
+
 lex / yacc - for building the GLSL compiler.
+
+
 On Linux systems, flex and bison are used.
 Versions 2.5.35 and 2.4.1, respectively, (or later) should work.
 
 
 On Windows with MinGW, install flex and bison with:
 mingw-get install msys-flex msys-bison
-
-python - Python is needed for building the Gallium components.
-Version 2.6.4 or later should work.
+For MSVC on Windows, you can find flex/bison programs on the
+ftp://ftp.freedesktop.org/pub/mesa/windows-utils/";>Mesa ftp site.
 
 
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.

2014-05-19 Thread Marek Olšák

You are complicating it. If we followed the specification to the
letter, the driver would have to advertise OpenGL 1.1 instead of 2.1.

The fact r300 cannot filter floating-point textures is documented by
the vendor and game developers (especially those who targeted D3D9)
knew about it.

For OpenGL ES, I propose a simpler solution:
- don't touch ARB_texture_float at all
- add OES_texture_float to gl_extensions
- add OES_texture_float_linear to gl_extensions
- define OES_texture_half_float as o(OES_texture_float)
- define OES_texture_half_float_linear as o(OES_texture_float_linear)

Then, drivers can enable the extensions as they see fit.

Marek

On Mon, May 19, 2014 at 8:34 AM, Rogovin, Kevin  wrote:
> Hi,
>
>   Each of the four extensions are right now set to be advertised if and only 
> if a GL context would advertise GL_ARB_texture_float:
>
> { "GL_OES_texture_float",   o(ARB_texture_float), 
>   ES2,2005 },
> { "GL_OES_texture_half_float",  o(ARB_texture_float), 
>   ES2,2005 },
> { "GL_OES_texture_float_linear",o(ARB_texture_float), 
>   ES2,2005 },
> { "GL_OES_texture_half_float_linear",   o(ARB_texture_float), 
>   ES2,2005 },
>
> From my interpretation of ARB_texture_float, that extension requires both 
> 16-bit and 32-bit textures and ability to filter linearly such textures. Did 
> I misunderstand the specification? If I got the specification correct, then 
> the r300 should not be advertising any of the extensions for otherwise it 
> would be advertising GL_ARB_texture_float.
>
> However, the r300 does give an example of ability to support some of the OES 
> extensions but not all. Previously Matt asked if there an example or need and 
> I thought not. It turns out I was wrong and there is a need atleast for the 
> r300. Supporting that granularity is going to be a bigger patch since it 
> would require changing the data structure struct gl_extensions to have four 
> entries and in turn additional logic to combine them to GL_ARB_texture_float. 
> The correct and more work way to do it would be to remove ARB_texture_float 
> from gl_extension, add a GLboolean for each of the 4 OES extensions, change 
> each driver to correctly fill them and then additional logic in creating 
> extension string(s) to check if each of the 4 OES extensions are TRUE then to 
> advertise GL_ARB_texture_float; we could also instead just add the 4 OES 
> booleans and have additional logic in mesa/main to set them each to TRUE if 
> ARB_texture_float is true. The latter solution though easier is less clean 
> and begging for trouble later. Regardless, lets first get this patch as-is 
> into Mesa, then do the "right" thing to allow a backend to support a subset 
> of the OES extensions without needing to support the ARB extension.
>
> -Kevin
>
>
>
> 
> From: Marek Olšák [mar...@gmail.com]
> Sent: Friday, May 16, 2014 4:33 PM
> To: Rogovin, Kevin
> Cc: mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and 
> GL_OES_texture_half_float.
>
> Sorry, I meant the linear filtering extensions.
>
> Marek
>
> On Fri, May 16, 2014 at 3:31 PM, Marek Olšák  wrote:
>> Hi Kevin,
>>
>> r300g doesn't support filtering of floating-point textures, so the
>> extension shouldn't be advertised there.
>>
>> Marek
>>
>> On Wed, May 7, 2014 at 1:18 PM, Kevin Rogovin  
>> wrote:
>>>  Add support for GLES2 extensions for floating point and half
>>>  floating point textures (GL_OES_texture_float, GL_OES_texture_half_float,
>>>  GL_OES_texture_float_linear and GL_OES_texture_half_float_linear).
>>>
>>> ---
>>>  src/mesa/main/extensions.c | 12 +-
>>>  src/mesa/main/glformats.c  | 25 
>>>  src/mesa/main/pack.c   | 17 +
>>>  src/mesa/main/teximage.c   | 59 
>>> ++
>>>  4 files changed, 112 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
>>> index c2ff7e3..e39f65e 100644
>>> --- a/src/mesa/main/extensions.c
>>> +++ b/src/mesa/main/extensions.c
>>> @@ -301,7 +301,17 @@ static const struct extension extension_table[] = {
>>> { "GL_OES_texture_mirrored_repeat", o(dummy_true),  
>>>  ES1,   2005 },
>>> { "GL_OES_texture_npot",
>>> o(ARB_texture_non_power_of_two), ES1 | ES2, 2005 },
>>> { "GL_OES_vertex_array_object", o(dummy_true),  
>>>  ES1 | ES2, 2010 },
>>> -
>>> +   /*
>>> +* TODO:
>>> +*  - rather than have an all or nothing approach for floating point 
>>> textures,
>>> +*allow for driver to specify what parts of floating point texture 
>>> functionality
>>> +*is suppo

[Mesa-dev] [Bug 78892] New: configure: error: Could not find clang internal header stddef.h in /usr/lib64/llvm/clang/3.4 Use --with-clang-libdir to specify the correct path to the clang libraries.

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=78892

  Priority: medium
Bug ID: 78892
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: configure: error: Could not find clang internal header
stddef.h in /usr/lib64/llvm/clang/3.4 Use
--with-clang-libdir to specify the correct path to the
clang libraries.
  Severity: blocker
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

mesa: 9e74de884a0595e577ebdfb7c7c13f4fd4d4dff5 (master 10.3.0-devel)

configure error on Fedora 21. stddef.h is located at
/usr/lib/clang/3.4/include/stddef.h.

$ ./autogen.sh --enable-opencl
[...]
checking for llvm-config... /usr/bin/llvm-config
configure: error: Could not find clang internal header stddef.h in
/usr/lib64/llvm/clang/3.4 Use --with-clang-libdir to specify the correct path
to the clang libraries.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] define GL_OES_standard_derivatives if extension is supported

2014-05-19 Thread kevin . rogovin

From: Kevin Rogovin 

Define the macro GL_OES_standard_derivatives as 1 if the extension
GL_OES_standard_derivatives is supported.

---
 src/glsl/glcpp/glcpp-parse.y | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index 9887583..83f6f46 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2067,6 +2067,8 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
   if (extensions != NULL) {
  if (extensions->OES_EGL_image_external)
 add_builtin_define(parser, "GL_OES_EGL_image_external", 1);
+  if (extensions->OES_standard_derivatives) 
+ add_builtin_define(parser, "GL_OES_standard_derivatives", 1);
   }
} else {
   add_builtin_define(parser, "GL_ARB_draw_buffers", 1);
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 78888] test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration]

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=7

Vinson Lee  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Vinson Lee  ---
commit 9e74de884a0595e577ebdfb7c7c13f4fd4d4dff5
Author: Vinson Lee 
Date:   Mon May 19 00:39:12 2014 -0700

i965: Rename brw_disasm to brw_disassemble_inst.

Fixes build error introduced with commit
4b04152db055babb8b06929a0c9ebea5c7f4fb92.

  CC   test_eu_compact.o
test_eu_compact.c: In function ‘test_compact_instruction’:
test_eu_compact.c:54:3: error: implicit declaration of function
‘brw_disasm’ [-Werror=implicit-function-declaration]
   brw_disasm(stderr, &src, brw->gen, false);
   ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=7
Signed-off-by: Vinson Lee 

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: rename brw_disasm to brw_disassemble_inst in test_eu_compact

2014-05-19 Thread Pohjolainen, Topi

On Mon, May 19, 2014 at 10:31:56AM +0300, Tapani P?lli wrote:
> (forgotten from commit 4b04152d)
> 
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=7
> ---
>  src/mesa/drivers/dri/i965/test_eu_compact.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/mesa/drivers/dri/i965/test_eu_compact.c 
> b/src/mesa/drivers/dri/i965/test_eu_compact.c
> index 8713918..231487d 100644
> --- a/src/mesa/drivers/dri/i965/test_eu_compact.c
> +++ b/src/mesa/drivers/dri/i965/test_eu_compact.c
> @@ -51,7 +51,7 @@ test_compact_instruction(struct brw_compile *p, struct 
> brw_instruction src)
>if (memcmp(&unchanged, &dst, sizeof(dst))) {
>fprintf(stderr, "Failed to compact, but dst changed\n");
>fprintf(stderr, "  Instruction: ");
> -  brw_disasm(stderr, &src, brw->gen, false);
> +  brw_disassemble_inst(stderr, &src, brw->gen, false);
>return false;
>}
> }
> -- 
> 1.8.3.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: rename brw_disasm to brw_disassemble_inst in test_eu_compact

2014-05-19 Thread Tapani Pälli

(forgotten from commit 4b04152d)

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=7
---
 src/mesa/drivers/dri/i965/test_eu_compact.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/test_eu_compact.c 
b/src/mesa/drivers/dri/i965/test_eu_compact.c
index 8713918..231487d 100644
--- a/src/mesa/drivers/dri/i965/test_eu_compact.c
+++ b/src/mesa/drivers/dri/i965/test_eu_compact.c
@@ -51,7 +51,7 @@ test_compact_instruction(struct brw_compile *p, struct 
brw_instruction src)
   if (memcmp(&unchanged, &dst, sizeof(dst))) {
 fprintf(stderr, "Failed to compact, but dst changed\n");
 fprintf(stderr, "  Instruction: ");
-brw_disasm(stderr, &src, brw->gen, false);
+brw_disassemble_inst(stderr, &src, brw->gen, false);
 return false;
   }
}
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.

2014-05-19 Thread Rogovin, Kevin

Hi

> It should be possible to adapt some of the existing float texture tests
> to run on ES mode without too much effort.

Oh dear, the test makes the GL API convert between 16 and 32 bit float formats. 
Also it does not appear to test filtering either.

Would it be prudent to make 4 tests: one for each extension or fold it 
together? 

-Kevin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 78888] New: test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration]

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=7

  Priority: medium
Bug ID: 7
  Keywords: regression
CC: kenn...@whitecape.org
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: test_eu_compact.c:54:3: error: implicit declaration of
function ‘brw_disasm’
[-Werror=implicit-function-declaration]
  Severity: blocker
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

mesa: 13edd5f6160fce73369afbbf937b5e7ef646a4cc (master 10.3.0-devel)

$ make check
[...]
  CC   test_eu_compact.o
test_eu_compact.c: In function ‘test_compact_instruction’:
test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’
[-Werror=implicit-function-declaration]
   brw_disasm(stderr, &src, brw->gen, false);
   ^

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fbo: Only try stencil meta blits on gen >= 8

2014-05-19 Thread Topi Pohjolainen

I don't have an ILK at hand but the fix should be trivial.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78872
Cc: "10.2" 
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_fbo.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index 5ff4263..6c99de9 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -865,6 +865,8 @@ intel_blit_framebuffer(struct gl_context *ctx,
GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
GLbitfield mask, GLenum filter)
 {
+   struct brw_context *brw = brw_context(ctx);
+
/* Page 679 of OpenGL 4.4 spec says:
 *"Added BlitFramebuffer to commands affected by conditional rendering 
in
 * section 10.10 (Bug 9562)."
@@ -872,14 +874,14 @@ intel_blit_framebuffer(struct gl_context *ctx,
if (!_mesa_check_conditional_render(ctx))
   return;
 
-   mask = brw_blorp_framebuffer(brw_context(ctx),
+   mask = brw_blorp_framebuffer(brw,
 srcX0, srcY0, srcX1, srcY1,
 dstX0, dstY0, dstX1, dstY1,
 mask, filter);
if (mask == 0x0)
   return;
 
-   if (mask & GL_STENCIL_BUFFER_BIT) {
+   if (brw->gen >= 8 && (mask & GL_STENCIL_BUFFER_BIT)) {
   brw_meta_fbo_stencil_blit(brw_context(ctx),
 srcX0, srcY0, srcX1, srcY1,
 dstX0, dstY0, dstX1, dstY1);
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 78773] Doom3 BFG doesnt start

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=78773

--- Comment #4 from Tapani Pälli  ---
Jan, could you please add full log output from doom3 when you run it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 78773] Doom3 BFG doesnt start

2014-05-19 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=78773

Tapani Pälli  changed:

   What|Removed |Added

 CC||lem...@gmail.com

--- Comment #3 from Tapani Pälli  ---
Is there something different within BFG edition than more content, engine is
the same? I've just tested regular Doom3 with today's Mesa master on Intel IVB
and everything works fine, log also says "...using GL_ARB_multitexture".

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/12] gallivm: Use LLVM global context.

2014-05-19 Thread Michel Dänzer

On 19.05.2014 15:03, Mathias Fröhlich wrote:
> 
> I tried to get my local llvm install again to a point where I can see
> backtrace information, but still failed to get valgrind/massif to print
> these nice backtraces. All of the llvm addresses are not resolved so far.

You may want to try some or all of these parameters for LLVM's configure:

'--enable-optimized' '--with-optimize-option=-fno-omit-frame-pointer -O2
[...]' '--enable-assertions' '--enable-debug-runtime'
'--enable-debug-symbols' 'CC=gcc' 'CXX=g++'


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

88 matches

Mail list logo