date:20130812

[Mesa-dev] [PATCH] tgsi: finish declaration parsing for arrays.

2013-08-12 Thread Dave Airlie

From: Dave Airlie 

I previously fixed this partly in 9e8400f4c95bde1f955c7977066583b507159a10,
however I didn't go far enough in testing it, now when I parse a TGSI shader
with arrays in it my iterator can see the ArrayID set to the proper value.

Signed-off-by: Dave Airlie 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c | 32 +++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 523430b..fa18462 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -124,6 +124,7 @@ tgsi_build_declaration(
unsigned semantic,
unsigned invariant,
unsigned local,
+   unsigned array,
struct tgsi_header *header )
 {
struct tgsi_declaration declaration;
@@ -139,7 +140,7 @@ tgsi_build_declaration(
declaration.Semantic = semantic;
declaration.Invariant = invariant;
declaration.Local = local;
-
+   declaration.Array = array;
header_bodysize_grow( header );
 
return declaration;
@@ -339,6 +340,21 @@ tgsi_default_declaration_array( void )
return a;
 }
 
+static struct tgsi_declaration_array
+tgsi_build_declaration_array(unsigned arrayid,
+ struct tgsi_declaration *declaration,
+ struct tgsi_header *header)
+{
+   struct tgsi_declaration_array da;
+
+   da = tgsi_default_declaration_array();
+   da.ArrayID = arrayid;
+
+   declaration_grow(declaration, header);
+
+   return da;
+}
+
 struct tgsi_full_declaration
 tgsi_default_full_declaration( void )
 {
@@ -379,6 +395,7 @@ tgsi_build_full_declaration(
   full_decl->Declaration.Semantic,
   full_decl->Declaration.Invariant,
   full_decl->Declaration.Local,
+  full_decl->Declaration.Array,
   header );
 
if (maxsize <= size)
@@ -472,6 +489,19 @@ tgsi_build_full_declaration(
  header);
}
 
+   if (full_decl->Declaration.Array) {
+  struct tgsi_declaration_array *da;
+
+  if (maxsize <= size) {
+ return 0;
+  }
+  da = (struct tgsi_declaration_array *)&tokens[size];
+  size++;
+  *da = tgsi_build_declaration_array(
+ full_decl->Array.ArrayID,
+ declaration,
+ header);
+   }
return size;
 }
 
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/20] radeonsi: implement MSAA colorbuffer compression for rendering

2013-08-12 Thread Michel Dänzer

On Sam, 2013-08-10 at 18:31 +0200, Marek Olšák wrote:
> v2: simplify flushing in si_context_flush

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: scanout buffers cannot be a destination of MSAA resolve

2013-08-12 Thread Michel Dänzer

On Sam, 2013-08-10 at 18:53 +0200, Marek Olšák wrote:
> Resolving to scanout buffers just doesn't work.

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-08-12 Thread ville . syrjala

From: Ville Syrjälä 

IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
so let's make use of it.

pts/xonotic and pts/reaction @ 1920x1080 gain ~4% on my IVB GT2. Most
other things show less gains/no regressions, except furmark which
loses some 10 points.

I didn't have a BYT at hand for testing.

Signed-off-by: Ville Syrjälä 
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 2 +-
 src/mesa/drivers/dri/i965/brw_misc_state.c| 2 +-
 src/mesa/drivers/dri/i965/gen6_blorp.cpp  | 4 ++--
 src/mesa/drivers/dri/i965/gen7_blorp.cpp  | 6 +++---
 src/mesa/drivers/dri/i965/gen7_misc_state.c   | 2 +-
 src/mesa/drivers/dri/i965/gen7_vs_state.c | 2 +-
 src/mesa/drivers/dri/i965/gen7_wm_state.c | 2 +-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 4 ++--
 8 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 897e733..fe840d7 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -658,7 +658,7 @@ static void brw_emit_vertices(struct brw_context *brw)
 if (brw->gen >= 7)
dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;
 
-if (brw->is_haswell)
+if (brw->gen == 7)
dw0 |= GEN7_MOCS_L3 << 16;
 
 OUT_BATCH(dw0 | (buffer->stride << BRW_VB0_PITCH_SHIFT));
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 5927b9b..3884f86 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -1038,7 +1038,7 @@ static void upload_state_base_address( struct brw_context 
*brw )
 */
 
if (brw->gen >= 6) {
-  uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+  uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
 
   if (brw->gen == 6)
 intel_emit_post_sync_nonzero_flush(brw);
diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
index af0f6fc..3c06a3f 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
@@ -74,7 +74,7 @@ void
 gen6_blorp_emit_state_base_address(struct brw_context *brw,
const brw_blorp_params *params)
 {
-   uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+   uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
 
BEGIN_BATCH(10);
OUT_BATCH(CMD_STATE_BASE_ADDRESS << 16 | (10 - 2));
@@ -165,7 +165,7 @@ gen6_blorp_emit_vertices(struct brw_context *brw,
   if (brw->gen >= 7)
  dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;
 
-  if (brw->is_haswell)
+  if (brw->gen == 7)
  dw0 |= GEN7_MOCS_L3 << 16;
 
   BEGIN_BATCH(batch_length);
diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
index 518d7f5..a9d6198 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
@@ -143,7 +143,7 @@ gen7_blorp_emit_surface_state(struct brw_context *brw,
 */
struct intel_region *region = surface->mt->region;
uint32_t tile_x, tile_y;
-   uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+   uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
 
uint32_t tiling = surface->map_stencil_as_y_tiled
   ? I915_TILING_Y : region->tiling;
@@ -616,7 +616,7 @@ gen7_blorp_emit_constant_ps(struct brw_context *brw,
 const brw_blorp_params *params,
 uint32_t wm_push_const_offset)
 {
-   uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+   uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
 
/* Make sure the push constants fill an exact integer number of
 * registers.
@@ -658,7 +658,7 @@ static void
 gen7_blorp_emit_depth_stencil_config(struct brw_context *brw,
  const brw_blorp_params *params)
 {
-   uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+   uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
uint32_t surfwidth, surfheight;
uint32_t surftype;
unsigned int depth = MAX2(params->depth.mt->logical_depth0, 1);
diff --git a/src/mesa/drivers/dri/i965/gen7_misc_state.c 
b/src/mesa/drivers/dri/i965/gen7_misc_state.c
index 51067b3..10619c1 100644
--- a/src/mesa/drivers/dri/i965/gen7_misc_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_misc_state.c
@@ -41,7 +41,7 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw,
 uint32_t tile_x, uint32_t tile_y)
 {
struct gl_context *ctx = &brw->ctx;
-   uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+   uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
struct gl_framebuffer *fb = ctx->DrawBuffer;
uint32_t surftype;
unsigned int depth = 1;
diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c 
b/src/mesa/drivers/dri/i965/gen7_vs_state.c
index 0340da4..20f3f58 100644
--- a/src/mesa/drivers/dri/i965/gen7_vs_state.c
+

[Mesa-dev] [PATCH 1/3] i965/hsw: Populate MOCS for STATE_BASE_ADDRESS

2013-08-12 Thread ville . syrjala

From: Ville Syrjälä 

Juse spotted these unpopulated MOCS fields when comparing the code
against BSpec. No idea if this makes any difference anywhere, or if
it even makes any sense.

Signed-off-by: Ville Syrjälä 
---
 src/mesa/drivers/dri/i965/brw_misc_state.c | 4 +++-
 src/mesa/drivers/dri/i965/gen6_blorp.cpp   | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 3bf37b9..5927b9b 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -1038,13 +1038,15 @@ static void upload_state_base_address( struct 
brw_context *brw )
 */
 
if (brw->gen >= 6) {
+  uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+
   if (brw->gen == 6)
 intel_emit_post_sync_nonzero_flush(brw);
 
BEGIN_BATCH(10);
OUT_BATCH(CMD_STATE_BASE_ADDRESS << 16 | (10 - 2));
/* General state base address: stateless DP read/write requests */
-   OUT_BATCH(1);
+   OUT_BATCH(1 | (mocs << 8) | (mocs << 4));
/* Surface state base address:
* BINDING_TABLE_STATE
* SURFACE_STATE
diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
index a4a9081..af0f6fc 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
@@ -74,9 +74,11 @@ void
 gen6_blorp_emit_state_base_address(struct brw_context *brw,
const brw_blorp_params *params)
 {
+   uint8_t mocs = brw->is_haswell ? GEN7_MOCS_L3 : 0;
+
BEGIN_BATCH(10);
OUT_BATCH(CMD_STATE_BASE_ADDRESS << 16 | (10 - 2));
-   OUT_BATCH(1); /* GeneralStateBaseAddressModifyEnable */
+   OUT_BATCH(1 | (mocs << 8) | (mocs << 4)); /* 
GeneralStateBaseAddressModifyEnable */
/* SurfaceStateBaseAddress */
OUT_RELOC(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0, 1);
/* DynamicStateBaseAddress */
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] i965/gen7: Don't use L3$ for render targets

2013-08-12 Thread ville . syrjala

From: Ville Syrjälä 

According to HSW Bspec L3$ evictions may land in LLC regardless of
LLC MOCS/PTE settings. That means we shouldn't set scanout buffers
as L3 cacheable when writing to them.

So far I've been unable to observe this phenomenon on my IVB, but
better safe than sorry. Especially since this doesn't appear to
hurt performance.

Ideally this should be limited to scanout buffers, but that information
is not availabe to Mesa. Limiting it to winsys buffers might be a
reasonable comporomise, but MOCS setup appears to be done at a
lower layer where that information is already lost, and I was too
lazy to start passing that infromation down.

Signed-off-by: Ville Syrjälä 
---
 src/mesa/drivers/dri/i965/gen7_blorp.cpp  | 3 ++-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
index a9d6198..6f34c8d 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp
@@ -143,7 +143,8 @@ gen7_blorp_emit_surface_state(struct brw_context *brw,
 */
struct intel_region *region = surface->mt->region;
uint32_t tile_x, tile_y;
-   uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
+   /* FIXME use L3$ for non-scanout render targets */
+   uint8_t mocs = !is_render_target && brw->gen == 7 ? GEN7_MOCS_L3 : 0;
 
uint32_t tiling = surface->map_stencil_as_y_tiled
   ? I915_TILING_Y : region->tiling;
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index cd83daf..f7447cb 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -514,7 +514,8 @@ gen7_update_renderbuffer_surface(struct brw_context *brw,
bool is_array = false;
int depth = MAX2(rb->Depth, 1);
int min_array_element;
-   uint8_t mocs = brw->gen == 7 ? GEN7_MOCS_L3 : 0;
+   /* FIXME use L3$ for non-scanout renderbuffers */
+   uint8_t mocs = 0;
GLenum gl_target = rb->TexImage ?
  rb->TexImage->TexObject->Target : GL_TEXTURE_2D;
 
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] glsl/ast: Check that geometry shader interface block inputs are arrays.

2013-08-12 Thread Paul Berry

---
 src/glsl/ast_to_hir.cpp | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index c06051f..e91e260 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4495,6 +4495,19 @@ ast_interface_block::hir(exec_list *instructions,
 */
assert(declared_variables.is_empty());
 
+   /* From section 4.3.4 (Inputs) of the GLSL 1.50 spec:
+*
+* Geometry shader input variables get the per-vertex values written
+* out by vertex shader output variables of the same names. Since a
+* geometry shader operates on a set of vertices, each input varying
+* variable (or input block, see interface blocks below) needs to be
+* declared as an array.
+*/
+   if (state->target == geometry_shader && !this->is_array &&
+   var_mode == ir_var_shader_in) {
+  _mesa_glsl_error(&loc, state, "geometry shader inputs must be arrays");
+   }
+
/* Page 39 (page 45 of the PDF) of section 4.3.7 in the GLSL ES 3.00 spec
 * says:
 *
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] glsl/ast: Don't perform GS input array checks on non-inputs.

2013-08-12 Thread Paul Berry

Previously, we were accidentally calling
handle_geometry_shader_input_decl() on non-input interface block
declarations, resulting in bogus error checking.
---
 src/glsl/ast_to_hir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 2e97f3b..1b8aca2 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4567,7 +4567,7 @@ ast_interface_block::hir(exec_list *instructions,
   }
 
   var->interface_type = block_type;
-  if (state->target == geometry_shader)
+  if (state->target == geometry_shader && var_mode == ir_var_shader_in)
  handle_geometry_shader_input_decl(state, loc, var);
   state->symbols->add_variable(var);
   instructions->push_tail(var);
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] glsl/ast: Fix assertion failure when GS input declared as non-array.

2013-08-12 Thread Paul Berry

Previously, if a geometry shader input was declared as a non-array, we
would flag the proper compiler error, but then before we got a chance
to report it to the client, handle_geometry_shader_input_decl() would
assertion fail.

With this patch, handle_geometry_shader_input_decl() ignores
non-arrays.
---
 src/glsl/ast_to_hir.cpp | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index e91e260..2e97f3b 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2552,9 +2552,8 @@ process_initializer(ir_variable *var, ast_declaration 
*decl,
 
 
 /**
- * Do additional processing necessary for geometry shader input array
- * declarations (this covers both interface blocks arrays and input variable
- * arrays).
+ * Do additional processing necessary for geometry shader input declarations
+ * (this covers both interface blocks arrays and bare input variables).
  */
 static void
 handle_geometry_shader_input_decl(struct _mesa_glsl_parse_state *state,
@@ -2565,7 +2564,16 @@ handle_geometry_shader_input_decl(struct 
_mesa_glsl_parse_state *state,
   num_vertices = vertices_per_prim(state->gs_input_prim_type);
}
 
-   assert(var->type->is_array());
+   /* Geometry shader input variables must be arrays.  Caller should have
+* reported an error for this.
+*/
+   if (!var->type->is_array()) {
+  assert(state->error);
+
+  /* To avoid cascading failures, short circuit the checks below. */
+  return;
+   }
+
if (var->type->length == 0) {
   /* Section 4.3.8.1 (Input Layout Qualifiers) of the GLSL 1.50 spec says:
*
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] tgsi: finish declaration parsing for arrays.

2013-08-12 Thread Brian Paul


On 08/12/2013 01:38 AM, Dave Airlie wrote:

From: Dave Airlie 

I previously fixed this partly in 9e8400f4c95bde1f955c7977066583b507159a10,
however I didn't go far enough in testing it, now when I parse a TGSI shader
with arrays in it my iterator can see the ArrayID set to the proper value.

Signed-off-by: Dave Airlie 



Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] segfault in pstip_bind_sampler_states

2013-08-12 Thread Brian Paul


On 08/09/2013 01:50 PM, Kevin H. Hobbs wrote:

On 08/09/2013 01:59 PM, Brian Paul wrote:


That's probably not it, given the above.  Can you try setting a
breakpoint on pstip_destroy() and see if that's getting called before
the segfault?  If so, things are getting freed in the wrong order.



No, it is not called before the segfault.

We do seem to enter pstip_bind_sampler_states many times before the
segfault. I do not remember this from before I had CFLAGS="-g -O0"...

The last time through :

(gdb) print pstip
$1 = (struct pstip_stage *) 0xff66331aff66331a

I don't think my actual RAM goes that high.


That looks suspect since the low and high halves of the address are the 
same.





(gdb) print pstip->state
Cannot access memory at address 0xff66331aff66339a

I should think not...

(gdb) print pipe
$2 = (struct pipe_context *) 0x13d6ec0

What does pstip_stage_from_pipe do?

(gdb) print pipe->draw
$3 = (void *) 0x137a090

(gdb) print ((struct draw_context *)(pipe->draw))->pipeline
$6 = {first = 0xffda006dffdc006e, validate = 0xffd40069ffd9006c,
flatshade = 0xffe6007affe7007d, clip =
 0xffe10070ffe30072, cull = 0xffdf006fffe00070, twoside =
0xffdd006effde006f, offset =
 0xffda006dffdc006d, unfilled = 0xffd8006bffd9006c, stipple =
0xffd5006affd6006b, aapoint =
 0xffd00067ffd30069, aaline = 0xff66331aff66331a, pstipple =
0xff66331aff66331a, wide_line =
 0xff66331aff66331a, wide_point = 0xff66331aff66331a, rasterize =
0xff66331aff66331a,
   wide_point_threshold = -3.05987774e+38, wide_line_threshold =
-3.05987774e+38,
   wide_point_sprites = 26 '\032', line_stipple = 51 '3', point_sprite =
102 'f', verts =
 0xff66331aff66331a ,
vertex_stride = 4284887834,
   vertex_count = 4284887834}

Which looks like a whole lot of uninitialized..



Can you run with valgrind?  That should give us some useful info if 
there's a use-after-free.


Otherwise, if you can send me an executable, I could try it here.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] st/xorg: set the SCANOUT flag for all pixmaps

2013-08-12 Thread Michel Dänzer

On Sam, 2013-08-10 at 00:23 +0200, Marek Olšák wrote:
> Any pixmap can potentially end up as a scanout buffer, right?
> 
> This fixes a whole-screen corruption with radeonsi, which needs a different
> texture layout for scanout textures.
> ---
>  src/gallium/state_trackers/xorg/xorg_exa.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/xorg/xorg_exa.c 
> b/src/gallium/state_trackers/xorg/xorg_exa.c
> index 3e764f8..0302a8b 100644
> --- a/src/gallium/state_trackers/xorg/xorg_exa.c
> +++ b/src/gallium/state_trackers/xorg/xorg_exa.c
> @@ -875,7 +875,7 @@ ExaModifyPixmapHeader(PixmapPtr pPixmap, int width, int 
> height,
>   template.depth0 = 1;
>   template.array_size = 1;
>   template.last_level = 0;
> - template.bind = PIPE_BIND_RENDER_TARGET | priv->flags;
> + template.bind = PIPE_BIND_RENDER_TARGET | PIPE_BIND_SCANOUT | 
> priv->flags;
>   priv->tex_flags = priv->flags;
>   texture = exa->scrn->resource_create(exa->scrn, &template);
>  

As st/xorg doesn't support page flipping, I think this should only be
really necessary for the screen pixmap (pPixmap ==
pScreen->GetScreenPixmap(pScreen)).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] radeonsi: don't make scanout resources linear except for cursors

2013-08-12 Thread Michel Dänzer

On Sam, 2013-08-10 at 00:23 +0200, Marek Olšák wrote:
> The surface allocator understands the scanout flag just fine.
> 
> This seems to improve performance for Ubuntu Unity on top of st/xorg
> and it fixes the cursor.
> ---
>  src/gallium/drivers/radeonsi/r600_texture.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/r600_texture.c 
> b/src/gallium/drivers/radeonsi/r600_texture.c
> index ee8ee14..8e01b14 100644
> --- a/src/gallium/drivers/radeonsi/r600_texture.c
> +++ b/src/gallium/drivers/radeonsi/r600_texture.c
> @@ -515,7 +515,7 @@ struct pipe_resource *si_texture_create(struct 
> pipe_screen *screen,
>   int r;
>  
>   if (!(templ->flags & R600_RESOURCE_FLAG_TRANSFER) &&
> - !(templ->bind & PIPE_BIND_SCANOUT)) {
> + !(templ->bind & PIPE_BIND_CURSOR)) {
>   if (templ->flags & R600_RESOURCE_FLAG_FORCE_TILING ||
>   templ->nr_samples > 1) {
>   array_mode = V_009910_ARRAY_2D_TILED_THIN1;

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 67962] undefined reference to `wayland_drm_buffer_get'

2013-08-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=67962

--- Comment #4 from U. Artie Eoff  ---
(In reply to comment #3)
> (In reply to comment #2)
> > I've sent a fix today
> > 
> > http://lists.freedesktop.org/archives/mesa-dev/2013-August/043097.html
> 
> Discard this, I've sent a v2 patch today which really fixes the build.

Patch v2 worked for me: 
http://lists.freedesktop.org/archives/mesa-dev/2013-August/043119.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallivm: simplify geometry shader mask handling a bit

2013-08-12 Thread sroland

From: Roland Scheidegger 

Instead of reducing masks to 0/1 simply use the mask directly as -1.
Also use some signed comparison instead of unsigned (as far as I understand
these values have to be (very) small and signed means llvm doesn't have to
apply additional logic to do the unsigned comparisons the cpu can't do).
Saves some ~15% of all instructions in some test geometry shader here.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   62 ++-
 1 file changed, 26 insertions(+), 36 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index affe059..d23a977 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -2592,19 +2592,15 @@ sviewinfo_emit(
 }
 
 static LLVMValueRef
-mask_to_one_vec(struct lp_build_tgsi_context *bld_base)
+mask_vec(struct lp_build_tgsi_context *bld_base)
 {
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
-   LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
-   LLVMValueRef one_vec = bld_base->int_bld.one;
struct lp_exec_mask *exec_mask = &bld->exec_mask;
 
if (exec_mask->has_mask) {
-  one_vec = LLVMBuildAnd(builder, one_vec, exec_mask->exec_mask, "");
+  return exec_mask->exec_mask;
}
-   one_vec = LLVMBuildAnd(builder, one_vec,
-  lp_build_mask_value(bld->mask), "");
-   return one_vec;
+   return lp_build_mask_value(bld->mask);
 }
 
 static void
@@ -2613,11 +2609,10 @@ increment_vec_ptr_by_mask(struct lp_build_tgsi_context 
* bld_base,
   LLVMValueRef mask)
 {
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
-
LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, "");
-   
-   current_vec = LLVMBuildAdd(builder, current_vec, mask, "");
-   
+
+   current_vec = LLVMBuildSub(builder, current_vec, mask, "");
+
LLVMBuildStore(builder, current_vec, ptr);
 }
 
@@ -2627,18 +2622,13 @@ clear_uint_vec_ptr_from_mask(struct 
lp_build_tgsi_context * bld_base,
  LLVMValueRef mask)
 {
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
-
LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, "");
-   LLVMValueRef full_mask = lp_build_cmp(&bld_base->uint_bld,
- PIPE_FUNC_NOTEQUAL,
- mask,
- bld_base->uint_bld.zero);
 
current_vec = lp_build_select(&bld_base->uint_bld,
- full_mask,
+ mask,
  bld_base->uint_bld.zero,
  current_vec);
-   
+
LLVMBuildStore(builder, current_vec, ptr);
 }
 
@@ -2648,8 +2638,8 @@ clamp_mask_to_max_output_vertices(struct 
lp_build_tgsi_soa_context * bld,
   LLVMValueRef total_emitted_vertices_vec)
 {
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
-   struct lp_build_context *uint_bld = &bld->bld_base.uint_bld;
-   LLVMValueRef max_mask = lp_build_cmp(uint_bld, PIPE_FUNC_LESS,
+   struct lp_build_context *int_bld = &bld->bld_base.int_bld;
+   LLVMValueRef max_mask = lp_build_cmp(int_bld, PIPE_FUNC_LESS,
 total_emitted_vertices_vec,
 bld->max_output_vertices_vec);
 
@@ -2666,23 +2656,23 @@ emit_vertex(
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
 
if (bld->gs_iface->emit_vertex) {
-  LLVMValueRef masked_ones = mask_to_one_vec(bld_base);
+  LLVMValueRef mask = mask_vec(bld_base);
   LLVMValueRef total_emitted_vertices_vec =
  LLVMBuildLoad(builder, bld->total_emitted_vertices_vec_ptr, "");
-  masked_ones = clamp_mask_to_max_output_vertices(bld, masked_ones,
-  
total_emitted_vertices_vec);
+  mask = clamp_mask_to_max_output_vertices(bld, mask,
+   total_emitted_vertices_vec);
   gather_outputs(bld);
   bld->gs_iface->emit_vertex(bld->gs_iface, &bld->bld_base,
  bld->outputs,
  total_emitted_vertices_vec);
   increment_vec_ptr_by_mask(bld_base, bld->emitted_vertices_vec_ptr,
-masked_ones);
+mask);
   increment_vec_ptr_by_mask(bld_base, bld->total_emitted_vertices_vec_ptr,
-masked_ones);
+mask);
 #if DUMP_GS_EMITS
   lp_build_print_value(bld->bld_base.base.gallivm,
" +++ emit vertex masked ones = ",
-   masked_ones);
+   mask);
   lp_build_print_value(bld->bld_base.base.gallivm,
" +++ emit vertex emitted = ",

[Mesa-dev] [PATCH 2/2] draw: simplify prim mask construction

2013-08-12 Thread sroland

From: Roland Scheidegger 

The code was quite weird, the second comparison was in fact a complete no-op
and we can also do the comparison with the vector directly instead of scalar,
which should not also be faster but it is way more obvious how that mask
is actually going to look like.
(Not sure how many instructions that saves as it turned out the mask wasn't
used in the test geometry shader I used at all after all...)
---
 src/gallium/auxiliary/draw/draw_llvm.c |   32 ++--
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 68f6369..84e3392 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -2040,31 +2040,19 @@ generate_mask_value(struct draw_gs_llvm_variant 
*variant,
 {
struct gallivm_state *gallivm = variant->gallivm;
LLVMBuilderRef builder = gallivm->builder;
-   LLVMValueRef bits[16];
-   struct lp_type  mask_type = lp_int_type(gs_type);
-   struct lp_type mask_elem_type = lp_elem_type(mask_type);
-   LLVMValueRef mask_val = lp_build_const_vec(gallivm,
-  mask_type,
-  0);
+   struct lp_type mask_type = lp_int_type(gs_type);
+   LLVMValueRef num_prims;
+   LLVMValueRef mask_val = lp_build_const_vec(gallivm, mask_type, 0);
unsigned i;
 
-   assert(gs_type.length <= Elements(bits));
-
-   for (i = gs_type.length; i >= 1; --i) {
-  int idx = i - 1;
-  LLVMValueRef ind = lp_build_const_int32(gallivm, i);
-  bits[idx] = lp_build_compare(gallivm,
-   mask_elem_type, PIPE_FUNC_GEQUAL,
-   variant->num_prims, ind);
-   }
-   for (i = 0; i < gs_type.length; ++i) {
-  LLVMValueRef ind = lp_build_const_int32(gallivm, i);
-  mask_val = LLVMBuildInsertElement(builder, mask_val, bits[i], ind, "");
+   num_prims = lp_build_broadcast(gallivm, lp_build_vec_type(gallivm, 
mask_type),
+  variant->num_prims);
+   for (i = 0; i <= gs_type.length; i++) {
+  LLVMValueRef idx = lp_build_const_int32(gallivm, i);
+  mask_val = LLVMBuildInsertElement(builder, mask_val, idx, idx, "");
}
-   mask_val = lp_build_compare(gallivm,
-   mask_type, PIPE_FUNC_NOTEQUAL,
-   mask_val,
-   lp_build_const_int_vec(gallivm, mask_type, 0));
+   mask_val = lp_build_compare(gallivm, mask_type,
+   PIPE_FUNC_GREATER, num_prims, mask_val);
 
return mask_val;
 }
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallivm: simplify geometry shader mask handling a bit

2013-08-12 Thread sroland

From: Roland Scheidegger 

Instead of reducing masks to 0/1 simply use the mask directly as -1.
Also use some signed comparison instead of unsigned (as far as I understand
these values have to be (very) small and signed means llvm doesn't have to
apply additional logic to do the unsigned comparisons the cpu can't do).
Saves a couple of instructions in some test geometry shader here.

v2: that was a bit to much optimization, don't skip combining the masks...
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   64 ++-
 1 file changed, 28 insertions(+), 36 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index affe059..589ea4f 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -2592,19 +2592,17 @@ sviewinfo_emit(
 }
 
 static LLVMValueRef
-mask_to_one_vec(struct lp_build_tgsi_context *bld_base)
+mask_vec(struct lp_build_tgsi_context *bld_base)
 {
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
-   LLVMValueRef one_vec = bld_base->int_bld.one;
struct lp_exec_mask *exec_mask = &bld->exec_mask;
 
-   if (exec_mask->has_mask) {
-  one_vec = LLVMBuildAnd(builder, one_vec, exec_mask->exec_mask, "");
+   if (!exec_mask->has_mask) {
+  return lp_build_mask_value(bld->mask);
}
-   one_vec = LLVMBuildAnd(builder, one_vec,
-  lp_build_mask_value(bld->mask), "");
-   return one_vec;
+   return LLVMBuildAnd(builder, lp_build_mask_value(bld->mask),
+   exec_mask->exec_mask, "");
 }
 
 static void
@@ -2613,11 +2611,10 @@ increment_vec_ptr_by_mask(struct lp_build_tgsi_context 
* bld_base,
   LLVMValueRef mask)
 {
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
-
LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, "");
-   
-   current_vec = LLVMBuildAdd(builder, current_vec, mask, "");
-   
+
+   current_vec = LLVMBuildSub(builder, current_vec, mask, "");
+
LLVMBuildStore(builder, current_vec, ptr);
 }
 
@@ -2627,18 +2624,13 @@ clear_uint_vec_ptr_from_mask(struct 
lp_build_tgsi_context * bld_base,
  LLVMValueRef mask)
 {
LLVMBuilderRef builder = bld_base->base.gallivm->builder;
-
LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, "");
-   LLVMValueRef full_mask = lp_build_cmp(&bld_base->uint_bld,
- PIPE_FUNC_NOTEQUAL,
- mask,
- bld_base->uint_bld.zero);
 
current_vec = lp_build_select(&bld_base->uint_bld,
- full_mask,
+ mask,
  bld_base->uint_bld.zero,
  current_vec);
-   
+
LLVMBuildStore(builder, current_vec, ptr);
 }
 
@@ -2648,8 +2640,8 @@ clamp_mask_to_max_output_vertices(struct 
lp_build_tgsi_soa_context * bld,
   LLVMValueRef total_emitted_vertices_vec)
 {
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
-   struct lp_build_context *uint_bld = &bld->bld_base.uint_bld;
-   LLVMValueRef max_mask = lp_build_cmp(uint_bld, PIPE_FUNC_LESS,
+   struct lp_build_context *int_bld = &bld->bld_base.int_bld;
+   LLVMValueRef max_mask = lp_build_cmp(int_bld, PIPE_FUNC_LESS,
 total_emitted_vertices_vec,
 bld->max_output_vertices_vec);
 
@@ -2666,23 +2658,23 @@ emit_vertex(
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
 
if (bld->gs_iface->emit_vertex) {
-  LLVMValueRef masked_ones = mask_to_one_vec(bld_base);
+  LLVMValueRef mask = mask_vec(bld_base);
   LLVMValueRef total_emitted_vertices_vec =
  LLVMBuildLoad(builder, bld->total_emitted_vertices_vec_ptr, "");
-  masked_ones = clamp_mask_to_max_output_vertices(bld, masked_ones,
-  
total_emitted_vertices_vec);
+  mask = clamp_mask_to_max_output_vertices(bld, mask,
+   total_emitted_vertices_vec);
   gather_outputs(bld);
   bld->gs_iface->emit_vertex(bld->gs_iface, &bld->bld_base,
  bld->outputs,
  total_emitted_vertices_vec);
   increment_vec_ptr_by_mask(bld_base, bld->emitted_vertices_vec_ptr,
-masked_ones);
+mask);
   increment_vec_ptr_by_mask(bld_base, bld->total_emitted_vertices_vec_ptr,
-masked_ones);
+mask);
 #if DUMP_GS_EMITS
   lp_build_print_value(bld->bld_base.base.gallivm,
" +++ emit vertex masked ones = ",
-

Re: [Mesa-dev] [PATCH mesa 3/3] egl: Update to Wayland 1.2 server API

2013-08-12 Thread Kristian Høgsberg

On Wed, Aug 7, 2013 at 1:27 PM, Ian Romanick  wrote:
> On 08/07/2013 10:39 AM, Kristian Høgsberg wrote:
>>
>> On Thu, Jul 18, 2013 at 03:11:25PM +0300, Ander Conselvan de Oliveira
>> wrote:
>>>
>>> Since Wayland 1.2, struct wl_buffer and a few functions are deprecated.
>>>
>>> References to wl_buffer are replaced with wl_resource and some getter
>>> functions and calls to deprecated functions are replaced with the proper
>>> new API. The latter changes are related to resource versioning.
>>>
>>> Signed-off-by: Ander Conselvan de Oliveira
>>> 
>>
>>
>> Thanks Ander, this and the two previous patches pushed.
>
>
> If any patches need to be picked back to the 9.2 branch (or 9.1 branch),
> please send them to mesa-stable.  I don't follow the Wayland work very
> closely, so I defer to Kristian's guidance.

Thanks Ian - no, that shouldn't be necessary.  The 1.2 release didn't
break API or ABI used by mesa, but we did deprecate certain parts.

Kristian

>> Kristian
>>
>>> ---
>>>   docs/specs/WL_bind_wayland_display.spec|  8 ++-
>>>   include/EGL/eglmesaext.h   |  6 +-
>>>   src/egl/drivers/dri2/egl_dri2.c| 28 +
>>>   src/egl/drivers/dri2/egl_dri2.h|  1 -
>>>   src/egl/main/eglapi.c  |  2 +-
>>>   src/egl/main/eglapi.h  |  2 +-
>>>   src/egl/wayland/wayland-drm/wayland-drm.c  | 66
>>> +-
>>>   src/egl/wayland/wayland-drm/wayland-drm.h  | 13 +++--
>>>   .../state_trackers/egl/common/egl_g3d_api.c|  2 +-
>>>   .../state_trackers/egl/common/egl_g3d_image.c  |  4 +-
>>>   .../egl/common/native_wayland_bufmgr.h |  6 +-
>>>   .../egl/common/native_wayland_drm_bufmgr.c | 25 +---
>>>   src/gbm/backends/dri/gbm_dri.c |  5 +-
>>>   13 files changed, 99 insertions(+), 69 deletions(-)
>>>
>>> diff --git a/docs/specs/WL_bind_wayland_display.spec
>>> b/docs/specs/WL_bind_wayland_display.spec
>>> index 02bd6ea..8f0083c 100644
>>> --- a/docs/specs/WL_bind_wayland_display.spec
>>> +++ b/docs/specs/WL_bind_wayland_display.spec
>>> @@ -17,7 +17,7 @@ Status
>>>
>>>   Version
>>>
>>> -Version 1, March 1, 2011
>>> +Version 5, July 16, 2013
>>>
>>>   Number
>>>
>>> @@ -57,7 +57,7 @@ New Procedures and Functions
>>>struct wl_display *display);
>>>
>>>   EGLBoolean eglQueryWaylandBufferWL(EGLDisplay dpy,
>>> -   struct wl_buffer *buffer,
>>> +   struct wl_resource *buffer,
>>>  EGLint attribute, EGLint
>>> *value);
>>>
>>>   New Tokens
>>> @@ -173,3 +173,7 @@ Revision History
>>>   Use EGL_TEXTURE_FORMAT, EGL_TEXTURE_RGB, and EGL_TEXTURE_RGBA,
>>>   and just define the new YUV texture formats.  Add support for
>>>   EGL_WIDTH and EGL_HEIGHT in the query attributes (Kristian
>>> Høgsberg)
>>> +Version 5, July 16, 2013
>>> +Change eglQueryWaylandBufferWL to take a resource pointer to the
>>> +buffer instead of a pointer to a struct wl_buffer, as the latter
>>> has
>>> +been deprecated. (Ander Conselvan de Oliveira)
>>> diff --git a/include/EGL/eglmesaext.h b/include/EGL/eglmesaext.h
>>> index d476d18..e0eae28 100644
>>> --- a/include/EGL/eglmesaext.h
>>> +++ b/include/EGL/eglmesaext.h
>>> @@ -120,15 +120,15 @@ typedef EGLDisplay (EGLAPIENTRYP
>>> PFNEGLGETDRMDISPLAYMESA) (int fd);
>>>   #define EGL_TEXTURE_Y_XUXV_WL   0x31D9
>>>
>>>   struct wl_display;
>>> -struct wl_buffer;
>>> +struct wl_resource;
>>>   #ifdef EGL_EGLEXT_PROTOTYPES
>>>   EGLAPI EGLBoolean EGLAPIENTRY eglBindWaylandDisplayWL(EGLDisplay dpy,
>>> struct wl_display *display);
>>>   EGLAPI EGLBoolean EGLAPIENTRY eglUnbindWaylandDisplayWL(EGLDisplay dpy,
>>> struct wl_display *display);
>>> -EGLAPI EGLBoolean EGLAPIENTRY eglQueryWaylandBufferWL(EGLDisplay dpy,
>>> struct wl_buffer *buffer, EGLint attribute, EGLint *value);
>>> +EGLAPI EGLBoolean EGLAPIENTRY eglQueryWaylandBufferWL(EGLDisplay dpy,
>>> struct wl_resource *buffer, EGLint attribute, EGLint *value);
>>>   #endif
>>>   typedef EGLBoolean (EGLAPIENTRYP PFNEGLBINDWAYLANDDISPLAYWL)
>>> (EGLDisplay dpy, struct wl_display *display);
>>>   typedef EGLBoolean (EGLAPIENTRYP PFNEGLUNBINDWAYLANDDISPLAYWL)
>>> (EGLDisplay dpy, struct wl_display *display);
>>> -typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYWAYLANDBUFFERWL) (EGLDisplay
>>> dpy, struct wl_buffer *buffer, EGLint attribute, EGLint *value);
>>> +typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYWAYLANDBUFFERWL) (EGLDisplay
>>> dpy, struct wl_resource *buffer, EGLint attribute, EGLint *value);
>>>
>>>   #endif
>>>
>>> diff --git a/src/egl/drivers/dri2/egl_dri2.c
>>> b/src/egl/drivers/dri2/egl_dri2.c
>>> index 1bce314..44fd8a8 100644
>>> --- a/src/egl/drivers/dri2/egl_dri2.c
>>> +++ b/src/egl/drivers/dri2/egl_dri2.c
>>> @@

Re: [Mesa-dev] [PATCH mesa 3/3] egl: Update to Wayland 1.2 server API

2013-08-12 Thread Kristian Høgsberg

On Fri, Aug 9, 2013 at 10:44 AM, Mike Lothian  wrote:
> Hi
>
> I seem to have a missing symbol wayland_drm_buffer_get in libgbm.so.1
>
> I'm pretty sure its related

No, it's a different issue: https://bugs.freedesktop.org/show_bug.cgi?id=67962

Kristian

> Regards
>
> Mike
>
> On 18 Jul 2013 13:15, "Ander Conselvan de Oliveira"
>  wrote:
>>
>> Since Wayland 1.2, struct wl_buffer and a few functions are deprecated.
>>
>> References to wl_buffer are replaced with wl_resource and some getter
>> functions and calls to deprecated functions are replaced with the proper
>> new API. The latter changes are related to resource versioning.
>>
>> Signed-off-by: Ander Conselvan de Oliveira
>> 
>> ---
>>  docs/specs/WL_bind_wayland_display.spec|  8 ++-
>>  include/EGL/eglmesaext.h   |  6 +-
>>  src/egl/drivers/dri2/egl_dri2.c| 28 +
>>  src/egl/drivers/dri2/egl_dri2.h|  1 -
>>  src/egl/main/eglapi.c  |  2 +-
>>  src/egl/main/eglapi.h  |  2 +-
>>  src/egl/wayland/wayland-drm/wayland-drm.c  | 66
>> +-
>>  src/egl/wayland/wayland-drm/wayland-drm.h  | 13 +++--
>>  .../state_trackers/egl/common/egl_g3d_api.c|  2 +-
>>  .../state_trackers/egl/common/egl_g3d_image.c  |  4 +-
>>  .../egl/common/native_wayland_bufmgr.h |  6 +-
>>  .../egl/common/native_wayland_drm_bufmgr.c | 25 +---
>>  src/gbm/backends/dri/gbm_dri.c |  5 +-
>>  13 files changed, 99 insertions(+), 69 deletions(-)
>>
>> diff --git a/docs/specs/WL_bind_wayland_display.spec
>> b/docs/specs/WL_bind_wayland_display.spec
>> index 02bd6ea..8f0083c 100644
>> --- a/docs/specs/WL_bind_wayland_display.spec
>> +++ b/docs/specs/WL_bind_wayland_display.spec
>> @@ -17,7 +17,7 @@ Status
>>
>>  Version
>>
>> -Version 1, March 1, 2011
>> +Version 5, July 16, 2013
>>
>>  Number
>>
>> @@ -57,7 +57,7 @@ New Procedures and Functions
>>   struct wl_display *display);
>>
>>  EGLBoolean eglQueryWaylandBufferWL(EGLDisplay dpy,
>> -   struct wl_buffer *buffer,
>> +   struct wl_resource *buffer,
>> EGLint attribute, EGLint *value);
>>
>>  New Tokens
>> @@ -173,3 +173,7 @@ Revision History
>>  Use EGL_TEXTURE_FORMAT, EGL_TEXTURE_RGB, and EGL_TEXTURE_RGBA,
>>  and just define the new YUV texture formats.  Add support for
>>  EGL_WIDTH and EGL_HEIGHT in the query attributes (Kristian
>> Høgsberg)
>> +Version 5, July 16, 2013
>> +Change eglQueryWaylandBufferWL to take a resource pointer to the
>> +buffer instead of a pointer to a struct wl_buffer, as the latter
>> has
>> +been deprecated. (Ander Conselvan de Oliveira)
>> diff --git a/include/EGL/eglmesaext.h b/include/EGL/eglmesaext.h
>> index d476d18..e0eae28 100644
>> --- a/include/EGL/eglmesaext.h
>> +++ b/include/EGL/eglmesaext.h
>> @@ -120,15 +120,15 @@ typedef EGLDisplay (EGLAPIENTRYP
>> PFNEGLGETDRMDISPLAYMESA) (int fd);
>>  #define EGL_TEXTURE_Y_XUXV_WL   0x31D9
>>
>>  struct wl_display;
>> -struct wl_buffer;
>> +struct wl_resource;
>>  #ifdef EGL_EGLEXT_PROTOTYPES
>>  EGLAPI EGLBoolean EGLAPIENTRY eglBindWaylandDisplayWL(EGLDisplay dpy,
>> struct wl_display *display);
>>  EGLAPI EGLBoolean EGLAPIENTRY eglUnbindWaylandDisplayWL(EGLDisplay dpy,
>> struct wl_display *display);
>> -EGLAPI EGLBoolean EGLAPIENTRY eglQueryWaylandBufferWL(EGLDisplay dpy,
>> struct wl_buffer *buffer, EGLint attribute, EGLint *value);
>> +EGLAPI EGLBoolean EGLAPIENTRY eglQueryWaylandBufferWL(EGLDisplay dpy,
>> struct wl_resource *buffer, EGLint attribute, EGLint *value);
>>  #endif
>>  typedef EGLBoolean (EGLAPIENTRYP PFNEGLBINDWAYLANDDISPLAYWL) (EGLDisplay
>> dpy, struct wl_display *display);
>>  typedef EGLBoolean (EGLAPIENTRYP PFNEGLUNBINDWAYLANDDISPLAYWL)
>> (EGLDisplay dpy, struct wl_display *display);
>> -typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYWAYLANDBUFFERWL) (EGLDisplay
>> dpy, struct wl_buffer *buffer, EGLint attribute, EGLint *value);
>> +typedef EGLBoolean (EGLAPIENTRYP PFNEGLQUERYWAYLANDBUFFERWL) (EGLDisplay
>> dpy, struct wl_resource *buffer, EGLint attribute, EGLint *value);
>>
>>  #endif
>>
>> diff --git a/src/egl/drivers/dri2/egl_dri2.c
>> b/src/egl/drivers/dri2/egl_dri2.c
>> index 1bce314..44fd8a8 100644
>> --- a/src/egl/drivers/dri2/egl_dri2.c
>> +++ b/src/egl/drivers/dri2/egl_dri2.c
>> @@ -41,6 +41,10 @@
>>
>>  #include "egl_dri2.h"
>>
>> +#ifdef HAVE_WAYLAND_PLATFORM
>> +#include "wayland-drm.h"
>> +#endif
>> +
>>  const __DRIuseInvalidateExtension use_invalidate = {
>> { __DRI_USE_INVALIDATE, 1 }
>>  };
>> @@ -1195,7 +1199,7 @@ dri2_create_image_wayland_wl_buffer(_EGLDisplay
>> *disp, _EGLContext *ctx,
>> EGLClientBuffer _buffer,
>>

Re: [Mesa-dev] [PATCH] meta: Fix blitting a framebuffer with renderbuffer attachment

2013-08-12 Thread Anuj Phogat

On Mon, Aug 5, 2013 at 3:00 PM, Anuj Phogat  wrote:
>
> This patch fixes a case of framebuffer blitting with renderbuffer
> as color attachment and GL_LINEAR filter. Meta implementation of
> glBlitFrambuffer() converts source color buffer to a texture and
> uses it to do the scaled blitting in to destination buffer. Using
> the exact source rectangle to create the texture does incorrect
> linear filtering along the edges. This patch makes the changes to
> extend the texture edges by one pixel in x, y directions. This
> ensures correct linear filtering.
>
> It fixes failing piglit fbo-linear-blit test. Patch for the testcase
> is on piglit mailing list.
Updated piglit test's name is: fbo-attachments-blit-scaled-linear.
Please take a look if the changes in this patch look good to you.
>
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/common/meta.c | 25 +++--
>  1 file changed, 15 insertions(+), 10 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index c62927c..155c4fa 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -1879,19 +1879,24 @@ _mesa_meta_BlitFramebuffer(struct gl_context *ctx,
>const GLenum rb_base_format =
>   _mesa_base_tex_format(ctx, colorReadRb->InternalFormat);
>
> -  newTex = alloc_texture(tex, srcW, srcH, rb_base_format);
> -  setup_copypix_texture(ctx, tex, newTex, srcX, srcY, srcW, srcH,
> +  /* Using  the exact source rectangle to create the texture does 
> incorrect
> +   * linear filtering along the edges. So, allocate the texture extended 
> along
> +   * edges by one pixel in x, y directions.
> +   */
> +  newTex = alloc_texture(tex, srcW + 2, srcH + 2, rb_base_format);
> +  setup_copypix_texture(ctx, tex, newTex,
> +srcX - 1, srcY - 1, srcW + 2, srcH + 2,
>  rb_base_format, filter);
>/* texcoords (after texture allocation!) */
>{
> - verts[0].s = 0.0F;
> - verts[0].t = 0.0F;
> - verts[1].s = tex->Sright;
> - verts[1].t = 0.0F;
> - verts[2].s = tex->Sright;
> - verts[2].t = tex->Ttop;
> - verts[3].s = 0.0F;
> - verts[3].t = tex->Ttop;
> + verts[0].s = 1.0F;
> + verts[0].t = 1.0F;
> + verts[1].s = tex->Sright - 1.0F;
> + verts[1].t = 1.0F;
> + verts[2].s = tex->Sright - 1.0F;
> + verts[2].t = tex->Ttop - 1.0F;
> + verts[3].s = 1.0F;
> + verts[3].t = tex->Ttop - 1.0F;
>
>   /* upload new vertex data */
>   _mesa_BufferSubData(GL_ARRAY_BUFFER_ARB, 0, sizeof(verts), verts);
> --
> 1.8.1.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] segfault in pstip_bind_sampler_states

2013-08-12 Thread Kevin H. Hobbs

On 08/12/2013 10:29 AM, Brian Paul wrote:
> Can you run with valgrind?  That should give us some useful info if 
> there's a use-after-free.

Sure,

$ valgrind /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkpython
"--enable-bt"
"/home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py/rtImageTest.py"
"/home/kevin/kitware/VTK/Filters/Hybrid/Testing/Python/largeImageOffset.py"
"-D" "/home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Testing" "-T"
"/home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary" "-V"
"/home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Filters/Hybrid/Testing/Data/Baseline/largeImageOffset.png"
"-A" "/home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py" >
/tmp/osmesa_valgrind.txt 2>&1
==30166== Memcheck, a memory error detector
==30166== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==30166== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==30166== Command: /home/kevin/kitware/VTK_OSMesa_Build/bin/vtkpython 
--enable-bt 
/home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py/rtImageTest.py 
/home/kevin/kitware/VTK/Filters/Hybrid/Testing/Python/largeImageOffset.py -D 
/home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Testing -T 
/home/kevin/kitware/VTK_OSMesa_Build/Testing/Temporary -V 
/home/kevin/kitware/VTK_OSMesa_Build/ExternalData/Filters/Hybrid/Testing/Data/Baseline/largeImageOffset.png
 -A /home/kevin/kitware/VTK_OSMesa_Build/Utilities/vtkTclTest2Py
==30166== 
vtk version 6.1.0
0Standard29.2597
29.19
==30166== Invalid write of size 8
==30166==at 0x4C2C29B: memcpy@@GLIBC_2.14 (mc_replace_strmem.c:882)
==30166==by 0x1DB7DF37: osmesa_st_framebuffer_flush_front (osmesa.c:329)
==30166==by 0x1D957491: st_manager_flush_frontbuffer (st_manager.c:770)
==30166==by 0x1D92C802: display_front_buffer (st_cb_flush.c:73)
==30166==by 0x1D92C9C9: st_glFlush (st_cb_flush.c:124)
==30166==by 0x1D7CD172: _mesa_flush (context.c:1643)
==30166==by 0x1D7CCD4B: _mesa_make_current (context.c:1455)
==30166==by 0x1D95733B: st_api_make_current (st_manager.c:722)
==30166==by 0x1DB7E5F8: OSMesaMakeCurrent (osmesa.c:653)
==30166==by 0x2131C322: vtkOSOpenGLRenderWindow::MakeCurrent() 
(vtkOSOpenGLRenderWindow.cxx:344)
==30166==by 0x2131BC50: vtkOSOpenGLRenderWindow::DestroyWindow() 
(vtkOSOpenGLRenderWindow.cxx:147)
==30166==by 0x2131C128: vtkOSOpenGLRenderWindow::Finalize() 
(vtkOSOpenGLRenderWindow.cxx:281)
==30166==  Address 0x91fdd18 is not stack'd, malloc'd or (recently) free'd
==30166== 
--30166-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - 
exiting
--30166-- si_code=80;  Faulting address: 0x0;  sp: 0x4030fdd50

valgrind: the 'impossible' happened:
   Killed by fatal signal
==30166==at 0x3806236E: mkFreeBlock (m_mallocfree.c:290)
==30166==by 0x38064436: vgPlain_arena_free (m_mallocfree.c:1846)
==30166==by 0x38029725: create_MC_Chunk (mc_malloc_wrappers.c:165)
==30166==by 0x38029944: vgMemCheck_new_block (mc_malloc_wrappers.c:283)
==30166==by 0x38029B9D: vgMemCheck___builtin_new (mc_malloc_wrappers.c:311)
==30166==by 0x3809E490: vgPlain_scheduler (scheduler.c:1667)
==30166==by 0x380AD6F9: run_a_thread_NORETURN (syswrap-linux.c:103)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==30166==at 0x4C2A361: operator new(unsigned long) (vg_replace_malloc.c:298)
==30166==by 0x1DE5891F: llvm::JIT::removeModule(llvm::Module*) (in 
/home/kevin/mesa_nightly/lib/libOSMesa.so.8.0.0)
==30166==by 0x1DEA8149: LLVMRemoveModule (in 
/home/kevin/mesa_nightly/lib/libOSMesa.so.8.0.0)
==30166==by 0x1DB2297F: free_gallivm_state (lp_bld_init.c:200)
==30166==by 0x1DB22D08: gallivm_destroy (lp_bld_init.c:554)
==30166==by 0x1DB4A006: draw_llvm_destroy_variant (draw_llvm.c:1998)
==30166==by 0x1DB4B6E0: vs_llvm_delete (draw_vs_llvm.c:73)
==30166==by 0x1DA251DD: draw_delete_vertex_shader (draw_vs.c:142)
==30166==by 0x1DBAC1CD: llvmpipe_delete_vs_state (lp_state_vs.c:105)
==30166==by 0x1D9FAB73: cso_delete_vertex_shader (cso_context.c:608)
==30166==by 0x1D95BE3E: delete_vp_variant (st_program.c:66)
==30166==by 0x1D95BEDA: st_release_vp_variants (st_program.c:90)
==30166==by 0x1D934F9F: st_delete_program (st_cb_program.c:131)
==30166==by 0x1D9ED388: _mesa_reference_program_ (program.c:421)
==30166==by 0x1D9D8AF7: _mesa_reference_program (program.h:102)
==30166==by 0x1D9D8D22: clear_cache (prog_cache.c:126)
==30166==by 0x1D9D8E14: _mesa_delete_program_cache (prog_cache.c:159)
==30166==by 0x1D9ECC90: _mesa_free_program_data (program.c:119)
==30166==by 0x1D7CC11C: _mesa_free_context_data (context.c:1166)
==30166==by 0x1D93C691: st_destroy_context (st_context.c:310)
==30166==by 0x1D956E2D: st_context_destroy (st_manager.c:578)
==30166==by 0x1DB7E4AD: OSMesaDestroyContext (osmesa.c:583)
==30166==by 0x2131BF2E: vtkOSOpenGLRenderWindow::DestroyOffScreenWindow() 
(vtkOSOpenGLRenderWindow.cxx:226)

[Mesa-dev] [PATCH] r600g, radeonsi: set/get the scanout flag using the set/get_tiling ioctls

2013-08-12 Thread Marek Olšák

---
 src/gallium/drivers/r300/r300_state.c |  2 +-
 src/gallium/drivers/r300/r300_texture.c   |  4 ++--
 src/gallium/drivers/r600/r600_texture.c   | 13 -
 src/gallium/drivers/radeonsi/r600_texture.c   | 13 -
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 12 ++--
 src/gallium/winsys/radeon/drm/radeon_winsys.h |  4 +++-
 6 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/r300/r300_state.c 
b/src/gallium/drivers/r300/r300_state.c
index e69a605..dad3dc5 100644
--- a/src/gallium/drivers/r300/r300_state.c
+++ b/src/gallium/drivers/r300/r300_state.c
@@ -843,7 +843,7 @@ static void r300_tex_set_tiling_flags(struct r300_context 
*r300,
 tex->tex.macrotile[level]) {
 r300->rws->buffer_set_tiling(tex->buf, r300->cs,
 tex->tex.microtile, tex->tex.macrotile[level],
-0, 0, 0, 0, 0,
+0, 0, 0, 0, 0, FALSE,
 tex->tex.stride_in_bytes[0]);
 
 tex->surface_level = level;
diff --git a/src/gallium/drivers/r300/r300_texture.c 
b/src/gallium/drivers/r300/r300_texture.c
index 13e9bc3..4b58f06 100644
--- a/src/gallium/drivers/r300/r300_texture.c
+++ b/src/gallium/drivers/r300/r300_texture.c
@@ -1059,7 +1059,7 @@ r300_texture_create_object(struct r300_screen *rscreen,
 
 rws->buffer_set_tiling(tex->buf, NULL,
 tex->tex.microtile, tex->tex.macrotile[0],
-0, 0, 0, 0, 0,
+0, 0, 0, 0, 0, FALSE,
 tex->tex.stride_in_bytes[0]);
 
 return tex;
@@ -1115,7 +1115,7 @@ struct pipe_resource *r300_texture_from_handle(struct 
pipe_screen *screen,
 if (!buffer)
 return NULL;
 
-rws->buffer_get_tiling(buffer, µtile, ¯otile, NULL, NULL, NULL, 
NULL, NULL);
+rws->buffer_get_tiling(buffer, µtile, ¯otile, NULL, NULL, NULL, 
NULL, NULL, NULL);
 
 /* Enforce a microtiled zbuffer. */
 if (util_format_is_depth_or_stencil(base->format) &&
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 742e982..f27d0dc 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -131,7 +131,7 @@ static int r600_init_surface(struct r600_screen *rscreen,
 struct radeon_surface *surface,
 const struct pipe_resource *ptex,
 unsigned array_mode,
-bool is_flushed_depth)
+bool is_flushed_depth, bool is_scanout)
 {
const struct util_format_description *desc =
util_format_description(ptex->format);
@@ -205,7 +205,7 @@ static int r600_init_surface(struct r600_screen *rscreen,
default:
return -EINVAL;
}
-   if (ptex->bind & PIPE_BIND_SCANOUT) {
+   if (is_scanout) {
surface->flags |= RADEON_SURF_SCANOUT;
}
 
@@ -285,6 +285,7 @@ static boolean r600_texture_get_handle(struct pipe_screen* 
screen,
   surface->tile_split,
   surface->stencil_tile_split,
   surface->mtilea,
+  (surface->flags & RADEON_SURF_SCANOUT) 
!= 0,
   rtex->surface.level[0].pitch_bytes);
 
return rscreen->ws->buffer_get_handle(resource->buf,
@@ -624,7 +625,8 @@ struct pipe_resource *r600_texture_create(struct 
pipe_screen *screen,
}
 
r = r600_init_surface(rscreen, &surface, templ, array_mode,
- templ->flags & R600_RESOURCE_FLAG_FLUSHED_DEPTH);
+ (templ->flags & R600_RESOURCE_FLAG_FLUSHED_DEPTH) 
!= 0,
+ (templ->bind & PIPE_BIND_SCANOUT) != 0);
if (r) {
return NULL;
}
@@ -689,6 +691,7 @@ struct pipe_resource *r600_texture_from_handle(struct 
pipe_screen *screen,
unsigned array_mode = 0;
enum radeon_bo_layout micro, macro;
struct radeon_surface surface;
+   boolean scanout;
int r;
 
/* Support only 2D textures without mipmaps */
@@ -704,7 +707,7 @@ struct pipe_resource *r600_texture_from_handle(struct 
pipe_screen *screen,
   &surface.bankw, &surface.bankh,
   &surface.tile_split,
   &surface.stencil_tile_split,
-  &surface.mtilea);
+  &surface.mtilea, &scanout);
 
if (macro == RADEON_LAYOUT_TILED)
array_mode = V_0280A0_ARRAY_2D_TILED_THIN1;
@@ -713,7 +716,7 @@ struct pipe_resource *r600_texture_from_handle(struct 
pipe_screen *screen,
else
array_mode = V_038000_ARRAY_LINEAR_ALIGNED;
 
-   r = r600_init_surface(rscreen, &surface, templ, array_mode, false);
+   r

Re: [Mesa-dev] [PATCH 4/6] i965/fs: Optimize IF/MOV/ELSE/MOV/ENDIF to SEL when possible.

2013-08-12 Thread Kenneth Graunke


On 08/07/2013 03:45 PM, Ian Romanick wrote:

On 08/05/2013 06:28 PM, Kenneth Graunke wrote:

[snip]

+ *(+f0) IF
+ *MOV dst src0
+ *ELSE
+ *MOV dst src1
+ *ENDIF


Do we see many cases of

 foo = batman;
 if (condition)
 foo = robin;


I haven't seen many cases of that, no.


+ *
+ * which can be easily translated into:
+ *
+ *(+f0) SEL dst src0 src1
+ *
+ * If src0 is an immediate value, we promote it to a temporary GRF.
+ */
+void
+fs_visitor::try_replace_with_sel()
+{
+   fs_inst *endif_inst = (fs_inst *) instructions.get_tail();
+   assert(endif_inst->opcode == BRW_OPCODE_ENDIF);
+
+   /* Pattern match in reverse: IF, MOV, ELSE, MOV, ENDIF. */


Just curious about the decision to match in reverse...


We do normal code generation for an ir_if, then, after emitting the 
closing ENDIF, check if it fits the pattern.  Since the end of the list 
is just after the ENDIF, and I don't know how many instructions may have 
been generated, it makes sense to do it in reverse.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: simplify geometry shader mask handling a bit

2013-08-12 Thread Zack Rusin

> From: Roland Scheidegger 
> 
> Instead of reducing masks to 0/1 simply use the mask directly as -1.
> Also use some signed comparison instead of unsigned (as far as I understand
> these values have to be (very) small and signed means llvm doesn't have to
> apply additional logic to do the unsigned comparisons the cpu can't do).
> Saves a couple of instructions in some test geometry shader here.
> 
> v2: that was a bit to much optimization, don't skip combining the masks...

k, I think that one looks good. 

Reviewed-by: Zack Rusin 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallivm: fix exec_mask interaction with geometry shader after end of main

2013-08-12 Thread sroland

From: Roland Scheidegger 

Because we must maintain an exec_mask even if there's currently nothing
on the mask stack, we can still have an exec_mask at the end of the program.
Effectively, this mask should be set back to default when returning from main.
Without relying on END/RET opcode (I think it's valid to have neither) it is
actually difficult to do this, as there doesn't seem any reasonable place to
do it, so instead let's just say the exec_mask is invalid outside main (which
it really is effectively).
The problem is that geometry shader called end_primitive outside the shader
(in the epilogue), and as a result used a bogus mask, leading to bugs if we
had to set the (somewhat misnamed) ret_in_main bit anywhere. So just avoid
the mask combining function when called from outside the shader.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi.c |2 +-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   28 +++
 2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
index 495940c..5a9e8d0 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
@@ -466,7 +466,7 @@ lp_build_tgsi_llvm(
 
while (bld_base->pc != -1) {
   struct tgsi_full_instruction *instr = bld_base->instructions +
-   bld_base->pc;
+   bld_base->pc;
   const struct tgsi_opcode_info *opcode_info =
  tgsi_get_opcode_info(instr->Instruction.Opcode);
   if (!lp_build_tgsi_inst_llvm(bld_base, instr)) {
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 589ea4f..db8e997 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -2691,11 +2691,21 @@ end_primitive_masked(struct lp_build_tgsi_context * 
bld_base,
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
 
if (bld->gs_iface->end_primitive) {
+  struct lp_build_context *uint_bld = &bld_base->uint_bld;
   LLVMValueRef emitted_vertices_vec =
  LLVMBuildLoad(builder, bld->emitted_vertices_vec_ptr, "");
   LLVMValueRef emitted_prims_vec =
  LLVMBuildLoad(builder, bld->emitted_prims_vec_ptr, "");
 
+  LLVMValueRef emitted_mask = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
+   emitted_vertices_vec,
+   uint_bld->zero);
+  /* We need to combine the current execution mask with the mask
+ telling us which, if any, execution slots actually have
+ unemitted primitives, this way we make sure that end_primitives
+ executes only on the paths that have unflushed vertices */
+  mask = LLVMBuildAnd(builder, mask, emitted_mask, "");
+
   bld->gs_iface->end_primitive(bld->gs_iface, &bld->bld_base,
emitted_vertices_vec,
emitted_prims_vec);
@@ -2735,20 +2745,7 @@ end_primitive(
struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
 
if (bld->gs_iface->end_primitive) {
-  LLVMBuilderRef builder = bld_base->base.gallivm->builder;
   LLVMValueRef mask = mask_vec(bld_base);
-  struct lp_build_context *uint_bld = &bld_base->uint_bld;
-  LLVMValueRef emitted_verts = LLVMBuildLoad(
- builder, bld->emitted_vertices_vec_ptr, "");
-  LLVMValueRef emitted_mask = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
-   emitted_verts,
-   uint_bld->zero);
-  /* We need to combine the current execution mask with the mask
- telling us which, if any, execution slots actually have
- unemitted primitives, this way we make sure that end_primitives
- executes only on the paths that have unflushed vertices */
-  mask = LLVMBuildAnd(builder, mask, emitted_mask, "");
-
   end_primitive_masked(bld_base, mask);
}
 }
@@ -3148,8 +3145,9 @@ static void emit_epilogue(struct lp_build_tgsi_context * 
bld_base)
   LLVMValueRef total_emitted_vertices_vec;
   LLVMValueRef emitted_prims_vec;
   /* implicit end_primitives, needed in case there are any unflushed
- vertices in the cache */
-  end_primitive(NULL, bld_base, NULL);
+ vertices in the cache. Note must not call end_primitive here
+ since the exec_mask is not valid at this point. */
+  end_primitive_masked(bld_base, lp_build_mask_value(bld->mask));
   
   total_emitted_vertices_vec =
  LLVMBuildLoad(builder, bld->total_emitted_vertices_vec_ptr, "");
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] draw: simplify prim mask construction

2013-08-12 Thread Zack Rusin

Looks good.

Reviewed-by: Zack Rusin 

- Original Message -
> From: Roland Scheidegger 
> 
> The code was quite weird, the second comparison was in fact a complete no-op
> and we can also do the comparison with the vector directly instead of scalar,
> which should not also be faster but it is way more obvious how that mask
> is actually going to look like.
> (Not sure how many instructions that saves as it turned out the mask wasn't
> used in the test geometry shader I used at all after all...)
> ---
>  src/gallium/auxiliary/draw/draw_llvm.c |   32
>  ++--
>  1 file changed, 10 insertions(+), 22 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c
> b/src/gallium/auxiliary/draw/draw_llvm.c
> index 68f6369..84e3392 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
> @@ -2040,31 +2040,19 @@ generate_mask_value(struct draw_gs_llvm_variant
> *variant,
>  {
> struct gallivm_state *gallivm = variant->gallivm;
> LLVMBuilderRef builder = gallivm->builder;
> -   LLVMValueRef bits[16];
> -   struct lp_type  mask_type = lp_int_type(gs_type);
> -   struct lp_type mask_elem_type = lp_elem_type(mask_type);
> -   LLVMValueRef mask_val = lp_build_const_vec(gallivm,
> -  mask_type,
> -  0);
> +   struct lp_type mask_type = lp_int_type(gs_type);
> +   LLVMValueRef num_prims;
> +   LLVMValueRef mask_val = lp_build_const_vec(gallivm, mask_type, 0);
> unsigned i;
>  
> -   assert(gs_type.length <= Elements(bits));
> -
> -   for (i = gs_type.length; i >= 1; --i) {
> -  int idx = i - 1;
> -  LLVMValueRef ind = lp_build_const_int32(gallivm, i);
> -  bits[idx] = lp_build_compare(gallivm,
> -   mask_elem_type, PIPE_FUNC_GEQUAL,
> -   variant->num_prims, ind);
> -   }
> -   for (i = 0; i < gs_type.length; ++i) {
> -  LLVMValueRef ind = lp_build_const_int32(gallivm, i);
> -  mask_val = LLVMBuildInsertElement(builder, mask_val, bits[i], ind,
> "");
> +   num_prims = lp_build_broadcast(gallivm, lp_build_vec_type(gallivm,
> mask_type),
> +  variant->num_prims);
> +   for (i = 0; i <= gs_type.length; i++) {
> +  LLVMValueRef idx = lp_build_const_int32(gallivm, i);
> +  mask_val = LLVMBuildInsertElement(builder, mask_val, idx, idx, "");
> }
> -   mask_val = lp_build_compare(gallivm,
> -   mask_type, PIPE_FUNC_NOTEQUAL,
> -   mask_val,
> -   lp_build_const_int_vec(gallivm, mask_type,
> 0));
> +   mask_val = lp_build_compare(gallivm, mask_type,
> +   PIPE_FUNC_GREATER, num_prims, mask_val);
>  
> return mask_val;
>  }
> --
> 1.7.9.5
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallivm: fix exec_mask interaction with geometry shader after end of main

2013-08-12 Thread Zack Rusin

Ah, that looks like a great catch.

Reviewed-by: Zack Rusin 

- Original Message -
> From: Roland Scheidegger 
> 
> Because we must maintain an exec_mask even if there's currently nothing
> on the mask stack, we can still have an exec_mask at the end of the program.
> Effectively, this mask should be set back to default when returning from
> main.
> Without relying on END/RET opcode (I think it's valid to have neither) it is
> actually difficult to do this, as there doesn't seem any reasonable place to
> do it, so instead let's just say the exec_mask is invalid outside main (which
> it really is effectively).
> The problem is that geometry shader called end_primitive outside the shader
> (in the epilogue), and as a result used a bogus mask, leading to bugs if we
> had to set the (somewhat misnamed) ret_in_main bit anywhere. So just avoid
> the mask combining function when called from outside the shader.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi.c |2 +-
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |   28
>  +++
>  2 files changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
> index 495940c..5a9e8d0 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.c
> @@ -466,7 +466,7 @@ lp_build_tgsi_llvm(
>  
> while (bld_base->pc != -1) {
>struct tgsi_full_instruction *instr = bld_base->instructions +
> - bld_base->pc;
> +   bld_base->pc;
>const struct tgsi_opcode_info *opcode_info =
>   tgsi_get_opcode_info(instr->Instruction.Opcode);
>if (!lp_build_tgsi_inst_llvm(bld_base, instr)) {
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> index 589ea4f..db8e997 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
> @@ -2691,11 +2691,21 @@ end_primitive_masked(struct lp_build_tgsi_context *
> bld_base,
> LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
>  
> if (bld->gs_iface->end_primitive) {
> +  struct lp_build_context *uint_bld = &bld_base->uint_bld;
>LLVMValueRef emitted_vertices_vec =
>   LLVMBuildLoad(builder, bld->emitted_vertices_vec_ptr, "");
>LLVMValueRef emitted_prims_vec =
>   LLVMBuildLoad(builder, bld->emitted_prims_vec_ptr, "");
>  
> +  LLVMValueRef emitted_mask = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
> +   emitted_vertices_vec,
> +   uint_bld->zero);
> +  /* We need to combine the current execution mask with the mask
> + telling us which, if any, execution slots actually have
> + unemitted primitives, this way we make sure that end_primitives
> + executes only on the paths that have unflushed vertices */
> +  mask = LLVMBuildAnd(builder, mask, emitted_mask, "");
> +
>bld->gs_iface->end_primitive(bld->gs_iface, &bld->bld_base,
> emitted_vertices_vec,
> emitted_prims_vec);
> @@ -2735,20 +2745,7 @@ end_primitive(
> struct lp_build_tgsi_soa_context * bld = lp_soa_context(bld_base);
>  
> if (bld->gs_iface->end_primitive) {
> -  LLVMBuilderRef builder = bld_base->base.gallivm->builder;
>LLVMValueRef mask = mask_vec(bld_base);
> -  struct lp_build_context *uint_bld = &bld_base->uint_bld;
> -  LLVMValueRef emitted_verts = LLVMBuildLoad(
> - builder, bld->emitted_vertices_vec_ptr, "");
> -  LLVMValueRef emitted_mask = lp_build_cmp(uint_bld, PIPE_FUNC_NOTEQUAL,
> -   emitted_verts,
> -   uint_bld->zero);
> -  /* We need to combine the current execution mask with the mask
> - telling us which, if any, execution slots actually have
> - unemitted primitives, this way we make sure that end_primitives
> - executes only on the paths that have unflushed vertices */
> -  mask = LLVMBuildAnd(builder, mask, emitted_mask, "");
> -
>end_primitive_masked(bld_base, mask);
> }
>  }
> @@ -3148,8 +3145,9 @@ static void emit_epilogue(struct lp_build_tgsi_context
> * bld_base)
>LLVMValueRef total_emitted_vertices_vec;
>LLVMValueRef emitted_prims_vec;
>/* implicit end_primitives, needed in case there are any unflushed
> - vertices in the cache */
> -  end_primitive(NULL, bld_base, NULL);
> + vertices in the cache. Note must not call end_primitive here
> + since the exec_mask is not valid at this point. */
> +  end_primitive_masked(bld_base, lp_build_mask_value(bld->mask));
>
>

Re: [Mesa-dev] [PATCH 1/6] i965/fs: Log a performance warning if skipping 16-wide due to pulls.

2013-08-12 Thread Matt Turner

On Mon, Aug 5, 2013 at 6:28 PM, Kenneth Graunke  wrote:
> Usually, the driver creates both 8-wide and 16-wide variants of every
> fragment shader.  When 16-wide compilation fails, it logs a performance
> warning explaining why only an 8-wide program exists.
>
> However, when there are pull parameters, the driver won't even bother
> trying the 16-wide compile (since it would fail).  In this case, it
> failed to emit a performance warning, leaving no explanation for the
> missing 16-wide program.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index a81e97f..a953310 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3058,14 +3058,18 @@ brw_wm_fs_emit(struct brw_context *brw, struct 
> brw_wm_compile *c,
>
> exec_list *simd16_instructions = NULL;
> fs_visitor v2(brw, c, prog, fp, 16);
> -   bool no16 = INTEL_DEBUG & DEBUG_NO16;
> -   if (brw->gen >= 5 && c->prog_data.nr_pull_params == 0 && likely(!no16)) {
> -  v2.import_uniforms(&v);
> -  if (!v2.run()) {
> - perf_debug("16-wide shader failed to compile, falling back to "
> -"8-wide at a 10-20%% performance cost: %s", v2.fail_msg);
> +   if (brw->gen >= 5 && likely(!(INTEL_DEBUG & DEBUG_NO16))) {
> +  if (c->prog_data.nr_pull_params == 0) {
> + /* Try a 16-wide compile */
> + v2.import_uniforms(&v);
> + if (!v2.run()) {
> +perf_debug("16-wide shader failed to compile, falling back to "
> +   "8-wide at a 10-20%% performance cost: %s", 
> v2.fail_msg);
> + } else {
> +simd16_instructions = &v2.instructions;
> + }
>} else {
> - simd16_instructions = &v2.instructions;
> + perf_debug("Skipping 16-wide due to pull parameters.\n");
>}
> }
>
> --
> 1.8.3.4

Series is
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/15] i965/fs: Skip global copy propagation step.

2013-08-12 Thread Kenneth Graunke

The dataflow analysis used for global copy propagation is severely
broken, and I believe it doesn't actually do anything.  Fixing it will
require a lot of changes, each of which might break things.

Once all the fixes land, we can re-enable this.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 234f8bd..d8d1546 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -445,6 +445,7 @@ fs_visitor::opt_copy_propagate()
   out_acp[b]) || progress;
}
 
+   #if 0
/* Do dataflow analysis for those available copies. */
fs_copy_prop_dataflow dataflow(mem_ctx, &cfg, out_acp);
 
@@ -464,6 +465,7 @@ fs_visitor::opt_copy_propagate()
 
   progress = opt_copy_propagate_local(mem_ctx, block, in_acp) || progress;
}
+   #endif
 
for (int i = 0; i < cfg.num_blocks; i++)
   delete [] out_acp[i];
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/15] i965/fs: Don't kill ACP entries due to their generating instruction.

2013-08-12 Thread Kenneth Graunke

fs_copy_prop_dataflow::setup_kills() walks through each basic block's
instructions, looking for instructions which overwrite registers used
in ACP entries.

This would be fine, except that it didn't exclude the instructions
which generated the ACP entries in the first place.  This meant that
every copy was killed by its own generating instruction, making the
dataflow analysis useless.

To fix this, this patch records the generating instruction in the
ACP entry.  It then skips the kill checks when processing that
instruction.

Using a void pointer ensures it will only be used for comparisons,
and not dereferenced.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index d8d1546..379605c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -42,6 +42,9 @@ namespace { /* avoid conflict with 
opt_copy_propagation_elements */
 struct acp_entry : public exec_node {
fs_reg dst;
fs_reg src;
+
+   /** MOV instruction which generated this copy/ACP entry. */
+   void *inst;
 };
 
 struct block_data {
@@ -146,8 +149,9 @@ fs_copy_prop_dataflow::setup_kills()
 continue;
 
  for (int i = 0; i < num_acp; i++) {
-if (inst->overwrites_reg(acp[i]->dst) ||
-inst->overwrites_reg(acp[i]->src)) {
+if (inst != acp[i]->inst &&
+(inst->overwrites_reg(acp[i]->dst) ||
+ inst->overwrites_reg(acp[i]->src))) {
BITSET_SET(bd[b].kill, i);
 }
  }
@@ -418,6 +422,7 @@ fs_visitor::opt_copy_propagate_local(void *mem_ctx, 
bblock_t *block,
 acp_entry *entry = ralloc(mem_ctx, acp_entry);
 entry->dst = inst->dst;
 entry->src = inst->src[0];
+ entry->inst = inst;
 acp[entry->dst.reg % ACP_HASH_SIZE].push_tail(entry);
   }
}
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/15] i965/fs: Switch to a do-while loop in copy propagation dataflow.

2013-08-12 Thread Kenneth Graunke

The fixed-point algorithm needs to run at least once, so a do-while loop
is more natural.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 379605c..5efa812 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -166,9 +166,9 @@ fs_copy_prop_dataflow::setup_kills()
 void
 fs_copy_prop_dataflow::run()
 {
-   bool cont = true;
+   bool cont;
 
-   while (cont) {
+   do {
   cont = false;
 
   for (int b = 0; b < cfg->num_blocks; b++) {
@@ -198,7 +198,7 @@ fs_copy_prop_dataflow::run()
 }
  }
   }
-   }
+   } while (cont);
 }
 
 bool
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/15] i965/fs: Rename "cont" to "progress" in dataflow algorithm.

2013-08-12 Thread Kenneth Graunke

This variable indicates that the fixed-point algorithm made changes to
the data at this step, so it needs to run for another iteration.

"progress" seems a nicer name for that.
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 5efa812..9204f0e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -166,10 +166,10 @@ fs_copy_prop_dataflow::setup_kills()
 void
 fs_copy_prop_dataflow::run()
 {
-   bool cont;
+   bool progress;
 
do {
-  cont = false;
+  progress = false;
 
   for (int b = 0; b < cfg->num_blocks; b++) {
  for (int i = 0; i < bitset_words; i++) {
@@ -178,7 +178,7 @@ fs_copy_prop_dataflow::run()
~bd[b].liveout[i]);
 if (new_liveout) {
bd[b].liveout[i] |= new_liveout;
-   cont = true;
+   progress = true;
 }
 
 /* Update livein: if it's live at the end of all parents, it's
@@ -194,11 +194,11 @@ fs_copy_prop_dataflow::run()
 }
 if (new_livein) {
bd[b].livein[i] |= new_livein;
-   cont = true;
+   progress = true;
 }
  }
   }
-   } while (cont);
+   } while (progress);
 }
 
 bool
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/15] i965/fs: Separate the updating of liveout/livein.

2013-08-12 Thread Kenneth Graunke

To compute the actual liveout/livein data flow values, we start with
some initial values and apply a fixed-point algorithm until they settle.

Previously, we iterated through all blocks, updating both liveout and
livein together in one pass.  This is awkward, since computing livein
for a block requires knowing liveout for all parent blocks.  Not all
of those parent blocks may have been processed yet.

This patch separates the two.  First, we update liveout for all blocks.
At iteration N of the fixed-point algorithm, this uses livein values
from iteration N-1.  Secondly, we update livein for all blocks.  At
step N, this uses the liveout information we just computed (in step N).

This ensures each computation has a consistent picture of the data,
rather than seeing an random mix of data from steps N-1 and N depending
on the order of the blocks in the CFG data structure.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 9204f0e..a00dde4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -171,6 +171,7 @@ fs_copy_prop_dataflow::run()
do {
   progress = false;
 
+  /* Update liveout for all blocks. */
   for (int b = 0; b < cfg->num_blocks; b++) {
  for (int i = 0; i < bitset_words; i++) {
 BITSET_WORD new_liveout = (bd[b].livein[i] &
@@ -180,10 +181,14 @@ fs_copy_prop_dataflow::run()
bd[b].liveout[i] |= new_liveout;
progress = true;
 }
+ }
+  }
 
-/* Update livein: if it's live at the end of all parents, it's
- * live at our start.
- */
+  /* Update livein for all blocks.  If a copy is live out of all parent
+   * blocks, it's live coming in to this block.
+   */
+  for (int b = 0; b < cfg->num_blocks; b++) {
+ for (int i = 0; i < bitset_words; i++) {
 BITSET_WORD new_livein = ~bd[b].livein[i];
 foreach_list(block_node, &cfg->blocks[b]->parents) {
bblock_link *link = (bblock_link *)block_node;
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] i965 global copy propagation fixes

2013-08-12 Thread Kenneth Graunke

Hello,

This surprisingly large series fixes bugs in the data flow algorithm
for copy propagation.  As far as I can tell, there were a myriad of
issues.  The updated algorithm seems to follow the textbook and
online materials much more closely.

No Piglit regressions on Ivybridge.

shader-db statistics are more or less a wash:

total instructions in shared programs: 1569184 -> 1569423 (0.02%)
instructions in affected programs: 44961 -> 45200 (0.53%)

A bunch of things are helped by a small margin, while other things are hurt
by a small margin (a couple of extra MOVs).  The few hurt shaders I looked
into appeared to suffer from compute-to-MRF not handling control flow.
There may be other reasons as well.

I'd still like to land it, though, as the old algorithm seems to be
pretty broken.


--Ken

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/15] i965/fs: Rename setup_kills() to setup_initial_values().

2013-08-12 Thread Kenneth Graunke

Although this function currently only initializes the KILL set, it will
soon initialize other data flow sets as well.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index a00dde4..7aff36b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -76,7 +76,7 @@ public:
fs_copy_prop_dataflow(void *mem_ctx, cfg_t *cfg,
  exec_list *out_acp[ACP_HASH_SIZE]);
 
-   void setup_kills();
+   void setup_initial_values();
void run();
 
void *mem_ctx;
@@ -128,16 +128,16 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void 
*mem_ctx, cfg_t *cfg,
 
assert(next_acp == num_acp);
 
-   setup_kills();
+   setup_initial_values();
run();
 }
 
 /**
- * Walk the set of instructions in the block, marking which entries in the acp
- * are killed by the block.
+ * Set up initial values for each of the data flow sets, prior to running
+ * the fixed-point algorithm.
  */
 void
-fs_copy_prop_dataflow::setup_kills()
+fs_copy_prop_dataflow::setup_initial_values()
 {
for (int b = 0; b < cfg->num_blocks; b++) {
   bblock_t *block = cfg->blocks[b];
@@ -148,6 +148,7 @@ fs_copy_prop_dataflow::setup_kills()
  if (inst->dst.file != GRF)
 continue;
 
+ /* Mark ACP entries which are killed by this instruction. */
  for (int i = 0; i < num_acp; i++) {
 if (inst != acp[i]->inst &&
 (inst->overwrites_reg(acp[i]->dst) ||
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/15] i965/fs: Create the COPY() set for use in copy propagation dataflow.

2013-08-12 Thread Kenneth Graunke

This is the "COPY" set from Muchnick's textbook, which is necessary
to do the dataflow algorithm correctly.

Signed-off-by: Kenneth Graunke 
---
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 27 ++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 7aff36b..2970af6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -64,6 +64,13 @@ struct block_data {
BITSET_WORD *liveout;
 
/**
+* Which entries in the fs_copy_prop_dataflow acp table are generated by
+* instructions in this block which reach the end of the block without
+* being killed.
+*/
+   BITSET_WORD *copy;
+
+   /**
 * Which entries in the fs_copy_prop_dataflow acp table are killed over the
 * course of this block.
 */
@@ -113,6 +120,7 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, 
cfg_t *cfg,
for (int b = 0; b < cfg->num_blocks; b++) {
   bd[b].livein = rzalloc_array(bd, BITSET_WORD, bitset_words);
   bd[b].liveout = rzalloc_array(bd, BITSET_WORD, bitset_words);
+  bd[b].copy = rzalloc_array(bd, BITSET_WORD, bitset_words);
   bd[b].kill = rzalloc_array(bd, BITSET_WORD, bitset_words);
 
   for (int i = 0; i < ACP_HASH_SIZE; i++) {
@@ -148,15 +156,26 @@ fs_copy_prop_dataflow::setup_initial_values()
  if (inst->dst.file != GRF)
 continue;
 
- /* Mark ACP entries which are killed by this instruction. */
  for (int i = 0; i < num_acp; i++) {
-if (inst != acp[i]->inst &&
-(inst->overwrites_reg(acp[i]->dst) ||
- inst->overwrites_reg(acp[i]->src))) {
+if (inst == acp[i]->inst) {
+   /* Add this entry to the COPY set. */
+   BITSET_SET(bd[b].copy, i);
+} else if (inst->overwrites_reg(acp[i]->dst) ||
+   inst->overwrites_reg(acp[i]->src)) {
+   /* The current instruction kills this copy.  Add the entry to
+* the KILL set.
+*/
BITSET_SET(bd[b].kill, i);
 }
  }
   }
+
+  /* Anything killed did not make it to the end of the block, so it
+   * shouldn't be in COPY.
+   */
+  for (int i = 0; i < bitset_words; i++) {
+ bd[b].copy[i] &= ~bd[b].kill[i];
+  }
}
 }
 
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/15] i965/fs: Simplify liveout calculation.

2013-08-12 Thread Kenneth Graunke

Excluding the existing liveout bits is a deviation from the textbook
algorithm.  The reason for doing so was to determine if the value
changed, which means the fixed-point algorithm needs to run for another
iteration.

The simpler way to do that is to save the value from step (N-1) and
compare it to the new value at step N.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 2970af6..2ab7734 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -194,13 +194,12 @@ fs_copy_prop_dataflow::run()
   /* Update liveout for all blocks. */
   for (int b = 0; b < cfg->num_blocks; b++) {
  for (int i = 0; i < bitset_words; i++) {
-BITSET_WORD new_liveout = (bd[b].livein[i] &
-   ~bd[b].kill[i] &
-   ~bd[b].liveout[i]);
-if (new_liveout) {
-   bd[b].liveout[i] |= new_liveout;
+const BITSET_WORD old_liveout = bd[b].liveout[i];
+
+bd[b].liveout[i] |= bd[b].livein[i] & ~bd[b].kill[i];
+
+if (old_liveout != bd[b].liveout[i])
progress = true;
-}
  }
   }
 
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/15] i965/fs: Use the COPY set in the calculation for liveout.

2013-08-12 Thread Kenneth Graunke

According to page 360 of the textbook, the proper formula for liveout
is:

CPout(n) = COPY(i) union (CPin(i) - KILL(i))

Previously, we omitted COPY.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 2ab7734..fd726e0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -196,7 +196,8 @@ fs_copy_prop_dataflow::run()
  for (int i = 0; i < bitset_words; i++) {
 const BITSET_WORD old_liveout = bd[b].liveout[i];
 
-bd[b].liveout[i] |= bd[b].livein[i] & ~bd[b].kill[i];
+bd[b].liveout[i] |=
+   bd[b].copy[i] | (bd[b].livein[i] & ~bd[b].kill[i]);
 
 if (old_liveout != bd[b].liveout[i])
progress = true;
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/15] i965/fs: Properly initialize the livein/liveout sets.

2013-08-12 Thread Kenneth Graunke

Previously, livein was initialized to 0 for all blocks.  According to
the textbook, it should be the universal set (~0) for all blocks except
the one representing the start of the program (which should be 0).

liveout also needs to be initialized to COPY for the initial block.

Signed-off-by: Kenneth Graunke 
---
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp| 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index fd726e0..3344f89 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -147,6 +147,7 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, 
cfg_t *cfg,
 void
 fs_copy_prop_dataflow::setup_initial_values()
 {
+   /* Initialize the COPY and KILL sets. */
for (int b = 0; b < cfg->num_blocks; b++) {
   bblock_t *block = cfg->blocks[b];
 
@@ -177,6 +178,26 @@ fs_copy_prop_dataflow::setup_initial_values()
  bd[b].copy[i] &= ~bd[b].kill[i];
   }
}
+
+   /* Populate the initial values for the livein and liveout sets.  For the
+* block at the start of the program, livein = 0 and liveout = copy.
+* For the others, set liveout to 0 (the empty set) and livein to ~0
+* (the universal set).
+*/
+   for (int b = 0; b < cfg->num_blocks; b++) {
+  bblock_t *block = cfg->blocks[b];
+  if (block->parents.is_empty()) {
+ for (int i = 0; i < bitset_words; i++) {
+bd[b].livein[i] = 0u;
+bd[b].liveout[i] = bd[b].copy[i];
+ }
+  } else {
+ for (int i = 0; i < bitset_words; i++) {
+bd[b].liveout[i] = 0u;
+bd[b].livein[i] = ~0u;
+ }
+  }
+   }
 }
 
 /**
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/15] i965/fs: Drop unnecessary and incorrect liveout initialization.

2013-08-12 Thread Kenneth Graunke

The previous commit properly initialized liveout.  This previous
(and incorrect) initialization is no longer necessary.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 3344f89..bd73666 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -128,7 +128,6 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, 
cfg_t *cfg,
 acp_entry *entry = (acp_entry *)entry_node;
 
 acp[next_acp] = entry;
-BITSET_SET(bd[b].liveout, next_acp);
 next_acp++;
  }
   }
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/15] i965/fs: Skip the initial block when updating livein/liveout.

2013-08-12 Thread Kenneth Graunke

The starting block always has livein = 0 and liveout = copy.  Since we
start with real data, not estimates, there's no need to refine it with
the fixed point algorithm.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index bd73666..f5c8e4a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -213,6 +213,9 @@ fs_copy_prop_dataflow::run()
 
   /* Update liveout for all blocks. */
   for (int b = 0; b < cfg->num_blocks; b++) {
+ if (cfg->blocks[b]->parents.is_empty())
+continue;
+
  for (int i = 0; i < bitset_words; i++) {
 const BITSET_WORD old_liveout = bd[b].liveout[i];
 
@@ -228,6 +231,9 @@ fs_copy_prop_dataflow::run()
* blocks, it's live coming in to this block.
*/
   for (int b = 0; b < cfg->num_blocks; b++) {
+ if (cfg->blocks[b]->parents.is_empty())
+continue;
+
  for (int i = 0; i < bitset_words; i++) {
 BITSET_WORD new_livein = ~bd[b].livein[i];
 foreach_list(block_node, &cfg->blocks[b]->parents) {
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/15] i965/fs: Fully recompute liveout at each step.

2013-08-12 Thread Kenneth Graunke

Since we start with an overestimation of livein (0x), successive
steps may should actually take away values.  This means we can't simply
OR in new liveout values; we need to recompute it from scratch at each
iteration of the fixed-point algorithm.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index f5c8e4a..9522649 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -219,7 +219,7 @@ fs_copy_prop_dataflow::run()
  for (int i = 0; i < bitset_words; i++) {
 const BITSET_WORD old_liveout = bd[b].liveout[i];
 
-bd[b].liveout[i] |=
+bd[b].liveout[i] =
bd[b].copy[i] | (bd[b].livein[i] & ~bd[b].kill[i]);
 
 if (old_liveout != bd[b].liveout[i])
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/15] i965/fs: Fix computation of livein.

2013-08-12 Thread Kenneth Graunke

Since the initial value for livein is an overestimation (0x),
it's extremely likely that it will shrink, which means we can't simply
OR in new bits - we need to fully recompute it based on the current
liveout values.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 9522649..dc9f2f8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -235,18 +235,17 @@ fs_copy_prop_dataflow::run()
 continue;
 
  for (int i = 0; i < bitset_words; i++) {
-BITSET_WORD new_livein = ~bd[b].livein[i];
+const BITSET_WORD old_livein = bd[b].livein[i];
+
+bd[b].livein[i] = ~0u;
 foreach_list(block_node, &cfg->blocks[b]->parents) {
bblock_link *link = (bblock_link *)block_node;
bblock_t *block = link->block;
-   new_livein &= bd[block->block_num].liveout[i];
-   if (!new_livein)
-  break;
+   bd[b].livein[i] &= bd[block->block_num].liveout[i];
 }
-if (new_livein) {
-   bd[b].livein[i] |= new_livein;
+
+if (old_livein != bd[b].livein[i])
progress = true;
-}
  }
   }
} while (progress);
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/15] i965/fs: Re-enable global copy propagation.

2013-08-12 Thread Kenneth Graunke

I believe the data flow analysis actually works now, and it should be
safe to re-enable global copy propagation.  It even does things now.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index dc9f2f8..6fdd420 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -500,7 +500,6 @@ fs_visitor::opt_copy_propagate()
   out_acp[b]) || progress;
}
 
-   #if 0
/* Do dataflow analysis for those available copies. */
fs_copy_prop_dataflow dataflow(mem_ctx, &cfg, out_acp);
 
@@ -520,7 +519,6 @@ fs_visitor::opt_copy_propagate()
 
   progress = opt_copy_propagate_local(mem_ctx, block, in_acp) || progress;
}
-   #endif
 
for (int i = 0; i < cfg.num_blocks; i++)
   delete [] out_acp[i];
-- 
1.8.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i965: Don't copy propagate bitcasts with source modifiers.

2013-08-12 Thread Matt Turner

---
 src/mesa/drivers/dri/i965/brw_fs.cpp| 13 +
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp   |  3 +++
 src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 10 ++
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a81e97f..f104f8c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2118,6 +2118,19 @@ fs_visitor::register_coalesce()
}
 }
 
+ if (has_source_modifiers) {
+for (int i = 0; i < 3; i++) {
+   if (scan_inst->src[i].file == GRF &&
+   scan_inst->src[i].reg == inst->dst.reg &&
+   scan_inst->src[i].reg_offset == inst->dst.reg_offset &&
+   inst->dst.type != scan_inst->src[i].type)
+   {
+ return false;
+   }
+}
+ }
+
+
 /* The gen6 MATH instruction can't handle source modifiers or
  * unusual register regions, so avoid coalescing those for
  * now.  We should do something more specific.
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 234f8bd..a5cd858 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -221,6 +221,9 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
 entry->src.smear != -1) && !can_do_source_mods(inst))
   return false;
 
+   if (has_source_modifiers && (entry->dst.type != inst->src[arg].type))
+  return false;
+
inst->src[arg].file = entry->src.file;
inst->src[arg].reg = entry->src.reg;
inst->src[arg].reg_offset = entry->src.reg_offset;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index c28d0de..2a2d403 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -206,14 +206,16 @@ vec4_visitor::try_copy_propagation(vec4_instruction 
*inst, int arg,
if (inst->src[arg].negate)
   value.negate = !value.negate;
 
-   bool has_source_modifiers = (value.negate || value.abs ||
-value.swizzle != BRW_SWIZZLE_XYZW ||
-value.file == UNIFORM);
+   bool has_source_modifiers = value.negate || value.abs;
 
/* gen6 math and gen7+ SENDs from GRFs ignore source modifiers on
 * instructions.
 */
-   if (has_source_modifiers && !can_do_source_mods(inst))
+   if ((has_source_modifiers || value.file == UNIFORM ||
+value.swizzle != BRW_SWIZZLE_XYZW) && !can_do_source_mods(inst))
+  return false;
+
+   if (has_source_modifiers && (value.type != inst->src[arg].type))
   return false;
 
bool is_3src_inst = (inst->opcode == BRW_OPCODE_LRP ||
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] glsl: Add bitcast_i2f() to ir_builder.

2013-08-12 Thread Matt Turner

---
 src/glsl/ir_builder.cpp | 24 
 src/glsl/ir_builder.h   |  4 
 2 files changed, 28 insertions(+)

diff --git a/src/glsl/ir_builder.cpp b/src/glsl/ir_builder.cpp
index 8fb30a0..5e1da17 100644
--- a/src/glsl/ir_builder.cpp
+++ b/src/glsl/ir_builder.cpp
@@ -304,12 +304,24 @@ f2i(operand a)
 }
 
 ir_expression*
+bitcast_f2i(operand a)
+{
+   return expr(ir_unop_bitcast_f2i, a);
+}
+
+ir_expression*
 i2f(operand a)
 {
return expr(ir_unop_i2f, a);
 }
 
 ir_expression*
+bitcast_i2f(operand a)
+{
+   return expr(ir_unop_bitcast_i2f, a);
+}
+
+ir_expression*
 i2u(operand a)
 {
return expr(ir_unop_i2u, a);
@@ -328,11 +340,23 @@ f2u(operand a)
 }
 
 ir_expression*
+bitcast_f2u(operand a)
+{
+   return expr(ir_unop_bitcast_f2u, a);
+}
+
+ir_expression*
 u2f(operand a)
 {
return expr(ir_unop_u2f, a);
 }
 
+ir_expression*
+bitcast_u2f(operand a)
+{
+   return expr(ir_unop_bitcast_u2f, a);
+}
+
 ir_if*
 if_tree(operand condition,
 ir_instruction *then_branch)
diff --git a/src/glsl/ir_builder.h b/src/glsl/ir_builder.h
index 690ac74..59985be 100644
--- a/src/glsl/ir_builder.h
+++ b/src/glsl/ir_builder.h
@@ -151,9 +151,13 @@ ir_expression *lshift(operand a, operand b);
 ir_expression *rshift(operand a, operand b);
 
 ir_expression *f2i(operand a);
+ir_expression *bitcast_f2i(operand a);
 ir_expression *i2f(operand a);
+ir_expression *bitcast_i2f(operand a);
 ir_expression *f2u(operand a);
+ir_expression *bitcast_f2u(operand a);
 ir_expression *u2f(operand a);
+ir_expression *bitcast_u2f(operand a);
 ir_expression *i2u(operand a);
 ir_expression *u2i(operand a);
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] glsl: Add abs() to ir_builder.

2013-08-12 Thread Matt Turner

---
 src/glsl/ir_builder.cpp | 6 ++
 src/glsl/ir_builder.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/src/glsl/ir_builder.cpp b/src/glsl/ir_builder.cpp
index 5e1da17..06b6a8c 100644
--- a/src/glsl/ir_builder.cpp
+++ b/src/glsl/ir_builder.cpp
@@ -219,6 +219,12 @@ saturate(operand a)
   new(mem_ctx) ir_constant(0.0f));
 }
 
+ir_expression *
+abs(operand a)
+{
+   return expr(ir_unop_abs, a);
+}
+
 ir_expression*
 equal(operand a, operand b)
 {
diff --git a/src/glsl/ir_builder.h b/src/glsl/ir_builder.h
index 59985be..49c2a73 100644
--- a/src/glsl/ir_builder.h
+++ b/src/glsl/ir_builder.h
@@ -133,6 +133,7 @@ ir_expression *round_even(operand a);
 ir_expression *dot(operand a, operand b);
 ir_expression *clamp(operand a, operand b, operand c);
 ir_expression *saturate(operand a);
+ir_expression *abs(operand a);
 
 ir_expression *equal(operand a, operand b);
 ir_expression *less(operand a, operand b);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] glsl: Add nequal() to ir_builder.

2013-08-12 Thread Matt Turner

---
 src/glsl/ir_builder.cpp | 6 ++
 src/glsl/ir_builder.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/src/glsl/ir_builder.cpp b/src/glsl/ir_builder.cpp
index 06b6a8c..b47d131 100644
--- a/src/glsl/ir_builder.cpp
+++ b/src/glsl/ir_builder.cpp
@@ -232,6 +232,12 @@ equal(operand a, operand b)
 }
 
 ir_expression*
+nequal(operand a, operand b)
+{
+   return expr(ir_binop_nequal, a, b);
+}
+
+ir_expression*
 less(operand a, operand b)
 {
return expr(ir_binop_less, a, b);
diff --git a/src/glsl/ir_builder.h b/src/glsl/ir_builder.h
index 49c2a73..267b673 100644
--- a/src/glsl/ir_builder.h
+++ b/src/glsl/ir_builder.h
@@ -136,6 +136,7 @@ ir_expression *saturate(operand a);
 ir_expression *abs(operand a);
 
 ir_expression *equal(operand a, operand b);
+ir_expression *nequal(operand a, operand b);
 ir_expression *less(operand a, operand b);
 ir_expression *greater(operand a, operand b);
 ir_expression *lequal(operand a, operand b);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] glsl: Add i2b() and b2i() to ir_builder.

2013-08-12 Thread Matt Turner

---
 src/glsl/ir_builder.cpp | 12 
 src/glsl/ir_builder.h   |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/src/glsl/ir_builder.cpp b/src/glsl/ir_builder.cpp
index b47d131..7d9cf5e 100644
--- a/src/glsl/ir_builder.cpp
+++ b/src/glsl/ir_builder.cpp
@@ -369,6 +369,18 @@ bitcast_u2f(operand a)
return expr(ir_unop_bitcast_u2f, a);
 }
 
+ir_expression*
+i2b(operand a)
+{
+   return expr(ir_unop_i2b, a);
+}
+
+ir_expression*
+b2i(operand a)
+{
+   return expr(ir_unop_b2i, a);
+}
+
 ir_if*
 if_tree(operand condition,
 ir_instruction *then_branch)
diff --git a/src/glsl/ir_builder.h b/src/glsl/ir_builder.h
index 267b673..7049476 100644
--- a/src/glsl/ir_builder.h
+++ b/src/glsl/ir_builder.h
@@ -162,6 +162,8 @@ ir_expression *u2f(operand a);
 ir_expression *bitcast_u2f(operand a);
 ir_expression *i2u(operand a);
 ir_expression *u2i(operand a);
+ir_expression *b2i(operand a);
+ir_expression *i2b(operand a);
 
 /**
  * Swizzle away later components, but preserve the ordering.
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] i965: Emit MOVs for neg/abs.

2013-08-12 Thread Matt Turner

Necessary to avoid combining a bitcast and a modifier into a single
operation. Otherwise if safe, the MOV should be removed by
copy-propagation or register coalescing.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index ee7728c..fa4554b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -361,12 +361,12 @@ fs_visitor::visit(ir_expression *ir)
   break;
case ir_unop_neg:
   op[0].negate = !op[0].negate;
-  this->result = op[0];
+  emit(MOV(this->result, op[0]));
   break;
case ir_unop_abs:
   op[0].abs = true;
   op[0].negate = false;
-  this->result = op[0];
+  emit(MOV(this->result, op[0]));
   break;
case ir_unop_sign:
   temp = fs_reg(this, ir->type);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 8d4a5d4..05c0091 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1391,12 +1391,12 @@ vec4_visitor::visit(ir_expression *ir)
   break;
case ir_unop_neg:
   op[0].negate = !op[0].negate;
-  this->result = op[0];
+  emit(MOV(result_dst, op[0]));
   break;
case ir_unop_abs:
   op[0].abs = true;
   op[0].negate = false;
-  this->result = op[0];
+  emit(MOV(result_dst, op[0]));
   break;
 
case ir_unop_sign:
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] glsl/ast: Don't perform GS input array checks on non-inputs.

2013-08-12 Thread Kenneth Graunke


On 08/12/2013 07:14 AM, Paul Berry wrote:

Previously, we were accidentally calling
handle_geometry_shader_input_decl() on non-input interface block
declarations, resulting in bogus error checking.
---
  src/glsl/ast_to_hir.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 2e97f3b..1b8aca2 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4567,7 +4567,7 @@ ast_interface_block::hir(exec_list *instructions,
}

var->interface_type = block_type;
-  if (state->target == geometry_shader)
+  if (state->target == geometry_shader && var_mode == ir_var_shader_in)
   handle_geometry_shader_input_decl(state, loc, var);
state->symbols->add_variable(var);
instructions->push_tail(var);



This series is:
Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Enable LTO by default on i965/libdricore release builds.

2013-08-12 Thread Chad Versace


On 08/08/2013 01:43 PM, Eric Anholt wrote:

We can't just smash it on globally due to (probably resolvable) issues
with the asm in glapi.  And we don't want to penalize developers with
longer build times for their normal debug environment.

Due to libdricore making almost all of our symbols public, the effect is
very small -- cairo-gl with INTEL_NO_HW=1 shows -0.798709% +/- 0.333703%
change in runtime (n=30).
---

If we were to avoid dricore, there's an additional 5% improvement available
(see the "megadriver" branch of my tree).

  configure.ac  | 25 +
  src/mesa/Makefile.am  |  4 ++--
  src/mesa/drivers/dri/i965/Makefile.am |  1 +
  src/mesa/libdricore/Makefile.am   |  8 +++-
  src/mesa/program/Makefile.am  |  4 ++--
  5 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/configure.ac b/configure.ac
index 62d06e0..26c230d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -314,6 +314,7 @@ AC_ARG_ENABLE([debug],
  [enable_debug="$enableval"],
  [enable_debug=no]
  )
+enable_lto=yes
  if test "x$enable_debug" = xyes; then
  DEFINES_FOR_BUILD="$DEFINES_FOR_BUILD -DDEBUG"
  if test "x$GCC_FOR_BUILD" = xyes; then
@@ -330,7 +331,31 @@ if test "x$enable_debug" = xyes; then
  if test "x$GXX" = xyes; then
  CXXFLAGS="$CXXFLAGS -g -O0"
  fi
+
+# Disable LTO by default on debug builds, since it's so expensive at
+# compile time.
+enable_lto=no
+fi


I'd like to emit a configuration error if someone tries to enable LTO
in a debug build. According the gcc-4.8 manpage, that's a really bad
idea.

   Link-time optimization does not work well with generation of
   debugging information.  Combining -flto with -g is currently
   experimental and expected to produce wrong results.

Other than that, the patch looks good to me.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] glsl: Add abs() to ir_builder.

2013-08-12 Thread Kenneth Graunke


On 08/12/2013 01:25 PM, Matt Turner wrote:

---
  src/glsl/ir_builder.cpp | 6 ++
  src/glsl/ir_builder.h   | 1 +
  2 files changed, 7 insertions(+)

diff --git a/src/glsl/ir_builder.cpp b/src/glsl/ir_builder.cpp
index 5e1da17..06b6a8c 100644
--- a/src/glsl/ir_builder.cpp
+++ b/src/glsl/ir_builder.cpp
@@ -219,6 +219,12 @@ saturate(operand a)
   new(mem_ctx) ir_constant(0.0f));
  }

+ir_expression *
+abs(operand a)
+{
+   return expr(ir_unop_abs, a);
+}
+
  ir_expression*
  equal(operand a, operand b)
  {


I was concerned that this might conflict with abs(x) from stdlib.h, so I 
did some experimenting:


Inside a namespace ir_builder { ... } block, it does appear to hide the 
"real" abs.  You have to call it via ::abs(x).


However, in anything that *uses* ir_builder, i.e.

using namespace ir_builder;

... abs(3) ... abs(some_ir_expression)

You can use both functions and they both work fine - it's just an extra 
overload.  This means you can write natural code without worrying about it.


So I'm fine with this patch.  Paul and Chad both voiced in person that 
they approved as well.


For the series:
Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] OpenGL ES only configuration (without "desktop" OpenGL support)

2013-08-12 Thread Chad Versace


On 08/06/2013 09:44 PM, Siarhei Siamashka wrote:

On Tue, 6 Aug 2013 15:54:57 -0700
Matt Turner  wrote:


On Tue, Aug 6, 2013 at 2:13 PM, Siarhei Siamashka
 wrote:



But if upstream Mesa treats this configuration as unsupported, then I
also don't see it progressing anywhere in Gentoo. So could you please
re-consider this decision?


As far as I'm aware, ES without Desktop GL is disallowed only because
it was discovered to be broken
which is because no one working on Mesa appears to test it.


I have not done any really serious testing. I'm just playing around [...]




If you can test it (and provide patches when you notice that it's
broken) I don't have a problem with allowing ES-only builds.


I agree. If you can fix Mesa to support ES-only builds and do *serious*
testing with Piglit and some real ES applications to prove that it works,
then I'm not opposed to supporting that configuration.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] mesa: Make detach_renderbuffer available outside fbobject.c

2013-08-12 Thread Chad Versace


On 08/08/2013 04:23 PM, Ian Romanick wrote:

From: Ian Romanick 

Also add a return value indicating whether any work was done.

This will be used by the next patch.

Signed-off-by: Ian Romanick 
Cc: "9.2" mesa-sta...@lists.freedesktop.org
---
  src/mesa/main/fbobject.c | 42 +-
  src/mesa/main/fbobject.h |  6 ++
  2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index 74f294c..d121167 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1227,19 +1227,43 @@ _mesa_BindRenderbufferEXT(GLenum target, GLuint 
renderbuffer)
   * the renderbuffer.
   * This is used when a renderbuffer object is deleted.
   * The spec calls for unbinding.
+ *
+ * \returns
+ * \c true if the renderbuffer was detached from an attachment point.  \c
+ * false otherwise.
   */
-static void
-detach_renderbuffer(struct gl_context *ctx,
-struct gl_framebuffer *fb,
-struct gl_renderbuffer *rb)
+bool
+_mesa_detach_renderbuffer(struct gl_context *ctx,
+  struct gl_framebuffer *fb,
+  const void *att)
  {
-   GLuint i;
+   unsigned i;
+   bool progress = false;
+
 for (i = 0; i < BUFFER_COUNT; i++) {
-  if (fb->Attachment[i].Renderbuffer == rb) {
+  if (fb->Attachment[i].Texture == att
+  || fb->Attachment[i].Renderbuffer == att) {
   _mesa_remove_attachment(ctx, &fb->Attachment[i]);
+ progress = true;
}


This patch has an easter egg. It does more than make detach_renderbuffer 
public. I think
this hunk regarding textures should be folded into patch 3. It definitely 
doesn't
belong in this patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] mesa: Use _mesa_detach_renderbuffer when deleting a texture

2013-08-12 Thread Chad Versace


On 08/08/2013 04:23 PM, Ian Romanick wrote:

From: Ian Romanick 

The functional change is that now invalidate_framebuffer is called if
the texture is actually detached from one of the currently bound FBOs.
Previously this was only done for renderbuffers.

The remaining changes make the texture delete path look more similar to
the renderbuffer delete path.  This includes adding relevant spec
quotations to justify the behavior.

Fixes piglit fbo-incomplete "delete texture of bound FBO" test.

Signed-off-by: Ian Romanick 
Cc: "9.2" mesa-sta...@lists.freedesktop.org
---
  src/mesa/main/fbobject.c | 23 +++
  src/mesa/main/texobj.c   | 42 +++---
  2 files changed, 46 insertions(+), 19 deletions(-)


Other than the missplaced hunk in patch 2, this series looks good to me. Move
that hunk and the series is
Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-12 Thread Ian Romanick


On 08/11/2013 02:32 AM, davya...@free.fr wrote:

Hello,

To enable using prime on Wayland, we need to have a way to create shareable 
textures.
That's the purpose of __DRI_IMAGE_USE_SHARE and PIPE_BIND_SHARED, but these 
flags don't
disable tiling. This patch change the behaviour of these flags so that they 
disable tiling.
Disabling tiling is necessary since GPUs don't share tiling modes.


Please send patches using 'git send-email'.  I can't reply to the 
attachment with review comments.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] mesa: Make detach_renderbuffer available outside fbobject.c

2013-08-12 Thread Ian Romanick


On 08/12/2013 02:15 PM, Chad Versace wrote:

On 08/08/2013 04:23 PM, Ian Romanick wrote:

From: Ian Romanick 

Also add a return value indicating whether any work was done.

This will be used by the next patch.

Signed-off-by: Ian Romanick 
Cc: "9.2" mesa-sta...@lists.freedesktop.org
---
  src/mesa/main/fbobject.c | 42
+-
  src/mesa/main/fbobject.h |  6 ++
  2 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index 74f294c..d121167 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1227,19 +1227,43 @@ _mesa_BindRenderbufferEXT(GLenum target,
GLuint renderbuffer)
   * the renderbuffer.
   * This is used when a renderbuffer object is deleted.
   * The spec calls for unbinding.
+ *
+ * \returns
+ * \c true if the renderbuffer was detached from an attachment
point.  \c
+ * false otherwise.
   */
-static void
-detach_renderbuffer(struct gl_context *ctx,
-struct gl_framebuffer *fb,
-struct gl_renderbuffer *rb)
+bool
+_mesa_detach_renderbuffer(struct gl_context *ctx,
+  struct gl_framebuffer *fb,
+  const void *att)
  {
-   GLuint i;
+   unsigned i;
+   bool progress = false;
+
 for (i = 0; i < BUFFER_COUNT; i++) {
-  if (fb->Attachment[i].Renderbuffer == rb) {
+  if (fb->Attachment[i].Texture == att
+  || fb->Attachment[i].Renderbuffer == att) {
   _mesa_remove_attachment(ctx, &fb->Attachment[i]);
+ progress = true;
}


This patch has an easter egg. It does more than make detach_renderbuffer
public. I think
this hunk regarding textures should be folded into patch 3. It
definitely doesn't
belong in this patch.


Yes, you are correct.  Patch-split fail. :(

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glx: Generate GLXBadDrawable when drawable is zero

2013-08-12 Thread Chad Versace

On 08/09/2013 03:42 PM, Ian Romanick wrote:

From: Ian Romanick 

Fixes piglit glx-query-drawable-GLXBadDrawable.

Signed-off-by: Ian Romanick 
Cc: "9.2" 
---
  src/glx/glx_pbuffer.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

Woot! You fixed my Piglit test from 2011!

Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 67962] undefined reference to `wayland_drm_buffer_get'

2013-08-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=67962

Chad Versace  changed:

   What|Removed |Added

 CC||chad.vers...@linux.intel.co
   ||m

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-12 Thread Ian Romanick


On 08/11/2013 03:50 AM, davya...@free.fr wrote:

This looks good to me, but commit messages should have a prefix
indicating which component is changed. In your case it's "gallium:"
and "intel:", respectively.

Marek



Ok, I've suppressed some trailing spaces and changed the commit message.


And smashed two separate patches into one.  Was that intentional?


 From f48fdb44d638ae850c7f3df36211b33788088927 Mon Sep 17 00:00:00 2001
From: axeldavy 
Date: Sun, 11 Aug 2013 12:36:33 +0200
Subject: [PATCH 1/1] gallium: Implements the new meaning of PIPE_BIND_SHARED
  (there should be no tiling) for i915, ilo, nv50, nvc0, r300, r600, radeonsi.
  intel: _DRI_IMAGE_USE_SHARE is equivalent to PIPE_BIND_SHARED. No tiling
  either for the dri drivers i915 and i965.


Signed-off-by: axeldavy 


Real name in S-o-b, please.  Axel Davy?


---
  src/gallium/drivers/i915/i915_resource.c| 7 ++-
  src/gallium/drivers/ilo/ilo_resource.c  | 2 +-
  src/gallium/drivers/nv50/nv50_miptree.c | 3 +++
  src/gallium/drivers/nvc0/nvc0_miptree.c | 3 +++
  src/gallium/drivers/r300/r300_texture.c | 2 +-
  src/gallium/drivers/r600/r600_texture.c | 3 ++-
  src/gallium/drivers/radeonsi/r600_texture.c | 2 +-
  src/gallium/include/pipe/p_defines.h| 5 ++---
  src/mesa/drivers/dri/i915/intel_screen.c| 2 ++
  src/mesa/drivers/dri/i965/intel_screen.c| 3 ++-
  10 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/i915/i915_resource.c 
b/src/gallium/drivers/i915/i915_resource.c
index 314ebe9..563bf83 100644
--- a/src/gallium/drivers/i915/i915_resource.c
+++ b/src/gallium/drivers/i915/i915_resource.c
@@ -12,7 +12,12 @@ i915_resource_create(struct pipe_screen *screen,
 if (template->target == PIPE_BUFFER)
return i915_buffer_create(screen, template);
 else
-  return i915_texture_create(screen, template, FALSE);
+   {
+  if (!(template->bind & PIPE_BIND_SHARED))
+ return i915_texture_create(screen, template, FALSE);
+  else
+ return i915_texture_create(screen, template, TRUE);
+   }

  }

diff --git a/src/gallium/drivers/ilo/ilo_resource.c 
b/src/gallium/drivers/ilo/ilo_resource.c
index 5061f69..3d5874f 100644
--- a/src/gallium/drivers/ilo/ilo_resource.c
+++ b/src/gallium/drivers/ilo/ilo_resource.c
@@ -473,7 +473,7 @@ tex_layout_init_tiling(struct tex_layout *layout)
  * "The cursor surface address must be 4K byte aligned. The cursor must
  *  be in linear memory, it cannot be tiled."
  */
-   if (unlikely(templ->bind & PIPE_BIND_CURSOR))
+   if (unlikely(templ->bind & (PIPE_BIND_CURSOR | PIPE_BIND_SHARED)))
valid_tilings &= tile_none;

 /*
diff --git a/src/gallium/drivers/nv50/nv50_miptree.c 
b/src/gallium/drivers/nv50/nv50_miptree.c
index 28be768..a4f15fe 100644
--- a/src/gallium/drivers/nv50/nv50_miptree.c
+++ b/src/gallium/drivers/nv50/nv50_miptree.c
@@ -326,6 +326,9 @@ nv50_miptree_create(struct pipe_screen *pscreen,
 pipe_reference_init(&pt->reference, 1);
 pt->screen = pscreen;

+   if (pt->bind & PIPE_BIND_SHARED)
+  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
+
 bo_config.nv50.memtype = nv50_mt_choose_storage_type(mt, TRUE);

 if (!nv50_miptree_init_ms_mode(mt)) {
diff --git a/src/gallium/drivers/nvc0/nvc0_miptree.c 
b/src/gallium/drivers/nvc0/nvc0_miptree.c
index 9e57d74..e645553 100644
--- a/src/gallium/drivers/nvc0/nvc0_miptree.c
+++ b/src/gallium/drivers/nvc0/nvc0_miptree.c
@@ -274,6 +274,9 @@ nvc0_miptree_create(struct pipe_screen *pscreen,
}
 }

+   if (pt->bind & PIPE_BIND_SHARED)
+  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
+
 bo_config.nvc0.memtype = nvc0_mt_choose_storage_type(mt, compressed);

 if (!nvc0_miptree_init_ms_mode(mt)) {
diff --git a/src/gallium/drivers/r300/r300_texture.c 
b/src/gallium/drivers/r300/r300_texture.c
index 13e9bc3..3672d7b 100644
--- a/src/gallium/drivers/r300/r300_texture.c
+++ b/src/gallium/drivers/r300/r300_texture.c
@@ -1079,7 +1079,7 @@ struct pipe_resource *r300_texture_create(struct 
pipe_screen *screen,
  enum radeon_bo_layout microtile, macrotile;

  if ((base->flags & R300_RESOURCE_FLAG_TRANSFER) ||
-(base->bind & PIPE_BIND_SCANOUT)) {
+(base->bind & (PIPE_BIND_SCANOUT | PIPE_BIND_SHARED))) {
  microtile = RADEON_LAYOUT_LINEAR;
  macrotile = RADEON_LAYOUT_LINEAR;
  } else {
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 36cca17..60c050b 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -609,7 +609,8 @@ struct pipe_resource *r600_texture_create(struct 
pipe_screen *screen,
 * because 422 formats are used for videos, which prefer linear buffers
 * for fast uploads anyway. */
if (!(templ->flags & R600_RESOURCE_FLAG_TRANSFER) &&
-   desc->layout != UTIL_FORMAT_LAYOUT_SUBSAMPLED) {
+   (desc->layout != UTIL_FORMAT_LAYOUT_SUBS

[Mesa-dev] [Bug 67962] undefined reference to `wayland_drm_buffer_get'

2013-08-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=67962

--- Comment #5 from Chad Versace  ---
Artie, I wrote the guilty commit. Next time I'm the guilty one, please add me
to bugs CC list.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-12 Thread Stéphane Marchesin

On Mon, Aug 12, 2013 at 2:29 PM, Ian Romanick  wrote:
> On 08/11/2013 03:50 AM, davya...@free.fr wrote:
>>>
>>> This looks good to me, but commit messages should have a prefix
>>> indicating which component is changed. In your case it's "gallium:"
>>> and "intel:", respectively.
>>>
>>> Marek
>>
>>
>>
>> Ok, I've suppressed some trailing spaces and changed the commit message.
>
>
> And smashed two separate patches into one.  Was that intentional?
>
>
>>  From f48fdb44d638ae850c7f3df36211b33788088927 Mon Sep 17 00:00:00 2001
>> From: axeldavy 
>> Date: Sun, 11 Aug 2013 12:36:33 +0200
>> Subject: [PATCH 1/1] gallium: Implements the new meaning of
>> PIPE_BIND_SHARED
>>   (there should be no tiling) for i915, ilo, nv50, nvc0, r300, r600,
>> radeonsi.
>>   intel: _DRI_IMAGE_USE_SHARE is equivalent to PIPE_BIND_SHARED. No tiling
>>   either for the dri drivers i915 and i965.
>>
>>
>> Signed-off-by: axeldavy 
>
>
> Real name in S-o-b, please.  Axel Davy?
>
>
>> ---
>>   src/gallium/drivers/i915/i915_resource.c| 7 ++-
>>   src/gallium/drivers/ilo/ilo_resource.c  | 2 +-
>>   src/gallium/drivers/nv50/nv50_miptree.c | 3 +++
>>   src/gallium/drivers/nvc0/nvc0_miptree.c | 3 +++
>>   src/gallium/drivers/r300/r300_texture.c | 2 +-
>>   src/gallium/drivers/r600/r600_texture.c | 3 ++-
>>   src/gallium/drivers/radeonsi/r600_texture.c | 2 +-
>>   src/gallium/include/pipe/p_defines.h| 5 ++---
>>   src/mesa/drivers/dri/i915/intel_screen.c| 2 ++
>>   src/mesa/drivers/dri/i965/intel_screen.c| 3 ++-
>>   10 files changed, 23 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/gallium/drivers/i915/i915_resource.c
>> b/src/gallium/drivers/i915/i915_resource.c
>> index 314ebe9..563bf83 100644
>> --- a/src/gallium/drivers/i915/i915_resource.c
>> +++ b/src/gallium/drivers/i915/i915_resource.c
>> @@ -12,7 +12,12 @@ i915_resource_create(struct pipe_screen *screen,
>>  if (template->target == PIPE_BUFFER)
>> return i915_buffer_create(screen, template);
>>  else
>> -  return i915_texture_create(screen, template, FALSE);
>> +   {
>> +  if (!(template->bind & PIPE_BIND_SHARED))
>> + return i915_texture_create(screen, template, FALSE);
>> +  else
>> + return i915_texture_create(screen, template, TRUE);
>> +   }
>>
>>   }
>>
>> diff --git a/src/gallium/drivers/ilo/ilo_resource.c
>> b/src/gallium/drivers/ilo/ilo_resource.c
>> index 5061f69..3d5874f 100644
>> --- a/src/gallium/drivers/ilo/ilo_resource.c
>> +++ b/src/gallium/drivers/ilo/ilo_resource.c
>> @@ -473,7 +473,7 @@ tex_layout_init_tiling(struct tex_layout *layout)
>>   * "The cursor surface address must be 4K byte aligned. The
>> cursor must
>>   *  be in linear memory, it cannot be tiled."
>>   */
>> -   if (unlikely(templ->bind & PIPE_BIND_CURSOR))
>> +   if (unlikely(templ->bind & (PIPE_BIND_CURSOR | PIPE_BIND_SHARED)))
>> valid_tilings &= tile_none;
>>
>>  /*
>> diff --git a/src/gallium/drivers/nv50/nv50_miptree.c
>> b/src/gallium/drivers/nv50/nv50_miptree.c
>> index 28be768..a4f15fe 100644
>> --- a/src/gallium/drivers/nv50/nv50_miptree.c
>> +++ b/src/gallium/drivers/nv50/nv50_miptree.c
>> @@ -326,6 +326,9 @@ nv50_miptree_create(struct pipe_screen *pscreen,
>>  pipe_reference_init(&pt->reference, 1);
>>  pt->screen = pscreen;
>>
>> +   if (pt->bind & PIPE_BIND_SHARED)
>> +  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
>> +
>>  bo_config.nv50.memtype = nv50_mt_choose_storage_type(mt, TRUE);
>>
>>  if (!nv50_miptree_init_ms_mode(mt)) {
>> diff --git a/src/gallium/drivers/nvc0/nvc0_miptree.c
>> b/src/gallium/drivers/nvc0/nvc0_miptree.c
>> index 9e57d74..e645553 100644
>> --- a/src/gallium/drivers/nvc0/nvc0_miptree.c
>> +++ b/src/gallium/drivers/nvc0/nvc0_miptree.c
>> @@ -274,6 +274,9 @@ nvc0_miptree_create(struct pipe_screen *pscreen,
>> }
>>  }
>>
>> +   if (pt->bind & PIPE_BIND_SHARED)
>> +  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
>> +
>>  bo_config.nvc0.memtype = nvc0_mt_choose_storage_type(mt, compressed);
>>
>>  if (!nvc0_miptree_init_ms_mode(mt)) {
>> diff --git a/src/gallium/drivers/r300/r300_texture.c
>> b/src/gallium/drivers/r300/r300_texture.c
>> index 13e9bc3..3672d7b 100644
>> --- a/src/gallium/drivers/r300/r300_texture.c
>> +++ b/src/gallium/drivers/r300/r300_texture.c
>> @@ -1079,7 +1079,7 @@ struct pipe_resource *r300_texture_create(struct
>> pipe_screen *screen,
>>   enum radeon_bo_layout microtile, macrotile;
>>
>>   if ((base->flags & R300_RESOURCE_FLAG_TRANSFER) ||
>> -(base->bind & PIPE_BIND_SCANOUT)) {
>> +(base->bind & (PIPE_BIND_SCANOUT | PIPE_BIND_SHARED))) {
>>   microtile = RADEON_LAYOUT_LINEAR;
>>   macrotile = RADEON_LAYOUT_LINEAR;
>>   } else {
>> diff --git a/src/gallium/drivers/r600/r600_texture.c
>> b/src/gallium/drivers/r600/r600_texture.c
>> index 36cca17..60c050b 100644
>> --- a/src/gallium/drivers/r600/r600_texture.c
>> +++ b/src/gallium/drive

Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-12 Thread davyaxel

> On 08/11/2013 03:50 AM, davya...@free.fr wrote:
>>> This looks good to me, but commit messages should have a prefix
>>> indicating which component is changed. In your case it's "gallium:"
>>> and "intel:", respectively.
>>>
>>> Marek 
>>
>>
>> Ok, I've suppressed some trailing spaces and changed the commit message. 
>
> And smashed two separate patches into one.  Was that intentional? 
I thought it would be more simple if it was only one patch.
>>  From f48fdb44d638ae850c7f3df36211b33788088927 Mon Sep 17 00:00:00 2001
>> From: axeldavy 
>> Date: Sun, 11 Aug 2013 12:36:33 +0200
>> Subject: [PATCH 1/1] gallium: Implements the new meaning of PIPE_BIND_SHARED
>>   (there should be no tiling) for i915, ilo, nv50, nvc0, r300, r600, 
>> radeonsi.
>>   intel: _DRI_IMAGE_USE_SHARE is equivalent to PIPE_BIND_SHARED. No tiling
>>   either for the dri drivers i915 and i965.
>>
>>
>> Signed-off-by: axeldavy  
>
> Real name in S-o-b, please.  Axel Davy?
Yes my name is Axel Davy
>> ---
>>   src/gallium/drivers/i915/i915_resource.c| 7 ++-
>>   src/gallium/drivers/ilo/ilo_resource.c  | 2 +-
>>   src/gallium/drivers/nv50/nv50_miptree.c | 3 +++
>>   src/gallium/drivers/nvc0/nvc0_miptree.c | 3 +++
>>   src/gallium/drivers/r300/r300_texture.c | 2 +-
>>   src/gallium/drivers/r600/r600_texture.c | 3 ++-
>>   src/gallium/drivers/radeonsi/r600_texture.c | 2 +-
>>   src/gallium/include/pipe/p_defines.h| 5 ++---
>>   src/mesa/drivers/dri/i915/intel_screen.c| 2 ++
>>   src/mesa/drivers/dri/i965/intel_screen.c| 3 ++-
>>   10 files changed, 23 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/gallium/drivers/i915/i915_resource.c 
>> b/src/gallium/drivers/i915/i915_resource.c
>> index 314ebe9..563bf83 100644
>> --- a/src/gallium/drivers/i915/i915_resource.c
>> +++ b/src/gallium/drivers/i915/i915_resource.c
>> @@ -12,7 +12,12 @@ i915_resource_create(struct pipe_screen *screen,
>>  if (template->target == PIPE_BUFFER)
>> return i915_buffer_create(screen, template);
>>  else
>> -  return i915_texture_create(screen, template, FALSE);
>> +   {
>> +  if (!(template->bind & PIPE_BIND_SHARED))
>> + return i915_texture_create(screen, template, FALSE);
>> +  else
>> + return i915_texture_create(screen, template, TRUE);
>> +   }
>>
>>   }
>>
>> diff --git a/src/gallium/drivers/ilo/ilo_resource.c 
>> b/src/gallium/drivers/ilo/ilo_resource.c
>> index 5061f69..3d5874f 100644
>> --- a/src/gallium/drivers/ilo/ilo_resource.c
>> +++ b/src/gallium/drivers/ilo/ilo_resource.c
>> @@ -473,7 +473,7 @@ tex_layout_init_tiling(struct tex_layout *layout)
>>   * "The cursor surface address must be 4K byte aligned. The cursor 
>> must
>>   *  be in linear memory, it cannot be tiled."
>>   */
>> -   if (unlikely(templ->bind & PIPE_BIND_CURSOR))
>> +   if (unlikely(templ->bind & (PIPE_BIND_CURSOR | PIPE_BIND_SHARED)))
>> valid_tilings &= tile_none;
>>
>>  /*
>> diff --git a/src/gallium/drivers/nv50/nv50_miptree.c 
>> b/src/gallium/drivers/nv50/nv50_miptree.c
>> index 28be768..a4f15fe 100644
>> --- a/src/gallium/drivers/nv50/nv50_miptree.c
>> +++ b/src/gallium/drivers/nv50/nv50_miptree.c
>> @@ -326,6 +326,9 @@ nv50_miptree_create(struct pipe_screen *pscreen,
>>  pipe_reference_init(&pt->reference, 1);
>>  pt->screen = pscreen;
>>
>> +   if (pt->bind & PIPE_BIND_SHARED)
>> +  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
>> +
>>  bo_config.nv50.memtype = nv50_mt_choose_storage_type(mt, TRUE);
>>
>>  if (!nv50_miptree_init_ms_mode(mt)) {
>> diff --git a/src/gallium/drivers/nvc0/nvc0_miptree.c 
>> b/src/gallium/drivers/nvc0/nvc0_miptree.c
>> index 9e57d74..e645553 100644
>> --- a/src/gallium/drivers/nvc0/nvc0_miptree.c
>> +++ b/src/gallium/drivers/nvc0/nvc0_miptree.c
>> @@ -274,6 +274,9 @@ nvc0_miptree_create(struct pipe_screen *pscreen,
>> }
>>  }
>>
>> +   if (pt->bind & PIPE_BIND_SHARED)
>> +  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
>> +
>>  bo_config.nvc0.memtype = nvc0_mt_choose_storage_type(mt, compressed);
>>
>>  if (!nvc0_miptree_init_ms_mode(mt)) {
>> diff --git a/src/gallium/drivers/r300/r300_texture.c 
>> b/src/gallium/drivers/r300/r300_texture.c
>> index 13e9bc3..3672d7b 100644
>> --- a/src/gallium/drivers/r300/r300_texture.c
>> +++ b/src/gallium/drivers/r300/r300_texture.c
>> @@ -1079,7 +1079,7 @@ struct pipe_resource *r300_texture_create(struct 
>> pipe_screen *screen,
>>   enum radeon_bo_layout microtile, macrotile;
>>
>>   if ((base->flags & R300_RESOURCE_FLAG_TRANSFER) ||
>> -(base->bind & PIPE_BIND_SCANOUT)) {
>> +(base->bind & (PIPE_BIND_SCANOUT | PIPE_BIND_SHARED))) {
>>   microtile = RADEON_LAYOUT_LINEAR;
>>   macrotile = RADEON_LAYOUT_LINEAR;
>>   } else {
>> diff --git a/src/gallium/drivers/r600/r600_texture.c 
>> b/src/gallium/drivers/r600/r600_texture.c
>> index 36cca17..60c050b 100644
>> --- a/src/gallium/drivers/r600/r600_texture.c
>> +

Re: [Mesa-dev] [PATCH v2] gbm: fix linking

2013-08-12 Thread Chad Versace


On 08/11/2013 08:27 AM, Armin K wrote:

Link to internal libwayland-drm library if Wayland
EGL platform is enabled. The library needs to be
built before gbm.

Link to libdrm if gbm_dri is enabled, otherwise
library will report undefined references to
libdrm symbols when using libgbm without
libdrm.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67962
---
  src/Makefile.am | 4 
  src/egl/Makefile.am | 4 
  src/gbm/Makefile.am | 3 ++-
  3 files changed, 6 insertions(+), 5 deletions(-)


This looks right to me,
Reviewed-by: Chad Versace 

But I'm no Wayland/GBM export. Before committing, I'd like
to wait a few days for a Wayland developer to comment.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-12 Thread Dave Airlie

On Tue, Aug 13, 2013 at 7:36 AM, Stéphane Marchesin
 wrote:
> On Mon, Aug 12, 2013 at 2:29 PM, Ian Romanick  wrote:
>> On 08/11/2013 03:50 AM, davya...@free.fr wrote:

 This looks good to me, but commit messages should have a prefix
 indicating which component is changed. In your case it's "gallium:"
 and "intel:", respectively.

 Marek
>>>
>>>
>>>
>>> Ok, I've suppressed some trailing spaces and changed the commit message.
>>
>>
>> And smashed two separate patches into one.  Was that intentional?
>>
>>
>>>  From f48fdb44d638ae850c7f3df36211b33788088927 Mon Sep 17 00:00:00 2001
>>> From: axeldavy 
>>> Date: Sun, 11 Aug 2013 12:36:33 +0200
>>> Subject: [PATCH 1/1] gallium: Implements the new meaning of
>>> PIPE_BIND_SHARED
>>>   (there should be no tiling) for i915, ilo, nv50, nvc0, r300, r600,
>>> radeonsi.
>>>   intel: _DRI_IMAGE_USE_SHARE is equivalent to PIPE_BIND_SHARED. No tiling
>>>   either for the dri drivers i915 and i965.
>>>
>>>
>>> Signed-off-by: axeldavy 
>>
>>
>> Real name in S-o-b, please.  Axel Davy?
>>
>>
>>> ---
>>>   src/gallium/drivers/i915/i915_resource.c| 7 ++-
>>>   src/gallium/drivers/ilo/ilo_resource.c  | 2 +-
>>>   src/gallium/drivers/nv50/nv50_miptree.c | 3 +++
>>>   src/gallium/drivers/nvc0/nvc0_miptree.c | 3 +++
>>>   src/gallium/drivers/r300/r300_texture.c | 2 +-
>>>   src/gallium/drivers/r600/r600_texture.c | 3 ++-
>>>   src/gallium/drivers/radeonsi/r600_texture.c | 2 +-
>>>   src/gallium/include/pipe/p_defines.h| 5 ++---
>>>   src/mesa/drivers/dri/i915/intel_screen.c| 2 ++
>>>   src/mesa/drivers/dri/i965/intel_screen.c| 3 ++-
>>>   10 files changed, 23 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/i915/i915_resource.c
>>> b/src/gallium/drivers/i915/i915_resource.c
>>> index 314ebe9..563bf83 100644
>>> --- a/src/gallium/drivers/i915/i915_resource.c
>>> +++ b/src/gallium/drivers/i915/i915_resource.c
>>> @@ -12,7 +12,12 @@ i915_resource_create(struct pipe_screen *screen,
>>>  if (template->target == PIPE_BUFFER)
>>> return i915_buffer_create(screen, template);
>>>  else
>>> -  return i915_texture_create(screen, template, FALSE);
>>> +   {
>>> +  if (!(template->bind & PIPE_BIND_SHARED))
>>> + return i915_texture_create(screen, template, FALSE);
>>> +  else
>>> + return i915_texture_create(screen, template, TRUE);
>>> +   }
>>>
>>>   }
>>>
>>> diff --git a/src/gallium/drivers/ilo/ilo_resource.c
>>> b/src/gallium/drivers/ilo/ilo_resource.c
>>> index 5061f69..3d5874f 100644
>>> --- a/src/gallium/drivers/ilo/ilo_resource.c
>>> +++ b/src/gallium/drivers/ilo/ilo_resource.c
>>> @@ -473,7 +473,7 @@ tex_layout_init_tiling(struct tex_layout *layout)
>>>   * "The cursor surface address must be 4K byte aligned. The
>>> cursor must
>>>   *  be in linear memory, it cannot be tiled."
>>>   */
>>> -   if (unlikely(templ->bind & PIPE_BIND_CURSOR))
>>> +   if (unlikely(templ->bind & (PIPE_BIND_CURSOR | PIPE_BIND_SHARED)))
>>> valid_tilings &= tile_none;
>>>
>>>  /*
>>> diff --git a/src/gallium/drivers/nv50/nv50_miptree.c
>>> b/src/gallium/drivers/nv50/nv50_miptree.c
>>> index 28be768..a4f15fe 100644
>>> --- a/src/gallium/drivers/nv50/nv50_miptree.c
>>> +++ b/src/gallium/drivers/nv50/nv50_miptree.c
>>> @@ -326,6 +326,9 @@ nv50_miptree_create(struct pipe_screen *pscreen,
>>>  pipe_reference_init(&pt->reference, 1);
>>>  pt->screen = pscreen;
>>>
>>> +   if (pt->bind & PIPE_BIND_SHARED)
>>> +  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
>>> +
>>>  bo_config.nv50.memtype = nv50_mt_choose_storage_type(mt, TRUE);
>>>
>>>  if (!nv50_miptree_init_ms_mode(mt)) {
>>> diff --git a/src/gallium/drivers/nvc0/nvc0_miptree.c
>>> b/src/gallium/drivers/nvc0/nvc0_miptree.c
>>> index 9e57d74..e645553 100644
>>> --- a/src/gallium/drivers/nvc0/nvc0_miptree.c
>>> +++ b/src/gallium/drivers/nvc0/nvc0_miptree.c
>>> @@ -274,6 +274,9 @@ nvc0_miptree_create(struct pipe_screen *pscreen,
>>> }
>>>  }
>>>
>>> +   if (pt->bind & PIPE_BIND_SHARED)
>>> +  pt->flags |= NOUVEAU_RESOURCE_FLAG_LINEAR;
>>> +
>>>  bo_config.nvc0.memtype = nvc0_mt_choose_storage_type(mt, compressed);
>>>
>>>  if (!nvc0_miptree_init_ms_mode(mt)) {
>>> diff --git a/src/gallium/drivers/r300/r300_texture.c
>>> b/src/gallium/drivers/r300/r300_texture.c
>>> index 13e9bc3..3672d7b 100644
>>> --- a/src/gallium/drivers/r300/r300_texture.c
>>> +++ b/src/gallium/drivers/r300/r300_texture.c
>>> @@ -1079,7 +1079,7 @@ struct pipe_resource *r300_texture_create(struct
>>> pipe_screen *screen,
>>>   enum radeon_bo_layout microtile, macrotile;
>>>
>>>   if ((base->flags & R300_RESOURCE_FLAG_TRANSFER) ||
>>> -(base->bind & PIPE_BIND_SCANOUT)) {
>>> +(base->bind & (PIPE_BIND_SCANOUT | PIPE_BIND_SHARED))) {
>>>   microtile = RADEON_LAYOUT_LINEAR;
>>>   macrotile = RADEON_LAYOUT_LINEAR;
>>>   } else {
>>> diff --git a/src/g

Re: [Mesa-dev] [PATCH 4/4] glsl: Add i2b() and b2i() to ir_builder.

2013-08-12 Thread Chad Versace


This series, including the abs overload, is
Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-12 Thread Marek Olšák

On Mon, Aug 12, 2013 at 11:36 PM, Stéphane Marchesin
 wrote:
>> Other than hybrid systems (of which
>> there are none with i915 graphics), is there any case where
>> __DRI_IMAGE_USE_SHARE can occur?
>
> You could do interesting things like cross-process sharing with it. I
> think it's worth doing it, no matter what. It's easy to pick up now,
> and hard to fix up later.

Cross-process sharing is mandatory already and exposed via
resource_from_handle and resource_get_handle. I don't think this is
useful for cross-process sharing anyway, because it disables tiling.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Patch] Sharing flags should disable tiling

2013-08-12 Thread Stéphane Marchesin

On Mon, Aug 12, 2013 at 3:05 PM, Marek Olšák  wrote:
> On Mon, Aug 12, 2013 at 11:36 PM, Stéphane Marchesin
>  wrote:
>>> Other than hybrid systems (of which
>>> there are none with i915 graphics), is there any case where
>>> __DRI_IMAGE_USE_SHARE can occur?
>>
>> You could do interesting things like cross-process sharing with it. I
>> think it's worth doing it, no matter what. It's easy to pick up now,
>> and hard to fix up later.
>
> Cross-process sharing is mandatory already and exposed via
> resource_from_handle and resource_get_handle. I don't think this is
> useful for cross-process sharing anyway, because it disables tiling.
>

Well, for Chrome we're thinking of using it. If one end can map linear
memory and write texture data to it from the CPU, and the other end
can use it as a GL texture, then we have a zero copy cross-process
texture upload. I realize it's not your normal use case, but... :)

Stéphane
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Patches: R600: Merge R600 and SI vector op expansions

2013-08-12 Thread Tom Stellard

Hi,

The attached patches expand a few more vector operations and also move the
expansion code into AMDGPUISelLowering.cpp so it can be shared between R600 and 
SI.

-Tom
>From a519e387c262ecc0282eb8cb1e2c8802725591b4 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 2 Aug 2013 11:30:01 -0700
Subject: [PATCH 1/3] R600: Expand vector float operations for both SI and
 R600

---
 lib/Target/R600/AMDGPUISelLowering.cpp | 22 +---
 lib/Target/R600/R600ISelLowering.cpp   |  9 ---
 test/CodeGen/R600/fadd.ll  | 48 ++
 test/CodeGen/R600/fdiv.ll  | 46 
 test/CodeGen/R600/fmul.ll  | 46 ++--
 test/CodeGen/R600/fsub.ll  | 45 ++-
 6 files changed, 128 insertions(+), 88 deletions(-)

diff --git a/lib/Target/R600/AMDGPUISelLowering.cpp 
b/lib/Target/R600/AMDGPUISelLowering.cpp
index 746c479..25b1e54 100644
--- a/lib/Target/R600/AMDGPUISelLowering.cpp
+++ b/lib/Target/R600/AMDGPUISelLowering.cpp
@@ -115,14 +115,14 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(TargetMachine 
&TM) :
   setOperationAction(ISD::VSELECT, MVT::v2f32, Expand);
   setOperationAction(ISD::VSELECT, MVT::v4f32, Expand);
 
-  static const int types[] = {
+  static const int IntTypes[] = {
 (int)MVT::v2i32,
 (int)MVT::v4i32
   };
-  const size_t NumTypes = array_lengthof(types);
+  const size_t NumIntTypes = array_lengthof(IntTypes);
 
-  for (unsigned int x  = 0; x < NumTypes; ++x) {
-MVT::SimpleValueType VT = (MVT::SimpleValueType)types[x];
+  for (unsigned int x  = 0; x < NumIntTypes; ++x) {
+MVT::SimpleValueType VT = (MVT::SimpleValueType)IntTypes[x];
 //Expand the following operations for the current type by default
 setOperationAction(ISD::ADD,  VT, Expand);
 setOperationAction(ISD::AND,  VT, Expand);
@@ -141,6 +141,20 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(TargetMachine 
&TM) :
 setOperationAction(ISD::VSELECT, VT, Expand);
 setOperationAction(ISD::XOR,  VT, Expand);
   }
+
+  static const int FloatTypes[] = {
+(int)MVT::v2f32,
+(int)MVT::v4f32
+  };
+  const size_t NumFloatTypes = array_lengthof(FloatTypes);
+
+  for (unsigned int x = 0; x < NumFloatTypes; ++x) {
+MVT::SimpleValueType VT = (MVT::SimpleValueType)FloatTypes[x];
+setOperationAction(ISD::FADD, VT, Expand);
+setOperationAction(ISD::FDIV, VT, Expand);
+setOperationAction(ISD::FMUL, VT, Expand);
+setOperationAction(ISD::FSUB, VT, Expand);
+  }
 }
 
 
//===--===//
diff --git a/lib/Target/R600/R600ISelLowering.cpp 
b/lib/Target/R600/R600ISelLowering.cpp
index e10af2b..b822431 100644
--- a/lib/Target/R600/R600ISelLowering.cpp
+++ b/lib/Target/R600/R600ISelLowering.cpp
@@ -38,15 +38,6 @@ R600TargetLowering::R600TargetLowering(TargetMachine &TM) :
 
   computeRegisterProperties();
 
-  setOperationAction(ISD::FADD, MVT::v4f32, Expand);
-  setOperationAction(ISD::FADD, MVT::v2f32, Expand);
-  setOperationAction(ISD::FMUL, MVT::v4f32, Expand);
-  setOperationAction(ISD::FMUL, MVT::v2f32, Expand);
-  setOperationAction(ISD::FDIV, MVT::v4f32, Expand);
-  setOperationAction(ISD::FDIV, MVT::v2f32, Expand);
-  setOperationAction(ISD::FSUB, MVT::v4f32, Expand);
-  setOperationAction(ISD::FSUB, MVT::v2f32, Expand);
-
   setOperationAction(ISD::FCOS, MVT::f32, Custom);
   setOperationAction(ISD::FSIN, MVT::f32, Custom);
 
diff --git a/test/CodeGen/R600/fadd.ll b/test/CodeGen/R600/fadd.ll
index 2716958..6d45967 100644
--- a/test/CodeGen/R600/fadd.ll
+++ b/test/CodeGen/R600/fadd.ll
@@ -1,23 +1,23 @@
-; RUN: llc < %s -march=r600 -mcpu=redwood | FileCheck %s
+; RUN: llc < %s -march=r600 -mcpu=redwood | FileCheck %s 
--check-prefix=R600-CHECK
+; RUN: llc < %s -march=r600 -mcpu=SI | FileCheck %s --check-prefix=SI-CHECK
 
-; CHECK: @fadd_f32
-; CHECK: ADD * T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}
-
-define void @fadd_f32() {
-   %r0 = call float @llvm.R600.load.input(i32 0)
-   %r1 = call float @llvm.R600.load.input(i32 1)
-   %r2 = fadd float %r0, %r1
-   call void @llvm.AMDGPU.store.output(float %r2, i32 0)
+; R600-CHECK: @fadd_f32
+; R600-CHECK: ADD * T{{[0-9]+\.[XYZW]}}, KC0[2].Z, KC0[2].W
+; SI-CHECK: @fadd_f32
+; SI-CHECK: V_ADD_F32
+define void @fadd_f32(float addrspace(1)* %out, float %a, float %b) {
+entry:
+   %0 = fadd float %a, %b
+   store float %0, float addrspace(1)* %out
ret void
 }
 
-declare float @llvm.R600.load.input(i32) readnone
-
-declare void @llvm.AMDGPU.store.output(float, i32)
-
-; CHECK: @fadd_v2f32
-; CHECK-DAG: ADD * T{{[0-9]\.[XYZW]}}, KC0[3].X, KC0[3].Z
-; CHECK-DAG: ADD * T{{[0-9]\.[XYZW]}}, KC0[2].W, KC0[3].Y
+; R600-CHECK: @fadd_v2f32
+; R600-CHECK-DAG: ADD * T{{[0-9]\.[XYZW]}}, KC0[3].X, KC0[3].Z
+; R600-CHECK-DAG: ADD * T{{[0-9]\.[XYZW]}}, KC0[2].W, KC0[3].Y
+; SI-CHECK: @fadd_v2f32
+; SI-CHECK: V_ADD_F32
+; SI-CHECK: V_ADD_F32
 d

Re: [Mesa-dev] [PATCH 3/9] glsl: Emit errors for things that look like default precision statements

2013-08-12 Thread Kenneth Graunke


On 08/09/2013 04:38 PM, Ian Romanick wrote:

From: Ian Romanick 

Previously we would emit a warning for empty declarations like

float;

We would also emit the same warning for things like

highp float;

However, this second case is most likely the application trying to set the
default precision.  We should instead generate an error.

Fixes piglit precision-05.vert.

Signed-off-by: Ian Romanick 
Cc: "9.2" 
---
  src/glsl/ast_to_hir.cpp | 15 ---
  1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 49804b7..9d2 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2697,6 +2697,10 @@ ast_declarator_list::hir(exec_list *instructions,
 *   name of a known structure type.  This is both invalid and weird.
 *   Emit an error.
 *
+   * - The program text contained something like 'mediump float;'
+   *   when the programmer probably meant 'precision mediump
+   *   float;' Emit an error.
+   *
 * Note that if decl_type is NULL and there is a structure involved,
 * there must have been some sort of error with the structure.  In this
 * case we assume that an error was already generated on this line of
@@ -2705,7 +2709,10 @@ ast_declarator_list::hir(exec_list *instructions,
 */
assert(this->type->specifier->structure == NULL || decl_type != NULL
 || state->error);
-  if (this->type->specifier->structure == NULL) {
+  if (this->type->qualifier.precision != ast_precision_none) {
+ _mesa_glsl_error(&loc, state,
+  "set default precision using `precision' keyword");


I like how this message suggests what to do.  I don't like that it 
doesn't describe the actual problem.


Perhaps something like:
"empty declaration; perhaps you meant `precision highp %s'?", 
this->type->specifier->type_name


or:
"empty declaration; to set the default precision, use `precision highp 
%s;'", this->type->specifier->type_name



+  } else if (this->type->specifier->structure == NULL) {
 if (decl_type != NULL) {
_mesa_glsl_warning(&loc, state, "empty declaration");
 } else {
@@ -2714,12 +2721,6 @@ ast_declarator_list::hir(exec_list *instructions,
 type_name);
 }
}
-
-  if (this->type->qualifier.precision != ast_precision_none &&
-  this->type->specifier->structure != NULL) {
- _mesa_glsl_error(&loc, state, "precision qualifiers can't be applied "
-  "to structures");
-  }
 }

 foreach_list_typed (ast_declaration, decl, link, &this->declarations) {


Although stupid, declarations like "highp float;" are actually permitted 
by the letter of the spec.  nVidia's driver also accepts them.  So I'm a 
bit uneasy about disallowing them.


We should check AMD.  If it disallows them, I'm fine with disallowing 
them as well.  If it accepts them, I think we should too (sadly).


I definitely approve of generating a better message, though.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 9/9] glsl: Track existence of default float precision in GLSL ES fragment shaders

2013-08-12 Thread Kenneth Graunke


On 08/09/2013 04:38 PM, Ian Romanick wrote:

From: Ian Romanick 

This is required by the spec, and it's a bit tricky because the default
precision is scoped.  As a result, I'm slightly abusing the symbol
table.

Fixes piglit no-default-float-precision.frag tests and the piglit
default-precision-nested-scope-0[1234].frag tests that are currently on
the piglit mailing list for review.

Signed-off-by: Ian Romanick 
Cc: "9.2" 
---
  src/glsl/ast.h  |  4 +++
  src/glsl/ast_to_hir.cpp | 68 ++---
  2 files changed, 68 insertions(+), 4 deletions(-)


Ugggh.  Scoped default precision is really ugly.  But what can you do?

Other than my comments on patch 3, this series is:
Reviewed-by: Kenneth Graunke 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Patches: R600: Improve load / store support for 8-bit and 16-bit types

2013-08-12 Thread Aaron Watry

It'll take me a while to attempt to parse everything that's going on
in these patches (and your resource descriptor types series that this
depends on), but I have sent it all through a piglit run on Evergreen
(Cedar).  Everything was latest Mesa/LLVM/libclc upstream code as of
today.

Baseline: 567/855 tests passed
Descriptors Series: 575/855 tests passed <-- Main differences here
were with some int3 load/store issues which were just exposed recently
and fixed by this series)
Descriptors + char/short load/store series: 880/1119 tests passed
(most of the additional tests and passes were char/short tests that no
longer crash out).

Specifically, I've double-checked the char/short/uchar/ushort built-in
functions, as well as the char/short arithmetic tests, and things are
looking good so far.  I'll try to test on Cayman/SI later.

--Aaron

On Mon, Aug 12, 2013 at 2:56 PM, Tom Stellard  wrote:
> Hi,
>
> The attached patches improve support for i8 and i16 loads and stores for
> Evergreen and newer GPUs.  This means that byte-addressable stores are
> now supported.
>
> Please review/test.
>
> -Tom
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] gallium: add new float comparison instructions returning integer masks

2013-08-12 Thread sroland

From: Roland Scheidegger 

Newer graphic languages don't want messy float mask results but instead true
"boolean" mask results for float comparisons. Otherwise just need to convert
the floats back to integers. Need to keep the old opcodes however due to both
legacy (gl and d3d9) needing them and because older hw can't really deal with
integers. These new FSEQ/FSGE/FSLT/FSNE opcodes are part of integer API and
hence must be supported if a driver claims to support glsl 1.30 (or
PIPE_SHADER_CAP_INTEGERS).
---
 src/gallium/docs/source/tgsi.rst   |   92 +++-
 src/gallium/include/pipe/p_shader_tokens.h |7 ++-
 2 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 949ad89..41f2798 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -512,13 +512,13 @@ This instruction replicates its result.
 
 .. math::
 
-  dst.x = (src0.x == src1.x) ? 1 : 0
+  dst.x = (src0.x == src1.x) ? 1.0F : 0.0F
 
-  dst.y = (src0.y == src1.y) ? 1 : 0
+  dst.y = (src0.y == src1.y) ? 1.0F : 0.0F
 
-  dst.z = (src0.z == src1.z) ? 1 : 0
+  dst.z = (src0.z == src1.z) ? 1.0F : 0.0F
 
-  dst.w = (src0.w == src1.w) ? 1 : 0
+  dst.w = (src0.w == src1.w) ? 1.0F : 0.0F
 
 
 .. opcode:: SFL - Set On False
@@ -538,13 +538,13 @@ This instruction replicates its result.
 
 .. math::
 
-  dst.x = (src0.x > src1.x) ? 1 : 0
+  dst.x = (src0.x > src1.x) ? 1.0F : 0.0F
 
-  dst.y = (src0.y > src1.y) ? 1 : 0
+  dst.y = (src0.y > src1.y) ? 1.0F : 0.0F
 
-  dst.z = (src0.z > src1.z) ? 1 : 0
+  dst.z = (src0.z > src1.z) ? 1.0F : 0.0F
 
-  dst.w = (src0.w > src1.w) ? 1 : 0
+  dst.w = (src0.w > src1.w) ? 1.0F : 0.0F
 
 
 .. opcode:: SIN - Sine
@@ -560,26 +560,26 @@ This instruction replicates its result.
 
 .. math::
 
-  dst.x = (src0.x <= src1.x) ? 1 : 0
+  dst.x = (src0.x <= src1.x) ? 1.0F : 0.0F
 
-  dst.y = (src0.y <= src1.y) ? 1 : 0
+  dst.y = (src0.y <= src1.y) ? 1.0F : 0.0F
 
-  dst.z = (src0.z <= src1.z) ? 1 : 0
+  dst.z = (src0.z <= src1.z) ? 1.0F : 0.0F
 
-  dst.w = (src0.w <= src1.w) ? 1 : 0
+  dst.w = (src0.w <= src1.w) ? 1.0F : 0.0F
 
 
 .. opcode:: SNE - Set On Not Equal
 
 .. math::
 
-  dst.x = (src0.x != src1.x) ? 1 : 0
+  dst.x = (src0.x != src1.x) ? 1.0F : 0.0F
 
-  dst.y = (src0.y != src1.y) ? 1 : 0
+  dst.y = (src0.y != src1.y) ? 1.0F : 0.0F
 
-  dst.z = (src0.z != src1.z) ? 1 : 0
+  dst.z = (src0.z != src1.z) ? 1.0F : 0.0F
 
-  dst.w = (src0.w != src1.w) ? 1 : 0
+  dst.w = (src0.w != src1.w) ? 1.0F : 0.0F
 
 
 .. opcode:: STR - Set On True
@@ -1325,6 +1325,21 @@ Support for these opcodes indicated by 
PIPE_SHADER_CAP_INTEGERS (all of them?)
 
 
 
+.. opcode:: FSLT - Float Set On Less Than (ordered)
+
+   Same comparison as SLT but returns integer instead of 1.0/0.0 float
+
+.. math::
+
+  dst.x = (src0.x < src1.x) ? ~0 : 0
+
+  dst.y = (src0.y < src1.y) ? ~0 : 0
+
+  dst.z = (src0.z < src1.z) ? ~0 : 0
+
+  dst.w = (src0.w < src1.w) ? ~0 : 0
+
+
 .. opcode:: ISLT - Signed Integer Set On Less Than
 
 .. math::
@@ -1351,6 +1366,21 @@ Support for these opcodes indicated by 
PIPE_SHADER_CAP_INTEGERS (all of them?)
   dst.w = (src0.w < src1.w) ? ~0 : 0
 
 
+.. opcode:: FSGE - Float Set On Greater Equal Than (ordered)
+
+   Same comparison as SGE but returns integer instead of 1.0/0.0 float
+
+.. math::
+
+  dst.x = (src0.x >= src1.x) ? ~0 : 0
+
+  dst.y = (src0.y >= src1.y) ? ~0 : 0
+
+  dst.z = (src0.z >= src1.z) ? ~0 : 0
+
+  dst.w = (src0.w >= src1.w) ? ~0 : 0
+
+
 .. opcode:: ISGE - Signed Integer Set On Greater Equal Than
 
 .. math::
@@ -1377,6 +1407,21 @@ Support for these opcodes indicated by 
PIPE_SHADER_CAP_INTEGERS (all of them?)
   dst.w = (src0.w >= src1.w) ? ~0 : 0
 
 
+.. opcode:: FSEQ - Float Set On Equal (ordered)
+
+   Same comparison as SEQ but returns integer instead of 1.0/0.0 float
+
+.. math::
+
+  dst.x = (src0.x == src1.x) ? ~0 : 0
+
+  dst.y = (src0.y == src1.y) ? ~0 : 0
+
+  dst.z = (src0.z == src1.z) ? ~0 : 0
+
+  dst.w = (src0.w == src1.w) ? ~0 : 0
+
+
 .. opcode:: USEQ - Integer Set On Equal
 
 .. math::
@@ -1390,6 +1435,21 @@ Support for these opcodes indicated by 
PIPE_SHADER_CAP_INTEGERS (all of them?)
   dst.w = (src0.w == src1.w) ? ~0 : 0
 
 
+.. opcode:: FSNE - Float Set On Not Equal (unordered)
+
+   Same comparison as SNE but returns integer instead of 1.0/0.0 float
+
+.. math::
+
+  dst.x = (src0.x != src1.x) ? ~0 : 0
+
+  dst.y = (src0.y != src1.y) ? ~0 : 0
+
+  dst.z = (src0.z != src1.z) ? ~0 : 0
+
+  dst.w = (src0.w != src1.w) ? ~0 : 0
+
+
 .. opcode:: USNE - Integer Set On Not Equal
 
 .. math::
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 9aaf687..872dfe9 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -367,7 +367,12 @@ struct tgsi_property_data {
 #define TGSI_OPCODE_TXQ_LZ  103 /* TXQ for mipmap level 0 */

[Mesa-dev] [PATCH 2/3] tgsi: implement new float comparison instructions returning integer masks

2013-08-12 Thread sroland

From: Roland Scheidegger 

Also while here add a bunch of other forgotten (integer) instructions to
tgsi_util_get_inst_usage_mask() (which isn't used for much except optimizing
away unused input components), though it may still be incomplete.
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c   |   60 ++
 src/gallium/auxiliary/tgsi/tgsi_info.c   |   16 +--
 src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h |4 ++
 src/gallium/auxiliary/tgsi/tgsi_util.c   |   26 +++
 4 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index d991d4b..1ffd9e9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -3264,6 +3264,50 @@ micro_f2i(union tgsi_exec_channel *dst,
 }
 
 static void
+micro_fseq(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src0,
+   const union tgsi_exec_channel *src1)
+{
+   dst->u[0] = src0->f[0] == src1->f[0] ? ~0 : 0;
+   dst->u[1] = src0->f[1] == src1->f[1] ? ~0 : 0;
+   dst->u[2] = src0->f[2] == src1->f[2] ? ~0 : 0;
+   dst->u[3] = src0->f[3] == src1->f[3] ? ~0 : 0;
+}
+
+static void
+micro_fsge(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src0,
+   const union tgsi_exec_channel *src1)
+{
+   dst->u[0] = src0->f[0] >= src1->f[0] ? ~0 : 0;
+   dst->u[1] = src0->f[1] >= src1->f[1] ? ~0 : 0;
+   dst->u[2] = src0->f[2] >= src1->f[2] ? ~0 : 0;
+   dst->u[3] = src0->f[3] >= src1->f[3] ? ~0 : 0;
+}
+
+static void
+micro_fslt(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src0,
+   const union tgsi_exec_channel *src1)
+{
+   dst->u[0] = src0->f[0] < src1->f[0] ? ~0 : 0;
+   dst->u[1] = src0->f[1] < src1->f[1] ? ~0 : 0;
+   dst->u[2] = src0->f[2] < src1->f[2] ? ~0 : 0;
+   dst->u[3] = src0->f[3] < src1->f[3] ? ~0 : 0;
+}
+
+static void
+micro_fsne(union tgsi_exec_channel *dst,
+   const union tgsi_exec_channel *src0,
+   const union tgsi_exec_channel *src1)
+{
+   dst->u[0] = src0->f[0] != src1->f[0] ? ~0 : 0;
+   dst->u[1] = src0->f[1] != src1->f[1] ? ~0 : 0;
+   dst->u[2] = src0->f[2] != src1->f[2] ? ~0 : 0;
+   dst->u[3] = src0->f[3] != src1->f[3] ? ~0 : 0;
+}
+
+static void
 micro_idiv(union tgsi_exec_channel *dst,
const union tgsi_exec_channel *src0,
const union tgsi_exec_channel *src1)
@@ -4152,6 +4196,22 @@ exec_instruction(
   exec_vector_unary(mach, inst, micro_f2i, TGSI_EXEC_DATA_INT, 
TGSI_EXEC_DATA_FLOAT);
   break;
 
+   case TGSI_OPCODE_FSEQ:
+  exec_vector_binary(mach, inst, micro_fseq, TGSI_EXEC_DATA_UINT, 
TGSI_EXEC_DATA_FLOAT);
+  break;
+
+   case TGSI_OPCODE_FSGE:
+  exec_vector_binary(mach, inst, micro_fsge, TGSI_EXEC_DATA_UINT, 
TGSI_EXEC_DATA_FLOAT);
+  break;
+
+   case TGSI_OPCODE_FSLT:
+  exec_vector_binary(mach, inst, micro_fslt, TGSI_EXEC_DATA_UINT, 
TGSI_EXEC_DATA_FLOAT);
+  break;
+
+   case TGSI_OPCODE_FSNE:
+  exec_vector_binary(mach, inst, micro_fsne, TGSI_EXEC_DATA_UINT, 
TGSI_EXEC_DATA_FLOAT);
+  break;
+
case TGSI_OPCODE_IDIV:
   exec_vector_binary(mach, inst, micro_idiv, TGSI_EXEC_DATA_INT, 
TGSI_EXEC_DATA_INT);
   break;
diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 7e93028..7a5d18f 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -145,10 +145,10 @@ static const struct tgsi_opcode_info 
opcode_info[TGSI_OPCODE_LAST] =
{ 0, 0, 0, 0, 0, 0, NONE, "", 105 }, /* removed */
{ 0, 0, 0, 0, 0, 0, NONE, "", 106 }, /* removed */
{ 0, 0, 0, 0, 0, 0, NONE, "NOP", TGSI_OPCODE_NOP },
-   { 0, 0, 0, 0, 0, 0, NONE, "", 108 }, /* removed */
-   { 0, 0, 0, 0, 0, 0, NONE, "", 109 }, /* removed */
-   { 0, 0, 0, 0, 0, 0, NONE, "", 110 }, /* removed */
-   { 0, 0, 0, 0, 0, 0, NONE, "", 111 }, /* removed */
+   { 1, 2, 0, 0, 0, 0, COMP, "FSEQ", TGSI_OPCODE_FSEQ },
+   { 1, 2, 0, 0, 0, 0, COMP, "FSGE", TGSI_OPCODE_FSGE },
+   { 1, 2, 0, 0, 0, 0, COMP, "FSLT", TGSI_OPCODE_FSLT },
+   { 1, 2, 0, 0, 0, 0, COMP, "FSNE", TGSI_OPCODE_FSNE },
{ 1, 1, 0, 0, 0, 0, REPL, "NRM4", TGSI_OPCODE_NRM4 },
{ 0, 1, 0, 0, 0, 0, NONE, "CALLNZ", TGSI_OPCODE_CALLNZ },
{ 0, 1, 0, 0, 0, 0, NONE, "", 114 }, /* removed */
@@ -302,6 +302,10 @@ tgsi_opcode_infer_type( uint opcode )
case TGSI_OPCODE_ARR:
case TGSI_OPCODE_MOD:
case TGSI_OPCODE_F2I:
+   case TGSI_OPCODE_FSEQ:
+   case TGSI_OPCODE_FSGE:
+   case TGSI_OPCODE_FSLT:
+   case TGSI_OPCODE_FSNE:
case TGSI_OPCODE_IDIV:
case TGSI_OPCODE_IMAX:
case TGSI_OPCODE_IMIN:
@@ -343,6 +347,10 @@ tgsi_opcode_infer_src_type( uint opcode )
case TGSI_OPCODE_TXQ_LZ:
case TGSI_OPCODE_F2I:
case TGSI_OPCODE_F2U:
+   case TGSI_OPCODE_FSEQ:
+   case TGSI_OPCODE_FSGE:
+   case TGSI_OPCODE_FSLT:
+   case TGSI_OPCODE_FSNE:
cas

[Mesa-dev] [PATCH 3/3] gallivm: implement new float comparison instructions returning integer masks

2013-08-12 Thread sroland

From: Roland Scheidegger 

FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the
select.
And just for consistency use the same appropriate ordered/unordered comparisons
for the old opcodes as well.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |   81 +++-
 1 file changed, 79 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index f461661..86c3249 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -1094,6 +1094,70 @@ f2u_emit_cpu(
 emit_data->args[0]);
 }
 
+/* TGSI_OPCODE_FSET Helper (CPU Only) */
+static void
+fset_emit_cpu(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data,
+   unsigned pipe_func)
+{
+   LLVMValueRef cond;
+
+   if (pipe_func != PIPE_FUNC_NOTEQUAL) {
+  cond = lp_build_cmp_ordered(&bld_base->base, pipe_func,
+  emit_data->args[0], emit_data->args[1]);
+   }
+   else {
+  cond = lp_build_cmp(&bld_base->base, pipe_func,
+  emit_data->args[0], emit_data->args[1]);
+
+   }
+   emit_data->output[emit_data->chan] = cond;
+}
+
+
+/* TGSI_OPCODE_FSEQ (CPU Only) */
+static void
+fseq_emit_cpu(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_EQUAL);
+}
+
+/* TGSI_OPCODE_ISGE (CPU Only) */
+static void
+fsge_emit_cpu(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_GEQUAL);
+}
+
+/* TGSI_OPCODE_ISLT (CPU Only) */
+static void
+fslt_emit_cpu(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_LESS);
+}
+
+/* TGSI_OPCODE_USNE (CPU Only) */
+
+static void
+fsne_emit_cpu(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_NOTEQUAL);
+}
+
 /* TGSI_OPCODE_FLR (CPU Only) */
 
 static void
@@ -1396,8 +1460,17 @@ set_emit_cpu(
struct lp_build_emit_data * emit_data,
unsigned pipe_func)
 {
-   LLVMValueRef cond = lp_build_cmp(&bld_base->base, pipe_func,
-emit_data->args[0], emit_data->args[1]);
+   LLVMValueRef cond;
+
+   if (pipe_func != PIPE_FUNC_NOTEQUAL) {
+  cond = lp_build_cmp_ordered(&bld_base->base, pipe_func,
+  emit_data->args[0], emit_data->args[1]);
+   }
+   else {
+  cond = lp_build_cmp(&bld_base->base, pipe_func,
+  emit_data->args[0], emit_data->args[1]);
+
+   }
emit_data->output[emit_data->chan] = lp_build_select(&bld_base->base,
   cond,
   bld_base->base.one,
@@ -1716,6 +1789,10 @@ lp_set_default_actions_cpu(
bld_base->op_actions[TGSI_OPCODE_F2I].emit = f2i_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_FLR].emit = flr_emit_cpu;
+   bld_base->op_actions[TGSI_OPCODE_FSEQ].emit = fseq_emit_cpu;
+   bld_base->op_actions[TGSI_OPCODE_FSGE].emit = fsge_emit_cpu;
+   bld_base->op_actions[TGSI_OPCODE_FSLT].emit = fslt_emit_cpu;
+   bld_base->op_actions[TGSI_OPCODE_FSNE].emit = fsne_emit_cpu;
 
bld_base->op_actions[TGSI_OPCODE_I2F].emit = i2f_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_IABS].emit = iabs_emit_cpu;
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] gallivm: implement new float comparison instructions returning integer masks

2013-08-12 Thread Zack Rusin

Nice. The entire series looks good.

Reviewed-by: Zack Rusin 

- Original Message -
> From: Roland Scheidegger 
> 
> FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the
> select.
> And just for consistency use the same appropriate ordered/unordered
> comparisons
> for the old opcodes as well.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c |   81
>  +++-
>  1 file changed, 79 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> index f461661..86c3249 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> @@ -1094,6 +1094,70 @@ f2u_emit_cpu(
>  emit_data->args[0]);
>  }
>  
> +/* TGSI_OPCODE_FSET Helper (CPU Only) */
> +static void
> +fset_emit_cpu(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data,
> +   unsigned pipe_func)
> +{
> +   LLVMValueRef cond;
> +
> +   if (pipe_func != PIPE_FUNC_NOTEQUAL) {
> +  cond = lp_build_cmp_ordered(&bld_base->base, pipe_func,
> +  emit_data->args[0], emit_data->args[1]);
> +   }
> +   else {
> +  cond = lp_build_cmp(&bld_base->base, pipe_func,
> +  emit_data->args[0], emit_data->args[1]);
> +
> +   }
> +   emit_data->output[emit_data->chan] = cond;
> +}
> +
> +
> +/* TGSI_OPCODE_FSEQ (CPU Only) */
> +static void
> +fseq_emit_cpu(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data)
> +{
> +   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_EQUAL);
> +}
> +
> +/* TGSI_OPCODE_ISGE (CPU Only) */
> +static void
> +fsge_emit_cpu(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data)
> +{
> +   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_GEQUAL);
> +}
> +
> +/* TGSI_OPCODE_ISLT (CPU Only) */
> +static void
> +fslt_emit_cpu(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data)
> +{
> +   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_LESS);
> +}
> +
> +/* TGSI_OPCODE_USNE (CPU Only) */
> +
> +static void
> +fsne_emit_cpu(
> +   const struct lp_build_tgsi_action * action,
> +   struct lp_build_tgsi_context * bld_base,
> +   struct lp_build_emit_data * emit_data)
> +{
> +   fset_emit_cpu(action, bld_base, emit_data, PIPE_FUNC_NOTEQUAL);
> +}
> +
>  /* TGSI_OPCODE_FLR (CPU Only) */
>  
>  static void
> @@ -1396,8 +1460,17 @@ set_emit_cpu(
> struct lp_build_emit_data * emit_data,
> unsigned pipe_func)
>  {
> -   LLVMValueRef cond = lp_build_cmp(&bld_base->base, pipe_func,
> -emit_data->args[0], emit_data->args[1]);
> +   LLVMValueRef cond;
> +
> +   if (pipe_func != PIPE_FUNC_NOTEQUAL) {
> +  cond = lp_build_cmp_ordered(&bld_base->base, pipe_func,
> +  emit_data->args[0], emit_data->args[1]);
> +   }
> +   else {
> +  cond = lp_build_cmp(&bld_base->base, pipe_func,
> +  emit_data->args[0], emit_data->args[1]);
> +
> +   }
> emit_data->output[emit_data->chan] = lp_build_select(&bld_base->base,
>cond,
>bld_base->base.one,
> @@ -1716,6 +1789,10 @@ lp_set_default_actions_cpu(
> bld_base->op_actions[TGSI_OPCODE_F2I].emit = f2i_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_F2U].emit = f2u_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_FLR].emit = flr_emit_cpu;
> +   bld_base->op_actions[TGSI_OPCODE_FSEQ].emit = fseq_emit_cpu;
> +   bld_base->op_actions[TGSI_OPCODE_FSGE].emit = fsge_emit_cpu;
> +   bld_base->op_actions[TGSI_OPCODE_FSLT].emit = fslt_emit_cpu;
> +   bld_base->op_actions[TGSI_OPCODE_FSNE].emit = fsne_emit_cpu;
>  
> bld_base->op_actions[TGSI_OPCODE_I2F].emit = i2f_emit_cpu;
> bld_base->op_actions[TGSI_OPCODE_IABS].emit = iabs_emit_cpu;
> --
> 1.7.9.5
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] tgsi_build: fix order of arguments for ind register build

2013-08-12 Thread Dave Airlie

From: Dave Airlie 

This was broken when arrayid was added.

Signed-off-by: Dave Airlie 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c |  2 +-
 src/gallium/renderer/virgl_hw.h | 39 +
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index 626faad..9c00cb6 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -875,8 +875,8 @@ static struct tgsi_ind_register
 tgsi_build_ind_register(
unsigned file,
unsigned swizzle,
-   unsigned arrayid,
int index,
+   unsigned arrayid,
struct tgsi_instruction *instruction,
struct tgsi_header *header )
 {
diff --git a/src/gallium/renderer/virgl_hw.h b/src/gallium/renderer/virgl_hw.h
index 2a8be61..71989cc 100644
--- a/src/gallium/renderer/virgl_hw.h
+++ b/src/gallium/renderer/virgl_hw.h
@@ -276,4 +276,43 @@ enum virgl_formats {
VIRGL_FORMAT_MAX,
 };
 
+struct virgl_caps_bool_set1 {
+unsigned indep_blend_enable:1;
+unsigned indep_blend_func:1;
+unsigned cube_map_array:1;
+unsigned shader_stencil_export:1;
+unsigned conditional_render:1;
+unsigned start_instance:1;
+unsigned primitive_restart:1;
+unsigned blend_eq_sep:1;
+unsigned instanceid:1;
+unsigned vertex_element_instance_divisor:1;
+unsigned seamless_cube_map:1;
+unsigned occlusion_query:1;
+unsigned timer_query:1;
+unsigned streamout_pause_resume:1;
+};
+
+/* endless expansion capabilites - current gallium has 252 formats */
+struct virgl_supported_format_mask {
+uint32_t bitmask[16];
+};
+/* capabilities set 2 - version 1 - 32-bit and float values */
+struct virgl_caps_v1 {
+struct virgl_caps_bool_set1 bset;
+uint32_t glsl_level;
+uint32_t max_texture_array_layers;
+uint32_t max_streamout_buffers;
+uint32_t max_dual_source_render_targets;
+uint32_t max_render_targets;
+struct virgl_supported_format_mask sampler;
+struct virgl_supported_format_mask fb;
+struct virgl_supported_format_mask depthstencil;
+struct virgl_supported_format_mask vertexbuffer;
+};
+
+union virgl_caps {
+uint32_t max_version;
+struct virgl_caps_v1 v1;
+};
 #endif
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] draw: make sure that the stages setup outputs

2013-08-12 Thread Zack Rusin

Calling the prepare outputs cleans up the slot assignments
for outputs, unfortunately aapoint and aaline didn't have
code to reset their slots after the initial setup, this
was messing up our slot assignments. The unfilled stage
was just missing the initial assignment of the face slot.
This fixes all of the reported piglit failures.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/draw/draw_context.c   |2 +
 src/gallium/auxiliary/draw/draw_pipe.h  |5 +-
 src/gallium/auxiliary/draw/draw_pipe_aaline.c   |   27 ---
 src/gallium/auxiliary/draw/draw_pipe_aapoint.c  |   56 ++-
 src/gallium/auxiliary/draw/draw_pipe_unfilled.c |2 +
 5 files changed, 62 insertions(+), 30 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 2d4843e..d1fac0c 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -564,6 +564,8 @@ draw_prepare_shader_outputs(struct draw_context *draw)
draw_remove_extra_vertex_attribs(draw);
draw_prim_assembler_prepare_outputs(draw->ia);
draw_unfilled_prepare_outputs(draw, draw->pipeline.unfilled);
+   draw_aapoint_prepare_outputs(draw, draw->pipeline.aapoint);
+   draw_aaline_prepare_outputs(draw, draw->pipeline.aaline);
 }
 
 /**
diff --git a/src/gallium/auxiliary/draw/draw_pipe.h 
b/src/gallium/auxiliary/draw/draw_pipe.h
index 7c9ed6c..ad3165f 100644
--- a/src/gallium/auxiliary/draw/draw_pipe.h
+++ b/src/gallium/auxiliary/draw/draw_pipe.h
@@ -101,7 +101,10 @@ void draw_pipe_passthrough_tri(struct draw_stage *stage, 
struct prim_header *hea
 void draw_pipe_passthrough_line(struct draw_stage *stage, struct prim_header 
*header);
 void draw_pipe_passthrough_point(struct draw_stage *stage, struct prim_header 
*header);
 
-
+void draw_aapoint_prepare_outputs(struct draw_context *context,
+  struct draw_stage *stage);
+void draw_aaline_prepare_outputs(struct draw_context *context,
+ struct draw_stage *stage);
 void draw_unfilled_prepare_outputs(struct draw_context *context,
struct draw_stage *stage);
 
diff --git a/src/gallium/auxiliary/draw/draw_pipe_aaline.c 
b/src/gallium/auxiliary/draw/draw_pipe_aaline.c
index aa88459..c44c236 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_aaline.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_aaline.c
@@ -692,13 +692,7 @@ aaline_first_line(struct draw_stage *stage, struct 
prim_header *header)
   return;
}
 
-   /* update vertex attrib info */
-   aaline->pos_slot = draw_current_shader_position_output(draw);;
-
-   /* allocate the extra post-transformed vertex attribute */
-   aaline->tex_slot = draw_alloc_extra_vertex_attrib(draw,
- TGSI_SEMANTIC_GENERIC,
- 
aaline->fs->generic_attrib);
+   draw_aaline_prepare_outputs(draw, draw->pipeline.aaline);
 
/* how many samplers? */
/* we'll use sampler/texture[pstip->sampler_unit] for the stipple */
@@ -953,6 +947,25 @@ aaline_set_sampler_views(struct pipe_context *pipe,
 }
 
 
+void
+draw_aaline_prepare_outputs(struct draw_context *draw,
+struct draw_stage *stage)
+{
+   struct aaline_stage *aaline = aaline_stage(stage);
+   const struct pipe_rasterizer_state *rast = draw->rasterizer;
+
+   /* update vertex attrib info */
+   aaline->pos_slot = draw_current_shader_position_output(draw);;
+
+   if (!rast->line_smooth)
+  return;
+
+   /* allocate the extra post-transformed vertex attribute */
+   aaline->tex_slot = draw_alloc_extra_vertex_attrib(draw,
+ TGSI_SEMANTIC_GENERIC,
+ 
aaline->fs->generic_attrib);
+}
+
 /**
  * Called by drivers that want to install this AA line prim stage
  * into the draw module's pipeline.  This will not be used if the
diff --git a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c 
b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
index 0d7b88e..7ae1ddd 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_aapoint.c
@@ -696,28 +696,7 @@ aapoint_first_point(struct draw_stage *stage, struct 
prim_header *header)
 */
bind_aapoint_fragment_shader(aapoint);
 
-   /* update vertex attrib info */
-   aapoint->pos_slot = draw_current_shader_position_output(draw);
-
-   /* allocate the extra post-transformed vertex attribute */
-   aapoint->tex_slot = draw_alloc_extra_vertex_attrib(draw,
-  TGSI_SEMANTIC_GENERIC,
-  
aapoint->fs->generic_attrib);
-   assert(aapoint->tex_slot > 0); /* output[0] is vertex pos */
-
-   /* find psize slot in post-transform vertex */
-   aapoint->psize_slot = -1;
-   if (draw->rasterizer-

Re: [Mesa-dev] [PATCH 3/4] R600/SI: Allow conversion between v32i8 and v8i32

2013-08-12 Thread Tom Stellard

On Sat, Aug 10, 2013 at 08:50:31PM +0200, Marek Olšák wrote:
> Signed-off-by: Marek Olšák 

You will need to add a test case to this commit, but otherwise the whole
series is:

Reviewed-by: Tom Stellard 

Do you have commit access yet?

-Tom

> ---
>  lib/Target/R600/SIInstructions.td | 5 +
>  lib/Target/R600/SIRegisterInfo.td | 4 ++--
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/Target/R600/SIInstructions.td 
> b/lib/Target/R600/SIInstructions.td
> index d941035..be2e290 100644
> --- a/lib/Target/R600/SIInstructions.td
> +++ b/lib/Target/R600/SIInstructions.td
> @@ -1506,6 +1506,11 @@ def : BitConvert ;
>  def : BitConvert ;
>  def : BitConvert ;
>  
> +def : BitConvert ;
> +def : BitConvert ;
> +def : BitConvert ;
> +def : BitConvert ;
> +
>  /** === **/
>  /** Src & Dst modifiers **/
>  /** === **/
> diff --git a/lib/Target/R600/SIRegisterInfo.td 
> b/lib/Target/R600/SIRegisterInfo.td
> index 292b9d2..82d1e71 100644
> --- a/lib/Target/R600/SIRegisterInfo.td
> +++ b/lib/Target/R600/SIRegisterInfo.td
> @@ -159,7 +159,7 @@ def SReg_64 : RegisterClass<"AMDGPU", [v2i32, i64, i1], 
> 64,
>  
>  def SReg_128 : RegisterClass<"AMDGPU", [v16i8, i128], 128, (add SGPR_128)>;
>  
> -def SReg_256 : RegisterClass<"AMDGPU", [v32i8], 256, (add SGPR_256)>;
> +def SReg_256 : RegisterClass<"AMDGPU", [v32i8, v8i32, v8f32], 256, (add 
> SGPR_256)>;
>  
>  def SReg_512 : RegisterClass<"AMDGPU", [v64i8], 512, (add SGPR_512)>;
>  
> @@ -174,7 +174,7 @@ def VReg_96 : RegisterClass<"AMDGPU", [untyped], 96, (add 
> VGPR_96)> {
>  
>  def VReg_128 : RegisterClass<"AMDGPU", [v4i32, v4f32], 128, (add VGPR_128)>;
>  
> -def VReg_256 : RegisterClass<"AMDGPU", [v8i32, v8f32], 256, (add VGPR_256)>;
> +def VReg_256 : RegisterClass<"AMDGPU", [v32i8, v8i32, v8f32], 256, (add 
> VGPR_256)>;
>  
>  def VReg_512 : RegisterClass<"AMDGPU", [v16i32, v16f32], 512, (add 
> VGPR_512)>;
>  
> -- 
> 1.8.1.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 67740] render2.c: In function '__indirect_glMap1d'

2013-08-12 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=67740

--- Comment #2 from Henry  ---
from the tarball.

./configure --prefix=/local PYTHON2=/local/bin/python2.7

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] update wayland requirement

2013-08-12 Thread Fabio Pedretti

Since 8d29b52 wayland 1.2.0 is required.--- configure.ac2013-08-12 13:17:10.0 +0200
+++ configure.ac2013-08-12 13:19:15.0 +0200
@@ -1433,7 +1433,7 @@ egl_platforms=`IFS=', '; echo $with_egl_
 for plat in $egl_platforms; do
case "$plat" in
wayland)
-   PKG_CHECK_MODULES([WAYLAND], [wayland-client >= 1.0.2 
wayland-server >= 1.0.2])
+   PKG_CHECK_MODULES([WAYLAND], [wayland-client >= 1.2.0 
wayland-server >= 1.2.0])
GALLIUM_WINSYS_DIRS="$GALLIUM_WINSYS_DIRS sw/wayland"
 
 WAYLAND_PREFIX=`$PKG_CONFIG --variable=prefix wayland-client`
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] llvmpipe: fix pipeline statistics with a null ps

2013-08-12 Thread Zack Rusin

If the fragment shader is null then pixel shader invocations have
to be equal to zero. And if we're running a null ps then clipper
invocations and primitives should be equal to zero but only
if both stancil and depth testing are disabled.

Signed-off-by: Zack Rusin 
---
 src/gallium/drivers/llvmpipe/lp_query.c |   30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_query.c 
b/src/gallium/drivers/llvmpipe/lp_query.c
index cea2d07..fb24c36 100644
--- a/src/gallium/drivers/llvmpipe/lp_query.c
+++ b/src/gallium/drivers/llvmpipe/lp_query.c
@@ -32,6 +32,7 @@
 
 #include "draw/draw_context.h"
 #include "pipe/p_defines.h"
+#include "tgsi/tgsi_scan.h"
 #include "util/u_memory.h"
 #include "os/os_time.h"
 #include "lp_context.h"
@@ -95,6 +96,7 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
   union pipe_query_result *vresult)
 {
struct llvmpipe_screen *screen = llvmpipe_screen(pipe->screen);
+   struct llvmpipe_context *llvmpipe = llvmpipe_context(pipe);
unsigned num_threads = MAX2(1, screen->num_threads);
struct llvmpipe_query *pq = llvmpipe_query(q);
uint64_t *result = (uint64_t *)vresult;
@@ -166,11 +168,31 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
case PIPE_QUERY_PIPELINE_STATISTICS: {
   struct pipe_query_data_pipeline_statistics *stats =
  (struct pipe_query_data_pipeline_statistics *)vresult;
-  /* only ps_invocations come from binned query */
-  for (i = 0; i < num_threads; i++) {
- pq->stats.ps_invocations += pq->end[i];
+  /* If we're running on what's considrered a null fragment
+   * shader, i.e. fragment shader consisting of a single
+   * END opcode or if the fragment shader is null then
+   * the number of ps_invocations should be zero */
+  if (llvmpipe->fs && llvmpipe->fs->info.base.num_tokens > 1) {
+ /* only ps_invocations come from binned query */
+ for (i = 0; i < num_threads; i++) {
+pq->stats.ps_invocations += pq->end[i];
+ }
+ pq->stats.ps_invocations *=
+LP_RASTER_BLOCK_SIZE * LP_RASTER_BLOCK_SIZE;
+  } else {
+ /* 
+  * Clipper primitives and invocations are equal to zero
+  * if we're running a null fragment shader but only
+  * if both stencil and depth testing are disabled.
+  */
+ if (!llvmpipe->depth_stencil->depth.enabled &&
+ !llvmpipe->depth_stencil->stencil[0].enabled &&
+ !llvmpipe->depth_stencil->stencil[1].enabled) {
+pq->stats.c_primitives = 0;
+pq->stats.c_invocations = 0;
+ }
+ pq->stats.ps_invocations = 0;
   }
-  pq->stats.ps_invocations *= LP_RASTER_BLOCK_SIZE * LP_RASTER_BLOCK_SIZE;
   *stats = pq->stats;
}
   break;
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Patches: R600: Merge R600 and SI vector op expansions

2013-08-12 Thread Michel Dänzer

On Mon, 2013-08-12 at 15:25 -0700, Tom Stellard wrote:
> 
> The attached patches expand a few more vector operations and also move
> the expansion code into AMDGPUISelLowering.cpp so it can be shared
> between R600 and SI.

This series is

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

84 matches

Mail list logo