[Mesa-dev] Pull request for 1.50 GS layout qualifiers

2013-06-13 Thread Eric Anholt
Hey Paul!  I got the layout qualifiers working.  It's unblocked things
so I can finish off a bunch of testcases I've been working on, so I'd
like to get it in your gs branch so we can all enjoy testcases together.

There's one not-for-upstream commit in here, and do note the TODO in the
last commit (and we need tests for this feature still).  Oh, and there's
one little prep commit for UBOs, too.

I'm planning on sending out these commits:
  glsl: Make _mesa_print_ir() available from anything including ir.h.
  glsl: Remove ir_print_visitor.h includes and usage
  mesa: Use shared code for converting shader targets to short strings.
  mesa: Move the common _mesa_glsl_compile_shader() code to glsl/.

for review, plus a port of your "Make files buildable from C" to the
list, since they seem like a good cleanup, together.

The following changes since commit 4e6d6dbfab79d9e7aff5d26c585d6e77b36db0f2:

  !UPSTREAM: Handle GS_OPCODE_THREAD_END in implied_mrf_writes() (2013-06-12 
11:09:01 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~anholt/mesa gs-qualifiers

for you to fetch changes up to dbe3e86de06813ea0619dd9035f328372c9caab2:

  glsl: Cross-validate GS layout qualifiers while intrastage linking. 
(2013-06-13 18:04:29 -0700)


Eric Anholt (11):
  mesa: Expose uniform buffers in geometry shaders.
  glsl: Make _mesa_print_ir() available from anything including ir.h.
  glsl: Remove ir_print_visitor.h includes and usage
  mesa: Use shared code for converting shader targets to short strings.
  mesa: Move the common _mesa_glsl_compile_shader() code to glsl/.
  glsl: Include EmitVertex() and EndPrimitive() prototypes for GLSL 1.50 GS.
  glsl: !UPSTREAM: Spam in builtin 1.30 variables for 1.50 GSes.
  glsl: Make sure that we don't put too many bitfields in 
ast_type_qualifier.
  glsl: Parse the GLSL 1.50 GS layout qualifiers.
  glsl: Export the compiler's GS layout qualifiers to the gl_shader.
  glsl: Cross-validate GS layout qualifiers while intrastage linking.

 src/glsl/ast.h |  12 ++
 src/glsl/ast_to_hir.cpp|   2 +
 src/glsl/ast_type.cpp  |  23 
 src/glsl/builtin_variables.cpp |   6 +
 src/glsl/builtins/profiles/150.geom|   3 +
 src/glsl/glsl_parser.yy|  69 +-
 src/glsl/glsl_parser_extras.cpp| 153 -
 src/glsl/glsl_parser_extras.h  |  11 ++
 src/glsl/ir.h  |   8 ++
 src/glsl/ir_print_visitor.cpp  |   3 +
 src/glsl/ir_print_visitor.h|   3 -
 src/glsl/ir_rvalue_visitor.cpp |   1 -
 src/glsl/link_varyings.cpp |  12 +-
 src/glsl/linker.cpp| 115 +---
 src/glsl/linker.h  |   3 -
 src/glsl/main.cpp  |  60 +---
 src/glsl/opt_array_splitting.cpp   |   1 -
 src/glsl/opt_noop_swizzle.cpp  |   1 -
 src/glsl/opt_structure_splitting.cpp   |   1 -
 src/glsl/program.h |  16 ++-
 src/glsl/test_optpass.cpp  |   1 -
 src/mesa/drivers/dri/i965/brw_fs.cpp   |   1 -
 src/mesa/drivers/dri/i965/brw_fs_emit.cpp  |   1 -
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   1 -
 .../drivers/dri/i965/brw_fs_vector_splitting.cpp   |   1 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |   5 +-
 .../drivers/dri/i965/brw_schedule_instructions.cpp |   1 -
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  10 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp |   1 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |   1 -
 .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |   1 -
 src/mesa/main/ff_fragment_shader.cpp   |   1 -
 src/mesa/main/mtypes.h |  18 +++
 src/mesa/main/shaderapi.c  |  74 ++
 src/mesa/main/uniform_query.cpp|   9 +-
 src/mesa/program/ir_to_mesa.cpp| 102 +-
 src/mesa/program/ir_to_mesa.h  |   1 -
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  12 +-
 38 files changed, 491 insertions(+), 253 deletions(-)
 create mode 100644 src/glsl/builtins/profiles/150.geom


pgpCafXQGWI4W.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw: don't clear the so targets until we stream out

2013-06-13 Thread Zack Rusin
> > Though I find stream output very confusing...
> 
> I agree. I was digging a bit more and I think I was correct the first time.
> The D3D spec is very clear that "a buffer cannot be bound as both an input
> and an output at the same time", so I think the current behavior is correct,
> or at least one of the correct options given that the behavior is simply
> undefined. So I think I'm going to skip this patch, especially that is is
> subtly wrong (because it will clear so target buffers on each invocation of
> the stream output stage which isn't correct behavior since the buffers
> should only be cleared when new so targets are set).

Actually I'd just like to commit the attached patch. All it does is move
the clearing of the so targets from the drivers to the draw module. It fixes
a bug in softpipe, because softpipe would never clear the buffers and would
always append.

zFrom a4a89e8f39a127474c668cb72a7db24038396731 Mon Sep 17 00:00:00 2001
From: Zack Rusin 
Date: Thu, 13 Jun 2013 17:57:47 -0400
Subject: [PATCH] draw: clear the draw buffers in draw

Moves clearing of the draw so target buffers to the draw
module. They had to be cleared in the drivers before
which was quite messy.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/draw/draw_context.c |   12 ++--
 src/gallium/auxiliary/draw/draw_context.h |3 ++-
 src/gallium/drivers/llvmpipe/lp_context.h |1 +
 src/gallium/drivers/llvmpipe/lp_draw_arrays.c |4 ++--
 src/gallium/drivers/llvmpipe/lp_state_so.c|8 ++--
 src/gallium/drivers/softpipe/sp_context.h |1 +
 src/gallium/drivers/softpipe/sp_draw_arrays.c |4 ++--
 src/gallium/drivers/softpipe/sp_state_so.c|1 +
 8 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c
index 22c0e9b..53f515e 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -809,12 +809,20 @@ draw_get_rasterizer_no_cull( struct draw_context *draw,
 void
 draw_set_mapped_so_targets(struct draw_context *draw,
int num_targets,
-   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS])
+   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS],
+   unsigned append_bitmask)
 {
int i;
 
-   for (i = 0; i < num_targets; i++)
+   for (i = 0; i < num_targets; i++) {
   draw->so.targets[i] = targets[i];
+  /* if we're not appending then lets reset the internal
+ data of our so target */
+  if (!(append_bitmask & (1 << i)) && draw->so.targets[i]) {
+ draw->so.targets[i]->internal_offset = 0;
+ draw->so.targets[i]->emitted_vertices = 0;
+  }
+   }
for (i = num_targets; i < PIPE_MAX_SO_BUFFERS; i++)
   draw->so.targets[i] = NULL;
 
diff --git a/src/gallium/auxiliary/draw/draw_context.h b/src/gallium/auxiliary/draw/draw_context.h
index 4a1b27e..ae63068 100644
--- a/src/gallium/auxiliary/draw/draw_context.h
+++ b/src/gallium/auxiliary/draw/draw_context.h
@@ -231,7 +231,8 @@ draw_set_mapped_constant_buffer(struct draw_context *draw,
 void
 draw_set_mapped_so_targets(struct draw_context *draw,
int num_targets,
-   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS]);
+   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS],
+   unsigned append_bitmask);
 
 
 /***
diff --git a/src/gallium/drivers/llvmpipe/lp_context.h b/src/gallium/drivers/llvmpipe/lp_context.h
index abfe852..0515968 100644
--- a/src/gallium/drivers/llvmpipe/lp_context.h
+++ b/src/gallium/drivers/llvmpipe/lp_context.h
@@ -91,6 +91,7 @@ struct llvmpipe_context {
 
struct draw_so_target *so_targets[PIPE_MAX_SO_BUFFERS];
int num_so_targets;
+   unsigned so_append_bitmask;
struct pipe_query_data_so_statistics so_stats;
unsigned num_primitives_generated;
 
diff --git a/src/gallium/drivers/llvmpipe/lp_draw_arrays.c b/src/gallium/drivers/llvmpipe/lp_draw_arrays.c
index 4e23904..11b665a 100644
--- a/src/gallium/drivers/llvmpipe/lp_draw_arrays.c
+++ b/src/gallium/drivers/llvmpipe/lp_draw_arrays.c
@@ -104,7 +104,7 @@ llvmpipe_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
   }
}
draw_set_mapped_so_targets(draw, lp->num_so_targets,
-  lp->so_targets);
+  lp->so_targets, lp->so_append_bitmask);
 
llvmpipe_prepare_vertex_sampling(lp,
 lp->num_sampler_views[PIPE_SHADER_VERTEX],
@@ -134,7 +134,7 @@ llvmpipe_draw_vbo(struct pipe_context *pipe, const struct pipe_draw_info *info)
if (mapped_indices) {
   draw_set_indexes(draw, NULL, 0, 0);
}
-   draw_set_mapped_so_targets(draw, 0, NULL);
+   draw_set_mapped_so_targets(draw, 0, N

Re: [Mesa-dev] [PATCH] draw: don't clear the so targets until we stream out

2013-06-13 Thread Zack Rusin
> Though I find stream output very confusing...

I agree. I was digging a bit more and I think I was correct the first time. The 
D3D spec is very clear that "a buffer cannot be bound as both an input and an 
output at the same time", so I think the current behavior is correct, or at 
least one of the correct options given that the behavior is simply undefined. 
So I think I'm going to skip this patch, especially that is is subtly wrong 
(because it will clear so target buffers on each invocation of the stream 
output stage which isn't correct behavior since the buffers should only be 
cleared when new so targets are set).

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw: don't clear the so targets until we stream out

2013-06-13 Thread Roland Scheidegger
Am 14.06.2013 00:04, schrieb Zack Rusin:
> Since draw auto fetches the count from the buffers, we can't
> just clear them on bind, we need to wait until the actual
> stream out is performed. Otherwise the count for draw auto
> will be zero. Plus is cleaner to have draw do it rather
> than drivers having to mess with draw's internals.
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/draw/draw_context.c |4 +++-
>  src/gallium/auxiliary/draw/draw_context.h |3 ++-
>  src/gallium/auxiliary/draw/draw_private.h |1 +
>  src/gallium/auxiliary/draw/draw_pt_so_emit.c  |   20 
>  src/gallium/drivers/llvmpipe/lp_context.h |1 +
>  src/gallium/drivers/llvmpipe/lp_draw_arrays.c |4 ++--
>  src/gallium/drivers/llvmpipe/lp_state_so.c|8 ++--
>  src/gallium/drivers/softpipe/sp_context.h |1 +
>  src/gallium/drivers/softpipe/sp_draw_arrays.c |4 ++--
>  src/gallium/drivers/softpipe/sp_state_so.c|1 +
>  10 files changed, 35 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_context.c 
> b/src/gallium/auxiliary/draw/draw_context.c
> index 4a08765..f463739 100644
> --- a/src/gallium/auxiliary/draw/draw_context.c
> +++ b/src/gallium/auxiliary/draw/draw_context.c
> @@ -810,7 +810,8 @@ draw_get_rasterizer_no_cull( struct draw_context *draw,
>  void
>  draw_set_mapped_so_targets(struct draw_context *draw,
> int num_targets,
> -   struct draw_so_target 
> *targets[PIPE_MAX_SO_BUFFERS])
> +   struct draw_so_target 
> *targets[PIPE_MAX_SO_BUFFERS],
> +   unsigned append_bitmask)
>  {
> int i;
>  
> @@ -820,6 +821,7 @@ draw_set_mapped_so_targets(struct draw_context *draw,
>draw->so.targets[i] = NULL;
>  
> draw->so.num_targets = num_targets;
> +   draw->so.append_bitmask = append_bitmask;
>  }
>  
>  void
> diff --git a/src/gallium/auxiliary/draw/draw_context.h 
> b/src/gallium/auxiliary/draw/draw_context.h
> index 4a1b27e..ae63068 100644
> --- a/src/gallium/auxiliary/draw/draw_context.h
> +++ b/src/gallium/auxiliary/draw/draw_context.h
> @@ -231,7 +231,8 @@ draw_set_mapped_constant_buffer(struct draw_context *draw,
>  void
>  draw_set_mapped_so_targets(struct draw_context *draw,
> int num_targets,
> -   struct draw_so_target 
> *targets[PIPE_MAX_SO_BUFFERS]);
> +   struct draw_so_target 
> *targets[PIPE_MAX_SO_BUFFERS],
> +   unsigned append_bitmask);
>  
>  
>  /***
> diff --git a/src/gallium/auxiliary/draw/draw_private.h 
> b/src/gallium/auxiliary/draw/draw_private.h
> index fd52c2d..4dda90e 100644
> --- a/src/gallium/auxiliary/draw/draw_private.h
> +++ b/src/gallium/auxiliary/draw/draw_private.h
> @@ -290,6 +290,7 @@ struct draw_context
> struct {
>struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS];
>uint num_targets;
> +  uint append_bitmask;
> } so;
>  
> /* Clip derived state:
> diff --git a/src/gallium/auxiliary/draw/draw_pt_so_emit.c 
> b/src/gallium/auxiliary/draw/draw_pt_so_emit.c
> index d624a99..785aa34 100644
> --- a/src/gallium/auxiliary/draw/draw_pt_so_emit.c
> +++ b/src/gallium/auxiliary/draw/draw_pt_so_emit.c
> @@ -77,6 +77,24 @@ draw_has_so(const struct draw_context *draw)
> return FALSE;
>  }
>  
> +static void
> +clean_so_buffers(struct pt_so_emit *emit)
> +{
> +   struct draw_context *draw = emit->draw;
> +   unsigned i;
> +
> +   debug_assert(emit->has_so);
> +
> +   for (i = 0; i < draw->so.num_targets; i++) {
> +  /* if we're not appending then lets reset the internal
> + data of our so target */
> +  if (!(draw->so.append_bitmask & (1 << i)) && draw->so.targets[i]) {
> + draw->so.targets[i]->internal_offset = 0;
> + draw->so.targets[i]->emitted_vertices = 0;
> +  }
> +   }
> +}
> +
>  void draw_pt_so_emit_prepare(struct pt_so_emit *emit, boolean 
> use_pre_clip_pos)
>  {
> struct draw_context *draw = emit->draw;
> @@ -257,6 +275,8 @@ void draw_pt_so_emit( struct pt_so_emit *emit,
> if (!draw->so.num_targets)
>return;
>  
> +   clean_so_buffers(emit);
> +
> emit->emitted_vertices = 0;
> emit->emitted_primitives = 0;
> emit->generated_primitives = 0;
> diff --git a/src/gallium/drivers/llvmpipe/lp_context.h 
> b/src/gallium/drivers/llvmpipe/lp_context.h
> index abfe852..0515968 100644
> --- a/src/gallium/drivers/llvmpipe/lp_context.h
> +++ b/src/gallium/drivers/llvmpipe/lp_context.h
> @@ -91,6 +91,7 @@ struct llvmpipe_context {
>  
> struct draw_so_target *so_targets[PIPE_MAX_SO_BUFFERS];
> int num_so_targets;
> +   unsigned so_append_bitmask;
> struct pipe_query_data_so_statistics so_stats;
> unsigned num_primitives_generated;
>  
> diff --git a/src/gallium/drivers/llvmpipe/lp_draw_arrays.c 
> 

Re: [Mesa-dev] [PATCH 2/2] r600g/compute: Accept LDS size from the LLVM backend

2013-06-13 Thread Aaron Watry
For both patches in this series, the original files use tabs for
indentation, not the spaces that the patches introduce. Might want to
fix that for consistency.

I'm not familiar enough with the register poking to give a qualified
review, but everything else looks reasonable to me.

Tested-by: Aaron Watry 

On Wed, Jun 12, 2013 at 7:34 PM, Tom Stellard  wrote:
> From: Tom Stellard 
>
> And allocate the correct amount before dispatching the kernel.
> ---
>  src/gallium/drivers/r600/evergreen_compute.c   | 53 
> +++---
>  .../drivers/r600/evergreen_compute_internal.h  |  1 +
>  src/gallium/drivers/r600/evergreen_state.c |  6 +--
>  src/gallium/drivers/r600/r600_asm.h|  1 +
>  src/gallium/drivers/r600/r600_llvm.c   |  3 ++
>  5 files changed, 44 insertions(+), 20 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index b16c9d9..226933b 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -211,7 +211,8 @@ void *evergreen_create_compute_state(
>  #endif
>
> shader->ctx = (struct r600_context*)ctx;
> -   shader->local_size = cso->req_local_mem; ///TODO: assert it
> +   /* XXX: We ignore cso->req_local_mem, because we compute this value
> +* ourselves on a per-kernel basis. */
> shader->private_size = cso->req_private_mem;
> shader->input_size = cso->req_input_mem;
>
> @@ -327,13 +328,13 @@ static void evergreen_emit_direct_dispatch(
>  {
> int i;
> struct radeon_winsys_cs *cs = rctx->rings.gfx.cs;
> +   struct r600_pipe_compute *shader = rctx->cs_shader_state.shader;
> unsigned num_waves;
> unsigned num_pipes = rctx->screen->info.r600_max_pipes;
> unsigned wave_divisor = (16 * num_pipes);
> int group_size = 1;
> int grid_size = 1;
> -   /* XXX: Enable lds and get size from cs_shader_state */
> -   unsigned lds_size = 0;
> +   unsigned lds_size = shader->active_kernel->bc.nlds_dw;
>
> /* Calculate group_size/grid_size */
> for (i = 0; i < 3; i++) {
> @@ -348,16 +349,10 @@ static void evergreen_emit_direct_dispatch(
> num_waves = (block_layout[0] * block_layout[1] * block_layout[2] +
> wave_divisor - 1) / wave_divisor;
>
> -   COMPUTE_DBG(rctx->screen, "Using %u pipes, there are %u wavefronts 
> per thread block\n",
> -   num_pipes, num_waves);
> -
> -   /* XXX: Partition the LDS between PS/CS.  By default half (4096 dwords
> -* on Evergreen) oes to Pixel Shaders and half goes to Compute 
> Shaders.
> -* We may need to allocat the entire LDS space for Compute Shaders.
> -*
> -* EG: R_008E2C_SQ_LDS_RESOURCE_MGMT := 
> S_008E2C_NUM_LS_LDS(lds_dwords)
> -* CM: CM_R_0286FC_SPI_LDS_MGMT :=  S_0286FC_NUM_LS_LDS(lds_dwords)
> -*/
> +   COMPUTE_DBG(rctx->screen, "Using %u pipes, "
> +   "%u wavefronts per thread block, "
> +   "allocating %u dwords lds.\n",
> +   num_pipes, num_waves, lds_size);
>
> r600_write_config_reg(cs, R_008970_VGT_NUM_INDICES, group_size);
>
> @@ -374,6 +369,14 @@ static void evergreen_emit_direct_dispatch(
> r600_write_value(cs, block_layout[1]); /* 
> R_0286F0_SPI_COMPUTE_NUM_THREAD_Y */
> r600_write_value(cs, block_layout[2]); /* 
> R_0286F4_SPI_COMPUTE_NUM_THREAD_Z */
>
> +   if (rctx->chip_class < CAYMAN) {
> +   assert(lds_size <= 8192);
> +   } else {
> +   /* Cayman appears to have a slightly smaller limit, see the
> +* value of CM_R_0286FC_SPI_LDS_MGMT.NUM_LS_LDS */
> +   assert(lds_size <= 8160);
> +   }
> +
> r600_write_compute_context_reg(cs, CM_R_0288E8_SQ_LDS_ALLOC,
> lds_size | (num_waves << 14));
>
> @@ -517,12 +520,14 @@ static void evergreen_launch_grid(
> struct r600_context *ctx = (struct r600_context *)ctx_;
>
>  #ifdef HAVE_OPENCL
> -   COMPUTE_DBG(ctx->screen, "*** evergreen_launch_grid: pc = %u\n", pc);
>
> struct r600_pipe_compute *shader = ctx->cs_shader_state.shader;
> -   if (!shader->kernels[pc].code_bo) {
> +   struct r600_kernel *kernel = &shader->kernels[pc];
> +
> +   COMPUTE_DBG(ctx->screen, "*** evergreen_launch_grid: pc = %u\n", pc);
> +
> +   if (!kernel->code_bo) {
> void *p;
> -   struct r600_kernel *kernel = &shader->kernels[pc];
> struct r600_bytecode *bc = &kernel->bc;
> LLVMModuleRef mod = kernel->llvm_module;
> boolean use_kill = false;
> @@ -551,7 +556,7 @@ static void evergreen_launch_grid(
> ctx->ws->buffer_unmap(kernel->code_bo->cs_buf);
>   

Re: [Mesa-dev] [PATCH 1/2] r600g/compute: Move compute_shader_create() function into evergreen_compute.c

2013-06-13 Thread Tom Stellard
On Thu, Jun 13, 2013 at 05:51:49PM -0500, Aaron Watry wrote:
> On Wed, Jun 12, 2013 at 7:34 PM, Tom Stellard  wrote:
> > From: Tom Stellard 
> >
> > ---
> >  src/gallium/drivers/r600/evergreen_compute.c | 23 +++-
> >  src/gallium/drivers/r600/r600_shader.c   | 32 
> > 
> >  2 files changed, 22 insertions(+), 33 deletions(-)
> >
> > diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> > b/src/gallium/drivers/r600/evergreen_compute.c
> > index c993c09..b16c9d9 100644
> > --- a/src/gallium/drivers/r600/evergreen_compute.c
> > +++ b/src/gallium/drivers/r600/evergreen_compute.c
> > @@ -46,6 +46,7 @@
> >  #include "evergreen_compute.h"
> >  #include "evergreen_compute_internal.h"
> >  #include "compute_memory_pool.h"
> > +#include "sb/sb_public.h"
> >  #ifdef HAVE_OPENCL
> >  #include "radeon_llvm_util.h"
> >  #endif
> > @@ -522,7 +523,27 @@ static void evergreen_launch_grid(
> > if (!shader->kernels[pc].code_bo) {
> > void *p;
> > struct r600_kernel *kernel = &shader->kernels[pc];
> > -   r600_compute_shader_create(ctx_, kernel->llvm_module, 
> > &kernel->bc);
> > +   struct r600_bytecode *bc = &kernel->bc;
> > +   LLVMModuleRef mod = kernel->llvm_module;
> > +   boolean use_kill = false;
> > +   bool dump = (ctx->screen->debug_flags & DBG_CS) != 0;
> > +   unsigned use_sb = ctx->screen->debug_flags & DBG_SB_CS;
> > +   unsigned sb_disasm = use_sb ||
> > +   (ctx->screen->debug_flags & DBG_SB_DISASM);
> > +
> > +   r600_bytecode_init(bc, ctx->chip_class, ctx->family,
> > +  ctx->screen->has_compressed_msaa_texturing);
> > +   bc->type = TGSI_PROCESSOR_COMPUTE;
> > +   bc->isa = ctx->isa;
> > +   r600_llvm_compile(mod, ctx->family, bc, &use_kill, dump);
> > +
> > +   if (dump && !sb_disasm) {
> > +   r600_bytecode_disasm(bc);
> > +   } else if ((dump && sb_disasm) || use_sb) {
> > +   if (r600_sb_bytecode_process(ctx, bc, NULL, dump, 
> > use_sb))
> > +   R600_ERR("r600_sb_bytecode_process 
> > failed!\n");
> > +   }
> > +
> > kernel->code_bo = 
> > r600_compute_buffer_alloc_vram(ctx->screen,
> > kernel->bc.ndw * 4);
> > p = r600_buffer_mmap_sync_with_rings(ctx, kernel->code_bo, 
> > PIPE_TRANSFER_WRITE);
> > diff --git a/src/gallium/drivers/r600/r600_shader.c 
> > b/src/gallium/drivers/r600/r600_shader.c
> > index 81ed3ce..97c625c 100644
> > --- a/src/gallium/drivers/r600/r600_shader.c
> > +++ b/src/gallium/drivers/r600/r600_shader.c
> > @@ -291,38 +291,6 @@ static int tgsi_bgnloop(struct r600_shader_ctx *ctx);
> >  static int tgsi_endloop(struct r600_shader_ctx *ctx);
> >  static int tgsi_loop_brk_cont(struct r600_shader_ctx *ctx);
> >
> > -#ifdef HAVE_OPENCL
> > -int r600_compute_shader_create(struct pipe_context * ctx,
> > -   LLVMModuleRef mod,  struct r600_bytecode * bytecode)
> > -{
> 
> There's an associated declaration of this function in r600_pipe.h that
> is now unused... should this be removed? Otherwise, this looks good to
> me.
>

Yes, that should be removed.  I'll take care of that before I push.
 
> FYI: Tested on CEDAR (HD5400).
>

Great, thanks.

-Tom
> 
> 
> > -   struct r600_context *r600_ctx = (struct r600_context *)ctx;
> > -   struct r600_shader_ctx shader_ctx;
> > -   boolean use_kill = false;
> > -   bool dump = (r600_ctx->screen->debug_flags & DBG_CS) != 0;
> > -   unsigned use_sb = r600_ctx->screen->debug_flags & DBG_SB_CS;
> > -   unsigned sb_disasm = use_sb ||
> > -   (r600_ctx->screen->debug_flags & DBG_SB_DISASM);
> > -
> > -   shader_ctx.bc = bytecode;
> > -   r600_bytecode_init(shader_ctx.bc, r600_ctx->chip_class, 
> > r600_ctx->family,
> > -  r600_ctx->screen->has_compressed_msaa_texturing);
> > -   shader_ctx.bc->type = TGSI_PROCESSOR_COMPUTE;
> > -   shader_ctx.bc->isa = r600_ctx->isa;
> > -   r600_llvm_compile(mod, r600_ctx->family,
> > -   shader_ctx.bc, &use_kill, dump);
> > -
> > -   if (dump && !sb_disasm) {
> > -   r600_bytecode_disasm(shader_ctx.bc);
> > -   } else if ((dump && sb_disasm) || use_sb) {
> > -   if (r600_sb_bytecode_process(r600_ctx, shader_ctx.bc, NULL, 
> > dump, use_sb))
> > -   R600_ERR("r600_sb_bytecode_process failed!\n");
> > -   }
> > -
> > -   return 1;
> > -}
> > -
> > -#endif /* HAVE_OPENCL */
> > -
> >  static int tgsi_is_supported(struct r600_shader_ctx *ctx)
> >  {
> > struct tgsi_full_instruction *i = 
> > &ctx->parse.FullToken.FullInstruction;
> > --
> > 1.7.11.4
> >
> > __

Re: [Mesa-dev] [PATCH 1/2] r600g/compute: Move compute_shader_create() function into evergreen_compute.c

2013-06-13 Thread Aaron Watry
On Wed, Jun 12, 2013 at 7:34 PM, Tom Stellard  wrote:
> From: Tom Stellard 
>
> ---
>  src/gallium/drivers/r600/evergreen_compute.c | 23 +++-
>  src/gallium/drivers/r600/r600_shader.c   | 32 
> 
>  2 files changed, 22 insertions(+), 33 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index c993c09..b16c9d9 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -46,6 +46,7 @@
>  #include "evergreen_compute.h"
>  #include "evergreen_compute_internal.h"
>  #include "compute_memory_pool.h"
> +#include "sb/sb_public.h"
>  #ifdef HAVE_OPENCL
>  #include "radeon_llvm_util.h"
>  #endif
> @@ -522,7 +523,27 @@ static void evergreen_launch_grid(
> if (!shader->kernels[pc].code_bo) {
> void *p;
> struct r600_kernel *kernel = &shader->kernels[pc];
> -   r600_compute_shader_create(ctx_, kernel->llvm_module, 
> &kernel->bc);
> +   struct r600_bytecode *bc = &kernel->bc;
> +   LLVMModuleRef mod = kernel->llvm_module;
> +   boolean use_kill = false;
> +   bool dump = (ctx->screen->debug_flags & DBG_CS) != 0;
> +   unsigned use_sb = ctx->screen->debug_flags & DBG_SB_CS;
> +   unsigned sb_disasm = use_sb ||
> +   (ctx->screen->debug_flags & DBG_SB_DISASM);
> +
> +   r600_bytecode_init(bc, ctx->chip_class, ctx->family,
> +  ctx->screen->has_compressed_msaa_texturing);
> +   bc->type = TGSI_PROCESSOR_COMPUTE;
> +   bc->isa = ctx->isa;
> +   r600_llvm_compile(mod, ctx->family, bc, &use_kill, dump);
> +
> +   if (dump && !sb_disasm) {
> +   r600_bytecode_disasm(bc);
> +   } else if ((dump && sb_disasm) || use_sb) {
> +   if (r600_sb_bytecode_process(ctx, bc, NULL, dump, 
> use_sb))
> +   R600_ERR("r600_sb_bytecode_process 
> failed!\n");
> +   }
> +
> kernel->code_bo = r600_compute_buffer_alloc_vram(ctx->screen,
> kernel->bc.ndw * 4);
> p = r600_buffer_mmap_sync_with_rings(ctx, kernel->code_bo, 
> PIPE_TRANSFER_WRITE);
> diff --git a/src/gallium/drivers/r600/r600_shader.c 
> b/src/gallium/drivers/r600/r600_shader.c
> index 81ed3ce..97c625c 100644
> --- a/src/gallium/drivers/r600/r600_shader.c
> +++ b/src/gallium/drivers/r600/r600_shader.c
> @@ -291,38 +291,6 @@ static int tgsi_bgnloop(struct r600_shader_ctx *ctx);
>  static int tgsi_endloop(struct r600_shader_ctx *ctx);
>  static int tgsi_loop_brk_cont(struct r600_shader_ctx *ctx);
>
> -#ifdef HAVE_OPENCL
> -int r600_compute_shader_create(struct pipe_context * ctx,
> -   LLVMModuleRef mod,  struct r600_bytecode * bytecode)
> -{

There's an associated declaration of this function in r600_pipe.h that
is now unused... should this be removed? Otherwise, this looks good to
me.

FYI: Tested on CEDAR (HD5400).

--Aaron


> -   struct r600_context *r600_ctx = (struct r600_context *)ctx;
> -   struct r600_shader_ctx shader_ctx;
> -   boolean use_kill = false;
> -   bool dump = (r600_ctx->screen->debug_flags & DBG_CS) != 0;
> -   unsigned use_sb = r600_ctx->screen->debug_flags & DBG_SB_CS;
> -   unsigned sb_disasm = use_sb ||
> -   (r600_ctx->screen->debug_flags & DBG_SB_DISASM);
> -
> -   shader_ctx.bc = bytecode;
> -   r600_bytecode_init(shader_ctx.bc, r600_ctx->chip_class, 
> r600_ctx->family,
> -  r600_ctx->screen->has_compressed_msaa_texturing);
> -   shader_ctx.bc->type = TGSI_PROCESSOR_COMPUTE;
> -   shader_ctx.bc->isa = r600_ctx->isa;
> -   r600_llvm_compile(mod, r600_ctx->family,
> -   shader_ctx.bc, &use_kill, dump);
> -
> -   if (dump && !sb_disasm) {
> -   r600_bytecode_disasm(shader_ctx.bc);
> -   } else if ((dump && sb_disasm) || use_sb) {
> -   if (r600_sb_bytecode_process(r600_ctx, shader_ctx.bc, NULL, 
> dump, use_sb))
> -   R600_ERR("r600_sb_bytecode_process failed!\n");
> -   }
> -
> -   return 1;
> -}
> -
> -#endif /* HAVE_OPENCL */
> -
>  static int tgsi_is_supported(struct r600_shader_ctx *ctx)
>  {
> struct tgsi_full_instruction *i = 
> &ctx->parse.FullToken.FullInstruction;
> --
> 1.7.11.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH libclc] Implement barrier() builtin

2013-06-13 Thread Aaron Watry
FYI: I've applied your related piglit test and R600 back-end patches
and tested this on a CEDAR (HD5400).

Note: I had some trouble applying patches 4 and 5 of the R600 patches
but after chopping out the unit tests and creating those files by hand
(and using --ignore-whitespace), everything is there and functioning.

For the libclc change:
Reviewed-by: Aaron Watry 

On Wed, Jun 12, 2013 at 7:31 PM, Tom Stellard  wrote:
> From: Tom Stellard 
>
> ---
>  r600/lib/SOURCES |  2 ++
>  r600/lib/synchronization/barrier.cl  | 15 +++
>  r600/lib/synchronization/barrier_impl.ll | 12 
>  3 files changed, 29 insertions(+)
>  create mode 100644 r600/lib/synchronization/barrier.cl
>  create mode 100644 r600/lib/synchronization/barrier_impl.ll
>
> diff --git a/r600/lib/SOURCES b/r600/lib/SOURCES
> index af8c8c8..16ef3ac 100644
> --- a/r600/lib/SOURCES
> +++ b/r600/lib/SOURCES
> @@ -2,3 +2,5 @@ workitem/get_group_id.ll
>  workitem/get_local_size.ll
>  workitem/get_local_id.ll
>  workitem/get_global_size.ll
> +synchronization/barrier.cl
> +synchronization/barrier_impl.ll
> diff --git a/r600/lib/synchronization/barrier.cl 
> b/r600/lib/synchronization/barrier.cl
> new file mode 100644
> index 000..ac0b4b3
> --- /dev/null
> +++ b/r600/lib/synchronization/barrier.cl
> @@ -0,0 +1,15 @@
> +
> +#include 
> +
> +void barrier_local(void);
> +void barrier_global(void);
> +
> +void barrier(cl_mem_fence_flags flags) {
> +  if (flags & CLK_LOCAL_MEM_FENCE) {
> +barrier_local();
> +  }
> +
> +  if (flags & CLK_GLOBAL_MEM_FENCE) {
> +barrier_global();
> +  }
> +}
> diff --git a/r600/lib/synchronization/barrier_impl.ll 
> b/r600/lib/synchronization/barrier_impl.ll
> new file mode 100644
> index 000..99ac018
> --- /dev/null
> +++ b/r600/lib/synchronization/barrier_impl.ll
> @@ -0,0 +1,12 @@
> +declare void @llvm.AMDGPU.barrier.local() nounwind
> +declare void @llvm.AMDGPU.barrier.global() nounwind
> +
> +define void @barrier_local() nounwind alwaysinline {
> +  call void @llvm.AMDGPU.barrier.local()
> +  ret void
> +}
> +
> +define void @barrier_global() nounwind alwaysinline {
> +  call void @llvm.AMDGPU.barrier.global()
> +  ret void
> +}
> --
> 1.7.11.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] draw: don't clear the so targets until we stream out

2013-06-13 Thread Zack Rusin
Since draw auto fetches the count from the buffers, we can't
just clear them on bind, we need to wait until the actual
stream out is performed. Otherwise the count for draw auto
will be zero. Plus is cleaner to have draw do it rather
than drivers having to mess with draw's internals.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/draw/draw_context.c |4 +++-
 src/gallium/auxiliary/draw/draw_context.h |3 ++-
 src/gallium/auxiliary/draw/draw_private.h |1 +
 src/gallium/auxiliary/draw/draw_pt_so_emit.c  |   20 
 src/gallium/drivers/llvmpipe/lp_context.h |1 +
 src/gallium/drivers/llvmpipe/lp_draw_arrays.c |4 ++--
 src/gallium/drivers/llvmpipe/lp_state_so.c|8 ++--
 src/gallium/drivers/softpipe/sp_context.h |1 +
 src/gallium/drivers/softpipe/sp_draw_arrays.c |4 ++--
 src/gallium/drivers/softpipe/sp_state_so.c|1 +
 10 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 4a08765..f463739 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -810,7 +810,8 @@ draw_get_rasterizer_no_cull( struct draw_context *draw,
 void
 draw_set_mapped_so_targets(struct draw_context *draw,
int num_targets,
-   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS])
+   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS],
+   unsigned append_bitmask)
 {
int i;
 
@@ -820,6 +821,7 @@ draw_set_mapped_so_targets(struct draw_context *draw,
   draw->so.targets[i] = NULL;
 
draw->so.num_targets = num_targets;
+   draw->so.append_bitmask = append_bitmask;
 }
 
 void
diff --git a/src/gallium/auxiliary/draw/draw_context.h 
b/src/gallium/auxiliary/draw/draw_context.h
index 4a1b27e..ae63068 100644
--- a/src/gallium/auxiliary/draw/draw_context.h
+++ b/src/gallium/auxiliary/draw/draw_context.h
@@ -231,7 +231,8 @@ draw_set_mapped_constant_buffer(struct draw_context *draw,
 void
 draw_set_mapped_so_targets(struct draw_context *draw,
int num_targets,
-   struct draw_so_target 
*targets[PIPE_MAX_SO_BUFFERS]);
+   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS],
+   unsigned append_bitmask);
 
 
 /***
diff --git a/src/gallium/auxiliary/draw/draw_private.h 
b/src/gallium/auxiliary/draw/draw_private.h
index fd52c2d..4dda90e 100644
--- a/src/gallium/auxiliary/draw/draw_private.h
+++ b/src/gallium/auxiliary/draw/draw_private.h
@@ -290,6 +290,7 @@ struct draw_context
struct {
   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS];
   uint num_targets;
+  uint append_bitmask;
} so;
 
/* Clip derived state:
diff --git a/src/gallium/auxiliary/draw/draw_pt_so_emit.c 
b/src/gallium/auxiliary/draw/draw_pt_so_emit.c
index d624a99..785aa34 100644
--- a/src/gallium/auxiliary/draw/draw_pt_so_emit.c
+++ b/src/gallium/auxiliary/draw/draw_pt_so_emit.c
@@ -77,6 +77,24 @@ draw_has_so(const struct draw_context *draw)
return FALSE;
 }
 
+static void
+clean_so_buffers(struct pt_so_emit *emit)
+{
+   struct draw_context *draw = emit->draw;
+   unsigned i;
+
+   debug_assert(emit->has_so);
+
+   for (i = 0; i < draw->so.num_targets; i++) {
+  /* if we're not appending then lets reset the internal
+ data of our so target */
+  if (!(draw->so.append_bitmask & (1 << i)) && draw->so.targets[i]) {
+ draw->so.targets[i]->internal_offset = 0;
+ draw->so.targets[i]->emitted_vertices = 0;
+  }
+   }
+}
+
 void draw_pt_so_emit_prepare(struct pt_so_emit *emit, boolean use_pre_clip_pos)
 {
struct draw_context *draw = emit->draw;
@@ -257,6 +275,8 @@ void draw_pt_so_emit( struct pt_so_emit *emit,
if (!draw->so.num_targets)
   return;
 
+   clean_so_buffers(emit);
+
emit->emitted_vertices = 0;
emit->emitted_primitives = 0;
emit->generated_primitives = 0;
diff --git a/src/gallium/drivers/llvmpipe/lp_context.h 
b/src/gallium/drivers/llvmpipe/lp_context.h
index abfe852..0515968 100644
--- a/src/gallium/drivers/llvmpipe/lp_context.h
+++ b/src/gallium/drivers/llvmpipe/lp_context.h
@@ -91,6 +91,7 @@ struct llvmpipe_context {
 
struct draw_so_target *so_targets[PIPE_MAX_SO_BUFFERS];
int num_so_targets;
+   unsigned so_append_bitmask;
struct pipe_query_data_so_statistics so_stats;
unsigned num_primitives_generated;
 
diff --git a/src/gallium/drivers/llvmpipe/lp_draw_arrays.c 
b/src/gallium/drivers/llvmpipe/lp_draw_arrays.c
index 4e23904..11b665a 100644
--- a/src/gallium/drivers/llvmpipe/lp_draw_arrays.c
+++ b/src/gallium/drivers/llvmpipe/lp_draw_arrays.c
@@ -104,7 +104,7 @@ llvmpipe_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
   }
}
dra

[Mesa-dev] [Bug 47824] osmesa using --enable-shared-glapi depends on libgl

2013-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=47824

Anssi Hannula  changed:

   What|Removed |Added

 CC||an...@mageia.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Remove broken source type assertions from brw_alu3().

2013-06-13 Thread Kenneth Graunke
Commit 526ffdfc033ab01cf133cb7e8290c65d12ccc9be attempted to generalize
the source register type assertions to allow D and UD.  However, the
src1 and src2 assertions actually checked src0.type against D and UD due
to a copy and paste bug.

It also began setting the source and destination register types based on
dest.type, ignoring src0/src1/src2.type completely.  BFE and BFI2 may
actually pass mixed D/UD types and expect them to be ignored, which is
arguably a bit sloppy, but not too crazy either.

This patch simply removes the source register assertions as those values
aren't used anyway.  It also clarifies the comment above the block that
sets the register types.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 3d0db1b..f2cacd1 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -811,9 +811,6 @@ static struct brw_instruction *brw_alu3(struct brw_compile 
*p,
assert(src0.file == BRW_GENERAL_REGISTER_FILE);
assert(src0.address_mode == BRW_ADDRESS_DIRECT);
assert(src0.nr < 128);
-   assert(src0.type == BRW_REGISTER_TYPE_F ||
-  src0.type == BRW_REGISTER_TYPE_D ||
-  src0.type == BRW_REGISTER_TYPE_UD);
insn->bits2.da3src.src0_swizzle = src0.dw1.bits.swizzle;
insn->bits2.da3src.src0_subreg_nr = get_3src_subreg_nr(src0);
insn->bits2.da3src.src0_reg_nr = src0.nr;
@@ -824,9 +821,6 @@ static struct brw_instruction *brw_alu3(struct brw_compile 
*p,
assert(src1.file == BRW_GENERAL_REGISTER_FILE);
assert(src1.address_mode == BRW_ADDRESS_DIRECT);
assert(src1.nr < 128);
-   assert(src1.type == BRW_REGISTER_TYPE_F ||
-  src0.type == BRW_REGISTER_TYPE_D ||
-  src0.type == BRW_REGISTER_TYPE_UD);
insn->bits2.da3src.src1_swizzle = src1.dw1.bits.swizzle;
insn->bits2.da3src.src1_subreg_nr_low = get_3src_subreg_nr(src1) & 0x3;
insn->bits3.da3src.src1_subreg_nr_high = get_3src_subreg_nr(src1) >> 2;
@@ -838,9 +832,6 @@ static struct brw_instruction *brw_alu3(struct brw_compile 
*p,
assert(src2.file == BRW_GENERAL_REGISTER_FILE);
assert(src2.address_mode == BRW_ADDRESS_DIRECT);
assert(src2.nr < 128);
-   assert(src2.type == BRW_REGISTER_TYPE_F ||
-  src0.type == BRW_REGISTER_TYPE_D ||
-  src0.type == BRW_REGISTER_TYPE_UD);
insn->bits3.da3src.src2_swizzle = src2.dw1.bits.swizzle;
insn->bits3.da3src.src2_subreg_nr = get_3src_subreg_nr(src2);
insn->bits3.da3src.src2_rep_ctrl = src2.vstride == BRW_VERTICAL_STRIDE_0;
@@ -849,12 +840,12 @@ static struct brw_instruction *brw_alu3(struct 
brw_compile *p,
insn->bits1.da3src.src2_negate = src2.negate;
 
if (intel->gen >= 7) {
-  /* For MAD and LRP, all incoming src types are float, but for BFE and
-   * BFI2, the three source types might not all be the same. src2, the
-   * primary argument, should match the type of the destination.
+  /* Set both the source and destination types based on dest.type,
+   * ignoring the source register types.  The MAD and LRP emitters ensure
+   * that all four types are float.  The BFE and BFI2 emitters, however,
+   * may send us mixed D and UD types and want us to ignore that and use
+   * the destination type.
*/
-  assert(dest.type == src2.type);
-
   switch (dest.type) {
   case BRW_REGISTER_TYPE_F:
  insn->bits1.da3src.src_type = BRW_3SRC_TYPE_F;
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Add back strict type assertions for MAD and LRP.

2013-06-13 Thread Kenneth Graunke
Commit 526ffdfc033ab01cf133cb7e8290c65d12ccc9be relaxed the type
assertions in brw_alu3 to allow D/UD types (required by BFE and BFI2).
This lost us the strict type checking for MAD and LRP, which require
all four types to be float.

This patch adds a new ALU3F wrapper which checks these once again.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 31d97ca..3d0db1b 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -905,6 +905,20 @@ struct brw_instruction *brw_##OP(struct brw_compile *p,
\
return brw_alu3(p, BRW_OPCODE_##OP, dest, src0, src1, src2);\
 }
 
+#define ALU3F(OP)   \
+struct brw_instruction *brw_##OP(struct brw_compile *p, \
+ struct brw_reg dest,   \
+ struct brw_reg src0,   \
+ struct brw_reg src1,   \
+ struct brw_reg src2)   \
+{   \
+   assert(dest.type == BRW_REGISTER_TYPE_F);\
+   assert(src0.type == BRW_REGISTER_TYPE_F);\
+   assert(src1.type == BRW_REGISTER_TYPE_F);\
+   assert(src2.type == BRW_REGISTER_TYPE_F);\
+   return brw_alu3(p, BRW_OPCODE_##OP, dest, src0, src1, src2); \
+}
+
 /* Rounding operations (other than RNDD) require two instructions - the first
  * stores a rounded value (possibly the wrong way) in the dest register, but
  * also sets a per-channel "increment bit" in the flag register.  A predicated
@@ -955,8 +969,8 @@ ALU2(DP3)
 ALU2(DP2)
 ALU2(LINE)
 ALU2(PLN)
-ALU3(MAD)
-ALU3(LRP)
+ALU3F(MAD)
+ALU3F(LRP)
 ALU1(BFREV)
 ALU3(BFE)
 ALU2(BFI1)
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 52167] llvmpipe test programs link fails when ld --as-needed option is used

2013-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=52167

Olivier Blin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Olivier Blin  ---
I can not reproduce the build issue anymore, so this patch is not needed.
This has likely been fixed by the automake conversion.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] XDC2013 - Call for Proposals

2013-06-13 Thread Keith Packard

# Call For Proposals
**2013 X.Org Developers Conference (XDC 2013)**
**23-25 September 2013**
**Portland, Oregon USA**

The [2013 X.Org Developers Conference]
(http://www.x.org/wiki/Events/XDC2013) is the annual
technical meeting for [X Window System](http://x.org) and
[Free Desktop](http://freedesktop.org) developers. The
attendees will gather to discuss outstanding technical
issues related to X and to plan the direction of the X
Window System and its software ecosystem. The event is free
of charge and open to the general public.

The XDC 2013 Technical Program Committee (TPC) is requesting
proposals for papers and presentations at XDC 2013. While
any serious proposal will be gratefully considered, topics of
interest to X.org and FreeDesktop.org developers are encouraged.
There are three particular types of proposal the TPC is seeking:

 1. Technical talk abstracts: 250-1000 words describing a
 presentation to be made at XDC 2013. This can be
 anything: a work-in-progress talk, a proposal for
 change, analysis of trends, etc.

 2. Informal white papers: 1000+ words on something of
 interest or relevance to X.org developers, FreeDesktop.org
 developers or the X community at large. These papers will
 appear in the online conference proceedings of XDC 2013,
 and are unrefereed (beyond basic checks for legitimacy and
 relevance). Papers can be refereed if requested in advance.

 3. Technical research papers: 2500+ words in a format
 and style suitable for refereed technical publication.
 Papers that are judged acceptable by the TPC and its
 referees will be published in the printed conference
 proceedings of XDC 2013, available on a print-on-demand
 basis online.

XDC 2013 technical presenters will be chosen from the
authors of any of these submissions (as well as other
presenters invited by the TPC).

Normally, there is time for everyone who wants to present to
do so, but one can never tell. As much as possible,
presenters will be selected from those who submit before the
deadline. We also may be able to offer financial assistance
for travel for presenters who could not otherwise afford to
attend and who submit before the deadline.  Please do submit
your proposal in a timely fashion.

**Proposals due:** Thursday 1 August 2013 17:00 UTC
*Accepted formats:  PDF and ASCII text.
**Notification of acceptance:** Thursday 8 August 2013
**E-mail:** bo...@foundation.x.org

-- 
keith.pack...@intel.com


pgpTt_81Ia1JZ.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] XDC2013 - Announcement

2013-06-13 Thread Keith Packard

Now that we have everything in place we can finally make it official
and announce it:

XDC2013 will take place from September 23th to September 25th in
Portland, Oregon at the University Place Hotel and Conference Center
Ian Romanick, Bart Massey, and I will be orgainzing this event.

The initial wiki page for this event has been put in place at:

http://wiki.x.org/wiki/Events/XDC2013

This page will get updated regularly. Also we will keep you up-to-date
on the X.Org events mailing list http://lists.x.org/mailman/listinfo/events
so if you plan to come and are not subscribed there already, please consider
doing so!

For registration please add yourself to the attendees page
http://wiki.x.org/wiki/Events/XDC2013/Attendees.

If you would like to give a talk during the event, please add it to the
program page http://wiki.x.org/wiki/Events/XDC2013/Program.

We are looking forward to seeing you in Portland. So if you are corporate
please talk to your managers about funding your trip. If you aren't but
you have something to present, please contact the XOrg Foundation Board
of Directors at bo...@foundation.x.org for travel funding.

We have negotiated a special conference rate at the conference hotel.
Please check the Wiki page for more information.

-- 
keith.pack...@intel.com


pgpSffKjm5FIz.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: per-texture locking

2013-06-13 Thread Ian Romanick

On 06/12/2013 04:08 PM, Dave Airlie wrote:

On Thu, Jun 13, 2013 at 3:33 AM, Eric Anholt  wrote:

Frank Henigman  writes:


On Tue, Jun 11, 2013 at 1:10 PM, Eric Anholt  wrote:


Frank Henigman  writes:


Replace the one texture lock with a lock per texture.  This allows
uploading textures from one thread concurrently with drawing in another
thread.  _mesa_lock_context_textures() was used to check for texture
updates from other contexts and also to acquire the texture lock.
It's been replaced with _mesa_check_context_textures() which only does
the checking.  Code sections that were between
_mesa_lock_context_textures() and _mesa_unlock_context_textures()
have been updated to lock individual textures as needed.


When someone's doing something like glCopyTexSubImage() from an FBO
backed by a texture to another texture, how is the locking supposed to
work?  How about copies from one texture to the same texture?


Right now glCopyTexSubImage locks the destination texture before copying
to it, but doesn't lock the source texture.  This was safe because locking
any
texture effectively locked them all.  With my change that's no longer true
so
now we're copying from an unlocked texture.  Is that your concern?
So we just need to have it lock the source texture too?
We'll have to check for source == destination so we don't try to lock twice.


That's an example of my concern.


I'm pretty sure that our current locking doesn't cover nearly as much as

it needs to if one wants to make thread-per-context shareCtx support
actually work, so I'm really concerned that this change may make the
locking unfixable.


I agree there probably are problems with locking currently, and there seems
to be zero coverage for context sharing in piglit.  But I don't understand
how
my change makes anything unfixable.  Can you elaborate?


Basic ABBA locking problems.  Someone does a copyteximage from texture A
to fbo-wrapped texture B, at the same time someone does copytexsubimage
 From texture B to fbo-wrapped texture A.

The timeline for ABBA failure is:

thread 1:  thread 2:
lock texture A
lock texture B
block locking texture B
block locking texture A


I suspect you'd need a reservation type scheme like the one Maarten is
writing for
the kernel, and based on the one TTM uses.

Where you get a list of objects you want to lock and back off all locks when
you hit a contended point.

Dave.


Other OpenGL drivers solve this a different way, and it has been 
something on my todo list since forever.  Basically, you have N+1 sets 
of state per object: the global state and a mirror per-context.


Once a texture is bound, all reads access the local mirror.  Writes hit 
the local mirror and the global state.  When glBindTexture is called, 
the mirror synchronizes from the global state.  Accesses to the 
per-context state never need a lock (since they're implicitly locked by 
when the thread calls MakeCurrent).


This eliminates all of the ABBA problems I'm aware of.

 - From Eric's example, thread 1 never needs to lock texture A, and 
thread 2 never needs to lock texture B.


 - Other cases where multiple textures are involved together (e.g., MRT 
rendering) are only modifying texture contents, not texture state.  No 
lock is needed in these cases.


The other follow-on is that we need to separately reference count the 
storage for the image data.  There are probably other issues.  Clients 
that have the texture bound will continue to access the same images even 
if another client has called glTexImage or glCopyTexImage.  Once the 
last client unbinds (or rebinds) the texture the old images are freed.


It's a big pile of work that will touch things all over the place in 
Mesa.  That, alas, is why I've never gotten around to doing it.


I think adding the ref counting to the images and adding the per-context 
mirror state (with the single big lock) would be good first steps.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] util: Expand the comment above the channel[] array

2013-06-13 Thread Will Schmidt
On Thu, 2013-06-13 at 14:50 +0100, Richard Sandiford wrote:
>

The entirety of the comment looks pretty good to me.  :-) One
question, and this is mostly curiosity on my part, I'm not specifically
asking for another revision. 

> * (This is the same as C bitfield layout on most ABIs.)

Do we have a handle on what 'most ABIs' are?   I.e. would this include
X86* and PPC* ABIs as we know them today, or do we already clearly
understand which ones would not match?

Thanks, 
-Will


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/vs: Combine code generation's inst->opcode switch statements.

2013-06-13 Thread Kenneth Graunke
vec4_visitor::generate_code() switches on vec4_instruction::opcode and
calls into the brw_eu_emit.c layer to generate code for some of them.
It then has a default case which calls generate_vec4_instruction() to
handle the rest...which switches on opcode and handles the rest of the
cases.

The split apparently is that generate_code() handles the actual hardware
opcodes (BRW_OPCODE_*) while generate_vec4_instruction() handles the
virtual opcodes (SHADER_OPCODE_* and VS_OPCODE_*).  But this looks
fairly arbitrary, and it makes more sense to combine the two switches.

This patch moves the cases from generate_code() into the helper function
so that generate_code() isn't as large.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_vec4_emit.cpp | 329 ++--
 1 file changed, 166 insertions(+), 163 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
index fbb93db..f15759f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
@@ -621,14 +621,178 @@ 
vec4_generator::generate_pull_constant_load_gen7(vec4_instruction *inst,
0);
 }
 
+/**
+ * Generate assembly for a Vec4 IR instruction.
+ *
+ * \param instruction The Vec4 IR instruction to generate code for.
+ * \param dst The destination register.
+ * \param src An array of up to three source registers.
+ */
 void
 vec4_generator::generate_vec4_instruction(vec4_instruction *instruction,
   struct brw_reg dst,
   struct brw_reg *src)
 {
-   vec4_instruction *inst = (vec4_instruction *)instruction;
+   vec4_instruction *inst = (vec4_instruction *) instruction;
 
switch (inst->opcode) {
+   case BRW_OPCODE_MOV:
+  brw_MOV(p, dst, src[0]);
+  break;
+   case BRW_OPCODE_ADD:
+  brw_ADD(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_MUL:
+  brw_MUL(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_MACH:
+  brw_set_acc_write_control(p, 1);
+  brw_MACH(p, dst, src[0], src[1]);
+  brw_set_acc_write_control(p, 0);
+  break;
+
+   case BRW_OPCODE_MAD:
+  brw_MAD(p, dst, src[0], src[1], src[2]);
+  break;
+
+   case BRW_OPCODE_FRC:
+  brw_FRC(p, dst, src[0]);
+  break;
+   case BRW_OPCODE_RNDD:
+  brw_RNDD(p, dst, src[0]);
+  break;
+   case BRW_OPCODE_RNDE:
+  brw_RNDE(p, dst, src[0]);
+  break;
+   case BRW_OPCODE_RNDZ:
+  brw_RNDZ(p, dst, src[0]);
+  break;
+
+   case BRW_OPCODE_AND:
+  brw_AND(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_OR:
+  brw_OR(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_XOR:
+  brw_XOR(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_NOT:
+  brw_NOT(p, dst, src[0]);
+  break;
+   case BRW_OPCODE_ASR:
+  brw_ASR(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_SHR:
+  brw_SHR(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_SHL:
+  brw_SHL(p, dst, src[0], src[1]);
+  break;
+
+   case BRW_OPCODE_CMP:
+  brw_CMP(p, dst, inst->conditional_mod, src[0], src[1]);
+  break;
+   case BRW_OPCODE_SEL:
+  brw_SEL(p, dst, src[0], src[1]);
+  break;
+
+   case BRW_OPCODE_DPH:
+  brw_DPH(p, dst, src[0], src[1]);
+  break;
+
+   case BRW_OPCODE_DP4:
+  brw_DP4(p, dst, src[0], src[1]);
+  break;
+
+   case BRW_OPCODE_DP3:
+  brw_DP3(p, dst, src[0], src[1]);
+  break;
+
+   case BRW_OPCODE_DP2:
+  brw_DP2(p, dst, src[0], src[1]);
+  break;
+
+   case BRW_OPCODE_F32TO16:
+  brw_F32TO16(p, dst, src[0]);
+  break;
+
+   case BRW_OPCODE_F16TO32:
+  brw_F16TO32(p, dst, src[0]);
+  break;
+
+   case BRW_OPCODE_LRP:
+  brw_LRP(p, dst, src[0], src[1], src[2]);
+  break;
+
+   case BRW_OPCODE_BFREV:
+  /* BFREV only supports UD type for src and dst. */
+  brw_BFREV(p, retype(dst, BRW_REGISTER_TYPE_UD),
+   retype(src[0], BRW_REGISTER_TYPE_UD));
+  break;
+   case BRW_OPCODE_FBH:
+  /* FBH only supports UD type for dst. */
+  brw_FBH(p, retype(dst, BRW_REGISTER_TYPE_UD), src[0]);
+  break;
+   case BRW_OPCODE_FBL:
+  /* FBL only supports UD type for dst. */
+  brw_FBL(p, retype(dst, BRW_REGISTER_TYPE_UD), src[0]);
+  break;
+   case BRW_OPCODE_CBIT:
+  /* CBIT only supports UD type for dst. */
+  brw_CBIT(p, retype(dst, BRW_REGISTER_TYPE_UD), src[0]);
+  break;
+
+   case BRW_OPCODE_BFE:
+  brw_BFE(p, dst, src[0], src[1], src[2]);
+  break;
+
+   case BRW_OPCODE_BFI1:
+  brw_BFI1(p, dst, src[0], src[1]);
+  break;
+   case BRW_OPCODE_BFI2:
+  brw_BFI2(p, dst, src[0], src[1], src[2]);
+  break;
+
+   case BRW_OPCODE_IF:
+  if (inst->src[0].file != BAD_FILE) {
+ /* The instruction has an embedded compare (only allowed on gen6) */
+ assert

Re: [Mesa-dev] [PATCH] draw: fix a regression in computing max elt

2013-06-13 Thread Jose Fonseca
Sounds good. Thanks for tracking this down!

Jose

- Original Message -
> gl can use elts without setting indices, in which case
> our eltMax was set to 0 and always invoking the overflow
> condition. So by default set eltMax to maximum, it will
> be curbed by draw_set_indexes (if it ever comes) and if
> not then it will let gl's glVertexPointer/glDrawArrays
> work correctly. Fixes piglit's
> triangle-rasterization-overdraw test.
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/draw/draw_context.c |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_context.c
> b/src/gallium/auxiliary/draw/draw_context.c
> index 22c0e9b..4a08765 100644
> --- a/src/gallium/auxiliary/draw/draw_context.c
> +++ b/src/gallium/auxiliary/draw/draw_context.c
> @@ -138,6 +138,7 @@ boolean draw_init(struct draw_context *draw)
> draw->clip_z = TRUE;
>  
> draw->pt.user.planes = (float (*) [DRAW_TOTAL_CLIP_PLANES][4])
> &(draw->plane[0]);
> +   draw->pt.user.eltMax = ~0;
>  
> if (!draw_pipeline_init( draw ))
>return FALSE;
> --
> 1.7.10.4
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] draw: fix a regression in computing max elt

2013-06-13 Thread Zack Rusin
gl can use elts without setting indices, in which case
our eltMax was set to 0 and always invoking the overflow
condition. So by default set eltMax to maximum, it will
be curbed by draw_set_indexes (if it ever comes) and if
not then it will let gl's glVertexPointer/glDrawArrays
work correctly. Fixes piglit's
triangle-rasterization-overdraw test.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/draw/draw_context.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 22c0e9b..4a08765 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -138,6 +138,7 @@ boolean draw_init(struct draw_context *draw)
draw->clip_z = TRUE;
 
draw->pt.user.planes = (float (*) [DRAW_TOTAL_CLIP_PLANES][4]) 
&(draw->plane[0]);
+   draw->pt.user.eltMax = ~0;
 
if (!draw_pipeline_init( draw ))
   return FALSE;
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] mesa, gallium: renumber shader indices according to their placement in pipeline

2013-06-13 Thread Brian Paul

On 06/13/2013 06:25 AM, Marek Olšák wrote:

See my explanation in mtypes.h.
---
  src/gallium/include/pipe/p_defines.h   |7 ---
  src/glsl/linker.cpp|   16 
  src/mesa/drivers/dri/i965/brw_shader.cpp   |8 ++--
  src/mesa/main/mtypes.h |8 ++--
  src/mesa/main/shaderobj.h  |4 ++--
  src/mesa/main/uniform_query.cpp|2 +-
  src/mesa/program/ir_to_mesa.cpp|   10 +++---
  src/mesa/program/program.h |2 +-
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   10 +++---
  9 files changed, 30 insertions(+), 37 deletions(-)


Reviewed-by: Brian Paul 

However, a change for the VMware svga driver is also needed:

diff --git a/src/gallium/drivers/svga/svga_state_constants.c 
b/src/gallium/drive

index 759c6c6..c03f38c 100644
--- a/src/gallium/drivers/svga/svga_state_constants.c
+++ b/src/gallium/drivers/svga/svga_state_constants.c
@@ -58,10 +58,15 @@
 static int
 svga_shader_type(unsigned shader)
 {
-   assert(PIPE_SHADER_VERTEX + 1 == SVGA3D_SHADERTYPE_VS);
-   assert(PIPE_SHADER_FRAGMENT + 1 == SVGA3D_SHADERTYPE_PS);
-   assert(shader <= PIPE_SHADER_FRAGMENT);
-   return shader + 1;
+   switch (shader) {
+   case PIPE_SHADER_VERTEX:
+  return SVGA3D_SHADERTYPE_VS;
+   case PIPE_SHADER_FRAGMENT:
+  return SVGA3D_SHADERTYPE_PS;
+   default:
+  assert(!"Unexpected PIPE_SHADER_ type in svga_shader_type()");
+  return SVGA3D_SHADERTYPE_VS;
+   }
 }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65714] Champions of Regnum dont show characters!

2013-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65714

Alex Deucher  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org
Version|9.1 |git
  Component|Mesa core   |Drivers/Gallium/r600

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65714] Champions of Regnum dont show characters!

2013-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65714

--- Comment #1 from Alex Deucher  ---
Does the ppa enable LLVM?  If so does setting env var R600_LLVM=0 help?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65714] Champions of Regnum dont show characters!

2013-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65714

Fabio Pedretti  changed:

   What|Removed |Added

  Attachment #80780|text/plain  |image/png
  mime type||

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600 Patches: Add support for the local address space

2013-06-13 Thread Tom Stellard
On Wed, Jun 12, 2013 at 06:37:39PM -0700, Matt Arsenault wrote:
> On 06/12/2013 05:42 PM, Tom Stellard wrote:
> >Hi,
> >
> >The attached patches add support for local address space on
> >Evergreen / Northern Islands GPUs.
> >
> >Please Review.
> >
> >-Tom
> > +  def int_AMDGPU_barrier_local  : Intrinsic<[], [], []>;
> You probably want to mark this as IntrReadMem to try to avoid
> reordering stores around the barrier
>

I don't think the intrinsic as defined will have stores reordered around
it.  From include/llvm/IR/Intrinsics.td:

// Intr*Mem - Memory properties.  An intrinsic is allowed to have at most one of
// these properties set.  They are listed from the most aggressive (best to use
// if correct) to the least aggressive.  If no property is set, the worst case
// is assumed (it may read and write any memory it can get access to and
// it may have other side effects).

-Tom
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65714] New: Champions of Regnum dont show characters!

2013-06-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65714

  Priority: medium
Bug ID: 65714
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Champions of Regnum dont show characters!
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: wolfmen...@hotmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: 9.1
 Component: Mesa core
   Product: Mesa

Created attachment 80780
  --> https://bugs.freedesktop.org/attachment.cgi?id=80780&action=edit
main character selection screen with no Character

I am having some issues with the Champions of Regnum Game

I am using Linux Mint 15 and out of the box everything was working fine... but
you know as a linux user wanted to have the latest drivers ...

So I first have installed the latest xorg/edgers/ppa ati graphic drivers ,
every game is running ok + playonlinux except Champions of Regnum which it
doesnt display any characters in game , it displays them as invisble chars ><

I have removed the xord/edgers ppa and installed oibaf's ppa but the same thing
as xorg edgers..

I have an ATI HIS 5970x2 2GB DDR5

Any solution to this?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] util: Expand the comment above the channel[] array

2013-06-13 Thread Richard Sandiford
Michel Dänzer  writes:
> On Die, 2013-06-11 at 16:26 +0100, Richard Sandiford wrote:
>> Signed-off-by: Richard Sandiford 
>> ---
>>  src/gallium/auxiliary/util/u_format.h | 42 
>> ++-
>>  1 file changed, 41 insertions(+), 1 deletion(-)
>> 
>> diff --git a/src/gallium/auxiliary/util/u_format.h 
>> b/src/gallium/auxiliary/util/u_format.h
>> index e4b9c36..db6c290 100644
>> --- a/src/gallium/auxiliary/util/u_format.h
>> +++ b/src/gallium/auxiliary/util/u_format.h
>> @@ -178,9 +178,49 @@ struct util_format_description
>> unsigned is_mixed:1;
>>  
>> /**
>> -* Input channel description.
>> +* Input channel description, in the order XYZW.
>>  *
>>  * Only valid for UTIL_FORMAT_LAYOUT_PLAIN formats.
>> +*
>> +* The general rule is that the order and layout of the channels is the
>> +* same as they would be in a C struct:
>> +*
>> +* struct {
>> +*...X...;
>> +*...Y...;
>> +*...Z...;
>> +*...W...;
>> +* };
>> +*
>> +* with bitfields being used for all integer channels.
>
> I'd advise against using the term 'bitfield', as the semantics of C
> bitfields are mostly up to the specific C implementation, and it will
> lure people into implicitly thinking of the semantics of bitfields in
> the C implementation they're using.

I got the impression that the bit order was fairly consistent in practice,
since there's usually a strong expectation that the first structure member
should be in the first byte.  But you're right of course.  How does this
look instead:

   /**
* Input channel description, in the order XYZW.
*
* Only valid for UTIL_FORMAT_LAYOUT_PLAIN formats.
*
* Suppose the pixel value is treated as a single integer P.
* The order of the channels within P depends on endianness:
*
* - On big-endian targets, the channels are ordered from the most
*   significant end to the least significant end.  The most significant
*   bit of P is the most significant bit of the first channel.  The least
*   significant bit of P is the least significant bit of the last channel.
*
* - On little-endian targets, the channels are ordered from the least
*   significant end to the most significant end.  The least significant
*   bit of P is the least significant bit of the first channel.  The most
*   significant bit of P is the most significant bit of the last channel.
*
* (This is the same as C bitfield layout on most ABIs.)
*
* This means that if some channels can be accessed as individual N-byte
* values, the order of those channels in this array matches their order
* in memory.  Each N-byte value has native endianness.
*
* If instead a group of channels is accessed as a single N-byte value,
* the order of the channels within that value depends on endianness.
* For big-endian targets, the first channel in the group will be
* the most significant, otherwise it will be the least significant.
*
* For example, if X, Y, Z and W are all 8 bits, the memory order is:
*
* 0 1 2 3
* X Y Z W
*
* regardless of endianness.  If instead the channels are 16 bits,
* the memory order is:
*
* 0  1  2  3  4  5  6  7
*  little-endian: Xl Xu Yl Yu Zl Zu Wl Wu   (l = lower, u = upper)
*  big-endian:Xu Xl Yu Yl Zu Zl Wu Wl
*
* If X is 8 bits and Y is 24 bits, the memory order is:
*
* 0  1  2  3
*  little-endian: X  Yl Ym Yu(l = lower, m = middle, u = upper)
*  big-endian:X  Yu Ym Yl
*
* If X is 5 bits, Y is 5 bits, Z is 5 bits and W is 1 bit, the layout is:
*
*01
* msb  lsb msb  lsb
*  little-endian: YYYX WZYY
*  big-endian:XYYY YYZW
*/

(Each version grows a new example :-))

Thanks,
Richard

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] mesa, glsl, gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies

2013-06-13 Thread Marek Olšák
Not needed with do_dead_builtin_varyings.
---
 src/gallium/drivers/freedreno/freedreno_screen.c |1 -
 src/gallium/drivers/i915/i915_screen.c   |1 -
 src/gallium/drivers/ilo/ilo_screen.c |1 -
 src/gallium/drivers/llvmpipe/lp_screen.c |1 -
 src/gallium/drivers/nv30/nv30_screen.c   |1 -
 src/gallium/drivers/nv50/nv50_screen.c   |1 -
 src/gallium/drivers/nvc0/nvc0_screen.c   |1 -
 src/gallium/drivers/r300/r300_screen.c   |1 -
 src/gallium/drivers/r600/r600_pipe.c |1 -
 src/gallium/drivers/radeonsi/radeonsi_pipe.c |1 -
 src/gallium/drivers/softpipe/sp_screen.c |1 -
 src/gallium/drivers/svga/svga_screen.c   |1 -
 src/gallium/include/pipe/p_defines.h |1 -
 src/glsl/link_varyings.cpp   |   32 ++
 src/mesa/main/mtypes.h   |5 ++--
 src/mesa/state_tracker/st_extensions.c   |3 --
 16 files changed, 10 insertions(+), 43 deletions(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index f88fa08..ff45b3e 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -185,7 +185,6 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
case PIPE_CAP_SCALED_RESOLVE:
-   case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 2d0cc78..3c751c5 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -204,7 +204,6 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_MIXED_COLORBUFFER_FORMATS:
case PIPE_CAP_CONDITIONAL_RENDER:
case PIPE_CAP_TEXTURE_BARRIER:
-   case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 9daf01e..7a4443e 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -372,7 +372,6 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
  return is->dev.has_gen7_sol_reset;
   else
  return false; /* TODO */
-   case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
   return false;
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 562fb51..1fed537 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -192,7 +192,6 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
   return 16*4;
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
   return 1;
-   case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
   return 0;
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
diff --git a/src/gallium/drivers/nv30/nv30_screen.c 
b/src/gallium/drivers/nv30/nv30_screen.c
index c9943e0..07ffc80 100644
--- a/src/gallium/drivers/nv30/nv30_screen.c
+++ b/src/gallium/drivers/nv30/nv30_screen.c
@@ -109,7 +109,6 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_MAX_TEXEL_OFFSET:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
-   case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
diff --git a/src/gallium/drivers/nv50/nv50_screen.c 
b/src/gallium/drivers/nv50/nv50_screen.c
index b6da303..5c57aa2 100644
--- a/src/gallium/drivers/nv50/nv50_screen.c
+++ b/src/gallium/drivers/nv50/nv50_screen.c
@@ -165,7 +165,6 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
case PIPE_CAP_START_INSTANCE:
   return 1;
-   case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
   return 0; /* state trackers will know better */
case PIPE_CAP_USER_CONSTANT_BUFFERS:
diff --git a/src/gallium/drivers/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nvc0/nvc0_screen.c
index 97ce82c..027fc11 100644
--- a/src/gallium/drivers/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nvc0/nvc0_screen.c
@@ -157,7 +157,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_QUAD

[Mesa-dev] [PATCH 5/6] st/mesa: disable EXT_separate_shader_objects

2013-06-13 Thread Marek Olšák
The extension disallows elimination of set-but-unused varyings.
---
 docs/relnotes/9.2.html |3 +++
 src/mesa/state_tracker/st_extensions.c |9 -
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/docs/relnotes/9.2.html b/docs/relnotes/9.2.html
index 0dcc960..99f6374 100644
--- a/docs/relnotes/9.2.html
+++ b/docs/relnotes/9.2.html
@@ -63,6 +63,9 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 Removed d3d1x state tracker (unused, unmaintained and broken)
+GL_EXT_separate_shader_objects has been removed from all Gallium drivers,
+because it disallows critical GLSL shader optimizations.
+GL_ARB_separate_shader_objects doesn't have this issue.
 
 
 
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 966722c..43111d6 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -559,7 +559,14 @@ void st_init_extensions(struct st_context *st)
ctx->Extensions.EXT_point_parameters = GL_TRUE;
ctx->Extensions.EXT_provoking_vertex = GL_TRUE;
ctx->Extensions.EXT_secondary_color = GL_TRUE;
-   ctx->Extensions.EXT_separate_shader_objects = GL_TRUE;
+
+   /* IMPORTANT:
+*Don't enable EXT_separate_shader_objects. It disallows certain
+*optimizations in the GLSL compiler and therefore is considered
+*harmful.
+*/
+   ctx->Extensions.EXT_separate_shader_objects = GL_FALSE;
+
ctx->Extensions.EXT_texture_env_dot3 = GL_TRUE;
ctx->Extensions.EXT_vertex_array_bgra = GL_TRUE;
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] glsl/linker: eliminate unused and set-but-unused built-in varyings

2013-06-13 Thread Marek Olšák
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.
---
 src/glsl/Makefile.sources  |1 +
 src/glsl/ir_optimization.h |4 +
 src/glsl/link_varyings.h   |4 +
 src/glsl/linker.cpp|   13 +-
 src/glsl/opt_dead_builtin_varyings.cpp |  468 
 5 files changed, 488 insertions(+), 2 deletions(-)
 create mode 100644 src/glsl/opt_dead_builtin_varyings.cpp

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 50bad85..cb17cf8 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -81,6 +81,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/opt_constant_variable.cpp \
$(GLSL_SRCDIR)/opt_copy_propagation.cpp \
$(GLSL_SRCDIR)/opt_copy_propagation_elements.cpp \
+   $(GLSL_SRCDIR)/opt_dead_builtin_varyings.cpp \
$(GLSL_SRCDIR)/opt_dead_code.cpp \
$(GLSL_SRCDIR)/opt_dead_code_local.cpp \
$(GLSL_SRCDIR)/opt_dead_functions.cpp \
diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index d38d5e3..fad6f1b 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -76,6 +76,10 @@ bool do_constant_variable_unlinked(exec_list *instructions);
 bool do_copy_propagation(exec_list *instructions);
 bool do_copy_propagation_elements(exec_list *instructions);
 bool do_constant_propagation(exec_list *instructions);
+void do_dead_builtin_varyings(struct gl_context *ctx,
+  exec_list *producer, exec_list *consumer,
+  unsigned num_tfeedback_decls,
+  class tfeedback_decl *tfeedback_decls);
 bool do_dead_code(exec_list *instructions, bool uniform_locations_assigned);
 bool do_dead_code_local(exec_list *instructions);
 bool do_dead_code_unlinked(exec_list *instructions);
diff --git a/src/glsl/link_varyings.h b/src/glsl/link_varyings.h
index daa9d79..7f7be35 100644
--- a/src/glsl/link_varyings.h
+++ b/src/glsl/link_varyings.h
@@ -125,6 +125,10 @@ public:
  return this->vector_elements * this->matrix_columns * this->size;
}
 
+   unsigned get_location() const {
+  return this->location;
+   }
+
 private:
/**
 * The name that was supplied to glTransformFeedbackVaryings.  Used for
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index a8537cf..129b665 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1887,6 +1887,9 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 goto done;
   }
 
+  do_dead_builtin_varyings(ctx, sh->ir, NULL,
+   num_tfeedback_decls, tfeedback_decls);
+
   demote_shader_inputs_and_outputs(sh, ir_var_shader_out);
 
   /* Eliminate code that is now dead due to unused outputs being demoted.
@@ -1895,11 +1898,13 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
  ;
}
else if (first == MESA_SHADER_FRAGMENT) {
-  /* If the program only contains a fragment shader, just demote
-   * user-defined varyings.
+  /* If the program only contains a fragment shader...
*/
   gl_shader *const sh = prog->_LinkedShaders[first];
 
+  do_dead_builtin_varyings(ctx, NULL, sh->ir,
+   num_tfeedback_decls, tfeedback_decls);
+
   demote_shader_inputs_and_outputs(sh, ir_var_shader_in);
 
   while (do_dead_code(sh->ir, false))
@@ -1919,6 +1924,10 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 tfeedback_decls))
  goto done;
 
+  do_dead_builtin_varyings(ctx, sh_i->ir, sh_next->ir,
+next == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
+tfeedback_decls);
+
   demote_shader_inputs_and_outputs(sh_i, ir_var_shader_out);
   demote_shader_inputs_and_outputs(sh_next, ir_var_shader_in);
 
diff --git a/src/glsl/opt_dead_builtin_varyings.cpp 
b/src/glsl/opt_dead_builtin_varyings.cpp
new file mode 100644
index 000..eb99d1e
--- /dev/null
+++ b/src/glsl/opt_dead_builtin_varyings.cpp
@@ -0,0 +1,468 @@
+/*
+ * Copyright © 2013 Marek Olšák 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Softwar

[Mesa-dev] [PATCH 3/6] glsl/linker: check against varying limit after unused varyings are eliminated

2013-06-13 Thread Marek Olšák
We counted even the varyings which were later eliminated, which was
suboptimal.
---
 src/glsl/link_varyings.cpp |   35 ---
 src/glsl/link_varyings.h   |5 +
 src/glsl/linker.cpp|4 
 3 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 34e3440..25f27f0 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -,16 +,12 @@ assign_varying_locations(struct gl_context *ctx,
   }
}
 
-   unsigned varying_vectors = 0;
-
if (consumer) {
   foreach_list(node, consumer->ir) {
  ir_variable *const var = ((ir_instruction *) node)->as_variable();
 
- if ((var == NULL) || (var->mode != ir_var_shader_in))
-continue;
-
- if (var->is_unmatched_generic_inout) {
+ if (var && var->mode == ir_var_shader_in &&
+ var->is_unmatched_generic_inout) {
 if (prog->Version <= 120) {
/* On page 25 (page 31 of the PDF) of the GLSL 1.20 spec:
 *
@@ -1143,15 +1139,32 @@ assign_varying_locations(struct gl_context *ctx,
  * value is written by the previous stage.
  */
 var->mode = ir_var_auto;
- } else if (is_varying_var(consumer->Type, var)) {
-/* The packing rules are used for vertex shader inputs are also
- * used for fragment shader inputs.
- */
-varying_vectors += count_attribute_slots(var->type);
  }
   }
}
 
+   return true;
+}
+
+bool
+check_against_varying_limit(struct gl_context *ctx,
+struct gl_shader_program *prog,
+gl_shader *consumer)
+{
+   unsigned varying_vectors = 0;
+
+   foreach_list(node, consumer->ir) {
+  ir_variable *const var = ((ir_instruction *) node)->as_variable();
+
+  if (var && var->mode == ir_var_shader_in &&
+  is_varying_var(consumer->Type, var)) {
+ /* The packing rules used for vertex shader inputs are also
+  * used for fragment shader inputs.
+  */
+ varying_vectors += count_attribute_slots(var->type);
+  }
+   }
+
if (ctx->API == API_OPENGLES2 || prog->IsES) {
   if (varying_vectors > ctx->Const.MaxVarying) {
  if (ctx->Const.GLSLSkipStrictMaxVaryingLimitCheck) {
diff --git a/src/glsl/link_varyings.h b/src/glsl/link_varyings.h
index ee1010a..daa9d79 100644
--- a/src/glsl/link_varyings.h
+++ b/src/glsl/link_varyings.h
@@ -232,4 +232,9 @@ assign_varying_locations(struct gl_context *ctx,
  unsigned num_tfeedback_decls,
  tfeedback_decl *tfeedback_decls);
 
+bool
+check_against_varying_limit(struct gl_context *ctx,
+struct gl_shader_program *prog,
+gl_shader *consumer);
+
 #endif /* GLSL_LINK_VARYINGS_H */
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 9ef9cc7..a8537cf 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1929,6 +1929,10 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   while (do_dead_code(sh_next->ir, false))
  ;
 
+  /* This must be done after all dead varyings are eliminated. */
+  if (!check_against_varying_limit(ctx, prog, sh_next))
+ goto done;
+
   next = i;
}
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] glsl/linker: link shaders in the opposite order (from fragment to vertex)

2013-06-13 Thread Marek Olšák
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.

For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.

(It was too late when I realized this hadn't been needed with only 2 shader
 stages. This can be considered a cleanup for now and hopefully geometry
 shaders are close to completion.)
---
 src/glsl/linker.cpp |  108 +++
 1 file changed, 58 insertions(+), 50 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 0f167e6..9ef9cc7 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1836,9 +1836,9 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   goto done;
}
 
-   unsigned prev;
-   for (prev = 0; prev < MESA_SHADER_TYPES; prev++) {
-  if (prog->_LinkedShaders[prev] != NULL)
+   unsigned first;
+   for (first = 0; first < MESA_SHADER_TYPES; first++) {
+  if (prog->_LinkedShaders[first] != NULL)
 break;
}
 
@@ -1850,7 +1850,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
* non-zero, but the program object has no vertex or geometry
* shader;
*/
-  if (prev >= MESA_SHADER_FRAGMENT) {
+  if (first >= MESA_SHADER_FRAGMENT) {
  linker_error(prog, "Transform feedback varyings specified, but "
   "no vertex or geometry shader is present.");
  goto done;
@@ -1864,69 +1864,77 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
  goto done;
}
 
-   for (unsigned i = prev + 1; i < MESA_SHADER_TYPES; i++) {
-  if (prog->_LinkedShaders[i] == NULL)
-continue;
-
-  if (!assign_varying_locations(
-   ctx, mem_ctx, prog, 
prog->_LinkedShaders[prev], prog->_LinkedShaders[i],
- i == MESA_SHADER_FRAGMENT ? num_tfeedback_decls : 0,
- tfeedback_decls))
-goto done;
-
-  prev = i;
+   /* Linking the stages in the opposite order (from fragment to vertex)
+* ensures that inter-shader outputs written to in an earlier stage are
+* eliminated if they are (transitively) not used in a later stage.
+*/
+   int last, next;
+   for (last = MESA_SHADER_TYPES-1; last >= 0; last--) {
+  if (prog->_LinkedShaders[last] != NULL)
+ break;
}
 
-   if (prev != MESA_SHADER_FRAGMENT && num_tfeedback_decls != 0) {
-  /* There was no fragment shader, but we still have to assign varying
-   * locations for use by transform feedback.
-   */
-  if (!assign_varying_locations(
-   ctx, mem_ctx, prog, 
prog->_LinkedShaders[prev], NULL, num_tfeedback_decls,
- tfeedback_decls))
- goto done;
-   }
+   if (last >= 0 && last < MESA_SHADER_FRAGMENT) {
+  gl_shader *const sh = prog->_LinkedShaders[last];
 
-   if (!store_tfeedback_info(ctx, prog, num_tfeedback_decls, tfeedback_decls))
-  goto done;
+  if (num_tfeedback_decls != 0) {
+ /* There was no fragment shader, but we still have to assign varying
+  * locations for use by transform feedback.
+  */
+ if (!assign_varying_locations(ctx, mem_ctx, prog,
+   sh, NULL,
+   num_tfeedback_decls, tfeedback_decls))
+goto done;
+  }
 
-   if (prog->_LinkedShaders[MESA_SHADER_VERTEX] != NULL) {
-  
demote_shader_inputs_and_outputs(prog->_LinkedShaders[MESA_SHADER_VERTEX],
-  ir_var_shader_out);
+  demote_shader_inputs_and_outputs(sh, ir_var_shader_out);
 
-  /* Eliminate code that is now dead due to unused vertex outputs being
-   * demoted.
+  /* Eliminate code that is now dead due to unused outputs being demoted.
*/
-  while (do_dead_code(prog->_LinkedShaders[MESA_SHADER_VERTEX]->ir, false))
-;
+  while (do_dead_code(sh->ir, false))
+ ;
}
-
-   if (prog->_LinkedShaders[MESA_SHADER_GEOMETRY] != NULL) {
-  gl_shader *const sh = prog->_LinkedShaders[MESA_SHADER_GEOMETRY];
+   else if (first == MESA_SHADER_FRAGMENT) {
+  /* If the program only contains a fragment shader, just demote
+   * user-defined varyings.
+   */
+  gl_shader *const sh = prog->_LinkedShaders[first];
 
   demote_shader_inputs_and_outputs(sh, ir_var_shader_in);
-  demote_shader_inputs_and_outputs(sh, ir_var_shader_out);
 
-  /* Eliminate code that is now dead due to unused geometry outputs being
-   * demoted.
-   */
-  while (do_dead_code(prog->_LinkedShaders[MESA_SHADER_GEOMETRY]->ir, 

[Mesa-dev] [PATCH 1/6] mesa, gallium: renumber shader indices according to their placement in pipeline

2013-06-13 Thread Marek Olšák
See my explanation in mtypes.h.
---
 src/gallium/include/pipe/p_defines.h   |7 ---
 src/glsl/linker.cpp|   16 
 src/mesa/drivers/dri/i965/brw_shader.cpp   |8 ++--
 src/mesa/main/mtypes.h |8 ++--
 src/mesa/main/shaderobj.h  |4 ++--
 src/mesa/main/uniform_query.cpp|2 +-
 src/mesa/program/ir_to_mesa.cpp|   10 +++---
 src/mesa/program/program.h |2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   10 +++---
 9 files changed, 30 insertions(+), 37 deletions(-)

diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 8af1a84..216cd2f 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -352,11 +352,12 @@ enum pipe_flush_flags {
 
 
 /**
- * Shaders
+ * Shaders.
+ * These must have the same values as Mesa's MESA_SHADER_*.
  */
 #define PIPE_SHADER_VERTEX   0
-#define PIPE_SHADER_FRAGMENT 1
-#define PIPE_SHADER_GEOMETRY 2
+#define PIPE_SHADER_GEOMETRY 1
+#define PIPE_SHADER_FRAGMENT 2
 #define PIPE_SHADER_COMPUTE  3
 #define PIPE_SHADER_TYPES4
 
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index cd8d680..0f167e6 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1514,31 +1514,31 @@ static bool
 check_resources(struct gl_context *ctx, struct gl_shader_program *prog)
 {
static const char *const shader_names[MESA_SHADER_TYPES] = {
-  "vertex", "fragment", "geometry"
+  "vertex", "geometry", "fragment"
};
 
const unsigned max_samplers[MESA_SHADER_TYPES] = {
   ctx->Const.VertexProgram.MaxTextureImageUnits,
-  ctx->Const.FragmentProgram.MaxTextureImageUnits,
-  ctx->Const.GeometryProgram.MaxTextureImageUnits
+  ctx->Const.GeometryProgram.MaxTextureImageUnits,
+  ctx->Const.FragmentProgram.MaxTextureImageUnits
};
 
const unsigned max_default_uniform_components[MESA_SHADER_TYPES] = {
   ctx->Const.VertexProgram.MaxUniformComponents,
-  ctx->Const.FragmentProgram.MaxUniformComponents,
-  ctx->Const.GeometryProgram.MaxUniformComponents
+  ctx->Const.GeometryProgram.MaxUniformComponents,
+  ctx->Const.FragmentProgram.MaxUniformComponents
};
 
const unsigned max_combined_uniform_components[MESA_SHADER_TYPES] = {
   ctx->Const.VertexProgram.MaxCombinedUniformComponents,
-  ctx->Const.FragmentProgram.MaxCombinedUniformComponents,
-  ctx->Const.GeometryProgram.MaxCombinedUniformComponents
+  ctx->Const.GeometryProgram.MaxCombinedUniformComponents,
+  ctx->Const.FragmentProgram.MaxCombinedUniformComponents
};
 
const unsigned max_uniform_blocks[MESA_SHADER_TYPES] = {
   ctx->Const.VertexProgram.MaxUniformBlocks,
-  ctx->Const.FragmentProgram.MaxUniformBlocks,
   ctx->Const.GeometryProgram.MaxUniformBlocks,
+  ctx->Const.FragmentProgram.MaxUniformBlocks
};
 
for (unsigned i = 0; i < MESA_SHADER_TYPES; i++) {
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 03e4329..5c8f449 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -119,17 +119,13 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) {
   struct brw_shader *shader =
 (struct brw_shader *)shProg->_LinkedShaders[stage];
-  static const GLenum targets[] = {
-GL_VERTEX_PROGRAM_ARB,
-GL_FRAGMENT_PROGRAM_ARB,
-GL_GEOMETRY_PROGRAM_NV
-  };
 
   if (!shader)
 continue;
 
   struct gl_program *prog =
-ctx->Driver.NewProgram(ctx, targets[stage], shader->base.Name);
+ctx->Driver.NewProgram(ctx, _mesa_program_index_to_target(stage),
+shader->base.Name);
   if (!prog)
return false;
   prog->Parameters = _mesa_new_parameter_list();
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index cd8650c..750e333 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2175,12 +2175,16 @@ struct gl_shader
 /**
  * Shader stages. Note that these will become 5 with tessellation.
  * These MUST have the same values as gallium's PIPE_SHADER_*
+ *
+ * The order must match how shaders are ordered in the pipeline.
+ * The GLSL linker assumes that if i= MESA_SHADER_TYPES)
   return 0;
diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 296f80f..be2f0e4 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -435,7 +435,7 @@ static void
 log_program_parameters(const struct gl_shader_program *shProg)
 {
static const char *stages[] = {
-  "vertex", "fragment", "geometry"
+  "vertex", "geometry", "fragment"
};
 
assert(Elements(stages) == MESA_SHADER_TYPES);
diff --git a/src/mesa/prog

[Mesa-dev] [PATCH 0/6] Eliminating unused built-in varyings

2013-06-13 Thread Marek Olšák
Hi everyone,

this series adds a new GLSL compiler optimization pass which eliminates unused 
and set-but-unused built-in varyings and adds a few improvements to the GLSL 
linker in the process.

Before I show you how it works, I wanna say that there are patches which are 
related to and will most probably conflict with the geometry shader work, but 
they are necessary because the linkage of varyings is largely suboptimal.

Also, the GL_EXT_separate_shader_objects extension must be disabled for this 
optimization to be enabled. The reason is a program object with both a VS and 
FS can be bound partially, e.g. by glUseShaderProgramEXT(GL_VERTEX_SHADER, 
prog), so the extension makes every program object be just a set of "separate 
shaders". The extension is not important anyway.

Now, to illustrate how the optimization works, consider these 2 shader IR dumps:


Vertex shader (8 varyings):
...
(declare (shader_out ) vec4 gl_FrontColor)
(declare (shader_out ) vec4 gl_FrontSecondaryColor)
(declare (shader_out ) (array vec4 6) gl_TexCoord)
(function main
  (signature void
(parameters
)
(
  ...
  (assign  (xyzw) (var_ref gl_FrontColor)  (var_ref gl_Color) ) 
  (assign  (xyzw) (var_ref gl_FrontSecondaryColor)  (var_ref 
gl_SecondaryColor) ) 
  (assign  (xyzw) (array_ref (var_ref gl_TexCoord) (constant int (1)) )  
(var_ref gl_MultiTexCoord1) ) 
  (assign  (xyzw) (array_ref (var_ref gl_TexCoord) (constant int (4)) )  
(var_ref gl_MultiTexCoord4) ) 
  (assign  (xyzw) (array_ref (var_ref gl_TexCoord) (constant int (5)) )  
(var_ref gl_MultiTexCoord5) ) 
))
)

Fragment shader (6 varyings):
...
(declare (shader_in ) vec4 gl_SecondaryColor)
(declare (shader_in ) (array vec4 5) gl_TexCoord)
(function main
  (signature void
(parameters
)
(
  (declare () vec4 r)
  (assign  (xyzw) (var_ref r)  ... (var_ref gl_SecondaryColor) ) ) 
  (assign  (xyzw) (var_ref r)  ... (array_ref (var_ref gl_TexCoord) 
(constant int (1)) ) ) ) ) 
  (assign  (xyzw) (var_ref r)  ... (array_ref (var_ref gl_TexCoord) 
(constant int (2)) ) ) ) ) 
  (assign  (xyzw) (var_ref r)  ... (array_ref (var_ref gl_TexCoord) 
(constant int (3)) ) ) ) ) 
  (declare (temporary ) vec4 assignment_tmp)
  (assign  (xyzw) (var_ref assignment_tmp)  ... (array_ref (var_ref 
gl_TexCoord) (constant int (4)) ) ) ) ) 
  ...
))
)


Note that only gl_TexCoord[1], gl_TexCoord[4], and gl_SecondaryColor are used 
by both shaders. The optimization replaces all occurences of varyings which are 
unused by the other stage by temporary variables. It also breaks down the 
gl_TexCoord array into separate vec4 variables if needed. Here's the result:


Vertex shader (3 varyings instead of 8):
...
(declare (shader_out ) vec4 gl_out_TexCoord1)
(declare (shader_out ) vec4 gl_out_TexCoord4)
(declare (temporary ) vec4 gl_out_TexCoord5_dummy)
(declare (temporary ) vec4 gl_out_FrontColor0_dummy)
(declare (shader_out ) vec4 gl_FrontSecondaryColor)
(function main
  (signature void
(parameters
)
(
  ...
  (assign  (xyzw) (var_ref gl_out_FrontColor0_dummy)  (var_ref gl_Color) ) 
  (assign  (xyzw) (var_ref gl_FrontSecondaryColor)  (var_ref 
gl_SecondaryColor) ) 
  (assign  (xyzw) (var_ref gl_out_TexCoord1)  (var_ref gl_MultiTexCoord1) ) 
  (assign  (xyzw) (var_ref gl_out_TexCoord4)  (var_ref gl_MultiTexCoord4) ) 
  (assign  (xyzw) (var_ref gl_out_TexCoord5_dummy)  (var_ref 
gl_MultiTexCoord5) ) 
))
)

Fragment shader (3 varyings instead of 6):
...
(declare (shader_in ) vec4 gl_in_TexCoord1)
(declare (temporary ) vec4 gl_in_TexCoord2_dummy)
(declare (temporary ) vec4 gl_in_TexCoord3_dummy)
(declare (shader_in ) vec4 gl_in_TexCoord4)
(declare (shader_in ) vec4 gl_SecondaryColor)
(function main
  (signature void
(parameters
)
(
  (declare () vec4 r)
  (assign  (xyzw) (var_ref r)  ... (var_ref gl_SecondaryColor) ) ) 
  (assign  (xyzw) (var_ref r)  ... (var_ref gl_in_TexCoord1) ) ) ) 
  (assign  (xyzw) (var_ref r)  ... (var_ref gl_in_TexCoord2_dummy) ) ) ) 
  (assign  (xyzw) (var_ref r)  ... (var_ref gl_in_TexCoord3_dummy) ) ) ) 
  (declare (temporary ) vec4 assignment_tmp)
  (assign  (xyzw) (var_ref assignment_tmp)  ... (var_ref gl_in_TexCoord4) ) 
) ) 
  ...
))
)

The locations of varyings remain the same. That's all. Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: fix temp texture bindings in st_CopyPixels()

2013-06-13 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Jun 13, 2013 at 11:11 AM, Chia-I Wu  wrote:
> The temporary texture should have either PIPE_BIND_RENDER_TARGET or
> PIPE_BIND_DEPTH_STENCIL set in addition to PIPE_BIND_SAMPLER_VIEW.
>
> Signed-off-by: Chia-I Wu 
> ---
>  src/mesa/state_tracker/st_cb_drawpixels.c |   30 
> +
>  1 file changed, 13 insertions(+), 17 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
> b/src/mesa/state_tracker/st_cb_drawpixels.c
> index 1c26315..0200a62 100644
> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> @@ -460,12 +460,12 @@ internal_format(struct gl_context *ctx, GLenum format, 
> GLenum type)
>   */
>  static struct pipe_resource *
>  alloc_texture(struct st_context *st, GLsizei width, GLsizei height,
> -  enum pipe_format texFormat)
> +  enum pipe_format texFormat, unsigned bind)
>  {
> struct pipe_resource *pt;
>
> pt = st_texture_create(st, st->internal_target, texFormat, 0,
> -  width, height, 1, 1, 0, PIPE_BIND_SAMPLER_VIEW);
> +  width, height, 1, 1, 0, bind);
>
> return pt;
>  }
> @@ -515,7 +515,7 @@ make_texture(struct st_context *st,
>return NULL;
>
> /* alloc temporary texture */
> -   pt = alloc_texture(st, width, height, pipeFormat);
> +   pt = alloc_texture(st, width, height, pipeFormat, PIPE_BIND_SAMPLER_VIEW);
> if (!pt) {
>_mesa_unmap_pbo_source(ctx, unpack);
>return NULL;
> @@ -1475,6 +1475,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint 
> srcy,
> int num_sampler_view = 1;
> GLfloat *color;
> enum pipe_format srcFormat;
> +   unsigned srcBind;
> GLboolean invertTex = GL_FALSE;
> GLint readX, readY, readW, readH;
> struct gl_pixelstore_attrib pack = ctx->DefaultPacking;
> @@ -1540,16 +1541,15 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, 
> GLint srcy,
>
> /* Choose the format for the temporary texture. */
> srcFormat = rbRead->texture->format;
> +   srcBind = PIPE_BIND_SAMPLER_VIEW |
> +  (type == GL_COLOR ? PIPE_BIND_RENDER_TARGET : PIPE_BIND_DEPTH_STENCIL);
>
> if (!screen->is_format_supported(screen, srcFormat, st->internal_target, 
> 0,
> -PIPE_BIND_SAMPLER_VIEW |
> -(type == GL_COLOR ? 
> PIPE_BIND_RENDER_TARGET
> - : PIPE_BIND_DEPTH_STENCIL))) {
> +srcBind)) {
>if (type == GL_DEPTH) {
>   srcFormat = st_choose_format(st, GL_DEPTH_COMPONENT, GL_NONE,
>GL_NONE, st->internal_target, 0,
> -  PIPE_BIND_SAMPLER_VIEW |
> -  PIPE_BIND_DEPTH_STENCIL, FALSE);
> +  srcBind, FALSE);
>}
>else {
>   assert(type == GL_COLOR);
> @@ -1557,26 +1557,22 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, 
> GLint srcy,
>   if (util_format_is_float(srcFormat)) {
>  srcFormat = st_choose_format(st, GL_RGBA32F, GL_NONE,
>   GL_NONE, st->internal_target, 0,
> - PIPE_BIND_SAMPLER_VIEW |
> - PIPE_BIND_RENDER_TARGET, FALSE);
> + srcBind, FALSE);
>   }
>   else if (util_format_is_pure_sint(srcFormat)) {
>  srcFormat = st_choose_format(st, GL_RGBA32I, GL_NONE,
>   GL_NONE, st->internal_target, 0,
> - PIPE_BIND_SAMPLER_VIEW |
> - PIPE_BIND_RENDER_TARGET, FALSE);
> + srcBind, FALSE);
>   }
>   else if (util_format_is_pure_uint(srcFormat)) {
>  srcFormat = st_choose_format(st, GL_RGBA32UI, GL_NONE,
>   GL_NONE, st->internal_target, 0,
> - PIPE_BIND_SAMPLER_VIEW |
> - PIPE_BIND_RENDER_TARGET, FALSE);
> + srcBind, FALSE);
>   }
>   else {
>  srcFormat = st_choose_format(st, GL_RGBA, GL_NONE,
>   GL_NONE, st->internal_target, 0,
> - PIPE_BIND_SAMPLER_VIEW |
> - PIPE_BIND_RENDER_TARGET, FALSE);
> + srcBind, FALSE);
>   }
>}
>
> @@ -1615,7 +1611,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint 
> srcy,
> readH = MAX2(0, readH);
>
> /* Allocate the temporary texture. */
> -   pt = alloc_texture(st, width, height, srcFormat);
>

[Mesa-dev] [PATCH] st/mesa: fix temp texture bindings in st_CopyPixels()

2013-06-13 Thread Chia-I Wu
The temporary texture should have either PIPE_BIND_RENDER_TARGET or
PIPE_BIND_DEPTH_STENCIL set in addition to PIPE_BIND_SAMPLER_VIEW.

Signed-off-by: Chia-I Wu 
---
 src/mesa/state_tracker/st_cb_drawpixels.c |   30 +
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 1c26315..0200a62 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -460,12 +460,12 @@ internal_format(struct gl_context *ctx, GLenum format, 
GLenum type)
  */
 static struct pipe_resource *
 alloc_texture(struct st_context *st, GLsizei width, GLsizei height,
-  enum pipe_format texFormat)
+  enum pipe_format texFormat, unsigned bind)
 {
struct pipe_resource *pt;
 
pt = st_texture_create(st, st->internal_target, texFormat, 0,
-  width, height, 1, 1, 0, PIPE_BIND_SAMPLER_VIEW);
+  width, height, 1, 1, 0, bind);
 
return pt;
 }
@@ -515,7 +515,7 @@ make_texture(struct st_context *st,
   return NULL;
 
/* alloc temporary texture */
-   pt = alloc_texture(st, width, height, pipeFormat);
+   pt = alloc_texture(st, width, height, pipeFormat, PIPE_BIND_SAMPLER_VIEW);
if (!pt) {
   _mesa_unmap_pbo_source(ctx, unpack);
   return NULL;
@@ -1475,6 +1475,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint 
srcy,
int num_sampler_view = 1;
GLfloat *color;
enum pipe_format srcFormat;
+   unsigned srcBind;
GLboolean invertTex = GL_FALSE;
GLint readX, readY, readW, readH;
struct gl_pixelstore_attrib pack = ctx->DefaultPacking;
@@ -1540,16 +1541,15 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint 
srcy,
 
/* Choose the format for the temporary texture. */
srcFormat = rbRead->texture->format;
+   srcBind = PIPE_BIND_SAMPLER_VIEW |
+  (type == GL_COLOR ? PIPE_BIND_RENDER_TARGET : PIPE_BIND_DEPTH_STENCIL);
 
if (!screen->is_format_supported(screen, srcFormat, st->internal_target, 0,
-PIPE_BIND_SAMPLER_VIEW |
-(type == GL_COLOR ? PIPE_BIND_RENDER_TARGET
- : PIPE_BIND_DEPTH_STENCIL))) {
+srcBind)) {
   if (type == GL_DEPTH) {
  srcFormat = st_choose_format(st, GL_DEPTH_COMPONENT, GL_NONE,
   GL_NONE, st->internal_target, 0,
-  PIPE_BIND_SAMPLER_VIEW |
-  PIPE_BIND_DEPTH_STENCIL, FALSE);
+  srcBind, FALSE);
   }
   else {
  assert(type == GL_COLOR);
@@ -1557,26 +1557,22 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint 
srcy,
  if (util_format_is_float(srcFormat)) {
 srcFormat = st_choose_format(st, GL_RGBA32F, GL_NONE,
  GL_NONE, st->internal_target, 0,
- PIPE_BIND_SAMPLER_VIEW |
- PIPE_BIND_RENDER_TARGET, FALSE);
+ srcBind, FALSE);
  }
  else if (util_format_is_pure_sint(srcFormat)) {
 srcFormat = st_choose_format(st, GL_RGBA32I, GL_NONE,
  GL_NONE, st->internal_target, 0,
- PIPE_BIND_SAMPLER_VIEW |
- PIPE_BIND_RENDER_TARGET, FALSE);
+ srcBind, FALSE);
  }
  else if (util_format_is_pure_uint(srcFormat)) {
 srcFormat = st_choose_format(st, GL_RGBA32UI, GL_NONE,
  GL_NONE, st->internal_target, 0,
- PIPE_BIND_SAMPLER_VIEW |
- PIPE_BIND_RENDER_TARGET, FALSE);
+ srcBind, FALSE);
  }
  else {
 srcFormat = st_choose_format(st, GL_RGBA, GL_NONE,
  GL_NONE, st->internal_target, 0,
- PIPE_BIND_SAMPLER_VIEW |
- PIPE_BIND_RENDER_TARGET, FALSE);
+ srcBind, FALSE);
  }
   }
 
@@ -1615,7 +1611,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint 
srcy,
readH = MAX2(0, readH);
 
/* Allocate the temporary texture. */
-   pt = alloc_texture(st, width, height, srcFormat);
+   pt = alloc_texture(st, width, height, srcFormat, srcBind);
if (!pt)
   return;
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev