[Mesa-dev] [PATCH] glsl: Fix gl_shader_program::UniformLocationBaseScale assert.

2013-06-24 Thread Vinson Lee
commit 26d86d26f9f972b19c7040bdb1b1daf48537ef3e added
gl_shader_program::UniformLocationBaseScale. According to the code
comments in that commit, UniformLocationBaseScale "must be >=1".

UniformLocationBaseScale is of type unsigned. Coverity reported a "Macro
compares unsigned to 0" defect as well.

Signed-off-by: Vinson Lee 
---
 src/mesa/main/uniforms.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/uniforms.h b/src/mesa/main/uniforms.h
index 14fe26d..9223917 100644
--- a/src/mesa/main/uniforms.h
+++ b/src/mesa/main/uniforms.h
@@ -272,7 +272,7 @@ static inline GLint
 _mesa_uniform_merge_location_offset(const struct gl_shader_program *prog,
 unsigned base_location, unsigned offset)
 {
-   assert(prog->UniformLocationBaseScale >= 0);
+   assert(prog->UniformLocationBaseScale >= 1);
assert(offset < prog->UniformLocationBaseScale);
return (base_location * prog->UniformLocationBaseScale) + offset;
 }
-- 
1.8.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa for Nios ii system

2013-06-24 Thread ZhiLi
Hi there,
I am a new user of OpenGL and Nios ii soft processor. Currently, I am working 
on a project related to OpenGL and Nios ii.
For this project, I will implement Nios ii soft core processor on Altera DE2 ( 
an FPGA board). Then I will compile some OpenGL applications with Nios ii IDE 
for Eclipse and download it onto the embedded system board. The problem is that 
there is no OpenGL library for this system so I have to modify mesa library to 
achieve this goal.
Specifically, this Nios ii processor is a general-purpose RISC processor with 
following features:
   Full 32-bit instruction set, data path, and address space■ 32 
general-purpose registers■ Optional shadow register sets■ 32 interrupt sources■ 
External interrupt controller interface for more interrupt sources■ 
Single-instruction 32 × 32 multiply and divide producing a 32-bit result■ 
Dedicated instructions for computing 64-bit and 128-bit products of 
multiplication■ Floating-point instructions for single-precision floating-point 
operations■ Single-instruction barrel shifter■ Access to a variety of on-chip 
peripherals, and interfaces to off-chip memories and peripherals■ 
Hardware-assisted debug module enabling processor start, stop, step, and trace 
under control of the Nios II software development tools■ Optional memory 
management unit (MMU) to support operating systems that require MMUs■ Optional 
memory protection unit (MPU)■ Software development environment based on the GNU 
C/C++ tool chain and the Nios II Software Build Tools (SBT) for Eclipse■ 
Integration with Altera's SignalTap® II Embedded Logic Analyzer, enabling 
real-time analysis of instructions and data along with other signals in the 
FPGA design■ Instruction set architecture (ISA) compatible across all Nios II 
processor systems■ Performance up to 250 DMIPS
Can anyone give me some suggestions where to start the project? Thanks  
  ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] glsl opt_flip_matrices: Silence unused variable warning in the release build

2013-06-24 Thread Kenneth Graunke

On 06/24/2013 01:22 PM, Ian Romanick wrote:

On 06/22/2013 08:43 AM, Emil Velikov wrote:

Resolves the following gcc warning

  opt_flip_matrices.cpp:84:32: warning: unused variable 'deref'

Signed-off-by: Emil Velikov 
---
  src/glsl/opt_flip_matrices.cpp | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/opt_flip_matrices.cpp
b/src/glsl/opt_flip_matrices.cpp
index 497513f..be3ccf8 100644
--- a/src/glsl/opt_flip_matrices.cpp
+++ b/src/glsl/opt_flip_matrices.cpp
@@ -81,8 +81,8 @@ matrix_flipper::visit_enter(ir_expression *ir)

 if (mvp_transpose &&
 strcmp(mat_var->name, "gl_ModelViewProjectionMatrix") == 0) {
-  ir_dereference_variable *deref =
ir->operands[0]->as_dereference_variable();
-  assert(deref && deref->var == mat_var);
+  assert(ir->operands[0]->as_dereference_variable() &&
+ ir->operands[0]->as_dereference_variable()->var ==
mat_var);


Rather than dipping into as_dereference_variable() twice, I'd surround
both lines with a '#ifndef NDEBUG' block.


Or just (void) deref;

I don't have a preference...either of those three approaches would get:
Reviewed-by: Kenneth Graunke 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] R600: Add support for i32 loads from the constant address space on Cayman

2013-06-24 Thread Aaron Watry
Tested-By: Aaron Watry 

Tested on an A6-3500 (SUMO)

On Tue, Jun 18, 2013 at 11:54 AM, Tom Stellard  wrote:
> From: Tom Stellard 
>
> ---
>  lib/Target/R600/R600Instructions.td | 9 +
>  test/CodeGen/R600/load.ll   | 1 +
>  2 files changed, 10 insertions(+)
>
> diff --git a/lib/Target/R600/R600Instructions.td 
> b/lib/Target/R600/R600Instructions.td
> index 83d735f..803f597 100644
> --- a/lib/Target/R600/R600Instructions.td
> +++ b/lib/Target/R600/R600Instructions.td
> @@ -1755,6 +1755,15 @@ def VTX_READ_GLOBAL_128_cm : VTX_READ_128_cm <1,
>[(set v4i32:$dst_gpr, (global_load ADDRVTX_READ:$src_gpr))]
>  >;
>
> +//===--===//
> +// Constant Loads
> +// XXX: We are currently storing all constants in the global address space.
> +//===--===//
> +
> +def CONSTANT_LOAD_cm : VTX_READ_32_cm <1,
> +  [(set i32:$dst_gpr, (constant_load ADDRVTX_READ:$src_gpr))]
> +>;
> +
>  } // End isCayman
>
>  
> //===--===//
> diff --git a/test/CodeGen/R600/load.ll b/test/CodeGen/R600/load.ll
> index ff774ec..d1ebaa3 100644
> --- a/test/CodeGen/R600/load.ll
> +++ b/test/CodeGen/R600/load.ll
> @@ -1,4 +1,5 @@
>  ; RUN: llc < %s -march=r600 -mcpu=redwood | FileCheck 
> --check-prefix=R600-CHECK %s
> +; RUN: llc < %s -march=r600 -mcpu=cayman | FileCheck 
> --check-prefix=R600-CHECK %s
>  ; RUN: llc < %s -march=r600 -mcpu=SI | FileCheck --check-prefix=SI-CHECK  %s
>
>  ; Load an i8 value from the global address space.
> --
> 1.7.11.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] llvmpipe: rework query logic

2013-06-24 Thread sroland
From: Roland Scheidegger 

Previously lp_rast_begin_query commands were always inserted into each bin,
and re-issued if the scene was restarted, while lp_rast_end_query commands
were executed for each still active query at the end of tile rasterization.
Also, the ps_invocations and vis_counter were set to zero when the respective
command was encountered.
This however cannot work for multiple queries of the same type (note that
occlusion counter and occlusion predicate while different type were also
affected).
So, change the logic to always set the ps_invocations and vis_counter to zero
at the start of tile rasterization, and then use "start" and "end" per-thread
query values when encountering the begin/end query commands instead, which
should work for multiple queries of the same type. This also means queries do
not have to be reissued in a new scene, however they still need to be finished
at end of tile rasterization, so a list of queries still active at the end of
a scene needs to be maintained.
Also while here don't bin the queries which don't do anything in rasterization.
(This change does not actually handle multiple queries of the same type yet,
as the list of active queries is just a simple fixed array and setup can still
only have one query active per type.)
---
 src/gallium/drivers/llvmpipe/lp_query.c |   13 +++--
 src/gallium/drivers/llvmpipe/lp_query.h |3 +-
 src/gallium/drivers/llvmpipe/lp_rast.c  |   56 ++
 src/gallium/drivers/llvmpipe/lp_rast_priv.h |9 ++-
 src/gallium/drivers/llvmpipe/lp_scene.h |4 ++
 src/gallium/drivers/llvmpipe/lp_setup.c |   81 +--
 src/gallium/drivers/llvmpipe/lp_setup_tri.c |5 ++
 7 files changed, 92 insertions(+), 79 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_query.c 
b/src/gallium/drivers/llvmpipe/lp_query.c
index 1d3edff..49abed0 100644
--- a/src/gallium/drivers/llvmpipe/lp_query.c
+++ b/src/gallium/drivers/llvmpipe/lp_query.c
@@ -120,19 +120,19 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
switch (pq->type) {
case PIPE_QUERY_OCCLUSION_COUNTER:
   for (i = 0; i < num_threads; i++) {
- *result += pq->count[i];
+ *result += pq->end[i];
   }
   break;
case PIPE_QUERY_OCCLUSION_PREDICATE:
   for (i = 0; i < num_threads; i++) {
  /* safer (still not guaranteed) when there's an overflow */
- vresult->b = vresult->b || pq->count[i];
+ vresult->b = vresult->b || pq->end[i];
   }
   break;
case PIPE_QUERY_TIMESTAMP:
   for (i = 0; i < num_threads; i++) {
- if (pq->count[i] > *result) {
-*result = pq->count[i];
+ if (pq->end[i] > *result) {
+*result = pq->end[i];
  }
  if (*result == 0)
 *result = os_time_get_nano();
@@ -170,7 +170,7 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
  (struct pipe_query_data_pipeline_statistics *)vresult;
   /* only ps_invocations come from binned query */
   for (i = 0; i < num_threads; i++) {
- pq->stats.ps_invocations += pq->count[i];
+ pq->stats.ps_invocations += pq->end[i];
   }
   pq->stats.ps_invocations *= LP_RASTER_BLOCK_SIZE * LP_RASTER_BLOCK_SIZE;
   *stats = pq->stats;
@@ -200,7 +200,8 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct 
pipe_query *q)
}
 
 
-   memset(pq->count, 0, sizeof(pq->count));
+   memset(pq->start, 0, sizeof(pq->start));
+   memset(pq->end, 0, sizeof(pq->end));
lp_setup_begin_query(llvmpipe->setup, pq);
 
switch (pq->type) {
diff --git a/src/gallium/drivers/llvmpipe/lp_query.h 
b/src/gallium/drivers/llvmpipe/lp_query.h
index e29022a..62ad5fd 100644
--- a/src/gallium/drivers/llvmpipe/lp_query.h
+++ b/src/gallium/drivers/llvmpipe/lp_query.h
@@ -42,7 +42,8 @@ struct llvmpipe_context;
 
 
 struct llvmpipe_query {
-   uint64_t count[LP_MAX_THREADS];  /* a counter for each thread */
+   uint64_t start[LP_MAX_THREADS];  /* start count value for each thread */
+   uint64_t end[LP_MAX_THREADS];/* end count value for each thread */
struct lp_fence *fence;  /* fence from last scene this was binned 
in */
unsigned type;   /* PIPE_QUERY_* */
unsigned num_primitives_generated;
diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c 
b/src/gallium/drivers/llvmpipe/lp_rast.c
index 62a82e3..871cc50 100644
--- a/src/gallium/drivers/llvmpipe/lp_rast.c
+++ b/src/gallium/drivers/llvmpipe/lp_rast.c
@@ -61,7 +61,6 @@ static void
 lp_rast_begin( struct lp_rasterizer *rast,
struct lp_scene *scene )
 {
-
rast->curr_scene = scene;
 
LP_DBG(DEBUG_RAST, "%s\n", __FUNCTION__);
@@ -100,6 +99,9 @@ lp_rast_tile_begin(struct lp_rasterizer_task *task,
task->height = TILE_SIZE + y * TILE_SIZE > task->scene->fb.height ?
 task->scene->fb.height - y * TILE_SIZE : TILE_SIZE;
 
+   task->thread_data.vis_counter = 0;
+   task->ps_invocations = 0;
+

[Mesa-dev] [PATCH 2/2] draw: allow overflows in the llvm paths

2013-06-24 Thread Zack Rusin
Because our code couldn't handle it we were skipping rendering
if we detected overflows. According to the spec we should
still render but with all 0 vertices, which is what the llvm
code already does. So for the llvm paths lets enable processing
even if an overflow condition has been detected.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/draw/draw_pt.c |   12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pt.c 
b/src/gallium/auxiliary/draw/draw_pt.c
index 720d7b1..e0b8007 100644
--- a/src/gallium/auxiliary/draw/draw_pt.c
+++ b/src/gallium/auxiliary/draw/draw_pt.c
@@ -508,11 +508,15 @@ draw_vbo(struct draw_context *draw,
  draw->pt.vertex_element,
  draw->pt.nr_vertex_elements,
  info);
-
-   if (index_limit == 0) {
+#if HAVE_LLVM
+   if (!draw->llvm)
+#endif
+   {
+  if (index_limit == 0) {
   /* one of the buffers is too small to do any valid drawing */
-  debug_warning("draw: VBO too small to draw anything\n");
-  return;
+ debug_warning("draw: VBO too small to draw anything\n");
+ return;
+  }
}
 
/* If we're collecting stats then make sure we start from scratch */
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] draw: avoid overflows in the llvm draw loop

2013-06-24 Thread Zack Rusin
Before we could easily overflow if start+count>max integer. To
avoid it we can just iterate over the count. This makes sure
that we never crash, since most of the overflow conditions
is already handled.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/draw/draw_llvm.c |   14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
b/src/gallium/auxiliary/draw/draw_llvm.c
index 6733e08..5373d1a 100644
--- a/src/gallium/auxiliary/draw/draw_llvm.c
+++ b/src/gallium/auxiliary/draw/draw_llvm.c
@@ -720,7 +720,7 @@ generate_fetch(struct gallivm_state *gallivm,
  stride, buffer_size,
  "buffer_overflowed");
/*
-   lp_build_printf(gallivm, "vbuf index = %d, stride is %d\n", indices, 
stride);
+   lp_build_printf(gallivm, "vbuf index = %u, stride is %u\n", index, stride);
lp_build_print_value(gallivm, "   buffer size = ", buffer_size);
lp_build_print_value(gallivm, "   buffer overflowed = ", buffer_overflowed);
*/
@@ -1595,6 +1595,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
if (elts) {
   start = zero;
   end = fetch_count;
+  count = fetch_count;
}
else {
   end = lp_build_add(&bld, start, count);
@@ -1604,7 +1605,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
 
fetch_max = LLVMBuildSub(builder, end, one, "fetch_max");
 
-   lp_build_loop_begin(&lp_loop, gallivm, start);
+   lp_build_loop_begin(&lp_loop, gallivm, zero);
{
   LLVMValueRef inputs[PIPE_MAX_SHADER_INPUTS][TGSI_NUM_CHANNELS];
   LLVMValueRef aos_attribs[PIPE_MAX_SHADER_INPUTS][LP_MAX_VECTOR_WIDTH / 
32] = { { 0 } };
@@ -1612,10 +1613,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
   LLVMValueRef clipmask;   /* holds the clipmask value */
   const LLVMValueRef (*ptr_aos)[TGSI_NUM_CHANNELS];
 
-  if (elts)
- io_itr = lp_loop.counter;
-  else
- io_itr = LLVMBuildSub(builder, lp_loop.counter, start, "");
+  io_itr = lp_loop.counter;
 
   io = LLVMBuildGEP(builder, io_ptr, &io_itr, 1, "");
 #if DEBUG_STORE
@@ -1628,6 +1626,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
 LLVMBuildAdd(builder,
  lp_loop.counter,
  lp_build_const_int32(gallivm, i), "");
+ true_index = LLVMBuildAdd(builder, start, true_index, "");
 
  /* make sure we're not out of bounds which can happen
   * if fetch_count % 4 != 0, because on the last iteration
@@ -1744,8 +1743,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
draw_llvm_variant *variant,
  vs_info->num_outputs, vs_type,
  have_clipdist);
}
-
-   lp_build_loop_end_cond(&lp_loop, end, step, LLVMIntUGE);
+   lp_build_loop_end_cond(&lp_loop, count, step, LLVMIntUGE);
 
sampler->destroy(sampler);
 
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa driver for VirtualBox

2013-06-24 Thread Dave Airlie
>
>
> Thank you, that looks very interesting.  I will need a bit of time to get
> into the code, but for a start the shader conversion code looks very
> understandably written.  If I turn it into a driver like I described, is it
> something you would be interested in interfacing to? By the way, is the code
> under the usual Mesa licence?

yes its all under standard license, I suspect the shader convertor is
simple because its incredibly naive, I just got things to a stage now
where piglit actually finishes without nuking my VMs and I suspect a
lot of the failure is due to the shader convertor! Though I'm pushing
changes to the shader convertor quite often, at the moment I'm using
the TGSI text across the wire, though I'm not 100% that is the best
plan going forward as TGSI text isn't currently versioned and we do
add features to it.

Well so far I've no real official plans for this code, Red Hat is
currently just letting me do the research on what a possible 3D virt
solution for qemu might look like and I'll probably make some sort of
announcement in a few weeks if I get it to run gnome-shell smoothly,
and not upside down. So you can use the code and if you want to make
some of it shared I'm happy to contribute to something common. I've
pushed a few fixes to the shader code over the last few days now that
I have some piglit coverage.

> We already have code to push OpenGL calls to the graphics card on the host
> system, so this would just be an additional layer of indirection. Regarding
> texture to pixmap, I was thinking of making our DDX use this interface too
> for creating and manipulating pixmaps.  All buffers created by a VirtualBox
> guest with accelerated 3D are just buffers created on the host by the
> VirtualBox process, and DDX pixmaps would then also be host buffers, so
> interchangeable with textures.  (Hope that makes sense.  It is getting
> rather late here...)

I suspect the main overhead then is just going to be pushing data
transfers via the shared memory will add an extra copy, this'll suck
mostly for vertex upload where at least in my testing of openarena is
where most things slow down.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] glsl ast_to_hir: Silence uninitialized variable warnings in the release build

2013-06-24 Thread Ian Romanick

On 06/22/2013 08:43 AM, Emil Velikov wrote:

Resolves the following gcc warnings

  warning: 'iface_type_name' may be used uninitialized in this function
  warning: 'var_mode' may be used uninitialized in this function

Note: The variables are initialised to UNKNOWN and ir_var_auto

Signed-off-by: Emil Velikov 


Reviewed-by: Ian Romanick 


---
  src/glsl/ast_to_hir.cpp | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index e918ade..a3d820f 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4174,6 +4174,8 @@ ast_interface_block::hir(exec_list *instructions,
var_mode = ir_var_uniform;
iface_type_name = "uniform";
 } else {
+  var_mode = ir_var_auto;
+  iface_type_name = "UNKNOWN";
assert(!"interface block layout qualifier not found!");
 }




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] glsl opt_flip_matrices: Silence unused variable warning in the release build

2013-06-24 Thread Ian Romanick

On 06/22/2013 08:43 AM, Emil Velikov wrote:

Resolves the following gcc warning

  opt_flip_matrices.cpp:84:32: warning: unused variable 'deref'

Signed-off-by: Emil Velikov 
---
  src/glsl/opt_flip_matrices.cpp | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/opt_flip_matrices.cpp b/src/glsl/opt_flip_matrices.cpp
index 497513f..be3ccf8 100644
--- a/src/glsl/opt_flip_matrices.cpp
+++ b/src/glsl/opt_flip_matrices.cpp
@@ -81,8 +81,8 @@ matrix_flipper::visit_enter(ir_expression *ir)

 if (mvp_transpose &&
 strcmp(mat_var->name, "gl_ModelViewProjectionMatrix") == 0) {
-  ir_dereference_variable *deref = 
ir->operands[0]->as_dereference_variable();
-  assert(deref && deref->var == mat_var);
+  assert(ir->operands[0]->as_dereference_variable() &&
+ ir->operands[0]->as_dereference_variable()->var == mat_var);


Rather than dipping into as_dereference_variable() twice, I'd surround 
both lines with a '#ifndef NDEBUG' block.




void *mem_ctx = ralloc_parent(ir);




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/hud: do not use free() for the free_query_data hook

2013-06-24 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Mon, Jun 24, 2013 at 6:44 PM, Brian Paul  wrote:
> That confuses Gallium's memory debugging code where CALLOC/MALLOC
> must be matched with FREE, not free().
> ---
>  src/gallium/auxiliary/hud/hud_cpu.c |   12 +++-
>  src/gallium/auxiliary/hud/hud_fps.c |   12 +++-
>  src/gallium/auxiliary/hud/hud_private.h |2 +-
>  3 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_cpu.c 
> b/src/gallium/auxiliary/hud/hud_cpu.c
> index ce98115..cd20dee 100644
> --- a/src/gallium/auxiliary/hud/hud_cpu.c
> +++ b/src/gallium/auxiliary/hud/hud_cpu.c
> @@ -116,6 +116,12 @@ query_cpu_load(struct hud_graph *gr)
> }
>  }
>
> +static void
> +free_query_data(void *p)
> +{
> +   FREE(p);
> +}
> +
>  void
>  hud_cpu_graph_install(struct hud_pane *pane, unsigned cpu_index)
>  {
> @@ -144,7 +150,11 @@ hud_cpu_graph_install(struct hud_pane *pane, unsigned 
> cpu_index)
> }
>
> gr->query_new_value = query_cpu_load;
> -   gr->free_query_data = free;
> +
> +   /* Don't use free() as our callback as that messes up Gallium's
> +* memory debugger.  Use simple free_query_data() wrapper.
> +*/
> +   gr->free_query_data = free_query_data;
>
> info = gr->query_data;
> info->cpu_index = cpu_index;
> diff --git a/src/gallium/auxiliary/hud/hud_fps.c 
> b/src/gallium/auxiliary/hud/hud_fps.c
> index 80381f5..6e9be71 100644
> --- a/src/gallium/auxiliary/hud/hud_fps.c
> +++ b/src/gallium/auxiliary/hud/hud_fps.c
> @@ -60,6 +60,12 @@ query_fps(struct hud_graph *gr)
> }
>  }
>
> +static void
> +free_query_data(void *p)
> +{
> +   FREE(p);
> +}
> +
>  void
>  hud_fps_graph_install(struct hud_pane *pane)
>  {
> @@ -76,7 +82,11 @@ hud_fps_graph_install(struct hud_pane *pane)
> }
>
> gr->query_new_value = query_fps;
> -   gr->free_query_data = free;
> +
> +   /* Don't use free() as our callback as that messes up Gallium's
> +* memory debugger.  Use simple free_query_data() wrapper.
> +*/
> +   gr->free_query_data = free_query_data;
>
> hud_pane_add_graph(pane, gr);
>  }
> diff --git a/src/gallium/auxiliary/hud/hud_private.h 
> b/src/gallium/auxiliary/hud/hud_private.h
> index 2b7d56b..1606ada 100644
> --- a/src/gallium/auxiliary/hud/hud_private.h
> +++ b/src/gallium/auxiliary/hud/hud_private.h
> @@ -42,7 +42,7 @@ struct hud_graph {
> char name[128];
> void *query_data;
> void (*query_new_value)(struct hud_graph *gr);
> -   void (*free_query_data)(void *ptr);
> +   void (*free_query_data)(void *ptr); /**< do not use ordinary free() */
>
> /* mutable variables */
> unsigned num_vertices;
> --
> 1.7.10.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g/compute: disable unused colorbuffer slots

2013-06-24 Thread Marek Olšák
Does setting R600_CONTEXT_FLUSH_AND_INV after
evergreen_emit_direct_dispatch help? I think we should always flush
the write caches at the end of CS and invalidate the read caches at
the beginning, not the other way around.

Marek

On Mon, Jun 24, 2013 at 7:25 PM, Tom Stellard  wrote:
> On Mon, Jun 24, 2013 at 03:31:50AM +0200, Marek Olšák wrote:
>> This might fix the lockups caused by colorbuffer flushes and it's generally
>> the right thing to do. Untested.
>
> Unfortunately, this doesn't fix the lockups on Cayman with VM enabled,
> but there are no regressions with it, so go ahead and commit.
>
> Tested-by: Tom Stellard 
>
>> ---
>>  src/gallium/drivers/r600/evergreen_compute.c | 13 -
>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
>> b/src/gallium/drivers/r600/evergreen_compute.c
>> index c993c09..b4cb939 100644
>> --- a/src/gallium/drivers/r600/evergreen_compute.c
>> +++ b/src/gallium/drivers/r600/evergreen_compute.c
>> @@ -408,7 +408,8 @@ static void compute_emit_cs(struct r600_context *ctx, 
>> const uint *block_layout,
>>   r600_flush_emit(ctx);
>>
>>   /* Emit colorbuffers. */
>> - for (i = 0; i < ctx->framebuffer.state.nr_cbufs; i++) {
>> + /* XXX support more than 8 colorbuffers (the offsets are not a 
>> multiple of 0x3C for CB8-11) */
>> + for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
>>   struct r600_surface *cb = (struct 
>> r600_surface*)ctx->framebuffer.state.cbufs[i];
>>   unsigned reloc = r600_context_bo_reloc(ctx, &ctx->rings.gfx,
>>  (struct 
>> r600_resource*)cb->base.texture,
>> @@ -434,6 +435,16 @@ static void compute_emit_cs(struct r600_context *ctx, 
>> const uint *block_layout,
>>   r600_write_value(cs, PKT3(PKT3_NOP, 0, 0)); /* 
>> R_028C74_CB_COLOR0_ATTRIB */
>>   r600_write_value(cs, reloc);
>>   }
>> + if (ctx->keep_tiling_flags) {
>> + for (; i < 8 ; i++) {
>> + r600_write_compute_context_reg(cs, 
>> R_028C70_CB_COLOR0_INFO + i * 0x3C,
>> +
>> S_028C70_FORMAT(V_028C70_COLOR_INVALID));
>> + }
>> + for (; i < 12; i++) {
>> + r600_write_compute_context_reg(cs, 
>> R_028E50_CB_COLOR8_INFO + (i - 8) * 0x1C,
>> +
>> S_028C70_FORMAT(V_028C70_COLOR_INVALID));
>> + }
>> + }
>>
>>   /* Set CB_TARGET_MASK  XXX: Use cb_misc_state */
>>   r600_write_compute_context_reg(cs, R_028238_CB_TARGET_MASK,
>> --
>> 1.8.1.2
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/blorp: Add bilinear filtering of samples for multisample scaled blits

2013-06-24 Thread Anuj Phogat
Current implementation of ext_framebuffer_multisample_blit_scaled in
i965/blorp uses nearest filtering for multisample scaled blits. Using
nearest filtering produces blocky artifacts and negates the benefits
of MSAA. That is the reason why extension was not enabled on i965.

This patch implements the bilinear filtering of samples in blorp engine.
Images generated with this patch are free from blocky artifacts and show
big improvement in visual quality.

Observed no piglit and gles3 regressions.

V3:
- Algorithm used for filtering assumes a rectangular grid of samples
  roughly corresponding to sample locations.
- Test the boundary conditions on the edges of texture.

Signed-off-by: Anuj Phogat 
---
  Made a small fix in setting up the interpolator value on top and bottom
  edges in Y direction.

 src/mesa/drivers/dri/i965/brw_blorp.h|  11 ++
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 257 +--
 2 files changed, 258 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index ffc27cc..0a15b89 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -319,6 +319,17 @@ struct brw_blorp_blit_prog_key
 * than one sample per pixel.
 */
bool persample_msaa_dispatch;
+
+   /* True for scaled blitting. */
+   bool blit_scaled;
+
+   /* Source rectangle dimensions. Used to test boundary conditions in shader
+* program.
+*/
+   float src_x0;
+   float src_y0;
+   float src_x1;
+   float src_y1;
 };
 
 class brw_blorp_blit_params : public brw_blorp_params
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 8694128..d816efc 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -622,7 +622,8 @@ private:
void kill_if_outside_dst_rect();
void translate_dst_to_src();
void single_to_blend();
-   void manual_blend(unsigned num_samples);
+   void manual_blend_average(unsigned num_samples);
+   void manual_blend_linear(unsigned num_samples);
void sample(struct brw_reg dst);
void texel_fetch(struct brw_reg dst);
void mcs_fetch();
@@ -676,6 +677,16 @@ private:
 */
struct brw_reg y_coords[2];
 
+   /* X, Y coordinates of the pixel from which we need to fetch the specific
+*  sample. These are used for multisample scaled blitting.
+*/
+   struct brw_reg x_sample_coords;
+   struct brw_reg y_sample_coords;
+
+   /* Store the values to use to interpolate in x and y directions */
+   struct brw_reg x_lerp;
+   struct brw_reg y_lerp;
+
/* Which element of x_coords and y_coords is currently in use.
 */
int xy_coord_index;
@@ -814,15 +825,17 @@ brw_blorp_blit_program::compile(struct brw_context *brw,
 * that we want to texture from.  Exception: if we are blending, then S is
 * irrelevant, because we are going to fetch all samples.
 */
-   if (key->blend) {
+   if (key->blend && !key->blit_scaled) {
   if (brw->intel.gen == 6) {
  /* Gen6 hardware an automatically blend using the SAMPLE message */
  single_to_blend();
  sample(texture_data[0]);
   } else {
  /* Gen7+ hardware doesn't automaticaly blend. */
- manual_blend(key->src_samples);
+ manual_blend_average(key->src_samples);
   }
+   } else if(key->blend && key->blit_scaled) {
+  manual_blend_linear(key->src_samples);
} else {
   /* We aren't blending, which means we just want to fetch a single sample
* from the source surface.  The address that we want to fetch from is
@@ -913,6 +926,18 @@ brw_blorp_blit_program::alloc_regs()
  = retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD);
   reg += 2;
}
+
+   if (key->blit_scaled && key->blend) {
+  this->x_sample_coords = brw_vec8_grf(reg, 0);
+  reg += 2;
+  this->y_sample_coords = brw_vec8_grf(reg, 0);
+  reg += 2;
+  this->x_lerp = brw_vec8_grf(reg, 0);
+  reg += 2;
+  this->y_lerp = brw_vec8_grf(reg, 0);
+  reg += 2;
+   }
+
this->xy_coord_index = 0;
this->sample_index
   = retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD);
@@ -1368,11 +1393,82 @@ brw_blorp_blit_program::translate_dst_to_src()
brw_MUL(&func, Y_f, Yp_f, y_transform.multiplier);
brw_ADD(&func, X_f, X_f, x_transform.offset);
brw_ADD(&func, Y_f, Y_f, y_transform.offset);
-   /* Round the float coordinates down to nearest integer by moving to
-* UD registers.
-*/
-   brw_MOV(&func, Xp, X_f);
-   brw_MOV(&func, Yp, Y_f);
+   if (key->blit_scaled && key->blend) {
+  float x_scale = 2.0;
+  float y_scale = key->src_samples / 2.0;
+  /* Translate coordinates to lay out the samples in a rectangular  grid
+   * roughly corresponding to sample locations.
+   */
+  brw_ADD(&func, X_f, X_f, brw_imm_f(-0.25));
+  brw_ADD(&func, Y_f, Y_f, brw_imm_f(-1.0 / ke

[Mesa-dev] [PATCH] i965: Be more careful with the interleaved user array upload optimization

2013-06-24 Thread Ian Romanick
From: Ian Romanick 

The checks to determine when the data can be uploaded in an interleaved
fashion can be tricked by certain data layouts.  For example,

float data[...];

glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 16, &data[0]);
glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 16, &data[4]);
glDrawArrays(GL_POINTS, 0, 1);

will hit the interleaved path with an incorrect size (16 bytes instead
of 32 bytes).  As a result, the data for attribute 1 never gets
uploaded.  The single element draw case is the only sensible case I can
think of for non-interleaved-that-looks-like-interleaved data, but there
may be others as well.

To fix this, make sure that the end of the element in the array being
checked is within the stride "window."  Previously the code would check
that the begining of the element was within the window.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick 
Cc: Kenneth Graunke 
Cc: Eric Anholt 
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 2ded14b..d19250b 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -506,8 +506,22 @@ static void brw_prepare_vertices(struct brw_context *brw)
ptr = glarray->Ptr;
 }
 else if (interleaved != glarray->StrideB ||
- (uintptr_t)(glarray->Ptr - ptr) > interleaved)
+  (uintptr_t)(glarray->Ptr - ptr) + glarray->_ElementSize > 
interleaved)
 {
+/* If the stride is different or if the stride doesn't cover the
+ * entire range of the data element, disable the interleaved
+ * upload optimization.  The second case can most commonly occur
+ * in cases where there is a single vertex and, for example, the
+ * data is stored on the application's stack.
+ *
+ * NOTE: This will also disable the optimization in cases where
+ * the data is in a different order than the array indices.
+ * Something like:
+ *
+ * float data[...];
+ * glVertexAttribPointer(0, 4, GL_FLOAT, 16, &data[4]);
+ * glVertexAttribPointer(1, 4, GL_FLOAT, 16, &data[0]);
+ */
interleaved = 0;
 }
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vl/mpeg12: implement inverse scan/quantization steps

2013-06-24 Thread Christian König

Am 24.06.2013 18:39, schrieb Ilia Mirkin:

On Mon, Jun 24, 2013 at 4:48 AM, Christian König
 wrote:

Am 23.06.2013 18:59, schrieb Ilia Mirkin:


Signed-off-by: Ilia Mirkin 
---

These changes make MPEG2 I-frames generate the correct macroblock data (as
compared to mplayer via xvmc). Other MPEG2 frames are still misparsed, and
MPEG1 I-frames have some errors (but largely match up).


NAK, zscan and mismatch handling are handled in vl/vl_zscan.c.

Please use/fix that one instead of adding another implementation.

Yes, I noticed these after Andy pointed out that my patch broke things
for him. Here's my situation, perhaps you can advise on how to
proceed:

NVIDIA VP2 hardware (NV84-NV96, NVA0) doesn't do bitstream parsing,
but it can take the macroblocks and render them. When I use my
implementation with xvmc, everything works fine. If I try to use vdpau
by using vl_mpeg12_bitstream to parse the bitstream, the data comes
out all wrong. It appears that decode_macroblock is called with data
before inverse z-scan and quantization, while mplayer pushes data to
xvmc after those steps. So should I basically have a bit of logic in
my decode_macroblock impl that says "if using mpeg12_bitstream then do
some more work on this data"? Or what data should decode_macroblock
expect to receive?


Yes exactly, for the bitstream case decode_macroblock gets the blocks in 
original zscan order without mismatch correction or quantification.


You can either do the missing steps on the gpu with shaders or on the 
cpu while uploading the data and use the entrypoint member on the 
decoder to distinct between the different usecases.


Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Include SH and SMX when invalidating read caches

2013-06-24 Thread Tom Stellard
On Sun, Jun 23, 2013 at 05:56:15PM -0400, Alex Deucher wrote:
> On Sun, Jun 23, 2013 at 2:24 PM, Marek Olšák  wrote:
> > Hi Alex,
> >
> > rctx->framebuffer.state.nr_cbufs might not contain what you think it
> > does, because the framebuffer that needs flushing may have been
> > replaced by a new framebuffer and the cache flushing of the old
> > framebuffer usually takes place before the first draw to the new
> > framebuffer. To solve this, we can either set all CB bits, or move
> > setting CP_COHER_CNTL outside of r600_flush_emit.
> 
> I think it should be ok to just set all the CB bits.  My only concern
> was whether there would be a problems with flushing unbound CBs.  I
> think it should be ok.  Martin, can you try the attached patch?
> 

I have tested this patch as well as Martin's patch and neither patch
introduces any regressions for compute on Cayman.

-Tom

> Alex
> 
> >
> > Marek
> >
> > On Sun, Jun 23, 2013 at 7:41 PM, Alex Deucher  wrote:
> >> On Sat, Jun 22, 2013 at 11:53 AM, Martin Andersson  
> >> wrote:
> >>> On Sat, Jun 22, 2013 at 12:22 PM, Marek Olšák  wrote:
>  Reviewed-by: Marek Olšák 
> 
>  BTW, SMX is a write cache, to maybe it shouldn't be part of this patch.
> >>>
> >>> I made a little experiment where i ran
> >>> "ext_framebuffer_multisample-unaligned-blit 4 color downsample -auto"
> >>> 1 times and found that without SMX the test failed 177 times and
> >>> with SMX it didn't fail at all. So I do think the SMX cache should be
> >>> invalidated somewhere.
> >>>
> >>> Before 
> >>> http://cgit.freedesktop.org/mesa/mesa/commit/?id=4539f8e20af286d1f521eb016c89c6d9af0b801c
> >>> it was under R600_CONTEXT_FLUSH_AND_INV, is that a better place?
> >>>
> >>> If that is the proper place for SMX should SH also be there, since it
> >>> was also there before the patch, or do you have any other suggestions?
> >>
> >> Does something like this help?  You might play with some variants of
> >> this patch.  This might break compute however as Tom had some problems
> >> with CB flushes on cayman class hw in the past.
> >>
> >> Alex
> >>
> >>
> >>>
>  Marek
> 
>  On Sun, Jun 16, 2013 at 1:27 PM, Martin Andersson  
>  wrote:
> > Not including the SH and SMX caches when invalidating read caches causes
> > random failures on some piglit tests when VA is enabled.
> >
> > Since the failures are random, and there other problems also causing 
> > random
> > failures, it's hard to know exactly what tests were effected, but these
> > tests now consistently pass:
> >
> > fast_color_clear/all-colors
> > fast_color_clear/redundant-clear
> > spec/!OpenGL 1.1/draw-pixels samples={2,4,6,8}
> > spec/!OpenGL 1.1/drawbuffer-modes
> > ---
> >  src/gallium/drivers/r600/r600_hw_context.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/src/gallium/drivers/r600/r600_hw_context.c 
> > b/src/gallium/drivers/r600/r600_hw_context.c
> > index 944b666..df20e56 100644
> > --- a/src/gallium/drivers/r600/r600_hw_context.c
> > +++ b/src/gallium/drivers/r600/r600_hw_context.c
> > @@ -231,6 +231,8 @@ void r600_flush_emit(struct r600_context *rctx)
> > if (rctx->flags & R600_CONTEXT_INVAL_READ_CACHES) {
> > cp_coher_cntl |= S_0085F0_VC_ACTION_ENA(1) |
> > S_0085F0_TC_ACTION_ENA(1) |
> > +   S_0085F0_SH_ACTION_ENA(1) |
> > +   S_0085F0_SMX_ACTION_ENA(1) |
> > S_0085F0_FULL_CACHE_ENA(1);
> > emit_flush = 1;
> > }
> > --
> > 1.8.3
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >>> ___
> >>> mesa-dev mailing list
> >>> mesa-dev@lists.freedesktop.org
> >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

> From 040284d2d3d01b3a2181a024afab8cfa1a5143d2 Mon Sep 17 00:00:00 2001
> From: Alex Deucher 
> Date: Sun, 23 Jun 2013 13:36:42 -0400
> Subject: [PATCH] r600g: adjust flush flags (v2)
> 
> 1. flush SH with read caches
> 2. add flag for DB flushes
> 3. add flag for CB flushes
> 
> v2: flush all CBs, remove redundant emit_state variable.
> 
> Signed-off-by: Alex Deucher 
> ---
>  src/gallium/drivers/r600/evergreen_state.c |2 +
>  src/gallium/drivers/r600/r600_hw_context.c |   30 ---
>  src/gallium/drivers/r600/r600_pipe.h   |2 +
>  src/gallium/drivers/r600/r600_state.c  |2 +
>  4 files changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
> b/src/gallium/drivers/r600/evergreen_state.c
> index 3ebb157..59fa412 100644
> --- a/src/gallium/drivers/r600/evergreen_state.c
> +++ b/src/gallium/drivers/r600/evergreen_state.

Re: [Mesa-dev] [PATCH] r600g/compute: disable unused colorbuffer slots

2013-06-24 Thread Tom Stellard
On Mon, Jun 24, 2013 at 03:31:50AM +0200, Marek Olšák wrote:
> This might fix the lockups caused by colorbuffer flushes and it's generally
> the right thing to do. Untested.

Unfortunately, this doesn't fix the lockups on Cayman with VM enabled,
but there are no regressions with it, so go ahead and commit.

Tested-by: Tom Stellard 

> ---
>  src/gallium/drivers/r600/evergreen_compute.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index c993c09..b4cb939 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -408,7 +408,8 @@ static void compute_emit_cs(struct r600_context *ctx, 
> const uint *block_layout,
>   r600_flush_emit(ctx);
>  
>   /* Emit colorbuffers. */
> - for (i = 0; i < ctx->framebuffer.state.nr_cbufs; i++) {
> + /* XXX support more than 8 colorbuffers (the offsets are not a multiple 
> of 0x3C for CB8-11) */
> + for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
>   struct r600_surface *cb = (struct 
> r600_surface*)ctx->framebuffer.state.cbufs[i];
>   unsigned reloc = r600_context_bo_reloc(ctx, &ctx->rings.gfx,
>  (struct 
> r600_resource*)cb->base.texture,
> @@ -434,6 +435,16 @@ static void compute_emit_cs(struct r600_context *ctx, 
> const uint *block_layout,
>   r600_write_value(cs, PKT3(PKT3_NOP, 0, 0)); /* 
> R_028C74_CB_COLOR0_ATTRIB */
>   r600_write_value(cs, reloc);
>   }
> + if (ctx->keep_tiling_flags) {
> + for (; i < 8 ; i++) {
> + r600_write_compute_context_reg(cs, 
> R_028C70_CB_COLOR0_INFO + i * 0x3C,
> +
> S_028C70_FORMAT(V_028C70_COLOR_INVALID));
> + }
> + for (; i < 12; i++) {
> + r600_write_compute_context_reg(cs, 
> R_028E50_CB_COLOR8_INFO + (i - 8) * 0x1C,
> +
> S_028C70_FORMAT(V_028C70_COLOR_INVALID));
> + }
> + }
>  
>   /* Set CB_TARGET_MASK  XXX: Use cb_misc_state */
>   r600_write_compute_context_reg(cs, R_028238_CB_TARGET_MASK,
> -- 
> 1.8.1.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/hud: do not use free() for the free_query_data hook

2013-06-24 Thread Brian Paul
That confuses Gallium's memory debugging code where CALLOC/MALLOC
must be matched with FREE, not free().
---
 src/gallium/auxiliary/hud/hud_cpu.c |   12 +++-
 src/gallium/auxiliary/hud/hud_fps.c |   12 +++-
 src/gallium/auxiliary/hud/hud_private.h |2 +-
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_cpu.c 
b/src/gallium/auxiliary/hud/hud_cpu.c
index ce98115..cd20dee 100644
--- a/src/gallium/auxiliary/hud/hud_cpu.c
+++ b/src/gallium/auxiliary/hud/hud_cpu.c
@@ -116,6 +116,12 @@ query_cpu_load(struct hud_graph *gr)
}
 }
 
+static void
+free_query_data(void *p)
+{
+   FREE(p);
+}
+
 void
 hud_cpu_graph_install(struct hud_pane *pane, unsigned cpu_index)
 {
@@ -144,7 +150,11 @@ hud_cpu_graph_install(struct hud_pane *pane, unsigned 
cpu_index)
}
 
gr->query_new_value = query_cpu_load;
-   gr->free_query_data = free;
+
+   /* Don't use free() as our callback as that messes up Gallium's
+* memory debugger.  Use simple free_query_data() wrapper.
+*/
+   gr->free_query_data = free_query_data;
 
info = gr->query_data;
info->cpu_index = cpu_index;
diff --git a/src/gallium/auxiliary/hud/hud_fps.c 
b/src/gallium/auxiliary/hud/hud_fps.c
index 80381f5..6e9be71 100644
--- a/src/gallium/auxiliary/hud/hud_fps.c
+++ b/src/gallium/auxiliary/hud/hud_fps.c
@@ -60,6 +60,12 @@ query_fps(struct hud_graph *gr)
}
 }
 
+static void
+free_query_data(void *p)
+{
+   FREE(p);
+}
+
 void
 hud_fps_graph_install(struct hud_pane *pane)
 {
@@ -76,7 +82,11 @@ hud_fps_graph_install(struct hud_pane *pane)
}
 
gr->query_new_value = query_fps;
-   gr->free_query_data = free;
+
+   /* Don't use free() as our callback as that messes up Gallium's
+* memory debugger.  Use simple free_query_data() wrapper.
+*/
+   gr->free_query_data = free_query_data;
 
hud_pane_add_graph(pane, gr);
 }
diff --git a/src/gallium/auxiliary/hud/hud_private.h 
b/src/gallium/auxiliary/hud/hud_private.h
index 2b7d56b..1606ada 100644
--- a/src/gallium/auxiliary/hud/hud_private.h
+++ b/src/gallium/auxiliary/hud/hud_private.h
@@ -42,7 +42,7 @@ struct hud_graph {
char name[128];
void *query_data;
void (*query_new_value)(struct hud_graph *gr);
-   void (*free_query_data)(void *ptr);
+   void (*free_query_data)(void *ptr); /**< do not use ordinary free() */
 
/* mutable variables */
unsigned num_vertices;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vl/mpeg12: implement inverse scan/quantization steps

2013-06-24 Thread Ilia Mirkin
On Mon, Jun 24, 2013 at 4:48 AM, Christian König
 wrote:
> Am 23.06.2013 18:59, schrieb Ilia Mirkin:
>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>
>> These changes make MPEG2 I-frames generate the correct macroblock data (as
>> compared to mplayer via xvmc). Other MPEG2 frames are still misparsed, and
>> MPEG1 I-frames have some errors (but largely match up).
>
>
> NAK, zscan and mismatch handling are handled in vl/vl_zscan.c.
>
> Please use/fix that one instead of adding another implementation.

Yes, I noticed these after Andy pointed out that my patch broke things
for him. Here's my situation, perhaps you can advise on how to
proceed:

NVIDIA VP2 hardware (NV84-NV96, NVA0) doesn't do bitstream parsing,
but it can take the macroblocks and render them. When I use my
implementation with xvmc, everything works fine. If I try to use vdpau
by using vl_mpeg12_bitstream to parse the bitstream, the data comes
out all wrong. It appears that decode_macroblock is called with data
before inverse z-scan and quantization, while mplayer pushes data to
xvmc after those steps. So should I basically have a bit of logic in
my decode_macroblock impl that says "if using mpeg12_bitstream then do
some more work on this data"? Or what data should decode_macroblock
expect to receive?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] util/debug: Cleanup/improve debug_symbol_name_dbghelp.

2013-06-24 Thread Brian Paul

On 06/24/2013 06:44 AM, jfons...@vmware.com wrote:

From: José Fonseca 

- use mgwhelp -- the successor for bfdhelp which does not have a hard
   dependency on BFD, and works on 64bits.
- use a macro instead of hand-typing to dispatch DbgHelp functions
- dump line numbers
- dump module names when symbols are not available
- support 64bits.
- add comments
---
  src/gallium/auxiliary/util/u_debug_symbol.c | 239 +++-
  1 file changed, 161 insertions(+), 78 deletions(-)


For both:

Reviewed-by: Brian Paul 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] util/debug: Cleanup/improve debug_symbol_name_dbghelp.

2013-06-24 Thread jfonseca
From: José Fonseca 

- use mgwhelp -- the successor for bfdhelp which does not have a hard
  dependency on BFD, and works on 64bits.
- use a macro instead of hand-typing to dispatch DbgHelp functions
- dump line numbers
- dump module names when symbols are not available
- support 64bits.
- add comments
---
 src/gallium/auxiliary/util/u_debug_symbol.c | 239 +++-
 1 file changed, 161 insertions(+), 78 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug_symbol.c 
b/src/gallium/auxiliary/util/u_debug_symbol.c
index 0ef111c..307bccb 100644
--- a/src/gallium/auxiliary/util/u_debug_symbol.c
+++ b/src/gallium/auxiliary/util/u_debug_symbol.c
@@ -40,7 +40,8 @@
 #include "u_debug_symbol.h"
 #include "u_hash_table.h"
 
-#if defined(PIPE_OS_WINDOWS) && defined(PIPE_ARCH_X86)
+
+#if defined(PIPE_OS_WINDOWS)

 #include 
 #include 
@@ -48,139 +49,221 @@
 #include "dbghelp.h"
 
 
-static BOOL bSymInitialized = FALSE;
-
-static HMODULE hModule_Dbghelp = NULL;
+/**
+ * SymInitialize() must be called once for each process (in this case, the
+ * current process), before any of the other functions can be called.
+ */
+static BOOL g_bSymInitialized = FALSE;
 
 
-static
-FARPROC WINAPI __GetProcAddress(LPCSTR lpProcName)
+/**
+ * Lookup the address of a DbgHelp function.
+ */
+static FARPROC WINAPI
+getDbgHelpProcAddress(LPCSTR lpProcName)
 {
+   static HMODULE hModule = NULL;
+
+   if (!hModule) {
+  static boolean bail = FALSE;
+
+  if (bail) {
+ return NULL;
+  }
+
 #ifdef PIPE_CC_GCC
-   if (!hModule_Dbghelp) {
   /*
-   * bfdhelp.dll is a dbghelp.dll look-alike replacement, which is able to
-   * understand MinGW symbols using BFD library.  It is available from
+   * DbgHelp does not understand the debug information generated by MinGW 
toolchain.
+   *
+   * mgwhelp.dll is a dbghelp.dll look-alike replacement, which is able to
+   * understand MinGW symbols, including on 64-bit builds.
+   */
+  if (!hModule) {
+ hModule = LoadLibraryA("mgwhelp.dll");
+ if (!hModule) {
+_debug_printf("warning: mgwhelp.dll not found: symbol names will 
not be resolved\n"
+  "warning: download it from 
http://code.google.com/p/jrfonseca/wiki/DrMingw#MgwHelp\n";);
+ }
+  }
+
+  /*
+   * bfdhelp.dll was the predecessor of mgwhelp.dll.  It is available from
* http://people.freedesktop.org/~jrfonseca/bfdhelp/ for now.
*/
-  hModule_Dbghelp = LoadLibraryA("bfdhelp.dll");
-   }
-#endif
+  if (!hModule) {
+ hModule = LoadLibraryA("bfdhelp.dll");
+  }
+   #endif
 
-   if (!hModule_Dbghelp) {
-  hModule_Dbghelp = LoadLibraryA("dbghelp.dll");
-  if (!hModule_Dbghelp) {
+  /*
+   * Fallback to the real DbgHelp.
+   */
+  if (!hModule) {
+ hModule = LoadLibraryA("dbghelp.dll");
+  }
+
+  if (!hModule) {
+ bail = TRUE;
  return NULL;
   }
}
 
-   return GetProcAddress(hModule_Dbghelp, lpProcName);
+   return GetProcAddress(hModule, lpProcName);
 }
 
 
-typedef BOOL (WINAPI *PFNSYMINITIALIZE)(HANDLE, LPSTR, BOOL);
-static PFNSYMINITIALIZE pfnSymInitialize = NULL;
+/**
+ * Generic macro to dispatch a DbgHelp functions.
+ */
+#define DBGHELP_DISPATCH(_name, _ret_type, _ret_default, _arg_types, 
_arg_names) \
+   static _ret_type WINAPI \
+   j_##_name _arg_types \
+   { \
+  typedef BOOL (WINAPI *PFN) _arg_types; \
+  static PFN pfn = NULL; \
+  if (!pfn) { \
+ pfn = (PFN) getDbgHelpProcAddress(#_name); \
+ if (!pfn) { \
+return _ret_default; \
+ } \
+  } \
+  return pfn _arg_names; \
+   }
 
-static
-BOOL WINAPI j_SymInitialize(HANDLE hProcess, PSTR UserSearchPath, BOOL 
fInvadeProcess)
-{
-   if(
-  (pfnSymInitialize || (pfnSymInitialize = (PFNSYMINITIALIZE) 
__GetProcAddress("SymInitialize")))
-   )
-  return pfnSymInitialize(hProcess, UserSearchPath, fInvadeProcess);
-   else
-  return FALSE;
-}
+DBGHELP_DISPATCH(SymInitialize,
+ BOOL, 0,
+ (HANDLE hProcess, PSTR UserSearchPath, BOOL fInvadeProcess),
+ (hProcess, UserSearchPath, fInvadeProcess))
 
-typedef DWORD (WINAPI *PFNSYMSETOPTIONS)(DWORD);
-static PFNSYMSETOPTIONS pfnSymSetOptions = NULL;
+DBGHELP_DISPATCH(SymSetOptions,
+ DWORD, FALSE,
+ (DWORD SymOptions),
+ (SymOptions))
 
-static
-DWORD WINAPI j_SymSetOptions(DWORD SymOptions)
-{
-   if(
-  (pfnSymSetOptions || (pfnSymSetOptions = (PFNSYMSETOPTIONS) 
__GetProcAddress("SymSetOptions")))
-   )
-  return pfnSymSetOptions(SymOptions);
-   else
-  return FALSE;
-}
+DBGHELP_DISPATCH(SymFromAddr,
+ BOOL, FALSE,
+ (HANDLE hProcess, DWORD64 Address, PDWORD64 Displacement, 
PSYMBOL_INFO Symbol),
+ (hProcess, Address, Displacement, Symbol))
 
-typedef BOOL (WINAPI *PFNSYMGET

[Mesa-dev] [PATCH 1/2] util/debug: Make debug_backtrace_capture work for 64bit windows.

2013-06-24 Thread jfonseca
From: José Fonseca 

Rely on Windows' CaptureStackBackTrace to do the grunt work.
---
 src/gallium/auxiliary/util/u_debug_stack.c | 56 --
 src/gallium/auxiliary/util/u_debug_stack.h |  7 
 2 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug_stack.c 
b/src/gallium/auxiliary/util/u_debug_stack.c
index 50a248a..68961d3 100644
--- a/src/gallium/auxiliary/util/u_debug_stack.c
+++ b/src/gallium/auxiliary/util/u_debug_stack.c
@@ -36,7 +36,17 @@
 #include "u_debug_symbol.h"
 #include "u_debug_stack.h"
 
+#if defined(PIPE_OS_WINDOWS)
+#include 
+#endif
+
 
+/**
+ * Capture stack backtrace.
+ *
+ * NOTE: The implementation of this function is quite big, but it is important 
not to
+ * break it down in smaller functions to avoid adding new frames to the 
calling stack.
+ */
 void
 debug_backtrace_capture(struct debug_stack_frame *backtrace,
 unsigned start_frame, 
@@ -45,8 +55,50 @@ debug_backtrace_capture(struct debug_stack_frame *backtrace,
const void **frame_pointer = NULL;
unsigned i = 0;
 
-   if(!nr_frames)
+   if (!nr_frames) {
   return;
+   }
+
+   /*
+* On Windows try obtaining the stack backtrace via CaptureStackBackTrace.
+*
+* It works reliably both for x86 for x86_64.
+*/
+#if defined(PIPE_OS_WINDOWS)
+   {
+  typedef USHORT (WINAPI *PFNCAPTURESTACKBACKTRACE)(ULONG, ULONG, PVOID *, 
PULONG);
+  static PFNCAPTURESTACKBACKTRACE pfnCaptureStackBackTrace = NULL;
+
+  if (!pfnCaptureStackBackTrace) {
+ static HMODULE hModule = NULL;
+ if (!hModule) {
+hModule = LoadLibraryA("kernel32");
+assert(hModule);
+ }
+ if (hModule) {
+pfnCaptureStackBackTrace = 
(PFNCAPTURESTACKBACKTRACE)GetProcAddress(hModule,
+   
 "RtlCaptureStackBackTrace");
+ }
+  }
+  if (pfnCaptureStackBackTrace) {
+ /*
+  * Skip this (debug_backtrace_capture) function's frame.
+  */
+
+ start_frame += 1;
+
+ assert(start_frame + nr_frames < 63);
+ i = pfnCaptureStackBackTrace(start_frame, nr_frames, (PVOID *) 
&backtrace->function, NULL);
+
+ /* Pad remaing requested frames with NULL */
+ while (i < nr_frames) {
+backtrace[i++].function = NULL;
+ }
+
+ return;
+  }
+   }
+#endif
 
 #if defined(PIPE_CC_GCC)
frame_pointer = ((const void **)__builtin_frame_address(1));
@@ -86,7 +138,7 @@ debug_backtrace_capture(struct debug_stack_frame *backtrace,
 #else
(void) frame_pointer;
 #endif
-   
+
while(nr_frames) {
   backtrace[i++].function = NULL;
   --nr_frames;
diff --git a/src/gallium/auxiliary/util/u_debug_stack.h 
b/src/gallium/auxiliary/util/u_debug_stack.h
index f50f04e..b1848dd 100644
--- a/src/gallium/auxiliary/util/u_debug_stack.h
+++ b/src/gallium/auxiliary/util/u_debug_stack.h
@@ -42,6 +42,13 @@ extern "C" {
 #endif
 
 
+/**
+ * Represent a frame from a stack backtrace.
+ *
+ * XXX: Do not change this.
+ *
+ * TODO: This should be refactored as a void * typedef.
+ */
 struct debug_stack_frame 
 {
const void *function;
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g/compute: disable unused colorbuffer slots

2013-06-24 Thread Alex Deucher
On Sun, Jun 23, 2013 at 9:31 PM, Marek Olšák  wrote:
> This might fix the lockups caused by colorbuffer flushes and it's generally
> the right thing to do. Untested.

Looks good to me.

Reviewed-by: Alex Deucher 


> ---
>  src/gallium/drivers/r600/evergreen_compute.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index c993c09..b4cb939 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -408,7 +408,8 @@ static void compute_emit_cs(struct r600_context *ctx, 
> const uint *block_layout,
> r600_flush_emit(ctx);
>
> /* Emit colorbuffers. */
> -   for (i = 0; i < ctx->framebuffer.state.nr_cbufs; i++) {
> +   /* XXX support more than 8 colorbuffers (the offsets are not a 
> multiple of 0x3C for CB8-11) */
> +   for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
> struct r600_surface *cb = (struct 
> r600_surface*)ctx->framebuffer.state.cbufs[i];
> unsigned reloc = r600_context_bo_reloc(ctx, &ctx->rings.gfx,
>(struct 
> r600_resource*)cb->base.texture,
> @@ -434,6 +435,16 @@ static void compute_emit_cs(struct r600_context *ctx, 
> const uint *block_layout,
> r600_write_value(cs, PKT3(PKT3_NOP, 0, 0)); /* 
> R_028C74_CB_COLOR0_ATTRIB */
> r600_write_value(cs, reloc);
> }
> +   if (ctx->keep_tiling_flags) {
> +   for (; i < 8 ; i++) {
> +   r600_write_compute_context_reg(cs, 
> R_028C70_CB_COLOR0_INFO + i * 0x3C,
> +  
> S_028C70_FORMAT(V_028C70_COLOR_INVALID));
> +   }
> +   for (; i < 12; i++) {
> +   r600_write_compute_context_reg(cs, 
> R_028E50_CB_COLOR8_INFO + (i - 8) * 0x1C,
> +  
> S_028C70_FORMAT(V_028C70_COLOR_INVALID));
> +   }
> +   }
>
> /* Set CB_TARGET_MASK  XXX: Use cb_misc_state */
> r600_write_compute_context_reg(cs, R_028238_CB_TARGET_MASK,
> --
> 1.8.1.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: handle SNORM formats in generic CopyPixels path

2013-06-24 Thread Jose Fonseca
Yes, that's a very good idea.  The implementation would be fast, furthermore 
the implementation body would be small, so it could be an inline function in 
u_format.h, therefore allowing the compiler to coalesce multiple calls too.

It implies that the u_format_table.py script needs to parse 
src/gallium/include/pipe/p_format.h to obtain the actual values of PIPE_FORMAT 
enums, so it knows which format goes on which entry of the table, but that's 
certainly doable.

Fair enough. Ignore my comment about util_format_description() for now, and 
we'll do this in a follow on change.

Jose

- Original Message -
> Wouldn't it be easier to just make util_format_description table-based
> and inline instead of switch-based? Something like:
> 
> return format < PIPE_FORMAT_COUNT ? table[format] : NULL;
> 
> Marek
> 
> On Mon, Jun 24, 2013 at 10:12 AM, Jose Fonseca  wrote:
> >
> >
> > - Original Message -
> >> ---
> >>  src/gallium/auxiliary/util/u_format.c | 14 ++
> >>  src/gallium/auxiliary/util/u_format.h |  3 +++
> >>  src/mesa/state_tracker/st_cb_drawpixels.c |  6 ++
> >>  3 files changed, 23 insertions(+)
> >>
> >> diff --git a/src/gallium/auxiliary/util/u_format.c
> >> b/src/gallium/auxiliary/util/u_format.c
> >> index 9bdc2ea..b70c108 100644
> >> --- a/src/gallium/auxiliary/util/u_format.c
> >> +++ b/src/gallium/auxiliary/util/u_format.c
> >> @@ -131,6 +131,20 @@ util_format_is_pure_uint(enum pipe_format format)
> >> return (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED &&
> >> desc->channel[i].pure_integer) ? TRUE : FALSE;
> >>  }
> >>
> >> +boolean
> >> +util_format_is_snorm(enum pipe_format format)
> >> +{
> >> +   const struct util_format_description *desc =
> >> util_format_description(format);
> >> +   int i;
> >> +
> >> +   i = util_format_get_first_non_void_channel(format);
> >> +   if (i == -1)
> >> +  return FALSE;
> >> +
> >> +   return desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED &&
> >> +  !desc->channel[i].pure_integer &&
> >> +  desc->channel[i].normalized;
> >> +}
> >
> > This will give wrong results for mixed formats -- containing a mixture of
> > SIGNED and UNSIGNED normalized types. You can avoid that by adding this
> > statement
> >
> >   if (desc->is_mixed) return FALSE at the start.
> >
> > I think a comment would be good too -- something like "Returns true if all
> > non-void channels are normalized signed."
> >
> > This is not your fault, but it is disappointing to see the proliferation of
> > redundant util_format_description() calls in these helpers.
> > util_format_description is not cost free, and st_CopyPixels and friends
> > keep calling it over and over again. I don't think it is hard to call
> > util_format_description() once, and then pass the format_desc pointers
> > around, but once we start on the other route is hard to back out. But as
> > they say -- when you're in a hole, the first thing to do is stop digging
> > --, so I think we should just stop adding new helper functions to util
> > u_format that take enum format instead of util_format_description.
> >
> > Jose
> >
> >>
> >>  boolean
> >>  util_format_is_luminance_alpha(enum pipe_format format)
> >> diff --git a/src/gallium/auxiliary/util/u_format.h
> >> b/src/gallium/auxiliary/util/u_format.h
> >> index 4cace6a..ccb7f92 100644
> >> --- a/src/gallium/auxiliary/util/u_format.h
> >> +++ b/src/gallium/auxiliary/util/u_format.h
> >> @@ -622,6 +622,9 @@ util_format_is_pure_sint(enum pipe_format format);
> >>  boolean
> >>  util_format_is_pure_uint(enum pipe_format format);
> >>
> >> +boolean
> >> +util_format_is_snorm(enum pipe_format format);
> >> +
> >>  /**
> >>   * Check if the src format can be blitted to the destination format with
> >>   * a simple memcpy.  For example, blitting from RGBA to RGBx is OK, but
> >>   not
> >> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c
> >> b/src/mesa/state_tracker/st_cb_drawpixels.c
> >> index 38563d5..0b5198a 100644
> >> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> >> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> >> @@ -1549,6 +1549,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx,
> >> GLint
> >> srcy,
> >>
> >> if (!screen->is_format_supported(screen, srcFormat,
> >> st->internal_target,
> >> 0,
> >>  srcBind)) {
> >> +  /* srcFormat is non-renderable. Find a compatible renderable
> >> format.
> >> */
> >>if (type == GL_DEPTH) {
> >>   srcFormat = st_choose_format(st, GL_DEPTH_COMPONENT, GL_NONE,
> >>GL_NONE, st->internal_target, 0,
> >> @@ -1572,6 +1573,11 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx,
> >> GLint srcy,
> >>   GL_NONE, st->internal_target, 0,
> >>   srcBind, FALSE);
> >>   }
> >> + else if (util_format_is_snorm(srcFormat)) {
> >> +srcFormat = st_c

Re: [Mesa-dev] [PATCH] st/mesa: handle SNORM formats in generic CopyPixels path

2013-06-24 Thread Marek Olšák
Wouldn't it be easier to just make util_format_description table-based
and inline instead of switch-based? Something like:

return format < PIPE_FORMAT_COUNT ? table[format] : NULL;

Marek

On Mon, Jun 24, 2013 at 10:12 AM, Jose Fonseca  wrote:
>
>
> - Original Message -
>> ---
>>  src/gallium/auxiliary/util/u_format.c | 14 ++
>>  src/gallium/auxiliary/util/u_format.h |  3 +++
>>  src/mesa/state_tracker/st_cb_drawpixels.c |  6 ++
>>  3 files changed, 23 insertions(+)
>>
>> diff --git a/src/gallium/auxiliary/util/u_format.c
>> b/src/gallium/auxiliary/util/u_format.c
>> index 9bdc2ea..b70c108 100644
>> --- a/src/gallium/auxiliary/util/u_format.c
>> +++ b/src/gallium/auxiliary/util/u_format.c
>> @@ -131,6 +131,20 @@ util_format_is_pure_uint(enum pipe_format format)
>> return (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED &&
>> desc->channel[i].pure_integer) ? TRUE : FALSE;
>>  }
>>
>> +boolean
>> +util_format_is_snorm(enum pipe_format format)
>> +{
>> +   const struct util_format_description *desc =
>> util_format_description(format);
>> +   int i;
>> +
>> +   i = util_format_get_first_non_void_channel(format);
>> +   if (i == -1)
>> +  return FALSE;
>> +
>> +   return desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED &&
>> +  !desc->channel[i].pure_integer &&
>> +  desc->channel[i].normalized;
>> +}
>
> This will give wrong results for mixed formats -- containing a mixture of 
> SIGNED and UNSIGNED normalized types. You can avoid that by adding this 
> statement
>
>   if (desc->is_mixed) return FALSE at the start.
>
> I think a comment would be good too -- something like "Returns true if all 
> non-void channels are normalized signed."
>
> This is not your fault, but it is disappointing to see the proliferation of 
> redundant util_format_description() calls in these helpers. 
> util_format_description is not cost free, and st_CopyPixels and friends keep 
> calling it over and over again. I don't think it is hard to call 
> util_format_description() once, and then pass the format_desc pointers 
> around, but once we start on the other route is hard to back out. But as they 
> say -- when you're in a hole, the first thing to do is stop digging --, so I 
> think we should just stop adding new helper functions to util u_format that 
> take enum format instead of util_format_description.
>
> Jose
>
>>
>>  boolean
>>  util_format_is_luminance_alpha(enum pipe_format format)
>> diff --git a/src/gallium/auxiliary/util/u_format.h
>> b/src/gallium/auxiliary/util/u_format.h
>> index 4cace6a..ccb7f92 100644
>> --- a/src/gallium/auxiliary/util/u_format.h
>> +++ b/src/gallium/auxiliary/util/u_format.h
>> @@ -622,6 +622,9 @@ util_format_is_pure_sint(enum pipe_format format);
>>  boolean
>>  util_format_is_pure_uint(enum pipe_format format);
>>
>> +boolean
>> +util_format_is_snorm(enum pipe_format format);
>> +
>>  /**
>>   * Check if the src format can be blitted to the destination format with
>>   * a simple memcpy.  For example, blitting from RGBA to RGBx is OK, but not
>> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c
>> b/src/mesa/state_tracker/st_cb_drawpixels.c
>> index 38563d5..0b5198a 100644
>> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
>> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
>> @@ -1549,6 +1549,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint
>> srcy,
>>
>> if (!screen->is_format_supported(screen, srcFormat, st->internal_target,
>> 0,
>>  srcBind)) {
>> +  /* srcFormat is non-renderable. Find a compatible renderable format.
>> */
>>if (type == GL_DEPTH) {
>>   srcFormat = st_choose_format(st, GL_DEPTH_COMPONENT, GL_NONE,
>>GL_NONE, st->internal_target, 0,
>> @@ -1572,6 +1573,11 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx,
>> GLint srcy,
>>   GL_NONE, st->internal_target, 0,
>>   srcBind, FALSE);
>>   }
>> + else if (util_format_is_snorm(srcFormat)) {
>> +srcFormat = st_choose_format(st, GL_RGBA16_SNORM, GL_NONE,
>> + GL_NONE, st->internal_target, 0,
>> + srcBind, FALSE);
>> + }
>>   else {
>>  srcFormat = st_choose_format(st, GL_RGBA, GL_NONE,
>>   GL_NONE, st->internal_target, 0,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: handle SNORM formats in generic CopyPixels path

2013-06-24 Thread Jose Fonseca


- Original Message -
> On Mon, Jun 24, 2013 at 6:12 PM, Jose Fonseca  wrote:
> >
> >
> > - Original Message -
> >> ---
> >>  src/gallium/auxiliary/util/u_format.c     | 14 ++
> >>  src/gallium/auxiliary/util/u_format.h     |  3 +++
> >>  src/mesa/state_tracker/st_cb_drawpixels.c |  6 ++
> >>  3 files changed, 23 insertions(+)
> >>
> >> diff --git a/src/gallium/auxiliary/util/u_format.c
> >> b/src/gallium/auxiliary/util/u_format.c
> >> index 9bdc2ea..b70c108 100644
> >> --- a/src/gallium/auxiliary/util/u_format.c
> >> +++ b/src/gallium/auxiliary/util/u_format.c
> >> @@ -131,6 +131,20 @@ util_format_is_pure_uint(enum pipe_format format)
> >>     return (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED &&
> >>     desc->channel[i].pure_integer) ? TRUE : FALSE;
> >>  }
> >>
> >> +boolean
> >> +util_format_is_snorm(enum pipe_format format)
> >> +{
> >> +   const struct util_format_description *desc =
> >> util_format_description(format);
> >> +   int i;
> >> +
> >> +   i = util_format_get_first_non_void_channel(format);
> >> +   if (i == -1)
> >> +      return FALSE;
> >> +
> >> +   return desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED &&
> >> +          !desc->channel[i].pure_integer &&
> >> +          desc->channel[i].normalized;
> >> +}
> >
> > This will give wrong results for mixed formats -- containing a mixture of
> > SIGNED and UNSIGNED normalized types. You can avoid that by adding this
> > statement
> >
> >   if (desc->is_mixed) return FALSE at the start.
> >
> > I think a comment would be good too -- something like "Returns true if all
> > non-void channels are normalized signed."
> >
> > This is not your fault, but it is disappointing to see the proliferation of
> > redundant util_format_description() calls in these helpers.
> > util_format_description is not cost free, and st_CopyPixels and friends
> > keep calling it over and over again. I don't think it is hard to call
> > util_format_description() once, and then pass the format_desc pointers
> > around, but once we start on the other route is hard to back out. But as
> > they say -- when you're in a hole, the first thing to do is stop digging
> > --, so I think we should just stop adding new helper functions to util
> > u_format that take enum format instead of util_format_description.
> 
> I hestiate to say this as no doubt I'll be wrong, but if they are all
> inline helpers, the compiler should figure it out and collapse the
> lookups. Of course I believe compilers should do lots of things...

It's not totally wrong what you say.

Ordinarily, a C compiler has no way to know that each call 
util_format_description() with same arguments the same result.

That is, it has no way to know that that util_format_description doesn't refer 
and modify other global state like:

  const struct util_format_description *
  util_format_description(format) {
    static const struct util_format_description * last_format_description = 
&my_table;
    return ++last_format_desciption;
  }

Unless,

 - unless the compiler is using whole-program-optimzation / 
link-time-optimization
 - the implementation of util_format_description() is a header -- but that will 
make compilation time slow.

 - there is some attribute to tell the compiler that a function has no side 
effects.

And I checked  http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html and 
apparently there is actually some gcc attributes that could help here:

-- __attribute__((const))

-- __attribute__((pure))

But support with other compilers varies -- 
http://stackoverflow.com/questions/2798188/pure-const-function-attributes-in-different-compilers
 . declspec(noalias) comes close,  
http://msdn.microsoft.com/en-us/library/k649tyc7.aspx , but not sure if it 
works. It might be worth a try though.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vl/mpeg12: implement inverse scan/quantization steps

2013-06-24 Thread Christian König

Am 23.06.2013 18:59, schrieb Ilia Mirkin:

Signed-off-by: Ilia Mirkin 
---

These changes make MPEG2 I-frames generate the correct macroblock data (as
compared to mplayer via xvmc). Other MPEG2 frames are still misparsed, and
MPEG1 I-frames have some errors (but largely match up).


NAK, zscan and mismatch handling are handled in vl/vl_zscan.c.

Please use/fix that one instead of adding another implementation.

Christian.


  src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c | 84 ++
  1 file changed, 73 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c 
b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c
index b0fb1bb..cd3647d 100644
--- a/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c
+++ b/src/gallium/auxiliary/vl/vl_mpeg12_bitstream.c
@@ -520,6 +520,30 @@ static const unsigned quant_scale[2][32] = {
  28, 32, 36, 40, 44, 48, 52, 56, 64, 72, 80, 88, 96, 104, 112 }
  };
  
+/* Inverses of Figures 7-2 and 7-3 */

+static const uint8_t scans[2][64] = {
+   {
+   0, 1, 8,16, 9, 2, 3,10,
+  17,24,32,25,18,11, 4, 5,
+  12,19,26,33,40,48,41,34,
+  27,20,13, 6, 7,14,21,28,
+  35,42,49,56,57,50,43,36,
+  29,22,15,23,30,37,44,51,
+  58,59,52,45,38,31,39,46,
+  53,60,61,54,47,55,62,63
+   },
+   {
+   0, 8,16,24, 1, 9, 2,10,
+  17,25,32,40,48,56,57,49,
+  41,33,26,18, 3,11, 4,12,
+  19,27,34,42,50,58,35,43,
+  51,59,20,28, 5,13, 6,14,
+  21,29,36,44,52,60,37,45,
+  53,61,22,30, 7,15,23,31,
+  38,46,54,62,39,47,55,63
+   }
+};
+
  static struct vl_vlc_entry tbl_B1[1 << 11];
  static struct vl_vlc_entry tbl_B2[1 << 2];
  static struct vl_vlc_entry tbl_B3[1 << 6];
@@ -706,6 +730,13 @@ reset_predictor(struct vl_mpg12_bs *bs) {
 bs->pred_dc[0] = bs->pred_dc[1] = bs->pred_dc[2] = 0;
  }
  
+static INLINE int16_t

+sign(int16_t val)
+{
+   if (!val) return 0;
+   return (val < 0) ? -1 : 1;
+}
+
  static INLINE void
  decode_dct(struct vl_mpg12_bs *bs, struct pipe_mpeg12_macroblock *mb, int 
scale)
  {
@@ -717,8 +748,11 @@ decode_dct(struct vl_mpg12_bs *bs, struct 
pipe_mpeg12_macroblock *mb, int scale)
 bool intra = mb->macroblock_type & PIPE_MPEG12_MB_TYPE_INTRA;
 const struct dct_coeff *table = intra ? bs->intra_dct_tbl : tbl_B14_AC;
 const struct dct_coeff *entry;
-   int i, cbp, blk = 0;
+   int i, j, cbp, blk = 0;
 short *dst = mb->blocks;
+   const uint8_t *scan = &scans[bs->desc->alternate_scan][0];
+   const uint8_t *quant_matrix = intra ? bs->desc->intra_matrix : 
bs->desc->non_intra_matrix;
+   int sums[6] = {0};
  
 vl_vlc_fillbits(&bs->vlc);

 mb->coded_block_pattern = cbp = intra ? 0x3F : vl_vlc_get_vlclbf(&bs->vlc, 
tbl_B9, 9);
@@ -757,8 +791,9 @@ entry:
 bs->pred_dc[cc] += dct_diff;
  }
  
-dst[0] = bs->pred_dc[cc];

  i = 0;
+j = scan[i];
+dst[j] = bs->pred_dc[cc];
  
   } else {

  entry = tbl_B14_DC + vl_vlc_peekbits(&bs->vlc, 17);
@@ -771,32 +806,59 @@ entry:
   i += vl_vlc_get_uimsbf(&bs->vlc, 6) + 1;
   if (i > 64)
  break;
+ j = scan[i];
  
- dst[i] = vl_vlc_get_simsbf(&bs->vlc, 8);

- if (dst[i] == -128)
-dst[i] = vl_vlc_get_uimsbf(&bs->vlc, 8) - 256;
- else if (dst[i] == 0)
-dst[i] = vl_vlc_get_uimsbf(&bs->vlc, 8);
-
- dst[i] *= scale;
+ dst[j] = vl_vlc_get_simsbf(&bs->vlc, 8);
+ if (dst[j] == -128)
+dst[j] = vl_vlc_get_uimsbf(&bs->vlc, 8) - 256;
+ else if (dst[j] == 0)
+dst[j] = vl_vlc_get_uimsbf(&bs->vlc, 8);
} else if (entry->run == dct_Escape) {
   i += vl_vlc_get_uimsbf(&bs->vlc, 6) + 1;
   if (i > 64)
  break;
  
- dst[i] = vl_vlc_get_simsbf(&bs->vlc, 12) * scale;

+ j = scan[i];
+ dst[j] = vl_vlc_get_simsbf(&bs->vlc, 12);
  
} else {

   i += entry->run;
   if (i > 64)
  break;
  
- dst[i] = entry->level * scale;

+ j = scan[i];
+ dst[j] = entry->level;
+  }
+
+  if (intra && !j) {
+ dst[j] = dst[j] << (3 - bs->desc->intra_dc_precision);
+  } else {
+ dst[j] = (2 * dst[j] + (intra ? 0 : sign(dst[j]))) * quant_matrix[j] 
* scale / 32;
+ if (bs->decoder->profile == PIPE_VIDEO_PROFILE_MPEG1 && dst[j])
+dst[j] = (dst[j] - 1) | 1;
}
+  if (dst[j] > 2047)
+ dst[j] = 2047;
+  else if (dst[j] < -2048)
+ dst[j] = -2048;
+
+  sums[blk] += dst[j];
  
vl_vlc_fillbits(&bs->vlc);

entry = table + vl_vlc_peekbits(&bs->vlc, 17);
 }
+
+   if (bs->decoder->profile != PIPE_VIDEO_PROFILE_MPEG1) {
+  dst = mb->blocks;
+  for (i = 0; i < blk; i++, dst += 64) {
+ if ((sums[i] & 1) == 0) {
+if (dst[63] & 1)
+   dst[63] -= 1;
+else
+   dst[63] += 1

Re: [Mesa-dev] [PATCH] st/mesa: handle SNORM formats in generic CopyPixels path

2013-06-24 Thread Dave Airlie
On Mon, Jun 24, 2013 at 6:12 PM, Jose Fonseca  wrote:
>
>
> - Original Message -
>> ---
>>  src/gallium/auxiliary/util/u_format.c | 14 ++
>>  src/gallium/auxiliary/util/u_format.h |  3 +++
>>  src/mesa/state_tracker/st_cb_drawpixels.c |  6 ++
>>  3 files changed, 23 insertions(+)
>>
>> diff --git a/src/gallium/auxiliary/util/u_format.c
>> b/src/gallium/auxiliary/util/u_format.c
>> index 9bdc2ea..b70c108 100644
>> --- a/src/gallium/auxiliary/util/u_format.c
>> +++ b/src/gallium/auxiliary/util/u_format.c
>> @@ -131,6 +131,20 @@ util_format_is_pure_uint(enum pipe_format format)
>> return (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED &&
>> desc->channel[i].pure_integer) ? TRUE : FALSE;
>>  }
>>
>> +boolean
>> +util_format_is_snorm(enum pipe_format format)
>> +{
>> +   const struct util_format_description *desc =
>> util_format_description(format);
>> +   int i;
>> +
>> +   i = util_format_get_first_non_void_channel(format);
>> +   if (i == -1)
>> +  return FALSE;
>> +
>> +   return desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED &&
>> +  !desc->channel[i].pure_integer &&
>> +  desc->channel[i].normalized;
>> +}
>
> This will give wrong results for mixed formats -- containing a mixture of 
> SIGNED and UNSIGNED normalized types. You can avoid that by adding this 
> statement
>
>   if (desc->is_mixed) return FALSE at the start.
>
> I think a comment would be good too -- something like "Returns true if all 
> non-void channels are normalized signed."
>
> This is not your fault, but it is disappointing to see the proliferation of 
> redundant util_format_description() calls in these helpers. 
> util_format_description is not cost free, and st_CopyPixels and friends keep 
> calling it over and over again. I don't think it is hard to call 
> util_format_description() once, and then pass the format_desc pointers 
> around, but once we start on the other route is hard to back out. But as they 
> say -- when you're in a hole, the first thing to do is stop digging --, so I 
> think we should just stop adding new helper functions to util u_format that 
> take enum format instead of util_format_description.

I hestiate to say this as no doubt I'll be wrong, but if they are all
inline helpers, the compiler should figure it out and collapse the
lookups. Of course I believe compilers should do lots of things...

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: handle SNORM formats in generic CopyPixels path

2013-06-24 Thread Jose Fonseca


- Original Message -
> ---
>  src/gallium/auxiliary/util/u_format.c | 14 ++
>  src/gallium/auxiliary/util/u_format.h |  3 +++
>  src/mesa/state_tracker/st_cb_drawpixels.c |  6 ++
>  3 files changed, 23 insertions(+)
> 
> diff --git a/src/gallium/auxiliary/util/u_format.c
> b/src/gallium/auxiliary/util/u_format.c
> index 9bdc2ea..b70c108 100644
> --- a/src/gallium/auxiliary/util/u_format.c
> +++ b/src/gallium/auxiliary/util/u_format.c
> @@ -131,6 +131,20 @@ util_format_is_pure_uint(enum pipe_format format)
> return (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED &&
> desc->channel[i].pure_integer) ? TRUE : FALSE;
>  }
>  
> +boolean
> +util_format_is_snorm(enum pipe_format format)
> +{
> +   const struct util_format_description *desc =
> util_format_description(format);
> +   int i;
> +
> +   i = util_format_get_first_non_void_channel(format);
> +   if (i == -1)
> +  return FALSE;
> +
> +   return desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED &&
> +  !desc->channel[i].pure_integer &&
> +  desc->channel[i].normalized;
> +}

This will give wrong results for mixed formats -- containing a mixture of 
SIGNED and UNSIGNED normalized types. You can avoid that by adding this 
statement

  if (desc->is_mixed) return FALSE at the start.

I think a comment would be good too -- something like "Returns true if all 
non-void channels are normalized signed."

This is not your fault, but it is disappointing to see the proliferation of 
redundant util_format_description() calls in these helpers. 
util_format_description is not cost free, and st_CopyPixels and friends keep 
calling it over and over again. I don't think it is hard to call 
util_format_description() once, and then pass the format_desc pointers around, 
but once we start on the other route is hard to back out. But as they say -- 
when you're in a hole, the first thing to do is stop digging --, so I think we 
should just stop adding new helper functions to util u_format that take enum 
format instead of util_format_description.

Jose

>  
>  boolean
>  util_format_is_luminance_alpha(enum pipe_format format)
> diff --git a/src/gallium/auxiliary/util/u_format.h
> b/src/gallium/auxiliary/util/u_format.h
> index 4cace6a..ccb7f92 100644
> --- a/src/gallium/auxiliary/util/u_format.h
> +++ b/src/gallium/auxiliary/util/u_format.h
> @@ -622,6 +622,9 @@ util_format_is_pure_sint(enum pipe_format format);
>  boolean
>  util_format_is_pure_uint(enum pipe_format format);
>  
> +boolean
> +util_format_is_snorm(enum pipe_format format);
> +
>  /**
>   * Check if the src format can be blitted to the destination format with
>   * a simple memcpy.  For example, blitting from RGBA to RGBx is OK, but not
> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c
> b/src/mesa/state_tracker/st_cb_drawpixels.c
> index 38563d5..0b5198a 100644
> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> @@ -1549,6 +1549,7 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx, GLint
> srcy,
>  
> if (!screen->is_format_supported(screen, srcFormat, st->internal_target,
> 0,
>  srcBind)) {
> +  /* srcFormat is non-renderable. Find a compatible renderable format.
> */
>if (type == GL_DEPTH) {
>   srcFormat = st_choose_format(st, GL_DEPTH_COMPONENT, GL_NONE,
>GL_NONE, st->internal_target, 0,
> @@ -1572,6 +1573,11 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx,
> GLint srcy,
>   GL_NONE, st->internal_target, 0,
>   srcBind, FALSE);
>   }
> + else if (util_format_is_snorm(srcFormat)) {
> +srcFormat = st_choose_format(st, GL_RGBA16_SNORM, GL_NONE,
> + GL_NONE, st->internal_target, 0,
> + srcBind, FALSE);
> + }
>   else {
>  srcFormat = st_choose_format(st, GL_RGBA, GL_NONE,
>   GL_NONE, st->internal_target, 0,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev