Re: [Mesa-dev] [PATCH] GL3: remove radeonsi occurrences in GL 4.2, already specified as "all DONE"

2016-04-30 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-04-30 01:48, Fabio Pedretti wrote:

---
 docs/GL3.txt | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index bb2bb6e..5a6be41 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -148,17 +148,17 @@ GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, 
radeonsi


 GL 4.2, GLSL 4.20 -- all DONE: radeonsi

-  GL_ARB_texture_compression_bptc   DONE (i965,
nvc0, r600, radeonsi)
+  GL_ARB_texture_compression_bptc   DONE (i965, 
nvc0, r600)
   GL_ARB_compressed_texture_pixel_storage   DONE (all 
drivers)

-  GL_ARB_shader_atomic_counters DONE (i965,
nvc0, radeonsi, softpipe)
+  GL_ARB_shader_atomic_counters DONE (i965,
nvc0, softpipe)
   GL_ARB_texture_storageDONE (all 
drivers)

-  GL_ARB_transform_feedback_instanced   DONE (i965,
nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
-  GL_ARB_base_instance  DONE (i965,
nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
-  GL_ARB_shader_image_load_storeDONE (i965,
radeonsi, softpipe)
+  GL_ARB_transform_feedback_instanced   DONE (i965,
nv50, nvc0, r600, llvmpipe, softpipe)
+  GL_ARB_base_instance  DONE (i965,
nv50, nvc0, r600, llvmpipe, softpipe)
+  GL_ARB_shader_image_load_storeDONE (i965, 
softpipe)

   GL_ARB_conservative_depth DONE (all
drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack   DONE (all
drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing   DONE (all 
drivers)

-  GL_ARB_internalformat_query   DONE (i965,
nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
+  GL_ARB_internalformat_query   DONE (i965,
nv50, nvc0, r600, llvmpipe, softpipe)
   GL_ARB_map_buffer_alignment   DONE (all 
drivers)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 22/27] glsl: add helper for comparing arrays in varying packing pass

2016-04-21 Thread eocallaghan


On 2016-03-31 21:57, Timothy Arceri wrote:

---
 src/compiler/glsl/lower_packed_varyings.cpp | 25 
+

 1 file changed, 25 insertions(+)

diff --git a/src/compiler/glsl/lower_packed_varyings.cpp
b/src/compiler/glsl/lower_packed_varyings.cpp
index ad766bb..6e7a289 100644
--- a/src/compiler/glsl/lower_packed_varyings.cpp
+++ b/src/compiler/glsl/lower_packed_varyings.cpp
@@ -152,6 +152,31 @@

 using namespace ir_builder;

+/**
+ * If the var is an array check if it matches the array attributes of 
the

+ * packed var.
+ */
+static bool
+check_for_matching_arrays(ir_variable *packed_var, ir_variable *var)
+{
+   const glsl_type *pt = packed_var->type;
+   const glsl_type *vt = var->type;


I suppose its ok to always assume the call site always does the right 
thing with

this helper? Either way,

Reviewed-by: Edward O'Callaghan 


+   bool array_match = true;
+
+   while (pt->is_array() || vt->is_array()) {
+  if (pt->is_array() != vt->is_array() ||
+  pt->length != vt->length) {
+ array_match = false;
+ break;
+  } else {
+ pt = pt->fields.array;
+ vt = vt->fields.array;
+  }
+   }
+
+   return array_match;
+}
+
 static bool
 needs_lowering(ir_variable *var, bool has_enhanced_layouts,
bool disable_varying_packing)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/27] glsl: skip location and component packing validation on patch out

2016-04-21 Thread eocallaghan

Acked-by: Edward O'Callaghan 

On 2016-03-31 21:57, Timothy Arceri wrote:

These outputs have a separate location domain from per-vertex outputs
and need to be handled separately. For now just skip validation so we
don't invalidate valid shaders.
---
 src/compiler/glsl/link_varyings.cpp | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp
b/src/compiler/glsl/link_varyings.cpp
index d125a9f..4f57fb2 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -348,8 +348,12 @@ cross_validate_outputs_to_inputs(struct
gl_shader_program *prog,
foreach_in_list(ir_instruction, node, producer->ir) {
   ir_variable *const var = node->as_variable();

-  if ((var == NULL) || (var->data.mode != ir_var_shader_out))
-continue;
+  /* FIXME: We should also validate per patch outputs too rather 
than just

+   * skipping over them here.
+   */
+  if ((var == NULL) || var->data.patch ||
+  (var->data.mode != ir_var_shader_out))
+ continue;

   if (!var->data.explicit_location
   || var->data.location < VARYING_SLOT_VAR0)
@@ -432,8 +436,12 @@ cross_validate_outputs_to_inputs(struct
gl_shader_program *prog,
foreach_in_list(ir_instruction, node, consumer->ir) {
   ir_variable *const input = node->as_variable();

-  if ((input == NULL) || (input->data.mode != ir_var_shader_in))
-continue;
+  /* FIXME: We should also validate per patch outputs too rather 
than just

+   * skipping over them here.
+   */
+  if ((input == NULL) || input->data.patch ||
+  (input->data.mode != ir_var_shader_in))
+ continue;

   if (strcmp(input->name, "gl_Color") == 0 && input->data.used) {
  const ir_variable *const front_color =


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 27/27] docs: mark ARB_enhanced_layouts as DONE

2016-04-21 Thread eocallaghan

Acked-by: Edward O'Callaghan 

On 2016-03-31 21:58, Timothy Arceri wrote:

---
 docs/GL3.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index f6248da..ede8cf5 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -193,11 +193,11 @@ GL 4.4, GLSL 4.40:
   GL_MAX_VERTEX_ATTRIB_STRIDE   DONE (all 
drivers)

   GL_ARB_buffer_storage DONE (i965,
nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture  DONE (i965, 
nv50, nvc0)
-  GL_ARB_enhanced_layouts   in progress 
(Timothy)
+  GL_ARB_enhanced_layouts   DONE (all 
drivers)

   - compile-time constant expressions   DONE
   - explicit byte offsets for blocksDONE
   - forced alignment within blocks  DONE
-  - specified vec4-slot component numbers   in progress
+  - specified vec4-slot component numbers   DONE
   - specified transform/feedback layout DONE
   - input/output block locationsDONE
   GL_ARB_multi_bind DONE (all 
drivers)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/27] glsl: allow component qualifier on varying inputs

2016-04-21 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-31 21:57, Timothy Arceri wrote:

---
 src/compiler/glsl/ast_type.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/glsl/ast_type.cpp 
b/src/compiler/glsl/ast_type.cpp

index 30c9eff..de3fdcc 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -146,6 +146,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
input_layout_mask.flags.q.centroid = 1;
/* Function params can have constant */
input_layout_mask.flags.q.constant = 1;
+   input_layout_mask.flags.q.explicit_component = 1;
input_layout_mask.flags.q.explicit_location = 1;
input_layout_mask.flags.q.flat = 1;
input_layout_mask.flags.q.in = 1;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] radeonsi: Print a message when scratch allocation fails.

2016-04-19 Thread eocallaghan

On 2016-04-20 11:46, Nicolai Hähnle wrote:

On 19.04.2016 17:50, Bas Nieuwenhuizen wrote:

Signed-off-by: Bas Nieuwenhuizen 
---
  src/gallium/drivers/radeonsi/si_compute.c   | 5 -
  src/gallium/drivers/radeonsi/si_state_shaders.c | 5 -
  2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c

index b46a2fe..7d91ac6 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -215,8 +215,11 @@ static bool 
si_setup_compute_scratch_buffer(struct si_context *sctx,

scratch_needed, 256, false, RADEON_DOMAIN_VRAM,
RADEON_FLAG_NO_CPU_ACCESS);

-   if (!sctx->compute_scratch_buffer)
+   if (!sctx->compute_scratch_buffer) {
+   fprintf(stderr, "Warning: Failed to allocate the "
+   "scratch buffer\n");
return false;
+   }


Here and below, please change the "Warning" into "radeonsi" so
unsuspecting users will be more likely to understand what's going on.
With that changed, the patch is

Reviewed-by: Nicolai Hähnle 


Wait, why not use the std R600_ERR() macro that wraps fprintf() calls?




}

  	if (sctx->compute_scratch_buffer != shader->scratch_bo && 
scratch_needed) {
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c

index fef676b..2396b8e 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1692,8 +1692,11 @@ static bool si_update_spi_tmpring_size(struct 
si_context *sctx)

scratch_needed_size, 256, 
false,
RADEON_DOMAIN_VRAM,
RADEON_FLAG_NO_CPU_ACCESS);
-   if (!sctx->scratch_buffer)
+   if (!sctx->scratch_buffer) {
+   fprintf(stderr, "Warning: Failed to allocate the 
"
+   "scratch buffer\n");
return false;
+   }
sctx->emit_scratch_reloc = true;
}



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] radeonsi: use CE suballocator for CP DMA realignment.

2016-04-19 Thread eocallaghan

On 2016-04-20 09:29, Bas Nieuwenhuizen wrote:

I retract patch 1 and 2. Large scratch buffers are nice, but the
hardware only supports a 32-bit offset into it.

- Bas

On Wed, Apr 20, 2016 at 12:50 AM, Bas Nieuwenhuizen
 wrote:

Use the CE suballocator instead of the normal one as the usage
is most similar to the CE, i.e. only read and written on GPU
and not mapped to CPU.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/gallium/drivers/radeonsi/si_cp_dma.c | 27 
++-

 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c

index 38e0ee6..264789d 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -222,31 +222,24 @@ static void si_clear_buffer(struct pipe_context 
*ctx, struct pipe_resource *dst,

  */
 static void si_cp_dma_realign_engine(struct si_context *sctx, 
unsigned size)

 {
+


trivial spurious '\n'


uint64_t va;
unsigned dma_flags = 0;
unsigned scratch_size = CP_DMA_ALIGNMENT * 2;
+   unsigned offset;
+   struct r600_resource *tmp_buf;

assert(size < CP_DMA_ALIGNMENT);

-   /* Use the scratch buffer as the dummy buffer. The 3D engine 
should be

-* idle at this point.
-*/
-   if (!sctx->scratch_buffer ||
-   sctx->scratch_buffer->b.b.width0 < scratch_size) {
-   r600_resource_reference(&sctx->scratch_buffer, NULL);
-   sctx->scratch_buffer =
-   si_resource_create_custom(&sctx->screen->b.b,
- PIPE_USAGE_DEFAULT,
- scratch_size);
-   if (!sctx->scratch_buffer)
-   return;
-   sctx->emit_scratch_reloc = true;
-   }
+   u_suballocator_alloc(sctx->ce_suballocator, scratch_size, 
&offset,

+(struct pipe_resource**)&tmp_buf);
+   if (!tmp_buf)
+   return;

-   si_cp_dma_prepare(sctx, &sctx->scratch_buffer->b.b,
- &sctx->scratch_buffer->b.b, size, size, 
&dma_flags);

+   si_cp_dma_prepare(sctx, &tmp_buf->b.b,
+ &tmp_buf->b.b, size, size, &dma_flags);

-   va = sctx->scratch_buffer->gpu_address;
+   va = tmp_buf->gpu_address + offset;
si_emit_cp_dma_copy_buffer(sctx, va, va + CP_DMA_ALIGNMENT, 
size,

   dma_flags);
 }
--
2.8.0


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radeonsi: Implement ddx/ddy on VI using ds_bpermute

2016-04-16 Thread eocallaghan

On 2016-04-16 20:20, Marek Olšák wrote:
On Sat, Apr 16, 2016 at 8:04 AM, Michel Dänzer  
wrote:

On 16.04.2016 14:51, Michel Dänzer wrote:

On 16.04.2016 11:39, Tom Stellard wrote:

The ds_bpermute instruction allows threads to transfer data directly
to or from the vgprs of other threads.  These instructions use the 
lds

hardware to transfer data, but do not read or write lds memory.

DDX BEFORE:|  DDX AFTER:
   |
v_mbcnt_lo_u32_b32_e64 v2, -1, 0   |  v_mbcnt_lo_u32_b32_e64 v2, -1, 
0
v_mbcnt_hi_u32_b32_e64 v2, -1, v2  |  v_mbcnt_hi_u32_b32_e64 v2, -1, 
v2
v_lshlrev_b32_e32 v4, 2, v2|  v_and_b32_e32 v2, 0x3ffc, 
v2

v_and_b32_e32 v2, -4, v2   |  v_lshlrev_b32_e32 v2, 2, v2
v_lshlrev_b32_e32 v3, 2, v2|  ds_bpermute_b32 v3, v2, v0
s_mov_b32 m0, -1   |  ds_bpermute_b32 v0, v2, v0 
offset:4

ds_write_b32 v4, v0|  s_waitcnt lgkmcnt(0)
s_waitcnt lgkmcnt(0)   |
v_or_b32_e32 v0, 1, v2 |
v_lshlrev_b32_e32 v0, 2, v0|
ds_read_b32 v1, v3 |
ds_read_b32 v0, v0 |
s_waitcnt lgkmcnt(0)   |
   |
LDS: 1 blocks  |  LDS: 0 blocks


Nice.


Were these intrinsics already available in LLVM 3.6? If not, the old
code needs to be kept for backwards compatibility.


I can see now that you're taking care of this for the bpermute
intrinsic, but AFAICT the mbcnt intrinsics were only added in LLVM 
3.8.


How do you feel about increasing the requirement to LLVM 3.8 for Mesa 
git?


+1 from me. Supporting more than two generations of LLVM is a bit much 
to carry imho.




Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: use enums in p_defines.h

2016-04-16 Thread eocallaghan

Patches 1 & 2 are,

Reviewed-by: Edward O'Callaghan 

i`ll have to spend some time looking at the others tomorrow..

On 2016-04-16 22:50, Marek Olšák wrote:

From: Marek Olšák 

and remove number assignments which are consecutive
---
 src/gallium/include/pipe/p_defines.h | 378 
+++

 1 file changed, 205 insertions(+), 173 deletions(-)

diff --git a/src/gallium/include/pipe/p_defines.h
b/src/gallium/include/pipe/p_defines.h
index 1aef21d..6bb180d 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -51,49 +51,56 @@ enum pipe_error
/* TODO */
 };

+enum {
+   PIPE_BLENDFACTOR_ONE = 1,
+   PIPE_BLENDFACTOR_SRC_COLOR,
+   PIPE_BLENDFACTOR_SRC_ALPHA,
+   PIPE_BLENDFACTOR_DST_ALPHA,
+   PIPE_BLENDFACTOR_DST_COLOR,
+   PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE,
+   PIPE_BLENDFACTOR_CONST_COLOR,
+   PIPE_BLENDFACTOR_CONST_ALPHA,
+   PIPE_BLENDFACTOR_SRC1_COLOR,
+   PIPE_BLENDFACTOR_SRC1_ALPHA,
+
+   PIPE_BLENDFACTOR_ZERO = 0x11,
+   PIPE_BLENDFACTOR_INV_SRC_COLOR,
+   PIPE_BLENDFACTOR_INV_SRC_ALPHA,
+   PIPE_BLENDFACTOR_INV_DST_ALPHA,
+   PIPE_BLENDFACTOR_INV_DST_COLOR,
+
+   PIPE_BLENDFACTOR_INV_CONST_COLOR = 0x17,
+   PIPE_BLENDFACTOR_INV_CONST_ALPHA,
+   PIPE_BLENDFACTOR_INV_SRC1_COLOR,
+   PIPE_BLENDFACTOR_INV_SRC1_ALPHA,
+};
+
+enum {
+   PIPE_BLEND_ADD,
+   PIPE_BLEND_SUBTRACT,
+   PIPE_BLEND_REVERSE_SUBTRACT,
+   PIPE_BLEND_MIN,
+   PIPE_BLEND_MAX,
+};

-#define PIPE_BLENDFACTOR_ONE 0x1
-#define PIPE_BLENDFACTOR_SRC_COLOR   0x2
-#define PIPE_BLENDFACTOR_SRC_ALPHA   0x3
-#define PIPE_BLENDFACTOR_DST_ALPHA   0x4
-#define PIPE_BLENDFACTOR_DST_COLOR   0x5
-#define PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE  0x6
-#define PIPE_BLENDFACTOR_CONST_COLOR 0x7
-#define PIPE_BLENDFACTOR_CONST_ALPHA 0x8
-#define PIPE_BLENDFACTOR_SRC1_COLOR  0x9
-#define PIPE_BLENDFACTOR_SRC1_ALPHA  0x0A
-#define PIPE_BLENDFACTOR_ZERO0x11
-#define PIPE_BLENDFACTOR_INV_SRC_COLOR   0x12
-#define PIPE_BLENDFACTOR_INV_SRC_ALPHA   0x13
-#define PIPE_BLENDFACTOR_INV_DST_ALPHA   0x14
-#define PIPE_BLENDFACTOR_INV_DST_COLOR   0x15
-#define PIPE_BLENDFACTOR_INV_CONST_COLOR 0x17
-#define PIPE_BLENDFACTOR_INV_CONST_ALPHA 0x18
-#define PIPE_BLENDFACTOR_INV_SRC1_COLOR  0x19
-#define PIPE_BLENDFACTOR_INV_SRC1_ALPHA  0x1A
-
-#define PIPE_BLEND_ADD   0
-#define PIPE_BLEND_SUBTRACT  1
-#define PIPE_BLEND_REVERSE_SUBTRACT  2
-#define PIPE_BLEND_MIN   3
-#define PIPE_BLEND_MAX   4
-
-#define PIPE_LOGICOP_CLEAR0
-#define PIPE_LOGICOP_NOR  1
-#define PIPE_LOGICOP_AND_INVERTED 2
-#define PIPE_LOGICOP_COPY_INVERTED3
-#define PIPE_LOGICOP_AND_REVERSE  4
-#define PIPE_LOGICOP_INVERT   5
-#define PIPE_LOGICOP_XOR  6
-#define PIPE_LOGICOP_NAND 7
-#define PIPE_LOGICOP_AND  8
-#define PIPE_LOGICOP_EQUIV9
-#define PIPE_LOGICOP_NOOP 10
-#define PIPE_LOGICOP_OR_INVERTED  11
-#define PIPE_LOGICOP_COPY 12
-#define PIPE_LOGICOP_OR_REVERSE   13
-#define PIPE_LOGICOP_OR   14
-#define PIPE_LOGICOP_SET  15
+enum {
+   PIPE_LOGICOP_CLEAR,
+   PIPE_LOGICOP_NOR,
+   PIPE_LOGICOP_AND_INVERTED,
+   PIPE_LOGICOP_COPY_INVERTED,
+   PIPE_LOGICOP_AND_REVERSE,
+   PIPE_LOGICOP_INVERT,
+   PIPE_LOGICOP_XOR,
+   PIPE_LOGICOP_NAND,
+   PIPE_LOGICOP_AND,
+   PIPE_LOGICOP_EQUIV,
+   PIPE_LOGICOP_NOOP,
+   PIPE_LOGICOP_OR_INVERTED,
+   PIPE_LOGICOP_COPY,
+   PIPE_LOGICOP_OR_REVERSE,
+   PIPE_LOGICOP_OR,
+   PIPE_LOGICOP_SET,
+};

 #define PIPE_MASK_R  0x1
 #define PIPE_MASK_G  0x2
@@ -110,19 +117,23 @@ enum pipe_error
  * Inequality functions.  Used for depth test, stencil compare, alpha
  * test, shadow compare, etc.
  */
-#define PIPE_FUNC_NEVER0
-#define PIPE_FUNC_LESS 1
-#define PIPE_FUNC_EQUAL2
-#define PIPE_FUNC_LEQUAL   3
-#define PIPE_FUNC_GREATER  4
-#define PIPE_FUNC_NOTEQUAL 5
-#define PIPE_FUNC_GEQUAL   6
-#define PIPE_FUNC_ALWAYS   7
+enum {
+   PIPE_FUNC_NEVER,
+   PIPE_FUNC_LESS,
+   PIPE_FUNC_EQUAL,
+   PIPE_FUNC_LEQUAL,
+   PIPE_FUNC_GREATER,
+   PIPE_FUNC_NOTEQUAL,
+   PIPE_FUNC_GEQUAL,
+   PIPE_FUNC_ALWAYS,
+};

 /** Polygon fill mode */
-#define PIPE_POLYGON_MODE_FILL  0
-#define PIPE_POLYGON_MODE_LINE  1
-#define PIPE_POLYGON_MODE_POINT 2
+enum {
+   PIPE_POLYGON_MODE_FILL,
+   PIPE_POLYGON_MODE_LINE,
+   PIPE_POLYGON_MODE_POINT,
+};

 /** Polygon face specification, eg for culling */
 #define PIPE_FACE_NONE   0
@@ -131,60 +142,72 @@ enum pipe_error
 #define PIPE_FACE_FRONT_AND_BACK (PIPE_FACE_FRONT | PIPE_FACE_BACK)

 /** Stencil ops */
-#define PIPE_STENCIL_OP_KEEP   0
-#define PIPE_STENCIL_OP_ZERO   1
-#define PIPE_STENCIL_OP_REPLACE2
-#define PIPE_STENCIL_OP_INCR   3
-#define PIPE_STENCIL_OP_DECR   4
-#define PIPE_STENCIL_OP_INCR_WRAP  5
-#define PI

Re: [Mesa-dev] [PATCH 1/2] gallium/radeon: use enums in r600_query.h

2016-04-16 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-04-16 22:50, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_query.h | 43 
++---

 1 file changed, 23 insertions(+), 20 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_query.h
b/src/gallium/drivers/radeon/r600_query.h
index 9f3a917..6bb9374 100644
--- a/src/gallium/drivers/radeon/r600_query.h
+++ b/src/gallium/drivers/radeon/r600_query.h
@@ -40,26 +40,29 @@ struct r600_query;
 struct r600_query_hw;
 struct r600_resource;

-#define R600_QUERY_DRAW_CALLS  (PIPE_QUERY_DRIVER_SPECIFIC + 0)
-#define R600_QUERY_REQUESTED_VRAM  (PIPE_QUERY_DRIVER_SPECIFIC + 1)
-#define R600_QUERY_REQUESTED_GTT   (PIPE_QUERY_DRIVER_SPECIFIC + 2)
-#define R600_QUERY_BUFFER_WAIT_TIME(PIPE_QUERY_DRIVER_SPECIFIC + 3)
-#define R600_QUERY_NUM_CS_FLUSHES  (PIPE_QUERY_DRIVER_SPECIFIC + 4)
-#define R600_QUERY_NUM_BYTES_MOVED (PIPE_QUERY_DRIVER_SPECIFIC + 5)
-#define R600_QUERY_VRAM_USAGE  (PIPE_QUERY_DRIVER_SPECIFIC + 6)
-#define R600_QUERY_GTT_USAGE   (PIPE_QUERY_DRIVER_SPECIFIC + 7)
-#define R600_QUERY_GPU_TEMPERATURE (PIPE_QUERY_DRIVER_SPECIFIC + 8)
-#define R600_QUERY_CURRENT_GPU_SCLK(PIPE_QUERY_DRIVER_SPECIFIC + 9)
-#define R600_QUERY_CURRENT_GPU_MCLK(PIPE_QUERY_DRIVER_SPECIFIC + 10)
-#define R600_QUERY_GPU_LOAD(PIPE_QUERY_DRIVER_SPECIFIC + 11)
-#define R600_QUERY_NUM_COMPILATIONS(PIPE_QUERY_DRIVER_SPECIFIC + 12)
-#define R600_QUERY_NUM_SHADERS_CREATED	(PIPE_QUERY_DRIVER_SPECIFIC + 
13)

-#define R600_QUERY_GPIN_ASIC_ID(PIPE_QUERY_DRIVER_SPECIFIC + 
14)
-#define R600_QUERY_GPIN_NUM_SIMD   (PIPE_QUERY_DRIVER_SPECIFIC + 15)
-#define R600_QUERY_GPIN_NUM_RB (PIPE_QUERY_DRIVER_SPECIFIC + 16)
-#define R600_QUERY_GPIN_NUM_SPI(PIPE_QUERY_DRIVER_SPECIFIC + 
17)
-#define R600_QUERY_GPIN_NUM_SE (PIPE_QUERY_DRIVER_SPECIFIC + 18)
-#define R600_QUERY_FIRST_PERFCOUNTER	(PIPE_QUERY_DRIVER_SPECIFIC + 
100)

+enum {
+   R600_QUERY_DRAW_CALLS = PIPE_QUERY_DRIVER_SPECIFIC,
+   R600_QUERY_REQUESTED_VRAM,
+   R600_QUERY_REQUESTED_GTT,
+   R600_QUERY_BUFFER_WAIT_TIME,
+   R600_QUERY_NUM_CS_FLUSHES,
+   R600_QUERY_NUM_BYTES_MOVED,
+   R600_QUERY_VRAM_USAGE,
+   R600_QUERY_GTT_USAGE,
+   R600_QUERY_GPU_TEMPERATURE,
+   R600_QUERY_CURRENT_GPU_SCLK,
+   R600_QUERY_CURRENT_GPU_MCLK,
+   R600_QUERY_GPU_LOAD,
+   R600_QUERY_NUM_COMPILATIONS,
+   R600_QUERY_NUM_SHADERS_CREATED,
+   R600_QUERY_GPIN_ASIC_ID,
+   R600_QUERY_GPIN_NUM_SIMD,
+   R600_QUERY_GPIN_NUM_RB,
+   R600_QUERY_GPIN_NUM_SPI,
+   R600_QUERY_GPIN_NUM_SE,
+
+   R600_QUERY_FIRST_PERFCOUNTER = PIPE_QUERY_DRIVER_SPECIFIC + 100,
+};

 enum {
R600_QUERY_GROUP_GPIN = 0,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] Implement ARB_clear_texture for radeon drivers

2016-04-15 Thread eocallaghan

Hi Jakob,

Unfortunately ARB_clear_texture is not as straight forward and taking 
that from nouveau.
You will notice I sent essentially a identical series last year to the 
ml as a rfc. You

can find that work sitting around on my github here:

 https://github.com/victoredwardocallaghan/mesa-GLwork

Also I am somewhat surprise arb_clear_texture-float piglit passed for 
you, what hardware

did you test that on exactly?

In any case, as Ilia correctly pointed out that this implementation 
specifically relies
on nouveau somewhat special take on its local version of usual gallium 
helpers and thus
this implementation isn`t correct under usual conditions. More 
precisely, changing the
surface condition after the create_surface callback is considered 
illegal under the usual

gallium helpers and framework.

Unfortunately this is series is, Nacked-by: Edward O'Callaghan 



Kind Regards,
Edward.

On 2016-04-16 02:33, Jakob Sinclair wrote:
This series of patches implements ARB_clear_texture for r600 and 
radeonsi.
I only tested this with the radeonsi driver and just assumed it would 
work
on the r600 driver. If someone could test this with the r600 driver it 
would

be wonderful. This implementation was mostly based on the nouveau
implementation of the same function. I don't have push access so 
someone

reviewing this can push it.

Regards
Jakob Sinclair

Jakob Sinclair (4):
  gallium/radeon: add clear_texture function
  gallium/radeonsi: enable ARB_clear_texture
  gallium/r600: enable ARB_clear_texture
  docs/GL3.txt: mark ARB_clear_texture as done for r600 and radeonsi

 docs/GL3.txt  |  2 +-
 src/gallium/drivers/r600/r600_pipe.c  |  2 +-
 src/gallium/drivers/radeon/r600_texture.c | 72 
+++

 src/gallium/drivers/radeonsi/si_pipe.c|  2 +-
 4 files changed, 75 insertions(+), 3 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 01/20] radeonsi: lower compute shader arguments

2016-04-13 Thread eocallaghan

Patches - 2-4, 7-8, 12-14 & 17 - are all:

 Reviewed-by: Edward O'Callaghan 

The series was:

 Tested-by: Edward O'Callaghan 

On 2016-04-14 05:29, Bas Nieuwenhuizen wrote:

Signed-off-by: Bas Nieuwenhuizen 
Reviewed-by: Marek Olšák 
Reviewed-by: Nicolai Hähnle 
---
 src/gallium/drivers/radeonsi/si_shader.c | 41 


 src/gallium/drivers/radeonsi/si_shader.h |  7 ++
 2 files changed, 48 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index c58467d..1ccdcac 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1282,6 +1282,36 @@ static void declare_system_value(
value = get_primitive_id(&radeon_bld->soa.bld_base, 0);
break;

+   case TGSI_SEMANTIC_GRID_SIZE:
+   value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_GRID_SIZE);
+   break;
+
+   case TGSI_SEMANTIC_BLOCK_SIZE:
+   {
+   LLVMValueRef values[3];
+   unsigned i;
+   unsigned *properties = ctx->shader->selector->info.properties;
+   unsigned sizes[3] = {
+   properties[TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH],
+   properties[TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT],
+   properties[TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH]
+   };
+
+   for (i = 0; i < 3; ++i)
+   values[i] = lp_build_const_int32(gallivm, sizes[i]);
+
+   value = lp_build_gather_values(gallivm, values, 3);
+   break;
+   }
+
+   case TGSI_SEMANTIC_BLOCK_ID:
+   value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_BLOCK_ID);
+   break;
+
+   case TGSI_SEMANTIC_THREAD_ID:
+   value = LLVMGetParam(radeon_bld->main_fn, SI_PARAM_THREAD_ID);
+   break;
+
default:
assert(!"unknown system value");
return;
@@ -4823,6 +4853,14 @@ static void create_function(struct
si_shader_context *ctx)
}
break;

+   case TGSI_PROCESSOR_COMPUTE:
+   params[SI_PARAM_GRID_SIZE] = v3i32;
+   params[SI_PARAM_BLOCK_ID] = v3i32;
+   last_sgpr = SI_PARAM_BLOCK_ID;
+
+   params[SI_PARAM_THREAD_ID] = v3i32;
+   num_params = SI_PARAM_THREAD_ID + 1;
+   break;
default:
assert(0 && "unimplemented shader");
return;
@@ -5600,6 +5638,7 @@ void si_dump_shader_key(unsigned shader, union
si_shader_key *key, FILE *f)
break;

case PIPE_SHADER_GEOMETRY:
+   case PIPE_SHADER_COMPUTE:
break;

case PIPE_SHADER_FRAGMENT:
@@ -5784,6 +5823,8 @@ int si_compile_tgsi_shader(struct si_screen 
*sscreen,

else
bld_base->emit_epilogue = si_llvm_return_fs_outputs;
break;
+   case TGSI_PROCESSOR_COMPUTE:
+   break;
default:
assert(!"Unsupported shader type");
return -1;
diff --git a/src/gallium/drivers/radeonsi/si_shader.h
b/src/gallium/drivers/radeonsi/si_shader.h
index 013c8a2..5043d43 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -91,6 +91,7 @@ struct radeon_shader_reloc;
 #define SI_SGPR_TCS_OUT_LAYOUT 11 /* TCS & TES only */
 #define SI_SGPR_TCS_IN_LAYOUT  12 /* TCS only */
 #define SI_SGPR_ALPHA_REF  10 /* PS only */
+#define SI_SGPR_GRID_SIZE  10 /* CS only */

 #define SI_VS_NUM_USER_SGPR15 /* API VS */
 #define SI_ES_NUM_USER_SGPR14 /* API VS */
@@ -100,6 +101,7 @@ struct radeon_shader_reloc;
 #define SI_GS_NUM_USER_SGPR10
 #define SI_GSCOPY_NUM_USER_SGPR4
 #define SI_PS_NUM_USER_SGPR11
+#define SI_CS_NUM_USER_SGPR13

 /* LLVM function parameter indices */
 #define SI_PARAM_RW_BUFFERS0
@@ -173,6 +175,11 @@ struct radeon_shader_reloc;
 #define SI_PARAM_SAMPLE_COVERAGE   21
 #define SI_PARAM_POS_FIXED_PT  22

+/* CS only parameters */
+#define SI_PARAM_GRID_SIZE 5
+#define SI_PARAM_BLOCK_ID  6
+#define SI_PARAM_THREAD_ID 7
+
 #define SI_NUM_PARAMS (SI_PARAM_POS_FIXED_PT + 9) /* +8 for 
COLOR[0..1] */


 struct si_shader;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: removing double semi-colons

2016-04-13 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-04-14 02:43, Jakob Sinclair wrote:

Trivial change. Removing unnecessary semi-colons from the code.
I don't have push access so someone reviewing this can push it.

Signed-off-by: Jakob Sinclair 
---
 src/compiler/glsl/ast_function.cpp  | 2 +-
 src/compiler/glsl/ir_rvalue_visitor.cpp | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ast_function.cpp
b/src/compiler/glsl/ast_function.cpp
index db68d5d..f50c7bf 100644
--- a/src/compiler/glsl/ast_function.cpp
+++ b/src/compiler/glsl/ast_function.cpp
@@ -1690,7 +1690,7 @@ process_record_constructor(exec_list 
*instructions,

   constructor_type->fields.structure[i].name,
   ir->type->name,
   
constructor_type->fields.structure[i].type->name);

- return ir_rvalue::error_value(ctx);;
+ return ir_rvalue::error_value(ctx);
   }

   node = node->next;
diff --git a/src/compiler/glsl/ir_rvalue_visitor.cpp
b/src/compiler/glsl/ir_rvalue_visitor.cpp
index 6ab6cf0..addcc68 100644
--- a/src/compiler/glsl/ir_rvalue_visitor.cpp
+++ b/src/compiler/glsl/ir_rvalue_visitor.cpp
@@ -146,7 +146,7 @@ ir_rvalue_base_visitor::rvalue_visit(ir_discard 
*ir)

 ir_visitor_status
 ir_rvalue_base_visitor::rvalue_visit(ir_return *ir)
 {
-   handle_rvalue(&ir->value);;
+   handle_rvalue(&ir->value);
return visit_continue;
 }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] ARB_framebuffer_no_attachments for llvm and soft pipes

2016-04-12 Thread eocallaghan

On 2016-04-11 22:27, Roland Scheidegger wrote:

Am 10.04.2016 um 09:41 schrieb Edward O'Callaghan:

All the piglits pass for these two as-is. However, some of the piglits
require SSBO support to run, although I can't see why anything would
actually fail but I thought I would make note of it just in case 
someone

felt this patch should be held back till SSBO support is in both pipe
drivers? If not, we should be golden to flick these on too.

Edward O'Callaghan (1):
  llvmpipe,softpipe: Enable ARB_framebuffer_no_attachments



I'm not sure this is really a good idea (at least for llvmpipe). At
least the number of layers used internally is wrong (fb_max_layer is
going to be ~0 as it is always derived from the attached buffrs). I
suppose though it might not actually matter. But I'm not sure if 
there's

really any point in this extension without images - while I initially
thought it wouldn't be able to do anything this isn't quite true as
there is indeed one side effect of fs possible even without images,
which is query results (such as occlusion queries, which is what piglit
uses).
I think though just about all practical use cases really would require
image support, so I'm not convinced exposing this just because we can 
is

worth it. But OTOH why not...


That was essentially how I rationalized it also.



Reviewed-by: Roland Scheidegger 


I never applied for commit bit yet so if you still want it your have to 
merge.


Kind Regards,
Edward.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] tgsi: fix buffer overflow

2016-04-12 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-04-13 11:06, Thomas Hindoe Paaboel Andersen wrote:

Increase r to four channels as rgba is written to it
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index fb51051..41dd0f0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -4011,7 +4011,7 @@ static void
 exec_atomop_buf(struct tgsi_exec_machine *mach,
 const struct tgsi_full_instruction *inst)
 {
-   union tgsi_exec_channel r[3];
+   union tgsi_exec_channel r[4];
union tgsi_exec_channel value[4], value2[4];
float rgba[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE];
float rgba2[TGSI_NUM_CHANNELS][TGSI_QUAD_SIZE];


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: Fix race condition on libgcrypt initialization

2016-04-12 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-04-13 08:10, Mark Janes wrote:

Fixes intermittent Vulkan CTS failures within the test groups:
dEQP-VK.api.object_management.multithreaded_per_thread_device
dEQP-VK.api.object_management.multithreaded_per_thread_resources
dEQP-VK.api.object_management.multithreaded_shared_resources

Signed-off-by: Mark Janes 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94904
---
 src/util/mesa-sha1.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/util/mesa-sha1.c b/src/util/mesa-sha1.c
index faa1c87..ca6b89b 100644
--- a/src/util/mesa-sha1.c
+++ b/src/util/mesa-sha1.c
@@ -175,21 +175,24 @@ _mesa_sha1_final(struct mesa_sha1 *ctx, unsigned
char result[20])
 #elif defined(HAVE_SHA1_IN_LIBGCRYPT)   /* Use libgcrypt for SHA1 */

 #include 
+#include "c11/threads.h"
+
+static void _mesa_libgcrypt_init(void)
+{
+   if (!gcry_check_version(NULL))
+  return NULL;
+   gcry_control(GCRYCTL_DISABLE_SECMEM, 0);
+   gcry_control(GCRYCTL_INITIALIZATION_FINISHED, 0);
+}

 struct mesa_sha1 *
 _mesa_sha1_init(void)
 {
-   static int init;
+   static once_flag flag = ONCE_FLAG_INIT;
gcry_md_hd_t h;
gcry_error_t err;

-   if (!init) {
-  if (!gcry_check_version(NULL))
- return NULL;
-  gcry_control(GCRYCTL_DISABLE_SECMEM, 0);
-  gcry_control(GCRYCTL_INITIALIZATION_FINISHED, 0);
-  init = 1;
-   }
+   call_once(&flag, _mesa_libgcrypt_init);

err = gcry_md_open(&h, GCRY_MD_SHA1, 0);
if (err)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: enable GLSL 4.20 and therefore OpenGL 4.2

2016-04-12 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-04-13 07:42, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

This is the last necessary bit for OpenGL 4.2 support. All 
driver-specific

functionality has already been implemented as part of extensions.
---
 docs/relnotes/11.3.0.html  | 7 ---
 src/gallium/drivers/radeonsi/si_pipe.c | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/relnotes/11.3.0.html b/docs/relnotes/11.3.0.html
index 9860ab0..1815cfc 100644
--- a/docs/relnotes/11.3.0.html
+++ b/docs/relnotes/11.3.0.html
@@ -22,11 +22,11 @@ People who are concerned with stability and
reliability should stick
 with a previous release or wait for Mesa 11.3.1.
 
 
-Mesa 11.3.0 implements the OpenGL 4.1 API, but the version reported by
+Mesa 11.3.0 implements the OpenGL 4.2 API, but the version reported by
 glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
 glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being 
used.
-Some drivers don't support all the features required in OpenGL 4.1.  
OpenGL
-4.1 is only available if requested at context 
creation
+Some drivers don't support all the features required in OpenGL 4.2.  
OpenGL
+4.2 is only available if requested at context 
creation

 because compatibility contexts are not supported.
 

@@ -44,6 +44,7 @@ Note: some of the new features are only available
with certain drivers.
 

 
+OpenGL 4.2 on radeonsi
 GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi
 GL_ARB_internalformat_query2 on all drivers
 GL_ARB_robust_buffer_access_behavior on radeonsi
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index 6b4b3d2..1dd7338 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -336,7 +336,8 @@ static int si_get_param(struct pipe_screen*
pscreen, enum pipe_cap param)
return 4;

case PIPE_CAP_GLSL_FEATURE_LEVEL:
-   return HAVE_LLVM >= 0x0307 ? 410 : 330;
+   return HAVE_LLVM >= 0x0309 ? 420 :
+  HAVE_LLVM >= 0x0307 ? 410 : 330;

case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return MIN2(sscreen->b.info.vram_size, 0x);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] R600-GCN: Improving performance on APUs & IGPs

2016-04-12 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

Thanks for working on this,
Edward.

On 2016-04-12 05:02, Marek Olšák wrote:

Hi,

This disables buffer moves between VRAM and GTT by setting both of
them as preferred heaps for APUs and IGPs.

Allocations go to VRAM if there is free space. If not, they go to GTT.
If a buffer is evicted from VRAM to GTT, it will stay there. If it's
evicted from GTT to swap, it can later be moved to either VRAM or GTT
(whichever has free space).

Tonga with 128 MB VRAM (decreased for testing) is 2x faster in Heaven
(1600x900 Ultra noAA) if I force has_dedicated_vram=false in Mesa,
leading to literally 0 buffer evictions.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] R600, GCN: Guard Band support

2016-04-10 Thread eocallaghan

I didn't see anything obviously wrong so,

Reviewed-by: Edward O'Callaghan 

But I have some general questions about guard band, not sure if this is 
the right place but I'll just ask any way:
From my somewhat naive understanding guard band can lead to small gaps 
between polygons. In this implementation what exactly happens when on 
the screen bounds as clipped triangle could wind up with different edge 
slopes? Does this clip to the guard band edge or does it clip to the 
screen edge and if so what is the pixel cost on SI with that?


On 2016-04-11 08:34, Marek Olšák wrote:

Hi,

This patch series adds Guard Band support into r600g and radeonsi.

It first implements the Guard Band in radeonsi, then it moves all
radeonsi scissor & viewport code into gallium/radeon, and then r600g
is switched to it and its original scissor & viewport code is deleted.

The differences between the R600 and GCN code are almost none.

This should improve performance if clipping is the bottleneck.

Grigori Goronzy started this originally.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix typo in r600 register definitions

2016-04-09 Thread eocallaghan

Acked-by: Edward O'Callaghan 

On 2016-04-09 09:12, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/r600/r600d.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/r600d.h 
b/src/gallium/drivers/r600/r600d.h

index 3d223ed..ef99573 100644
--- a/src/gallium/drivers/r600/r600d.h
+++ b/src/gallium/drivers/r600/r600d.h
@@ -780,7 +780,7 @@
 #define   S_028D0C_STENCIL_COMPRESS_DISABLE(x) (((x) & 0x1) << 
5)
 #define   S_028D0C_DEPTH_COMPRESS_DISABLE(x)   (((x) & 0x1) << 
6)
 #define   S_028D0C_COPY_CENTROID(x)(((x) & 0x1) << 
7)
-#define   S_028D0C_COPY_SAMPLE(x)  (((x) & 0x1) << 
8)
+#define   S_028D0C_COPY_SAMPLE(x)  (((x) & 0x03) 
<< 8)
 #define   S_028D0C_R700_PERFECT_ZPASS_COUNTS(x)(((x) & 0x1) << 
15)
 #define   S_028D0C_CONSERVATIVE_Z_EXPORT(x)(((x) & 0x03) 
<< 13)
 #define   G_028D0C_CONSERVATIVE_Z_EXPORT(x)(((x) >> 13) & 
0x03)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] gallium/radeon: move pipeline stat context flags to common code

2016-04-09 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

This definitely makes for a good cleanup, I was wondering about all the 
manual stuff myself..


On 2016-04-09 09:12, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.h | 5 -
 src/gallium/drivers/radeonsi/si_pipe.h| 3 ---
 src/gallium/drivers/radeonsi/si_state.c   | 8 
 src/gallium/drivers/radeonsi/si_state_draw.c  | 4 ++--
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 7da7736..57af0ff 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -50,7 +50,10 @@
 #define R600_RESOURCE_FLAG_FORCE_TILING		(PIPE_RESOURCE_FLAG_DRV_PRIV 
<< 2)


 #define R600_CONTEXT_STREAMOUT_FLUSH   (1u << 0)
-#define R600_CONTEXT_PRIVATE_FLAG  (1u << 1)
+/* Pipeline & streamout query controls. */
+#define R600_CONTEXT_START_PIPELINE_STATS  (1u << 1)
+#define R600_CONTEXT_STOP_PIPELINE_STATS   (1u << 2)
+#define R600_CONTEXT_PRIVATE_FLAG  (1u << 3)

 /* special primitive types */
 #define R600_PRIM_RECTANGLE_LIST   PIPE_PRIM_MAX
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
b/src/gallium/drivers/radeonsi/si_pipe.h
index 8fcfcd2..f665c81 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -66,9 +66,6 @@
 /* Compute only. */
 #define SI_CONTEXT_FLUSH_WITH_INV_L2   (R600_CONTEXT_PRIVATE_FLAG <<
13) /* TODO: merge with TC? */
 #define SI_CONTEXT_FLAG_COMPUTE(R600_CONTEXT_PRIVATE_FLAG << 
14)
-/* Pipeline & streamout query controls. */
-#define SI_CONTEXT_START_PIPELINE_STATS	(R600_CONTEXT_PRIVATE_FLAG << 
15)
-#define SI_CONTEXT_STOP_PIPELINE_STATS	(R600_CONTEXT_PRIVATE_FLAG << 
16)


 #define SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER 
(SI_CONTEXT_FLUSH_AND_INV_CB | \

  SI_CONTEXT_FLUSH_AND_INV_CB_META 
| \
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index 0c46425..94130a9 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1358,11 +1358,11 @@ static void si_set_active_query_state(struct
pipe_context *ctx, boolean enable)

/* Pipeline stat & streamout queries. */
if (enable) {
-   sctx->b.flags &= ~SI_CONTEXT_STOP_PIPELINE_STATS;
-   sctx->b.flags |= SI_CONTEXT_START_PIPELINE_STATS;
+   sctx->b.flags &= ~R600_CONTEXT_STOP_PIPELINE_STATS;
+   sctx->b.flags |= R600_CONTEXT_START_PIPELINE_STATS;
} else {
-   sctx->b.flags &= ~SI_CONTEXT_START_PIPELINE_STATS;
-   sctx->b.flags |= SI_CONTEXT_STOP_PIPELINE_STATS;
+   sctx->b.flags &= ~R600_CONTEXT_START_PIPELINE_STATS;
+   sctx->b.flags |= R600_CONTEXT_STOP_PIPELINE_STATS;
}

/* Occlusion queries. */
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 105c5fb..40cad50 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -722,11 +722,11 @@ void si_emit_cache_flush(struct si_context
*si_ctx, struct r600_atom *atom)
}
}

-   if (sctx->flags & SI_CONTEXT_START_PIPELINE_STATS) {
+   if (sctx->flags & R600_CONTEXT_START_PIPELINE_STATS) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_PIPELINESTAT_START) |
EVENT_INDEX(0));
-   } else if (sctx->flags & SI_CONTEXT_STOP_PIPELINE_STATS) {
+   } else if (sctx->flags & R600_CONTEXT_STOP_PIPELINE_STATS) {
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cs, EVENT_TYPE(V_028A90_PIPELINESTAT_STOP) |
EVENT_INDEX(0));


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: implement and rely on set_active_query_state

2016-04-08 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-04-08 18:58, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_blit.c   |  3 ---
 src/gallium/drivers/radeonsi/si_pipe.h   |  4 
 src/gallium/drivers/radeonsi/si_state.c  | 32 
+++-

 src/gallium/drivers/radeonsi/si_state_draw.c | 10 +
 4 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c
b/src/gallium/drivers/radeonsi/si_blit.c
index c5ea8b1..aed783f 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -52,8 +52,6 @@ static void si_blitter_begin(struct pipe_context
*ctx, enum si_blitter_op op)
 {
struct si_context *sctx = (struct si_context *)ctx;

-   r600_suspend_nontimer_queries(&sctx->b);
-
 	util_blitter_save_vertex_buffer_slot(sctx->blitter, 
sctx->vertex_buffer);
 	util_blitter_save_vertex_elements(sctx->blitter, 
sctx->vertex_elements);

util_blitter_save_vertex_shader(sctx->blitter, sctx->vs_shader.cso);
@@ -95,7 +93,6 @@ static void si_blitter_end(struct pipe_context *ctx)
struct si_context *sctx = (struct si_context *)ctx;

sctx->b.render_cond_force_off = false;
-   r600_resume_nontimer_queries(&sctx->b);
 }

 static unsigned u_max_sample(struct pipe_resource *r)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
b/src/gallium/drivers/radeonsi/si_pipe.h
index 4158fc5..8fcfcd2 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -66,6 +66,9 @@
 /* Compute only. */
 #define SI_CONTEXT_FLUSH_WITH_INV_L2   (R600_CONTEXT_PRIVATE_FLAG <<
13) /* TODO: merge with TC? */
 #define SI_CONTEXT_FLAG_COMPUTE(R600_CONTEXT_PRIVATE_FLAG << 
14)
+/* Pipeline & streamout query controls. */
+#define SI_CONTEXT_START_PIPELINE_STATS	(R600_CONTEXT_PRIVATE_FLAG << 
15)
+#define SI_CONTEXT_STOP_PIPELINE_STATS	(R600_CONTEXT_PRIVATE_FLAG << 
16)


 #define SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER 
(SI_CONTEXT_FLUSH_AND_INV_CB | \

  SI_CONTEXT_FLUSH_AND_INV_CB_META 
| \
@@ -289,6 +292,7 @@ struct si_context {
booldb_stencil_clear;
booldb_stencil_disable_expclear;
unsignedps_db_shader_control;
+   boolocclusion_queries_disabled;

/* Emitted draw state. */
int last_base_vertex;
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index a66bd30..6fbbb68 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1352,6 +1352,26 @@ static void *si_create_db_flush_dsa(struct
si_context *sctx)

 /* DB RENDER STATE */

+static void si_set_active_query_state(struct pipe_context *ctx, 
boolean enable)

+{
+   struct si_context *sctx = (struct si_context*)ctx;
+
+   /* Pipeline stat & streamout queries. */
+   if (enable) {
+   sctx->b.flags &= ~SI_CONTEXT_STOP_PIPELINE_STATS;
+   sctx->b.flags |= SI_CONTEXT_START_PIPELINE_STATS;
+   } else {
+   sctx->b.flags &= ~SI_CONTEXT_START_PIPELINE_STATS;
+   sctx->b.flags |= SI_CONTEXT_STOP_PIPELINE_STATS;
+   }
+
+   /* Occlusion queries. */
+   if (sctx->occlusion_queries_disabled != !enable) {
+   sctx->occlusion_queries_disabled = !enable;
+   si_mark_atom_dirty(sctx, &sctx->db_render_state);
+   }
+}
+
 static void si_set_occlusion_query_state(struct pipe_context *ctx, 
bool enable)

 {
struct si_context *sctx = (struct si_context*)ctx;
@@ -1386,7 +1406,8 @@ static void si_emit_db_render_state(struct
si_context *sctx, struct r600_atom *s
}

/* DB_COUNT_CONTROL (occlusion queries) */
-   if (sctx->b.num_occlusion_queries > 0) {
+   if (sctx->b.num_occlusion_queries > 0 &&
+   !sctx->occlusion_queries_disabled) {
bool perfect = sctx->b.num_perfect_occlusion_queries > 0;

if (sctx->b.chip_class >= CIK) {
@@ -3765,6 +3786,7 @@ void si_init_state_functions(struct si_context 
*sctx)

sctx->b.b.set_min_samples = si_set_min_samples;
sctx->b.b.set_tess_state = si_set_tess_state;

+   sctx->b.b.set_active_query_state = si_set_active_query_state;
sctx->b.set_occlusion_query_state = si_set_occlusion_query_state;
sctx->b.need_gfx_cs_space = si_need_gfx_cs_space;

@@ -3995,6 +4017,14 @@ static void si_init_config(struct si_context 
*sctx)

si_pm4_cmd_add(pm4, 0x8000);
si_pm4_cmd_end(pm4, false);

+   /* This enables pipeline stat & streamout queries.
+* They are only disabled by blits.
+*/
+   si_pm4_cmd_begin(pm4, PKT3_EVENT_WRITE);
+   si_pm4_cmd_add(pm4, EVENT_TYPE(V_028A90_PIPELINESTAT_START) |
+   EVENT_INDEX(0));
+   si_pm4_cmd_end(pm4,

Re: [Mesa-dev] [PATCH] radeonsi: fix mask checking when emitting scissors and viewports

2016-04-08 Thread eocallaghan

On 2016-04-08 19:00, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index 8087d23..3894e1d 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -912,8 +912,10 @@ static void si_emit_scissors(struct si_context
*sctx, struct r600_atom *atom)
bool scissor_enable = sctx->queued.named.rasterizer->scissor_enable;

/* The simple case: Only 1 viewport is active. */
-   if (mask & 1 &&
-   !si_get_vs_info(sctx)->writes_viewport_index) {
+   if (!si_get_vs_info(sctx)->writes_viewport_index) {
+   if (!(mask & 1))


seems a bit tentative.. did you want 1u here or?


+   return;
+
 		radeon_set_context_reg_seq(cs, R_028250_PA_SC_VPORT_SCISSOR_0_TL, 
2);

si_emit_one_scissor(cs, &sctx->viewports.states[0],
scissor_enable ? &states[0] : NULL);
@@ -960,8 +962,10 @@ static void si_emit_viewports(struct si_context
*sctx, struct r600_atom *atom)
unsigned mask = sctx->viewports.dirty_mask;

/* The simple case: Only 1 viewport is active. */
-   if (mask & 1 &&
-   !si_get_vs_info(sctx)->writes_viewport_index) {
+   if (!si_get_vs_info(sctx)->writes_viewport_index) {
+   if (!(mask & 1))
+   return;
+
radeon_set_context_reg_seq(cs, R_02843C_PA_CL_VPORT_XSCALE, 6);
radeon_emit(cs, fui(states[0].scale[0]));
radeon_emit(cs, fui(states[0].translate[0]));


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/radeon: unify checking streamout enable state

2016-04-08 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-04-08 19:00, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/r600/r600_state_common.c  | 5 ++---
 src/gallium/drivers/radeon/r600_pipe_common.h | 6 ++
 src/gallium/drivers/radeon/r600_streamout.c   | 6 --
 src/gallium/drivers/radeonsi/si_state_draw.c  | 3 +--
 4 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state_common.c
b/src/gallium/drivers/r600/r600_state_common.c
index df41d3f..82babeb 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1841,8 +1841,7 @@ static void r600_draw_vbo(struct pipe_context
*ctx, const struct pipe_draw_info
ia_switch_on_eop = true;
}

-   if (rctx->b.streamout.streamout_enabled ||
-   rctx->b.streamout.prims_gen_query_enabled)
+   if (r600_get_strmout_en(&rctx->b))
partial_vs_wave = true;

radeon_set_context_reg(cs, CM_R_028AA8_IA_MULTI_VGT_PARAM,
@@ -2018,7 +2017,7 @@ static void r600_draw_vbo(struct pipe_context
*ctx, const struct pipe_draw_info
rctx->b.family == CHIP_RV635) {
/* if we have gs shader or streamout
   we need to do a wait idle after every draw */
-   if (rctx->gs_shader || rctx->b.streamout.streamout_enabled) {
+   if (rctx->gs_shader || r600_get_strmout_en(&rctx->b)) {
 			radeon_set_config_reg(cs, R_008040_WAIT_UNTIL, 
S_008040_WAIT_3D_IDLE(1));

}
}
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 062c319..7da7736 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -639,6 +639,12 @@ r600_resource_reference(struct r600_resource
**ptr, struct r600_resource *res)
(struct pipe_resource *)res);
 }

+static inline bool r600_get_strmout_en(struct r600_common_context 
*rctx)

+{
+   return rctx->streamout.streamout_enabled ||
+  rctx->streamout.prims_gen_query_enabled;
+}
+
 static inline unsigned r600_tex_aniso_filter(unsigned filter)
 {
if (filter <= 1)   return 0;
diff --git a/src/gallium/drivers/radeon/r600_streamout.c
b/src/gallium/drivers/radeon/r600_streamout.c
index e977ed9..fc9ec48 100644
--- a/src/gallium/drivers/radeon/r600_streamout.c
+++ b/src/gallium/drivers/radeon/r600_streamout.c
@@ -311,12 +311,6 @@ void r600_emit_streamout_end(struct
r600_common_context *rctx)
  * are no buffers bound.
  */

-static bool r600_get_strmout_en(struct r600_common_context *rctx)
-{
-   return rctx->streamout.streamout_enabled ||
-  rctx->streamout.prims_gen_query_enabled;
-}
-
 static void r600_emit_streamout_enable(struct r600_common_context 
*rctx,

   struct r600_atom *atom)
 {
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 84b850a..3863e59 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -882,8 +882,7 @@ void si_draw_vbo(struct pipe_context *ctx, const
struct pipe_draw_info *info)
if ((sctx->b.family == CHIP_HAWAII ||
 sctx->b.family == CHIP_TONGA ||
 sctx->b.family == CHIP_FIJI) &&
-   (sctx->b.streamout.streamout_enabled ||
-sctx->b.streamout.prims_gen_query_enabled)) {
+   r600_get_strmout_en(&sctx->b)) {
sctx->b.flags |= SI_CONTEXT_VGT_STREAMOUT_SYNC;
}


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: add some trivial null pointer checks

2016-04-06 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

+1 for defensive programming.

On 2016-04-07 06:00, Brian Paul wrote:

These small mallocs will probably never fail, but static analysis tools
may complain about the missing checks.
---
 src/gallium/drivers/svga/svga_pipe_blend.c| 3 +++
 src/gallium/drivers/svga/svga_pipe_depthstencil.c | 3 +++
 src/gallium/drivers/svga/svga_pipe_rasterizer.c   | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/src/gallium/drivers/svga/svga_pipe_blend.c
b/src/gallium/drivers/svga/svga_pipe_blend.c
index 0af80cd..0ba9313 100644
--- a/src/gallium/drivers/svga/svga_pipe_blend.c
+++ b/src/gallium/drivers/svga/svga_pipe_blend.c
@@ -142,6 +142,9 @@ svga_create_blend_state(struct pipe_context *pipe,
struct svga_blend_state *blend = CALLOC_STRUCT( svga_blend_state );
unsigned i;

+   if (!blend)
+  return NULL;
+
/* Fill in the per-rendertarget blend state.  We currently only
 * support independent blend enable and colormask per render 
target.

 */
diff --git a/src/gallium/drivers/svga/svga_pipe_depthstencil.c
b/src/gallium/drivers/svga/svga_pipe_depthstencil.c
index d84ed1d..83fcdc3 100644
--- a/src/gallium/drivers/svga/svga_pipe_depthstencil.c
+++ b/src/gallium/drivers/svga/svga_pipe_depthstencil.c
@@ -134,6 +134,9 @@ svga_create_depth_stencil_state(struct pipe_context 
*pipe,

struct svga_context *svga = svga_context(pipe);
struct svga_depth_stencil_state *ds = CALLOC_STRUCT(
svga_depth_stencil_state );

+   if (!ds)
+  return NULL;
+
/* Don't try to figure out CW/CCW correspondence with
 * stencil[0]/[1] at this point.  Presumably this can change as
 * back/front face are modified.
diff --git a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
index 8e0db53..d397c95 100644
--- a/src/gallium/drivers/svga/svga_pipe_rasterizer.c
+++ b/src/gallium/drivers/svga/svga_pipe_rasterizer.c
@@ -161,6 +161,9 @@ svga_create_rasterizer_state(struct pipe_context 
*pipe,
struct svga_rasterizer_state *rast = CALLOC_STRUCT( 
svga_rasterizer_state );

struct svga_screen *screen = svga_screen(pipe->screen);

+   if (!rast)
+  return NULL;
+
/* need this for draw module. */
rast->templ = *templ;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] r600/compute: cleanup evergreen_compute.c

2016-04-06 Thread eocallaghan

Nice cleanup. This series is,

Reviewed-by: Edward O'Callaghan 


On 2016-04-07 07:40, Dave Airlie wrote:

This probably should have been cleaned up before merging, but we
were a bit lax with it. This is a bunch of cleanups and changes,
that make adding ARB_compute_support less of a task.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] ARB_robust_buffer_access_behavior for radeonsi

2016-04-04 Thread eocallaghan

I had a hacked up version of this last week which was very similar.
This is much cleaner, hence this series is,

Reviewed-by: Edward O'Callaghan 

On 2016-04-04 21:41, Bas Nieuwenhuizen wrote:
This series implements ARb_robust_buffer_access_behavior for the 
radeonsi

driver.

There are some tests at:

https://github.com/BNieuwenhuizen/piglit

These have not been send yet as they depend on robust access context
support in waffle.

Bas Nieuwenhuizen (4):
  radeonsi: use bounded indexing for constant buffers
  radeonsi: use bounded indexing for samplers
  expose ARB_robust_buffer_access_behavior
  radeonsi: mark ARB_robust_buffer_access_behavior as supported

 docs/GL3.txt |  2 +-
 docs/relnotes/11.3.0.html|  1 +
 src/gallium/docs/source/screen.rst   |  4 +++-
 src/gallium/drivers/freedreno/freedreno_screen.c |  1 +
 src/gallium/drivers/i915/i915_screen.c   |  1 +
 src/gallium/drivers/ilo/ilo_screen.c |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c |  1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   |  1 +
 src/gallium/drivers/r300/r300_screen.c   |  1 +
 src/gallium/drivers/r600/r600_pipe.c |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
 src/gallium/drivers/radeonsi/si_shader.c | 10 +++---
 src/gallium/drivers/softpipe/sp_screen.c |  1 +
 src/gallium/drivers/svga/svga_screen.c   |  1 +
 src/gallium/drivers/swr/swr_screen.cpp   |  1 +
 src/gallium/drivers/vc4/vc4_screen.c |  1 +
 src/gallium/drivers/virgl/virgl_screen.c |  1 +
 src/gallium/include/pipe/p_defines.h |  1 +
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/mtypes.h   |  1 +
 src/mesa/main/version.c  |  2 +-
 src/mesa/state_tracker/st_extensions.c   |  1 +
 24 files changed, 32 insertions(+), 6 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] report ARB_cull_distance

2016-04-03 Thread eocallaghan

Patches 1-5, 8 & 10 are,

Reviewed-by: Edward O'Callaghan 

On 2016-04-04 12:15, Dave Airlie wrote:

Okay I've taken Tobias' last work in progress, cleaned it up a bit,
move the rename out into a separate patch, reordered things slightly.

I've dropped the separate passes, I think nearly all hw operates the
same, I do wonder why we even have this as an option, since at least
965/gallium always require the compiler to lower, we don't really
have other consumers that would care about this I don't think.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/20] GL compute shaders for radeonsi

2016-04-03 Thread eocallaghan

This series is,

Tested-By: Edward O'Callaghan 

I didn`t pick up anything major wrong with it, but with others
minor suggestions this series is also,

Reviewed-By: Edward O'Callaghan 

Kind Regards,

On 2016-04-03 00:10, Bas Nieuwenhuizen wrote:

This series implements OpenGL compute shader for radeonsi. It
is based off master + Nicolai Hähnle's SSBO patches.

It depends on two patches for LLVM that have not
been committed yet:
  - D18340
  - D18559

The series is also available as the si-compute-shader branches of
 - https://github.com/BNieuwenhuizen/llvm
 - https://github.com/BNieuwenhuizen/mesa

Bas Nieuwenhuizen (20):
  radeonsi: set shader calling conventions
  radeonsi: lower compute shader arguments
  radeonsi: add shared memory
  radeonsi: implement shared memory load/store
  radeonsi: implement shared atomics
  radeonsi: set maximum work group size based on block size
  radeonsi: update shader count for compute shaders
  radeonsi: implement TGSI compute shader creation
  radeonsi: split input upload off from si_launch_grid
  radeonsi: don't pass scratch buffer to user SGPRs
  radeonsi: do per cs setup for compute shaders once per cs
  radeonsi: rework compute scratch buffer
  radeonsi: only emit compute shader state when switching shaders
  radeonsi: implement TGSI compute dispatch
  radeonsi: split texture decompression for compute shaders
  radeonsi: split setting graphics and compute descriptors
  radeonsi: do not do two full flushes on every compute dispatch
  radeonsi: clean up compute flush
  mesa/st: enable compute shaders if images are also supported
  radeonsi: enable TGSI support cap for compute shaders

 docs/GL3.txt   |   4 +-
 docs/relnotes/11.3.0.html  |   1 +
 src/gallium/drivers/radeon/r600_pipe_common.c  |  21 +-
 src/gallium/drivers/radeon/radeon_llvm.h   |   3 +
 src/gallium/drivers/radeon/radeon_llvm_emit.c  |  17 +-
 .../drivers/radeon/radeon_setup_tgsi_llvm.c|   4 +
 src/gallium/drivers/radeonsi/si_blit.c |  13 +-
 src/gallium/drivers/radeonsi/si_compute.c  | 557 
-

 src/gallium/drivers/radeonsi/si_descriptors.c  |  60 ++-
 src/gallium/drivers/radeonsi/si_hw_context.c   |   2 +
 src/gallium/drivers/radeonsi/si_pipe.c |   4 +-
 src/gallium/drivers/radeonsi/si_pipe.h |  11 +-
 src/gallium/drivers/radeonsi/si_shader.c   | 252 +-
 src/gallium/drivers/radeonsi/si_shader.h   |  10 +
 src/gallium/drivers/radeonsi/si_state.c|   6 +-
 src/gallium/drivers/radeonsi/si_state.h|  10 +-
 src/gallium/drivers/radeonsi/si_state_draw.c   |  31 +-
 src/mesa/state_tracker/st_extensions.c |   6 +-
 18 files changed, 708 insertions(+), 304 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Split buffer block array into UBO and SSBO arrays

2016-04-03 Thread eocallaghan

This series is,

Acked-by: Edward O'Callaghan 

On 2016-04-03 21:16, Timothy Arceri wrote:

This is the final clean-up of the buffer block structures. With this
series we just create two arrays to begin with and drop the combined
array.

Note to avoid code churn and regressions I intend to squash patches
1-4 before pushing I've just sent them split up to make reviewing
easier.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] radeonsi, r600g ARB_framebuffer_no_attachments rebased

2016-04-03 Thread eocallaghan

On 2016-04-03 14:55, Ilia Mirkin wrote:

On Sat, Apr 2, 2016 at 10:54 PM, Edward O'Callaghan
 wrote:

This series implements ARB_framebuffer_no_attachments for radeonsi &
r600g. It is a rebase of the last previous patch series with the
respective Rb's added.

I have given back my R9 hw today and so this was the last time I ran
piglit to confirm everything passes and all our applications locally
that need the support are happy and works as expected.


Hi Edward,

Could you confirm whether you ran piglit on the
ARB_framebuffer_no_attachments-specific tests only, or whether you did
a full run and compared it a baseline run?

Thanks,

  -ilia


Hi Ilia,

I ran both and after:

 $ piglit run tests/gpu results/gpu-reference
 $ piglit summary html --overwrite summary/gpu results/gpu-reference
 $ firefox summary/gpu/index.html

and to confirm the extensions functionality (+ some local proprietary 
medical applications):


 $ piglit run tests/all -t framebuffer_no_attachments 
results/framebuffer_no_attachments
 $ piglit summary html --overwrite summary/framebuffer_no_attachments 
results/framebuffer_no_attachments

 $ firefox summary/framebuffer_no_attachments/index.html

Kind Regards,
Edward.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/30] r600: refactor binding code for attach buffer to CB.

2016-03-31 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-31 18:03, Dave Airlie wrote:

From: Dave Airlie 

This refactors out the code and fixes it up to be used
for images later. It uses the code in the current RAT binding
for compute.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/evergreen_state.c | 111 
-

 1 file changed, 78 insertions(+), 33 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c
b/src/gallium/drivers/r600/evergreen_state.c
index c151a1b..f7a2a8f 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -1056,6 +1056,71 @@ struct r600_tex_color_info {
boolean export_16bpc;
 };

+static void evergreen_set_color_surface_buffer(struct r600_context 
*rctx,

+  struct r600_resource *res,
+  enum pipe_format pformat,
+  unsigned first_element,
+  unsigned last_element,
+  struct r600_tex_color_info 
*color)
+{
+   unsigned format, swap, ntype, endian;
+   const struct util_format_description *desc;
+	unsigned block_size = 
align(util_format_get_blocksize(res->b.b.format), 4);

+   unsigned pitch_alignment =
+   MAX2(64, rctx->screen->b.info.pipe_interleave_bytes / 
block_size);
+   unsigned pitch = align(res->b.b.width0, pitch_alignment);
+   int i;
+   unsigned width_elements;
+
+   width_elements = last_element - first_element + 1;
+
+   format = r600_translate_colorformat(rctx->b.chip_class, pformat);
+   swap = r600_translate_colorswap(pformat);
+
+   endian = r600_colorformat_endian_swap(format);
+
+   desc = util_format_description(pformat);
+   for (i = 0; i < 4; i++) {
+   if (desc->channel[i].type != UTIL_FORMAT_TYPE_VOID) {
+   break;
+   }
+   }
+   ntype = V_028C70_NUMBER_UNORM;
+   if (desc->colorspace == UTIL_FORMAT_COLORSPACE_SRGB)
+   ntype = V_028C70_NUMBER_SRGB;
+   else if (desc->channel[i].type == UTIL_FORMAT_TYPE_SIGNED) {
+   if (desc->channel[i].normalized)
+   ntype = V_028C70_NUMBER_SNORM;
+   else if (desc->channel[i].pure_integer)
+   ntype = V_028C70_NUMBER_SINT;
+   } else if (desc->channel[i].type == UTIL_FORMAT_TYPE_UNSIGNED) {
+   if (desc->channel[i].normalized)
+   ntype = V_028C70_NUMBER_UNORM;
+   else if (desc->channel[i].pure_integer)
+   ntype = V_028C70_NUMBER_UINT;
+   }
+   pitch = (pitch / 8) - 1;
+   color->pitch = S_028C64_PITCH_TILE_MAX(pitch);
+
+   color->info = S_028C70_ARRAY_MODE(V_028C70_ARRAY_LINEAR_ALIGNED);
+   color->info |= S_028C70_FORMAT(format) |
+  S_028C70_COMP_SWAP(swap) |
+  S_028C70_BLEND_CLAMP(0) |
+  S_028C70_BLEND_BYPASS(1) |
+  S_028C70_NUMBER_TYPE(ntype) |
+  S_028C70_ENDIAN(endian);
+   color->attrib = S_028C74_NON_DISP_TILING_ORDER(1);
+   color->ntype = ntype;
+   color->export_16bpc = false;
+   color->dim = width_elements - 1;
+   color->slice = (width_elements / 64) - 1;
+   color->view = 0;
+   color->offset = res->gpu_address >> 8;
+
+   color->fmask = color->offset;
+   color->fmask_slice = 0;
+}
+
 static void evergreen_set_color_surface_common(struct r600_context 
*rctx,

   struct r600_texture *rtex,
   unsigned level,
@@ -1239,47 +1304,27 @@ void evergreen_init_color_surface_rat(struct
r600_context *rctx,
struct r600_surface *surf)
 {
struct pipe_resource *pipe_buffer = surf->base.texture;
-   unsigned format = r600_translate_colorformat(rctx->b.chip_class,
-surf->base.format);
-   unsigned endian = r600_colorformat_endian_swap(format);
-   unsigned swap = r600_translate_colorswap(surf->base.format);
-   unsigned block_size =
-   align(util_format_get_blocksize(pipe_buffer->format), 4);
-   unsigned pitch_alignment =
-   MAX2(64, rctx->screen->b.info.pipe_interleave_bytes / 
block_size);
-   unsigned pitch = align(pipe_buffer->width0, pitch_alignment);
-
-   surf->cb_color_base = r600_resource(pipe_buffer)->gpu_address >> 8;
+   struct r600_tex_color_info color;

-   surf->cb_color_pitch = (pitch / 8) - 1;
+   evergreen_set_color_surface_buffer(rctx, (struct r600_resource
*)surf->base.texture,
+  surf->base.format, 0, 
pipe_buffer->width0,
+  &color

Re: [Mesa-dev] [PATCH 04/30] r600: refactor out CB setup.

2016-03-31 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-31 18:03, Dave Airlie wrote:

From: Dave Airlie 

This moves the code to create CB info out into
a separate function so it can be reused in images
code to create RATs.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/evergreen_state.c | 257 
+

 1 file changed, 147 insertions(+), 110 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c
b/src/gallium/drivers/r600/evergreen_state.c
index 356e708..c151a1b 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -1042,105 +1042,72 @@ static void
evergreen_emit_scissor_state(struct r600_context *rctx, struct r600_
rstate->atom.num_dw = 0;
 }

-/**
- * This function intializes the CB* register values for RATs.  It is 
meant

- * to be used for 1D aligned buffers that do not have an associated
- * radeon_surf.
- */
-void evergreen_init_color_surface_rat(struct r600_context *rctx,
-   struct r600_surface *surf)
-{
-   struct pipe_resource *pipe_buffer = surf->base.texture;
-   unsigned format = r600_translate_colorformat(rctx->b.chip_class,
-surf->base.format);
-   unsigned endian = r600_colorformat_endian_swap(format);
-   unsigned swap = r600_translate_colorswap(surf->base.format);
-   unsigned block_size =
-   align(util_format_get_blocksize(pipe_buffer->format), 4);
-   unsigned pitch_alignment =
-   MAX2(64, rctx->screen->b.info.pipe_interleave_bytes / 
block_size);
-   unsigned pitch = align(pipe_buffer->width0, pitch_alignment);
-
-   surf->cb_color_base = r600_resource(pipe_buffer)->gpu_address >> 8;
-
-   surf->cb_color_pitch = (pitch / 8) - 1;
-
-   surf->cb_color_slice = 0;
-
-   surf->cb_color_view = 0;
-
-   surf->cb_color_info =
- S_028C70_ENDIAN(endian)
-   | S_028C70_FORMAT(format)
-   | S_028C70_ARRAY_MODE(V_028C70_ARRAY_LINEAR_ALIGNED)
-   | S_028C70_NUMBER_TYPE(V_028C70_NUMBER_UINT)
-   | S_028C70_COMP_SWAP(swap)
-   | S_028C70_BLEND_BYPASS(1) /* We must set this bit because we
-   * are using NUMBER_UINT */
-   | S_028C70_RAT(1)
-   ;
-
-   surf->cb_color_attrib = S_028C74_NON_DISP_TILING_ORDER(1);
-
-   /* For buffers, CB_COLOR0_DIM needs to be set to the number of
-* elements. */
-   surf->cb_color_dim = pipe_buffer->width0;
-
-   /* Set the buffer range the GPU will have access to: */
-   util_range_add(&r600_resource(pipe_buffer)->valid_buffer_range,
-  0, pipe_buffer->width0);
-
-   surf->cb_color_fmask = surf->cb_color_base;
-   surf->cb_color_fmask_slice = 0;
-}
+struct r600_tex_color_info {
+   unsigned info;
+   unsigned view;
+   unsigned dim;
+   unsigned pitch;
+   unsigned slice;
+   unsigned attrib;
+   unsigned ntype;
+   unsigned fmask;
+   unsigned fmask_slice;
+   uint64_t offset;
+   boolean export_16bpc;
+};

-void evergreen_init_color_surface(struct r600_context *rctx,
- struct r600_surface *surf)
+static void evergreen_set_color_surface_common(struct r600_context 
*rctx,

+  struct r600_texture *rtex,
+  unsigned level,
+  unsigned first_layer,
+  unsigned last_layer,
+  enum pipe_format pformat,
+  struct r600_tex_color_info 
*color)
 {
struct r600_screen *rscreen = rctx->screen;
-   struct r600_texture *rtex = (struct r600_texture*)surf->base.texture;
-   unsigned level = surf->base.u.tex.level;
unsigned pitch, slice;
-   unsigned color_info, color_attrib, color_dim = 0, color_view;
-   unsigned format, swap, ntype, endian;
-   uint64_t offset, base_offset;
unsigned non_disp_tiling, macro_aspect, tile_split, bankh, bankw,
fmask_bankh, nbanks;
+   unsigned format, swap, ntype, endian;
const struct util_format_description *desc;
-   int i;
bool blend_clamp = 0, blend_bypass = 0;
+   int i;

-   offset = rtex->surface.level[level].offset;
+   color->offset = rtex->surface.level[level].offset;
if (rtex->surface.level[level].mode == RADEON_SURF_MODE_LINEAR) {
-   assert(surf->base.u.tex.first_layer == 
surf->base.u.tex.last_layer);
-   offset += rtex->surface.level[level].slice_size *
- surf->base.u.tex.first_layer;
-   color_view = 0;
+   color->offset += (rtex->surface.level[level].slice_size *
+

Re: [Mesa-dev] [PATCH 02/30] r600: factor out the code to initialise a buffer resource.

2016-03-31 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-31 18:03, Dave Airlie wrote:

From: Dave Airlie 

This takes the code required to initialise a buffer resource
out of the texture buffer code, into it's own function.

This is going to be used for the image support later.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/evergreen_state.c | 81 
+++---

 1 file changed, 53 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c
b/src/gallium/drivers/r600/evergreen_state.c
index fa3d0b3..b68faed 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -596,56 +596,81 @@ static void
*evergreen_create_sampler_state(struct pipe_context *ctx,
return ss;
 }

-static struct pipe_sampler_view *
-texture_buffer_sampler_view(struct r600_context *rctx,
-   struct r600_pipe_sampler_view *view,
-   unsigned width0, unsigned height0)
-   
+struct eg_buf_res_params {
+   enum pipe_format pipe_format;
+   unsigned first_element;
+   unsigned last_element;
+   unsigned char swizzle[4];
+   bool uncached;
+};
+
+static void evergreen_fill_buffer_resource_words(struct r600_context 
*rctx,

+struct pipe_resource *buffer,
+struct eg_buf_res_params 
*params,
+bool *skip_mip_address_reloc,
+unsigned tex_resource_words[8])
 {
-   struct r600_texture *tmp = (struct r600_texture*)view->base.texture;
+   struct r600_texture *tmp = (struct r600_texture*)buffer;
uint64_t va;
-   int stride = util_format_get_blocksize(view->base.format);
+   int stride = util_format_get_blocksize(params->pipe_format);
unsigned format, num_format, format_comp, endian;
unsigned swizzle_res;
-   unsigned char swizzle[4];
const struct util_format_description *desc;
-   unsigned offset = view->base.u.buf.first_element * stride;
-   unsigned size = (view->base.u.buf.last_element -
view->base.u.buf.first_element + 1) * stride;
+   unsigned offset = params->first_element * stride;
+	unsigned num_elements = (params->last_element - params->first_element 
+ 1);

+   unsigned size = num_elements * stride;

-   swizzle[0] = view->base.swizzle_r;
-   swizzle[1] = view->base.swizzle_g;
-   swizzle[2] = view->base.swizzle_b;
-   swizzle[3] = view->base.swizzle_a;
-
-   r600_vertex_data_type(view->base.format,
+   r600_vertex_data_type(params->pipe_format,
  &format, &num_format, &format_comp,
  &endian);

-   desc = util_format_description(view->base.format);
+   desc = util_format_description(params->pipe_format);

-	swizzle_res = r600_get_swizzle_combined(desc->swizzle, swizzle, 
TRUE);
+	swizzle_res = r600_get_swizzle_combined(desc->swizzle, 
params->swizzle, TRUE);


va = tmp->resource.gpu_address + offset;
-   view->tex_resource = &tmp->resource;
-
-   view->skip_mip_address_reloc = true;
-   view->tex_resource_words[0] = va;
-   view->tex_resource_words[1] = size - 1;
-   view->tex_resource_words[2] = S_030008_BASE_ADDRESS_HI(va >> 32UL) |
+   *skip_mip_address_reloc = true;
+   tex_resource_words[0] = va;
+   tex_resource_words[1] = size - 1;
+   tex_resource_words[2] = S_030008_BASE_ADDRESS_HI(va >> 32UL) |
S_030008_STRIDE(stride) |
S_030008_DATA_FORMAT(format) |
S_030008_NUM_FORMAT_ALL(num_format) |
S_030008_FORMAT_COMP_ALL(format_comp) |
S_030008_ENDIAN_SWAP(endian);
-   view->tex_resource_words[3] = swizzle_res;
+	tex_resource_words[3] = swizzle_res | 
S_03000C_UNCACHED(params->uncached);

/*
 * in theory dword 4 is for number of elements, for use with resinfo,
 * but it seems to utterly fail to work, the amd gpu shader analyser
 * uses a const buffer to store the element sizes for buffer txq
 */
-   view->tex_resource_words[4] = 0;
-   view->tex_resource_words[5] = view->tex_resource_words[6] = 0;
-	view->tex_resource_words[7] = 
S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER);

+   tex_resource_words[4] = num_elements;
+   tex_resource_words[5] = tex_resource_words[6] = 0;
+	tex_resource_words[7] = 
S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER);

+}
+
+static struct pipe_sampler_view *
+texture_buffer_sampler_view(struct r600_context *rctx,
+   struct r600_pipe_sampler_view *view,
+   unsigned width0, unsigned height0)
+{
+   struct r600_texture *tmp = (struct r600_texture*)view->base.texture;
+   struct eg_buf_res_params params;
+
+   memset(¶ms, 0, sizeof(params));
+
+

Re: [Mesa-dev] [PATCH 03/30] r600: refactor texture resource words setup code.

2016-03-31 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-31 18:03, Dave Airlie wrote:

From: Dave Airlie 

This refactors out the code to setup a texture resource
so we can reuse it later from the images code.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/evergreen_state.c | 274 
+

 1 file changed, 158 insertions(+), 116 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c
b/src/gallium/drivers/r600/evergreen_state.c
index b68faed..356e708 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -649,93 +649,55 @@ static void
evergreen_fill_buffer_resource_words(struct r600_context *rctx,
 	tex_resource_words[7] = 
S_03001C_TYPE(V_03001C_SQ_TEX_VTX_VALID_BUFFER);

 }

-static struct pipe_sampler_view *
-texture_buffer_sampler_view(struct r600_context *rctx,
-   struct r600_pipe_sampler_view *view,
-   unsigned width0, unsigned height0)
-{
-   struct r600_texture *tmp = (struct r600_texture*)view->base.texture;
-   struct eg_buf_res_params params;
-
-   memset(¶ms, 0, sizeof(params));
-
-   params.pipe_format = view->base.format;
-   params.first_element = view->base.u.buf.first_element;
-   params.last_element = view->base.u.buf.last_element;
-   params.swizzle[0] = view->base.swizzle_r;
-   params.swizzle[1] = view->base.swizzle_g;
-   params.swizzle[2] = view->base.swizzle_b;
-   params.swizzle[3] = view->base.swizzle_a;
-
-   evergreen_fill_buffer_resource_words(rctx, view->base.texture,
-¶ms, 
&view->skip_mip_address_reloc,
-view->tex_resource_words);
-   view->tex_resource = &tmp->resource;
-
-   if (tmp->resource.gpu_address)
-   LIST_ADDTAIL(&view->list, &rctx->b.texture_buffers);
-   return &view->base;
-}
+struct eg_tex_res_params {
+   enum pipe_format pipe_format;
+   int force_level;
+   unsigned width0;
+   unsigned height0;
+   unsigned first_level;
+   unsigned last_level;
+   unsigned first_layer;
+   unsigned last_layer;
+   unsigned target;
+   unsigned char swizzle[4];
+};

-struct pipe_sampler_view *
-evergreen_create_sampler_view_custom(struct pipe_context *ctx,
-struct pipe_resource *texture,
-const struct pipe_sampler_view *state,
-unsigned width0, unsigned height0,
-unsigned force_level)
+static int evergreen_fill_tex_resource_words(struct r600_context 
*rctx,

+struct pipe_resource *texture,
+struct eg_tex_res_params *params,
+bool *skip_mip_address_reloc,
+unsigned tex_resource_words[8])
 {
-   struct r600_context *rctx = (struct r600_context*)ctx;
-   struct r600_screen *rscreen = (struct r600_screen*)ctx->screen;
-	struct r600_pipe_sampler_view *view = 
CALLOC_STRUCT(r600_pipe_sampler_view);

+   struct r600_screen *rscreen = (struct r600_screen*)rctx->b.b.screen;
struct r600_texture *tmp = (struct r600_texture*)texture;
unsigned format, endian;
uint32_t word4 = 0, yuv_format = 0, pitch = 0;
-   unsigned char swizzle[4], array_mode = 0, non_disp_tiling = 0;
+   unsigned char array_mode = 0, non_disp_tiling = 0;
unsigned height, depth, width;
unsigned macro_aspect, tile_split, bankh, bankw, nbanks, fmask_bankh;
-   enum pipe_format pipe_format = state->format;
struct radeon_surf_level *surflevel;
unsigned base_level, first_level, last_level;
unsigned dim, last_layer;
uint64_t va;

-   if (!view)
-   return NULL;
-
-   /* initialize base object */
-   view->base = *state;
-   view->base.texture = NULL;
-   pipe_reference(NULL, &texture->reference);
-   view->base.texture = texture;
-   view->base.reference.count = 1;
-   view->base.context = ctx;
-
-   if (state->target == PIPE_BUFFER)
-   return texture_buffer_sampler_view(rctx, view, width0, height0);
-
-   swizzle[0] = state->swizzle_r;
-   swizzle[1] = state->swizzle_g;
-   swizzle[2] = state->swizzle_b;
-   swizzle[3] = state->swizzle_a;
-
tile_split = tmp->surface.tile_split;
surflevel = tmp->surface.level;

/* Texturing with separate depth and stencil. */
if (tmp->is_depth && !tmp->is_flushing_texture) {
-   switch (pipe_format) {
+   switch (params->pipe_format) {
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
-   pipe_format = PIPE_FORMAT_Z32_FLOAT;
+   params->pipe_format = PIPE_FORMAT_Z32_FLOAT;
   

Re: [Mesa-dev] [PATCH] mesa/st: Avoid a NULL-ptr dereference on possible missing callback

2016-03-28 Thread eocallaghan

Determinism is always better regardless of how you get there. A
null pointer deference `on purpose` is a really poor idea in C.
I am somewhat surprised you are asking if its really better to
rely on undefined behavior vs. an assert.

I thought about using a if branch and just returning but I think
it is better to crash deterministically with a reason rather than
perhaps fail silently. Although I do see in other places a if branch
and a return was used in similar situations so I would be willing to
do the same if I must.

On 2016-03-28 16:08, Ilia Mirkin wrote:

When would that happen? When a user force-enables
ARB_query_buffer_object for a driver that's not ready for it? Is
hitting a deterministic assert in that case any better than hitting a
null deref?

On Sun, Mar 27, 2016 at 11:52 PM, Edward O'Callaghan
 wrote:

Just because we miss a gallium driver callback don't dereference
invalid memory.

Signed-off-by: Edward O'Callaghan 
---
 src/mesa/state_tracker/st_cb_queryobj.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_cb_queryobj.c 
b/src/mesa/state_tracker/st_cb_queryobj.c

index cdb9efc..e9abc38 100644
--- a/src/mesa/state_tracker/st_cb_queryobj.c
+++ b/src/mesa/state_tracker/st_cb_queryobj.c
@@ -402,6 +402,7 @@ st_StoreQueryResult(struct gl_context *ctx, struct 
gl_query_object *q,

   index = 0;
}

+   assert(pipe->get_query_result_resource);
pipe->get_query_result_resource(pipe, stq->pq, wait, result_type, 
index,

stObj->buffer, offset);
 }
--
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: remove initialized field from uniform storage

2016-03-26 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-27 14:51, Timothy Arceri wrote:

The only place this was used was in a gallium debug function that
had to be manually enabled.
---
 src/compiler/glsl/ir_uniform.h  |  5 
 src/compiler/glsl/link_uniform_initializers.cpp |  4 ---
 src/compiler/glsl/link_uniforms.cpp |  1 -
 src/mesa/main/shaderapi.c   |  3 +-
 src/mesa/main/uniform_query.cpp |  4 ---
 src/mesa/state_tracker/st_draw.c| 37 
-

 6 files changed, 1 insertion(+), 53 deletions(-)

diff --git a/src/compiler/glsl/ir_uniform.h 
b/src/compiler/glsl/ir_uniform.h

index 1854279..e72e7b4 100644
--- a/src/compiler/glsl/ir_uniform.h
+++ b/src/compiler/glsl/ir_uniform.h
@@ -105,11 +105,6 @@ struct gl_uniform_storage {
 */
unsigned array_elements;

-   /**
-* Has this uniform ever been set?
-*/
-   bool initialized;
-
struct gl_opaque_uniform_index opaque[MESA_SHADER_STAGES];

/**
diff --git a/src/compiler/glsl/link_uniform_initializers.cpp
b/src/compiler/glsl/link_uniform_initializers.cpp
index 7d280cc..870bc5b 100644
--- a/src/compiler/glsl/link_uniform_initializers.cpp
+++ b/src/compiler/glsl/link_uniform_initializers.cpp
@@ -162,8 +162,6 @@ set_opaque_binding(void *mem_ctx, gl_shader_program 
*prog,

 }
  }
   }
-
-  storage->initialized = true;
}
 }

@@ -267,8 +265,6 @@ set_uniform_initializer(void *mem_ctx,
gl_shader_program *prog,
  }
   }
}
-
-   storage->initialized = true;
 }
 }

diff --git a/src/compiler/glsl/link_uniforms.cpp
b/src/compiler/glsl/link_uniforms.cpp
index 09322c5..a16b34a 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -799,7 +799,6 @@ private:

   this->uniforms[id].name = ralloc_strdup(this->uniforms, name);
   this->uniforms[id].type = base_type;
-  this->uniforms[id].initialized = 0;
   this->uniforms[id].num_driver_storage = 0;
   this->uniforms[id].driver_storage = NULL;
   this->uniforms[id].atomic_buffer_index = -1;
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 5b882d6..92302f6 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -2568,7 +2568,6 @@ _mesa_UniformSubroutinesuiv(GLenum shadertype,
GLsizei count,
   memcpy(&uni->storage[0], &indices[i],
  sizeof(GLuint) * uni_count);

-  uni->initialized = true;
   _mesa_propagate_uniforms_to_driver_storage(uni, 0, uni_count);
   i += uni_count;
} while(i < count);
@@ -2742,7 +2741,7 @@ _mesa_shader_init_subroutine_defaults(struct
gl_shader *sh)

   for (j = 0; j < uni_count; j++)
  memcpy(&uni->storage[j], &val, sizeof(int));
-  uni->initialized = true;
+
   _mesa_propagate_uniforms_to_driver_storage(uni, 0, uni_count);
}
 }
diff --git a/src/mesa/main/uniform_query.cpp 
b/src/mesa/main/uniform_query.cpp

index 2ced201..ab5c3cd 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -815,8 +815,6 @@ _mesa_uniform(struct gl_context *ctx, struct
gl_shader_program *shProg,
   }
}

-   uni->initialized = true;
-
_mesa_propagate_uniforms_to_driver_storage(uni, offset, count);

/* If the uniform is a sampler, do the extra magic necessary to 
propagate

@@ -1030,8 +1028,6 @@ _mesa_uniform_matrix(struct gl_context *ctx,
struct gl_shader_program *shProg,
   }
}

-   uni->initialized = true;
-
_mesa_propagate_uniforms_to_driver_storage(uni, offset, count);
 }

diff --git a/src/mesa/state_tracker/st_draw.c 
b/src/mesa/state_tracker/st_draw.c

index fdd59a3..3db5749 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -127,35 +127,6 @@ setup_index_buffer(struct st_context *st,


 /**
- * Prior to drawing, check that any uniforms referenced by the
- * current shader have been set.  If a uniform has not been set,
- * issue a warning.
- */
-static void
-check_uniforms(struct gl_context *ctx)
-{
-   struct gl_shader_program **shProg = ctx->_Shader->CurrentProgram;
-   unsigned j;
-
-   for (j = 0; j < 3; j++) {
-  unsigned i;
-
-  if (shProg[j] == NULL || !shProg[j]->LinkStatus)
-continue;
-
-  for (i = 0; i < shProg[j]->NumUniformStorage; i++) {
- const struct gl_uniform_storage *u = 
&shProg[j]->UniformStorage[i];

- if (!u->initialized) {
-_mesa_warning(ctx,
-  "Using shader with uninitialized uniform: 
%s",

-  u->name);
- }
-  }
-   }
-}
-
-
-/**
  * Translate OpenGL primtive type (GL_POINTS, GL_TRIANGLE_STRIP, etc) 
to

  * the corresponding Gallium type.
  */
@@ -203,14 +174,6 @@ st_draw_vbo(struct gl_context *ctx,
/* Validate state. */
if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
   st_validate_state(st, ST_PIPELINE_RENDER);
-
-#if 0
-  if (MESA_VERBOSE & VERBOSE_GL

Re: [Mesa-dev] [PATCH 03/17] mesa/st: Set _NumSamples in update_framebuffer_state()

2016-03-25 Thread eocallaghan

On 2016-03-25 22:20, Ilia Mirkin wrote:

On Mar 25, 2016 4:43 AM,  wrote:


On 2016-03-25 14:02, Ilia Mirkin wrote:


On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan
 wrote:


Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported
with a framebuffer using no attachment.

Signed-off-by: Edward O'Callaghan 
---
 src/mesa/state_tracker/st_atom_framebuffer.c | 51



 1 file changed, 51 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c

b/src/mesa/state_tracker/st_atom_framebuffer.c

index ae883a2..07854ca 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -64,6 +64,44 @@ update_framebuffer_size(struct

pipe_framebuffer_state *framebuffer,

framebuffer->height = MIN2(framebuffer->height,

surface->height);

 }

+/**
+ * Round up the requested multisample count to the next supported

sample size.

+ */
+static unsigned
+framebuffer_quantize_num_samples(struct pipe_screen *screen,

unsigned num_samples)

+{
+   int quantized_samples = 0;
+   bool msaa_mode_supported;
+
+   if (!num_samples)
+  return 0;
+
+   assert(screen);
+
+   /* Assumes the highest supported MSAA is x32 on any hardware

*/

+   for (unsigned msaa_mode = 32; msaa_mode >= 1; msaa_mode =

msaa_mode/2) {



This should probably start at MaxFramebufferSamples right? Also
msaa_mode >= num_samples? [then you can get rid of the if below]



I did it in this manner because I don't trust all C compilers to

warn sufficiently

on `num_samples' overflows turning this into a infinite loop even

though it is a

unsigned type. The micro-optimization serves no purpose because the

optimizer will

trivially reduce the loop down, not that it has that many iterations

any way. The

loop as-is is both well bounded and deterministic, nice qualities to

have.


"Premature optimization is the root of all evil" ~ Donald Knuth's.




I was going for clarity and simplicity, not runtime efficiency. Fewer
lines of code to read, fewer conditions. For loop semantics are fairly
well defined, compilers tend to get those things right.


I was not referring to loop semantics or if a compiler can understand 
how

to lower a loop correctly, you didn't really read what I said. Point is,
parameterizing the loop with a function argument to save one line of 
code

while losing some safety and determinism does hardly anything to make a
argument for this to be changed imho. I much prefer how it is now, a 
simple

constant deterministic loop, very clear.





And lastly I don't know if it's a valid assumption that we can

always

just divide by 2. That said, I don't know of any hw that actually
supports non-power-of-two MSAA levels, so perhaps it's OK.



You are way over engineering here; it is a totally reasonable

assumption and if such

hardware does exist which we would support (I can`t see any in the

tree currently as

far as I am aware) then they can provide follow up fixes.


I'm flexible on this one... If no one else cares, I don't care either.








+  assert(!(msaa_mode > 32 || msaa_mode == 0)); /* be safe

from int overflows */

+  if (msaa_mode >= num_samples) {
+ /**
+  * For ARB_framebuffer_no_attachment, A format of
+  * PIPE_FORMAT_NONE implies what number of samples is
+  * supported for a framebuffer with no attachment. Thus

the

+  * drivers callback must be adjusted for this.
+  */
+ msaa_mode_supported =

screen->is_format_supported(screen,

+ PIPE_FORMAT_NONE,

PIPE_TEXTURE_2D,

+ msaa_mode,

PIPE_BIND_RENDER_TARGET);

+ /**
+  * Check if the MSAA mode that is higher than the

requested

+  * num_samples is supported, and if so returning it.
+  */
+ if (msaa_mode_supported)
+quantized_samples = msaa_mode;
+  }
+   }
+
+   return quantized_samples;
+}

 /**
  * Update framebuffer state (color, depth, stencil, etc. buffers)
@@ -72,6 +110,8 @@ static void
 update_framebuffer_state( struct st_context *st )
 {
struct pipe_framebuffer_state *framebuffer =

&st->state.framebuffer;

+   struct pipe_context *pipe = st->pipe;
+   struct pipe_screen *screen = pipe->screen;
struct gl_framebuffer *fb = st->ctx->DrawBuffer;
struct st_renderbuffer *strb;
GLuint i;
@@ -82,6 +122,17 @@ update_framebuffer_state( struct st_context

*st )

framebuffer->width  = UINT_MAX;
framebuffer->height = UINT_MAX;

+   /**
+* Quantize the derived default number of samples:
+*
+* A query to the driver of supported MSAA values the
+* hardware supports is done as to legalize the number
+* of application requested samples, NumSamples.
+* See commit eb9cf3c for more information.
+*/
+   fb->DefaultGeometry._NumSamples =
+  framebuffer_quantize_num_samples(screen,

fb->DefaultGeometry.NumSamples);

+
/*printf("

Re: [Mesa-dev] [PATCH 15/17] nvc0: handle the case where there are no framebuffer attachments

2016-03-25 Thread eocallaghan

On 2016-03-25 14:29, Ilia Mirkin wrote:

Please leave this and the next patch out of your series. I'm going to
need to retest everything carefully once the core support is in (and I
get a bit of time).


Done. Be sure to remove my Rb if you made changes and i`ll review it 
again.




Thanks,

  -ilia

On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan
 wrote:

From: Ilia Mirkin 

Signed-off-by: Ilia Mirkin 
Reviewed-by: Edward O'Callaghan 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c|  7 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 16 
++--

 src/gallium/drivers/nouveau/nvc0/nvc0_surface.c|  4 
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c

index b7c6faf..add9a79 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -456,6 +456,13 @@ nvc0_fp_gen_header(struct nvc0_program *fp, 
struct nv50_ir_prog_info *info)

  fp->hdr[18] |= 0xf << info->out[i].slot[0];
}

+   /* TODO: figure out proper condition, but this makes things work 
when there
+* are no "regular" outputs in the frag shader, used when there 
are no

+* attachments.
+*/
+   if (info->numOutputs == 0)
+  fp->hdr[18] |= 0xf;
+
fp->fp.early_z = info->prop.fp.earlyFragTests;

return 0;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c

index 9c64482..53f574b 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
@@ -75,12 +75,11 @@ nvc0_validate_fb(struct nvc0_context *nvc0)
 struct nvc0_screen *screen = nvc0->screen;
 unsigned i, ms;
 unsigned ms_mode = NVC0_3D_MULTISAMPLE_MODE_MS1;
+unsigned nr_cbufs = fb->nr_cbufs;
 bool serialize = false;

 nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_3D_FB);

-BEGIN_NVC0(push, NVC0_3D(RT_CONTROL), 1);
-PUSH_DATA (push, (076543210 << 4) | fb->nr_cbufs);
 BEGIN_NVC0(push, NVC0_3D(SCREEN_SCISSOR_HORIZ), 2);
 PUSH_DATA (push, fb->width << 16);
 PUSH_DATA (push, fb->height << 16);
@@ -179,6 +178,18 @@ nvc0_validate_fb(struct nvc0_context *nvc0)
 PUSH_DATA (push, 0);
 }

+if (nr_cbufs == 0 && !fb->zsbuf) {
+   unsigned samples = util_next_power_of_two(fb->samples);
+
+   nvc0_fb_set_null_rt(push, 0);
+
+   assert(samples <= 8);
+   ms_mode = ffs(samples) - 1;
+   nr_cbufs = 1;
+}
+
+BEGIN_NVC0(push, NVC0_3D(RT_CONTROL), 1);
+PUSH_DATA (push, (076543210 << 4) | nr_cbufs);
 IMMED_NVC0(push, NVC0_3D(MULTISAMPLE_MODE), ms_mode);

 ms = 1 << ms_mode;
@@ -592,6 +603,7 @@ nvc0_validate_derived_2(struct nvc0_context *nvc0)
struct nouveau_pushbuf *push = nvc0->base.pushbuf;

if (nvc0->zsa && nvc0->zsa->pipe.alpha.enabled &&
+   nvc0->framebuffer.zsbuf &&
nvc0->framebuffer.nr_cbufs == 0) {
   nvc0_fb_set_null_rt(push, 0);
   BEGIN_NVC0(push, NVC0_3D(RT_CONTROL), 1);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c

index e8b3a4d..d546957 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
@@ -1043,6 +1043,8 @@ nvc0_blitctx_pre_blit(struct nvc0_blitctx *ctx)

ctx->saved.fb.width = nvc0->framebuffer.width;
ctx->saved.fb.height = nvc0->framebuffer.height;
+   ctx->saved.fb.samples = nvc0->framebuffer.samples;
+   ctx->saved.fb.layers = nvc0->framebuffer.layers;
ctx->saved.fb.nr_cbufs = nvc0->framebuffer.nr_cbufs;
ctx->saved.fb.cbufs[0] = nvc0->framebuffer.cbufs[0];
ctx->saved.fb.zsbuf = nvc0->framebuffer.zsbuf;
@@ -1110,6 +1112,8 @@ nvc0_blitctx_post_blit(struct nvc0_blitctx 
*blit)


nvc0->framebuffer.width = blit->saved.fb.width;
nvc0->framebuffer.height = blit->saved.fb.height;
+   nvc0->framebuffer.samples = blit->saved.fb.samples;
+   nvc0->framebuffer.layers = blit->saved.fb.layers;
nvc0->framebuffer.nr_cbufs = blit->saved.fb.nr_cbufs;
nvc0->framebuffer.cbufs[0] = blit->saved.fb.cbufs[0];
nvc0->framebuffer.zsbuf = blit->saved.fb.zsbuf;
--
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/17] gallium/util: Ensure util_framebuffer_get_num_samples() is valid

2016-03-25 Thread eocallaghan

On 2016-03-25 14:20, Ilia Mirkin wrote:

Instead of introducing buggy code in patch 6/17 and then fixing it up
here, you need to fold this with patch 6 so that it's all done at the
same time.


Yea, can do. Cheers,



On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan
 wrote:

Upon context creation, internal driver structures are malloc()'ed
and memset() to zero them. This results in a invalid number of
samples 'by default'. Handle this in the simplest way to avoid
elaborate and probably equally sub-optimial solutions.

V2: Minor, use "NOTE:" instead of "N.B." in comment.

Signed-off-by: Edward O'Callaghan 
Reviewed-by: Marek Olšák 
---
 src/gallium/auxiliary/util/u_framebuffer.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_framebuffer.c 
b/src/gallium/auxiliary/util/u_framebuffer.c

index 775f050..b020f27 100644
--- a/src/gallium/auxiliary/util/u_framebuffer.c
+++ b/src/gallium/auxiliary/util/u_framebuffer.c
@@ -204,9 +204,15 @@ util_framebuffer_get_num_samples(const struct 
pipe_framebuffer_state *fb)

 * In the case of ARB_framebuffer_no_attachment
 * we obtain the number of samples directly from
 * the framebuffer state.
+*
+* NOTE: fb->samples may wind up as zero due to memset()'s on 
internal
+*   driver structures on their initialization and so we take 
the
+*   MAX here to ensure we have a valid number of samples. 
However,

+*   if samples is legitimately not getting set somewhere
+*   multi-sampling will evidently break.
 */
if (!(fb->nr_cbufs || fb->zsbuf))
-  return fb->samples;
+  return MAX2(fb->samples, 1);

for (i = 0; i < fb->nr_cbufs; i++) {
   if (fb->cbufs[i]) {
--
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/17] mesa/st: Set _NumSamples in update_framebuffer_state()

2016-03-25 Thread eocallaghan

On 2016-03-25 14:02, Ilia Mirkin wrote:

On Thu, Mar 24, 2016 at 8:11 PM, Edward O'Callaghan
 wrote:

Using PIPE_FORMAT_NONE to indicate what MSAA modes are supported
with a framebuffer using no attachment.

Signed-off-by: Edward O'Callaghan 
---
 src/mesa/state_tracker/st_atom_framebuffer.c | 51 


 1 file changed, 51 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c 
b/src/mesa/state_tracker/st_atom_framebuffer.c

index ae883a2..07854ca 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -64,6 +64,44 @@ update_framebuffer_size(struct 
pipe_framebuffer_state *framebuffer,

framebuffer->height = MIN2(framebuffer->height, surface->height);
 }

+/**
+ * Round up the requested multisample count to the next supported 
sample size.

+ */
+static unsigned
+framebuffer_quantize_num_samples(struct pipe_screen *screen, unsigned 
num_samples)

+{
+   int quantized_samples = 0;
+   bool msaa_mode_supported;
+
+   if (!num_samples)
+  return 0;
+
+   assert(screen);
+
+   /* Assumes the highest supported MSAA is x32 on any hardware */
+   for (unsigned msaa_mode = 32; msaa_mode >= 1; msaa_mode = 
msaa_mode/2) {


This should probably start at MaxFramebufferSamples right? Also
msaa_mode >= num_samples? [then you can get rid of the if below]


I did it in this manner because I don't trust all C compilers to warn 
sufficiently
on `num_samples' overflows turning this into a infinite loop even though 
it is a
unsigned type. The micro-optimization serves no purpose because the 
optimizer will
trivially reduce the loop down, not that it has that many iterations any 
way. The
loop as-is is both well bounded and deterministic, nice qualities to 
have.


"Premature optimization is the root of all evil" ~ Donald Knuth's.



And lastly I don't know if it's a valid assumption that we can always
just divide by 2. That said, I don't know of any hw that actually
supports non-power-of-two MSAA levels, so perhaps it's OK.


You are way over engineering here; it is a totally reasonable assumption 
and if such
hardware does exist which we would support (I can`t see any in the tree 
currently as

far as I am aware) then they can provide follow up fixes.



+  assert(!(msaa_mode > 32 || msaa_mode == 0)); /* be safe from 
int overflows */

+  if (msaa_mode >= num_samples) {
+ /**
+  * For ARB_framebuffer_no_attachment, A format of
+  * PIPE_FORMAT_NONE implies what number of samples is
+  * supported for a framebuffer with no attachment. Thus the
+  * drivers callback must be adjusted for this.
+  */
+ msaa_mode_supported = screen->is_format_supported(screen,
+ PIPE_FORMAT_NONE, 
PIPE_TEXTURE_2D,
+ msaa_mode, 
PIPE_BIND_RENDER_TARGET);

+ /**
+  * Check if the MSAA mode that is higher than the requested
+  * num_samples is supported, and if so returning it.
+  */
+ if (msaa_mode_supported)
+quantized_samples = msaa_mode;
+  }
+   }
+
+   return quantized_samples;
+}

 /**
  * Update framebuffer state (color, depth, stencil, etc. buffers)
@@ -72,6 +110,8 @@ static void
 update_framebuffer_state( struct st_context *st )
 {
struct pipe_framebuffer_state *framebuffer = 
&st->state.framebuffer;

+   struct pipe_context *pipe = st->pipe;
+   struct pipe_screen *screen = pipe->screen;
struct gl_framebuffer *fb = st->ctx->DrawBuffer;
struct st_renderbuffer *strb;
GLuint i;
@@ -82,6 +122,17 @@ update_framebuffer_state( struct st_context *st )
framebuffer->width  = UINT_MAX;
framebuffer->height = UINT_MAX;

+   /**
+* Quantize the derived default number of samples:
+*
+* A query to the driver of supported MSAA values the
+* hardware supports is done as to legalize the number
+* of application requested samples, NumSamples.
+* See commit eb9cf3c for more information.
+*/
+   fb->DefaultGeometry._NumSamples =
+  framebuffer_quantize_num_samples(screen, 
fb->DefaultGeometry.NumSamples);

+
/*printf("-- fb size %d x %d\n", fb->Width, fb->Height);*/

/* Examine Mesa's ctx->DrawBuffer->_ColorDrawBuffers state
--
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix 2D array MSAA failures since image support landed

2016-03-22 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-23 04:27, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index b9bdd47..b8fde00 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2993,7 +2993,8 @@ si_make_texture_descriptor(struct si_screen 
*screen,

if (type == V_008F1C_SQ_RSRC_IMG_1D_ARRAY) {
height = 1;
depth = res->array_size;
-   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY) {
+   } else if (type == V_008F1C_SQ_RSRC_IMG_2D_ARRAY ||
+  type == V_008F1C_SQ_RSRC_IMG_2D_MSAA_ARRAY) {
if (sampler || res->target != PIPE_TEXTURE_3D)
depth = res->array_size;
} else if (type == V_008F1C_SQ_RSRC_IMG_CUBE)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/17] gallium/aux: Fix u_blitter.c for layers/samples

2016-03-22 Thread eocallaghan
Ah you are correct, this is no longer needed in the push branch. We can 
drop this one from the series as its a nop, please ignore thanks for 
spotting it.


On 2016-03-22 02:43, Marek Olšák wrote:

Does this fix anything even? The blitter always binds something, thus
this should have no effect.

Marek

On Sat, Mar 19, 2016 at 7:41 AM, Edward O'Callaghan
 wrote:

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/auxiliary/util/u_blitter.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_blitter.c 
b/src/gallium/auxiliary/util/u_blitter.c

index 43fbd8e..c4a32e8 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -1566,11 +1566,13 @@ void util_blitter_blit_generic(struct 
blitter_context *blitter,

/* Initialize framebuffer state. */
fb_state.width = dst->width;
fb_state.height = dst->height;
-   fb_state.nr_cbufs = blit_depth || blit_stencil ? 0 : 1;
fb_state.cbufs[0] = NULL;
fb_state.zsbuf = NULL;

if (blit_depth || blit_stencil) {
+  fb_state.nr_cbufs = 0;
+  fb_state.layers = 0;
+  fb_state.samples = 1;
   pipe->bind_blend_state(pipe, ctx->blend[0][0]);

   if (blit_depth && blit_stencil) {
@@ -1594,6 +1596,7 @@ void util_blitter_blit_generic(struct 
blitter_context *blitter,

   }

} else {
+  fb_state.nr_cbufs = 1;
   unsigned colormask = mask & PIPE_MASK_RGBA;

   pipe->bind_blend_state(pipe, 
ctx->blend[colormask][alpha_blend]);

--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] radeonsi: shader buffer support (atomic counters, ssbo)

2016-03-21 Thread eocallaghan

Hi Nicolai,

Thanks for taking over this work and going the whole nine yards with it!

This series is, Reviewed-by: Edward O'Callaghan 



Thanks again,
Edward.

On 2016-03-22 10:21, Nicolai Hähnle wrote:

Hi,

since shader images have laid most of the foundation, here are shader 
buffers
now. This is the last extension missing for OpenGL 4.2 (we still need 
to turn

on GLSL 4.2, but I think that only involves flipping a bit).

As with shader images, this extension needs bleeding edge LLVM - this 
time,
important patches have not landed upstream yet, and if you want to try 
this

code you'll need my LLVM branch at
https://cgit.freedesktop.org/~nh/llvm/log/?h=images

(For those following along at home, the necessary LLVM patches for 
shader

images have already landed upstream.)

In principle, there are two alternative implementations for shader 
buffers:
using LLVM IR  pointers with LLVM-native load/store instructions 
directly, or
using intrinsics that operate on GCN buffer descriptors. This 
implementation
uses the second approach. A brief comparison between the two 
approaches:


1. The pointer approach would use FLAT memory instructions on CI+, 
which
   operate on 64 bit pointers rather than 128 bit buffer descriptors. 
This

   would reduce SGPR memory pressure slightly.

2. LLVM understands pointers for alias analysis, so it's possible that 
it
   would generate somewhat better code if we were to use pointers in 
the

   IR.

3. The buffer load/store intructions have built-in bounds checks. 
Bounds
   checks are required for an honest implementation of the 
ARB_robustness

   extension, which we claim to support.

The last point makes it obvious that the implementation really needs to 
use
buffer intrinsics, but it'd be interesting to know how big the 
difference
in code quality is versus something that uses pointers. To get the best 
of

both worlds, we should really find a way to teach LLVM's alias analysis
about what those buffer descriptors mean. For now, this current 
approach is

the right way to do it.

Please review!

Thanks,
Nicolai

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: replace gl_context->Multisample._Enabled with _mesa_is_multisample_enabled.

2016-03-21 Thread eocallaghan

Too quick, very nice cleanup, thanks.

Reviewed-by: Edward O'Callaghan 

On 2016-03-22 12:58, Bas Nieuwenhuizen wrote:

This removes any dependency on driver validation of the number of
framebuffer samples.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/mesa/drivers/dri/i965/brw_util.h   |  5 +++--
 src/mesa/drivers/dri/i965/gen6_cc.c|  6 +++---
 src/mesa/drivers/dri/i965/gen6_multisample_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_blend_state.c   |  6 +++---
 src/mesa/drivers/dri/i965/gen8_depth_state.c   |  3 ++-
 src/mesa/drivers/dri/i965/gen8_sf_state.c  |  4 ++--
 src/mesa/main/framebuffer.c| 19 
+++

 src/mesa/main/framebuffer.h|  3 +++
 src/mesa/main/mtypes.h |  1 -
 src/mesa/main/state.c  | 17 
-

 src/mesa/program/prog_statevars.c  |  2 +-
 src/mesa/state_tracker/st_atom_rasterizer.c|  4 ++--
 src/mesa/state_tracker/st_atom_shader.c|  2 +-
 src/mesa/swrast/s_points.c |  4 ++--
 14 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_util.h
b/src/mesa/drivers/dri/i965/brw_util.h
index 1f27e98..3e9a6ee 100644
--- a/src/mesa/drivers/dri/i965/brw_util.h
+++ b/src/mesa/drivers/dri/i965/brw_util.h
@@ -34,6 +34,7 @@
 #define BRW_UTIL_H

 #include "brw_context.h"
+#include "main/framebuffer.h"

 extern GLuint brw_translate_blend_factor( GLenum factor );
 extern GLuint brw_translate_blend_equation( GLenum mode );
@@ -49,13 +50,13 @@ brw_get_line_width(struct brw_context *brw)
 * implementation-dependent maximum non-antialiased line width."
 */
float line_width =
-  CLAMP(!brw->ctx.Multisample._Enabled && 
!brw->ctx.Line.SmoothFlag

+  CLAMP(!_mesa_is_multisample_enabled(&brw->ctx) &&
!brw->ctx.Line.SmoothFlag
 ? roundf(brw->ctx.Line.Width) : brw->ctx.Line.Width,
 0.0f, brw->ctx.Const.MaxLineWidth);
uint32_t line_width_u3_7 = U_FIXED(line_width, 7);

/* Line width of 0 is not allowed when MSAA enabled */
-   if (brw->ctx.Multisample._Enabled) {
+   if (_mesa_is_multisample_enabled(&brw->ctx)) {
   if (line_width_u3_7 == 0)
  line_width_u3_7 = 1;
} else if (brw->ctx.Line.SmoothFlag && line_width < 1.5f) {
diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c
b/src/mesa/drivers/dri/i965/gen6_cc.c
index cee139b..f5a7d4d 100644
--- a/src/mesa/drivers/dri/i965/gen6_cc.c
+++ b/src/mesa/drivers/dri/i965/gen6_cc.c
@@ -198,14 +198,14 @@ gen6_upload_blend_state(struct brw_context *brw)
   if(!is_buffer_zero_integer_format) {
  /* _NEW_MULTISAMPLE */
  blend[b].blend1.alpha_to_coverage =
-ctx->Multisample._Enabled &&
ctx->Multisample.SampleAlphaToCoverage;
+_mesa_is_multisample_enabled(ctx) &&
ctx->Multisample.SampleAlphaToCoverage;

/* From SandyBridge PRM, volume 2 Part 1, section 8.2.3, BLEND_STATE:
 * DWord 1, Bit 30 (AlphaToOne Enable):
 * "If Dual Source Blending is enabled, this bit must be disabled"
 */
  WARN_ONCE(ctx->Color.Blend[b]._UsesDualSrc &&
-   ctx->Multisample._Enabled &&
+   _mesa_is_multisample_enabled(ctx) &&
ctx->Multisample.SampleAlphaToOne,
"HW workaround: disabling alpha to one with dual 
src "

"blending\n");
@@ -213,7 +213,7 @@ gen6_upload_blend_state(struct brw_context *brw)
 blend[b].blend1.alpha_to_one = false;
 else
blend[b].blend1.alpha_to_one =
-	   ctx->Multisample._Enabled && 
ctx->Multisample.SampleAlphaToOne;
+	   _mesa_is_multisample_enabled(ctx) && 
ctx->Multisample.SampleAlphaToOne;


  blend[b].blend1.alpha_to_coverage_dither = (brw->gen >= 7);
   }
diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
index 8eb620d..fcd313a 100644
--- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c
@@ -171,7 +171,7 @@ gen6_determine_sample_mask(struct brw_context *brw)
/* BRW_NEW_NUM_SAMPLES */
unsigned num_samples = brw->num_samples;

-   if (ctx->Multisample._Enabled) {
+   if (_mesa_is_multisample_enabled(ctx)) {
   if (ctx->Multisample.SampleCoverage) {
  coverage = ctx->Multisample.SampleCoverageValue;
  coverage_invert = ctx->Multisample.SampleCoverageInvert;
diff --git a/src/mesa/drivers/dri/i965/gen8_blend_state.c
b/src/mesa/drivers/dri/i965/gen8_blend_state.c
index 786c79a..63186bd 100644
--- a/src/mesa/drivers/dri/i965/gen8_blend_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_blend_state.c
@@ -65,7 +65,7 @@ gen8_upload_blend_state(struct brw_context *brw)

if (rb_zero_type != GL_INT && rb_zero_type != GL_UNSIGNED_INT) {
   /* _NEW_MULTISAMPLE */
-  if (ctx->Mul

Re: [Mesa-dev] [PATCH] tgsi: drop unused set_exec/kill_mask interfaces.

2016-03-21 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-22 11:29, Dave Airlie wrote:

From: Dave Airlie 

These don't get used and haven't been in git history from what I can
see, so drop them.

Signed-off-by: Dave Airlie 
---
 src/gallium/auxiliary/draw/draw_gs.c  |  6 --
 src/gallium/auxiliary/draw/draw_vs_exec.c |  6 --
 src/gallium/auxiliary/tgsi/tgsi_exec.h| 25 
-

 3 files changed, 37 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_gs.c
b/src/gallium/auxiliary/draw/draw_gs.c
index 6b33341..fcef31b 100644
--- a/src/gallium/auxiliary/draw/draw_gs.c
+++ b/src/gallium/auxiliary/draw/draw_gs.c
@@ -206,12 +206,6 @@ static unsigned tgsi_gs_run(struct
draw_geometry_shader *shader,
 {
struct tgsi_exec_machine *machine = shader->machine;

-   tgsi_set_exec_mask(machine,
-  1,
-  input_primitives > 1,
-  input_primitives > 2,
-  input_primitives > 3);
-
/* run interpreter */
tgsi_exec_machine_run(machine);

diff --git a/src/gallium/auxiliary/draw/draw_vs_exec.c
b/src/gallium/auxiliary/draw/draw_vs_exec.c
index abd64f5..3fd8ef3 100644
--- a/src/gallium/auxiliary/draw/draw_vs_exec.c
+++ b/src/gallium/auxiliary/draw/draw_vs_exec.c
@@ -159,12 +159,6 @@ vs_exec_run_linear( struct draw_vertex_shader 
*shader,
  input = (const float (*)[4])((const char *)input + 
input_stride);

   }

-  tgsi_set_exec_mask(machine,
- 1,
- max_vertices > 1,
- max_vertices > 2,
- max_vertices > 3);
-
   /* run interpreter */
   tgsi_exec_machine_run( machine );

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h
b/src/gallium/auxiliary/tgsi/tgsi_exec.h
index 12a6875..991c3bf 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h
@@ -196,10 +196,6 @@ struct tgsi_sampler
 #define TGSI_EXEC_TEMP_HALF_I   (TGSI_EXEC_NUM_TEMPS + 3)
 #define TGSI_EXEC_TEMP_HALF_C   0

-/* execution mask, each value is either 0 or ~0 */
-#define TGSI_EXEC_MASK_I(TGSI_EXEC_NUM_TEMPS + 3)
-#define TGSI_EXEC_MASK_C1
-
 /* 4 register buffer for various purposes */
 #define TGSI_EXEC_TEMP_R0   (TGSI_EXEC_NUM_TEMPS + 4)
 #define TGSI_EXEC_NUM_TEMP_R4
@@ -397,27 +393,6 @@ boolean
 tgsi_check_soa_dependencies(const struct tgsi_full_instruction *inst);


-static inline void
-tgsi_set_kill_mask(struct tgsi_exec_machine *mach, unsigned mask)
-{
-   
mach->Temps[TGSI_EXEC_TEMP_KILMASK_I].xyzw[TGSI_EXEC_TEMP_KILMASK_C].u[0] 
=

-  mask;
-}
-
-
-/** Set execution mask values prior to executing the shader */
-static inline void
-tgsi_set_exec_mask(struct tgsi_exec_machine *mach,
-   boolean ch0, boolean ch1, boolean ch2, boolean ch3)
-{
-   int *mask = mach->Temps[TGSI_EXEC_MASK_I].xyzw[TGSI_EXEC_MASK_C].i;
-   mask[0] = ch0 ? ~0 : 0;
-   mask[1] = ch1 ? ~0 : 0;
-   mask[2] = ch2 ? ~0 : 0;
-   mask[3] = ch3 ? ~0 : 0;
-}
-
-
 extern void
 tgsi_exec_set_constant_buffers(struct tgsi_exec_machine *mach,
unsigned num_bufs,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix out-of-bounds indexing of shader images

2016-03-21 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-22 07:41, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

Results are undefined but may not crash. Without this change, 
out-of-bounds

indexing can lead to VM faults and GPU hangs.

Constant buffers, samplers, and possibly others will eventually need 
similar

treatment to support GL_ARB_robust_buffer_access_behavior.
---
 src/gallium/drivers/radeonsi/si_shader.c | 44 
+++-

 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index 9ad2290..1e4bf82 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -532,6 +532,37 @@ static LLVMValueRef get_indirect_index(struct
si_shader_context *ctx,
 }

 /**
+ * Like get_indirect_index, but restricts the return value to a 
(possibly

+ * undefined) value inside [0..num).
+ */
+static LLVMValueRef get_bounded_indirect_index(struct 
si_shader_context *ctx,

+  const struct tgsi_ind_register 
*ind,
+  int rel_index, unsigned num)
+{
+   struct gallivm_state *gallivm = &ctx->radeon_bld.gallivm;
+   LLVMBuilderRef builder = gallivm->builder;
+   LLVMValueRef result = get_indirect_index(ctx, ind, rel_index);
+   LLVMValueRef c_max = LLVMConstInt(ctx->i32, num - 1, 0);
+   LLVMValueRef cc;
+
+   if (util_is_power_of_two(num)) {
+   result = LLVMBuildAnd(builder, result, c_max, "");
+   } else {
+   /* In theory, this MAX pattern should result in code that is
+* as good as the bit-wise AND above.
+*
+* In practice, LLVM generates worse code (at the time of
+* writing), because its value tracking is not strong enough.
+*/
+   cc = LLVMBuildICmp(builder, LLVMIntULE, result, c_max, "");
+   result = LLVMBuildSelect(builder, cc, result, c_max, "");
+   }
+
+   return result;
+}
+
+
+/**
  * Calculate a dword address given an input or output register and a 
stride.

  */
 static LLVMValueRef get_dw_address(struct si_shader_context *ctx,
@@ -2814,7 +2845,18 @@ image_fetch_rsrc(
LLVMValueRef rsrc_ptr;
LLVMValueRef tmp;

-		ind_index = get_indirect_index(ctx, &image->Indirect, 
image->Register.Index);

+   /* From the GL_ARB_shader_image_load_store extension spec:
+*
+*If a shader performs an image load, store, or atomic
+*operation using an image variable declared as an array,
+*and if the index used to select an individual element is
+*negative or greater than or equal to the size of the
+*array, the results of the operation are undefined but may
+*not lead to termination.
+*/
+   ind_index = get_bounded_indirect_index(ctx, &image->Indirect,
+  image->Register.Index,
+  SI_NUM_IMAGES);

rsrc_ptr = LLVMGetParam(ctx->radeon_bld.main_fn, 
SI_PARAM_IMAGES);
tmp = build_indexed_load_const(ctx, rsrc_ptr, ind_index);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/17] gallium: Add PIPE_CAP_MSAA_MODES

2016-03-21 Thread eocallaghan

On 2016-03-21 21:06, Marek Olšák wrote:
On Sat, Mar 19, 2016 at 5:09 PM, Ilia Mirkin  
wrote:

On Sat, Mar 19, 2016 at 12:02 PM, Bas Nieuwenhuizen
 wrote:
On Sat, Mar 19, 2016 at 4:25 PM, Ilia Mirkin  
wrote:

On Sat, Mar 19, 2016 at 11:14 AM, Bas Nieuwenhuizen
 wrote:

That would limit us to supporting sample counts for which we have
texture formats.

As far as I understand with radeonsi we can support 16 samples 
without

any attachments, but all formats are limited to <= 8 samples.


So you're going to end up with a situation where GL_MAX_SAMPLES is
less than GL_MAX_FRAMEBUFFER_SAMPLES? I don't know that that's a
useful thing to have. This implementation still has the problem of
only supporting POT MSAA levels (although tbh I'm not 100% sure
there's hw out there that supports NPOT MSAA levels). If people 
really

want this, I think the way to go would be to make
is_format_supported() work with PIPE_FORMAT_NONE and do it that way.

Also, are you *sure* that's the case on radeonsi? I find it very odd
that the rasterizer would support a higher MSAA level than the 
highest

attachment would...


I am pretty confident that this is the case. I just tested 16 samples
(although this series seems to miss changing MaxFramebufferSamples),
and the driver disallows any texture format with > 8 samples [1].
Furthermore the proprietary driver on Windows seems to have
GL_MAX_SAMPLES=8 and GL_MAX_FRAMEBUFFER_SAMPLES=16 [2].


OK. I still think it's crazy, but it is what it is :)


It's called EQAA (similar to CSAA). The hardware can do 16 unique
depth samples, but only 8 unique color samples can be stored. Other
than that, the rasterization hw supports 16x MSAA fully.





Using PIPE_FORMAT_NONE to query the driver would probably be a bit
less error prone than the current code that sets the masks, so that
would be fine with me.


Actually my earlier criticism about it only doing POT levels is a bit
off -- after reading some more of the code, I just think that the
settings to drivers were off - it should have been (1 << 8) | (1 <<
4), etc. This works up to 32x MSAA, which is not supported by anyone
(for real, although NVIDIA blob drivers do fake it).


R300 can do 6x MSAA, but granted it won't support this extension.



I do still prefer to avoid having separate places where this info is
encoded... so I maintain my vote for using PIPE_FORMAT_NONE in
is_format_supported.


Same here.


Same. I fixed this in the up-coming series.



Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 17/17] GL3.txt: Mark ARB_framebuffer_no_attachments as done

2016-03-19 Thread eocallaghan

On 2016-03-19 21:08, Kai Wasserbäch wrote:

Edward O'Callaghan wrote on 19.03.2016 07:41:

Signed-off-by: Edward O'Callaghan 
---
 docs/GL3.txt  | 2 +-
 docs/relnotes/11.3.0.html | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 3058996..b9fc86b 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -172,7 +172,7 @@ GL 4.3, GLSL 4.30:
   GL_KHR_debug  DONE (all 
drivers)
   GL_ARB_explicit_uniform_location  DONE (all 
drivers that support GLSL)
   GL_ARB_fragment_layer_viewportDONE (i965, 
nv50, nvc0, r600, radeonsi, llvmpipe)

-  GL_ARB_framebuffer_no_attachments DONE (i965)
+  GL_ARB_framebuffer_no_attachments DONE (i965, 
nvc0, r600, radeonsi)

   GL_ARB_internalformat_query2  DONE (i965)
   GL_ARB_invalidate_subdata DONE (all 
drivers)
   GL_ARB_multi_draw_indirectDONE (i965, 
nvc0, r600, radeonsi, llvmpipe, softpipe)


Should this also update the GL_ARB_framebuffer_no_attachments line in 
the OpenGL
ES 3.1 section? Or is more work needed for that? In the latter case a 
small

comment in the commit message might be nice.

Cheers,
Kai


I am not working on ES and don`t really know much about it so i`ll leave 
that one to the `experts`.

My focus here is just usual GL.

Thanks,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] st/mesa, radeonsi: some MemoryBarrier fixes

2016-03-19 Thread eocallaghan

Hi Nicolai,

This series is,
Reviewed-by: Edward O'Callaghan 

Thanks,

On 2016-03-19 14:37, Nicolai Hähnle wrote:

Hi,

these patches apply on top of my ARB_shader_image_load_store series. 
Together,

they fix a few remaining fails with piglit's
arb_shader_image_load_store-host-mem-barrier.
You can see them in context at 
https://cgit.freedesktop.org/~nh/mesa/log/?h=ssbo


The basic assumption for how barrier bits are translated is that each 
Gallium
object / binding point has its own PIPE_BARRIER_* bit, but the driver 
will

automatically do the necessary invalidations/flushes for transfers and
blit-type operations, as well as when the framebuffer state is changed.

This is still very tricky stuff to get right, but at least I think it's
shaping up nicely for radeonsi, as evidenced by the fact that the
host-mem-barrier test passes (and the control subtests also show that
what we're doing here isn't just a no-op). Please review!

Thanks,
Nicolai
---
 src/gallium/drivers/radeonsi/si_state.c  |  7 +++--
 src/gallium/include/pipe/p_defines.h |  1 +
 .../state_tracker/st_cb_texturebarrier.c | 25 +-
 3 files changed, 30 insertions(+), 3 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac require libdrm 2.4.65 for amdgpu because of drmGetDevice

2016-03-13 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-03-14 03:46, Marek Olšák wrote:

From: Marek Olšák 

---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 49be147..d768db6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -70,7 +70,7 @@ AC_SUBST([OPENCL_VERSION])
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.60
 LIBDRM_RADEON_REQUIRED=2.4.56
-LIBDRM_AMDGPU_REQUIRED=2.4.63
+LIBDRM_AMDGPU_REQUIRED=2.4.65
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] mesa: gl_NumSamples should always be at least one

2016-02-17 Thread eocallaghan

Reviewed-by: Edward O`Callaghan 

I had the same issue also.

On 2016-02-16 17:31, Ilia Mirkin wrote:

From ARB_sample_shading:

"gl_NumSamples is the total number of samples in the framebuffer,
 or one if rendering to a non-multisample framebuffer"

So make sure to always pass in at least 1.

Signed-off-by: Ilia Mirkin 
---
 src/mesa/program/prog_statevars.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/program/prog_statevars.c
b/src/mesa/program/prog_statevars.c
index eed2412..489f75f 100644
--- a/src/mesa/program/prog_statevars.c
+++ b/src/mesa/program/prog_statevars.c
@@ -353,7 +353,7 @@ _mesa_fetch_state(struct gl_context *ctx, const
gl_state_index state[],
   }
   return;
case STATE_NUM_SAMPLES:
-  ((int *)value)[0] = _mesa_geometric_samples(ctx->DrawBuffer);
+  ((int *)value)[0] = MAX2(1, 
_mesa_geometric_samples(ctx->DrawBuffer));

   return;
case STATE_DEPTH_RANGE:
   value[0] = ctx->ViewportArray[0].Near;/* near
   */


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/dri/r200: Refrain from using symbol links in repo

2016-02-17 Thread eocallaghan
Disregard, apparently this breaks out-of-tree builds. There perhaps is 
maybe no good solution here so i`ll refrain from this can of worms for 
now.


On 2016-02-18 15:46, Edward O'Callaghan wrote:

Just use the relative path in the Makefile.source over
symbol links that are not necessarily portable.

Untested as I don't have this old hardware.

Signed-off-by: Edward O'Callaghan 
---
 src/mesa/drivers/dri/r200/Makefile.sources| 58 
+++

 src/mesa/drivers/dri/r200/radeon_buffer_objects.c |  1 -
 src/mesa/drivers/dri/r200/radeon_buffer_objects.h |  1 -
 src/mesa/drivers/dri/r200/radeon_chipset.h|  1 -
 src/mesa/drivers/dri/r200/radeon_cmdbuf.h |  1 -
 src/mesa/drivers/dri/r200/radeon_common.c |  1 -
 src/mesa/drivers/dri/r200/radeon_common.h |  1 -
 src/mesa/drivers/dri/r200/radeon_common_context.c |  1 -
 src/mesa/drivers/dri/r200/radeon_common_context.h |  1 -
 src/mesa/drivers/dri/r200/radeon_debug.c  |  1 -
 src/mesa/drivers/dri/r200/radeon_debug.h  |  1 -
 src/mesa/drivers/dri/r200/radeon_dma.c|  1 -
 src/mesa/drivers/dri/r200/radeon_dma.h|  1 -
 src/mesa/drivers/dri/r200/radeon_fbo.c|  1 -
 src/mesa/drivers/dri/r200/radeon_fog.c|  1 -
 src/mesa/drivers/dri/r200/radeon_fog.h|  1 -
 src/mesa/drivers/dri/r200/radeon_mipmap_tree.c|  1 -
 src/mesa/drivers/dri/r200/radeon_mipmap_tree.h|  1 -
 src/mesa/drivers/dri/r200/radeon_pixel_read.c |  1 -
 src/mesa/drivers/dri/r200/radeon_queryobj.c   |  1 -
 src/mesa/drivers/dri/r200/radeon_queryobj.h   |  1 -
 src/mesa/drivers/dri/r200/radeon_screen.c |  1 -
 src/mesa/drivers/dri/r200/radeon_screen.h |  1 -
 src/mesa/drivers/dri/r200/radeon_span.c   |  1 -
 src/mesa/drivers/dri/r200/radeon_span.h   |  1 -
 src/mesa/drivers/dri/r200/radeon_tex_copy.c   |  1 -
 src/mesa/drivers/dri/r200/radeon_texture.c|  1 -
 src/mesa/drivers/dri/r200/radeon_texture.h|  1 -
 src/mesa/drivers/dri/r200/radeon_tile.c   |  1 -
 src/mesa/drivers/dri/r200/radeon_tile.h   |  1 -
 30 files changed, 29 insertions(+), 58 deletions(-)
 delete mode 12 src/mesa/drivers/dri/r200/radeon_buffer_objects.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_buffer_objects.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_chipset.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_cmdbuf.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_common.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_common.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_common_context.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_common_context.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_debug.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_debug.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_dma.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_dma.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_fbo.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_fog.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_fog.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_mipmap_tree.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_mipmap_tree.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_pixel_read.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_queryobj.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_queryobj.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_screen.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_screen.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_span.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_span.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_tex_copy.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_texture.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_texture.h
 delete mode 12 src/mesa/drivers/dri/r200/radeon_tile.c
 delete mode 12 src/mesa/drivers/dri/r200/radeon_tile.h

diff --git a/src/mesa/drivers/dri/r200/Makefile.sources
b/src/mesa/drivers/dri/r200/Makefile.sources
index dbcb9af..ef2e7be 100644
--- a/src/mesa/drivers/dri/r200/Makefile.sources
+++ b/src/mesa/drivers/dri/r200/Makefile.sources
@@ -1,30 +1,30 @@
 R200_COMMON_FILES = \
-   radeon_buffer_objects.c \
-   radeon_buffer_objects.h \
-   radeon_cmdbuf.h \
-   radeon_common.c \
-   radeon_common.h \
-   radeon_common_context.c \
-   radeon_common_context.h \
-   radeon_debug.c \
-   radeon_debug.h \
-   radeon_dma.c \
-   radeon_dma.h \
-   radeon_fbo.c \
-   radeon_fog.c \
-   radeon_fog.h \
-   radeon_mipmap_tree.c \
-   radeon_mipmap_tree.h \
-   radeon_pixel_read.c \
-   radeon_queryobj.c \
-   radeon_queryobj.h \
-   radeon_span.c \
-   radeon_span.h \
-   radeon_tex_copy.c \
-   radeon_texture.c \
-   radeon_texture.h \
-   radeon_tile.c \
-   radeon_

Re: [Mesa-dev] [PATCH v3] clover: fix build failure since bfd695e

2016-02-13 Thread eocallaghan

Thanks kindly.

Reviewed-by: Edward O'Callaghan 

On 2016-02-14 09:39, Serge Martin wrote:

---
 src/gallium/state_trackers/clover/core/kernel.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp
b/src/gallium/state_trackers/clover/core/kernel.cpp
index 41b3852..8396be9 100644
--- a/src/gallium/state_trackers/clover/core/kernel.cpp
+++ b/src/gallium/state_trackers/clover/core/kernel.cpp
@@ -76,9 +76,9 @@ kernel::launch(command_queue &q,
   exec.g_buffers.data(), 
g_handles.data());


// Fill information for the launch_grid() call.
-   info.block = pad_vector(q, block_size, 1).data(),
-   info.grid = pad_vector(q, reduced_grid_size, 1).data(),
-   info.pc = find(name_equals(_name), m.sysm).offset;
+   copy(pad_vector(q, block_size, 1), info.block);
+   copy(pad_vector(q, reduced_grid_size, 1), info.grid);
+   info.pc = find(name_equals(_name), m.syms).offset;
info.input = exec.input.data();

q.pipe->launch_grid(q.pipe, &info);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] tgsi: break gigantic tgsi_scan_shader() function into pieces

2016-02-05 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-02-06 11:56, Brian Paul wrote:

New functions for examining instructions, declarations, etc.
---
 src/gallium/auxiliary/tgsi/tgsi_scan.c | 739 
+

 1 file changed, 375 insertions(+), 364 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index 687fb54..4199dbe 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -44,6 +44,375 @@



+static void
+scan_instruction(struct tgsi_shader_info *info,
+ const struct tgsi_full_instruction *fullinst,
+ unsigned *current_depth)
+{
+   unsigned i;
+
+   assert(fullinst->Instruction.Opcode < TGSI_OPCODE_LAST);
+   info->opcode_count[fullinst->Instruction.Opcode]++;
+
+   switch (fullinst->Instruction.Opcode) {
+   case TGSI_OPCODE_IF:
+   case TGSI_OPCODE_UIF:
+   case TGSI_OPCODE_BGNLOOP:
+  (*current_depth)++;
+  info->max_depth = MAX2(info->max_depth, *current_depth);
+  break;
+   case TGSI_OPCODE_ENDIF:
+   case TGSI_OPCODE_ENDLOOP:
+  (*current_depth)--;
+  break;
+   default:
+  break;
+   }
+
+   if (fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_CENTROID ||
+   fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_OFFSET ||
+   fullinst->Instruction.Opcode == TGSI_OPCODE_INTERP_SAMPLE) {
+  const struct tgsi_full_src_register *src0 = &fullinst->Src[0];
+  unsigned input;
+
+  if (src0->Register.Indirect && src0->Indirect.ArrayID)
+ input = info->input_array_first[src0->Indirect.ArrayID];
+  else
+ input = src0->Register.Index;
+
+  /* For the INTERP opcodes, the interpolation is always
+   * PERSPECTIVE unless LINEAR is specified.
+   */
+  switch (info->input_interpolate[input]) {
+  case TGSI_INTERPOLATE_COLOR:
+  case TGSI_INTERPOLATE_CONSTANT:
+  case TGSI_INTERPOLATE_PERSPECTIVE:
+ switch (fullinst->Instruction.Opcode) {
+ case TGSI_OPCODE_INTERP_CENTROID:
+info->uses_persp_opcode_interp_centroid = true;
+break;
+ case TGSI_OPCODE_INTERP_OFFSET:
+info->uses_persp_opcode_interp_offset = true;
+break;
+ case TGSI_OPCODE_INTERP_SAMPLE:
+info->uses_persp_opcode_interp_sample = true;
+break;
+ }
+ break;
+
+  case TGSI_INTERPOLATE_LINEAR:
+ switch (fullinst->Instruction.Opcode) {
+ case TGSI_OPCODE_INTERP_CENTROID:
+info->uses_linear_opcode_interp_centroid = true;
+break;
+ case TGSI_OPCODE_INTERP_OFFSET:
+info->uses_linear_opcode_interp_offset = true;
+break;
+ case TGSI_OPCODE_INTERP_SAMPLE:
+info->uses_linear_opcode_interp_sample = true;
+break;
+ }
+ break;
+  }
+   }
+
+   if (fullinst->Instruction.Opcode >= TGSI_OPCODE_F2D &&
+   fullinst->Instruction.Opcode <= TGSI_OPCODE_DSSG)
+  info->uses_doubles = true;
+
+   for (i = 0; i < fullinst->Instruction.NumSrcRegs; i++) {
+  const struct tgsi_full_src_register *src = &fullinst->Src[i];
+  int ind = src->Register.Index;
+
+  /* Mark which inputs are effectively used */
+  if (src->Register.File == TGSI_FILE_INPUT) {
+ unsigned usage_mask;
+ usage_mask = tgsi_util_get_inst_usage_mask(fullinst, i);
+ if (src->Register.Indirect) {
+for (ind = 0; ind < info->num_inputs; ++ind) {
+   info->input_usage_mask[ind] |= usage_mask;
+}
+ } else {
+assert(ind >= 0);
+assert(ind < PIPE_MAX_SHADER_INPUTS);
+info->input_usage_mask[ind] |= usage_mask;
+ }
+
+ if (info->processor == TGSI_PROCESSOR_FRAGMENT &&
+ !src->Register.Indirect) {
+unsigned name =
+   info->input_semantic_name[src->Register.Index];
+unsigned index =
+   info->input_semantic_index[src->Register.Index];
+
+if (name == TGSI_SEMANTIC_POSITION &&
+(src->Register.SwizzleX == TGSI_SWIZZLE_Z ||
+ src->Register.SwizzleY == TGSI_SWIZZLE_Z ||
+ src->Register.SwizzleZ == TGSI_SWIZZLE_Z ||
+ src->Register.SwizzleW == TGSI_SWIZZLE_Z))
+   info->reads_z = TRUE;
+
+if (name == TGSI_SEMANTIC_COLOR) {
+   unsigned mask =
+  (1 << src->Register.SwizzleX) |
+  (1 << src->Register.SwizzleY) |
+  (1 << src->Register.SwizzleZ) |
+  (1 << src->Register.SwizzleW);
+
+   info->colors_read |= mask << (index * 4);
+}
+ }
+  }
+
+  /* check for indirect register reads */
+  if (src->Register.Indirect) {
+ info->indirect_files |= (1 << src->Register.File);
+ info->indirect_files_read |= (1 << 

Re: [Mesa-dev] [PATCH] radeonsi: Dump LLVM IR before optimization passes

2016-02-03 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-02-04 13:28, Michel Dänzer wrote:

From: Michel Dänzer 

Otherwise it's not possible to diagnose problems caused by optimization
passes.

Signed-off-by: Michel Dänzer 
---
 src/gallium/drivers/radeonsi/si_shader.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index 2192b21..d6c719f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4089,13 +4089,10 @@ int si_compile_llvm(struct si_screen *sscreen,
int r = 0;
unsigned count = p_atomic_inc_return(&sscreen->b.num_compilations);

-   if (r600_can_dump_shader(&sscreen->b, processor)) {
+   if (!(sscreen->b.debug_flags & DBG_NO_IR) &&
+   r600_can_dump_shader(&sscreen->b, processor))
fprintf(stderr, "radeonsi: Compiling shader %d\n", count);

-   if (!(sscreen->b.debug_flags & DBG_NO_IR))
-   LLVMDumpModule(mod);
-   }
-
if (!si_replace_shader(count, binary)) {
r = radeon_llvm_compile(mod, binary,
r600_get_llvm_processor_name(sscreen->b.family), tm,
@@ -4177,6 +4174,11 @@ static int si_generate_gs_copy_shader(struct
si_screen *sscreen,

si_llvm_export_vs(bld_base, outputs, gsinfo->num_outputs);

+   /* Dump LLVM IR before any optimization passes */
+   if (!(sscreen->b.debug_flags & DBG_NO_IR) &&
+   r600_can_dump_shader(&sscreen->b, TGSI_PROCESSOR_GEOMETRY))
+   LLVMDumpModule(bld_base->base.gallivm->module);
+
radeon_llvm_finalize_module(&si_shader_ctx->radeon_bld);

if (dump)
@@ -4383,9 +4385,15 @@ int si_shader_create(struct si_screen *sscreen,
LLVMTargetMachineRef tm,
goto out;
}

+   mod = bld_base->base.gallivm->module;
+
+   /* Dump LLVM IR before any optimization passes */
+   if (!(sscreen->b.debug_flags & DBG_NO_IR) &&
+   r600_can_dump_shader(&sscreen->b, si_shader_ctx.type))
+   LLVMDumpModule(mod);
+
radeon_llvm_finalize_module(&si_shader_ctx.radeon_bld);

-   mod = bld_base->base.gallivm->module;
r = si_compile_llvm(sscreen, &shader->binary, &shader->config, tm,
mod, debug, si_shader_ctx.type);
if (r) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] radeonsi: experimental support for GPUPerfStudio

2016-02-03 Thread eocallaghan
Can't see any serious issues here and a non-verified `working' instance 
of

GPUPerfStudio is sure better than a crashing one!

This series is,

Reviewed-by: Edward O'Callaghan 


On 2016-02-04 00:52, Nicolai Hähnle wrote:

Hi,

this bunch of patches meets GPUPerfStudio half-way in supporting the 
timing
features on CI+ hardware. The latest version of GPUPerfStudio is 
required.


With these patches, GPUPerfStudio should recognize our driver as 
supported

and offer its frame profiling features without crashing. It should also
report reasonable numbers in the profile. However, I haven't fully
validated the reported numbers, so while I'd like to get this merged 
now,

it should still be considered as somewhat experimental. Please review.

Thanks,
Nicolai
--
 .../drivers/radeon/r600_perfcounter.c|  38 +++---
 src/gallium/drivers/radeon/r600_query.c  |  80 ++-
 src/gallium/drivers/radeon/r600_query.h  |  32 ++---
 .../drivers/radeonsi/si_perfcounter.c| 121 +
 4 files changed, 201 insertions(+), 70 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] st/mesa: unify variants and delete functions for TCS, TES, GS

2016-01-30 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-31 02:50, Marek Olšák wrote:

From: Marek Olšák 

no difference between those
---
 src/mesa/state_tracker/st_atom_shader.c |   6 +-
 src/mesa/state_tracker/st_cb_program.c  |  18 ++-
 src/mesa/state_tracker/st_context.h |   6 +-
 src/mesa/state_tracker/st_program.c | 204 


 src/mesa/state_tracker/st_program.h |  88 +++---
 5 files changed, 108 insertions(+), 214 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_shader.c
b/src/mesa/state_tracker/st_atom_shader.c
index 0f9ea10..2d8a3c3 100644
--- a/src/mesa/state_tracker/st_atom_shader.c
+++ b/src/mesa/state_tracker/st_atom_shader.c
@@ -163,7 +163,7 @@ static void
 update_gp( struct st_context *st )
 {
struct st_geometry_program *stgp;
-   struct st_gp_variant_key key;
+   struct st_basic_variant_key key;

if (!st->ctx->GeometryProgram._Current) {
   cso_set_geometry_shader_handle(st->cso_context, NULL);
@@ -199,7 +199,7 @@ static void
 update_tcp( struct st_context *st )
 {
struct st_tessctrl_program *sttcp;
-   struct st_tcp_variant_key key;
+   struct st_basic_variant_key key;

if (!st->ctx->TessCtrlProgram._Current) {
   cso_set_tessctrl_shader_handle(st->cso_context, NULL);
@@ -235,7 +235,7 @@ static void
 update_tep( struct st_context *st )
 {
struct st_tesseval_program *sttep;
-   struct st_tep_variant_key key;
+   struct st_basic_variant_key key;

if (!st->ctx->TessEvalProgram._Current) {
   cso_set_tesseval_shader_handle(st->cso_context, NULL);
diff --git a/src/mesa/state_tracker/st_cb_program.c
b/src/mesa/state_tracker/st_cb_program.c
index 2c4eccf..6f9c53e 100644
--- a/src/mesa/state_tracker/st_cb_program.c
+++ b/src/mesa/state_tracker/st_cb_program.c
@@ -153,7 +153,8 @@ st_delete_program(struct gl_context *ctx, struct
gl_program *prog)
  struct st_geometry_program *stgp =
 (struct st_geometry_program *) prog;

- st_release_gp_variants(st, stgp);
+ st_release_basic_variants(st, stgp->Base.Base.Target,
+   &stgp->variants, &stgp->tgsi);

  if (stgp->glsl_to_tgsi)
 free_glsl_to_tgsi_visitor(stgp->glsl_to_tgsi);
@@ -175,7 +176,8 @@ st_delete_program(struct gl_context *ctx, struct
gl_program *prog)
  struct st_tessctrl_program *sttcp =
 (struct st_tessctrl_program *) prog;

- st_release_tcp_variants(st, sttcp);
+ st_release_basic_variants(st, sttcp->Base.Base.Target,
+   &sttcp->variants, &sttcp->tgsi);

  if (sttcp->glsl_to_tgsi)
 free_glsl_to_tgsi_visitor(sttcp->glsl_to_tgsi);
@@ -186,7 +188,8 @@ st_delete_program(struct gl_context *ctx, struct
gl_program *prog)
  struct st_tesseval_program *sttep =
 (struct st_tesseval_program *) prog;

- st_release_tep_variants(st, sttep);
+ st_release_basic_variants(st, sttep->Base.Base.Target,
+   &sttep->variants, &sttep->tgsi);

  if (sttep->glsl_to_tgsi)
 free_glsl_to_tgsi_visitor(sttep->glsl_to_tgsi);
@@ -239,7 +242,8 @@ st_program_string_notify( struct gl_context *ctx,
else if (target == GL_GEOMETRY_PROGRAM_NV) {
   struct st_geometry_program *stgp = (struct st_geometry_program 
*) prog;


-  st_release_gp_variants(st, stgp);
+  st_release_basic_variants(st, stgp->Base.Base.Target,
+&stgp->variants, &stgp->tgsi);
   if (!st_translate_geometry_program(st, stgp))
  return false;

@@ -260,7 +264,8 @@ st_program_string_notify( struct gl_context *ctx,
   struct st_tessctrl_program *sttcp =
  (struct st_tessctrl_program *) prog;

-  st_release_tcp_variants(st, sttcp);
+  st_release_basic_variants(st, sttcp->Base.Base.Target,
+&sttcp->variants, &sttcp->tgsi);
   if (!st_translate_tessctrl_program(st, sttcp))
  return false;

@@ -271,7 +276,8 @@ st_program_string_notify( struct gl_context *ctx,
   struct st_tesseval_program *sttep =
  (struct st_tesseval_program *) prog;

-  st_release_tep_variants(st, sttep);
+  st_release_basic_variants(st, sttep->Base.Base.Target,
+&sttep->variants, &sttep->tgsi);
   if (!st_translate_tesseval_program(st, sttep))
  return false;

diff --git a/src/mesa/state_tracker/st_context.h
b/src/mesa/state_tracker/st_context.h
index 9db5f11..2883edf 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -166,9 +166,9 @@ struct st_context

struct st_vp_variant *vp_variant;
struct st_fp_variant *fp_variant;
-   struct st_gp_variant *gp_variant;
-   struct st_tcp_variant *tcp_variant;
-   struct st_tep_variant *tep_variant;
+   struct st_basic_variant *gp_variant;
+   struct st_basic_variant *tcp_variant;
+   struct st_basic_variant *tep_variant;

Re: [Mesa-dev] [PATCH 2/3] mesa: use geometric helper for computing min samples

2016-01-30 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-01-31 16:58, Ilia Mirkin wrote:

In case we have a draw buffer without attachments, we should be looking
at the default number of samples.

Signed-off-by: Ilia Mirkin 
---

Still doesn't work properly on nvc0, but at least the right number of
min samples gets passed along.

 src/mesa/program/program.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c
index 0e78e6a..27867c4 100644
--- a/src/mesa/program/program.c
+++ b/src/mesa/program/program.c
@@ -31,6 +31,7 @@

 #include "main/glheader.h"
 #include "main/context.h"
+#include "main/framebuffer.h"
 #include "main/hash.h"
 #include "main/macros.h"
 #include "program.h"
@@ -534,14 +535,14 @@ _mesa_get_min_invocations_per_fragment(struct
gl_context *ctx,
*  forces per-sample shading"
*/
   if (prog->IsSample && !ignore_sample_qualifier)
- return MAX2(ctx->DrawBuffer->Visual.samples, 1);
+ return MAX2(_mesa_geometric_samples(ctx->DrawBuffer), 1);

   if (prog->Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
  SYSTEM_BIT_SAMPLE_POS))
- return MAX2(ctx->DrawBuffer->Visual.samples, 1);
+ return MAX2(_mesa_geometric_samples(ctx->DrawBuffer), 1);
   else if (ctx->Multisample.SampleShading)
  return MAX2(ceil(ctx->Multisample.MinSampleShadingValue *
-  ctx->DrawBuffer->Visual.samples), 1);
+  _mesa_geometric_samples(ctx->DrawBuffer)), 
1);

   else
  return 1;
}


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] winsys/amdgpu: Handle RADEON_FLAG_NO_CPU_ACCESS

2016-01-25 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

Good job working out where this issue was.

On 2016-01-26 18:40, Michel Dänzer wrote:

From: Michel Dänzer 

Failing to do this was resulting in the kernel driver unnecessarily
leaving open the possibility of CPU access to tiled BOs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93862

(This change shouldn't be backported to stable branches, because
released versions of xf86-video-amdgpu unnecessarily try to map the
front buffer)

Signed-off-by: Michel Dänzer 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 30a1aa8..1e997d9 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -292,6 +292,8 @@ static struct amdgpu_winsys_bo
*amdgpu_create_bo(struct amdgpu_winsys *ws,
   request.preferred_heap |= AMDGPU_GEM_DOMAIN_VRAM;
   if (flags & RADEON_FLAG_CPU_ACCESS)
  request.flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+  if (flags & RADEON_FLAG_NO_CPU_ACCESS)
+ request.flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
}
if (initial_domain & RADEON_DOMAIN_GTT) {
   request.preferred_heap |= AMDGPU_GEM_DOMAIN_GTT;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 00/11] st/mesa: add shader buffer support

2016-01-24 Thread eocallaghan

This whole series is now,

Reviewed-by: Edward O'Callaghan 

On 2016-01-25 05:59, Ilia Mirkin wrote:

I believe I've addressed the various feedback people had. There's the
outstanding point of how to expose the atomic buffer bindings, but
this is a larger issue and largely tangential to the actual code
changed in this series.

No one has commented on my glsl_to_tgsi bits, which I sort of
expected. Unless I hear outcry to the contrary, I'm just going to push
those unreviewed once everything else is good -- nobody knows that
code particularly well, and I've run the dEQP tests, which leads me to
believe it's at least mostly good.

Ilia Mirkin (11):
  tgsi: add MEMBAR opcode to handle memoryBarrier* GLSL intrinsics
  glsl: always initialize image_* fields, copy them on interface init
  glsl: keep track of ssbo variable being accessed, add access params
  mesa: add PROGRAM_IMMEDIATE, PROGRAM_BUFFER
  st/mesa: add atomic counter support
  st/mesa: add support for SSBO binding and GLSL intrinsics
  st/mesa: use RESQ to find buffer size
  st/mesa: add support for memory barrier intrinsics
  st/mesa: add shader buffer barrier bit
  st/mesa: enable ARB_shader_storage_buffer_object when supported
  trace: add support for set_shader_buffers

 src/gallium/auxiliary/tgsi/tgsi_info.c|   2 +-
 src/gallium/docs/source/tgsi.rst  |  17 ++
 src/gallium/drivers/trace/tr_context.c|  40 +++
 src/gallium/drivers/trace/tr_dump_state.c |  18 ++
 src/gallium/drivers/trace/tr_dump_state.h |   2 +
 src/gallium/include/pipe/p_defines.h  |   1 +
 src/gallium/include/pipe/p_shader_tokens.h|   7 +-
 src/glsl/builtin_variables.cpp|   5 +
 src/glsl/lower_buffer_access.cpp  |   6 +-
 src/glsl/lower_buffer_access.h|   1 +
 src/glsl/lower_shared_reference.cpp   |   6 +-
 src/glsl/lower_ubo_reference.cpp  |  40 ++-
 src/glsl/nir/glsl_types.cpp   |   5 +
 src/glsl/nir/glsl_types.h |   3 +-
 src/glsl/nir/shader_enums.h   |  10 +
 src/mesa/Makefile.sources |   2 +
 src/mesa/main/mtypes.h|   2 +
 src/mesa/program/ir_to_mesa.cpp   |   4 +
 src/mesa/state_tracker/st_atom.c  |  10 +
 src/mesa/state_tracker/st_atom.h  |  10 +
 src/mesa/state_tracker/st_atom_atomicbuf.c| 158 +++
 src/mesa/state_tracker/st_atom_storagebuf.c   | 194 +
 src/mesa/state_tracker/st_cb_bufferobjects.c  |   4 +
 src/mesa/state_tracker/st_cb_texturebarrier.c |   4 +
 src/mesa/state_tracker/st_context.c   |   2 +
 src/mesa/state_tracker/st_context.h   |   2 +
 src/mesa/state_tracker/st_extensions.c|  30 ++
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 392 
--

 28 files changed, 949 insertions(+), 28 deletions(-)
 create mode 100644 src/mesa/state_tracker/st_atom_atomicbuf.c
 create mode 100644 src/mesa/state_tracker/st_atom_storagebuf.c


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: add DCC buffer for sampler views on new CS

2016-01-24 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-01-25 03:40, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

This fixes a VM fault and possible lockup in high memory pressure 
situations.


Cc: "11.0 11.1" 
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 33 
+++

 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
b/src/gallium/drivers/radeonsi/si_descriptors.c
index aad836d..6c79673 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -138,6 +138,22 @@ static void si_release_sampler_views(struct
si_sampler_views *views)
si_release_descriptors(&views->desc);
 }

+static void si_sampler_view_add_buffers(struct si_context *sctx,
+   struct si_sampler_view *rview)
+{
+   if (rview->resource) {
+   radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
+   rview->resource, RADEON_USAGE_READ,
+   r600_get_sampler_view_priority(rview->resource));
+   }
+
+   if (rview->dcc_buffer && rview->dcc_buffer != rview->resource) {
+   radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
+   rview->dcc_buffer, RADEON_USAGE_READ,
+   RADEON_PRIO_DCC);
+   }
+}
+
 static void si_sampler_views_begin_new_cs(struct si_context *sctx,
  struct si_sampler_views *views)
 {
@@ -149,12 +165,7 @@ static void si_sampler_views_begin_new_cs(struct
si_context *sctx,
struct si_sampler_view *rview =
(struct si_sampler_view*)views->views[i];

-   if (!rview->resource)
-   continue;
-
-   radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
- rview->resource, RADEON_USAGE_READ,
- 
r600_get_sampler_view_priority(rview->resource));
+   si_sampler_view_add_buffers(sctx, rview);
}

if (!views->desc.buffer)
@@ -176,15 +187,7 @@ static void si_set_sampler_view(struct si_context
*sctx, unsigned shader,
struct si_sampler_view *rview =
(struct si_sampler_view*)view;

-   if (rview->resource)
-   radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
-   rview->resource, RADEON_USAGE_READ,
-   
r600_get_sampler_view_priority(rview->resource));
-
-   if (rview->dcc_buffer && rview->dcc_buffer != rview->resource)
-   radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
-   rview->dcc_buffer, RADEON_USAGE_READ,
-   RADEON_PRIO_DCC);
+   si_sampler_view_add_buffers(sctx, rview);

pipe_sampler_view_reference(&views->views[slot], view);
memcpy(views->desc.list + slot*8, view_desc, 8*4);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] radeonsi: allow using all CUs for tessellation and on-chip GS (v2)

2016-01-22 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-23 01:18, Marek Olšák wrote:

From: Marek Olšák 

v2: After more discussion with hw teams, the kernel already contains 
the

optimal settings allowing us to use all CUs.
---
 src/gallium/drivers/radeonsi/si_state.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index a3ddee8..67b2835 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3701,9 +3701,9 @@ static void si_init_config(struct si_context 
*sctx)

si_pm4_set_reg(pm4, R_028408_VGT_INDX_OFFSET, 0);

if (sctx->b.chip_class >= CIK) {
-   si_pm4_set_reg(pm4, R_00B51C_SPI_SHADER_PGM_RSRC3_LS,
S_00B51C_CU_EN(0xfffc));
+   si_pm4_set_reg(pm4, R_00B51C_SPI_SHADER_PGM_RSRC3_LS,
S_00B51C_CU_EN(0x));
si_pm4_set_reg(pm4, R_00B41C_SPI_SHADER_PGM_RSRC3_HS, 0);
-   si_pm4_set_reg(pm4, R_00B31C_SPI_SHADER_PGM_RSRC3_ES,
S_00B31C_CU_EN(0xfffe));
+   si_pm4_set_reg(pm4, R_00B31C_SPI_SHADER_PGM_RSRC3_ES,
S_00B31C_CU_EN(0x));
si_pm4_set_reg(pm4, R_00B21C_SPI_SHADER_PGM_RSRC3_GS,
S_00B21C_CU_EN(0x));
si_pm4_set_reg(pm4, R_00B118_SPI_SHADER_PGM_RSRC3_VS,
S_00B118_CU_EN(0x));
 		si_pm4_set_reg(pm4, R_00B11C_SPI_SHADER_LATE_ALLOC_VS, 
S_00B11C_LIMIT(0));


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/7] radeonsi: geometry shader bug fix and cleanup

2016-01-22 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-23 10:59, Nicolai Hähnle wrote:

Hi,

this series was prompted by a rendering bug reported for Dolphin. The 
bug is
fixed in the first two patches, and the remainder is assorted cleanups 
that

I noticed while working on the fix. Please review.

Thanks,
Nicolai
--
 .../drivers/radeonsi/si_descriptors.c|  8 +-
 src/gallium/drivers/radeonsi/si_shader.c | 14 ++--
 src/gallium/drivers/radeonsi/si_shader.h |  1 -
 .../drivers/radeonsi/si_state_shaders.c  | 78 +++---
 4 files changed, 62 insertions(+), 39 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/9] st/mesa: accelerate texture uploads from PBOs

2016-01-21 Thread eocallaghan

To the best of my understanding, this series is now:

Reviewed-by: Edward O'Callaghan 

On 2016-01-22 06:37, Nicolai Hähnle wrote:

Hi everybody,

here's an updated version of the series.

I decided to keep BUFFER_SAMPLER_VIEW_RGBA_ONLY as is, following 
Fredrik's

point that it affects not only the sampler swizzle but also the texture
format itself.

The major functionality changes are that we now try to fulfill larger
alignments by adjusting the buf_offset appropriately (this is not 
needed

for radeonsi, but I did some basic tests to make sure this works) and
we don't use a geometry shader if the driver can handle layer writes
in the VS.

Please review.

Thanks,
Nicolai
--
 src/gallium/docs/source/screen.rst   |   11 +
 .../drivers/freedreno/freedreno_screen.c |3 +
 src/gallium/drivers/i915/i915_screen.c   |2 +
 src/gallium/drivers/ilo/ilo_screen.c |3 +
 src/gallium/drivers/llvmpipe/lp_screen.c |2 +
 .../drivers/nouveau/nv30/nv30_screen.c   |2 +
 .../drivers/nouveau/nv50/nv50_screen.c   |2 +
 .../drivers/nouveau/nvc0/nvc0_screen.c   |2 +
 src/gallium/drivers/r300/r300_screen.c   |2 +
 src/gallium/drivers/r600/r600_pipe.c |4 +
 src/gallium/drivers/radeon/r600_texture.c|   26 +-
 src/gallium/drivers/radeonsi/si_pipe.c   |4 +
 src/gallium/drivers/softpipe/sp_screen.c |3 +
 src/gallium/drivers/svga/svga_screen.c   |2 +
 src/gallium/drivers/vc4/vc4_screen.c |2 +
 src/gallium/drivers/virgl/virgl_screen.c |3 +
 src/gallium/include/pipe/p_defines.h |2 +
 src/mesa/state_tracker/st_cb_texture.c   | 1178 +++-
 src/mesa/state_tracker/st_cb_texture.h   |5 +
 src/mesa/state_tracker/st_context.c  |2 +
 src/mesa/state_tracker/st_context.h  |   13 +
 21 files changed, 1254 insertions(+), 19 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: Add option for SI scheduler

2016-01-21 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-01-22 04:35, Axel Davy wrote:

Add a debug option to select the LLVM SI Machine Scheduler.
R600_DEBUG=sisched

Signed-off-by: Axel Davy 
---
The corresponding llvm patch is on llvm master,
and should land soon for 3.8 branch
 src/gallium/drivers/radeon/r600_pipe_common.c | 1 +
 src/gallium/drivers/radeon/r600_pipe_common.h | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c| 6 +-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index e926f56..a9ce7b1 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -389,6 +389,7 @@ static const struct debug_named_value
common_debug_options[] = {
{ "nodcc", DBG_NO_DCC, "Disable DCC." },
{ "nodccclear", DBG_NO_DCC_CLEAR, "Disable DCC fast clear." },
{ "norbplus", DBG_NO_RB_PLUS, "Disable RB+ on Stoney." },
+	{ "sisched", DBG_SI_SCHED, "Enable LLVM SI Machine Instruction 
Scheduler." },


DEBUG_NAMED_VALUE_END /* must be last */
 };
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 27f6e98..3020421 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -87,6 +87,7 @@
 #define DBG_NO_DCC (1llu << 43)
 #define DBG_NO_DCC_CLEAR   (1llu << 44)
 #define DBG_NO_RB_PLUS (1llu << 45)
+#define DBG_SI_SCHED   (1llu << 46)

 #define R600_MAP_BUFFER_ALIGNMENT 64

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index f6ff4a8..51bcba7 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -215,7 +215,11 @@ static struct pipe_context
*si_create_context(struct pipe_screen *screen,
r600_target = radeon_llvm_get_r600_target(triple);
sctx->tm = LLVMCreateTargetMachine(r600_target, triple,
   
r600_get_llvm_processor_name(sscreen->b.family),
-  "+DumpCode,+vgpr-spilling",
+#if HAVE_LLVM >= 0x0308
+  sscreen->b.debug_flags & 
DBG_SI_SCHED ?
+   
"+DumpCode,+vgpr-spilling,+si-scheduler" :
+#endif
+   "+DumpCode,+vgpr-spilling",
   LLVMCodeGenLevelDefault,
   LLVMRelocDefault,
   LLVMCodeModelDefault);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: add max waves / CU to shader stats

2016-01-19 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-20 12:39, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 33 
+---

 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index 0c5fd32..5c536f8 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3994,12 +3994,39 @@ static void si_shader_dump_stats(struct
si_screen *sscreen,
 struct pipe_debug_callback *debug,
 unsigned processor)
 {
+   /* Compute the maximum number of waves.
+* The pixel shader additionally allocates 1 - 48 blocks of LDS
+* depending on non-compile times parameters.
+*/
+   unsigned ps_lds_size = processor == TGSI_PROCESSOR_FRAGMENT ? 1 : 0;
+   unsigned lds_size = ps_lds_size + conf->lds_size;
+   unsigned max_waves = 10;
+
+   if (conf->num_sgprs) {
+   if (sscreen->b.chip_class >= VI)
+   max_waves = MIN2(max_waves, 800 / conf->num_sgprs);
+   else
+   max_waves = MIN2(max_waves, 512 / conf->num_sgprs);
+   }
+
+   if (conf->num_vgprs)
+   max_waves = MIN2(max_waves, 256 / conf->num_vgprs);
+
+   if (lds_size)
+   max_waves = MIN2(max_waves, 128 / lds_size);
+
if (r600_can_dump_shader(&sscreen->b, processor)) {
fprintf(stderr, "*** SHADER STATS ***\n"
-   "SGPRS: %d\nVGPRS: %d\nCode Size: %d bytes\nLDS: %d 
blocks\n"
-   "Scratch: %d bytes per wave\n\n",
+   "SGPRS: %d\n"
+   "VGPRS: %d\n"
+   "Code Size: %d bytes\n"
+   "LDS: %d blocks\n"
+   "Scratch: %d bytes per wave\n"
+   "Max waves / CU: %d\n"
+   "\n",
conf->num_sgprs, conf->num_vgprs, code_size,
-   conf->lds_size, conf->scratch_bytes_per_wave);
+   conf->lds_size, conf->scratch_bytes_per_wave,
+   max_waves);
}

pipe_debug_message(debug, SHADER_INFO,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/10] st/mesa: use RESQ to find buffer size

2016-01-18 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-01-18 16:51, Ilia Mirkin wrote:

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0aaa175..602e689 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -569,6 +569,7 @@ static bool
 is_resource_instruction(unsigned opcode)
 {
switch (opcode) {
+   case TGSI_OPCODE_RESQ:
case TGSI_OPCODE_LOAD:
case TGSI_OPCODE_ATOMUADD:
case TGSI_OPCODE_ATOMXCHG:
@@ -,6 +2223,22 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
   emit_asm(ir, TGSI_OPCODE_UP2H, result_dst, op[0]);
   break;

+   case ir_unop_get_buffer_size: {
+  ir_constant *const_offset = ir->operands[0]->as_constant();
+  st_src_reg buffer(
+PROGRAM_BUFFER,
+ctx->Const.Program[shader->Stage].MaxAtomicBuffers +
+(const_offset ? const_offset->value.u[0] : 0),
+GLSL_TYPE_UINT);
+  if (!const_offset) {
+ buffer.reladdr = ralloc(mem_ctx, st_src_reg);
+ memcpy(buffer.reladdr, &sampler_reladdr, 
sizeof(sampler_reladdr));

+ emit_arl(ir, sampler_reladdr, op[0]);
+  }
+  emit_asm(ir, TGSI_OPCODE_RESQ, result_dst)->buffer = buffer;
+  break;
+   }
+
case ir_unop_pack_snorm_2x16:
case ir_unop_pack_unorm_2x16:
case ir_unop_pack_snorm_4x8:
@@ -2245,10 +2262,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
*/
   assert(!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()");
   break;
-
-   case ir_unop_get_buffer_size:
-  assert(!"Not implemented yet");
-  break;
}

this->result = result_src;
@@ -5133,6 +5146,7 @@ compile_tgsi_instruction(struct st_translate *t,
 src, num_src);
   return;

+   case TGSI_OPCODE_RESQ:
case TGSI_OPCODE_LOAD:
case TGSI_OPCODE_ATOMUADD:
case TGSI_OPCODE_ATOMXCHG:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] tgsi: initialize Atomic field in tgsi_default_declaration

2016-01-17 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-01-17 19:46, Ilia Mirkin wrote:

Spotted by Coverity.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index ea20746..83f5062 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -110,6 +110,7 @@ tgsi_default_declaration( void )
declaration.Invariant = 0;
declaration.Local = 0;
declaration.Array = 0;
+   declaration.Atomic = 0;
declaration.Padding = 0;

return declaration;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: Print "LLVM emitted unknown config register" warning only once

2016-01-15 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-15 14:23, Michel Dänzer wrote:

From: Michel Dänzer 

Say "LLVM" instead of "Compiler" for clarity.

Signed-off-by: Michel Dänzer 
---
 src/gallium/drivers/radeonsi/si_shader.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index cc9718e..3ab054c 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3735,8 +3735,15 @@ void si_shader_binary_read_config(struct
radeon_shader_binary *binary,
G_00B860_WAVESIZE(value) * 256 * 4 * 1;
break;
default:
-   fprintf(stderr, "Warning: Compiler emitted unknown "
-   "config register: 0x%x\n", reg);
+   {
+   static bool printed;
+
+   if (!printed) {
+   fprintf(stderr, "Warning: LLVM emitted 
unknown "
+   "config register: 0x%x\n", reg);
+   printed = true;
+   }
+   }
break;
}
}


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] glsl: enable offset layout qualifier for ARB_enhanced_layouts

2016-01-11 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-11 14:13, Timothy Arceri wrote:

---
 src/glsl/glsl_parser.yy | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index 6b634f2..b2b94f4 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -1505,7 +1505,8 @@ layout_qualifier_id:
  $$.binding = $3;
   }

-  if (state->has_atomic_counters() &&
+  if ((state->has_atomic_counters() ||
+   state->has_enhanced_layouts()) &&
   match_layout_qualifier("offset", $1, state) == 0) {
  $$.flags.q.explicit_offset = 1;
  $$.offset = $3;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/mesa: remove dead code from mesa_to_tgsi

2016-01-07 Thread eocallaghan

This series is:

Reviewed-by: Edward O'Callaghan 

On 2016-01-08 12:12, Marek Olšák wrote:

From: Marek Olšák 

These aren't part of ARB_fragment_program.
---
 src/mesa/state_tracker/st_mesa_to_tgsi.c | 51 


 1 file changed, 51 deletions(-)

diff --git a/src/mesa/state_tracker/st_mesa_to_tgsi.c
b/src/mesa/state_tracker/st_mesa_to_tgsi.c
index 4b9dc99..d8f7b6c 100644
--- a/src/mesa/state_tracker/st_mesa_to_tgsi.c
+++ b/src/mesa/state_tracker/st_mesa_to_tgsi.c
@@ -475,24 +475,6 @@ static void emit_swz( struct st_translate *t,
 }


-/**
- * Negate the value of DDY to match GL semantics where (0,0) is the
- * lower-left corner of the window.
- * Note that the GL_ARB_fragment_coord_conventions extension will
- * effect this someday.
- */
-static void emit_ddy( struct st_translate *t,
-  struct ureg_dst dst,
-  const struct prog_src_register *SrcReg )
-{
-   struct ureg_program *ureg = t->ureg;
-   struct ureg_src src = translate_src( t, SrcReg );
-   src = ureg_negate( src );
-   ureg_DDY( ureg, dst, src );
-}
-
-
-
 static unsigned
 translate_opcode( unsigned op )
 {
@@ -714,10 +696,6 @@ compile_instruction(
*/
   ureg_MOV( ureg, dst[0], ureg_imm1f(ureg, 0.5) );
   break;
-   
-   case OPCODE_DDY:
-  emit_ddy( t, dst[0], &inst->SrcReg[0] );
-  break;

case OPCODE_RSQ:
   ureg_RSQ( ureg, dst[0], ureg_abs(src[0]) );
@@ -926,31 +904,6 @@ emit_wpos(struct st_context *st,


 /**
- * OpenGL's fragment gl_FrontFace input is 1 for front-facing, 0 for 
back.

- * TGSI uses +1 for front, -1 for back.
- * This function converts the TGSI value to the GL value.  Simply 
clamping/

- * saturating the value to [0,1] does the job.
- */
-static void
-emit_face_var( struct st_translate *t,
-   const struct gl_program *program )
-{
-   struct ureg_program *ureg = t->ureg;
-   struct ureg_dst face_temp = ureg_DECL_temporary( ureg );
-   struct ureg_src face_input = 
t->inputs[t->inputMapping[VARYING_SLOT_FACE]];

-
-   /* MOV_SAT face_temp, input[face]
-*/
-   face_temp = ureg_saturate( face_temp );
-   ureg_MOV( ureg, face_temp, face_input );
-
-   /* Use face_temp as face input from here on:
-*/
-   t->inputs[t->inputMapping[VARYING_SLOT_FACE]] = 
ureg_src(face_temp);

-}
-
-
-/**
  * Translate Mesa program to TGSI format.
  * \param program  the program to translate
  * \param numInputs  number of input registers used
@@ -1020,10 +973,6 @@ st_translate_mesa_program(
  emit_wpos(st_context(ctx), t, program, ureg);
   }

-  if (program->InputsRead & VARYING_BIT_FACE) {
- emit_face_var( t, program );
-  }
-
   /*
* Declare output attributes.
*/


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: simplify gl_FragCoord behavior

2016-01-07 Thread eocallaghan

This series is:

Reviewed-by: Edward O'Callaghan 

On 2016-01-08 12:30, Marek Olšák wrote:

From: Marek Olšák 

It will become a system value, not an input.
---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 45 
-

 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 64adf69..460dda5 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -399,30 +399,29 @@ static void si_shader_ps(struct si_shader 
*shader)

if (!pm4)
return;

-   for (i = 0; i < info->num_inputs; i++) {
-   switch (info->input_semantic_name[i]) {
-   case TGSI_SEMANTIC_POSITION:
-   /* SPI_BARYC_CNTL.POS_FLOAT_LOCATION
-* Possible vaules:
-* 0 -> Position = pixel center (default)
-* 1 -> Position = pixel centroid
-* 2 -> Position = at sample position
-*/
-   switch (info->input_interpolate_loc[i]) {
-   case TGSI_INTERPOLATE_LOC_CENTROID:
-   spi_baryc_cntl |= 
S_0286E0_POS_FLOAT_LOCATION(1);
-   break;
-   case TGSI_INTERPOLATE_LOC_SAMPLE:
-   spi_baryc_cntl |= 
S_0286E0_POS_FLOAT_LOCATION(2);
-   break;
-   }
+   /* SPI_BARYC_CNTL.POS_FLOAT_LOCATION
+* Possible vaules:
+* 0 -> Position = pixel center
+* 1 -> Position = pixel centroid
+* 2 -> Position = at sample position
+*
+* From GLSL 4.5 specification, section 7.1:
+	 *   "The variable gl_FragCoord is available as an input variable 
from
+	 *within fragment shaders and it holds the window relative 
coordinates

+*(x, y, z, 1/w) values for the fragment. If multi-sampling, this
+*value can be for any location within the pixel, or one of the
+*fragment samples. The use of centroid does not further restrict
+*this value to be inside the current primitive."
+*
+	 * Meaning that centroid has no effect and we can return anything 
within
+	 * the pixel. Thus, return the value at sample position, because 
that's

+* the most accurate one shaders can get.
+*/
+   spi_baryc_cntl |= S_0286E0_POS_FLOAT_LOCATION(2);

-   if 
(info->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER] ==
-   TGSI_FS_COORD_PIXEL_CENTER_INTEGER)
-   spi_baryc_cntl |= S_0286E0_POS_FLOAT_ULC(1);
-   break;
-   }
-   }
+   if (info->properties[TGSI_PROPERTY_FS_COORD_PIXEL_CENTER] ==
+   TGSI_FS_COORD_PIXEL_CENTER_INTEGER)
+   spi_baryc_cntl |= S_0286E0_POS_FLOAT_ULC(1);

 	/* Find out what SPI_SHADER_COL_FORMAT and CB_SHADER_MASK should be. 
*/

colors_written = info->colors_written;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] gl_FragCoord and gl_FrontFacing as system values

2016-01-07 Thread eocallaghan

This series is:

Reviewed-by: Edward O'Callaghan 

On 2016-01-08 12:29, Marek Olšák wrote:

Hi,

This series adds the possibility for drivers to get gl_FragCoord and
gl_FrontFacing as system values. When FACE is a system value, it also
changes its type to integer from floating-point.

Each variable has its own Const flag / Gallium CAP, so drivers can
choose whether they want this for each variable.

This simplifies input handling in the radeonsi driver. With this, TGSI
INPUT[i] becomes fragment shader input[i] in the hardware, so the
driver doesn't have to do any translation of locations.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: combine if blocks

2016-01-07 Thread eocallaghan

This series is:

Reviewed-by: Edward O'Callaghan 

On 2016-01-08 15:25, Timothy Arceri wrote:

---
 src/glsl/link_uniforms.cpp | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 47bb771..33b2d4c 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -532,6 +532,8 @@ public:
   */
  if (var->is_interface_instance()) {
 ubo_byte_offset = 0;
+process(var->get_interface_type(),
+var->get_interface_type()->name);
  } else {
 const struct gl_uniform_block *const block =
&prog->BufferInterfaceBlocks[ubo_block_index];
@@ -542,13 +544,8 @@ public:
&block->Uniforms[var->data.location];

 ubo_byte_offset = ubo_var->Offset;
- }
-
- if (var->is_interface_instance())
-process(var->get_interface_type(),
-var->get_interface_type()->name);
- else
 process(var);
+ }
   } else {
  /* Store any explicit location and reset data location so we 
can

   * reuse this variable for storing the uniform slot number.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] tgsi/scan: set if a fragment shader writes sample mask

2016-01-05 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-06 12:46, Marek Olšák wrote:

From: Marek Olšák 

This will be used by radeonsi.
---
 src/gallium/auxiliary/tgsi/tgsi_scan.c | 2 ++
 src/gallium/auxiliary/tgsi/tgsi_scan.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index e04f407..e3feed9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -392,6 +392,8 @@ tgsi_scan_shader(const struct tgsi_token *tokens,
  }
  else if (semName == TGSI_SEMANTIC_STENCIL) {
 info->writes_stencil = TRUE;
+ } else if (semName == TGSI_SEMANTIC_SAMPLEMASK) {
+info->writes_samplemask = TRUE;
  }
   }

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h
b/src/gallium/auxiliary/tgsi/tgsi_scan.h
index 7e9a559..a3e4378 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h
@@ -82,6 +82,7 @@ struct tgsi_shader_info
boolean reads_z; /**< does fragment shader read depth? */
boolean writes_z;  /**< does fragment shader write Z value? */
boolean writes_stencil; /**< does fragment shader write stencil 
value? */
+   boolean writes_samplemask; /**< does fragment shader write sample 
mask? */

boolean writes_edgeflag; /**< vertex shader outputs edgeflag */
boolean uses_kill;  /**< KILL or KILL_IF instruction used? */
boolean uses_persp_center;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: Remove unnecessary semicolons

2016-01-05 Thread eocallaghan

On 2016-01-06 10:30, Brian Paul wrote:

Series looks OK to me.  Reviewed-by: Brian Paul 

Do you need someone to commit/push for you?


I do yes, thank you kindly.

Edward.



-Brian

On 01/05/2016 03:07 AM, Edward O'Callaghan wrote:

Fix silly issue with MSVC case fall-though support to need
a extra 'break;'

Found-by: Coccinelle
Signed-off-by: Edward O'Callaghan 
Reviewed-by: Brian Paul 
---
  src/gallium/auxiliary/draw/draw_pipe_aaline.c  | 2 +-
  src/gallium/auxiliary/gallivm/lp_bld_swizzle.c | 2 +-
  src/gallium/auxiliary/nir/tgsi_to_nir.c| 2 +-
  src/gallium/auxiliary/util/u_surface.c | 3 ++-
  src/gallium/auxiliary/vl/vl_mpeg12_decoder.c   | 2 +-
  src/gallium/state_trackers/nine/swapchain9.c   | 2 +-
  src/gallium/state_trackers/omx/entrypoint.c| 2 +-
  src/gallium/state_trackers/vdpau/mixer.c   | 2 +-
  8 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_aaline.c 
b/src/gallium/auxiliary/draw/draw_pipe_aaline.c

index 3ce550a..e85ae16 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_aaline.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_aaline.c
@@ -938,7 +938,7 @@ draw_aaline_prepare_outputs(struct draw_context 
*draw,

 const struct pipe_rasterizer_state *rast = draw->rasterizer;

 /* update vertex attrib info */
-   aaline->pos_slot = draw_current_shader_position_output(draw);;
+   aaline->pos_slot = draw_current_shader_position_output(draw);

 if (!rast->line_smooth)
return;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c 
b/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c

index b1aef71..f571838 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_swizzle.c
@@ -720,7 +720,7 @@ lp_build_transpose_aos_n(struct gallivm_state 
*gallivm,


default:
   assert(0);
-   };
+   }
  }


diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c

index 94d992b..7c57759 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -1950,7 +1950,7 @@ tgsi_processor_to_shader_stage(unsigned 
processor)

 case TGSI_PROCESSOR_COMPUTE:   return MESA_SHADER_COMPUTE;
 default:
unreachable("invalid TGSI processor");
-   };
+   }
  }

  struct nir_shader *
diff --git a/src/gallium/auxiliary/util/u_surface.c 
b/src/gallium/auxiliary/util/u_surface.c

index 6aa44f9..c150d92 100644
--- a/src/gallium/auxiliary/util/u_surface.c
+++ b/src/gallium/auxiliary/util/u_surface.c
@@ -600,7 +600,8 @@ is_box_inside_resource(const struct pipe_resource 
*res,

depth = res->array_size;
assert(res->array_size % 6 == 0);
break;
-   case PIPE_MAX_TEXTURE_TYPES:;
+   case PIPE_MAX_TEXTURE_TYPES:
+  break;
 }

 return box->x >= 0 &&
diff --git a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c 
b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c

index f5bb3a0..b5c7045 100644
--- a/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c
+++ b/src/gallium/auxiliary/vl/vl_mpeg12_decoder.c
@@ -792,7 +792,7 @@ vl_mpeg12_end_frame(struct pipe_video_codec 
*decoder,

for (j = 0; j < VL_MAX_REF_FRAMES; ++j) {
   if (!ref_frames[j] || !ref_frames[j][i]) continue;

- vb[2] = vl_vb_get_mv(&buf->vertex_stream, j);;
+ vb[2] = vl_vb_get_mv(&buf->vertex_stream, j);
   dec->context->set_vertex_buffers(dec->context, 0, 3, vb);

   vl_mc_render_ref(i ? &dec->mc_c : &dec->mc_y, &buf->mc[i], 
ref_frames[j][i]);
diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c

index 3f5be26..3b1a7a4 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -790,7 +790,7 @@ NineSwapChain9_Present( struct NineSwapChain9 
*This,

  case D3DSWAPEFFECT_FLIP:
  UNTESTED(4);
  case D3DSWAPEFFECT_DISCARD:
-/* rotate the queue */;
+/* rotate the queue */
  pipe_resource_reference(&res, 
This->buffers[0]->base.resource);

  for (i = 1; i <= This->params.BackBufferCount; i++) {
  NineSurface9_SetResourceResize(This->buffers[i - 1],
diff --git a/src/gallium/state_trackers/omx/entrypoint.c 
b/src/gallium/state_trackers/omx/entrypoint.c

index da9ca10..afcbd97 100644
--- a/src/gallium/state_trackers/omx/entrypoint.c
+++ b/src/gallium/state_trackers/omx/entrypoint.c
@@ -137,7 +137,7 @@ OMX_ERRORTYPE 
omx_workaround_Destructor(OMX_COMPONENTTYPE *comp)

 priv->state = OMX_StateInvalid;
 tsem_up(priv->messageSem);

-   /* wait for thread to exit */;
+   /* wait for thread to exit */
 pthread_join(priv->messageHandlerThread, NULL);

 return omx_base_component_Destructor(comp);
diff --git a/src/gallium/state_trackers/vdpau/mixer.c 
b/src/gallium/state_trackers/vdpau/mixer.c

index c0b1ecc..dec79ff 100644
--- a/src/gallium/state_trackers/vdpau/mixer.c
+++ b/sr

Re: [Mesa-dev] [PATCH] nir: few missing struct names

2016-01-04 Thread eocallaghan

Reviewed-by: Edward O'Callaghan 

On 2016-01-05 05:27, Rob Clark wrote:

From: Rob Clark 

nir.h is a bit inconsistent about 'typedef struct {} nir_foo' vs
'typedef struct nir_foo {} nir_foo'.  But missing struct name tags is
inconvenient when you need a fwd declaration without pulling in all
of nir.

So add missing struct name tag for nir_variable, and a couple other
spots where it would likely be useful.

Signed-off-by: Rob Clark 
---
 src/glsl/nir/nir.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 4286738..bedcc0d 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -139,7 +139,7 @@ typedef enum {
  * ir_variable - it should be easy to translate between the two.
  */

-typedef struct {
+typedef struct nir_variable {
struct exec_node node;

/**
@@ -349,7 +349,7 @@ typedef struct {
 #define nir_foreach_variable(var, var_list) \
foreach_list_typed(nir_variable, var, node, var_list)

-typedef struct {
+typedef struct nir_register {
struct exec_node node;

unsigned num_components; /** < number of vector components */
@@ -443,7 +443,7 @@ nir_instr_is_last(nir_instr *instr)
return 
exec_node_is_tail_sentinel(exec_node_get_next(&instr->node));

 }

-typedef struct {
+typedef struct nir_ssa_def {
/** for debugging only, can be NULL */
const char* name;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/8] gallium: add shader buffer support

2016-01-02 Thread eocallaghan

In this series patches 2-8 are:

Reviewed-by: Edward O'Callaghan 

with some commentary on patch 1.

Kind Regards,

On 2016-01-03 15:37, Ilia Mirkin wrote:

This provides enough support in TGSI to support shader buffers. I do
away with the defunct TGSI_FILE_RESOURCE (renaming it into
TGSI_FILE_IMAGE to work with pipe_image_view), and add a brand new
TGSI_FILE_BUFFER. At the declaration level, this can have an ATOMIC
qualifier (and later a SHARED qualifier for compute shaders).

I also add memory qualifiers to LOAD/STORE opcodes, which can convey
the coherent/volatile/restrict flags as specified in the GLSL. I also
modified all of the formerly resource opcodes to work on both buffers
and images. For images they will derive the format from the IMAGE
declaration, while buffers are format-less by definition.

This is still missing a way to implement memory barriers, that will
come soon, and is not going to affect anything else I do in this
series.

For the full series I'm working on, you can look at

https://github.com/imirkin/mesa/commits/atomic3

which exposes ARB_shader_atomic_counters and
ARB_shader_storage_buffer_objects on nvc0+ (but it won't work on
maxwell -- need to add emission of atomic ops and cache control).

However this is a nice self-contained chunk to start with.

Ilia Mirkin (8):
  tgsi: add ureg support for image decls
  ureg: add buffer support to ureg
  tgsi: provide a way to encode memory qualifiers for SSBO
  tgsi: add a is_store property
  tgsi: update atomic op docs
  gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS
  gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT
  gallium: add a RESQ opcode to query info about a resource

 src/gallium/auxiliary/gallivm/lp_bld_limits.h  |   1 +
 src/gallium/auxiliary/tgsi/tgsi_build.c| 112 --
 src/gallium/auxiliary/tgsi/tgsi_dump.c |  25 +-
 src/gallium/auxiliary/tgsi/tgsi_exec.h |   1 +
 src/gallium/auxiliary/tgsi/tgsi_info.c | 446 
++---

 src/gallium/auxiliary/tgsi/tgsi_info.h |   1 +
 src/gallium/auxiliary/tgsi/tgsi_parse.c|   8 +-
 src/gallium/auxiliary/tgsi/tgsi_parse.h|   3 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  12 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.h  |   2 +
 src/gallium/auxiliary/tgsi/tgsi_text.c |  42 +-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 182 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  23 ++
 src/gallium/docs/source/screen.rst |   8 +
 src/gallium/docs/source/tgsi.rst   | 105 ++---
 src/gallium/drivers/freedreno/freedreno_screen.c   |   3 +
 src/gallium/drivers/i915/i915_screen.c |   1 +
 src/gallium/drivers/ilo/ilo_screen.c   |   1 +
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |   8 +-
 src/gallium/drivers/llvmpipe/lp_screen.c   |   1 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  12 +-
 src/gallium/drivers/nouveau/nv30/nv30_screen.c |   3 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |   2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   2 +
 src/gallium/drivers/r300/r300_screen.c |   3 +
 src/gallium/drivers/r600/r600_pipe.c   |   2 +
 src/gallium/drivers/radeonsi/si_pipe.c |   3 +
 src/gallium/drivers/softpipe/sp_screen.c   |   1 +
 src/gallium/drivers/svga/svga_screen.c |   4 +
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c|   2 +
 src/gallium/drivers/vc4/vc4_screen.c   |   3 +
 src/gallium/drivers/virgl/virgl_screen.c   |   1 +
 src/gallium/include/pipe/p_defines.h   |   2 +
 src/gallium/include/pipe/p_shader_tokens.h |  28 +-
 34 files changed, 729 insertions(+), 324 deletions(-)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] tgsi: add ureg support for image decls

2016-01-02 Thread eocallaghan
There is quite a bit of rename churn happening here at the same time as 
the bring up of ureg support for image declarations.
Would it be possible to split the rename churn out from the actual 
behavioral changes please?


On 2016-01-03 15:37, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_build.c| 62 
+

 src/gallium/auxiliary/tgsi/tgsi_dump.c | 10 +--
 src/gallium/auxiliary/tgsi/tgsi_parse.c|  4 +-
 src/gallium/auxiliary/tgsi/tgsi_parse.h|  2 +-
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  4 +-
 src/gallium/auxiliary/tgsi/tgsi_text.c | 10 +--
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 77 
++

 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  7 ++
 src/gallium/drivers/ilo/shader/toy_tgsi.c  |  8 +--
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 12 +++-
 src/gallium/drivers/svga/svga_tgsi_vgpu10.c|  2 +
 src/gallium/include/pipe/p_shader_tokens.h |  7 +-
 12 files changed, 153 insertions(+), 52 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index fdb7feb..bb9d0cb 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -259,36 +259,39 @@ tgsi_build_declaration_semantic(
return ds;
 }

-static struct tgsi_declaration_resource
-tgsi_default_declaration_resource(void)
+static struct tgsi_declaration_image
+tgsi_default_declaration_image(void)
 {
-   struct tgsi_declaration_resource dr;
+   struct tgsi_declaration_image di;

-   dr.Resource = TGSI_TEXTURE_BUFFER;
-   dr.Raw = 0;
-   dr.Writable = 0;
-   dr.Padding = 0;
+   di.Resource = TGSI_TEXTURE_BUFFER;
+   di.Raw = 0;
+   di.Writable = 0;
+   di.Format = 0;
+   di.Padding = 0;

-   return dr;
+   return di;
 }

-static struct tgsi_declaration_resource
-tgsi_build_declaration_resource(unsigned texture,
-unsigned raw,
-unsigned writable,
-struct tgsi_declaration *declaration,
-struct tgsi_header *header)
+static struct tgsi_declaration_image
+tgsi_build_declaration_image(unsigned texture,
+ unsigned format,
+ unsigned raw,
+ unsigned writable,
+ struct tgsi_declaration *declaration,
+ struct tgsi_header *header)
 {
-   struct tgsi_declaration_resource dr;
+   struct tgsi_declaration_image di;

-   dr = tgsi_default_declaration_resource();
-   dr.Resource = texture;
-   dr.Raw = raw;
-   dr.Writable = writable;
+   di = tgsi_default_declaration_image();
+   di.Resource = texture;
+   di.Format = format;
+   di.Raw = raw;
+   di.Writable = writable;

declaration_grow(declaration, header);

-   return dr;
+   return di;
 }

 static struct tgsi_declaration_sampler_view
@@ -364,7 +367,7 @@ tgsi_default_full_declaration( void )
full_declaration.Range = tgsi_default_declaration_range();
full_declaration.Semantic = tgsi_default_declaration_semantic();
full_declaration.Interp = tgsi_default_declaration_interp();
-   full_declaration.Resource = tgsi_default_declaration_resource();
+   full_declaration.Image = tgsi_default_declaration_image();
full_declaration.SamplerView = 
tgsi_default_declaration_sampler_view();

full_declaration.Array = tgsi_default_declaration_array();

@@ -454,20 +457,21 @@ tgsi_build_full_declaration(
  header );
}

-   if (full_decl->Declaration.File == TGSI_FILE_RESOURCE) {
-  struct tgsi_declaration_resource *dr;
+   if (full_decl->Declaration.File == TGSI_FILE_IMAGE) {
+  struct tgsi_declaration_image *di;

   if (maxsize <= size) {
  return  0;
   }
-  dr = (struct tgsi_declaration_resource *)&tokens[size];
+  di = (struct tgsi_declaration_image *)&tokens[size];
   size++;

-  *dr = 
tgsi_build_declaration_resource(full_decl->Resource.Resource,

-full_decl->Resource.Raw,
-
full_decl->Resource.Writable,

-declaration,
-header);
+  *di = tgsi_build_declaration_image(full_decl->Image.Resource,
+ full_decl->Image.Format,
+ full_decl->Image.Raw,
+ full_decl->Image.Writable,
+ declaration,
+ header);
}

if (full_decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index e29ffb3..dad3839 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
++

Re: [Mesa-dev] [PATCH 0/5] Add ARB_indirect_parameters support

2016-01-02 Thread eocallaghan

In this series patches 1-4 are:

Reviewed-by: Edward O'Callaghan 

No idea what is happening in patch 5 to say anything either way.

On 2016-01-03 07:38, Ilia Mirkin wrote:

The nvc0 patch applies on top of some unpublished patches, see

https://github.com/imirkin/mesa/commits/tmp4

for the full thing. The whole series applies on top of the
ARB_multi_draw_indirect patches I sent earlier (with potential minor
modifications). There is some type confusion between the
ARB_indirect_parameters spec and the Khronos gl.xml/glcorearb.h files,
I went with the latter's definitions.

This passes the relatively simple piglit test I sent.

Ilia Mirkin (5):
  glapi: add ARB_indirect_parameters definitions
  mesa: add parameter buffer, used for ARB_indirect_parameters
  mesa: add support for ARB_indirect_parameters draw functions
  st/mesa: expose ARB_indirect_parameters when the backend driver 
allows

  nvc0: add ARB_indirect_parameters support

 docs/relnotes/11.2.0.html  |   1 +
 src/gallium/drivers/nouveau/nvc0/mme/com9097.mme   | 157 
+
 src/gallium/drivers/nouveau/nvc0/mme/com9097.mme.h | 125 


 src/gallium/drivers/nouveau/nvc0/nvc0_macros.h |   4 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |   4 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c|  29 +++-
 src/mapi/glapi/gen/ARB_indirect_parameters.xml |  30 
 src/mapi/glapi/gen/Makefile.am |   1 +
 src/mapi/glapi/gen/gl_API.xml  |   6 +-
 src/mesa/main/api_validate.c   | 115 
+++

 src/mesa/main/api_validate.h   |  16 +++
 src/mesa/main/bufferobj.c  |  15 ++
 src/mesa/main/extensions_table.h   |   1 +
 src/mesa/main/get.c|   5 +
 src/mesa/main/get_hash_params.py   |   4 +
 src/mesa/main/mtypes.h |   2 +
 src/mesa/main/tests/dispatch_sanity.cpp|   4 +
 src/mesa/state_tracker/st_cb_bufferobjects.c   |   1 +
 src/mesa/state_tracker/st_extensions.c |   1 +
 src/mesa/vbo/vbo_exec_array.c  | 124 


 20 files changed, 638 insertions(+), 7 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_indirect_parameters.xml


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: document PK2H/UP2H

2016-01-02 Thread eocallaghan

This series is:

Reviewed-by: Edward O'Callaghan 

On 2016-01-03 11:37, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
 src/gallium/docs/source/tgsi.rst | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst 
b/src/gallium/docs/source/tgsi.rst

index 955ece8..f69998f 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -458,7 +458,9 @@ while DDY is allowed to be the same for the entire 
2x2 quad.


 .. opcode:: PK2H - Pack Two 16-bit Floats

-  TBD
+.. math::
+
+  dst.x = f32\_to\_f16(src.x) | f32\_to\_f16(src.y) << 16


 .. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
@@ -615,7 +617,11 @@ This instruction replicates its result.

 .. opcode:: UP2H - Unpack Two 16-Bit Floats

-  TBD
+.. math::
+
+  dst.x = f16\_to\_f32(src0.x \& 0x)
+
+  dst.y = f16\_to\_f32(src0.x >> 16)

 .. note::


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] gallium/tests: fix build with clang compiler

2016-01-02 Thread eocallaghan
omg I don't know why folks insist on using gnuc nested functions they 
are insane.


Thanks for working though this one!

Reviewed-by: Edward O'Callaghan 

On 2016-01-03 04:20, Samuel Pitoiset wrote:

Nested functions are supported as an extension in GNU C, but Clang
don't support them.

This fixes compilation errors when (manually) building compute.c,
or by setting --enable-gallium-tests to the configure script.

Changes from v3:
 - refactor by introducing test_default_init()

Changes from v2:
 - fix typo

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/tests/trivial/compute.c | 603 


 1 file changed, 330 insertions(+), 273 deletions(-)

diff --git a/src/gallium/tests/trivial/compute.c
b/src/gallium/tests/trivial/compute.c
index bcdfb11..5ce12ab 100644
--- a/src/gallium/tests/trivial/compute.c
+++ b/src/gallium/tests/trivial/compute.c
@@ -428,6 +428,35 @@ static void launch_grid(struct context *ctx,
const uint *block_layout,
 pipe->launch_grid(pipe, block_layout, grid_layout, pc, input);
 }

+static void test_default_init(void *p, int s, int x, int y)
+{
+*(uint32_t *)p = 0xdeadbeef;
+}
+
+/* test_system_values */
+static void test_system_values_expect(void *p, int s, int x, int y)
+{
+int id = x / 16, sv = (x % 16) / 4, c = x % 4;
+int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 };
+int bsz[] = { 4, 3, 5, 1};
+int gsz[] = { 5, 4, 1, 1};
+
+switch (sv) {
+case 0:
+*(uint32_t *)p = tid[c] / bsz[c];
+break;
+case 1:
+*(uint32_t *)p = bsz[c];
+break;
+case 2:
+*(uint32_t *)p = gsz[c];
+break;
+case 3:
+*(uint32_t *)p = tid[c] % bsz[c];
+break;
+}
+}
+
 static void test_system_values(struct context *ctx)
 {
 const char *src = "COMP\n"
@@ -461,44 +490,31 @@ static void test_system_values(struct context 
*ctx)

 "  STORE RES[0].xyzw, TEMP[0], SV[3]\n"
 "  RET\n"
 "ENDSUB\n";
-void init(void *p, int s, int x, int y) {
-*(uint32_t *)p = 0xdeadbeef;
-}
-void expect(void *p, int s, int x, int y) {
-int id = x / 16, sv = (x % 16) / 4, c = x % 4;
-int tid[] = { id % 20, (id % 240) / 20, id / 240, 0 };
-int bsz[] = { 4, 3, 5, 1};
-int gsz[] = { 5, 4, 1, 1};
-
-switch (sv) {
-case 0:
-*(uint32_t *)p = tid[c] / bsz[c];
-break;
-case 1:
-*(uint32_t *)p = bsz[c];
-break;
-case 2:
-*(uint32_t *)p = gsz[c];
-break;
-case 3:
-*(uint32_t *)p = tid[c] % bsz[c];
-break;
-}
-}

 printf("- %s\n", __func__);

 init_prog(ctx, 0, 0, 0, src, NULL);
 init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT,
- 76800, 0, init);
+ 76800, 0, test_default_init);
 init_compute_resources(ctx, (int []) { 0, -1 });
 launch_grid(ctx, (uint []){4, 3, 5}, (uint []){5, 4, 1}, 0, 
NULL);

-check_tex(ctx, 0, expect, NULL);
+check_tex(ctx, 0, test_system_values_expect, NULL);
 destroy_compute_resources(ctx);
 destroy_tex(ctx);
 destroy_prog(ctx);
 }

+/* test_resource_access */
+static void test_resource_access_init0(void *p, int s, int x, int y)
+{
+*(float *)p = 8.0 - (float)x;
+}
+
+static void test_resource_access_expect(void *p, int s, int x, int y)
+{
+*(float *)p = 8.0 - (float)((x + 4 * y) & 0x3f);
+}
+
 static void test_resource_access(struct context *ctx)
 {
 const char *src = "COMP\n"
@@ -519,31 +535,33 @@ static void test_resource_access(struct context 
*ctx)

 "   STORE RES[1].xyzw, TEMP[1], TEMP[0]\n"
 "   RET\n"
 "ENDSUB\n";
-void init0(void *p, int s, int x, int y) {
-*(float *)p = 8.0 - (float)x;
-}
-void init1(void *p, int s, int x, int y) {
-*(uint32_t *)p = 0xdeadbeef;
-}
-void expect(void *p, int s, int x, int y) {
-*(float *)p = 8.0 - (float)((x + 4*y) & 0x3f);
-}

 printf("- %s\n", __func__);

 init_prog(ctx, 0, 0, 0, src, NULL);
 init_tex(ctx, 0, PIPE_BUFFER, true, PIPE_FORMAT_R32_FLOAT,
- 256, 0, init0);
+ 256, 0, test_resource_access_init0);
 init_tex(ctx, 1, PIPE_TEXTURE_2D, true, PIPE_FORMAT_R32_FLOAT,
- 60, 12, init1);
+ 60, 12, test_default_init);
 init_compute_r

Re: [Mesa-dev] [PATCH 0/9] RadeonSI: Some shaders cleanups

2016-01-01 Thread eocallaghan
Well can't disagree with anything in this series and it certainly makes 
`si_shader.c' a tiny bit easier to understand for me.
Hopefully not too much fallout with the debug callback patch series as 
they will need to be rebased on top of this :|


Thus this series is,

Reviewed-by: Edward O'Callaghan 

On 2016-01-02 01:13, Marek Olšák wrote:

Hi,

These are shader cleanups mostly around si_compile_llvm.

You may wonder why the "move si_shader_binary_upload out of xxx"
patches. They are part of my one-variant-per-shader rework, which
needs a lot of restructuring.

Besides this, I have 2 more series of cleanup patches, which I will
send when this lands.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium/radeon: implement set_debug_callback

2015-12-30 Thread eocallaghan

Fantastic thanks!

This series is,

Reviewed-by: Edward O'Callaghan 

On 2015-12-31 13:30, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/r600_pipe_common.c | 12 
 src/gallium/drivers/radeon/r600_pipe_common.h |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 9a5e987..41c7aa5 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -227,6 +227,17 @@ static enum pipe_reset_status
r600_get_reset_status(struct pipe_context *ctx)
return PIPE_UNKNOWN_CONTEXT_RESET;
 }

+static void r600_set_debug_callback(struct pipe_context *ctx,
+   const struct pipe_debug_callback *cb)
+{
+   struct r600_common_context *rctx = (struct r600_common_context *)ctx;
+
+   if (cb)
+   rctx->debug = *cb;
+   else
+   memset(&rctx->debug, 0, sizeof(rctx->debug));
+}
+
 bool r600_common_context_init(struct r600_common_context *rctx,
  struct r600_common_screen *rscreen)
 {
@@ -252,6 +263,7 @@ bool r600_common_context_init(struct
r600_common_context *rctx,
rctx->b.transfer_inline_write = u_default_transfer_inline_write;
 rctx->b.memory_barrier = r600_memory_barrier;
rctx->b.flush = r600_flush_from_st;
+   rctx->b.set_debug_callback = r600_set_debug_callback;

if (rscreen->info.drm_major == 2 && rscreen->info.drm_minor >= 43) {
rctx->b.get_device_reset_status = r600_get_reset_status;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
b/src/gallium/drivers/radeon/r600_pipe_common.h
index c3933b1d..a69e627 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -440,6 +440,8 @@ struct r600_common_context {
 * the GPU addresses are updated. */
struct list_headtexture_buffers;

+   struct pipe_debug_callback  debug;
+
/* Copy one resource to another using async DMA. */
void (*dma_copy)(struct pipe_context *ctx,
 struct pipe_resource *dst,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: annotate ast_process_struct_or_iface_block_members() as static

2015-12-29 Thread eocallaghan

This series is,

Reviewed-by: Edward O'Callaghan 

On 2015-12-29 21:02, Timothy Arceri wrote:

From: Emil Velikov 

Reviewed-by: Timothy Arceri 
---
 src/glsl/ast_to_hir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index bb35d72..d51f095 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -6201,7 +6201,7 @@ ast_type_specifier::hir(exec_list *instructions,
  * The number of fields processed.  A pointer to the array structure 
fields is

  * stored in \c *fields_ret.
  */
-unsigned
+static unsigned
 ast_process_struct_or_iface_block_members(exec_list *instructions,
   struct 
_mesa_glsl_parse_state *state,

   exec_list *declarations,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V2 ARB_enhanced_layouts component qualifier support

2015-12-28 Thread eocallaghan

On 2015-12-29 16:00, Timothy Arceri wrote:
This series adds support for the component layout qualifier by 
enhancing

the varying packing pass at the GLSL IR level. The advantage to this
approach is that its fairly simple and will work for all drivers, the
disadvantage it that it relies on optimisation passes to clean up the
mess.


I'm personally a fan of this approach so I think abstract IR passes are
the way to go over hand holding bulky driver backends. I see the main
issue is around the interaction with tessellation, admittedly patch 16
I don't fully understand the implications although patch 26 helped 
clarify
a little about the issue. It does sound to me that the duplicate 
dimensionality

from AoA and so on are solvable at the IR level in any case. Thus, this
series is:

Reviewed-by: Edward O'Callaghan 



 [PATCH 01/28] glsl: only add outward facing varyings to resourse list

 Patch 1: Bugfix for SSO but also required later in the series to add
 packed vertex inputs to the resource list.

 [PATCH 02/28] glsl: move lowering after matching validation
 [PATCH 03/28] glsl: don't change the varying type in validation code
 [PATCH 04/28] glsl: remove unused varyings before packing them

 Patches 2-4: Required to allow unused varyings with explicit locations
 to be removed before packing.

 [PATCH 05/28] glsl: create helper to remove outer vertex index array
 [PATCH 06/28] glsl: fix overlapping of varying locations for arrays
 [PATCH 07/28] glsl: don't try adding build-ins to explicit locations

 Patches 5-7: are SSO bugfixes

 [PATCH 08/28] glsl: parse component layout qualifier
 [PATCH 09/28] glsl: validate and store component layout qualifier in
 [PATCH 10/28] glsl: fix cross validation for explicit locations on
 [PATCH 11/28] glsl: cross validate varyings with a component
 [PATCH 12/28] glsl: update explicit location matching to support
 [PATCH 13/28] glsl: include varyings with explicit locations in slot
 [PATCH 14/28] glsl: pass disable_varying_packing bool to the lowering
 [PATCH 15/28] glsl: add support for packing varyings with explicit
 [PATCH 16/28] glsl: don't pack tessellation stages like we do other
 [PATCH 17/28] glsl: enable lowering of varyings with explicit
 [PATCH 18/28] glsl: validate linking of intrastage component
 [PATCH 19/28] glsl: add support for explicit components to frag
 [PATCH 20/28] glsl: pack vertex attributes with component layout
 [PATCH 21/28] glsl: pack fragment shader outputs with component
 [PATCH 22/28] glsl: get geometry shader vertex count from type when
 [PATCH 23/28] glsl: add pack varying to resource list for vertex
 [PATCH 24/28] glsl: make needs_lowering() a shared packing helper
 [PATCH 25/28] glsl: move packed varying creation code to a helper
 [PATCH 26/28] glsl: lower tessellation varyings packed with component
 [PATCH 27/28] mesa: add LOCATION_COMPONENT support to

 Patches 8-28: add the component layout qualifier support.
 [PATCH 28/28] docs: mark component layout qualifiers as DONE
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/28] glsl: don't pack tessellation stages like we do other stages

2015-12-28 Thread eocallaghan

On 2015-12-29 16:00, Timothy Arceri wrote:

Tessellation shaders treat varyings as shared memory and invocations
can access each others varyings therefore we can't use the existing
method to lower them.

This adds a check for these stages as following patches will
allow explicit locations to be lowered even when the driver and 
existing
tesselation checks ask for it to be disabled, we do this to enable 
support

for the component layout qualifier.


I find this a little hard to read and understand, could you brush it up 
a bit

please if that's ok?


---
 src/glsl/lower_packed_varyings.cpp | 62 
+-

 1 file changed, 34 insertions(+), 28 deletions(-)

diff --git a/src/glsl/lower_packed_varyings.cpp
b/src/glsl/lower_packed_varyings.cpp
index 2899846..e4e9a35 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -737,40 +737,46 @@ lower_packed_varyings(void *mem_ctx, unsigned
locations_used,
   ir_variable_mode mode, unsigned 
gs_input_vertices,

   gl_shader *shader, bool disable_varying_packing)
 {
-   exec_list *instructions = shader->ir;
ir_function *main_func = shader->symbols->get_function("main");
exec_list void_parameters;
ir_function_signature *main_func_sig
   = main_func->matching_signature(NULL, &void_parameters, false);
-   exec_list new_instructions, new_variables;
-   lower_packed_varyings_visitor visitor(mem_ctx, locations_used, 
mode,

- gs_input_vertices,
- &new_instructions,
- &new_variables,
- disable_varying_packing);
-   visitor.run(shader);
-   if (mode == ir_var_shader_out) {
-  if (shader->Stage == MESA_SHADER_GEOMETRY) {
- /* For geometry shaders, outputs need to be lowered before 
each call

-  * to EmitVertex()
-  */
- lower_packed_varyings_gs_splicer splicer(mem_ctx, 
&new_instructions);

-
- /* Add all the variables in first. */
- main_func_sig->body.head->insert_before(&new_variables);

- /* Now update all the EmitVertex instances */
- splicer.run(instructions);
+   if (!(shader->Stage == MESA_SHADER_TESS_CTRL ||
+ shader->Stage == MESA_SHADER_TESS_EVAL)) {
+  exec_list *instructions = shader->ir;
+  exec_list new_instructions, new_variables;
+
+  lower_packed_varyings_visitor visitor(mem_ctx, locations_used, 
mode,

+gs_input_vertices,
+&new_instructions,
+&new_variables,
+disable_varying_packing);
+  visitor.run(shader);
+  if (mode == ir_var_shader_out) {
+ if (shader->Stage == MESA_SHADER_GEOMETRY) {
+/* For geometry shaders, outputs need to be lowered before 
each

+ * call to EmitVertex()
+ */
+lower_packed_varyings_gs_splicer splicer(mem_ctx,
+ 
&new_instructions);

+
+/* Add all the variables in first. */
+main_func_sig->body.head->insert_before(&new_variables);
+
+/* Now update all the EmitVertex instances */
+splicer.run(instructions);
+ } else {
+/* For other shader types, outputs need to be lowered at 
the end

+ * of main()
+ */
+main_func_sig->body.append_list(&new_variables);
+main_func_sig->body.append_list(&new_instructions);
+ }
   } else {
- /* For other shader types, outputs need to be lowered at the 
end of

-  * main()
-  */
- main_func_sig->body.append_list(&new_variables);
- main_func_sig->body.append_list(&new_instructions);
+ /* Shader inputs need to be lowered at the beginning of 
main() */

+ main_func_sig->body.head->insert_before(&new_instructions);
+ main_func_sig->body.head->insert_before(&new_variables);
   }
-   } else {
-  /* Shader inputs need to be lowered at the beginning of main() 
*/

-  main_func_sig->body.head->insert_before(&new_instructions);
-  main_func_sig->body.head->insert_before(&new_variables);
}
 }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 26/28] glsl: lower tessellation varyings packed with component layout qualifier

2015-12-28 Thread eocallaghan

On 2015-12-29 16:00, Timothy Arceri wrote:

For tessellation shaders we cannot just copy everything to the packed
varyings like we do in other stages as tessellation uses shared memory 
for
varyings, therefore it is only safe to copy array elements that the 
shader

actually uses.

This class searches the IR for uses of varyings and then creates
instructions that copy those vars to a packed varying. This means it is
easy to end up with duplicate copies if the varying is used more than 
once,

also arrays of arrays create a duplicate copy for each dimension that
exists. These issues are not easily resolved without breaking various
corner cases so we leave it to a later IR stage to clean up the mess.

Note that neither GLSL IR nor NIR can currently can't clean up the


s/can\'t//


duplicates when and indirect is used as an array index. This patch
assumes that NIR will eventually be able to clean this up.
---
 src/glsl/lower_packed_varyings.cpp | 421 
+

 1 file changed, 421 insertions(+)

diff --git a/src/glsl/lower_packed_varyings.cpp
b/src/glsl/lower_packed_varyings.cpp
index b606cc8..9522969 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -148,10 +148,28 @@
 #include "ir.h"
 #include "ir_builder.h"
 #include "ir_optimization.h"
+#include "ir_rvalue_visitor.h"
 #include "program/prog_instruction.h"
+#include "util/hash_table.h"

 using namespace ir_builder;

+/**
+ * Creates new type for and array when the base type changes.
+ */
+static const glsl_type *
+update_packed_array_type(const glsl_type *type, const glsl_type 
*packed_type)

+{
+   const glsl_type *element_type = type->fields.array;
+   if (element_type->is_array()) {
+ const glsl_type *new_array_type =
+update_packed_array_type(element_type, packed_type);
+  return glsl_type::get_array_instance(new_array_type, 
type->length);

+   } else {
+  return glsl_type::get_array_instance(packed_type, type->length);
+   }
+}
+
 static bool
 needs_lowering(ir_variable *var, bool has_enhanced_layouts,
bool disable_varying_packing)
@@ -205,6 +223,51 @@ create_packed_var(void * const mem_ctx, const
char *packed_name,
return packed_var;
 }

+/**
+ * Creates a packed varying for the tessellation packing.
+ */
+static ir_variable *
+create_tess_packed_var(void *mem_ctx, ir_variable *unpacked_var)
+{
+   /* create packed varying name using location */
+   char location_str[11];
+   snprintf(location_str, 11, "%d", unpacked_var->data.location);
+   char *packed_name;
+   if ((ir_variable_mode) unpacked_var->data.mode == 
ir_var_shader_out)
+  packed_name = ralloc_asprintf(mem_ctx, "packed_out:%s", 
location_str);

+   else
+  packed_name = ralloc_asprintf(mem_ctx, "packed_in:%s", 
location_str);

+
+   const glsl_type *packed_type;
+   switch (unpacked_var->type->without_array()->base_type) {
+   case GLSL_TYPE_UINT:
+  packed_type = glsl_type::uvec4_type;
+  break;
+   case GLSL_TYPE_INT:
+  packed_type = glsl_type::ivec4_type;
+  break;
+   case GLSL_TYPE_FLOAT:
+  packed_type = glsl_type::vec4_type;
+  break;
+   case GLSL_TYPE_DOUBLE:
+  packed_type = glsl_type::dvec4_type;
+  break;
+   default:
+  assert(!"Unexpected type in tess varying packing");
+  return NULL;
+   }
+
+   /* Create array new array type */
+   if (unpacked_var->type->is_array()) {
+  packed_type = update_packed_array_type(unpacked_var->type, 
packed_type);

+   }
+
+   return create_packed_var(mem_ctx, packed_name, packed_type, 
unpacked_var,
+(ir_variable_mode) 
unpacked_var->data.mode,

+unpacked_var->data.location,
+unpacked_var->type->is_array());
+}
+
 namespace {

 /**
@@ -763,6 +826,296 @@
lower_packed_varyings_gs_splicer::visit_leave(ir_emit_vertex *ev)
 }


+/**
+ * For tessellation shaders we cannot just copy everything to the 
packed
+ * varyings like we do in other stages as tessellation uses shared 
memory for
+ * varyings, therefore it is only safe to copy array elements that the 
shader

+ * actually uses.
+ *
+ * This class searches the IR for uses of varyings and then creates
+ * instructions that copy those vars to a packed varying. This means 
it is
+ * easy to end up with duplicate copies if the varying is used more 
than once,
+ * also arrays of arrays create a duplicate copy for each dimension 
that
+ * exists. These issues are not easily resolved without breaking 
various
+ * corner cases so we leave it to a later IR stage to clean up the 
mess.

+ */
+class lower_packed_varyings_tess_visitor : public ir_rvalue_visitor
+{
+public:
+   lower_packed_varyings_tess_visitor(void *mem_ctx, hash_table 
*varyings,

+  ir_variable_mode mode)
+   : mem_ctx(mem_ctx), varyings(varyings), mode(mode)
+   {
+   }
+
+   virtual ~lower_packed_varyings_tess_visitor()
+   {
+   }
+
+   virtual ir

Re: [Mesa-dev] [PATCH v5] Add .mailmap

2015-12-28 Thread eocallaghan

Should I be expecting to see myself on here?

On 2015-12-28 20:50, Giuseppe Bilotta wrote:

This adds a first tentative .mailmap file, to canonicize contributor
name/emails in shortlogs and other statistical endeavours.

Signed-off-by: Giuseppe Bilotta 
---
Hopefully the last time I need to submit this …

 .mailmap | 460 
+++

 1 file changed, 460 insertions(+)
 create mode 100644 .mailmap

diff --git a/.mailmap b/.mailmap
new file mode 100644
index 000..10811c0
--- /dev/null
+++ b/.mailmap
@@ -0,0 +1,460 @@
+Aapo Tahkola  
+
+Adam Jackson  
+Adam Jackson  
+
+Adrian Marius Negreanu  Adrian Negreanu

+Adrian Marius Negreanu  Negreanu Marius
Adrian 
+
+Dave Airlie  
+Dave Airlie  airlied 


+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+Dave Airlie  
+
+Alan Coopersmith  


+
+Alan Hourihane  
+Alan Hourihane  
+Alan Hourihane  
+
+Alexander Monakov  
+
+Alexander von Gluck IV  Alexander von Gluck

+
+Alex Corscadden  


+Alex Corscadden  
+
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+Alex Deucher  
+
+Andreas Fänger  
+
+Andreas Hartmetz  
+
+Andre Heider 
+Andreas Heider 
+
+Andreas Pokorny 

+
+Andrew Randrianasulu  
+Andrew Randrianasulu  
+
+Arthur Huillet  Arthur HUILLET 


+
+Benjamin Franzke  ben

+
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+Ben Skeggs  
+
+Ben Widawsky  Ben Widawsky 


+
+Blair Sadewitz  Blair Sadewitz

+
+Boris Peterbarg  reist 
+
+Brian Paul  Brian 
+Brian Paul  
+Brian Paul  
+Brian Paul  
+Brian Paul  brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  Brian 
+Brian Paul  root 
+Brian Paul  root 
+Brian Paul  root 
+Brian Paul  root 
+
+Bruce Merry  
+
+Carl-Philip Hänsch  Carl-Philip Haensch

+Carl-Philip Hänsch  Carl-Philip Haensch

+Carl-Philip Hänsch  Carl-Philip Haensch

+
+Chad Versace  
+Chad Versace  c...@chad-versace.us>

+Chad Versace  
+
+Chia-I Wu  
+Chia-I Wu  Chia-Wu 
+
+Chih-Wei Huang  Chih-Wei Huang 


+
+Christian König  Christian Koenig

+Christian König  Christian König

+Christian König  Christian König

+
+Christoph Brill  Christoph Bill 
+Christoph Brill  
+
+Christoph Bumiller 

+
+Christopher James Halse Rogers
 Christopher James Halse
Rogers 
+
+Claudio Ciccani  
+Claudio Ciccani  
+
+Connor Abbott  
+Connor Abbott  
+
+Corbin Simpson  
+Corbin Simpson  
+
+Courtney Goeltzenleuchter  
+
+Daniel Skinner  sio 
+
+Daniel Stone  
+
+David Miller  David S. Miller 


+David Miller  Dave Miller 
+David Miller  davem69 
+
+David Heidelberger  David Heidelberg

+David Heidelberger  
+
+David Reveman  
+
+Dieter Nützel  Dieter Nützel 


+
+Dmitry Cherkassov  Dmitry Cherkasov

+
+Dylan Baker  
+
+Emeric Grange  Emeric 


+
+Emil Velikov  
+
+Eric Anholt  Eric Anholt 
+
+Eugeni Dodonov  
+
+Fabian Bieler  
+Fabian Bieler  <>
+
+Feng, Haitao  Haitao Feng 


+
+Frank Henigman  
+
+George Sapountzis  George Sapountzis 


+
+Gwenole Beauchesne  
+
+Hamish Marson  hmarson 
+
+Hans de Goede  Hans de Goede 


+
+Homer Hsing  
+
+Hui Qi Tay  
+
+Ian Romanick  
+Ian Romanick  
+
+Jakob Bornecrantz  
+Jakob Bornecrantz  
+Jakob Bornecrantz  
+Jakob Bornecrantz  
+Jakob Bornecrantz  
+Jakob Bornecrantz  
+
+Jakub Bogusz  
+
+James Legg  
+
+Jan Vesely  Jan Vesely 
+
+Jason Ekstrand  
+
+Jeremy Huddleston  
+Jeremy Huddleston  
+Jeremy Huddleston  
+Jeremy Huddleston  
+Jeremy Huddleston  Jeremy Huddleston Sequoia

+
+Jeremy Kolb  
+
+Jerome Glisse  
+Jerome Glisse  
+Jerome Glisse  John Doe 
+Jerome Glisse  John Doe 


+
+Jesse Barnes  
+Jesse Barnes  
+Jesse Barnes  


+Jesse Barnes  
+Jesse Barnes  
+
+Joakim Sindholt  
+Joakim Sindholt  
+
+Jochen Gerlach  jtg 
+
+Joel Bosveld  
+
+Jonathan Adamczewski  
+
+Jon Turney  Jon TURNEY

+
+José Fonseca  Jose Fonseca 
+José Fonseca  Jose Fonseca

+José Fonseca  
+José Fonseca  
+José Fonseca  
+José Fonseca  
+José Fonseca  
+
+Jouk Jansen  Jouk Jansen

+Jouk Jansen  Jouk Jansen

+Jouk Jansen  joukj 


+Jouk Jansen  Jouk

+Jouk Jansen  Jouk 


+Jouk Jansen  J.Jansen

+
+Juan Zhao  
+
+Julien Cristau  
+
+Julien Isorce  
+
+Kalyan Kondapally 

+
+Karl Schultz  Karl Schultze 

+Karl Schultz  unknown 


+Karl Schultz  
+Karl Schultz  
+Karl Schultz  
+
+Keith Harrison  sio2 
+
+Keith Packard  
+Keith Packard  
+
+Keith Whitwell  
+Keith Whitwell  keithw 


+
+Kristian Høgsberg  
+Kristian Høgsberg  
+Kristian Høgsberg  
+Kristian Høgsberg  
+Kristian Høgsberg  
+
+Krzesimir Nowak  
+
+Li Peng  
+
+Lucas Stach  
+
+Maarten Lankhorst  

+Maarten Lankhorst  


+
+Maciej Cencora  
+
+Marc-André Lureau  Marc-Andre Lureau

+
+Marc Dietrich  Marc 
+Marc Dietrich  marvin24 
+
+Marcin Ślusarz  Marcin Slusarz

+
+Marek Olšák  
+
+Mario Kleiner  kleinerm

+Mario Kleiner  


+
+Mark Mueller  
+
+Marta Lofstedt  


+
+Martin Peres  
+
+Mathias Fröhlich  Mathias Froehlich

+Mat

Re: [Mesa-dev] [PATCH 0/10] Tessellation shaders for Gen7/7.5.

2015-12-24 Thread eocallaghan


Reviewed-by: Edward O'Callaghan 

Congrats on getting this working, also thanks!

On 2015-12-25 12:34, Kenneth Graunke wrote:
This morning, I woke up and somehow "knew" what was causing my HS GPU 
hangs

on Gen7/7.5.  It turns out I was (completely) wrong, but through some
miraculous series of illogical leaps, I arrived at a solution anyway.

I don't honestly know how I got it working on Christmas Eve after
failing to figure it out for months on end.  After exhausting every bit
of documentation and every tool available, and finding zero 
information,

somehow randomly flailing in the dark resulted in a solution, today of
all days.  Honestly, I had pretty much no hope for figuring this out,
so I'm relieved to have it working at last...

It turns out that setting interleave on the EOT URB write does bad 
things.
Fixing this fixed all the GPU hangs when releasing inputs one at a 
time,

I then added back the ability to release inputs in pairs, which caused
more GPU hangs.  It turned out I needed to be more careful and enable
both halves.

Everything seems to be working just fine now, so let's turn it on.

--Ken

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/13] i965: Only call brw_upload_tcs/tes_prog when using tessellation.

2015-12-22 Thread eocallaghan

On 2015-12-22 21:20, Kenneth Graunke wrote:

If there's no evaluation shader, tessellation is disabled.  The upload
functions would just bail.  Instead, don't bother calling them.

This will simplify the optional-TCS case a bit, as brw_upload_tcs can
assume that we're doing tessellation.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_state_upload.c | 11 +--
 src/mesa/drivers/dri/i965/brw_tcs.c  | 17 -
 src/mesa/drivers/dri/i965/brw_tes.c  |  9 -
 3 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 56962d5..af9fb5b 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -678,8 +678,15 @@ brw_upload_programs(struct brw_context *brw,
 {
if (pipeline == BRW_RENDER_PIPELINE) {
   brw_upload_vs_prog(brw);
-  brw_upload_tcs_prog(brw);
-  brw_upload_tes_prog(brw);
+  if (brw->tess_eval_program) {
+ brw_upload_tcs_prog(brw);
+ brw_upload_tes_prog(brw);
+  } else {
+ brw->tcs.prog_data = NULL;
+ brw->tcs.base.prog_data = NULL;
+ brw->tes.prog_data = NULL;
+ brw->tes.base.prog_data = NULL;
+  }

   if (brw->gen < 6)
  brw_upload_ff_gs_prog(brw);
diff --git a/src/mesa/drivers/dri/i965/brw_tcs.c
b/src/mesa/drivers/dri/i965/brw_tcs.c
index b5eb4cd..037a2da 100644
--- a/src/mesa/drivers/dri/i965/brw_tcs.c
+++ b/src/mesa/drivers/dri/i965/brw_tcs.c
@@ -187,6 +187,10 @@ brw_upload_tcs_prog(struct brw_context *brw)
/* BRW_NEW_TESS_CTRL_PROGRAM */
struct brw_tess_ctrl_program *tcp =
   (struct brw_tess_ctrl_program *) brw->tess_ctrl_program;
+   /* BRW_NEW_TESS_EVAL_PROGRAM */
+   struct brw_tess_eval_program *tep =
+  (struct brw_tess_eval_program *) brw->tess_eval_program;
+   assert(tcp && tep);

if (!brw_state_dirty(brw,
 _NEW_TEXTURE,
@@ -195,15 +199,6 @@ brw_upload_tcs_prog(struct brw_context *brw)
 BRW_NEW_TESS_EVAL_PROGRAM))
   return;

-   if (tcp == NULL) {
-  /* Other state atoms had better not try to access prog_data, 
since

-   * there's no HS program.
-   */
-  brw->tcs.prog_data = NULL;
-  brw->tcs.base.prog_data = NULL;
-  return;
-   }
-
struct gl_program *prog = &tcp->program.Base;

memset(&key, 0, sizeof(key));
@@ -216,13 +211,9 @@ brw_upload_tcs_prog(struct brw_context *brw)
brw_populate_sampler_prog_key_data(ctx, prog, 
stage_state->sampler_count,

   &key.tex);

-   /* BRW_NEW_TESS_EVAL_PROGRAM */
/* We need to specialize our code generation for tessellation 
levels

 * based on the domain the DS is expecting to tessellate.
 */
-   struct brw_tess_eval_program *tep =
-  (struct brw_tess_eval_program *) brw->tess_eval_program;
-   assert(tep);
key.tes_primitive_mode = tep->program.PrimitiveMode;


Does this compile? You've killed off *tep yet we still dereference it.



if (!brw_search_cache(&brw->cache, BRW_CACHE_TCS_PROG,
diff --git a/src/mesa/drivers/dri/i965/brw_tes.c
b/src/mesa/drivers/dri/i965/brw_tes.c
index 3c12706..4b2bf8c 100644
--- a/src/mesa/drivers/dri/i965/brw_tes.c
+++ b/src/mesa/drivers/dri/i965/brw_tes.c
@@ -241,15 +241,6 @@ brw_upload_tes_prog(struct brw_context *brw)
 BRW_NEW_TESS_EVAL_PROGRAM))
   return;

-   if (tep == NULL) {
-  /* Other state atoms had better not try to access prog_data, 
since

-   * there's no TES program.
-   */
-  brw->tes.prog_data = NULL;
-  brw->tes.base.prog_data = NULL;
-  return;
-   }
-
struct gl_program *prog = &tep->program.Base;

memset(&key, 0, sizeof(key));


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] draw: rework hanndling of non-existing outputs in emit code

2015-12-21 Thread eocallaghan


Thanks for the most comprehensive cleanup Roland and fixing that
minor regression we discussed. Happy holiday's.

Reviewed-by: Edward O'Callaghan 

On 2015-12-22 14:00, srol...@vmware.com wrote:

From: Roland Scheidegger 

Previously the code would just redirect requests for attributes which
don't exist to use output 0. Rework this to output all zeros instead 
which

seems more useful - in particular some extensions like
ARB_fragment_layer_viewport require 0 in the fs even if it wasn't 
output by
previous stages. That way, drivers don't have to special case this 
depending

if the vs/gs outputs some attribute or not.
---
 src/gallium/auxiliary/draw/draw_pipe_vbuf.c | 52 
+

 src/gallium/auxiliary/draw/draw_pt_emit.c   | 12 +++
 src/gallium/auxiliary/draw/draw_vertex.h|  4 +--
 3 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_vbuf.c
b/src/gallium/auxiliary/draw/draw_pipe_vbuf.c
index f36706c..81c4fed 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_vbuf.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_vbuf.c
@@ -74,9 +74,10 @@ struct vbuf_stage {
unsigned max_indices;
unsigned nr_indices;

-   /* Cache point size somewhere it's address won't change:
+   /* Cache point size somewhere its address won't change:
 */
float point_size;
+   float zero4[4];

struct translate_cache *cache;
 };
@@ -205,6 +206,7 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint prim 
)

struct translate_key hw_key;
unsigned dst_offset;
unsigned i;
+   const struct vertex_info *vinfo;

vbuf->render->set_primitive(vbuf->render, prim);

@@ -215,27 +217,33 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint 
prim )

 * state change.
 */
vbuf->vinfo = vbuf->render->get_vertex_info(vbuf->render);
-   vbuf->vertex_size = vbuf->vinfo->size * sizeof(float);
+   vinfo = vbuf->vinfo;
+   vbuf->vertex_size = vinfo->size * sizeof(float);

/* Translate from pipeline vertices to hw vertices.
 */
dst_offset = 0;

-   for (i = 0; i < vbuf->vinfo->num_attribs; i++) {
+   for (i = 0; i < vinfo->num_attribs; i++) {
   unsigned emit_sz = 0;
   unsigned src_buffer = 0;
   enum pipe_format output_format;
-  unsigned src_offset = (vbuf->vinfo->attrib[i].src_index * 4 *
sizeof(float) );
+  unsigned src_offset = (vinfo->attrib[i].src_index * 4 * 
sizeof(float) );


-  output_format = 
draw_translate_vinfo_format(vbuf->vinfo->attrib[i].emit);
-  emit_sz = 
draw_translate_vinfo_size(vbuf->vinfo->attrib[i].emit);
+  output_format = 
draw_translate_vinfo_format(vinfo->attrib[i].emit);

+  emit_sz = draw_translate_vinfo_size(vinfo->attrib[i].emit);

   /* doesn't handle EMIT_OMIT */
   assert(emit_sz != 0);

-  if (vbuf->vinfo->attrib[i].emit == EMIT_1F_PSIZE) {
-src_buffer = 1;
-src_offset = 0;
+  if (vinfo->attrib[i].emit == EMIT_1F_PSIZE) {
+ src_buffer = 1;
+ src_offset = 0;
+  }
+  else if (vinfo->attrib[i].src_index == 255) {
+ /* elements which don't exist will get assigned zeros */
+ src_buffer = 2;
+ src_offset = 0;
   }

   hw_key.element[i].type = TRANSLATE_ELEMENT_NORMAL;
@@ -249,7 +257,7 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint prim 
)

   dst_offset += emit_sz;
}

-   hw_key.nr_elements = vbuf->vinfo->num_attribs;
+   hw_key.nr_elements = vinfo->num_attribs;
hw_key.output_stride = vbuf->vertex_size;

/* Don't bother with caching at this stage:
@@ -261,6 +269,7 @@ vbuf_start_prim( struct vbuf_stage *vbuf, uint prim 
)

   vbuf->translate = translate_cache_find(vbuf->cache, &hw_key);

   vbuf->translate->set_buffer(vbuf->translate, 1,
&vbuf->point_size, 0, ~0);
+  vbuf->translate->set_buffer(vbuf->translate, 2, &vbuf->zero4[0], 
0, ~0);

}

vbuf->point_size = vbuf->stage.draw->rasterizer->point_size;
@@ -428,7 +437,7 @@ struct draw_stage *draw_vbuf_stage( struct
draw_context *draw,
struct vbuf_stage *vbuf = CALLOC_STRUCT(vbuf_stage);
if (!vbuf)
   goto fail;
-
+
vbuf->stage.draw = draw;
vbuf->stage.name = "vbuf";
vbuf->stage.point = vbuf_first_point;
@@ -437,29 +446,30 @@ struct draw_stage *draw_vbuf_stage( struct
draw_context *draw,
vbuf->stage.flush = vbuf_flush;
vbuf->stage.reset_stipple_counter = vbuf_reset_stipple_counter;
vbuf->stage.destroy = vbuf_destroy;
-
+
vbuf->render = render;
vbuf->max_indices = MIN2(render->max_indices, 
UNDEFINED_VERTEX_ID-1);


-   vbuf->indices = (ushort *) align_malloc( vbuf->max_indices *
-   sizeof(vbuf->indices[0]),
-   16 );
+   vbuf->indices = (ushort *) align_malloc(vbuf->max_indices *
+sizeof(vbuf->indices[0]),
+16);
if (!vbuf->indices)
   goto fail;

vbuf->cache = translate_cache_create();
-   if (!vbuf->cache)
+   if (!vbuf->ca

Re: [Mesa-dev] [PATCH] nir/builder: fix C90 build errors

2015-12-19 Thread eocallaghan

On 2015-12-20 09:39, Rob Clark wrote:

From: Rob Clark 

We are going to start using nir_builder.h from some gallium code, which
is currently only C90.  Which results in:

   In file included from nir/nir_emulate.c:26:0:
   ../../../src/glsl/nir/nir_builder.h: In function ‘nir_build_alu’:
   ../../../src/glsl/nir/nir_builder.h:132:4: error: ISO C90 forbids
mixed declarations and code [-Werror=declaration-after-statement]
   unsigned num_components = op_info->output_size;
   ^
   In file included from nir/nir_emulate.c:26:0:
   ../../../src/glsl/nir/nir_builder.h: In function ‘nir_ssa_for_src’:
   ../../../src/glsl/nir/nir_builder.h:271:4: error: ISO C90 forbids
mixed declarations and code [-Werror=declaration-after-statement]
   nir_alu_src alu = { NIR_SRC_INIT };
   ^
   cc1: some warnings being treated as errors

Signed-off-by: Rob Clark 
---
Not sure if I should just go ahead and push this sort of thing.  Or
if we can start requiring C99 for gallium?

 src/glsl/nir/nir_builder.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_builder.h b/src/glsl/nir/nir_builder.h
index 332bb02..6f30306 100644
--- a/src/glsl/nir/nir_builder.h
+++ b/src/glsl/nir/nir_builder.h
@@ -115,6 +115,8 @@ nir_build_alu(nir_builder *build, nir_op op,
nir_ssa_def *src0,
 {
const nir_op_info *op_info = &nir_op_infos[op];
nir_alu_instr *instr = nir_alu_instr_create(build->shader, op);
+   unsigned num_components;
+
if (!instr)
   return NULL;

@@ -129,7 +131,7 @@ nir_build_alu(nir_builder *build, nir_op op,
nir_ssa_def *src0,
/* Guess the number of components the destination temporary should 
have

 * based on our input sizes, if it's not fixed for the op.
 */
-   unsigned num_components = op_info->output_size;
+   num_components = op_info->output_size;
if (num_components == 0) {
   for (unsigned i = 0; i < op_info->num_inputs; i++) {
  if (op_info->input_sizes[i] == 0)
@@ -265,10 +267,11 @@ nir_channel(nir_builder *b, nir_ssa_def *def, 
unsigned c)

 static inline nir_ssa_def *
 nir_ssa_for_src(nir_builder *build, nir_src src, int num_components)
 {
+   nir_alu_src alu = { NIR_SRC_INIT };
+
if (src.is_ssa && src.ssa->num_components == num_components)
   return src.ssa;

-   nir_alu_src alu = { NIR_SRC_INIT };
alu.src = src;
for (int j = 0; j < 4; j++)
   alu.swizzle[j] = j;


Reviewed-by: Edward O'Callaghan 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] radeonsi: fix perfcounter selection for SI_PC_MULTI_BLOCK layouts

2015-12-14 Thread eocallaghan

On 2015-12-15 04:06, Nicolai Hähnle wrote:

From: Nicolai Hähnle 

The incorrectly computed register count caused lockups.
---
 src/gallium/drivers/radeonsi/si_perfcounter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_perfcounter.c
b/src/gallium/drivers/radeonsi/si_perfcounter.c
index a0ddff6..7ee1dae 100644
--- a/src/gallium/drivers/radeonsi/si_perfcounter.c
+++ b/src/gallium/drivers/radeonsi/si_perfcounter.c
@@ -436,7 +436,7 @@ static void si_pc_emit_select(struct
r600_common_context *ctx,

dw = count + regs->num_prelude;
if (count >= regs->num_multi)
-   count += regs->num_multi;
+   dw += regs->num_multi;
radeon_set_uconfig_reg_seq(cs, regs->select0, dw);
for (idx = 0; idx < regs->num_prelude; ++idx)
radeon_emit(cs, 0);


This series is,

Reviewed-by: Edward O'Callaghan 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: simplifiy interface matching

2015-12-12 Thread eocallaghan

On 2015-12-13 16:25, Timothy Arceri wrote:

This makes the code easier to follow, should be more efficient
and will makes it easier to add matching via explicit locations
in the following patch.

This patch also replaces the hash table with the newer
resizable hash table this should be more suitable as the table
is likely to only contain a small number of entries.
---
 src/glsl/link_interface_blocks.cpp | 154 
+++--

 1 file changed, 46 insertions(+), 108 deletions(-)

diff --git a/src/glsl/link_interface_blocks.cpp
b/src/glsl/link_interface_blocks.cpp
index 936e2e0..61ba078 100644
--- a/src/glsl/link_interface_blocks.cpp
+++ b/src/glsl/link_interface_blocks.cpp
@@ -30,100 +30,52 @@
 #include "glsl_symbol_table.h"
 #include "linker.h"
 #include "main/macros.h"
-#include "program/hash_table.h"
+#include "util/hash_table.h"


 namespace {

 /**
- * Information about a single interface block definition that we need 
to keep

- * track of in order to check linkage rules.
- *
- * Note: this class is expected to be short lived, so it doesn't make 
copies
- * of the strings it references; it simply borrows the pointers from 
the

- * ir_variable class.
- */
-struct interface_block_definition
-{
-   /**
-* Extract an interface block definition from an ir_variable that
-* represents either the interface instance (for named interfaces), 
or a

-* member of the interface (for unnamed interfaces).
-*/
-   explicit interface_block_definition(ir_variable *var)
-  : var(var),
-type(var->get_interface_type()),
-instance_name(NULL)
-   {
-  if (var->is_interface_instance()) {
- instance_name = var->name;
-  }
-  explicitly_declared = (var->data.how_declared !=
ir_var_declared_implicitly);
-   }
-   /**
-* Interface block ir_variable
-*/
-   ir_variable *var;
-
-   /**
-* Interface block type
-*/
-   const glsl_type *type;
-
-   /**
-* For a named interface block, the instance name.  Otherwise NULL.
-*/
-   const char *instance_name;
-
-   /**
-* True if this interface block was explicitly declared in the 
shader;

-* false if it was an implicitly declared built-in interface block.
-*/
-   bool explicitly_declared;
-};
-
-
-/**
  * Check if two interfaces match, according to intrastage interface 
matching
  * rules.  If they do, and the first interface uses an unsized array, 
it will
  * be updated to reflect the array size declared in the second 
interface.

  */
 bool
-intrastage_match(interface_block_definition *a,
- const interface_block_definition *b,
- ir_variable_mode mode,
+intrastage_match(ir_variable *a,
+ ir_variable *b,
  struct gl_shader_program *prog)
 {
/* Types must match. */
-   if (a->type != b->type) {
+   if (a->get_interface_type() != b->get_interface_type()) {
   /* Exception: if both the interface blocks are implicitly 
declared,
* don't force their types to match.  They might mismatch due to 
the two

* shaders using different GLSL versions, and that's ok.
*/
-  if (a->explicitly_declared || b->explicitly_declared)
+  if (a->data.how_declared != ir_var_declared_implicitly ||
+  b->data.how_declared != ir_var_declared_implicitly)
  return false;
}

/* Presence/absence of interface names must match. */
-   if ((a->instance_name == NULL) != (b->instance_name == NULL))
+   if (a->is_interface_instance() != b->is_interface_instance())
   return false;

/* For uniforms, instance names need not match.  For shader 
ins/outs,

 * it's not clear from the spec whether they need to match, but
 * Mesa's implementation relies on them matching.
 */
-   if (a->instance_name != NULL &&
-   mode != ir_var_uniform && mode != ir_var_shader_storage &&
-   strcmp(a->instance_name, b->instance_name) != 0) {
+   if (a->is_interface_instance() && b->data.mode != ir_var_uniform &&
+   b->data.mode != ir_var_shader_storage &&
+   strcmp(a->name, b->name) != 0) {
   return false;
}

/* If a block is an array then it must match across the shader.
 * Unsized arrays are also processed and matched agaist sized 
arrays.

 */
-   if (b->var->type != a->var->type &&
-   (b->instance_name != NULL || a->instance_name != NULL) &&
-   !validate_intrastage_arrays(prog, b->var, a->var))
+   if (b->type != a->type &&
+   (b->is_interface_instance() || a->is_interface_instance()) &&
+   !validate_intrastage_arrays(prog, b, a))
   return false;

return true;
@@ -139,43 +91,44 @@ intrastage_match(interface_block_definition *a,
  * This is used for tessellation control and geometry shader 
consumers.

  */
 bool
-interstage_match(const interface_block_definition *producer,
- const interface_block_definition *consumer,
+interstage_match(ir_variable *producer,
+ ir_variable *consumer,

Re: [Mesa-dev] [PATCH] radeonsi: also print hexadecimal values for register fields in the IB parser

2015-12-10 Thread eocallaghan

On 2015-12-10 09:54, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_debug.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_debug.c
b/src/gallium/drivers/radeonsi/si_debug.c
index cce665e..034acf5 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -61,13 +61,16 @@ static void print_spaces(FILE *f, unsigned num)
 static void print_value(FILE *file, uint32_t value, int bits)
 {
/* Guess if it's int or float */
-   if (value <= (1 << 15))
-   fprintf(file, "%u\n", value);
-   else {
+   if (value <= (1 << 15)) {
+   if (value <= 9)
+   fprintf(file, "%u\n", value);
+   else
+   fprintf(file, "%u (0x%0*x)\n", value, bits / 4, value);
+   } else {
float f = uif(value);

if (fabs(f) < 10 && f*10 == floor(f*10))
-   fprintf(file, "%.1ff\n", f);
+   fprintf(file, "%.1ff (0x%0*x)\n", f, bits / 4, value);
else
/* Don't print more leading zeros than there are bits. 
*/
fprintf(file, "0x%0*x\n", bits / 4, value);


Reviewed-by: Edward O'Callaghan 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: don't call of u_prims_for_vertices for patches and rectangles

2015-12-10 Thread eocallaghan

On 2015-12-10 22:15, Marek Olšák wrote:
On Thu, Dec 10, 2015 at 4:01 AM, Michel Dänzer  
wrote:

On 10.12.2015 06:58, Marek Olšák wrote:

From: Marek Olšák 

Both caused a crash due to a division by zero in that function.
This is an alternative fix.

Cc: 11.0 11.1 
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c

index ee84a1f..e550011 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -216,6 +216,18 @@ static void si_emit_derived_tess_state(struct 
si_context *sctx,

  radeon_emit(cs, tcs_out_layout | (num_tcs_output_cp << 26));
 }

+static unsigned si_num_prims_for_vertices(const struct 
pipe_draw_info *info)

+{
+ switch (info->mode) {
+ case PIPE_PRIM_PATCHES:
+ return info->count / info->vertices_per_patch;
+ case R600_PRIM_RECTANGLE_LIST:
+ return info->count / 3;
+ default:
+ return u_prims_for_vertices(info->mode, info->count);
+ }
+}


I don't suppose it makes sense to handle PIPE_PRIM_PATCHES in
u_prims_for_vertices? Either way,


u_prims_for_vertices has an assertion that fails if mode == PATCHES.
That's sufficient.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


I prefer this combined solution now. Many thanks,

Reviewed-by: Edward O'Callaghan 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: handle patches in u_prims_for_vertices to fix a radeonsi crash

2015-12-09 Thread eocallaghan

On 2015-12-10 01:47, Marek Olšák wrote:

From: Marek Olšák 

I guess the crash was because of divison by zero.

Cc: 11.0 11.1 
---
 src/gallium/auxiliary/util/u_prim.h  | 17 +
 src/gallium/drivers/radeonsi/si_state_draw.c |  3 ++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_prim.h
b/src/gallium/auxiliary/util/u_prim.h
index 3668015..4926af6 100644
--- a/src/gallium/auxiliary/util/u_prim.h
+++ b/src/gallium/auxiliary/util/u_prim.h
@@ -141,14 +141,23 @@ u_prim_vertex_count(unsigned prim)
  * For polygons, return the number of triangles.
  */
 static inline unsigned
-u_prims_for_vertices(unsigned prim, unsigned num)
+u_prims_for_vertices(unsigned prim, unsigned num, unsigned 
vertices_per_patch)

 {
-   const struct u_prim_vertex_count *info = u_prim_vertex_count(prim);
+   struct u_prim_vertex_count info;

-   if (num < info->min)
+   if (prim == PIPE_PRIM_PATCHES)
+  info.min = info.incr = vertices_per_patch;
+   else if (prim < PIPE_PRIM_MAX)


We already do this check in u_prim_vertex_count() and if out-of-bounds 
we
returned a NULL. Perhaps it would be better avoid this extra else-if 
branch
here and just in the else branch, make the call and then assert on the 
NULL.



+  info = *u_prim_vertex_count(prim);
+   else {
+  assert(!"invalid prim type");
+  return 0;
+   }
+
+   if (num < info.min)
   return 0;


Well convolving this with my previous patch,
http://lists.freedesktop.org/archives/mesa-dev/2015-December/102729.html
I think we should still have an  assert(info.incr != 0);  here.



-   return 1 + ((num - info->min) / info->incr);
+   return 1 + ((num - info.min) / info.incr);
 }

 static inline boolean u_validate_pipe_prim( unsigned pipe_prim, 
unsigned nr )

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ee84a1f..4ac9d0a 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -320,7 +320,8 @@ static unsigned si_get_ia_multi_vgt_param(struct
si_context *sctx,
if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi &&
(info->indirect ||
 (info->instance_count > 1 &&
- u_prims_for_vertices(info->mode, info->count) <= 1)))
+ u_prims_for_vertices(info->mode, info->count,
+  info->vertices_per_patch) <= 1)))
sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;

return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports

2015-12-09 Thread eocallaghan

Roland,

I could not due to ml size limit or something, it just bounces hence the 
pull request.


Cheers,
Edward.

On 2015-12-10 02:38, Roland Scheidegger wrote:

Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan:

This fixes my initial attempt so that piglit now passes 14/14. Thanks
to a couple of tips from Roland in the previous patch I was able to
fix the remaining issue. This should be golden now.



Great that you got it working!
Please send the patches to the ml.

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >