Re: [Mesa-dev] [PATCH 2/4] i965: Implement the WaPreventHSTessLevelsInterference workaround.

2016-08-18 Thread Alejandro Piñeiro


On 17/08/16 16:15, Kenneth Graunke wrote:
> Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases:
> - vertex_spacing
> - tessellation_shader_point_mode.points_verification
> - tessellation_shader_quads_tessellation.inner_tessellation_level_rounding
> 
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources |   1 +
>  src/mesa/drivers/dri/i965/brw_compiler.h   |   2 +
>  src/mesa/drivers/dri/i965/brw_nir.h|   2 +
>  .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 152 
> +
>  src/mesa/drivers/dri/i965/brw_tcs.c|  18 ++-
>  src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp |   3 +
>  6 files changed, 175 insertions(+), 3 deletions(-)
>  create mode 100644 src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index df6b5dd..2492a31 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -48,6 +48,7 @@ i965_compiler_FILES = \
>   brw_nir_attribute_workarounds.c \
>   brw_nir_intrinsics.c \
>   brw_nir_opt_peephole_ffma.c \
> + brw_nir_tcs_workarounds.c \
>   brw_packed_float.c \
>   brw_predicated_break.cpp \
>   brw_reg.h \
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 10e9f47..7d15c28 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -220,6 +220,8 @@ struct brw_tcs_prog_key
> /** A bitfield of per-vertex outputs written. */
> uint64_t outputs_written;
>  
> +   bool quads_workaround;
> +
> struct brw_sampler_prog_key_data tex;
>  };
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.h 
> b/src/mesa/drivers/dri/i965/brw_nir.h
> index 74c354f..6185310 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.h
> +++ b/src/mesa/drivers/dri/i965/brw_nir.h
> @@ -117,6 +117,8 @@ bool brw_nir_apply_attribute_workarounds(nir_shader *nir,
>  
>  bool brw_nir_apply_trig_workarounds(nir_shader *nir);
>  
> +void brw_nir_apply_tcs_quads_workaround(nir_shader *nir);
> +
>  nir_shader *brw_nir_apply_sampler_key(nir_shader *nir,
>const struct brw_device_info *devinfo,
>const struct brw_sampler_prog_key_data 
> *key,
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c 
> b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> new file mode 100644
> index 000..0626981
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> @@ -0,0 +1,152 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "compiler/nir/nir_builder.h"
> +#include "brw_nir.h"
> +
> +/**
> + * Implements the WaPreventHSTessLevelsInterference workaround (for Gen7-8).
> + *
> + * From the Broadwell PRM, Volume 7 (3D-Media-GPGPU), Page 494 (below the
> + * definition of the patch header layouts):
> + *
> + *"HW Bug: The Tessellation stage will incorrectly add domain points
> + * along patch edges under the following conditions, which may result
> + * in conformance failures and/or cracking artifacts:
> + *
> + *   * QUAD domain
> + *   * INTEGER partitioning
> + *   * All three TessFactors in a given U or V direction (e.g., V
> + * direction: UEQ0, InsideV, UEQ1) are all exactly 1.0
> + *   * All three TessFactors in the other direction are > 1.0 and all
> + * round up to the same integer value (e.g, U direction:
> + * VEQ0 = 3.1, InsideU = 3.7, VEQ1 = 3.4)
> + *
> + * The suggested workaround (to be implemented as part of the postamble
> + *  

Re: [Mesa-dev] [PATCH 33/95] i965/vec4: implement d2b

2016-08-18 Thread Iago Toral
On Wed, 2016-08-17 at 15:15 -0700, Francisco Jerez wrote:
> Iago Toral  writes:
> 
> > 
> > On Tue, 2016-08-02 at 18:27 -0700, Francisco Jerez wrote:
> > > 
> > > Iago Toral Quiroga  writes:
> > > 
> > > > 
> > > > 
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 18
> > > > ++
> > > >  1 file changed, 18 insertions(+)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > index 1525a3d..4014020 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > @@ -1497,6 +1497,24 @@ vec4_visitor::nir_emit_alu(nir_alu_instr
> > > > *instr)
> > > >    emit(CMP(dst, op[0], brw_imm_f(0.0f),
> > > > BRW_CONDITIONAL_NZ));
> > > >    break;
> > > >  
> > > > +   case nir_op_d2b: {
> > > > +  /* two-argument instructions can't take 64-bit
> > > > immediates */
> > > > +  dst_reg zero = dst_reg(this, glsl_type::dvec4_type);
> > > > +  emit(MOV(zero, brw_imm_df(0.0)));
> > > > +
> > > > +  dst_reg tmp = dst_reg(this, glsl_type::dvec4_type);
> > > > +  emit(CMP(tmp, op[0], src_reg(zero),
> > > > BRW_CONDITIONAL_NZ));
> > > > +
> > > > +  /* Convert the double CMP result to a single boolean
> > > > result.
> > > > For that
> > > > +   * we take the low 32-bit chunk of each DF component in
> > > > the
> > > > result.
> > > > +   * and do a final MOV to honor the original writemask
> > > > +   */
> > > > +  dst_reg result = dst_reg(this, glsl_type::bvec4_type);
> > > > +  emit(VEC4_OPCODE_PICK_LOW_32BIT, result, src_reg(tmp));
> > > > +  emit(MOV(dst, src_reg(result)));
> > > Couldn't you just do a single CMP instruction of the double-
> > > precision
> > > argument and a single-precision 0.0 immediate? 
> > It never occurred to me that we could do that but it seems to work
> > just
> > fine.
> > 
> > > 
> > >  I think you could
> > > potentially also use a 32bit destination type on the CMP
> > > instruction
> > > so
> > > you don't need to emit the PICK_LOW+MOV instructions afterwards.
> > This does not seem to work though.
> > 
> > > 
> > >   It may
> > > hit an instruction decompression bug but it could be cleaned up
> > > by
> > > the
> > > SIMD lowering pass afterwards, AFAICT the result would be two
> > > uncompressed instructions instead of the two uncompressed plus
> > > two
> > > compressed instructions above.
> > I tried splitting the instruction just in case. Notice that doing
> > this
> > requires that we improve the splitting pass to support splitting
> > instructions that write only one register (the original series did
> > not
> > implement support for that because so far the only cared to split
> > DF
> > instructions that wrote more than one register). I implemented a
> > couple
> > of quick hacks to get this done so I could test this but the result
> > is
> > still incorrect.
> > 
> > For reference, if we don't split, the resulting CMP looks like this
> > (after full scalarization):
> > 
> > cmp.nz.f0(8)  g8<1>.xD  g6<2,2,1>.xyxyDF 0F     { align16 1Q };
> > cmp.nz.f0(8)  g8<1>.yD  g6<2,2,1>.zwzwDF 0F     { align16 1Q };
> > cmp.nz.f0(8)  g8<1>.zD  g6.2<0,2,1>.xyxyDF 0F   { align16 1Q };
> > cmp.nz.f0(8)  g8<1>.wD  g6.2<0,2,1>.zwzwDF 0F   { align16 1Q };
> > 
> > And if we split the instruction in two we produce this:
> > 
> > cmp.nz.f0(4)  g6<1>.xD     g1<0,2,1>.xyxyDF 0F    { align16 1N };
> > cmp.nz.f0(4)  g6<1>.yD     g1<0,2,1>.zwzwDF 0F    { align16 1N };
> > cmp.nz.f0(4)  g6<1>.zD     g1.2<0,2,1>.xyxyDF 0F  { align16 1N };
> > cmp.nz.f0(4)  g6<1>.wD     g1.2<0,2,1>.zwzwDF 0F  { align16 1N };
> > cmp.nz.f0(4)  g6.4<1>.xD   g1<0,2,1>.xyxyDF 0F    { align16 2N };
> > cmp.nz.f0(4)  g6.4<1>.yD   g1<0,2,1>.zwzwDF 0F    { align16 2N };
> > cmp.nz.f0(4)  g6.4<1>.zD   g1.2<0,2,1>.xyxyDF 0F  { align16 2N };
> > cmp.nz.f0(4)  g6.4<1>.wD   g1.2<0,2,1>.zwzwDF 0F  { align16 2N };
> > 
> > Both sequences of instructions look correct to me as far as the
> > splitting and DF regioning go, so I guess the 32-bit CMP result is
> > not
> > doing what we want.
> > 
> > I am not too surprised about this. Even if we use a 32b dst type I
> > suppose the instruction execution type is still 64b so I guess
> > there
> > should be a 64b to 32b conversion involved in writing to a 32b
> > result.
> > Seeing how 64b to 32b conversions require that we emit specific
> > code
> > using ALIGN1 mode to deal with alignment requirements I guess it is
> > not
> > too surprising that this does not work, right?
> > 
> Yeah, that's not too surprising, don't worry about it.  Another idea
> would be to do something along the lines of 'MOV.nz null,  goes here>' to check whether the source is non-zero (so you avoid
> loading the immediate which is a non-trivial operation on Gen7), and
> then use a single-precision SEL instruction (or two predicated MOV
> instructions) to write true or false to the destination (so yo

Re: [Mesa-dev] [PATCH] vbo: increase VBO_SAVE_BUFFER_SIZE from 8k to 256k dwords

2016-08-18 Thread Gustaw Smolarczyk
Hi,

Will this help Minecraft? I believe it currently uses VBOs (if they
are enabled) and display lists. I might test it when I'm home next
week.

Regards,
Gustaw Smolarczyk

2016-08-18 7:26 GMT+02:00 Mathias Fröhlich :
> Hi,
>
> On Wednesday, 17 August 2016 11:00:48 CEST Tim Rowley wrote:
>> Increases the performance of legacy geometry-heavy apps
>> still using display lists.
> That is my observation too.
> +1 from my side.
>
> If you need that from me this gets:
> Reviewed-by: Mathias Fröhlich 
>
> Mathias
>
>> ---
>>  src/mesa/vbo/vbo_save.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
>> index 2843b3c..d1d7fb0 100644
>> --- a/src/mesa/vbo/vbo_save.h
>> +++ b/src/mesa/vbo/vbo_save.h
>> @@ -96,7 +96,7 @@ struct vbo_save_vertex_list {
>>   * likelyhood as it occurs.  No reason we couldn't change usage
>>   * internally even though this probably isn't allowed for client VBOs?
>>   */
>> -#define VBO_SAVE_BUFFER_SIZE (8*1024) /* dwords */
>> +#define VBO_SAVE_BUFFER_SIZE (256*1024) /* dwords */
>>  #define VBO_SAVE_PRIM_SIZE   128
>>  #define VBO_SAVE_PRIM_MODE_MASK 0x3f
>>  #define VBO_SAVE_PRIM_WEAK  0x40
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] nir: Rely on the fact that bcsel takes a well formed boolean.

2016-08-18 Thread Jason Ekstrand
On Wed, Aug 17, 2016 at 3:03 PM, Kenneth Graunke 
wrote:

> According to Connor, it's safe to assume that the first operand of
> bcsel, as well as the operand of b2f and b2i, must be well formed
> booleans.
>

Connor is *probably* right...  I'm personally still a bit skeptical that
this won't bite us in practice since we have no way to validate it.

Patches 1-3 are definitely

Reviewed-by: Jason Ekstrand 

Go ahead and put my R-B on this patch if you want.  We can revert it if it
causes trouble. :)


> https://lists.freedesktop.org/archives/mesa-dev/2016-August/125658.html
>
> With the previous improvements to a@bool handling, this now has no
> change in shader-db instruction counts on Broadwell.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py
> b/src/compiler/nir/nir_opt_algebraic.py
> index 0f0896b..ceb8730 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -144,7 +144,7 @@ optimizations = [
> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
> (('bcsel', ('flt', b, a), b, a), ('fmin', a, b)),
> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
> -   (('bcsel', ('inot', 'a@bool'), b, c), ('bcsel', a, c, b)),
> +   (('bcsel', ('inot', a), b, c), ('bcsel', a, c, b)),
> (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)),
> (('bcsel', a, True, 'b@bool'), ('ior', a, b)),
> (('fmin', a, a), a),
> @@ -248,8 +248,8 @@ optimizations = [
> (('ine', 'a@bool', True), ('inot', a)),
> (('ine', 'a@bool', False), a),
> (('ieq', 'a@bool', False), ('inot', 'a')),
> -   (('bcsel', a, True, False), ('ine', a, 0)),
> -   (('bcsel', a, False, True), ('ieq', a, 0)),
> +   (('bcsel', a, True, False), a),
> +   (('bcsel', a, False, True), ('inot', a)),
> (('bcsel', True, b, c), b),
> (('bcsel', False, b, c), c),
> # The result of this should be hit by constant propagation and, in the
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: fix key used for hashing switch statement cases

2016-08-18 Thread Tapani Pälli
Implementation previously used case value itself as the key, however
afterhash implementation change by ee02a5e we cannot use 0 as key.
Patch uses _mesa_hash_data to formulate a suitable key for this hash.

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
---
 src/compiler/glsl/ast_to_hir.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index e03a6e3..53fc4d6 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -6180,9 +6180,11 @@ ast_case_label::hir(exec_list *instructions,
  /* Stuff a dummy value in to allow processing to continue. */
  label_const = new(ctx) ir_constant(0);
   } else {
+ unsigned hash_key = _mesa_hash_data(&label_const->value.u[0],
+ sizeof(unsigned));
  ast_expression *previous_label = (ast_expression *)
  hash_table_find(state->switch_state.labels_ht,
- (void *)(uintptr_t)label_const->value.u[0]);
+ (void *)(uintptr_t)hash_key);
 
  if (previous_label) {
 YYLTYPE loc = this->test_value->get_location();
@@ -6193,7 +6195,7 @@ ast_case_label::hir(exec_list *instructions,
  } else {
 hash_table_insert(state->switch_state.labels_ht,
   this->test_value,
-  (void *)(uintptr_t)label_const->value.u[0]);
+  (void *)(uintptr_t)hash_key);
  }
   }
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] st/mesa: add extra braces on nir_lower_wpos_ytransform_options initializer

2016-08-18 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Wed, Aug 17, 2016 at 4:39 PM, Brian Paul  wrote:

> Since the type is an array inside a union.  Silences MinGW warning.
> ---
>  src/mesa/state_tracker/st_glsl_to_nir.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp
> b/src/mesa/state_tracker/st_glsl_to_nir.cpp
> index 73a692a..6470fae 100644
> --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
> @@ -234,7 +234,7 @@ st_glsl_to_nir(struct st_context *st, struct
> gl_program *prog,
>static const gl_state_index wposTransformState[STATE_LENGTH] = {
>   STATE_INTERNAL, STATE_FB_WPOS_Y_TRANSFORM
>};
> -  nir_lower_wpos_ytransform_options wpos_options = {0};
> +  nir_lower_wpos_ytransform_options wpos_options = {{0}};
>struct pipe_screen *pscreen = st->pipe->screen;
>
>memcpy(wpos_options.state_tokens, wposTransformState,
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: increase VBO_SAVE_BUFFER_SIZE from 8k to 256k dwords

2016-08-18 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Aug 17, 2016 at 6:00 PM, Tim Rowley 
wrote:

> Increases the performance of legacy geometry-heavy apps
> still using display lists.
> ---
>  src/mesa/vbo/vbo_save.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
> index 2843b3c..d1d7fb0 100644
> --- a/src/mesa/vbo/vbo_save.h
> +++ b/src/mesa/vbo/vbo_save.h
> @@ -96,7 +96,7 @@ struct vbo_save_vertex_list {
>   * likelyhood as it occurs.  No reason we couldn't change usage
>   * internally even though this probably isn't allowed for client VBOs?
>   */
> -#define VBO_SAVE_BUFFER_SIZE (8*1024) /* dwords */
> +#define VBO_SAVE_BUFFER_SIZE (256*1024) /* dwords */
>  #define VBO_SAVE_PRIM_SIZE   128
>  #define VBO_SAVE_PRIM_MODE_MASK 0x3f
>  #define VBO_SAVE_PRIM_WEAK  0x40
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] vbo: remove unnecessary max_basevertex computation

2016-08-18 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Wed, Aug 17, 2016 at 11:12 PM, Ilia Mirkin  wrote:

> The max basevertex is already computed and added into max_index by the
> caller, _tnl_draw_prims.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/vbo/vbo_split.c | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/src/mesa/vbo/vbo_split.c b/src/mesa/vbo/vbo_split.c
> index 2f95746..79d7dd4 100644
> --- a/src/mesa/vbo/vbo_split.c
> +++ b/src/mesa/vbo/vbo_split.c
> @@ -108,14 +108,6 @@ void vbo_split_prims( struct gl_context *ctx,
>   vbo_draw_func draw,
>   const struct split_limits *limits )
>  {
> -   GLint max_basevertex = prim->basevertex;
> -   GLuint i;
> -
> -   for (i = 1; i < nr_prims; i++)
> -  max_basevertex = MAX2(max_basevertex, prim[i].basevertex);
> -
> -   /* XXX max_basevertex is computed but not used, why? */
> -
> if (ib) {
>if (limits->max_indices == 0) {
>  /* Could traverse the indices, re-emitting vertices in turn.
> --
> 2.7.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] st: guard against NULL pipe_surface

2016-08-18 Thread Marek Olšák
Hi,

Do you have a test case?

Marek

On Thu, Aug 18, 2016 at 1:44 AM, Tobias Klausmann <
tobias.johannes.klausm...@mni.thm.de> wrote:

> OpenMW tries to upload a new surface (mouse pointer) which fails in the now
> guarded update_framebuffer_size() as the surface is NULL.
>
> This is not inteded as a real "fix", as it would just hide the immediate
> crash.
>
> So if somebody could take a look at this...
>
> Reported-by: 
> Signed-off-by: Tobias Klausmann 
> ---
>  src/mesa/state_tracker/st_atom_framebuffer.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c
> b/src/mesa/state_tracker/st_atom_framebuffer.c
> index ea41d9d..3ee4ea5 100644
> --- a/src/mesa/state_tracker/st_atom_framebuffer.c
> +++ b/src/mesa/state_tracker/st_atom_framebuffer.c
> @@ -177,8 +177,10 @@ update_framebuffer_state( struct st_context *st )
>   /* rendering to a GL texture, may have to update surface */
>   st_update_renderbuffer_surface(st, strb);
>}
> -  pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
> -  update_framebuffer_size(framebuffer, strb->surface);
> +  if (strb->surface) {
> + pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
> + update_framebuffer_size(framebuffer, strb->surface);
> +  }
> }
> else {
>strb = st_renderbuffer(fb->Attachment[BUFFER_STENCIL].
> Renderbuffer);
> --
> 2.9.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix key used for hashing switch statement cases

2016-08-18 Thread Iago Toral
On Thu, 2016-08-18 at 11:57 +0300, Tapani Pälli wrote:
> Implementation previously used case value itself as the key, however
> afterhash implementation change by ee02a5e we cannot use 0 as key.
> Patch uses _mesa_hash_data to formulate a suitable key for this hash.
> 
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp
> b/src/compiler/glsl/ast_to_hir.cpp
> index e03a6e3..53fc4d6 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -6180,9 +6180,11 @@ ast_case_label::hir(exec_list *instructions,
>   /* Stuff a dummy value in to allow processing to continue.
> */
>   label_const = new(ctx) ir_constant(0);
>    } else {
> + unsigned hash_key = _mesa_hash_data(&label_const-
> >value.u[0],
> + sizeof(unsigned));

Shouldn't this take (void *) label_const->value[0] instead? It is the
value of the label we want to use as key, not the memory address where
it is stored, right?

>   ast_expression *previous_label = (ast_expression *)
>   hash_table_find(state->switch_state.labels_ht,
> - (void *)(uintptr_t)label_const-
> >value.u[0]);
> + (void *)(uintptr_t)hash_key);

Actually, you use the value here, so I guess the above was a mistake?

>   if (previous_label) {
>  YYLTYPE loc = this->test_value->get_location();
> @@ -6193,7 +6195,7 @@ ast_case_label::hir(exec_list *instructions,
>   } else {
>  hash_table_insert(state->switch_state.labels_ht,
>    this->test_value,
> -  (void *)(uintptr_t)label_const-
> >value.u[0]);
> +  (void *)(uintptr_t)hash_key);
>   }
>    }
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/cfg: Factor common code out of switch statement.

2016-08-18 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Wed, 2016-08-17 at 11:54 -0700, Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_cfg.cpp | 10 ++
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp
> b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> index 5d46615..53b32be 100644
> --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> @@ -177,9 +177,10 @@ cfg_t::cfg_t(exec_list *instructions)
>    /* set_next_block wants the post-incremented ip */
>    ip++;
>  
> +  inst->exec_node::remove();
> +
>    switch (inst->opcode) {
>    case BRW_OPCODE_IF:
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
>    /* Push our information onto a stack so we can recover from
> @@ -202,7 +203,6 @@ cfg_t::cfg_t(exec_list *instructions)
>    break;
>  
>    case BRW_OPCODE_ELSE:
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
>   cur_else = cur;
> @@ -226,7 +226,6 @@ cfg_t::cfg_t(exec_list *instructions)
>  set_next_block(&cur, cur_endif, ip - 1);
>   }
>  
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
>   if (cur_else) {
> @@ -267,12 +266,10 @@ cfg_t::cfg_t(exec_list *instructions)
>  set_next_block(&cur, cur_do, ip - 1);
>   }
>  
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>    break;
>  
>    case BRW_OPCODE_CONTINUE:
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
>   assert(cur_do != NULL);
> @@ -286,7 +283,6 @@ cfg_t::cfg_t(exec_list *instructions)
>    break;
>  
>    case BRW_OPCODE_BREAK:
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
>   assert(cur_while != NULL);
> @@ -300,7 +296,6 @@ cfg_t::cfg_t(exec_list *instructions)
>    break;
>  
>    case BRW_OPCODE_WHILE:
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
>   assert(cur_do != NULL && cur_while != NULL);
> @@ -317,7 +312,6 @@ cfg_t::cfg_t(exec_list *instructions)
>    break;
>  
>    default:
> - inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>    break;
>    }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] egl/x11_dri3: disable WL_bind_wayland_display for devices without render nodes

2016-08-18 Thread Frank Binns

On 04/08/16 15:02, Frank Binns wrote:

On 20/06/16 12:05, Axel Davy wrote:

Hi,

The three patches make sense to me.

Reviewed-by: Axel Davy 


Can someone push these patches for me?

Thanks
Frank


Ping



On 17/06/2016 19:41, Frank Binns wrote :
Up until now, DRI3 was only used for devices that have render nodes, 
unless
overridden via an environment variable, with it falling back to DRI2 
otherwise.
This limitation was there in order to support 
WL_bind_wayland_display as it
requires client opened device node fds to be authenticated, which 
isn't possible
when using DRI3. This is an unfortunate compromise as DRI3 provides 
security

benefits over DRI2.

Instead, allow DRI3 to be used for devices without render nodes but 
don't
advertise WL_bind_wayland_display in this case. Applications that 
need this
extension can still be run by disabling DRI3 support via the 
LIBGL_DRI3_DISABLE

environment variable.

Signed-off-by: Frank Binns 
---
  src/egl/drivers/dri2/platform_x11.c  |  3 ++-
  src/egl/drivers/dri2/platform_x11_dri3.c | 33 
+---

  2 files changed, 7 insertions(+), 29 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c

index c0a4005..1b50889 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -1335,7 +1335,8 @@ dri2_initialize_x11_dri3(_EGLDriver *drv, 
_EGLDisplay *disp)

 disp->Extensions.EXT_buffer_age = EGL_TRUE;
#ifdef HAVE_WAYLAND_PLATFORM
-   disp->Extensions.WL_bind_wayland_display = EGL_TRUE;
+   if (dri2_dpy->device_name)
+  disp->Extensions.WL_bind_wayland_display = EGL_TRUE;
  #endif
   if (dri2_dpy->conn) {
diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c

index 9363a8a..b781987 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -437,29 +437,6 @@ struct dri2_egl_display_vtbl 
dri3_x11_display_vtbl = {

 .get_dri_drawable = dri3_get_dri_drawable,
  };
  -static char *
-dri3_get_device_name(int fd)
-{
-   char *ret = NULL;
-
-   ret = drmGetRenderDeviceNameFromFd(fd);
-   if (ret)
-  return ret;
-
-   /* For dri3, render node support is required for 
WL_bind_wayland_display.
-* In order not to regress on older systems without kernel or 
libdrm
-* support, fall back to dri2. User can override it with 
environment

-* variable if they don't need to use that extension.
-*/
-   if (getenv("EGL_FORCE_DRI3") == NULL) {
-  _eglLog(_EGL_WARNING, "Render node support not available, 
falling back to dri2");
-  _eglLog(_EGL_WARNING, "If you want to force dri3, set 
EGL_FORCE_DRI3 environment variable");

-   } else
-  ret = loader_get_device_name_for_fd(fd);
-
-   return ret;
-}
-
  EGLBoolean
  dri3_x11_connect(struct dri2_egl_display *dri2_dpy)
  {
@@ -539,11 +516,11 @@ dri3_x11_connect(struct dri2_egl_display 
*dri2_dpy)

return EGL_FALSE;
 }
  -   dri2_dpy->device_name = dri3_get_device_name(dri2_dpy->fd);
-   if (!dri2_dpy->device_name) {
-  close(dri2_dpy->fd);
-  return EGL_FALSE;
-   }
+   /* Only try to get a render device name since it's only needed for
+* WL_bind_wayland_display and dri3 doesn't provide a mechanism for
+* authenticating client opened device node fds. If this fails then
+* don't advertise the extension. */
+   dri2_dpy->device_name = drmGetRenderDeviceNameFromFd(dri2_dpy->fd);
   return EGL_TRUE;
  }






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: fix MinGW initializer warning in ast_declarator_list::hir()

2016-08-18 Thread Emil Velikov
On 17 August 2016 at 15:39, Brian Paul  wrote:
> We need an extra set of braces for an array inside a struct.
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index e03a6e3..abab932 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -4731,7 +4731,7 @@ ast_declarator_list::hir(exec_list *instructions,
>if ((var->data.mode == ir_var_auto || var->data.mode == 
> ir_var_temporary)
>&& (var->type->is_numeric() || var->type->is_boolean())
>&& state->zero_init) {
> - const ir_constant_data data = {0};
> + const ir_constant_data data = {{0}};
Different versions of GCC trigger warning on each one of these. The
former is the correct one according to the spec, yet memset is the
most robust (warnings free) option.

Humble suggestion: opt to the latter or expect someone to unknowingly
revert this.
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: increase VBO_SAVE_BUFFER_SIZE from 8k to 256k dwords

2016-08-18 Thread Ian Romanick
On 08/17/2016 05:00 PM, Tim Rowley wrote:
> Increases the performance of legacy geometry-heavy apps
> still using display lists.

Do you have any data or at least names of some affected apps?  This is
valuable for the next person that has to change this code... imagine
someone who needs to change this value to reduce memory usage.

> ---
>  src/mesa/vbo/vbo_save.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
> index 2843b3c..d1d7fb0 100644
> --- a/src/mesa/vbo/vbo_save.h
> +++ b/src/mesa/vbo/vbo_save.h
> @@ -96,7 +96,7 @@ struct vbo_save_vertex_list {
>   * likelyhood as it occurs.  No reason we couldn't change usage
>   * internally even though this probably isn't allowed for client VBOs?
>   */
> -#define VBO_SAVE_BUFFER_SIZE (8*1024) /* dwords */
> +#define VBO_SAVE_BUFFER_SIZE (256*1024) /* dwords */
>  #define VBO_SAVE_PRIM_SIZE   128
>  #define VBO_SAVE_PRIM_MODE_MASK 0x3f
>  #define VBO_SAVE_PRIM_WEAK  0x40
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/cfg: Ignore non-CF instructions in unreachable blocks.

2016-08-18 Thread Iago Toral
On Wed, 2016-08-17 at 11:54 -0700, Matt Turner wrote:
> The basic block following a control flow structure like an infinite
> loop
> will be unreachable. Ignore any non-control-flow instructions in it
> since they can have no effect on the program.

If the block is unreachable control-flow instructions inside the block
are also irrelevant, is there any reason why you don't skip CF
instructions too?

> Avoids a segmentation fault in cfg_t::intersect(a, b) when called on
> an
> unreachable block. By avoiding ever putting code in an unreachable
> block, we never attempt to optimize code in an unreachable block.

Can't the problem persist if the unreachable block has any control-flow 
instructions?

> ---
>  src/mesa/drivers/dri/i965/brw_cfg.cpp | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp
> b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> index 53b32be..2fa4e3d 100644
> --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> @@ -312,6 +312,11 @@ cfg_t::cfg_t(exec_list *instructions)
>    break;
>  
>    default:
> + if (cur->num != 0 && cur->parents.is_empty()) {
> +ip--;
> +continue;
> + }
> +
>   cur->instructions.push_tail(inst);
>    break;
>    }
> @@ -319,6 +324,12 @@ cfg_t::cfg_t(exec_list *instructions)
>  
> cur->end_ip = ip - 1;
>  
> +   /* If the last block was empty (because it was unreachable) drop
> it. */
> +   if (cur->instructions.is_empty()) {
> +  cur->link.remove();
> +  num_blocks--;
> +   }
> +
> make_block_array();
>  }
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 43/95] i965/disasm: print NibCtrl for instructions with execsize 4

2016-08-18 Thread Iago Toral
On Mon, 2016-08-08 at 15:58 -0700, Francisco Jerez wrote:
> Iago Toral Quiroga  writes:
> 
> > 
> > ---
> >  src/mesa/drivers/dri/i965/brw_disasm.c | 8 +++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
> > b/src/mesa/drivers/dri/i965/brw_disasm.c
> > index c8bdeab..d5e9916 100644
> > --- a/src/mesa/drivers/dri/i965/brw_disasm.c
> > +++ b/src/mesa/drivers/dri/i965/brw_disasm.c
> > @@ -1169,7 +1169,13 @@ qtr_ctrl(FILE *file, const struct
> > brw_device_info *devinfo, brw_inst *inst)
> > int qtr_ctl = brw_inst_qtr_control(devinfo, inst);
> > int exec_size = 1 << brw_inst_exec_size(devinfo, inst);
> >  
> > -   if (exec_size == 8) {
> > +   if (exec_size == 4) {
> I guess it wouldn't hurt to show this for exec_size < 4 too even
> though
> it will typically be 1N.

The PRMs say:

"NibCtrl is only allowed for SIMD4 instructions with a DF (Double
Float) source or destination type"

I don't know if the "only allowed" in that sentence means that it is
incorrect to set it to to anything else in instructions with an exec
size < 4, in which case showing it in the disassembly would help find
bugs although maybe we should just validate for that when we generate
code, or if it means that the hardware will ignore it completely, in
which case showing its value does not seem useful for anything.

What do you think?

> > 
> > +  int nib_ctl = brw_inst_nib_control(devinfo, inst);
> This may cause an assertion failure on SNB and earlier because the
> NibCtrl field doesn't exist.  You could do something along the lines
> of:
> 
> > 
> > const unsigned nib_ctl = devinfo->gen < 7 ? 0 :
> >  brw_inst_nib_control(devinfo, inst);

Ah, good to know, thanks!

> > +  if (nib_ctl == 0)
> > + string(file, " 1N");
> > +  else
> > + string(file, " 2N");
> The usual qtr_ctl field is still taken into account by the hardware
> regardless of whether you use NibCtrl, this should probably be:
> 
> > 
> > format(file, " %dN", qtr_ctl * 2 + nib_ctl + 1);

Yeah, good point.

> > +   } else if (exec_size == 8) {
> >    switch (qtr_ctl) {
> >    case 0:
> >   string(file, " 1Q");
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 42/95] i965/vec4: dump NibCtrl for instructions with execsize 4

2016-08-18 Thread Iago Toral
On Mon, 2016-08-08 at 15:30 -0700, Francisco Jerez wrote:
> Iago Toral Quiroga  writes:
> 
> > 
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > index 829b7d3..88bf895 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > @@ -1580,6 +1580,9 @@
> > vec4_visitor::dump_instruction(backend_instruction *be_inst, FILE
> > *file)
> > if (inst->force_writemask_all)
> >    fprintf(file, " NoMask");
> >  
> > +   if (inst->exec_size == 4)
> > +  fprintf(file, "%s", inst->group == 0 ? " 1N" : " 2N");
> > +
> In the FS back-end we do:
> 
> > 
> >   if (inst->exec_size != dispatch_width)
> > fprintf(file, "group%d ", inst->group);
> Would it make sense to have the vec4 back-end behave the same way for
> consistency? (with dispatch_width equal to 8)

Yeah, I had noticed the difference I intended to fix it, but I forgot
about it, thanks for reminding me, we should really try to have
consistent outputs.

I have the same doubt I mentioned in the same patch for the disassembly
(patch 43) regarding whether we should show this for any exec_size != 4
though.

Iago

> > 
> > fprintf(file, "\n");
> >  }
> >  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix key used for hashing switch statement cases

2016-08-18 Thread Tapani Pälli

On 08/18/2016 01:08 PM, Iago Toral wrote:

On Thu, 2016-08-18 at 11:57 +0300, Tapani Pälli wrote:

Implementation previously used case value itself as the key, however
afterhash implementation change by ee02a5e we cannot use 0 as key.
Patch uses _mesa_hash_data to formulate a suitable key for this hash.

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
---
  src/compiler/glsl/ast_to_hir.cpp | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp
b/src/compiler/glsl/ast_to_hir.cpp
index e03a6e3..53fc4d6 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -6180,9 +6180,11 @@ ast_case_label::hir(exec_list *instructions,
   /* Stuff a dummy value in to allow processing to continue.
*/
   label_const = new(ctx) ir_constant(0);
} else {
+ unsigned hash_key = _mesa_hash_data(&label_const-

value.u[0],

+ sizeof(unsigned));

Shouldn't this take (void *) label_const->value[0] instead? It is the
value of the label we want to use as key, not the memory address where
it is stored, right?


This works ok because of the way _mesa_hash_data is implemented. It 
takes address of some data and then for contents found from that address 
it calculates a hash, so it'll do that for the value of the label (see 
_mesa_fnv32_1a_accumulate_block).



   ast_expression *previous_label = (ast_expression *)
   hash_table_find(state->switch_state.labels_ht,
- (void *)(uintptr_t)label_const-

value.u[0]);

+ (void *)(uintptr_t)hash_key);

Actually, you use the value here, so I guess the above was a mistake?


   if (previous_label) {
  YYLTYPE loc = this->test_value->get_location();
@@ -6193,7 +6195,7 @@ ast_case_label::hir(exec_list *instructions,
   } else {
  hash_table_insert(state->switch_state.labels_ht,
this->test_value,
-  (void *)(uintptr_t)label_const-

value.u[0]);

+  (void *)(uintptr_t)hash_key);
   }
}
  



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix key used for hashing switch statement cases

2016-08-18 Thread Iago Toral
On Thu, 2016-08-18 at 14:25 +0300, Tapani Pälli wrote:
> On 08/18/2016 01:08 PM, Iago Toral wrote:
> > 
> > On Thu, 2016-08-18 at 11:57 +0300, Tapani Pälli wrote:
> > > 
> > > Implementation previously used case value itself as the key,
> > > however
> > > afterhash implementation change by ee02a5e we cannot use 0 as
> > > key.
> > > Patch uses _mesa_hash_data to formulate a suitable key for this
> > > hash.
> > > 
> > > Signed-off-by: Tapani Pälli 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
> > > ---
> > >   src/compiler/glsl/ast_to_hir.cpp | 6 --
> > >   1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > > b/src/compiler/glsl/ast_to_hir.cpp
> > > index e03a6e3..53fc4d6 100644
> > > --- a/src/compiler/glsl/ast_to_hir.cpp
> > > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > > @@ -6180,9 +6180,11 @@ ast_case_label::hir(exec_list
> > > *instructions,
> > >    /* Stuff a dummy value in to allow processing to
> > > continue.
> > > */
> > >    label_const = new(ctx) ir_constant(0);
> > > } else {
> > > + unsigned hash_key = _mesa_hash_data(&label_const-
> > > > 
> > > > value.u[0],
> > > + sizeof(unsigned));
> > Shouldn't this take (void *) label_const->value[0] instead? It is
> > the
> > value of the label we want to use as key, not the memory address
> > where
> > it is stored, right?
> This works ok because of the way _mesa_hash_data is implemented. It 
> takes address of some data and then for contents found from that
> address 
> it calculates a hash, so it'll do that for the value of the label
> (see 
> _mesa_fnv32_1a_accumulate_block).

Ah right, I did not notice that. Looks good to me then:

Reviewed-by: Iago Toral Quiroga 

Iago

> > 
> > > 
> > >    ast_expression *previous_label = (ast_expression *)
> > >    hash_table_find(state->switch_state.labels_ht,
> > > - (void *)(uintptr_t)label_const-
> > > > 
> > > > value.u[0]);
> > > + (void *)(uintptr_t)hash_key);
> > Actually, you use the value here, so I guess the above was a
> > mistake?
> > 
> > > 
> > >    if (previous_label) {
> > >   YYLTYPE loc = this->test_value->get_location();
> > > @@ -6193,7 +6195,7 @@ ast_case_label::hir(exec_list
> > > *instructions,
> > >    } else {
> > >   hash_table_insert(state->switch_state.labels_ht,
> > > this->test_value,
> > > -  (void *)(uintptr_t)label_const-
> > > > 
> > > > value.u[0]);
> > > +  (void *)(uintptr_t)hash_key);
> > >    }
> > > }
> > >   
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 41/95] i965/vec4: make the generator set correct NibCtrl for SIMD4 DF instructions

2016-08-18 Thread Iago Toral
On Mon, 2016-08-08 at 15:26 -0700, Francisco Jerez wrote:
> Iago Toral Quiroga  writes:
> 
> > 
> > From the HSW PRM, Command Reference, QtrCtrl:
> > 
> >    "NibCtrl is only allowed for SIMD4 instructions with a DF
> > (Double Float)
> > source or destination type."
> > 
> > v2 (Samuel): Assert that the type is DF.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 9 +
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > index c6e040e..dc4f6db 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > @@ -1494,6 +1494,7 @@ generate_code(struct brw_codegen *p,
> >    brw_set_default_saturate(p, inst->saturate);
> >    brw_set_default_mask_control(p, inst->force_writemask_all);
> >    brw_set_default_acc_write_control(p, inst-
> > >writes_accumulator);
> > +  brw_set_default_group(p, 0);
> > 
> >    assert(inst->base_mrf + inst->mlen <= BRW_MAX_MRF(devinfo-
> > >gen));
> >    assert(inst->mlen <= BRW_MAX_MSG_LENGTH);
> > @@ -1529,6 +1530,14 @@ generate_code(struct brw_codegen *p,
> >    } else {
> >   assert(inst->exec_size == 8 || inst->exec_size == 4);
> >   brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
> > + if (inst->exec_size == 4 && !inst->force_writemask_all) {
> > +assert(inst->dst.type == BRW_REGISTER_TYPE_DF ||
> > +   inst->src[0].type == BRW_REGISTER_TYPE_DF ||
> > +   inst->src[1].type == BRW_REGISTER_TYPE_DF ||
> > +   inst->src[2].type == BRW_REGISTER_TYPE_DF);
> > +assert(inst->group == 0 || inst->group == 4);
> > +brw_inst_set_group(p->devinfo, p->current, inst-
> > >group);
> I think this should be unconditional (e.g. if somebody tries to set
> group == 4 on an instruction with exec_size == 8 it would be nice to
> get
> an assertion failure instead of the group value being silently
> ignored).  How about you replace the 'brw_set_default_group(p, 0)'
> line
> above with:
> 
> > 
> > 
> >   assert(inst->group % inst->exec_size == 0);
> >   assert(inst->group % 8 == 0 ||
> >  inst->dst.type == BRW_REGISTER_TYPE_DF ||
> >  inst->src[0].type == BRW_REGISTER_TYPE_DF ||
> >  inst->src[1].type == BRW_REGISTER_TYPE_DF ||
> >  inst->src[2].type == BRW_REGISTER_TYPE_DF);
> >   brw_set_default_group(p, inst->group);
> and then drop this hunk?

Yes, that looks like better. Thanks!

I suppose that setting the group won't have any effect  if
force_writemask_all is set as too so there is no need to void setting
it in that case, right?

> > 
> >    }
> >  
> >    switch (inst->opcode) {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Possible memory leak with `glxMakeCurrent`?

2016-08-18 Thread Miklós Máté

On 08/18/2016 05:08 AM, Michel Dänzer wrote:

On 18/08/16 05:39 AM, Itai wrote:

(Posted initially in mesa-users, but got no reply - the list seems dead.
Couldn't find any bug report, and sadly not well versed enough in mesa
to file one myself).

FWIW, there's no need to be versed in Mesa to file a bug report. :)



Following an investigation of a memory leak with JavaFX on some Linux
configuration, it looks like there is a possible memory leak when using
`glxMakeCurrent`.
Sadly, I myself don't know enough about OpenGL/Mesa to describe it
fully, but I'm hoping someone here can understand it well enough to make
a proper bug report.

Here is a link to the discussion on the openjfx-dex list:
http://mail.openjdk.java.net/pipermail/openjfx-dev/2016-August/019577.html

Here is a forum post describing a non-Java way to reproduce this same
issue:
http://www.gamedev.net/topic/679705-glxmakecurrent-slowly-leaks-memory/


This was possibly not an issue in older versions of Mesa, as the bug
does not appear on older Linux installations (I'm using Mesa 11.2.2,
where the bug is present)

Does it still happen with current Git master? There have been some fixes
in this area recently.

If it still happens, the output of running an affected application in
valgrind --leak-check=full (with debugging symbols available for at
least all Mesa binaries) would be useful.


I can confirm that there is a per-makecurrent leak in 11.2, but it seems 
to be fixed in 12.0.


==28934== 1,673,408 bytes in 9,508 blocks are definitely lost in loss 
record 692 of 692

==28934==at 0x402F218: calloc (vg_replace_malloc.c:711)
==28934==by 0x525B981: r600_create_surface_custom (r600_texture.c:1202)
==28934==by 0x525BC6B: r600_create_surface (r600_texture.c:1248)
==28934==by 0x4ED2A37: st_framebuffer_validate (st_manager.c:235)
==28934==by 0x4ED3F1D: st_api_make_current (st_manager.c:783)
==28934==by 0x4FB38EA: dri_make_current (dri_context.c:245)
==28934==by 0x4FB268E: driBindContext (dri_util.c:553)
==28934==by 0x4078ABD: dri2_bind_context (dri2_glx.c:160)
==28934==by 0x4050955: MakeContextCurrent (glxcurrent.c:228)
==28934==by 0x4050AF5: glXMakeCurrent (glxcurrent.c:262)
==28934==by 0x8049210: display (makecurrent.c:92)
==28934==by 0x418451B: ??? (in /usr/lib/i386-linux-gnu/libglut.so.3.9.0)

MM

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix key used for hashing switch statement cases

2016-08-18 Thread Ilia Mirkin
Can the hash value not be 0?

On Aug 18, 2016 4:57 AM, "Tapani Pälli"  wrote:

> Implementation previously used case value itself as the key, however
> afterhash implementation change by ee02a5e we cannot use 0 as key.
> Patch uses _mesa_hash_data to formulate a suitable key for this hash.
>
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_
> hir.cpp
> index e03a6e3..53fc4d6 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -6180,9 +6180,11 @@ ast_case_label::hir(exec_list *instructions,
>   /* Stuff a dummy value in to allow processing to continue. */
>   label_const = new(ctx) ir_constant(0);
>} else {
> + unsigned hash_key = _mesa_hash_data(&label_const->value.u[0],
> + sizeof(unsigned));
>   ast_expression *previous_label = (ast_expression *)
>   hash_table_find(state->switch_state.labels_ht,
> - (void *)(uintptr_t)label_const->value.u[0]);
> + (void *)(uintptr_t)hash_key);
>
>   if (previous_label) {
>  YYLTYPE loc = this->test_value->get_location();
> @@ -6193,7 +6195,7 @@ ast_case_label::hir(exec_list *instructions,
>   } else {
>  hash_table_insert(state->switch_state.labels_ht,
>this->test_value,
> -  (void *)(uintptr_t)label_const->
> value.u[0]);
> +  (void *)(uintptr_t)hash_key);
>   }
>}
>
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix empty program log length

2016-08-18 Thread Iago Toral
On Wed, 2016-08-17 at 11:18 +0300, Tapani Pälli wrote:
> In case we have empty log (""), we should return 0. This fixes
> Khronos WebGL conformance test 'program-infolog'.
> 
> From OpenGL ES 3.1 (and OpenGL 4.5 Core) spec:
>    "If pname is INFO_LOG_LENGTH , the length of the info log,
> including
> a null terminator, is returned. If there is no info log, zero is
> returned."

Reviewed-by: Iago Toral Quiroga 

> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97321
> ---
>  src/mesa/main/shaderapi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index 4f29cd9..b2b3b0a 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -654,7 +654,8 @@ get_programiv(struct gl_context *ctx, GLuint
> program, GLenum pname,
>    *params = shProg->Validated;
>    return;
> case GL_INFO_LOG_LENGTH:
> -  *params = shProg->InfoLog ? strlen(shProg->InfoLog) + 1 : 0;
> +  *params = (shProg->InfoLog && shProg->InfoLog[0] != 0) ?
> + strlen(shProg->InfoLog) + 1 : 0;
>    return;
> case GL_ATTACHED_SHADERS:
>    *params = shProg->NumShaders;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix empty program log length

2016-08-18 Thread Tapani Pälli

On 08/18/2016 03:11 PM, Iago Toral wrote:

On Wed, 2016-08-17 at 11:18 +0300, Tapani Pälli wrote:

In case we have empty log (""), we should return 0. This fixes
Khronos WebGL conformance test 'program-infolog'.

 From OpenGL ES 3.1 (and OpenGL 4.5 Core) spec:
"If pname is INFO_LOG_LENGTH , the length of the info log,
including
 a null terminator, is returned. If there is no info log, zero is
 returned."

Reviewed-by: Iago Toral Quiroga 


Thanks for the review Iago, there's some discussion still in the bug if 
this is the right way to go so I'll be waiting for conclusion on that.



Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97321
---
  src/mesa/main/shaderapi.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 4f29cd9..b2b3b0a 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -654,7 +654,8 @@ get_programiv(struct gl_context *ctx, GLuint
program, GLenum pname,
*params = shProg->Validated;
return;
 case GL_INFO_LOG_LENGTH:
-  *params = shProg->InfoLog ? strlen(shProg->InfoLog) + 1 : 0;
+  *params = (shProg->InfoLog && shProg->InfoLog[0] != 0) ?
+ strlen(shProg->InfoLog) + 1 : 0;
return;
 case GL_ATTACHED_SHADERS:
*params = shProg->NumShaders;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] st: guard against NULL pipe_surface

2016-08-18 Thread Tobias Klausmann

I have no test case per se, but orbea (in CC) noted in IRC:

Start OpenMW with DRI3 -> Crash [1]

Start OpenMW with DRI2 -> no crash,

So i fear it is somewhere in our DRI3 path. Anyway a guard seems 
reasonable to harden release builds against this. Maybe not at this 
place but central in update_framebuffer_size().


Greetings,

Tobias


[1] https://homepages.thm.de/~tjkl80/openmw_backtrace.txt


On 18.08.2016 11:55, Marek Olšák wrote:

Hi,

Do you have a test case?

Marek

On Thu, Aug 18, 2016 at 1:44 AM, Tobias Klausmann 
> wrote:


OpenMW tries to upload a new surface (mouse pointer) which fails
in the now
guarded update_framebuffer_size() as the surface is NULL.

This is not inteded as a real "fix", as it would just hide the
immediate crash.

So if somebody could take a look at this...

Reported-by: mailto:ovarieg...@yahoo.com>>
Signed-off-by: Tobias Klausmann
mailto:tobias.johannes.klausm...@mni.thm.de>>
---
 src/mesa/state_tracker/st_atom_framebuffer.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c
b/src/mesa/state_tracker/st_atom_framebuffer.c
index ea41d9d..3ee4ea5 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -177,8 +177,10 @@ update_framebuffer_state( struct st_context *st )
  /* rendering to a GL texture, may have to update surface */
  st_update_renderbuffer_surface(st, strb);
   }
-  pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
-  update_framebuffer_size(framebuffer, strb->surface);
+  if (strb->surface) {
+ pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
+ update_framebuffer_size(framebuffer, strb->surface);
+  }
}
else {
   strb =
st_renderbuffer(fb->Attachment[BUFFER_STENCIL].Renderbuffer);
--
2.9.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org 
https://lists.freedesktop.org/mailman/listinfo/mesa-dev





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gbm: wire up fence extension

2016-08-18 Thread Emil Velikov
On 16 August 2016 at 18:01, Rob Clark  wrote:
> I noticed that fence extension wasn't exposed for drm/gbm, which sort
> of defeats the purpose of using fence fd's for synchronizing rendering
> and atomic pageflip.
>
Afaict this is a slightly more complete version of Philipp's patch.

While my earlier suggestion still stands, I should have mentioned that
it's not a show stopper for this patch.
Namely: we really want a way to check and bail out on ABI mismatch
between libEGL and libgbm.

> I suppose that somewhere or other, there needs to be a similar change
> to .../state_trackers/dri/dri2.c to avoid breaking things on i965.
Kind of ... see below.


> ---
>  src/egl/drivers/dri2/platform_drm.c   | 1 +
>  src/gallium/state_trackers/dri/dri2.c | 1 +
>  src/gbm/backends/dri/gbm_dri.c| 1 +
>  src/gbm/backends/dri/gbm_driint.h | 1 +
>  4 files changed, 4 insertions(+)
>
> diff --git a/src/egl/drivers/dri2/platform_drm.c 
> b/src/egl/drivers/dri2/platform_drm.c
> index 1ce282f..7bea8e1 100644
> --- a/src/egl/drivers/dri2/platform_drm.c
> +++ b/src/egl/drivers/dri2/platform_drm.c
> @@ -645,6 +645,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
> dri2_dpy->dri_screen = dri2_dpy->gbm_dri->screen;
> dri2_dpy->core = dri2_dpy->gbm_dri->core;
> dri2_dpy->dri2 = dri2_dpy->gbm_dri->dri2;
> +   dri2_dpy->fence = dri2_dpy->gbm_dri->fence;
Nice catch. Philipp's patch was missing this hunk, thus one could not
have really used the fence extension from EGL.

> dri2_dpy->image = dri2_dpy->gbm_dri->image;
> dri2_dpy->flush = dri2_dpy->gbm_dri->flush;
> dri2_dpy->swrast = dri2_dpy->gbm_dri->swrast;
> diff --git a/src/gallium/state_trackers/dri/dri2.c 
> b/src/gallium/state_trackers/dri/dri2.c
> index 75d740b..8ad8701 100644
> --- a/src/gallium/state_trackers/dri/dri2.c
> +++ b/src/gallium/state_trackers/dri/dri2.c
> @@ -2042,6 +2042,7 @@ const __DRIextension *galliumdrm_driver_extensions[] = {
>  &driImageDriverExtension.base,
>  &driDRI2Extension.base,
>  &gallium_config_options.base,
> +&dri2FenceExtension.base,
While things are a bit confusing I think this shouldn't be here, and
thus there's no 'missing' i965 hunk.

galliumdrm_driver_extensions are the "core" extensions which among
other are responsible for:
 - setting up the screen (which itself sets up the screen extensions)
 - query screen extensions

As a simple test - both gallium and i965 already expose the fence
extension and it seems to be working fine afaict.

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gbm: wire up fence extension

2016-08-18 Thread Rob Clark
On Thu, Aug 18, 2016 at 8:45 AM, Emil Velikov  wrote:
> On 16 August 2016 at 18:01, Rob Clark  wrote:
>> I noticed that fence extension wasn't exposed for drm/gbm, which sort
>> of defeats the purpose of using fence fd's for synchronizing rendering
>> and atomic pageflip.
>>
> Afaict this is a slightly more complete version of Philipp's patch.
>
> While my earlier suggestion still stands, I should have mentioned that
> it's not a show stopper for this patch.
> Namely: we really want a way to check and bail out on ABI mismatch
> between libEGL and libgbm.
>
>> I suppose that somewhere or other, there needs to be a similar change
>> to .../state_trackers/dri/dri2.c to avoid breaking things on i965.
> Kind of ... see below.
>
>
>> ---
>>  src/egl/drivers/dri2/platform_drm.c   | 1 +
>>  src/gallium/state_trackers/dri/dri2.c | 1 +
>>  src/gbm/backends/dri/gbm_dri.c| 1 +
>>  src/gbm/backends/dri/gbm_driint.h | 1 +
>>  4 files changed, 4 insertions(+)
>>
>> diff --git a/src/egl/drivers/dri2/platform_drm.c 
>> b/src/egl/drivers/dri2/platform_drm.c
>> index 1ce282f..7bea8e1 100644
>> --- a/src/egl/drivers/dri2/platform_drm.c
>> +++ b/src/egl/drivers/dri2/platform_drm.c
>> @@ -645,6 +645,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
>> dri2_dpy->dri_screen = dri2_dpy->gbm_dri->screen;
>> dri2_dpy->core = dri2_dpy->gbm_dri->core;
>> dri2_dpy->dri2 = dri2_dpy->gbm_dri->dri2;
>> +   dri2_dpy->fence = dri2_dpy->gbm_dri->fence;
> Nice catch. Philipp's patch was missing this hunk, thus one could not
> have really used the fence extension from EGL.
>
>> dri2_dpy->image = dri2_dpy->gbm_dri->image;
>> dri2_dpy->flush = dri2_dpy->gbm_dri->flush;
>> dri2_dpy->swrast = dri2_dpy->gbm_dri->swrast;
>> diff --git a/src/gallium/state_trackers/dri/dri2.c 
>> b/src/gallium/state_trackers/dri/dri2.c
>> index 75d740b..8ad8701 100644
>> --- a/src/gallium/state_trackers/dri/dri2.c
>> +++ b/src/gallium/state_trackers/dri/dri2.c
>> @@ -2042,6 +2042,7 @@ const __DRIextension *galliumdrm_driver_extensions[] = 
>> {
>>  &driImageDriverExtension.base,
>>  &driDRI2Extension.base,
>>  &gallium_config_options.base,
>> +&dri2FenceExtension.base,
> While things are a bit confusing I think this shouldn't be here, and
> thus there's no 'missing' i965 hunk.
>
> galliumdrm_driver_extensions are the "core" extensions which among
> other are responsible for:
>  - setting up the screen (which itself sets up the screen extensions)
>  - query screen extensions
>
> As a simple test - both gallium and i965 already expose the fence
> extension and it seems to be working fine afaict.

Ok, in that case I think we need this hunk squashed in (but untested so far):


diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index d4602d1..3cd8783 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -244,13 +244,13 @@ struct dri_extension_match {
 static struct dri_extension_match dri_core_extensions[] = {
{ __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) },
{ __DRI_IMAGE, 1, offsetof(struct gbm_dri_device, image) },
+   { __DRI2_FENCE, 2, offsetof(struct gbm_dri_device, fence) },
{ NULL, 0, 0 }
 };

 static struct dri_extension_match gbm_dri_device_extensions[] = {
{ __DRI_CORE, 1, offsetof(struct gbm_dri_device, core) },
{ __DRI_DRI2, 1, offsetof(struct gbm_dri_device, dri2) },
-   { __DRI2_FENCE, 2, offsetof(struct gbm_dri_device, fence) },
{ NULL, 0, 0 }
 };
 

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: fix MinGW initializer warning in ast_declarator_list::hir()

2016-08-18 Thread Brian Paul

On 08/18/2016 04:25 AM, Emil Velikov wrote:

On 17 August 2016 at 15:39, Brian Paul  wrote:

We need an extra set of braces for an array inside a struct.
---
  src/compiler/glsl/ast_to_hir.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index e03a6e3..abab932 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -4731,7 +4731,7 @@ ast_declarator_list::hir(exec_list *instructions,
if ((var->data.mode == ir_var_auto || var->data.mode == 
ir_var_temporary)
&& (var->type->is_numeric() || var->type->is_boolean())
&& state->zero_init) {
- const ir_constant_data data = {0};
+ const ir_constant_data data = {{0}};

Different versions of GCC trigger warning on each one of these. The
former is the correct one according to the spec, yet memset is the
most robust (warnings free) option.

Humble suggestion: opt to the latter or expect someone to unknowingly
revert this.


I'll just let it go then.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 24/32] i965/blorp/gen6: Use genxml packing structs for state setup

2016-08-18 Thread Pohjolainen, Topi
On Thu, Aug 11, 2016 at 02:15:21PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/Makefile.am  |   1 +
>  src/mesa/drivers/dri/i965/gen6_blorp.c | 685 
> -
>  2 files changed, 256 insertions(+), 430 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
> b/src/mesa/drivers/dri/i965/Makefile.am
> index 0a5222e..77ad1e8 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.am
> +++ b/src/mesa/drivers/dri/i965/Makefile.am
> @@ -36,6 +36,7 @@ AM_CFLAGS = \
>   -I$(top_srcdir)/src/compiler/nir \
>   -I$(top_srcdir)/src/intel \
>   -I$(top_builddir)/src/compiler/nir \
> + -I$(top_builddir)/src/intel \
>   -I$(top_builddir)/src/mesa/drivers/dri/common \
>   $(DEFINES) \
>   $(VISIBILITY_CFLAGS) \
> diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.c 
> b/src/mesa/drivers/dri/i965/gen6_blorp.c
> index 78e9472..875fdc9 100644
> --- a/src/mesa/drivers/dri/i965/gen6_blorp.c
> +++ b/src/mesa/drivers/dri/i965/gen6_blorp.c
> @@ -27,469 +27,245 @@
>  #include "intel_mipmap_tree.h"
>  
>  #include "brw_context.h"
> -#include "brw_defines.h"
>  #include "brw_state.h"
>  
>  #include "blorp_priv.h"
> -#include "vbo/vbo.h"
> -#include "brw_draw.h"
>  
> -/* 3DSTATE_URB
> - *
> - * Assign the entire URB to the VS. Even though the VS disabled, URB space
> - * is still needed because the clipper loads the VUE's from the URB. From
> - * the Sandybridge PRM, Volume 2, Part 1, Section 3DSTATE,
> - * Dword 1.15:0 "VS Number of URB Entries":
> - * This field is always used (even if VS Function Enable is DISABLED).
> - *
> - * The warning below appears in the PRM (Section 3DSTATE_URB), but we can
> - * safely ignore it because this batch contains only one draw call.
> - * Because of URB corruption caused by allocating a previous GS unit
> - * URB entry to the VS unit, software is required to send a ???GS NULL
> - * Fence??? (Send URB fence with VS URB size == 1 and GS URB size == 0)
> - * plus a dummy DRAW call before any case where VS will be taking over
> - * GS URB space.
> - */

In addition to changing the actual patch formatting you are dropping almost
all the comments giving the rational for the actual values. For example, here.
I at least found them quite valuable when I was learning blorp.

Just a few similar comments and tiny nits. I tried to carefully compare the
original logic side-by-side with the new and couldn't see anything dropped or
added:

Reviewed-by: Topi Pohjolainen 

> -static void
> -gen6_blorp_emit_urb_config(struct brw_context *brw,
> -   const struct brw_blorp_params *params)
> -{
> -   BEGIN_BATCH(3);
> -   OUT_BATCH(_3DSTATE_URB << 16 | (3 - 2));
> -   OUT_BATCH(brw->urb.max_vs_entries << GEN6_URB_VS_ENTRIES_SHIFT);
> -   OUT_BATCH(0);
> -   ADVANCE_BATCH();
> -}
> +#define GEN_VERSIONx10 60
> +#include "genxml/gen_macros.h"
>  
> -
> -/* 3DSTATE_CC_STATE_POINTERS
> - *
> - * The pointer offsets are relative to
> - * CMD_STATE_BASE_ADDRESS.DynamicStateBaseAddress.
> - *
> - * The HiZ op doesn't use BLEND_STATE or COLOR_CALC_STATE.

And here.

> - */
> -static void
> -gen6_blorp_emit_cc_state_pointers(struct brw_context *brw,
> -  const struct brw_blorp_params *params,
> -  uint32_t cc_blend_state_offset,
> -  uint32_t depthstencil_offset,
> -  uint32_t cc_state_offset)
> +static void *
> +blorp_emit_dwords(struct brw_context *brw, unsigned n)
>  {
> -   BEGIN_BATCH(4);
> -   OUT_BATCH(_3DSTATE_CC_STATE_POINTERS << 16 | (4 - 2));
> -   OUT_BATCH(cc_blend_state_offset | 1); /* BLEND_STATE offset */
> -   OUT_BATCH(depthstencil_offset | 1); /* DEPTH_STENCIL_STATE offset */
> -   OUT_BATCH(cc_state_offset | 1); /* COLOR_CALC_STATE offset */
> -   ADVANCE_BATCH();
> +   intel_batchbuffer_begin(brw, n, RENDER_RING);
> +   uint32_t *map = brw->batch.map_next;
> +   brw->batch.map_next += n;
> +   intel_batchbuffer_advance(brw);
> +   return map;
>  }
>  
> +struct blorp_address {
> +   drm_intel_bo *buffer;
> +   uint32_t read_domains;
> +   uint32_t write_domain;
> +   uint32_t offset;
> +};
>  
> -/**
> - * 3DSTATE_SAMPLER_STATE_POINTERS.  See upload_sampler_state_pointers().
> - */
> -static void
> -gen6_blorp_emit_sampler_state_pointers(struct brw_context *brw,
> -   uint32_t sampler_offset)
> +static uint64_t
> +blorp_emit_reloc(struct brw_context *brw, void *location,
> + struct blorp_address address, uint32_t delta)
>  {
> -   BEGIN_BATCH(4);
> -   OUT_BATCH(_3DSTATE_SAMPLER_STATE_POINTERS << 16 |
> - VS_SAMPLER_STATE_CHANGE |
> - GS_SAMPLER_STATE_CHANGE |
> - PS_SAMPLER_STATE_CHANGE |
> - (4 - 2));
> -   OUT_BATCH(0); /* VS */
> -   OUT_BATCH(0); /* GS */
> -   OUT_BATCH(sampler_offset);
> -   ADVANCE_BATCH();
> +   uint32_t offset = (char *)locat

Re: [Mesa-dev] [RFC] gbm: wire up fence extension

2016-08-18 Thread Emil Velikov
On 18 August 2016 at 14:12, Rob Clark  wrote:
> On Thu, Aug 18, 2016 at 8:45 AM, Emil Velikov  
> wrote:
>> On 16 August 2016 at 18:01, Rob Clark  wrote:
>>> I noticed that fence extension wasn't exposed for drm/gbm, which sort
>>> of defeats the purpose of using fence fd's for synchronizing rendering
>>> and atomic pageflip.
>>>
>> Afaict this is a slightly more complete version of Philipp's patch.
>>
>> While my earlier suggestion still stands, I should have mentioned that
>> it's not a show stopper for this patch.
>> Namely: we really want a way to check and bail out on ABI mismatch
>> between libEGL and libgbm.
>>
>>> I suppose that somewhere or other, there needs to be a similar change
>>> to .../state_trackers/dri/dri2.c to avoid breaking things on i965.
>> Kind of ... see below.
>>
>>
>>> ---
>>>  src/egl/drivers/dri2/platform_drm.c   | 1 +
>>>  src/gallium/state_trackers/dri/dri2.c | 1 +
>>>  src/gbm/backends/dri/gbm_dri.c| 1 +
>>>  src/gbm/backends/dri/gbm_driint.h | 1 +
>>>  4 files changed, 4 insertions(+)
>>>
>>> diff --git a/src/egl/drivers/dri2/platform_drm.c 
>>> b/src/egl/drivers/dri2/platform_drm.c
>>> index 1ce282f..7bea8e1 100644
>>> --- a/src/egl/drivers/dri2/platform_drm.c
>>> +++ b/src/egl/drivers/dri2/platform_drm.c
>>> @@ -645,6 +645,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
>>> dri2_dpy->dri_screen = dri2_dpy->gbm_dri->screen;
>>> dri2_dpy->core = dri2_dpy->gbm_dri->core;
>>> dri2_dpy->dri2 = dri2_dpy->gbm_dri->dri2;
>>> +   dri2_dpy->fence = dri2_dpy->gbm_dri->fence;
>> Nice catch. Philipp's patch was missing this hunk, thus one could not
>> have really used the fence extension from EGL.
>>
>>> dri2_dpy->image = dri2_dpy->gbm_dri->image;
>>> dri2_dpy->flush = dri2_dpy->gbm_dri->flush;
>>> dri2_dpy->swrast = dri2_dpy->gbm_dri->swrast;
>>> diff --git a/src/gallium/state_trackers/dri/dri2.c 
>>> b/src/gallium/state_trackers/dri/dri2.c
>>> index 75d740b..8ad8701 100644
>>> --- a/src/gallium/state_trackers/dri/dri2.c
>>> +++ b/src/gallium/state_trackers/dri/dri2.c
>>> @@ -2042,6 +2042,7 @@ const __DRIextension *galliumdrm_driver_extensions[] 
>>> = {
>>>  &driImageDriverExtension.base,
>>>  &driDRI2Extension.base,
>>>  &gallium_config_options.base,
>>> +&dri2FenceExtension.base,
>> While things are a bit confusing I think this shouldn't be here, and
>> thus there's no 'missing' i965 hunk.
>>
>> galliumdrm_driver_extensions are the "core" extensions which among
>> other are responsible for:
>>  - setting up the screen (which itself sets up the screen extensions)
>>  - query screen extensions
>>
>> As a simple test - both gallium and i965 already expose the fence
>> extension and it seems to be working fine afaict.
>
> Ok, in that case I think we need this hunk squashed in (but untested so far):
>
> 
> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> index d4602d1..3cd8783 100644
> --- a/src/gbm/backends/dri/gbm_dri.c
> +++ b/src/gbm/backends/dri/gbm_dri.c
> @@ -244,13 +244,13 @@ struct dri_extension_match {
>  static struct dri_extension_match dri_core_extensions[] = {
> { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) },
> { __DRI_IMAGE, 1, offsetof(struct gbm_dri_device, image) },
> +   { __DRI2_FENCE, 2, offsetof(struct gbm_dri_device, fence) },
> { NULL, 0, 0 }
>  };
>
>  static struct dri_extension_match gbm_dri_device_extensions[] = {
> { __DRI_CORE, 1, offsetof(struct gbm_dri_device, core) },
> { __DRI_DRI2, 1, offsetof(struct gbm_dri_device, dri2) },
> -   { __DRI2_FENCE, 2, offsetof(struct gbm_dri_device, fence) },
> { NULL, 0, 0 }
>  };
>  
>
Yes you're correct - we'll need to expose it (why did I read
DRI2_FLUSH as DRI2_FENCE?).

Adding it to the array makes it compulsory, due to the nature of
dri_bind_extensions(). We could do that but it'll break all the
classic drivers but i965.

Something like the egl handling would be better:

   for (i = 0; extensions[i]; i++) {
  if (strcmp(extensions[i]->name, __DRI2_FENCE) == 0)
 dri2_dpy->fence = (__DRI2fenceExtension *) extensions[i];
   }


-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: fix key used for hashing switch statement cases

2016-08-18 Thread Ilia Mirkin
Since it *is* a hash, you could just use a much simpler hash function like

int hash(int val) {
  int ret = val + 1000;
  if (ret == 0) ret = 1;
  return ret;
}

That way everything will map to a unique integer, except for -1000
which will map to the same thing as -999. I posit that this should be
rare, and hash collisions are perfectly fine in any case (it's not
like we have 1<<32 buckets anyways). This also avoids the zero hash
value case.

  -ilia


On Thu, Aug 18, 2016 at 7:52 AM, Ilia Mirkin  wrote:
> Can the hash value not be 0?
>
>
> On Aug 18, 2016 4:57 AM, "Tapani Pälli"  wrote:
>>
>> Implementation previously used case value itself as the key, however
>> afterhash implementation change by ee02a5e we cannot use 0 as key.
>> Patch uses _mesa_hash_data to formulate a suitable key for this hash.
>>
>> Signed-off-by: Tapani Pälli 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
>> ---
>>  src/compiler/glsl/ast_to_hir.cpp | 6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/compiler/glsl/ast_to_hir.cpp
>> b/src/compiler/glsl/ast_to_hir.cpp
>> index e03a6e3..53fc4d6 100644
>> --- a/src/compiler/glsl/ast_to_hir.cpp
>> +++ b/src/compiler/glsl/ast_to_hir.cpp
>> @@ -6180,9 +6180,11 @@ ast_case_label::hir(exec_list *instructions,
>>   /* Stuff a dummy value in to allow processing to continue. */
>>   label_const = new(ctx) ir_constant(0);
>>} else {
>> + unsigned hash_key = _mesa_hash_data(&label_const->value.u[0],
>> + sizeof(unsigned));
>>   ast_expression *previous_label = (ast_expression *)
>>   hash_table_find(state->switch_state.labels_ht,
>> - (void *)(uintptr_t)label_const->value.u[0]);
>> + (void *)(uintptr_t)hash_key);
>>
>>   if (previous_label) {
>>  YYLTYPE loc = this->test_value->get_location();
>> @@ -6193,7 +6195,7 @@ ast_case_label::hir(exec_list *instructions,
>>   } else {
>>  hash_table_insert(state->switch_state.labels_ht,
>>this->test_value,
>> -  (void
>> *)(uintptr_t)label_const->value.u[0]);
>> +  (void *)(uintptr_t)hash_key);
>>   }
>>}
>>
>> --
>> 2.5.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gbm: wire up fence extension

2016-08-18 Thread Rob Clark
On Thu, Aug 18, 2016 at 9:28 AM, Emil Velikov  wrote:
> On 18 August 2016 at 14:12, Rob Clark  wrote:
>> On Thu, Aug 18, 2016 at 8:45 AM, Emil Velikov  
>> wrote:
>>> On 16 August 2016 at 18:01, Rob Clark  wrote:
 I noticed that fence extension wasn't exposed for drm/gbm, which sort
 of defeats the purpose of using fence fd's for synchronizing rendering
 and atomic pageflip.

>>> Afaict this is a slightly more complete version of Philipp's patch.
>>>
>>> While my earlier suggestion still stands, I should have mentioned that
>>> it's not a show stopper for this patch.
>>> Namely: we really want a way to check and bail out on ABI mismatch
>>> between libEGL and libgbm.
>>>
 I suppose that somewhere or other, there needs to be a similar change
 to .../state_trackers/dri/dri2.c to avoid breaking things on i965.
>>> Kind of ... see below.
>>>
>>>
 ---
  src/egl/drivers/dri2/platform_drm.c   | 1 +
  src/gallium/state_trackers/dri/dri2.c | 1 +
  src/gbm/backends/dri/gbm_dri.c| 1 +
  src/gbm/backends/dri/gbm_driint.h | 1 +
  4 files changed, 4 insertions(+)

 diff --git a/src/egl/drivers/dri2/platform_drm.c 
 b/src/egl/drivers/dri2/platform_drm.c
 index 1ce282f..7bea8e1 100644
 --- a/src/egl/drivers/dri2/platform_drm.c
 +++ b/src/egl/drivers/dri2/platform_drm.c
 @@ -645,6 +645,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
 dri2_dpy->dri_screen = dri2_dpy->gbm_dri->screen;
 dri2_dpy->core = dri2_dpy->gbm_dri->core;
 dri2_dpy->dri2 = dri2_dpy->gbm_dri->dri2;
 +   dri2_dpy->fence = dri2_dpy->gbm_dri->fence;
>>> Nice catch. Philipp's patch was missing this hunk, thus one could not
>>> have really used the fence extension from EGL.
>>>
 dri2_dpy->image = dri2_dpy->gbm_dri->image;
 dri2_dpy->flush = dri2_dpy->gbm_dri->flush;
 dri2_dpy->swrast = dri2_dpy->gbm_dri->swrast;
 diff --git a/src/gallium/state_trackers/dri/dri2.c 
 b/src/gallium/state_trackers/dri/dri2.c
 index 75d740b..8ad8701 100644
 --- a/src/gallium/state_trackers/dri/dri2.c
 +++ b/src/gallium/state_trackers/dri/dri2.c
 @@ -2042,6 +2042,7 @@ const __DRIextension *galliumdrm_driver_extensions[] 
 = {
  &driImageDriverExtension.base,
  &driDRI2Extension.base,
  &gallium_config_options.base,
 +&dri2FenceExtension.base,
>>> While things are a bit confusing I think this shouldn't be here, and
>>> thus there's no 'missing' i965 hunk.
>>>
>>> galliumdrm_driver_extensions are the "core" extensions which among
>>> other are responsible for:
>>>  - setting up the screen (which itself sets up the screen extensions)
>>>  - query screen extensions
>>>
>>> As a simple test - both gallium and i965 already expose the fence
>>> extension and it seems to be working fine afaict.
>>
>> Ok, in that case I think we need this hunk squashed in (but untested so far):
>>
>> 
>> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
>> index d4602d1..3cd8783 100644
>> --- a/src/gbm/backends/dri/gbm_dri.c
>> +++ b/src/gbm/backends/dri/gbm_dri.c
>> @@ -244,13 +244,13 @@ struct dri_extension_match {
>>  static struct dri_extension_match dri_core_extensions[] = {
>> { __DRI2_FLUSH, 1, offsetof(struct gbm_dri_device, flush) },
>> { __DRI_IMAGE, 1, offsetof(struct gbm_dri_device, image) },
>> +   { __DRI2_FENCE, 2, offsetof(struct gbm_dri_device, fence) },
>> { NULL, 0, 0 }
>>  };
>>
>>  static struct dri_extension_match gbm_dri_device_extensions[] = {
>> { __DRI_CORE, 1, offsetof(struct gbm_dri_device, core) },
>> { __DRI_DRI2, 1, offsetof(struct gbm_dri_device, dri2) },
>> -   { __DRI2_FENCE, 2, offsetof(struct gbm_dri_device, fence) },
>> { NULL, 0, 0 }
>>  };
>>  
>>
> Yes you're correct - we'll need to expose it (why did I read
> DRI2_FLUSH as DRI2_FENCE?).
>
> Adding it to the array makes it compulsory, due to the nature of
> dri_bind_extensions(). We could do that but it'll break all the
> classic drivers but i965.
>
> Something like the egl handling would be better:
>
>for (i = 0; extensions[i]; i++) {
>   if (strcmp(extensions[i]->name, __DRI2_FENCE) == 0)
>  dri2_dpy->fence = (__DRI2fenceExtension *) extensions[i];
>}

or maybe just add a 'bool optional' flag in dri_extension_match table?

BR,
-R

>
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 26/32] i965/blorp: Add genxml-based dynamic state emit functions

2016-08-18 Thread Pohjolainen, Topi
On Thu, Aug 11, 2016 at 02:15:23PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 82 
> +
>  1 file changed, 73 insertions(+), 9 deletions(-)

Compared to existing and didn't see anything added or removed:

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index 0652930..e512b95 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -240,6 +240,70 @@ blorp_emit_depth_stencil_config(struct brw_context *brw,
> blorp_emit(brw, GENX(3DSTATE_STENCIL_BUFFER), sb);
>  }
>  
> +static uint32_t
> +blorp_emit_blend_state(struct brw_context *brw,
> +   const struct brw_blorp_params *params)
> +{
> +   struct GENX(BLEND_STATE) blend;
> +   memset(&blend, 0, sizeof(blend));
> +
> +   for (unsigned i = 0; i < params->num_draw_buffers; ++i) {
> +  blend.Entry[i].PreBlendColorClampEnable = true;
> +  blend.Entry[i].PostBlendColorClampEnable = true;
> +  blend.Entry[i].ColorClampRange = COLORCLAMP_RTFORMAT;
> +
> +  blend.Entry[i].WriteDisableRed = params->color_write_disable[0];
> +  blend.Entry[i].WriteDisableGreen = params->color_write_disable[1];
> +  blend.Entry[i].WriteDisableBlue = params->color_write_disable[2];
> +  blend.Entry[i].WriteDisableAlpha = params->color_write_disable[3];
> +   }
> +
> +   uint32_t offset;
> +   void *state = brw_state_batch(brw, AUB_TRACE_BLEND_STATE,
> + GENX(BLEND_STATE_length) * 4, 64, &offset);
> +   GENX(BLEND_STATE_pack)(NULL, state, &blend);
> +
> +   return offset;
> +}
> +
> +static uint32_t
> +blorp_emit_color_calc_state(struct brw_context *brw,
> +const struct brw_blorp_params *params)
> +{
> +   uint32_t offset;
> +   void *state = brw_state_batch(brw, AUB_TRACE_CC_STATE,
> + GENX(COLOR_CALC_STATE_length) * 4, 64, 
> &offset);
> +   memset(state, 0, GENX(COLOR_CALC_STATE_length) * 4);
> +
> +   return offset;
> +}
> +
> +static uint32_t
> +blorp_emit_depth_stencil_state(struct brw_context *brw,
> +   const struct brw_blorp_params *params)
> +{
> +   /* See the following sections of the Sandy Bridge PRM, Volume 1, Part2:
> +*   - 7.5.3.1 Depth Buffer Clear
> +*   - 7.5.3.2 Depth Buffer Resolve
> +*   - 7.5.3.3 Hierarchical Depth Buffer Resolve
> +*/
> +   struct GENX(DEPTH_STENCIL_STATE) ds = {
> +  .DepthBufferWriteEnable = true,
> +   };
> +
> +   if (params->hiz_op == GEN6_HIZ_OP_DEPTH_RESOLVE) {
> +  ds.DepthTestEnable = true;
> +  ds.DepthTestFunction = COMPAREFUNCTION_NEVER;
> +   }
> +
> +   uint32_t offset;
> +   void *state = brw_state_batch(brw, AUB_TRACE_DEPTH_STENCIL_STATE,
> + GENX(DEPTH_STENCIL_STATE_length) * 4, 64,
> + &offset);
> +   GENX(DEPTH_STENCIL_STATE_pack)(NULL, state, &ds);
> +
> +   return offset;
> +}
>  
>  /* 3DSTATE_VIEWPORT_STATE_POINTERS */
>  static void
> @@ -278,9 +342,9 @@ void
>  genX(blorp_exec)(struct brw_context *brw,
>   const struct brw_blorp_params *params)
>  {
> -   uint32_t cc_blend_state_offset = 0;
> -   uint32_t cc_state_offset = 0;
> -   uint32_t depthstencil_offset;
> +   uint32_t blend_state_offset = 0;
> +   uint32_t color_calc_state_offset = 0;
> +   uint32_t depth_stencil_state_offset;
> uint32_t wm_bind_bo_offset = 0;
>  
> /* Emit workaround flushes when we switch from drawing to blorping. */
> @@ -295,18 +359,18 @@ genX(blorp_exec)(struct brw_context *brw,
> }
>  
> if (params->wm_prog_data) {
> -  cc_blend_state_offset = gen6_blorp_emit_blend_state(brw, params);
> -  cc_state_offset = gen6_blorp_emit_cc_state(brw);
> +  blend_state_offset = blorp_emit_blend_state(brw, params);
> +  color_calc_state_offset = blorp_emit_color_calc_state(brw, params);
> }
> -   depthstencil_offset = gen6_blorp_emit_depth_stencil_state(brw, params);
> +   depth_stencil_state_offset = blorp_emit_depth_stencil_state(brw, params);
>  
> blorp_emit(brw, GENX(3DSTATE_CC_STATE_POINTERS), cc) {
>cc.BLEND_STATEChange = true;
>cc.COLOR_CALC_STATEChange = true;
>cc.DEPTH_STENCIL_STATEChange = true;
> -  cc.PointertoBLEND_STATE = cc_blend_state_offset;
> -  cc.PointertoCOLOR_CALC_STATE = cc_state_offset;
> -  cc.PointertoDEPTH_STENCIL_STATE = depthstencil_offset;
> +  cc.PointertoBLEND_STATE = blend_state_offset;
> +  cc.PointertoCOLOR_CALC_STATE = color_calc_state_offset;
> +  cc.PointertoDEPTH_STENCIL_STATE = depth_stencil_state_offset;
> }
>  
> blorp_emit(brw, GENX(3DSTATE_CONSTANT_VS), vs);
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/m

[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

Bug ID: 97393
   Summary: Wrong result of S3TC DXT3 texture rendering
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/swr
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: b7.10110...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 125875
  --> https://bugs.freedesktop.org/attachment.cgi?id=125875&action=edit
Test case

The attached test case works correctly when used with e.g. i965 driver, but
gives wrong result with LIBGL_ALWAYS_SOFTWARE=1, regardless of what swrast
driver is in use: llvmpipe, softpipe and non-gallium SW renderers all behave
the same wrong way.

I've tried installing libtxc_dxtn, but it didn't change anything.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #1 from Ruslan Kabatsayev  ---
Created attachment 125876
  --> https://bugs.freedesktop.org/attachment.cgi?id=125876&action=edit
Screenshot

On this screenshot LHS is the swrast result, and RHS is i965. The former is
wrong, the latter is correct.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] anv: remove dummy VK_DEBUG_MARKER_EXT entry points

2016-08-18 Thread Emil Velikov
On 27 July 2016 at 16:19, Jason Ekstrand  wrote:
> Thanks. We only had that because at one point we weren't using the loader
> and dota2 needed it. No longer an issue...
>
Damien, are you still keeping track of oddly parsed emails by patchwork ?
Jason's reply seems to have confused patchwork to change the author
[1]. It also detected 5 revisions of this series, although I could be
the one to "blame" there ;-)

Related: all the /series pages seem to be unhappy if you're navigating
through history (go back/forward). Is that intentional - can I suggest
opting for the /patches approach ?

Thanks
Emil

[1] https://patchwork.freedesktop.org/patch/101314/
[2] https://patchwork.freedesktop.org/series/10320/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

Ilia Mirkin  changed:

   What|Removed |Added

  Component|Drivers/Gallium/swr |Mesa core

--- Comment #2 from Ilia Mirkin  ---
[swr is for the literal 'swr' driver, not all swrast, reassigning to core]

Sounds like your test-case is doing linear interp, and that's not working in
swrast for some reason? Haven't looked at the actual test-case yet.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 27/32] i965/blorp: Add genxml-based sampler state emit function

2016-08-18 Thread Pohjolainen, Topi
On Thu, Aug 11, 2016 at 02:15:24PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 49 
> ++---
>  1 file changed, 38 insertions(+), 11 deletions(-)

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index e512b95..156d8ac 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -305,6 +305,42 @@ blorp_emit_depth_stencil_state(struct brw_context *brw,
> return offset;
>  }
>  
> +static void
> +blorp_emit_sampler_state(struct brw_context *brw,
> + const struct brw_blorp_params *params)
> +{
> +   struct GENX(SAMPLER_STATE) sampler = {
> +  .MipModeFilter = MIPFILTER_NONE,
> +  .MagModeFilter = MAPFILTER_LINEAR,
> +  .MinModeFilter = MAPFILTER_LINEAR,
> +  .MinLOD = 0,
> +  .MaxLOD = 0,
> +  .TCXAddressControlMode = TCM_CLAMP,
> +  .TCYAddressControlMode = TCM_CLAMP,
> +  .TCZAddressControlMode = TCM_CLAMP,
> +  .MaximumAnisotropy = RATIO21,
> +  .RAddressMinFilterRoundingEnable = true,
> +  .RAddressMagFilterRoundingEnable = true,
> +  .VAddressMinFilterRoundingEnable = true,
> +  .VAddressMagFilterRoundingEnable = true,
> +  .UAddressMinFilterRoundingEnable = true,
> +  .UAddressMagFilterRoundingEnable = true,
> +  .NonnormalizedCoordinateEnable = true,
> +   };
> +
> +   uint32_t offset;
> +   void *state = brw_state_batch(brw, AUB_TRACE_SAMPLER_STATE,
> + GENX(SAMPLER_STATE_length) * 4, 32, 
> &offset);
> +   GENX(SAMPLER_STATE_pack)(NULL, state, &sampler);
> +
> +   blorp_emit(brw, GENX(3DSTATE_SAMPLER_STATE_POINTERS), ssp) {
> +  ssp.VSSamplerStateChange = true;
> +  ssp.GSSamplerStateChange = true;
> +  ssp.PSSamplerStateChange = true;
> +  ssp.PointertoPSSamplerState = offset;
> +   }
> +}
> +
>  /* 3DSTATE_VIEWPORT_STATE_POINTERS */
>  static void
>  blorp_emit_viewport_state(struct brw_context *brw,
> @@ -401,17 +437,8 @@ genX(blorp_exec)(struct brw_context *brw,
>}
> }
>  
> -   if (params->src.bo) {
> -  const uint32_t sampler_offset =
> - gen6_blorp_emit_sampler_state(brw, MAPFILTER_LINEAR, 0, true);
> -
> -  blorp_emit(brw, GENX(3DSTATE_SAMPLER_STATE_POINTERS), ssp) {
> - ssp.VSSamplerStateChange = true;
> - ssp.GSSamplerStateChange = true;
> - ssp.PSSamplerStateChange = true;
> - ssp.PointertoPSSamplerState = sampler_offset;
> -  }
> -   }
> +   if (params->src.bo)
> +  blorp_emit_sampler_state(brw, params);
>  
> gen6_emit_3dstate_multisample(brw, params->dst.surf.samples);
>  
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 28/32] i965/blorp: Add a helper for emitting surface states

2016-08-18 Thread Pohjolainen, Topi
On Thu, Aug 11, 2016 at 02:15:25PM -0700, Jason Ekstrand wrote:
> The new helper emits surface states and the binding table in one go.  It's
> nice to have it pulled out of the main blorp_exec function.
> ---
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 52 
> -

Reviewed-by: Topi Pohjolainen 

>  1 file changed, 28 insertions(+), 24 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index 156d8ac..922da45 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -306,6 +306,32 @@ blorp_emit_depth_stencil_state(struct brw_context *brw,
>  }
>  
>  static void
> +blorp_emit_surface_states(struct brw_context *brw,
> +  const struct brw_blorp_params *params)
> +{
> +   uint32_t bind_offset;
> +   uint32_t *bind =
> +  brw_state_batch(brw, AUB_TRACE_BINDING_TABLE,
> +  sizeof(uint32_t) * BRW_BLORP_NUM_BINDING_TABLE_ENTRIES,
> +  32, /* alignment */ &bind_offset);
> +
> +   bind[BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX] =
> +  brw_blorp_emit_surface_state(brw, ¶ms->dst,
> +   I915_GEM_DOMAIN_RENDER,
> +   I915_GEM_DOMAIN_RENDER, true);
> +   if (params->src.bo) {
> +  bind[BRW_BLORP_TEXTURE_BINDING_TABLE_INDEX] =
> + brw_blorp_emit_surface_state(brw, ¶ms->src,
> +  I915_GEM_DOMAIN_SAMPLER, 0, false);
> +   }
> +
> +   blorp_emit(brw, GENX(3DSTATE_BINDING_TABLE_POINTERS), bt) {
> +  bt.PSBindingTableChange = true;
> +  bt.PointertoPSBindingTable = bind_offset;
> +   }
> +}
> +
> +static void
>  blorp_emit_sampler_state(struct brw_context *brw,
>   const struct brw_blorp_params *params)
>  {
> @@ -381,7 +407,6 @@ genX(blorp_exec)(struct brw_context *brw,
> uint32_t blend_state_offset = 0;
> uint32_t color_calc_state_offset = 0;
> uint32_t depth_stencil_state_offset;
> -   uint32_t wm_bind_bo_offset = 0;
>  
> /* Emit workaround flushes when we switch from drawing to blorping. */
> brw_emit_post_sync_nonzero_flush(brw);
> @@ -413,29 +438,8 @@ genX(blorp_exec)(struct brw_context *brw,
> blorp_emit(brw, GENX(3DSTATE_CONSTANT_GS), gs);
> blorp_emit(brw, GENX(3DSTATE_CONSTANT_PS), ps);
>  
> -   if (params->wm_prog_data) {
> -  uint32_t wm_surf_offset_renderbuffer;
> -  uint32_t wm_surf_offset_texture = 0;
> -
> -  wm_surf_offset_renderbuffer =
> - brw_blorp_emit_surface_state(brw, ¶ms->dst,
> -  I915_GEM_DOMAIN_RENDER,
> -  I915_GEM_DOMAIN_RENDER, true);
> -  if (params->src.bo) {
> - wm_surf_offset_texture =
> -brw_blorp_emit_surface_state(brw, ¶ms->src,
> - I915_GEM_DOMAIN_SAMPLER, 0, false);
> -  }
> -  wm_bind_bo_offset =
> - gen6_blorp_emit_binding_table(brw,
> -   wm_surf_offset_renderbuffer,
> -   wm_surf_offset_texture);
> -
> -  blorp_emit(brw, GENX(3DSTATE_BINDING_TABLE_POINTERS), bt) {
> - bt.PSBindingTableChange = true;
> - bt.PointertoPSBindingTable = wm_bind_bo_offset;
> -  }
> -   }
> +   if (params->wm_prog_data)
> +  blorp_emit_surface_states(brw, params);
>  
> if (params->src.bo)
>blorp_emit_sampler_state(brw, params);
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #3 from Ruslan Kabatsayev  ---
(In reply to Ilia Mirkin from comment #2)
> Sounds like your test-case is doing linear interp, and that's not working in
> swrast for some reason? Haven't looked at the actual test-case yet.

No, it uses nearest-neighbor filtering so that the image pixels are easier to
see.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: initialize and finalize the LLVM function pass manager

2016-08-18 Thread Marek Olšák
Ping.

This was pointed out by Matt Arsenault. gallivm_compile_module calls these
too.

Marek

On Fri, Aug 12, 2016 at 1:26 AM, Marek Olšák  wrote:

> From: Marek Olšák 
>
> we should do that allegedly
> ---
>  src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> index d75311e..e04e26a 100644
> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> @@ -1918,21 +1918,23 @@ void radeon_llvm_finalize_module(struct
> radeon_llvm_context *ctx)
> LLVMAddPromoteMemoryToRegisterPass(gallivm->passmgr);
>
> /* Add some optimization passes */
> LLVMAddScalarReplAggregatesPass(gallivm->passmgr);
> LLVMAddLICMPass(gallivm->passmgr);
> LLVMAddAggressiveDCEPass(gallivm->passmgr);
> LLVMAddCFGSimplificationPass(gallivm->passmgr);
> LLVMAddInstructionCombiningPass(gallivm->passmgr);
>
> /* Run the pass */
> +   LLVMInitializeFunctionPassManager(gallivm->passmgr);
> LLVMRunFunctionPassManager(gallivm->passmgr, ctx->main_fn);
> +   LLVMFinalizeFunctionPassManager(gallivm->passmgr);
>
> LLVMDisposeBuilder(gallivm->builder);
> LLVMDisposePassManager(gallivm->passmgr);
> gallivm_dispose_target_library_info(target_library_info);
>  }
>
>  void radeon_llvm_dispose(struct radeon_llvm_context *ctx)
>  {
> LLVMDisposeModule(ctx->soa.bld_base.base.gallivm->module);
> LLVMContextDispose(ctx->soa.bld_base.base.gallivm->context);
> --
> 2.7.4
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 29/32] i965/blorp: Add genxml-based vertex setup helpers

2016-08-18 Thread Pohjolainen, Topi
On Thu, Aug 11, 2016 at 02:15:26PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 213 
> +++-
>  1 file changed, 212 insertions(+), 1 deletion(-)

I can't see anything amiss here either:

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
> b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> index 922da45..f345454 100644
> --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> @@ -85,6 +85,7 @@ __gen_combine_address(struct brw_context *brw, void 
> *location,
>  #include "genxml/genX_pack.h"
>  
>  #define _blorp_cmd_length(cmd) cmd ## _length
> +#define _blorp_cmd_length_bias(cmd) cmd ## _length_bias
>  #define _blorp_cmd_header(cmd) cmd ## _header
>  #define _blorp_cmd_pack(cmd) cmd ## _pack
>  
> @@ -95,6 +96,215 @@ __gen_combine_address(struct brw_context *brw, void 
> *location,
>  _blorp_cmd_pack(cmd)(brw, (void *)_dst, &name),   \
>  _dst = NULL)
>  
> +#define blorp_emitn(batch, cmd, n) ({\
> +  uint32_t *_dw = blorp_emit_dwords(batch, n);   \
> +  struct cmd template = {\
> + _blorp_cmd_header(cmd), \
> + .DWordLength = n - _blorp_cmd_length_bias(cmd), \
> +  }; \
> +  _blorp_cmd_pack(cmd)(batch, _dw, &template);   \
> +  _dw + 1; /* Array starts at dw[1] */   \
> +   })
> +
> +static void
> +blorp_emit_vertex_data(struct brw_context *brw,
> +   const struct brw_blorp_params *params,
> +   struct blorp_address *addr,
> +   uint32_t *size)
> +{
> +   const float vertices[] = {
> +  /* v0 */ (float)params->x0, (float)params->y1,
> +  /* v1 */ (float)params->x1, (float)params->y1,
> +  /* v2 */ (float)params->x0, (float)params->y0,
> +   };
> +
> +   uint32_t offset;
> +   void *data = brw_state_batch(brw, AUB_TRACE_VERTEX_BUFFER,
> +sizeof(vertices), 32, &offset);
> +   memcpy(data, vertices, sizeof(vertices));
> +
> +   *addr = (struct blorp_address) {
> +  .buffer = brw->batch.bo,
> +  .read_domains = I915_GEM_DOMAIN_VERTEX,
> +  .write_domain = 0,
> +  .offset = offset,
> +   };
> +   *size = sizeof(vertices);
> +}
> +
> +static void
> +blorp_emit_input_varying_data(struct brw_context *brw,
> +  const struct brw_blorp_params *params,
> +  struct blorp_address *addr,
> +  uint32_t *size)
> +{
> +   const unsigned vec4_size_in_bytes = 4 * sizeof(float);
> +   const unsigned max_num_varyings =
> +  DIV_ROUND_UP(sizeof(params->wm_inputs), vec4_size_in_bytes);
> +   const unsigned num_varyings = params->wm_prog_data->num_varying_inputs;
> +
> +   *size = num_varyings * vec4_size_in_bytes;
> +
> +   const float *const inputs_src = (const float *)¶ms->wm_inputs;
> +   uint32_t offset;
> +   float *inputs = brw_state_batch(brw, AUB_TRACE_VERTEX_BUFFER,
> +   *size, 32, &offset);
> +
> +   /* Walk over the attribute slots, determine if the attribute is used by
> +* the program and when necessary copy the values from the input storage 
> to
> +* the vertex data buffer.
> +*/
> +   for (unsigned i = 0; i < max_num_varyings; i++) {
> +  const gl_varying_slot attr = VARYING_SLOT_VAR0 + i;
> +
> +  if (!(params->wm_prog_data->inputs_read & BITFIELD64_BIT(attr)))
> + continue;
> +
> +  memcpy(inputs, inputs_src + i * 4, vec4_size_in_bytes);
> +
> +  inputs += 4;
> +   }
> +
> +   *addr = (struct blorp_address) {
> +  .buffer = brw->batch.bo,
> +  .read_domains = I915_GEM_DOMAIN_VERTEX,
> +  .write_domain = 0,
> +  .offset = offset,
> +   };
> +}
> +
> +static void
> +blorp_emit_vertex_buffers(struct brw_context *brw,
> +  const struct brw_blorp_params *params)
> +{
> +   struct GENX(VERTEX_BUFFER_STATE) vb[2];
> +   memset(vb, 0, sizeof(vb));
> +
> +   unsigned num_buffers = 1;
> +
> +   uint32_t size;
> +   blorp_emit_vertex_data(brw, params, &vb[0].BufferStartingAddress, &size);
> +   vb[0].VertexBufferIndex = 0;
> +   vb[0].BufferPitch = 2 * sizeof(float);
> +   vb[0].BufferAccessType = VERTEXDATA;
> +   vb[0].EndAddress = vb[0].BufferStartingAddress;
> +   vb[0].EndAddress.offset += size - 1;
> +
> +   if (params->wm_prog_data && params->wm_prog_data->num_varying_inputs) {
> +  blorp_emit_input_varying_data(brw, params,
> +&vb[1].BufferStartingAddress, &size);
> +  vb[1].VertexBufferIndex = 1;
> +  vb[1].BufferPitch = 0;
> +  vb[1].BufferAccessType = INSTANCEDATA;
> +  vb[1].EndAddress = vb[1].BufferStartingAddress;
> +  vb[1].EndAddress.offset += size;
> +  num_buffers++;
> +   }
> +
> +   

[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #4 from almos  ---
The test renders fine for me with radeonsi, llvmpipe, softpipe, swrast. Mesa
11.2.2 and 12.1-dev, llvm 3.8, libtxc-dxtn0 1.0.1.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: initialize and finalize the LLVM function pass manager

2016-08-18 Thread Tom Stellard
On Fri, Aug 12, 2016 at 01:26:08AM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 
> we should do that allegedly

Reviewed-by: Tom Stellard 
> ---
>  src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> index d75311e..e04e26a 100644
> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> @@ -1918,21 +1918,23 @@ void radeon_llvm_finalize_module(struct 
> radeon_llvm_context *ctx)
>   LLVMAddPromoteMemoryToRegisterPass(gallivm->passmgr);
>  
>   /* Add some optimization passes */
>   LLVMAddScalarReplAggregatesPass(gallivm->passmgr);
>   LLVMAddLICMPass(gallivm->passmgr);
>   LLVMAddAggressiveDCEPass(gallivm->passmgr);
>   LLVMAddCFGSimplificationPass(gallivm->passmgr);
>   LLVMAddInstructionCombiningPass(gallivm->passmgr);
>  
>   /* Run the pass */
> + LLVMInitializeFunctionPassManager(gallivm->passmgr);
>   LLVMRunFunctionPassManager(gallivm->passmgr, ctx->main_fn);
> + LLVMFinalizeFunctionPassManager(gallivm->passmgr);
>  
>   LLVMDisposeBuilder(gallivm->builder);
>   LLVMDisposePassManager(gallivm->passmgr);
>   gallivm_dispose_target_library_info(target_library_info);
>  }
>  
>  void radeon_llvm_dispose(struct radeon_llvm_context *ctx)
>  {
>   LLVMDisposeModule(ctx->soa.bld_base.base.gallivm->module);
>   LLVMContextDispose(ctx->soa.bld_base.base.gallivm->context);
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #5 from Ruslan Kabatsayev  ---
(In reply to almos from comment #4)
> The test renders fine for me with radeonsi, llvmpipe, softpipe, swrast. Mesa
> 11.2.2 and 12.1-dev, llvm 3.8, libtxc-dxtn0 1.0.1.

That's strange... I also tested on Mesa 10.3.2, 10.5.9 and 12.1-dev, but on
Intel-based hardware (i915 and i965). Does software renderer somehow use X's HW
acceleration for some actions? Might be broken display driver...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #6 from Roland Scheidegger  ---
Without having tested it, the image looks to me like it might be using s2tc
decompression (which is what distros install due to patent issues). Which
cheats a bit (no interpolation between the two base colors in a block).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #7 from Ruslan Kabatsayev  ---
(In reply to Roland Scheidegger from comment #6)
> Without having tested it, the image looks to me like it might be using s2tc
> decompression (which is what distros install due to patent issues). Which
> cheats a bit (no interpolation between the two base colors in a block).

OK, removing libtxc-dxtn-s2tc0 indeed makes it render correctly (when true s3tc
library is in LD_LIBRARY_PATH). But if I don't add the correct s3tc lib to
LD_LIBRARY_PATH, i965 appears broken too, although its results were correct
with s2tc. How can this be explained?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #8 from Ruslan Kabatsayev  ---
Hmm, I seem to have expressed my question badly in comment 7. I mean: 

1. with s2tc library and no s3tc, i965 result is correct, and swrast result is
wrong
2. without s2tc but with s3tc library both are correct (as expected)
3. with both s2tc and s3tc removed both swrast and i965 are broken completely

So the question is now: why is i965 broken without s2tc if it seemed to decode
the texture without shortcomings of s2tc when s2tc was installed? How is swrast
different from it in this regard?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

--- Comment #9 from Ilia Mirkin  ---
s3tc is completely disabled if libtxc_dxtn.so isn't found.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] i965: Implement the WaPreventHSTessLevelsInterference workaround.

2016-08-18 Thread Connor Abbott
On Wed, Aug 17, 2016 at 10:15 AM, Kenneth Graunke  wrote:
> Fixes several GL44-CTS.tessellation_shader (and GL45 and ES31) subcases:
> - vertex_spacing
> - tessellation_shader_point_mode.points_verification
> - tessellation_shader_quads_tessellation.inner_tessellation_level_rounding
>
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources |   1 +
>  src/mesa/drivers/dri/i965/brw_compiler.h   |   2 +
>  src/mesa/drivers/dri/i965/brw_nir.h|   2 +
>  .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 152 
> +
>  src/mesa/drivers/dri/i965/brw_tcs.c|  18 ++-
>  src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp |   3 +
>  6 files changed, 175 insertions(+), 3 deletions(-)
>  create mode 100644 src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
>
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index df6b5dd..2492a31 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -48,6 +48,7 @@ i965_compiler_FILES = \
> brw_nir_attribute_workarounds.c \
> brw_nir_intrinsics.c \
> brw_nir_opt_peephole_ffma.c \
> +   brw_nir_tcs_workarounds.c \
> brw_packed_float.c \
> brw_predicated_break.cpp \
> brw_reg.h \
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
> b/src/mesa/drivers/dri/i965/brw_compiler.h
> index 10e9f47..7d15c28 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> @@ -220,6 +220,8 @@ struct brw_tcs_prog_key
> /** A bitfield of per-vertex outputs written. */
> uint64_t outputs_written;
>
> +   bool quads_workaround;
> +
> struct brw_sampler_prog_key_data tex;
>  };
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.h 
> b/src/mesa/drivers/dri/i965/brw_nir.h
> index 74c354f..6185310 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.h
> +++ b/src/mesa/drivers/dri/i965/brw_nir.h
> @@ -117,6 +117,8 @@ bool brw_nir_apply_attribute_workarounds(nir_shader *nir,
>
>  bool brw_nir_apply_trig_workarounds(nir_shader *nir);
>
> +void brw_nir_apply_tcs_quads_workaround(nir_shader *nir);
> +
>  nir_shader *brw_nir_apply_sampler_key(nir_shader *nir,
>const struct brw_device_info *devinfo,
>const struct brw_sampler_prog_key_data 
> *key,
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c 
> b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> new file mode 100644
> index 000..0626981
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> @@ -0,0 +1,152 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "compiler/nir/nir_builder.h"
> +#include "brw_nir.h"
> +
> +/**
> + * Implements the WaPreventHSTessLevelsInterference workaround (for Gen7-8).
> + *
> + * From the Broadwell PRM, Volume 7 (3D-Media-GPGPU), Page 494 (below the
> + * definition of the patch header layouts):
> + *
> + *"HW Bug: The Tessellation stage will incorrectly add domain points
> + * along patch edges under the following conditions, which may result
> + * in conformance failures and/or cracking artifacts:
> + *
> + *   * QUAD domain
> + *   * INTEGER partitioning
> + *   * All three TessFactors in a given U or V direction (e.g., V
> + * direction: UEQ0, InsideV, UEQ1) are all exactly 1.0
> + *   * All three TessFactors in the other direction are > 1.0 and all
> + * round up to the same integer value (e.g, U direction:
> + * VEQ0 = 3.1, InsideU = 3.7, VEQ1 = 3.4)
> + *
> + * The suggested workaround (to be implemented as part of the

Re: [Mesa-dev] [PATCH] vbo: increase VBO_SAVE_BUFFER_SIZE from 8k to 256k dwords

2016-08-18 Thread Rowley, Timothy O
Don’t think it’ll help Minecraft.  Looking at an apitrace of running around in 
the demo world for a little bit, it only created 66 display lists all of which 
were quite small (less than 100 api entries).

-Tim

> On Aug 18, 2016, at 3:10 AM, Gustaw Smolarczyk  wrote:
> 
> Hi,
> 
> Will this help Minecraft? I believe it currently uses VBOs (if they
> are enabled) and display lists. I might test it when I'm home next
> week.
> 
> Regards,
> Gustaw Smolarczyk
> 
> 2016-08-18 7:26 GMT+02:00 Mathias Fröhlich :
>> Hi,
>> 
>> On Wednesday, 17 August 2016 11:00:48 CEST Tim Rowley wrote:
>>> Increases the performance of legacy geometry-heavy apps
>>> still using display lists.
>> That is my observation too.
>> +1 from my side.
>> 
>> If you need that from me this gets:
>> Reviewed-by: Mathias Fröhlich 
>> 
>> Mathias
>> 
>>> ---
>>> src/mesa/vbo/vbo_save.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
>>> index 2843b3c..d1d7fb0 100644
>>> --- a/src/mesa/vbo/vbo_save.h
>>> +++ b/src/mesa/vbo/vbo_save.h
>>> @@ -96,7 +96,7 @@ struct vbo_save_vertex_list {
>>>  * likelyhood as it occurs.  No reason we couldn't change usage
>>>  * internally even though this probably isn't allowed for client VBOs?
>>>  */
>>> -#define VBO_SAVE_BUFFER_SIZE (8*1024) /* dwords */
>>> +#define VBO_SAVE_BUFFER_SIZE (256*1024) /* dwords */
>>> #define VBO_SAVE_PRIM_SIZE   128
>>> #define VBO_SAVE_PRIM_MODE_MASK 0x3f
>>> #define VBO_SAVE_PRIM_WEAK  0x40
>>> 
>> 
>> 
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97393] Wrong result of S3TC DXT3 texture rendering

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97393

Ruslan Kabatsayev  changed:

   What|Removed |Added

 Resolution|--- |NOTABUG
 Status|NEW |RESOLVED

--- Comment #10 from Ruslan Kabatsayev  ---
OK, so this seems to be a misconfiguration, not a bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] st/vdpau: change the order in which filters are applied(v3)

2016-08-18 Thread Nayan Deshmukh
Hi Christian,

I tried using 3 instances of the filter with original height & width,
height/2 & width and height/2 and width/2.
I am applying the 3 filters by checking the width and height of
surfaces respectively.

The chroma artifacts are still present, different from the initial
ones. Any suggestion for the fixup?
I tried debugging, but I think I am missing something.

Regards,
Nayan.

On Wed, Aug 17, 2016 at 2:59 PM, Andy Furniss  wrote:
> Christian König wrote:
>>
>> Top and Bottom field are separated in this representation.
>>
>> So you got a maximum of 3 planes multiplied by two fields.
>
>
> Ahh, thanks.
>
>
>>
>> Regards,
>> Christian.
>>
>> Am 17.08.2016 um 11:11 schrieb Andy Furniss:
>>>
>>> Nayan Deshmukh wrote:
>>>
 Sorry for the misleading language. What I meant was that the filter
 should only
 be applied to first resource i.e. use only first sampler_view and
 surface. As you
 replaced i by 0 the filter gets applied multiple times. I tried doing
 that and I am
 experiencing same problems i.e. artifacts with positive sharpen.

 So can you please it try it again by replacing VL_MAX_SURFACES with 1
 in the
 for loop. It should probably fix the problems.
>>>
>>>
>>> Possibly a stupid question, because I don't know how this stuff works,
>>> but why does it loop 6 times per frame rather than 3?
>>>
>>> I mean as seen if I put a printf as below.
>>>
> +   for(i = 0; i < VL_MAX_SURFACES; ++i) {
> +  if(sampler_views[i] != NULL && surfaces[i] != NULL) {
>>>
>>>  fprintf(stderr,"ADF: i = %d\n", i);
>
> + if (vmixer->noise_reduction.filter)
> + vl_median_filter_render(vmixer->noise_reduction.filter,
> +sampler_views[i],
> surfaces[i]);
> +
> + if (vmixer->sharpness.filter)
> + vl_matrix_filter_render(vmixer->sharpness.filter,
> +sampler_views[i],
> surfaces[i]);
>>>
>>>
>>
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] nir/search: Extend 'a@bool' to handle a couple of system values.

2016-08-18 Thread Matt Turner
On Wed, Aug 17, 2016 at 3:03 PM, Kenneth Graunke  wrote:
> load_front_face and load_helper_invocation produce booleans.
>
> On Broadwell:
>
> total instructions in shared programs: 11638956 -> 11638011 (-0.01%)
> instructions in affected programs: 115093 -> 114148 (-0.82%)
> helped: 628
> HURT: 14

I looked at one of the regressions. Looks like noise caused by hitting
the fs_visitor::optimize_frontfacing_ternary() path when the
frontfacing value is already available.

The series is

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/cfg: Ignore non-CF instructions in unreachable blocks.

2016-08-18 Thread Matt Turner
On Thu, Aug 18, 2016 at 3:43 AM, Iago Toral  wrote:
> On Wed, 2016-08-17 at 11:54 -0700, Matt Turner wrote:
>> The basic block following a control flow structure like an infinite
>> loop
>> will be unreachable. Ignore any non-control-flow instructions in it
>> since they can have no effect on the program.
>
> If the block is unreachable control-flow instructions inside the block
> are also irrelevant, is there any reason why you don't skip CF
> instructions too?

I think that could lead to further problems. For instance, if we had

START B47
  do
  break
END B47
START B48
  while
END B48

B48 would be unreachable, but I think emitting a "do" instruction
without a "while" might cause problems in the generator
(brw_find_loop_end, brw_find_next_block_end).

>> Avoids a segmentation fault in cfg_t::intersect(a, b) when called on
>> an
>> unreachable block. By avoiding ever putting code in an unreachable
>> block, we never attempt to optimize code in an unreachable block.
>
> Can't the problem persist if the unreachable block has any control-flow
> instructions?

I don't think so. The problem involved finding a constant
(fs_visitor::opt_combine_constants) that was in both a reachable and
an unreachable block. When a constant is in two different blocks, the
code finds the common ancestor of both blocks (cfg_t::intersect) and
places the constant in it. If one of the blocks is unreachable, it
will not have an immediate dominator, leading to cfg_t::intersect
segfaulting. Since control flow instructions do not take regular
sources (and not immediates), they should pose no problem.

The background is that Jason noticed a segfault when enabling GCM/GVN
in NIR in a single piglit test
(tests/glslparsertest/shaders/CorrectFull.frag). The test is a silly
parser test that contains an infinite loop. After enabling GVN, a
constant was used in both a reachable and unreachable block, leading
to the segfault.

My fix is to not put instructions into unreachable basic blocks. :)

See the wip/nir-gvn branch of my tree.

Thanks for reviewing!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gbm: wire up fence extension

2016-08-18 Thread Rob Clark
On Thu, Aug 18, 2016 at 8:45 AM, Emil Velikov  wrote:
> On 16 August 2016 at 18:01, Rob Clark  wrote:
>> I noticed that fence extension wasn't exposed for drm/gbm, which sort
>> of defeats the purpose of using fence fd's for synchronizing rendering
>> and atomic pageflip.
>>
> Afaict this is a slightly more complete version of Philipp's patch.
>
> While my earlier suggestion still stands, I should have mentioned that
> it's not a show stopper for this patch.
> Namely: we really want a way to check and bail out on ABI mismatch
> between libEGL and libgbm.

btw, I dug up Philipp's original patch/thread..  I didn't realize that
someone had already done this (would have saved me some time ;-))

As far as ABI mismatch, we don't currently ship libEGL/libgbm
separately.  And there doesn't appear to be any versioning scheme so
far.  (Although if we do split out libgbm from mesa, that would be a
good thing to add.)

So I'm inclined to punt on the version check (and actually not
entirely sure it is even possible).

BR,
-R

>> I suppose that somewhere or other, there needs to be a similar change
>> to .../state_trackers/dri/dri2.c to avoid breaking things on i965.
> Kind of ... see below.
>
>
>> ---
>>  src/egl/drivers/dri2/platform_drm.c   | 1 +
>>  src/gallium/state_trackers/dri/dri2.c | 1 +
>>  src/gbm/backends/dri/gbm_dri.c| 1 +
>>  src/gbm/backends/dri/gbm_driint.h | 1 +
>>  4 files changed, 4 insertions(+)
>>
>> diff --git a/src/egl/drivers/dri2/platform_drm.c 
>> b/src/egl/drivers/dri2/platform_drm.c
>> index 1ce282f..7bea8e1 100644
>> --- a/src/egl/drivers/dri2/platform_drm.c
>> +++ b/src/egl/drivers/dri2/platform_drm.c
>> @@ -645,6 +645,7 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
>> dri2_dpy->dri_screen = dri2_dpy->gbm_dri->screen;
>> dri2_dpy->core = dri2_dpy->gbm_dri->core;
>> dri2_dpy->dri2 = dri2_dpy->gbm_dri->dri2;
>> +   dri2_dpy->fence = dri2_dpy->gbm_dri->fence;
> Nice catch. Philipp's patch was missing this hunk, thus one could not
> have really used the fence extension from EGL.
>
>> dri2_dpy->image = dri2_dpy->gbm_dri->image;
>> dri2_dpy->flush = dri2_dpy->gbm_dri->flush;
>> dri2_dpy->swrast = dri2_dpy->gbm_dri->swrast;
>> diff --git a/src/gallium/state_trackers/dri/dri2.c 
>> b/src/gallium/state_trackers/dri/dri2.c
>> index 75d740b..8ad8701 100644
>> --- a/src/gallium/state_trackers/dri/dri2.c
>> +++ b/src/gallium/state_trackers/dri/dri2.c
>> @@ -2042,6 +2042,7 @@ const __DRIextension *galliumdrm_driver_extensions[] = 
>> {
>>  &driImageDriverExtension.base,
>>  &driDRI2Extension.base,
>>  &gallium_config_options.base,
>> +&dri2FenceExtension.base,
> While things are a bit confusing I think this shouldn't be here, and
> thus there's no 'missing' i965 hunk.
>
> galliumdrm_driver_extensions are the "core" extensions which among
> other are responsible for:
>  - setting up the screen (which itself sets up the screen extensions)
>  - query screen extensions
>
> As a simple test - both gallium and i965 already expose the fence
> extension and it seems to be working fine afaict.
>
> Regards,
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] vbo: Correctly handle attribute offsets in dlist draw.

2016-08-18 Thread Mathias . Froehlich
From: Mathias Fröhlich 

Hi,

I found the below while fixing a similar problem lately in
the immediate mode glBegin/glEnd code path.

Please review
Thanks

Mathias



When executing a display list draw with a shader program
using the generic0 attribute the offset computation
may get out of sync. To fix precompute the offsets
on the full attribute list and store the offsets in
the display list node.

Signed-off-by: Mathias Fröhlich 
---
 src/mesa/vbo/vbo_save.h  |  1 +
 src/mesa/vbo/vbo_save_api.c  |  6 ++
 src/mesa/vbo/vbo_save_draw.c | 35 +--
 3 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 2843b3c..a61973f 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -64,6 +64,7 @@ struct vbo_save_vertex_list {
GLbitfield64 enabled; /**< mask of enabled vbo arrays. */
GLubyte attrsz[VBO_ATTRIB_MAX];
GLenum attrtype[VBO_ATTRIB_MAX];
+   GLushort offsets[VBO_ATTRIB_MAX];
GLuint vertex_size;  /**< size in GLfloats */
 
/* Copy of the final vertex from node->vertex_store->bufferobj.
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index f648ccc..4473fef 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -436,6 +436,12 @@ _save_compile_vertex_list(struct gl_context *ctx)
node->vertex_size = save->vertex_size;
node->buffer_offset =
   (save->buffer - save->vertex_store->buffer) * sizeof(GLfloat);
+   GLushort offset = 0;
+   int i;
+   for (i = 0; i < VBO_ATTRIB_MAX; ++i) {
+  node->offsets[i] = offset;
+  offset += node->attrsz[i] * sizeof(GLfloat);
+   }
node->count = save->vert_count;
node->wrap_count = save->copied.nr;
node->dangling_attr_ref = save->dangling_attr_ref;
diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c
index 507ab82..e69c108 100644
--- a/src/mesa/vbo/vbo_save_draw.c
+++ b/src/mesa/vbo/vbo_save_draw.c
@@ -26,6 +26,7 @@
  *Keith Whitwell 
  */
 
+#include 
 #include "main/glheader.h"
 #include "main/bufferobj.h"
 #include "main/context.h"
@@ -136,15 +137,10 @@ static void vbo_bind_vertex_list(struct gl_context *ctx,
struct vbo_context *vbo = vbo_context(ctx);
struct vbo_save_context *save = &vbo->save;
struct gl_client_array *arrays = save->arrays;
-   GLuint buffer_offset = node->buffer_offset;
const GLuint *map;
GLuint attr;
-   GLubyte node_attrsz[VBO_ATTRIB_MAX];  /* copy of node->attrsz[] */
-   GLenum node_attrtype[VBO_ATTRIB_MAX];  /* copy of node->attrtype[] */
GLbitfield64 varying_inputs = 0x0;
-
-   memcpy(node_attrsz, node->attrsz, sizeof(node->attrsz));
-   memcpy(node_attrtype, node->attrtype, sizeof(node->attrtype));
+   bool generic_from_pos = false;
 
/* Install the default (ie Current) attributes first, then overlay
 * all active ones.
@@ -176,10 +172,7 @@ static void vbo_bind_vertex_list(struct gl_context *ctx,
*/
   if ((ctx->VertexProgram._Current->Base.InputsRead & VERT_BIT_POS) == 0 &&
   (ctx->VertexProgram._Current->Base.InputsRead & VERT_BIT_GENERIC0)) {
- save->inputs[VERT_ATTRIB_GENERIC0] = save->inputs[0];
- node_attrsz[VERT_ATTRIB_GENERIC0] = node_attrsz[0];
- node_attrtype[VERT_ATTRIB_GENERIC0] = node_attrtype[0];
- node_attrsz[0] = 0;
+ generic_from_pos = true;
   }
   break;
default:
@@ -188,30 +181,36 @@ static void vbo_bind_vertex_list(struct gl_context *ctx,
 
for (attr = 0; attr < VERT_ATTRIB_MAX; attr++) {
   const GLuint src = map[attr];
+  const GLubyte size = node->attrsz[src];
 
-  if (node_attrsz[src]) {
+  if (size) {
  /* override the default array set above */
  save->inputs[attr] = &arrays[attr];
 
-arrays[attr].Ptr = (const GLubyte *) NULL + buffer_offset;
-arrays[attr].Size = node_attrsz[src];
+ const uintptr_t buffer_offset = node->buffer_offset;
+ arrays[attr].Ptr = ADD_POINTERS(buffer_offset, node->offsets[src]);
+ arrays[attr].Size = size;
 arrays[attr].StrideB = node->vertex_size * sizeof(GLfloat);
- arrays[attr].Type = node_attrtype[src];
- arrays[attr].Integer =
-   vbo_attrtype_to_integer_flag(node_attrtype[src]);
+ const GLenum type = node->attrtype[src];
+ arrays[attr].Type = type;
+ arrays[attr].Integer = vbo_attrtype_to_integer_flag(type);
  arrays[attr].Format = GL_RGBA;
- arrays[attr]._ElementSize = arrays[attr].Size * sizeof(GLfloat);
+ arrays[attr]._ElementSize = size * sizeof(GLfloat);
  _mesa_reference_buffer_object(ctx,
&arrays[attr].BufferObj,
node->vertex_store->bufferobj);
 
 assert(arrays[attr].BufferObj->Name);
 
-buffer_offset += node_attrsz[src] * sizeof(GLfloat);
  varying_inputs |= VERT_BIT(attr);

Re: [Mesa-dev] [PATCH] glsl: fix key used for hashing switch statement cases

2016-08-18 Thread Eric Anholt
Tapani Pälli  writes:

> Implementation previously used case value itself as the key, however
> afterhash implementation change by ee02a5e we cannot use 0 as key.
> Patch uses _mesa_hash_data to formulate a suitable key for this hash.
>
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97309
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index e03a6e3..53fc4d6 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -6180,9 +6180,11 @@ ast_case_label::hir(exec_list *instructions,
>   /* Stuff a dummy value in to allow processing to continue. */
>   label_const = new(ctx) ir_constant(0);
>} else {
> + unsigned hash_key = _mesa_hash_data(&label_const->value.u[0],
> + sizeof(unsigned));
>   ast_expression *previous_label = (ast_expression *)
>   hash_table_find(state->switch_state.labels_ht,
> - (void *)(uintptr_t)label_const->value.u[0]);
> + (void *)(uintptr_t)hash_key);
>  
>   if (previous_label) {
>  YYLTYPE loc = this->test_value->get_location();
> @@ -6193,7 +6195,7 @@ ast_case_label::hir(exec_list *instructions,
>   } else {
>  hash_table_insert(state->switch_state.labels_ht,
>this->test_value,
> -  (void *)(uintptr_t)label_const->value.u[0]);
> +  (void *)(uintptr_t)hash_key);

_mesa_hash_data(...) could *also* be zero.  It'll only probably be that
for about one possible input value, but still.

Instead, we should change from _mesa_pointer_hash/compare to a thing
using _mesa_hash_data in HT setup, and pass pointer to the value.u[0]
here.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93551] Divinity: Original Sin Enhanced Edition(Native) crash on start

2016-08-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93551

--- Comment #42 from Alex  ---
(In reply to Thomas J. Moore from comment #32)
> Created attachment 125302 [details]
> Simple LD_PRELOAD shim to apply necessary patches for divos
> 
> Game works great for me with the above patches (thanks to those who figured
> this out!).  However, since they are not likely to be incorporated into
> Mesa, and patching my system Mesa just for one poorly written game is a bad
> idea, I think one of two alternate solutions needs to be provided.  My
> preference would be to patch the game binaries.  I don't really want to mess
> with that right now, though (especially since I can't easily locate where it
> does the vendor check).  The other would be to provide the patches in the
> form of an LD_PRELOAD shim.  I have attached the source code for one that
> seems to work for me.

Would it be possible to add another function to that shim to override the
output of sysconf (_SC_NPROCESSORS_ONLN)? I spoke to the developer and
apparently the game will only use as many "worker" threads (indicated by "[N]
WT" in the app title bar) as your number of cores minus the main thread, but if
you're in an environment where your single-threaded performance is heavily
constrained (e.g. a Y series Intel notebook chip like mine, when using the iGPU
throttles the CPU speeds) but you have multiple cores available, this isn't
ideal. From looking at my CPU usage under Divinity I'd really like to be able
to hack it to give me 2-3 worker threads instead of 1 to see if that helps
performance at all... I don't have much C experience though.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Use nir_shader_get_entrypoint in TCS quad workaround code.

2016-08-18 Thread Kenneth Graunke
We want to insert the code at the end of the program.  Looping over
all the functions (of which there was only one) was the old way of doing
this, but now we have nir_shader_get_entrypoint(), so let's use it.

Suggested by Connor Abbott.

Signed-off-by: Kenneth Graunke 
---
 .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 23 +++---
 1 file changed, 11 insertions(+), 12 deletions(-)

Sorry, I pushed the patch rather quickly...here's a follow-on to use
nir_shader_get_entrypoint().

Shouldn't be also be using it in nir_lower_gs_intrinsics.c?

diff --git a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c 
b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
index 0626981..ac4f9e0 100644
--- a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
+++ b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
@@ -134,19 +134,18 @@ brw_nir_apply_tcs_quads_workaround(nir_shader *nir)
 {
assert(nir->stage == MESA_SHADER_TESS_CTRL);
 
-   nir_foreach_function(func, nir) {
-  if (!func->impl)
- continue;
+   nir_function *func = nir_shader_get_entrypoint(nir);
+   if (!func->impl)
+  return;
 
-  nir_builder b;
-  nir_builder_init(&b, func->impl);
+   nir_builder b;
+   nir_builder_init(&b, func->impl);
 
-  struct set_entry *entry;
-  set_foreach(func->impl->end_block->predecessors, entry) {
- nir_block *pred = (nir_block *) entry->key;
- emit_quads_workaround(&b, pred);
-  }
-
-  nir_metadata_preserve(func->impl, 0);
+   struct set_entry *entry;
+   set_foreach(func->impl->end_block->predecessors, entry) {
+  nir_block *pred = (nir_block *) entry->key;
+  emit_quads_workaround(&b, pred);
}
+
+   nir_metadata_preserve(func->impl, 0);
 }
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 33/95] i965/vec4: implement d2b

2016-08-18 Thread Francisco Jerez
Iago Toral  writes:

> On Wed, 2016-08-17 at 15:15 -0700, Francisco Jerez wrote:
>> Iago Toral  writes:
>> 
>> > 
>> > On Tue, 2016-08-02 at 18:27 -0700, Francisco Jerez wrote:
>> > > 
>> > > Iago Toral Quiroga  writes:
>> > > 
>> > > > 
>> > > > 
>> > > > ---
>> > > >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 18
>> > > > ++
>> > > >  1 file changed, 18 insertions(+)
>> > > > 
>> > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > > > index 1525a3d..4014020 100644
>> > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > > > @@ -1497,6 +1497,24 @@ vec4_visitor::nir_emit_alu(nir_alu_instr
>> > > > *instr)
>> > > >    emit(CMP(dst, op[0], brw_imm_f(0.0f),
>> > > > BRW_CONDITIONAL_NZ));
>> > > >    break;
>> > > >  
>> > > > +   case nir_op_d2b: {
>> > > > +  /* two-argument instructions can't take 64-bit
>> > > > immediates */
>> > > > +  dst_reg zero = dst_reg(this, glsl_type::dvec4_type);
>> > > > +  emit(MOV(zero, brw_imm_df(0.0)));
>> > > > +
>> > > > +  dst_reg tmp = dst_reg(this, glsl_type::dvec4_type);
>> > > > +  emit(CMP(tmp, op[0], src_reg(zero),
>> > > > BRW_CONDITIONAL_NZ));
>> > > > +
>> > > > +  /* Convert the double CMP result to a single boolean
>> > > > result.
>> > > > For that
>> > > > +   * we take the low 32-bit chunk of each DF component in
>> > > > the
>> > > > result.
>> > > > +   * and do a final MOV to honor the original writemask
>> > > > +   */
>> > > > +  dst_reg result = dst_reg(this, glsl_type::bvec4_type);
>> > > > +  emit(VEC4_OPCODE_PICK_LOW_32BIT, result, src_reg(tmp));
>> > > > +  emit(MOV(dst, src_reg(result)));
>> > > Couldn't you just do a single CMP instruction of the double-
>> > > precision
>> > > argument and a single-precision 0.0 immediate? 
>> > It never occurred to me that we could do that but it seems to work
>> > just
>> > fine.
>> > 
>> > > 
>> > >  I think you could
>> > > potentially also use a 32bit destination type on the CMP
>> > > instruction
>> > > so
>> > > you don't need to emit the PICK_LOW+MOV instructions afterwards.
>> > This does not seem to work though.
>> > 
>> > > 
>> > >   It may
>> > > hit an instruction decompression bug but it could be cleaned up
>> > > by
>> > > the
>> > > SIMD lowering pass afterwards, AFAICT the result would be two
>> > > uncompressed instructions instead of the two uncompressed plus
>> > > two
>> > > compressed instructions above.
>> > I tried splitting the instruction just in case. Notice that doing
>> > this
>> > requires that we improve the splitting pass to support splitting
>> > instructions that write only one register (the original series did
>> > not
>> > implement support for that because so far the only cared to split
>> > DF
>> > instructions that wrote more than one register). I implemented a
>> > couple
>> > of quick hacks to get this done so I could test this but the result
>> > is
>> > still incorrect.
>> > 
>> > For reference, if we don't split, the resulting CMP looks like this
>> > (after full scalarization):
>> > 
>> > cmp.nz.f0(8)  g8<1>.xD  g6<2,2,1>.xyxyDF 0F     { align16 1Q };
>> > cmp.nz.f0(8)  g8<1>.yD  g6<2,2,1>.zwzwDF 0F     { align16 1Q };
>> > cmp.nz.f0(8)  g8<1>.zD  g6.2<0,2,1>.xyxyDF 0F   { align16 1Q };
>> > cmp.nz.f0(8)  g8<1>.wD  g6.2<0,2,1>.zwzwDF 0F   { align16 1Q };
>> > 
>> > And if we split the instruction in two we produce this:
>> > 
>> > cmp.nz.f0(4)  g6<1>.xD     g1<0,2,1>.xyxyDF 0F    { align16 1N };
>> > cmp.nz.f0(4)  g6<1>.yD     g1<0,2,1>.zwzwDF 0F    { align16 1N };
>> > cmp.nz.f0(4)  g6<1>.zD     g1.2<0,2,1>.xyxyDF 0F  { align16 1N };
>> > cmp.nz.f0(4)  g6<1>.wD     g1.2<0,2,1>.zwzwDF 0F  { align16 1N };
>> > cmp.nz.f0(4)  g6.4<1>.xD   g1<0,2,1>.xyxyDF 0F    { align16 2N };
>> > cmp.nz.f0(4)  g6.4<1>.yD   g1<0,2,1>.zwzwDF 0F    { align16 2N };
>> > cmp.nz.f0(4)  g6.4<1>.zD   g1.2<0,2,1>.xyxyDF 0F  { align16 2N };
>> > cmp.nz.f0(4)  g6.4<1>.wD   g1.2<0,2,1>.zwzwDF 0F  { align16 2N };
>> > 
>> > Both sequences of instructions look correct to me as far as the
>> > splitting and DF regioning go, so I guess the 32-bit CMP result is
>> > not
>> > doing what we want.
>> > 
>> > I am not too surprised about this. Even if we use a 32b dst type I
>> > suppose the instruction execution type is still 64b so I guess
>> > there
>> > should be a 64b to 32b conversion involved in writing to a 32b
>> > result.
>> > Seeing how 64b to 32b conversions require that we emit specific
>> > code
>> > using ALIGN1 mode to deal with alignment requirements I guess it is
>> > not
>> > too surprising that this does not work, right?
>> > 
>> Yeah, that's not too surprising, don't worry about it.  Another idea
>> would be to do something along the lines of 'MOV.nz null, > goes here>' to check whether the source is non-zero (so you avoid
>> loading the immediate which is a non-trivial operation on

Re: [Mesa-dev] [PATCH 41/95] i965/vec4: make the generator set correct NibCtrl for SIMD4 DF instructions

2016-08-18 Thread Francisco Jerez
Iago Toral  writes:

> On Mon, 2016-08-08 at 15:26 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga  writes:
>> 
>> > 
>> > From the HSW PRM, Command Reference, QtrCtrl:
>> > 
>> >    "NibCtrl is only allowed for SIMD4 instructions with a DF
>> > (Double Float)
>> > source or destination type."
>> > 
>> > v2 (Samuel): Assert that the type is DF.
>> > 
>> > Signed-off-by: Samuel Iglesias Gonsálvez 
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 9 +
>> >  1 file changed, 9 insertions(+)
>> > 
>> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> > index c6e040e..dc4f6db 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> > @@ -1494,6 +1494,7 @@ generate_code(struct brw_codegen *p,
>> >    brw_set_default_saturate(p, inst->saturate);
>> >    brw_set_default_mask_control(p, inst->force_writemask_all);
>> >    brw_set_default_acc_write_control(p, inst-
>> > >writes_accumulator);
>> > +  brw_set_default_group(p, 0);
>> > 
>> >    assert(inst->base_mrf + inst->mlen <= BRW_MAX_MRF(devinfo-
>> > >gen));
>> >    assert(inst->mlen <= BRW_MAX_MSG_LENGTH);
>> > @@ -1529,6 +1530,14 @@ generate_code(struct brw_codegen *p,
>> >    } else {
>> >   assert(inst->exec_size == 8 || inst->exec_size == 4);
>> >   brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
>> > + if (inst->exec_size == 4 && !inst->force_writemask_all) {
>> > +assert(inst->dst.type == BRW_REGISTER_TYPE_DF ||
>> > +   inst->src[0].type == BRW_REGISTER_TYPE_DF ||
>> > +   inst->src[1].type == BRW_REGISTER_TYPE_DF ||
>> > +   inst->src[2].type == BRW_REGISTER_TYPE_DF);
>> > +assert(inst->group == 0 || inst->group == 4);
>> > +brw_inst_set_group(p->devinfo, p->current, inst-
>> > >group);
>> I think this should be unconditional (e.g. if somebody tries to set
>> group == 4 on an instruction with exec_size == 8 it would be nice to
>> get
>> an assertion failure instead of the group value being silently
>> ignored).  How about you replace the 'brw_set_default_group(p, 0)'
>> line
>> above with:
>> 
>> > 
>> > 
>> >   assert(inst->group % inst->exec_size == 0);
>> >   assert(inst->group % 8 == 0 ||
>> >  inst->dst.type == BRW_REGISTER_TYPE_DF ||
>> >  inst->src[0].type == BRW_REGISTER_TYPE_DF ||
>> >  inst->src[1].type == BRW_REGISTER_TYPE_DF ||
>> >  inst->src[2].type == BRW_REGISTER_TYPE_DF);
>> >   brw_set_default_group(p, inst->group);
>> and then drop this hunk?
>
> Yes, that looks like better. Thanks!
>
> I suppose that setting the group won't have any effect  if
> force_writemask_all is set as too so there is no need to void setting
> it in that case, right?
>
Yeah, it shouldn't make much of a difference, I wouldn't bother not to
set it.

>> > 
>> >    }
>> >  
>> >    switch (inst->opcode) {


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

SDMA is much faster for tiled->linear blits from VRAM to GTT.
I have Bonaire in my second PCIe slot.

$ glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD TONGA ...

$ DRI_PRIME=1 glxinfo | grep OpenGL.renderer
OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ...

Without SDMA:
$ DRI_PRIME=1 glxgears
8796 frames in 5.0 seconds = 1759.074 FPS
8899 frames in 5.0 seconds = 1779.672 FPS

With SDMA:
$ DRI_PRIME=1 glxgears
12765 frames in 5.0 seconds = 2552.788 FPS
12888 frames in 5.0 seconds = 2577.495 FPS

The 1st GPU is irrelevant. The improvement should be much lower at 60 fps,
but definitely measurable.

SI will get this once we add SDMA blit support for it.
---
 src/gallium/drivers/radeonsi/si_blit.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 3cfd011..1147b5b 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -1089,25 +1089,44 @@ resolve_to_temp:
si_blitter_end(ctx);
 
pipe_resource_reference(&tmp, NULL);
return true;
 }
 
 static void si_blit(struct pipe_context *ctx,
const struct pipe_blit_info *info)
 {
struct si_context *sctx = (struct si_context*)ctx;
+   struct r600_texture *rdst = (struct r600_texture *)info->dst.resource;
 
if (do_hardware_msaa_resolve(ctx, info)) {
return;
}
 
+   /* Using SDMA for copying to a linear texture in GTT is much faster.
+* This improves DRI PRIME performance.
+*
+* resource_copy_region can't do this yet, because dma_copy calls it
+* on failure (recursion).
+*/
+   if (rdst->surface.level[info->dst.level].mode ==
+   RADEON_SURF_MODE_LINEAR_ALIGNED &&
+   sctx->b.dma_copy &&
+   util_can_blit_via_copy_region(info, false)) {
+   sctx->b.dma_copy(ctx, info->dst.resource, info->dst.level,
+info->dst.box.x, info->dst.box.y,
+info->dst.box.z,
+info->src.resource, info->src.level,
+&info->src.box);
+   return;
+   }
+
assert(util_blitter_is_blit_supported(sctx->blitter, info));
 
/* The driver doesn't decompress resources automatically while
 * u_blitter is rendering. */
si_decompress_subresource(ctx, info->src.resource, info->mask,
  info->src.level,
  info->src.box.z,
  info->src.box.z + info->src.box.depth - 1);
 
if (sctx->screen->b.debug_flags & DBG_FORCE_DMA &&
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] gallium/radeon: derive buffer placement and flags only once per buffer

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

Invalidated buffers don't have to do this.
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 147 +---
 src/gallium/drivers/radeon/r600_pipe_common.h   |   2 +
 2 files changed, 80 insertions(+), 69 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 4480293..113a7dc 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -98,91 +98,108 @@ void *r600_buffer_map_sync_with_rings(struct 
r600_common_context *ctx,
/* Setting the CS to NULL will prevent doing checks we have done 
already. */
return ctx->ws->buffer_map(resource->buf, NULL, usage);
 }
 
 bool r600_init_resource(struct r600_common_screen *rscreen,
struct r600_resource *res,
uint64_t size, unsigned alignment)
 {
struct r600_texture *rtex = (struct r600_texture*)res;
struct pb_buffer *old_buf, *new_buf;
-   enum radeon_bo_flag flags = 0;
-
-   switch (res->b.b.usage) {
-   case PIPE_USAGE_STREAM:
-   flags = RADEON_FLAG_GTT_WC;
-   /* fall through */
-   case PIPE_USAGE_STAGING:
-   /* Transfers are likely to occur more often with these 
resources. */
-   res->domains = RADEON_DOMAIN_GTT;
-   break;
-   case PIPE_USAGE_DYNAMIC:
-   /* Older kernels didn't always flush the HDP cache before
-* CS execution
-*/
-   if (rscreen->info.drm_major == 2 &&
-   rscreen->info.drm_minor < 40) {
+
+   if (!res->initialized) {
+   res->flags = 0;
+
+   switch (res->b.b.usage) {
+   case PIPE_USAGE_STREAM:
+   res->flags = RADEON_FLAG_GTT_WC;
+   /* fall through */
+   case PIPE_USAGE_STAGING:
+   /* Transfers are likely to occur more often with these
+* resources. */
res->domains = RADEON_DOMAIN_GTT;
-   flags |= RADEON_FLAG_GTT_WC;
+   break;
+   case PIPE_USAGE_DYNAMIC:
+   /* Older kernels didn't always flush the HDP cache 
before
+* CS execution
+*/
+   if (rscreen->info.drm_major == 2 &&
+   rscreen->info.drm_minor < 40) {
+   res->domains = RADEON_DOMAIN_GTT;
+   res->flags |= RADEON_FLAG_GTT_WC;
+   break;
+   }
+   res->flags |= RADEON_FLAG_CPU_ACCESS;
+   /* fall through */
+   case PIPE_USAGE_DEFAULT:
+   case PIPE_USAGE_IMMUTABLE:
+   default:
+   /* Not listing GTT here improves performance in some
+* apps. */
+   res->domains = RADEON_DOMAIN_VRAM;
+   res->flags |= RADEON_FLAG_GTT_WC;
break;
}
-   flags |= RADEON_FLAG_CPU_ACCESS;
-   /* fall through */
-   case PIPE_USAGE_DEFAULT:
-   case PIPE_USAGE_IMMUTABLE:
-   default:
-   /* Not listing GTT here improves performance in some apps. */
-   res->domains = RADEON_DOMAIN_VRAM;
-   flags |= RADEON_FLAG_GTT_WC;
-   break;
-   }
 
-   if (res->b.b.target == PIPE_BUFFER &&
-   res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
- PIPE_RESOURCE_FLAG_MAP_COHERENT)) {
-   /* Use GTT for all persistent mappings with older kernels,
-* because they didn't always flush the HDP cache before CS
-* execution.
-*
-* Write-combined CPU mappings are fine, the kernel ensures all 
CPU
-* writes finish before the GPU executes a command stream.
+   if (res->b.b.target == PIPE_BUFFER &&
+   res->b.b.flags & (PIPE_RESOURCE_FLAG_MAP_PERSISTENT |
+ PIPE_RESOURCE_FLAG_MAP_COHERENT)) {
+   /* Use GTT for all persistent mappings with older
+* kernels, because they didn't always flush the HDP
+* cache before CS execution.
+*
+* Write-combined CPU mappings are fine, the kernel
+* ensures all CPU writes finish before the GPU
+* executes a command stream.
+*/
+   if (rscreen->info.drm_major == 2 &&
+   rscreen->info.drm_minor < 40)
+   res->domains = RADEON_DOMAIN_GTT;
+

[Mesa-dev] [PATCH 5/8] radeonsi: fix printing shaders and states on a VM fault

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

This was missed while rewriting the PIPE_DUMP flags.
---
 src/gallium/drivers/radeonsi/si_debug.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
b/src/gallium/drivers/radeonsi/si_debug.c
index 7e75d3f..8ce9caf 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -820,21 +820,23 @@ void si_check_vm_faults(struct r600_common_context *ctx,
fprintf(f, "Device vendor: %s\n", screen->get_device_vendor(screen));
fprintf(f, "Device name: %s\n\n", screen->get_name(screen));
fprintf(f, "Failing VM page: 0x%08x\n\n", addr);
 
if (sctx->apitrace_call_number)
fprintf(f, "Last apitrace call: %u\n\n",
sctx->apitrace_call_number);
 
switch (ring) {
case RING_GFX:
-   si_dump_debug_state(&sctx->b.b, f, 0);
+   si_dump_debug_state(&sctx->b.b, f,
+   PIPE_DUMP_CURRENT_STATES |
+   PIPE_DUMP_CURRENT_SHADERS);
break;
 
case RING_DMA:
si_dump_dma(sctx, saved, f);
break;
 
default:
break;
}
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] radeonsi: enable SDMA on CIK

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

It passes R600_DEBUG=testdma on Bonaire/radeon.
---
 src/gallium/drivers/radeonsi/cik_sdma.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/cik_sdma.c 
b/src/gallium/drivers/radeonsi/cik_sdma.c
index a36bbce..2eade50 100644
--- a/src/gallium/drivers/radeonsi/cik_sdma.c
+++ b/src/gallium/drivers/radeonsi/cik_sdma.c
@@ -158,24 +158,20 @@ static bool cik_sdma_copy_texture(struct si_context *sctx,
 
assert(src_level <= src->last_level);
assert(dst_level <= dst->last_level);
assert(rdst->surface.level[dst_level].offset +
   dst_slice_pitch * bpp * (dstz + src_box->depth) <=
   rdst->resource.buf->size);
assert(rsrc->surface.level[src_level].offset +
   src_slice_pitch * bpp * (srcz + src_box->depth) <=
   rsrc->resource.buf->size);
 
-   /* Test CIK with radeon and amdgpu before enabling this. */
-   if (sctx->b.chip_class == CIK)
-   return false;
-
if (!r600_prepare_for_dma_blit(&sctx->b, rdst, dst_level, dstx, dsty,
dstz, rsrc, src_level, src_box))
return false;
 
dstx /= rdst->surface.blk_w;
dsty /= rdst->surface.blk_h;
 
if (srcx >= (1 << 14) ||
srcy >= (1 << 14) ||
srcz >= (1 << 11) ||
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] radeonsi: fix VM faults due NULL internal const buffers on CIK

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

They are harmless, but the interrupts do decrease performance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97039

Cc: 12.0 
---
 src/gallium/drivers/radeonsi/si_pipe.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 62b62db..2669c7b 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -244,36 +244,45 @@ static struct pipe_context *si_create_context(struct 
pipe_screen *screen,
if (sctx->blitter == NULL)
goto fail;
sctx->blitter->draw_rectangle = r600_draw_rectangle;
 
sctx->sample_mask.sample_mask = 0x;
 
/* these must be last */
si_begin_new_cs(sctx);
r600_query_init_backend_mask(&sctx->b); /* this emits commands and must 
be last */
 
-   /* CIK cannot unbind a constant buffer (S_BUFFER_LOAD is buggy
-* with a NULL buffer). We need to use a dummy buffer instead. */
+   /* CIK cannot unbind a constant buffer (S_BUFFER_LOAD doesn't skip loads
+* if NUM_RECORDS == 0). We need to use a dummy buffer instead. */
if (sctx->b.chip_class == CIK) {
sctx->null_const_buf.buffer = pipe_buffer_create(screen, 
PIPE_BIND_CONSTANT_BUFFER,
 
PIPE_USAGE_DEFAULT, 16);
if (!sctx->null_const_buf.buffer)
goto fail;
sctx->null_const_buf.buffer_size = 
sctx->null_const_buf.buffer->width0;
 
for (shader = 0; shader < SI_NUM_SHADERS; shader++) {
for (i = 0; i < SI_NUM_CONST_BUFFERS; i++) {
sctx->b.b.set_constant_buffer(&sctx->b.b, 
shader, i,
  
&sctx->null_const_buf);
}
}
 
+   si_set_rw_buffer(sctx, SI_HS_CONST_DEFAULT_TESS_LEVELS,
+&sctx->null_const_buf);
+   si_set_rw_buffer(sctx, SI_VS_CONST_CLIP_PLANES,
+&sctx->null_const_buf);
+   si_set_rw_buffer(sctx, SI_PS_CONST_POLY_STIPPLE,
+&sctx->null_const_buf);
+   si_set_rw_buffer(sctx, SI_PS_CONST_SAMPLE_POSITIONS,
+&sctx->null_const_buf);
+
/* Clear the NULL constant buffer, because loads should return 
zeros. */
sctx->b.clear_buffer(&sctx->b.b, sctx->null_const_buf.buffer, 0,
 sctx->null_const_buf.buffer->width0, 0,
 R600_COHERENCY_SHADER);
}
 
uint64_t max_threads_per_block;
screen->get_compute_param(screen, PIPE_SHADER_IR_TGSI,
  PIPE_COMPUTE_CAP_MAX_THREADS_PER_BLOCK,
  &max_threads_per_block);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] gallium/radeon: merge USER_SHADER and INTERNAL_SHADER priority flags

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

there's no reason to separate these
---
 src/gallium/drivers/r600/evergreen_compute.c|  2 +-
 src/gallium/drivers/r600/evergreen_state.c  |  2 +-
 src/gallium/drivers/r600/r600_state.c   |  2 +-
 src/gallium/drivers/r600/r600_state_common.c|  2 +-
 src/gallium/drivers/radeon/radeon_winsys.h  |  3 +--
 src/gallium/drivers/radeonsi/si_compute.c   |  2 +-
 src/gallium/drivers/radeonsi/si_debug.c |  3 +--
 src/gallium/drivers/radeonsi/si_state_shaders.c | 12 ++--
 8 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 292b5e3..fe43f37 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -577,21 +577,21 @@ void evergreen_emit_cs_shader(struct r600_context *rctx,
radeon_compute_set_context_reg_seq(cs, R_0288D0_SQ_PGM_START_LS, 3);
radeon_emit(cs, va >> 8); /* R_0288D0_SQ_PGM_START_LS */
radeon_emit(cs,   /* R_0288D4_SQ_PGM_RESOURCES_LS */
S_0288D4_NUM_GPRS(ngpr)
| S_0288D4_STACK_SIZE(nstack));
radeon_emit(cs, 0); /* R_0288D8_SQ_PGM_RESOURCES_LS_2 */
 
radeon_emit(cs, PKT3C(PKT3_NOP, 0, 0));
radeon_emit(cs, radeon_add_to_buffer_list(&rctx->b, &rctx->b.gfx,
  code_bo, RADEON_USAGE_READ,
- RADEON_PRIO_USER_SHADER));
+ RADEON_PRIO_SHADER_BINARY));
 }
 
 static void evergreen_launch_grid(struct pipe_context *ctx,
  const struct pipe_grid_info *info)
 {
struct r600_context *rctx = (struct r600_context *)ctx;
 #ifdef HAVE_OPENCL
struct r600_pipe_compute *shader = rctx->cs_shader_state.shader;
boolean use_kill;
 
diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 3d1a19d..11c8161 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -2149,21 +2149,21 @@ static void evergreen_emit_vertex_fetch_shader(struct 
r600_context *rctx, struct
 {
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
struct r600_cso_state *state = (struct r600_cso_state*)a;
struct r600_fetch_shader *shader = (struct 
r600_fetch_shader*)state->cso;
 
radeon_set_context_reg(cs, R_0288A4_SQ_PGM_START_FS,
   (shader->buffer->gpu_address + shader->offset) 
>> 8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, radeon_add_to_buffer_list(&rctx->b, &rctx->b.gfx, 
shader->buffer,
   RADEON_USAGE_READ,
-  
RADEON_PRIO_INTERNAL_SHADER));
+  RADEON_PRIO_SHADER_BINARY));
 }
 
 static void evergreen_emit_shader_stages(struct r600_context *rctx, struct 
r600_atom *a)
 {
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
struct r600_shader_stages_state *state = (struct 
r600_shader_stages_state*)a;
 
uint32_t v = 0, v2 = 0, primid = 0, tf_param = 0;
 
if (rctx->vs_shader->current->shader.vs_as_gs_a) {
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 62b1c2c..fb2861a 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -1902,21 +1902,21 @@ static void r600_emit_sample_mask(struct r600_context 
*rctx, struct r600_atom *a
 static void r600_emit_vertex_fetch_shader(struct r600_context *rctx, struct 
r600_atom *a)
 {
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
struct r600_cso_state *state = (struct r600_cso_state*)a;
struct r600_fetch_shader *shader = (struct 
r600_fetch_shader*)state->cso;
 
radeon_set_context_reg(cs, R_028894_SQ_PGM_START_FS, shader->offset >> 
8);
radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
radeon_emit(cs, radeon_add_to_buffer_list(&rctx->b, &rctx->b.gfx, 
shader->buffer,
   RADEON_USAGE_READ,
-  
RADEON_PRIO_INTERNAL_SHADER));
+  RADEON_PRIO_SHADER_BINARY));
 }
 
 static void r600_emit_shader_stages(struct r600_context *rctx, struct 
r600_atom *a)
 {
struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
struct r600_shader_stages_state *state = (struct 
r600_shader_stages_state*)a;
 
uint32_t v2 = 0, primid = 0;
 
if (rctx->vs_shader->current->shader.vs_as_gs_a) {
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 9008a4a..a5341c3 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r

[Mesa-dev] [PATCH 7/8] gallium/radeon: add a driver query for AMDGPU_INFO_NUM_EVICTIONS

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

If the kernel driver doesn't support it, it returns 0.
---
 src/gallium/drivers/radeon/r600_query.c   | 8 ++--
 src/gallium/drivers/radeon/r600_query.h   | 1 +
 src/gallium/drivers/radeon/radeon_winsys.h| 1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 7 +++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 ++
 5 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 5825e8b..bd0a906 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -49,20 +49,21 @@ static void r600_query_sw_destroy(struct 
r600_common_context *rctx,
 static enum radeon_value_id winsys_id_from_type(unsigned type)
 {
switch (type) {
case R600_QUERY_REQUESTED_VRAM: return RADEON_REQUESTED_VRAM_MEMORY;
case R600_QUERY_REQUESTED_GTT: return RADEON_REQUESTED_GTT_MEMORY;
case R600_QUERY_MAPPED_VRAM: return RADEON_MAPPED_VRAM;
case R600_QUERY_MAPPED_GTT: return RADEON_MAPPED_GTT;
case R600_QUERY_BUFFER_WAIT_TIME: return RADEON_BUFFER_WAIT_TIME_NS;
case R600_QUERY_NUM_CS_FLUSHES: return RADEON_NUM_CS_FLUSHES;
case R600_QUERY_NUM_BYTES_MOVED: return RADEON_NUM_BYTES_MOVED;
+   case R600_QUERY_NUM_EVICTIONS: return RADEON_NUM_EVICTIONS;
case R600_QUERY_VRAM_USAGE: return RADEON_VRAM_USAGE;
case R600_QUERY_GTT_USAGE: return RADEON_GTT_USAGE;
case R600_QUERY_GPU_TEMPERATURE: return RADEON_GPU_TEMPERATURE;
case R600_QUERY_CURRENT_GPU_SCLK: return RADEON_CURRENT_SCLK;
case R600_QUERY_CURRENT_GPU_MCLK: return RADEON_CURRENT_MCLK;
default: unreachable("query type does not correspond to winsys id");
}
 }
 
 static bool r600_query_sw_begin(struct r600_common_context *rctx,
@@ -96,21 +97,22 @@ static bool r600_query_sw_begin(struct r600_common_context 
*rctx,
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BACK_BUFFER_PS_DRAW_RATIO:
query->begin_result = 0;
break;
case R600_QUERY_BUFFER_WAIT_TIME:
case R600_QUERY_NUM_CS_FLUSHES:
-   case R600_QUERY_NUM_BYTES_MOVED: {
+   case R600_QUERY_NUM_BYTES_MOVED:
+   case R600_QUERY_NUM_EVICTIONS: {
enum radeon_value_id ws_id = winsys_id_from_type(query->b.type);
query->begin_result = rctx->ws->query_value(rctx->ws, ws_id);
break;
}
case R600_QUERY_GPU_LOAD:
query->begin_result = r600_gpu_load_begin(rctx->screen);
break;
case R600_QUERY_NUM_COMPILATIONS:
query->begin_result = 
p_atomic_read(&rctx->screen->num_compilations);
break;
@@ -160,21 +162,22 @@ static bool r600_query_sw_end(struct r600_common_context 
*rctx,
case R600_QUERY_REQUESTED_GTT:
case R600_QUERY_MAPPED_VRAM:
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BUFFER_WAIT_TIME:
case R600_QUERY_NUM_CS_FLUSHES:
-   case R600_QUERY_NUM_BYTES_MOVED: {
+   case R600_QUERY_NUM_BYTES_MOVED:
+   case R600_QUERY_NUM_EVICTIONS: {
enum radeon_value_id ws_id = winsys_id_from_type(query->b.type);
query->end_result = rctx->ws->query_value(rctx->ws, ws_id);
break;
}
case R600_QUERY_GPU_LOAD:
query->end_result = r600_gpu_load_end(rctx->screen,
  query->begin_result);
query->begin_result = 0;
break;
case R600_QUERY_NUM_COMPILATIONS:
@@ -1179,20 +1182,21 @@ static struct pipe_driver_query_info 
r600_driver_query_list[] = {
X("compute-calls",  COMPUTE_CALLS,  UINT64, 
AVERAGE),
X("spill-compute-calls",SPILL_COMPUTE_CALLS,UINT64, 
AVERAGE),
X("dma-calls",  DMA_CALLS,  UINT64, 
AVERAGE),
X("requested-VRAM", REQUESTED_VRAM, BYTES, AVERAGE),
X("requested-GTT",  REQUESTED_GTT,  BYTES, AVERAGE),
X("mapped-VRAM",MAPPED_VRAM,BYTES, AVERAGE),
X("mapped-GTT", MAPPED_GTT, BYTES, AVERAGE),
X("buffer-wait-time",   BUFFER_WAIT_TIME,   MICROSECONDS, 
CUMULATIVE),
X("num-cs-flushes", NUM_CS_FLUSHES, UINT64, 
AVERAGE),
X("num-bytes-moved",NUM_BYTES_MOVED,BYTES, 
CUMULATIVE),
+   X("num-evictions",

[Mesa-dev] [PATCH 2/8] gallium/radeon: increase priority for shader binaries

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/radeon_winsys.h | 2 +-
 src/gallium/drivers/radeonsi/si_debug.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index cbab406..f4e3773 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -179,21 +179,20 @@ enum radeon_bo_priority {
 RADEON_PRIO_IB2, /* IB executed with INDIRECT_BUFFER */
 RADEON_PRIO_DRAW_INDIRECT,
 RADEON_PRIO_INDEX_BUFFER,
 
 RADEON_PRIO_VCE = 8,
 RADEON_PRIO_UVD,
 RADEON_PRIO_SDMA_BUFFER,
 RADEON_PRIO_SDMA_TEXTURE,
 
 RADEON_PRIO_CP_DMA = 12,
-RADEON_PRIO_SHADER_BINARY,
 
 RADEON_PRIO_CONST_BUFFER = 16,
 RADEON_PRIO_DESCRIPTORS,
 RADEON_PRIO_BORDER_COLORS,
 
 RADEON_PRIO_SAMPLER_BUFFER = 20,
 RADEON_PRIO_VERTEX_BUFFER,
 
 RADEON_PRIO_SHADER_RW_BUFFER = 24,
 RADEON_PRIO_COMPUTE_GLOBAL,
@@ -207,20 +206,21 @@ enum radeon_bo_priority {
 
 RADEON_PRIO_DEPTH_BUFFER = 40,
 
 RADEON_PRIO_COLOR_BUFFER_MSAA = 44,
 
 RADEON_PRIO_DEPTH_BUFFER_MSAA = 48,
 
 RADEON_PRIO_CMASK = 52,
 RADEON_PRIO_DCC,
 RADEON_PRIO_HTILE,
+RADEON_PRIO_SHADER_BINARY, /* the hw can't hide instruction cache misses */
 
 RADEON_PRIO_SHADER_RINGS = 56,
 
 RADEON_PRIO_SCRATCH_BUFFER = 60,
 /* 63 is the maximum value */
 };
 
 struct winsys_handle;
 struct radeon_winsys_ctx;
 
diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
b/src/gallium/drivers/radeonsi/si_debug.c
index 4b500cf..7e75d3f 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -546,38 +546,38 @@ static const char *priority_to_string(enum 
radeon_bo_priority priority)
ITEM(QUERY),
ITEM(IB1),
ITEM(IB2),
ITEM(DRAW_INDIRECT),
ITEM(INDEX_BUFFER),
ITEM(VCE),
ITEM(UVD),
ITEM(SDMA_BUFFER),
ITEM(SDMA_TEXTURE),
ITEM(CP_DMA),
-   ITEM(SHADER_BINARY),
ITEM(CONST_BUFFER),
ITEM(DESCRIPTORS),
ITEM(BORDER_COLORS),
ITEM(SAMPLER_BUFFER),
ITEM(VERTEX_BUFFER),
ITEM(SHADER_RW_BUFFER),
ITEM(COMPUTE_GLOBAL),
ITEM(SAMPLER_TEXTURE),
ITEM(SHADER_RW_IMAGE),
ITEM(SAMPLER_TEXTURE_MSAA),
ITEM(COLOR_BUFFER),
ITEM(DEPTH_BUFFER),
ITEM(COLOR_BUFFER_MSAA),
ITEM(DEPTH_BUFFER_MSAA),
ITEM(CMASK),
ITEM(DCC),
ITEM(HTILE),
+   ITEM(SHADER_BINARY),
ITEM(SHADER_RINGS),
ITEM(SCRATCH_BUFFER),
};
 #undef ITEM
 
assert(priority < ARRAY_SIZE(table));
return table[priority];
 }
 
 static int bo_list_compare_va(const struct radeon_bo_list_item *a,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/23 V2] mesa: Convert string_to_uint_map to the util hash table

2016-08-18 Thread Thomas Helland
And remove the now unused hash_table_replace.

V2: Actually do the equivalent thing, and don't leak memory

Signed-off-by: Thomas Helland 
---
 src/mesa/program/hash_table.h | 66 +++
 1 file changed, 22 insertions(+), 44 deletions(-)

diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h
index 421d0e9..bccea65 100644
--- a/src/mesa/program/hash_table.h
+++ b/src/mesa/program/hash_table.h
@@ -111,8 +111,6 @@ static inline void *hash_table_find(struct hash_table *ht, 
const void *key)
  * \warning
  * The value passed by \c key is kept in the hash table and is used by later
  * calls to \c hash_table_find.
- *
- * \sa hash_table_replace
  */
 static inline void hash_table_insert(struct hash_table *ht, void *data,
  const void *key)
@@ -121,33 +119,6 @@ static inline void hash_table_insert(struct hash_table 
*ht, void *data,
 }
 
 /**
- * Add an element to a hash table with replacement
- *
- * \return
- * 1 if it did replace the value (in which case the old key is kept), 0 if it
- * did not replace the value (in which case the new key is kept).
- *
- * \warning
- * If \c key is already in the hash table, \c data will \b replace the most
- * recently inserted \c data (see the warning in \c hash_table_insert) for
- * that key.
- *
- * \sa hash_table_insert
- */
-static inline bool hash_table_replace(struct hash_table *ht, void *data,
-  const void *key)
-{
-   struct hash_entry *entry = _mesa_hash_table_search(ht, key);
-   if (entry) {
-  entry->data = data;
-  return true;
-   } else {
-  _mesa_hash_table_insert(ht, key, data);
-  return false;
-   }
-}
-
-/**
  * Remove a specific element from a hash table.
  */
 static inline void hash_table_remove(struct hash_table *ht, const void *key)
@@ -240,14 +211,14 @@ struct string_to_uint_map {
 public:
string_to_uint_map()
{
-  this->ht = hash_table_ctor(0, hash_table_string_hash,
-hash_table_string_compare);
+  this->ht = _mesa_hash_table_create(NULL, _mesa_key_hash_string,
+ _mesa_key_string_equal);
}
 
~string_to_uint_map()
{
   hash_table_call_foreach(this->ht, delete_key, NULL);
-  hash_table_dtor(this->ht);
+  _mesa_hash_table_destroy(this->ht, NULL);
}
 
/**
@@ -256,7 +227,7 @@ public:
void clear()
{
   hash_table_call_foreach(this->ht, delete_key, NULL);
-  hash_table_clear(this->ht);
+  _mesa_hash_table_clear(this->ht, NULL);
}
 
/**
@@ -290,12 +261,13 @@ public:
 */
bool get(unsigned &value, const char *key)
{
-  const intptr_t v =
-(intptr_t) hash_table_find(this->ht, (const void *) key);
+  hash_entry *entry = _mesa_hash_table_search(this->ht,
+  (const void *) key);
 
-  if (v == 0)
-return false;
+  if (!entry)
+ return false;
 
+  const intptr_t v = (intptr_t) entry->data;
   value = (unsigned)(v - 1);
   return true;
}
@@ -307,19 +279,25 @@ public:
* valid value in the table.  Bias the value by +1 so that a
* user-specified zero is stored as 1.  This enables ::get to tell the
* difference between a user-specified zero (returned as 1 by
-   * hash_table_find) and the key not in the table (returned as 0 by
-   * hash_table_find).
+   * _mesa_hash_table_search) and the key not in the table (returned as 0 
by
+   * _mesa_hash_table_find).
*
* The net effect is that we can't store UINT_MAX in the table.  This is
* because UINT_MAX+1 = 0.
*/
   assert(value != UINT_MAX);
   char *dup_key = strdup(key);
-  bool result = hash_table_replace(this->ht,
-  (void *) (intptr_t) (value + 1),
-  dup_key);
-  if (result)
-free(dup_key);
+
+  struct hash_entry *entry = _mesa_hash_table_search(this->ht, dup_key);
+  if (entry) {
+ entry->data = (void *) (intptr_t) (value + 1);
+  } else {
+ _mesa_hash_table_insert(this->ht, dup_key,
+ (void *) (intptr_t) (value + 1));
+  }
+
+  if (entry)
+ free(dup_key);
}
 
 private:
-- 
2.9.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] gallium/hud: don't enable blending for all objects

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 7870da5..0b292bc 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -57,21 +57,21 @@ static boolean huds_visible = TRUE;
 
 struct hud_context {
struct pipe_context *pipe;
struct cso_context *cso;
struct u_upload_mgr *uploader;
 
struct hud_batch_query_context *batch_query;
struct list_head pane_list;
 
/* states */
-   struct pipe_blend_state alpha_blend;
+   struct pipe_blend_state no_blend, alpha_blend;
struct pipe_depth_stencil_alpha_state dsa;
void *fs_color, *fs_text;
struct pipe_rasterizer_state rasterizer;
void *vs;
struct pipe_vertex_element velems[2];
 
/* font */
struct util_font font;
struct pipe_sampler_view *font_sampler_view;
struct pipe_sampler_state font_sampler_state;
@@ -496,21 +496,20 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
viewport.scale[0] = 0.5f * hud->fb_width;
viewport.scale[1] = 0.5f * hud->fb_height;
viewport.scale[2] = 1.0f;
viewport.translate[0] = 0.5f * hud->fb_width;
viewport.translate[1] = 0.5f * hud->fb_height;
viewport.translate[2] = 0.0f;
 
cso_set_framebuffer(cso, &fb);
cso_set_sample_mask(cso, ~0);
cso_set_min_samples(cso, 1);
-   cso_set_blend(cso, &hud->alpha_blend);
cso_set_depth_stencil_alpha(cso, &hud->dsa);
cso_set_rasterizer(cso, &hud->rasterizer);
cso_set_viewport(cso, &viewport);
cso_set_stream_outputs(cso, 0, NULL, NULL);
cso_set_tessctrl_shader_handle(cso, NULL);
cso_set_tesseval_shader_handle(cso, NULL);
cso_set_geometry_shader_handle(cso, NULL);
cso_set_vertex_shader_handle(cso, hud->vs);
cso_set_vertex_elements(cso, 2, hud->velems);
cso_set_render_condition(cso, NULL, FALSE, 0);
@@ -532,59 +531,63 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
  gr->query_new_value(gr);
   }
 
   hud_pane_accumulate_vertices(hud, pane);
}
 
/* unmap the uploader's vertex buffer before drawing */
u_upload_unmap(hud->uploader);
 
/* draw accumulated vertices for background quads */
+   cso_set_blend(cso, &hud->alpha_blend);
cso_set_fragment_shader_handle(hud->cso, hud->fs_color);
 
if (hud->bg.num_vertices) {
   hud->constants.color[0] = 0;
   hud->constants.color[1] = 0;
   hud->constants.color[2] = 0;
   hud->constants.color[3] = 0.666f;
   hud->constants.translate[0] = 0;
   hud->constants.translate[1] = 0;
   hud->constants.scale[0] = 1;
   hud->constants.scale[1] = 1;
 
   cso_set_constant_buffer(cso, PIPE_SHADER_VERTEX, 0, &hud->constbuf);
   cso_set_vertex_buffers(cso, cso_get_aux_vertex_buffer_slot(cso), 1,
  &hud->bg.vbuf);
   cso_draw_arrays(cso, PIPE_PRIM_QUADS, 0, hud->bg.num_vertices);
}
pipe_resource_reference(&hud->bg.vbuf.buffer, NULL);
 
/* draw accumulated vertices for white lines */
+   cso_set_blend(cso, &hud->no_blend);
+
hud->constants.color[0] = 1;
hud->constants.color[1] = 1;
hud->constants.color[2] = 1;
hud->constants.color[3] = 1;
hud->constants.translate[0] = 0;
hud->constants.translate[1] = 0;
hud->constants.scale[0] = 1;
hud->constants.scale[1] = 1;
cso_set_constant_buffer(cso, PIPE_SHADER_VERTEX, 0, &hud->constbuf);
 
if (hud->whitelines.num_vertices) {
   cso_set_vertex_buffers(cso, cso_get_aux_vertex_buffer_slot(cso), 1,
  &hud->whitelines.vbuf);
   cso_set_fragment_shader_handle(hud->cso, hud->fs_color);
   cso_draw_arrays(cso, PIPE_PRIM_LINES, 0, hud->whitelines.num_vertices);
}
pipe_resource_reference(&hud->whitelines.vbuf.buffer, NULL);
 
/* draw accumulated vertices for text */
+   cso_set_blend(cso, &hud->alpha_blend);
if (hud->text.num_vertices) {
   cso_set_vertex_buffers(cso, cso_get_aux_vertex_buffer_slot(cso), 1,
  &hud->text.vbuf);
   cso_set_fragment_shader_handle(hud->cso, hud->fs_text);
   cso_draw_arrays(cso, PIPE_PRIM_QUADS, 0, hud->text.num_vertices);
}
pipe_resource_reference(&hud->text.vbuf.buffer, NULL);
 
/* draw the rest */
LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
@@ -1163,20 +1166,22 @@ hud_create(struct pipe_context *pipe, struct 
cso_context *cso)
PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM);
 
/* font */
if (!util_font_create(pipe, UTIL_FONT_FIXED_8X13, &hud->font)) {
   u_upload_destroy(hud->uploader);
   FREE(hud);
   return NULL;
}
 
/* blend state */
+   hud->no_blend.rt[0].colormask = PIPE_MASK_RGBA;
+
hud->alpha_blend.rt[0].colormask = PIPE_MASK_RGBA;
hud->alpha_blend.rt[0].blend_enable = 1;
hud->alpha_bl

[Mesa-dev] [PATCH 5/6] gallium/hud: generalize code for drawing numbers next to graphs

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index f7baccd..ac971d8 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -337,33 +337,35 @@ hud_draw_graph_line_strip(struct hud_context *hud, const 
struct hud_graph *gr,
 }
 
 static void
 hud_pane_accumulate_vertices(struct hud_context *hud,
  const struct hud_pane *pane)
 {
struct hud_graph *gr;
float *line_verts = hud->whitelines.vertices + 
hud->whitelines.num_vertices*2;
unsigned i, num = 0;
char str[32];
+   const unsigned last_line = 5;
 
/* draw background */
hud_draw_background_quad(hud,
 pane->x1, pane->y1,
 pane->x2, pane->y2);
 
/* draw numbers on the right-hand side */
-   for (i = 0; i < 6; i++) {
+   for (i = 0; i <= last_line; i++) {
   unsigned x = pane->x2 + 2;
-  unsigned y = pane->inner_y1 + pane->inner_height * (5 - i) / 5 -
+  unsigned y = pane->inner_y1 +
+   pane->inner_height * (last_line - i) / last_line -
hud->font.glyph_height / 2;
 
-  number_to_human_readable(pane->max_value * i / 5, pane->max_value,
+  number_to_human_readable(pane->max_value * i / last_line, 
pane->max_value,
pane->type, str);
   hud_draw_string(hud, x, y, "%s", str);
}
 
/* draw info below the pane */
i = 0;
LIST_FOR_EACH_ENTRY(gr, &pane->graph_list, head) {
   unsigned x = pane->x1 + 2;
   unsigned y = pane->y2 + 2 + i*hud->font.glyph_height;
 
@@ -389,22 +391,23 @@ hud_pane_accumulate_vertices(struct hud_context *hud,
line_verts[num++] = (float) pane->y2;
line_verts[num++] = (float) pane->x2;
line_verts[num++] = (float) pane->y2;
 
line_verts[num++] = (float) pane->x1;
line_verts[num++] = (float) pane->y1;
line_verts[num++] = (float) pane->x1;
line_verts[num++] = (float) pane->y2;
 
/* draw horizontal lines inside the graph */
-   for (i = 0; i <= 5; i++) {
-  float y = round((pane->max_value * i / 5.0) * pane->yscale + 
pane->inner_y2);
+   for (i = 0; i <= last_line; i++) {
+  float y = round((pane->max_value * i / (double)last_line) *
+  pane->yscale + pane->inner_y2);
 
   assert(hud->whitelines.num_vertices + num/2 + 2 <= 
hud->whitelines.max_num_vertices);
   line_verts[num++] = pane->x1;
   line_verts[num++] = y;
   line_verts[num++] = pane->x2;
   line_verts[num++] = y;
}
 
hud->whitelines.num_vertices += num/2;
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] gallium/hud: use sRGB for nicer AA lines

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index fb9c8c6..8ab998e 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -35,20 +35,21 @@
 
 #include 
 #include 
 
 #include "hud/hud_context.h"
 #include "hud/hud_private.h"
 #include "hud/font.h"
 
 #include "cso_cache/cso_context.h"
 #include "util/u_draw_quad.h"
+#include "util/u_format.h"
 #include "util/u_inlines.h"
 #include "util/u_memory.h"
 #include "util/u_math.h"
 #include "util/u_sampler.h"
 #include "util/u_simple_shaders.h"
 #include "util/u_string.h"
 #include "util/u_upload_mgr.h"
 #include "tgsi/tgsi_text.h"
 #include "tgsi/tgsi_dump.h"
 
@@ -90,20 +91,22 @@ struct hud_context {
unsigned fb_width, fb_height;
 
/* vertices for text and background drawing are accumulated here and then
 * drawn all at once */
struct vertex_queue {
   float *vertices;
   struct pipe_vertex_buffer vbuf;
   unsigned max_num_vertices;
   unsigned num_vertices;
} text, bg, whitelines;
+
+   bool has_srgb;
 };
 
 #ifdef PIPE_OS_UNIX
 static void
 signal_visible_handler(int sig, siginfo_t *siginfo, void *context)
 {
huds_visible = !huds_visible;
 }
 #endif
 
@@ -477,20 +480,32 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
 CSO_BIT_VERTEX_SHADER |
 CSO_BIT_VERTEX_ELEMENTS |
 CSO_BIT_AUX_VERTEX_BUFFER_SLOT |
 CSO_BIT_PAUSE_QUERIES |
 CSO_BIT_RENDER_CONDITION));
cso_save_constant_buffer_slot0(cso, PIPE_SHADER_VERTEX);
 
/* set states */
memset(&surf_templ, 0, sizeof(surf_templ));
surf_templ.format = tex->format;
+
+   /* Without this, AA lines look thinner if they are between 2 pixels
+* because the alpha is 0.5 on both pixels. (it's ugly)
+*
+* sRGB makes the width of all AA lines look the same.
+*/
+   if (hud->has_srgb) {
+  enum pipe_format srgb_format = util_format_srgb(tex->format);
+
+  if (srgb_format != PIPE_FORMAT_NONE)
+ surf_templ.format = srgb_format;
+   }
surf = pipe->create_surface(pipe, tex, &surf_templ);
 
memset(&fb, 0, sizeof(fb));
fb.nr_cbufs = 1;
fb.cbufs[0] = surf;
fb.zsbuf = NULL;
fb.width = hud->fb_width;
fb.height = hud->fb_height;
 
viewport.scale[0] = 0.5f * hud->fb_width;
@@ -1131,20 +1146,21 @@ print_help(struct pipe_screen *screen)
   }
}
 
puts("");
fflush(stdout);
 }
 
 struct hud_context *
 hud_create(struct pipe_context *pipe, struct cso_context *cso)
 {
+   struct pipe_screen *screen = pipe->screen;
struct hud_context *hud;
struct pipe_sampler_view view_templ;
unsigned i;
const char *env = debug_get_option("GALLIUM_HUD", NULL);
unsigned signo = debug_get_num_option("GALLIUM_HUD_TOGGLE_SIGNAL", 0);
 #ifdef PIPE_OS_UNIX
static boolean sig_handled = FALSE;
struct sigaction action = {};
 #endif
huds_visible = debug_get_bool_option("GALLIUM_HUD_VISIBLE", TRUE);
@@ -1166,20 +1182,25 @@ hud_create(struct pipe_context *pipe, struct 
cso_context *cso)
hud->uploader = u_upload_create(pipe, 256 * 1024,
PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM);
 
/* font */
if (!util_font_create(pipe, UTIL_FONT_FIXED_8X13, &hud->font)) {
   u_upload_destroy(hud->uploader);
   FREE(hud);
   return NULL;
}
 
+   hud->has_srgb = screen->is_format_supported(screen,
+   PIPE_FORMAT_B8G8R8A8_SRGB,
+   PIPE_TEXTURE_2D, 0,
+   PIPE_BIND_RENDER_TARGET) != 0;
+
/* blend state */
hud->no_blend.rt[0].colormask = PIPE_MASK_RGBA;
 
hud->alpha_blend.rt[0].colormask = PIPE_MASK_RGBA;
hud->alpha_blend.rt[0].blend_enable = 1;
hud->alpha_blend.rt[0].rgb_func = PIPE_BLEND_ADD;
hud->alpha_blend.rt[0].rgb_src_factor = PIPE_BLENDFACTOR_SRC_ALPHA;
hud->alpha_blend.rt[0].rgb_dst_factor = PIPE_BLENDFACTOR_INV_SRC_ALPHA;
hud->alpha_blend.rt[0].alpha_func = PIPE_BLEND_ADD;
hud->alpha_blend.rt[0].alpha_src_factor = PIPE_BLENDFACTOR_ZERO;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] gallium/hud: round max_value to print nicely rounded numbers next to graphs

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

This improves readability a lot.
---
 src/gallium/auxiliary/hud/hud_context.c  | 80 ++--
 src/gallium/auxiliary/hud/hud_driver_query.c |  3 +-
 src/gallium/auxiliary/hud/hud_private.h  |  1 +
 3 files changed, 80 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index ac971d8..3453bda 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -337,21 +337,21 @@ hud_draw_graph_line_strip(struct hud_context *hud, const 
struct hud_graph *gr,
 }
 
 static void
 hud_pane_accumulate_vertices(struct hud_context *hud,
  const struct hud_pane *pane)
 {
struct hud_graph *gr;
float *line_verts = hud->whitelines.vertices + 
hud->whitelines.num_vertices*2;
unsigned i, num = 0;
char str[32];
-   const unsigned last_line = 5;
+   const unsigned last_line = pane->last_line;
 
/* draw background */
hud_draw_background_quad(hud,
 pane->x1, pane->y1,
 pane->x2, pane->y2);
 
/* draw numbers on the right-hand side */
for (i = 0; i <= last_line; i++) {
   unsigned x = pane->x2 + 2;
   unsigned y = pane->inner_y1 +
@@ -537,21 +537,21 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
cso_set_geometry_shader_handle(cso, NULL);
cso_set_vertex_shader_handle(cso, hud->vs);
cso_set_vertex_elements(cso, 2, hud->velems);
cso_set_render_condition(cso, NULL, FALSE, 0);
cso_set_sampler_views(cso, PIPE_SHADER_FRAGMENT, 1,
  &hud->font_sampler_view);
cso_set_samplers(cso, PIPE_SHADER_FRAGMENT, 1, sampler_states);
cso_set_constant_buffer(cso, PIPE_SHADER_VERTEX, 0, &hud->constbuf);
 
/* prepare vertex buffers */
-   hud_alloc_vertices(hud, &hud->bg, 4 * 128, 2 * sizeof(float));
+   hud_alloc_vertices(hud, &hud->bg, 4 * 256, 2 * sizeof(float));
hud_alloc_vertices(hud, &hud->whitelines, 4 * 256, 2 * sizeof(float));
hud_alloc_vertices(hud, &hud->text, 4 * 1024, 4 * sizeof(float));
 
/* prepare all graphs */
hud_batch_query_update(hud->batch_query);
 
LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
   LIST_FOR_EACH_ENTRY(gr, &pane->graph_list, head) {
  gr->query_new_value(gr);
   }
@@ -620,28 +620,102 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
   if (pane)
  hud_pane_draw_colored_objects(hud, pane);
}
 
cso_restore_state(cso);
cso_restore_constant_buffer_slot0(cso, PIPE_SHADER_VERTEX);
 
pipe_surface_reference(&surf, NULL);
 }
 
+static void
+fixup_bytes(enum pipe_driver_query_type type, int position, uint64_t *exp10)
+{
+   if (type == PIPE_DRIVER_QUERY_TYPE_BYTES && position % 3 == 0)
+  *exp10 = (*exp10 / 1000) * 1024;
+}
+
 /**
  * Set the maximum value for the Y axis of the graph.
  * This scales the graph accordingly.
  */
 void
 hud_pane_set_max_value(struct hud_pane *pane, uint64_t value)
 {
-   pane->max_value = value;
+   double leftmost_digit;
+   uint64_t exp10;
+   int i;
+
+   /* The following code determines the max_value in the graph as well as
+* how many describing lines are drawn. The max_value is rounded up,
+* so that all drawn numbers are rounded for readability.
+* We want to print multiples of a simple number instead of multiples of
+* hard-to-read numbers like 1.753.
+*/
+
+   /* Find the left-most digit. */
+   exp10 = 1;
+   for (i = 0; value > 9 * exp10; i++) {
+  exp10 *= 10;
+  fixup_bytes(pane->type, i + 1, &exp10);
+   }
+
+   leftmost_digit = DIV_ROUND_UP(value, exp10);
+
+   /* Round 9 to 10. */
+   if (leftmost_digit == 9) {
+  leftmost_digit = 1;
+  exp10 *= 10;
+  fixup_bytes(pane->type, i + 1, &exp10);
+   }
+
+   switch ((unsigned)leftmost_digit) {
+   case 1:
+  pane->last_line = 5; /* lines in +1/5 increments */
+  break;
+   case 2:
+  pane->last_line = 8; /* lines in +1/4 increments. */
+  break;
+   case 3:
+   case 4:
+  pane->last_line = leftmost_digit * 2; /* lines in +1/2 increments */
+  break;
+   case 5:
+   case 6:
+   case 7:
+   case 8:
+  pane->last_line = leftmost_digit; /* lines in +1 increments */
+  break;
+   default:
+  assert(0);
+   }
+
+   /* Truncate {3,4} to {2.5, 3.5} if possible. */
+   for (i = 3; i <= 4; i++) {
+  if (leftmost_digit == i && value <= (i - 0.5) * exp10) {
+ leftmost_digit = i - 0.5;
+ pane->last_line = leftmost_digit * 2; /* lines in +1/2 increments. */
+  }
+   }
+
+   /* Truncate 2 to a multiple of 0.2 in (1, 1.6] if possible. */
+   if (leftmost_digit == 2) {
+  for (i = 1; i <= 3; i++) {
+ if (value <= (1 + i*0.2) * exp10) {
+leftmost_digit = 1 + i*0.2;
+pane->last_line = 5 + i; /* lines in +1/5 increments. */
+break;
+ }
+  }
+   }
+
+   pane->max_

[Mesa-dev] [PATCH 2/6] gallium/hud: use AA lines for graphs

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

this looks a lot better (with the next patch)
---
 src/gallium/auxiliary/hud/hud_context.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 0b292bc..fb9c8c6 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -60,21 +60,21 @@ struct hud_context {
struct cso_context *cso;
struct u_upload_mgr *uploader;
 
struct hud_batch_query_context *batch_query;
struct list_head pane_list;
 
/* states */
struct pipe_blend_state no_blend, alpha_blend;
struct pipe_depth_stencil_alpha_state dsa;
void *fs_color, *fs_text;
-   struct pipe_rasterizer_state rasterizer;
+   struct pipe_rasterizer_state rasterizer, rasterizer_aa_lines;
void *vs;
struct pipe_vertex_element velems[2];
 
/* font */
struct util_font font;
struct pipe_sampler_view *font_sampler_view;
struct pipe_sampler_state font_sampler_state;
 
/* VS constant buffer */
struct {
@@ -583,20 +583,21 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
cso_set_blend(cso, &hud->alpha_blend);
if (hud->text.num_vertices) {
   cso_set_vertex_buffers(cso, cso_get_aux_vertex_buffer_slot(cso), 1,
  &hud->text.vbuf);
   cso_set_fragment_shader_handle(hud->cso, hud->fs_text);
   cso_draw_arrays(cso, PIPE_PRIM_QUADS, 0, hud->text.num_vertices);
}
pipe_resource_reference(&hud->text.vbuf.buffer, NULL);
 
/* draw the rest */
+   cso_set_rasterizer(cso, &hud->rasterizer_aa_lines);
LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
   if (pane)
  hud_pane_draw_colored_objects(hud, pane);
}
 
cso_restore_state(cso);
cso_restore_constant_buffer_slot0(cso, PIPE_SHADER_VERTEX);
 
pipe_surface_reference(&surf, NULL);
 }
@@ -1220,20 +1221,23 @@ hud_create(struct pipe_context *pipe, struct 
cso_context *cso)
   hud->fs_text = pipe->create_fs_state(pipe, &state);
}
 
/* rasterizer */
hud->rasterizer.half_pixel_center = 1;
hud->rasterizer.bottom_edge_rule = 1;
hud->rasterizer.depth_clip = 1;
hud->rasterizer.line_width = 1;
hud->rasterizer.line_last_pixel = 1;
 
+   hud->rasterizer_aa_lines = hud->rasterizer;
+   hud->rasterizer_aa_lines.line_smooth = 1;
+
/* vertex shader */
{
   static const char *vertex_shader_text = {
  "VERT\n"
  "DCL IN[0..1]\n"
  "DCL OUT[0], POSITION\n"
  "DCL OUT[1], COLOR[0]\n" /* color */
  "DCL OUT[2], GENERIC[0]\n" /* texcoord */
  /* [0] = color,
   * [1] = (2/fb_width, 2/fb_height, xoffset, yoffset)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] gallium/hud: draw numbers with 3 decimal places if those aren't 0

2016-08-18 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/hud/hud_context.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 8ab998e..f7baccd 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -289,26 +289,33 @@ number_to_human_readable(uint64_t num, uint64_t max_value,
  max_unit = ARRAY_SIZE(metric_units)-1;
  units = metric_units;
   }
}
 
while (d > divisor && unit < max_unit) {
   d /= divisor;
   unit++;
}
 
-   if (d >= 100 || d == (int)d)
+   /* Round to 3 decimal places so as not to print trailing zeros. */
+   if (d*1000 != (int)(d*1000))
+  d = round(d * 1000) / 1000;
+
+   /* Show at least 4 digits with at most 3 decimal places, but not zeros. */
+   if (d >= 1000 || d == (int)d)
   sprintf(out, "%.0f%s", d, units[unit]);
-   else if (d >= 10 || d*10 == (int)(d*10))
+   else if (d >= 100 || d*10 == (int)(d*10))
   sprintf(out, "%.1f%s", d, units[unit]);
-   else
+   else if (d >= 10 || d*100 == (int)(d*100))
   sprintf(out, "%.2f%s", d, units[unit]);
+   else
+  sprintf(out, "%.3f%s", d, units[unit]);
 }
 
 static void
 hud_draw_graph_line_strip(struct hud_context *hud, const struct hud_graph *gr,
   unsigned xoffset, unsigned yoffset, float yscale)
 {
if (gr->num_vertices <= 1)
   return;
 
assert(gr->index <= gr->num_vertices);
@@ -529,21 +536,21 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
cso_set_vertex_elements(cso, 2, hud->velems);
cso_set_render_condition(cso, NULL, FALSE, 0);
cso_set_sampler_views(cso, PIPE_SHADER_FRAGMENT, 1,
  &hud->font_sampler_view);
cso_set_samplers(cso, PIPE_SHADER_FRAGMENT, 1, sampler_states);
cso_set_constant_buffer(cso, PIPE_SHADER_VERTEX, 0, &hud->constbuf);
 
/* prepare vertex buffers */
hud_alloc_vertices(hud, &hud->bg, 4 * 128, 2 * sizeof(float));
hud_alloc_vertices(hud, &hud->whitelines, 4 * 256, 2 * sizeof(float));
-   hud_alloc_vertices(hud, &hud->text, 4 * 512, 4 * sizeof(float));
+   hud_alloc_vertices(hud, &hud->text, 4 * 1024, 4 * sizeof(float));
 
/* prepare all graphs */
hud_batch_query_update(hud->batch_query);
 
LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
   LIST_FOR_EACH_ENTRY(gr, &pane->graph_list, head) {
  gr->query_new_value(gr);
   }
 
   hud_pane_accumulate_vertices(hud, pane);
@@ -1019,21 +1026,21 @@ hud_parse_env_var(struct hud_context *hud, const char 
*env)
 
  if (pane && pane->num_graphs) {
 LIST_ADDTAIL(&pane->head, &hud->pane_list);
 pane = NULL;
  }
  break;
 
   case ';':
  env++;
  y = 10;
- x += column_width + hud->font.glyph_width * 7;
+ x += column_width + hud->font.glyph_width * 9;
  height = 100;
 
  if (pane && pane->num_graphs) {
 LIST_ADDTAIL(&pane->head, &hud->pane_list);
 pane = NULL;
  }
 
  /* Starting a new column; reset column width. */
  column_width = 251;
  break;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/23 V2] glsl: Convert glcpp-parse to the util hash table

2016-08-18 Thread Thomas Helland
And change the include in glcpp.h accordingly.

V2: Whitespace fix

Signed-off-by: Thomas Helland 
---
 src/compiler/glsl/glcpp/glcpp-parse.y | 54 ++-
 src/compiler/glsl/glcpp/glcpp.h   |  2 +-
 2 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index eff53be..4fd1448 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -278,6 +278,7 @@ control_line_success:
HASH_TOKEN DEFINE_TOKEN define
 |  HASH_TOKEN UNDEF IDENTIFIER NEWLINE {
macro_t *macro;
+   struct hash_entry *entry;
 
 /* Section 3.4 (Preprocessor) of the GLSL ES 3.00 spec says:
  *
@@ -309,9 +310,10 @@ control_line_success:
glcpp_error(& @1, parser, "Built-in (pre-defined)"
" macro names cannot be undefined.");
 
-   macro = hash_table_find (parser->defines, $3);
-   if (macro) {
-   hash_table_remove (parser->defines, $3);
+   entry = _mesa_hash_table_search (parser->defines, $3);
+   if (entry) {
+   macro = entry->data;
+   _mesa_hash_table_remove (parser->defines, entry);
ralloc_free (macro);
}
ralloc_free ($3);
@@ -348,13 +350,16 @@ control_line_success:
_glcpp_parser_skip_stack_push_if (parser, & @1, 0);
}
 |  HASH_TOKEN IFDEF IDENTIFIER junk NEWLINE {
-   macro_t *macro = hash_table_find (parser->defines, $3);
+   struct hash_entry *entry =
+   _mesa_hash_table_search(parser->defines, $3);
+   macro_t *macro = entry ? entry->data : NULL;
ralloc_free ($3);
_glcpp_parser_skip_stack_push_if (parser, & @1, macro != NULL);
}
 |  HASH_TOKEN IFNDEF IDENTIFIER junk NEWLINE {
-   macro_t *macro = hash_table_find (parser->defines, $3);
-   ralloc_free ($3);
+   struct hash_entry *entry =
+   _mesa_hash_table_search(parser->defines, $3);
+   macro_t *macro = entry ? entry->data : NULL;
_glcpp_parser_skip_stack_push_if (parser, & @3, macro == NULL);
}
 |  HASH_TOKEN ELIF pp_tokens NEWLINE {
@@ -1342,8 +1347,8 @@ glcpp_parser_create(glcpp_extension_iterator extensions, 
void *state, gl_api api
parser = ralloc (NULL, glcpp_parser_t);
 
glcpp_lex_init_extra (parser, &parser->scanner);
-   parser->defines = hash_table_ctor(32, hash_table_string_hash,
- hash_table_string_compare);
+   parser->defines = _mesa_hash_table_create(NULL, _mesa_key_hash_string,
+ _mesa_key_string_equal);
parser->active = NULL;
parser->lexing_directive = 0;
parser->space_tokens = 1;
@@ -1384,7 +1389,7 @@ void
 glcpp_parser_destroy(glcpp_parser_t *parser)
 {
glcpp_lex_destroy (parser->scanner);
-   hash_table_dtor (parser->defines);
+   _mesa_hash_table_destroy(parser->defines, NULL);
ralloc_free (parser);
 }
 
@@ -1563,8 +1568,8 @@ _glcpp_parser_evaluate_defined(glcpp_parser_t *parser, 
token_node_t *node,
 
*last = node;
 
-   return hash_table_find(parser->defines,
-  argument->token->value.str) ? 1 : 0;
+   return _mesa_hash_table_search(parser->defines,
+  argument->token->value.str) ? 1 : 0;
 
 FAIL:
glcpp_error (&defined->token->location, parser,
@@ -1705,6 +1710,7 @@ static token_list_t *
 _glcpp_parser_expand_function(glcpp_parser_t *parser, token_node_t *node,
   token_node_t **last, expansion_mode_t mode)
 {
+   struct hash_entry *entry;
macro_t *macro;
const char *identifier;
argument_list_t *arguments;
@@ -1714,7 +1720,8 @@ _glcpp_parser_expand_function(glcpp_parser_t *parser, 
token_node_t *node,
 
identifier = node->token->value.str;
 
-   macro = hash_table_find(parser->defines, identifier);
+   entry = _mesa_hash_table_search(parser->defines, identifier);
+   macro = entry ? entry->data : NULL;
 
assert(macro->is_function);
 
@@ -1811,6 +1818,7 @@ _glcpp_parser_expand_node(glcpp_parser_t *parser, 
token_node_t *node,
 {
token_t *token = node->token;
const char *identifier;
+   struct hash_entry *entry;
macro_t *macro;
 
/* We only expand identifiers */
@@ -1830,7 +1838,8 @@ _glcpp_parser_expand_node(glcpp_parser_t *parser, 
token_node_t *node,
   return _token_list_create_with_one_integer(parser, 
node->token->location.source);
 
/* Look up this identifier in the hash table. */
-   macro = hash_table_find(parser->defines, identifier);
+   entry = _mesa_hash_table_search(parser->defines, identifier);
+   macro = entry ? entry->dat

Re: [Mesa-dev] [RFC] st: guard against NULL pipe_surface

2016-08-18 Thread Marek Olšák
On Thu, Aug 18, 2016 at 2:41 PM, Tobias Klausmann
 wrote:
> I have no test case per se, but orbea (in CC) noted in IRC:
>
> Start OpenMW with DRI3 -> Crash [1]
>
> Start OpenMW with DRI2 -> no crash,
>
> So i fear it is somewhere in our DRI3 path. Anyway a guard seems reasonable
> to harden release builds against this. Maybe not at this place but central
> in update_framebuffer_size().

Yeah, it would be very useful if we could reproduce this somehow.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/23] mesa: Replace hashing functions of prog_hash_table

2016-08-18 Thread Thomas Helland
2016-08-17 3:02 GMT+02:00 Timothy Arceri :
> On Tue, 2016-08-16 at 22:10 +0200, Thomas Helland wrote:
>> This will make it functionally equivalent to the one in util.
>> This enables us to do a step-by-step replacement of the table.
>>
>> Signed-off-by: Thomas Helland 
>
> You should really just move these to the header since you are moving
> them there in the next patch anyway. Otherwise this is just code churn.
>

An argument can definitely be made for that. I didn't really like the code
churn myself, either, but ended with the conclusion that it is easier to
squash than to split. So whoever commits the series when it's reviewed
can squash the patches before pushing upstream.

>> ---
>>  src/mesa/program/prog_hash_table.c | 22 --
>>  1 file changed, 8 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/mesa/program/prog_hash_table.c
>> b/src/mesa/program/prog_hash_table.c
>> index cbea74c..a712175 100644
>> --- a/src/mesa/program/prog_hash_table.c
>> +++ b/src/mesa/program/prog_hash_table.c
>> @@ -35,33 +35,27 @@
>>  unsigned
>>  hash_table_string_hash(const void *key)
>>  {
>> -const char *str = (const char *) key;
>> -unsigned hash = 5381;
>> -
>> -
>> -while (*str != '\0') {
>> -hash = (hash * 33) + *str;
>> -str++;
>> -}
>> -
>> -return hash;
>> +   const char *str = (const char *) key;
>> +   uint32_t hash = _mesa_hash_string(str);
>> +   return hash;
>>  }
>>
>> -bool hash_table_string_compare(const void *a, const void *b)
>> +bool
>> +hash_table_string_compare(const void *a, const void *b)
>>  {
>> -   return strcmp(a, b) == 0;
>> +   return _mesa_key_string_equal(a, b);
>>  }
>>
>>
>>  unsigned
>>  hash_table_pointer_hash(const void *key)
>>  {
>> -   return (unsigned)((uintptr_t) key / sizeof(void *));
>> +   return _mesa_hash_pointer(key);
>>  }
>>
>>
>>  bool
>>  hash_table_pointer_compare(const void *key1, const void *key2)
>>  {
>> -   return key1 == key2;
>> +   return _mesa_key_pointer_equal(key1, key2);
>>  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use nir_shader_get_entrypoint in TCS quad workaround code.

2016-08-18 Thread Jason Ekstrand
Thanks.  Rb

On Aug 18, 2016 19:46, "Kenneth Graunke"  wrote:

> We want to insert the code at the end of the program.  Looping over
> all the functions (of which there was only one) was the old way of doing
> this, but now we have nir_shader_get_entrypoint(), so let's use it.
>
> Suggested by Connor Abbott.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 23
> +++---
>  1 file changed, 11 insertions(+), 12 deletions(-)
>
> Sorry, I pushed the patch rather quickly...here's a follow-on to use
> nir_shader_get_entrypoint().
>
> Shouldn't be also be using it in nir_lower_gs_intrinsics.c?
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> index 0626981..ac4f9e0 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> @@ -134,19 +134,18 @@ brw_nir_apply_tcs_quads_workaround(nir_shader *nir)
>  {
> assert(nir->stage == MESA_SHADER_TESS_CTRL);
>
> -   nir_foreach_function(func, nir) {
> -  if (!func->impl)
> - continue;
> +   nir_function *func = nir_shader_get_entrypoint(nir);
> +   if (!func->impl)
> +  return;
>
> -  nir_builder b;
> -  nir_builder_init(&b, func->impl);
> +   nir_builder b;
> +   nir_builder_init(&b, func->impl);
>
> -  struct set_entry *entry;
> -  set_foreach(func->impl->end_block->predecessors, entry) {
> - nir_block *pred = (nir_block *) entry->key;
> - emit_quads_workaround(&b, pred);
> -  }
> -
> -  nir_metadata_preserve(func->impl, 0);
> +   struct set_entry *entry;
> +   set_foreach(func->impl->end_block->predecessors, entry) {
> +  nir_block *pred = (nir_block *) entry->key;
> +  emit_quads_workaround(&b, pred);
> }
> +
> +   nir_metadata_preserve(func->impl, 0);
>  }
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use nir_shader_get_entrypoint in TCS quad workaround code.

2016-08-18 Thread Connor Abbott
On Thu, Aug 18, 2016 at 2:46 PM, Kenneth Graunke  wrote:
> We want to insert the code at the end of the program.  Looping over
> all the functions (of which there was only one) was the old way of doing
> this, but now we have nir_shader_get_entrypoint(), so let's use it.
>
> Suggested by Connor Abbott.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 23 
> +++---
>  1 file changed, 11 insertions(+), 12 deletions(-)
>
> Sorry, I pushed the patch rather quickly...here's a follow-on to use
> nir_shader_get_entrypoint().
>
> Shouldn't be also be using it in nir_lower_gs_intrinsics.c?
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c 
> b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> index 0626981..ac4f9e0 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> @@ -134,19 +134,18 @@ brw_nir_apply_tcs_quads_workaround(nir_shader *nir)
>  {
> assert(nir->stage == MESA_SHADER_TESS_CTRL);
>
> -   nir_foreach_function(func, nir) {
> -  if (!func->impl)
> - continue;
> +   nir_function *func = nir_shader_get_entrypoint(nir);
> +   if (!func->impl)
> +  return;

Minor nit: I think you can drop the if, since it doesn't make sense to
have an entrypoint with only a declaration and no implementation.
Other than that,

Reviewed-by: Connor Abbott 

What about my other comment though?

>
> -  nir_builder b;
> -  nir_builder_init(&b, func->impl);
> +   nir_builder b;
> +   nir_builder_init(&b, func->impl);
>
> -  struct set_entry *entry;
> -  set_foreach(func->impl->end_block->predecessors, entry) {
> - nir_block *pred = (nir_block *) entry->key;
> - emit_quads_workaround(&b, pred);
> -  }
> -
> -  nir_metadata_preserve(func->impl, 0);
> +   struct set_entry *entry;
> +   set_foreach(func->impl->end_block->predecessors, entry) {
> +  nir_block *pred = (nir_block *) entry->key;
> +  emit_quads_workaround(&b, pred);
> }
> +
> +   nir_metadata_preserve(func->impl, 0);
>  }
> --
> 2.9.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use nir_shader_get_entrypoint in TCS quad workaround code.

2016-08-18 Thread Jason Ekstrand
On Aug 18, 2016 23:02, "Connor Abbott"  wrote:
>
> On Thu, Aug 18, 2016 at 2:46 PM, Kenneth Graunke 
wrote:
> > We want to insert the code at the end of the program.  Looping over
> > all the functions (of which there was only one) was the old way of doing
> > this, but now we have nir_shader_get_entrypoint(), so let's use it.
> >
> > Suggested by Connor Abbott.
> >
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 23
+++---
> >  1 file changed, 11 insertions(+), 12 deletions(-)
> >
> > Sorry, I pushed the patch rather quickly...here's a follow-on to use
> > nir_shader_get_entrypoint().
> >
> > Shouldn't be also be using it in nir_lower_gs_intrinsics.c?
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> > index 0626981..ac4f9e0 100644
> > --- a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> > +++ b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
> > @@ -134,19 +134,18 @@ brw_nir_apply_tcs_quads_workaround(nir_shader
*nir)
> >  {
> > assert(nir->stage == MESA_SHADER_TESS_CTRL);
> >
> > -   nir_foreach_function(func, nir) {
> > -  if (!func->impl)
> > - continue;
> > +   nir_function *func = nir_shader_get_entrypoint(nir);
> > +   if (!func->impl)
> > +  return;
>
> Minor nit: I think you can drop the if, since it doesn't make sense to
> have an entrypoint with only a declaration and no implementation.
> Other than that,

Mind if we split the difference and make it an assert?  We can probably add
said assert too the get_entrypoint helper.

> Reviewed-by: Connor Abbott 
>
> What about my other comment though?
>
> >
> > -  nir_builder b;
> > -  nir_builder_init(&b, func->impl);
> > +   nir_builder b;
> > +   nir_builder_init(&b, func->impl);
> >
> > -  struct set_entry *entry;
> > -  set_foreach(func->impl->end_block->predecessors, entry) {
> > - nir_block *pred = (nir_block *) entry->key;
> > - emit_quads_workaround(&b, pred);
> > -  }
> > -
> > -  nir_metadata_preserve(func->impl, 0);
> > +   struct set_entry *entry;
> > +   set_foreach(func->impl->end_block->predecessors, entry) {
> > +  nir_block *pred = (nir_block *) entry->key;
> > +  emit_quads_workaround(&b, pred);
> > }
> > +
> > +   nir_metadata_preserve(func->impl, 0);
> >  }
> > --
> > 2.9.3
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Use nir_shader_get_entrypoint in TCS quad workaround code.

2016-08-18 Thread Connor Abbott
On Thu, Aug 18, 2016 at 6:12 PM, Jason Ekstrand  wrote:
> On Aug 18, 2016 23:02, "Connor Abbott"  wrote:
>>
>> On Thu, Aug 18, 2016 at 2:46 PM, Kenneth Graunke 
>> wrote:
>> > We want to insert the code at the end of the program.  Looping over
>> > all the functions (of which there was only one) was the old way of doing
>> > this, but now we have nir_shader_get_entrypoint(), so let's use it.
>> >
>> > Suggested by Connor Abbott.
>> >
>> > Signed-off-by: Kenneth Graunke 
>> > ---
>> >  .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 23
>> > +++---
>> >  1 file changed, 11 insertions(+), 12 deletions(-)
>> >
>> > Sorry, I pushed the patch rather quickly...here's a follow-on to use
>> > nir_shader_get_entrypoint().
>> >
>> > Shouldn't be also be using it in nir_lower_gs_intrinsics.c?
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
>> > b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
>> > index 0626981..ac4f9e0 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_nir_tcs_workarounds.c
>> > @@ -134,19 +134,18 @@ brw_nir_apply_tcs_quads_workaround(nir_shader
>> > *nir)
>> >  {
>> > assert(nir->stage == MESA_SHADER_TESS_CTRL);
>> >
>> > -   nir_foreach_function(func, nir) {
>> > -  if (!func->impl)
>> > - continue;
>> > +   nir_function *func = nir_shader_get_entrypoint(nir);
>> > +   if (!func->impl)
>> > +  return;
>>
>> Minor nit: I think you can drop the if, since it doesn't make sense to
>> have an entrypoint with only a declaration and no implementation.
>> Other than that,
>
> Mind if we split the difference and make it an assert?  We can probably add
> said assert too the get_entrypoint helper.

Sure, sounds good.

>
>> Reviewed-by: Connor Abbott 
>>
>> What about my other comment though?
>>
>> >
>> > -  nir_builder b;
>> > -  nir_builder_init(&b, func->impl);
>> > +   nir_builder b;
>> > +   nir_builder_init(&b, func->impl);
>> >
>> > -  struct set_entry *entry;
>> > -  set_foreach(func->impl->end_block->predecessors, entry) {
>> > - nir_block *pred = (nir_block *) entry->key;
>> > - emit_quads_workaround(&b, pred);
>> > -  }
>> > -
>> > -  nir_metadata_preserve(func->impl, 0);
>> > +   struct set_entry *entry;
>> > +   set_foreach(func->impl->end_block->predecessors, entry) {
>> > +  nir_block *pred = (nir_block *) entry->key;
>> > +  emit_quads_workaround(&b, pred);
>> > }
>> > +
>> > +   nir_metadata_preserve(func->impl, 0);
>> >  }
>> > --
>> > 2.9.3
>> >
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/16] glsl: Add a lowering pass to handle advanced blending modes.

2016-08-18 Thread Francisco Jerez
Kenneth Graunke  writes:

> Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and
> need to emulate it in the pixel shader.  This lowering pass implements
> all the necessary math for advanced blending.  It fetches the existing
> framebuffer value using the MESA_shader_framebuffer_fetch built-in
> variables, and the previous commit's state var uniform to select
> which equation to use.
>
> This is done at the GLSL IR level to make it easy for all drivers to
> implement the GL_KHR_blend_equation_advanced extension and share code.
>
> Drivers need to hook up MESA_shader_framebuffer_fetch functionality:
> 1. Hook up the fb_fetch_output variable
> 2. Implement BlendBarrier()
>
> Then to get KHR_blend_equation_advanced, they simply need to:
> 3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled
> 4. Call this lowering pass.
>
> Very little driver specific code should be required.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/Makefile.sources  |   1 +
>  src/compiler/glsl/ir_optimization.h|   1 +
>  .../glsl/lower_blend_equation_advanced.cpp | 523 
> +
>  3 files changed, 525 insertions(+)
>  create mode 100644 src/compiler/glsl/lower_blend_equation_advanced.cpp
>
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index 0ff9b23..fe26132 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -77,6 +77,7 @@ LIBGLSL_FILES = \
>   glsl/loop_analysis.h \
>   glsl/loop_controls.cpp \
>   glsl/loop_unroll.cpp \
> + glsl/lower_blend_equation_advanced.cpp \
>   glsl/lower_buffer_access.cpp \
>   glsl/lower_buffer_access.h \
>   glsl/lower_const_arrays_to_uniforms.cpp \
> diff --git a/src/compiler/glsl/ir_optimization.h 
> b/src/compiler/glsl/ir_optimization.h
> index c29260a..3bd6928 100644
> --- a/src/compiler/glsl/ir_optimization.h
> +++ b/src/compiler/glsl/ir_optimization.h
> @@ -151,6 +151,7 @@ void optimize_dead_builtin_variables(exec_list 
> *instructions,
>  bool lower_tess_level(gl_linked_shader *shader);
>  
>  bool lower_vertex_id(gl_linked_shader *shader);
> +bool lower_blend_equation_advanced(gl_linked_shader *shader);
>  
>  bool lower_subroutine(exec_list *instructions, struct _mesa_glsl_parse_state 
> *state);
>  void propagate_invariance(exec_list *instructions);
> diff --git a/src/compiler/glsl/lower_blend_equation_advanced.cpp 
> b/src/compiler/glsl/lower_blend_equation_advanced.cpp
> new file mode 100644
> index 000..91a11f3
> --- /dev/null
> +++ b/src/compiler/glsl/lower_blend_equation_advanced.cpp
> @@ -0,0 +1,523 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "ir.h"
> +#include "ir_builder.h"
> +#include "ir_optimization.h"
> +#include "ir_hierarchical_visitor.h"
> +#include "program/prog_instruction.h"
> +#include "program/prog_statevars.h"
> +#include "util/bitscan.h"
> +
> +namespace {
> +
> +using namespace ir_builder;
> +
> +class lower_blend_visitor : public ir_hierarchical_visitor {
> +public:
> +   explicit lower_blend_visitor(gl_linked_shader *sh)
> +  : sh(sh), mem_ctx(ralloc_parent(sh->ir)), progress(false)
> +   {
> +  fb = new(mem_ctx) ir_variable(glsl_type::vec4_type, "__blend_fb_fetch",
> +ir_var_shader_out);
> +  fb->data.location = FRAG_RESULT_DATA0;
> +  fb->data.read_only = 1;
> +  fb->data.fb_fetch_output = 1;
> +

Does this variable need to be marked ir_var_hidden too?

> +  mode = new(mem_ctx) ir_variable(glsl_type::uint_type,
> +  "gl_AdvancedBlendModeMESA",
> +  ir_var_uniform);
> +  mode->data.how_declared = ir_var_hidden;
> +  mode->allocate_state_slots(1);
> +  ir

[Mesa-dev] [PATCH 2/4] travis: Update to the Ubuntu Trusty image.

2016-08-18 Thread Eric Anholt
This will hopefully fix wget from x.org (no real reason explained in
Travis CI bug reports), and may also mean that we can enable LLVM driver
builds.
---
 .travis.yml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 843a9bcdc2ee..6a22e9595fe0 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,6 +1,7 @@
 language: c
 
-sudo: false
+sudo: true
+dist: trusty
 
 cache:
   directories:
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.

2016-08-18 Thread Eric Anholt
---
 .travis.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 6a22e9595fe0..e086173c529f 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -89,7 +89,7 @@ install:
 
   - wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
   - tar -jxvf $LIBDRM_VERSION.tar.bz2
-  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)
+  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 && 
make install)
 
   - wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
   - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.

2016-08-18 Thread Eric Anholt
---
 .travis.yml | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index e086173c529f..5f489a47fb79 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -16,7 +16,11 @@ addons:
   - libexpat1-dev
   - libxcb-dri2-0-dev
   - libx11-xcb-dev
-  - llvm-3.4-dev
+  - llvm-3.5-dev
+  # llvm-config is not in the dev package?
+  - llvm-3.5
+  # LLVM packaging is broken and misses this dep.
+  - libedit-dev
   - scons
 
 env:
@@ -95,16 +99,13 @@ install:
   - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
   - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make 
install)
 
-# Disabled LLVM (and therefore r300 and r600) because the build fails
-# with "undefined reference to `clock_gettime'" and "undefined
-# reference to `setupterm'" in llvmpipe.
 script:
   - if test "x$BUILD" = xmake; then
   ./autogen.sh --enable-debug
---disable-gallium-llvm
 --with-egl-platforms=x11,drm
 --with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau
---with-gallium-drivers=svga,swrast,vc4,virgl
+--with-gallium-drivers=svga,swrast,vc4,virgl,r300,r600
+--disable-llvm-shared-libs
 ;
   make && make check;
 elif test x$BUILD = xscons; then
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] travis: Parse configure.ac to pick an updated LIBDRM_VERSION.

2016-08-18 Thread Eric Anholt
Travis has been broken a couple of times by configure.ac updates.  To make
it useful, auto-update the version necessary.

This could potentially be used for other dependencies, too, but those get
bumped less frequently.
---
 .travis.yml | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index da1d81e1b462..843a9bcdc2ee 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -41,6 +41,16 @@ install:
   - export PATH="/usr/lib/ccache:$PATH"
   - pip install --user mako
 
+  # Since libdrm gets updated in configure.ac regularly, try to pick up the
+  # latest version from there.
+  - for line in `grep "^LIBDRM_.*_REQUIRED=" configure.ac`; do
+  old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;
+  new_ver=`echo $line | sed 's/.*REQUIRED=//'`;
+  if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then
+export LIBDRM_VERSION="libdrm-$new_ver";
+  fi;
+done
+
   # Install dependencies where we require specific versions (or where
   # disallowed by Travis CI's package whitelisting).
 
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] Travis CI build fixes

2016-08-18 Thread Eric Anholt
Here are a few fixes to get Travis CI builds on github working again.

As a reminder: If your personal branches of Mesa are on github, you
can enable it in your github account, then go to travis-ci.com, login,
go to your profile settings there, and check the Mesa repository.  Any
further git pushes will get a make check and a scons bulid run on
them, across a variety of drivers.

Eric Anholt (4):
  travis: Parse configure.ac to pick an updated LIBDRM_VERSION.
  travis: Update to the Ubuntu Trusty image.
  travis: Enable vc4 in libdrm to satisfy vc4 test build dependency.
  travis: Upgrade LLVM dependency to 3.5 and enable LLVM drivers.

 .travis.yml | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/16] mesa: Allow advanced blending enums in glBlendEquation[i].

2016-08-18 Thread Francisco Jerez
One comment not strictly related to this patch (or any other in
particular): Apparently the spec requires some amount of draw-time error
checking you don't seem to have implemented anywhere in this series,
e.g.:

|If any non-NONE draw buffer uses a blend equation found in table X.1
|or X.2, the error INVALID_OPERATION is generated by Begin or any
|operation that implicitly calls Begin (such as DrawElements) if:
|
|  * the draw buffer for color output zero selects multiple color
|buffers (e.g., FRONT_AND_BACK in the default framebuffer); or
|
|  * the draw buffer for any other color output is not NONE.

and:

|If the current blend equation is found in table X.1 or X.2, and the
|active fragment shader does not include the layout qualifier
|matching the blend equation or "blend_support_all_equations", the
|error INVALID_OPERATION is generated by [...]

Francisco Jerez  writes:

> Kenneth Graunke  writes:
>
>> Don't allow them in glBlendEquationSeparate[i], though, as required
>> by the spec.
>>
>> Signed-off-by: Kenneth Graunke 
>
> Reviewed-by: Francisco Jerez 
>
>> ---
>>  src/mesa/main/blend.c | 64 
>> +++
>>  1 file changed, 54 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/mesa/main/blend.c b/src/mesa/main/blend.c
>> index 2ae22e9..fe83e59 100644
>> --- a/src/mesa/main/blend.c
>> +++ b/src/mesa/main/blend.c
>> @@ -336,11 +336,11 @@ _mesa_BlendFuncSeparateiARB(GLuint buf, GLenum 
>> sfactorRGB, GLenum dfactorRGB,
>>  
>>  
>>  /**
>> - * Check if given blend equation is legal.
>> - * \return GL_TRUE if legal, GL_FALSE otherwise.
>> + * Return true if \p mode is a legal blending equation, excluding
>> + * GL_KHR_blend_equation_advanced modes.
>>   */
>> -static GLboolean
>> -legal_blend_equation(const struct gl_context *ctx, GLenum mode)
>> +static bool
>> +legal_simple_blend_equation(const struct gl_context *ctx, GLenum mode)
>>  {
>> switch (mode) {
>> case GL_FUNC_ADD:
>> @@ -356,6 +356,36 @@ legal_blend_equation(const struct gl_context *ctx, 
>> GLenum mode)
>>  }
>>  
>>  
>> +/**
>> + * Return true if \p mode is one of the advanced blending equations
>> + * defined by GL_KHR_blend_equation_advanced.
>> + */
>> +static bool
>> +legal_advanced_blend_equation(const struct gl_context *ctx, GLenum mode)
>> +{
>> +   switch (mode) {
>> +   case GL_MULTIPLY_KHR:
>> +   case GL_SCREEN_KHR:
>> +   case GL_OVERLAY_KHR:
>> +   case GL_DARKEN_KHR:
>> +   case GL_LIGHTEN_KHR:
>> +   case GL_COLORDODGE_KHR:
>> +   case GL_COLORBURN_KHR:
>> +   case GL_HARDLIGHT_KHR:
>> +   case GL_SOFTLIGHT_KHR:
>> +   case GL_DIFFERENCE_KHR:
>> +   case GL_EXCLUSION_KHR:
>> +   case GL_HSL_HUE_KHR:
>> +   case GL_HSL_SATURATION_KHR:
>> +   case GL_HSL_COLOR_KHR:
>> +   case GL_HSL_LUMINOSITY_KHR:
>> +  return _mesa_has_KHR_blend_equation_advanced(ctx);
>> +   default:
>> +  return false;
>> +   }
>> +}
>> +
>> +
>>  /* This is really an extension function! */
>>  void GLAPIENTRY
>>  _mesa_BlendEquation( GLenum mode )
>> @@ -390,7 +420,8 @@ _mesa_BlendEquation( GLenum mode )
>> if (!changed)
>>return;
>>  
>> -   if (!legal_blend_equation(ctx, mode)) {
>> +   if (!legal_simple_blend_equation(ctx, mode) &&
>> +   !legal_advanced_blend_equation(ctx, mode)) {
>>_mesa_error(ctx, GL_INVALID_ENUM, "glBlendEquation");
>>return;
>> }
>> @@ -426,7 +457,8 @@ _mesa_BlendEquationiARB(GLuint buf, GLenum mode)
>>return;
>> }
>>  
>> -   if (!legal_blend_equation(ctx, mode)) {
>> +   if (!legal_simple_blend_equation(ctx, mode) &&
>> +   !legal_advanced_blend_equation(ctx, mode)) {
>>_mesa_error(ctx, GL_INVALID_ENUM, "glBlendEquationi");
>>return;
>> }
>> @@ -482,12 +514,18 @@ _mesa_BlendEquationSeparate( GLenum modeRGB, GLenum 
>> modeA )
>>return;
>> }
>>  
>> -   if (!legal_blend_equation(ctx, modeRGB)) {
>> +   /* Only allow simple blending equations.
>> +* The GL_KHR_blend_equation_advanced spec says:
>> +*
>> +*"NOTE: These enums are not accepted by the  or 
>> +* parameters of BlendEquationSeparate or BlendEquationSeparatei."
>> +*/
>> +   if (!legal_simple_blend_equation(ctx, modeRGB)) {
>>_mesa_error(ctx, GL_INVALID_ENUM, 
>> "glBlendEquationSeparateEXT(modeRGB)");
>>return;
>> }
>>  
>> -   if (!legal_blend_equation(ctx, modeA)) {
>> +   if (!legal_simple_blend_equation(ctx, modeA)) {
>>_mesa_error(ctx, GL_INVALID_ENUM, 
>> "glBlendEquationSeparateEXT(modeA)");
>>return;
>> }
>> @@ -524,12 +562,18 @@ _mesa_BlendEquationSeparateiARB(GLuint buf, GLenum 
>> modeRGB, GLenum modeA)
>>return;
>> }
>>  
>> -   if (!legal_blend_equation(ctx, modeRGB)) {
>> +   /* Only allow simple blending equations.
>> +* The GL_KHR_blend_equation_advanced spec says:
>> +*
>> +*"NOTE: These enums are not accepted by the  or 
>> +* parameters of BlendEquationSepar

Re: [Mesa-dev] [PATCH 6/6] gallium/hud: round max_value to print nicely rounded numbers next to graphs

2016-08-18 Thread Brian Paul

On 08/18/2016 01:56 PM, Marek Olšák wrote:

From: Marek Olšák 

This improves readability a lot.


Series LGTM.

Reviewed-by: Brian Paul 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/cfg: Ignore non-CF instructions in unreachable blocks.

2016-08-18 Thread Matt Turner
On Thu, Aug 18, 2016 at 10:21 AM, Matt Turner  wrote:
> On Thu, Aug 18, 2016 at 3:43 AM, Iago Toral  wrote:
>> On Wed, 2016-08-17 at 11:54 -0700, Matt Turner wrote:
>>> The basic block following a control flow structure like an infinite
>>> loop
>>> will be unreachable. Ignore any non-control-flow instructions in it
>>> since they can have no effect on the program.
>>
>> If the block is unreachable control-flow instructions inside the block
>> are also irrelevant, is there any reason why you don't skip CF
>> instructions too?
>
> I think that could lead to further problems. For instance, if we had
>
> START B47
>   do
>   break
> END B47
> START B48
>   while
> END B48
>
> B48 would be unreachable, but I think emitting a "do" instruction
> without a "while" might cause problems in the generator
> (brw_find_loop_end, brw_find_next_block_end).
>
>>> Avoids a segmentation fault in cfg_t::intersect(a, b) when called on
>>> an
>>> unreachable block. By avoiding ever putting code in an unreachable
>>> block, we never attempt to optimize code in an unreachable block.
>>
>> Can't the problem persist if the unreachable block has any control-flow
>> instructions?
>
> I don't think so. The problem involved finding a constant
> (fs_visitor::opt_combine_constants) that was in both a reachable and
> an unreachable block. When a constant is in two different blocks, the
> code finds the common ancestor of both blocks (cfg_t::intersect) and
> places the constant in it. If one of the blocks is unreachable, it
> will not have an immediate dominator, leading to cfg_t::intersect
> segfaulting. Since control flow instructions do not take regular
> sources (and not immediates), they should pose no problem.
>
> The background is that Jason noticed a segfault when enabling GCM/GVN
> in NIR in a single piglit test
> (tests/glslparsertest/shaders/CorrectFull.frag). The test is a silly
> parser test that contains an infinite loop. After enabling GVN, a
> constant was used in both a reachable and unreachable block, leading
> to the segfault.
>
> My fix is to not put instructions into unreachable basic blocks. :)

I talked with Curro. He's right that this isn't sufficient (one could
have an unreachable block with many unreachable descendants full of
instructions).

I'll drop this patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 43/95] i965/disasm: print NibCtrl for instructions with execsize 4

2016-08-18 Thread Francisco Jerez
Iago Toral  writes:

> On Mon, 2016-08-08 at 15:58 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga  writes:
>> 
>> > 
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_disasm.c | 8 +++-
>> >  1 file changed, 7 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
>> > b/src/mesa/drivers/dri/i965/brw_disasm.c
>> > index c8bdeab..d5e9916 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_disasm.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_disasm.c
>> > @@ -1169,7 +1169,13 @@ qtr_ctrl(FILE *file, const struct
>> > brw_device_info *devinfo, brw_inst *inst)
>> > int qtr_ctl = brw_inst_qtr_control(devinfo, inst);
>> > int exec_size = 1 << brw_inst_exec_size(devinfo, inst);
>> >  
>> > -   if (exec_size == 8) {
>> > +   if (exec_size == 4) {
>> I guess it wouldn't hurt to show this for exec_size < 4 too even
>> though
>> it will typically be 1N.
>
> The PRMs say:
>
> "NibCtrl is only allowed for SIMD4 instructions with a DF (Double
> Float) source or destination type"
>
> I don't know if the "only allowed" in that sentence means that it is
> incorrect to set it to to anything else in instructions with an exec
> size < 4, in which case showing it in the disassembly would help find
> bugs although maybe we should just validate for that when we generate
> code, or if it means that the hardware will ignore it completely, in
> which case showing its value does not seem useful for anything.
>
> What do you think?

It's not terribly important, but the main reason I suggested changing
the condition to do the same thing for exec_size < 4 is that it
currently shows no group control information whatsoever for such
instructions.  Regardless of whether the hardware actually does the
right thing or not (the hardware spec is rather ambiguous about it) it
wouldn't hurt to print the information, this is just a disassembler, not
a validator, right?

>
>> > 
>> > +  int nib_ctl = brw_inst_nib_control(devinfo, inst);
>> This may cause an assertion failure on SNB and earlier because the
>> NibCtrl field doesn't exist.  You could do something along the lines
>> of:
>> 
>> > 
>> > const unsigned nib_ctl = devinfo->gen < 7 ? 0 :
>> >  brw_inst_nib_control(devinfo, inst);
>
> Ah, good to know, thanks!
>
>> > +  if (nib_ctl == 0)
>> > + string(file, " 1N");
>> > +  else
>> > + string(file, " 2N");
>> The usual qtr_ctl field is still taken into account by the hardware
>> regardless of whether you use NibCtrl, this should probably be:
>> 
>> > 
>> > format(file, " %dN", qtr_ctl * 2 + nib_ctl + 1);
>
> Yeah, good point.
>
>> > +   } else if (exec_size == 8) {
>> >    switch (qtr_ctl) {
>> >    case 0:
>> >   string(file, " 1Q");


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] Converting TTN to using frontface intrinsic.

2016-08-18 Thread Kenneth Graunke
On Friday, August 5, 2016 4:22:16 PM PDT Eric Anholt wrote:
> Here's a little miniseries from my trying to convert from using TTN to
> usually using GTN in vc4.  No change on shader-db, but seems like a
> good idea.
> 
> Eric Anholt (3):
>   nir: Tell opt_algebraic that load_front_face is a boolean.
>   nir: Use the system-value front face for twoside lowering.
>   ttn: Use nir_load_front_face instead of the TGSI-style input.
> 
>  src/compiler/nir/nir_lower_two_sided_color.c   | 23 +++--
>  src/compiler/nir/nir_search.c  | 60 
> --
>  src/gallium/auxiliary/nir/tgsi_to_nir.c| 30 +--
>  .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 46 -
>  src/gallium/drivers/vc4/vc4_nir_lower_io.c | 15 +-
>  5 files changed, 80 insertions(+), 94 deletions(-)
> 
> 

Patches 2-3 are:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir: Move the undef of nir_intrinsics.h macros to the .h.

2016-08-18 Thread Kenneth Graunke
On Saturday, August 6, 2016 12:28:36 AM PDT Eric Anholt wrote:
> I wanted to include this from nir_builder as well, so it also needed the
> undefs.
> ---
>  src/compiler/nir/nir.h| 3 ---
>  src/compiler/nir/nir_intrinsics.h | 3 +++
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index 379a5f8e022e..5e527d8add12 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -868,9 +868,6 @@ typedef enum {
> nir_num_intrinsics = nir_last_intrinsic + 1
>  } nir_intrinsic_op;
>  
> -#undef INTRINSIC
> -#undef LAST_INTRINSIC
> -
>  #define NIR_INTRINSIC_MAX_CONST_INDEX 3
>  
>  /** Represents an intrinsic
> diff --git a/src/compiler/nir/nir_intrinsics.h 
> b/src/compiler/nir/nir_intrinsics.h
> index 42c770f15942..d0f7f5db8337 100644
> --- a/src/compiler/nir/nir_intrinsics.h
> +++ b/src/compiler/nir/nir_intrinsics.h
> @@ -405,3 +405,6 @@ STORE(ssbo, 3, 1, WRMASK, xx, xx, 0)
>  STORE(shared, 2, 2, BASE, WRMASK, xx, 0)
>  
>  LAST_INTRINSIC(store_shared)
> +
> +#undef INTRINSIC
> +#undef LAST_INTRINSIC
> 

Patches 1-2 are:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/sched: Simplify work done by add_barrier_deps().

2016-08-18 Thread Matt Turner
Scheduling barriers are implemented by placing a dependence on every
node before and after the barrier. This is unnecessary as we can limit
the number of nodes we place dependencies on to those between us and the
next barrier in each direction.

Runtime of dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
is reduced from ~25 minutes to a little more than three.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94681
---
Would people rather I make is_scheduling_barrier() a virtual member of
backend_instruction and move the implementations into vec4_instruction
and fs_inst?

 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 8afdc25..2e53f9d 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -86,6 +86,8 @@ public:
 * its children, or just the issue_time if it's a leaf node.
 */
int delay;
+
+   bool is_barrier;
 };
 
 void
@@ -772,6 +774,7 @@ schedule_node::schedule_node(backend_instruction *inst,
this->unblocked_time = 0;
this->cand_generation = 0;
this->delay = 0;
+   this->is_barrier = false;
 
/* We can't measure Gen6 timings directly but expect them to be much
 * closer to Gen7 than Gen4.
@@ -872,9 +875,13 @@ instruction_scheduler::add_barrier_deps(schedule_node *n)
schedule_node *prev = (schedule_node *)n->prev;
schedule_node *next = (schedule_node *)n->next;
 
+   n->is_barrier = true;
+
if (prev) {
   while (!prev->is_head_sentinel()) {
  add_dep(prev, n, 0);
+ if (prev->is_barrier)
+break;
  prev = (schedule_node *)prev->prev;
   }
}
@@ -882,6 +889,8 @@ instruction_scheduler::add_barrier_deps(schedule_node *n)
if (next) {
   while (!next->is_tail_sentinel()) {
  add_dep(n, next, 0);
+ if (next->is_barrier)
+break;
  next = (schedule_node *)next->next;
   }
}
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] a4xx: only disable depth clipping, not all clipping, when requested

2016-08-18 Thread Ilia Mirkin
The previous bit disables the whole clipper, including the regular
viewport-related clipping that would go on. The two new bits disable
near and far clipping (separately, as verified with the
depth-clamp-range piglit).

Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/freedreno/a4xx/a4xx.xml.h   | 2 ++
 src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h 
b/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
index d9a7bb5..a90a4ce 100644
--- a/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
+++ b/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
@@ -3145,6 +3145,8 @@ static inline uint32_t A4XX_TPL1_TP_TEX_COUNT_GS(uint32_t 
val)
 
 #define REG_A4XX_GRAS_CL_CLIP_CNTL 0x2000
 #define A4XX_GRAS_CL_CLIP_CNTL_CLIP_DISABLE0x8000
+#define A4XX_GRAS_CL_CLIP_CNTL_ZNEAR_CLIP_DISABLE  0x0001
+#define A4XX_GRAS_CL_CLIP_CNTL_ZFAR_CLIP_DISABLE   0x0002
 #define A4XX_GRAS_CL_CLIP_CNTL_ZERO_GB_SCALE_Z 0x0040
 
 #define REG_A4XX_GRAS_CLEAR_CNTL   0x2003
diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c
index 7456c63..b3a4292 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_rasterizer.c
@@ -98,7 +98,8 @@ fd4_rasterizer_state_create(struct pipe_context *pctx,
so->gras_su_mode_control |= 
A4XX_GRAS_SU_MODE_CONTROL_POLY_OFFSET;
 
if (!cso->depth_clip)
-   so->gras_cl_clip_cntl |= A4XX_GRAS_CL_CLIP_CNTL_CLIP_DISABLE;
+   so->gras_cl_clip_cntl |= 
A4XX_GRAS_CL_CLIP_CNTL_ZNEAR_CLIP_DISABLE |
+   A4XX_GRAS_CL_CLIP_CNTL_ZFAR_CLIP_DISABLE;
if (cso->clip_halfz)
so->gras_cl_clip_cntl |= A4XX_GRAS_CL_CLIP_CNTL_ZERO_GB_SCALE_Z;
 
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] a4xx: make sure to actually clamp depth as requested

2016-08-18 Thread Ilia Mirkin
We were previously ... not clamping. I guess this meant that everything
got clamped to 1/0, which was enough to pass the existing tests. Or
perhaps the clamping would only happen to the rasterized depth value and
not the frag shader's output depth value.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/freedreno/a4xx/a4xx.xml.h |  2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c | 29 ++-
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h 
b/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
index a90a4ce..aeb61e7 100644
--- a/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
+++ b/src/gallium/drivers/freedreno/a4xx/a4xx.xml.h
@@ -1376,7 +1376,7 @@ static inline uint32_t A4XX_RB_DEPTH_CONTROL_ZFUNC(enum 
adreno_compare_func val)
 {
return ((val) << A4XX_RB_DEPTH_CONTROL_ZFUNC__SHIFT) & 
A4XX_RB_DEPTH_CONTROL_ZFUNC__MASK;
 }
-#define A4XX_RB_DEPTH_CONTROL_BF_ENABLE
0x0080
+#define A4XX_RB_DEPTH_CONTROL_Z_CLAMP_ENABLE   0x0080
 #define A4XX_RB_DEPTH_CONTROL_EARLY_Z_DISABLE  0x0001
 #define A4XX_RB_DEPTH_CONTROL_FORCE_FRAGZ_TO_FS
0x0002
 #define A4XX_RB_DEPTH_CONTROL_Z_TEST_ENABLE0x8000
diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
index e0f413f..fc0e4d1 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
@@ -31,6 +31,7 @@
 #include "util/u_memory.h"
 #include "util/u_helpers.h"
 #include "util/u_format.h"
+#include "util/u_viewport.h"
 
 #include "freedreno_resource.h"
 #include "freedreno_query_hw.h"
@@ -550,12 +551,14 @@ fd4_emit_state(struct fd_context *ctx, struct 
fd_ringbuffer *ring,

A4XX_RB_STENCILREFMASK_BF_STENCILREF(sr->ref_value[1]));
}
 
-   if (dirty & (FD_DIRTY_ZSA | FD_DIRTY_PROG)) {
+   if (dirty & (FD_DIRTY_ZSA | FD_DIRTY_RASTERIZER | FD_DIRTY_PROG)) {
struct fd4_zsa_stateobj *zsa = fd4_zsa_stateobj(ctx->zsa);
bool fragz = fp->has_kill | fp->writes_pos;
+   bool clamp = !ctx->rasterizer->depth_clip;
 
OUT_PKT0(ring, REG_A4XX_RB_DEPTH_CONTROL, 1);
OUT_RING(ring, zsa->rb_depth_control |
+   COND(clamp, 
A4XX_RB_DEPTH_CONTROL_Z_CLAMP_ENABLE) |
COND(fragz, 
A4XX_RB_DEPTH_CONTROL_EARLY_Z_DISABLE) |
COND(fragz && fp->frag_coord, 
A4XX_RB_DEPTH_CONTROL_FORCE_FRAGZ_TO_FS));
 
@@ -642,6 +645,30 @@ fd4_emit_state(struct fd_context *ctx, struct 
fd_ringbuffer *ring,
OUT_RING(ring, 
A4XX_GRAS_CL_VPORT_ZSCALE_0(ctx->viewport.scale[2]));
}
 
+   if (dirty & (FD_DIRTY_VIEWPORT | FD_DIRTY_RASTERIZER | 
FD_DIRTY_FRAMEBUFFER)) {
+   float zmin, zmax;
+   int depth = 24;
+   if (ctx->batch->framebuffer.zsbuf) {
+   depth = util_format_get_component_bits(
+   
pipe_surface_format(ctx->batch->framebuffer.zsbuf),
+   UTIL_FORMAT_COLORSPACE_ZS, 0);
+   }
+   util_viewport_zmin_zmax(&ctx->viewport, 
ctx->rasterizer->clip_halfz,
+   &zmin, &zmax);
+
+   OUT_PKT0(ring, REG_A4XX_RB_VPORT_Z_CLAMP(0), 2);
+   if (depth == 32) {
+   OUT_RING(ring, fui(zmin));
+   OUT_RING(ring, fui(zmax));
+   } else if (depth == 16) {
+   OUT_RING(ring, (uint32_t)(zmin * 0x));
+   OUT_RING(ring, (uint32_t)(zmax * 0x));
+   } else {
+   OUT_RING(ring, (uint32_t)(zmin * 0xff));
+   OUT_RING(ring, (uint32_t)(zmax * 0xff));
+   }
+   }
+
if (dirty & (FD_DIRTY_PROG | FD_DIRTY_FRAMEBUFFER)) {
struct pipe_framebuffer_state *pfb = &ctx->batch->framebuffer;
unsigned n = pfb->nr_cbufs;
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] st: guard against NULL pipe_surface

2016-08-18 Thread Hunter S
Unfortunately openmw is only a game engine and the game data or assets are 
not included which are required for using it. This could make testing it 
morecomplicated, however I can build mesa myself and test any patches if that 
helps.
If the game data is provided this is relatively easy to reproduce at least with 
nouveauand DRI3 while LIBGL_DRI3_DISABLE=1is able to launch and play the game 
successfully.

The previous patch avoids the crash by blocking the problematic custom mouse 
cursor 
used by openmw for the menus which is then reduced to a tiny ".." and is hard 
to spot.

Hunter
  From: Marek Olšák 
 To: Tobias Klausmann  
Cc: "mesa-dev@lists.freedesktop.org" ; 
ovarieg...@yahoo.com
 Sent: Thursday, August 18, 2016 1:40 PM
 Subject: Re: [Mesa-dev] [RFC] st: guard against NULL pipe_surface
   
On Thu, Aug 18, 2016 at 2:41 PM, Tobias Klausmann
 wrote:
> I have no test case per se, but orbea (in CC) noted in IRC:
>
> Start OpenMW with DRI3 -> Crash [1]
>
> Start OpenMW with DRI2 -> no crash,
>
> So i fear it is somewhere in our DRI3 path. Anyway a guard seems reasonable
> to harden release builds against this. Maybe not at this place but central
> in update_framebuffer_size().

Yeah, it would be very useful if we could reproduce this somehow.

Marek


  ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >