Re: [Mesa-dev] [RFC 2/2] mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program

2015-10-02 Thread Iago Toral
On Thu, 2015-10-01 at 14:01 -0400, Ilia Mirkin wrote:
> On Thu, Oct 1, 2015 at 7:09 AM, Iago Toral Quiroga  wrote:
> > These arrays provide backends with separate index spaces for UBOS and SSBOs.
> > ---
> >  src/glsl/linker.cpp | 35 
> > +++
> >  src/glsl/standalone_scaffolding.cpp |  9 +
> >  src/mesa/main/mtypes.h  |  6 ++
> >  3 files changed, 50 insertions(+)
> >
> > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> > index e6eba94..3da773d 100644
> > --- a/src/glsl/linker.cpp
> > +++ b/src/glsl/linker.cpp
> > @@ -4107,6 +4107,41 @@ link_shaders(struct gl_context *ctx, struct 
> > gl_shader_program *prog)
> >}
> > }
> >
> > +   /* Split prog->BufferInterfaceBlocks into prog->UniformBlocks and
> > +* prog->ShaderStorageBlocks, so that drivers that need separate index
> > +* spaces for each set can have that.
> > +*/
> > +   unsigned num_ubo_blocks;
> > +   unsigned num_ssbo_blocks;
> > +   num_ubo_blocks = 0;
> > +   num_ssbo_blocks = 0;
> > +   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
> > +  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
> > + num_ssbo_blocks++;
> > +  else
> > + num_ubo_blocks++;
> > +   }
> > +
> > +   prog->UniformBlocks =
> > +  ralloc_array(mem_ctx, gl_uniform_block *, num_ubo_blocks);
> > +   prog->NumUniformBlocks = 0;
> > +
> > +   prog->ShaderStorageBlocks =
> > +  ralloc_array(mem_ctx, gl_uniform_block *, num_ssbo_blocks);
> > +   prog->NumShaderStorageBlocks = 0;
> > +
> > +   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
> > +  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
> > + prog->ShaderStorageBlocks[prog->NumShaderStorageBlocks++] =
> > +>BufferInterfaceBlocks[i];
> > +  else
> > + prog->UniformBlocks[prog->NumUniformBlocks++] =
> > +>BufferInterfaceBlocks[i];
> > +   }
> 
> Shouldn't this go through and also adjust the indices of the linked
> programs? Or... something along those lines? With this, I still need a
> remapping table to go from the index passed to a load_ssbo/load_ubo
> instruction to a ssbo/ubo index.

Maybe I am missing something, but in the case of i965 for example, we
use a lowering pass (lower_ubo_reference) to compute the block indices
(see src/glsl/lower_ubo_reference.cpp). If you look at
lower_ubo_reference_visitor::setup_for_load_or_store, there is the code
we use to compute the block index. This happens in the backend already,
so I was thinking that you would do something similar, but instead of
using BufferInterfaceBlocks you would use these two arrays instead.
Until this point there are no references to any blocks in the shaders.

> Perhaps a few well-placed helper functions can alleviate this? Also
> this should erase the need for some of the O(n) iterations that have
> sprung up as a result of this combined list.
> 
> IMHO ideally the BufferInterfaceBlocks list would get freed at the end
> of this function. But I understand that this will require work, and
> the onus is probably on me (or anyone wanting to add ssbo support to
> other drivers) to do it, or work around it.

Yeah, it could go away I guess, but as you say that would require some
extra work (and also rewrite the i965 driver first to use separate index
spaces). In any case, notice that UniformBlocks and ShaderStorageBlocks
only contain pointers into BufferInterfaceBlocks, so we would only be
saving a few bytes.

Iago

>   -ilia
> 
> > +
> > +   assert(prog->NumUniformBlocks + prog->NumShaderStorageBlocks ==
> > +  prog->NumBufferInterfaceBlocks);
> > +
> > /* FINISHME: Assign fragment shader output locations. */
> >
> >  done:
> > diff --git a/src/glsl/standalone_scaffolding.cpp 
> > b/src/glsl/standalone_scaffolding.cpp
> > index 0c53589..658245f 100644
> > --- a/src/glsl/standalone_scaffolding.cpp
> > +++ b/src/glsl/standalone_scaffolding.cpp
> > @@ -102,6 +102,15 @@ _mesa_clear_shader_program_data(struct 
> > gl_shader_program *shProg)
> > ralloc_free(shProg->BufferInterfaceBlocks);
> > shProg->BufferInterfaceBlocks = NULL;
> > shProg->NumBufferInterfaceBlocks = 0;
> > +
> > +   ralloc_free(shProg->UniformBlocks);
> > +   shProg->UniformBlocks = NULL;
> > +   shProg->NumUniformBlocks = 0;
> > +
> > +   ralloc_free(shProg->ShaderStorageBlocks);
> > +   shProg->ShaderStorageBlocks = NULL;
> > +   shProg->NumShaderStorageBlocks = 0;
> > +
> > for (i = 0; i < MESA_SHADER_STAGES; i++) {
> >ralloc_free(shProg->UniformBlockStageIndex[i]);
> >shProg->UniformBlockStageIndex[i] = NULL;
> > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> > index 347da14..2362f54 100644
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -2692,6 +2692,12 @@ struct gl_shader_program
> > unsigned NumBufferInterfaceBlocks;
> > struct gl_uniform_block 

Re: [Mesa-dev] [PATCH] mesa: avoid leaking closure when iterating over a string_to_uint_map

2015-10-02 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Thu, 2015-10-01 at 20:19 -0400, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/program/hash_table.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h
> index e85a836..d0a2abf 100644
> --- a/src/mesa/program/hash_table.h
> +++ b/src/mesa/program/hash_table.h
> @@ -249,6 +249,7 @@ public:
>wrapper->closure = closure;
>  
>hash_table_call_foreach(this->ht, subtract_one_wrapper, wrapper);
> +  free(wrapper);
> }
>  
> /**


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: avoid leaking hiddenUniforms map when there are no uniforms

2015-10-02 Thread Iago Toral
On Thu, 2015-10-01 at 20:22 -0400, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/glsl/link_uniforms.cpp | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
> index 47d49c8..740b0a4 100644
> --- a/src/glsl/link_uniforms.cpp
> +++ b/src/glsl/link_uniforms.cpp
> @@ -1131,15 +1131,15 @@ link_assign_uniform_locations(struct 
> gl_shader_program *prog,
> const unsigned num_data_slots = uniform_size.num_values;
> const unsigned hidden_uniforms = uniform_size.num_hidden_uniforms;
>  
> +   /* assign hidden uniforms a slot id */
> +   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, _size);
> +   delete hiddenUniforms;
> +
> /* On the outside chance that there were no uniforms, bail out.
>  */
> if (num_uniforms == 0)
>return;
>  
> -   /* assign hidden uniforms a slot id */
> -   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, _size);
> -   delete hiddenUniforms;
> -

I suppose there is no much gain in simply adding the delete statement
right before the return...

Reviewed-by: Iago Toral Quiroga 


> struct gl_uniform_storage *uniforms =
>rzalloc_array(prog, struct gl_uniform_storage, num_uniforms);
> union gl_constant_value *data =


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: don't forget to free image_param on prog_data free

2015-10-02 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Thu, 2015-10-01 at 20:27 -0400, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/drivers/dri/i965/brw_program.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
> b/src/mesa/drivers/dri/i965/brw_program.c
> index fee96a8..0e4b823 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -551,6 +551,7 @@ brw_stage_prog_data_free(const void *p)
>  
> ralloc_free(prog_data->param);
> ralloc_free(prog_data->pull_param);
> +   ralloc_free(prog_data->image_param);
>  }
>  
>  void


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-02 Thread Francisco Jerez
Neil Roberts  writes:

> Francisco Jerez  writes:
>
>> Sigh, it's really awful that our hardware only supports a single sample
>> index for the whole SIMD thread...  I was thinking though that there
>> might be a better alternative to running the sample-index interpolator
>> query in a loop: The "Per Slot Offset" interpolator query does allow
>> independent offsets for each channel, couldn't we pass the hard-coded
>> table of sample offsets (see brw_multisample_state.h) to the shader
>> (e.g. as push constants or hard-coded as vector immediates), use VxH
>> indirect indexing to fetch the right offset for each channel and then
>> lower it into a non-dynamically-uniform interpolateAtOffset, which the
>> hardware can handle natively?
>
> Oh, that sounds like a neat trick! I think if we did that we'd have to
> use push constants rather than vector immediates because otherwise we'd
> have to recompile the program if it is used with a different sample
> count, right? That would be a shame.
>
I guess you could still get around with immediates if you hard-coded the
offsets for *all* MS modes inside a single table (How much data is it in
total?  4 dwords?) and use one set of values or other depending on the
num samples parameter (e.g. by indexing by 'num_samples + sample_idx' so
you know that each sample index gets a unique location in the table for
each multisample mode).  Although yeah passing the right table as push
constants might be easier.

> On the other hand I don't think we know any actual use cases of calling
> interpolateAtSample yet so it might be better to just pick whichever
> implementation works with the least hassle and then we can optimise it
> later if it is shown to be important for some application. Having to
> upload uniforms depending on the sample count could be a bit fiddly to
> implement.
>
Sounds OK to me, I'm all for KISS for now until we find some important
use-case for this.

> Regards,
> - Neil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/11] i965/nir: Simplify uniform setup

2015-10-02 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 19 ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 -
>  2 files changed, 16 insertions(+), 24 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 829c663..eb0fe7b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -185,20 +185,17 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader)
> if (shader_prog) {
>brw_nir_setup_glsl_uniforms(shader, shader_prog, prog,
>stage_prog_data, true);
> -
> -  foreach_list_typed(nir_variable, var, node, >uniforms) {
> - /* UBO's and atomics don't take up space in the uniform file */
> - if (var->interface_type != NULL || var->type->contains_atomic())
> -continue;
> -
> - if(type_size_scalar(var->type) > 0)
> -param_size[var->data.driver_location] = 
> type_size_scalar(var->type);
> -  }
> } else {
>brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data);
> +   }
> +
> +   foreach_list_typed(nir_variable, var, node, >uniforms) {
> +  /* UBO's and atomics don't take up space in the uniform file */
> +  if (var->interface_type != NULL || var->type->contains_atomic())
> + continue;
>  
> -  if(prog->Parameters->NumParameters > 0)
> - param_size[0] = prog->Parameters->NumParameters * 4;
> +  if (type_size_scalar(var->type) > 0)
> + param_size[var->data.driver_location] = type_size_scalar(var->type);
> }
>  }
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 36bb35f..8274d48 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -139,22 +139,17 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
> if (shader_prog) {
>brw_nir_setup_glsl_uniforms(shader, shader_prog, prog,
>stage_prog_data, false);
> -
> -  foreach_list_typed(nir_variable, var, node, >uniforms) {
> - /* UBO's, atomics and samplers don't take up space in the
> -uniform file */
> - if (var->interface_type != NULL || var->type->contains_atomic() ||
> - type_size_vec4(var->type) == 0) {
> -continue;
> - }
> -
> - uniform_size[var->data.driver_location] = type_size_vec4(var->type);
> -  }
> } else {
>brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data);
> +   }
> +
> +   foreach_list_typed(nir_variable, var, node, >uniforms) {
> +  /* UBO's and atomics don't take up space in the uniform file */
> +  if (var->interface_type != NULL || var->type->contains_atomic())
> + continue;
>  
> -  if(prog->Parameters->NumParameters > 0)
> - uniform_size[0] = prog->Parameters->NumParameters;
> +  if (type_size_vec4(var->type) > 0)
> + uniform_size[var->data.driver_location] = type_size_vec4(var->type);
> }
>  }
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glx: Fix build errors with --enable-mangling (v2)

2015-10-02 Thread Emil Velikov
On 1 October 2015 at 09:32, Emil Velikov  wrote:
> On 28 September 2015 at 18:59, Kyle Brenneman  wrote:
>> Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up
>> the renames from glx_mangle.h.
>>
>> Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is
>> defined.
>>
>> v2: Add a comment clarifying why GLX_ALIAS needs two macros.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=2
>> Signed-off-by: Kyle Brenneman 
>> Cc: "10.6 11.0" 
>> ---
>>  src/glx/glxcmds.c   |  4 
>>  src/glx/glxextensions.h | 10 --
>>  2 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/glx/glxcmds.c b/src/glx/glxcmds.c
>> index 26ff804..93e8db5 100644
>> --- a/src/glx/glxcmds.c
>> +++ b/src/glx/glxcmds.c
>> @@ -2646,7 +2646,11 @@ _X_EXPORT void (*glXGetProcAddressARB(const GLubyte * 
>> procName)) (void)
>>   */
>>  _X_EXPORT void (*glXGetProcAddress(const GLubyte * procName)) (void)
>>  #if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
>> +# if defined(USE_MGL_NAMESPACE)
>> +   __attribute__ ((alias("mglXGetProcAddressARB")));
>> +# else
>> __attribute__ ((alias("glXGetProcAddressARB")));
>> +# endif
>>  #else
>>  {
>> return glXGetProcAddressARB(procName);
>> diff --git a/src/glx/glxextensions.h b/src/glx/glxextensions.h
>> index 90b173f..3a9bc82 100644
>> --- a/src/glx/glxextensions.h
>> +++ b/src/glx/glxextensions.h
>> @@ -281,11 +281,17 @@ typedef void (*PFNGLXDISABLEEXTENSIONPROC) (const char 
>> *name);
>>  # define GLX_ALIAS_VOID(real_func, proto_args, args, aliased_func)
>>  #else
>>  # if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
>> -#  define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) 
>> \
>> +/* GLX_ALIAS and GLX_ALIAS_VOID both expand to the macro GLX_ALIAS2. Using 
>> the
>> + * extra expansion means that the name mangling macros in glx_mangle.h will
>> + * apply before stringification, so the alias attribute will have a string 
>> like
>> + * "mglXFoo" instead of "glXFoo". */
> Hmm things are happening in quite Interesting order. On the good side,
> we've been using the two layer macros in our dispatch (glapi) for a
> while so things should be fully fixed now.
>
> Reviewed-by: Emil Velikov 
>
> Thank you for the fixes Kyle. I'll push the three patches tomorrow,
> barring any objections of course.
>
All three are in master now. Thanks again Kyle.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-02 Thread Francisco Jerez
Matt Turner  writes:

> On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts  wrote:
>> If a non-const sample number is given to interpolateAtSample it will
>> now generate an indirect send message with the sample ID similar to
>> how non-const sampler array indexing works. Previously non-const
>> values were ignored and instead it ended up using a constant 0 value.
>>
>> The generator will try to determine if the sample ID is dynamically
>> uniform via nir_src_is_dynamically_uniform. If not it will query the
>> pixel interpolator in a loop, once for each possible sample number.
>> This is necessary because the indirect send message doesn't seem to
>> have a way to specify a different value for each fragment.
>
> Heh, I think this was the solution Curro had in mind a couple of years ago. :)
>
> I've Cc'd him. It'd be good for him to review this.
>
Sigh, it's really awful that our hardware only supports a single sample
index for the whole SIMD thread...  I was thinking though that there
might be a better alternative to running the sample-index interpolator
query in a loop: The "Per Slot Offset" interpolator query does allow
independent offsets for each channel, couldn't we pass the hard-coded
table of sample offsets (see brw_multisample_state.h) to the shader
(e.g. as push constants or hard-coded as vector immediates), use VxH
indirect indexing to fetch the right offset for each channel and then
lower it into a non-dynamically-uniform interpolateAtOffset, which the
hardware can handle natively?

>> The range of possible sample numbers is determined using
>> STATE_NUM_SAMPLES. When linking the shader it will now add a reference
>> to this state if any dynamically non-uniform calls to
>> interpolateAtSample are found.
>>
>> This fixes the following two Piglit tests:
>>
>> arb_gpu_shader5-interpolateAtSample-nonconst
>> arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform
>>
>> v2: Handle dynamically non-uniform sample ids.
>> ---
>>  src/mesa/drivers/dri/i965/brw_eu.h |   2 +-
>>  src/mesa/drivers/dri/i965/brw_eu_emit.c|  34 ---
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp |   5 +-
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 119 
>> +
>>  src/mesa/drivers/dri/i965/brw_program.c|  54 +++
>>  src/mesa/drivers/dri/i965/brw_program.h|   1 +
>>  src/mesa/drivers/dri/i965/brw_shader.cpp   |   2 +
>>  7 files changed, 185 insertions(+), 32 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
>> b/src/mesa/drivers/dri/i965/brw_eu.h
>> index 761aa0e..0ac1ad9 100644
>> --- a/src/mesa/drivers/dri/i965/brw_eu.h
>> +++ b/src/mesa/drivers/dri/i965/brw_eu.h
>> @@ -461,7 +461,7 @@ brw_pixel_interpolator_query(struct brw_codegen *p,
>>   struct brw_reg mrf,
>>   bool noperspective,
>>   unsigned mode,
>> - unsigned data,
>> + struct brw_reg data,
>>   unsigned msg_length,
>>   unsigned response_length);
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
>> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> index 4d39762..25524d4 100644
>> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> @@ -3192,26 +3192,38 @@ brw_pixel_interpolator_query(struct brw_codegen *p,
>>   struct brw_reg mrf,
>>   bool noperspective,
>>   unsigned mode,
>> - unsigned data,
>> + struct brw_reg data,
>>   unsigned msg_length,
>>   unsigned response_length)
>>  {
>> const struct brw_device_info *devinfo = p->devinfo;
>> -   struct brw_inst *insn = next_insn(p, BRW_OPCODE_SEND);
>> +   struct brw_inst *insn;
>> +   uint16_t exec_size;
>>
>> -   brw_set_dest(p, insn, dest);
>> -   brw_set_src0(p, insn, mrf);
>> -   brw_set_message_descriptor(p, insn, GEN7_SFID_PIXEL_INTERPOLATOR,
>> -  msg_length, response_length,
>> -  false /* header is never present for PI */,
>> -  false);
>> +   if (data.file == BRW_IMMEDIATE_VALUE) {
>> +  insn = next_insn(p, BRW_OPCODE_SEND);
>> +  brw_set_dest(p, insn, dest);
>> +  brw_set_src0(p, insn, mrf);
>> +  brw_set_message_descriptor(p, insn, GEN7_SFID_PIXEL_INTERPOLATOR,
>> + msg_length, response_length,
>> + false /* header is never present for PI */,
>> + false);
>> +  brw_inst_set_pi_message_data(devinfo, insn, data.dw1.ud);
>> +   } else {
>> +  insn = brw_send_indirect_message(p,
>> +   

Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-02 Thread Neil Roberts
Francisco Jerez  writes:

> Sigh, it's really awful that our hardware only supports a single sample
> index for the whole SIMD thread...  I was thinking though that there
> might be a better alternative to running the sample-index interpolator
> query in a loop: The "Per Slot Offset" interpolator query does allow
> independent offsets for each channel, couldn't we pass the hard-coded
> table of sample offsets (see brw_multisample_state.h) to the shader
> (e.g. as push constants or hard-coded as vector immediates), use VxH
> indirect indexing to fetch the right offset for each channel and then
> lower it into a non-dynamically-uniform interpolateAtOffset, which the
> hardware can handle natively?

Oh, that sounds like a neat trick! I think if we did that we'd have to
use push constants rather than vector immediates because otherwise we'd
have to recompile the program if it is used with a different sample
count, right? That would be a shame.

On the other hand I don't think we know any actual use cases of calling
interpolateAtSample yet so it might be better to just pick whichever
implementation works with the least hassle and then we can optimise it
later if it is shown to be important for some application. Having to
upload uniforms depending on the sample count could be a bit fiddly to
implement.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-02 Thread Neil Roberts
Matt Turner  writes:

>> +static fs_reg
>> +get_num_samples_reg(fs_visitor *v)
>> +{
>> +   struct gl_program_parameter_list *params = v->prog->Parameters;
>> +   static gl_state_index tokens[STATE_LENGTH] = {
>
> I suspect this isn't thread-safe.

Do you mean because the tokens array is static? I don't think this is
written to so I was assuming it would be ok. I should have made it
const. Or did you mean something else?

>> +   bld.CMP(bld.null_reg_ud(),
>> +   sample_src, i_reg,
>> +   BRW_CONDITIONAL_EQ);
>
> I think you might be able to put all of this on one line.

Sadly it doesn't fit. I could probably make it two lines though.

> If you want, you could reverse the order of the loop (that is, iterate
> from num_samples_reg down to 0) and then save the
> opt_cmod_propagation() pass some work and just emit an ADD with a a
> conditional_mod in order to get rid of the CMP instruction.
>
>> +   inst = bld.emit(BRW_OPCODE_BREAK);
>> +   inst->predicate = BRW_PREDICATE_NORMAL;
>> +   bld.emit(BRW_OPCODE_WHILE);
>
> You can actually get rid of the BREAK by predicating the WHILE (and
> setting predicate_inverse on it). There's also a nice
> set_predicate(predicate, inst) function you should use.
>
> If you want to do that, I think it'll take a small amount of work in
> the BRW_OPCODE_WHILE case in brw_cfg.cpp to understand that a
> predicated WHILE has two successors.

That sounds like a good idea. I still have more to learn about the ISA
so I guess this is a good way to do it :)

>> +static bool
>> +find_interpolate_at_sample_in_block(nir_block *block,
>> +void *data)
>> +{
>> +   nir_foreach_instr_safe(block, instr) {
>
> I don't think you need _safe here. At least I can't see why it's
> necessary.

Good point, I don't know why I put _safe there.

Thanks for looking at this.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-02 Thread Roland Scheidegger
Am 02.10.2015 um 12:19 schrieb Marek Olšák:
> On Fri, Oct 2, 2015 at 3:13 AM, Roland Scheidegger  wrote:
>> Can't say I'm a big fan of doing essentially the same thing with
>> different methods, but well it's coming from GL, and if it helps some
>> drivers...
>>
>> Acked-by: Roland Scheidegger 
> 
> If all drivers start supporting the CAP, we could remove the st/mesa
> emulation. Just as we could remove the color clamping emulation.
> 
> Marek
> 

Yes, but that just moves the logic to the drivers. It's still (mostly)
the same thing handling per-sample inputs, or forcing them to per sample
via ARB_sample_shading.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92236] [ColorTiling2D] R600 BARTS radeon: pipe_context->clear with color tiling enabled doesn't immediately affect color buffer

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92236

Axel Davy  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|NOTOURBUG   |---

--- Comment #2 from Axel Davy  ---
(In reply to Michel Dänzer from comment #1)
> That's correct. nine needs to resolve the clear using the
> pipe->flush_resource hook.
> 

Nine does call it before sending the buffer to the server.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/11] i965/nir: Pull common ARB program uniform handling into a common function

2015-10-02 Thread Iago Toral
On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> The way we deal with ARB program uniforms is basically the same in both the
> vec4 and the fs backend.  This commit takes the best parts of both
> implementations and pulls the common code into a shared helper function.
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources |  1 +
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 10 +
>  src/mesa/drivers/dri/i965/brw_nir.h|  9 
>  src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 61 
> ++
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 29 ++--
>  5 files changed, 76 insertions(+), 34 deletions(-)
>  create mode 100644 src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index 540e3df..eb8196d 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -86,6 +86,7 @@ i965_FILES = \
>   brw_nir.h \
>   brw_nir.c \
>   brw_nir_analyze_boolean_resolves.c \
> + brw_nir_uniforms.cpp \
>   brw_object_purgeable.c \
>   brw_packed_float.c \
>   brw_performance_monitor.c \
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index d33e452..3ba6a67 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -196,14 +196,8 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader)
>  param_size[var->data.driver_location] = 
> type_size_scalar(var->type);
>}
> } else {
> -  /* prog_to_nir only creates a single giant uniform variable so we can
> -   * just set param up directly. */
> -  for (unsigned p = 0; p < prog->Parameters->NumParameters; p++) {
> - for (unsigned int i = 0; i < 4; i++) {
> -stage_prog_data->param[4 * p + i] =
> -   >Parameters->ParameterValues[p][i];
> - }
> -  }
> +  brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data);
> +
>if(prog->Parameters->NumParameters > 0)
>   param_size[0] = prog->Parameters->NumParameters * 4;
> }
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.h 
> b/src/mesa/drivers/dri/i965/brw_nir.h
> index ad71293..b4a6dc0 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.h
> +++ b/src/mesa/drivers/dri/i965/brw_nir.h
> @@ -85,6 +85,15 @@ enum brw_reg_type brw_type_for_nir_type(nir_alu_type type);
>  
>  enum glsl_base_type brw_glsl_base_type_for_nir_type(nir_alu_type type);
>  
> +void brw_nir_setup_glsl_uniforms(nir_shader *shader,
> + struct gl_shader_program *shader_prog,
> + const struct gl_program *prog,
> + struct brw_stage_prog_data *stage_prog_data,
> + bool is_scalar);

brw_nir_setup_glsl_uniforms should be declared in the next patch.

> +void brw_nir_setup_arb_uniforms(nir_shader *shader, struct gl_program *prog,
> +struct brw_stage_prog_data *stage_prog_data);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp 
> b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> new file mode 100644
> index 000..ede4e91
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> @@ -0,0 +1,61 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "brw_shader.h"
> +#include "brw_nir.h"
> +
> +void
> +brw_nir_setup_arb_uniforms(nir_shader *shader, struct gl_program *prog,
> +   struct brw_stage_prog_data *stage_prog_data)
> +{
> +   struct gl_program_parameter_list *plist = prog->Parameters;
> +
> +#ifndef NDEBUG
> +   if 

[Mesa-dev] [Bug 92236] [ColorTiling2D] R600 BARTS radeon: pipe_context->clear with color tiling enabled doesn't immediately affect color buffer

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92236

Michel Dänzer  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
  Component|Drivers/Gallium/r600|Mesa core
 Resolution|--- |NOTOURBUG
   Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org
 QA Contact|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org

--- Comment #1 from Michel Dänzer  ---
(In reply to Patrick Rudolph from comment #0)
> There's an open Gallium nine bug: https://github.com/iXit/Mesa-3D/issues/138
> 
> It looks like pipe_context->clear called with PIPE_CLEAR_COLOR isn't applied
> immediately to the buffer.

That's correct. nine needs to resolve the clear using the pipe->flush_resource
hook.


> The problem doesn't exist with R600_DEBUG=notiling

Fast clear only works with tiled surfaces.


> and it doesn't exists when using pipe_context->clear_render_target instead.

IIRC that's currently a software fallback.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/11] i965/shader: Pull setup_image_uniform_values out of backend_shader

2015-10-02 Thread Iago Toral
On Fri, 2015-10-02 at 10:29 +0200, Iago Toral wrote:
> On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> > I tried to do this once before but Curro pointed out that having it in
> > backend_shader meant it could use the setup_vec4_uniform_values helper
> > which did different things in vec4 and fs.  Now the setup_uniform_values
> > function differs only by an assert in the two backends so there's no real
> > good reason to be using it anymore.
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  3 +-
> >  src/mesa/drivers/dri/i965/brw_shader.cpp | 52 
> > +---
> >  src/mesa/drivers/dri/i965/brw_shader.h   |  7 +++--
> >  3 files changed, 42 insertions(+), 20 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > index 7a965cd..d33e452 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -236,7 +236,8 @@ fs_visitor::nir_setup_uniform(nir_variable *var)
> >}
> >  
> >if (storage->type->is_image()) {
> > - setup_image_uniform_values(index, storage);
> > + brw_setup_image_uniform_values(stage, stage_prog_data,
> > +index, storage);
> >} else {
> >   unsigned slots = storage->type->component_slots();
> >   if (storage->array_elements)
> > diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> > b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > index 72388ce..1d184a7 100644
> > --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > @@ -1419,32 +1419,50 @@ 
> > backend_shader::assign_common_binding_table_offsets(uint32_t 
> > next_binding_table_
> > /* prog_data->base.binding_table.size will be set by 
> > brw_mark_surface_used. */
> >  }
> >  
> > +static void
> > +setup_vec4_uniform_value(const gl_constant_value **params,
> > + const gl_constant_value *values,
> > + unsigned n)
> > +{
> > +   static const gl_constant_value zero = { 0 };
> > +
> > +   for (unsigned i = 0; i < n; ++i)
> > +  params[i] = [i];
> > +
> > +   for (unsigned i = n; i < 4; ++i)
> > +  params[i] = 
> > +}
> 
> I actually liked the version that received an offset into params better,
> since that allows us to assert that we are not messing our writes to
> params. 

Oh, but that was a requirement exclusive to the vec4 backend, so I guess
this is okay...

> Not a big deal though, so either way:
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> >  void
> > -backend_shader::setup_image_uniform_values(unsigned param_offset,
> > -   const gl_uniform_storage 
> > *storage)
> > +brw_setup_image_uniform_values(gl_shader_stage stage,
> > +   struct brw_stage_prog_data *stage_prog_data,
> > +   unsigned param_start_index,
> > +   const gl_uniform_storage *storage)
> >  {
> > -   const unsigned stage = _mesa_program_enum_to_shader_stage(prog->Target);
> > +   const gl_constant_value **param =
> > +  _prog_data->param[param_start_index];
> >  
> > for (unsigned i = 0; i < MAX2(storage->array_elements, 1); i++) {
> >const unsigned image_idx = storage->image[stage].index + i;
> > -  const brw_image_param *param = 
> > _prog_data->image_param[image_idx];
> > +  const brw_image_param *image_param =
> > + _prog_data->image_param[image_idx];
> >  
> >/* Upload the brw_image_param structure.  The order is expected to 
> > match
> > * the BRW_IMAGE_PARAM_*_OFFSET defines.
> > */
> > -  setup_vec4_uniform_value(param_offset + 
> > BRW_IMAGE_PARAM_SURFACE_IDX_OFFSET,
> > - (const gl_constant_value *)>surface_idx, 1);
> > -  setup_vec4_uniform_value(param_offset + 
> > BRW_IMAGE_PARAM_OFFSET_OFFSET,
> > - (const gl_constant_value *)param->offset, 2);
> > -  setup_vec4_uniform_value(param_offset + BRW_IMAGE_PARAM_SIZE_OFFSET,
> > - (const gl_constant_value *)param->size, 3);
> > -  setup_vec4_uniform_value(param_offset + 
> > BRW_IMAGE_PARAM_STRIDE_OFFSET,
> > - (const gl_constant_value *)param->stride, 4);
> > -  setup_vec4_uniform_value(param_offset + 
> > BRW_IMAGE_PARAM_TILING_OFFSET,
> > - (const gl_constant_value *)param->tiling, 3);
> > -  setup_vec4_uniform_value(param_offset + 
> > BRW_IMAGE_PARAM_SWIZZLING_OFFSET,
> > - (const gl_constant_value *)param->swizzling, 2);
> > -  param_offset += BRW_IMAGE_PARAM_SIZE;
> > +  setup_vec4_uniform_value(param + BRW_IMAGE_PARAM_SURFACE_IDX_OFFSET,
> > + (const gl_constant_value *)_param->surface_idx, 1);
> > +  setup_vec4_uniform_value(param + BRW_IMAGE_PARAM_OFFSET_OFFSET,
> > + (const gl_constant_value *)image_param->offset, 2);
> > +  

Re: [Mesa-dev] [PATCH 08/11] i965/vec4: Use the uniform count from nir_assign_var_locations

2015-10-02 Thread Iago Toral
On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> Previously, we were counting up uniforms as we set them up.  However, this
> count should be exactly identical to shader->num_uniforms provided by
> nir_assign_var_locations.  (If it's not, we're in trouble anyway because
> that means that locations don't match up.)  This matches what the fs
> backend is already doing.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 32 
> ++
>  1 file changed, 11 insertions(+), 21 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index b0abfc1..ee94e58 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -134,7 +134,7 @@ vec4_visitor::nir_setup_inputs(nir_shader *shader)
>  void
>  vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>  {
> -   uniforms = 0;
> +   uniforms = shader->num_uniforms;
>  
> if (shader_prog) {
>foreach_list_typed(nir_variable, var, node, >uniforms) {
> @@ -145,8 +145,7 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>  continue;
>   }
>  
> - assert(uniforms < uniform_array_size);
> - uniform_size[uniforms] = type_size_vec4(var->type);
> + uniform_size[var->data.driver_location] = type_size_vec4(var->type);
>  
>   if (strncmp(var->name, "gl_", 3) == 0)
>  nir_setup_builtin_uniform(var);
> @@ -161,8 +160,8 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>assert(shader->uniforms.length() == 1 &&
>   strcmp(var->name, "parameters") == 0);
>  
> -  assert(uniforms < uniform_array_size);
> -  uniform_size[uniforms] = type_size_vec4(var->type);
> +  assert(var->data.driver_location == 0);
> +  uniform_size[0] = type_size_vec4(var->type);
>  
>struct gl_program_parameter_list *plist = prog->Parameters;
>for (unsigned p = 0; p < plist->NumParameters; p++) {
> @@ -174,14 +173,12 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>  
>   unsigned i;
>   for (i = 0; i < plist->Parameters[p].Size; i++) {
> -stage_prog_data->param[uniforms * 4 + i] = 
> >ParameterValues[p][i];
> +stage_prog_data->param[p * 4 + i] = 
> >ParameterValues[p][i];
>   }
>   for (; i < 4; i++) {
>  static const gl_constant_value zero = { 0.0 };
> -stage_prog_data->param[uniforms * 4 + i] = 
> +stage_prog_data->param[p * 4 + i] = 
>   }
> -
> - uniforms++;
>}
> }
>  }
> @@ -197,6 +194,7 @@ vec4_visitor::nir_setup_uniform(nir_variable *var)
>  * order we'd walk the type, so walk the list of storage and find anything
>  * with our name, or the prefix of a component that starts with our name.
>  */
> +unsigned index = var->data.driver_location * 4;

Maybe call this uniform_index for consistency with
nir_setup_builtin_uniform below.

Reviewed-by: Iago Toral Quiroga 

>  for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
> struct gl_uniform_storage *storage = _prog->UniformStorage[u];
>  
> @@ -215,19 +213,14 @@ vec4_visitor::nir_setup_uniform(nir_variable *var)
>  storage->type->matrix_columns);
>  
> for (unsigned s = 0; s < vector_count; s++) {
> -  assert(uniforms < uniform_array_size);
> -
>int i;
>for (i = 0; i < storage->type->vector_elements; i++) {
> - stage_prog_data->param[uniforms * 4 + i] = components;
> - components++;
> + stage_prog_data->param[index++] = components++;
>}
>for (; i < 4; i++) {
>   static const gl_constant_value zero = { 0.0 };
> - stage_prog_data->param[uniforms * 4 + i] = 
> + stage_prog_data->param[index++] = 
>}
> -
> -  uniforms++;
> }
>  }
>  }
> @@ -238,6 +231,7 @@ vec4_visitor::nir_setup_builtin_uniform(nir_variable *var)
> const nir_state_slot *const slots = var->state_slots;
> assert(var->state_slots != NULL);
>  
> +   unsigned uniform_index = var->data.driver_location * 4;
> for (unsigned int i = 0; i < var->num_state_slots; i++) {
>/* This state reference has already been setup by ir_to_mesa,
> * but we'll get the same index back here.  We can reference
> @@ -249,13 +243,9 @@ vec4_visitor::nir_setup_builtin_uniform(nir_variable 
> *var)
>gl_constant_value *values =
>   >Parameters->ParameterValues[index][0];
>  
> -  assert(uniforms < uniform_array_size);
> -
>for (unsigned j = 0; j < 4; j++)
> - stage_prog_data->param[uniforms * 4 + j] =
> + stage_prog_data->param[uniform_index++] =
>  [GET_SWZ(slots[i].swizzle, j)];
> -
> -  uniforms++;
> }
>  }
>  


___
mesa-dev 

Re: [Mesa-dev] [PATCH 07/11] i965/shader: Get rid of the setup_vec4_uniform_value helper

2015-10-02 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> It's not used by anything anymore
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp   | 14 --
>  src/mesa/drivers/dri/i965/brw_fs.h |  4 
>  src/mesa/drivers/dri/i965/brw_shader.h |  4 
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  3 ---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 16 
>  5 files changed, 41 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 64215ae..b062219 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -948,20 +948,6 @@ fs_visitor::import_uniforms(fs_visitor *v)
> this->param_size = v->param_size;
>  }
>  
> -void
> -fs_visitor::setup_vec4_uniform_value(unsigned param_offset,
> - const gl_constant_value *values,
> - unsigned n)
> -{
> -   static const gl_constant_value zero = { 0 };
> -
> -   for (unsigned i = 0; i < n; ++i)
> -  stage_prog_data->param[param_offset + i] = [i];
> -
> -   for (unsigned i = n; i < 4; ++i)
> -  stage_prog_data->param[param_offset + i] = 
> -}
> -
>  fs_reg *
>  fs_visitor::emit_fragcoord_interpolation(bool pixel_center_integer,
>   bool origin_upper_left)
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index a8b6726..6231652 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -294,10 +294,6 @@ public:
>  
> struct brw_reg interp_reg(int location, int channel);
>  
> -   virtual void setup_vec4_uniform_value(unsigned param_offset,
> - const gl_constant_value *values,
> - unsigned n);
> -
> int implied_mrf_writes(fs_inst *inst);
>  
> virtual void dump_instructions();
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> b/src/mesa/drivers/dri/i965/brw_shader.h
> index ee6afce..eeb3306 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.h
> +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> @@ -269,10 +269,6 @@ public:
> void assign_common_binding_table_offsets(uint32_t 
> next_binding_table_offset);
>  
> virtual void invalidate_live_intervals() = 0;
> -
> -   virtual void setup_vec4_uniform_value(unsigned param_offset,
> - const gl_constant_value *values,
> - unsigned n) = 0;
>  };
>  
>  uint32_t brw_texture_offset(int *offsets, unsigned num_components);
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 76b13a6..b3a62d2 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -132,9 +132,6 @@ public:
> bool run();
> void fail(const char *msg, ...);
>  
> -   virtual void setup_vec4_uniform_value(unsigned param_offset,
> - const gl_constant_value *values,
> - unsigned n);
> int setup_uniforms(int payload_reg);
>  
> bool reg_allocate_trivial();
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index af01d8e..bc9d9a0 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -688,22 +688,6 @@ dst_reg::dst_reg(class vec4_visitor *v, const struct 
> glsl_type *type)
> this->type = brw_type_for_base_type(type);
>  }
>  
> -void
> -vec4_visitor::setup_vec4_uniform_value(unsigned param_offset,
> -   const gl_constant_value *values,
> -   unsigned n)
> -{
> -   static const gl_constant_value zero = { 0 };
> -
> -   assert(param_offset % 4 == 0);
> -
> -   for (unsigned i = 0; i < n; ++i)
> -  stage_prog_data->param[param_offset + i] = [i];
> -
> -   for (unsigned i = n; i < 4; ++i)
> -  stage_prog_data->param[param_offset + i] = 
> -}
> -
>  vec4_instruction *
>  vec4_visitor::emit_minmax(enum brw_conditional_mod conditionalmod, dst_reg 
> dst,
>src_reg src0, src_reg src1)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/11] i965/nir: Pull GLSL uniform handling into a common function

2015-10-02 Thread Iago Toral
On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> The way we deal with GLSL uniforms and builtins is basically the same in
> both the vec4 and the fs backend.  This commit takes the best parts of both
> implementations and pulls the common code into a shared helper function.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h |   2 -
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  79 +---
>  src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 120 
> +
>  src/mesa/drivers/dri/i965/brw_vec4.h   |   2 -
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp |  74 +--
>  5 files changed, 126 insertions(+), 151 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 6231652..0bc639d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -241,8 +241,6 @@ public:
> void nir_setup_inputs(nir_shader *shader);
> void nir_setup_outputs(nir_shader *shader);
> void nir_setup_uniforms(nir_shader *shader);
> -   void nir_setup_uniform(nir_variable *var);
> -   void nir_setup_builtin_uniform(nir_variable *var);
> void nir_emit_system_values(nir_shader *shader);
> void nir_emit_impl(nir_function_impl *impl);
> void nir_emit_cf_list(exec_list *list);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 3ba6a67..829c663 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -183,15 +183,14 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader)
> uniforms = shader->num_uniforms;
>  
> if (shader_prog) {
> +  brw_nir_setup_glsl_uniforms(shader, shader_prog, prog,
> +  stage_prog_data, true);
> +
>foreach_list_typed(nir_variable, var, node, >uniforms) {
>   /* UBO's and atomics don't take up space in the uniform file */
>   if (var->interface_type != NULL || var->type->contains_atomic())
>  continue;
>  
> - if (strncmp(var->name, "gl_", 3) == 0)
> -nir_setup_builtin_uniform(var);
> - else
> -nir_setup_uniform(var);
>   if(type_size_scalar(var->type) > 0)
>  param_size[var->data.driver_location] = 
> type_size_scalar(var->type);
>}
> @@ -203,78 +202,6 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader)
> }
>  }
>  
> -void
> -fs_visitor::nir_setup_uniform(nir_variable *var)
> -{
> -   int namelen = strlen(var->name);
> -
> -   /* The data for our (non-builtin) uniforms is stored in a series of
> -  * gl_uniform_driver_storage structs for each subcomponent that
> -  * glGetUniformLocation() could name.  We know it's been set up in the
> -  * same order we'd walk the type, so walk the list of storage and find
> -  * anything with our name, or the prefix of a component that starts with
> -  * our name.
> -  */
> -   unsigned index = var->data.driver_location;
> -   for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
> -  struct gl_uniform_storage *storage = _prog->UniformStorage[u];
> -
> -  if (storage->builtin)
> -  continue;
> -
> -  if (strncmp(var->name, storage->name, namelen) != 0 ||
> - (storage->name[namelen] != 0 &&
> - storage->name[namelen] != '.' &&
> - storage->name[namelen] != '[')) {
> - continue;
> -  }
> -
> -  if (storage->type->is_image()) {
> - brw_setup_image_uniform_values(stage, stage_prog_data,
> -index, storage);
> -  } else {
> - unsigned slots = storage->type->component_slots();
> - if (storage->array_elements)
> -slots *= storage->array_elements;
> -
> - for (unsigned i = 0; i < slots; i++) {
> -stage_prog_data->param[index++] = >storage[i];
> - }
> -  }
> -   }
> -}
> -
> -void
> -fs_visitor::nir_setup_builtin_uniform(nir_variable *var)
> -{
> -   const nir_state_slot *const slots = var->state_slots;
> -   assert(var->state_slots != NULL);
> -
> -   unsigned uniform_index = var->data.driver_location;
> -   for (unsigned int i = 0; i < var->num_state_slots; i++) {
> -  /* This state reference has already been setup by ir_to_mesa, but we'll
> -   * get the same index back here.
> -   */
> -  int index = _mesa_add_state_reference(this->prog->Parameters,
> -(gl_state_index 
> *)slots[i].tokens);
> -
> -  /* Add each of the unique swizzles of the element as a parameter.
> -   * This'll end up matching the expected layout of the
> -   * array/matrix/structure we're trying to fill in.
> -   */
> -  int last_swiz = -1;
> -  for (unsigned int j = 0; j < 4; j++) {
> - int swiz = GET_SWZ(slots[i].swizzle, j);
> - if (swiz == last_swiz)
> -break;

Re: [Mesa-dev] [PATCH 06/11] i965/shader: Pull setup_image_uniform_values out of backend_shader

2015-10-02 Thread Iago Toral
On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> I tried to do this once before but Curro pointed out that having it in
> backend_shader meant it could use the setup_vec4_uniform_values helper
> which did different things in vec4 and fs.  Now the setup_uniform_values
> function differs only by an assert in the two backends so there's no real
> good reason to be using it anymore.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  3 +-
>  src/mesa/drivers/dri/i965/brw_shader.cpp | 52 
> +---
>  src/mesa/drivers/dri/i965/brw_shader.h   |  7 +++--
>  3 files changed, 42 insertions(+), 20 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 7a965cd..d33e452 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -236,7 +236,8 @@ fs_visitor::nir_setup_uniform(nir_variable *var)
>}
>  
>if (storage->type->is_image()) {
> - setup_image_uniform_values(index, storage);
> + brw_setup_image_uniform_values(stage, stage_prog_data,
> +index, storage);
>} else {
>   unsigned slots = storage->type->component_slots();
>   if (storage->array_elements)
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 72388ce..1d184a7 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -1419,32 +1419,50 @@ 
> backend_shader::assign_common_binding_table_offsets(uint32_t 
> next_binding_table_
> /* prog_data->base.binding_table.size will be set by 
> brw_mark_surface_used. */
>  }
>  
> +static void
> +setup_vec4_uniform_value(const gl_constant_value **params,
> + const gl_constant_value *values,
> + unsigned n)
> +{
> +   static const gl_constant_value zero = { 0 };
> +
> +   for (unsigned i = 0; i < n; ++i)
> +  params[i] = [i];
> +
> +   for (unsigned i = n; i < 4; ++i)
> +  params[i] = 
> +}

I actually liked the version that received an offset into params better,
since that allows us to assert that we are not messing our writes to
params. Not a big deal though, so either way:

Reviewed-by: Iago Toral Quiroga 

>  void
> -backend_shader::setup_image_uniform_values(unsigned param_offset,
> -   const gl_uniform_storage *storage)
> +brw_setup_image_uniform_values(gl_shader_stage stage,
> +   struct brw_stage_prog_data *stage_prog_data,
> +   unsigned param_start_index,
> +   const gl_uniform_storage *storage)
>  {
> -   const unsigned stage = _mesa_program_enum_to_shader_stage(prog->Target);
> +   const gl_constant_value **param =
> +  _prog_data->param[param_start_index];
>  
> for (unsigned i = 0; i < MAX2(storage->array_elements, 1); i++) {
>const unsigned image_idx = storage->image[stage].index + i;
> -  const brw_image_param *param = 
> _prog_data->image_param[image_idx];
> +  const brw_image_param *image_param =
> + _prog_data->image_param[image_idx];
>  
>/* Upload the brw_image_param structure.  The order is expected to 
> match
> * the BRW_IMAGE_PARAM_*_OFFSET defines.
> */
> -  setup_vec4_uniform_value(param_offset + 
> BRW_IMAGE_PARAM_SURFACE_IDX_OFFSET,
> - (const gl_constant_value *)>surface_idx, 1);
> -  setup_vec4_uniform_value(param_offset + BRW_IMAGE_PARAM_OFFSET_OFFSET,
> - (const gl_constant_value *)param->offset, 2);
> -  setup_vec4_uniform_value(param_offset + BRW_IMAGE_PARAM_SIZE_OFFSET,
> - (const gl_constant_value *)param->size, 3);
> -  setup_vec4_uniform_value(param_offset + BRW_IMAGE_PARAM_STRIDE_OFFSET,
> - (const gl_constant_value *)param->stride, 4);
> -  setup_vec4_uniform_value(param_offset + BRW_IMAGE_PARAM_TILING_OFFSET,
> - (const gl_constant_value *)param->tiling, 3);
> -  setup_vec4_uniform_value(param_offset + 
> BRW_IMAGE_PARAM_SWIZZLING_OFFSET,
> - (const gl_constant_value *)param->swizzling, 2);
> -  param_offset += BRW_IMAGE_PARAM_SIZE;
> +  setup_vec4_uniform_value(param + BRW_IMAGE_PARAM_SURFACE_IDX_OFFSET,
> + (const gl_constant_value *)_param->surface_idx, 1);
> +  setup_vec4_uniform_value(param + BRW_IMAGE_PARAM_OFFSET_OFFSET,
> + (const gl_constant_value *)image_param->offset, 2);
> +  setup_vec4_uniform_value(param + BRW_IMAGE_PARAM_SIZE_OFFSET,
> + (const gl_constant_value *)image_param->size, 3);
> +  setup_vec4_uniform_value(param + BRW_IMAGE_PARAM_STRIDE_OFFSET,
> + (const gl_constant_value *)image_param->stride, 4);
> +  setup_vec4_uniform_value(param + BRW_IMAGE_PARAM_TILING_OFFSET,
> + (const gl_constant_value 

Re: [Mesa-dev] [PATCH 1/2] nir: Add a function to determine if a source is dynamically uniform

2015-10-02 Thread Francisco Jerez
Jason Ekstrand  writes:

> On Oct 1, 2015 12:18 PM, "Matt Turner"  wrote:
>>
>> On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts  wrote:
>> > Adds nir_src_is_dynamically_uniform which returns true if the source
>> > is known to be dynamically uniform. This will be used in a later patch
>> > to add a workaround for cases that only work with dynamically uniform
>> > sources. Note that the function is not definitive, it can return false
>> > negatives (but not false positives). Currently it only detects
>> > constants and uniform accesses. It could easily be extended to include
>> > more cases.
>> > ---
>> >  src/glsl/nir/nir.c | 29 +
>> >  src/glsl/nir/nir.h |  1 +
>> >  2 files changed, 30 insertions(+)
>> >
>> > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> > index 78ff886..242f0b4 100644
>> > --- a/src/glsl/nir/nir.c
>> > +++ b/src/glsl/nir/nir.c
>> > @@ -1784,6 +1784,35 @@ nir_src_as_const_value(nir_src src)
>> > return >value;
>> >  }
>> >
>> > +/**
>> > + * Returns true if the source is known to be dynamically uniform.
> Otherwise it
>> > + * returns false which means it may or may not be dynamically uniform
> but it
>> > + * can't be determined.
>> > + */
>> > +bool
>> > +nir_src_is_dynamically_uniform(nir_src src)
>> > +{
>> > +   if (!src.is_ssa)
>> > +  return false;
>> > +
>> > +   /* Constants are trivially dynamically uniform */
>> > +   if (src.ssa->parent_instr->type == nir_instr_type_load_const)
>> > +  return true;
>> > +
>> > +   /* As are uniform variables */
>> > +   if (src.ssa->parent_instr->type == nir_instr_type_intrinsic) {
>> > +  nir_intrinsic_instr *intr =
> nir_instr_as_intrinsic(src.ssa->parent_instr);
>> > +
>> > +  if (intr->intrinsic == nir_intrinsic_load_uniform)
>> > + return true;
>> > +   }
>> > +
>> > +   /* XXX: this could have many more tests, such as when a sampler
> function is
>> > +* called with dynamically uniform arguments.
>> > +*/
>> > +   return false;
>> > +}
>>
>> This functions seems correct as-is, so it gets a
>>
>> Reviewed-by: Matt Turner 
>>
>> On top of being useful for fixing the nonconst/nonuniform piglit
>> tests, knowing which values are uniform can allow better optimization
>> and knowing which branches are uniform would allow us to enable Single
>> Program Flow.
>>
>> Cc'ing Jason for a discussion about how we'd like to handle this kind
>> of information in NIR. Add a bool is_uniform or something to
>> nir_ssa_def and hook into the metadata system?
>
> That was more-or-less my plan. We should probably have a proper pass that
> handles things like phi nodes and ALU operations.
>

Heh, I hadn't noticed this series until now and happen to have started
to sketch such an analysis pass just this week, with a completely
unrelated purpose in mind (enable a restricted form of GCM for texturing
ops with implicit derivatives).  Any volunteers to write it (Neil?) or
shall I go ahead with it?

> --Jason
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/9] nir: Report progress from nir_lower_system_values().

2015-10-02 Thread Chris Wilson
On Fri, Sep 18, 2015 at 08:37:13AM -0700, Kenneth Graunke wrote:
> Signed-off-by: Kenneth Graunke 

Causes

==823== Conditional jump or move depends on uninitialised value(s)
==823==at 0xB09020C: convert_block (nir_lower_system_values.c:68)
==823==by 0xB079FB8: foreach_cf_node (nir.c:1310)
==823==by 0xB07A0AF: nir_foreach_block (nir.c:1336)
==823==by 0xB09026B: convert_impl (nir_lower_system_values.c:79)
==823==by 0xB0902CE: nir_lower_system_values (nir_lower_system_values.c:92)
==823==by 0xB15C174: brw_create_nir (brw_nir.c:155)
==823==by 0xB16E27B: brw_link_shader (brw_shader.cpp:395)
==823==by 0xAF5380E: _mesa_glsl_link_shader (ir_to_mesa.cpp:2983)
==823==by 0xAD04535: create_new_program(gl_context*, state_key*) 
(ff_fragment_shader.cpp:1268)
==823==by 0xAD04605: _mesa_get_fixed_func_fragment_program 
(ff_fragment_shader.cpp:1298)
==823==by 0xADF8080: update_program (state.c:157)
==823==by 0xADF89DC: _mesa_update_state_locked (state.c:469)
==823==  Uninitialised value was created by a stack allocation
==823==at 0xB090249: convert_impl (nir_lower_system_values.c:76)
==823== 

Trivially fixed by

@@ -74,7 +74,7 @@ convert_block(nir_block *block, void *state)
 static bool
 convert_impl(nir_function_impl *impl)
 {
-   bool progress;
+   bool progress = false;
 
nir_foreach_block(impl, convert_block, );
nir_metadata_preserve(impl, nir_metadata_block_index |

Though you may want to also convert

diff --git a/src/glsl/nir/nir_lower_system_values.c b/src/glsl/nir/nir_lower_sys
tem_values.c
index d77bb2f..07371e4 100644
--- a/src/glsl/nir/nir_lower_system_values.c
+++ b/src/glsl/nir/nir_lower_system_values.c
@@ -65,7 +65,7 @@ convert_block(nir_block *block, void *state)
 
nir_foreach_instr_safe(block, instr) {
   if (instr->type == nir_instr_type_intrinsic)
- *progress = convert_instr(nir_instr_as_intrinsic(instr)) || *progress;
+ *progress |= convert_instr(nir_instr_as_intrinsic(instr));
}
 
return true;

throughout as well.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl/opt_array_splitting: Fix crash when doing array indexing into other arrays

2015-10-02 Thread Iago Toral
I think this never got a review, anyone willing to take it?

Iago

On Mon, 2015-09-14 at 13:49 +0200, Iago Toral Quiroga wrote:
> When we find indirect indexing into an array, the current implementation
> of the array spliiting optimization pass does not look further into the
> expression tree. However, if the variable expression involves variable
> indexing into other arrays, we can miss that these other arrays also have
> variable indexing. If that happens, the pass will crash later on after
> hitting an assertion put there to ensure that split arrays are in fact
> always indexed via constants:
> 
> shader_runner: opt_array_splitting.cpp:296:
> void ir_array_splitting_visitor::split_deref(ir_dereference**): Assertion 
> `constant' failed.
> 
> This patch fixes the problem by letting the pass step into the variable
> index expression to identify these cases properly.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89607
> ---
>  src/glsl/opt_array_splitting.cpp | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/src/glsl/opt_array_splitting.cpp 
> b/src/glsl/opt_array_splitting.cpp
> index 9e73f3c..1fdd013 100644
> --- a/src/glsl/opt_array_splitting.cpp
> +++ b/src/glsl/opt_array_splitting.cpp
> @@ -185,8 +185,18 @@ 
> ir_array_reference_visitor::visit_enter(ir_dereference_array *ir)
> /* If the access to the array has a variable index, we wouldn't
>  * know which split variable this dereference should go to.
>  */
> -   if (entry && !ir->array_index->as_constant())
> -  entry->split = false;
> +   if (!ir->array_index->as_constant()) {
> +  if (entry)
> + entry->split = false;
> +  /* This variable indexing could come from a different array dereference
> +   * that also has variable indexing, that is, something like 
> a[b[a[b[0.
> +   * If we return visit_continue_with_parent here for the first 
> appearence
> +   * of a, then we can miss that b also has indirect indexing (if this is
> +   * the only place in the program where such indirect indexing into b
> +   * happens), so keep going.
> +   */
> +  return visit_continue;
> +   }
>  
> return visit_continue_with_parent;
>  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92236] [ColorTiling2D] R600 BARTS radeon: pipe_context->clear with color tiling enabled doesn't immediately affect color buffer

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92236

Axel Davy  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #3 from Axel Davy  ---
Ok, after discussion with Michel we realized the problem is that flush_resource
should be called before flushing, and not after.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-02 Thread Marek Olšák
On Fri, Oct 2, 2015 at 3:13 AM, Roland Scheidegger  wrote:
> Can't say I'm a big fan of doing essentially the same thing with
> different methods, but well it's coming from GL, and if it helps some
> drivers...
>
> Acked-by: Roland Scheidegger 

If all drivers start supporting the CAP, we could remove the st/mesa
emulation. Just as we could remove the color clamping emulation.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-02 Thread Rob Clark
On Fri, Oct 2, 2015 at 11:49 AM, Emil Velikov  wrote:
> On 1 October 2015 at 20:44, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> Signed-off-by: Rob Clark 
>> ---
>> Drop the idea of more ambitious re-arrangement if libs, but keep the
>> pipe-loader refactoring.  With this at least drm_gralloc could still
>> dlopen() gallium_dri.so and then use the pipe-loader API to figure
>> out which pipe driver to load and hand back a screen.
>>
>> The nine st is not updated.. but I don't claim to understand the whole
>> screen + sw-screen thing.  So I figured I'd let someone who knew what
>> they were doing update nine.  Once that happens, we should change to
>> not expose the dd_xyz fxns outside of pipe-loader, imho..
>>
> If the intent here is to consolidate/abstract things, this patch isn't
> the way I'm afraid.
>
> Namely, it drops support for software only pipe-driver. It also
> removes pipe-driver support for dri, xa and vl-based modules. All of
> which used to work fine last time I've tested them (admittedly ~6
> months ago).

I assume you mean !GALLIUM_STATIC_TARGETS support?

I think we just need to build pipe-driver twice, once in each mode,
and link either the static-pipe or dynamic-pipe version per state
tracker depending on how you want things..  but wasn't sure the best
way to go about that..


> The idea that I have in mind is roughly as:
> 1) abstract most of the 'target-helpers' as 'static' pipe-loader,
> providing pipe-loader like interface.
> 2) drop the ifdefs in the former, add dummy
> nouveau_drm_screen_create implementations.
> 3) remove all the ifdefs in the state-trackers.
> 4) add a configure switch that allows one to toggle/choose which how
> the modules will be build. default to 'mega' (static).

hmm, that sounds harder than just building it twice..

> If you want to hack on that be my guest, but be aware that the
> pipe-loader interface has some non-obvious fd ownership patterns ;-)
>
> Atm, I'm having fun with drm_gralloc and mesa. Expect patches on that
> one later on today/tomorrow.

finally moving drm_gralloc into mesa?  That would be helpful, I think
that should happen before we try to cleanup and merge the dmabuf
parts..

BR,
-R

> Cheers,
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-02 Thread Emil Velikov
On 1 October 2015 at 20:44, Rob Clark  wrote:
> From: Rob Clark 
>
> Signed-off-by: Rob Clark 
> ---
> Drop the idea of more ambitious re-arrangement if libs, but keep the
> pipe-loader refactoring.  With this at least drm_gralloc could still
> dlopen() gallium_dri.so and then use the pipe-loader API to figure
> out which pipe driver to load and hand back a screen.
>
> The nine st is not updated.. but I don't claim to understand the whole
> screen + sw-screen thing.  So I figured I'd let someone who knew what
> they were doing update nine.  Once that happens, we should change to
> not expose the dd_xyz fxns outside of pipe-loader, imho..
>
If the intent here is to consolidate/abstract things, this patch isn't
the way I'm afraid.

Namely, it drops support for software only pipe-driver. It also
removes pipe-driver support for dri, xa and vl-based modules. All of
which used to work fine last time I've tested them (admittedly ~6
months ago).

The idea that I have in mind is roughly as:
1) abstract most of the 'target-helpers' as 'static' pipe-loader,
providing pipe-loader like interface.
2) drop the ifdefs in the former, add dummy
nouveau_drm_screen_create implementations.
3) remove all the ifdefs in the state-trackers.
4) add a configure switch that allows one to toggle/choose which how
the modules will be build. default to 'mega' (static).

If you want to hack on that be my guest, but be aware that the
pipe-loader interface has some non-obvious fd ownership patterns ;-)

Atm, I'm having fun with drm_gralloc and mesa. Expect patches on that
one later on today/tomorrow.

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nir: Add a function to determine if a source is dynamically uniform

2015-10-02 Thread Francisco Jerez
Matt Turner  writes:

> On Fri, Oct 2, 2015 at 2:52 AM, Francisco Jerez  wrote:
>> Jason Ekstrand  writes:
>>
>>> On Oct 1, 2015 12:18 PM, "Matt Turner"  wrote:

 On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts  wrote:
 > Adds nir_src_is_dynamically_uniform which returns true if the source
 > is known to be dynamically uniform. This will be used in a later patch
 > to add a workaround for cases that only work with dynamically uniform
 > sources. Note that the function is not definitive, it can return false
 > negatives (but not false positives). Currently it only detects
 > constants and uniform accesses. It could easily be extended to include
 > more cases.
 > ---
 >  src/glsl/nir/nir.c | 29 +
 >  src/glsl/nir/nir.h |  1 +
 >  2 files changed, 30 insertions(+)
 >
 > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
 > index 78ff886..242f0b4 100644
 > --- a/src/glsl/nir/nir.c
 > +++ b/src/glsl/nir/nir.c
 > @@ -1784,6 +1784,35 @@ nir_src_as_const_value(nir_src src)
 > return >value;
 >  }
 >
 > +/**
 > + * Returns true if the source is known to be dynamically uniform.
>>> Otherwise it
 > + * returns false which means it may or may not be dynamically uniform
>>> but it
 > + * can't be determined.
 > + */
 > +bool
 > +nir_src_is_dynamically_uniform(nir_src src)
 > +{
 > +   if (!src.is_ssa)
 > +  return false;
 > +
 > +   /* Constants are trivially dynamically uniform */
 > +   if (src.ssa->parent_instr->type == nir_instr_type_load_const)
 > +  return true;
 > +
 > +   /* As are uniform variables */
 > +   if (src.ssa->parent_instr->type == nir_instr_type_intrinsic) {
 > +  nir_intrinsic_instr *intr =
>>> nir_instr_as_intrinsic(src.ssa->parent_instr);
 > +
 > +  if (intr->intrinsic == nir_intrinsic_load_uniform)
 > + return true;
 > +   }
 > +
 > +   /* XXX: this could have many more tests, such as when a sampler
>>> function is
 > +* called with dynamically uniform arguments.
 > +*/
 > +   return false;
 > +}

 This functions seems correct as-is, so it gets a

 Reviewed-by: Matt Turner 

 On top of being useful for fixing the nonconst/nonuniform piglit
 tests, knowing which values are uniform can allow better optimization
 and knowing which branches are uniform would allow us to enable Single
 Program Flow.

 Cc'ing Jason for a discussion about how we'd like to handle this kind
 of information in NIR. Add a bool is_uniform or something to
 nir_ssa_def and hook into the metadata system?
>>>
>>> That was more-or-less my plan. We should probably have a proper pass that
>>> handles things like phi nodes and ALU operations.
>>>
>>
>> Heh, I hadn't noticed this series until now and happen to have started
>> to sketch such an analysis pass just this week, with a completely
>> unrelated purpose in mind (enable a restricted form of GCM for texturing
>> ops with implicit derivatives).  Any volunteers to write it (Neil?) or
>> shall I go ahead with it?
>
> Interesting. In case you haven't seen it, Connor added nir_opt_gcm()
> some time ago, but it's not enabled because it hurts many things.
> Presumably you can reuse its code to implement your idea.
>
Yeah, I had seen Connor's pass and it seems like a good starting point,
thanks.

> I suspect you should go ahead and write the uniform analysis pass.
>
> Thanks!


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/11] i965/nir: Pull common ARB program uniform handling into a common function

2015-10-02 Thread Jason Ekstrand
On Fri, Oct 2, 2015 at 3:25 AM, Iago Toral  wrote:
> On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
>> The way we deal with ARB program uniforms is basically the same in both the
>> vec4 and the fs backend.  This commit takes the best parts of both
>> implementations and pulls the common code into a shared helper function.
>> ---
>>  src/mesa/drivers/dri/i965/Makefile.sources |  1 +
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 10 +
>>  src/mesa/drivers/dri/i965/brw_nir.h|  9 
>>  src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 61 
>> ++
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 29 ++--
>>  5 files changed, 76 insertions(+), 34 deletions(-)
>>  create mode 100644 src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
>>
>> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
>> b/src/mesa/drivers/dri/i965/Makefile.sources
>> index 540e3df..eb8196d 100644
>> --- a/src/mesa/drivers/dri/i965/Makefile.sources
>> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
>> @@ -86,6 +86,7 @@ i965_FILES = \
>>   brw_nir.h \
>>   brw_nir.c \
>>   brw_nir_analyze_boolean_resolves.c \
>> + brw_nir_uniforms.cpp \
>>   brw_object_purgeable.c \
>>   brw_packed_float.c \
>>   brw_performance_monitor.c \
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index d33e452..3ba6a67 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -196,14 +196,8 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader)
>>  param_size[var->data.driver_location] = 
>> type_size_scalar(var->type);
>>}
>> } else {
>> -  /* prog_to_nir only creates a single giant uniform variable so we can
>> -   * just set param up directly. */
>> -  for (unsigned p = 0; p < prog->Parameters->NumParameters; p++) {
>> - for (unsigned int i = 0; i < 4; i++) {
>> -stage_prog_data->param[4 * p + i] =
>> -   >Parameters->ParameterValues[p][i];
>> - }
>> -  }
>> +  brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data);
>> +
>>if(prog->Parameters->NumParameters > 0)
>>   param_size[0] = prog->Parameters->NumParameters * 4;
>> }
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.h 
>> b/src/mesa/drivers/dri/i965/brw_nir.h
>> index ad71293..b4a6dc0 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.h
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.h
>> @@ -85,6 +85,15 @@ enum brw_reg_type brw_type_for_nir_type(nir_alu_type 
>> type);
>>
>>  enum glsl_base_type brw_glsl_base_type_for_nir_type(nir_alu_type type);
>>
>> +void brw_nir_setup_glsl_uniforms(nir_shader *shader,
>> + struct gl_shader_program *shader_prog,
>> + const struct gl_program *prog,
>> + struct brw_stage_prog_data 
>> *stage_prog_data,
>> + bool is_scalar);
>
> brw_nir_setup_glsl_uniforms should be declared in the next patch.

Done

>> +void brw_nir_setup_arb_uniforms(nir_shader *shader, struct gl_program *prog,
>> +struct brw_stage_prog_data 
>> *stage_prog_data);
>> +
>>  #ifdef __cplusplus
>>  }
>>  #endif
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp 
>> b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
>> new file mode 100644
>> index 000..ede4e91
>> --- /dev/null
>> +++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
>> @@ -0,0 +1,61 @@
>> +/*
>> + * Copyright © 2015 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS
>> + * IN THE SOFTWARE.
>> + */
>> +
>> +#include "brw_shader.h"
>> +#include "brw_nir.h"
>> +
>> +void
>> +brw_nir_setup_arb_uniforms(nir_shader *shader, 

Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-02 Thread Marek Olšák
On Fri, Oct 2, 2015 at 7:50 PM, Ilia Mirkin  wrote:
> On Fri, Oct 2, 2015 at 1:48 PM, Marek Olšák  wrote:
>> On Fri, Oct 2, 2015 at 4:12 PM, Roland Scheidegger  
>> wrote:
>>> Am 02.10.2015 um 12:19 schrieb Marek Olšák:
 On Fri, Oct 2, 2015 at 3:13 AM, Roland Scheidegger  
 wrote:
> Can't say I'm a big fan of doing essentially the same thing with
> different methods, but well it's coming from GL, and if it helps some
> drivers...
>
> Acked-by: Roland Scheidegger 

 If all drivers start supporting the CAP, we could remove the st/mesa
 emulation. Just as we could remove the color clamping emulation.

 Marek

>>>
>>> Yes, but that just moves the logic to the drivers. It's still (mostly)
>>> the same thing handling per-sample inputs, or forcing them to per sample
>>> via ARB_sample_shading.
>>
>> Yes, that's the main point. For performance, we should do the
>> translation to TGSI at link time, so that the GLSL->TGSI overhead can
>> be removed from draw calls. Because of that, all shader state
>> dependencies should be moved to and implemented/emulated in drivers.
>> This will be difficult to do because of the lack of interest from
>> other driver maintainers, therefore we need CAPs until they realize
>> it's good for them too.
>
> In case that was directed at me, I realize it's good for me too. I
> just don't know if I can do it on your schedule and I'd rather nouveau
> not be broken in the meanwhile :)

No, that wasn't directed at you. The "lack of interest" can also be
translated as the lack of manpower.

I'll try to do my best to ensure that no driver will be broken.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix GetProgramiv/GetActiveAttrib regression

2015-10-02 Thread Ilia Mirkin
On Mon, Sep 28, 2015 at 5:12 AM, Tapani Pälli  wrote:
> Commit 4639cea2921669527eb43dcb49724c05afb27e8e added packed varyings
> as part of program resource list. We need to take this to account with
> all functions dealing with active input attributes.
>
> Patch fixes the issue by adding additional check to is_active_attrib
> and iterating active attribs explicitly in GetActiveAttrib function.
>
> I've tested that fix works for Assault Android Cactus (demo version)
> and does not cause Piglit or CTS regressions in glGetProgramiv tests.
>
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122
> ---
>  src/mesa/main/shader_query.cpp | 44 
> +-
>  1 file changed, 35 insertions(+), 9 deletions(-)
>
> diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
> index 99d9e10..d023504 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -101,15 +101,34 @@ _mesa_BindAttribLocation(GLhandleARB program, GLuint 
> index,
>  */
>  }
>
> +/* Function checks if given input variable is packed varying by checking
> + * if there also exists output variable with the same name.
> + */
> +static bool
> +is_packed_varying(struct gl_shader_program *shProg, const ir_variable *input)
> +{
> +   assert(input->data.mode == ir_var_shader_in);

No statements before variable decls in core code.

> +   unsigned array_index = 0;
> +   struct gl_program_resource *out =
> +  _mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT, 
> input->name,
> +   _index);

This is slow, right? I.e. it's a loop?

> +   if (out)
> +  return true;
> +   return false;
> +}
> +
>  static bool
> -is_active_attrib(const ir_variable *var)
> +is_active_attrib(struct gl_shader_program *shProg,
> + const ir_variable *var)
>  {
> if (!var)
>return false;
>
> switch (var->data.mode) {
> case ir_var_shader_in:
> -  return var->data.location != -1;
> +  return (var->data.location != -1 &&
> +  var->data.location_frac == 0 &&
> +  !is_packed_varying(shProg, var));
>
> case ir_var_system_value:
>/* From GL 4.3 core spec, section 11.1.1 (Vertex Attributes):
> @@ -154,19 +173,25 @@ _mesa_GetActiveAttrib(GLhandleARB program, GLuint 
> desired_index,
>return;
> }
>
> -   struct gl_program_resource *res =
> -  _mesa_program_resource_find_index(shProg, GL_PROGRAM_INPUT,
> -desired_index);

You appear to replace this function, which is O(n), with a loop around
what is essentially this same function, which is O(n^2). Could you
write a more targeted helper which achieves what you want with a
single scan of the program resource list?

> +   struct gl_program_resource *res = shProg->ProgramResourceList;
> +   unsigned index = -1;
> +   for (unsigned i = 0; i < shProg->NumProgramResourceList; i++, res++) {
> +  if (res->Type != GL_PROGRAM_INPUT ||
> +  !is_active_attrib(shProg, RESOURCE_VAR(res)))
> + continue;
> +  if (++index == desired_index)
> + break;
> +   }
>
> /* User asked for index that does not exist. */
> -   if (!res) {
> +   if (!res || index != desired_index) {
>_mesa_error(ctx, GL_INVALID_VALUE, "glGetActiveAttrib(index)");
>return;
> }
>
> const ir_variable *const var = RESOURCE_VAR(res);
>
> -   if (!is_active_attrib(var))
> +   if (!is_active_attrib(shProg, var))
>return;
>
> const char *var_name = var->name;
> @@ -252,7 +277,7 @@ _mesa_count_active_attribs(struct gl_shader_program 
> *shProg)
> for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) {
>if (res->Type == GL_PROGRAM_INPUT &&
>res->StageReferences & (1 << MESA_SHADER_VERTEX) &&
> -  is_active_attrib(RESOURCE_VAR(res)))
> +  is_active_attrib(shProg, RESOURCE_VAR(res)))
>   count++;
> }
> return count;
> @@ -271,7 +296,8 @@ _mesa_longest_attribute_name_length(struct 
> gl_shader_program *shProg)
> size_t longest = 0;
> for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) {
>if (res->Type == GL_PROGRAM_INPUT &&
> -  res->StageReferences & (1 << MESA_SHADER_VERTEX)) {
> +  res->StageReferences & (1 << MESA_SHADER_VERTEX) &&
> +  is_active_attrib(shProg, RESOURCE_VAR(res))) {
>
>const size_t length = strlen(RESOURCE_VAR(res)->name);
>if (length >= longest)
> --
> 2.4.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/remove_phis: handle trivial back-edges

2015-10-02 Thread Matt Turner
On Fri, Oct 2, 2015 at 9:28 AM, Connor Abbott  wrote:
> Some loops may have phi nodes that look like:
>
> foo = ...
> loop {
> bar = phi(foo, bar)
> ...
> }
>
> in which case we can remove the phi node and replace all uses of 'bar'
> with 'foo'. In particular, there are some L4D2 vertex shaders with loops
> that, after optimization, look like:
>
> /* succs: block_1 */
> loop {
> block block_1:
> /* preds: block_0 block_4 */
> vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994
> vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321
> vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324
> vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327
> vec1 ssa_8139 = intrinsic load_uniform () () (232)
> vec1 ssa_588 = ige ssa_2195, ssa_8139
> /* succs: block_2 block_3 */
> if ssa_588 {
> block block_2:
> /* preds: block_1 */
> break
> /* succs: block_5 */
> } else {
> block block_3:
> /* preds: block_1 */
> /* succs: block_4 */
> }
> block block_4:
> /* preds: block_3 */
> vec1 ssa_994 = iadd ssa_2195, ssa_2150
> /* succs: block_1 */
> }
>
> where after removing the second, third, and fourth phi nodes, the loop becomes
> entirely dead, and this patch will cause the loop to be deleted entirely.
>
> No piglit regressions.
>
> Shader-db results on bdw:
>
> instructions in affected programs: 5824 -> 5664 (-2.75%)
> helped:32
> HURT:  0

I added loop-count support to report.py recently.

instructions in affected programs: 5824 -> 5664 (-2.75%)
total loops in shared programs:2234 -> 2202 (-1.43%)
helped:0

(report.py doesn't count programs in which the number of loops changed
as helped/hurt because sometimes a loop is unrolled and the
instruction count triples)

I'd feel fine about changing helped: 0 to helped: 32 like you have.

>
> Signed-off-by: Connor Abbott 
> ---
> Seems like this one fell through the cracks...
>
>  src/glsl/nir/nir_opt_remove_phis.c | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/src/glsl/nir/nir_opt_remove_phis.c 
> b/src/glsl/nir/nir_opt_remove_phis.c
> index bf4a67e..0a5793a 100644
> --- a/src/glsl/nir/nir_opt_remove_phis.c
> +++ b/src/glsl/nir/nir_opt_remove_phis.c
> @@ -58,6 +58,21 @@ remove_phis_block(nir_block *block, void *state)
>
>nir_foreach_phi_src(phi, src) {
>   assert(src->src.is_ssa);
> +
> + /* For phi nodes at the beginning of loops, we may encounter some
> +  * sources from backedges that point back to the destination of the
> +  * same phi, i.e. something like:
> +  *
> +  * a = phi(a, b, ...)
> +  *
> +  * We can safely ignore these sources, since if all of the normal
> +  * sources point to the same definition, then that definition must
> +  * still dominate the phi node, and the phi will still always take
> +  * the value of that definition.
> +  */
> +

I'd remove the newline here.

> + if (src->src.ssa == >dest.ssa)
> +continue;
>
>   if (def == NULL) {
>  def  = src->src.ssa;
> @@ -72,6 +87,11 @@ remove_phis_block(nir_block *block, void *state)
>if (!srcs_same)
>   continue;
>
> +  /* We must have found at least one definition, since there must be at
> +   * least one forward edge.
> +   */
> +  assert(def != NULL);
> +
>assert(phi->dest.is_ssa);
>nir_ssa_def_rewrite_uses(>dest.ssa, nir_src_for_ssa(def));
>nir_instr_remove(instr);
> --
> 2.1.0

Look good to me. Thanks for remembering this.

Reviewed-by: Matt Turner 

One strange thing I noticed... this doesn't affect any programs on
Haswell. All the shaders are vertex shaders, so they're vec4 on
Haswell but scalar on Broadwell, so it's probably somehow related to
that.

Want to take a look at what's going on with Haswell? Seems like if the
loop is dead... it should be dead.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/remove_phis: handle trivial back-edges

2015-10-02 Thread Connor Abbott
On Fri, Oct 2, 2015 at 12:56 PM, Matt Turner  wrote:
> On Fri, Oct 2, 2015 at 9:28 AM, Connor Abbott  wrote:
>> Some loops may have phi nodes that look like:
>>
>> foo = ...
>> loop {
>> bar = phi(foo, bar)
>> ...
>> }
>>
>> in which case we can remove the phi node and replace all uses of 'bar'
>> with 'foo'. In particular, there are some L4D2 vertex shaders with loops
>> that, after optimization, look like:
>>
>> /* succs: block_1 */
>> loop {
>> block block_1:
>> /* preds: block_0 block_4 */
>> vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994
>> vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321
>> vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324
>> vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327
>> vec1 ssa_8139 = intrinsic load_uniform () () (232)
>> vec1 ssa_588 = ige ssa_2195, ssa_8139
>> /* succs: block_2 block_3 */
>> if ssa_588 {
>> block block_2:
>> /* preds: block_1 */
>> break
>> /* succs: block_5 */
>> } else {
>> block block_3:
>> /* preds: block_1 */
>> /* succs: block_4 */
>> }
>> block block_4:
>> /* preds: block_3 */
>> vec1 ssa_994 = iadd ssa_2195, ssa_2150
>> /* succs: block_1 */
>> }
>>
>> where after removing the second, third, and fourth phi nodes, the loop 
>> becomes
>> entirely dead, and this patch will cause the loop to be deleted entirely.
>>
>> No piglit regressions.
>>
>> Shader-db results on bdw:
>>
>> instructions in affected programs: 5824 -> 5664 (-2.75%)
>> helped:32
>> HURT:  0
>
> I added loop-count support to report.py recently.
>
> instructions in affected programs: 5824 -> 5664 (-2.75%)
> total loops in shared programs:2234 -> 2202 (-1.43%)
> helped:0
>
> (report.py doesn't count programs in which the number of loops changed
> as helped/hurt because sometimes a loop is unrolled and the
> instruction count triples)
>
> I'd feel fine about changing helped: 0 to helped: 32 like you have.
>
>>
>> Signed-off-by: Connor Abbott 
>> ---
>> Seems like this one fell through the cracks...
>>
>>  src/glsl/nir/nir_opt_remove_phis.c | 20 
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/src/glsl/nir/nir_opt_remove_phis.c 
>> b/src/glsl/nir/nir_opt_remove_phis.c
>> index bf4a67e..0a5793a 100644
>> --- a/src/glsl/nir/nir_opt_remove_phis.c
>> +++ b/src/glsl/nir/nir_opt_remove_phis.c
>> @@ -58,6 +58,21 @@ remove_phis_block(nir_block *block, void *state)
>>
>>nir_foreach_phi_src(phi, src) {
>>   assert(src->src.is_ssa);
>> +
>> + /* For phi nodes at the beginning of loops, we may encounter some
>> +  * sources from backedges that point back to the destination of the
>> +  * same phi, i.e. something like:
>> +  *
>> +  * a = phi(a, b, ...)
>> +  *
>> +  * We can safely ignore these sources, since if all of the normal
>> +  * sources point to the same definition, then that definition must
>> +  * still dominate the phi node, and the phi will still always take
>> +  * the value of that definition.
>> +  */
>> +
>
> I'd remove the newline here.
>
>> + if (src->src.ssa == >dest.ssa)
>> +continue;
>>
>>   if (def == NULL) {
>>  def  = src->src.ssa;
>> @@ -72,6 +87,11 @@ remove_phis_block(nir_block *block, void *state)
>>if (!srcs_same)
>>   continue;
>>
>> +  /* We must have found at least one definition, since there must be at
>> +   * least one forward edge.
>> +   */
>> +  assert(def != NULL);
>> +
>>assert(phi->dest.is_ssa);
>>nir_ssa_def_rewrite_uses(>dest.ssa, nir_src_for_ssa(def));
>>nir_instr_remove(instr);
>> --
>> 2.1.0
>
> Look good to me. Thanks for remembering this.
>
> Reviewed-by: Matt Turner 

Ok, I fixed the newline and the results and pushed it.

>
> One strange thing I noticed... this doesn't affect any programs on
> Haswell. All the shaders are vertex shaders, so they're vec4 on
> Haswell but scalar on Broadwell, so it's probably somehow related to
> that.
>
> Want to take a look at what's going on with Haswell? Seems like if the
> loop is dead... it should be dead.

Yeah, I noticed that too. I'll look into it... it could just be that
the loops were getting deleted before this patch anyways without the
scalarizer.
___
mesa-dev mailing 

Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-02 Thread Ilia Mirkin
On Fri, Oct 2, 2015 at 1:48 PM, Marek Olšák  wrote:
> On Fri, Oct 2, 2015 at 4:12 PM, Roland Scheidegger  wrote:
>> Am 02.10.2015 um 12:19 schrieb Marek Olšák:
>>> On Fri, Oct 2, 2015 at 3:13 AM, Roland Scheidegger  
>>> wrote:
 Can't say I'm a big fan of doing essentially the same thing with
 different methods, but well it's coming from GL, and if it helps some
 drivers...

 Acked-by: Roland Scheidegger 
>>>
>>> If all drivers start supporting the CAP, we could remove the st/mesa
>>> emulation. Just as we could remove the color clamping emulation.
>>>
>>> Marek
>>>
>>
>> Yes, but that just moves the logic to the drivers. It's still (mostly)
>> the same thing handling per-sample inputs, or forcing them to per sample
>> via ARB_sample_shading.
>
> Yes, that's the main point. For performance, we should do the
> translation to TGSI at link time, so that the GLSL->TGSI overhead can
> be removed from draw calls. Because of that, all shader state
> dependencies should be moved to and implemented/emulated in drivers.
> This will be difficult to do because of the lack of interest from
> other driver maintainers, therefore we need CAPs until they realize
> it's good for them too.

In case that was directed at me, I realize it's good for me too. I
just don't know if I can do it on your schedule and I'd rather nouveau
not be broken in the meanwhile :)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/11] i965/vec4: Use the uniform count from nir_assign_var_locations

2015-10-02 Thread Jason Ekstrand
On Fri, Oct 2, 2015 at 2:09 AM, Iago Toral  wrote:
> On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
>> Previously, we were counting up uniforms as we set them up.  However, this
>> count should be exactly identical to shader->num_uniforms provided by
>> nir_assign_var_locations.  (If it's not, we're in trouble anyway because
>> that means that locations don't match up.)  This matches what the fs
>> backend is already doing.
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 32 
>> ++
>>  1 file changed, 11 insertions(+), 21 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> index b0abfc1..ee94e58 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> @@ -134,7 +134,7 @@ vec4_visitor::nir_setup_inputs(nir_shader *shader)
>>  void
>>  vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>>  {
>> -   uniforms = 0;
>> +   uniforms = shader->num_uniforms;
>>
>> if (shader_prog) {
>>foreach_list_typed(nir_variable, var, node, >uniforms) {
>> @@ -145,8 +145,7 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>>  continue;
>>   }
>>
>> - assert(uniforms < uniform_array_size);
>> - uniform_size[uniforms] = type_size_vec4(var->type);
>> + uniform_size[var->data.driver_location] = 
>> type_size_vec4(var->type);
>>
>>   if (strncmp(var->name, "gl_", 3) == 0)
>>  nir_setup_builtin_uniform(var);
>> @@ -161,8 +160,8 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>>assert(shader->uniforms.length() == 1 &&
>>   strcmp(var->name, "parameters") == 0);
>>
>> -  assert(uniforms < uniform_array_size);
>> -  uniform_size[uniforms] = type_size_vec4(var->type);
>> +  assert(var->data.driver_location == 0);
>> +  uniform_size[0] = type_size_vec4(var->type);
>>
>>struct gl_program_parameter_list *plist = prog->Parameters;
>>for (unsigned p = 0; p < plist->NumParameters; p++) {
>> @@ -174,14 +173,12 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>>
>>   unsigned i;
>>   for (i = 0; i < plist->Parameters[p].Size; i++) {
>> -stage_prog_data->param[uniforms * 4 + i] = 
>> >ParameterValues[p][i];
>> +stage_prog_data->param[p * 4 + i] = 
>> >ParameterValues[p][i];
>>   }
>>   for (; i < 4; i++) {
>>  static const gl_constant_value zero = { 0.0 };
>> -stage_prog_data->param[uniforms * 4 + i] = 
>> +stage_prog_data->param[p * 4 + i] = 
>>   }
>> -
>> - uniforms++;
>>}
>> }
>>  }
>> @@ -197,6 +194,7 @@ vec4_visitor::nir_setup_uniform(nir_variable *var)
>>  * order we'd walk the type, so walk the list of storage and find 
>> anything
>>  * with our name, or the prefix of a component that starts with our name.
>>  */
>> +unsigned index = var->data.driver_location * 4;
>
> Maybe call this uniform_index for consistency with
> nir_setup_builtin_uniform below.

I made the change to the patch that splits out GLSL uniform handling.

Thanks for reviewing!

> Reviewed-by: Iago Toral Quiroga 
>
>>  for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
>> struct gl_uniform_storage *storage = _prog->UniformStorage[u];
>>
>> @@ -215,19 +213,14 @@ vec4_visitor::nir_setup_uniform(nir_variable *var)
>>  storage->type->matrix_columns);
>>
>> for (unsigned s = 0; s < vector_count; s++) {
>> -  assert(uniforms < uniform_array_size);
>> -
>>int i;
>>for (i = 0; i < storage->type->vector_elements; i++) {
>> - stage_prog_data->param[uniforms * 4 + i] = components;
>> - components++;
>> + stage_prog_data->param[index++] = components++;
>>}
>>for (; i < 4; i++) {
>>   static const gl_constant_value zero = { 0.0 };
>> - stage_prog_data->param[uniforms * 4 + i] = 
>> + stage_prog_data->param[index++] = 
>>}
>> -
>> -  uniforms++;
>> }
>>  }
>>  }
>> @@ -238,6 +231,7 @@ vec4_visitor::nir_setup_builtin_uniform(nir_variable 
>> *var)
>> const nir_state_slot *const slots = var->state_slots;
>> assert(var->state_slots != NULL);
>>
>> +   unsigned uniform_index = var->data.driver_location * 4;
>> for (unsigned int i = 0; i < var->num_state_slots; i++) {
>>/* This state reference has already been setup by ir_to_mesa,
>> * but we'll get the same index back here.  We can reference
>> @@ -249,13 +243,9 @@ vec4_visitor::nir_setup_builtin_uniform(nir_variable 
>> *var)
>>gl_constant_value *values =
>>   >Parameters->ParameterValues[index][0];
>>
>> -  assert(uniforms < uniform_array_size);
>> -
>>for (unsigned j 

[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #11 from Chih-Wei Huang  ---
(In reply to Emil Velikov from comment #10)
> In the few moments that android-x86 was up I've noticed a very interesting
> commit [1]. In case you don't have a local copy, it enables a workaround
> #10194508, as seen here [2].
> 
> Not sure if the android-x86 version has the extra qcom 'hacks' so it might
> be worth experimenting on the topic - adding/reverting the qcom commit,
> reverting the original patch [1], all that whist toggling on/off the
> extension ;-)
> 
> 
> [1]
> http://git.android-x86.org/?p=platform/frameworks/native.git;a=commit;
> h=39aa8695464e496ec4f6d22e9af54457bcfdf4ae
> [2]
> https://github.com/omnirom/android_frameworks_native/blob/HEAD/opengl/libs/
> EGL/eglApi.cpp#L431

OK, you pointed out an old workaround we applied long time ago.
Without it, we just got black screen on i915/i965 GPU and
garbled screen on radeon.
Credit to Chia-I Wu who found the workaround.
Otherwise android-x86 has already dead.

Actually I reverted that workaround each time I tried a new Mesa version.
However, it's no help. We still need the workaround.

I just tried to revert the workaround with the extension enabled/disabled.
Nothing changed. We still need the workaround.

Mesa still doesn't support PIXEL_FORMAT_RGBA_?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: avoid leaking hiddenUniforms map when there are no uniforms

2015-10-02 Thread Ilia Mirkin
On Fri, Oct 2, 2015 at 2:51 AM, Iago Toral  wrote:
> On Thu, 2015-10-01 at 20:22 -0400, Ilia Mirkin wrote:
>> Signed-off-by: Ilia Mirkin 
>> ---
>>  src/glsl/link_uniforms.cpp | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
>> index 47d49c8..740b0a4 100644
>> --- a/src/glsl/link_uniforms.cpp
>> +++ b/src/glsl/link_uniforms.cpp
>> @@ -1131,15 +1131,15 @@ link_assign_uniform_locations(struct 
>> gl_shader_program *prog,
>> const unsigned num_data_slots = uniform_size.num_values;
>> const unsigned hidden_uniforms = uniform_size.num_hidden_uniforms;
>>
>> +   /* assign hidden uniforms a slot id */
>> +   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, _size);
>> +   delete hiddenUniforms;
>> +
>> /* On the outside chance that there were no uniforms, bail out.
>>  */
>> if (num_uniforms == 0)
>>return;
>>
>> -   /* assign hidden uniforms a slot id */
>> -   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, _size);
>> -   delete hiddenUniforms;
>> -
>
> I suppose there is no much gain in simply adding the delete statement
> right before the return...

That was my feeling as well -- the simplicity of having a single
delete seems to outweigh the modest overhead of iterating over an
empty hashtable.

Ideally we'd just use a scoped_ptr (RAII object that deletes ptr on
delete), but it's been too long for me to remember what is in what
version of what standard and I didn't want to get into that game.

>
> Reviewed-by: Iago Toral Quiroga 
>
>
>> struct gl_uniform_storage *uniforms =
>>rzalloc_array(prog, struct gl_uniform_storage, num_uniforms);
>> union gl_constant_value *data =
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir/remove_phis: handle trivial back-edges

2015-10-02 Thread Connor Abbott
On Fri, Oct 2, 2015 at 1:26 PM, Connor Abbott  wrote:
> On Fri, Oct 2, 2015 at 12:56 PM, Matt Turner  wrote:
>> On Fri, Oct 2, 2015 at 9:28 AM, Connor Abbott  wrote:
>>> Some loops may have phi nodes that look like:
>>>
>>> foo = ...
>>> loop {
>>> bar = phi(foo, bar)
>>> ...
>>> }
>>>
>>> in which case we can remove the phi node and replace all uses of 'bar'
>>> with 'foo'. In particular, there are some L4D2 vertex shaders with loops
>>> that, after optimization, look like:
>>>
>>> /* succs: block_1 */
>>> loop {
>>> block block_1:
>>> /* preds: block_0 block_4 */
>>> vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994
>>> vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321
>>> vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324
>>> vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327
>>> vec1 ssa_8139 = intrinsic load_uniform () () (232)
>>> vec1 ssa_588 = ige ssa_2195, ssa_8139
>>> /* succs: block_2 block_3 */
>>> if ssa_588 {
>>> block block_2:
>>> /* preds: block_1 */
>>> break
>>> /* succs: block_5 */
>>> } else {
>>> block block_3:
>>> /* preds: block_1 */
>>> /* succs: block_4 */
>>> }
>>> block block_4:
>>> /* preds: block_3 */
>>> vec1 ssa_994 = iadd ssa_2195, ssa_2150
>>> /* succs: block_1 */
>>> }
>>>
>>> where after removing the second, third, and fourth phi nodes, the loop 
>>> becomes
>>> entirely dead, and this patch will cause the loop to be deleted entirely.
>>>
>>> No piglit regressions.
>>>
>>> Shader-db results on bdw:
>>>
>>> instructions in affected programs: 5824 -> 5664 (-2.75%)
>>> helped:32
>>> HURT:  0
>>
>> I added loop-count support to report.py recently.
>>
>> instructions in affected programs: 5824 -> 5664 (-2.75%)
>> total loops in shared programs:2234 -> 2202 (-1.43%)
>> helped:0
>>
>> (report.py doesn't count programs in which the number of loops changed
>> as helped/hurt because sometimes a loop is unrolled and the
>> instruction count triples)
>>
>> I'd feel fine about changing helped: 0 to helped: 32 like you have.
>>
>>>
>>> Signed-off-by: Connor Abbott 
>>> ---
>>> Seems like this one fell through the cracks...
>>>
>>>  src/glsl/nir/nir_opt_remove_phis.c | 20 
>>>  1 file changed, 20 insertions(+)
>>>
>>> diff --git a/src/glsl/nir/nir_opt_remove_phis.c 
>>> b/src/glsl/nir/nir_opt_remove_phis.c
>>> index bf4a67e..0a5793a 100644
>>> --- a/src/glsl/nir/nir_opt_remove_phis.c
>>> +++ b/src/glsl/nir/nir_opt_remove_phis.c
>>> @@ -58,6 +58,21 @@ remove_phis_block(nir_block *block, void *state)
>>>
>>>nir_foreach_phi_src(phi, src) {
>>>   assert(src->src.is_ssa);
>>> +
>>> + /* For phi nodes at the beginning of loops, we may encounter some
>>> +  * sources from backedges that point back to the destination of 
>>> the
>>> +  * same phi, i.e. something like:
>>> +  *
>>> +  * a = phi(a, b, ...)
>>> +  *
>>> +  * We can safely ignore these sources, since if all of the normal
>>> +  * sources point to the same definition, then that definition must
>>> +  * still dominate the phi node, and the phi will still always take
>>> +  * the value of that definition.
>>> +  */
>>> +
>>
>> I'd remove the newline here.
>>
>>> + if (src->src.ssa == >dest.ssa)
>>> +continue;
>>>
>>>   if (def == NULL) {
>>>  def  = src->src.ssa;
>>> @@ -72,6 +87,11 @@ remove_phis_block(nir_block *block, void *state)
>>>if (!srcs_same)
>>>   continue;
>>>
>>> +  /* We must have found at least one definition, since there must be at
>>> +   * least one forward edge.
>>> +   */
>>> +  assert(def != NULL);
>>> +
>>>assert(phi->dest.is_ssa);
>>>nir_ssa_def_rewrite_uses(>dest.ssa, nir_src_for_ssa(def));
>>>nir_instr_remove(instr);
>>> --
>>> 2.1.0
>>
>> Look good to me. Thanks for remembering this.
>>
>> Reviewed-by: Matt Turner 
>
> Ok, I fixed the newline and the results and pushed it.
>
>>
>> One strange thing I noticed... this doesn't affect any programs on
>> Haswell. All the shaders are vertex shaders, so they're vec4 on
>> Haswell but scalar on Broadwell, so it's probably somehow related to
>> that.
>>
>> Want to take a look at what's going on with Haswell? Seems like if the
>> loop is dead... it should be dead.
>
> 

Re: [Mesa-dev] [PATCH 04/11] i965/vec4: Use the actual channels used in pack_uniform_registers

2015-10-02 Thread Jason Ekstrand
On Fri, Oct 2, 2015 at 12:32 AM, Iago Toral  wrote:
> On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
>> Previously, pack_uniform_registers worked based on the size of the uniform
>> as given to us when we initially set up the uniforms.  However, we have to
>> walk through the uniforms and figure out liveness anyway, so we migh as
>> well record the number of channels used as we go.  This may also allow us
>> to pack things tighter in a few cases.
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 52 
>> +-
>>  1 file changed, 38 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index 407698f..f0fa07e 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -518,11 +518,11 @@ vec4_visitor::split_uniform_registers()
>>  void
>>  vec4_visitor::pack_uniform_registers()
>>  {
>> -   bool uniform_used[this->uniforms];
>> +   uint8_t chans_used[this->uniforms];
>> int new_loc[this->uniforms];
>> int new_chan[this->uniforms];
>>
>> -   memset(uniform_used, 0, sizeof(uniform_used));
>> +   memset(chans_used, 0, sizeof(chans_used));
>> memset(new_loc, 0, sizeof(new_loc));
>> memset(new_chan, 0, sizeof(new_chan));
>>
>> @@ -532,10 +532,36 @@ vec4_visitor::pack_uniform_registers()
>>  */
>> foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
>>for (int i = 0 ; i < 3; i++) {
>> -  if (inst->src[i].file != UNIFORM)
>> - continue;
>> + if (inst->src[i].file != UNIFORM)
>> +continue;
>
> The switch statement below should go before the loop.

Will do.

>> + unsigned readmask;
>> + switch (inst->opcode) {
>> + case VEC4_OPCODE_PACK_BYTES:
>> + case BRW_OPCODE_DP4:
>> + case BRW_OPCODE_DPH:
>> +readmask = 0xf;
>> +break;
>> + case BRW_OPCODE_DP3:
>> +readmask = 0x7;
>> +break;
>> + case BRW_OPCODE_DP2:
>> +readmask = 0x3;
>> +break;
>
> I don't know if there are other opcodes that could also be
> special-handled like these, but if there are we are only missing the
> opportunity to do tighter packing for them (which we are not doing now
> anyway)...
>
> Reviewed-by: Iago Toral Quiroga 
>
>> + default:
>> +readmask = inst->dst.writemask;
>> +break;
>> + }
>> +
>> + int reg = inst->src[i].reg;
>> +
>> + for (int c = 0; c < 4; c++) {
>> +if (!(readmask & (1 << c)))
>> +   continue;
>>
>> -  uniform_used[inst->src[i].reg] = true;
>> +chans_used[reg] = MAX2(chans_used[reg],
>> +   BRW_GET_SWZ(inst->src[i].swizzle, c) + 
>> 1);
>> + }
>>}
>> }
>>
>> @@ -546,17 +572,15 @@ vec4_visitor::pack_uniform_registers()
>>  */
>> for (int src = 0; src < uniforms; src++) {
>>assert(src < uniform_array_size);
>> -  int size = this->uniform_vector_size[src];
>> +  int size = chans_used[src];
>>
>> -  if (!uniform_used[src]) {
>> -  this->uniform_vector_size[src] = 0;
>> -  continue;
>> -  }
>> +  if (size == 0)
>> + continue;
>>
>>int dst;
>>/* Find the lowest place we can slot this uniform in. */
>>for (dst = 0; dst < src; dst++) {
>> -  if (this->uniform_vector_size[dst] + size <= 4)
>> +  if (chans_used[dst] + size <= 4)
>>   break;
>>}
>>
>> @@ -565,7 +589,7 @@ vec4_visitor::pack_uniform_registers()
>>new_chan[src] = 0;
>>} else {
>>new_loc[src] = dst;
>> -  new_chan[src] = this->uniform_vector_size[dst];
>> +  new_chan[src] = chans_used[dst];
>>
>>/* Move the references to the data */
>>for (int j = 0; j < size; j++) {
>> @@ -573,8 +597,8 @@ vec4_visitor::pack_uniform_registers()
>>  stage_prog_data->param[src * 4 + j];
>>}
>>
>> -  this->uniform_vector_size[dst] += size;
>> -  this->uniform_vector_size[src] = 0;
>> +  chans_used[dst] += size;
>> +  chans_used[src] = 0;
>>}
>>
>>new_uniform_count = MAX2(new_uniform_count, dst + 1);
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glx-tls: Visibility hidden attribute and fix x86/x86_64 tls/tdsentry points

2015-10-02 Thread Emil Velikov
Hi Marc,

On 29 September 2015 at 10:31, Marc Dietrich  wrote:
> As expressed before, using hidden attribute only hides some hack on how to
> find the head of the dispatch entry table afaict.
>
> I just replaced it with shared_dispatch_stub_0() and this works for shared-
> glapi case, but not without shared-api (the first function is named
> differently). IMHO, a better fix would be to mark the head of the dispatch
> table for all cases and use this mark instead of the asm hack.
>
Can you please prepare a patch for this and send it over ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-02 Thread Rob Clark
On Fri, Oct 2, 2015 at 1:46 PM, Emil Velikov  wrote:
> On 2 October 2015 at 17:11, Rob Clark  wrote:
>> On Fri, Oct 2, 2015 at 11:49 AM, Emil Velikov  
>> wrote:
>>> On 1 October 2015 at 20:44, Rob Clark  wrote:
 From: Rob Clark 

 Signed-off-by: Rob Clark 
 ---
 Drop the idea of more ambitious re-arrangement if libs, but keep the
 pipe-loader refactoring.  With this at least drm_gralloc could still
 dlopen() gallium_dri.so and then use the pipe-loader API to figure
 out which pipe driver to load and hand back a screen.

 The nine st is not updated.. but I don't claim to understand the whole
 screen + sw-screen thing.  So I figured I'd let someone who knew what
 they were doing update nine.  Once that happens, we should change to
 not expose the dd_xyz fxns outside of pipe-loader, imho..

>>> If the intent here is to consolidate/abstract things, this patch isn't
>>> the way I'm afraid.
>>>
>>> Namely, it drops support for software only pipe-driver. It also
>>> removes pipe-driver support for dri, xa and vl-based modules. All of
>>> which used to work fine last time I've tested them (admittedly ~6
>>> months ago).
>>
>> I assume you mean !GALLIUM_STATIC_TARGETS support?
>>
> Precisely.
>
>> I think we just need to build pipe-driver twice, once in each mode,
>> and link either the static-pipe or dynamic-pipe version per state
>> tracker depending on how you want things..  but wasn't sure the best
>> way to go about that..
>>
> I believe that won't work as the pipe-loader itself dlopens the
> pipe-driver (pipe_foo.so).

I could be being dense, but why wouldn't that work properly if we
built pipe-loader twice?  Ie. the non-GALLIUM_STATIC_TARGETS build
should be built with the right CFLAGS so the code path that tries to
open pipe_foo.so ends up in the binary

(I do wonder a bit why the search-path gets passed in externally from
pipe-loader, since all the state trackers using the
non-GALLIUM_STATIC_TARGETS would presumably want to look in the same
place for pipe_foo.so, but maybe that is a separate clean-up..)

>>
>>> The idea that I have in mind is roughly as:
>>> 1) abstract most of the 'target-helpers' as 'static' pipe-loader,
>>> providing pipe-loader like interface.
>>> 2) drop the ifdefs in the former, add dummy
>>> nouveau_drm_screen_create implementations.
>>> 3) remove all the ifdefs in the state-trackers.
>>> 4) add a configure switch that allows one to toggle/choose which how
>>> the modules will be build. default to 'mega' (static).
>>
>> hmm, that sounds harder than just building it twice..
>>
> Indeed it's quite hairy, so if you can come with a better idea I'm all ears.
>
>>> If you want to hack on that be my guest, but be aware that the
>>> pipe-loader interface has some non-obvious fd ownership patterns ;-)
>>>
>>> Atm, I'm having fun with drm_gralloc and mesa. Expect patches on that
>>> one later on today/tomorrow.
>>
>> finally moving drm_gralloc into mesa?  That would be helpful, I think
>> that should happen before we try to cleanup and merge the dmabuf
>> parts..
>>
> Yea... that one is rather fun, as I would like to preserve as much of
> the history as possible, while avoiding commits from people named
> "Somebody" and alike. I think I got it, will need to see how fast my
> laptop is going to build it.

cool!

BR,
-R

> Thanks
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-02 Thread Marek Olšák
On Fri, Oct 2, 2015 at 4:12 PM, Roland Scheidegger  wrote:
> Am 02.10.2015 um 12:19 schrieb Marek Olšák:
>> On Fri, Oct 2, 2015 at 3:13 AM, Roland Scheidegger  
>> wrote:
>>> Can't say I'm a big fan of doing essentially the same thing with
>>> different methods, but well it's coming from GL, and if it helps some
>>> drivers...
>>>
>>> Acked-by: Roland Scheidegger 
>>
>> If all drivers start supporting the CAP, we could remove the st/mesa
>> emulation. Just as we could remove the color clamping emulation.
>>
>> Marek
>>
>
> Yes, but that just moves the logic to the drivers. It's still (mostly)
> the same thing handling per-sample inputs, or forcing them to per sample
> via ARB_sample_shading.

Yes, that's the main point. For performance, we should do the
translation to TGSI at link time, so that the GLSL->TGSI overhead can
be removed from draw calls. Because of that, all shader state
dependencies should be moved to and implemented/emulated in drivers.
This will be difficult to do because of the lack of interest from
other driver maintainers, therefore we need CAPs until they realize
it's good for them too.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-02 Thread Emil Velikov
On 2 October 2015 at 17:11, Rob Clark  wrote:
> On Fri, Oct 2, 2015 at 11:49 AM, Emil Velikov  
> wrote:
>> On 1 October 2015 at 20:44, Rob Clark  wrote:
>>> From: Rob Clark 
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>> Drop the idea of more ambitious re-arrangement if libs, but keep the
>>> pipe-loader refactoring.  With this at least drm_gralloc could still
>>> dlopen() gallium_dri.so and then use the pipe-loader API to figure
>>> out which pipe driver to load and hand back a screen.
>>>
>>> The nine st is not updated.. but I don't claim to understand the whole
>>> screen + sw-screen thing.  So I figured I'd let someone who knew what
>>> they were doing update nine.  Once that happens, we should change to
>>> not expose the dd_xyz fxns outside of pipe-loader, imho..
>>>
>> If the intent here is to consolidate/abstract things, this patch isn't
>> the way I'm afraid.
>>
>> Namely, it drops support for software only pipe-driver. It also
>> removes pipe-driver support for dri, xa and vl-based modules. All of
>> which used to work fine last time I've tested them (admittedly ~6
>> months ago).
>
> I assume you mean !GALLIUM_STATIC_TARGETS support?
>
Precisely.

> I think we just need to build pipe-driver twice, once in each mode,
> and link either the static-pipe or dynamic-pipe version per state
> tracker depending on how you want things..  but wasn't sure the best
> way to go about that..
>
I believe that won't work as the pipe-loader itself dlopens the
pipe-driver (pipe_foo.so).

>
>> The idea that I have in mind is roughly as:
>> 1) abstract most of the 'target-helpers' as 'static' pipe-loader,
>> providing pipe-loader like interface.
>> 2) drop the ifdefs in the former, add dummy
>> nouveau_drm_screen_create implementations.
>> 3) remove all the ifdefs in the state-trackers.
>> 4) add a configure switch that allows one to toggle/choose which how
>> the modules will be build. default to 'mega' (static).
>
> hmm, that sounds harder than just building it twice..
>
Indeed it's quite hairy, so if you can come with a better idea I'm all ears.

>> If you want to hack on that be my guest, but be aware that the
>> pipe-loader interface has some non-obvious fd ownership patterns ;-)
>>
>> Atm, I'm having fun with drm_gralloc and mesa. Expect patches on that
>> one later on today/tomorrow.
>
> finally moving drm_gralloc into mesa?  That would be helpful, I think
> that should happen before we try to cleanup and merge the dmabuf
> parts..
>
Yea... that one is rather fun, as I would like to preserve as much of
the history as possible, while avoiding commits from people named
"Somebody" and alike. I think I got it, will need to see how fast my
laptop is going to build it.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix GetProgramiv/GetActiveAttrib regression

2015-10-02 Thread Tapani Pälli

On 10/02/2015 07:46 PM, Ilia Mirkin wrote:

On Mon, Sep 28, 2015 at 5:12 AM, Tapani Pälli  wrote:

Commit 4639cea2921669527eb43dcb49724c05afb27e8e added packed varyings
as part of program resource list. We need to take this to account with
all functions dealing with active input attributes.

Patch fixes the issue by adding additional check to is_active_attrib
and iterating active attribs explicitly in GetActiveAttrib function.

I've tested that fix works for Assault Android Cactus (demo version)
and does not cause Piglit or CTS regressions in glGetProgramiv tests.

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122
---
  src/mesa/main/shader_query.cpp | 44 +-
  1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 99d9e10..d023504 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -101,15 +101,34 @@ _mesa_BindAttribLocation(GLhandleARB program, GLuint 
index,
  */
  }

+/* Function checks if given input variable is packed varying by checking
+ * if there also exists output variable with the same name.
+ */
+static bool
+is_packed_varying(struct gl_shader_program *shProg, const ir_variable *input)
+{
+   assert(input->data.mode == ir_var_shader_in);

No statements before variable decls in core code.

ok

+   unsigned array_index = 0;
+   struct gl_program_resource *out =
+  _mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT, input->name,
+   _index);

This is slow, right? I.e. it's a loop?


It is agreeably slow, alternatively we could add new flag to ir_variable 
to make things quick.



+   if (out)
+  return true;
+   return false;
+}
+
  static bool
-is_active_attrib(const ir_variable *var)
+is_active_attrib(struct gl_shader_program *shProg,
+ const ir_variable *var)
  {
 if (!var)
return false;

 switch (var->data.mode) {
 case ir_var_shader_in:
-  return var->data.location != -1;
+  return (var->data.location != -1 &&
+  var->data.location_frac == 0 &&
+  !is_packed_varying(shProg, var));

 case ir_var_system_value:
/* From GL 4.3 core spec, section 11.1.1 (Vertex Attributes):
@@ -154,19 +173,25 @@ _mesa_GetActiveAttrib(GLhandleARB program, GLuint 
desired_index,
return;
 }

-   struct gl_program_resource *res =
-  _mesa_program_resource_find_index(shProg, GL_PROGRAM_INPUT,
-desired_index);

You appear to replace this function, which is O(n), with a loop around
what is essentially this same function, which is O(n^2). Could you
write a more targeted helper which achieves what you want with a
single scan of the program resource list?


OK, I can simplify this. At this point I think it would be fine to 
revert the patches adding packed varyings to resource list. I can then 
send these changes together when adding them back.



+   struct gl_program_resource *res = shProg->ProgramResourceList;
+   unsigned index = -1;
+   for (unsigned i = 0; i < shProg->NumProgramResourceList; i++, res++) {
+  if (res->Type != GL_PROGRAM_INPUT ||
+  !is_active_attrib(shProg, RESOURCE_VAR(res)))
+ continue;
+  if (++index == desired_index)
+ break;
+   }

 /* User asked for index that does not exist. */
-   if (!res) {
+   if (!res || index != desired_index) {
_mesa_error(ctx, GL_INVALID_VALUE, "glGetActiveAttrib(index)");
return;
 }

 const ir_variable *const var = RESOURCE_VAR(res);

-   if (!is_active_attrib(var))
+   if (!is_active_attrib(shProg, var))
return;

 const char *var_name = var->name;
@@ -252,7 +277,7 @@ _mesa_count_active_attribs(struct gl_shader_program *shProg)
 for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) {
if (res->Type == GL_PROGRAM_INPUT &&
res->StageReferences & (1 << MESA_SHADER_VERTEX) &&
-  is_active_attrib(RESOURCE_VAR(res)))
+  is_active_attrib(shProg, RESOURCE_VAR(res)))
   count++;
 }
 return count;
@@ -271,7 +296,8 @@ _mesa_longest_attribute_name_length(struct 
gl_shader_program *shProg)
 size_t longest = 0;
 for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) {
if (res->Type == GL_PROGRAM_INPUT &&
-  res->StageReferences & (1 << MESA_SHADER_VERTEX)) {
+  res->StageReferences & (1 << MESA_SHADER_VERTEX) &&
+  is_active_attrib(shProg, RESOURCE_VAR(res))) {

const size_t length = strlen(RESOURCE_VAR(res)->name);
if (length >= longest)
--
2.4.3



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.

2015-10-02 Thread Jason Ekstrand
On Thu, Oct 1, 2015 at 11:13 AM, Kenneth Graunke  wrote:
> Geometry and tessellation shaders process multiple vertices; their
> inputs are arrays indexed by the vertex number.  While GLSL makes
> this look like a normal array, it can be very different behind the
> scenes.
>
> On Intel hardware, all inputs for a particular vertex are stored
> together - as if they were grouped into a single struct.  This means
> that consecutive elements of these top-level arrays are not contiguous.
> In fact, they may sometimes be in completely disjoint memory segments.
>
> NIR's existing load_input intrinsics are awkward for this case, as they
> distill everything down to a single offset.  We'd much rather keep the
> vertex ID separate, but build up an offset as normal beyond that.
>
> This patch introduces new nir_intrinsic_load_per_vertex_input
> intrinsics to handle this case.  They work like ordinary load_input
> intrinsics, but have an extra source (src[0]) which represents the
> outermost array index.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/nir/nir_intrinsics.h |  1 +
>  src/glsl/nir/nir_lower_io.c   | 54 ++---
>  src/glsl/nir/nir_print.c  |  2 +
>  src/mesa/drivers/dri/i965/brw_nir.c   | 13 +-
>  src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 58 
> +++
>  5 files changed, 86 insertions(+), 42 deletions(-)
>
> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
> index ac4c2ba..263d8c1 100644
> --- a/src/glsl/nir/nir_intrinsics.h
> +++ b/src/glsl/nir/nir_intrinsics.h
> @@ -228,6 +228,7 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
>  LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
>  LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
>  LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> +LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
>  LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
>
>  /*
> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
> index 80967fc..d066ca6 100644
> --- a/src/glsl/nir/nir_lower_io.c
> +++ b/src/glsl/nir/nir_lower_io.c
> @@ -63,8 +63,20 @@ nir_assign_var_locations(struct exec_list *var_list, 
> unsigned *size,
> *size = location;
>  }
>
> +/**
> + * Returns true if we're processing a stage whose inputs are arrays indexed
> + * by a vertex number (such as geometry shader inputs).
> + */
> +static bool
> +stage_uses_per_vertex_inputs(struct lower_io_state *state)
> +{
> +   gl_shader_stage stage = state->builder.shader->stage;
> +   return stage == MESA_SHADER_GEOMETRY;
> +}
> +
>  static unsigned
>  get_io_offset(nir_deref_var *deref, nir_instr *instr,
> +  nir_src *vertex_index,
>nir_src *indirect, bool *has_indirect,
>struct lower_io_state *state)
>  {
> @@ -75,6 +87,22 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
> b->cursor = nir_before_instr(instr);
>
> nir_deref *tail = >deref;
> +
> +   /* For per-vertex input arrays (i.e. geometry shader inputs), keep the
> +* outermost array index separate.  Process the rest normally.
> +*/
> +   if (vertex_index != NULL) {
> +  tail = tail->child;
> +  assert(tail->deref_type == nir_deref_type_array);
> +  nir_deref_array *deref_array = nir_deref_as_array(tail);
> +
> +  nir_ssa_def *vtx = nir_imm_int(b, deref_array->base_offset);
> +  if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
> + vtx = nir_iadd(b, vtx, nir_ssa_for_src(b, deref_array->indirect, 
> 1));
> +  }
> +  *vertex_index = nir_src_for_ssa(vtx);
> +   }
> +
> while (tail->child != NULL) {
>const struct glsl_type *parent_type = tail->type;
>tail = tail->child;
> @@ -114,13 +142,19 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
>  }
>
>  static nir_intrinsic_op
> -load_op(nir_variable_mode mode, bool has_indirect)
> +load_op(struct lower_io_state *state,
> +nir_variable_mode mode, bool per_vertex, bool has_indirect)
>  {
> nir_intrinsic_op op;
> switch (mode) {
> case nir_var_shader_in:
> -  op = has_indirect ? nir_intrinsic_load_input_indirect :
> -  nir_intrinsic_load_input;
> +  if (per_vertex) {
> + op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect :
> + nir_intrinsic_load_per_vertex_input;
> +  } else {
> + op = has_indirect ? nir_intrinsic_load_input_indirect :
> + nir_intrinsic_load_input;
> +  }
>break;
> case nir_var_uniform:
>op = has_indirect ? nir_intrinsic_load_uniform_indirect :
> @@ -157,15 +191,21 @@ nir_lower_io_block(nir_block *block, void *void_state)
>   if (mode != nir_var_shader_in && mode != nir_var_uniform)
>   

Re: [Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-02 Thread Emil Velikov
On 2 October 2015 at 19:02, Rob Clark  wrote:
> On Fri, Oct 2, 2015 at 1:46 PM, Emil Velikov  wrote:
>> On 2 October 2015 at 17:11, Rob Clark  wrote:
>>> On Fri, Oct 2, 2015 at 11:49 AM, Emil Velikov  
>>> wrote:
 On 1 October 2015 at 20:44, Rob Clark  wrote:
> From: Rob Clark 
>
> Signed-off-by: Rob Clark 
> ---
> Drop the idea of more ambitious re-arrangement if libs, but keep the
> pipe-loader refactoring.  With this at least drm_gralloc could still
> dlopen() gallium_dri.so and then use the pipe-loader API to figure
> out which pipe driver to load and hand back a screen.
>
> The nine st is not updated.. but I don't claim to understand the whole
> screen + sw-screen thing.  So I figured I'd let someone who knew what
> they were doing update nine.  Once that happens, we should change to
> not expose the dd_xyz fxns outside of pipe-loader, imho..
>
 If the intent here is to consolidate/abstract things, this patch isn't
 the way I'm afraid.

 Namely, it drops support for software only pipe-driver. It also
 removes pipe-driver support for dri, xa and vl-based modules. All of
 which used to work fine last time I've tested them (admittedly ~6
 months ago).
>>>
>>> I assume you mean !GALLIUM_STATIC_TARGETS support?
>>>
>> Precisely.
>>
>>> I think we just need to build pipe-driver twice, once in each mode,
>>> and link either the static-pipe or dynamic-pipe version per state
>>> tracker depending on how you want things..  but wasn't sure the best
>>> way to go about that..
>>>
>> I believe that won't work as the pipe-loader itself dlopens the
>> pipe-driver (pipe_foo.so).
>
> I could be being dense, but why wouldn't that work properly if we
> built pipe-loader twice?  Ie. the non-GALLIUM_STATIC_TARGETS build
> should be built with the right CFLAGS so the code path that tries to
> open pipe_foo.so ends up in the binary
>
> (I do wonder a bit why the search-path gets passed in externally from
> pipe-loader, since all the state trackers using the
> non-GALLIUM_STATIC_TARGETS would presumably want to look in the same
> place for pipe_foo.so, but maybe that is a separate clean-up..)
>
I wouldn't say that you're dense - could be tired and/or my
explanation leaves something to be desired. Let me try it from another
angle.

If we build pipe-loader as is, it is part of the 'API' library - dri,
vdpau To work, it uses the separate/standalone pipe-drivers.

Even if we 'build pipe-loader twice' this won't bring us anything as
the pipe-driver(s) will still be external module. And as soon as you
start hacking around dlopen you effectively rewrite the
target-helpers.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/20] i915: Pile of fixes and cleanups

2015-10-02 Thread Ian Romanick
On 10/02/2015 01:30 PM, Ville Syrjälä wrote:
> On Sat, Oct 03, 2015 at 02:55:57AM +0800, Chih-Wei Huang wrote:
>> 2015-05-22 1:21 GMT+08:00 Ville Syrjälä :
>>> On Sat, May 16, 2015 at 01:36:44AM +0800, Chih-Wei Huang wrote:
 2015-04-13 19:00 GMT+08:00 Ville Syrjälä :

 Glad to see the patches to improve i915 driver.
 I thought Intel has given it up.
 Thank you!

 I'm happy to test them on my Atom N450 tablet
 once they are merged.
>>>
>>> Available at
>>> git://github.com/vsyrjala/mesa.git i915_fixes_7
>>> if you can't wait :P
>>
>> Hello, how about these i915 patches?
>> Seems they are not merged.
>> Are they still needed?
>>
>> I've tried to apply to mesa 11.0.2
>> but there are conflicts I can't resolve.
> 
> I've not managed to push them. I was planning to push them but then my summer
> vacation was coming up, and I figured it's a bit rude to push and disappear
> in case there are any issues. After coming back I naturally forgot all about
> it, and when I did eventually rememer 11.0 was just about to be released,
> so I thought it's better to push after the release, and then I've just been
> too busy with other crap to do anything with these.
> 
> But I think Ian recently picked up at least some of them (thanks for that
> BTW). I've not found the time to really good look at what patches he had
> taken and which were still missing. Ian?

All except patch 10 have either already been pushed, have been re-sent
to the list, or are irrelevant (because they change code that I deleted).

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-02 Thread Rob Clark
On Fri, Oct 2, 2015 at 2:41 PM, Emil Velikov  wrote:
> On 2 October 2015 at 19:02, Rob Clark  wrote:
>> On Fri, Oct 2, 2015 at 1:46 PM, Emil Velikov  
>> wrote:
>>> On 2 October 2015 at 17:11, Rob Clark  wrote:
 On Fri, Oct 2, 2015 at 11:49 AM, Emil Velikov  
 wrote:
> On 1 October 2015 at 20:44, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> Signed-off-by: Rob Clark 
>> ---
>> Drop the idea of more ambitious re-arrangement if libs, but keep the
>> pipe-loader refactoring.  With this at least drm_gralloc could still
>> dlopen() gallium_dri.so and then use the pipe-loader API to figure
>> out which pipe driver to load and hand back a screen.
>>
>> The nine st is not updated.. but I don't claim to understand the whole
>> screen + sw-screen thing.  So I figured I'd let someone who knew what
>> they were doing update nine.  Once that happens, we should change to
>> not expose the dd_xyz fxns outside of pipe-loader, imho..
>>
> If the intent here is to consolidate/abstract things, this patch isn't
> the way I'm afraid.
>
> Namely, it drops support for software only pipe-driver. It also
> removes pipe-driver support for dri, xa and vl-based modules. All of
> which used to work fine last time I've tested them (admittedly ~6
> months ago).

 I assume you mean !GALLIUM_STATIC_TARGETS support?

>>> Precisely.
>>>
 I think we just need to build pipe-driver twice, once in each mode,
 and link either the static-pipe or dynamic-pipe version per state
 tracker depending on how you want things..  but wasn't sure the best
 way to go about that..

>>> I believe that won't work as the pipe-loader itself dlopens the
>>> pipe-driver (pipe_foo.so).
>>
>> I could be being dense, but why wouldn't that work properly if we
>> built pipe-loader twice?  Ie. the non-GALLIUM_STATIC_TARGETS build
>> should be built with the right CFLAGS so the code path that tries to
>> open pipe_foo.so ends up in the binary
>>
>> (I do wonder a bit why the search-path gets passed in externally from
>> pipe-loader, since all the state trackers using the
>> non-GALLIUM_STATIC_TARGETS would presumably want to look in the same
>> place for pipe_foo.so, but maybe that is a separate clean-up..)
>>
> I wouldn't say that you're dense - could be tired and/or my
> explanation leaves something to be desired. Let me try it from another
> angle.

probably explanations are not clear on both sides, although probably
mostly my fault for being a bit new/naive to the way the build is laid
out..

> If we build pipe-loader as is, it is part of the 'API' library - dri,
> vdpau To work, it uses the separate/standalone pipe-drivers.
>
> Even if we 'build pipe-loader twice' this won't bring us anything as
> the pipe-driver(s) will still be external module. And as soon as you
> start hacking around dlopen you effectively rewrite the
> target-helpers.

if I'm understanding things properly, src/gallium/targets/* links
directly against libpipe_loader.la currently..  what I'm proposing (I
think) is we split libpipe_loader.la into libpipe_loader_static.la and
libpipe_loader_dynamic.la (or, well, maybe those aren't the best names
but I'll use them for now), which is what I meant by compile
pipe-loader twice, and depending on whether st wants "mega" or not it
pulls in either libpipe_loader_dynamic.la or
libpipe_loader_static.la..

I mean, that doesn't completely get rid of the per-target 'if
HAVE_GALLIUM_STATIC_TARGETS' bit which statically pulls in the
individual pipe drivers via $TARGET_LIB_DEPS..  (although maybe there
is even a clever way to pull those in indirectly via
libpipe_loader_static.la?)

BR,
-R

> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/13] i965: Pull shader, prog, and shader_prog out of the

2015-10-02 Thread Kenneth Graunke
On Thursday, October 01, 2015 06:50:26 PM Jason Ekstrand wrote:
> This series does a bunch of code-shuffling with the end objective of
> getting the gl_* data structures out of the backend compiler.  Now that we
> have NIR, we would like the compiler to be a NIR -> binary translator and
> not be loaded full of mesa and GL-specific stuff.  This series gets us 95%
> of the way there.  Unfortunately, we still have to pass gl_program into
> geometry shaders for transform-fedback and, for fragment shaders, we need
> to pass in gl_shader_program so that we can fix up TEXTURE_RECTANGLE on
> older hardware (pre-Broadwell).
> 
> Incidentally, according to our CI system, this series fixes 2 piglit tests
> on Iron Lake and GM45.  I have no idea why.
> 
> Jason Ekstrand (13):
>   i965/shader: Pull assign_common_binding_table_offsets out of
> backend_shader
>   i965: Move binding table setup to codegen time.
>   i965: Move prog_data uniform setup to the codegen level
>   nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader
>   nir: Add a a nir_shader_info struct
>   nir: Move GS data to nir_shader_info
>   i965/backend_shader: Add a field to store the NIR shader
>   i965/cs: Remove the prog argument from local_id_payload_dwords
>   i965/fs: Move sampler unit lookup into rescale_texcoord
>   i965/fs: Use the nir info instead of pulling things out of
> [shader_]prog
>   i965/vec4: Use nir info instead of pulling things out of [shader_]prog
>   i965/fs,vec4: Get rid of the sanity_param_count
>   i965/shader: Get rid of the shader, prog, and shader_prog fields

Series is:
Reviewed-by: Kenneth Graunke 

I used this to read patch 13, as it never seemed to hit the mailing list
for some reason or other:
http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/compiler-separation


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #12 from Ilia Mirkin  ---
Looking over that patch, I noticed this hunk:

   memcpy(>base, , sizeof base);
   if (double_buffer) {
- conf->dri_double_config = dri_config;
- conf->dri_single_config = NULL;
+ if (srgb)
+conf->dri_srgb_double_config = dri_config;
+ else
+conf->dri_double_config = dri_config;
   } else {
- conf->dri_single_config = dri_config;
- conf->dri_double_config = NULL;
+ if (srgb)
+conf->dri_srgb_single_config = dri_config;
+ else
+conf->dri_single_config = dri_config;
   }

Previously the "other" configs would get cleared out whereas they aren't
anymore.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #13 from Ilia Mirkin  ---
(In reply to Ilia Mirkin from comment #12)
> Looking over that patch, I noticed this hunk:
> 
>memcpy(>base, , sizeof base);
>if (double_buffer) {
> - conf->dri_double_config = dri_config;
> - conf->dri_single_config = NULL;
> + if (srgb)
> +conf->dri_srgb_double_config = dri_config;
> + else
> +conf->dri_double_config = dri_config;
>} else {
> - conf->dri_single_config = dri_config;
> - conf->dri_double_config = NULL;
> + if (srgb)
> +conf->dri_srgb_single_config = dri_config;
> + else
> +conf->dri_single_config = dri_config;
>}
> 
> Previously the "other" configs would get cleared out whereas they aren't
> anymore.

Er nevermind. Before it was malloc'd but now it's calloc'd.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/20] i915: Pile of fixes and cleanups

2015-10-02 Thread Ville Syrjälä
On Sat, Oct 03, 2015 at 02:55:57AM +0800, Chih-Wei Huang wrote:
> 2015-05-22 1:21 GMT+08:00 Ville Syrjälä :
> > On Sat, May 16, 2015 at 01:36:44AM +0800, Chih-Wei Huang wrote:
> >> 2015-04-13 19:00 GMT+08:00 Ville Syrjälä :
> >>
> >> Glad to see the patches to improve i915 driver.
> >> I thought Intel has given it up.
> >> Thank you!
> >>
> >> I'm happy to test them on my Atom N450 tablet
> >> once they are merged.
> >
> > Available at
> > git://github.com/vsyrjala/mesa.git i915_fixes_7
> > if you can't wait :P
> 
> Hello, how about these i915 patches?
> Seems they are not merged.
> Are they still needed?
> 
> I've tried to apply to mesa 11.0.2
> but there are conflicts I can't resolve.

I've not managed to push them. I was planning to push them but then my summer
vacation was coming up, and I figured it's a bit rude to push and disappear
in case there are any issues. After coming back I naturally forgot all about
it, and when I did eventually rememer 11.0 was just about to be released,
so I thought it's better to push after the release, and then I've just been
too busy with other crap to do anything with these.

But I think Ian recently picked up at least some of them (thanks for that
BTW). I've not found the time to really good look at what patches he had
taken and which were still missing. Ian?

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] nir/lower_io: Return has_indirect flag from get_io_offset().

2015-10-02 Thread Jason Ekstrand
On Thu, Oct 1, 2015 at 11:13 AM, Kenneth Graunke  wrote:
> get_io_offset() already walks the dereference chain and discovers
> whether or not we have an indirect; we can just return that rather than
> computing it a second time.  This means moving the call a bit earlier.
>
> More importantly, I'm about to introduce special handling for the
> outermost array index, which would require changing the dereference
> we pass to deref_has_indirect().  Or we can just eliminate that.

I think I actually like what we did in nir_lower_samplers a bit
better.  There we just made indirect a nir_ssa_def* and NULL means no
indirect.  Since these passes all happen while we're in SSA form, that
works pretty well.  Thoughts?

> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/nir/nir_lower_io.c | 41 ++---
>  1 file changed, 14 insertions(+), 27 deletions(-)
>
> This replaces patches 3-5 of my recent 23 patch series:
>
> [PATCH 03/23] nir/lower_io: Switch on shader stage for inputs in load_op().
> [PATCH 04/23] nir/lower_io: Make get_io_offset take a nir_deref, not a 
> nir_deref_var.
> [PATCH 05/23] nir: Introduce new nir_intrinsic_load_per_vertex_input 
> intrinsics.
>
> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
> index f32c09d..80967fc 100644
> --- a/src/glsl/nir/nir_lower_io.c
> +++ b/src/glsl/nir/nir_lower_io.c
> @@ -63,22 +63,9 @@ nir_assign_var_locations(struct exec_list *var_list, 
> unsigned *size,
> *size = location;
>  }
>
> -static bool
> -deref_has_indirect(nir_deref_var *deref)
> -{
> -   for (nir_deref *tail = deref->deref.child; tail; tail = tail->child) {
> -  if (tail->deref_type == nir_deref_type_array) {
> - nir_deref_array *arr = nir_deref_as_array(tail);
> - if (arr->deref_array_type == nir_deref_array_type_indirect)
> -return true;
> -  }
> -   }
> -
> -   return false;
> -}
> -
>  static unsigned
> -get_io_offset(nir_deref_var *deref, nir_instr *instr, nir_src *indirect,
> +get_io_offset(nir_deref_var *deref, nir_instr *instr,
> +  nir_src *indirect, bool *has_indirect,
>struct lower_io_state *state)
>  {
> bool found_indirect = false;
> @@ -122,6 +109,7 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr, 
> nir_src *indirect,
>}
> }
>
> +   *has_indirect = found_indirect;
> return base_offset;
>  }
>
> @@ -169,17 +157,17 @@ nir_lower_io_block(nir_block *block, void *void_state)
>   if (mode != nir_var_shader_in && mode != nir_var_uniform)
>  continue;
>
> - bool has_indirect = deref_has_indirect(intrin->variables[0]);
> + nir_src indirect;
> + bool has_indirect;
> +
> + unsigned offset = get_io_offset(intrin->variables[0], 
> >instr,
> + , _indirect, state);
>
>   nir_intrinsic_instr *load =
>  nir_intrinsic_instr_create(state->mem_ctx,
> load_op(mode, has_indirect));
>   load->num_components = intrin->num_components;
>
> - nir_src indirect;
> - unsigned offset = get_io_offset(intrin->variables[0],
> - >instr, , state);
> -
>   unsigned location = intrin->variables[0]->var->data.driver_location;
>   if (mode == nir_var_uniform) {
>  load->const_index[0] = location;
> @@ -209,7 +197,12 @@ nir_lower_io_block(nir_block *block, void *void_state)
>   if (intrin->variables[0]->var->data.mode != nir_var_shader_out)
>  continue;
>
> - bool has_indirect = deref_has_indirect(intrin->variables[0]);
> + nir_src indirect;
> + bool has_indirect;
> +
> + unsigned offset = get_io_offset(intrin->variables[0], 
> >instr,
> + , _indirect, state);
> + offset += intrin->variables[0]->var->data.driver_location;
>
>   nir_intrinsic_op store_op;
>   if (has_indirect) {
> @@ -221,12 +214,6 @@ nir_lower_io_block(nir_block *block, void *void_state)
>   nir_intrinsic_instr *store = 
> nir_intrinsic_instr_create(state->mem_ctx,
>   store_op);
>   store->num_components = intrin->num_components;
> -
> - nir_src indirect;
> - unsigned offset = get_io_offset(intrin->variables[0],
> - >instr, , state);
> - offset += intrin->variables[0]->var->data.driver_location;
> -
>   store->const_index[0] = offset;
>
>   nir_src_copy(>src[0], >src[0], store);
> --
> 2.5.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 0/4] Reserve binding table space for SSBOs

2015-10-02 Thread Kenneth Graunke
On Wednesday, September 30, 2015 12:06:38 PM Iago Toral Quiroga wrote:
> This fixes a bug that Curro pointed out: we only allocate entries for
> UBOs at the moment. This a problem when a shader defines more
> than 12 combined UBO and SSBO surfaces. This is done with patch 3.
> 
> patch 4 simply adds an assertion to make sure that we never try to exceed
> that limit for some reason. It should be impossible since the compiler
> checks the individual limits, but if that actually were to happen because
> of some bug in the future I guess we want to detect that with the assert
> better.
> 
> I also replace all the hardcoded limits for UBO and SSBOS, that follows
> what we have done for other things like images or ABOs and it is just
> better. This is done with patches 1-2.
> 
> Although we are discussing that we probably want to have separated index
> spaces for UBOs and SSBOs I think we should probably land at least patches
> 1-3 since these would not be affected by that change.
> 
> I also think that we want to land [1], even if that one would be affected,
> since that fixes a current problem in master.
> 
> [1] https://patchwork.freedesktop.org/patch/60654/
> 
> Iago Toral Quiroga (4):
>   i965: Define BRW_MAX_UBO
>   i965: Define BRW_MAX_SSBO
>   i965: Reserve binding table space for SSBO surfaces
>   i965: Assert on the number of combined UBO and SSBO binding table
> entries
> 
>  src/mesa/drivers/dri/i965/brw_context.c  | 18 +-
>  src/mesa/drivers/dri/i965/brw_context.h  | 12 +++-
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  1 +
>  3 files changed, 21 insertions(+), 10 deletions(-)
> 
> 

Series is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] nir/lower_io: Return has_indirect flag from get_io_offset().

2015-10-02 Thread Kenneth Graunke
On Friday, October 02, 2015 12:08:07 PM Jason Ekstrand wrote:
> On Thu, Oct 1, 2015 at 11:13 AM, Kenneth Graunke  
> wrote:
> > get_io_offset() already walks the dereference chain and discovers
> > whether or not we have an indirect; we can just return that rather than
> > computing it a second time.  This means moving the call a bit earlier.
> >
> > More importantly, I'm about to introduce special handling for the
> > outermost array index, which would require changing the dereference
> > we pass to deref_has_indirect().  Or we can just eliminate that.
> 
> I think I actually like what we did in nir_lower_samplers a bit
> better.  There we just made indirect a nir_ssa_def* and NULL means no
> indirect.  Since these passes all happen while we're in SSA form, that
> works pretty well.  Thoughts?

Yeah, I like that better.  I'll resend.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-02 Thread Emil Velikov
On 2 October 2015 at 19:59, Rob Clark  wrote:
> On Fri, Oct 2, 2015 at 2:41 PM, Emil Velikov  wrote:
>> On 2 October 2015 at 19:02, Rob Clark  wrote:
>>> On Fri, Oct 2, 2015 at 1:46 PM, Emil Velikov  
>>> wrote:
 On 2 October 2015 at 17:11, Rob Clark  wrote:
> On Fri, Oct 2, 2015 at 11:49 AM, Emil Velikov  
> wrote:
>> On 1 October 2015 at 20:44, Rob Clark  wrote:
>>> From: Rob Clark 
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>> Drop the idea of more ambitious re-arrangement if libs, but keep the
>>> pipe-loader refactoring.  With this at least drm_gralloc could still
>>> dlopen() gallium_dri.so and then use the pipe-loader API to figure
>>> out which pipe driver to load and hand back a screen.
>>>
>>> The nine st is not updated.. but I don't claim to understand the whole
>>> screen + sw-screen thing.  So I figured I'd let someone who knew what
>>> they were doing update nine.  Once that happens, we should change to
>>> not expose the dd_xyz fxns outside of pipe-loader, imho..
>>>
>> If the intent here is to consolidate/abstract things, this patch isn't
>> the way I'm afraid.
>>
>> Namely, it drops support for software only pipe-driver. It also
>> removes pipe-driver support for dri, xa and vl-based modules. All of
>> which used to work fine last time I've tested them (admittedly ~6
>> months ago).
>
> I assume you mean !GALLIUM_STATIC_TARGETS support?
>
 Precisely.

> I think we just need to build pipe-driver twice, once in each mode,
> and link either the static-pipe or dynamic-pipe version per state
> tracker depending on how you want things..  but wasn't sure the best
> way to go about that..
>
 I believe that won't work as the pipe-loader itself dlopens the
 pipe-driver (pipe_foo.so).
>>>
>>> I could be being dense, but why wouldn't that work properly if we
>>> built pipe-loader twice?  Ie. the non-GALLIUM_STATIC_TARGETS build
>>> should be built with the right CFLAGS so the code path that tries to
>>> open pipe_foo.so ends up in the binary
>>>
>>> (I do wonder a bit why the search-path gets passed in externally from
>>> pipe-loader, since all the state trackers using the
>>> non-GALLIUM_STATIC_TARGETS would presumably want to look in the same
>>> place for pipe_foo.so, but maybe that is a separate clean-up..)
>>>
>> I wouldn't say that you're dense - could be tired and/or my
>> explanation leaves something to be desired. Let me try it from another
>> angle.
>
> probably explanations are not clear on both sides, although probably
> mostly my fault for being a bit new/naive to the way the build is laid
> out..
>
>> If we build pipe-loader as is, it is part of the 'API' library - dri,
>> vdpau To work, it uses the separate/standalone pipe-drivers.
>>
>> Even if we 'build pipe-loader twice' this won't bring us anything as
>> the pipe-driver(s) will still be external module. And as soon as you
>> start hacking around dlopen you effectively rewrite the
>> target-helpers.
>
> if I'm understanding things properly, src/gallium/targets/* links
> directly against libpipe_loader.la currently..  what I'm proposing (I
> think) is we split libpipe_loader.la into libpipe_loader_static.la and
> libpipe_loader_dynamic.la (or, well, maybe those aren't the best names
> but I'll use them for now), which is what I meant by compile
> pipe-loader twice, and depending on whether st wants "mega" or not it
> pulls in either libpipe_loader_dynamic.la or
> libpipe_loader_static.la..
>
> I mean, that doesn't completely get rid of the per-target 'if
> HAVE_GALLIUM_STATIC_TARGETS' bit which statically pulls in the
> individual pipe drivers via $TARGET_LIB_DEPS..  (although maybe there
> is even a clever way to pull those in indirectly via
> libpipe_loader_static.la?)
>
I think I see what's happening here. I can bet that the
targets/pipe-loader naming has gotten you confused.

So let me try from (yet) another angle. Here is what we have today in
the default case:
DRI (static targets)
 - targets/dri, does _not_ link libpipe_loader, but libfoo.a
.- libfoo.a 'opens' the statically linked drivers/$bar.

Clover ('dynamic'/pipe-loader/pipe-driver targets)
 - targets/opencl, links against libpipe_loader.a (i.e. static)
 - libpipe_loader.a dlopens the separate/dynamic/external modules.
 - the latter are generated in targets/pipe-loader (note name should
really be pipe-drivers)

We cannot split libpipe_loader, as it only handles the 'dynamic'. If
you look at the makefile you'll see

if HAVE_GALLIUM_STATIC_TARGETS

x_SOURCES += target.c // aka libfoo.a above

else # HAVE_GALLIUM_STATIC_TARGETS

x_LIBADD += libpipe_loader.la

endif # HAVE_GALLIUM_STATIC_TARGETS


Hopefully this time 

[Mesa-dev] [PATCH v3 2/2] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.

2015-10-02 Thread Kenneth Graunke
Geometry and tessellation shaders process multiple vertices; their
inputs are arrays indexed by the vertex number.  While GLSL makes
this look like a normal array, it can be very different behind the
scenes.

On Intel hardware, all inputs for a particular vertex are stored
together - as if they were grouped into a single struct.  This means
that consecutive elements of these top-level arrays are not contiguous.
In fact, they may sometimes be in completely disjoint memory segments.

NIR's existing load_input intrinsics are awkward for this case, as they
distill everything down to a single offset.  We'd much rather keep the
vertex ID separate, but build up an offset as normal beyond that.

This patch introduces new nir_intrinsic_load_per_vertex_input
intrinsics to handle this case.  They work like ordinary load_input
intrinsics, but have an extra source (src[0]) which represents the
outermost array index.

v2: Rebase on earlier refactors.
v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_intrinsics.h |  1 +
 src/glsl/nir/nir_lower_io.c   | 55 ++---
 src/glsl/nir/nir_print.c  |  2 +
 src/mesa/drivers/dri/i965/brw_nir.c   | 13 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 58 +++
 5 files changed, 86 insertions(+), 43 deletions(-)

diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index ac4c2ba..263d8c1 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -228,6 +228,7 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
 LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
 LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
 
 /*
diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
index 51a953a..a6bae94 100644
--- a/src/glsl/nir/nir_lower_io.c
+++ b/src/glsl/nir/nir_lower_io.c
@@ -63,8 +63,20 @@ nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
*size = location;
 }
 
+/**
+ * Returns true if we're processing a stage whose inputs are arrays indexed
+ * by a vertex number (such as geometry shader inputs).
+ */
+static bool
+stage_uses_per_vertex_inputs(struct lower_io_state *state)
+{
+   gl_shader_stage stage = state->builder.shader->stage;
+   return stage == MESA_SHADER_GEOMETRY;
+}
+
 static unsigned
 get_io_offset(nir_deref_var *deref, nir_instr *instr,
+  nir_ssa_def **vertex_index,
   nir_ssa_def **out_indirect,
   struct lower_io_state *state)
 {
@@ -75,6 +87,22 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
b->cursor = nir_before_instr(instr);
 
nir_deref *tail = >deref;
+
+   /* For per-vertex input arrays (i.e. geometry shader inputs), keep the
+* outermost array index separate.  Process the rest normally.
+*/
+   if (vertex_index != NULL) {
+  tail = tail->child;
+  assert(tail->deref_type == nir_deref_type_array);
+  nir_deref_array *deref_array = nir_deref_as_array(tail);
+
+  nir_ssa_def *vtx = nir_imm_int(b, deref_array->base_offset);
+  if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
+ vtx = nir_iadd(b, vtx, nir_ssa_for_src(b, deref_array->indirect, 1));
+  }
+  *vertex_index = vtx;
+   }
+
while (tail->child != NULL) {
   const struct glsl_type *parent_type = tail->type;
   tail = tail->child;
@@ -107,13 +135,19 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
 }
 
 static nir_intrinsic_op
-load_op(nir_variable_mode mode, bool has_indirect)
+load_op(struct lower_io_state *state,
+nir_variable_mode mode, bool per_vertex, bool has_indirect)
 {
nir_intrinsic_op op;
switch (mode) {
case nir_var_shader_in:
-  op = has_indirect ? nir_intrinsic_load_input_indirect :
-  nir_intrinsic_load_input;
+  if (per_vertex) {
+ op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect :
+ nir_intrinsic_load_per_vertex_input;
+  } else {
+ op = has_indirect ? nir_intrinsic_load_input_indirect :
+ nir_intrinsic_load_input;
+  }
   break;
case nir_var_uniform:
   op = has_indirect ? nir_intrinsic_load_uniform_indirect :
@@ -150,14 +184,20 @@ nir_lower_io_block(nir_block *block, void *void_state)
  if (mode != nir_var_shader_in && mode != nir_var_uniform)
 continue;
 
+ bool per_vertex = stage_uses_per_vertex_inputs(state) &&
+   mode == nir_var_shader_in;
+
  nir_ssa_def *indirect;
+ nir_ssa_def *vertex_index;
 
  

[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #14 from Emil Velikov  ---
(In reply to Ilia Mirkin from comment #13)
> Er nevermind. Before it was malloc'd but now it's calloc'd.

The most creepy part is that with the extension disabled, we can never get to
the srgb configs, as GLColorspace will be LINEAR. Even if the user requests
one, we'll error out.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/3] i965/vec4: Always use NIR

2015-10-02 Thread Matt Turner
The series is

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] i965: always run the post-RA scheduler

2015-10-02 Thread Connor Abbott
Before, we would only do scheduling after register allocation if we
spilled, despite the fact that the pre-RA scheduler was only supposed to
be for register pressure and set the latencies of every instruction to
1. This meant that unless we spilled, which we rarely do, then we never
considered instruction latencies at all, and we usually never bothered
to try and hide texture fetch latency. Although a later commit removes
the setting the latency to 1 part, we still want to always run the
post-RA scheduler since it's able to take the false dependencies that
the register allocator creates into account, and it can be more
aggressive than the pre-RA scheduler since it doesn't have to worry
about register pressure at all.

XXX perf data

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b269ade..14a9fdf 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4981,8 +4981,7 @@ fs_visitor::allocate_registers()
if (failed)
   return;
 
-   if (!allocated_without_spills)
-  schedule_instructions(SCHEDULE_POST);
+   schedule_instructions(SCHEDULE_POST);
 
if (last_scratch > 0)
   prog_data->total_scratch = brw_get_scratch_size(last_scratch);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] i965: dump scheduling cycle estimates

2015-10-02 Thread Connor Abbott
On Fri, Oct 2, 2015 at 5:37 PM, Connor Abbott  wrote:
> The heuristic we're using is rather lame, since it assumes everything is
> non-uniform and loops execute 50 times, but it should be enough for
> measuring improvements in the scheduler that don't result in a change in
> the number of instructions.
>
> Signed-off-by: Connor Abbott 
> ---
>  src/mesa/drivers/dri/i965/brw_cfg.h  |  4 
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 11 ++-
>  .../drivers/dri/i965/brw_schedule_instructions.cpp   | 20 
> 
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  9 +
>  4 files changed, 35 insertions(+), 9 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
> b/src/mesa/drivers/dri/i965/brw_cfg.h
> index a094917..d0bdb00 100644
> --- a/src/mesa/drivers/dri/i965/brw_cfg.h
> +++ b/src/mesa/drivers/dri/i965/brw_cfg.h
> @@ -90,6 +90,8 @@ struct bblock_t {
> struct exec_list parents;
> struct exec_list children;
> int num;
> +
> +   unsigned cycle_count;
>  };
>
>  static inline struct backend_instruction *
> @@ -285,6 +287,8 @@ struct cfg_t {
> int num_blocks;
>
> bool idom_dirty;
> +
> +   unsigned cycle_count;
>  };
>
>  /* Note that this is implemented with a double for loop -- break will
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index 6f8b75e..9540012 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -2181,9 +2181,9 @@ fs_generator::generate_code(const cfg_t *cfg, int 
> dispatch_width)
>
> if (unlikely(debug_flag)) {
>fprintf(stderr, "Native code for %s\n"
> -  "SIMD%d shader: %d instructions. %d loops. %d:%d spills:fills. 
> Promoted %u constants. Compacted %d to %d"
> +  "SIMD%d shader: %d instructions. %u cycles. %d loops. %d:%d 
> spills:fills. Promoted %u constants. Compacted %d to %d"
>" bytes (%.0f%%)\n",
> -  shader_name, dispatch_width, before_size / 16, loop_count,
> +  shader_name, dispatch_width, before_size / 16, 
> cfg->cycle_count, loop_count,
>spill_count, fill_count, promoted_constants, before_size, 
> after_size,
>100.0f * (before_size - after_size) / before_size);
>
> @@ -2193,12 +2193,13 @@ fs_generator::generate_code(const cfg_t *cfg, int 
> dispatch_width)
> }
>
> compiler->shader_debug_log(log_data,
> -  "%s SIMD%d shader: %d inst, %d loops, "
> +  "%s SIMD%d shader: %d inst, %u cycles, %d 
> loops, "
>"%d:%d spills:fills, Promoted %u constants, "
>"compacted %d to %d bytes.\n",
>stage_abbrev, dispatch_width, before_size / 16,
> -  loop_count, spill_count, fill_count,
> -  promoted_constants, before_size, after_size);
> +  cfg->cycle_count, loop_count, spill_count,
> +  fill_count, promoted_constants, before_size,
> +  after_size);
>
> return start_offset;
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> index 1652261..22a493f 100644
> --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
> @@ -1467,6 +1467,24 @@ instruction_scheduler::schedule_instructions(bblock_t 
> *block)
> if (block->end()->opcode == BRW_OPCODE_NOP)
>block->end()->remove(block);
> assert(instructions_to_schedule == 0);
> +
> +   block->cycle_count = time;
> +}
> +
> +static unsigned get_cycle_count(cfg_t *cfg)
> +{
> +   unsigned count = 0, multiplier = 1;
> +   foreach_block(block, cfg) {
> +  if (block->start()->opcode == BRW_OPCODE_DO)
> + multiplier *= 50; /* assume that loops have ~50 instructions */

Whoops, this should say "assume that loops are run ~50 times"...

> +
> +  count += block->cycle_count * multiplier;
> +
> +  if (block->end()->opcode == BRW_OPCODE_WHILE)
> + multiplier /= 50;
> +   }
> +
> +   return count;
>  }
>
>  void
> @@ -1507,6 +1525,8 @@ instruction_scheduler::run(cfg_t *cfg)
>post_reg_alloc);
>bs->dump_instructions();
> }
> +
> +   cfg->cycle_count = get_cycle_count(cfg);
>  }
>
>  void
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index dcacc90..3010352 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1656,10 +1656,10 @@ vec4_generator::generate_code(const cfg_t *cfg)
>   fprintf(stderr, "Native code for 

[Mesa-dev] [PATCH v3 1/2] nir/lower_io: Make get_io_offset() return a nir_ssa_def * for indirects.

2015-10-02 Thread Kenneth Graunke
get_io_offset() already walks the dereference chain and discovers
whether or not we have an indirect; we can just return that rather than
computing it a second time via deref_has_indirect().  This means moving
the call a bit earlier.

By returning a nir_ssa_def *, we can pass back both an existence flag
(via NULL checking the pointer) and the value in one parameter.  It
also simplifies the code somewhat.  nir_lower_samplers works in a
similar fashion.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_lower_io.c | 62 +++--
 1 file changed, 20 insertions(+), 42 deletions(-)

this pach replaces 

nir/lower_io: Return has_indirect flag from get_io_offset().

diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
index f32c09d..51a953a 100644
--- a/src/glsl/nir/nir_lower_io.c
+++ b/src/glsl/nir/nir_lower_io.c
@@ -63,25 +63,12 @@ nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
*size = location;
 }
 
-static bool
-deref_has_indirect(nir_deref_var *deref)
-{
-   for (nir_deref *tail = deref->deref.child; tail; tail = tail->child) {
-  if (tail->deref_type == nir_deref_type_array) {
- nir_deref_array *arr = nir_deref_as_array(tail);
- if (arr->deref_array_type == nir_deref_array_type_indirect)
-return true;
-  }
-   }
-
-   return false;
-}
-
 static unsigned
-get_io_offset(nir_deref_var *deref, nir_instr *instr, nir_src *indirect,
+get_io_offset(nir_deref_var *deref, nir_instr *instr,
+  nir_ssa_def **out_indirect,
   struct lower_io_state *state)
 {
-   bool found_indirect = false;
+   nir_ssa_def *indirect = NULL;
unsigned base_offset = 0;
 
nir_builder *b = >builder;
@@ -103,14 +90,7 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr, 
nir_src *indirect,
nir_imul(b, nir_imm_int(b, size),
 nir_ssa_for_src(b, deref_array->indirect, 1));
 
-if (found_indirect) {
-   indirect->ssa =
-  nir_iadd(b, nir_ssa_for_src(b, *indirect, 1), mul);
-} else {
-   indirect->ssa = mul;
-}
-indirect->is_ssa = true;
-found_indirect = true;
+indirect = indirect ? nir_iadd(b, indirect, mul) : mul;
  }
   } else if (tail->deref_type == nir_deref_type_struct) {
  nir_deref_struct *deref_struct = nir_deref_as_struct(tail);
@@ -122,6 +102,7 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr, 
nir_src *indirect,
   }
}
 
+   *out_indirect = indirect;
return base_offset;
 }
 
@@ -169,17 +150,16 @@ nir_lower_io_block(nir_block *block, void *void_state)
  if (mode != nir_var_shader_in && mode != nir_var_uniform)
 continue;
 
- bool has_indirect = deref_has_indirect(intrin->variables[0]);
+ nir_ssa_def *indirect;
+
+ unsigned offset = get_io_offset(intrin->variables[0], >instr,
+ , state);
 
  nir_intrinsic_instr *load =
 nir_intrinsic_instr_create(state->mem_ctx,
-   load_op(mode, has_indirect));
+   load_op(mode, indirect));
  load->num_components = intrin->num_components;
 
- nir_src indirect;
- unsigned offset = get_io_offset(intrin->variables[0],
- >instr, , state);
-
  unsigned location = intrin->variables[0]->var->data.driver_location;
  if (mode == nir_var_uniform) {
 load->const_index[0] = location;
@@ -188,8 +168,8 @@ nir_lower_io_block(nir_block *block, void *void_state)
 load->const_index[0] = location + offset;
  }
 
- if (has_indirect)
-load->src[0] = indirect;
+ if (indirect)
+load->src[0] = nir_src_for_ssa(indirect);
 
  if (intrin->dest.is_ssa) {
 nir_ssa_dest_init(>instr, >dest,
@@ -209,10 +189,14 @@ nir_lower_io_block(nir_block *block, void *void_state)
  if (intrin->variables[0]->var->data.mode != nir_var_shader_out)
 continue;
 
- bool has_indirect = deref_has_indirect(intrin->variables[0]);
+ nir_ssa_def *indirect;
+
+ unsigned offset = get_io_offset(intrin->variables[0], >instr,
+ , state);
+ offset += intrin->variables[0]->var->data.driver_location;
 
  nir_intrinsic_op store_op;
- if (has_indirect) {
+ if (indirect) {
 store_op = nir_intrinsic_store_output_indirect;
  } else {
 store_op = nir_intrinsic_store_output;
@@ -221,18 +205,12 @@ nir_lower_io_block(nir_block *block, void *void_state)
  nir_intrinsic_instr *store = 
nir_intrinsic_instr_create(state->mem_ctx,
  store_op);
  

[Mesa-dev] [PATCH 1/6] i965: fix cycle estimates when there's a pipeline stall

2015-10-02 Thread Connor Abbott
The issue time for an instruction is how many cycles it takes to
actually put it into the pipeline. If there's a pipeline stall that
causes the instruction to be delayed, we should first take that into
account to figure out when the instruction would start executing and
*then* add the issue time. The old code had it backwards, and so we
would underestimate the total time whenever we thought there would be a
pipeline stall by up to the issue time of the instruction.

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 4e43e5c..76d58e2 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -1405,18 +1405,19 @@ instruction_scheduler::schedule_instructions(bblock_t 
*block)
   instructions_to_schedule--;
   update_register_pressure(chosen->inst);
 
+  /* If we expected a delay for scheduling, then bump the clock to reflect
+   * that.  In reality, the hardware will switch to another hyperthread
+   * and may not return to dispatching our thread for a while even after
+   * we're unblocked.  After this, we have the time when the chosen
+   * instruction will start executing.
+   */
+  time = MAX2(time, chosen->unblocked_time);
+
   /* Update the clock for how soon an instruction could start after the
* chosen one.
*/
   time += issue_time(chosen->inst);
 
-  /* If we expected a delay for scheduling, then bump the clock to reflect
-   * that as well.  In reality, the hardware will switch to another
-   * hyperthread and may not return to dispatching our thread for a while
-   * even after we're unblocked.
-   */
-  time = MAX2(time, chosen->unblocked_time);
-
   if (debug) {
  fprintf(stderr, "clock %4d, scheduled: ", time);
  bs->dump_instruction(chosen->inst);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] i965: dump scheduling cycle estimates

2015-10-02 Thread Connor Abbott
The heuristic we're using is rather lame, since it assumes everything is
non-uniform and loops execute 50 times, but it should be enough for
measuring improvements in the scheduler that don't result in a change in
the number of instructions.

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_cfg.h  |  4 
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 11 ++-
 .../drivers/dri/i965/brw_schedule_instructions.cpp   | 20 
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  9 +
 4 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h 
b/src/mesa/drivers/dri/i965/brw_cfg.h
index a094917..d0bdb00 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.h
+++ b/src/mesa/drivers/dri/i965/brw_cfg.h
@@ -90,6 +90,8 @@ struct bblock_t {
struct exec_list parents;
struct exec_list children;
int num;
+
+   unsigned cycle_count;
 };
 
 static inline struct backend_instruction *
@@ -285,6 +287,8 @@ struct cfg_t {
int num_blocks;
 
bool idom_dirty;
+
+   unsigned cycle_count;
 };
 
 /* Note that this is implemented with a double for loop -- break will
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 6f8b75e..9540012 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -2181,9 +2181,9 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 
if (unlikely(debug_flag)) {
   fprintf(stderr, "Native code for %s\n"
-  "SIMD%d shader: %d instructions. %d loops. %d:%d spills:fills. 
Promoted %u constants. Compacted %d to %d"
+  "SIMD%d shader: %d instructions. %u cycles. %d loops. %d:%d 
spills:fills. Promoted %u constants. Compacted %d to %d"
   " bytes (%.0f%%)\n",
-  shader_name, dispatch_width, before_size / 16, loop_count,
+  shader_name, dispatch_width, before_size / 16, cfg->cycle_count, 
loop_count,
   spill_count, fill_count, promoted_constants, before_size, 
after_size,
   100.0f * (before_size - after_size) / before_size);
 
@@ -2193,12 +2193,13 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
}
 
compiler->shader_debug_log(log_data,
-  "%s SIMD%d shader: %d inst, %d loops, "
+  "%s SIMD%d shader: %d inst, %u cycles, %d loops, 
"
   "%d:%d spills:fills, Promoted %u constants, "
   "compacted %d to %d bytes.\n",
   stage_abbrev, dispatch_width, before_size / 16,
-  loop_count, spill_count, fill_count,
-  promoted_constants, before_size, after_size);
+  cfg->cycle_count, loop_count, spill_count,
+  fill_count, promoted_constants, before_size,
+  after_size);
 
return start_offset;
 }
diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 1652261..22a493f 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -1467,6 +1467,24 @@ instruction_scheduler::schedule_instructions(bblock_t 
*block)
if (block->end()->opcode == BRW_OPCODE_NOP)
   block->end()->remove(block);
assert(instructions_to_schedule == 0);
+
+   block->cycle_count = time;
+}
+
+static unsigned get_cycle_count(cfg_t *cfg)
+{
+   unsigned count = 0, multiplier = 1;
+   foreach_block(block, cfg) {
+  if (block->start()->opcode == BRW_OPCODE_DO)
+ multiplier *= 50; /* assume that loops have ~50 instructions */
+
+  count += block->cycle_count * multiplier;
+
+  if (block->end()->opcode == BRW_OPCODE_WHILE)
+ multiplier /= 50;
+   }
+
+   return count;
 }
 
 void
@@ -1507,6 +1525,8 @@ instruction_scheduler::run(cfg_t *cfg)
   post_reg_alloc);
   bs->dump_instructions();
}
+
+   cfg->cycle_count = get_cycle_count(cfg);
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index dcacc90..3010352 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1656,10 +1656,10 @@ vec4_generator::generate_code(const cfg_t *cfg)
  fprintf(stderr, "Native code for %s program %d:\n", stage_name,
  prog->Id);
   }
-  fprintf(stderr, "%s vec4 shader: %d instructions. %d loops. Compacted %d 
to %d"
+  fprintf(stderr, "%s vec4 shader: %d instructions. %u cycles. %d loops. 
Compacted %d to %d"
   " bytes (%.0f%%)\n",
   stage_abbrev,
-  before_size / 16, loop_count, 

[Mesa-dev] [PATCH 2/6] i965/sched: write-after-read dependencies are free

2015-10-02 Thread Connor Abbott
Although write-after-write dependencies have the same latency as
read-after-write dependencies due to how the register scoreboard works,
write-after-read dependencies aren't checked by the EU at all, so
they're purely a constraint on how the scheduler can order the
instructions.

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 76d58e2..1652261 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -927,10 +927,10 @@ fs_instruction_scheduler::calculate_deps()
  if (inst->src[i].file == GRF) {
 if (post_reg_alloc) {
for (int r = 0; r < inst->regs_read(i); r++)
-  add_dep(n, last_grf_write[inst->src[i].reg + r]);
+  add_dep(n, last_grf_write[inst->src[i].reg + r], 0);
 } else {
for (int r = 0; r < inst->regs_read(i); r++) {
-  add_dep(n, last_grf_write[inst->src[i].reg * 16 + 
inst->src[i].reg_offset + r]);
+  add_dep(n, last_grf_write[inst->src[i].reg * 16 + 
inst->src[i].reg_offset + r], 0);
}
 }
  } else if (inst->src[i].file == HW_REG &&
@@ -941,9 +941,9 @@ fs_instruction_scheduler::calculate_deps()
if (inst->src[i].fixed_hw_reg.vstride == BRW_VERTICAL_STRIDE_0)
   size = 1;
for (int r = 0; r < size; r++)
-  add_dep(n, last_grf_write[inst->src[i].fixed_hw_reg.nr + r]);
+  add_dep(n, last_grf_write[inst->src[i].fixed_hw_reg.nr + r], 
0);
 } else {
-   add_dep(n, last_fixed_grf_write);
+   add_dep(n, last_fixed_grf_write, 0);
 }
  } else if (inst->src[i].is_accumulator()) {
 add_dep(n, last_accumulator_write);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] i965/sched: use liveness analysis for computing register pressure

2015-10-02 Thread Connor Abbott
Previously, we were using some heuristics to try and detect when a write
was about to begin a live range, or when a read was about to end a live
range. We never used the liveness analysis information used by the
register allocator, though, which meant that the scheduler's and the
allocator's ideas of when a live range began and ended were different.
Not only did this make our estimate of the register pressure benefit of
scheduling an instruction wrong in some cases, but it was preventing us
from knowing the actual register pressure when scheduling each
instruction, which we want to have in order to switch to register
pressure scheduling only when the register pressure is too high.

This commit rewrites the register pressure tracking code to use the same
model as our register allocator currently uses. We use the results of
liveness analysis, as well as the compute_payload_ranges() function that
we split out in the last commit. This means that we compute live ranges
twice on each round through the register allocator, although we could
speed it up by only recomputing the ranges and not the live in/live out
sets after scheduling, since we only shuffle around instructions within
a single basic block when we schedule.

Shader-db results on bdw:

total instructions in shared programs: 7130187 -> 7129880 (-0.00%)
instructions in affected programs: 1744 -> 1437 (-17.60%)
helped: 1
HURT: 1

total cycles in shared programs: 172535126 -> 172473226 (-0.04%)
cycles in affected programs: 11338636 -> 11276736 (-0.55%)
helped: 876
HURT: 873

LOST:   8
GAINED: 0
Signed-off-by: Connor Abbott 
---
The results are a wash, but this is needed for a lot of the more
experimental things I want to do. I can drop this if there are any
objections.

 .../drivers/dri/i965/brw_schedule_instructions.cpp | 300 +
 1 file changed, 244 insertions(+), 56 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp 
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 22a493f..6b8792b 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -26,6 +26,7 @@
  */
 
 #include "brw_fs.h"
+#include "brw_fs_live_variables.h"
 #include "brw_vec4.h"
 #include "brw_cfg.h"
 #include "brw_shader.h"
@@ -400,22 +401,49 @@ schedule_node::set_latency_gen7(bool is_haswell)
 class instruction_scheduler {
 public:
instruction_scheduler(backend_shader *s, int grf_count,
+ int hw_reg_count, int block_count,
  instruction_scheduler_mode mode)
{
   this->bs = s;
   this->mem_ctx = ralloc_context(NULL);
   this->grf_count = grf_count;
+  this->hw_reg_count = hw_reg_count;
   this->instructions.make_empty();
   this->instructions_to_schedule = 0;
   this->post_reg_alloc = (mode == SCHEDULE_POST);
   this->mode = mode;
   this->time = 0;
   if (!post_reg_alloc) {
- this->remaining_grf_uses = rzalloc_array(mem_ctx, int, grf_count);
- this->grf_active = rzalloc_array(mem_ctx, bool, grf_count);
+ this->reg_pressure_in = rzalloc_array(mem_ctx, int, block_count);
+
+ this->livein = ralloc_array(mem_ctx, BITSET_WORD *, block_count);
+ for (int i = 0; i < block_count; i++)
+this->livein[i] = rzalloc_array(mem_ctx, BITSET_WORD,
+BITSET_WORDS(grf_count));
+
+ this->liveout = ralloc_array(mem_ctx, BITSET_WORD *, block_count);
+ for (int i = 0; i < block_count; i++)
+this->liveout[i] = rzalloc_array(mem_ctx, BITSET_WORD,
+ BITSET_WORDS(grf_count));
+
+ this->hw_liveout = ralloc_array(mem_ctx, BITSET_WORD *, block_count);
+ for (int i = 0; i < block_count; i++)
+this->hw_liveout[i] = rzalloc_array(mem_ctx, BITSET_WORD,
+BITSET_WORDS(hw_reg_count));
+
+ this->written = rzalloc_array(mem_ctx, bool, grf_count);
+
+ this->reads_remaining = rzalloc_array(mem_ctx, int, grf_count);
+
+ this->hw_reads_remaining = rzalloc_array(mem_ctx, int, hw_reg_count);
   } else {
- this->remaining_grf_uses = NULL;
- this->grf_active = NULL;
+ this->reg_pressure_in = NULL;
+ this->livein = NULL;
+ this->liveout = NULL;
+ this->hw_liveout = NULL;
+ this->written = NULL;
+ this->reads_remaining = NULL;
+ this->hw_reads_remaining = NULL;
   }
}
 
@@ -442,7 +470,8 @@ public:
 */
virtual int issue_time(backend_instruction *inst) = 0;
 
-   virtual void count_remaining_grf_uses(backend_instruction *inst) = 0;
+   virtual void count_reads_remaining(backend_instruction *inst) = 0;
+   virtual void setup_liveness(cfg_t *cfg) = 0;
virtual void update_register_pressure(backend_instruction *inst) = 0;
virtual 

[Mesa-dev] [PATCH shader-db 1/2] report.py: rework and update for cycle info

2015-10-02 Thread Connor Abbott
Now that we have three separate things we want to measure (instructions,
cycles, and loops), it's impractical to keep adding special code for
changes in each thing. Instead, for each program in before and after we
store a table of measurement -> value, and when reporting we loop over
each measurement and report helped/hurt before reporting the gained/lost
programs.

Signed-off-by: Connor Abbott 
---
 report.py | 140 ++
 1 file changed, 67 insertions(+), 73 deletions(-)

diff --git a/report.py b/report.py
index 4c06714..bc3a640 100755
--- a/report.py
+++ b/report.py
@@ -10,17 +10,22 @@ def get_results(filename):
 
 results = {}
 
-re_match = re.compile(r"(\S+) - (.S \S+) shader: (\S*) inst, (\S*) loops")
+re_match = re.compile(r"(\S+) - (.S \S+) shader: (\S*) inst, (\S*) cycles, 
(\S*) loops")
 for line in lines:
 match = re.search(re_match, line)
 if match is None:
 continue
 
 groups = match.groups()
-count = int(groups[2])
-loop = int(groups[3])
-if count != 0:
-results[(groups[0], groups[1])] = count, loop
+inst_count = int(groups[2])
+   cycle_count = int(groups[3])
+loop_count = int(groups[4])
+if inst_count != 0:
+results[(groups[0], groups[1])] = {
+"instructions": inst_count,
+"cycles": cycle_count,
+   "loops": loop_count
+}
 
 return results
 
@@ -50,76 +55,77 @@ def main():
 parser.add_argument("after", type=get_results, help="the output of the new 
code")
 args = parser.parse_args()
 
-total_before = 0
-total_after = 0
-total_before_loop = 0
-total_after_loop = 0
-affected_before = 0
-affected_after = 0
+for measurement in ["instructions", "cycles", "loops"]:
+total_before = 0
+total_after = 0
+affected_before = 0
+affected_after = 0
 
-helped = []
-hurt = []
-lost = []
-gained = []
-loop_change = []
-for p in args.before:
-(name, type) = p
-namestr = name + " " + type
-before_count = args.before[p][0]
-before_loop = args.before[p][1]
+helped = []
+hurt = []
+for p in args.before:
+before_count = args.before[p][measurement]
+
+if args.after.get(p) is None:
+continue
 
-if args.after.get(p) is not None:
-after_count = args.after[p][0]
-after_loop = args.after[p][1]
+# If the number of loops changed, then we may have unrolled some
+# loops, in which case other measurements will be misleading.
+if measurement != "loops" and args.before[p]["loops"] != 
args.after[p]["loops"]:
+continue
 
-total_before_loop += before_loop
-total_after_loop += after_loop
+after_count = args.after[p][measurement]
 
-if before_loop == after_loop:
-total_before += before_count
-total_after += after_count
+total_before += before_count
+total_after += after_count
 
 if before_count != after_count:
 affected_before += before_count
 affected_after += after_count
 
-if after_loop != before_loop:
-loop_change.append(p);
-elif after_count > before_count:
+if after_count > before_count:
 hurt.append(p)
 else:
 helped.append(p)
-else:
-lost.append(namestr)
 
-for p in args.after:
-if args.before.get(p) is None:
-gained.append(p[0] + " " + p[1])
+helped.sort(
+key=lambda k: float(args.before[k][measurement] - 
args.after[k][measurement]) / args.before[k][measurement])
+for p in helped:
+namestr = p[0] + " " + p[1]
+print(measurement + " helped:   " + get_result_string(
+namestr, args.before[p][measurement], 
args.after[p][measurement]))
+if len(helped) > 0:
+print("")
+
+hurt.sort(
+key=lambda k: float(args.after[k][measurement] - 
args.before[k][measurement]) / args.before[k][measurement])
+for p in hurt:
+namestr = p[0] + " " + p[1]
+print(measurement + " HURT:   " + get_result_string(
+namestr, args.before[p][measurement], 
args.after[p][measurement]))
+if len(hurt) > 0:
+print("")
+
+print("total {0} in shared programs: {1}\n"
+ "{0} in affected programs: {2}\n"
+ "helped: {3}\n"
+ "HURT: {4}\n".format(
+ measurement,
+ change(total_before, total_after),
+ change(affected_before, affected_after),
+ len(helped),
+ 

[Mesa-dev] [PATCH 0/6] i965 scheduling improvements

2015-10-02 Thread Connor Abbott
Here's a series to fix some things I found with i965's instruction
scheduler. It also makes us print out an estimate of the cycle count,
which the shader-db patches then make use of to enable us to better
measure the impact of scheduler changes. Perhaps the most important
patch is patch 3, which fixes an embarrasing mistake which prevented us
from actually trying to hide latency most of the time. Patches 1-2 fix
some minor issues with keeping track of cycle counts, and patches 5-6
are a little more experimental.

Performance numbers for patch 3 are needed. Unfortunately, patch 3 also
fixes the cycle counts that patch 4 outputs, so we can't get a good
sense of how it's changing things without actually running benchmarks.

The series is also available at:

git://people.freedesktop.org/~cwabbott0/mesa i965-sched-conservative

There's also an i965-sched branch with some even-more-experimental patches,
which range from "this is a good idea but it hurts some stuff" to "I'm not
sure about this at all." I have some ideas about how to improve them so that
they don't regress as much, but I want to get this series out of the way
in the meantime.

Connor Abbott (6):
  i965: fix cycle estimates when there's a pipeline stall
  i965/sched: write-after-read dependencies are free
  i965: always run the post-RA scheduler
  i965: dump scheduling cycle estimates
  i965/fs: split out calculation of payload live ranges
  i965/sched: use liveness analysis for computing register pressure

 src/mesa/drivers/dri/i965/brw_cfg.h|   4 +
 src/mesa/drivers/dri/i965/brw_fs.cpp   |   3 +-
 src/mesa/drivers/dri/i965/brw_fs.h |   2 +
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  11 +-
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  51 +--
 .../drivers/dri/i965/brw_schedule_instructions.cpp | 343 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |   9 +-
 7 files changed, 323 insertions(+), 100 deletions(-)

-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] i965/fs: split out calculation of payload live ranges

2015-10-02 Thread Connor Abbott
We'll need this for the scheduler too, since it wants to know when the
live ranges of payload registers end in order to model them in our
register pressure calculations.

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_fs.h|  2 +
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 51 +--
 2 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index a8b6726..160b09b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -141,6 +141,8 @@ public:
void assign_vs_urb_setup();
bool assign_regs(bool allow_spilling);
void assign_regs_trivial();
+   void calculate_payload_ranges(int payload_node_count,
+ int *payload_last_use_ip);
void setup_payload_interference(struct ra_graph *g, int payload_reg_count,
int first_payload_node);
int choose_spill_reg(struct ra_graph *g);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
index 6900cee..3f00479 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
@@ -332,32 +332,12 @@ count_to_loop_end(const bblock_t *block)
unreachable("not reached");
 }
 
-/**
- * Sets up interference between thread payload registers and the virtual GRFs
- * to be allocated for program temporaries.
- *
- * We want to be able to reallocate the payload for our virtual GRFs, notably
- * because the setup coefficients for a full set of 16 FS inputs takes up 8 of
- * our 128 registers.
- *
- * The layout of the payload registers is:
- *
- * 0..payload.num_regs-1: fixed function setup (including bary coordinates).
- * payload.num_regs..payload.num_regs+curb_read_lengh-1: uniform data
- * payload.num_regs+curb_read_lengh..first_non_payload_grf-1: setup 
coefficients.
- *
- * And we have payload_node_count nodes covering these registers in order
- * (note that in SIMD16, a node is two registers).
- */
-void
-fs_visitor::setup_payload_interference(struct ra_graph *g,
-   int payload_node_count,
-   int first_payload_node)
+void fs_visitor::calculate_payload_ranges(int payload_node_count,
+  int *payload_last_use_ip)
 {
int loop_depth = 0;
int loop_end_ip = 0;
 
-   int payload_last_use_ip[payload_node_count];
for (int i = 0; i < payload_node_count; i++)
   payload_last_use_ip[i] = -1;
 
@@ -428,6 +408,33 @@ fs_visitor::setup_payload_interference(struct ra_graph *g,
 
   ip++;
}
+}
+
+
+/**
+ * Sets up interference between thread payload registers and the virtual GRFs
+ * to be allocated for program temporaries.
+ *
+ * We want to be able to reallocate the payload for our virtual GRFs, notably
+ * because the setup coefficients for a full set of 16 FS inputs takes up 8 of
+ * our 128 registers.
+ *
+ * The layout of the payload registers is:
+ *
+ * 0..payload.num_regs-1: fixed function setup (including bary coordinates).
+ * payload.num_regs..payload.num_regs+curb_read_lengh-1: uniform data
+ * payload.num_regs+curb_read_lengh..first_non_payload_grf-1: setup 
coefficients.
+ *
+ * And we have payload_node_count nodes covering these registers in order
+ * (note that in SIMD16, a node is two registers).
+ */
+void
+fs_visitor::setup_payload_interference(struct ra_graph *g,
+   int payload_node_count,
+   int first_payload_node)
+{
+   int payload_last_use_ip[payload_node_count];
+   calculate_payload_ranges(payload_node_count, payload_last_use_ip);
 
for (int i = 0; i < payload_node_count; i++) {
   if (payload_last_use_ip[i] == -1)
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH shader-db 2/2] report.py: add an option for which measurements to report

2015-10-02 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 report.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/report.py b/report.py
index bc3a640..f9445dd 100755
--- a/report.py
+++ b/report.py
@@ -48,14 +48,19 @@ def get_result_string(p, b, a):
 p = p + ' '
 return p + change(b, a)
 
+def split_list(string):
+return string.split(",")
 
 def main():
 parser = argparse.ArgumentParser()
+parser.add_argument("--measurements", "-m", type=split_list,
+default=["instructions", "cycles", "loops"],
+help="comma-separated list of measurements to report")
 parser.add_argument("before", type=get_results, help="the output of the 
original code")
 parser.add_argument("after", type=get_results, help="the output of the new 
code")
 args = parser.parse_args()
 
-for measurement in ["instructions", "cycles", "loops"]:
+for measurement in args.measurements:
 total_before = 0
 total_after = 0
 affected_before = 0
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #15 from Emil Velikov  ---
(In reply to Chih-Wei Huang from comment #11)
> Mesa still doesn't support PIXEL_FORMAT_RGBA_?

It should, depending on how exactly you define the formats. I've looked a while
back for some android documentation but I came short. Have you seen any ?

Upon closer look the mappings in drm_format_from_hal and get_pipe_format look
very funny.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span

2015-10-02 Thread Ilia Mirkin
On Fri, Oct 2, 2015 at 4:47 PM, Ian Romanick  wrote:
> On 10/01/2015 12:15 PM, Ilia Mirkin wrote:
>> On Thu, Oct 1, 2015 at 3:12 PM, Ian Romanick  wrote:
>>> I'm just
>>> wondering because Mesa doesn't support that extension.  How is this even
>>> being hit?
>>
>> See 81d2fd91a90 (mesa: add NV_read_{depth,stencil,depth_stencil} extensions)
>
> Okay, that's weird.  I must have had some old branch checked out at the
> time.  I did 'grep -r NV_read_depth src/', and it came back empty.  Now
> I just get to be irritated that we enabled THREE extensions for which we
> have ZERO tests... and least one is clearly completely broken. :(
>
> I guess now I at least have something concrete to point to then next
> time I object to enabling an ES extension that "just" allows some
> desktop functionality. ;)

I believe Rob tested at least some of it with qapitrace[1], as
otherwise there was no way to get access to the data in a
renderbuffer, which can be quite useful for debugging. Not all of us
have your level of familiarity with how GL works, but how will we
learn without making some mistakes? :)

No matter how many tests we might have, they'll always leave
*something* out. The fact that there are no tests at all for this ext
isn't great, of course. But there are also no functional tests for
{ARB,AMD}_conservative_depth and probably a number of others.

  -ilia

1. 
https://github.com/apitrace/apitrace/commit/7ad1d34c97cb0d2f07a381d7664f1ba0f1be410b.patch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 2/2] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.

2015-10-02 Thread Jason Ekstrand
I only scanned through then briefly, but they look good to me.
Assuming Jenkins agrees,

Reviewed-by: Jason Ekstrand 

Let's actually push it this time. :-)

On Fri, Oct 2, 2015 at 2:52 PM, Kenneth Graunke  wrote:
> Geometry and tessellation shaders process multiple vertices; their
> inputs are arrays indexed by the vertex number.  While GLSL makes
> this look like a normal array, it can be very different behind the
> scenes.
>
> On Intel hardware, all inputs for a particular vertex are stored
> together - as if they were grouped into a single struct.  This means
> that consecutive elements of these top-level arrays are not contiguous.
> In fact, they may sometimes be in completely disjoint memory segments.
>
> NIR's existing load_input intrinsics are awkward for this case, as they
> distill everything down to a single offset.  We'd much rather keep the
> vertex ID separate, but build up an offset as normal beyond that.
>
> This patch introduces new nir_intrinsic_load_per_vertex_input
> intrinsics to handle this case.  They work like ordinary load_input
> intrinsics, but have an extra source (src[0]) which represents the
> outermost array index.
>
> v2: Rebase on earlier refactors.
> v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/nir/nir_intrinsics.h |  1 +
>  src/glsl/nir/nir_lower_io.c   | 55 ++---
>  src/glsl/nir/nir_print.c  |  2 +
>  src/mesa/drivers/dri/i965/brw_nir.c   | 13 +-
>  src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 58 
> +++
>  5 files changed, 86 insertions(+), 43 deletions(-)
>
> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
> index ac4c2ba..263d8c1 100644
> --- a/src/glsl/nir/nir_intrinsics.h
> +++ b/src/glsl/nir/nir_intrinsics.h
> @@ -228,6 +228,7 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
>  LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
>  LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
>  LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> +LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
> NIR_INTRINSIC_CAN_REORDER)
>  LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
>
>  /*
> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
> index 51a953a..a6bae94 100644
> --- a/src/glsl/nir/nir_lower_io.c
> +++ b/src/glsl/nir/nir_lower_io.c
> @@ -63,8 +63,20 @@ nir_assign_var_locations(struct exec_list *var_list, 
> unsigned *size,
> *size = location;
>  }
>
> +/**
> + * Returns true if we're processing a stage whose inputs are arrays indexed
> + * by a vertex number (such as geometry shader inputs).
> + */
> +static bool
> +stage_uses_per_vertex_inputs(struct lower_io_state *state)
> +{
> +   gl_shader_stage stage = state->builder.shader->stage;
> +   return stage == MESA_SHADER_GEOMETRY;
> +}
> +
>  static unsigned
>  get_io_offset(nir_deref_var *deref, nir_instr *instr,
> +  nir_ssa_def **vertex_index,
>nir_ssa_def **out_indirect,
>struct lower_io_state *state)
>  {
> @@ -75,6 +87,22 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
> b->cursor = nir_before_instr(instr);
>
> nir_deref *tail = >deref;
> +
> +   /* For per-vertex input arrays (i.e. geometry shader inputs), keep the
> +* outermost array index separate.  Process the rest normally.
> +*/
> +   if (vertex_index != NULL) {
> +  tail = tail->child;
> +  assert(tail->deref_type == nir_deref_type_array);
> +  nir_deref_array *deref_array = nir_deref_as_array(tail);
> +
> +  nir_ssa_def *vtx = nir_imm_int(b, deref_array->base_offset);
> +  if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
> + vtx = nir_iadd(b, vtx, nir_ssa_for_src(b, deref_array->indirect, 
> 1));
> +  }
> +  *vertex_index = vtx;
> +   }
> +
> while (tail->child != NULL) {
>const struct glsl_type *parent_type = tail->type;
>tail = tail->child;
> @@ -107,13 +135,19 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
>  }
>
>  static nir_intrinsic_op
> -load_op(nir_variable_mode mode, bool has_indirect)
> +load_op(struct lower_io_state *state,
> +nir_variable_mode mode, bool per_vertex, bool has_indirect)
>  {
> nir_intrinsic_op op;
> switch (mode) {
> case nir_var_shader_in:
> -  op = has_indirect ? nir_intrinsic_load_input_indirect :
> -  nir_intrinsic_load_input;
> +  if (per_vertex) {
> + op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect :
> + nir_intrinsic_load_per_vertex_input;
> +  } else {
> + op = has_indirect ? nir_intrinsic_load_input_indirect :
> + nir_intrinsic_load_input;
> +  }
> 

[Mesa-dev] [PATCH 3/3] i965: Remove shader_prog from vec4_gs_visitor.

2015-10-02 Thread Kenneth Graunke
Unfortunately it has to stay in gen6_gs_visitor.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 6 ++
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   | 3 ---
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h   | 9 +++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 74ef728..f6967a7 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -37,7 +37,6 @@ namespace brw {
 vec4_gs_visitor::vec4_gs_visitor(const struct brw_compiler *compiler,
  void *log_data,
  struct brw_gs_compile *c,
- struct gl_shader_program *prog,
  nir_shader *shader,
  void *mem_ctx,
  bool no_spills,
@@ -45,7 +44,6 @@ vec4_gs_visitor::vec4_gs_visitor(const struct brw_compiler 
*compiler,
: vec4_visitor(compiler, log_data, >key.tex,
   >prog_data.base, shader,  mem_ctx,
   no_spills, shader_time_index),
- shader_prog(prog),
  c(c)
 {
 }
@@ -641,7 +639,7 @@ brw_gs_emit(struct brw_context *brw,
  c->prog_data.base.dispatch_mode = DISPATCH_MODE_4X2_DUAL_OBJECT;
 
  vec4_gs_visitor v(brw->intelScreen->compiler, brw,
-   c, prog, shader->Program->nir,
+   c, shader->Program->nir,
mem_ctx, true /* no_spills */, st_index);
  if (v.run()) {
 return generate_assembly(brw, prog, >gp->program.Base,
@@ -684,7 +682,7 @@ brw_gs_emit(struct brw_context *brw,
 
if (brw->gen >= 7)
   gs = new vec4_gs_visitor(brw->intelScreen->compiler, brw,
-   c, prog, shader->Program->nir,
+   c, shader->Program->nir,
mem_ctx, false /* no_spills */,
st_index);
else
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
index 85d80b8..da93f0d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
@@ -70,7 +70,6 @@ public:
vec4_gs_visitor(const struct brw_compiler *compiler,
void *log_data,
struct brw_gs_compile *c,
-   struct gl_shader_program *prog,
nir_shader *shader,
void *mem_ctx,
bool no_spills,
@@ -97,8 +96,6 @@ protected:
void emit_control_data_bits();
void set_stream_control_data_bits(unsigned stream_id);
 
-   struct gl_shader_program *shader_prog;
-
src_reg vertex_count;
src_reg control_data_bits;
const struct brw_gs_compile * const c;
diff --git a/src/mesa/drivers/dri/i965/gen6_gs_visitor.h 
b/src/mesa/drivers/dri/i965/gen6_gs_visitor.h
index 41c6d18..e75d6aa 100644
--- a/src/mesa/drivers/dri/i965/gen6_gs_visitor.h
+++ b/src/mesa/drivers/dri/i965/gen6_gs_visitor.h
@@ -43,8 +43,11 @@ public:
void *mem_ctx,
bool no_spills,
int shader_time_index) :
-  vec4_gs_visitor(comp, log_data, c, prog, shader, mem_ctx, no_spills,
-  shader_time_index) {}
+  vec4_gs_visitor(comp, log_data, c, shader, mem_ctx, no_spills,
+  shader_time_index),
+  shader_prog(prog)
+  {
+  }
 
 protected:
virtual void emit_prolog();
@@ -64,6 +67,8 @@ private:
void xfb_setup();
int get_vertex_output_offset_for_varying(int vertex, int varying);
 
+   const struct gl_shader_program *shader_prog;
+
src_reg vertex_output;
src_reg vertex_output_offset;
src_reg temp;
-- 
2.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] nir: Add a nir_shader_info::has_transform_feedback_varyings flag.

2015-10-02 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/glsl_to_nir.cpp | 2 ++
 src/glsl/nir/nir.h   | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 4dd6287..efaa73e 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -164,6 +164,8 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
shader->info.separate_shader = shader_prog->SeparateShader;
shader->info.gs.vertices_out = sh->Geom.VerticesOut;
shader->info.gs.invocations = sh->Geom.Invocations;
+   shader->info.has_transform_feedback_varyings =
+  shader_prog->TransformFeedback.NumVarying > 0;
 
return shader;
 }
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index c83ef50..a829e2c 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1484,6 +1484,9 @@ typedef struct nir_shader_info {
/* Whether or not separate shader objects were used */
bool separate_shader;
 
+   /** Was this shader linked with any transform feedback varyings? */
+   bool has_transform_feedback_varyings;
+
struct {
   /** The maximum number of vertices the geometry shader might write. */
   unsigned vertices_out;
-- 
2.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965: Use nir->has_transform_feedback_varyings to avoid shader_prog.

2015-10-02 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index c673ccd..74ef728 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -473,7 +473,7 @@ vec4_gs_visitor::gs_emit_vertex(int stream_id)
 * be recorded by transform feedback, we can simply discard all geometry
 * bound to these streams when transform feedback is disabled.
 */
-   if (stream_id > 0 && shader_prog->TransformFeedback.NumVarying == 0)
+   if (stream_id > 0 && !nir->info.has_transform_feedback_varyings)
   return;
 
/* If we're outputting 32 control data bits or less, then we can wait
-- 
2.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl/dri2: expose srgb configs when KHR_gl_colorspace is available

2015-10-02 Thread Emil Velikov
Otherwise the user has no way of using it, and we'll try to access the
linear one.

Cc: Mauro Rossi 
Cc: Chih-Wei Huang 
Cc: "11.0" 
Fixes: c2c2e9ab604(egl: implement EGL_KHR_gl_colorspace (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596
Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/egl_dri2.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 1740ee3..049bebb 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -236,7 +236,8 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
*dri_config, int id,
  break;
 
   case __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE:
- srgb = value != 0;
+ if (dpy->Extensions.KHR_gl_colorspace)
+srgb = value != 0;
  break;
 
   default:
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #16 from Emil Velikov  ---
(In reply to Emil Velikov from comment #14)
> (In reply to Ilia Mirkin from comment #13)
> > Er nevermind. Before it was malloc'd but now it's calloc'd.
> 
> The most creepy part is that with the extension disabled, we can never get
> to the srgb configs, as GLColorspace will be LINEAR. Even if the user
> requests one, we'll error out.

Actually I stand corrected - if one disables the extension, we'll still
expose/list the srgb configs. With this patch [1] + the one in comment 8 things
should be back to normal. Now we need to figure out why things go crazy in the
first place (hint check the format mappings in comment 15).

[1] http://patchwork.freedesktop.org/patch/60961/

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] main: consider that unsized arrays have at least one active element

2015-10-02 Thread Samuel Iglesias Gonsalvez
From ARB_shader_storage_buffer_object:

"When using the ARB_program_interface_query extension to enumerate the
 set of active buffer variables, only the first element of arrays (sized
 or unsized) will be enumerated"

_mesa_program_resource_array_size() is used when getting the name (and
name length) of the active variables. When it is an unsized array,
we want to indicate it has one active element so the returned name
would have "[0]" at the end.

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/mesa/main/shader_query.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index be4c918..a8f7755 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -485,8 +485,12 @@ _mesa_program_resource_array_size(struct 
gl_program_resource *res)
case GL_COMPUTE_SUBROUTINE_UNIFORM:
case GL_TESS_CONTROL_SUBROUTINE_UNIFORM:
case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM:
-   case GL_BUFFER_VARIABLE:
   return RESOURCE_UNI(res)->array_elements;
+   case GL_BUFFER_VARIABLE:
+  if (RESOURCE_UNI(res)->is_unsized_array)
+ return 1;
+  else
+ return RESOURCE_UNI(res)->array_elements;
case GL_VERTEX_SUBROUTINE:
case GL_GEOMETRY_SUBROUTINE:
case GL_FRAGMENT_SUBROUTINE:
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] main: fix goto in program_resource_top_level_array_stride

2015-10-02 Thread Samuel Iglesias Gonsalvez
Use found_top_level_array_stride instead of found_top_level_array_size.

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/mesa/main/shader_query.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index e6d93c3..be4c918 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -1006,7 +1006,7 @@ program_resource_top_level_array_stride(struct 
gl_shader_program *shProg,
 */
if (strcmp(name, var_name) == 0) {
   array_stride = 0;
-  goto found_top_level_array_size;
+  goto found_top_level_array_stride;
}
 
if (interface->interface_packing != 
GLSL_INTERFACE_PACKING_STD430) {
@@ -1024,11 +1024,11 @@ program_resource_top_level_array_stride(struct 
gl_shader_program *shProg,
 } else {
array_stride = 0;
 }
-goto found_top_level_array_size;
+goto found_top_level_array_stride;
  }
   }
}
-found_top_level_array_size:
+found_top_level_array_stride:
free(interface_name);
free(var_name);
return array_stride;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] main: fix TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE

2015-10-02 Thread Samuel Iglesias Gonsalvez
When the active variable is an array which is already a top-level
shader storage block member, don't return its array size and stride
when querying TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE
respectively.

Fixes the following 12 dEQP-GLES31 tests:

dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.column_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.column_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.column_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.column_major_mat3x4

Signed-off-by: Samuel Iglesias Gonsalvez 
---

I am not very sure about this change, ARB_program_interface query
doesn't clarify this case:

"For the property TOP_LEVEL_ARRAY_SIZE, a single integer identifying
 the number of active array elements of the top-level shader storage
 block member containing to the active variable is written to "

"For the property TOP_LEVEL_ARRAY_STRIDE, a single integer identifying
 the stride between array elements of the top-level shader storage
 block member containing the active variable is written to ."

Ian, What do you think?

 src/mesa/main/shader_query.cpp | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 7189676..e6d93c3 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -921,12 +921,18 @@ program_resource_top_level_array_size(struct 
gl_shader_program *shProg,
  * the top-level block member is an array with no declared size,
  * the value zero is written to .
  */
-if (field->type->is_unsized_array())
+/* If the given variable is already a top-level shader storage
+ * block member, then return array_size = 1.
+ */
+if (strcmp(name, var_name) == 0)
+   array_size = 1;
+else if (field->type->is_unsized_array())
array_size = 0;
 else if (field->type->is_array())
array_size = field->type->length;
 else
array_size = 1;
+
 goto found_top_level_array_size;
  }
   }
@@ -995,6 +1001,14 @@ program_resource_top_level_array_stride(struct 
gl_shader_program *shProg,
bool row_major = matrix_layout == GLSL_MATRIX_LAYOUT_ROW_MAJOR;
const glsl_type *array_type = field->type->fields.array;
 
+   /* If the given variable is already a top-level shader storage
+* block member, then return array_stride = 0.
+*/
+   if (strcmp(name, var_name) == 0) {
+  array_stride = 0;
+  goto found_top_level_array_size;
+   }
+
if (interface->interface_packing != 
GLSL_INTERFACE_PACKING_STD430) {
   if (array_type->is_record()) {
  array_stride = array_type->std140_size(row_major);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] glsl: fix matrix stride calculation for std430's row_major matrices with two columns

2015-10-02 Thread Samuel Iglesias Gonsalvez
It doesn't round up to vec4 size.

Fixes 15 dEQP tests:

dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x4
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x3
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x4

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/glsl/lower_ubo_reference.cpp | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/glsl/lower_ubo_reference.cpp b/src/glsl/lower_ubo_reference.cpp
index 247620e..183435e 100644
--- a/src/glsl/lower_ubo_reference.cpp
+++ b/src/glsl/lower_ubo_reference.cpp
@@ -744,7 +744,14 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
* or 32 depending on the number of columns.
*/
   assert(matrix_columns <= 4);
-  unsigned matrix_stride = glsl_align(matrix_columns * N, 16);
+  unsigned matrix_stride = 0;
+  /* matrix stride for std430 mat2xY matrices are not rounded up to
+   * vec4 size.
+   */
+  if (packing == GLSL_INTERFACE_PACKING_STD430 && matrix_columns == 2)
+ matrix_stride = 2 * N;
+  else
+ matrix_stride = glsl_align(matrix_columns * N, 16);
 
   const glsl_type *deref_type = deref->type->base_type == GLSL_TYPE_FLOAT ?
  glsl_type::float_type : glsl_type::double_type;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] main: buffer array variables can have array size of 0 if they are unsized

2015-10-02 Thread Samuel Iglesias Gonsalvez
From ARB_program_query_interface:

  For the property ARRAY_SIZE, a single integer identifying the number of
  active array elements of an active variable is written to . The
  array size returned is in units of the type associated with the property
  TYPE. For active variables not corresponding to an array of basic types,
  the value one is written to . If the variable is a shader
  storage block member in an array with no declared size, the value zero
  is written to .

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/mesa/main/shader_query.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index a8f7755..a1c80b1 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -1269,8 +1269,12 @@ _mesa_program_resource_prop(struct gl_shader_program 
*shProg,
   switch (res->Type) {
   case GL_UNIFORM:
   case GL_BUFFER_VARIABLE:
+ if (RESOURCE_UNI(res)->is_shader_storage &&
+ RESOURCE_UNI(res)->is_unsized_array)
+*val = 0;
+ else
 *val = MAX2(RESOURCE_UNI(res)->array_elements, 1);
-return 1;
+ return 1;
   case GL_PROGRAM_INPUT:
   case GL_PROGRAM_OUTPUT:
  *val = MAX2(_mesa_program_resource_array_size(res), 1);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] glsl: Add is_unsized_array flag to gl_uniform_storage

2015-10-02 Thread Samuel Iglesias Gonsalvez
---
 src/glsl/ir_uniform.h  | 5 +
 src/glsl/link_uniforms.cpp | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h
index 858a7da..6954759 100644
--- a/src/glsl/ir_uniform.h
+++ b/src/glsl/ir_uniform.h
@@ -199,6 +199,11 @@ struct gl_uniform_storage {
 * This is a shader storage buffer variable, not an uniform.
 */
bool is_shader_storage;
+
+   /**
+* Flag to indicate the variable is an unsized array
+*/
+   bool is_unsized_array;
 };
 
 #ifdef __cplusplus
diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 47d49c8..c471d98 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -792,6 +792,8 @@ private:
 
   this->uniforms[id].is_shader_storage =
  current_var->is_in_shader_storage_block();
+  this->uniforms[id].is_unsized_array =
+ current_var->type->is_unsized_array();
 
   if (this->ubo_block_index != -1) {
  this->uniforms[id].block_index = this->ubo_block_index;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] glsl: fix matrix stride calculation for std430's row_major matrices with two columns

2015-10-02 Thread Ilia Mirkin
I'm a little concerned that your random ssbo packing tests didn't pick
this up... can you double-check your script?

On Fri, Oct 2, 2015 at 3:13 AM, Samuel Iglesias Gonsalvez
 wrote:
> It doesn't round up to vec4 size.
>
> Fixes 15 dEQP tests:
>
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x3
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x3
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x3
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x4
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x4
> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x4
> dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2
> dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x3
> dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x4
> dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2
> dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x3
> dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x4
>
> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/glsl/lower_ubo_reference.cpp | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index 247620e..183435e 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -744,7 +744,14 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
> * or 32 depending on the number of columns.
> */
>assert(matrix_columns <= 4);
> -  unsigned matrix_stride = glsl_align(matrix_columns * N, 16);
> +  unsigned matrix_stride = 0;
> +  /* matrix stride for std430 mat2xY matrices are not rounded up to
> +   * vec4 size.
> +   */
> +  if (packing == GLSL_INTERFACE_PACKING_STD430 && matrix_columns == 2)
> + matrix_stride = 2 * N;
> +  else
> + matrix_stride = glsl_align(matrix_columns * N, 16);
>
>const glsl_type *deref_type = deref->type->base_type == 
> GLSL_TYPE_FLOAT ?
>   glsl_type::float_type : glsl_type::double_type;
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: validate binding qualifier on block members

2015-10-02 Thread Tapani Pälli
Fixes following Piglit test:
member-invalid-binding-qualifier.frag

Signed-off-by: Tapani Pälli 
---
 src/glsl/ast_to_hir.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 6899a55..7bcf182 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -5765,6 +5765,10 @@ ast_process_structure_or_interface_block(exec_list 
*instructions,
 
  const struct ast_type_qualifier *const qual =
 & decl_list->type->qualifier;
+
+ if (qual->flags.q.explicit_binding)
+validate_binding_qualifier(state, , decl_type, qual);
+
  if (qual->flags.q.std140 ||
  qual->flags.q.std430 ||
  qual->flags.q.packed ||
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] main: fix TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE

2015-10-02 Thread Tapani Pälli

These patches fix ES31-CTS.shader_storage_buffer_object.*matrixOp*
(from 6/14 to 14/14 passing)

Tested-by: Tapani Pälli 

On 10/02/2015 10:13 AM, Samuel Iglesias Gonsalvez wrote:

When the active variable is an array which is already a top-level
shader storage block member, don't return its array size and stride
when querying TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE
respectively.

Fixes the following 12 dEQP-GLES31 tests:

dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.column_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.column_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.column_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat3x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.column_major_mat3x4

Signed-off-by: Samuel Iglesias Gonsalvez 
---

I am not very sure about this change, ARB_program_interface query
doesn't clarify this case:

"For the property TOP_LEVEL_ARRAY_SIZE, a single integer identifying
  the number of active array elements of the top-level shader storage
  block member containing to the active variable is written to "

"For the property TOP_LEVEL_ARRAY_STRIDE, a single integer identifying
  the stride between array elements of the top-level shader storage
  block member containing the active variable is written to ."

Ian, What do you think?

  src/mesa/main/shader_query.cpp | 16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 7189676..e6d93c3 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -921,12 +921,18 @@ program_resource_top_level_array_size(struct 
gl_shader_program *shProg,
   * the top-level block member is an array with no declared size,
   * the value zero is written to .
   */
-if (field->type->is_unsized_array())
+/* If the given variable is already a top-level shader storage
+ * block member, then return array_size = 1.
+ */
+if (strcmp(name, var_name) == 0)
+   array_size = 1;
+else if (field->type->is_unsized_array())
 array_size = 0;
  else if (field->type->is_array())
 array_size = field->type->length;
  else
 array_size = 1;
+
  goto found_top_level_array_size;
   }
}
@@ -995,6 +1001,14 @@ program_resource_top_level_array_stride(struct 
gl_shader_program *shProg,
 bool row_major = matrix_layout == GLSL_MATRIX_LAYOUT_ROW_MAJOR;
 const glsl_type *array_type = field->type->fields.array;

+   /* If the given variable is already a top-level shader storage
+* block member, then return array_stride = 0.
+*/
+   if (strcmp(name, var_name) == 0) {
+  array_stride = 0;
+  goto found_top_level_array_size;
+   }
+
 if (interface->interface_packing != 
GLSL_INTERFACE_PACKING_STD430) {
if (array_type->is_record()) {
   array_stride = array_type->std140_size(row_major);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/11] i965/vec4: Use the actual channels used in pack_uniform_registers

2015-10-02 Thread Iago Toral
On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> Previously, pack_uniform_registers worked based on the size of the uniform
> as given to us when we initially set up the uniforms.  However, we have to
> walk through the uniforms and figure out liveness anyway, so we migh as
> well record the number of channels used as we go.  This may also allow us
> to pack things tighter in a few cases.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 52 
> +-
>  1 file changed, 38 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 407698f..f0fa07e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -518,11 +518,11 @@ vec4_visitor::split_uniform_registers()
>  void
>  vec4_visitor::pack_uniform_registers()
>  {
> -   bool uniform_used[this->uniforms];
> +   uint8_t chans_used[this->uniforms];
> int new_loc[this->uniforms];
> int new_chan[this->uniforms];
>  
> -   memset(uniform_used, 0, sizeof(uniform_used));
> +   memset(chans_used, 0, sizeof(chans_used));
> memset(new_loc, 0, sizeof(new_loc));
> memset(new_chan, 0, sizeof(new_chan));
>  
> @@ -532,10 +532,36 @@ vec4_visitor::pack_uniform_registers()
>  */
> foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
>for (int i = 0 ; i < 3; i++) {
> -  if (inst->src[i].file != UNIFORM)
> - continue;
> + if (inst->src[i].file != UNIFORM)
> +continue;

The switch statement below should go before the loop.

> + unsigned readmask;
> + switch (inst->opcode) {
> + case VEC4_OPCODE_PACK_BYTES:
> + case BRW_OPCODE_DP4:
> + case BRW_OPCODE_DPH:
> +readmask = 0xf;
> +break;
> + case BRW_OPCODE_DP3:
> +readmask = 0x7;
> +break;
> + case BRW_OPCODE_DP2:
> +readmask = 0x3;
> +break;

I don't know if there are other opcodes that could also be
special-handled like these, but if there are we are only missing the
opportunity to do tighter packing for them (which we are not doing now
anyway)...

Reviewed-by: Iago Toral Quiroga 

> + default:
> +readmask = inst->dst.writemask;
> +break;
> + }
> +
> + int reg = inst->src[i].reg;
> +
> + for (int c = 0; c < 4; c++) {
> +if (!(readmask & (1 << c)))
> +   continue;
>  
> -  uniform_used[inst->src[i].reg] = true;
> +chans_used[reg] = MAX2(chans_used[reg],
> +   BRW_GET_SWZ(inst->src[i].swizzle, c) + 1);
> + }
>}
> }
>  
> @@ -546,17 +572,15 @@ vec4_visitor::pack_uniform_registers()
>  */
> for (int src = 0; src < uniforms; src++) {
>assert(src < uniform_array_size);
> -  int size = this->uniform_vector_size[src];
> +  int size = chans_used[src];
>  
> -  if (!uniform_used[src]) {
> -  this->uniform_vector_size[src] = 0;
> -  continue;
> -  }
> +  if (size == 0)
> + continue;
>  
>int dst;
>/* Find the lowest place we can slot this uniform in. */
>for (dst = 0; dst < src; dst++) {
> -  if (this->uniform_vector_size[dst] + size <= 4)
> +  if (chans_used[dst] + size <= 4)
>   break;
>}
>  
> @@ -565,7 +589,7 @@ vec4_visitor::pack_uniform_registers()
>new_chan[src] = 0;
>} else {
>new_loc[src] = dst;
> -  new_chan[src] = this->uniform_vector_size[dst];
> +  new_chan[src] = chans_used[dst];
>  
>/* Move the references to the data */
>for (int j = 0; j < size; j++) {
> @@ -573,8 +597,8 @@ vec4_visitor::pack_uniform_registers()
>  stage_prog_data->param[src * 4 + j];
>}
>  
> -  this->uniform_vector_size[dst] += size;
> -  this->uniform_vector_size[src] = 0;
> +  chans_used[dst] += size;
> +  chans_used[src] = 0;
>}
>  
>new_uniform_count = MAX2(new_uniform_count, dst + 1);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] glsl: fix matrix stride calculation for std430's row_major matrices with two columns

2015-10-02 Thread Samuel Iglesias Gonsálvez


On 02/10/15 09:20, Ilia Mirkin wrote:
> I'm a little concerned that your random ssbo packing tests didn't pick
> this up... can you double-check your script?
> 

The script generates shader_runner tests that query GL_MATRIX_STRIDE
using ARB_program_interface_query's calls but it has not support for
checking that it is writing/reading to the proper places yet.

That queries are fine. This patch is to fix the SSBO load/stores
operations with std430 and row_major mat2xY matrices.

Sam

> On Fri, Oct 2, 2015 at 3:13 AM, Samuel Iglesias Gonsalvez
>  wrote:
>> It doesn't round up to vec4 size.
>>
>> Fixes 15 dEQP tests:
>>
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x3
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x3
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x3
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x4
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x4
>> dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x4
>> dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2
>> dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x3
>> dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x4
>> dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2
>> dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x3
>> dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x4
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> ---
>>  src/glsl/lower_ubo_reference.cpp | 9 -
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/glsl/lower_ubo_reference.cpp 
>> b/src/glsl/lower_ubo_reference.cpp
>> index 247620e..183435e 100644
>> --- a/src/glsl/lower_ubo_reference.cpp
>> +++ b/src/glsl/lower_ubo_reference.cpp
>> @@ -744,7 +744,14 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
>> * or 32 depending on the number of columns.
>> */
>>assert(matrix_columns <= 4);
>> -  unsigned matrix_stride = glsl_align(matrix_columns * N, 16);
>> +  unsigned matrix_stride = 0;
>> +  /* matrix stride for std430 mat2xY matrices are not rounded up to
>> +   * vec4 size.
>> +   */
>> +  if (packing == GLSL_INTERFACE_PACKING_STD430 && matrix_columns == 2)
>> + matrix_stride = 2 * N;
>> +  else
>> + matrix_stride = glsl_align(matrix_columns * N, 16);
>>
>>const glsl_type *deref_type = deref->type->base_type == 
>> GLSL_TYPE_FLOAT ?
>>   glsl_type::float_type : glsl_type::double_type;
>> --
>> 2.1.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: validate binding qualifier on block members

2015-10-02 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

On 02/10/15 09:23, Tapani Pälli wrote:
> Fixes following Piglit test:
>   member-invalid-binding-qualifier.frag
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/ast_to_hir.cpp | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index 6899a55..7bcf182 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -5765,6 +5765,10 @@ ast_process_structure_or_interface_block(exec_list 
> *instructions,
>  
>   const struct ast_type_qualifier *const qual =
>  & decl_list->type->qualifier;
> +
> + if (qual->flags.q.explicit_binding)
> +validate_binding_qualifier(state, , decl_type, qual);
> +
>   if (qual->flags.q.std140 ||
>   qual->flags.q.std430 ||
>   qual->flags.q.packed ||
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] glsl: fix matrix stride calculation for std430's row_major matrices with two columns

2015-10-02 Thread Ilia Mirkin
On Fri, Oct 2, 2015 at 3:34 AM, Samuel Iglesias Gonsálvez
 wrote:
> On 02/10/15 09:20, Ilia Mirkin wrote:
>> I'm a little concerned that your random ssbo packing tests didn't pick
>> this up... can you double-check your script?
>>
>
> The script generates shader_runner tests that query GL_MATRIX_STRIDE
> using ARB_program_interface_query's calls but it has not support for
> checking that it is writing/reading to the proper places yet.

Oh. That explains it. Might I suggest implementing such support? :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] glsl: fix matrix stride calculation for std430's row_major matrices with two columns

2015-10-02 Thread Samuel Iglesias Gonsálvez


On 02/10/15 09:39, Ilia Mirkin wrote:
> On Fri, Oct 2, 2015 at 3:34 AM, Samuel Iglesias Gonsálvez
>  wrote:
>> On 02/10/15 09:20, Ilia Mirkin wrote:
>>> I'm a little concerned that your random ssbo packing tests didn't pick
>>> this up... can you double-check your script?
>>>
>>
>> The script generates shader_runner tests that query GL_MATRIX_STRIDE
>> using ARB_program_interface_query's calls but it has not support for
>> checking that it is writing/reading to the proper places yet.
> 
> Oh. That explains it. Might I suggest implementing such support? :)
> 

Yeah, sure. I add it to my todo list :-)

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/11] i965/vec4: Get rid of the uniform_vector_size array

2015-10-02 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> The uniform_vector_size array was only ever used by pack_uniform_registers
> which no longer needs it.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp|  1 -
>  src/mesa/drivers/dri/i965/brw_vec4.h  |  3 +--
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 15 ---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  3 ---
>  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  1 -
>  5 files changed, 5 insertions(+), 18 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index f0fa07e..4e9f3f7 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1667,7 +1667,6 @@ vec4_visitor::setup_uniforms(int reg)
>  */
> if (devinfo->gen < 6 && this->uniforms == 0) {
>assert(this->uniforms < this->uniform_array_size);
> -  this->uniform_vector_size[this->uniforms] = 1;
>  
>stage_prog_data->param =
>   reralloc(NULL, stage_prog_data->param, const gl_constant_value *, 
> 4);
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 341897e..76b13a6 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -124,8 +124,7 @@ public:
> dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
> const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
> int *uniform_size;
> -   int *uniform_vector_size;
> -   int uniform_array_size; /*< Size of uniform_[vector_]size arrays */
> +   int uniform_array_size; /*< Size of the uniform_size array */
> int uniforms;
>  
> src_reg shader_start_time;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 2555038..b0abfc1 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -166,16 +166,14 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
>  
>struct gl_program_parameter_list *plist = prog->Parameters;
>for (unsigned p = 0; p < plist->NumParameters; p++) {
> - uniform_vector_size[uniforms] = plist->Parameters[p].Size;
> -
>   /* Parameters should be either vec4 uniforms or single component
>* constants; matrices and other larger types should have been 
> broken
>* down earlier.
>*/
> - assert(uniform_vector_size[uniforms] <= 4);
> + assert(plist->Parameters[p].Size <= 4);
>  
> - int i;
> - for (i = 0; i < uniform_vector_size[uniforms]; i++) {
> + unsigned i;
> + for (i = 0; i < plist->Parameters[p].Size; i++) {
>  stage_prog_data->param[uniforms * 4 + i] = 
> >ParameterValues[p][i];
>   }
>   for (; i < 4; i++) {
> @@ -218,10 +216,9 @@ vec4_visitor::nir_setup_uniform(nir_variable *var)
>  
> for (unsigned s = 0; s < vector_count; s++) {
>assert(uniforms < uniform_array_size);
> -  uniform_vector_size[uniforms] = storage->type->vector_elements;
>  
>int i;
> -  for (i = 0; i < uniform_vector_size[uniforms]; i++) {
> +  for (i = 0; i < storage->type->vector_elements; i++) {
>   stage_prog_data->param[uniforms * 4 + i] = components;
>   components++;
>}
> @@ -258,10 +255,6 @@ vec4_visitor::nir_setup_builtin_uniform(nir_variable 
> *var)
>   stage_prog_data->param[uniforms * 4 + j] =
>  [GET_SWZ(slots[i].swizzle, j)];
>  
> -  uniform_vector_size[uniforms] =
> - (var->type->is_scalar() || var->type->is_vector() ||
> -  var->type->is_matrix() ? var->type->vector_elements : 4);
> -
>uniforms++;
> }
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index f303080..af01d8e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -702,8 +702,6 @@ vec4_visitor::setup_vec4_uniform_value(unsigned 
> param_offset,
>  
> for (unsigned i = n; i < 4; ++i)
>stage_prog_data->param[param_offset + i] = 
> -
> -   uniform_vector_size[param_offset / 4] = n;
>  }
>  
>  vec4_instruction *
> @@ -1886,7 +1884,6 @@ vec4_visitor::vec4_visitor(const struct brw_compiler 
> *compiler,
> }
>  
> this->uniform_size = rzalloc_array(mem_ctx, int, 
> this->uniform_array_size);
> -   this->uniform_vector_size = rzalloc_array(mem_ctx, int, 
> this->uniform_array_size);
>  }
>  
>  vec4_visitor::~vec4_visitor()
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
> index f4b50ba..3a267d8 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
> +++ 

Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span

2015-10-02 Thread Kenneth Graunke
On Friday, October 02, 2015 05:04:21 PM Ilia Mirkin wrote:
> On Fri, Oct 2, 2015 at 4:47 PM, Ian Romanick  wrote:
> > On 10/01/2015 12:15 PM, Ilia Mirkin wrote:
> >> On Thu, Oct 1, 2015 at 3:12 PM, Ian Romanick  wrote:
> >>> I'm just
> >>> wondering because Mesa doesn't support that extension.  How is this even
> >>> being hit?
> >>
> >> See 81d2fd91a90 (mesa: add NV_read_{depth,stencil,depth_stencil} 
> >> extensions)
> >
> > Okay, that's weird.  I must have had some old branch checked out at the
> > time.  I did 'grep -r NV_read_depth src/', and it came back empty.  Now
> > I just get to be irritated that we enabled THREE extensions for which we
> > have ZERO tests... and least one is clearly completely broken. :(
> >
> > I guess now I at least have something concrete to point to then next
> > time I object to enabling an ES extension that "just" allows some
> > desktop functionality. ;)
> 
> I believe Rob tested at least some of it with qapitrace[1], as
> otherwise there was no way to get access to the data in a
> renderbuffer, which can be quite useful for debugging. Not all of us
> have your level of familiarity with how GL works, but how will we
> learn without making some mistakes? :)
> 
> No matter how many tests we might have, they'll always leave
> *something* out. The fact that there are no tests at all for this ext
> isn't great, of course. But there are also no functional tests for
> {ARB,AMD}_conservative_depth and probably a number of others.
> 
>   -ilia

I think we can all agree that enabling a feature with 0 tests is bad
practice.  Just because people have done it before doesn't make it wise.
Or in a positive spin: our commitment to Piglit is one of the reasons
we have such high quality drivers.  It's worth it, even if it can be a
bunch of unglamorous and annoying work.

I'm OK with not adding comprehensive tests for ES extensions that port
over GL functionality, but at least touch testing the feature is
valuable.

Also...GL_ARB_conservative_depth is a bit different.  It provides a hint
which the driver can use to speed up some operations - but there's no
new feature, or observable behavior.  Drivers can completely ignore it
and still have a valid, correct implementation of the extension.  So
it's a bit tricky to test (but we could at least look for breakage...)

In contrast, GL_NV_read_depth_stencil turns a GL call that used to be an
error into a functioning read of depth data, which seems straightforward
to test.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965/nir: Remove the prog parameter from brw_nir_lower_inputs

2015-10-02 Thread Jason Ekstrand
On Oct 2, 2015 8:51 PM, "Kenneth Graunke"  wrote:
>
> On Friday, October 02, 2015 07:48:47 PM Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/brw_nir.c | 6 ++
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
b/src/mesa/drivers/dri/i965/brw_nir.c
> > index cc9430c..4f19655 100644
> > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > @@ -28,9 +28,7 @@
> >  #include "program/prog_to_nir.h"
> >
> >  static void
> > -brw_nir_lower_inputs(nir_shader *nir,
> > - const struct gl_program *prog,
> > - bool is_scalar)
> > +brw_nir_lower_inputs(nir_shader *nir, bool is_scalar)
> >  {
> > nir_assign_var_locations(>inputs, >num_inputs,
> >  is_scalar ? type_size_scalar :
type_size_vec4);
> > @@ -141,7 +139,7 @@ brw_create_nir(struct brw_context *brw,
> > /* Get rid of split copies */
> > nir_optimize(nir, is_scalar);
> >
> > -   brw_nir_lower_inputs(nir, prog, is_scalar);
> > +   brw_nir_lower_inputs(nir, is_scalar);
> > brw_nir_lower_outputs(nir, is_scalar);
> > nir_assign_var_locations(>uniforms,
> >  >num_uniforms,
> >
>
> This is actually used in by pending patch series.  But it looks like
> everything I need is in nir_shader_info, so I should be able to rebase
> onto this without much trouble.
>
> Series is:
> Reviewed-by: Kenneth Graunke 

Thanks!  One more step away from tyranny. :-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span

2015-10-02 Thread Ilia Mirkin
On Sat, Oct 3, 2015 at 12:06 AM, Kenneth Graunke  wrote:
> On Friday, October 02, 2015 05:04:21 PM Ilia Mirkin wrote:
>> On Fri, Oct 2, 2015 at 4:47 PM, Ian Romanick  wrote:
>> > On 10/01/2015 12:15 PM, Ilia Mirkin wrote:
>> >> On Thu, Oct 1, 2015 at 3:12 PM, Ian Romanick  wrote:
>> >>> I'm just
>> >>> wondering because Mesa doesn't support that extension.  How is this even
>> >>> being hit?
>> >>
>> >> See 81d2fd91a90 (mesa: add NV_read_{depth,stencil,depth_stencil} 
>> >> extensions)
>> >
>> > Okay, that's weird.  I must have had some old branch checked out at the
>> > time.  I did 'grep -r NV_read_depth src/', and it came back empty.  Now
>> > I just get to be irritated that we enabled THREE extensions for which we
>> > have ZERO tests... and least one is clearly completely broken. :(
>> >
>> > I guess now I at least have something concrete to point to then next
>> > time I object to enabling an ES extension that "just" allows some
>> > desktop functionality. ;)
>>
>> I believe Rob tested at least some of it with qapitrace[1], as
>> otherwise there was no way to get access to the data in a
>> renderbuffer, which can be quite useful for debugging. Not all of us
>> have your level of familiarity with how GL works, but how will we
>> learn without making some mistakes? :)
>>
>> No matter how many tests we might have, they'll always leave
>> *something* out. The fact that there are no tests at all for this ext
>> isn't great, of course. But there are also no functional tests for
>> {ARB,AMD}_conservative_depth and probably a number of others.
>>
>>   -ilia
>
> I think we can all agree that enabling a feature with 0 tests is bad
> practice.  Just because people have done it before doesn't make it wise.
> Or in a positive spin: our commitment to Piglit is one of the reasons
> we have such high quality drivers.  It's worth it, even if it can be a
> bunch of unglamorous and annoying work.
>
> I'm OK with not adding comprehensive tests for ES extensions that port
> over GL functionality, but at least touch testing the feature is
> valuable.
>
> Also...GL_ARB_conservative_depth is a bit different.  It provides a hint
> which the driver can use to speed up some operations - but there's no
> new feature, or observable behavior.  Drivers can completely ignore it
> and still have a valid, correct implementation of the extension.  So
> it's a bit tricky to test (but we could at least look for breakage...)

Ah yes, but drivers that don't ignore it can definitely mess it up. A
proper test would do a layout(depth) + modify depth in a permitted
fashion and make sure that it still works. I have a pending
implementation for nvc0 that's waiting on such tests to make sure that
I didn't get it backwards... with depth I can never tell what's up and
what's down :)

>
> In contrast, GL_NV_read_depth_stencil turns a GL call that used to be an
> error into a functioning read of depth data, which seems straightforward
> to test.

For the record, I largely agree with everything you say about the
value/importance of having tests. There's a non-trivial problem with
the same person writing the tests and implementation though -- if I
think an implementation should work a particular way, I'm going to
write a test that conforms to that line of thought. This is a large
part of the reason I tend to stay away from test writing for
extensions I don't fully understand -- it's dangerous for me to write
tests if I don't fully grasp all the functionality. For example take
the ARB_conservative_depth thing -- I could write tests that check the
depth stuff in the way I think it should work, but I'm not
sufficiently confident in my understanding of that stuff. The only
reason I was able to write tests for EXT_polygon_offset_clamp was that
I was 100% confident of my impl and just jiggered the test until it
passed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: don't set DATA_FORMAT if ADD_TID_ENABLE is set on VI (v2)

2015-10-02 Thread Marek Olšák
After some testing, I realized this patch causes problems. DATA_FORMAT
with ADD_TID=1 means STRIDE[14:17] only if the instruction is untyped
MUBUF, but Mesa only uses ADD_TID with typed MUBUF (tbuffer in
particular).

Marek

On Thu, Oct 1, 2015 at 10:27 AM, Christian König
 wrote:
> On 01.10.2015 05:44, Michel Dänzer wrote:
>>
>> On 01.10.2015 04:11, Marek Olšák wrote:
>>>
>>> From: Marek Olšák 
>>>
>>> This can cause incorrect address calculations and hangs.
>>>
>>> v2: do it properly
>>>
>>> Cc: mesa-sta...@lists.freedesktop.org
>>> Tested-and-Reviewed-by: Christian König 
>>> ---
>>>   src/gallium/drivers/radeonsi/si_descriptors.c | 9 -
>>>   1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
>>> b/src/gallium/drivers/radeonsi/si_descriptors.c
>>> index b07ab3b..b56219a 100644
>>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
>>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
>>> @@ -619,11 +619,18 @@ void si_set_ring_buffer(struct pipe_context *ctx,
>>> uint shader, uint slot,
>>>   S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
>>>   S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
>>>
>>> S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
>>> -
>>> S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32) |
>>>   S_008F0C_ELEMENT_SIZE(element_size) |
>>>   S_008F0C_INDEX_STRIDE(index_stride) |
>>>   S_008F0C_ADD_TID_ENABLE(add_tid);
>>>   + /* If ADD_TID_ENABLE is set on VI, DATA_FORMAT specifies
>>> +* STRIDE bits [14:17]
>>> +*/
>>> +   if (sctx->b.chip_class >= VI && add_tid)
>>> +   desc[3] |= S_008F0C_DATA_FORMAT(stride >> 14);
>>> +   else
>>> +   desc[3] |=
>>> S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);
>>
>> The beginning of the function has:
>>
>> /* The stride field in the resource descriptor has 14 bits */
>> assert(stride < (1 << 14));
>>
>> So the if-branch is dead code in a non-release build. Would be nice if
>> that could be reconciled somehow, but I'm fine with doing it in a
>> follow-up change. Either way, this change is
>>
>> Reviewed-by: Michel Dänzer 
>
>
> Yeah, agree. Might be nice if someone can come up with a test for this, but
> I don't think this is absolutely necessary.
>
> For now the patch is Reviewed-by: Christian König 
>
> Regards,
> Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92263] [swrast] piglit gl-2.1-pbo regression

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92263

Bug ID: 92263
   Summary: [swrast] piglit gl-2.1-pbo regression
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org

mesa: bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444 (master 11.1.0-devel)

$ ./bin/gl-2.1-pbo -auto -fbo
[...]
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glBegin(incomplete
framebuffer)
Mesa: User error: GL_INVALID_OPERATION in glEnd
Mesa: User error: GL_INVALID_FRAMEBUFFER_OPERATION in glReadPixels(incomplete
framebuffer)
Probe at (0,0)
  Expected: 1.00 1.00 0.00
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 1.00 1.00 0.00
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 0.80 0.80 0.80
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 1.00 1.00 0.00
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 0.80 0.80 0.80
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 1.00 1.00 0.00
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 1.00 1.00 0.00
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 0.80 0.80 0.80
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 1.00 1.00 0.00
  Observed: 1.00 0.00 0.00
Probe at (0,0)
  Expected: 0.80 0.80 0.80
  Observed: 1.00 0.00 0.00
PIGLIT: {"subtest": {"test_tex_image" : "fail"}}


c277fa394087272c79d65fc308d268fc768b91e5 is the first bad commit
commit c277fa394087272c79d65fc308d268fc768b91e5
Author: Brian Paul 
Date:   Wed Sep 30 11:29:13 2015 -0600

mesa: consolidate texture binding code

Before, we were doing the actual _mesa_reference_texobj() call and
ctx->Driver.BindTexture() and misc housekeeping in three different
places.  This consolidates the common code in a new bind_texture()
function.

Reviewed-by: Tapani Pälli 

:04 04 01cfb3f9437ac679e95f9507e0b47516ecec347e
7d921525c01aef94abfb26cc232d225ebe5f50af Msrc
bisect run success

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] nir: Add a nir_shader_info::has_transform_feedback_varyings flag.

2015-10-02 Thread Matt Turner
The series is

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92264] [softpipe] piglit arb_direct_state_access-texunits regression

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92264

Bug ID: 92264
   Summary: [softpipe] piglit arb_direct_state_access-texunits
regression
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: bri...@vmware.com, lem...@gmail.com

mesa: bf7b6fd3fd6d98305d64ee6224ca9f9e7ba48444 (master 11.1.0-devel)

$ ./bin/arb_direct_state_access-texunits -auto 
GL_RENDERER = Gallium 0.4 on softpipe
GL_MAX_TEXTURE_COORDS = 8
GL_MAX_TEXTURE_IMAGE_UNITS = 18
GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS = 18
GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS = 54
Mesa: User error: GL_INVALID_VALUE in glBindTextureUnit(unit=54)
Unexpected GL error: GL_INVALID_VALUE 0x501
(Error at piglit/tests/spec/arb_direct_state_access/texunits.c:124)
Expected GL error: GL_INVALID_OPERATION 0x502
PIGLIT: {"result": "fail" }


a9408f3ca14f2fb6286bd66bad06ee1bde0d8697 is the first bad commit
commit a9408f3ca14f2fb6286bd66bad06ee1bde0d8697
Author: Brian Paul 
Date:   Wed Sep 30 11:37:16 2015 -0600

mesa: remove _mesa_get_tex_unit_err() and fix error handling

This helper was only called from _mesa_BindTextureUnit().  It's simpler
to just inline it.

The error check / code / message in the helper was incorrect.  It was
written for glBindTextures(), not glBindTextureUnit().  The correct
error for a bad texture unit number is GL_INVALID_VALUE.  The error
message now reports the unit number rather than a GL_TEXTUREi enum.

Fixes a failure in piglit's arb_direct_state_access-bind-texture-unit test.

Reviewed-by: Tapani Pälli 

:04 04 7d921525c01aef94abfb26cc232d225ebe5f50af
27e5bf59206f1906b3654aa6da59d70968234f6c Msrc
bisect run success

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92264] [softpipe] piglit arb_direct_state_access-texunits regression

2015-10-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92264

Tapani Pälli  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Tapani Pälli  ---


*** This bug has been marked as a duplicate of bug 92215 ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >