Re: [Mesa-dev] [PATCH 8/8] i965: Flush pipeline on EndTransformFeedback.

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 03:35 PM, Paul Berry wrote:
> A common use case for transform feedback is to perform one draw
> operation that writes transform feedback output to a buffer, followed
> by a second draw operation that consumes that buffer as vertex input.
> Since vertex input is consumed at an earlier pipeline stage than
> writing transform feedback output, we need to flush the pipeline to
> ensure that the transform feedback output is completely written before
> the data is consumed.
> 
> In an ideal world, we would do some dependency tracking, so that we
> would only flush the pipeline if the next draw call was about to
> consume data generated by a previous draw call in the same batch.
> However, since we don't have that sort of dependency tracking
> infrastructure right now, we just unconditionally flush the buffer
> every time glEndTransformFeedback() is called.  This will cause a
> performance hit compared to the ideal case (since we will sometimes
> flush the pipeline unnecessarily), but fortunately the performance hit
> will be confined to circumstances where transform feedback is in use.
> ---
>  src/mesa/drivers/dri/i965/brw_context.c |8 ++--
>  src/mesa/drivers/dri/i965/brw_context.h |5 +
>  src/mesa/drivers/dri/i965/gen6_sol.c|   16 
>  3 files changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index eb68bf4..f7b88c3 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -52,6 +52,7 @@
>  
>  #include "tnl/t_pipeline.h"
>  #include "glsl/ralloc.h"
> +#include "intel_batchbuffer.h"
>  
>  /***
>   * Mesa's Driver Functions
> @@ -109,7 +110,8 @@ static void brwPrepareExecBegin(struct gl_context *ctx)
> }
>  }
>  
> -static void brwInitDriverFunctions( struct dd_function_table *functions )
> +static void brwInitDriverFunctions(struct intel_context *intel,
> +   struct dd_function_table *functions)
>  {
> intelInitDriverFunctions( functions );
>  
> @@ -117,6 +119,8 @@ static void brwInitDriverFunctions( struct 
> dd_function_table *functions )
> brw_init_queryobj_functions(functions);
>  
> functions->PrepareExecBegin = brwPrepareExecBegin;
> +   if (intel->gen == 6)
> +  functions->EndTransformFeedback = gen6_end_transform_feedback;

Unfortunately, this doesn't work: brwInitDriverFunctions is called
before intelInitContext, so intel->gen isn't set yet.  So this function
pointer never gets set, and gen6_end_transform_feedback never happens.

I blithely tried moving the brwInitDriverFunctions call after
intelInitContext, but that crashes.  Also, setting anything in
"functions" after intelInitContext simply doesn't take; something
must've copied the table (remapping code most likely).  In other words,
a generation check just isn't going to happen.

I think the best solution is to set the function pointer unconditionally
and add

   if (intel->gen < 6)
  return;

to the top of gen6_end_transform_feedback so it becomes a no-op on
earlier platforms.

Sorry for suggesting the generation check earlier; I didn't realize it
was impossible.

>  }
>  
>  bool
> @@ -136,7 +140,7 @@ brwCreateContext(int api,
>return false;
> }
>  
> -   brwInitDriverFunctions( &functions );
> +   brwInitDriverFunctions(intel, &functions);
>  
> if (!intelInitContext( intel, api, mesaVis, driContextPriv,
> sharedContextPrivate, &functions )) {
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index da1de02..fa6c883 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1071,6 +1071,11 @@ void brw_init_surface_formats(struct brw_context *brw);
>  bool
>  brw_fprog_uses_noperspective(const struct gl_fragment_program *fprog);
>  
> +/* gen6_sol.c */
> +void
> +gen6_end_transform_feedback(struct gl_context *ctx,
> +struct gl_transform_feedback_object *obj);
> +
>  
>  
>  /*==
> diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
> b/src/mesa/drivers/dri/i965/gen6_sol.c
> index af372c1..53e3325 100644
> --- a/src/mesa/drivers/dri/i965/gen6_sol.c
> +++ b/src/mesa/drivers/dri/i965/gen6_sol.c
> @@ -28,6 +28,7 @@
>  
>  #include "brw_context.h"
>  #include "intel_buffer_objects.h"
> +#include "intel_batchbuffer.h"
>  #include "brw_defines.h"
>  
>  static void
> @@ -100,3 +101,18 @@ const struct brw_tracked_state gen6_sol_surface = {
> },
> .emit = brw_update_sol_surfaces,
>  };
> +
> +void
> +gen6_end_transform_feedback(struct gl_context *ctx,
> +struct gl_transform_feedback_object *obj)
> +{
> +   /* After EndTransformFeedback, it's likely that the client program will 
> try
> +* to draw us

Re: [Mesa-dev] [PATCH 1/8] mesa: Record transform feedback stride in linker output.

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 03:35 PM, Paul Berry wrote:
> This patch adds the field gl_transform_feedback_info::BufferStride,
> which records the total number of components (per vertex) that
> transform feedback is being instructed to store in each buffer.  The
> i965 gen6 back-end needs this information in order to set up binding
> tables, and it seems better to have the linker provide this
> information rather than force the back-end to recompute it.
> ---
>  src/glsl/linker.cpp|4 +++-
>  src/mesa/main/mtypes.h |5 +
>  2 files changed, 8 insertions(+), 1 deletions(-)
> 
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index b8a7126..5eb2a20 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -1599,6 +1599,7 @@ tfeedback_decl::store(struct gl_shader_program *prog,
>info->Outputs[info->NumOutputs].NumComponents = this->vector_elements;
>info->Outputs[info->NumOutputs].OutputBuffer = buffer;
>++info->NumOutputs;
> +  info->BufferStride[buffer] += this->vector_elements;
> }
> return true;
>  }
> @@ -1863,7 +1864,8 @@ store_tfeedback_info(struct gl_context *ctx, struct 
> gl_shader_program *prog,
>   tfeedback_decl *tfeedback_decls)
>  {
> unsigned total_tfeedback_components = 0;
> -   prog->LinkedTransformFeedback.NumOutputs = 0;
> +   memset(&prog->LinkedTransformFeedback, 0,
> +  sizeof(prog->LinkedTransformFeedback));
> for (unsigned i = 0; i < num_tfeedback_decls; ++i) {
>unsigned buffer =
>   prog->TransformFeedback.BufferMode == GL_SEPARATE_ATTRIBS ? i : 0;
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 1934349..d4c600a 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1822,6 +1822,11 @@ struct gl_transform_feedback_info {
>unsigned OutputBuffer;
>unsigned NumComponents;
> } Outputs[MAX_PROGRAM_OUTPUTS];
> +
> +   /**
> +* Number of components stored in each buffer.
> +*/
> +   unsigned BufferStride[MAX_FEEDBACK_ATTRIBS];
>  };
>  
>  /**

I must confess that, looking at this structure alone, I wasn't sure what
the difference between NumComponents (above) and BufferStride was.

But this is definitely useful.  With or without improved comments:
Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/8] mesa: Fix off-by-one error in transform feedback size check.

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 03:35 PM, Paul Berry wrote:
> In _mesa_BindBufferRange(), we need to verify that the offset and size
> specified by the client do not exceed the size of the underlying
> buffer.  We were accidentally doing this check using ">=" rather than
> ">", so we were generating a bogus error if the client specified an
> offset and size that fit exactly in the underlying buffer.
> ---
>  src/mesa/main/transformfeedback.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/src/mesa/main/transformfeedback.c 
> b/src/mesa/main/transformfeedback.c
> index 799245d..78ca64d 100644
> --- a/src/mesa/main/transformfeedback.c
> +++ b/src/mesa/main/transformfeedback.c
> @@ -486,7 +486,7 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
>return;
> }
>  
> -   if (offset + size >= bufObj->Size) {
> +   if (offset + size > bufObj->Size) {
>_mesa_error(ctx, GL_INVALID_VALUE,
>"glBindBufferRange(offset + size %d > buffer size %d)",
> (int) (offset + size), (int) (bufObj->Size));

Whoops.  Good catch.

Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965 gen6+: Use 1-wide null operands for IF instructions

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 03:35 PM, Paul Berry wrote:
> The Sandy Bridge PRM, volume 4, part 2, section 5.3.10 ("5.3.10
> Register Region Restrictions") contains the following restriction on
> the execution size and operand width of instructions:
> 
>"3. ExecSize must be equal to or greater than Width."
> 
> When emitting an IF instruction in single program flow mode on Gen6+,
> we use an ExecSize of 1, therefore the Width of each operand must also
> be 1.
> 
> The operands of an IF instruction are not actually used for their
> normal purpose on Gen6+ (which is probably the reason this wasn't
> causing a GPU hang), but nonetheless it seems prudent to follow this
> rule.

I'm not actually sure what you're trying to say here.  The IF statement
is different on Gen4-5, Gen6, and Gen7, so I'm unclear what "normal" is.

I think the point is that the ARF null register doesn't have a value, so
null<0,1,0> is just as good as null<4,4,1> or null<8,8,1>.  Either way
you get nothing and write nothing.  So width is irrelevant other than
dodging this assertion.

If we weren't guessing execution size based on the register widths, I'd
prefer to just change the null reg to have width 1.  But since we do,
that would be a royal pain, and this is nice and simple.  I'm definitely
in favor of this patch.

Reviewed-by: Kenneth Graunke 

> This patch unconditionally uses 1-wide null operands for Gen6+
> IF instructions, rather than the standard null operand, which is 8
> components wide.
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c |8 
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index a46a81b..d48753c 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -941,11 +941,11 @@ brw_IF(struct brw_compile *p, GLuint execute_size)
> } else if (intel->gen == 6) {
>brw_set_dest(p, insn, brw_imm_w(0));
>insn->bits1.branch_gen6.jump_count = 0;
> -  brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
> -  brw_set_src1(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
> +  brw_set_src0(p, insn, vec1(retype(brw_null_reg(), 
> BRW_REGISTER_TYPE_D)));
> +  brw_set_src1(p, insn, vec1(retype(brw_null_reg(), 
> BRW_REGISTER_TYPE_D)));
> } else {
> -  brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
> -  brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
> +  brw_set_dest(p, insn, vec1(retype(brw_null_reg(), 
> BRW_REGISTER_TYPE_D)));
> +  brw_set_src0(p, insn, vec1(retype(brw_null_reg(), 
> BRW_REGISTER_TYPE_D)));
>brw_set_src1(p, insn, brw_imm_ud(0));
>insn->bits3.break_cont.jip = 0;
>insn->bits3.break_cont.uip = 0;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/8] i965 gs: Move vue_map to brw_gs_compile.

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 03:35 PM, Paul Berry wrote:
> This patch stores the geometry shader VUE map from a local variable in
> compile_gs_prog() to a field in the brw_gs_compile struct, so that it
> will be available while compiling the geometry shader.  This is
> necessary in order to support transform feedback on Gen6, because the
> Gen6 geometry shader code that supports transform feedback needs to be
> able to inspect the VUE map in order to find the correct vertex data
> to output.
> ---
>  src/mesa/drivers/dri/i965/brw_gs.c |5 ++---
>  src/mesa/drivers/dri/i965/brw_gs.h |2 ++
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index 69ffa19..f5d5898 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -57,9 +57,8 @@ static void compile_gs_prog( struct brw_context *brw,
> 
> c.key = *key;
> /* The geometry shader needs to access the entire VUE. */
> -   struct brw_vue_map vue_map;
> -   brw_compute_vue_map(&vue_map, intel, c.key.userclip_active, c.key.attrs);
> -   c.nr_regs = (vue_map.num_slots + 1)/2;
> +   brw_compute_vue_map(&c.vue_map, intel, c.key.userclip_active, 
> c.key.attrs);
> +   c.nr_regs = (c.vue_map.num_slots + 1)/2;
>  
> mem_ctx = NULL;
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.h 
> b/src/mesa/drivers/dri/i965/brw_gs.h
> index abcb0b2..ecab3ef 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.h
> +++ b/src/mesa/drivers/dri/i965/brw_gs.h
> @@ -66,6 +66,8 @@ struct brw_gs_compile {
>  
> /* Number of registers used to store vertex data */
> GLuint nr_regs;
> +
> +   struct brw_vue_map vue_map;
>  };
>  
>  #define ATTR_SIZE  (4*4)

Simple enough.  I noticed that you only use this in gen6_sol_program(),
so you could actually just pass it as an additional function parameter.
 That said, I imagine it will be useful elsewhere in the future, so this
seems sensible.

Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] i965 gen6: Turn on transform feedback extension.

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 03:35 PM, Paul Berry wrote:
> This patch advertises support for EXT_transform_feedback on Intel Gen6
> and higher.
> 
> Since transform feedback support is not completely finished yet, for
> now we only advertise support for it when MESA_GL_VERSION_OVERRIDE is
> 3.0 or greater (since transform feedback is required by GL version
> 3.0).
> ---
>  src/mesa/drivers/dri/intel/intel_extensions.c |7 +--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c 
> b/src/mesa/drivers/dri/intel/intel_extensions.c
> index 681f5f2..41ae29e 100644
> --- a/src/mesa/drivers/dri/intel/intel_extensions.c
> +++ b/src/mesa/drivers/dri/intel/intel_extensions.c
> @@ -98,10 +98,13 @@ intelInitExtensions(struct gl_context *ctx)
> ctx->Extensions.OES_EGL_image = true;
>  #endif
>  
> -   if (intel->gen >= 6)
> +   if (intel->gen >= 6) {
>ctx->Const.GLSLVersion = 130;
> -   else
> +  if (override_version >= 30)
> + ctx->Extensions.EXT_transform_feedback = true;
> +   } else {
>ctx->Const.GLSLVersion = 120;
> +   }
> _mesa_override_glsl_version(ctx);
>  
> if (intel->gen >= 5)

I'd go ahead and add a new (intel->gen >= 6) block after
_mesa_override_glsl_version.  That way this block can be solely for
determining the GLSL version.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] i965 gen6+: Make intel_batchbuffer_emit_mi_flush() actually flush.

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 03:35 PM, Paul Berry wrote:
> Previous to this patch, the function intel_batchbuffer_emit_mi_flush()
> was a bit of a misnomer.  On Gen4+, when not using the blit engine, it
> didn't actually flush the pipeline--it simply generated a
> _3DSTATE_PIPE_CONTROL command with the necessary bits set to flush GPU

It's actually just called "PIPE_CONTROL", never 3DSTATE_PIPE_CONTROL.

> caches.  This was usually sufficient, since in most situations where
> intel_batchbuffer_emit_mi_flush() wass called, all we really care

"... was called, all we really cared" (typo, tense)

This makes a lot of sense.  At some point, I think we ought to convert
some of the callers of this function to emit the proper PIPE_CONTROL
directly (with only the necessary bits and workarounds).  But this is
the right way to start.  Nice work figuring this out.

Reviewed-by: Kenneth Graunke 

> about was ensuring cache coherency.
> 
> However, with the advent of OpenGL 3.0, there are two cases in which
> data output by one stage of the pipeline might be consumed, in a later
> draw operation, by an earlier stage of the pipeline:
> 
> (a) When using textures in the vertex shader.
> 
> (b) When using drawing with a vertex buffer that was previously
> generated using transform feedback.
> 
> This patch addresses case (a) by changing
> intel_batchbuffer_emit_mi_flush() so that on Gen6+, it sets the
> PIPE_CONTROL_CS_STALL bit (this forces the pipeline to actually
> flush).  (Case (b) will be addressed by the next patch in the series).
> 
> This is not an ideal solution--in a perfect world, the driver would
> have some buffer dependency tracking so that we would only have to
> flush the pipeline in the two cases above.  Until that dependency
> tracking is implemented, however, it seems prudent to have
> intel_batchbuffer_emit_mi_flush() actually flush the pipeline, so that
> we get correct rendering, at the expense of a (hopefully small)
> performance hit.
> 
> The change is only applied to Gen6+, since at the moment only Gen6+
> supports the OpenGL 3.0 features that make a full pipeline flush
> necessary.
> ---
>  src/mesa/drivers/dri/intel/intel_batchbuffer.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c 
> b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> index 6991db8..4ff098a 100644
> --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> @@ -461,7 +461,8 @@ intel_batchbuffer_emit_mi_flush(struct intel_context 
> *intel)
>  PIPE_CONTROL_WRITE_FLUSH |
>  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
>  PIPE_CONTROL_TC_FLUSH |
> -PIPE_CONTROL_NO_WRITE);
> +PIPE_CONTROL_NO_WRITE |
> +   PIPE_CONTROL_CS_STALL);
>OUT_BATCH(0); /* write address */
>OUT_BATCH(0); /* write data */
>ADVANCE_BATCH();

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] shaders for g3dvl compositor (was vdpau: Handle destination rectangles correctly)

2011-12-14 Thread Christian König

Hi Andy,

On 12.12.2011 19:15, Andy Furniss wrote:

Christian König wrote:


@Andy: Could you please confirm that they are also working? (They should
apply ontop of current master).


Yes - they are working OK for me.

Ok, so I committed them just a couple of minutes ago.



One thing I noticed while messing with mplayer subs/OSD is that vdpau 
in full screen positions them relative to the screen rather than the 
active picture, so they can appear over the black bars. This is the 
same with the previous patches, and also -vo gl does this.


Xvmc, xv and x11 all keep them in the active picture area - I don't 
know which, if any, is considered right or wrong behavior.


For XvMC that is a limitation of the API, since the XvMCBlendSubpicture 
and XvMCBlendSubpicture2 blend the subs/OSD into a surface and not into 
the target drawable you can't specify any position outside of the active 
video area (ok you can do that since the x/y positions are signed, but I 
won't rely on that this is working correctly).


I'm not sure about Xv and X11 output, but it at least looks like VDPAU 
and OpenGL is more advanced here and that this behavior should be preferred.


Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] mesa: Record transform feedback stride in linker output.

2011-12-14 Thread Marek Olšák
The strides are useful, although they can be computed by summing up
all the NumComponents fields for each buffer. Note that I also take
into account ARB_transform_feedback3, which allows having interleaved
attribs in more than one buffer. For the EXT_transform_feedback case
and separate attribs, BufferStride is really equal to NumComponents.

I'd like to ask about something here.

There is yet another property which can be computed from the other
variables: the destination offsets. Assuming the outputs are supposed
to be interleaved in one buffer, the offset is always 0 for the first
output. The second output has an offset equal to
Outputs[0].NumComponents. The third output has an offset equal to
Outputs[0].NumComponents + Outputs[1].NumComponents, and so on. This
will become more important when we start supporting
ARB_transform_feedback3, which allows holes between attribs via
gl_SkipComponents{1,2,3,4}. It would be useful for Radeons too,
because r600g must already compute the offsets from NumComponents. So
the structure for ARB_transform_feedback3 would look like:

   unsigned NumComponents;
+  unsigned DstOffset;
} Outputs[MAX_PROGRAM_OUTPUTS];

Are you okay with this change as a way to express gl_SkipComponents?

Marek


On Wed, Dec 14, 2011 at 11:20 AM, Kenneth Graunke  wrote:
> On 12/13/2011 03:35 PM, Paul Berry wrote:
>> This patch adds the field gl_transform_feedback_info::BufferStride,
>> which records the total number of components (per vertex) that
>> transform feedback is being instructed to store in each buffer.  The
>> i965 gen6 back-end needs this information in order to set up binding
>> tables, and it seems better to have the linker provide this
>> information rather than force the back-end to recompute it.
>> ---
>>  src/glsl/linker.cpp    |    4 +++-
>>  src/mesa/main/mtypes.h |    5 +
>>  2 files changed, 8 insertions(+), 1 deletions(-)
>>
>> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
>> index b8a7126..5eb2a20 100644
>> --- a/src/glsl/linker.cpp
>> +++ b/src/glsl/linker.cpp
>> @@ -1599,6 +1599,7 @@ tfeedback_decl::store(struct gl_shader_program *prog,
>>        info->Outputs[info->NumOutputs].NumComponents = this->vector_elements;
>>        info->Outputs[info->NumOutputs].OutputBuffer = buffer;
>>        ++info->NumOutputs;
>> +      info->BufferStride[buffer] += this->vector_elements;
>>     }
>>     return true;
>>  }
>> @@ -1863,7 +1864,8 @@ store_tfeedback_info(struct gl_context *ctx, struct 
>> gl_shader_program *prog,
>>                       tfeedback_decl *tfeedback_decls)
>>  {
>>     unsigned total_tfeedback_components = 0;
>> -   prog->LinkedTransformFeedback.NumOutputs = 0;
>> +   memset(&prog->LinkedTransformFeedback, 0,
>> +          sizeof(prog->LinkedTransformFeedback));
>>     for (unsigned i = 0; i < num_tfeedback_decls; ++i) {
>>        unsigned buffer =
>>           prog->TransformFeedback.BufferMode == GL_SEPARATE_ATTRIBS ? i : 0;
>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> index 1934349..d4c600a 100644
>> --- a/src/mesa/main/mtypes.h
>> +++ b/src/mesa/main/mtypes.h
>> @@ -1822,6 +1822,11 @@ struct gl_transform_feedback_info {
>>        unsigned OutputBuffer;
>>        unsigned NumComponents;
>>     } Outputs[MAX_PROGRAM_OUTPUTS];
>> +
>> +   /**
>> +    * Number of components stored in each buffer.
>> +    */
>> +   unsigned BufferStride[MAX_FEEDBACK_ATTRIBS];
>>  };
>>
>>  /**
>
> I must confess that, looking at this structure alone, I wasn't sure what
> the difference between NumComponents (above) and BufferStride was.
>
> But this is definitely useful.  With or without improved comments:
> Reviewed-by: Kenneth Graunke 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/egl: Add support for EGL_NOK_swap_region

2011-12-14 Thread Fredrik Höglund
On Tuesday 13 December 2011, Chia-I Wu wrote:
> On Fri, Dec 9, 2011 at 11:36 PM, Fredrik Höglund  wrote:
> > Backends indicate that they support this extension by returning
> > EGL_TRUE when native_display::get_param() is called with
> > NATIVE_PARAM_PRESENT_REGION.
> >
> > native_present_control is extended to include the region that should
> > be presented. When the whole surface is to be presented, this region
> > will be a single rect containing the dimensions of the surface.
> >
> > Signed-off-by: Fredrik Höglund 
> > ---
> >  src/gallium/state_trackers/egl/common/egl_g3d.c|5 +++
> >  .../state_trackers/egl/common/egl_g3d_api.c|   31 
> > ++-
> >  src/gallium/state_trackers/egl/common/native.h |   12 +++-
> >  3 files changed, 45 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/gallium/state_trackers/egl/common/egl_g3d.c 
> > b/src/gallium/state_trackers/egl/common/egl_g3d.c
> > index 182ce68..feebfaf 100644
> > --- a/src/gallium/state_trackers/egl/common/egl_g3d.c
> > +++ b/src/gallium/state_trackers/egl/common/egl_g3d.c
> > @@ -606,6 +606,11 @@ egl_g3d_initialize(_EGLDriver *drv, _EGLDisplay *dpy)
> >   dpy->Extensions.WL_bind_wayland_display = EGL_TRUE;
> >  #endif
> >
> > +#ifdef EGL_NOK_swap_region
> > +   if (gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESENT_REGION))
> > +  dpy->Extensions.NOK_swap_region = EGL_TRUE;
> > +#endif
> > +
> Does EGL_NOK_swap_region require the contents of the back buffer to be
> preserved?  If so, NATIVE_PARAM_PRESERVE_BUFFER needs to be checked
> too.

Unfortunately the specification isn't available, or at least not anywhere
where google can find it. I think the safe thing to do for now is to assume
that the back buffer must be preserved. EGL_NV_post_sub_buffer also
requires this.

>>  static EGLBoolean
>> +egl_g3d_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
>> +{
>> +   struct egl_g3d_surface *gsurf = egl_g3d_surface(surf);
>> +   const EGLint rect[4] = { 0, 0, gsurf->base.Width, gsurf->base.Height };
>> +
>> +   return swap_buffers(drv, dpy, surf, 1, rect,
> "return swap_buffers(drv, dpy, surf, 0, NULL, ...);" seems simpler here.

It moves this logic into the backends, but on the other hand it makes
it easier for them to detect when the whole surface should be presented.
I'll update the patch to do this.

Regards,
Fredrik

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965: Flush pipeline on EndTransformFeedback.

2011-12-14 Thread Paul Berry
On 14 December 2011 01:44, Kenneth Graunke  wrote:

> On 12/13/2011 03:35 PM, Paul Berry wrote:
> > A common use case for transform feedback is to perform one draw
> > operation that writes transform feedback output to a buffer, followed
> > by a second draw operation that consumes that buffer as vertex input.
> > Since vertex input is consumed at an earlier pipeline stage than
> > writing transform feedback output, we need to flush the pipeline to
> > ensure that the transform feedback output is completely written before
> > the data is consumed.
> >
> > In an ideal world, we would do some dependency tracking, so that we
> > would only flush the pipeline if the next draw call was about to
> > consume data generated by a previous draw call in the same batch.
> > However, since we don't have that sort of dependency tracking
> > infrastructure right now, we just unconditionally flush the buffer
> > every time glEndTransformFeedback() is called.  This will cause a
> > performance hit compared to the ideal case (since we will sometimes
> > flush the pipeline unnecessarily), but fortunately the performance hit
> > will be confined to circumstances where transform feedback is in use.
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.c |8 ++--
> >  src/mesa/drivers/dri/i965/brw_context.h |5 +
> >  src/mesa/drivers/dri/i965/gen6_sol.c|   16 
> >  3 files changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
> > index eb68bf4..f7b88c3 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > @@ -52,6 +52,7 @@
> >
> >  #include "tnl/t_pipeline.h"
> >  #include "glsl/ralloc.h"
> > +#include "intel_batchbuffer.h"
> >
> >  /***
> >   * Mesa's Driver Functions
> > @@ -109,7 +110,8 @@ static void brwPrepareExecBegin(struct gl_context
> *ctx)
> > }
> >  }
> >
> > -static void brwInitDriverFunctions( struct dd_function_table *functions
> )
> > +static void brwInitDriverFunctions(struct intel_context *intel,
> > +   struct dd_function_table *functions)
> >  {
> > intelInitDriverFunctions( functions );
> >
> > @@ -117,6 +119,8 @@ static void brwInitDriverFunctions( struct
> dd_function_table *functions )
> > brw_init_queryobj_functions(functions);
> >
> > functions->PrepareExecBegin = brwPrepareExecBegin;
> > +   if (intel->gen == 6)
> > +  functions->EndTransformFeedback = gen6_end_transform_feedback;
>
> Unfortunately, this doesn't work: brwInitDriverFunctions is called
> before intelInitContext, so intel->gen isn't set yet.  So this function
> pointer never gets set, and gen6_end_transform_feedback never happens.
>
> I blithely tried moving the brwInitDriverFunctions call after
> intelInitContext, but that crashes.  Also, setting anything in
> "functions" after intelInitContext simply doesn't take; something
> must've copied the table (remapping code most likely).  In other words,
> a generation check just isn't going to happen.
>
> I think the best solution is to set the function pointer unconditionally
> and add
>
>   if (intel->gen < 6)
>  return;
>
> to the top of gen6_end_transform_feedback so it becomes a no-op on
> earlier platforms.
>
> Sorry for suggesting the generation check earlier; I didn't realize it
> was impossible.
>

Thanks for catching this. I blithely re-ran my tests after you made your
suggestion yesterday, and since there were no additional failures I figured
everything was ok.  It turns out that an additional flush was happening
that was masking the problem (the flush we do when reassigning GS URB slots
to the VS).  But that flush won't always happen.  I'm planning to do
another update to the transform feedback tests soon, so I'll try to catch
that case when I do.

On further reflection, adding "if (intel->gen < 6) return;" to the top of
gen6_end_transform_feedback() is unnecessary.  At the moment we don't
support transform feedback when gen < 6, so the function will never be
called on earlier platforms.  And if we do add transform feedback support
to gen < 6 in the future, we will need to do this flush.  We'll need to do
the flush on gen7 too.  So I'm just going to set the function pointer
unconditionally and rename gen6_end_transform_feedback() to
brw_end_transform_feedback().

Revised patch to follow.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8] i965: Flush pipeline on EndTransformFeedback.

2011-12-14 Thread Paul Berry
A common use case for transform feedback is to perform one draw
operation that writes transform feedback output to a buffer, followed
by a second draw operation that consumes that buffer as vertex input.
Since vertex input is consumed at an earlier pipeline stage than
writing transform feedback output, we need to flush the pipeline to
ensure that the transform feedback output is completely written before
the data is consumed.

In an ideal world, we would do some dependency tracking, so that we
would only flush the pipeline if the next draw call was about to
consume data generated by a previous draw call in the same batch.
However, since we don't have that sort of dependency tracking
infrastructure right now, we just unconditionally flush the buffer
every time glEndTransformFeedback() is called.  This will cause a
performance hit compared to the ideal case (since we will sometimes
flush the pipeline unnecessarily), but fortunately the performance hit
will be confined to circumstances where transform feedback is in use.
---
Revised based on Kenneth Graunke's comments from 14 December 2011 01:44.

 src/mesa/drivers/dri/i965/brw_context.c |7 +--
 src/mesa/drivers/dri/i965/brw_context.h |5 +
 src/mesa/drivers/dri/i965/gen6_sol.c|   16 
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index eb68bf4..080fd80 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -52,6 +52,7 @@
 
 #include "tnl/t_pipeline.h"
 #include "glsl/ralloc.h"
+#include "intel_batchbuffer.h"
 
 /***
  * Mesa's Driver Functions
@@ -109,7 +110,8 @@ static void brwPrepareExecBegin(struct gl_context *ctx)
}
 }
 
-static void brwInitDriverFunctions( struct dd_function_table *functions )
+static void brwInitDriverFunctions(struct intel_context *intel,
+   struct dd_function_table *functions)
 {
intelInitDriverFunctions( functions );
 
@@ -117,6 +119,7 @@ static void brwInitDriverFunctions( struct 
dd_function_table *functions )
brw_init_queryobj_functions(functions);
 
functions->PrepareExecBegin = brwPrepareExecBegin;
+   functions->EndTransformFeedback = brw_end_transform_feedback;
 }
 
 bool
@@ -136,7 +139,7 @@ brwCreateContext(int api,
   return false;
}
 
-   brwInitDriverFunctions( &functions );
+   brwInitDriverFunctions(intel, &functions);
 
if (!intelInitContext( intel, api, mesaVis, driContextPriv,
  sharedContextPrivate, &functions )) {
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index da1de02..8e52488 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1071,6 +1071,11 @@ void brw_init_surface_formats(struct brw_context *brw);
 bool
 brw_fprog_uses_noperspective(const struct gl_fragment_program *fprog);
 
+/* gen6_sol.c */
+void
+brw_end_transform_feedback(struct gl_context *ctx,
+   struct gl_transform_feedback_object *obj);
+
 
 
 /*==
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index af372c1..3a64270 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -28,6 +28,7 @@
 
 #include "brw_context.h"
 #include "intel_buffer_objects.h"
+#include "intel_batchbuffer.h"
 #include "brw_defines.h"
 
 static void
@@ -100,3 +101,18 @@ const struct brw_tracked_state gen6_sol_surface = {
},
.emit = brw_update_sol_surfaces,
 };
+
+void
+brw_end_transform_feedback(struct gl_context *ctx,
+   struct gl_transform_feedback_object *obj)
+{
+   /* After EndTransformFeedback, it's likely that the client program will try
+* to draw using the contents of the transform feedback buffer as vertex
+* input.  In order for this to work, we need to flush the data through at
+* least the GS stage of the pipeline, and flush out the render cache.  For
+* simplicity, just do a full flush.
+*/
+   struct brw_context *brw = brw_context(ctx);
+   struct intel_context *intel = &brw->intel;
+   intel_batchbuffer_emit_mi_flush(intel);
+}
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965 gen6+: Use 1-wide null operands for IF instructions

2011-12-14 Thread Paul Berry
On 14 December 2011 02:33, Kenneth Graunke  wrote:

> On 12/13/2011 03:35 PM, Paul Berry wrote:
> > The Sandy Bridge PRM, volume 4, part 2, section 5.3.10 ("5.3.10
> > Register Region Restrictions") contains the following restriction on
> > the execution size and operand width of instructions:
> >
> >"3. ExecSize must be equal to or greater than Width."
> >
> > When emitting an IF instruction in single program flow mode on Gen6+,
> > we use an ExecSize of 1, therefore the Width of each operand must also
> > be 1.
> >
> > The operands of an IF instruction are not actually used for their
> > normal purpose on Gen6+ (which is probably the reason this wasn't
> > causing a GPU hang), but nonetheless it seems prudent to follow this
> > rule.
>
> I'm not actually sure what you're trying to say here.  The IF statement
> is different on Gen4-5, Gen6, and Gen7, so I'm unclear what "normal" is.
>

Tell you what, I'll just delete that last paragraph out of the commit
message.  It wasn't really adding much content anyway.


>
> I think the point is that the ARF null register doesn't have a value, so
> null<0,1,0> is just as good as null<4,4,1> or null<8,8,1>.  Either way
> you get nothing and write nothing.  So width is irrelevant other than
> dodging this assertion.
>
> If we weren't guessing execution size based on the register widths, I'd
> prefer to just change the null reg to have width 1.  But since we do,
> that would be a royal pain, and this is nice and simple.  I'm definitely
> in favor of this patch.
>

Agreed.  The whole "guessing execution size" thing has always troubled me,
but I haven't thought of a good alternative.


>
> Reviewed-by: Kenneth Graunke 
>
> > This patch unconditionally uses 1-wide null operands for Gen6+
> > IF instructions, rather than the standard null operand, which is 8
> > components wide.
> > ---
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c |8 
> >  1 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index a46a81b..d48753c 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -941,11 +941,11 @@ brw_IF(struct brw_compile *p, GLuint execute_size)
> > } else if (intel->gen == 6) {
> >brw_set_dest(p, insn, brw_imm_w(0));
> >insn->bits1.branch_gen6.jump_count = 0;
> > -  brw_set_src0(p, insn, retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D));
> > -  brw_set_src1(p, insn, retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D));
> > +  brw_set_src0(p, insn, vec1(retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D)));
> > +  brw_set_src1(p, insn, vec1(retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D)));
> > } else {
> > -  brw_set_dest(p, insn, retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D));
> > -  brw_set_src0(p, insn, retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D));
> > +  brw_set_dest(p, insn, vec1(retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D)));
> > +  brw_set_src0(p, insn, vec1(retype(brw_null_reg(),
> BRW_REGISTER_TYPE_D)));
> >brw_set_src1(p, insn, brw_imm_ud(0));
> >insn->bits3.break_cont.jip = 0;
> >insn->bits3.break_cont.uip = 0;
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] i965 gen6: Turn on transform feedback extension.

2011-12-14 Thread Paul Berry
On 14 December 2011 02:51, Kenneth Graunke  wrote:

> On 12/13/2011 03:35 PM, Paul Berry wrote:
> > This patch advertises support for EXT_transform_feedback on Intel Gen6
> > and higher.
> >
> > Since transform feedback support is not completely finished yet, for
> > now we only advertise support for it when MESA_GL_VERSION_OVERRIDE is
> > 3.0 or greater (since transform feedback is required by GL version
> > 3.0).
> > ---
> >  src/mesa/drivers/dri/intel/intel_extensions.c |7 +--
> >  1 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c
> b/src/mesa/drivers/dri/intel/intel_extensions.c
> > index 681f5f2..41ae29e 100644
> > --- a/src/mesa/drivers/dri/intel/intel_extensions.c
> > +++ b/src/mesa/drivers/dri/intel/intel_extensions.c
> > @@ -98,10 +98,13 @@ intelInitExtensions(struct gl_context *ctx)
> > ctx->Extensions.OES_EGL_image = true;
> >  #endif
> >
> > -   if (intel->gen >= 6)
> > +   if (intel->gen >= 6) {
> >ctx->Const.GLSLVersion = 130;
> > -   else
> > +  if (override_version >= 30)
> > + ctx->Extensions.EXT_transform_feedback = true;
> > +   } else {
> >ctx->Const.GLSLVersion = 120;
> > +   }
> > _mesa_override_glsl_version(ctx);
> >
> > if (intel->gen >= 5)
>
> I'd go ahead and add a new (intel->gen >= 6) block after
> _mesa_override_glsl_version.  That way this block can be solely for
> determining the GLSL version.
>

That seems sensible.  In fact, since we haven't started transform feedback
work for gen7, I'll condition the new block on (intel->gen == 6 &&
override_version >= 30) for now, so that those of us with gen7 prototypes
get less confusing output from our piglit runs.

With that change, can I take this as a "reviewed-by"?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] i965 gen6+: Make intel_batchbuffer_emit_mi_flush() actually flush.

2011-12-14 Thread Paul Berry
On 14 December 2011 02:59, Kenneth Graunke  wrote:

> On 12/13/2011 03:35 PM, Paul Berry wrote:
> > Previous to this patch, the function intel_batchbuffer_emit_mi_flush()
> > was a bit of a misnomer.  On Gen4+, when not using the blit engine, it
> > didn't actually flush the pipeline--it simply generated a
> > _3DSTATE_PIPE_CONTROL command with the necessary bits set to flush GPU
>
> It's actually just called "PIPE_CONTROL", never 3DSTATE_PIPE_CONTROL.
>

(Checks the docs).  Hmm, you're right.  For some reason we call it
_3DSTATE_PIPE_CONTROL in our #defines (see intel_reg.h).  Still, it seems
better for the commit message to match the documentation.  I'll change the
commit message.


>
> > caches.  This was usually sufficient, since in most situations where
> > intel_batchbuffer_emit_mi_flush() wass called, all we really care
>
> "... was called, all we really cared" (typo, tense)
>

Heh, oops.


>
> This makes a lot of sense.  At some point, I think we ought to convert
> some of the callers of this function to emit the proper PIPE_CONTROL
> directly (with only the necessary bits and workarounds).  But this is
> the right way to start.  Nice work figuring this out.
>

Agreed.  As my CS professor Bob Keller once said, "get it right first, then
make it fast."


>
> Reviewed-by: Kenneth Graunke 
>
> > about was ensuring cache coherency.
> >
> > However, with the advent of OpenGL 3.0, there are two cases in which
> > data output by one stage of the pipeline might be consumed, in a later
> > draw operation, by an earlier stage of the pipeline:
> >
> > (a) When using textures in the vertex shader.
> >
> > (b) When using drawing with a vertex buffer that was previously
> > generated using transform feedback.
> >
> > This patch addresses case (a) by changing
> > intel_batchbuffer_emit_mi_flush() so that on Gen6+, it sets the
> > PIPE_CONTROL_CS_STALL bit (this forces the pipeline to actually
> > flush).  (Case (b) will be addressed by the next patch in the series).
> >
> > This is not an ideal solution--in a perfect world, the driver would
> > have some buffer dependency tracking so that we would only have to
> > flush the pipeline in the two cases above.  Until that dependency
> > tracking is implemented, however, it seems prudent to have
> > intel_batchbuffer_emit_mi_flush() actually flush the pipeline, so that
> > we get correct rendering, at the expense of a (hopefully small)
> > performance hit.
> >
> > The change is only applied to Gen6+, since at the moment only Gen6+
> > supports the OpenGL 3.0 features that make a full pipeline flush
> > necessary.
> > ---
> >  src/mesa/drivers/dri/intel/intel_batchbuffer.c |3 ++-
> >  1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> > index 6991db8..4ff098a 100644
> > --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> > +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c
> > @@ -461,7 +461,8 @@ intel_batchbuffer_emit_mi_flush(struct intel_context
> *intel)
> >  PIPE_CONTROL_WRITE_FLUSH |
> >  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
> >  PIPE_CONTROL_TC_FLUSH |
> > -PIPE_CONTROL_NO_WRITE);
> > +PIPE_CONTROL_NO_WRITE |
> > +   PIPE_CONTROL_CS_STALL);
> >OUT_BATCH(0); /* write address */
> >OUT_BATCH(0); /* write data */
> >ADVANCE_BATCH();
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] mesa: Record transform feedback stride in linker output.

2011-12-14 Thread Paul Berry
On 13 December 2011 15:53, Brian Paul  wrote:

> On Tue, Dec 13, 2011 at 4:35 PM, Paul Berry 
> wrote:
> > This patch adds the field gl_transform_feedback_info::BufferStride,
> > which records the total number of components (per vertex) that
> > transform feedback is being instructed to store in each buffer.  The
> > i965 gen6 back-end needs this information in order to set up binding
> > tables, and it seems better to have the linker provide this
> > information rather than force the back-end to recompute it.
> > ---
> >  src/glsl/linker.cpp|4 +++-
> >  src/mesa/main/mtypes.h |5 +
> >  2 files changed, 8 insertions(+), 1 deletions(-)
> >
> > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> > index b8a7126..5eb2a20 100644
> > --- a/src/glsl/linker.cpp
> > +++ b/src/glsl/linker.cpp
> > @@ -1599,6 +1599,7 @@ tfeedback_decl::store(struct gl_shader_program
> *prog,
> >   info->Outputs[info->NumOutputs].NumComponents =
> this->vector_elements;
> >   info->Outputs[info->NumOutputs].OutputBuffer = buffer;
> >   ++info->NumOutputs;
> > +  info->BufferStride[buffer] += this->vector_elements;
> >}
> >return true;
> >  }
> > @@ -1863,7 +1864,8 @@ store_tfeedback_info(struct gl_context *ctx,
> struct gl_shader_program *prog,
> >  tfeedback_decl *tfeedback_decls)
> >  {
> >unsigned total_tfeedback_components = 0;
> > -   prog->LinkedTransformFeedback.NumOutputs = 0;
> > +   memset(&prog->LinkedTransformFeedback, 0,
> > +  sizeof(prog->LinkedTransformFeedback));
> >for (unsigned i = 0; i < num_tfeedback_decls; ++i) {
> >   unsigned buffer =
> >  prog->TransformFeedback.BufferMode == GL_SEPARATE_ATTRIBS ? i :
> 0;
> > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> > index 1934349..d4c600a 100644
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -1822,6 +1822,11 @@ struct gl_transform_feedback_info {
> >   unsigned OutputBuffer;
> >   unsigned NumComponents;
> >} Outputs[MAX_PROGRAM_OUTPUTS];
> > +
> > +   /**
> > +* Number of components stored in each buffer.
> > +*/
> > +   unsigned BufferStride[MAX_FEEDBACK_ATTRIBS];
> >  };
>
> Minor nit: I'd use a single line comment there: /** Number of
> components... */ to be consistent with other comments.
>

Good point.  Considering Ken's confusion, though, I think I'll expand the
comment to make it clearer what's going on.


>
> The rest looks OK to me otherwise.  Reviewed-by: Brian Paul
>   But if you have any doubts, another one of the
> GLSL people should probably review the rest too.
>
> -Brian
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] mesa: Record transform feedback stride in linker output.

2011-12-14 Thread Paul Berry
On 14 December 2011 05:48, Marek Olšák  wrote:

> The strides are useful, although they can be computed by summing up
> all the NumComponents fields for each buffer. Note that I also take
> into account ARB_transform_feedback3, which allows having interleaved
> attribs in more than one buffer. For the EXT_transform_feedback case
> and separate attribs, BufferStride is really equal to NumComponents.
>
> I'd like to ask about something here.
>
> There is yet another property which can be computed from the other
> variables: the destination offsets. Assuming the outputs are supposed
> to be interleaved in one buffer, the offset is always 0 for the first
> output. The second output has an offset equal to
> Outputs[0].NumComponents. The third output has an offset equal to
> Outputs[0].NumComponents + Outputs[1].NumComponents, and so on. This
> will become more important when we start supporting
> ARB_transform_feedback3, which allows holes between attribs via
> gl_SkipComponents{1,2,3,4}. It would be useful for Radeons too,
> because r600g must already compute the offsets from NumComponents. So
> the structure for ARB_transform_feedback3 would look like:
>
>   unsigned NumComponents;
> +  unsigned DstOffset;
>} Outputs[MAX_PROGRAM_OUTPUTS];
>
> Are you okay with this change as a way to express gl_SkipComponents?
>

Yes, I like this idea very much.  I'll work on it this morning and post a
follow-up patch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] mesa: Record transform feedback strides/offsets in linker output.

2011-12-14 Thread Paul Berry
This patch adds two new fields to the gl_transform_feedback_info
struct:

- BufferStride records the total number of components (per vertex)
  that transform feedback is being instructed to store in each buffer.

- Outputs[i].DstOffset records the offset within the interleaved
  structure of each transform feedback output.

These values are needed by the i965 gen6 and r600g back-ends, so it
seems better to have the linker provide them rather than force each
back-end to compute them independently.

Also, DstOffset helps pave the way for supporting
ARB_transform_feedback3, which allows the transform feedback output to
contain holes between attributes by specifying
gl_SkipComponents{1,2,3,4} as the varying name.
---
 src/glsl/linker.cpp|5 -
 src/mesa/main/mtypes.h |   10 ++
 2 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index b8a7126..6587008 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1598,7 +1598,9 @@ tfeedback_decl::store(struct gl_shader_program *prog,
   info->Outputs[info->NumOutputs].OutputRegister = this->location + v;
   info->Outputs[info->NumOutputs].NumComponents = this->vector_elements;
   info->Outputs[info->NumOutputs].OutputBuffer = buffer;
+  info->Outputs[info->NumOutputs].DstOffset = info->BufferStride[buffer];
   ++info->NumOutputs;
+  info->BufferStride[buffer] += this->vector_elements;
}
return true;
 }
@@ -1863,7 +1865,8 @@ store_tfeedback_info(struct gl_context *ctx, struct 
gl_shader_program *prog,
  tfeedback_decl *tfeedback_decls)
 {
unsigned total_tfeedback_components = 0;
-   prog->LinkedTransformFeedback.NumOutputs = 0;
+   memset(&prog->LinkedTransformFeedback, 0,
+  sizeof(prog->LinkedTransformFeedback));
for (unsigned i = 0; i < num_tfeedback_decls; ++i) {
   unsigned buffer =
  prog->TransformFeedback.BufferMode == GL_SEPARATE_ATTRIBS ? i : 0;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 1934349..e18c5f8 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1821,7 +1821,17 @@ struct gl_transform_feedback_info {
   unsigned OutputRegister;
   unsigned OutputBuffer;
   unsigned NumComponents;
+
+  /** offset (in DWORDs) of this output within the interleaved structure */
+  unsigned DstOffset;
} Outputs[MAX_PROGRAM_OUTPUTS];
+
+   /**
+* Total number of components stored in each buffer.  This may be used by
+* hardware back-ends to determine the correct stride when interleaving
+* multiple transform feedback outputs in the same buffer.
+*/
+   unsigned BufferStride[MAX_FEEDBACK_ATTRIBS];
 };
 
 /**
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa, gallium: add a cap for GPUs without unified color+generic varying slots

2011-12-14 Thread Ian Romanick

On 12/13/2011 04:32 PM, Marek Olšák wrote:

---
  src/gallium/drivers/r300/r300_screen.c |3 ++-
  src/gallium/include/pipe/p_defines.h   |3 ++-
  src/glsl/linker.cpp|6 ++
  src/mesa/main/mtypes.h |9 +
  src/mesa/state_tracker/st_extensions.c |3 +++
  5 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r300/r300_screen.c 
b/src/gallium/drivers/r300/r300_screen.c
index 0bae065..a761939 100644
--- a/src/gallium/drivers/r300/r300_screen.c
+++ b/src/gallium/drivers/r300/r300_screen.c
@@ -102,6 +102,7 @@ static int r300_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
  case PIPE_CAP_TEXTURE_BARRIER:
  case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
  case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
+case PIPE_CAP_SEPARATE_COLOR_VARYINGS:
  return 1;

  /* r300 cannot do swizzling of compressed textures. Supported 
otherwise. */
@@ -182,7 +183,7 @@ static int r300_get_shader_param(struct pipe_screen 
*pscreen, unsigned shader, e
   * R500 has the ability to turn 3rd and 4th color into
   * additional texcoords but there is no two-sided color
   * selection then. However the facing bit can be used instead. */
-return 10;
+return 8;


Why are you trying to make r300g behave differently the every other 
driver that has existed for that same hardware?  This doesn't make any 
sense.  This is not Apple's driver works for r300 hardware, and it's not 
how AMD's Windows driver works (worked, anyway) for r300 hardware.


This feels like an unnecessary hack.


  case PIPE_SHADER_CAP_MAX_CONSTS:
  return is_r500 ? 256 : 32;
  case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index 30f1d7f..5229c5f 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -467,7 +467,8 @@ enum pipe_cap {
 PIPE_CAP_CONDITIONAL_RENDER = 52,
 PIPE_CAP_TEXTURE_BARRIER = 53,
 PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS = 54, /* temporary */
-   PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS = 55 /* temporary */
+   PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS = 55, /* temporary */
+   PIPE_CAP_SEPARATE_COLOR_VARYINGS = 56
  };

  /**
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index b8a7126..e9298bb 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1805,6 +1805,12 @@ assign_varying_locations(struct gl_context *ctx,
   */
  var->mode = ir_var_auto;
   } else {
+if (ctx->Const.GLSLSeparateColorVaryings&&
+(var->location == FRAG_ATTRIB_COL0 ||
+ var->location == FRAG_ATTRIB_COL1)) {
+   continue;
+}
+
  /* The packing rules are used for vertex shader inputs are also
   * used for fragment shader inputs.
   */
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 1934349..9e9ad83 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2839,6 +2839,15 @@ struct gl_constants
  */
 GLboolean GLSLSkipStrictMaxVaryingLimitCheck;
 GLboolean GLSLSkipStrictMaxUniformLimitCheck;
+
+   /**
+* Whether the color varyings do not share varying slots with generic
+* varyings. In such a case, the driver must not include the color
+* varyings in the maximum number of varyings limit. In return,
+* the GLSL linker will not count the color varyings to the number of
+* used varying components.
+*/
+   GLboolean GLSLSeparateColorVaryings;
  };


diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 457d5d6..7a7919f 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -228,6 +228,9 @@ void st_init_limits(struct st_context *st)

 c->GLSLSkipStrictMaxVaryingLimitCheck =
screen->get_param(screen, PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS);
+
+   c->GLSLSeparateColorVaryings =
+  screen->get_param(screen, PIPE_CAP_SEPARATE_COLOR_VARYINGS);
  }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/8] i965 gen6+: Use 1-wide null operands for IF instructions

2011-12-14 Thread Eric Anholt
On Wed, 14 Dec 2011 08:06:20 -0800, Paul Berry  wrote:
> On 14 December 2011 02:33, Kenneth Graunke  wrote:
> > I think the point is that the ARF null register doesn't have a value, so
> > null<0,1,0> is just as good as null<4,4,1> or null<8,8,1>.  Either way
> > you get nothing and write nothing.  So width is irrelevant other than
> > dodging this assertion.
> >
> > If we weren't guessing execution size based on the register widths, I'd
> > prefer to just change the null reg to have width 1.  But since we do,
> > that would be a royal pain, and this is nice and simple.  I'm definitely
> > in favor of this patch.
> >
> 
> Agreed.  The whole "guessing execution size" thing has always troubled me,
> but I haven't thought of a good alternative.

Me too.


pgpgwjuzrT6uI.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] i965 gen6+: Make intel_batchbuffer_emit_mi_flush() actually flush.

2011-12-14 Thread Kenneth Graunke
On 12/14/2011 08:28 AM, Paul Berry wrote:
> On 14 December 2011 02:59, Kenneth Graunke  > wrote:
> 
> On 12/13/2011 03:35 PM, Paul Berry wrote:
> > Previous to this patch, the function intel_batchbuffer_emit_mi_flush()
> > was a bit of a misnomer.  On Gen4+, when not using the blit engine, it
> > didn't actually flush the pipeline--it simply generated a
> > _3DSTATE_PIPE_CONTROL command with the necessary bits set to flush GPU
> 
> It's actually just called "PIPE_CONTROL", never 3DSTATE_PIPE_CONTROL.
> 
> 
> (Checks the docs).  Hmm, you're right.  For some reason we call it
> _3DSTATE_PIPE_CONTROL in our #defines (see intel_reg.h).  Still, it
> seems better for the commit message to match the documentation.  I'll
> change the commit message.

We also call it CMD_PIPE_CONTROL (see brw_defines.h).  Nice, huh?

I'm always hugely skeptical of intel_reg.h because it contains a lot of
i8xx and i915 defines that sometimes have similar names to i965 stuff
but isn't the same.

In this case, _3DSTATE_PIPE_CONTROL == CMD_PIPE_CONTROL | 2 (i.e. with
length included).  Not a huge fan of including the length in the #define
either.

But I'm digressing :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] i965 gen6: Turn on transform feedback extension.

2011-12-14 Thread Kenneth Graunke
On 12/14/2011 08:20 AM, Paul Berry wrote:
> On 14 December 2011 02:51, Kenneth Graunke  > wrote:
> 
> On 12/13/2011 03:35 PM, Paul Berry wrote:
> > This patch advertises support for EXT_transform_feedback on Intel Gen6
> > and higher.
> >
> > Since transform feedback support is not completely finished yet, for
> > now we only advertise support for it when MESA_GL_VERSION_OVERRIDE is
> > 3.0 or greater (since transform feedback is required by GL version
> > 3.0).
> > ---
> >  src/mesa/drivers/dri/intel/intel_extensions.c |7 +--
> >  1 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c
> b/src/mesa/drivers/dri/intel/intel_extensions.c
> > index 681f5f2..41ae29e 100644
> > --- a/src/mesa/drivers/dri/intel/intel_extensions.c
> > +++ b/src/mesa/drivers/dri/intel/intel_extensions.c
> > @@ -98,10 +98,13 @@ intelInitExtensions(struct gl_context *ctx)
> > ctx->Extensions.OES_EGL_image = true;
> >  #endif
> >
> > -   if (intel->gen >= 6)
> > +   if (intel->gen >= 6) {
> >ctx->Const.GLSLVersion = 130;
> > -   else
> > +  if (override_version >= 30)
> > + ctx->Extensions.EXT_transform_feedback = true;
> > +   } else {
> >ctx->Const.GLSLVersion = 120;
> > +   }
> > _mesa_override_glsl_version(ctx);
> >
> > if (intel->gen >= 5)
> 
> I'd go ahead and add a new (intel->gen >= 6) block after
> _mesa_override_glsl_version.  That way this block can be solely for
> determining the GLSL version.
> 
> 
> That seems sensible.  In fact, since we haven't started transform
> feedback work for gen7, I'll condition the new block on (intel->gen == 6
> && override_version >= 30) for now, so that those of us with gen7
> prototypes get less confusing output from our piglit runs.
> 
> With that change, can I take this as a "reviewed-by"?

Yes, of course.  Thanks Paul.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Flush pipeline on EndTransformFeedback.

2011-12-14 Thread Kenneth Graunke
From: Paul Berry 

A common use case for transform feedback is to perform one draw
operation that writes transform feedback output to a buffer, followed
by a second draw operation that consumes that buffer as vertex input.
Since vertex input is consumed at an earlier pipeline stage than
writing transform feedback output, we need to flush the pipeline to
ensure that the transform feedback output is completely written before
the data is consumed.

In an ideal world, we would do some dependency tracking, so that we
would only flush the pipeline if the next draw call was about to
consume data generated by a previous draw call in the same batch.
However, since we don't have that sort of dependency tracking
infrastructure right now, we just unconditionally flush the buffer
every time glEndTransformFeedback() is called.  This will cause a
performance hit compared to the ideal case (since we will sometimes
flush the pipeline unnecessarily), but fortunately the performance hit
will be confined to circumstances where transform feedback is in use.
---
 src/mesa/drivers/dri/i965/brw_context.c |2 ++
 src/mesa/drivers/dri/i965/brw_context.h |5 +
 src/mesa/drivers/dri/i965/gen6_sol.c|   19 +++
 3 files changed, 26 insertions(+), 0 deletions(-)

I was thinking more like this.

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index eb68bf4..be230de 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -52,6 +52,7 @@
 
 #include "tnl/t_pipeline.h"
 #include "glsl/ralloc.h"
+#include "intel_batchbuffer.h"
 
 /***
  * Mesa's Driver Functions
@@ -117,6 +118,7 @@ static void brwInitDriverFunctions( struct 
dd_function_table *functions )
brw_init_queryobj_functions(functions);
 
functions->PrepareExecBegin = brwPrepareExecBegin;
+   functions->EndTransformFeedback = gen6_end_transform_feedback;
 }
 
 bool
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index da1de02..fa6c883 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1071,6 +1071,11 @@ void brw_init_surface_formats(struct brw_context *brw);
 bool
 brw_fprog_uses_noperspective(const struct gl_fragment_program *fprog);
 
+/* gen6_sol.c */
+void
+gen6_end_transform_feedback(struct gl_context *ctx,
+struct gl_transform_feedback_object *obj);
+
 
 
 /*==
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index af372c1..1f06fa2 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -28,6 +28,7 @@
 
 #include "brw_context.h"
 #include "intel_buffer_objects.h"
+#include "intel_batchbuffer.h"
 #include "brw_defines.h"
 
 static void
@@ -100,3 +101,21 @@ const struct brw_tracked_state gen6_sol_surface = {
},
.emit = brw_update_sol_surfaces,
 };
+
+void
+gen6_end_transform_feedback(struct gl_context *ctx,
+struct gl_transform_feedback_object *obj)
+{
+   struct intel_context *intel = intel_context(ctx);
+
+   if (intel->gen < 6)
+  return;
+
+   /* After EndTransformFeedback, it's likely that the client program will try
+* to draw using the contents of the transform feedback buffer as vertex
+* input.  In order for this to work, we need to flush the data through at
+* least the GS stage of the pipeline, and flush out the render cache.  For
+* simplicity, just do a full flush.
+*/
+   intel_batchbuffer_emit_mi_flush(intel);
+}
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] mesa: Record transform feedback stride in linker output.

2011-12-14 Thread Kenneth Graunke
On 12/14/2011 05:48 AM, Marek Olšák wrote:
> The strides are useful, although they can be computed by summing up
> all the NumComponents fields for each buffer. Note that I also take
> into account ARB_transform_feedback3, which allows having interleaved
> attribs in more than one buffer. For the EXT_transform_feedback case
> and separate attribs, BufferStride is really equal to NumComponents.
> 
> I'd like to ask about something here.
> 
> There is yet another property which can be computed from the other
> variables: the destination offsets. Assuming the outputs are supposed
> to be interleaved in one buffer, the offset is always 0 for the first
> output. The second output has an offset equal to
> Outputs[0].NumComponents. The third output has an offset equal to
> Outputs[0].NumComponents + Outputs[1].NumComponents, and so on. This
> will become more important when we start supporting
> ARB_transform_feedback3, which allows holes between attribs via
> gl_SkipComponents{1,2,3,4}. It would be useful for Radeons too,
> because r600g must already compute the offsets from NumComponents. So
> the structure for ARB_transform_feedback3 would look like:
> 
>unsigned NumComponents;
> +  unsigned DstOffset;
> } Outputs[MAX_PROGRAM_OUTPUTS];
> 
> Are you okay with this change as a way to express gl_SkipComponents?
> 
> Marek

That sounds good to me.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: Set the correct ctx->NewState bitfield for rasterizer discard.

2011-12-14 Thread Paul Berry
Previously, we were setting the _NEW_TRANSFORM bit when enabling or
disabling RASTERIZER_DISCARD.  This is incorrect, since _NEW_TRANSFORM
flags changes to ctx->Transform, but the rasterizer discard flag is in
ctx->TransformFeedback.  This patch sets the correct bit,
_NEW_TRANSFORM_FEEDBACK.
---
 src/mesa/main/enable.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index 6461ac1..15a2305 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -890,7 +890,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
   case GL_RASTERIZER_DISCARD:
 CHECK_EXTENSION(EXT_transform_feedback, cap);
  if (ctx->TransformFeedback.RasterDiscard != state) {
-FLUSH_VERTICES(ctx, _NEW_TRANSFORM);
+FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
 ctx->TransformFeedback.RasterDiscard = state;
  }
  break;
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965 gen6: Implement rasterizer discard.

2011-12-14 Thread Paul Berry
This patch enables rasterizer discard functionality (a part of
transform feedback) in Gen6, by generating an alternate GS program
when rasterizer discard is active.  Instead of forwarding vertices
down the pipeline, the alternate GS program uses a URB Write message
to deallocate the URB entry that was allocated by FF sync and
terminate the thread.

Note: parts of the Sandy Bridge PRM seem to imply that we could do
this more efficiently, by clearing the GEN6_GS_RENDERING_ENABLE bit,
and not allocating a URB entry at all.  However, it's not clear how we
are supposed to terminate the thread if we do that.  Volume 2 part 1,
section 4.5.4, says "GS threads must terminate by sending a URB_WRITE
message with the EOT and Complete bits set.", and my experiments so
far confirm that.
---

This patch needs to be applied on top of the series "[PATCH 0/8] i965
gen6: Initial implementation of transform feedback.", which is still
under review on the mailing list.

 src/mesa/drivers/dri/i965/brw_gs.c  |6 ++
 src/mesa/drivers/dri/i965/brw_gs.h  |1 +
 src/mesa/drivers/dri/i965/brw_gs_emit.c |   30 ++
 3 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index c323a73..2b51a4f 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_gs.c
@@ -208,6 +208,12 @@ static void populate_key( struct brw_context *brw,
linked_xfb_info->Outputs[i].OutputRegister;
  }
   }
+  /* On Gen6, GS is also used for rasterizer discard. */
+  /* _NEW_TRANSFORM_FEEDBACK */
+  if (ctx->TransformFeedback.RasterDiscard) {
+ key->need_gs_prog = true;
+ key->rasterizer_discard = true;
+  }
} else {
   /* Pre-gen6, GS is used to transform QUADLIST, QUADSTRIP, and LINELOOP
* into simpler primitives.
diff --git a/src/mesa/drivers/dri/i965/brw_gs.h 
b/src/mesa/drivers/dri/i965/brw_gs.h
index 33d8d7a..7bf2248 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.h
+++ b/src/mesa/drivers/dri/i965/brw_gs.h
@@ -50,6 +50,7 @@ struct brw_gs_prog_key {
GLuint pv_first:1;
GLuint need_gs_prog:1;
GLuint userclip_active:1;
+   GLuint rasterizer_discard:1;
 
/**
 * Number of varyings that are output to transform feedback.
diff --git a/src/mesa/drivers/dri/i965/brw_gs_emit.c 
b/src/mesa/drivers/dri/i965/brw_gs_emit.c
index 72d4eca..586563a 100644
--- a/src/mesa/drivers/dri/i965/brw_gs_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_gs_emit.c
@@ -208,6 +208,28 @@ static void brw_gs_emit_vue(struct brw_gs_compile *c,
 }
 
 /**
+ * De-allocate the URB entry that was previously allocated to this thread
+ * (without writing any vertex data to it), and terminate the thread.  This is
+ * used to implement RASTERIZER_DISCARD functionality.
+ */
+static void brw_gs_terminate(struct brw_gs_compile *c)
+{
+   struct brw_compile *p = &c->func;
+   brw_urb_WRITE(p,
+ retype(brw_null_reg(), BRW_REGISTER_TYPE_UD), /* dest */
+ 0, /* msg_reg_nr */
+ c->reg.header, /* src0 */
+ false, /* allocate */
+ false, /* used */
+ 1, /* msg_length */
+ 0, /* response_length */
+ true, /* eot */
+ true, /* writes_complete */
+ 0, /* offset */
+ BRW_URB_SWIZZLE_NONE);
+}
+
+/**
  * Send an FF_SYNC message to ensure that all previously spawned GS threads
  * have finished sending primitives down the pipeline, and to allocate a URB
  * entry for the first output vertex.  Only needed when intel->needs_ff_sync
@@ -414,6 +436,14 @@ gen6_sol_program(struct brw_gs_compile *c, struct 
brw_gs_prog_key *key,
 
brw_gs_ff_sync(c, 1);
 
+   /* If RASTERIZER_DISCARD is enabled, we have nothing further to do, so
+* release the URB that was just allocated, and terminate the thread.
+*/
+   if (key->rasterizer_discard) {
+  brw_gs_terminate(c);
+  return;
+   }
+
brw_gs_overwrite_header_dw2_from_r0(c, num_verts);
switch (num_verts) {
case 1:
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/egl: Add support for EGL_NOK_swap_region

2011-12-14 Thread Fredrik Höglund
Backends indicate that they support this extension by returning
EGL_TRUE when native_display::get_param() is called with
NATIVE_PARAM_PRESENT_REGION and NATIVE_PARAM_PRESERVE_BUFFER.

native_present_control is extended to include the region that should
be presented. When native_present_control::num_rects is zero,
the whole surface is to be presented.

Signed-off-by: Fredrik Höglund 
---
 src/gallium/state_trackers/egl/common/egl_g3d.c|6 
 .../state_trackers/egl/common/egl_g3d_api.c|   30 ++-
 src/gallium/state_trackers/egl/common/native.h |   13 -
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/egl/common/egl_g3d.c 
b/src/gallium/state_trackers/egl/common/egl_g3d.c
index 182ce68..53811b8 100644
--- a/src/gallium/state_trackers/egl/common/egl_g3d.c
+++ b/src/gallium/state_trackers/egl/common/egl_g3d.c
@@ -606,6 +606,12 @@ egl_g3d_initialize(_EGLDriver *drv, _EGLDisplay *dpy)
   dpy->Extensions.WL_bind_wayland_display = EGL_TRUE;
 #endif
 
+#ifdef EGL_NOK_swap_region
+   if (gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESENT_REGION) &&
+   gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESERVE_BUFFER))
+  dpy->Extensions.NOK_swap_region = EGL_TRUE;
+#endif
+
if (egl_g3d_add_configs(drv, dpy, 1) == 1) {
   _eglError(EGL_NOT_INITIALIZED, "eglInitialize(unable to add configs)");
   goto fail;
diff --git a/src/gallium/state_trackers/egl/common/egl_g3d_api.c 
b/src/gallium/state_trackers/egl/common/egl_g3d_api.c
index 911540e..2e3ead6 100644
--- a/src/gallium/state_trackers/egl/common/egl_g3d_api.c
+++ b/src/gallium/state_trackers/egl/common/egl_g3d_api.c
@@ -546,7 +546,8 @@ egl_g3d_make_current(_EGLDriver *drv, _EGLDisplay *dpy,
 }
 
 static EGLBoolean
-egl_g3d_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
+swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf,
+ EGLint num_rects, const EGLint *rects, EGLBoolean preserve)
 {
struct egl_g3d_surface *gsurf = egl_g3d_surface(surf);
_EGLContext *ctx = _eglGetCurrentContext();
@@ -572,14 +573,35 @@ egl_g3d_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLSurface *surf)
 
memset(&ctrl, 0, sizeof(ctrl));
ctrl.natt = NATIVE_ATTACHMENT_BACK_LEFT;
-   ctrl.preserve = (gsurf->base.SwapBehavior == EGL_BUFFER_PRESERVED);
+   ctrl.preserve = preserve;
ctrl.swap_interval = gsurf->base.SwapInterval;
ctrl.premultiplied_alpha = (gsurf->base.VGAlphaFormat == 
EGL_VG_ALPHA_FORMAT_PRE);
+   ctrl.num_rects = num_rects;
+   ctrl.rects = rects;
 
return gsurf->native->present(gsurf->native, &ctrl);
 }
 
 static EGLBoolean
+egl_g3d_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
+{
+   struct egl_g3d_surface *gsurf = egl_g3d_surface(surf);
+
+   return swap_buffers(drv, dpy, surf, 0, NULL,
+   (gsurf->base.SwapBehavior == EGL_BUFFER_PRESERVED));
+}
+
+#ifdef EGL_NOK_swap_region
+static EGLBoolean
+egl_g3d_swap_buffers_region(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface 
*surf,
+EGLint num_rects, const EGLint *rects)
+{
+   /* Note: y=0=top */
+   return swap_buffers(drv, dpy, surf, num_rects, rects, EGL_TRUE);
+}
+#endif /* EGL_NOK_swap_region */
+
+static EGLBoolean
 egl_g3d_copy_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf,
  EGLNativePixmapType target)
 {
@@ -867,4 +889,8 @@ egl_g3d_init_driver_api(_EGLDriver *drv)
drv->API.CreateScreenSurfaceMESA = egl_g3d_create_screen_surface;
drv->API.ShowScreenSurfaceMESA = egl_g3d_show_screen_surface;
 #endif
+
+#ifdef EGL_NOK_swap_region
+   drv->API.SwapBuffersRegionNOK = egl_g3d_swap_buffers_region;
+#endif
 }
diff --git a/src/gallium/state_trackers/egl/common/native.h 
b/src/gallium/state_trackers/egl/common/native.h
index ee24c22..312b079 100644
--- a/src/gallium/state_trackers/egl/common/native.h
+++ b/src/gallium/state_trackers/egl/common/native.h
@@ -81,7 +81,13 @@ enum native_param_type {
 * EGL_ALPHA_SIZE.  EGL_VG_ALPHA_FORMAT attribute of a surface will affect
 * how the surface is presented.
 */
-   NATIVE_PARAM_PREMULTIPLIED_ALPHA
+   NATIVE_PARAM_PREMULTIPLIED_ALPHA,
+
+   /**
+* Return TRUE if native_surface::present supports presenting a partial
+* surface.
+*/
+   NATIVE_PARAM_PRESENT_REGION
 };
 
 /**
@@ -99,6 +105,11 @@ struct native_present_control {
 
/**< pixels use premultiplied alpha */
boolean premultiplied_alpha;
+
+   /**< The region to present. y=0=top.
+If num_rects is 0, the whole surface is to be presented */
+   int num_rects;
+   const int *rects; /* x, y, width, height */
 };
 
 struct native_surface {
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/egl: Implement EGL_NOK_swap_region for x11

2011-12-14 Thread Fredrik Höglund
v2: inline x11_drawable_copy_buffers().

Signed-off-by: Fredrik Höglund 
---
 src/gallium/state_trackers/egl/x11/native_dri2.c |   36 --
 src/gallium/state_trackers/egl/x11/x11_screen.c  |   21 +++-
 src/gallium/state_trackers/egl/x11/x11_screen.h  |   14 -
 3 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/src/gallium/state_trackers/egl/x11/native_dri2.c 
b/src/gallium/state_trackers/egl/x11/native_dri2.c
index 4754744..9ea3b80 100644
--- a/src/gallium/state_trackers/egl/x11/native_dri2.c
+++ b/src/gallium/state_trackers/egl/x11/native_dri2.c
@@ -313,22 +313,35 @@ dri2_surface_flush_frontbuffer(struct native_surface 
*nsurf)
 }
 
 static boolean
-dri2_surface_swap_buffers(struct native_surface *nsurf)
+dri2_surface_swap_buffers(struct native_surface *nsurf, int num_rects,
+  const int *rects)
 {
struct dri2_surface *dri2surf = dri2_surface(nsurf);
struct dri2_display *dri2dpy = dri2surf->dri2dpy;
 
/* copy to front buffer */
-   if (dri2surf->have_back)
-  x11_drawable_copy_buffers(dri2dpy->xscr, dri2surf->drawable,
-0, 0, dri2surf->width, dri2surf->height,
-DRI2BufferBackLeft, DRI2BufferFrontLeft);
+   if (dri2surf->have_back) {
+  if (num_rects > 0)
+ x11_drawable_copy_buffers_region(dri2dpy->xscr, dri2surf->drawable,
+   num_rects, rects,
+   DRI2BufferBackLeft, DRI2BufferFrontLeft);
+  else
+ x11_drawable_copy_buffers(dri2dpy->xscr, dri2surf->drawable,
+   0, 0, dri2surf->width, dri2surf->height,
+   DRI2BufferBackLeft, DRI2BufferFrontLeft);
+   }
 
/* and update fake front buffer */
-   if (dri2surf->have_fake)
-  x11_drawable_copy_buffers(dri2dpy->xscr, dri2surf->drawable,
-0, 0, dri2surf->width, dri2surf->height,
-DRI2BufferFrontLeft, DRI2BufferFakeFrontLeft);
+   if (dri2surf->have_fake) {
+  if (num_rects > 0)
+ x11_drawable_copy_buffers_region(dri2dpy->xscr, dri2surf->drawable,
+   num_rects, rects,
+   DRI2BufferFrontLeft, DRI2BufferFakeFrontLeft);
+  else
+ x11_drawable_copy_buffers(dri2dpy->xscr, dri2surf->drawable,
+   0, 0, dri2surf->width, dri2surf->height,
+   DRI2BufferFrontLeft, DRI2BufferFakeFrontLeft);
+   }
 
/* force buffers to be updated in next validation call */
if (!dri2_surface_receive_events(&dri2surf->base)) {
@@ -354,7 +367,7 @@ dri2_surface_present(struct native_surface *nsurf,
   ret = dri2_surface_flush_frontbuffer(nsurf);
   break;
case NATIVE_ATTACHMENT_BACK_LEFT:
-  ret = dri2_surface_swap_buffers(nsurf);
+  ret = dri2_surface_swap_buffers(nsurf, ctrl->num_rects, ctrl->rects);
   break;
default:
   ret = FALSE;
@@ -722,6 +735,9 @@ dri2_display_get_param(struct native_display *ndpy,
   /* DRI2CopyRegion is used */
   val = TRUE;
   break;
+   case NATIVE_PARAM_PRESENT_REGION:
+  val = TRUE;
+  break;
case NATIVE_PARAM_MAX_SWAP_INTERVAL:
default:
   val = 0;
diff --git a/src/gallium/state_trackers/egl/x11/x11_screen.c 
b/src/gallium/state_trackers/egl/x11/x11_screen.c
index 6155b4d..1e20f94 100644
--- a/src/gallium/state_trackers/egl/x11/x11_screen.c
+++ b/src/gallium/state_trackers/egl/x11/x11_screen.c
@@ -341,21 +341,24 @@ x11_drawable_enable_dri2(struct x11_screen *xscr,
  * Copy between buffers of the DRI2 drawable.
  */
 void
-x11_drawable_copy_buffers(struct x11_screen *xscr, Drawable drawable,
-  int x, int y, int width, int height,
-  int src_buf, int dst_buf)
+x11_drawable_copy_buffers_region(struct x11_screen *xscr, Drawable drawable,
+ int num_rects, const int *rects,
+ int src_buf, int dst_buf)
 {
-   XRectangle rect;
XserverRegion region;
+   XRectangle *rectangles = CALLOC(num_rects, sizeof(XRectangle));
 
-   rect.x = x;
-   rect.y = y;
-   rect.width = width;
-   rect.height = height;
+   for (int i = 0; i < num_rects; i++) {
+  rectangles[i].x = rects[i * 4 + 0];
+  rectangles[i].y = rects[i * 4 + 1];
+  rectangles[i].width = rects[i * 4 + 2];
+  rectangles[i].height = rects[i * 4 + 3];
+   }
 
-   region = XFixesCreateRegion(xscr->dpy, &rect, 1);
+   region = XFixesCreateRegion(xscr->dpy, rectangles, num_rects);
DRI2CopyRegion(xscr->dpy, drawable, region, dst_buf, src_buf);
XFixesDestroyRegion(xscr->dpy, region);
+   FREE(rectangles);
 }
 
 /**
diff --git a/src/gallium/state_trackers/egl/x11/x11_screen.h 
b/src/gallium/state_trackers/egl/x11/x11_screen.h
index acf1300..db1b218 100644
--- a/src/gallium/state_trackers/egl/x11/x11_screen.h
+++ b/src/gallium/state_trackers/egl/x11/x11_screen.h
@@ -108,9 +108,21 @@ x11_drawable_enable_dri2(struct x11_screen *xscr,
  Drawable drawable, boolean on);
 
 void
+x11_drawable_copy_buffers_region(st

Re: [Mesa-dev] [PATCH 1/8] intel: Drop check for wrapped_depth in RB mapping.

2011-12-14 Thread Kenneth Graunke
On 12/08/2011 01:39 PM, Eric Anholt wrote:
> This used to be needed because irb->mt would be unset for fake packed
> depth/stencil, but no longer.
> ---
>  src/mesa/drivers/dri/intel/intel_fbo.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
> b/src/mesa/drivers/dri/intel/intel_fbo.c
> index 4fcde23..1980355 100644
> --- a/src/mesa/drivers/dri/intel/intel_fbo.c
> +++ b/src/mesa/drivers/dri/intel/intel_fbo.c
> @@ -122,7 +122,7 @@ intel_map_renderbuffer(struct gl_context *ctx,
> int stride;
>  
> /* We sometimes get called with this by our intel_span.c usage. */
> -   if (!irb->mt && !irb->wrapped_depth) {
> +   if (!irb->mt) {
>*out_map = NULL;
>*out_stride = 0;
>return;

This series looks like a nice cleanup.  I'm not terribly confident in my
understanding of this code, but it looks okay to me.

Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] st/egl: Add support for EGL_NV_post_sub_buffer

2011-12-14 Thread Fredrik Höglund
Signed-off-by: Fredrik Höglund 
---
 src/gallium/state_trackers/egl/common/egl_g3d.c|8 -
 .../state_trackers/egl/common/egl_g3d_api.c|   26 
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/egl/common/egl_g3d.c 
b/src/gallium/state_trackers/egl/common/egl_g3d.c
index 53811b8..f870b18 100644
--- a/src/gallium/state_trackers/egl/common/egl_g3d.c
+++ b/src/gallium/state_trackers/egl/common/egl_g3d.c
@@ -606,11 +606,15 @@ egl_g3d_initialize(_EGLDriver *drv, _EGLDisplay *dpy)
   dpy->Extensions.WL_bind_wayland_display = EGL_TRUE;
 #endif
 
-#ifdef EGL_NOK_swap_region
if (gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESENT_REGION) &&
-   gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESERVE_BUFFER))
+   gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESERVE_BUFFER)) {
+#ifdef EGL_NOK_swap_region
   dpy->Extensions.NOK_swap_region = EGL_TRUE;
 #endif
+#ifdef EGL_NV_post_sub_buffer
+  dpy->Extensions.NV_post_sub_buffer = EGL_TRUE;
+#endif
+   }
 
if (egl_g3d_add_configs(drv, dpy, 1) == 1) {
   _eglError(EGL_NOT_INITIALIZED, "eglInitialize(unable to add configs)");
diff --git a/src/gallium/state_trackers/egl/common/egl_g3d_api.c 
b/src/gallium/state_trackers/egl/common/egl_g3d_api.c
index 2e3ead6..e16955c 100644
--- a/src/gallium/state_trackers/egl/common/egl_g3d_api.c
+++ b/src/gallium/state_trackers/egl/common/egl_g3d_api.c
@@ -296,6 +296,16 @@ egl_g3d_create_surface(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLConfig *conf,
gconf->stvis.buffer_mask & ST_ATTACHMENT_FRONT_LEFT_MASK)
   gsurf->stvis.render_buffer = ST_ATTACHMENT_FRONT_LEFT;
 
+#ifdef EGL_NV_post_sub_buffer
+   if (dpy->Extensions.NV_post_sub_buffer) {
+  if (gsurf->base.Type == EGL_WINDOW_BIT &&
+  gsurf->base.RenderBuffer == EGL_BACK_BUFFER)
+ gsurf->base.PostSubBufferSupportedNV = EGL_TRUE;
+  else
+ gsurf->base.PostSubBufferSupportedNV = EGL_FALSE;
+   }
+#endif
+
gsurf->stfbi = egl_g3d_create_st_framebuffer(&gsurf->base);
if (!gsurf->stfbi) {
   nsurf->destroy(nsurf);
@@ -601,6 +611,18 @@ egl_g3d_swap_buffers_region(_EGLDriver *drv, _EGLDisplay 
*dpy, _EGLSurface *surf
 }
 #endif /* EGL_NOK_swap_region */
 
+#ifdef EGL_NV_post_sub_buffer
+static EGLBoolean
+egl_g3d_post_sub_buffer(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf,
+EGLint x, EGLint y, EGLint width, EGLint height)
+{
+   /* Note: y=0=bottom */
+   const EGLint rect[4] = { x, surf->Height - y - height, width, height };
+
+   return swap_buffers(drv, dpy, surf, 1, rect, EGL_TRUE);
+}
+#endif /* EGL_NV_post_sub_buffer */
+
 static EGLBoolean
 egl_g3d_copy_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf,
  EGLNativePixmapType target)
@@ -893,4 +915,8 @@ egl_g3d_init_driver_api(_EGLDriver *drv)
 #ifdef EGL_NOK_swap_region
drv->API.SwapBuffersRegionNOK = egl_g3d_swap_buffers_region;
 #endif
+
+#ifdef EGL_NV_post_sub_buffer
+   drv->API.PostSubBufferNV = egl_g3d_post_sub_buffer;
+#endif
 }
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] egl_dri2/x11: Add support for EGL_NV_post_sub_buffer

2011-12-14 Thread Fredrik Höglund
Signed-off-by: Fredrik Höglund 
---
 src/egl/drivers/dri2/platform_x11.c |   17 +
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 8dd231a..08a2c8d 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -248,6 +248,12 @@ dri2_create_surface(_EGLDriver *drv, _EGLDisplay *disp, 
EGLint type,
   free(reply);
}
 
+   if (dri2_dpy->dri2 && type == EGL_WINDOW_BIT &&
+   dri2_surf->base.RenderBuffer == EGL_BACK_BUFFER)
+  dri2_surf->base.PostSubBufferSupportedNV = EGL_TRUE;
+   else
+  dri2_surf->base.PostSubBufferSupportedNV = EGL_FALSE;
+
return &dri2_surf->base;
 
  cleanup_dri_drawable:
@@ -749,6 +755,15 @@ dri2_swap_buffers_region(_EGLDriver *drv, _EGLDisplay 
*disp, _EGLSurface *draw,
 }
 
 static EGLBoolean
+dri2_post_sub_buffer(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw,
+EGLint x, EGLint y, EGLint width, EGLint height)
+{
+   const EGLint rect[4] = { x, draw->Height - y - height, width, height };
+
+   return dri2_swap_buffers_region(drv, disp, draw, 1, rect);
+}
+
+static EGLBoolean
 dri2_copy_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *surf,
  EGLNativePixmapType target)
 {
@@ -971,6 +986,7 @@ dri2_initialize_x11_dri2(_EGLDriver *drv, _EGLDisplay *disp)
drv->API.CopyBuffers = dri2_copy_buffers;
drv->API.CreateImageKHR = dri2_x11_create_image_khr;
drv->API.SwapBuffersRegionNOK = dri2_swap_buffers_region;
+   drv->API.PostSubBufferNV = dri2_post_sub_buffer;
 
dri2_dpy = malloc(sizeof *dri2_dpy);
if (!dri2_dpy)
@@ -1041,6 +1057,7 @@ dri2_initialize_x11_dri2(_EGLDriver *drv, _EGLDisplay 
*disp)
disp->Extensions.KHR_image_pixmap = EGL_TRUE;
disp->Extensions.NOK_swap_region = EGL_TRUE;
disp->Extensions.NOK_texture_from_pixmap = EGL_TRUE;
+   disp->Extensions.NV_post_sub_buffer = EGL_TRUE;
 
 #ifdef HAVE_WAYLAND_PLATFORM
disp->Extensions.WL_bind_wayland_display = EGL_TRUE;
-- 
1.7.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] egl: add EGL_NV_post_sub_buffer

2011-12-14 Thread Fredrik Höglund
v2: Handle EGL_POST_SUB_BUFFER_SUPPORTED_NV in
_eglParseSurfaceAttribList()

Signed-off-by: Fredrik Höglund 
---
 include/EGL/eglext.h  |9 +
 src/egl/main/eglapi.c |   24 
 src/egl/main/eglapi.h |8 
 src/egl/main/egldisplay.h |2 ++
 src/egl/main/eglmisc.c|2 ++
 src/egl/main/eglsurface.c |   23 +++
 src/egl/main/eglsurface.h |4 
 7 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/include/EGL/eglext.h b/include/EGL/eglext.h
index 9484b83..d03a24d 100644
--- a/include/EGL/eglext.h
+++ b/include/EGL/eglext.h
@@ -144,6 +144,15 @@ typedef EGLImageKHR (EGLAPIENTRYP 
PFNEGLCREATEDRMIMAGEMESA) (EGLDisplay dpy, con
 typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDRMIMAGEMESA) (EGLDisplay dpy, 
EGLImageKHR image, EGLint *name, EGLint *handle, EGLint *stride);
 #endif
 
+#ifndef EGL_NV_post_sub_buffer
+#define EGL_NV_post_sub_buffer 1
+#define EGL_POST_SUB_BUFFER_SUPPORTED_NV   0x30BE
+#ifdef EGL_EGLEXT_PROTOTYPES
+EGLAPI EGLBoolean EGLAPIENTRY eglPostSubBufferNV(EGLDisplay dpy, EGLSurface 
surface, EGLint x, EGLint y, EGLint width, EGLint height);
+#endif /* EGL_EGLEXT_PROTOTYPES */
+typedef EGLBoolean (EGLAPIENTRYP PFNEGLPOSTSUBBUFFERNVPROC) (EGLDisplay dpy, 
EGLSurface surface, EGLint x, EGLint y, EGLint width, EGLint height);
+#endif
+
 #ifndef EGL_WL_bind_wayland_display
 #define EGL_WL_bind_wayland_display 1
 
diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 3cb1a5b..ff15476 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -951,6 +951,9 @@ eglGetProcAddress(const char *procname)
 #ifdef EGL_ANDROID_swap_rectangle
   { "eglSetSwapRectangleANDROID", (_EGLProc) eglSetSwapRectangleANDROID },
 #endif
+#ifdef EGL_NV_post_sub_buffer
+  { "eglPostSubBufferNV", (_EGLProc) eglPostSubBufferNV },
+#endif
   { NULL, NULL }
};
EGLint i;
@@ -1590,3 +1593,24 @@ eglSetSwapRectangleANDROID(EGLDisplay dpy, EGLSurface 
draw,
RETURN_EGL_EVAL(disp, ret);
 }
 #endif
+
+#ifdef EGL_NV_post_sub_buffer
+EGLBoolean EGLAPIENTRY
+eglPostSubBufferNV(EGLDisplay dpy, EGLSurface surface,
+   EGLint x, EGLint y, EGLint width, EGLint height)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLSurface *surf = _eglLookupSurface(surface, disp);
+   _EGLDriver *drv;
+   EGLBoolean ret;
+
+   _EGL_CHECK_SURFACE(disp, surf, EGL_FALSE, drv);
+
+   if (!disp->Extensions.NV_post_sub_buffer)
+  RETURN_EGL_EVAL(disp, EGL_FALSE);
+
+   ret = drv->API.PostSubBufferNV(drv, disp, surf, x, y, width, height);
+
+   RETURN_EGL_EVAL(disp, ret);
+}
+#endif
diff --git a/src/egl/main/eglapi.h b/src/egl/main/eglapi.h
index 1e0aef6..f374273 100644
--- a/src/egl/main/eglapi.h
+++ b/src/egl/main/eglapi.h
@@ -135,6 +135,10 @@ typedef EGLBoolean (*UnbindWaylandDisplayWL_t)(_EGLDriver 
*drv, _EGLDisplay *dis
 typedef EGLBoolean (*SetSwapRectangleANDROID_t)(_EGLDriver *drv, _EGLDisplay 
*disp, _EGLSurface *draw, EGLint left, EGLint top, EGLint width, EGLint height);
 #endif
 
+#ifdef EGL_NV_post_sub_buffer
+typedef EGLBoolean (*PostSubBufferNV_t)(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surface, EGLint x, EGLint y, EGLint width, EGLint height);
+#endif
+
 /**
  * The API dispatcher jumps through these functions
  */
@@ -218,6 +222,10 @@ struct _egl_api
 #ifdef EGL_ANDROID_swap_rectangle
SetSwapRectangleANDROID_t SetSwapRectangleANDROID;
 #endif
+
+#ifdef EGL_NV_post_sub_buffer
+   PostSubBufferNV_t PostSubBufferNV;
+#endif
 };
 
 #endif /* EGLAPI_INCLUDED */
diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h
index 67a2e24..8f737df 100644
--- a/src/egl/main/egldisplay.h
+++ b/src/egl/main/egldisplay.h
@@ -112,6 +112,8 @@ struct _egl_extensions
 
EGLBoolean ANDROID_image_native_buffer;
EGLBoolean ANDROID_swap_rectangle;
+
+   EGLBoolean NV_post_sub_buffer;
 };
 
 
diff --git a/src/egl/main/eglmisc.c b/src/egl/main/eglmisc.c
index ab48bc6..30cd04e 100644
--- a/src/egl/main/eglmisc.c
+++ b/src/egl/main/eglmisc.c
@@ -116,6 +116,8 @@ _eglUpdateExtensionsString(_EGLDisplay *dpy)
 
_EGL_CHECK_EXTENSION(ANDROID_image_native_buffer);
_EGL_CHECK_EXTENSION(ANDROID_swap_rectangle);
+
+   _EGL_CHECK_EXTENSION(NV_post_sub_buffer);
 #undef _EGL_CHECK_EXTENSION
 }
 
diff --git a/src/egl/main/eglsurface.c b/src/egl/main/eglsurface.c
index 3564ecd..2db801a 100644
--- a/src/egl/main/eglsurface.c
+++ b/src/egl/main/eglsurface.c
@@ -170,6 +170,20 @@ _eglParseSurfaceAttribList(_EGLSurface *surf, const EGLint 
*attrib_list)
  }
  surf->RenderBuffer = val;
  break;
+#ifdef EGL_NV_post_sub_buffer
+  case EGL_POST_SUB_BUFFER_SUPPORTED_NV:
+ if (!dpy->Extensions.NV_post_sub_buffer ||
+ type != EGL_WINDOW_BIT) {
+err = EGL_BAD_ATTRIBUTE;
+break;
+ }
+ if (val != EGL_TRUE && val != EGL_FALSE) {
+err = EGL_BAD_PARAMETER;
+break;
+ }
+ surf-

Re: [Mesa-dev] [PATCH 1/8] mesa: Record transform feedback strides/offsets in linker output.

2011-12-14 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Dec 14, 2011 at 6:00 PM, Paul Berry  wrote:
> This patch adds two new fields to the gl_transform_feedback_info
> struct:
>
> - BufferStride records the total number of components (per vertex)
>  that transform feedback is being instructed to store in each buffer.
>
> - Outputs[i].DstOffset records the offset within the interleaved
>  structure of each transform feedback output.
>
> These values are needed by the i965 gen6 and r600g back-ends, so it
> seems better to have the linker provide them rather than force each
> back-end to compute them independently.
>
> Also, DstOffset helps pave the way for supporting
> ARB_transform_feedback3, which allows the transform feedback output to
> contain holes between attributes by specifying
> gl_SkipComponents{1,2,3,4} as the varying name.
> ---
>  src/glsl/linker.cpp    |    5 -
>  src/mesa/main/mtypes.h |   10 ++
>  2 files changed, 14 insertions(+), 1 deletions(-)
>
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index b8a7126..6587008 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -1598,7 +1598,9 @@ tfeedback_decl::store(struct gl_shader_program *prog,
>       info->Outputs[info->NumOutputs].OutputRegister = this->location + v;
>       info->Outputs[info->NumOutputs].NumComponents = this->vector_elements;
>       info->Outputs[info->NumOutputs].OutputBuffer = buffer;
> +      info->Outputs[info->NumOutputs].DstOffset = info->BufferStride[buffer];
>       ++info->NumOutputs;
> +      info->BufferStride[buffer] += this->vector_elements;
>    }
>    return true;
>  }
> @@ -1863,7 +1865,8 @@ store_tfeedback_info(struct gl_context *ctx, struct 
> gl_shader_program *prog,
>                      tfeedback_decl *tfeedback_decls)
>  {
>    unsigned total_tfeedback_components = 0;
> -   prog->LinkedTransformFeedback.NumOutputs = 0;
> +   memset(&prog->LinkedTransformFeedback, 0,
> +          sizeof(prog->LinkedTransformFeedback));
>    for (unsigned i = 0; i < num_tfeedback_decls; ++i) {
>       unsigned buffer =
>          prog->TransformFeedback.BufferMode == GL_SEPARATE_ATTRIBS ? i : 0;
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 1934349..e18c5f8 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1821,7 +1821,17 @@ struct gl_transform_feedback_info {
>       unsigned OutputRegister;
>       unsigned OutputBuffer;
>       unsigned NumComponents;
> +
> +      /** offset (in DWORDs) of this output within the interleaved structure 
> */
> +      unsigned DstOffset;
>    } Outputs[MAX_PROGRAM_OUTPUTS];
> +
> +   /**
> +    * Total number of components stored in each buffer.  This may be used by
> +    * hardware back-ends to determine the correct stride when interleaving
> +    * multiple transform feedback outputs in the same buffer.
> +    */
> +   unsigned BufferStride[MAX_FEEDBACK_ATTRIBS];
>  };
>
>  /**
> --
> 1.7.6.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta: rework dest image allocation in mipmap generation code

2011-12-14 Thread Eric Anholt
On Tue, 13 Dec 2011 18:07:05 -0700, Brian Paul  wrote:
> From: Brian Paul 
> 
> This fixes two things:
> 1. If the texture object was created with glTexStorage2D, the call
>to _mesa_TexImage2D() would generate INVALID_OPERATION since the
>texture is marked as immutable.
> 2. _mesa_TexImage2D() always frees any existing texture image memory
>before allocating new memory.  That's inefficient since the existing
>image is usually the right size already.  Now we only make the call
>when necessary.
> 
> v2: use _mesa_TexImage() in prepare_dest_image() to make sure side-effects
> of changing a texture image are observed (like FBO completeness).

Should this live in main/mipmap.c?  It looks like
generate_mipmap_uncompressed() needs it.


pgpmzMmLxeKbp.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] egl: add EGL_NV_post_sub_buffer

2011-12-14 Thread Ian Romanick

On 12/14/2011 12:24 PM, Fredrik Höglund wrote:

v2: Handle EGL_POST_SUB_BUFFER_SUPPORTED_NV in
 _eglParseSurfaceAttribList()

Signed-off-by: Fredrik Höglund
---
  include/EGL/eglext.h  |9 +
  src/egl/main/eglapi.c |   24 
  src/egl/main/eglapi.h |8 
  src/egl/main/egldisplay.h |2 ++
  src/egl/main/eglmisc.c|2 ++
  src/egl/main/eglsurface.c |   23 +++
  src/egl/main/eglsurface.h |4 
  7 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/include/EGL/eglext.h b/include/EGL/eglext.h
index 9484b83..d03a24d 100644
--- a/include/EGL/eglext.h
+++ b/include/EGL/eglext.h
@@ -144,6 +144,15 @@ typedef EGLImageKHR (EGLAPIENTRYP 
PFNEGLCREATEDRMIMAGEMESA) (EGLDisplay dpy, con
  typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDRMIMAGEMESA) (EGLDisplay dpy, 
EGLImageKHR image, EGLint *name, EGLint *handle, EGLint *stride);
  #endif

+#ifndef EGL_NV_post_sub_buffer
+#define EGL_NV_post_sub_buffer 1
+#define EGL_POST_SUB_BUFFER_SUPPORTED_NV   0x30BE
+#ifdef EGL_EGLEXT_PROTOTYPES
+EGLAPI EGLBoolean EGLAPIENTRY eglPostSubBufferNV(EGLDisplay dpy, EGLSurface 
surface, EGLint x, EGLint y, EGLint width, EGLint height);
+#endif /* EGL_EGLEXT_PROTOTYPES */
+typedef EGLBoolean (EGLAPIENTRYP PFNEGLPOSTSUBBUFFERNVPROC) (EGLDisplay dpy, 
EGLSurface surface, EGLint x, EGLint y, EGLint width, EGLint height);
+#endif
+
  #ifndef EGL_WL_bind_wayland_display
  #define EGL_WL_bind_wayland_display 1



I think we may be treating this file incorrectly.  It's my understanding 
that eglext.h comes from Khronos, and we don't get to modify it.  Our 
current eglext.h is based on version 7 (current version is 10), and it 
has diverged quite a bit.  I want to ping folks at Khronos and figure 
out how we're supposed to deal with this file.



diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 3cb1a5b..ff15476 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -951,6 +951,9 @@ eglGetProcAddress(const char *procname)
  #ifdef EGL_ANDROID_swap_rectangle
{ "eglSetSwapRectangleANDROID", (_EGLProc) eglSetSwapRectangleANDROID },
  #endif
+#ifdef EGL_NV_post_sub_buffer


It seems weird that we define this in our own header file, then check to 
see if it's defined.  I understand that the new code is just following 
the pattern of the existing code.  Perhaps Kristian or Chia-I can shed 
some light on this convention.



+  { "eglPostSubBufferNV", (_EGLProc) eglPostSubBufferNV },
+#endif
{ NULL, NULL }
 };
 EGLint i;
@@ -1590,3 +1593,24 @@ eglSetSwapRectangleANDROID(EGLDisplay dpy, EGLSurface 
draw,
 RETURN_EGL_EVAL(disp, ret);
  }
  #endif
+
+#ifdef EGL_NV_post_sub_buffer
+EGLBoolean EGLAPIENTRY
+eglPostSubBufferNV(EGLDisplay dpy, EGLSurface surface,
+   EGLint x, EGLint y, EGLint width, EGLint height)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLSurface *surf = _eglLookupSurface(surface, disp);
+   _EGLDriver *drv;
+   EGLBoolean ret;
+
+   _EGL_CHECK_SURFACE(disp, surf, EGL_FALSE, drv);
+
+   if (!disp->Extensions.NV_post_sub_buffer)
+  RETURN_EGL_EVAL(disp, EGL_FALSE);
+
+   ret = drv->API.PostSubBufferNV(drv, disp, surf, x, y, width, height);
+
+   RETURN_EGL_EVAL(disp, ret);
+}
+#endif
diff --git a/src/egl/main/eglapi.h b/src/egl/main/eglapi.h
index 1e0aef6..f374273 100644
--- a/src/egl/main/eglapi.h
+++ b/src/egl/main/eglapi.h
@@ -135,6 +135,10 @@ typedef EGLBoolean (*UnbindWaylandDisplayWL_t)(_EGLDriver 
*drv, _EGLDisplay *dis
  typedef EGLBoolean (*SetSwapRectangleANDROID_t)(_EGLDriver *drv, _EGLDisplay 
*disp, _EGLSurface *draw, EGLint left, EGLint top, EGLint width, EGLint height);
  #endif

+#ifdef EGL_NV_post_sub_buffer
+typedef EGLBoolean (*PostSubBufferNV_t)(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surface, EGLint x, EGLint y, EGLint width, EGLint height);
+#endif
+
  /**
   * The API dispatcher jumps through these functions
   */
@@ -218,6 +222,10 @@ struct _egl_api
  #ifdef EGL_ANDROID_swap_rectangle
 SetSwapRectangleANDROID_t SetSwapRectangleANDROID;
  #endif
+
+#ifdef EGL_NV_post_sub_buffer
+   PostSubBufferNV_t PostSubBufferNV;
+#endif
  };

  #endif /* EGLAPI_INCLUDED */
diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h
index 67a2e24..8f737df 100644
--- a/src/egl/main/egldisplay.h
+++ b/src/egl/main/egldisplay.h
@@ -112,6 +112,8 @@ struct _egl_extensions

 EGLBoolean ANDROID_image_native_buffer;
 EGLBoolean ANDROID_swap_rectangle;
+
+   EGLBoolean NV_post_sub_buffer;
  };


diff --git a/src/egl/main/eglmisc.c b/src/egl/main/eglmisc.c
index ab48bc6..30cd04e 100644
--- a/src/egl/main/eglmisc.c
+++ b/src/egl/main/eglmisc.c
@@ -116,6 +116,8 @@ _eglUpdateExtensionsString(_EGLDisplay *dpy)

 _EGL_CHECK_EXTENSION(ANDROID_image_native_buffer);
 _EGL_CHECK_EXTENSION(ANDROID_swap_rectangle);
+
+   _EGL_CHECK_EXTENSION(NV_post_sub_buffer);
  #undef _EGL_CHECK_EXTENSION
  }

diff --git a/src/egl/main/e

Re: [Mesa-dev] [PATCH] meta: rework dest image allocation in mipmap generation code

2011-12-14 Thread Brian Paul

On 12/14/2011 02:13 PM, Eric Anholt wrote:

On Tue, 13 Dec 2011 18:07:05 -0700, Brian Paul  wrote:

From: Brian Paul

This fixes two things:
1. If the texture object was created with glTexStorage2D, the call
to _mesa_TexImage2D() would generate INVALID_OPERATION since the
texture is marked as immutable.
2. _mesa_TexImage2D() always frees any existing texture image memory
before allocating new memory.  That's inefficient since the existing
image is usually the right size already.  Now we only make the call
when necessary.

v2: use _mesa_TexImage() in prepare_dest_image() to make sure side-effects
of changing a texture image are observed (like FBO completeness).


Should this live in main/mipmap.c?  It looks like
generate_mipmap_uncompressed() needs it.


I think the broader question is are we updating fbo completeness when 
we change any texture image by mipmap generation, right?


We're probably missing that in a few places.

How about I look into that after this meta fix?  R-b?

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Set the correct ctx->NewState bitfield for rasterizer discard.

2011-12-14 Thread Marek Olšák
I think RASTERIZER_DISCARD has nothing to do with transform feedback.
I think it's part of the same spec because it's not useful without it.
As I understand it, _NEW_TRANSFORM_FEEDBACK is dirty when transform
feedback buffer bindings are changed or just enabled/disabled. On the
other hand, RASTERIZER_DISCARD enables or disables the rasterizer, so
it should fall into the same category as face culling for example. (I
even implemented it using face culling on r600)

Also there would be no way to know whether _NEW_TRANSFORM_FEEDBACK
changes just TFB buffer bindings, or just RASTERIZER_DISCARD, or both.

Marek

On Wed, Dec 14, 2011 at 8:59 PM, Paul Berry  wrote:
> Previously, we were setting the _NEW_TRANSFORM bit when enabling or
> disabling RASTERIZER_DISCARD.  This is incorrect, since _NEW_TRANSFORM
> flags changes to ctx->Transform, but the rasterizer discard flag is in
> ctx->TransformFeedback.  This patch sets the correct bit,
> _NEW_TRANSFORM_FEEDBACK.
> ---
>  src/mesa/main/enable.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
> index 6461ac1..15a2305 100644
> --- a/src/mesa/main/enable.c
> +++ b/src/mesa/main/enable.c
> @@ -890,7 +890,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
> GLboolean state)
>       case GL_RASTERIZER_DISCARD:
>         CHECK_EXTENSION(EXT_transform_feedback, cap);
>          if (ctx->TransformFeedback.RasterDiscard != state) {
> -            FLUSH_VERTICES(ctx, _NEW_TRANSFORM);
> +            FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
>             ctx->TransformFeedback.RasterDiscard = state;
>          }
>          break;
> --
> 1.7.6.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa, gallium: add a cap for GPUs without unified color+generic varying slots

2011-12-14 Thread Marek Olšák
On Wed, Dec 14, 2011 at 6:30 PM, Ian Romanick  wrote:
> On 12/13/2011 04:32 PM, Marek Olšák wrote:
>>
>> ---
>>  src/gallium/drivers/r300/r300_screen.c |    3 ++-
>>  src/gallium/include/pipe/p_defines.h   |    3 ++-
>>  src/glsl/linker.cpp                    |    6 ++
>>  src/mesa/main/mtypes.h                 |    9 +
>>  src/mesa/state_tracker/st_extensions.c |    3 +++
>>  5 files changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/drivers/r300/r300_screen.c
>> b/src/gallium/drivers/r300/r300_screen.c
>> index 0bae065..a761939 100644
>> --- a/src/gallium/drivers/r300/r300_screen.c
>> +++ b/src/gallium/drivers/r300/r300_screen.c
>> @@ -102,6 +102,7 @@ static int r300_get_param(struct pipe_screen* pscreen,
>> enum pipe_cap param)
>>          case PIPE_CAP_TEXTURE_BARRIER:
>>          case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
>>          case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
>> +        case PIPE_CAP_SEPARATE_COLOR_VARYINGS:
>>              return 1;
>>
>>          /* r300 cannot do swizzling of compressed textures. Supported
>> otherwise. */
>> @@ -182,7 +183,7 @@ static int r300_get_shader_param(struct pipe_screen
>> *pscreen, unsigned shader, e
>>               * R500 has the ability to turn 3rd and 4th color into
>>               * additional texcoords but there is no two-sided color
>>               * selection then. However the facing bit can be used
>> instead. */
>> -            return 10;
>> +            return 8;
>
>
> Why are you trying to make r300g behave differently the every other driver
> that has existed for that same hardware?  This doesn't make any sense.  This
> is not Apple's driver works for r300 hardware, and it's not how AMD's
> Windows driver works (worked, anyway) for r300 hardware.
>
> This feels like an unnecessary hack.

Hi Ian,

Simple. I cannot pass the test glsl-max-varyings if I report 40
varying components, because I have only 32 texcoord components + 8
color components. I could re-use the color varyings, but it's not full
single-precision. r300 uses a fixed-point type S3.12 for color
interpolators, which can represent values in the range [-7.,
7.]. That's unusable for used-defined varyings. r500 uses a 20-bit
float, which is better, but I am not sure if it's good enough.

Also, r500 actually has 10 texcoord interpolators, but we don't use
the last two yet (it's non-trivial, there is a special PS3 mode for
it). Whether color interpolators can be enabled in that mode is
undocumented, though I am almost sure the back colors cannot be. (the
DX9 ps_3_0 shader profile doesn't have color inputs at all)

Yes, the patch is a hack. However modifying glsl-max-varyings to only
test MAX_VARYING_FLOATS - 8 doesn't feel right either. Do you have a
better idea?

What about the other patch?

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/8] mesa: Track changes to transform feedback state.

2011-12-14 Thread Marek Olšák
On Wed, Dec 7, 2011 at 8:09 PM, Paul Berry  wrote:
> This patch adds a new bit to the ctx->NewState bitfield,
> _NEW_TRANSFORM_FEEDBACK, to track state changes that affect
> ctx->TransformFeedback.  This bit can be used by driver back-ends to
> avoid expensive recomputations when transform feedback state has not
> been modified.
>
> Reviewed-by: Kenneth Graunke 
> ---
>  src/mesa/main/mtypes.h            |    1 +
>  src/mesa/main/transformfeedback.c |    9 -
>  2 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 33b00c6..fc494f7 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -3051,6 +3051,7 @@ struct gl_matrix_stack
>  #define _NEW_PROGRAM_CONSTANTS (1 << 27)
>  #define _NEW_BUFFER_OBJECT     (1 << 28)
>  #define _NEW_FRAG_CLAMP        (1 << 29)
> +#define _NEW_TRANSFORM_FEEDBACK (1 << 30) /**< gl_context::TransformFeedback 
> */
>  #define _NEW_ALL ~0
>  /*@}*/
>
> diff --git a/src/mesa/main/transformfeedback.c 
> b/src/mesa/main/transformfeedback.c
> index 11abd03..799245d 100644
> --- a/src/mesa/main/transformfeedback.c
> +++ b/src/mesa/main/transformfeedback.c
> @@ -376,6 +376,7 @@ _mesa_BeginTransformFeedback(GLenum mode)
>       return;
>    }
>
> +   FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
>    obj->Active = GL_TRUE;
>    ctx->TransformFeedback.Mode = mode;
>
> @@ -398,6 +399,7 @@ _mesa_EndTransformFeedback(void)
>       return;
>    }
>
> +   FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
>    ctx->TransformFeedback.CurrentObject->Active = GL_FALSE;
>
>    assert(ctx->Driver.EndTransformFeedback);
> @@ -415,6 +417,7 @@ bind_buffer_range(struct gl_context *ctx, GLuint index,
>  {
>    struct gl_transform_feedback_object *obj =
>       ctx->TransformFeedback.CurrentObject;
> +   FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
>
>    /* The general binding point */
>    _mesa_reference_buffer_object(ctx,

Sorry for the late reply.

There's no need to set _NEW_TRANSFORM_FEEDBACK in bind_buffer_range,
because it's not allowed to change buffer bindings for an active TFB
object. Buffer bindings can only be changed for inactive TFB objects.
And because they're inactive, the drivers don't need to know about it.
The TFB buffers are effectively bound through BeginTransformFeedback
and unbound through EndTransformFeedback (likewise for Resume and
Pause).

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Set the correct ctx->NewState bitfield for rasterizer discard.

2011-12-14 Thread Paul Berry
On 14 December 2011 13:42, Marek Olšák  wrote:

> I think RASTERIZER_DISCARD has nothing to do with transform feedback.
> I think it's part of the same spec because it's not useful without it.
> As I understand it, _NEW_TRANSFORM_FEEDBACK is dirty when transform
> feedback buffer bindings are changed or just enabled/disabled. On the
> other hand, RASTERIZER_DISCARD enables or disables the rasterizer, so
> it should fall into the same category as face culling for example. (I
> even implemented it using face culling on r600)
>
> Also there would be no way to know whether _NEW_TRANSFORM_FEEDBACK
> changes just TFB buffer bindings, or just RASTERIZER_DISCARD, or both.
>
> Marek
>

I see where you are coming from--I could have implemented rasterizer
discard on i965 gen6 in the same way.

However, I think there are three compelling reasons to consider rasterizer
discard to be related to transform feedback:

(1) from a user perspective, it really only makes sense to use rasterizer
discard when transform feedback is active.  Thus, it's highly likely that
when the rasterizer discard state is changed, transform feedback settings
will be changed too.

(2) rasterizer discard functionality is described by the same set of
extensions that enable transform feedback (e.g. EXT_transform_feedback), so
presumably the inventors of these two features thought they were closely
related.

(3) the enable bit that Mesa uses to keep track of the state of rasterizer
discard is in gl_context::TransformFeedback, not gl_context::Transform.

Item (3) is the most important to me.  One of the scarier things about i965
driver development is that we have to manually keep track of which hardware
configuration commands correspond to which dirty bits.  If we miss a
dependency, that causes a subtle bug in which a state change might not
cause the hardware state to be updated properly.  These kinds of bugs are
not well exposed by Piglit tests, because most Piglit tests make a large
number of state changes followed by a draw operation, so a missing
dependency might easily go undetected.  The thing that allows me to sleep
at night is that there is a nice one-to-one correspondence between almost
all of the dirty bits and corresponding substructures of gl_context.  For
example, the _NEW_MODELVIEW dirty bit corresponds to gl_context::ModelView,
_NEW_PROJECTION corresponds to gl_context::Projection, and so on.  That
means any time I am worried that I'm not handling dirty bits correctly, I
can visually inspect the code and make sure that the dirty bits I'm
consulting match up with which elements of gl_context I'm accessing.  If we
leave the code as is, then there's a big undocumented exception to that
one-to-one correspondence, wherein
gl_context::TransformFeedback.RasterDiscard is covered by _NEW_TRANSFORM,
not _NEW_TRANSFORM_FEEDBACK, as one would logically guess based on where
it's located within gl_context.  I'm not confident that I (or other
developers) will remember this exception once we're no longer in the thick
of implementing transform feedback and rasterizer discard.

It seems to me that we have three possible approaches to choose from here:

(a) Go ahead with this patch, and modify r600 code to recompute the face
culling mode when the _NEW_TRANSFORM_FEEDBACK dirty bit is set.

(b) Don't apply this patch, and instead move RasterDiscard from
gl_context::TransformFeedback to gl_context::Transform, so that we restore
the one-to-one correspondence between dirty bits and substructures of
gl_context.

(c) Do nothing, and rely on programmers to remember that RasterDiscard is
an exception to the usual correspondence between dirty bits and
substructures of gl_context.

I'm really not comfortable with (c) because of the risk of future bugs.  I
suppose I could be talked into (b) if there's popular support for it, but
it's not my favourite, because as I said earlier, I think there are
actually a lot of good reasons to think of rasterizer discard as related to
transform feedback.  My preference is to do (a).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Flush pipeline on EndTransformFeedback.

2011-12-14 Thread Paul Berry
A common use case for transform feedback is to perform one draw
operation that writes transform feedback output to a buffer, followed
by a second draw operation that consumes that buffer as vertex input.
Since vertex input is consumed at an earlier pipeline stage than
writing transform feedback output, we need to flush the pipeline to
ensure that the transform feedback output is completely written before
the data is consumed.

In an ideal world, we would do some dependency tracking, so that we
would only flush the pipeline if the next draw call was about to
consume data generated by a previous draw call in the same batch.
However, since we don't have that sort of dependency tracking
infrastructure right now, we just unconditionally flush the buffer
every time glEndTransformFeedback() is called.  This will cause a
performance hit compared to the ideal case (since we will sometimes
flush the pipeline unnecessarily), but fortunately the performance hit
will be confined to circumstances where transform feedback is in use.
---

Ken pointed out to me on IRC that in my last attempt at this patch, I
neglected to back out bogus the extra argument I added to
brwInitDriverFunctions().  This fixes that.  I now believe the only
differences between my proposal and Ken's are:

- I'm not doing "if (intel->gen < 6) return;" because I believe that
  we will need to do this flush on every platform that supports
  transform feedback, not just Gen6.

- I'm calling the function that does the flush
  "brw_end_transform_feedback" instead of
  "gen6_end_transform_feedback", because it's no longer a
  gen6-specific function.

Hopefully I haven't made any other gaffes this time.

 src/mesa/drivers/dri/i965/brw_context.c |1 +
 src/mesa/drivers/dri/i965/brw_context.h |5 +
 src/mesa/drivers/dri/i965/gen6_sol.c|   16 
 3 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index eb68bf4..7d360b0 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -117,6 +117,7 @@ static void brwInitDriverFunctions( struct 
dd_function_table *functions )
brw_init_queryobj_functions(functions);
 
functions->PrepareExecBegin = brwPrepareExecBegin;
+   functions->EndTransformFeedback = brw_end_transform_feedback;
 }
 
 bool
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index da1de02..8e52488 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1071,6 +1071,11 @@ void brw_init_surface_formats(struct brw_context *brw);
 bool
 brw_fprog_uses_noperspective(const struct gl_fragment_program *fprog);
 
+/* gen6_sol.c */
+void
+brw_end_transform_feedback(struct gl_context *ctx,
+   struct gl_transform_feedback_object *obj);
+
 
 
 /*==
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 48f0bfe..b11bce2 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -28,6 +28,7 @@
 
 #include "brw_context.h"
 #include "intel_buffer_objects.h"
+#include "intel_batchbuffer.h"
 #include "brw_defines.h"
 
 static void
@@ -98,3 +99,18 @@ const struct brw_tracked_state gen6_sol_surface = {
},
.emit = brw_update_sol_surfaces,
 };
+
+void
+brw_end_transform_feedback(struct gl_context *ctx,
+   struct gl_transform_feedback_object *obj)
+{
+   /* After EndTransformFeedback, it's likely that the client program will try
+* to draw using the contents of the transform feedback buffer as vertex
+* input.  In order for this to work, we need to flush the data through at
+* least the GS stage of the pipeline, and flush out the render cache.  For
+* simplicity, just do a full flush.
+*/
+   struct brw_context *brw = brw_context(ctx);
+   struct intel_context *intel = &brw->intel;
+   intel_batchbuffer_emit_mi_flush(intel);
+}
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Flush pipeline on EndTransformFeedback.

2011-12-14 Thread Kenneth Graunke
On 12/14/2011 02:54 PM, Paul Berry wrote:
> A common use case for transform feedback is to perform one draw
> operation that writes transform feedback output to a buffer, followed
> by a second draw operation that consumes that buffer as vertex input.
> Since vertex input is consumed at an earlier pipeline stage than
> writing transform feedback output, we need to flush the pipeline to
> ensure that the transform feedback output is completely written before
> the data is consumed.
> 
> In an ideal world, we would do some dependency tracking, so that we
> would only flush the pipeline if the next draw call was about to
> consume data generated by a previous draw call in the same batch.
> However, since we don't have that sort of dependency tracking
> infrastructure right now, we just unconditionally flush the buffer
> every time glEndTransformFeedback() is called.  This will cause a
> performance hit compared to the ideal case (since we will sometimes
> flush the pipeline unnecessarily), but fortunately the performance hit
> will be confined to circumstances where transform feedback is in use.
> ---
> 
> Ken pointed out to me on IRC that in my last attempt at this patch, I
> neglected to back out bogus the extra argument I added to
> brwInitDriverFunctions().  This fixes that.  I now believe the only
> differences between my proposal and Ken's are:
> 
> - I'm not doing "if (intel->gen < 6) return;" because I believe that
>   we will need to do this flush on every platform that supports
>   transform feedback, not just Gen6.
> 
> - I'm calling the function that does the flush
>   "brw_end_transform_feedback" instead of
>   "gen6_end_transform_feedback", because it's no longer a
>   gen6-specific function.
> 
> Hopefully I haven't made any other gaffes this time.

Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Set the correct ctx->NewState bitfield for rasterizer discard.

2011-12-14 Thread Marek Olšák
On Wed, Dec 14, 2011 at 11:25 PM, Paul Berry  wrote:
> On 14 December 2011 13:42, Marek Olšák  wrote:
>>
>> I think RASTERIZER_DISCARD has nothing to do with transform feedback.
>> I think it's part of the same spec because it's not useful without it.
>> As I understand it, _NEW_TRANSFORM_FEEDBACK is dirty when transform
>> feedback buffer bindings are changed or just enabled/disabled. On the
>> other hand, RASTERIZER_DISCARD enables or disables the rasterizer, so
>> it should fall into the same category as face culling for example. (I
>> even implemented it using face culling on r600)
>>
>> Also there would be no way to know whether _NEW_TRANSFORM_FEEDBACK
>> changes just TFB buffer bindings, or just RASTERIZER_DISCARD, or both.
>>
>> Marek
>
>
> I see where you are coming from--I could have implemented rasterizer discard
> on i965 gen6 in the same way.
>
> However, I think there are three compelling reasons to consider rasterizer
> discard to be related to transform feedback:
>
> (1) from a user perspective, it really only makes sense to use rasterizer
> discard when transform feedback is active.  Thus, it's highly likely that
> when the rasterizer discard state is changed, transform feedback settings
> will be changed too.
>
> (2) rasterizer discard functionality is described by the same set of
> extensions that enable transform feedback (e.g. EXT_transform_feedback), so
> presumably the inventors of these two features thought they were closely
> related.
>
> (3) the enable bit that Mesa uses to keep track of the state of rasterizer
> discard is in gl_context::TransformFeedback, not gl_context::Transform.
>
> Item (3) is the most important to me.  One of the scarier things about i965
> driver development is that we have to manually keep track of which hardware
> configuration commands correspond to which dirty bits.  If we miss a
> dependency, that causes a subtle bug in which a state change might not cause
> the hardware state to be updated properly.  These kinds of bugs are not well
> exposed by Piglit tests, because most Piglit tests make a large number of
> state changes followed by a draw operation, so a missing dependency might
> easily go undetected.  The thing that allows me to sleep at night is that
> there is a nice one-to-one correspondence between almost all of the dirty
> bits and corresponding substructures of gl_context.  For example, the
> _NEW_MODELVIEW dirty bit corresponds to gl_context::ModelView,
> _NEW_PROJECTION corresponds to gl_context::Projection, and so on.  That
> means any time I am worried that I'm not handling dirty bits correctly, I
> can visually inspect the code and make sure that the dirty bits I'm
> consulting match up with which elements of gl_context I'm accessing.  If we
> leave the code as is, then there's a big undocumented exception to that
> one-to-one correspondence, wherein
> gl_context::TransformFeedback.RasterDiscard is covered by _NEW_TRANSFORM,
> not _NEW_TRANSFORM_FEEDBACK, as one would logically guess based on where
> it's located within gl_context.  I'm not confident that I (or other
> developers) will remember this exception once we're no longer in the thick
> of implementing transform feedback and rasterizer discard.
>
> It seems to me that we have three possible approaches to choose from here:
>
> (a) Go ahead with this patch, and modify r600 code to recompute the face
> culling mode when the _NEW_TRANSFORM_FEEDBACK dirty bit is set.
>
> (b) Don't apply this patch, and instead move RasterDiscard from
> gl_context::TransformFeedback to gl_context::Transform, so that we restore
> the one-to-one correspondence between dirty bits and substructures of
> gl_context.
>
> (c) Do nothing, and rely on programmers to remember that RasterDiscard is an
> exception to the usual correspondence between dirty bits and substructures
> of gl_context.
>
> I'm really not comfortable with (c) because of the risk of future bugs.  I
> suppose I could be talked into (b) if there's popular support for it, but
> it's not my favourite, because as I said earlier, I think there are actually
> a lot of good reasons to think of rasterizer discard as related to transform
> feedback.  My preference is to do (a).

(d) Rework the _NEW_* flags such that they roughly match hardware
state groups, not OpenGL state groups. Direct3D 11 and Gallium are two
examples of how it could be done.

I am for (b) or (d). I would have nothing against (a) if TFB buffer
bindings were not covered by the same flag. It's mainly about the
overhead of state changes, although I admitted there are r600-related
reasons too. Also, Gallium will have rasterizer_discard in the rasterizer
state (once the patches hit master) - that can be changed though.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa, gallium: add a cap for GPUs without unified color+generic varying slots

2011-12-14 Thread Ian Romanick

On 12/14/2011 01:47 PM, Marek Olšák wrote:

On Wed, Dec 14, 2011 at 6:30 PM, Ian Romanick  wrote:

On 12/13/2011 04:32 PM, Marek Olšák wrote:


---
  src/gallium/drivers/r300/r300_screen.c |3 ++-
  src/gallium/include/pipe/p_defines.h   |3 ++-
  src/glsl/linker.cpp|6 ++
  src/mesa/main/mtypes.h |9 +
  src/mesa/state_tracker/st_extensions.c |3 +++
  5 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r300/r300_screen.c
b/src/gallium/drivers/r300/r300_screen.c
index 0bae065..a761939 100644
--- a/src/gallium/drivers/r300/r300_screen.c
+++ b/src/gallium/drivers/r300/r300_screen.c
@@ -102,6 +102,7 @@ static int r300_get_param(struct pipe_screen* pscreen,
enum pipe_cap param)
  case PIPE_CAP_TEXTURE_BARRIER:
  case PIPE_CAP_TGSI_CAN_COMPACT_VARYINGS:
  case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
+case PIPE_CAP_SEPARATE_COLOR_VARYINGS:
  return 1;

  /* r300 cannot do swizzling of compressed textures. Supported
otherwise. */
@@ -182,7 +183,7 @@ static int r300_get_shader_param(struct pipe_screen
*pscreen, unsigned shader, e
   * R500 has the ability to turn 3rd and 4th color into
   * additional texcoords but there is no two-sided color
   * selection then. However the facing bit can be used
instead. */
-return 10;
+return 8;



Why are you trying to make r300g behave differently the every other driver
that has existed for that same hardware?  This doesn't make any sense.  This
is not Apple's driver works for r300 hardware, and it's not how AMD's
Windows driver works (worked, anyway) for r300 hardware.

This feels like an unnecessary hack.


Hi Ian,

Simple. I cannot pass the test glsl-max-varyings if I report 40
varying components, because I have only 32 texcoord components + 8
color components. I could re-use the color varyings, but it's not full
single-precision. r300 uses a fixed-point type S3.12 for color
interpolators, which can represent values in the range [-7.,
7.]. That's unusable for used-defined varyings. r500 uses a 20-bit
float, which is better, but I am not sure if it's good enough.


The desktop GL spec isn't very specific about range or precision before 
either a late 3.x version or one of the 4.x versions (I don't recall 
which).  However, I'm pretty sure ES2 requires 24-bit float.  Is 20-bits 
even a real computer number? :)


This also means that r300 and r500 can't handle unclamped colors. 
Right?  I'm thinking of the scenario:


 - The driver advertises 32 varying components.

 - The application calls glClampColorARB(GL_CLAMP_VERTEX_COLOR_ARB, 
GL_FALSE).


 - The shader uses gl_Color and 32 components worth of other varyings.

What happens?  Are the colors partially clamped or what?  Since color 
clamping is set independent of compiling or linking, you don't have an 
opportunity to generate any errors up front.


It sounds like the colors have to go through different interpolators 
anyway if glClampColorARB(GL_CLAMP_VERTEX_COLOR_ARB, GL_FALSE) is used. 
 Is that handled correctly?


Now I'm really curious to see glsl-max-varyings run on an Apple system 
with an r300.  Scouting around, that looks like it would have to be a G5 
iMac or PowerMac.  Hrm...



Also, r500 actually has 10 texcoord interpolators, but we don't use
the last two yet (it's non-trivial, there is a special PS3 mode for
it). Whether color interpolators can be enabled in that mode is
undocumented, though I am almost sure the back colors cannot be. (the
DX9 ps_3_0 shader profile doesn't have color inputs at all)


If colors are counted, you don't need to worry about that.  40 varyings 
means 40 varyings.  Clamped colors would go to the usual slots (leaving 
2 texcoords unused), and clamped colors would go in two of the texcoord 
slots.



Yes, the patch is a hack. However modifying glsl-max-varyings to only
test MAX_VARYING_FLOATS - 8 doesn't feel right either. Do you have a
better idea?

What about the other patch?


I need to do some research about that.  I'm pretty sure i915 needs a 
slot for WPOS.  I want to collect some more data and see if there's a 
coherent architecture that works for the various platforms.  We're 
starting to get a big pile of band-aids, and that can't hold up in the 
long run.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/8] mesa: Track changes to transform feedback state.

2011-12-14 Thread Paul Berry
On 14 December 2011 13:54, Marek Olšák  wrote:

> On Wed, Dec 7, 2011 at 8:09 PM, Paul Berry 
> wrote:
> > This patch adds a new bit to the ctx->NewState bitfield,
> > _NEW_TRANSFORM_FEEDBACK, to track state changes that affect
> > ctx->TransformFeedback.  This bit can be used by driver back-ends to
> > avoid expensive recomputations when transform feedback state has not
> > been modified.
> >
> > Reviewed-by: Kenneth Graunke 
> > ---
> >  src/mesa/main/mtypes.h|1 +
> >  src/mesa/main/transformfeedback.c |9 -
> >  2 files changed, 9 insertions(+), 1 deletions(-)
> >
> > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> > index 33b00c6..fc494f7 100644
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -3051,6 +3051,7 @@ struct gl_matrix_stack
> >  #define _NEW_PROGRAM_CONSTANTS (1 << 27)
> >  #define _NEW_BUFFER_OBJECT (1 << 28)
> >  #define _NEW_FRAG_CLAMP(1 << 29)
> > +#define _NEW_TRANSFORM_FEEDBACK (1 << 30) /**<
> gl_context::TransformFeedback */
> >  #define _NEW_ALL ~0
> >  /*@}*/
> >
> > diff --git a/src/mesa/main/transformfeedback.c
> b/src/mesa/main/transformfeedback.c
> > index 11abd03..799245d 100644
> > --- a/src/mesa/main/transformfeedback.c
> > +++ b/src/mesa/main/transformfeedback.c
> > @@ -376,6 +376,7 @@ _mesa_BeginTransformFeedback(GLenum mode)
> >   return;
> >}
> >
> > +   FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
> >obj->Active = GL_TRUE;
> >ctx->TransformFeedback.Mode = mode;
> >
> > @@ -398,6 +399,7 @@ _mesa_EndTransformFeedback(void)
> >   return;
> >}
> >
> > +   FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
> >ctx->TransformFeedback.CurrentObject->Active = GL_FALSE;
> >
> >assert(ctx->Driver.EndTransformFeedback);
> > @@ -415,6 +417,7 @@ bind_buffer_range(struct gl_context *ctx, GLuint
> index,
> >  {
> >struct gl_transform_feedback_object *obj =
> >   ctx->TransformFeedback.CurrentObject;
> > +   FLUSH_VERTICES(ctx, _NEW_TRANSFORM_FEEDBACK);
> >
> >/* The general binding point */
> >_mesa_reference_buffer_object(ctx,
>
> Sorry for the late reply.
>
> There's no need to set _NEW_TRANSFORM_FEEDBACK in bind_buffer_range,
> because it's not allowed to change buffer bindings for an active TFB
> object. Buffer bindings can only be changed for inactive TFB objects.
> And because they're inactive, the drivers don't need to know about it.
> The TFB buffers are effectively bound through BeginTransformFeedback
> and unbound through EndTransformFeedback (likewise for Resume and
> Pause).
>
> Marek
>

Ah, you're right.  I missed this subtlety.  I've already landed this patch,
so I'll submit a follow-up patch that replaces the unnecessary calls to
FLUSH_VERTICES() with comments explaining why a flush is unnecessary.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/14] i965/vs: Implement EXT_texture_swizzle support for VS texturing.

2011-12-14 Thread Kenneth Graunke
On 12/13/2011 01:19 PM, Eric Anholt wrote:
> On Thu,  8 Dec 2011 17:08:04 -0800, Kenneth Graunke  
> wrote:
>> Signed-off-by: Kenneth Graunke 
>> +   if (one_mask) {
>> +  swizzled_result.writemask = one_mask;
>> +  emit(MOV(swizzled_result, src_reg(1.0f)));
>> +   }
>>  }
> 
> I think this would be wrong for SWIZZLE_ONE of integer textures.

My tests indicate that it works.  swizzled_result gets the appropriate
register type, so this ends up being something like:

mov(8)   m4<1>.ywUD1F{ align16 WE_normal 1Q };

and the MOV instruction does the float->int conversion for us.

Is that okay?  If not, what would you prefer?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Set the correct ctx->NewState bitfield for rasterizer discard.

2011-12-14 Thread Paul Berry
On 14 December 2011 15:00, Marek Olšák  wrote:

> On Wed, Dec 14, 2011 at 11:25 PM, Paul Berry 
> wrote:
> > On 14 December 2011 13:42, Marek Olšák  wrote:
> >>
> >> I think RASTERIZER_DISCARD has nothing to do with transform feedback.
> >> I think it's part of the same spec because it's not useful without it.
> >> As I understand it, _NEW_TRANSFORM_FEEDBACK is dirty when transform
> >> feedback buffer bindings are changed or just enabled/disabled. On the
> >> other hand, RASTERIZER_DISCARD enables or disables the rasterizer, so
> >> it should fall into the same category as face culling for example. (I
> >> even implemented it using face culling on r600)
> >>
> >> Also there would be no way to know whether _NEW_TRANSFORM_FEEDBACK
> >> changes just TFB buffer bindings, or just RASTERIZER_DISCARD, or both.
> >>
> >> Marek
> >
> >
> > I see where you are coming from--I could have implemented rasterizer
> discard
> > on i965 gen6 in the same way.
> >
> > However, I think there are three compelling reasons to consider
> rasterizer
> > discard to be related to transform feedback:
> >
> > (1) from a user perspective, it really only makes sense to use rasterizer
> > discard when transform feedback is active.  Thus, it's highly likely that
> > when the rasterizer discard state is changed, transform feedback settings
> > will be changed too.
> >
> > (2) rasterizer discard functionality is described by the same set of
> > extensions that enable transform feedback (e.g. EXT_transform_feedback),
> so
> > presumably the inventors of these two features thought they were closely
> > related.
> >
> > (3) the enable bit that Mesa uses to keep track of the state of
> rasterizer
> > discard is in gl_context::TransformFeedback, not gl_context::Transform.
> >
> > Item (3) is the most important to me.  One of the scarier things about
> i965
> > driver development is that we have to manually keep track of which
> hardware
> > configuration commands correspond to which dirty bits.  If we miss a
> > dependency, that causes a subtle bug in which a state change might not
> cause
> > the hardware state to be updated properly.  These kinds of bugs are not
> well
> > exposed by Piglit tests, because most Piglit tests make a large number of
> > state changes followed by a draw operation, so a missing dependency might
> > easily go undetected.  The thing that allows me to sleep at night is that
> > there is a nice one-to-one correspondence between almost all of the dirty
> > bits and corresponding substructures of gl_context.  For example, the
> > _NEW_MODELVIEW dirty bit corresponds to gl_context::ModelView,
> > _NEW_PROJECTION corresponds to gl_context::Projection, and so on.  That
> > means any time I am worried that I'm not handling dirty bits correctly, I
> > can visually inspect the code and make sure that the dirty bits I'm
> > consulting match up with which elements of gl_context I'm accessing.  If
> we
> > leave the code as is, then there's a big undocumented exception to that
> > one-to-one correspondence, wherein
> > gl_context::TransformFeedback.RasterDiscard is covered by _NEW_TRANSFORM,
> > not _NEW_TRANSFORM_FEEDBACK, as one would logically guess based on where
> > it's located within gl_context.  I'm not confident that I (or other
> > developers) will remember this exception once we're no longer in the
> thick
> > of implementing transform feedback and rasterizer discard.
> >
> > It seems to me that we have three possible approaches to choose from
> here:
> >
> > (a) Go ahead with this patch, and modify r600 code to recompute the face
> > culling mode when the _NEW_TRANSFORM_FEEDBACK dirty bit is set.
> >
> > (b) Don't apply this patch, and instead move RasterDiscard from
> > gl_context::TransformFeedback to gl_context::Transform, so that we
> restore
> > the one-to-one correspondence between dirty bits and substructures of
> > gl_context.
> >
> > (c) Do nothing, and rely on programmers to remember that RasterDiscard
> is an
> > exception to the usual correspondence between dirty bits and
> substructures
> > of gl_context.
> >
> > I'm really not comfortable with (c) because of the risk of future bugs.
> I
> > suppose I could be talked into (b) if there's popular support for it, but
> > it's not my favourite, because as I said earlier, I think there are
> actually
> > a lot of good reasons to think of rasterizer discard as related to
> transform
> > feedback.  My preference is to do (a).
>
> (d) Rework the _NEW_* flags such that they roughly match hardware
> state groups, not OpenGL state groups. Direct3D 11 and Gallium are two
> examples of how it could be done.
>
> I am for (b) or (d). I would have nothing against (a) if TFB buffer
> bindings were not covered by the same flag. It's mainly about the
> overhead of state changes, although I admitted there are r600-related
> reasons too. Also, Gallium will have rasterizer_discard in the rasterizer
> state (once the patches hit master) - that can be changed though.
>
> Marek
>

Re: [Mesa-dev] [PATCH 1/2] mesa, gallium: add a cap for GPUs without unified color+generic varying slots

2011-12-14 Thread Marek Olšák
On Thu, Dec 15, 2011 at 12:01 AM, Ian Romanick  wrote:
>> Simple. I cannot pass the test glsl-max-varyings if I report 40
>> varying components, because I have only 32 texcoord components + 8
>> color components. I could re-use the color varyings, but it's not full
>> single-precision. r300 uses a fixed-point type S3.12 for color
>> interpolators, which can represent values in the range [-7.,
>> 7.]. That's unusable for used-defined varyings. r500 uses a 20-bit
>> float, which is better, but I am not sure if it's good enough.
>
>
> The desktop GL spec isn't very specific about range or precision before
> either a late 3.x version or one of the 4.x versions (I don't recall which).
>  However, I'm pretty sure ES2 requires 24-bit float.  Is 20-bits even a real
> computer number? :)

I did not design the hardware. The 20-bit float interpolation might
have been faster than 32-bit or even 24-bit, who knows.

>
> This also means that r300 and r500 can't handle unclamped colors. Right?

r300 cannot handle unclamped colors outside of the range [-8, 8]. For
that reason, ARB_color_buffer_float is not exposed on r300. r500 can
handle unclamped colors, but only with 20 bits of precision (probably
m14e6 or something like that).

>  I'm thinking of the scenario:
>
>  - The driver advertises 32 varying components.
>
>  - The application calls glClampColorARB(GL_CLAMP_VERTEX_COLOR_ARB,
> GL_FALSE).
>
>  - The shader uses gl_Color and 32 components worth of other varyings.
>
> What happens?  Are the colors partially clamped or what?  Since color
> clamping is set independent of compiling or linking, you don't have an
> opportunity to generate any errors up front.
>
> It sounds like the colors have to go through different interpolators anyway
> if glClampColorARB(GL_CLAMP_VERTEX_COLOR_ARB, GL_FALSE) is used.  Is that
> handled correctly?

Like I said, ARB_color_buffer_float is not exposed on r300-r400, so
the colors are always clamped.

>
> Now I'm really curious to see glsl-max-varyings run on an Apple system with
> an r300.  Scouting around, that looks like it would have to be a G5 iMac or
> PowerMac.  Hrm...
>
>
>> Also, r500 actually has 10 texcoord interpolators, but we don't use
>> the last two yet (it's non-trivial, there is a special PS3 mode for
>> it). Whether color interpolators can be enabled in that mode is
>> undocumented, though I am almost sure the back colors cannot be. (the
>> DX9 ps_3_0 shader profile doesn't have color inputs at all)
>
>
> If colors are counted, you don't need to worry about that.  40 varyings
> means 40 varyings.  Clamped colors would go to the usual slots (leaving 2
> texcoords unused), and clamped colors would go in two of the texcoord slots.
>
>
>> Yes, the patch is a hack. However modifying glsl-max-varyings to only
>> test MAX_VARYING_FLOATS - 8 doesn't feel right either. Do you have a
>> better idea?
>>
>> What about the other patch?
>
>
> I need to do some research about that.  I'm pretty sure i915 needs a slot
> for WPOS.  I want to collect some more data and see if there's a coherent
> architecture that works for the various platforms.  We're starting to get a
> big pile of band-aids, and that can't hold up in the long run.

Right. I used the ps_3_0 spec, which requires 10 varying vectors +
face + wpos.xy (yes two channels only). That matches r500, which has
dedicated shader inputs for face + wpos.xy. However, OpenGL also
requires wpos.zw, so r300g emulates wpos through another texcoord
anyway, but SM4 and later hardware should not have that limitation.

The point of these patches is that if we want the linker to fail when
it should, we better calculate occupied resources correctly, otherwise
we may regret this one day.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Set the correct ctx->NewState bitfield for rasterizer discard.

2011-12-14 Thread Marek Olšák
On Thu, Dec 15, 2011 at 12:18 AM, Paul Berry  wrote:
> On 14 December 2011 15:00, Marek Olšák  wrote:
>>
>> On Wed, Dec 14, 2011 at 11:25 PM, Paul Berry 
>> wrote:
>> > On 14 December 2011 13:42, Marek Olšák  wrote:
>> >>
>> >> I think RASTERIZER_DISCARD has nothing to do with transform feedback.
>> >> I think it's part of the same spec because it's not useful without it.
>> >> As I understand it, _NEW_TRANSFORM_FEEDBACK is dirty when transform
>> >> feedback buffer bindings are changed or just enabled/disabled. On the
>> >> other hand, RASTERIZER_DISCARD enables or disables the rasterizer, so
>> >> it should fall into the same category as face culling for example. (I
>> >> even implemented it using face culling on r600)
>> >>
>> >> Also there would be no way to know whether _NEW_TRANSFORM_FEEDBACK
>> >> changes just TFB buffer bindings, or just RASTERIZER_DISCARD, or both.
>> >>
>> >> Marek
>> >
>> >
>> > I see where you are coming from--I could have implemented rasterizer
>> > discard
>> > on i965 gen6 in the same way.
>> >
>> > However, I think there are three compelling reasons to consider
>> > rasterizer
>> > discard to be related to transform feedback:
>> >
>> > (1) from a user perspective, it really only makes sense to use
>> > rasterizer
>> > discard when transform feedback is active.  Thus, it's highly likely
>> > that
>> > when the rasterizer discard state is changed, transform feedback
>> > settings
>> > will be changed too.
>> >
>> > (2) rasterizer discard functionality is described by the same set of
>> > extensions that enable transform feedback (e.g. EXT_transform_feedback),
>> > so
>> > presumably the inventors of these two features thought they were closely
>> > related.
>> >
>> > (3) the enable bit that Mesa uses to keep track of the state of
>> > rasterizer
>> > discard is in gl_context::TransformFeedback, not gl_context::Transform.
>> >
>> > Item (3) is the most important to me.  One of the scarier things about
>> > i965
>> > driver development is that we have to manually keep track of which
>> > hardware
>> > configuration commands correspond to which dirty bits.  If we miss a
>> > dependency, that causes a subtle bug in which a state change might not
>> > cause
>> > the hardware state to be updated properly.  These kinds of bugs are not
>> > well
>> > exposed by Piglit tests, because most Piglit tests make a large number
>> > of
>> > state changes followed by a draw operation, so a missing dependency
>> > might
>> > easily go undetected.  The thing that allows me to sleep at night is
>> > that
>> > there is a nice one-to-one correspondence between almost all of the
>> > dirty
>> > bits and corresponding substructures of gl_context.  For example, the
>> > _NEW_MODELVIEW dirty bit corresponds to gl_context::ModelView,
>> > _NEW_PROJECTION corresponds to gl_context::Projection, and so on.  That
>> > means any time I am worried that I'm not handling dirty bits correctly,
>> > I
>> > can visually inspect the code and make sure that the dirty bits I'm
>> > consulting match up with which elements of gl_context I'm accessing.  If
>> > we
>> > leave the code as is, then there's a big undocumented exception to that
>> > one-to-one correspondence, wherein
>> > gl_context::TransformFeedback.RasterDiscard is covered by
>> > _NEW_TRANSFORM,
>> > not _NEW_TRANSFORM_FEEDBACK, as one would logically guess based on where
>> > it's located within gl_context.  I'm not confident that I (or other
>> > developers) will remember this exception once we're no longer in the
>> > thick
>> > of implementing transform feedback and rasterizer discard.
>> >
>> > It seems to me that we have three possible approaches to choose from
>> > here:
>> >
>> > (a) Go ahead with this patch, and modify r600 code to recompute the face
>> > culling mode when the _NEW_TRANSFORM_FEEDBACK dirty bit is set.
>> >
>> > (b) Don't apply this patch, and instead move RasterDiscard from
>> > gl_context::TransformFeedback to gl_context::Transform, so that we
>> > restore
>> > the one-to-one correspondence between dirty bits and substructures of
>> > gl_context.
>> >
>> > (c) Do nothing, and rely on programmers to remember that RasterDiscard
>> > is an
>> > exception to the usual correspondence between dirty bits and
>> > substructures
>> > of gl_context.
>> >
>> > I'm really not comfortable with (c) because of the risk of future bugs.
>> > I
>> > suppose I could be talked into (b) if there's popular support for it,
>> > but
>> > it's not my favourite, because as I said earlier, I think there are
>> > actually
>> > a lot of good reasons to think of rasterizer discard as related to
>> > transform
>> > feedback.  My preference is to do (a).
>>
>> (d) Rework the _NEW_* flags such that they roughly match hardware
>> state groups, not OpenGL state groups. Direct3D 11 and Gallium are two
>> examples of how it could be done.
>>
>> I am for (b) or (d). I would have nothing against (a) if TFB buffer
>> bindings were not covered by the same flag. It's 

Re: [Mesa-dev] [PATCH 1/2] mesa: Set the correct ctx->NewState bitfield for rasterizer discard.

2011-12-14 Thread Kenneth Graunke
On 12/14/2011 04:10 PM, Marek Olšák wrote:
> On Thu, Dec 15, 2011 at 12:18 AM, Paul Berry  wrote:
>> On 14 December 2011 15:00, Marek Olšák  wrote:
>>>
>>> On Wed, Dec 14, 2011 at 11:25 PM, Paul Berry 
>>> wrote:
 On 14 December 2011 13:42, Marek Olšák  wrote:
>
> I think RASTERIZER_DISCARD has nothing to do with transform feedback.
> I think it's part of the same spec because it's not useful without it.
> As I understand it, _NEW_TRANSFORM_FEEDBACK is dirty when transform
> feedback buffer bindings are changed or just enabled/disabled. On the
> other hand, RASTERIZER_DISCARD enables or disables the rasterizer, so
> it should fall into the same category as face culling for example. (I
> even implemented it using face culling on r600)
>
> Also there would be no way to know whether _NEW_TRANSFORM_FEEDBACK
> changes just TFB buffer bindings, or just RASTERIZER_DISCARD, or both.
>
> Marek


 I see where you are coming from--I could have implemented rasterizer
 discard
 on i965 gen6 in the same way.

 However, I think there are three compelling reasons to consider
 rasterizer
 discard to be related to transform feedback:

 (1) from a user perspective, it really only makes sense to use
 rasterizer
 discard when transform feedback is active.  Thus, it's highly likely
 that
 when the rasterizer discard state is changed, transform feedback
 settings
 will be changed too.

 (2) rasterizer discard functionality is described by the same set of
 extensions that enable transform feedback (e.g. EXT_transform_feedback),
 so
 presumably the inventors of these two features thought they were closely
 related.

 (3) the enable bit that Mesa uses to keep track of the state of
 rasterizer
 discard is in gl_context::TransformFeedback, not gl_context::Transform.

 Item (3) is the most important to me.  One of the scarier things about
 i965
 driver development is that we have to manually keep track of which
 hardware
 configuration commands correspond to which dirty bits.  If we miss a
 dependency, that causes a subtle bug in which a state change might not
 cause
 the hardware state to be updated properly.  These kinds of bugs are not
 well
 exposed by Piglit tests, because most Piglit tests make a large number
 of
 state changes followed by a draw operation, so a missing dependency
 might
 easily go undetected.  The thing that allows me to sleep at night is
 that
 there is a nice one-to-one correspondence between almost all of the
 dirty
 bits and corresponding substructures of gl_context.  For example, the
 _NEW_MODELVIEW dirty bit corresponds to gl_context::ModelView,
 _NEW_PROJECTION corresponds to gl_context::Projection, and so on.  That
 means any time I am worried that I'm not handling dirty bits correctly,
 I
 can visually inspect the code and make sure that the dirty bits I'm
 consulting match up with which elements of gl_context I'm accessing.  If
 we
 leave the code as is, then there's a big undocumented exception to that
 one-to-one correspondence, wherein
 gl_context::TransformFeedback.RasterDiscard is covered by
 _NEW_TRANSFORM,
 not _NEW_TRANSFORM_FEEDBACK, as one would logically guess based on where
 it's located within gl_context.  I'm not confident that I (or other
 developers) will remember this exception once we're no longer in the
 thick
 of implementing transform feedback and rasterizer discard.

 It seems to me that we have three possible approaches to choose from
 here:

 (a) Go ahead with this patch, and modify r600 code to recompute the face
 culling mode when the _NEW_TRANSFORM_FEEDBACK dirty bit is set.

 (b) Don't apply this patch, and instead move RasterDiscard from
 gl_context::TransformFeedback to gl_context::Transform, so that we
 restore
 the one-to-one correspondence between dirty bits and substructures of
 gl_context.

 (c) Do nothing, and rely on programmers to remember that RasterDiscard
 is an
 exception to the usual correspondence between dirty bits and
 substructures
 of gl_context.

 I'm really not comfortable with (c) because of the risk of future bugs.
 I
 suppose I could be talked into (b) if there's popular support for it,
 but
 it's not my favourite, because as I said earlier, I think there are
 actually
 a lot of good reasons to think of rasterizer discard as related to
 transform
 feedback.  My preference is to do (a).
>>>
>>> (d) Rework the _NEW_* flags such that they roughly match hardware
>>> state groups, not OpenGL state groups. Direct3D 11 and Gallium are two
>>> examples of how it could be done.
>>>
>>> I am for (b) or (d). I would have nothing against (a) if TFB buff

Re: [Mesa-dev] [PATCH 1/2] st/egl: Add support for EGL_NOK_swap_region

2011-12-14 Thread Chia-I Wu
On Thu, Dec 15, 2011 at 4:06 AM, Fredrik Höglund  wrote:
> Backends indicate that they support this extension by returning
> EGL_TRUE when native_display::get_param() is called with
> NATIVE_PARAM_PRESENT_REGION and NATIVE_PARAM_PRESERVE_BUFFER.
>
> native_present_control is extended to include the region that should
> be presented. When native_present_control::num_rects is zero,
> the whole surface is to be presented.
>
> Signed-off-by: Fredrik Höglund 
Committed.  Thanks.
> ---
>  src/gallium/state_trackers/egl/common/egl_g3d.c    |    6 
>  .../state_trackers/egl/common/egl_g3d_api.c        |   30 ++-
>  src/gallium/state_trackers/egl/common/native.h     |   13 -
>  3 files changed, 46 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/state_trackers/egl/common/egl_g3d.c 
> b/src/gallium/state_trackers/egl/common/egl_g3d.c
> index 182ce68..53811b8 100644
> --- a/src/gallium/state_trackers/egl/common/egl_g3d.c
> +++ b/src/gallium/state_trackers/egl/common/egl_g3d.c
> @@ -606,6 +606,12 @@ egl_g3d_initialize(_EGLDriver *drv, _EGLDisplay *dpy)
>       dpy->Extensions.WL_bind_wayland_display = EGL_TRUE;
>  #endif
>
> +#ifdef EGL_NOK_swap_region
> +   if (gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESENT_REGION) &&
> +       gdpy->native->get_param(gdpy->native, NATIVE_PARAM_PRESERVE_BUFFER))
> +      dpy->Extensions.NOK_swap_region = EGL_TRUE;
> +#endif
> +
>    if (egl_g3d_add_configs(drv, dpy, 1) == 1) {
>       _eglError(EGL_NOT_INITIALIZED, "eglInitialize(unable to add configs)");
>       goto fail;
> diff --git a/src/gallium/state_trackers/egl/common/egl_g3d_api.c 
> b/src/gallium/state_trackers/egl/common/egl_g3d_api.c
> index 911540e..2e3ead6 100644
> --- a/src/gallium/state_trackers/egl/common/egl_g3d_api.c
> +++ b/src/gallium/state_trackers/egl/common/egl_g3d_api.c
> @@ -546,7 +546,8 @@ egl_g3d_make_current(_EGLDriver *drv, _EGLDisplay *dpy,
>  }
>
>  static EGLBoolean
> -egl_g3d_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
> +swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf,
> +             EGLint num_rects, const EGLint *rects, EGLBoolean preserve)
>  {
>    struct egl_g3d_surface *gsurf = egl_g3d_surface(surf);
>    _EGLContext *ctx = _eglGetCurrentContext();
> @@ -572,14 +573,35 @@ egl_g3d_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, 
> _EGLSurface *surf)
>
>    memset(&ctrl, 0, sizeof(ctrl));
>    ctrl.natt = NATIVE_ATTACHMENT_BACK_LEFT;
> -   ctrl.preserve = (gsurf->base.SwapBehavior == EGL_BUFFER_PRESERVED);
> +   ctrl.preserve = preserve;
>    ctrl.swap_interval = gsurf->base.SwapInterval;
>    ctrl.premultiplied_alpha = (gsurf->base.VGAlphaFormat == 
> EGL_VG_ALPHA_FORMAT_PRE);
> +   ctrl.num_rects = num_rects;
> +   ctrl.rects = rects;
>
>    return gsurf->native->present(gsurf->native, &ctrl);
>  }
>
>  static EGLBoolean
> +egl_g3d_swap_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf)
> +{
> +   struct egl_g3d_surface *gsurf = egl_g3d_surface(surf);
> +
> +   return swap_buffers(drv, dpy, surf, 0, NULL,
> +                       (gsurf->base.SwapBehavior == EGL_BUFFER_PRESERVED));
> +}
> +
> +#ifdef EGL_NOK_swap_region
> +static EGLBoolean
> +egl_g3d_swap_buffers_region(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface 
> *surf,
> +                            EGLint num_rects, const EGLint *rects)
> +{
> +   /* Note: y=0=top */
> +   return swap_buffers(drv, dpy, surf, num_rects, rects, EGL_TRUE);
> +}
> +#endif /* EGL_NOK_swap_region */
> +
> +static EGLBoolean
>  egl_g3d_copy_buffers(_EGLDriver *drv, _EGLDisplay *dpy, _EGLSurface *surf,
>                      EGLNativePixmapType target)
>  {
> @@ -867,4 +889,8 @@ egl_g3d_init_driver_api(_EGLDriver *drv)
>    drv->API.CreateScreenSurfaceMESA = egl_g3d_create_screen_surface;
>    drv->API.ShowScreenSurfaceMESA = egl_g3d_show_screen_surface;
>  #endif
> +
> +#ifdef EGL_NOK_swap_region
> +   drv->API.SwapBuffersRegionNOK = egl_g3d_swap_buffers_region;
> +#endif
>  }
> diff --git a/src/gallium/state_trackers/egl/common/native.h 
> b/src/gallium/state_trackers/egl/common/native.h
> index ee24c22..312b079 100644
> --- a/src/gallium/state_trackers/egl/common/native.h
> +++ b/src/gallium/state_trackers/egl/common/native.h
> @@ -81,7 +81,13 @@ enum native_param_type {
>     * EGL_ALPHA_SIZE.  EGL_VG_ALPHA_FORMAT attribute of a surface will affect
>     * how the surface is presented.
>     */
> -   NATIVE_PARAM_PREMULTIPLIED_ALPHA
> +   NATIVE_PARAM_PREMULTIPLIED_ALPHA,
> +
> +   /**
> +    * Return TRUE if native_surface::present supports presenting a partial
> +    * surface.
> +    */
> +   NATIVE_PARAM_PRESENT_REGION
>  };
>
>  /**
> @@ -99,6 +105,11 @@ struct native_present_control {
>
>    /**< pixels use premultiplied alpha */
>    boolean premultiplied_alpha;
> +
> +   /**< The region to present. y=0=top.
> +        If num_rects is 0, the whole surface is to be presented */
> +   int num_rects;
> +   const int *rects; /* x, y, width, height */

Re: [Mesa-dev] [PATCH 1/3] egl: add EGL_NV_post_sub_buffer

2011-12-14 Thread Chia-I Wu
On Thu, Dec 15, 2011 at 5:21 AM, Ian Romanick  wrote:
> On 12/14/2011 12:24 PM, Fredrik Höglund wrote:
>>
>> v2: Handle EGL_POST_SUB_BUFFER_SUPPORTED_NV in
>>     _eglParseSurfaceAttribList()
>>
>> Signed-off-by: Fredrik Höglund
>> ---
>>  include/EGL/eglext.h      |    9 +
>>  src/egl/main/eglapi.c     |   24 
>>  src/egl/main/eglapi.h     |    8 
>>  src/egl/main/egldisplay.h |    2 ++
>>  src/egl/main/eglmisc.c    |    2 ++
>>  src/egl/main/eglsurface.c |   23 +++
>>  src/egl/main/eglsurface.h |    4 
>>  7 files changed, 72 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/EGL/eglext.h b/include/EGL/eglext.h
>> index 9484b83..d03a24d 100644
>> --- a/include/EGL/eglext.h
>> +++ b/include/EGL/eglext.h
>> @@ -144,6 +144,15 @@ typedef EGLImageKHR (EGLAPIENTRYP
>> PFNEGLCREATEDRMIMAGEMESA) (EGLDisplay dpy, con
>>  typedef EGLBoolean (EGLAPIENTRYP PFNEGLEXPORTDRMIMAGEMESA) (EGLDisplay
>> dpy, EGLImageKHR image, EGLint *name, EGLint *handle, EGLint *stride);
>>  #endif
>>
>> +#ifndef EGL_NV_post_sub_buffer
>> +#define EGL_NV_post_sub_buffer 1
>> +#define EGL_POST_SUB_BUFFER_SUPPORTED_NV       0x30BE
>> +#ifdef EGL_EGLEXT_PROTOTYPES
>> +EGLAPI EGLBoolean EGLAPIENTRY eglPostSubBufferNV(EGLDisplay dpy,
>> EGLSurface surface, EGLint x, EGLint y, EGLint width, EGLint height);
>> +#endif /* EGL_EGLEXT_PROTOTYPES */
>> +typedef EGLBoolean (EGLAPIENTRYP PFNEGLPOSTSUBBUFFERNVPROC) (EGLDisplay
>> dpy, EGLSurface surface, EGLint x, EGLint y, EGLint width, EGLint height);
>> +#endif
>> +
>>  #ifndef EGL_WL_bind_wayland_display
>>  #define EGL_WL_bind_wayland_display 1
>>
>
> I think we may be treating this file incorrectly.  It's my understanding
> that eglext.h comes from Khronos, and we don't get to modify it.  Our
> current eglext.h is based on version 7 (current version is 10), and it has
> diverged quite a bit.  I want to ping folks at Khronos and figure out how
> we're supposed to deal with this file.
That is great.  I always download the latest eglext.h and add our
changes to it when I need to update the file.

>
>> diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
>> index 3cb1a5b..ff15476 100644
>> --- a/src/egl/main/eglapi.c
>> +++ b/src/egl/main/eglapi.c
>> @@ -951,6 +951,9 @@ eglGetProcAddress(const char *procname)
>>  #ifdef EGL_ANDROID_swap_rectangle
>>        { "eglSetSwapRectangleANDROID", (_EGLProc)
>> eglSetSwapRectangleANDROID },
>>  #endif
>> +#ifdef EGL_NV_post_sub_buffer
>
>
> It seems weird that we define this in our own header file, then check to see
> if it's defined.  I understand that the new code is just following the
> pattern of the existing code.  Perhaps Kristian or Chia-I can shed some
> light on this convention.
The convention was established before I started working on EGL.  I
agree with you that those checks are unnecessary (and ugly..).  I will
remove them in a separate patch, if no one objects.

>> +      { "eglPostSubBufferNV", (_EGLProc) eglPostSubBufferNV },
>> +#endif
>>        { NULL, NULL }
>>     };
>>     EGLint i;
>> @@ -1590,3 +1593,24 @@ eglSetSwapRectangleANDROID(EGLDisplay dpy,
>> EGLSurface draw,
>>     RETURN_EGL_EVAL(disp, ret);
>>  }
>>  #endif
>> +
>> +#ifdef EGL_NV_post_sub_buffer
>> +EGLBoolean EGLAPIENTRY
>> +eglPostSubBufferNV(EGLDisplay dpy, EGLSurface surface,
>> +                   EGLint x, EGLint y, EGLint width, EGLint height)
>> +{
>> +   _EGLDisplay *disp = _eglLockDisplay(dpy);
>> +   _EGLSurface *surf = _eglLookupSurface(surface, disp);
>> +   _EGLDriver *drv;
>> +   EGLBoolean ret;
>> +
>> +   _EGL_CHECK_SURFACE(disp, surf, EGL_FALSE, drv);
>> +
>> +   if (!disp->Extensions.NV_post_sub_buffer)
>> +      RETURN_EGL_EVAL(disp, EGL_FALSE);
>> +
>> +   ret = drv->API.PostSubBufferNV(drv, disp, surf, x, y, width, height);
>> +
>> +   RETURN_EGL_EVAL(disp, ret);
>> +}
>> +#endif
>> diff --git a/src/egl/main/eglapi.h b/src/egl/main/eglapi.h
>> index 1e0aef6..f374273 100644
>> --- a/src/egl/main/eglapi.h
>> +++ b/src/egl/main/eglapi.h
>> @@ -135,6 +135,10 @@ typedef EGLBoolean
>> (*UnbindWaylandDisplayWL_t)(_EGLDriver *drv, _EGLDisplay *dis
>>  typedef EGLBoolean (*SetSwapRectangleANDROID_t)(_EGLDriver *drv,
>> _EGLDisplay *disp, _EGLSurface *draw, EGLint left, EGLint top, EGLint width,
>> EGLint height);
>>  #endif
>>
>> +#ifdef EGL_NV_post_sub_buffer
>> +typedef EGLBoolean (*PostSubBufferNV_t)(_EGLDriver *drv, _EGLDisplay
>> *disp, _EGLSurface *surface, EGLint x, EGLint y, EGLint width, EGLint
>> height);
>> +#endif
>> +
>>  /**
>>   * The API dispatcher jumps through these functions
>>   */
>> @@ -218,6 +222,10 @@ struct _egl_api
>>  #ifdef EGL_ANDROID_swap_rectangle
>>     SetSwapRectangleANDROID_t SetSwapRectangleANDROID;
>>  #endif
>> +
>> +#ifdef EGL_NV_post_sub_buffer
>> +   PostSubBufferNV_t PostSubBufferNV;
>> +#endif
>>  };
>>
>>  #endif /* EGLAPI_INCLUDED */
>> diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h
>> index 67a2e24..8f737df 100644
>> --- a/

[Mesa-dev] [PATCH 1/2] linker: fix strdup memory leak

2011-12-14 Thread Pekka Paalanen
string_to_uint_map::put() already does a strdup() for the key argument,
so we leak the memory allocated by strdup() in link_uniforms.cpp.

Remove the extra strdup(), fixes a few Valgrind detected leaks.

Signed-off-by: Pekka Paalanen 
---
 src/glsl/link_uniforms.cpp |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index c7de480..f6094d7 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -174,8 +174,7 @@ private:
   if (this->map->get(id, name))
 return;
 
-  char *key = strdup(name);
-  this->map->put(this->num_active_uniforms, key);
+  this->map->put(this->num_active_uniforms, name);
 
   /* Each leaf uniform occupies one entry in the list of active
* uniforms.
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa: free gl_uniform_storage::name

2011-12-14 Thread Pekka Paalanen
parcel_out_uniform_storage::visit_field() assigns a strdup()'d string
into gl_uniform_storage::name, but it is never freed.

Free gl_uniform_storage::name, fixes some Valgrind reported memory
leaks.

Signed-off-by: Pekka Paalanen 
---
 src/mesa/main/shaderobj.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index 454007f..2275430 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -39,6 +39,7 @@
 #include "program/prog_parameter.h"
 #include "program/hash_table.h"
 #include "ralloc.h"
+#include "../glsl/ir_uniform.h"
 
 /**/
 /*** Shader object functions***/
@@ -276,6 +277,9 @@ _mesa_clear_shader_program_data(struct gl_context *ctx,
 struct gl_shader_program *shProg)
 {
if (shProg->UniformStorage) {
+  unsigned i;
+  for (i = 0; i < shProg->NumUserUniformStorage; ++i)
+ free(shProg->UniformStorage[i].name);
   ralloc_free(shProg->UniformStorage);
   shProg->NumUserUniformStorage = 0;
   shProg->UniformStorage = NULL;
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev