date:20160908

Re: [Mesa-dev] [PATCH 2/3] egl: return corresponding offset of EGLImage instead of 0.

2016-09-08 Thread Weng, Chuanbo

From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of 
Axel Davy
Sent: Friday, September 9, 2016 2:12 PM
To: Weng, Chuanbo ; mesa-dev@lists.freedesktop.org; 
emil.l.veli...@gmail.com
Subject: Re: [Mesa-dev] [PATCH 2/3] egl: return corresponding offset of 
EGLImage instead of 0.

That doesn't seem good to me.

With that patch, that means that since no one is implementing

__DRI_IMAGE_ATTRIB_OFFSET

(yes I know in a later patch you implement it for i965),

then what used to work will stop working (as the queryImage will return false).

You need to introduce some interface version implementation check.

[Chuanbo] Maybe I can add more comment to git log (such as "This patch 
just implements egl loader side, the driver side

implementation is also needed for corresponding platform"), so user can be 
aware of this.

Introduce interface version implementation check will make mesa code more 
complex, because we should also add related check to

other dri2 functions(dri2_).

Another solution is combining the three patches into one patch, as I did before:

https://lists.freedesktop.org/archives/mesa-dev/2016-August/126945.html

This is not as easy as this version for reviewers, but more clearer for users.

Emil, what do you think?

No, that's not ok.

First i965 isn't the only one to implement the dri image interface (see the 
gallium one), second a new implementer doesn't have to start from the most 
recent version, and can choose to implement older version, which wouldn't 
implement your new functionnality.

[Chuanbo] OK. I’ll send out a new version based on your comments. 
Thanks.

The code has to be something like:

if (offsets) {

   offsets[0] = 0;

   if (dri2_dpy->image->base.version >= 13)

dri2_dpy->image->queryImage(dri2_img->dri_image, __DRI_IMAGE_ATTRIB_OFFSET, 
offsets);

}

Axel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/57] i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes.

2016-09-08 Thread Iago Toral

On Thu, 2016-09-08 at 18:44 -0700, Francisco Jerez wrote:
> Iago Toral  writes:
> 
> > 
> > On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
> > (...)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > index 12ab7b3..a678351 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> > > @@ -363,7 +363,7 @@ fs_generator::generate_fb_read(fs_inst *inst,
> > > struct brw_reg dst,
> > >    prog_data->binding_table.render_target_start + inst-
> > > >target;
> > >  
> > > gen9_fb_READ(p, dst, payload, surf_index,
> > > -inst->header_size, inst->regs_written,
> > > +inst->header_size, inst->size_written /
> > > REG_SIZE,
> > DIV_ROUND_UP?
> > 
> > > 
> > >  prog_data->persample_dispatch);
> > >  
> > > brw_mark_surface_used(&prog_data->base, surf_index);
> > > @@ -467,7 +467,7 @@ fs_generator::generate_urb_read(fs_inst
> > > *inst,
> > >    brw_inst_set_urb_per_slot_offset(p->devinfo, send, true);
> > >  
> > > brw_inst_set_mlen(p->devinfo, send, inst->mlen);
> > > -   brw_inst_set_rlen(p->devinfo, send, inst->regs_written);
> > > +   brw_inst_set_rlen(p->devinfo, send, inst->size_written /
> > > REG_SIZE);
> > DIV_ROUND_UP?
> > 
> > > 
> > > brw_inst_set_header_present(p->devinfo, send, true);
> > > brw_inst_set_urb_global_offset(p->devinfo, send, inst-
> > > >offset);
> > >  }
> > > @@ -895,7 +895,7 @@ fs_generator::generate_tex(fs_inst *inst,
> > > struct
> > > brw_reg dst, struct brw_reg src
> > >   surface + base_binding_table_index,
> > >   sampler % 16,
> > >   msg_type,
> > > - inst->regs_written,
> > > + inst->size_written / REG_SIZE,
> > DIV_ROUND_UP?
> > 
> > > 
> > >   inst->mlen,
> > >   inst->header_size != 0,
> > >   simd_mode,
> > > @@ -932,7 +932,7 @@ fs_generator::generate_tex(fs_inst *inst,
> > > struct
> > > brw_reg dst, struct brw_reg src
> > >    0 /* surface */,
> > >    0 /* sampler */,
> > >    msg_type,
> > > -  inst->regs_written,
> > > +  inst->size_written / REG_SIZE,
> > DIV_ROUND_UP?
> > 
> > > 
> > >    inst->mlen /* mlen */,
> > >    inst->header_size != 0 /* header
> > > */,
> > >    simd_mode,
> > > @@ -1263,7 +1263,7 @@
> > > fs_generator::generate_varying_pull_constant_load_gen4(fs_inst
> > > *inst,
> > > */
> > >    msg_type = BRW_SAMPLER_MESSAGE_SIMD16_LD;
> > >    assert(inst->mlen == 3);
> > > -  assert(inst->regs_written == 8);
> > > +  assert(inst->size_written == 8 * REG_SIZE);
> > >    rlen = 8;
> > >    simd_mode = BRW_SAMPLER_SIMD_MODE_SIMD16;
> > > }
> > > @@ -1408,7 +1408,7 @@
> > > fs_generator::generate_pixel_interpolator_query(fs_inst *inst,
> > >   msg_type,
> > >   msg_data,
> > >   inst->mlen,
> > > - inst->regs_written);
> > > + inst->size_written / REG_SIZE);
> > DIV_ROUND_UP?
> In all cases above you have the requirement that the amount of data
> written is an exact multiple of REG_SIZE, because SEND messages can
> only
> represent return payload sizes as an integer in GRF units, so if
> fs_inst::size_written ends up not being a multiple of REG_SIZE in any
> of
> these cases something has gone seriously wrong along the way.  Would
> you
> like me to sprinkle in some assertions to verify that?

Yeah, I guess adding a few assertions could make sense, thanks!

Iago
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [v3 4/6] i965/rbc: Consult rb settings for texture surface setup

2016-09-08 Thread Pohjolainen, Topi

On Thu, Sep 08, 2016 at 08:49:56AM -0700, Jason Ekstrand wrote:
>On Sep 7, 2016 9:30 PM, "Pohjolainen, Topi"
><[1]topi.pohjolai...@gmail.com> wrote:
>>
>> On Wed, Sep 07, 2016 at 03:25:30PM -0700, Jason Ekstrand wrote:
>> >On Sep 7, 2016 10:24 AM, "Topi Pohjolainen"
>> ><[1][2]topi.pohjolai...@gmail.com> wrote:
>> >>
>> >> Once mcs buffer gets allocated without delay for lossless
>> >> compression (same as we do for msaa), one gets regression in:
>> >>
>> >> GL45-CTS.texture_barrier_ARB.same-texel-rw
>> >>
>> >> Setting the auxiliary surface for both sampling engine and
>data
>> >> port seems to fix this. I haven't found any hardware
>documentation
>> >> backing this though.
>> >>
>> >> v2 (Jason): Prepare also for the case where surface is sampled
>with
>> >> non-compressible format forcing also rendering
>without
>> >> compression.
>> >> v3: Split asserts and decision making.
>> >>
>> >> Signed-off-by: Topi Pohjolainen
><[2][3]topi.pohjolai...@intel.com>
>> >> ---
>> >>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 63
>> >+---
>> >>  1 file changed, 56 insertions(+), 7 deletions(-)
>> >>
>> >> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> >b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> >> index c1273c5..054c5c8 100644
>> >> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> >> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> >> @@ -140,9 +140,7 @@ brw_emit_surface_state(struct brw_context
>*brw,
>> >> struct isl_surf *aux_surf = NULL, aux_surf_s;
>> >> uint64_t aux_offset = 0;
>> >> enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
>> >> -   if (mt->mcs_mt &&
>> >> -   ((view.usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) ||
>> >> -mt->fast_clear_state !=
>INTEL_FAST_CLEAR_STATE_RESOLVED)) {
>> >> +   if (mt->mcs_mt && !(flags & INTEL_AUX_BUFFER_DISABLED)) {
>> >>intel_miptree_get_aux_isl_surf(brw, mt, &aux_surf_s,
>> >&aux_usage);
>> >>aux_surf = &aux_surf_s;
>> >>assert(mt->mcs_mt->offset == 0);
>> >> @@ -425,6 +423,54 @@ swizzle_to_scs(GLenum swizzle, bool
>> >need_green_to_blue)
>> >> return (need_green_to_blue && scs == HSW_SCS_GREEN) ?
>> >HSW_SCS_BLUE : scs;
>> >>  }
>> >>
>> >> +static unsigned
>> >> +brw_find_matching_rb(const struct gl_framebuffer *fb,
>> >> + const struct intel_mipmap_tree *mt)
>> >> +{
>> >> +   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
>> >> +  const struct intel_renderbuffer *irb =
>> >> + intel_renderbuffer(fb->_ColorDrawBuffers[i]);
>> >> +
>> >> +  if (irb->mt == mt)
>> >> + return i;
>> >> +   }
>> >> +
>> >> +   return fb->_NumColorDrawBuffers;
>> >> +}
>> >> +
>> >> +static bool
>> >> +brw_disable_aux_surface(const struct brw_context *brw,
>> >> +const struct intel_mipmap_tree *mt)
>> >> +{
>> >> +   /* Nothing to disable. */
>> >> +   if (!mt->mcs_mt)
>> >> +  return false;
>> >> +
>> >> +   /* There are special cases only for lossless compression.
>*/
>> >> +   if (!intel_miptree_is_lossless_compressed(brw, mt))
>> >> +  return mt->fast_clear_state ==
>> >INTEL_FAST_CLEAR_STATE_RESOLVED;
>> >> +
>> >> +   const struct gl_framebuffer *fb = brw->ctx.DrawBuffer;
>> >> +   const unsigned rb_index = brw_find_matching_rb(fb, mt);
>> >> +
>> >> +   /* In practise it looks that setting the same lossless
>compressed
>> >surface
>> >> +* to be sampled without auxiliary surface and to be
>written with
>> >auxiliary
>> >> +* surface confuses the hardware. Therefore sampler engine
>must
>> >be provided
>> >> +* with auxiliary buffer regardless of the fast clear
>state if
>> >the same
>> >> +* surface is also going to be written during the same
>rendering
>> >pass with
>> >> +* auxiliary buffer enabled.
>> >> +*/
>> >> +   if (rb_index < fb->_NumColorDrawBuffers) {
>> >> +  if (brw->draw_aux_buffer_disabled[rb_index]) {
>> >> + assert(mt->fast_clear_state ==
>> >INTEL_FAST_CLEAR_STATE_RESOLVED);
>> >> +  }
>> >> +
>> >> +  return brw->draw_aux_buffer_disabled[rb_index];
>> >
>> >

Re: [Mesa-dev] [PATCH 06/57] i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_written.

2016-09-08 Thread Iago Toral

On Thu, 2016-09-08 at 18:32 -0700, Francisco Jerez wrote:
> Iago Toral  writes:
> 
> > 
> > On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
> > > 
> > > This is in preparation for dropping vec4_instruction::regs_read
> > > and
> > > ::regs_written in favor of more accurate alternatives expressed
> > > in
> > > byte units.  The main reason these wrappers are useful is that a
> > > number of optimization passes implement dataflow analysis with
> > > register granularity, so these helpers will come in handy once
> > > we've
> > > switched register offsets and sizes to the byte
> > > representation.  The
> > > wrapper functions will also make sure that GRF misalignment
> > > (currently
> > > neglected by most of the back-end) is taken into account
> > > correctly in
> > > the calculation of regs_read and regs_written.
> > 
> > This does not seem to replace all uses of regs_written and inst-
> > > 
> > > regs_read() with these helpers. I am not sure if this was by
> > > design or
> > by mistake but the consequence is that later patches still do a lot
> > of
> > things like:
> > 
> > - scan_inst->dst.offset / REG_SIZE + scan_inst->regs_written >
> > + scan_inst->dst.offset / REG_SIZE + DIV_ROUND_UP(scan_inst-
> > > 
> > > size_written, REG_SIZE)
> > (this hunk is from the next patch in fs_visitor::compute_to_mrf(),
> > but
> > there are plenty more like this in that same patch)
> > 
> > which would have not been necessary if we just used the
> > regs_written()
> > helper here.
> > 
> The reason for the apparent inconsistency you've noticed here is that
> regs_written(inst) and DIV_ROUND_UP(inst.size_written, REG_SIZE),
> even
> though they look like synonyms at this point of the series, are
> intended
> to do different things (they don't yet, but they will once several
> fixes
> are applied to regs_written() after PATCH 16).  From the doxygen
> comment
> of regs_written():
> 
> > 
> > Return the number of dataflow registers written by the instruction
> > (either fully or partially) counted from 'floor(reg_offset(inst-
> > >dst)
> > / register_size)'.  The somewhat arbitrary register size unit is
> > 16B
> > for the UNIFORM and IMM files and 32B for all other files.
> IOW, regs_written() is expected to behave as if it partitioned the
> register file of inst->dst into 32B chunks starting from
> reg_offset(r)
> == 0, and returned how many of those chunks overlap the destination
> region of the instruction, which is not necessarily equivalent to the
> amount of data written by the instruction in register units (if e.g.
> the
> instruction writes exactly REG_SIZE bytes but the destination region
> starts mid-GRF, regs_written(inst) would be expected to return two,
> but
> DIV_ROUND_UP(inst.size_written, REG_SIZE) would return one).
> 
> The same goes for regs_read() vs DIV_ROUND_UP(size_read(), REG_SIZE).

Ah right, that makes sense.

> That said, you could argue that in the example you pasted above
> regs_written() would have been the more correct thing to do of the
> two.
> That's definitely the case, but I didn't bother to change it because
> I
> removed the whole condition anyway during the clean-up part of this
> series, since it was just a rather hairy open-coded version of
> region_contained_in().

I see, that's fine by me.
Thanks for the explanation!

> > 
> > > 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_ir_vec4.h| 26
> > > ++
> > >  .../drivers/dri/i965/brw_schedule_instructions.cpp |  8 +++
> > >  src/mesa/drivers/dri/i965/brw_vec4.cpp |  4 ++--
> > >  src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  6 ++---
> > >  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  6 ++---
> > >  .../drivers/dri/i965/brw_vec4_live_variables.cpp   |  8 +++
> > >  6 files changed, 42 insertions(+), 16 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > > b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > > index 4f49428..a1a201b 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > > @@ -254,6 +254,32 @@ set_saturate(bool saturate, vec4_instruction
> > > *inst)
> > > return inst;
> > >  }
> > >  
> > > +/**
> > > + * Return the number of dataflow registers written by the
> > > instruction (either
> > > + * fully or partially) counted from 'floor(reg_offset(inst->dst) 
> > > /
> > > + * register_size)'.  The somewhat arbitrary register size unit
> > > is
> > > 16B for the
> > > + * UNIFORM and IMM files and 32B for all other files.
> > > + */
> > > +inline unsigned
> > > +regs_written(const vec4_instruction *inst)
> > > +{
> > > +   /* XXX - Take into account register-misaligned offsets
> > > correctly.
> > > */
> > > +   return inst->regs_written;
> > > +}
> > > +
> > > +/**
> > > + * Return the number of dataflow registers read by the
> > > instruction
> > > (either
> > > + * fully or partially) counted from 'floor(reg_offset(inst-
> > > >src[i])
> > > /
> > > +

Re: [Mesa-dev] [PATCH 2/3] egl: return corresponding offset of EGLImage instead of 0.

2016-09-08 Thread Axel Davy




That doesn't seem good to me.

With that patch, that means that since no one is implementing

__DRI_IMAGE_ATTRIB_OFFSET

(yes I know in a later patch you implement it for i965),

then what used to work will stop working (as the queryImage will return false).

You need to introduce some interface version implementation check.
[Chuanbo] Maybe I can add more comment to git log (such as "This patch 
just implements egl loader side, the driver side
implementation is also needed for corresponding platform"), so user can be 
aware of this.
Introduce interface version implementation check will make mesa code more 
complex, because we should also add related check to
other dri2 functions(dri2_).
Another solution is combining the three patches into one patch, as I did before:
https://lists.freedesktop.org/archives/mesa-dev/2016-August/126945.html
This is not as easy as this version for reviewers, but more clearer for users.
Emil, what do you think?


No, that's not ok.

First i965 isn't the only one to implement the dri image interface (see 
the gallium one), second a new implementer doesn't have to start from 
the most recent version, and can choose to implement older version, 
which wouldn't implement your new functionnality.



The code has to be something like:

if (offsets) {

   offsets[0] = 0;

   if (dri2_dpy->image->base.version >= 13)

dri2_dpy->image->queryImage(dri2_img->dri_image, 
__DRI_IMAGE_ATTRIB_OFFSET, offsets);


}

Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vbo: Correctly handle attribute offsets in dlist draw.

2016-09-08 Thread Mathias Fröhlich

Hi,

On Wednesday, 24 August 2016 08:32:10 CEST mathias.froehl...@gmx.net wrote:
> From: Mathias Fröhlich 
> 
> Hi all,
> 
> kind of a ping with a rephrased commit message
> and the style change.
> Please review.

Ping

Thanks!

Mathias

> 
> Thanks!
> 
> Mathias
> 
> 
> When executing a display list draw, for the offset
> list to be correct, the offset computation needs to
> accumulate all attribute size values in order.
> Specifically, if we are shuffling around the position
> and generic0 attributes, we may violate the order or
> if we do not walk the generic vbo attributes we may
> skip some of the attributes.
> Even if this is an unlikely usecase we can fix this
> by precomputing the offsets on the full attribute list
> and store the full offset list in the display list node.
> 
> v2: Formatting fix
> 
> Signed-off-by: Mathias Fröhlich 
> ---
>  src/mesa/vbo/vbo_save.h  |  1 +
>  src/mesa/vbo/vbo_save_api.c  |  5 +
>  src/mesa/vbo/vbo_save_draw.c | 35 +--
>  3 files changed, 23 insertions(+), 18 deletions(-)
> 
> diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
> index 2843b3c..a61973f 100644
> --- a/src/mesa/vbo/vbo_save.h
> +++ b/src/mesa/vbo/vbo_save.h
> @@ -64,6 +64,7 @@ struct vbo_save_vertex_list {
> GLbitfield64 enabled; /**< mask of enabled vbo arrays. */
> GLubyte attrsz[VBO_ATTRIB_MAX];
> GLenum attrtype[VBO_ATTRIB_MAX];
> +   GLushort offsets[VBO_ATTRIB_MAX];
> GLuint vertex_size;  /**< size in GLfloats */
>  
> /* Copy of the final vertex from node->vertex_store->bufferobj.
> diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
> index f648ccc..36af426 100644
> --- a/src/mesa/vbo/vbo_save_api.c
> +++ b/src/mesa/vbo/vbo_save_api.c
> @@ -415,6 +415,7 @@ _save_compile_vertex_list(struct gl_context *ctx)
>  {
> struct vbo_save_context *save = &vbo_context(ctx)->save;
> struct vbo_save_vertex_list *node;
> +   GLushort offset = 0;
>  
> /* Allocate space for this structure in the display list currently
>  * being compiled.
> @@ -436,6 +437,10 @@ _save_compile_vertex_list(struct gl_context *ctx)
> node->vertex_size = save->vertex_size;
> node->buffer_offset =
>(save->buffer - save->vertex_store->buffer) * sizeof(GLfloat);
> +   for (unsigned i = 0; i < VBO_ATTRIB_MAX; ++i) {
> +  node->offsets[i] = offset;
> +  offset += node->attrsz[i] * sizeof(GLfloat);
> +   }
> node->count = save->vert_count;
> node->wrap_count = save->copied.nr;
> node->dangling_attr_ref = save->dangling_attr_ref;
> diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c
> index 507ab82..e69c108 100644
> --- a/src/mesa/vbo/vbo_save_draw.c
> +++ b/src/mesa/vbo/vbo_save_draw.c
> @@ -26,6 +26,7 @@
>   *Keith Whitwell 
>   */
>  
> +#include 
>  #include "main/glheader.h"
>  #include "main/bufferobj.h"
>  #include "main/context.h"
> @@ -136,15 +137,10 @@ static void vbo_bind_vertex_list(struct gl_context 
*ctx,
> struct vbo_context *vbo = vbo_context(ctx);
> struct vbo_save_context *save = &vbo->save;
> struct gl_client_array *arrays = save->arrays;
> -   GLuint buffer_offset = node->buffer_offset;
> const GLuint *map;
> GLuint attr;
> -   GLubyte node_attrsz[VBO_ATTRIB_MAX];  /* copy of node->attrsz[] */
> -   GLenum node_attrtype[VBO_ATTRIB_MAX];  /* copy of node->attrtype[] */
> GLbitfield64 varying_inputs = 0x0;
> -
> -   memcpy(node_attrsz, node->attrsz, sizeof(node->attrsz));
> -   memcpy(node_attrtype, node->attrtype, sizeof(node->attrtype));
> +   bool generic_from_pos = false;
>  
> /* Install the default (ie Current) attributes first, then overlay
>  * all active ones.
> @@ -176,10 +172,7 @@ static void vbo_bind_vertex_list(struct gl_context 
*ctx,
> */
>if ((ctx->VertexProgram._Current->Base.InputsRead & VERT_BIT_POS) == 
0 &&
>(ctx->VertexProgram._Current->Base.InputsRead & 
VERT_BIT_GENERIC0)) {
> - save->inputs[VERT_ATTRIB_GENERIC0] = save->inputs[0];
> - node_attrsz[VERT_ATTRIB_GENERIC0] = node_attrsz[0];
> - node_attrtype[VERT_ATTRIB_GENERIC0] = node_attrtype[0];
> - node_attrsz[0] = 0;
> + generic_from_pos = true;
>}
>break;
> default:
> @@ -188,30 +181,36 @@ static void vbo_bind_vertex_list(struct gl_context 
*ctx,
>  
> for (attr = 0; attr < VERT_ATTRIB_MAX; attr++) {
>const GLuint src = map[attr];
> +  const GLubyte size = node->attrsz[src];
>  
> -  if (node_attrsz[src]) {
> +  if (size) {
>   /* override the default array set above */
>   save->inputs[attr] = &arrays[attr];
>  
> -  arrays[attr].Ptr = (const GLubyte *) NULL + buffer_offset;
> -  arrays[attr].Size = node_attrsz[src];
> + const uintptr_t buffer_offset = node->buffer_offset;
> + arrays[attr].Ptr = ADD_POINTERS(buffer_offset, node-
>offsets[src]);
> + arrays[attr].Size = size;
>arrays[

Re: [Mesa-dev] [PATCH 14/33] intel/blorp: Add an entrypoint for doing bit-for-bit copies

2016-09-08 Thread Pohjolainen, Topi

On Thu, Sep 08, 2016 at 11:01:45PM -0700, Jason Ekstrand wrote:
>On Sep 8, 2016 10:47 PM, "Pohjolainen, Topi"
><[1]topi.pohjolai...@gmail.com> wrote:
>>
>> On Thu, Sep 08, 2016 at 10:58:09AM -0700, Jason Ekstrand wrote:
>> >On Wed, Sep 7, 2016 at 1:16 PM, Jason Ekstrand
>> ><[1][2]ja...@jlekstrand.net> wrote:
>> >
>> >On Sep 7, 2016 10:45 AM, "Nanley Chery"
><[2][3]nanleych...@gmail.com>
>> >wrote:
>> >>
>> >> On Wed, Sep 07, 2016 at 10:26:25AM -0700, Jason Ekstrand
>wrote:
>> >> > On Wed, Sep 7, 2016 at 9:50 AM, Jason Ekstrand
>> ><[3][4]ja...@jlekstrand.net> wrote:
>> >> >
>> >> > > On Wed, Sep 7, 2016 at 9:36 AM, Nanley Chery
>> ><[4][5]nanleych...@gmail.com>
>> >> > > wrote:
>> >> > >
>> >> > >> On Tue, Sep 06, 2016 at 05:02:55PM -0700, Jason Ekstrand
>wrote:
>> >> > >> > On Tue, Sep 6, 2016 at 4:12 PM, Nanley Chery
>> ><[5][6]nanleych...@gmail.com>
>> >> > >> wrote:
>> >> > >> >
>> >> > >> > > On Wed, Aug 31, 2016 at 02:22:33PM -0700, Jason
>Ekstrand
>> >wrote:
>> >> > >> > > > ---
>> >> > >> > > >  src/intel/blorp/blorp.h  |  10 
>> >> > >> > > >  src/intel/blorp/blorp_blit.c | 133
>> >++
>> >> > >> > > +
>> >> > >> > > >  2 files changed, 143 insertions(+)
>> >> > >> > > >
>> >> > >> > > > diff --git a/src/intel/blorp/blorp.h
>> >b/src/intel/blorp/blorp.h
>> >> > >> > > > index c1e93fd..6574124 100644
>> >> > >> > > > --- a/src/intel/blorp/blorp.h
>> >> > >> > > > +++ b/src/intel/blorp/blorp.h
>> >> > >> > > > @@ -109,6 +109,16 @@ blorp_blit(struct blorp_batch
>*batch,
>> >> > >> > > > uint32_t filter, bool mirror_x, bool
>> >mirror_y);
>> >> > >> > > >
>> >> > >> > > >  void
>> >> > >> > > > +blorp_copy(struct blorp_batch *batch,
>> >> > >> > > > +   const struct blorp_surf *src_surf,
>> >> > >> > > > +   unsigned src_level, unsigned src_layer,
>> >> > >> > > > +   const struct blorp_surf *dst_surf,
>> >> > >> > > > +   unsigned dst_level, unsigned dst_layer,
>> >> > >> > > > +   uint32_t src_x, uint32_t src_y,
>> >> > >> > > > +   uint32_t dst_x, uint32_t dst_y,
>> >> > >> > > > +   uint32_t src_width, uint32_t
>src_height);
>> >> > >> > > > +
>> >> > >> > > > +void
>> >> > >> > > >  blorp_fast_clear(struct blorp_batch *batch,
>> >> > >> > > >   const struct blorp_surf *surf,
>> >> > >> > > >   uint32_t level, uint32_t layer,
>enum
>> >isl_format
>> >> > >> format,
>> >> > >> > > > diff --git a/src/intel/blorp/blorp_blit.c
>> >> > >> b/src/intel/blorp/blorp_blit.c
>> >> > >> > > > index 3ab39a3..42a502c 100644
>> >> > >> > > > --- a/src/intel/blorp/blorp_blit.c
>> >> > >> > > > +++ b/src/intel/blorp/blorp_blit.c
>> >> > >> > > > @@ -1685,3 +1685,136 @@ blorp_blit(struct
>blorp_batch
>> >*batch,
>> >> > >> > > >   dst_x0, dst_y0, dst_x1, dst_y1,
>> >> > >> > > >   mirror_x, mirror_y);
>> >> > >> > > >  }
>> >> > >> > > > +
>> >> > >> > > > +static enum isl_format
>> >> > >> > > > +get_copy_format_for_bpb(unsigned bpb)
>> >> > >> > > > +{
>> >> > >> > > > +   /* The choice of UNORM and UINT formats is very
>> >intentional
>> >> > >> here.
>> >> > >> > > Most of
>> >> > >> > > > +* the time, we want to use a UINT format to
>avoid any
>> >rounding
>> >> > >> > > error in
>> >> > >> > > > +* the blit.  For stencil blits, R8_UINT is
>required
>> >by the
>> >> > >> hardware.
>> >> > >> > > > +* (It's the only format allowed in conjunction
>with
>> >W-tiling.)
>> >> > >> > > Also we
>> >> > >> > > > +* intentionally use the 4-channel formats
>whenever we
>> >can.
>> >> > >> This is
>> >> > >> > > so
>> >> > >> > > > +* that, when we do a RGB <-> RGBX copy, the
>two
>> >formats will
>> >> > >> line
>> >> > >> > > up even
>> >> > >> > > > +* though one of them is 3/4 the size of the
>other.
>> >The choice
>> >> > >> of
>> >> > >> > > UNORM
>> >> > >> > > > +* vs. UINT is also very intentional because
>Haswell
>> >doesn't
>> >> > >> handle
>> >> > >> > > 8 or
>> >> > >> > > > +* 16-bit RGB UINT formats at all so we have to
>use
>> >UNORM there.
>> >

Re: [Mesa-dev] [PATCH 14/33] intel/blorp: Add an entrypoint for doing bit-for-bit copies

2016-09-08 Thread Jason Ekstrand

On Sep 8, 2016 10:47 PM, "Pohjolainen, Topi" 
wrote:
>
> On Thu, Sep 08, 2016 at 10:58:09AM -0700, Jason Ekstrand wrote:
> >On Wed, Sep 7, 2016 at 1:16 PM, Jason Ekstrand
> ><[1]ja...@jlekstrand.net> wrote:
> >
> >On Sep 7, 2016 10:45 AM, "Nanley Chery" <[2]nanleych...@gmail.com>
> >wrote:
> >>
> >> On Wed, Sep 07, 2016 at 10:26:25AM -0700, Jason Ekstrand wrote:
> >> > On Wed, Sep 7, 2016 at 9:50 AM, Jason Ekstrand
> ><[3]ja...@jlekstrand.net> wrote:
> >> >
> >> > > On Wed, Sep 7, 2016 at 9:36 AM, Nanley Chery
> ><[4]nanleych...@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> On Tue, Sep 06, 2016 at 05:02:55PM -0700, Jason Ekstrand
wrote:
> >> > >> > On Tue, Sep 6, 2016 at 4:12 PM, Nanley Chery
> ><[5]nanleych...@gmail.com>
> >> > >> wrote:
> >> > >> >
> >> > >> > > On Wed, Aug 31, 2016 at 02:22:33PM -0700, Jason Ekstrand
> >wrote:
> >> > >> > > > ---
> >> > >> > > >  src/intel/blorp/blorp.h  |  10 
> >> > >> > > >  src/intel/blorp/blorp_blit.c | 133
> >++
> >> > >> > > +
> >> > >> > > >  2 files changed, 143 insertions(+)
> >> > >> > > >
> >> > >> > > > diff --git a/src/intel/blorp/blorp.h
> >b/src/intel/blorp/blorp.h
> >> > >> > > > index c1e93fd..6574124 100644
> >> > >> > > > --- a/src/intel/blorp/blorp.h
> >> > >> > > > +++ b/src/intel/blorp/blorp.h
> >> > >> > > > @@ -109,6 +109,16 @@ blorp_blit(struct blorp_batch
*batch,
> >> > >> > > > uint32_t filter, bool mirror_x, bool
> >mirror_y);
> >> > >> > > >
> >> > >> > > >  void
> >> > >> > > > +blorp_copy(struct blorp_batch *batch,
> >> > >> > > > +   const struct blorp_surf *src_surf,
> >> > >> > > > +   unsigned src_level, unsigned src_layer,
> >> > >> > > > +   const struct blorp_surf *dst_surf,
> >> > >> > > > +   unsigned dst_level, unsigned dst_layer,
> >> > >> > > > +   uint32_t src_x, uint32_t src_y,
> >> > >> > > > +   uint32_t dst_x, uint32_t dst_y,
> >> > >> > > > +   uint32_t src_width, uint32_t src_height);
> >> > >> > > > +
> >> > >> > > > +void
> >> > >> > > >  blorp_fast_clear(struct blorp_batch *batch,
> >> > >> > > >   const struct blorp_surf *surf,
> >> > >> > > >   uint32_t level, uint32_t layer, enum
> >isl_format
> >> > >> format,
> >> > >> > > > diff --git a/src/intel/blorp/blorp_blit.c
> >> > >> b/src/intel/blorp/blorp_blit.c
> >> > >> > > > index 3ab39a3..42a502c 100644
> >> > >> > > > --- a/src/intel/blorp/blorp_blit.c
> >> > >> > > > +++ b/src/intel/blorp/blorp_blit.c
> >> > >> > > > @@ -1685,3 +1685,136 @@ blorp_blit(struct blorp_batch
> >*batch,
> >> > >> > > >   dst_x0, dst_y0, dst_x1, dst_y1,
> >> > >> > > >   mirror_x, mirror_y);
> >> > >> > > >  }
> >> > >> > > > +
> >> > >> > > > +static enum isl_format
> >> > >> > > > +get_copy_format_for_bpb(unsigned bpb)
> >> > >> > > > +{
> >> > >> > > > +   /* The choice of UNORM and UINT formats is very
> >intentional
> >> > >> here.
> >> > >> > > Most of
> >> > >> > > > +* the time, we want to use a UINT format to avoid
any
> >rounding
> >> > >> > > error in
> >> > >> > > > +* the blit.  For stencil blits, R8_UINT is required
> >by the
> >> > >> hardware.
> >> > >> > > > +* (It's the only format allowed in conjunction with
> >W-tiling.)
> >> > >> > > Also we
> >> > >> > > > +* intentionally use the 4-channel formats whenever
we
> >can.
> >> > >> This is
> >> > >> > > so
> >> > >> > > > +* that, when we do a RGB <-> RGBX copy, the two
> >formats will
> >> > >> line
> >> > >> > > up even
> >> > >> > > > +* though one of them is 3/4 the size of the other.
> >The choice
> >> > >> of
> >> > >> > > UNORM
> >> > >> > > > +* vs. UINT is also very intentional because Haswell
> >doesn't
> >> > >> handle
> >> > >> > > 8 or
> >> > >> > > > +* 16-bit RGB UINT formats at all so we have to use
> >UNORM there.
> >> > >> > > > +* Fortunately, the only time we should ever use two
> >different
> >> > >> > > formats in
> >> > >> > > > +* the table below is for RGB -> RGBA blits and so
we
> >will never
> >> > >> > > have any
> >> > >> > > > +* UNORM/UINT mismatch.
> >> > >> > > > +*/
> >> > >> > > > +   switch (bpb) {
> >> > >> > > > +   case 8:  return ISL_FORMAT_R8_UINT;
> >> > >> > > > +   case 16: return ISL_FORMAT_R8G8_UINT;
> >> > >> > > > +   case 24: return ISL_FORMAT_R8G8B8_UNORM;
> >> > >> > > > +   case 32: return ISL_FORMAT_R8G8B8A8_UNORM;
> >> > >> > > > +   case 48: return ISL_FORMAT_R16G16B16_UNORM;
> >> > >> > > > +   case

Re: [Mesa-dev] [PATCH] dir-locals.el: show-trailing-whitespace and whitespace support

2016-09-08 Thread Andres Gomez

On Thu, 2016-09-08 at 12:04 -0400, Ilia Mirkin wrote:
> On Thu, Sep 8, 2016 at 12:01 PM, Andres Gomez  wrote:

> > It will highlight malformed indentation
> 
> Malformed meaning what?

Usage of tabs, instead of spaces.

> > and change color of the
> > characters exceeding the line length limit (>78 chars).
> 
> Ah. I don't think that'll go over too well. I know I'd hate it at
> least. My window width is 80 chars anyways, so that's a much more
> obvious indicator of when to stop typing. Some lines do go over 80
> though and are fine. No need to assault someone reading code...

The change of color is not so annoying, IMHO, but if this is going to
cause trouble to other developers, I suppose, I will drop it.

-- 

Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/33] intel/blorp: Add an entrypoint for doing bit-for-bit copies

2016-09-08 Thread Pohjolainen, Topi

On Thu, Sep 08, 2016 at 10:58:09AM -0700, Jason Ekstrand wrote:
>On Wed, Sep 7, 2016 at 1:16 PM, Jason Ekstrand
><[1]ja...@jlekstrand.net> wrote:
> 
>On Sep 7, 2016 10:45 AM, "Nanley Chery" <[2]nanleych...@gmail.com>
>wrote:
>>
>> On Wed, Sep 07, 2016 at 10:26:25AM -0700, Jason Ekstrand wrote:
>> > On Wed, Sep 7, 2016 at 9:50 AM, Jason Ekstrand
><[3]ja...@jlekstrand.net> wrote:
>> >
>> > > On Wed, Sep 7, 2016 at 9:36 AM, Nanley Chery
><[4]nanleych...@gmail.com>
>> > > wrote:
>> > >
>> > >> On Tue, Sep 06, 2016 at 05:02:55PM -0700, Jason Ekstrand wrote:
>> > >> > On Tue, Sep 6, 2016 at 4:12 PM, Nanley Chery
><[5]nanleych...@gmail.com>
>> > >> wrote:
>> > >> >
>> > >> > > On Wed, Aug 31, 2016 at 02:22:33PM -0700, Jason Ekstrand
>wrote:
>> > >> > > > ---
>> > >> > > >  src/intel/blorp/blorp.h  |  10 
>> > >> > > >  src/intel/blorp/blorp_blit.c | 133
>++
>> > >> > > +
>> > >> > > >  2 files changed, 143 insertions(+)
>> > >> > > >
>> > >> > > > diff --git a/src/intel/blorp/blorp.h
>b/src/intel/blorp/blorp.h
>> > >> > > > index c1e93fd..6574124 100644
>> > >> > > > --- a/src/intel/blorp/blorp.h
>> > >> > > > +++ b/src/intel/blorp/blorp.h
>> > >> > > > @@ -109,6 +109,16 @@ blorp_blit(struct blorp_batch *batch,
>> > >> > > > uint32_t filter, bool mirror_x, bool
>mirror_y);
>> > >> > > >
>> > >> > > >  void
>> > >> > > > +blorp_copy(struct blorp_batch *batch,
>> > >> > > > +   const struct blorp_surf *src_surf,
>> > >> > > > +   unsigned src_level, unsigned src_layer,
>> > >> > > > +   const struct blorp_surf *dst_surf,
>> > >> > > > +   unsigned dst_level, unsigned dst_layer,
>> > >> > > > +   uint32_t src_x, uint32_t src_y,
>> > >> > > > +   uint32_t dst_x, uint32_t dst_y,
>> > >> > > > +   uint32_t src_width, uint32_t src_height);
>> > >> > > > +
>> > >> > > > +void
>> > >> > > >  blorp_fast_clear(struct blorp_batch *batch,
>> > >> > > >   const struct blorp_surf *surf,
>> > >> > > >   uint32_t level, uint32_t layer, enum
>isl_format
>> > >> format,
>> > >> > > > diff --git a/src/intel/blorp/blorp_blit.c
>> > >> b/src/intel/blorp/blorp_blit.c
>> > >> > > > index 3ab39a3..42a502c 100644
>> > >> > > > --- a/src/intel/blorp/blorp_blit.c
>> > >> > > > +++ b/src/intel/blorp/blorp_blit.c
>> > >> > > > @@ -1685,3 +1685,136 @@ blorp_blit(struct blorp_batch
>*batch,
>> > >> > > >   dst_x0, dst_y0, dst_x1, dst_y1,
>> > >> > > >   mirror_x, mirror_y);
>> > >> > > >  }
>> > >> > > > +
>> > >> > > > +static enum isl_format
>> > >> > > > +get_copy_format_for_bpb(unsigned bpb)
>> > >> > > > +{
>> > >> > > > +   /* The choice of UNORM and UINT formats is very
>intentional
>> > >> here.
>> > >> > > Most of
>> > >> > > > +* the time, we want to use a UINT format to avoid any
>rounding
>> > >> > > error in
>> > >> > > > +* the blit.  For stencil blits, R8_UINT is required
>by the
>> > >> hardware.
>> > >> > > > +* (It's the only format allowed in conjunction with
>W-tiling.)
>> > >> > > Also we
>> > >> > > > +* intentionally use the 4-channel formats whenever we
>can.
>> > >> This is
>> > >> > > so
>> > >> > > > +* that, when we do a RGB <-> RGBX copy, the two
>formats will
>> > >> line
>> > >> > > up even
>> > >> > > > +* though one of them is 3/4 the size of the other.
>The choice
>> > >> of
>> > >> > > UNORM
>> > >> > > > +* vs. UINT is also very intentional because Haswell
>doesn't
>> > >> handle
>> > >> > > 8 or
>> > >> > > > +* 16-bit RGB UINT formats at all so we have to use
>UNORM there.
>> > >> > > > +* Fortunately, the only time we should ever use two
>different
>> > >> > > formats in
>> > >> > > > +* the table below is for RGB -> RGBA blits and so we
>will never
>> > >> > > have any
>> > >> > > > +* UNORM/UINT mismatch.
>> > >> > > > +*/
>> > >> > > > +   switch (bpb) {
>> > >> > > > +   case 8:  return ISL_FORMAT_R8_UINT;
>> > >> > > > +   case 16: return ISL_FORMAT_R8G8_UINT;
>> > >> > > > +   case 24: return ISL_FORMAT_R8G8B8_UNORM;
>> > >> > > > +   case 32: return ISL_FORMAT_R8G8B8A8_UNORM;
>> > >> > > > +   case 48: return ISL_FORMAT_R16G16B16_UNORM;
>> > >> > > > +   case 64: return ISL_FORMAT_R16G16B16A16_UNORM;
>> > >> > > > +   case 96: return ISL_FORMAT_R32G32B32_UINT;
>> > >> > > > +   case 128:return ISL_FORMAT_R32G32B32A32_UINT;
>> > >> > > > +   default:
>> > >> > > > +  unreachable("Unknown format bpb");
>> > >> > >

Re: [Mesa-dev] [PATCH] glsl: Make blend_colordodge compare against 1.0 - FLT_EPSILON.

2016-09-08 Thread Francisco Jerez

Alejandro Piñeiro  writes:

> On 02/09/16 23:13, Kenneth Graunke wrote:
>> On Friday, August 26, 2016 10:49:18 PM PDT Kenneth Graunke wrote:
>>> This fixes a numerical precision issue that was causing two CTS
>>> failures:
>>>
>>> ES31-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR
>>> ES31-CTS.blend_equation_advanced.blend_all.GL_COLORBURN_KHR_all_qualifier
>
> FWIW: it fixes the equivalent GL44 tests.
>
>>>
>>> When blending with GL_COLORDODGE_KHR and these colors:
>> 
>> This should be GL_COLORBURN_KHR and the subject should be
>> blend_colorburn.  Fixed locally.
>
> Gentle reminder to use the fixed version.
>
>>>
>>>dst = <0.372549027, 0.372549027, 0.372549027, 0.372549027>
>>>src = <0.09375, 0.046875, 0.0, 0.375>
>>>
>>> the normalized dst value became 0.9994 (due to imprecisions in the
>>> alpha scaling, presumably), which failed the dst >= 1.0 comparison.
>>> The blue channel would then fall through to the dst < 1.0 && src >= 0
>>> comparison, which was true, since src.b == 0.  This produced a factor
>>> of 0.0 instead of 1.0.
>>>
>>> To work around this, compare with 1.0 - FLT_EPSILON.
>
> Makes sense:
> Reviewed-by: Alejandro Piñeiro 
>
> Having said so, Kenneth mentioned on IRC that Francisco has some doubts
> on this patch. CCing him just in case he wants to mention something.
>
Heh, right, my concern was that this smells strongly like a test relying
on not terribly well-defined behavior...  AFAICT the problem addressed
here is ultimately caused by the discontinuity that the COLORBURN
blending equation has at the point Cd = 1, Cs = 0, and the test authors
had the awesome idea [not necessarily being sarcastic here ;)] of
testing the blending function at precisely that point, even though the
function is guaranteed to be numerically unstable and vary wildly given
the slightest rounding error.

Does the extension impose any requirements on the precision of the
division by alpha operation done on pre-multiplied color components?
The test case may be valid assuming that IEEE precision rules apply, but
AFAIK GLSL has considerably looser requirements on the division
operation, and the KHR_blend_equation_advanced lowering code is
implemented in terms of GLSL division so the result could potentially be
farther off than 1 - epsilon (though AFAICT this change would be correct
assuming the result of GLSL division is guaranteed to be within ~1.5 ULP
of the exact value, which I don't think is the case).

>>>
>>> Signed-off-by: Kenneth Graunke 
>>> ---
>>>  src/compiler/glsl/lower_blend_equation_advanced.cpp | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/compiler/glsl/lower_blend_equation_advanced.cpp 
>>> b/src/compiler/glsl/lower_blend_equation_advanced.cpp
>>> index a998df1..f8b0261 100644
>>> --- a/src/compiler/glsl/lower_blend_equation_advanced.cpp
>>> +++ b/src/compiler/glsl/lower_blend_equation_advanced.cpp
>>> @@ -28,6 +28,7 @@
>>>  #include "program/prog_instruction.h"
>>>  #include "program/prog_statevars.h"
>>>  #include "util/bitscan.h"
>>> +#include 
>>>  
>>>  using namespace ir_builder;
>>>  
>>> @@ -101,7 +102,7 @@ blend_colorburn(ir_variable *src, ir_variable *dst)
>>>  *   1 - min(1,(1-Cd)/Cs), if Cd < 1 and Cs > 0
>>>  *   0, if Cd < 1 and Cs <= 0
>>>  */
>>> -   return csel(gequal(dst, imm3(1)), imm3(1),
>>> +   return csel(gequal(dst, imm3(1 - FLT_EPSILON)), imm3(1),
>>> csel(lequal(src, imm3(0)), imm3(0),
>>>  sub(imm3(1), min2(imm3(1), div(sub(imm3(1), dst), 
>>> src);
>>>  }
>>>
>> 
>> 
>> 
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> 
>
> -- 
> Alejandro Piñeiro 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel/aubinator: Properly handle batch buffer chaining

2016-09-08 Thread Jason Ekstrand

The original aubinator that Kristian wrote had a bug in the handling of
MI_BATCH_BUFFER_START that propagated into the version in upstream mesa.
Say you have two batch buffers A and B where A calls MI_BATCH_BUFFER_START
to jump to B.  Now suppose that A and B are placed consecutively in the
address space with A before B.  What can happen is that aubinator will
process A, and start processing B when it should.  When it gets done with
B, it returns and continues to process A.  Because A doesn't have any more
data after the MI_BATCH_BUFFER_START, it will just process a bunch of NOPs
until it gets to the next buffer in memory which is B again.  In this
scenario B gets processed twice which can be very confusing.  If you place
things in memory just right, you can also end up with infinite loops which
are all sorts of fun.

The root problem here is that it continues to process commands even after
an MI_BATCH_BUFFER_START.  By simply checking the 2nd level we can detect
whether or not the command buffer we are jumping to will return here and
stop processing commands if it won't.

Signed-off-by: Jason Ekstrand 
---
 src/intel/tools/aubinator.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index fe1f369..73a7f21 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -766,6 +766,13 @@ parse_commands(struct gen_spec *spec, uint32_t *cmds, int 
size, int engine)
 start = p[1];
 
  parse_commands(spec, gtt + start, 1 << 20, engine);
+
+ /* MI_BATCH_BUFFER_START with "2nd Level Batch Buffer" unset acts
+  * like a goto.  No commands after such a MI_BATCH_BUFFER_START will
+  * get processed so we should bail as well.
+  */
+ if (p[0] & (1 << 22) == 0)
+break;
   } else if ((p[0] & 0x) == AUB_MI_BATCH_BUFFER_END) {
  break;
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r300g: Set R300_VAP_CNTL on RSxxx to avoid triangle flickering

2016-09-08 Thread Dave Airlie

On 8 September 2016 at 09:53, Max Staudt  wrote:
> On the RSxxx chip series, HW TCL is missing and r300_emit_vs_state()
> is never called.
>
> However, if R300_VAP_CNTL is never set, the hardware (at least the
> RS690 I tested this on) comes up with rendering artifacts, and
> parts that are uploaded before this "fix" remain broken in VRAM.
> This causes artifacts as in fdo#69076 ("triangle flickering").
>
> It seems like this setup needs to happen at least once after power on
> for 3D rendering to work properly. In the DDX with EXA, this happens in
> RADEON_SWITCH_TO_3D() when processing an XRENDER Composite or an
> Xv request. So playing back a video or starting a GTK+2 application
> fixes 3D rendering for the rest of the session. However, this auto-fix
> doesn't happen when EXA is not used, such as with GLAMOR or Wayland.
>
> This patch ensures the register is configured even in absence of
> the DDX's EXA module.
>
> The register setting is taken from:
>   xf86-video-ati  --  RADEONInit3DEngineInternal()
>   mesa/src/mesa/drivers/dri/r300  --  r300EmitClearState()
>
> Tested on RS690.
>
> Signed-off-by: Max Staudt 

I've applied and pushed this.

Thanks,
Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] egl: return corresponding offset of EGLImage instead of 0.

2016-09-08 Thread Weng, Chuanbo

Hi Axel and Emil,
Thanks Axel for your review. Please see my comments below.

-Original Message-
From: Axel Davy [mailto:axel.d...@ens.fr] 
Sent: Friday, September 9, 2016 1:25 AM
To: Weng, Chuanbo ; mesa-dev@lists.freedesktop.org; 
emil.l.veli...@gmail.com
Subject: Re: [Mesa-dev] [PATCH 2/3] egl: return corresponding offset of 
EGLImage instead of 0.

Hi,

That doesn't seem good to me.

With that patch, that means that since no one is implementing

__DRI_IMAGE_ATTRIB_OFFSET

(yes I know in a later patch you implement it for i965),

then what used to work will stop working (as the queryImage will return false).

You need to introduce some interface version implementation check.
[Chuanbo] Maybe I can add more comment to git log (such as "This patch 
just implements egl loader side, the driver side 
implementation is also needed for corresponding platform"), so user can be 
aware of this.
Introduce interface version implementation check will make mesa code more 
complex, because we should also add related check to
other dri2 functions(dri2_).
Another solution is combining the three patches into one patch, as I did before:
https://lists.freedesktop.org/archives/mesa-dev/2016-August/126945.html
This is not as easy as this version for reviewers, but more clearer for users.
Emil, what do you think?

(another nitpick: queryImage return true or false, not EGL_TRUE/EGL_FALSE. It's 
better to convert, instead of assuming they're the same)
[Chuanbo] Agree.

Yours,

Axel Davy

On 06/09/2016 11:06, Chuanbo Weng wrote:
> The offset should not always be 0. For example, if EGLImage is created 
> from a 2D texture with EGL_GL_TEXTURE_LEVEL=1, then the offset should 
> be the actual start of miplevel 1 in bo.
>
> Signed-off-by: Chuanbo Weng 
> ---
>   src/egl/drivers/dri2/egl_dri2.c | 12 +---
>   1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/src/egl/drivers/dri2/egl_dri2.c 
> b/src/egl/drivers/dri2/egl_dri2.c index 859612f..8ef0acd 100644
> --- a/src/egl/drivers/dri2/egl_dri2.c
> +++ b/src/egl/drivers/dri2/egl_dri2.c
> @@ -2249,6 +2249,8 @@ dri2_export_dma_buf_image_mesa(_EGLDriver *drv, 
> _EGLDisplay *disp, _EGLImage *im
>  struct dri2_egl_image *dri2_img = dri2_egl_image(img);
>   
>  (void) drv;
> +   EGLBoolean ret = EGL_TRUE;
> +   EGLint img_offset = 0;
>   
>  /* rework later to provide multiple fds/strides/offsets */
>  if (fds)
> @@ -2259,10 +2261,14 @@ dri2_export_dma_buf_image_mesa(_EGLDriver *drv, 
> _EGLDisplay *disp, _EGLImage *im
> dri2_dpy->image->queryImage(dri2_img->dri_image,
> __DRI_IMAGE_ATTRIB_STRIDE, strides);
>   
> -   if (offsets)
> -  offsets[0] = 0;
> +   if (offsets){
> +  ret = dri2_dpy->image->queryImage(dri2_img->dri_image,
> + __DRI_IMAGE_ATTRIB_OFFSET, &img_offset);
> +  if(ret == EGL_TRUE)
> +offsets[0] = img_offset;
> +   }
>   
> -   return EGL_TRUE;
> +   return ret;
>   }
>   
>   #endif

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFT PATCH 2/2] nv20: enable ARB_texture_border_clamp support

2016-09-08 Thread Francisco Jerez

Ilia Mirkin  writes:

> Signed-off-by: Ilia Mirkin 
> ---
>
> This was tested on a NV25-on-NV34 situation. Should be tested on real hardware
> since my test environment relies on accurate emulation in the hw.
>
Looks okay to me, as long as you get some testing coverage on real
hardware patch is:

Acked-by: Francisco Jerez 

>  src/mesa/drivers/dri/nouveau/nv20_context.c   |  1 +
>  src/mesa/drivers/dri/nouveau/nv20_state_tex.c | 29 
> ++-
>  2 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/nouveau/nv20_context.c 
> b/src/mesa/drivers/dri/nouveau/nv20_context.c
> index ec638c0..6940b4d 100644
> --- a/src/mesa/drivers/dri/nouveau/nv20_context.c
> +++ b/src/mesa/drivers/dri/nouveau/nv20_context.c
> @@ -456,6 +456,7 @@ nv20_context_create(struct nouveau_screen *screen, gl_api 
> api,
>   if (!nouveau_context_init(ctx, api, screen, visual, share_ctx))
>   goto fail;
>  
> + ctx->Extensions.ARB_texture_border_clamp = true;
>   ctx->Extensions.ARB_texture_env_crossbar = true;
>   ctx->Extensions.ARB_texture_env_combine = true;
>   ctx->Extensions.ARB_texture_env_dot3 = true;
> diff --git a/src/mesa/drivers/dri/nouveau/nv20_state_tex.c 
> b/src/mesa/drivers/dri/nouveau/nv20_state_tex.c
> index b0a4c9f..ef1799a 100644
> --- a/src/mesa/drivers/dri/nouveau/nv20_state_tex.c
> +++ b/src/mesa/drivers/dri/nouveau/nv20_state_tex.c
> @@ -165,7 +165,8 @@ nv20_emit_tex_obj(struct gl_context *ctx, int emit)
>   struct nouveau_surface *s;
>   struct gl_texture_image *ti;
>   const struct gl_sampler_object *sa;
> - uint32_t tx_format, tx_filter, tx_wrap, tx_enable;
> + uint8_t r, g, b, a;
> + uint32_t tx_format, tx_filter, tx_wrap, tx_bcolor, tx_enable;
>  
>   PUSH_RESET(push, BUFCTX_TEX(i));
>  
> @@ -201,6 +202,29 @@ nv20_emit_tex_obj(struct gl_context *ctx, int emit)
>   | nvgl_filter_mode(sa->MinFilter) << 16
>   | 2 << 12;
>  
> + r = FLOAT_TO_UBYTE(sa->BorderColor.f[0]);
> + g = FLOAT_TO_UBYTE(sa->BorderColor.f[1]);
> + b = FLOAT_TO_UBYTE(sa->BorderColor.f[2]);
> + a = FLOAT_TO_UBYTE(sa->BorderColor.f[3]);
> + switch (ti->_BaseFormat) {
> + case GL_LUMINANCE:
> + a = 0xff;
> + /* fallthrough */
> + case GL_LUMINANCE_ALPHA:
> + g = b = r;
> + break;
> + case GL_RGB:
> + a = 0xff;
> + break;
> + case GL_INTENSITY:
> + g = b = a = r;
> + break;
> + case GL_ALPHA:
> + r = g = b = 0;
> + break;
> + }
> + tx_bcolor = b << 0 | g << 8 | r << 16 | a << 24;
> +
>   tx_enable = NV20_3D_TEX_ENABLE_ENABLE
>   | log2i(sa->MaxAnisotropy) << 4;
>  
> @@ -249,6 +273,9 @@ nv20_emit_tex_obj(struct gl_context *ctx, int emit)
>   BEGIN_NV04(push, NV20_3D(TEX_FILTER(i)), 1);
>   PUSH_DATA (push, tx_filter);
>  
> + BEGIN_NV04(push, NV20_3D(TEX_BORDER_COLOR(i)), 1);
> + PUSH_DATA (push, tx_bcolor);
> +
>   BEGIN_NV04(push, NV20_3D(TEX_ENABLE(i)), 1);
>   PUSH_DATA (push, tx_enable);
>  
> -- 
> 2.7.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/57] i965/ir: Switch representation of register offsets and sizes to byte units.

2016-09-08 Thread Francisco Jerez

Iago Toral  writes:

> On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
>> This series reworks the representation of register region offsets in
>> the i965 IR to be universally byte-based instead of the rather
>> awkward
>> split between reg_offset and subreg_offset we have in the FS back-end
>> right now, or the reg_offset field currently used in the VEC4 IR
>> which
>> doesn't allow better granularity than 32B.  The most immediate
>> motivation is to enable sub-GRF offsets in the VEC4 back-end, which
>> will be useful for various kinds of lowering and instruction
>> splitting
>> required for FP64 support on VEC4 platforms.
>
> Thanks a lot for taking care of this!
>
>> Patches 01-11 take care of scaling the regs_written and regs_read
>> instruction methods on both back-ends and the reg_offset register
>> field of VEC4 IR registers.  The fs_reg::reg_offset and
>> ::subreg_offset fields are also unified into a single register field.
>> Because this part of the series is rather bulky I've tried to keep
>> the
>> changes as obvious and functionally equivalent as possible at the
>> cost
>> of introducing not particularly clever code in some cases that could
>> be simplified with some knowledge of the context.  Patches 31-46 make
>> a second pass through the code touched in the first part of the
>> series
>> in order to get rid of an amount of cruft.
>
> I have reviewed this part and intend to continue reviewing the rest in
> the following days. Might be a good idea to have another set of eyes
> reviewing the series or at least skimming through it just in case.
>
> You might want to look at least at the second comment I made to patch 2
> and the comment to patch 7, since these might be actual problems. All
> other comments are minor things or small clarifications and you can
> ignore them if you want.
>
> With the issues I point out in patches 2 and 7 addressed (or
> confirmation from your side that these are not real problems), patches
> 1-11 are:
>
> Reviewed-by: Iago Toral Quiroga 
>
Thanks!

>> Patches 12-30 address an amount of bugs that became obvious during
>> the
>> conversion to byte units, some of them seem worrying enough that it
>> might make sense to back-port them to stable releases.
>> 
>> Patches 47-57 go through the VEC4 back-end and address a number of
>> issues that would arise in existing optimization passes with
>> non-GRF-aligned regions, which will be useful for FP64 support.  It's
>> likely not complete and the handling of sub-GRF offsets doesn't
>> attempt to be nearly as clever as the FS back-end, but they should
>> make a substantial improvement over the current situation.
>> 
>> [PATCH 01/57] i965/fs: Replace fs_reg::reg_offset with fs_reg::offset
>> expressed in bytes.
>> [PATCH 02/57] i965/vec4: Replace dst/src_reg::reg_offset with
>> dst/src_reg::offset expressed in bytes.
>> [PATCH 03/57] i965/ir: Remove backend_reg::reg_offset.
>> [PATCH 04/57] i965/fs: Replace fs_reg::subreg_offset with
>> fs_reg::offset expressed in bytes.
>> [PATCH 05/57] i965/fs: Add wrapper functions for fs_inst::regs_read
>> and ::regs_written.
>> [PATCH 06/57] i965/vec4: Add wrapper functions for
>> vec4_instruction::regs_read and ::regs_written.
>> [PATCH 07/57] i965/fs: Replace fs_inst::regs_written with
>> ::size_written field in bytes.
>> [PATCH 08/57] i965/vec4: Replace vec4_instruction::regs_written with
>> ::size_written field in bytes.
>> [PATCH 09/57] i965/ir: Drop backend_instruction::regs_written field.
>> [PATCH 10/57] i965/fs: Replace fs_inst::regs_read with ::size_read
>> using byte units.
>> [PATCH 11/57] i965/vec4: Replace vec4_instruction::regs_read with
>> ::size_read using byte units.
>> [PATCH 12/57] i965/fs: Return more accurate read size from
>> fs_inst::size_read for IMM and UNIFORM files.
>> [PATCH 13/57] i965/fs: Return more accurate read size for LINTERP
>> from fs_inst::size_read.
>> [PATCH 14/57] i965/fs: Handle arbitrary offsets in
>> brw_reg_from_fs_reg for MRF/VGRF registers.
>> [PATCH 15/57] i965/fs: Handle fixed HW GRF subnr in reg_offset().
>> [PATCH 16/57] i965/fs: Take into account trailing padding in
>> regs_written() and regs_read().
>> [PATCH 17/57] i965/fs: Take into account misalignment in
>> regs_written() and regs_read().
>> [PATCH 18/57] i965/vec4: Take into account misalignment in
>> regs_written() and regs_read().
>> [PATCH 19/57] i965/fs: Don't consider LOAD_PAYLOAD with sub-GRF
>> offset to behave like a raw copy.
>> [PATCH 20/57] i965/fs: Don't consider LOAD_PAYLOAD with stride > 1
>> source to behave like a raw copy.
>> [PATCH 21/57] i965/fs: Compare full register offsets in cmod
>> propagation pass.
>> [PATCH 22/57] i965/fs: Fix can_propagate_from() source/destination
>> overlap check.
>> [PATCH 23/57] i965/fs: Fix LOAD_PAYLOAD handling in register coalesce
>> is_nop_mov().
>> [PATCH 24/57] i965/fs: Drop fs_inst::overwrites_reg() in favor of
>> regions_overlap().
>> [PATCH 25/57] i965/fs: Stop using fs_reg::in_range() in favor of
>> regions_over

Re: [Mesa-dev] [PATCH 0/5] * Aubinator code simplification *

2016-09-08 Thread Kenneth Graunke

On Thursday, September 8, 2016 4:15:17 PM PDT Sirisha Gandikota wrote:
> From: Sirisha Gandikota 
> 
> This patch set simplifies parts of code in the aubinator tool
> as per review comments from Ken (Wed Aug 24 04:51:47 UTC 2016)
> 
> Sirisha Gandikota (5):
>   aubinator: Fix compiler warning
>   aubinator: Simplify gen_disasm_create()'s devinfo handling
>   aubinator: Simplify print_dword_val() method
>   aubinator: Make gen_disasm_disassemble handle split sends
>   aubinator: Remove bogus "end" parameter in gen_disasm_disassemble()
> 
>  src/intel/tools/aubinator.c  | 20 
>  src/intel/tools/decoder.h|  2 +-
>  src/intel/tools/disasm.c | 38 ++
>  src/intel/tools/gen_disasm.h |  2 +-
>  4 files changed, 32 insertions(+), 30 deletions(-)
> 
> 

Patches 1, 2, and 5 are:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] aubinator: Simplify print_dword_val() method

2016-09-08 Thread Kenneth Graunke

On Thursday, September 8, 2016 4:15:20 PM PDT Sirisha Gandikota wrote:
> From: Sirisha Gandikota 
> 
> Remove the float/dword union and use the iter->p[f->start / 32]
> directly as printf formatter %08x expects uint32_t (Ken)
> 
> Signed-off-by: Sirisha Gandikota 
> ---
>  src/intel/tools/aubinator.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
> index 811f707..d147225 100644
> --- a/src/intel/tools/aubinator.c
> +++ b/src/intel/tools/aubinator.c
> @@ -92,17 +92,13 @@ print_dword_val(struct gen_field_iterator *iter, uint64_t 
> offset,
>  int *dword_num)
>  {
> struct gen_field *f;
> -   union {
> -  uint32_t dw;
> -  float f;
> -   } v;
>  
> f = iter->group->fields[iter->i - 1];
> -   v.dw = iter->p[f->start / 32];
>  
> if (*dword_num != (f->start / 32)) {
>printf("0x%08lx:  0x%08x : Dword %d\n",
> - offset + 4 * (f->start / 32), v.dw, f->start / 32);
> + offset + 4 * (f->start / 32), iter->p[f->start / 32], f->start 
> / 
> +32);
>*dword_num = (f->start / 32);
> }
>  }
> 

Perhaps do:

   const int dword = f->start / 32;

and then use that everywhere instead of repeating f->start / 32 in four
places?  Seems a little tidier.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/57] i965/vec4: Replace vec4_instruction::regs_read with ::size_read using byte units.

2016-09-08 Thread Francisco Jerez

Iago Toral  writes:

> On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
>> The previous regs_read value can be recovered by rewriting each
>> reference of regs_read() like 'x = i.regs_read(j)' to 'x =
>> DIV_ROUND_UP(i.size_read(j), reg_unit)'.
>> 
>> For the same reason as in the previous patches, this doesn't attempt
>> to be particularly clever about simplifying the result in the
>> interest
>> of keeping the rather lengthy patch as obvious as possible.  I'll
>> come
>> back later to clean up any ugliness introduced here.
>> ---
>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  6 +++--
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 30
>> ++
>>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp |  2 +-
>>  3 files changed, 25 insertions(+), 13 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> index 5a79062..2fd5441 100644
>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> @@ -167,7 +167,7 @@ public:
>> unsigned sol_vertex; /**< gen6: used for setting dst index in SVB
>> header */
>>  
>> bool is_send_from_grf();
>> -   unsigned regs_read(unsigned arg) const;
>> +   unsigned size_read(unsigned arg) const;
>> bool can_reswizzle(const struct gen_device_info *devinfo, int
>> dst_writemask,
>>    int swizzle, int swizzle_mask);
>> void reswizzle(int dst_writemask, int swizzle);
>> @@ -278,7 +278,9 @@ inline unsigned
>>  regs_read(const vec4_instruction *inst, unsigned i)
>>  {
>> /* XXX - Take into account register-misaligned offsets correctly.
>> */
>> -   return inst->regs_read(i);
>> +   const unsigned reg_size =
>> +  inst->src[i].file == UNIFORM || inst->src[i].file == IMM ? 16
>> : REG_SIZE;
>> +   return DIV_ROUND_UP(inst->size_read(i), reg_size);
>>  }
>>  
>>  } /* namespace brw */
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index bdd6e59..561170c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -199,11 +199,8 @@
>> vec4_instruction::has_source_and_destination_hazard() const
>>  }
>>  
>>  unsigned
>> -vec4_instruction::regs_read(unsigned arg) const
>> +vec4_instruction::size_read(unsigned arg) const
>>  {
>> -   if (src[arg].file == BAD_FILE)
>> -  return 0;
>> -
>> switch (opcode) {
>> case SHADER_OPCODE_SHADER_TIME_ADD:
>> case SHADER_OPCODE_UNTYPED_ATOMIC:
>> @@ -213,13 +210,26 @@ vec4_instruction::regs_read(unsigned arg) const
>> case SHADER_OPCODE_TYPED_SURFACE_READ:
>> case SHADER_OPCODE_TYPED_SURFACE_WRITE:
>> case TCS_OPCODE_URB_WRITE:
>> -  return arg == 0 ? mlen : 1;
>> -
>> +  if (arg == 0)
>> + return mlen * REG_SIZE;
>> +  break;
>> case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
>> -  return arg == 1 ? mlen : 1;
>> +  if (arg == 1)
>> + return mlen * REG_SIZE;
>> +  break;
>> +   default:
>> +  break;
>> +   }
>>  
>> +   switch (src[arg].file) {
>> +   case BAD_FILE:
>> +  return 0;
>> +   case IMM:
>> +   case UNIFORM:
>> +  return 4 * type_sz(src[arg].type);
>> default:
>> -  return 1;
>> +  /* XXX - Represent actual execution size and vertical stride.
>> */
>> +  return 8 * type_sz(src[arg].type);
>> }
>>  }
>>  
>> @@ -1188,7 +1198,7 @@ vec4_visitor::opt_register_coalesce()
>>   bool interfered = false;
>>   for (int i = 0; i < 3; i++) {
>>  if (inst->src[0].in_range(scan_inst->src[i],
>> -  scan_inst->regs_read(i)))
>> +  DIV_ROUND_UP(scan_inst-
>> >size_read(i), REG_SIZE)))
>
> Why not just swap scan_inst->regs_read(i) with regs_read(scan_inst, i)
> instead?
>
>>     interfered = true;
>>   }
>>   if (interfered)
>> @@ -1214,7 +1224,7 @@ vec4_visitor::opt_register_coalesce()
>>   } else {
>>  for (int i = 0; i < 3; i++) {
>> if (inst->dst.in_range(scan_inst->src[i],
>> -  scan_inst->regs_read(i)))
>> +  DIV_ROUND_UP(scan_inst-
>> >size_read(i), REG_SIZE)))
>
> Same here.
>

These two are removed later on when backend_reg::in_range goes away in
PATCH 27, so the DIV_ROUND_UP() call also goes away and size_read() ends
up being exactly what we want.

>>    interfered = true;
>>  }
>>  if (interfered)
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> index f98c7ac..777d252 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> @@ -437,7 +437,7 @@ vec4_visitor::opt_copy_propagation(bool
>> do_constant_prop)
>>  continue;
>>  
>>

Re: [Mesa-dev] [PATCH 4/5] aubinator: Make gen_disasm_disassemble handle split sends

2016-09-08 Thread Kenneth Graunke

On Thursday, September 8, 2016 4:15:21 PM PDT Sirisha Gandikota wrote:
> From: Sirisha Gandikota 
> 
> Skylake adds new SENDS and SENDSC opcodes, which should be
> handled in the send-with-EOT check. Make an is_send() helper
> that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken)
> 
> Signed-off-by: Sirisha Gandikota 
> ---
>  src/intel/tools/disasm.c | 22 --
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/src/intel/tools/disasm.c b/src/intel/tools/disasm.c
> index 7e5a7cb..13e4ce2 100644
> --- a/src/intel/tools/disasm.c
> +++ b/src/intel/tools/disasm.c
> @@ -35,12 +35,25 @@ struct gen_disasm {
>  struct gen_device_info devinfo;
>  };
>  
> +
> +static bool
> +is_send(uint32_t opcode)
> +{
> +   if (opcode == BRW_OPCODE_SEND || opcode == BRW_OPCODE_SENDC ||
> +   opcode == BRW_OPCODE_SENDS || opcode == BRW_OPCODE_SENDSC ) {
> +  return true;
> +   } else {
> +  return false;
> +   }

No need for if/else...you can just do

   return opcode == BRW_OPCODE_SEND ||
  opcode == BRW_OPCODE_SENDC ||
  opcode == BRW_OPCODE_SENDS ||
  opcode == BRW_OPCODE_SENDSC;

> +}
> +
>  void
>  gen_disasm_disassemble(struct gen_disasm *disasm, void *assembly, int start,
> int end, FILE *out)
>  {
> struct gen_device_info *devinfo = &disasm->devinfo;
> bool dump_hex = false;
> +   uint32_t opcode = 0;
>  
> for (int offset = start; offset < end;) {
>brw_inst *insn = assembly + offset;
> @@ -74,14 +87,11 @@ gen_disasm_disassemble(struct gen_disasm *disasm, void 
> *assembly, int start,
>brw_disassemble_inst(out, devinfo, insn, compacted);
>  
>/* Simplistic, but efficient way to terminate disasm */
> -  if (brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SEND ||
> -  brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SENDC) {
> - if (brw_inst_eot(devinfo, insn))
> -break;
> +  opcode = brw_inst_opcode(devinfo, insn);

We're allowed to mix declarations and code - please do

  const uint32_t opcode = brw_inst_opcode(devinfo, insn);

> +  if (opcode == 0 || (is_send(opcode) && brw_inst_eot(devinfo, insn))) {
> + break;
>}
>  
> -  if (brw_inst_opcode(devinfo, insn) == 0)
> - break;
> }
>  }
>  
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/57] i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes.

2016-09-08 Thread Francisco Jerez

Iago Toral  writes:

> On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
> (...)
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> index 12ab7b3..a678351 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> @@ -363,7 +363,7 @@ fs_generator::generate_fb_read(fs_inst *inst,
>> struct brw_reg dst,
>>    prog_data->binding_table.render_target_start + inst->target;
>>  
>> gen9_fb_READ(p, dst, payload, surf_index,
>> -inst->header_size, inst->regs_written,
>> +inst->header_size, inst->size_written / REG_SIZE,
>
> DIV_ROUND_UP?
>
>>  prog_data->persample_dispatch);
>>  
>> brw_mark_surface_used(&prog_data->base, surf_index);
>> @@ -467,7 +467,7 @@ fs_generator::generate_urb_read(fs_inst *inst,
>>    brw_inst_set_urb_per_slot_offset(p->devinfo, send, true);
>>  
>> brw_inst_set_mlen(p->devinfo, send, inst->mlen);
>> -   brw_inst_set_rlen(p->devinfo, send, inst->regs_written);
>> +   brw_inst_set_rlen(p->devinfo, send, inst->size_written /
>> REG_SIZE);
>
> DIV_ROUND_UP?
>
>> brw_inst_set_header_present(p->devinfo, send, true);
>> brw_inst_set_urb_global_offset(p->devinfo, send, inst->offset);
>>  }
>> @@ -895,7 +895,7 @@ fs_generator::generate_tex(fs_inst *inst, struct
>> brw_reg dst, struct brw_reg src
>>   surface + base_binding_table_index,
>>   sampler % 16,
>>   msg_type,
>> - inst->regs_written,
>> + inst->size_written / REG_SIZE,
>
> DIV_ROUND_UP?
>
>>   inst->mlen,
>>   inst->header_size != 0,
>>   simd_mode,
>> @@ -932,7 +932,7 @@ fs_generator::generate_tex(fs_inst *inst, struct
>> brw_reg dst, struct brw_reg src
>>    0 /* surface */,
>>    0 /* sampler */,
>>    msg_type,
>> -  inst->regs_written,
>> +  inst->size_written / REG_SIZE,
>
> DIV_ROUND_UP?
>
>>    inst->mlen /* mlen */,
>>    inst->header_size != 0 /* header */,
>>    simd_mode,
>> @@ -1263,7 +1263,7 @@
>> fs_generator::generate_varying_pull_constant_load_gen4(fs_inst *inst,
>> */
>>    msg_type = BRW_SAMPLER_MESSAGE_SIMD16_LD;
>>    assert(inst->mlen == 3);
>> -  assert(inst->regs_written == 8);
>> +  assert(inst->size_written == 8 * REG_SIZE);
>>    rlen = 8;
>>    simd_mode = BRW_SAMPLER_SIMD_MODE_SIMD16;
>> }
>> @@ -1408,7 +1408,7 @@
>> fs_generator::generate_pixel_interpolator_query(fs_inst *inst,
>>   msg_type,
>>   msg_data,
>>   inst->mlen,
>> - inst->regs_written);
>> + inst->size_written / REG_SIZE);
>
> DIV_ROUND_UP?

In all cases above you have the requirement that the amount of data
written is an exact multiple of REG_SIZE, because SEND messages can only
represent return payload sizes as an integer in GRF units, so if
fs_inst::size_written ends up not being a multiple of REG_SIZE in any of
these cases something has gone seriously wrong along the way.  Would you
like me to sprinkle in some assertions to verify that?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] vl/dri3: handle the case of different GPU

2016-09-08 Thread Michel Dänzer

On 08/09/16 05:59 PM, Christian König wrote:
> Am 08.09.2016 um 10:42 schrieb Michel Dänzer:
>> On 08/09/16 05:05 PM, Christian König wrote:
>>> Am 08.09.2016 um 08:23 schrieb Michel Dänzer:
 On 08/09/16 01:13 PM, Nayan Deshmukh wrote:
> On Thu, Sep 8, 2016 at 9:03 AM, Michel Dänzer  > wrote:
> On 08/09/16 02:48 AM, Nayan Deshmukh wrote:
>> use a linear buffer in case of back buffer
>>
>> Signed-off-by: Nayan Deshmukh > >
>
> However, as we discussed before, for various reasons it would
> probably be better to create separate linear buffers instead of making
> all buffers linear.
>
> So should I maintain a single linear buffer and copy the back
> buffer to
> it before sending it via the present extension?
 It's better to create one linear buffer corresponding to each
 non-linear
 buffer with contents to be presented. Otherwise the rendering GPU may
 overwrite the linear buffer contents while the presentation GPU is
 still
 reading from it, resulting in tearing-like artifacts.
>>> That approach isn't necessary. VDPAU has functions to query if an output
>>> surface is still displayed or not.
>>>
>>> If the application starts to render into a buffer while it is still
>>> being displayed tearing-like artifacts are the expected result.
>> You're talking about the buffers exposed to applications via VDAPU. I
>> was talking about using a single separate linear buffer which would be
>> used for presentation of all VDPAU buffers. There's no way for the
>> application to know when that's idle.
> 
> Ok, yes that makes more sense.
> 
>>
>>> Additional to that I've made the VDPAU output surfaces linear a while
>>> ago anyway, because it showed that tiling actually wasn't beneficial in
>>> this use case (a single quad rendered over the whole texture).
>> That's fine as long as the buffers are in VRAM, but when they're pinned
>> to GTT for sharing between GPUs, rendering to them with the 3D engine
>> results in bad PCIe bandwidth utilization, as Marek explained recently.
>> So even if the original buffers are already linear, it's better to keep
>> those in VRAM and use separate buffers for sharing between GPUs.
>>
> Mhm at least for VDPAU most compositions should happen on temporary
> buffers anyway when there are any filters enabled.

In that case, do the contents get into the final buffer via a blit or
some kind of triangle / quad draw operation?


> Anyway I would clearly suggest to handle that in the VDPAU state tracker
> and not in the DRI3 code, cause the handling needed seems to be
> different for VA-API and I would really like to avoid any additional
> copy for 4k playback.

The thing is, with a discrete GPU, having separate buffers for sharing
between GPUs and transferring the final contents to be presented to
those buffers using a blit might be faster than having any of the
previous steps render to the shared buffer in GTT directly. Only the
DRI3 specific code knows about this.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/57] i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_written.

2016-09-08 Thread Francisco Jerez

Iago Toral  writes:

> On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
>> This is in preparation for dropping vec4_instruction::regs_read and
>> ::regs_written in favor of more accurate alternatives expressed in
>> byte units.  The main reason these wrappers are useful is that a
>> number of optimization passes implement dataflow analysis with
>> register granularity, so these helpers will come in handy once we've
>> switched register offsets and sizes to the byte representation.  The
>> wrapper functions will also make sure that GRF misalignment
>> (currently
>> neglected by most of the back-end) is taken into account correctly in
>> the calculation of regs_read and regs_written.
>
>
> This does not seem to replace all uses of regs_written and inst-
>>regs_read() with these helpers. I am not sure if this was by design or
> by mistake but the consequence is that later patches still do a lot of
> things like:
>
> - scan_inst->dst.offset / REG_SIZE + scan_inst->regs_written >
> + scan_inst->dst.offset / REG_SIZE + DIV_ROUND_UP(scan_inst-
>>size_written, REG_SIZE)
>
> (this hunk is from the next patch in fs_visitor::compute_to_mrf(), but
> there are plenty more like this in that same patch)
>
> which would have not been necessary if we just used the regs_written()
> helper here.
>

The reason for the apparent inconsistency you've noticed here is that
regs_written(inst) and DIV_ROUND_UP(inst.size_written, REG_SIZE), even
though they look like synonyms at this point of the series, are intended
to do different things (they don't yet, but they will once several fixes
are applied to regs_written() after PATCH 16).  From the doxygen comment
of regs_written():

| Return the number of dataflow registers written by the instruction
| (either fully or partially) counted from 'floor(reg_offset(inst->dst)
| / register_size)'.  The somewhat arbitrary register size unit is 16B
| for the UNIFORM and IMM files and 32B for all other files.

IOW, regs_written() is expected to behave as if it partitioned the
register file of inst->dst into 32B chunks starting from reg_offset(r)
== 0, and returned how many of those chunks overlap the destination
region of the instruction, which is not necessarily equivalent to the
amount of data written by the instruction in register units (if e.g. the
instruction writes exactly REG_SIZE bytes but the destination region
starts mid-GRF, regs_written(inst) would be expected to return two, but
DIV_ROUND_UP(inst.size_written, REG_SIZE) would return one).

The same goes for regs_read() vs DIV_ROUND_UP(size_read(), REG_SIZE).

That said, you could argue that in the example you pasted above
regs_written() would have been the more correct thing to do of the two.
That's definitely the case, but I didn't bother to change it because I
removed the whole condition anyway during the clean-up part of this
series, since it was just a rather hairy open-coded version of
region_contained_in().

>> ---
>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h| 26
>> ++
>>  .../drivers/dri/i965/brw_schedule_instructions.cpp |  8 +++
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp |  4 ++--
>>  src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  6 ++---
>>  .../dri/i965/brw_vec4_dead_code_eliminate.cpp  |  6 ++---
>>  .../drivers/dri/i965/brw_vec4_live_variables.cpp   |  8 +++
>>  6 files changed, 42 insertions(+), 16 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> index 4f49428..a1a201b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> @@ -254,6 +254,32 @@ set_saturate(bool saturate, vec4_instruction
>> *inst)
>> return inst;
>>  }
>>  
>> +/**
>> + * Return the number of dataflow registers written by the
>> instruction (either
>> + * fully or partially) counted from 'floor(reg_offset(inst->dst) /
>> + * register_size)'.  The somewhat arbitrary register size unit is
>> 16B for the
>> + * UNIFORM and IMM files and 32B for all other files.
>> + */
>> +inline unsigned
>> +regs_written(const vec4_instruction *inst)
>> +{
>> +   /* XXX - Take into account register-misaligned offsets correctly.
>> */
>> +   return inst->regs_written;
>> +}
>> +
>> +/**
>> + * Return the number of dataflow registers read by the instruction
>> (either
>> + * fully or partially) counted from 'floor(reg_offset(inst->src[i])
>> /
>> + * register_size)'.  The somewhat arbitrary register size unit is
>> 16B for the
>> + * UNIFORM and IMM files and 32B for all other files.
>> + */
>> +inline unsigned
>> +regs_read(const vec4_instruction *inst, unsigned i)
>> +{
>> +   /* XXX - Take into account register-misaligned offsets correctly.
>> */
>> +   return inst->regs_read(i);
>> +}
>> +
>>  } /* namespace brw */
>>  
>>  #endif
>> diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
>> b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
>> index 0d3

Re: [Mesa-dev] [PATCH 05/13] nouveau: Enable EXT_texture_env_dot3 on NV10 and NV20

2016-09-08 Thread Ian Romanick

On 08/28/2016 06:17 PM, Ilia Mirkin wrote:
> On Sun, Aug 28, 2016 at 9:05 PM, Ian Romanick  wrote:
>> On 08/28/2016 08:56 AM, Ilia Mirkin wrote:
>>> FWIW this fails for GL_DOT3_RGBA_EXT but works for GL_DOT3_RGB_EXT
>>> [according to glean's texCombine test]. (I suspect the existing
>>
>> Looking at the test results... any idea what it's actually doing?
>> Ignoring alpha and using 1.0?  Using garbage?  Other?
> 
> We... it's unclear that the results can be trusted. There's an
> error in setting the RC_OUT_ALPHA value which means that the hardware
> is in some pseudo-inconsistent state, potentially. (The error is
> thrown when the graph engine processes the register write request from
> the command FIFO, in the form of an interrupt.) For a while I was
> getting 0,0,0,0.25, and then I started getting 1,1,1,0.25. The "real"
> answer was supposed to be 1,1,1,1:
> 
> $ NOUVEAU_VIEUX=1 bin/glean -o -v -v -v -t +texCombine  --quick
> texCombine:  FAIL rgba8, db, z24, s8, win+pmap, id 33
> expected 1, 1, 1, 1, got 0, 0, 0, 0.247059 in Single Texture Test
> Current combine state:
> Incoming Fragment RGBA = 0, 0.25, 0.5, 0.75
> Texture Unit 0:
>   GL_COMBINE_RGB_EXT = GL_DOT3_RGBA_EXT
>   GL_COMBINE_ALPHA_EXT = GL_MODULATE
>   GL_SOURCE0_RGB_EXT = GL_TEXTURE
>   GL_SOURCE1_RGB_EXT = GL_TEXTURE
>   GL_SOURCE2_RGB_EXT = GL_CONSTANT_EXT
>   GL_SOURCE0_ALPHA_EXT = GL_TEXTURE
>   GL_SOURCE1_ALPHA_EXT = GL_TEXTURE
>   GL_SOURCE2_ALPHA_EXT = GL_CONSTANT_EXT
>   GL_OPERAND0_RGB_EXT = GL_SRC_COLOR
>   GL_OPERAND1_RGB_EXT = GL_SRC_COLOR
>   GL_OPERAND2_RGB_EXT = GL_SRC_ALPHA
>   GL_OPERAND0_ALPHA_EXT = GL_SRC_ALPHA
>   GL_OPERAND1_ALPHA_EXT = GL_SRC_ALPHA
>   GL_OPERAND2_ALPHA_EXT = GL_SRC_ALPHA
>   GL_RGB_SCALE_EXT = 1
>   GL_ALPHA_SCALE = 1
>   Tex Env RGBA = 0.25, 0.5, 0.75, 1
>   Texture RGBA = 1, 0, 0.25, 0.5
> 
> To be super-clear - this is not your fault - it was already like that
> for the non-EXT version. But I'm hoping you could provide some hints
> as to why it's happening and/or how I could fix it.
> 
> And I'm pretty sure the RGB_EXT thing works, because the texcombine
> test runs that first and there are no errors from it.

While I was waiting for my GF3 system to finish installing... I dug
through the GL_NV_register_combiners documentation.  That extension is a
pretty thin shim on top of what the hardware does.  As far as I can
tell, there is no way to output the same data to RGB and A in a single
combiner stage.  I believe you have to use a second register combiner
stage to copy the blue component from the DOT3 operation to the alpha
component.

I'm not sure how to accomplish that in the current architecture.

>   -ilia




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/57] i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes.

2016-09-08 Thread Francisco Jerez

Iago Toral  writes:

> On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
> (...)
>>  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  4 +-
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 61
>> --
>>  .../drivers/dri/i965/brw_vec4_cmod_propagation.cpp |  2 +-
>>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp |  4 +-
>>  .../drivers/dri/i965/brw_vec4_live_variables.h |  8 +--
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp |  2 +-
>>  .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |  4 +-
>>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 14 ++---
>>  8 files changed, 51 insertions(+), 48 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> index 3813bb8..4f49428 100644
>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
>> @@ -65,7 +65,7 @@ offset(src_reg reg, unsigned delta)
>>  {
>> assert(delta == 0 ||
>>    (reg.file != ARF && reg.file != FIXED_GRF && reg.file !=
>> IMM));
>> -   reg.reg_offset += delta;
>> +   reg.offset += delta * (reg.file == UNIFORM ? 16 : REG_SIZE);
>> return reg;
>>  }
>>  
>> @@ -134,7 +134,7 @@ offset(dst_reg reg, unsigned delta)
>>  {
>> assert(delta == 0 ||
>>    (reg.file != ARF && reg.file != FIXED_GRF && reg.file !=
>> IMM));
>> -   reg.reg_offset += delta;
>> +   reg.offset += delta * (reg.file == UNIFORM ? 16 : REG_SIZE);
>> return reg;
>>  }
>>  
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index d52fdc0..dd058db 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -68,7 +68,7 @@ src_reg::src_reg()
>>  src_reg::src_reg(struct ::brw_reg reg) :
>> backend_reg(reg)
>>  {
>> -   this->reg_offset = 0;
>> +   this->offset = 0;
>> this->reladdr = NULL;
>>  }
>>  
>> @@ -125,7 +125,7 @@ dst_reg::dst_reg(enum brw_reg_file file, int nr,
>> brw_reg_type type,
>>  dst_reg::dst_reg(struct ::brw_reg reg) :
>> backend_reg(reg)
>>  {
>> -   this->reg_offset = 0;
>> +   this->offset = 0;
>> this->reladdr = NULL;
>>  }
>>  
>> @@ -395,7 +395,7 @@ vec4_visitor::opt_vector_float()
>>    * sequence.  Combine anything we've accumulated so far.
>>    */
>>   if (last_reg != inst->dst.nr ||
>> - last_reg_offset != inst->dst.reg_offset ||
>> + last_reg_offset != inst->dst.offset / REG_SIZE ||
>>   last_reg_file != inst->dst.file ||
>>   (vf > 0 && dest_type != need_type)) {
>>  
>> @@ -439,7 +439,7 @@ vec4_visitor::opt_vector_float()
>>  imm_inst[inst_count++] = inst;
>>  
>>  last_reg = inst->dst.nr;
>> -last_reg_offset = inst->dst.reg_offset;
>> +last_reg_offset = inst->dst.offset / REG_SIZE;
>>  last_reg_file = inst->dst.file;
>>  if (vf > 0)
>> dest_type = need_type;
>> @@ -539,8 +539,8 @@ vec4_visitor::split_uniform_registers()
>>  
>>   assert(!inst->src[i].reladdr);
>>  
>> - inst->src[i].nr += inst->src[i].reg_offset;
>> - inst->src[i].reg_offset = 0;
>> + inst->src[i].nr += inst->src[i].offset / 16;
>> + inst->src[i].offset %= 16;
>>    }
>> }
>>  }
>> @@ -857,7 +857,7 @@
>> vec4_visitor::move_push_constants_to_pull_constants()
>>  
>>   inst->src[i].file = temp.file;
>>   inst->src[i].nr = temp.nr;
>> - inst->src[i].reg_offset = temp.reg_offset;
>> + inst->src[i].offset %= 16;
>
> So it seems that temp.offset is going to be 0 here and that's why
> you're making this change. Looks good to me, just making sure that this
> is not something unintended since it is not quite following the pattern
> of directly translating the original code.
>
Yeah, that's right, I should probably have split this simplification
into a separate patch since it's apparently not fully obvious.

>>   inst->src[i].reladdr = NULL;
>>    }
>> }
>
> (...)
>
>> @@ -1831,7 +1834,7 @@ vec4_visitor::convert_to_hw_regs()
>>   struct brw_reg reg;
>>   switch (src.file) {
>>   case VGRF:
>> -reg = brw_vec8_grf(src.nr + src.reg_offset, 0);
>> +reg = brw_vec8_grf(src.nr + src.offset / REG_SIZE, 0);
>>  reg.type = src.type;
>>  reg.swizzle = src.swizzle;
>>  reg.abs = src.abs;
>> @@ -1840,8 +1843,8 @@ vec4_visitor::convert_to_hw_regs()
>>  
>>   case UNIFORM:
>>  reg = stride(brw_vec4_grf(prog_data-
>> >base.dispatch_grf_start_reg +
>> -  (src.nr + src.reg_offset) / 2,
>> -  ((src.nr + src.reg_offset) %
>> 2) * 4),
>> +  (src.nr + src.offset / 4) / 2,
>> +  ((src.nr + src.offset / 4) % 
>
> Shouldn't we divide by 16 instead

[Mesa-dev] [RFT PATCH 1/2] nouveau: fix GL_CLAMP

2016-09-08 Thread Ilia Mirkin

My earlier observations were that this didn't actually do anything
useful on nv10. however it seems to help with nv25-on-nv30. should try
to test it on "real" hw.
---

Note - this needs to be redone so as to avoid returning 0x5 on hw that
doesn't support it. According to rnndb, it doesn't exist on NV4/NV5.

However the immediate desire is to double-check that it doesn't break
anything on pre-NV20 hw.

 src/mesa/drivers/dri/nouveau/nouveau_gldefs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/nouveau/nouveau_gldefs.h 
b/src/mesa/drivers/dri/nouveau/nouveau_gldefs.h
index 46ec14e..ad698b1 100644
--- a/src/mesa/drivers/dri/nouveau/nouveau_gldefs.h
+++ b/src/mesa/drivers/dri/nouveau/nouveau_gldefs.h
@@ -229,6 +229,7 @@ nvgl_wrap_mode(unsigned wrap)
case GL_MIRRORED_REPEAT:
return 0x2;
case GL_CLAMP:
+   return 0x5;
case GL_CLAMP_TO_EDGE:
return 0x3;
case GL_CLAMP_TO_BORDER:
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFT PATCH 2/2] nv20: enable ARB_texture_border_clamp support

2016-09-08 Thread Ilia Mirkin

Signed-off-by: Ilia Mirkin 
---

This was tested on a NV25-on-NV34 situation. Should be tested on real hardware
since my test environment relies on accurate emulation in the hw.

 src/mesa/drivers/dri/nouveau/nv20_context.c   |  1 +
 src/mesa/drivers/dri/nouveau/nv20_state_tex.c | 29 ++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/nouveau/nv20_context.c 
b/src/mesa/drivers/dri/nouveau/nv20_context.c
index ec638c0..6940b4d 100644
--- a/src/mesa/drivers/dri/nouveau/nv20_context.c
+++ b/src/mesa/drivers/dri/nouveau/nv20_context.c
@@ -456,6 +456,7 @@ nv20_context_create(struct nouveau_screen *screen, gl_api 
api,
if (!nouveau_context_init(ctx, api, screen, visual, share_ctx))
goto fail;
 
+   ctx->Extensions.ARB_texture_border_clamp = true;
ctx->Extensions.ARB_texture_env_crossbar = true;
ctx->Extensions.ARB_texture_env_combine = true;
ctx->Extensions.ARB_texture_env_dot3 = true;
diff --git a/src/mesa/drivers/dri/nouveau/nv20_state_tex.c 
b/src/mesa/drivers/dri/nouveau/nv20_state_tex.c
index b0a4c9f..ef1799a 100644
--- a/src/mesa/drivers/dri/nouveau/nv20_state_tex.c
+++ b/src/mesa/drivers/dri/nouveau/nv20_state_tex.c
@@ -165,7 +165,8 @@ nv20_emit_tex_obj(struct gl_context *ctx, int emit)
struct nouveau_surface *s;
struct gl_texture_image *ti;
const struct gl_sampler_object *sa;
-   uint32_t tx_format, tx_filter, tx_wrap, tx_enable;
+   uint8_t r, g, b, a;
+   uint32_t tx_format, tx_filter, tx_wrap, tx_bcolor, tx_enable;
 
PUSH_RESET(push, BUFCTX_TEX(i));
 
@@ -201,6 +202,29 @@ nv20_emit_tex_obj(struct gl_context *ctx, int emit)
| nvgl_filter_mode(sa->MinFilter) << 16
| 2 << 12;
 
+   r = FLOAT_TO_UBYTE(sa->BorderColor.f[0]);
+   g = FLOAT_TO_UBYTE(sa->BorderColor.f[1]);
+   b = FLOAT_TO_UBYTE(sa->BorderColor.f[2]);
+   a = FLOAT_TO_UBYTE(sa->BorderColor.f[3]);
+   switch (ti->_BaseFormat) {
+   case GL_LUMINANCE:
+   a = 0xff;
+   /* fallthrough */
+   case GL_LUMINANCE_ALPHA:
+   g = b = r;
+   break;
+   case GL_RGB:
+   a = 0xff;
+   break;
+   case GL_INTENSITY:
+   g = b = a = r;
+   break;
+   case GL_ALPHA:
+   r = g = b = 0;
+   break;
+   }
+   tx_bcolor = b << 0 | g << 8 | r << 16 | a << 24;
+
tx_enable = NV20_3D_TEX_ENABLE_ENABLE
| log2i(sa->MaxAnisotropy) << 4;
 
@@ -249,6 +273,9 @@ nv20_emit_tex_obj(struct gl_context *ctx, int emit)
BEGIN_NV04(push, NV20_3D(TEX_FILTER(i)), 1);
PUSH_DATA (push, tx_filter);
 
+   BEGIN_NV04(push, NV20_3D(TEX_BORDER_COLOR(i)), 1);
+   PUSH_DATA (push, tx_bcolor);
+
BEGIN_NV04(push, NV20_3D(TEX_ENABLE(i)), 1);
PUSH_DATA (push, tx_enable);
 
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Rob Clark

On Thu, Sep 8, 2016 at 8:28 PM, Roland Scheidegger  wrote:
> Am 09.09.2016 um 02:19 schrieb Rob Clark:
>> On Thu, Sep 8, 2016 at 7:54 PM, Rob Clark  wrote:
>>> On Thu, Sep 8, 2016 at 6:41 PM, Roland Scheidegger  
>>> wrote:
 Am 08.09.2016 um 23:43 schrieb Rob Clark:
> On Thu, Sep 8, 2016 at 5:11 PM, Roland Scheidegger  
> wrote:
>> Am 08.09.2016 um 22:30 schrieb Rob Clark:
>>> Support multi-planar YUV for external EGLImage's (currently just in the
>>> dma-buf import path) by lowering to multiple texture fetch's for each
>>> plane and CSC in shader.
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>>  src/gallium/auxiliary/util/u_inlines.h  |   4 +-
>>>  src/gallium/include/pipe/p_state.h  |   9 +++
>>>  src/gallium/include/state_tracker/st_api.h  |   3 +
>>>  src/gallium/state_trackers/dri/dri2.c   | 119 
>>> +++-
>>>  src/gallium/state_trackers/dri/dri_screen.c |  11 +++
>>>  src/mesa/main/mtypes.h  |  16 
>>>  src/mesa/program/ir_to_mesa.cpp |   1 +
>>>  src/mesa/state_tracker/st_atom_sampler.c|  41 +-
>>>  src/mesa/state_tracker/st_atom_shader.c |   3 +
>>>  src/mesa/state_tracker/st_atom_texture.c|  58 ++
>>>  src/mesa/state_tracker/st_cb_eglimage.c |  18 +
>>>  src/mesa/state_tracker/st_context.c |   7 +-
>>>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
>>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
>>>  src/mesa/state_tracker/st_manager.c |   1 +
>>>  src/mesa/state_tracker/st_program.c |  35 
>>>  src/mesa/state_tracker/st_program.h |  37 +
>>>  src/mesa/state_tracker/st_texture.h |  21 +
>>>  18 files changed, 362 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/util/u_inlines.h 
>>> b/src/gallium/auxiliary/util/u_inlines.h
>>> index c2a0b08..b7b8313 100644
>>> --- a/src/gallium/auxiliary/util/u_inlines.h
>>> +++ b/src/gallium/auxiliary/util/u_inlines.h
>>> @@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource 
>>> **ptr, struct pipe_resource *tex)
>>> struct pipe_resource *old_tex = *ptr;
>>>
>>> if (pipe_reference_described(&(*ptr)->reference, &tex->reference,
>>> -
>>> (debug_reference_descriptor)debug_describe_resource))
>>> +
>>> (debug_reference_descriptor)debug_describe_resource)) {
>>> +  pipe_resource_reference(&old_tex->next, NULL);
>>>old_tex->screen->resource_destroy(old_tex->screen, old_tex);
>>> +   }
>>> *ptr = tex;
>>>  }
>>>
>>> diff --git a/src/gallium/include/pipe/p_state.h 
>>> b/src/gallium/include/pipe/p_state.h
>>> index ebd0337..4a88da6 100644
>>> --- a/src/gallium/include/pipe/p_state.h
>>> +++ b/src/gallium/include/pipe/p_state.h
>>> @@ -498,6 +498,15 @@ struct pipe_resource
>>>
>>> unsigned bind;/**< bitmask of PIPE_BIND_x */
>>> unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
>>> +
>>> +   /**
>>> +* For planar images, ie. YUV EGLImage external, etc, pointer to the
>>> +* next plane.
>>> +*
>>> +* TODO might be useful for dealing w/ z32s8 too, since at least a
>>> +* couple drivers split these out into separate buffers internally.
>>> +*/
>>> +   struct pipe_resource *next;
>> Would it be possible to stuff the multiple resources somewhere else
>> (__DRIImage ?)? Seems a bit of a hack to have resources referencing
>> other resources that way.
>> (Also, it's odd since things are mostly lowered really outside of
>> gallium so it's odd that some of the yuv state still sneaks in there.)
>
> I did originally start down the path of making __DRIImage have
> multiple pipe_resource's.. I'm not really sure that would end up
> better, and it certainly would be more invasive.
>
> Maybe we should just make that something like 'void *stpriv' to let st
> stick whatever it wants in there.  That seems more sane than making
> the st use a hashtable to map the rsc back to something else.

 Can't you just put 3 resources in somewhere without pointers?
 I just think it really should be outside gallium interfaces. The
 lowering is all done by the state tracker, hence having those bits there
 referencing other resources in gallium looks wrong to me.

>>>
>>> It would require a *lot* of changes to change st_texture_object::pt
>>> into an array in mesa/st, I think.. plus a bunch of re-working the
>>> egl-img code in mesa/st.. that sounds like a much worse option to me.
>>>
>>> Having a 'void *' opaque pointer in the resource (rather than a
>>> 'struct pipe_resource *next') for the st

Re: [Mesa-dev] [PATCH 01/57] i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes.

2016-09-08 Thread Francisco Jerez

Iago Toral  writes:

> On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
> (...)
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> index ea39252..29435f6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> @@ -672,7 +672,7 @@ backend_shader::backend_shader(const struct
>> brw_compiler *compiler,
>>  bool
>>  backend_reg::equals(const backend_reg &r) const
>>  {
>> -   return brw_regs_equal(this, &r) && reg_offset == r.reg_offset;
>> +   return brw_regs_equal(this, &r) && offset == r.offset;
>>  }
>>  
>>  bool
>> @@ -750,7 +750,9 @@ backend_reg::in_range(const backend_reg &r,
>> unsigned n) const
>> return (file == r.file &&
>> nr == r.nr &&
>> reg_offset >= r.reg_offset &&
>> -   reg_offset < r.reg_offset + n);
>> +   reg_offset < r.reg_offset + n &&
>
> Are you keeping the checks with reg_offset here for a reason or is this
> just an omission? I would expect that these would be replaced with the
> checks below like we do everywhere else in this patch.
>
Yeah, the reason is that this code is shared between both back-ends, and
the VEC4 back-end is still using the old reg_offset field until PATCH 2,
so I left these lying around for the moment until PATCH 3 in order to
avoid a temporary regression.

>> +   offset >= r.offset &&
>> +   offset < r.offset + n * REG_SIZE);
>>  }
>>  
>>  bool
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h
>> b/src/mesa/drivers/dri/i965/brw_shader.h
>> index 0102098..72b94b6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.h
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.h
>> @@ -44,14 +44,14 @@ struct backend_reg : private brw_reg
>> const brw_reg &as_brw_reg() const
>> {
>>    assert(file == ARF || file == FIXED_GRF || file == MRF || file
>> == IMM);
>> -  assert(reg_offset == 0);
>> +  assert(reg_offset == 0 && offset == 0);
>
> Same here.
>
>>    return static_cast(*this);
>> }
>>  
>> brw_reg &as_brw_reg()
>> {
>>    assert(file == ARF || file == FIXED_GRF || file == MRF || file
>> == IMM);
>> -  assert(reg_offset == 0);
>> +  assert(reg_offset == 0 && offset == 0);
>
> And here.
>
>
>>    return static_cast(*this);
>> }
>>  
>> @@ -75,6 +75,9 @@ struct backend_reg : private brw_reg
>>  */
>> uint16_t reg_offset;
>>  
>> +   /** Offset from the start of the (virtual) register in bytes. */
>> +   uint16_t offset;
>> +
>> using brw_reg::type;
>> using brw_reg::file;
>> using brw_reg::negate;


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97643] Shader crashes radeon driver and brings the whole system down

2016-09-08 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97643

Michel Dänzer  changed:

   What|Removed |Added

  Component|Mesa core   |Drivers/Gallium/radeonsi
 QA Contact|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org
   Assignee|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Roland Scheidegger

Am 09.09.2016 um 02:19 schrieb Rob Clark:
> On Thu, Sep 8, 2016 at 7:54 PM, Rob Clark  wrote:
>> On Thu, Sep 8, 2016 at 6:41 PM, Roland Scheidegger  
>> wrote:
>>> Am 08.09.2016 um 23:43 schrieb Rob Clark:
 On Thu, Sep 8, 2016 at 5:11 PM, Roland Scheidegger  
 wrote:
> Am 08.09.2016 um 22:30 schrieb Rob Clark:
>> Support multi-planar YUV for external EGLImage's (currently just in the
>> dma-buf import path) by lowering to multiple texture fetch's for each
>> plane and CSC in shader.
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/gallium/auxiliary/util/u_inlines.h  |   4 +-
>>  src/gallium/include/pipe/p_state.h  |   9 +++
>>  src/gallium/include/state_tracker/st_api.h  |   3 +
>>  src/gallium/state_trackers/dri/dri2.c   | 119 
>> +++-
>>  src/gallium/state_trackers/dri/dri_screen.c |  11 +++
>>  src/mesa/main/mtypes.h  |  16 
>>  src/mesa/program/ir_to_mesa.cpp |   1 +
>>  src/mesa/state_tracker/st_atom_sampler.c|  41 +-
>>  src/mesa/state_tracker/st_atom_shader.c |   3 +
>>  src/mesa/state_tracker/st_atom_texture.c|  58 ++
>>  src/mesa/state_tracker/st_cb_eglimage.c |  18 +
>>  src/mesa/state_tracker/st_context.c |   7 +-
>>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
>>  src/mesa/state_tracker/st_manager.c |   1 +
>>  src/mesa/state_tracker/st_program.c |  35 
>>  src/mesa/state_tracker/st_program.h |  37 +
>>  src/mesa/state_tracker/st_texture.h |  21 +
>>  18 files changed, 362 insertions(+), 27 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_inlines.h 
>> b/src/gallium/auxiliary/util/u_inlines.h
>> index c2a0b08..b7b8313 100644
>> --- a/src/gallium/auxiliary/util/u_inlines.h
>> +++ b/src/gallium/auxiliary/util/u_inlines.h
>> @@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource **ptr, 
>> struct pipe_resource *tex)
>> struct pipe_resource *old_tex = *ptr;
>>
>> if (pipe_reference_described(&(*ptr)->reference, &tex->reference,
>> -
>> (debug_reference_descriptor)debug_describe_resource))
>> +
>> (debug_reference_descriptor)debug_describe_resource)) {
>> +  pipe_resource_reference(&old_tex->next, NULL);
>>old_tex->screen->resource_destroy(old_tex->screen, old_tex);
>> +   }
>> *ptr = tex;
>>  }
>>
>> diff --git a/src/gallium/include/pipe/p_state.h 
>> b/src/gallium/include/pipe/p_state.h
>> index ebd0337..4a88da6 100644
>> --- a/src/gallium/include/pipe/p_state.h
>> +++ b/src/gallium/include/pipe/p_state.h
>> @@ -498,6 +498,15 @@ struct pipe_resource
>>
>> unsigned bind;/**< bitmask of PIPE_BIND_x */
>> unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
>> +
>> +   /**
>> +* For planar images, ie. YUV EGLImage external, etc, pointer to the
>> +* next plane.
>> +*
>> +* TODO might be useful for dealing w/ z32s8 too, since at least a
>> +* couple drivers split these out into separate buffers internally.
>> +*/
>> +   struct pipe_resource *next;
> Would it be possible to stuff the multiple resources somewhere else
> (__DRIImage ?)? Seems a bit of a hack to have resources referencing
> other resources that way.
> (Also, it's odd since things are mostly lowered really outside of
> gallium so it's odd that some of the yuv state still sneaks in there.)

 I did originally start down the path of making __DRIImage have
 multiple pipe_resource's.. I'm not really sure that would end up
 better, and it certainly would be more invasive.

 Maybe we should just make that something like 'void *stpriv' to let st
 stick whatever it wants in there.  That seems more sane than making
 the st use a hashtable to map the rsc back to something else.
>>>
>>> Can't you just put 3 resources in somewhere without pointers?
>>> I just think it really should be outside gallium interfaces. The
>>> lowering is all done by the state tracker, hence having those bits there
>>> referencing other resources in gallium looks wrong to me.
>>>
>>
>> It would require a *lot* of changes to change st_texture_object::pt
>> into an array in mesa/st, I think.. plus a bunch of re-working the
>> egl-img code in mesa/st.. that sounds like a much worse option to me.
>>
>> Having a 'void *' opaque pointer in the resource (rather than a
>> 'struct pipe_resource *next') for the st to do whatever it wants with
>> seems semi-sane to me.  And plausibly useful to other st's as well.
>>
>> *however* making it an opaque ptr (or even handling

Re: [Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Rob Clark

On Thu, Sep 8, 2016 at 7:54 PM, Rob Clark  wrote:
> On Thu, Sep 8, 2016 at 6:41 PM, Roland Scheidegger  wrote:
>> Am 08.09.2016 um 23:43 schrieb Rob Clark:
>>> On Thu, Sep 8, 2016 at 5:11 PM, Roland Scheidegger  
>>> wrote:
 Am 08.09.2016 um 22:30 schrieb Rob Clark:
> Support multi-planar YUV for external EGLImage's (currently just in the
> dma-buf import path) by lowering to multiple texture fetch's for each
> plane and CSC in shader.
>
> Signed-off-by: Rob Clark 
> ---
>  src/gallium/auxiliary/util/u_inlines.h  |   4 +-
>  src/gallium/include/pipe/p_state.h  |   9 +++
>  src/gallium/include/state_tracker/st_api.h  |   3 +
>  src/gallium/state_trackers/dri/dri2.c   | 119 
> +++-
>  src/gallium/state_trackers/dri/dri_screen.c |  11 +++
>  src/mesa/main/mtypes.h  |  16 
>  src/mesa/program/ir_to_mesa.cpp |   1 +
>  src/mesa/state_tracker/st_atom_sampler.c|  41 +-
>  src/mesa/state_tracker/st_atom_shader.c |   3 +
>  src/mesa/state_tracker/st_atom_texture.c|  58 ++
>  src/mesa/state_tracker/st_cb_eglimage.c |  18 +
>  src/mesa/state_tracker/st_context.c |   7 +-
>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
>  src/mesa/state_tracker/st_manager.c |   1 +
>  src/mesa/state_tracker/st_program.c |  35 
>  src/mesa/state_tracker/st_program.h |  37 +
>  src/mesa/state_tracker/st_texture.h |  21 +
>  18 files changed, 362 insertions(+), 27 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_inlines.h 
> b/src/gallium/auxiliary/util/u_inlines.h
> index c2a0b08..b7b8313 100644
> --- a/src/gallium/auxiliary/util/u_inlines.h
> +++ b/src/gallium/auxiliary/util/u_inlines.h
> @@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource **ptr, 
> struct pipe_resource *tex)
> struct pipe_resource *old_tex = *ptr;
>
> if (pipe_reference_described(&(*ptr)->reference, &tex->reference,
> -
> (debug_reference_descriptor)debug_describe_resource))
> +
> (debug_reference_descriptor)debug_describe_resource)) {
> +  pipe_resource_reference(&old_tex->next, NULL);
>old_tex->screen->resource_destroy(old_tex->screen, old_tex);
> +   }
> *ptr = tex;
>  }
>
> diff --git a/src/gallium/include/pipe/p_state.h 
> b/src/gallium/include/pipe/p_state.h
> index ebd0337..4a88da6 100644
> --- a/src/gallium/include/pipe/p_state.h
> +++ b/src/gallium/include/pipe/p_state.h
> @@ -498,6 +498,15 @@ struct pipe_resource
>
> unsigned bind;/**< bitmask of PIPE_BIND_x */
> unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
> +
> +   /**
> +* For planar images, ie. YUV EGLImage external, etc, pointer to the
> +* next plane.
> +*
> +* TODO might be useful for dealing w/ z32s8 too, since at least a
> +* couple drivers split these out into separate buffers internally.
> +*/
> +   struct pipe_resource *next;
 Would it be possible to stuff the multiple resources somewhere else
 (__DRIImage ?)? Seems a bit of a hack to have resources referencing
 other resources that way.
 (Also, it's odd since things are mostly lowered really outside of
 gallium so it's odd that some of the yuv state still sneaks in there.)
>>>
>>> I did originally start down the path of making __DRIImage have
>>> multiple pipe_resource's.. I'm not really sure that would end up
>>> better, and it certainly would be more invasive.
>>>
>>> Maybe we should just make that something like 'void *stpriv' to let st
>>> stick whatever it wants in there.  That seems more sane than making
>>> the st use a hashtable to map the rsc back to something else.
>>
>> Can't you just put 3 resources in somewhere without pointers?
>> I just think it really should be outside gallium interfaces. The
>> lowering is all done by the state tracker, hence having those bits there
>> referencing other resources in gallium looks wrong to me.
>>
>
> It would require a *lot* of changes to change st_texture_object::pt
> into an array in mesa/st, I think.. plus a bunch of re-working the
> egl-img code in mesa/st.. that sounds like a much worse option to me.
>
> Having a 'void *' opaque pointer in the resource (rather than a
> 'struct pipe_resource *next') for the st to do whatever it wants with
> seems semi-sane to me.  And plausibly useful to other st's as well.
>
> *however* making it an opaque ptr (or even handling it purely in
> mesa/st) seems like a slight disadvantage compared to current patch..
> unref'ing rsc->next in pipe_resource_reference() is a nice b

Re: [Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Rob Clark

On Thu, Sep 8, 2016 at 6:41 PM, Roland Scheidegger  wrote:
> Am 08.09.2016 um 23:43 schrieb Rob Clark:
>> On Thu, Sep 8, 2016 at 5:11 PM, Roland Scheidegger  
>> wrote:
>>> Am 08.09.2016 um 22:30 schrieb Rob Clark:
 Support multi-planar YUV for external EGLImage's (currently just in the
 dma-buf import path) by lowering to multiple texture fetch's for each
 plane and CSC in shader.

 Signed-off-by: Rob Clark 
 ---
  src/gallium/auxiliary/util/u_inlines.h  |   4 +-
  src/gallium/include/pipe/p_state.h  |   9 +++
  src/gallium/include/state_tracker/st_api.h  |   3 +
  src/gallium/state_trackers/dri/dri2.c   | 119 
 +++-
  src/gallium/state_trackers/dri/dri_screen.c |  11 +++
  src/mesa/main/mtypes.h  |  16 
  src/mesa/program/ir_to_mesa.cpp |   1 +
  src/mesa/state_tracker/st_atom_sampler.c|  41 +-
  src/mesa/state_tracker/st_atom_shader.c |   3 +
  src/mesa/state_tracker/st_atom_texture.c|  58 ++
  src/mesa/state_tracker/st_cb_eglimage.c |  18 +
  src/mesa/state_tracker/st_context.c |   7 +-
  src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
  src/mesa/state_tracker/st_manager.c |   1 +
  src/mesa/state_tracker/st_program.c |  35 
  src/mesa/state_tracker/st_program.h |  37 +
  src/mesa/state_tracker/st_texture.h |  21 +
  18 files changed, 362 insertions(+), 27 deletions(-)

 diff --git a/src/gallium/auxiliary/util/u_inlines.h 
 b/src/gallium/auxiliary/util/u_inlines.h
 index c2a0b08..b7b8313 100644
 --- a/src/gallium/auxiliary/util/u_inlines.h
 +++ b/src/gallium/auxiliary/util/u_inlines.h
 @@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource **ptr, 
 struct pipe_resource *tex)
 struct pipe_resource *old_tex = *ptr;

 if (pipe_reference_described(&(*ptr)->reference, &tex->reference,
 -
 (debug_reference_descriptor)debug_describe_resource))
 +
 (debug_reference_descriptor)debug_describe_resource)) {
 +  pipe_resource_reference(&old_tex->next, NULL);
old_tex->screen->resource_destroy(old_tex->screen, old_tex);
 +   }
 *ptr = tex;
  }

 diff --git a/src/gallium/include/pipe/p_state.h 
 b/src/gallium/include/pipe/p_state.h
 index ebd0337..4a88da6 100644
 --- a/src/gallium/include/pipe/p_state.h
 +++ b/src/gallium/include/pipe/p_state.h
 @@ -498,6 +498,15 @@ struct pipe_resource

 unsigned bind;/**< bitmask of PIPE_BIND_x */
 unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
 +
 +   /**
 +* For planar images, ie. YUV EGLImage external, etc, pointer to the
 +* next plane.
 +*
 +* TODO might be useful for dealing w/ z32s8 too, since at least a
 +* couple drivers split these out into separate buffers internally.
 +*/
 +   struct pipe_resource *next;
>>> Would it be possible to stuff the multiple resources somewhere else
>>> (__DRIImage ?)? Seems a bit of a hack to have resources referencing
>>> other resources that way.
>>> (Also, it's odd since things are mostly lowered really outside of
>>> gallium so it's odd that some of the yuv state still sneaks in there.)
>>
>> I did originally start down the path of making __DRIImage have
>> multiple pipe_resource's.. I'm not really sure that would end up
>> better, and it certainly would be more invasive.
>>
>> Maybe we should just make that something like 'void *stpriv' to let st
>> stick whatever it wants in there.  That seems more sane than making
>> the st use a hashtable to map the rsc back to something else.
>
> Can't you just put 3 resources in somewhere without pointers?
> I just think it really should be outside gallium interfaces. The
> lowering is all done by the state tracker, hence having those bits there
> referencing other resources in gallium looks wrong to me.
>

It would require a *lot* of changes to change st_texture_object::pt
into an array in mesa/st, I think.. plus a bunch of re-working the
egl-img code in mesa/st.. that sounds like a much worse option to me.

Having a 'void *' opaque pointer in the resource (rather than a
'struct pipe_resource *next') for the st to do whatever it wants with
seems semi-sane to me.  And plausibly useful to other st's as well.

*however* making it an opaque ptr (or even handling it purely in
mesa/st) seems like a slight disadvantage compared to current patch..
unref'ing rsc->next in pipe_resource_reference() is a nice benefit of
the current approach..

At the end of the day, I'm less a fan of making this all much harder
for the st only for the benefit of some hypothetical API "

[Mesa-dev] [PATCH 0/5] * Aubinator code simplification *

2016-09-08 Thread Sirisha Gandikota

From: Sirisha Gandikota 

This patch set simplifies parts of code in the aubinator tool
as per review comments from Ken (Wed Aug 24 04:51:47 UTC 2016)

Sirisha Gandikota (5):
  aubinator: Fix compiler warning
  aubinator: Simplify gen_disasm_create()'s devinfo handling
  aubinator: Simplify print_dword_val() method
  aubinator: Make gen_disasm_disassemble handle split sends
  aubinator: Remove bogus "end" parameter in gen_disasm_disassemble()

 src/intel/tools/aubinator.c  | 20 
 src/intel/tools/decoder.h|  2 +-
 src/intel/tools/disasm.c | 38 ++
 src/intel/tools/gen_disasm.h |  2 +-
 4 files changed, 32 insertions(+), 30 deletions(-)

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] aubinator: Simplify gen_disasm_create()'s devinfo handling

2016-09-08 Thread Sirisha Gandikota

From: Sirisha Gandikota 

Copy the whole devinfo structure instead of just few fields (Ken)

Earlier, copied only couple of fields which added more code. So,
simplify code by copying the whole structure.

Signed-off-by: Sirisha Gandikota 
---
 src/intel/tools/disasm.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/src/intel/tools/disasm.c b/src/intel/tools/disasm.c
index ddbfa9f..7e5a7cb 100644
--- a/src/intel/tools/disasm.c
+++ b/src/intel/tools/disasm.c
@@ -89,18 +89,12 @@ struct gen_disasm *
 gen_disasm_create(int pciid)
 {
struct gen_disasm *gd;
-   const struct gen_device_info *dev_info = NULL;
 
gd = malloc(sizeof *gd);
if (gd == NULL)
   return NULL;
 
-   dev_info = gen_get_device_info(pciid);
-
-   gd->devinfo.gen = dev_info->gen;
-   gd->devinfo.is_cherryview = dev_info->is_cherryview;
-   gd->devinfo.is_g4x = dev_info->is_g4x;
-
+   gd->devinfo = *gen_get_device_info(pciid);
brw_init_compaction_tables(&gd->devinfo);
 
return gd;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] aubinator: Make gen_disasm_disassemble handle split sends

2016-09-08 Thread Sirisha Gandikota

From: Sirisha Gandikota 

Skylake adds new SENDS and SENDSC opcodes, which should be
handled in the send-with-EOT check. Make an is_send() helper
that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken)

Signed-off-by: Sirisha Gandikota 
---
 src/intel/tools/disasm.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/intel/tools/disasm.c b/src/intel/tools/disasm.c
index 7e5a7cb..13e4ce2 100644
--- a/src/intel/tools/disasm.c
+++ b/src/intel/tools/disasm.c
@@ -35,12 +35,25 @@ struct gen_disasm {
 struct gen_device_info devinfo;
 };
 
+
+static bool
+is_send(uint32_t opcode)
+{
+   if (opcode == BRW_OPCODE_SEND || opcode == BRW_OPCODE_SENDC ||
+   opcode == BRW_OPCODE_SENDS || opcode == BRW_OPCODE_SENDSC ) {
+  return true;
+   } else {
+  return false;
+   }
+}
+
 void
 gen_disasm_disassemble(struct gen_disasm *disasm, void *assembly, int start,
int end, FILE *out)
 {
struct gen_device_info *devinfo = &disasm->devinfo;
bool dump_hex = false;
+   uint32_t opcode = 0;
 
for (int offset = start; offset < end;) {
   brw_inst *insn = assembly + offset;
@@ -74,14 +87,11 @@ gen_disasm_disassemble(struct gen_disasm *disasm, void 
*assembly, int start,
   brw_disassemble_inst(out, devinfo, insn, compacted);
 
   /* Simplistic, but efficient way to terminate disasm */
-  if (brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SEND ||
-  brw_inst_opcode(devinfo, insn) == BRW_OPCODE_SENDC) {
- if (brw_inst_eot(devinfo, insn))
-break;
+  opcode = brw_inst_opcode(devinfo, insn);
+  if (opcode == 0 || (is_send(opcode) && brw_inst_eot(devinfo, insn))) {
+ break;
   }
 
-  if (brw_inst_opcode(devinfo, insn) == 0)
- break;
}
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] aubinator: Fix compiler warning

2016-09-08 Thread Sirisha Gandikota

From: Sirisha Gandikota 

Add 'const' qualifier to gen_field_iterator::p pointer (Ken)

Signed-off-by: Sirisha Gandikota 
---
 src/intel/tools/decoder.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/tools/decoder.h b/src/intel/tools/decoder.h
index b46e451..4ab0765 100644
--- a/src/intel/tools/decoder.h
+++ b/src/intel/tools/decoder.h
@@ -47,7 +47,7 @@ struct gen_field_iterator {
struct gen_group *group;
const char *name;
char value[128];
-   uint32_t *p;
+   const uint32_t *p;
int i;
 };
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] aubinator: Simplify print_dword_val() method

2016-09-08 Thread Sirisha Gandikota

From: Sirisha Gandikota 

Remove the float/dword union and use the iter->p[f->start / 32]
directly as printf formatter %08x expects uint32_t (Ken)

Signed-off-by: Sirisha Gandikota 
---
 src/intel/tools/aubinator.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index 811f707..d147225 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -92,17 +92,13 @@ print_dword_val(struct gen_field_iterator *iter, uint64_t 
offset,
 int *dword_num)
 {
struct gen_field *f;
-   union {
-  uint32_t dw;
-  float f;
-   } v;
 
f = iter->group->fields[iter->i - 1];
-   v.dw = iter->p[f->start / 32];
 
if (*dword_num != (f->start / 32)) {
   printf("0x%08lx:  0x%08x : Dword %d\n",
- offset + 4 * (f->start / 32), v.dw, f->start / 32);
+ offset + 4 * (f->start / 32), iter->p[f->start / 32], f->start / 
+32);
   *dword_num = (f->start / 32);
}
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] aubinator: Remove bogus "end" parameter in gen_disasm_disassemble()

2016-09-08 Thread Sirisha Gandikota

From: Sirisha Gandikota 

Earlier, the loop pretends to loop over instructions from "start" to "end",
but the callers always pass 8192 for end, which is some huge bogus
value. The real loop termination condition is send-with-EOT or 0. (Ken)

Signed-off-by: Sirisha Gandikota 
---
 src/intel/tools/aubinator.c  | 12 ++--
 src/intel/tools/disasm.c |  8 +---
 src/intel/tools/gen_disasm.h |  2 +-
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index d147225..fffb1b6 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -304,7 +304,7 @@ handle_media_interface_descriptor_load(struct gen_spec 
*spec, uint32_t *p)
   }
 
   insns = (struct brw_instruction *) (gtt + start);
-  gen_disasm_disassemble(disasm, insns, 0, 8192, stdout);
+  gen_disasm_disassemble(disasm, insns, 0, stdout);
 
   dump_samplers(spec, descriptors[3] & ~0x1f);
   dump_binding_table(spec, descriptors[4] & ~0x1f);
@@ -402,7 +402,7 @@ handle_3dstate_vs(struct gen_spec *spec, uint32_t *p)
  instruction_base, start);
 
   insns = (struct brw_instruction *) (gtt + instruction_base + start);
-  gen_disasm_disassemble(disasm, insns, 0, 8192, stdout);
+  gen_disasm_disassemble(disasm, insns, 0, stdout);
}
 }
 
@@ -426,7 +426,7 @@ handle_3dstate_hs(struct gen_spec *spec, uint32_t *p)
  instruction_base, start);
 
   insns = (struct brw_instruction *) (gtt + instruction_base + start);
-  gen_disasm_disassemble(disasm, insns, 0, 8192, stdout);
+  gen_disasm_disassemble(disasm, insns, 0, stdout);
}
 }
 
@@ -520,21 +520,21 @@ handle_3dstate_ps(struct gen_spec *spec, uint32_t *p)
printf("  Kernel[0] %s\n", k0);
if (k0 != unused) {
   insns = (struct brw_instruction *) (gtt + start);
-  gen_disasm_disassemble(disasm, insns, 0, 8192, stdout);
+  gen_disasm_disassemble(disasm, insns, 0, stdout);
}
 
start = instruction_base + (p[k1_offset] & mask);
printf("  Kernel[1] %s\n", k1);
if (k1 != unused) {
   insns = (struct brw_instruction *) (gtt + start);
-  gen_disasm_disassemble(disasm, insns, 0, 8192, stdout);
+  gen_disasm_disassemble(disasm, insns, 0, stdout);
}
 
start = instruction_base + (p[k2_offset] & mask);
printf("  Kernel[2] %s\n", k2);
if (k2 != unused) {
   insns = (struct brw_instruction *) (gtt + start);
-  gen_disasm_disassemble(disasm, insns, 0, 8192, stdout);
+  gen_disasm_disassemble(disasm, insns, 0, stdout);
}
 }
 
diff --git a/src/intel/tools/disasm.c b/src/intel/tools/disasm.c
index 13e4ce2..7b8bf69 100644
--- a/src/intel/tools/disasm.c
+++ b/src/intel/tools/disasm.c
@@ -48,14 +48,16 @@ is_send(uint32_t opcode)
 }
 
 void
-gen_disasm_disassemble(struct gen_disasm *disasm, void *assembly, int start,
-   int end, FILE *out)
+gen_disasm_disassemble(struct gen_disasm *disasm, void *assembly,
+   int start, FILE *out)
 {
struct gen_device_info *devinfo = &disasm->devinfo;
bool dump_hex = false;
uint32_t opcode = 0;
+   int offset = start;
 
-   for (int offset = start; offset < end;) {
+   /* This loop exits when send-with-EOT or when opcode is 0 */
+   while (true) {
   brw_inst *insn = assembly + offset;
   brw_inst uncompacted;
   bool compacted = brw_inst_cmpt_control(devinfo, insn);
diff --git a/src/intel/tools/gen_disasm.h b/src/intel/tools/gen_disasm.h
index af6654f..24b56c9 100644
--- a/src/intel/tools/gen_disasm.h
+++ b/src/intel/tools/gen_disasm.h
@@ -28,7 +28,7 @@ struct gen_disasm;
 
 struct gen_disasm *gen_disasm_create(int pciid);
 void gen_disasm_disassemble(struct gen_disasm *disasm,
-void *assembly, int start, int end, FILE *out);
+void *assembly, int start, FILE *out);
 
 void gen_disasm_destroy(struct gen_disasm *disasm);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Roland Scheidegger

Am 08.09.2016 um 23:43 schrieb Rob Clark:
> On Thu, Sep 8, 2016 at 5:11 PM, Roland Scheidegger  wrote:
>> Am 08.09.2016 um 22:30 schrieb Rob Clark:
>>> Support multi-planar YUV for external EGLImage's (currently just in the
>>> dma-buf import path) by lowering to multiple texture fetch's for each
>>> plane and CSC in shader.
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>>  src/gallium/auxiliary/util/u_inlines.h  |   4 +-
>>>  src/gallium/include/pipe/p_state.h  |   9 +++
>>>  src/gallium/include/state_tracker/st_api.h  |   3 +
>>>  src/gallium/state_trackers/dri/dri2.c   | 119 
>>> +++-
>>>  src/gallium/state_trackers/dri/dri_screen.c |  11 +++
>>>  src/mesa/main/mtypes.h  |  16 
>>>  src/mesa/program/ir_to_mesa.cpp |   1 +
>>>  src/mesa/state_tracker/st_atom_sampler.c|  41 +-
>>>  src/mesa/state_tracker/st_atom_shader.c |   3 +
>>>  src/mesa/state_tracker/st_atom_texture.c|  58 ++
>>>  src/mesa/state_tracker/st_cb_eglimage.c |  18 +
>>>  src/mesa/state_tracker/st_context.c |   7 +-
>>>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
>>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
>>>  src/mesa/state_tracker/st_manager.c |   1 +
>>>  src/mesa/state_tracker/st_program.c |  35 
>>>  src/mesa/state_tracker/st_program.h |  37 +
>>>  src/mesa/state_tracker/st_texture.h |  21 +
>>>  18 files changed, 362 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/util/u_inlines.h 
>>> b/src/gallium/auxiliary/util/u_inlines.h
>>> index c2a0b08..b7b8313 100644
>>> --- a/src/gallium/auxiliary/util/u_inlines.h
>>> +++ b/src/gallium/auxiliary/util/u_inlines.h
>>> @@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource **ptr, 
>>> struct pipe_resource *tex)
>>> struct pipe_resource *old_tex = *ptr;
>>>
>>> if (pipe_reference_described(&(*ptr)->reference, &tex->reference,
>>> -
>>> (debug_reference_descriptor)debug_describe_resource))
>>> +
>>> (debug_reference_descriptor)debug_describe_resource)) {
>>> +  pipe_resource_reference(&old_tex->next, NULL);
>>>old_tex->screen->resource_destroy(old_tex->screen, old_tex);
>>> +   }
>>> *ptr = tex;
>>>  }
>>>
>>> diff --git a/src/gallium/include/pipe/p_state.h 
>>> b/src/gallium/include/pipe/p_state.h
>>> index ebd0337..4a88da6 100644
>>> --- a/src/gallium/include/pipe/p_state.h
>>> +++ b/src/gallium/include/pipe/p_state.h
>>> @@ -498,6 +498,15 @@ struct pipe_resource
>>>
>>> unsigned bind;/**< bitmask of PIPE_BIND_x */
>>> unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
>>> +
>>> +   /**
>>> +* For planar images, ie. YUV EGLImage external, etc, pointer to the
>>> +* next plane.
>>> +*
>>> +* TODO might be useful for dealing w/ z32s8 too, since at least a
>>> +* couple drivers split these out into separate buffers internally.
>>> +*/
>>> +   struct pipe_resource *next;
>> Would it be possible to stuff the multiple resources somewhere else
>> (__DRIImage ?)? Seems a bit of a hack to have resources referencing
>> other resources that way.
>> (Also, it's odd since things are mostly lowered really outside of
>> gallium so it's odd that some of the yuv state still sneaks in there.)
> 
> I did originally start down the path of making __DRIImage have
> multiple pipe_resource's.. I'm not really sure that would end up
> better, and it certainly would be more invasive.
> 
> Maybe we should just make that something like 'void *stpriv' to let st
> stick whatever it wants in there.  That seems more sane than making
> the st use a hashtable to map the rsc back to something else.
Can't you just put 3 resources in somewhere without pointers?
I just think it really should be outside gallium interfaces. The
lowering is all done by the state tracker, hence having those bits there
referencing other resources in gallium looks wrong to me.


> 
> One note I would make, is that I think at least both radeon and
> freedreno (and maybe others) already do similar things (in driver
> backend) for formats like z32_x24s8, since from hw PoV, they are
> actually two separate buffers, while from GL and gallium API they are
> conceptually a single buffer.  Having a chain of resources, like I did
> for planar YUV, seems like a reasonable approach if we ever wanted to
> refactor some of that duplicated logic into mesa/st.  (Not that it is
> high on my todo list.. just pointing out there are other cases where
> we want to treat multiple buffers as one logical buffer.)
Honestly, I don't think that's a good idea. Unless I'm mistaken even for
things like z24s8 some hw has separate buffers, and drivers should (and
do) handle that internally. This is just something which can't be
abstracted away at the interface level reasonably, and drivers just need
to deal wit

Re: [Mesa-dev] [PATCH 1/3] nir/gcm: Call nir_metadata_preserve

2016-09-08 Thread Kenneth Graunke

On Tuesday, September 6, 2016 11:08:56 AM PDT Jason Ekstrand wrote:
> ---
>  src/compiler/nir/nir_opt_gcm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/compiler/nir/nir_opt_gcm.c b/src/compiler/nir/nir_opt_gcm.c
> index 84e32ef..02a9348 100644
> --- a/src/compiler/nir/nir_opt_gcm.c
> +++ b/src/compiler/nir/nir_opt_gcm.c
> @@ -483,6 +483,9 @@ opt_gcm_impl(nir_function_impl *impl)
> }
>  
> ralloc_free(state.blocks);
> +
> +   nir_metadata_preserve(impl, nir_metadata_block_index |
> +   nir_metadata_dominance);
>  }
>  
>  void
> 

Patches 1-2 are:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Bug 97261] vaapi u/v wrong order since vl/util: add copy func for yv12image to nv12surface

2016-09-08 Thread Zhang, Boyuan

Hi Andy,

I verified the bug. You are correct. The u and v are inversed. I checked your 
patch, and confirmed it fixes the issue. Patch is Reviewed-by: Boyuan Zhang 

Thanks a lot for the help!

Regards,
Boyuan

From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of 
bugzilla-dae...@freedesktop.org
Sent: August-09-16 9:52 AM
To: mesa-dev@lists.freedesktop.org
Subject: [Mesa-dev] [Bug 97261] vaapi u/v wrong order since vl/util: add copy 
func for yv12image to nv12surface

Bug ID

97261

Summary

vaapi u/v wrong order since vl/util: add copy func for yv12image to nv12surface

Product

Mesa

Version

git

Hardware

Other

OS

All

Status

NEW

Severity

normal

Priority

medium

Component

Mesa core

Assignee

mesa-dev@lists.freedesktop.org

Reporter

adf.li...@gmail.com

QA Contact

mesa-dev@lists.freedesktop.org


Created attachment 125638 
[details]

small test vid



As noted st the time, though Boyuan said he couldn't reproduce, for me



vl/util: add copy func for yv12image to nv12surface



gets u and v for both yv12 and I420 inputs reversed whether encoding or

playing.



Both gstreamer and mpv affected.



Testing playback using attached small test vid that instantly shows the issue

either



VAAPI_DISABLE_INTERLACE=true mpv --vo=vaapi uvtest.mkv



or



gst-launch-1.0 filesrc location=uvtest.mkv ! matroskademux ! avdec_h264 !

vaapisink



Of course any test that outputs nv12 works OK as it avoids the conversion.



It seems that the new util function expects input to be yuv, but it actually

gets yvu.



I sent a patch to the list for this -



https://lists.freedesktop.org/archives/mesa-dev/2016-July/124695.html



Filing bug/test to see if anyone else reproduce.


You are receiving this mail because:

  *   You are the QA Contact for the bug.
  *   You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97643] Shader crashes radeon driver and brings the whole system down

2016-09-08 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97643

--- Comment #4 from Cris  ---
I don't know then. Is there a way for me to diagnose it myself?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/23] glsl: Convert constant_expression to the util hash table

2016-09-08 Thread Thomas Helland

2016-08-16 22:10 GMT+02:00 Thomas Helland :
> Signed-off-by: Thomas Helland 
> ---
>  src/compiler/glsl/ir_constant_expression.cpp | 24 +---
>  1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
> b/src/compiler/glsl/ir_constant_expression.cpp
> index 6329acd..16c8fac 100644
> --- a/src/compiler/glsl/ir_constant_expression.cpp
> +++ b/src/compiler/glsl/ir_constant_expression.cpp
> @@ -39,7 +39,7 @@
>  #include "util/half_float.h"
>  #include "ir.h"
>  #include "compiler/glsl_types.h"
> -#include "program/hash_table.h"
> +#include "util/hash_table.h"
>
>  static float
>  dot_f(ir_constant *op0, ir_constant *op1)
> @@ -457,7 +457,8 @@ constant_referenced(const ir_dereference *deref,
>const ir_dereference_variable *const dv =
>   (const ir_dereference_variable *) deref;
>
> -  store = (ir_constant *) hash_table_find(variable_context, dv->var);
> +  hash_entry *entry = _mesa_hash_table_search(variable_context, dv->var);
> +  store = (ir_constant *) entry->data;
>break;
> }
>
> @@ -1806,9 +1807,10 @@ 
> ir_dereference_variable::constant_expression_value(struct hash_table 
> *variable_c
>
> /* Give priority to the context hashtable, if it exists */
> if (variable_context) {
> -  ir_constant *value = (ir_constant *)hash_table_find(variable_context, 
> var);
> -  if(value)
> - return value;
> +  hash_entry *entry = _mesa_hash_table_search(variable_context, var);
> +
> +  if(entry)
> + return (ir_constant *) entry->data;
> }
>
> /* The constant_value of a uniform variable is its initializer,
> @@ -1926,7 +1928,7 @@ bool 
> ir_function_signature::constant_expression_evaluate_expression_list(const s
>   /* (declare () type symbol) */
>case ir_type_variable: {
>   ir_variable *var = inst->as_variable();
> - hash_table_insert(variable_context, ir_constant::zero(this, 
> var->type), var);
> + _mesa_hash_table_insert(variable_context, var, 
> ir_constant::zero(this, var->type));
>   break;
>}
>
> @@ -2050,8 +2052,8 @@ 
> ir_function_signature::constant_expression_value(exec_list 
> *actual_parameters, s
>  * We expect the correctness of the number of parameters to have
>  * been checked earlier.
>  */
> -   hash_table *deref_hash = hash_table_ctor(8, hash_table_pointer_hash,
> -hash_table_pointer_compare);
> +   hash_table *deref_hash = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
> +_mesa_key_pointer_equal);
>
> /* If "origin" is non-NULL, then the function body is there.  So we
>  * have to use the variable objects from the object with the body,
> @@ -2062,13 +2064,13 @@ 
> ir_function_signature::constant_expression_value(exec_list 
> *actual_parameters, s
> foreach_in_list(ir_rvalue, n, actual_parameters) {
>ir_constant *constant = n->constant_expression_value(variable_context);
>if (constant == NULL) {
> - hash_table_dtor(deref_hash);
> + _mesa_hash_table_destroy(deref_hash, NULL);
>   return NULL;
>}
>
>
>ir_variable *var = (ir_variable *)parameter_info;
> -  hash_table_insert(deref_hash, constant, var);
> +  _mesa_hash_table_insert(deref_hash, constant, var);

This would be the cause of the regressions.
The API is inverted between the hash table implementations,
but the arguments here are not. No wonder weird things happen.
Will do a complete piglit run (except deqp, etc) and send
an updated patch to the list likely sometime tomorrow.

>
>parameter_info = parameter_info->next;
> }
> @@ -2081,7 +2083,7 @@ 
> ir_function_signature::constant_expression_value(exec_list 
> *actual_parameters, s
> if (constant_expression_evaluate_expression_list(origin ? origin->body : 
> body, deref_hash, &result) && result)
>result = result->clone(ralloc_parent(this), NULL);
>
> -   hash_table_dtor(deref_hash);
> +   _mesa_hash_table_destroy(deref_hash, NULL);
>
> return result;
>  }
> --
> 2.9.2
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Fail the shader compile instead of asserting when we can't spill

2016-09-08 Thread Francisco Jerez

Jason Ekstrand  writes:

> On Thu, Sep 8, 2016 at 2:39 PM, Francisco Jerez 
> wrote:
>
>> Jason Ekstrand  writes:
>>
>> > Blorp doesn't handle spilling so we set allow_spilling to false in that
>> > case.  The blorp 16x MSAA resolve shader spills in 16-wide but not
>> 8-wide.
>> > This commit makes it so that we fail the 16-wide compile and successfully
>> > fall back to 8-wide instead of just assert-failing when trying to compile
>> > the 16-wide shader.
>> >
>> > Signed-off-by: Jason Ekstrand 
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +--
>> >  1 file changed, 5 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> > index d0b55ae..73aa5d2 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> > @@ -5906,6 +5906,11 @@ fs_visitor::allocate_registers(bool
>> allow_spilling)
>> > }
>> >
>> > if (!allocated_without_spills) {
>> > +  if (!allow_spilling) {
>> > + assert(dispatch_width > 8);
>>
>> Not sure it makes sense for the register allocator to set different
>> requirements based on the build dispatch width, wouldn't it make sense
>> to just fail() here consistently if allow_spilling is false regardless
>> of the dispatch width, and let blorp assert fail if none of the dispatch
>> widths it tries out compiles successfully?
>>
>
> Agreed.  I'll drop the assert.
>

Thanks!  With the assert left out patch is:

Reviewed-by: Francisco Jerez 

>
>> > + fail("Failure to register allocate and spilling is not
>> allowed.");
>> > +  }
>> > +
>> >/* We assume that any spilling is worse than just dropping back to
>> > * SIMD8.  There's probably actually some intermediate point where
>> > * SIMD16 with a couple of spills is still better.
>> > @@ -5930,8 +5935,6 @@ fs_visitor::allocate_registers(bool
>> allow_spilling)
>> >}
>> > }
>> >
>> > -   assert(last_scratch == 0 || allow_spilling);
>> > -
>> > /* This must come after all optimization and register allocation,
>> since
>> >  * it inserts dead code that happens to have side effects, and it
>> does
>> >  * so based on the actual physical registers in use.
>> > --
>> > 2.5.0.400.gff86faf
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Fail the shader compile instead of asserting when we can't spill

2016-09-08 Thread Jason Ekstrand

On Thu, Sep 8, 2016 at 2:39 PM, Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > Blorp doesn't handle spilling so we set allow_spilling to false in that
> > case.  The blorp 16x MSAA resolve shader spills in 16-wide but not
> 8-wide.
> > This commit makes it so that we fail the 16-wide compile and successfully
> > fall back to 8-wide instead of just assert-failing when trying to compile
> > the 16-wide shader.
> >
> > Signed-off-by: Jason Ekstrand 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index d0b55ae..73aa5d2 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -5906,6 +5906,11 @@ fs_visitor::allocate_registers(bool
> allow_spilling)
> > }
> >
> > if (!allocated_without_spills) {
> > +  if (!allow_spilling) {
> > + assert(dispatch_width > 8);
>
> Not sure it makes sense for the register allocator to set different
> requirements based on the build dispatch width, wouldn't it make sense
> to just fail() here consistently if allow_spilling is false regardless
> of the dispatch width, and let blorp assert fail if none of the dispatch
> widths it tries out compiles successfully?
>

Agreed.  I'll drop the assert.


> > + fail("Failure to register allocate and spilling is not
> allowed.");
> > +  }
> > +
> >/* We assume that any spilling is worse than just dropping back to
> > * SIMD8.  There's probably actually some intermediate point where
> > * SIMD16 with a couple of spills is still better.
> > @@ -5930,8 +5935,6 @@ fs_visitor::allocate_registers(bool
> allow_spilling)
> >}
> > }
> >
> > -   assert(last_scratch == 0 || allow_spilling);
> > -
> > /* This must come after all optimization and register allocation,
> since
> >  * it inserts dead code that happens to have side effects, and it
> does
> >  * so based on the actual physical registers in use.
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Fail the shader compile instead of asserting when we can't spill

2016-09-08 Thread Francisco Jerez

Jason Ekstrand  writes:

> Blorp doesn't handle spilling so we set allow_spilling to false in that
> case.  The blorp 16x MSAA resolve shader spills in 16-wide but not 8-wide.
> This commit makes it so that we fail the 16-wide compile and successfully
> fall back to 8-wide instead of just assert-failing when trying to compile
> the 16-wide shader.
>
> Signed-off-by: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index d0b55ae..73aa5d2 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -5906,6 +5906,11 @@ fs_visitor::allocate_registers(bool allow_spilling)
> }
>  
> if (!allocated_without_spills) {
> +  if (!allow_spilling) {
> + assert(dispatch_width > 8);

Not sure it makes sense for the register allocator to set different
requirements based on the build dispatch width, wouldn't it make sense
to just fail() here consistently if allow_spilling is false regardless
of the dispatch width, and let blorp assert fail if none of the dispatch
widths it tries out compiles successfully?

> + fail("Failure to register allocate and spilling is not allowed.");
> +  }
> +
>/* We assume that any spilling is worse than just dropping back to
> * SIMD8.  There's probably actually some intermediate point where
> * SIMD16 with a couple of spills is still better.
> @@ -5930,8 +5935,6 @@ fs_visitor::allocate_registers(bool allow_spilling)
>}
> }
>  
> -   assert(last_scratch == 0 || allow_spilling);
> -
> /* This must come after all optimization and register allocation, since
>  * it inserts dead code that happens to have side effects, and it does
>  * so based on the actual physical registers in use.
> -- 
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Rob Clark

On Thu, Sep 8, 2016 at 5:11 PM, Roland Scheidegger  wrote:
> Am 08.09.2016 um 22:30 schrieb Rob Clark:
>> Support multi-planar YUV for external EGLImage's (currently just in the
>> dma-buf import path) by lowering to multiple texture fetch's for each
>> plane and CSC in shader.
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/gallium/auxiliary/util/u_inlines.h  |   4 +-
>>  src/gallium/include/pipe/p_state.h  |   9 +++
>>  src/gallium/include/state_tracker/st_api.h  |   3 +
>>  src/gallium/state_trackers/dri/dri2.c   | 119 
>> +++-
>>  src/gallium/state_trackers/dri/dri_screen.c |  11 +++
>>  src/mesa/main/mtypes.h  |  16 
>>  src/mesa/program/ir_to_mesa.cpp |   1 +
>>  src/mesa/state_tracker/st_atom_sampler.c|  41 +-
>>  src/mesa/state_tracker/st_atom_shader.c |   3 +
>>  src/mesa/state_tracker/st_atom_texture.c|  58 ++
>>  src/mesa/state_tracker/st_cb_eglimage.c |  18 +
>>  src/mesa/state_tracker/st_context.c |   7 +-
>>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
>>  src/mesa/state_tracker/st_manager.c |   1 +
>>  src/mesa/state_tracker/st_program.c |  35 
>>  src/mesa/state_tracker/st_program.h |  37 +
>>  src/mesa/state_tracker/st_texture.h |  21 +
>>  18 files changed, 362 insertions(+), 27 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_inlines.h 
>> b/src/gallium/auxiliary/util/u_inlines.h
>> index c2a0b08..b7b8313 100644
>> --- a/src/gallium/auxiliary/util/u_inlines.h
>> +++ b/src/gallium/auxiliary/util/u_inlines.h
>> @@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource **ptr, 
>> struct pipe_resource *tex)
>> struct pipe_resource *old_tex = *ptr;
>>
>> if (pipe_reference_described(&(*ptr)->reference, &tex->reference,
>> -
>> (debug_reference_descriptor)debug_describe_resource))
>> +
>> (debug_reference_descriptor)debug_describe_resource)) {
>> +  pipe_resource_reference(&old_tex->next, NULL);
>>old_tex->screen->resource_destroy(old_tex->screen, old_tex);
>> +   }
>> *ptr = tex;
>>  }
>>
>> diff --git a/src/gallium/include/pipe/p_state.h 
>> b/src/gallium/include/pipe/p_state.h
>> index ebd0337..4a88da6 100644
>> --- a/src/gallium/include/pipe/p_state.h
>> +++ b/src/gallium/include/pipe/p_state.h
>> @@ -498,6 +498,15 @@ struct pipe_resource
>>
>> unsigned bind;/**< bitmask of PIPE_BIND_x */
>> unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
>> +
>> +   /**
>> +* For planar images, ie. YUV EGLImage external, etc, pointer to the
>> +* next plane.
>> +*
>> +* TODO might be useful for dealing w/ z32s8 too, since at least a
>> +* couple drivers split these out into separate buffers internally.
>> +*/
>> +   struct pipe_resource *next;
> Would it be possible to stuff the multiple resources somewhere else
> (__DRIImage ?)? Seems a bit of a hack to have resources referencing
> other resources that way.
> (Also, it's odd since things are mostly lowered really outside of
> gallium so it's odd that some of the yuv state still sneaks in there.)

I did originally start down the path of making __DRIImage have
multiple pipe_resource's.. I'm not really sure that would end up
better, and it certainly would be more invasive.

Maybe we should just make that something like 'void *stpriv' to let st
stick whatever it wants in there.  That seems more sane than making
the st use a hashtable to map the rsc back to something else.

One note I would make, is that I think at least both radeon and
freedreno (and maybe others) already do similar things (in driver
backend) for formats like z32_x24s8, since from hw PoV, they are
actually two separate buffers, while from GL and gallium API they are
conceptually a single buffer.  Having a chain of resources, like I did
for planar YUV, seems like a reasonable approach if we ever wanted to
refactor some of that duplicated logic into mesa/st.  (Not that it is
high on my todo list.. just pointing out there are other cases where
we want to treat multiple buffers as one logical buffer.)

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/lower_tex: fix typo with sample_dim

2016-09-08 Thread Jason Ekstrand

On Thu, Sep 8, 2016 at 12:53 PM, Rob Clark  wrote:

> Numeric 2 is actually GLSL_SAMPLER_DIM_3D, which I don't think is what
> was intended.
>
> Signed-off-by: Rob Clark 
> ---
>  src/compiler/nir/nir_lower_tex.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/nir/nir_lower_tex.c
> b/src/compiler/nir/nir_lower_tex.c
> index a405758..0efd443 100644
> --- a/src/compiler/nir/nir_lower_tex.c
> +++ b/src/compiler/nir/nir_lower_tex.c
> @@ -211,7 +211,7 @@ sample_plane(nir_builder *b, nir_tex_instr *tex, int
> plane)
> plane_tex->src[1].src = nir_src_for_ssa(nir_imm_int(b, plane));
> plane_tex->src[1].src_type = nir_tex_src_plane;
> plane_tex->op = nir_texop_tex;
> -   plane_tex->sampler_dim = 2;
> +   plane_tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
>

Ugh... 2 doesn't even map to 2D. :(

Reviewed-by: Jason Ekstrand 


> plane_tex->dest_type = nir_type_float;
> plane_tex->coord_components = 2;
>
> --
> 2.7.4
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Fail the shader compile instead of asserting when we can't spill

2016-09-08 Thread Jason Ekstrand

Blorp doesn't handle spilling so we set allow_spilling to false in that
case.  The blorp 16x MSAA resolve shader spills in 16-wide but not 8-wide.
This commit makes it so that we fail the 16-wide compile and successfully
fall back to 8-wide instead of just assert-failing when trying to compile
the 16-wide shader.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d0b55ae..73aa5d2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5906,6 +5906,11 @@ fs_visitor::allocate_registers(bool allow_spilling)
}
 
if (!allocated_without_spills) {
+  if (!allow_spilling) {
+ assert(dispatch_width > 8);
+ fail("Failure to register allocate and spilling is not allowed.");
+  }
+
   /* We assume that any spilling is worse than just dropping back to
* SIMD8.  There's probably actually some intermediate point where
* SIMD16 with a couple of spills is still better.
@@ -5930,8 +5935,6 @@ fs_visitor::allocate_registers(bool allow_spilling)
   }
}
 
-   assert(last_scratch == 0 || allow_spilling);
-
/* This must come after all optimization and register allocation, since
 * it inserts dead code that happens to have side effects, and it does
 * so based on the actual physical registers in use.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Roland Scheidegger

Am 08.09.2016 um 22:30 schrieb Rob Clark:
> Support multi-planar YUV for external EGLImage's (currently just in the
> dma-buf import path) by lowering to multiple texture fetch's for each
> plane and CSC in shader.
> 
> Signed-off-by: Rob Clark 
> ---
>  src/gallium/auxiliary/util/u_inlines.h  |   4 +-
>  src/gallium/include/pipe/p_state.h  |   9 +++
>  src/gallium/include/state_tracker/st_api.h  |   3 +
>  src/gallium/state_trackers/dri/dri2.c   | 119 
> +++-
>  src/gallium/state_trackers/dri/dri_screen.c |  11 +++
>  src/mesa/main/mtypes.h  |  16 
>  src/mesa/program/ir_to_mesa.cpp |   1 +
>  src/mesa/state_tracker/st_atom_sampler.c|  41 +-
>  src/mesa/state_tracker/st_atom_shader.c |   3 +
>  src/mesa/state_tracker/st_atom_texture.c|  58 ++
>  src/mesa/state_tracker/st_cb_eglimage.c |  18 +
>  src/mesa/state_tracker/st_context.c |   7 +-
>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
>  src/mesa/state_tracker/st_manager.c |   1 +
>  src/mesa/state_tracker/st_program.c |  35 
>  src/mesa/state_tracker/st_program.h |  37 +
>  src/mesa/state_tracker/st_texture.h |  21 +
>  18 files changed, 362 insertions(+), 27 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_inlines.h 
> b/src/gallium/auxiliary/util/u_inlines.h
> index c2a0b08..b7b8313 100644
> --- a/src/gallium/auxiliary/util/u_inlines.h
> +++ b/src/gallium/auxiliary/util/u_inlines.h
> @@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource **ptr, 
> struct pipe_resource *tex)
> struct pipe_resource *old_tex = *ptr;
>  
> if (pipe_reference_described(&(*ptr)->reference, &tex->reference, 
> -
> (debug_reference_descriptor)debug_describe_resource))
> +
> (debug_reference_descriptor)debug_describe_resource)) {
> +  pipe_resource_reference(&old_tex->next, NULL);
>old_tex->screen->resource_destroy(old_tex->screen, old_tex);
> +   }
> *ptr = tex;
>  }
>  
> diff --git a/src/gallium/include/pipe/p_state.h 
> b/src/gallium/include/pipe/p_state.h
> index ebd0337..4a88da6 100644
> --- a/src/gallium/include/pipe/p_state.h
> +++ b/src/gallium/include/pipe/p_state.h
> @@ -498,6 +498,15 @@ struct pipe_resource
>  
> unsigned bind;/**< bitmask of PIPE_BIND_x */
> unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
> +
> +   /**
> +* For planar images, ie. YUV EGLImage external, etc, pointer to the
> +* next plane.
> +*
> +* TODO might be useful for dealing w/ z32s8 too, since at least a
> +* couple drivers split these out into separate buffers internally.
> +*/
> +   struct pipe_resource *next;
Would it be possible to stuff the multiple resources somewhere else
(__DRIImage ?)? Seems a bit of a hack to have resources referencing
other resources that way.
(Also, it's odd since things are mostly lowered really outside of
gallium so it's odd that some of the yuv state still sneaks in there.)

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 02/11] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-08 Thread Ian Romanick

On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/mesa/main/api_validate.c | 94 
> 
>  src/mesa/main/api_validate.h |  4 ++
>  src/mesa/main/compute.c  | 17 
>  src/mesa/main/context.c  |  6 +++
>  src/mesa/main/dd.h   |  9 
>  src/mesa/main/extensions_table.h |  1 +
>  src/mesa/main/get.c  | 12 +
>  src/mesa/main/get_hash_params.py |  3 ++
>  src/mesa/main/mtypes.h   | 23 +-
>  src/mesa/main/shaderapi.c|  1 +
>  src/mesa/main/shaderobj.c|  2 +
>  11 files changed, 171 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index b35751e..9379015 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -1096,6 +1096,7 @@ GLboolean
>  _mesa_validate_DispatchCompute(struct gl_context *ctx,
> const GLuint *num_groups)
>  {
> +   struct gl_shader_program *prog;
> int i;
> FLUSH_CURRENT(ctx, 0);
>  
> @@ -1128,6 +1129,86 @@ _mesa_validate_DispatchCompute(struct gl_context *ctx,
>}
> }
>  
> +   /* From the ARB_compute_variable_group_size specification:
> +*
> +* "An INVALID_OPERATION error is generated by DispatchCompute if the 
> active
> +* program for the compute shader stage has a variable work group size."
> +*/

There has been a lot of debate about formatting spec quotations.  The
one thing where I think everyone agrees is formatting the first like.
Please use

The ARB_compute_variable_group_size spec says:

That makes it much easier to grep the code for spec quotations.

This comment applies to the spec quotations below too.

> +   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
> +   if (prog->Comp.LocalSizeVariable) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION,
> +  "glDispatchCompute(variable work group size forbidden)");
> +  return GL_FALSE;
> +   }
> +
> +   return GL_TRUE;
> +}
> +
> +GLboolean
> +_mesa_validate_DispatchComputeGroupSizeARB(struct gl_context *ctx,
> +   const GLuint *num_groups,
> +   const GLuint *group_size)
> +{
> +   struct gl_shader_program *prog;
> +   GLuint64 total_invocations = 1;
> +   int i;
> +
> +   FLUSH_CURRENT(ctx, 0);
> +
> +   if (!check_valid_to_compute(ctx, "glDispatchComputeGroupSizeARB"))
> +  return GL_FALSE;
> +
> +   for (i = 0; i < 3; i++) {
> +  /* From the ARB_compute_variable_group_size specification:
> +   *
> +   * "An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB 
> if
> +   * any of , , or  is less 
> than
> +   * or equal to zero or greater than the maximum local work group size 
> for
> +   * compute shaders with variable group size
> +   * (MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB) in the corresponding 
> dimension."
> +   *
> +   * However, the "less than" is a spec bug because they are declared as
> +   * unsigned integers.
> +   */
> +  if (group_size[i] == 0 ||
> +  group_size[i] > ctx->Const.MaxComputeVariableGroupSize[i]) {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "glDispatchComputeGroupSizeARB(group_size_%c)", 'x' + 
> i);
> + return GL_FALSE;
> +  }
> +
> +  /* From the ARB_compute_variable_group_size specification:
> +   *
> +   * "An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB 
> if
> +   * the product of , , and 
> +   * exceeds the implementation-dependent maximum local work group
> +   * invocation count for compute shaders with variable group size
> +   * (MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB)."
> +   */
> +  total_invocations *= group_size[i];
> +  if (total_invocations > ctx->Const.MaxComputeVariableGroupInvocations) 
> {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "glDispatchComputeGroupSizeARB(product of local_sizes "
> + "exceeds MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB 
> (%d))",
> + ctx->Const.MaxComputeVariableGroupInvocations);
> + return GL_FALSE;
> +  }

This check should happen after the loop, and you should also log the
value of total_invocations.  Something like:

   _mesa_error(ctx, GL_INVALID_VALUE,
   "glDispatchComputeGroupSizeARB(product of local_sizes "
   "exceeds MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB (%d < %d))",
total_invocations,
ctx->Const.MaxComputeVariableGroupInvocations);

> +
> +  /* From the ARB_compute_variable_group_size specification:
> +   *
> +   * "An INVALID_OPERATION error is generated by
> +   * DispatchComputeGroupSizeARB if the active program for the compute
> +   * shader stage has a fixed work group size."
> +

Re: [Mesa-dev] [PATCH 01/11] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-08 Thread Ian Romanick

On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  .../glapi/gen/ARB_compute_variable_group_size.xml  | 25 
> ++
>  src/mapi/glapi/gen/Makefile.am |  1 +
>  src/mapi/glapi/gen/gl_API.xml  |  2 ++
>  src/mesa/main/compute.c|  8 +++
>  src/mesa/main/compute.h|  5 +
>  src/mesa/main/tests/dispatch_sanity.cpp|  3 +++
>  6 files changed, 44 insertions(+)
>  create mode 100644 src/mapi/glapi/gen/ARB_compute_variable_group_size.xml
> 
> diff --git a/src/mapi/glapi/gen/ARB_compute_variable_group_size.xml 
> b/src/mapi/glapi/gen/ARB_compute_variable_group_size.xml
> new file mode 100644
> index 000..b21c52f
> --- /dev/null
> +++ b/src/mapi/glapi/gen/ARB_compute_variable_group_size.xml
> @@ -0,0 +1,25 @@
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +  
> +  
> +  
> +  
> +
> +  
> +
> +
> +
> +
> +
> +
> +  
> +
> +
> +
> diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
> index 0d7c338..49fdfe3 100644
> --- a/src/mapi/glapi/gen/Makefile.am
> +++ b/src/mapi/glapi/gen/Makefile.am
> @@ -117,6 +117,7 @@ API_XML = \
>   ARB_color_buffer_float.xml \
>   ARB_compressed_texture_pixel_storage.xml \
>   ARB_compute_shader.xml \
> + ARB_compute_variable_group_size.xml \
>   ARB_copy_buffer.xml \
>   ARB_copy_image.xml \
>   ARB_debug_output.xml \
> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
> index c39aa22..9ad3b60 100644
> --- a/src/mapi/glapi/gen/gl_API.xml
> +++ b/src/mapi/glapi/gen/gl_API.xml
> @@ -8258,6 +8258,8 @@
>  
>   xmlns:xi="http://www.w3.org/2001/XInclude"/>
>  
> + xmlns:xi="http://www.w3.org/2001/XInclude"/>
> +

This (extension #153) should go before ARB_indirect_parameters
(extension #154), and the "ARB extensions 149 - 153" should be changed
to "ARB extensions 149 - 152".

With that fixed, this patch is

Reviewed-by: Ian Romanick 

>  
>  
>   xmlns:xi="http://www.w3.org/2001/XInclude"/>
> diff --git a/src/mesa/main/compute.c b/src/mesa/main/compute.c
> index b71430f..b052bae 100644
> --- a/src/mesa/main/compute.c
> +++ b/src/mesa/main/compute.c
> @@ -60,3 +60,11 @@ _mesa_DispatchComputeIndirect(GLintptr indirect)
>  
> ctx->Driver.DispatchComputeIndirect(ctx, indirect);
>  }
> +
> +void GLAPIENTRY
> +_mesa_DispatchComputeGroupSizeARB(GLuint num_groups_x, GLuint num_groups_y,
> +  GLuint num_groups_z, GLuint group_size_x,
> +  GLuint group_size_y, GLuint group_size_z)
> +{
> +
> +}
> diff --git a/src/mesa/main/compute.h b/src/mesa/main/compute.h
> index 0cc034f..8018bbb 100644
> --- a/src/mesa/main/compute.h
> +++ b/src/mesa/main/compute.h
> @@ -35,4 +35,9 @@ _mesa_DispatchCompute(GLuint num_groups_x,
>  extern void GLAPIENTRY
>  _mesa_DispatchComputeIndirect(GLintptr indirect);
>  
> +extern void GLAPIENTRY
> +_mesa_DispatchComputeGroupSizeARB(GLuint num_groups_x, GLuint num_groups_y,
> +  GLuint num_groups_z, GLuint group_size_x,
> +  GLuint group_size_y, GLuint group_size_z);
> +
>  #endif
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 42fe61a..7faeabe 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -942,6 +942,9 @@ const struct function common_desktop_functions_possible[] 
> = {
> { "glDispatchCompute", 43, -1 },
> { "glDispatchComputeIndirect", 43, -1 },
>  
> +   /* GL_ARB_compute_variable_group_size */
> +   { "glDispatchComputeGroupSizeARB", 43, -1 },
> +
> /* GL_EXT_polygon_offset_clamp */
> { "glPolygonOffsetClampEXT", 11, -1 },
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: also honors interlaced preference when providing a video format

2016-09-08 Thread Zhang, Boyuan

Hi Leo, Christian and Julien,

I tested the patch with Vaapi Encoding and Transcoding, it seems working fine. 
We are using "VAAPI_DISABLE_INTERLACE" env, so interlaced is always disabled.

Regards,
Boyuan

-Original Message-
From: Liu, Leo 
Sent: September-08-16 9:50 AM
To: Koenig, Christian; Julien Isorce; mesa-dev@lists.freedesktop.org
Cc: mesa-sta...@lists.freedesktop.org; Zhang, Boyuan; Julien Isorce
Subject: Re: [PATCH] st/va: also honors interlaced preference when providing a 
video format



On 09/08/2016 03:50 AM, Christian König wrote:
> Am 08.09.2016 um 09:34 schrieb Julien Isorce:
>> This fixes a crash when using the prefered video format with 
>> vaapisink on Nvidia hardwares.
>> Also caught by the following assert:
>>nouveau_vp3_video.c:91: Assertion `templat->interlaced' failed.
>>
>> TEST= gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12 ! 
>> vaapisink
>>
>> Signed-off-by: Julien Isorce 
>> Tested-by: Víctor Manuel Jáquez Leal 
>
> Reviewed-by: Christian König .
>
> But somebody should double check if that doesn't break transcoding for 
> AMD GPUs.
>
> We had some problems with that in the past.

VA-API encode use "VAAPI_DISABLE_INTERLACE" env for making sure not interlaced, 
but better to double check.

Boyuan, can you test on this patch?

Regards,
Leo

>
> Regards,
> Christian.
>
>> ---
>>   src/gallium/state_trackers/va/surface.c | 36
>> +
>>   1 file changed, 19 insertions(+), 17 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/va/surface.c
>> b/src/gallium/state_trackers/va/surface.c
>> index 3ee1cdd..00df69d 100644
>> --- a/src/gallium/state_trackers/va/surface.c
>> +++ b/src/gallium/state_trackers/va/surface.c
>> @@ -632,24 +632,26 @@ vlVaCreateSurfaces2(VADriverContextP ctx, 
>> unsigned int format,
>>memset(&templat, 0, sizeof(templat));
>>   +   templat.buffer_format = pscreen->get_video_param(
>> +  pscreen,
>> +  PIPE_VIDEO_PROFILE_UNKNOWN,
>> +  PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
>> +  PIPE_VIDEO_CAP_PREFERED_FORMAT
>> +   );
>> +   templat.interlaced = pscreen->get_video_param(
>> +  pscreen,
>> +  PIPE_VIDEO_PROFILE_UNKNOWN,
>> +  PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
>> +  PIPE_VIDEO_CAP_PREFERS_INTERLACED
>> +   );
>> +
>>  if (expected_fourcc) {
>> -  templat.buffer_format = VaFourccToPipeFormat(expected_fourcc);
>> -  templat.interlaced = 0;
>> -   } else {
>> -  templat.buffer_format = pscreen->get_video_param
>> -(
>> -   pscreen,
>> -   PIPE_VIDEO_PROFILE_UNKNOWN,
>> -   PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
>> -   PIPE_VIDEO_CAP_PREFERED_FORMAT
>> -   );
>> -  templat.interlaced = pscreen->get_video_param
>> -(
>> -   pscreen,
>> -   PIPE_VIDEO_PROFILE_UNKNOWN,
>> -   PIPE_VIDEO_ENTRYPOINT_BITSTREAM,
>> -   PIPE_VIDEO_CAP_PREFERS_INTERLACED
>> -   );
>> +  enum pipe_format expected_format =
>> VaFourccToPipeFormat(expected_fourcc);
>> +
>> +  if (expected_format != templat.buffer_format || memory_attibute)
>> +templat.interlaced = 0;
>> +
>> +  templat.buffer_format = expected_format;
>>  }
>>templat.chroma_format = ChromaToPipe(format);
>
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/11] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-08 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/api_validate.c | 94 
 src/mesa/main/api_validate.h |  4 ++
 src/mesa/main/compute.c  | 17 
 src/mesa/main/context.c  |  6 +++
 src/mesa/main/dd.h   |  9 
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/get.c  | 12 +
 src/mesa/main/get_hash_params.py |  3 ++
 src/mesa/main/mtypes.h   | 23 +-
 src/mesa/main/shaderapi.c|  1 +
 src/mesa/main/shaderobj.c|  2 +
 11 files changed, 171 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index b35751e..9379015 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -1096,6 +1096,7 @@ GLboolean
 _mesa_validate_DispatchCompute(struct gl_context *ctx,
const GLuint *num_groups)
 {
+   struct gl_shader_program *prog;
int i;
FLUSH_CURRENT(ctx, 0);
 
@@ -1128,6 +1129,86 @@ _mesa_validate_DispatchCompute(struct gl_context *ctx,
   }
}
 
+   /* From the ARB_compute_variable_group_size specification:
+*
+* "An INVALID_OPERATION error is generated by DispatchCompute if the active
+* program for the compute shader stage has a variable work group size."
+*/
+   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
+   if (prog->Comp.LocalSizeVariable) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "glDispatchCompute(variable work group size forbidden)");
+  return GL_FALSE;
+   }
+
+   return GL_TRUE;
+}
+
+GLboolean
+_mesa_validate_DispatchComputeGroupSizeARB(struct gl_context *ctx,
+   const GLuint *num_groups,
+   const GLuint *group_size)
+{
+   struct gl_shader_program *prog;
+   GLuint64 total_invocations = 1;
+   int i;
+
+   FLUSH_CURRENT(ctx, 0);
+
+   if (!check_valid_to_compute(ctx, "glDispatchComputeGroupSizeARB"))
+  return GL_FALSE;
+
+   for (i = 0; i < 3; i++) {
+  /* From the ARB_compute_variable_group_size specification:
+   *
+   * "An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if
+   * any of , , or  is less than
+   * or equal to zero or greater than the maximum local work group size for
+   * compute shaders with variable group size
+   * (MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB) in the corresponding dimension."
+   *
+   * However, the "less than" is a spec bug because they are declared as
+   * unsigned integers.
+   */
+  if (group_size[i] == 0 ||
+  group_size[i] > ctx->Const.MaxComputeVariableGroupSize[i]) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glDispatchComputeGroupSizeARB(group_size_%c)", 'x' + i);
+ return GL_FALSE;
+  }
+
+  /* From the ARB_compute_variable_group_size specification:
+   *
+   * "An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if
+   * the product of , , and 
+   * exceeds the implementation-dependent maximum local work group
+   * invocation count for compute shaders with variable group size
+   * (MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB)."
+   */
+  total_invocations *= group_size[i];
+  if (total_invocations > ctx->Const.MaxComputeVariableGroupInvocations) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glDispatchComputeGroupSizeARB(product of local_sizes "
+ "exceeds MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB 
(%d))",
+ ctx->Const.MaxComputeVariableGroupInvocations);
+ return GL_FALSE;
+  }
+
+  /* From the ARB_compute_variable_group_size specification:
+   *
+   * "An INVALID_OPERATION error is generated by
+   * DispatchComputeGroupSizeARB if the active program for the compute
+   * shader stage has a fixed work group size."
+   */
+  prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
+  if (prog->Comp.LocalSize[i] != 0) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "glDispatchComputeGroupSizeARB(fixed work group size "
+ "forbidden)");
+ return GL_FALSE;
+  }
+   }
+
return GL_TRUE;
 }
 
@@ -1137,6 +1218,7 @@ valid_dispatch_indirect(struct gl_context *ctx,
 GLsizei size, const char *name)
 {
const uint64_t end = (uint64_t) indirect + size;
+   struct gl_shader_program *prog;
 
if (!check_valid_to_compute(ctx, name))
   return GL_FALSE;
@@ -1182,6 +1264,18 @@ valid_dispatch_indirect(struct gl_context *ctx,
   return GL_FALSE;
}
 
+   /* From the ARB_compute_variable_group_size specification:
+*
+* "An INVALID_OPERATION error is generated if the active program for the
+* compute shader stage has a variable work group size."
+*/
+   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMP

[Mesa-dev] [PATCH 11/11] nv50/ir: use 1024 threads/block for variable local size

2016-09-08 Thread Samuel Pitoiset

When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
index 4a701f7..0bb14ec 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
@@ -174,7 +174,8 @@ public:
virtual void getBuiltinCode(const uint32_t **code, uint32_t *size) const = 
0;
 
virtual void parseDriverInfo(const struct nv50_ir_prog_info *info) {
-  threads = info->prop.cp.numThreads;
+  threads =
+ info->prop.cp.numThreads == 0 ? 1024 : info->prop.cp.numThreads;
}
 
virtual bool runLegalizePass(Program *, CGStage stage) const = 0;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/11] glsl: add enable flags for ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset

This also initializes the default values for the standalone compiler.

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl/glsl_parser_extras.cpp | 1 +
 src/compiler/glsl/glsl_parser_extras.h   | 2 ++
 src/compiler/glsl/standalone.cpp | 4 
 src/compiler/glsl/standalone_scaffolding.cpp | 5 +
 4 files changed, 12 insertions(+)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 436ddd0..bcbe623 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -580,6 +580,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(ARB_ES3_2_compatibility),
EXT(ARB_arrays_of_arrays),
EXT(ARB_compute_shader),
+   EXT(ARB_compute_variable_group_size),
EXT(ARB_conservative_depth),
EXT(ARB_cull_distance),
EXT(ARB_derivative_control),
diff --git a/src/compiler/glsl/glsl_parser_extras.h 
b/src/compiler/glsl/glsl_parser_extras.h
index e146fe1..8e0dafe 100644
--- a/src/compiler/glsl/glsl_parser_extras.h
+++ b/src/compiler/glsl/glsl_parser_extras.h
@@ -576,6 +576,8 @@ struct _mesa_glsl_parse_state {
bool ARB_arrays_of_arrays_warn;
bool ARB_compute_shader_enable;
bool ARB_compute_shader_warn;
+   bool ARB_compute_variable_group_size_enable;
+   bool ARB_compute_variable_group_size_warn;
bool ARB_conservative_depth_enable;
bool ARB_conservative_depth_warn;
bool ARB_cull_distance_enable;
diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index 88fe5fd..cb2da03 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -58,6 +58,10 @@ initialize_context(struct gl_context *ctx, gl_api api)
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
ctx->Const.MaxComputeWorkGroupInvocations = 1024;
ctx->Const.MaxComputeSharedMemorySize = 32768;
+   ctx->Const.MaxComputeVariableGroupSize[0] = 512;
+   ctx->Const.MaxComputeVariableGroupSize[1] = 512;
+   ctx->Const.MaxComputeVariableGroupSize[2] = 64;
+   ctx->Const.MaxComputeVariableGroupInvocations = 512;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxCombinedUniformComponents = 1024;
diff --git a/src/compiler/glsl/standalone_scaffolding.cpp 
b/src/compiler/glsl/standalone_scaffolding.cpp
index b0fb4b7..decff5f 100644
--- a/src/compiler/glsl/standalone_scaffolding.cpp
+++ b/src/compiler/glsl/standalone_scaffolding.cpp
@@ -144,6 +144,7 @@ void initialize_context_to_defaults(struct gl_context *ctx, 
gl_api api)
ctx->Extensions.dummy_false = false;
ctx->Extensions.dummy_true = true;
ctx->Extensions.ARB_compute_shader = true;
+   ctx->Extensions.ARB_compute_variable_group_size = true;
ctx->Extensions.ARB_conservative_depth = true;
ctx->Extensions.ARB_draw_instanced = true;
ctx->Extensions.ARB_ES2_compatibility = true;
@@ -207,6 +208,10 @@ void initialize_context_to_defaults(struct gl_context 
*ctx, gl_api api)
ctx->Const.MaxComputeWorkGroupSize[1] = 1024;
ctx->Const.MaxComputeWorkGroupSize[2] = 64;
ctx->Const.MaxComputeWorkGroupInvocations = 1024;
+   ctx->Const.MaxComputeVariableGroupSize[0] = 512;
+   ctx->Const.MaxComputeVariableGroupSize[1] = 512;
+   ctx->Const.MaxComputeVariableGroupSize[2] = 64;
+   ctx->Const.MaxComputeVariableGroupInvocations = 512;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits = 16;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxUniformComponents = 1024;
ctx->Const.Program[MESA_SHADER_COMPUTE].MaxInputComponents = 0; /* not used 
*/
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/11] st/mesa: add support for dispatching a variable local size

2016-09-08 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/state_tracker/st_cb_compute.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_compute.c 
b/src/mesa/state_tracker/st_cb_compute.c
index 88c1ee2..ccc5dc2 100644
--- a/src/mesa/state_tracker/st_cb_compute.c
+++ b/src/mesa/state_tracker/st_cb_compute.c
@@ -36,6 +36,7 @@
 
 static void st_dispatch_compute_common(struct gl_context *ctx,
const GLuint *num_groups,
+   const GLuint *group_size,
struct pipe_resource *indirect,
GLintptr indirect_offset)
 {
@@ -56,7 +57,7 @@ static void st_dispatch_compute_common(struct gl_context *ctx,
   st_validate_state(st, ST_PIPELINE_COMPUTE);
 
for (unsigned i = 0; i < 3; i++) {
-  info.block[i] = prog->Comp.LocalSize[i];
+  info.block[i] = group_size ? group_size[i] : prog->Comp.LocalSize[i];
   info.grid[i]  = num_groups ? num_groups[i] : 0;
}
 
@@ -71,7 +72,7 @@ static void st_dispatch_compute_common(struct gl_context *ctx,
 static void st_dispatch_compute(struct gl_context *ctx,
 const GLuint *num_groups)
 {
-   st_dispatch_compute_common(ctx, num_groups, NULL, 0);
+   st_dispatch_compute_common(ctx, num_groups, NULL, NULL, 0);
 }
 
 static void st_dispatch_compute_indirect(struct gl_context *ctx,
@@ -80,11 +81,19 @@ static void st_dispatch_compute_indirect(struct gl_context 
*ctx,
struct gl_buffer_object *indirect_buffer = ctx->DispatchIndirectBuffer;
struct pipe_resource *indirect = st_buffer_object(indirect_buffer)->buffer;
 
-   st_dispatch_compute_common(ctx, NULL, indirect, indirect_offset);
+   st_dispatch_compute_common(ctx, NULL, NULL, indirect, indirect_offset);
+}
+
+static void st_dispatch_compute_group_size(struct gl_context *ctx,
+   const GLuint *num_groups,
+   const GLuint *group_size)
+{
+   st_dispatch_compute_common(ctx, num_groups, group_size, NULL, 0);
 }
 
 void st_init_compute_functions(struct dd_function_table *functions)
 {
functions->DispatchCompute = st_dispatch_compute;
functions->DispatchComputeIndirect = st_dispatch_compute_indirect;
+   functions->DispatchComputeGroupSize = st_dispatch_compute_group_size;
 }
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/7] mesa/st: pass st_compute_program to st_get_cp_variant

2016-09-08 Thread Rob Clark

Makes it more consistent with vp/fp variants, and will be needed in a
following patch.

Signed-off-by: Rob Clark 
---
 src/mesa/state_tracker/st_atom_shader.c |  2 +-
 src/mesa/state_tracker/st_program.c | 12 ++--
 src/mesa/state_tracker/st_program.h |  3 +--
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_shader.c 
b/src/mesa/state_tracker/st_atom_shader.c
index c2e4fc8..3cf8992 100644
--- a/src/mesa/state_tracker/st_atom_shader.c
+++ b/src/mesa/state_tracker/st_atom_shader.c
@@ -322,7 +322,7 @@ update_cp( struct st_context *st )
assert(stcp->Base.Base.Target == GL_COMPUTE_PROGRAM_NV);
 
key = st_get_basic_variant_key(st, &stcp->Base.Base);
-   st->cp_variant = st_get_cp_variant(st, &stcp->tgsi, &stcp->variants, &key);
+   st->cp_variant = st_get_cp_variant(st, stcp, &key);
 
st_reference_compprog(st, &st->cp, stcp);
 
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index 284cc22..41ccc20 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -1685,15 +1685,15 @@ st_translate_compute_program(struct st_context *st,
  */
 struct st_basic_variant *
 st_get_cp_variant(struct st_context *st,
-  struct pipe_compute_state *tgsi,
-  struct st_basic_variant **variants,
+  struct st_compute_program *stcp,
   const struct st_basic_variant_key *key)
 {
struct pipe_context *pipe = st->pipe;
+   struct pipe_compute_state *tgsi = &stcp->tgsi;
struct st_basic_variant *v;
 
/* Search for existing variant */
-   for (v = *variants; v; v = v->next) {
+   for (v = stcp->variants; v; v = v->next) {
   if (memcmp(&v->key, key, sizeof(*key)) == 0) {
  break;
   }
@@ -1708,8 +1708,8 @@ st_get_cp_variant(struct st_context *st,
  v->key = *key;
 
  /* insert into list */
- v->next = *variants;
- *variants = v;
+ v->next = stcp->variants;
+ stcp->variants = v;
   }
}
 
@@ -1955,7 +1955,7 @@ st_precompile_shader_variant(struct st_context *st,
case GL_COMPUTE_PROGRAM_NV: {
   struct st_compute_program *p = (struct st_compute_program *)prog;
   struct st_basic_variant_key key = st_get_basic_variant_key(st, prog);
-  st_get_cp_variant(st, &p->tgsi, &p->variants, &key);
+  st_get_cp_variant(st, p, &key);
   break;
}
 
diff --git a/src/mesa/state_tracker/st_program.h 
b/src/mesa/state_tracker/st_program.h
index f4e572a..dd5a89b 100644
--- a/src/mesa/state_tracker/st_program.h
+++ b/src/mesa/state_tracker/st_program.h
@@ -447,8 +447,7 @@ st_get_fp_variant(struct st_context *st,
 
 extern struct st_basic_variant *
 st_get_cp_variant(struct st_context *st,
-  struct pipe_compute_state *tgsi,
-  struct st_basic_variant **variants,
+  struct st_compute_program *p,
   const struct st_basic_variant_key *key);
 
 extern struct st_basic_variant *
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/11] glsl: reject compute shaders with fixed and variable local size

2016-09-08 Thread Samuel Pitoiset

The ARB_compute_variable_group_size specification explains that
when a compute shader includes both a fixed and a variable local
size, a compile-time error occurs.

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl/ast_to_hir.cpp | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 4fc4c5c..a53a82e 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -8013,6 +8013,20 @@ ast_cs_input_layout::hir(exec_list *instructions,
   }
}
 
+   /* From the ARB_compute_variable_group_size specification:
+*
+* If a compute shader including a *local_size_variable* qualifier also
+* declares a fixed local group size using the *local_size_x*,
+* *local_size_y*, or *local_size_z* qualifiers, a compile-time error
+* results
+*/
+   if (state->cs_input_local_size_variable_specified) {
+  _mesa_glsl_error(&loc, state,
+   "compute shader can't include both a variable and a "
+   "fixed local group size");
+  return NULL;
+   }
+
state->cs_input_local_size_specified = true;
for (int i = 0; i < 3; i++)
   state->cs_input_local_size[i] = qual_local_size[i];
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/11] st/mesa: expose ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset

This extension is only exposed if the underlying driver supports
ARB_compute_shader.

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/state_tracker/st_extensions.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 807fbfb..dc2e60a 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -1196,6 +1196,19 @@ void st_init_extensions(struct pipe_screen *screen,
  extensions->ARB_compute_shader =
   extensions->ARB_shader_image_load_store 
&&
   extensions->ARB_shader_atomic_counters;
+
+ if (extensions->ARB_compute_shader) {
+/* Because the minimum values required by
+ * ARB_compute_variable_group_size are less (or equal) than the
+ * ones defined by ARB_compute_shader we can re-use them. */
+for (i = 0; i < 3; i++) {
+   consts->MaxComputeVariableGroupSize[i] =
+  consts->MaxComputeWorkGroupSize[i];
+}
+consts->MaxComputeVariableGroupInvocations =
+   consts->MaxComputeWorkGroupInvocations;
+extensions->ARB_compute_variable_group_size = true;
+ }
   }
}
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/11] glsl: add gl_LocalGroupSizeARB as a system value

2016-09-08 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl/builtin_variables.cpp | 2 ++
 src/compiler/shader_enums.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/src/compiler/glsl/builtin_variables.cpp 
b/src/compiler/glsl/builtin_variables.cpp
index f47daab..a1768fc 100644
--- a/src/compiler/glsl/builtin_variables.cpp
+++ b/src/compiler/glsl/builtin_variables.cpp
@@ -1236,6 +1236,8 @@ builtin_variable_generator::generate_cs_special_vars()
 "gl_LocalInvocationID");
add_system_value(SYSTEM_VALUE_WORK_GROUP_ID, uvec3_t, "gl_WorkGroupID");
add_system_value(SYSTEM_VALUE_NUM_WORK_GROUPS, uvec3_t, "gl_NumWorkGroups");
+   add_system_value(SYSTEM_VALUE_LOCAL_GROUP_SIZE,
+uvec3_t, "gl_LocalGroupSizeARB");
if (state->ctx->Const.LowerCsDerivedVariables) {
   add_variable("gl_GlobalInvocationID", uvec3_t, ir_var_auto, 0);
   add_variable("gl_LocalInvocationIndex", uint_t, ir_var_auto, 0);
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index c3a62e0..b6e048e 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -472,6 +472,7 @@ typedef enum
SYSTEM_VALUE_GLOBAL_INVOCATION_ID,
SYSTEM_VALUE_WORK_GROUP_ID,
SYSTEM_VALUE_NUM_WORK_GROUPS,
+   SYSTEM_VALUE_LOCAL_GROUP_SIZE,
/*@}*/
 
/**
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/11] glsl/linker: handle errors when a variable local size is used

2016-09-08 Thread Samuel Pitoiset

Compute shaders can now include a fixed local size as defined by
ARB_compute_shader or a variable size as defined by
ARB_compute_variable_group_size.

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl/linker.cpp | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index c95edf3..e909455 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2074,6 +2074,7 @@ link_cs_input_layout_qualifiers(struct gl_shader_program 
*prog,
 {
for (int i = 0; i < 3; i++)
   linked_shader->info.Comp.LocalSize[i] = 0;
+   linked_shader->info.Comp.LocalSizeVariable = false;
 
/* This function is called for all shader stages, but it only has an effect
 * for compute shaders.
@@ -2109,6 +2110,20 @@ link_cs_input_layout_qualifiers(struct gl_shader_program 
*prog,
 linked_shader->info.Comp.LocalSize[i] =
shader->info.Comp.LocalSize[i];
  }
+  } else if (shader->info.Comp.LocalSizeVariable) {
+ if (linked_shader->info.Comp.LocalSize[0] != 0) {
+/* From the ARB_compute_variable_group_size spec:
+ *
+ * If one compute shader attached to a program declares a
+ * variable local group size and a second compute shader
+ * attached to the same program declares a fixed local group
+ * size, a link-time error results.
+ */
+linker_error(prog, "computer shader defined with both fixed and "
+ "variable local group size\n");
+return;
+ }
+ linked_shader->info.Comp.LocalSizeVariable = true;
   }
}
 
@@ -2116,12 +2131,16 @@ link_cs_input_layout_qualifiers(struct 
gl_shader_program *prog,
 * since we already know we're in the right type of shader program
 * for doing it.
 */
-   if (linked_shader->info.Comp.LocalSize[0] == 0) {
-  linker_error(prog, "compute shader didn't declare local size\n");
+   if (linked_shader->info.Comp.LocalSize[0] == 0 &&
+   !linked_shader->info.Comp.LocalSizeVariable) {
+  linker_error(prog, "compute shader must contain a fixed or a variable "
+ "local group size\n");
   return;
}
for (int i = 0; i < 3; i++)
   prog->Comp.LocalSize[i] = linked_shader->info.Comp.LocalSize[i];
+   prog->Comp.LocalSizeVariable =
+  linked_shader->info.Comp.LocalSizeVariable;
 }
 
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/11] st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE

2016-09-08 Thread Samuel Pitoiset

gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE
which represents the block size in threads.

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 507a782..429f4b0 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5235,6 +5235,8 @@ _mesa_sysval_to_semantic(unsigned sysval)
   return TGSI_SEMANTIC_BLOCK_ID;
case SYSTEM_VALUE_NUM_WORK_GROUPS:
   return TGSI_SEMANTIC_GRID_SIZE;
+   case SYSTEM_VALUE_LOCAL_GROUP_SIZE:
+  return TGSI_SEMANTIC_BLOCK_SIZE;
 
/* Unhandled */
case SYSTEM_VALUE_LOCAL_INVOCATION_INDEX:
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/7] mesa/st: a bit of basic_variant refactoring

2016-09-08 Thread Rob Clark

Add a helper to initialize the key, and pass the key into the helper
that iterates the variants, similar to how it works for vp/fp variants.

The 'prog' arg to the helper gets used in a following patch, and is the
reason to pass the key into st_get_basic_variant().

Signed-off-by: Rob Clark 
---
 src/mesa/state_tracker/st_atom_shader.c | 16 
 src/mesa/state_tracker/st_program.c | 34 -
 src/mesa/state_tracker/st_program.h | 16 ++--
 3 files changed, 42 insertions(+), 24 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_shader.c 
b/src/mesa/state_tracker/st_atom_shader.c
index 2f700a2..c2e4fc8 100644
--- a/src/mesa/state_tracker/st_atom_shader.c
+++ b/src/mesa/state_tracker/st_atom_shader.c
@@ -217,6 +217,7 @@ static void
 update_gp( struct st_context *st )
 {
struct st_geometry_program *stgp;
+   struct st_basic_variant_key key;
 
if (!st->ctx->GeometryProgram._Current) {
   cso_set_geometry_shader_handle(st->cso_context, NULL);
@@ -227,8 +228,9 @@ update_gp( struct st_context *st )
stgp = st_geometry_program(st->ctx->GeometryProgram._Current);
assert(stgp->Base.Base.Target == GL_GEOMETRY_PROGRAM_NV);
 
+   key = st_get_basic_variant_key(st, &stgp->Base.Base);
st->gp_variant = st_get_basic_variant(st, PIPE_SHADER_GEOMETRY,
- &stgp->tgsi, &stgp->variants);
+ &stgp->tgsi, &stgp->variants, &key);
 
st_reference_geomprog(st, &st->gp, stgp);
 
@@ -246,6 +248,7 @@ static void
 update_tcp( struct st_context *st )
 {
struct st_tessctrl_program *sttcp;
+   struct st_basic_variant_key key;
 
if (!st->ctx->TessCtrlProgram._Current) {
   cso_set_tessctrl_shader_handle(st->cso_context, NULL);
@@ -256,8 +259,9 @@ update_tcp( struct st_context *st )
sttcp = st_tessctrl_program(st->ctx->TessCtrlProgram._Current);
assert(sttcp->Base.Base.Target == GL_TESS_CONTROL_PROGRAM_NV);
 
+   key = st_get_basic_variant_key(st, &sttcp->Base.Base);
st->tcp_variant = st_get_basic_variant(st, PIPE_SHADER_TESS_CTRL,
-  &sttcp->tgsi, &sttcp->variants);
+  &sttcp->tgsi, &sttcp->variants, 
&key);
 
st_reference_tesscprog(st, &st->tcp, sttcp);
 
@@ -275,6 +279,7 @@ static void
 update_tep( struct st_context *st )
 {
struct st_tesseval_program *sttep;
+   struct st_basic_variant_key key;
 
if (!st->ctx->TessEvalProgram._Current) {
   cso_set_tesseval_shader_handle(st->cso_context, NULL);
@@ -285,8 +290,9 @@ update_tep( struct st_context *st )
sttep = st_tesseval_program(st->ctx->TessEvalProgram._Current);
assert(sttep->Base.Base.Target == GL_TESS_EVALUATION_PROGRAM_NV);
 
+   key = st_get_basic_variant_key(st, &sttep->Base.Base);
st->tep_variant = st_get_basic_variant(st, PIPE_SHADER_TESS_EVAL,
-  &sttep->tgsi, &sttep->variants);
+  &sttep->tgsi, &sttep->variants, 
&key);
 
st_reference_tesseprog(st, &st->tep, sttep);
 
@@ -304,6 +310,7 @@ static void
 update_cp( struct st_context *st )
 {
struct st_compute_program *stcp;
+   struct st_basic_variant_key key;
 
if (!st->ctx->ComputeProgram._Current) {
   cso_set_compute_shader_handle(st->cso_context, NULL);
@@ -314,7 +321,8 @@ update_cp( struct st_context *st )
stcp = st_compute_program(st->ctx->ComputeProgram._Current);
assert(stcp->Base.Base.Target == GL_COMPUTE_PROGRAM_NV);
 
-   st->cp_variant = st_get_cp_variant(st, &stcp->tgsi, &stcp->variants);
+   key = st_get_basic_variant_key(st, &stcp->Base.Base);
+   st->cp_variant = st_get_cp_variant(st, &stcp->tgsi, &stcp->variants, &key);
 
st_reference_compprog(st, &st->cp, stcp);
 
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index 91887dc..284cc22 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -1533,18 +1533,15 @@ struct st_basic_variant *
 st_get_basic_variant(struct st_context *st,
  unsigned pipe_shader,
  struct pipe_shader_state *tgsi,
- struct st_basic_variant **variants)
+ struct st_basic_variant **variants,
+ const struct st_basic_variant_key *key)
 {
struct pipe_context *pipe = st->pipe;
struct st_basic_variant *v;
-   struct st_basic_variant_key key;
-
-   memset(&key, 0, sizeof(key));
-   key.st = st->has_shareable_shaders ? NULL : st;
 
/* Search for existing variant */
for (v = *variants; v; v = v->next) {
-  if (memcmp(&v->key, &key, sizeof(key)) == 0) {
+  if (memcmp(&v->key, key, sizeof(*key)) == 0) {
  break;
   }
}
@@ -1570,7 +1567,7 @@ st_get_basic_variant(struct st_context *st,
 return NULL;
  }
 
- v->key = key;
+ v->key = *key;
 
  /* ins

[Mesa-dev] [PATCH 04/11] glsl: process local_size_variable input qualifier

2016-09-08 Thread Samuel Pitoiset

This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.

Signed-off-by: Samuel Pitoiset 
---
 src/compiler/glsl/ast.h  |  5 +
 src/compiler/glsl/ast_type.cpp   |  6 ++
 src/compiler/glsl/glsl_parser.yy | 13 +
 src/compiler/glsl/glsl_parser_extras.cpp |  5 +
 src/compiler/glsl/glsl_parser_extras.h   |  6 ++
 5 files changed, 35 insertions(+)

diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
index 4c648d0..55f009a 100644
--- a/src/compiler/glsl/ast.h
+++ b/src/compiler/glsl/ast.h
@@ -553,6 +553,11 @@ struct ast_type_qualifier {
   */
  unsigned local_size:3;
 
+/** \name Layout qualifiers for ARB_compute_variable_group_size. */
+/** \{ */
+unsigned local_size_variable:1;
+/** \} */
+
 /** \name Layout and memory qualifiers for 
ARB_shader_image_load_store. */
 /** \{ */
 unsigned early_fragment_tests:1;
diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index f3f6b29..3f19f1f 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -503,6 +503,7 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
  state->in_qualifier->flags.q.local_size == 0;
 
   valid_in_mask.flags.q.local_size = 7;
+  valid_in_mask.flags.q.local_size_variable = 1;
   break;
default:
   _mesa_glsl_error(loc, state,
@@ -580,6 +581,10 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
   this->point_mode = q.point_mode;
}
 
+   if (q.flags.q.local_size_variable) {
+  state->cs_input_local_size_variable_specified = true;
+   }
+
if (create_node) {
   if (create_gs_ast) {
  node = new(mem_ctx) ast_gs_input_layout(*loc, q.prim_type);
@@ -653,6 +658,7 @@ ast_type_qualifier::validate_flags(YYLTYPE *loc,
 bad.flags.q.prim_type ? " prim_type" : "",
 bad.flags.q.max_vertices ? " max_vertices" : "",
 bad.flags.q.local_size ? " local_size" : "",
+bad.flags.q.local_size_variable ? " local_size_variable" : 
"",
 bad.flags.q.early_fragment_tests ? " early_fragment_tests" 
: "",
 bad.flags.q.explicit_image_format ? " image_format" : "",
 bad.flags.q.coherent ? " coherent" : "",
diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy
index 9e1fd9e..38cbd3f 100644
--- a/src/compiler/glsl/glsl_parser.yy
+++ b/src/compiler/glsl/glsl_parser.yy
@@ -1491,6 +1491,19 @@ layout_qualifier_id:
  }
   }
 
+  /* Layout qualifiers for ARB_compute_variable_group_size. */
+  if (!$$.flags.i) {
+ if (match_layout_qualifier($1, "local_size_variable", state) == 0) {
+$$.flags.q.local_size_variable = 1;
+ }
+
+ if ($$.flags.i && !state->ARB_compute_variable_group_size_enable) {
+_mesa_glsl_error(& @1, state,
+ "qualifier `local_size_variable` requires "
+ "ARB_compute_variable_group_size");
+ }
+  }
+
   if (!$$.flags.i) {
  _mesa_glsl_error(& @1, state, "unrecognized layout identifier "
   "`%s'", $1);
diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index bcbe623..5a7153f 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -297,6 +297,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
   sizeof(this->atomic_counter_offsets));
this->allow_extension_directive_midshader =
   ctx->Const.AllowGLSLExtensionDirectiveMidShader;
+
+   this->cs_input_local_size_variable_specified = false;
 }
 
 /**
@@ -1648,6 +1650,7 @@ set_shader_inout_layout(struct gl_shader *shader,
if (shader->Stage != MESA_SHADER_COMPUTE) {
   /* Should have been prevented by the parser. */
   assert(!state->cs_input_local_size_specified);
+  assert(!state->cs_input_local_size_variable_specified);
}
 
if (shader->Stage != MESA_SHADER_FRAGMENT) {
@@ -1763,6 +1766,8 @@ set_shader_inout_layout(struct gl_shader *shader,
  for (int i = 0; i < 3; i++)
 shader->info.Comp.LocalSize[i] = 0;
   }
+  shader->info.Comp.LocalSizeVariable =
+ state->cs_input_local_size_variable_specified;
   break;
 
case MESA_SHADER_FRAGMENT:
diff --git a/src/compiler/glsl/glsl_parser_extras.h 
b/src/compiler/glsl/glsl_parser_extras.h
index 8e0dafe..c1cf789 100644
--- a/src/compiler/glsl/glsl_parser_extras.h
+++ b/src/compiler/glsl/glsl_parser_extras.h
@@ -405,6 +405,12 @@ struct _mesa_glsl_parse_state {
unsigned cs_input_local_size[3];
 
/**
+* True if a compute shader input local variable size was specified using
+* a layout directive as specified by ARB_c

[Mesa-dev] [PATCH 6/7] mesa/st: pass prog to st_get_basic_variant()

2016-09-08 Thread Rob Clark

Needed in a following patch.

Signed-off-by: Rob Clark 
---
 src/mesa/state_tracker/st_atom_shader.c |  6 +++---
 src/mesa/state_tracker/st_program.c | 16 
 src/mesa/state_tracker/st_program.h |  2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_shader.c 
b/src/mesa/state_tracker/st_atom_shader.c
index 3cf8992..f0970ae 100644
--- a/src/mesa/state_tracker/st_atom_shader.c
+++ b/src/mesa/state_tracker/st_atom_shader.c
@@ -229,7 +229,7 @@ update_gp( struct st_context *st )
assert(stgp->Base.Base.Target == GL_GEOMETRY_PROGRAM_NV);
 
key = st_get_basic_variant_key(st, &stgp->Base.Base);
-   st->gp_variant = st_get_basic_variant(st, PIPE_SHADER_GEOMETRY,
+   st->gp_variant = st_get_basic_variant(st, &stgp->Base.Base,
  &stgp->tgsi, &stgp->variants, &key);
 
st_reference_geomprog(st, &st->gp, stgp);
@@ -260,7 +260,7 @@ update_tcp( struct st_context *st )
assert(sttcp->Base.Base.Target == GL_TESS_CONTROL_PROGRAM_NV);
 
key = st_get_basic_variant_key(st, &sttcp->Base.Base);
-   st->tcp_variant = st_get_basic_variant(st, PIPE_SHADER_TESS_CTRL,
+   st->tcp_variant = st_get_basic_variant(st, &sttcp->Base.Base,
   &sttcp->tgsi, &sttcp->variants, 
&key);
 
st_reference_tesscprog(st, &st->tcp, sttcp);
@@ -291,7 +291,7 @@ update_tep( struct st_context *st )
assert(sttep->Base.Base.Target == GL_TESS_EVALUATION_PROGRAM_NV);
 
key = st_get_basic_variant_key(st, &sttep->Base.Base);
-   st->tep_variant = st_get_basic_variant(st, PIPE_SHADER_TESS_EVAL,
+   st->tep_variant = st_get_basic_variant(st, &sttep->Base.Base,
   &sttep->tgsi, &sttep->variants, 
&key);
 
st_reference_tesseprog(st, &st->tep, sttep);
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index 41ccc20..f8be835 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -1531,7 +1531,7 @@ st_translate_geometry_program(struct st_context *st,
  */
 struct st_basic_variant *
 st_get_basic_variant(struct st_context *st,
- unsigned pipe_shader,
+ struct gl_program *prog,
  struct pipe_shader_state *tgsi,
  struct st_basic_variant **variants,
  const struct st_basic_variant_key *key)
@@ -1551,14 +1551,14 @@ st_get_basic_variant(struct st_context *st,
   v = CALLOC_STRUCT(st_basic_variant);
   if (v) {
  /* fill in new variant */
- switch (pipe_shader) {
- case PIPE_SHADER_TESS_CTRL:
+ switch (prog->Target) {
+ case GL_TESS_CONTROL_PROGRAM_NV:
 v->driver_shader = pipe->create_tcs_state(pipe, tgsi);
 break;
- case PIPE_SHADER_TESS_EVAL:
+ case GL_TESS_EVALUATION_PROGRAM_NV:
 v->driver_shader = pipe->create_tes_state(pipe, tgsi);
 break;
- case PIPE_SHADER_GEOMETRY:
+ case GL_GEOMETRY_PROGRAM_NV:
 v->driver_shader = pipe->create_gs_state(pipe, tgsi);
 break;
  default:
@@ -1924,21 +1924,21 @@ st_precompile_shader_variant(struct st_context *st,
case GL_TESS_CONTROL_PROGRAM_NV: {
   struct st_tessctrl_program *p = (struct st_tessctrl_program *)prog;
   struct st_basic_variant_key key = st_get_basic_variant_key(st, prog);
-  st_get_basic_variant(st, PIPE_SHADER_TESS_CTRL, &p->tgsi, &p->variants, 
&key);
+  st_get_basic_variant(st, prog, &p->tgsi, &p->variants, &key);
   break;
}
 
case GL_TESS_EVALUATION_PROGRAM_NV: {
   struct st_tesseval_program *p = (struct st_tesseval_program *)prog;
   struct st_basic_variant_key key = st_get_basic_variant_key(st, prog);
-  st_get_basic_variant(st, PIPE_SHADER_TESS_EVAL, &p->tgsi, &p->variants, 
&key);
+  st_get_basic_variant(st, prog, &p->tgsi, &p->variants, &key);
   break;
}
 
case GL_GEOMETRY_PROGRAM_NV: {
   struct st_geometry_program *p = (struct st_geometry_program *)prog;
   struct st_basic_variant_key key = st_get_basic_variant_key(st, prog);
-  st_get_basic_variant(st, PIPE_SHADER_GEOMETRY, &p->tgsi, &p->variants, 
&key);
+  st_get_basic_variant(st, prog, &p->tgsi, &p->variants, &key);
   break;
}
 
diff --git a/src/mesa/state_tracker/st_program.h 
b/src/mesa/state_tracker/st_program.h
index dd5a89b..9cf492c 100644
--- a/src/mesa/state_tracker/st_program.h
+++ b/src/mesa/state_tracker/st_program.h
@@ -452,7 +452,7 @@ st_get_cp_variant(struct st_context *st,
 
 extern struct st_basic_variant *
 st_get_basic_variant(struct st_context *st,
- unsigned pipe_shader,
+ struct gl_program *prog,
  struct pipe_shader_state *tgsi,
  struct st_basic_variant **variants,
  const struct st_basic_variant_key

[Mesa-dev] [PATCH 3/7] mesa/st: support lowering multi-planar YUV

2016-09-08 Thread Rob Clark

Support multi-planar YUV for external EGLImage's (currently just in the
dma-buf import path) by lowering to multiple texture fetch's for each
plane and CSC in shader.

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/util/u_inlines.h  |   4 +-
 src/gallium/include/pipe/p_state.h  |   9 +++
 src/gallium/include/state_tracker/st_api.h  |   3 +
 src/gallium/state_trackers/dri/dri2.c   | 119 +++-
 src/gallium/state_trackers/dri/dri_screen.c |  11 +++
 src/mesa/main/mtypes.h  |  16 
 src/mesa/program/ir_to_mesa.cpp |   1 +
 src/mesa/state_tracker/st_atom_sampler.c|  41 +-
 src/mesa/state_tracker/st_atom_shader.c |   3 +
 src/mesa/state_tracker/st_atom_texture.c|  58 ++
 src/mesa/state_tracker/st_cb_eglimage.c |  18 +
 src/mesa/state_tracker/st_context.c |   7 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp   |   1 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp  |   4 +
 src/mesa/state_tracker/st_manager.c |   1 +
 src/mesa/state_tracker/st_program.c |  35 
 src/mesa/state_tracker/st_program.h |  37 +
 src/mesa/state_tracker/st_texture.h |  21 +
 18 files changed, 362 insertions(+), 27 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_inlines.h 
b/src/gallium/auxiliary/util/u_inlines.h
index c2a0b08..b7b8313 100644
--- a/src/gallium/auxiliary/util/u_inlines.h
+++ b/src/gallium/auxiliary/util/u_inlines.h
@@ -136,8 +136,10 @@ pipe_resource_reference(struct pipe_resource **ptr, struct 
pipe_resource *tex)
struct pipe_resource *old_tex = *ptr;
 
if (pipe_reference_described(&(*ptr)->reference, &tex->reference, 
-
(debug_reference_descriptor)debug_describe_resource))
+
(debug_reference_descriptor)debug_describe_resource)) {
+  pipe_resource_reference(&old_tex->next, NULL);
   old_tex->screen->resource_destroy(old_tex->screen, old_tex);
+   }
*ptr = tex;
 }
 
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index ebd0337..4a88da6 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -498,6 +498,15 @@ struct pipe_resource
 
unsigned bind;/**< bitmask of PIPE_BIND_x */
unsigned flags;   /**< bitmask of PIPE_RESOURCE_FLAG_x */
+
+   /**
+* For planar images, ie. YUV EGLImage external, etc, pointer to the
+* next plane.
+*
+* TODO might be useful for dealing w/ z32s8 too, since at least a
+* couple drivers split these out into separate buffers internally.
+*/
+   struct pipe_resource *next;
 };
 
 
diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index 21d5177..06abfc5 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -200,6 +200,9 @@ struct st_egl_image
/* this is owned by the caller */
struct pipe_resource *texture;
 
+   /* format only differs from texture->format for multi-planar (YUV): */
+   enum pipe_format format;
+
unsigned level;
unsigned layer;
 };
diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 43a5df1..a22e7ee 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -83,6 +83,21 @@ static int convert_fourcc(int format, int *dri_components_p)
   format = __DRI_IMAGE_FORMAT_GR88;
   dri_components = __DRI_IMAGE_COMPONENTS_RG;
   break;
+   /*
+* For multi-planar YUV formats, we return the format of the first
+* plane only.  Since there is only one caller which supports multi-
+* planar YUV it gets to figure out the remaining planes on it's
+* own.
+*/
+   case __DRI_IMAGE_FOURCC_YUV420:
+   case __DRI_IMAGE_FOURCC_YVU420:
+  format = __DRI_IMAGE_FORMAT_R8;
+  dri_components = __DRI_IMAGE_COMPONENTS_Y_U_V;
+  break;
+   case __DRI_IMAGE_FOURCC_NV12:
+  format = __DRI_IMAGE_FORMAT_R8;
+  dri_components = __DRI_IMAGE_COMPONENTS_Y_UV;
+  break;
default:
   return -1;
}
@@ -90,6 +105,11 @@ static int convert_fourcc(int format, int *dri_components_p)
return format;
 }
 
+/* NOTE this probably isn't going to do the right thing for YUV images
+ * (but I think the same can be said for intel_query_image()).  I think
+ * only needed for exporting dmabuf's, so I think I won't loose much
+ * sleep over it.
+ */
 static int convert_to_fourcc(int format)
 {
switch(format) {
@@ -762,14 +782,16 @@ dri2_lookup_egl_image(struct dri_screen *screen, void 
*handle)
 static __DRIimage *
 dri2_create_image_from_winsys(__DRIscreen *_screen,
   int width, int height, int format,
-  struct winsys_handle *whandle,
+  int num_handles, struct winsys_handle *whandle,

[Mesa-dev] [PATCH 7/7] mesa/st: support for YUV in VS/VS/GS/TCS/TEC..

2016-09-08 Thread Rob Clark

maybe we don't keep these bits?

Signed-off-by: Rob Clark 
---
 src/mesa/state_tracker/st_atom_shader.c |  2 ++
 src/mesa/state_tracker/st_context.c | 20 +++
 src/mesa/state_tracker/st_program.c | 63 ++---
 src/mesa/state_tracker/st_program.h |  4 +++
 4 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_shader.c 
b/src/mesa/state_tracker/st_atom_shader.c
index f0970ae..b340609 100644
--- a/src/mesa/state_tracker/st_atom_shader.c
+++ b/src/mesa/state_tracker/st_atom_shader.c
@@ -196,6 +196,8 @@ update_vp( struct st_context *st )
VARYING_SLOT_BFC0 |
VARYING_SLOT_BFC1));
 
+   key.external = st_get_external_sampler_key(st, &stvp->Base.Base);
+
st->vp_variant = st_get_vp_variant(st, stvp, &key);
 
st_reference_vertprog(st, &st->vp, stvp);
diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 2571fae..9d62cf7 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -259,10 +259,30 @@ void st_invalidate_state(struct gl_context * ctx, 
GLbitfield new_state)
(ST_NEW_SAMPLER_VIEWS |
 ST_NEW_SAMPLERS |
 ST_NEW_IMAGE_UNITS);
+  if (ctx->VertexProgram._Current &&
+  ctx->VertexProgram._Current->Base.ExternalSamplersUsed) {
+ st->dirty |= ST_NEW_VS_STATE;
+  }
   if (ctx->FragmentProgram._Current &&
   ctx->FragmentProgram._Current->Base.ExternalSamplersUsed) {
  st->dirty |= ST_NEW_FS_STATE;
   }
+  if (ctx->GeometryProgram._Current &&
+  ctx->GeometryProgram._Current->Base.ExternalSamplersUsed) {
+ st->dirty |= ST_NEW_GS_STATE;
+  }
+  if (ctx->ComputeProgram._Current &&
+  ctx->ComputeProgram._Current->Base.ExternalSamplersUsed) {
+ st->dirty |= ST_NEW_CS_STATE;
+  }
+  if (ctx->TessCtrlProgram._Current &&
+  ctx->TessCtrlProgram._Current->Base.ExternalSamplersUsed) {
+ st->dirty |= ST_NEW_TCS_STATE;
+  }
+  if (ctx->TessEvalProgram._Current &&
+  ctx->TessEvalProgram._Current->Base.ExternalSamplersUsed) {
+ st->dirty |= ST_NEW_TES_STATE;
+  }
}
 
if (new_state & _NEW_PROGRAM_CONSTANTS)
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index f8be835..1c8cf89 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -489,8 +489,23 @@ st_create_vp_variant(struct st_context *st,
   if (key->passthrough_edgeflags)
  NIR_PASS_V(vpv->tgsi.ir.nir, nir_lower_passthrough_edgeflags);
 
+  if (unlikely(key->external.lower_nv12 || key->external.lower_iyuv)) {
+ nir_lower_tex_options options = {0};
+ options.lower_y_uv_external = key->external.lower_nv12;
+ options.lower_y_u_v_external = key->external.lower_iyuv;
+ NIR_PASS_V(vpv->tgsi.ir.nir, nir_lower_tex, &options);
+  }
+
   st_finalize_nir(st, &stvp->Base.Base, vpv->tgsi.ir.nir);
 
+  if (unlikely(key->external.lower_nv12 || key->external.lower_iyuv)) {
+ /* This pass needs to happen *after* nir_lower_sampler */
+ NIR_PASS_V(vpv->tgsi.ir.nir, st_nir_lower_tex_src_plane,
+~stvp->Base.Base.SamplersUsed,
+key->external.lower_nv12,
+key->external.lower_iyuv);
+  }
+
   vpv->driver_shader = pipe->create_vs_state(pipe, &vpv->tgsi);
   /* driver takes ownership of IR: */
   vpv->tgsi.ir.nir = NULL;
@@ -518,6 +533,21 @@ st_create_vp_variant(struct st_context *st,
  fprintf(stderr, "mesa: cannot emulate deprecated features\n");
}
 
+   if (unlikely(key->external.lower_nv12 || key->external.lower_iyuv)) {
+  const struct tgsi_token *tokens;
+
+  tokens = st_tgsi_lower_yuv(vpv->tgsi.tokens,
+ ~stvp->Base.Base.SamplersUsed,
+ key->external.lower_nv12,
+ key->external.lower_iyuv);
+  if (tokens) {
+ tgsi_free_tokens(vpv->tgsi.tokens);
+ vpv->tgsi.tokens = tokens;
+  } else {
+ fprintf(stderr, "mesa: cannot create a shader for 
samplerExternalOES\n");
+  }
+   }
+
if (ST_DEBUG & DEBUG_TGSI) {
   tgsi_dump(vpv->tgsi.tokens, 0);
   debug_printf("\n");
@@ -1550,16 +1580,25 @@ st_get_basic_variant(struct st_context *st,
   /* create new */
   v = CALLOC_STRUCT(st_basic_variant);
   if (v) {
+ struct pipe_shader_state cso = *tgsi;
+
+ if (unlikely(key->external.lower_nv12 || key->external.lower_iyuv)) {
+assert(cso.type == PIPE_SHADER_IR_TGSI);
+cso.tokens = st_tgsi_lower_yuv(cso.tokens, ~prog->SamplersUsed,
+   key->external.lower_nv12,
+   key->extern

[Mesa-dev] [PATCH 00/11] add support for ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset

Hi,

This series implements ARB_compute_variable_group_size written against GL 4.3.
This extension allows to dispatch variable work group size via a new function
called glDispatchComputeGroupSizeARB().

Because this extension is pretty similar to ARB_compute_shader, all Gallium
drivers which already support compute shaders will expose
ARB_compute_variable_group_size with that series.

I did write a bunch of piglit tests, have a look here if you want:
https://lists.freedesktop.org/archives/piglit/2016-September/020755.html

All tests pass on Fermi (GF119) as well as all previous compute shaders tests.

Marek, Nicolai and other AMD folks, I don't know if radeonsi will need a fix
somewhere for handling a variable work group size, but as I don't have the
hardware, I can't test. Let me know if something needs to be slighty updated.

Please review,
Thanks!

Samuel Pitoiset (11):
  glapi: add entry points for GL_ARB_compute_variable_group_size
  mesa/main: add support for ARB_compute_variable_groups_size
  glsl: add enable flags for ARB_compute_variable_group_size
  glsl: process local_size_variable input qualifier
  glsl: reject compute shaders with fixed and variable local size
  glsl/linker: handle errors when a variable local size is used
  glsl: add gl_LocalGroupSizeARB as a system value
  st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE
  st/mesa: add support for dispatching a variable local size
  st/mesa: expose ARB_compute_variable_group_size
  nv50/ir: use 1024 threads/block for variable local size

 src/compiler/glsl/ast.h|  5 ++
 src/compiler/glsl/ast_to_hir.cpp   | 14 
 src/compiler/glsl/ast_type.cpp |  6 ++
 src/compiler/glsl/builtin_variables.cpp|  2 +
 src/compiler/glsl/glsl_parser.yy   | 13 +++
 src/compiler/glsl/glsl_parser_extras.cpp   |  6 ++
 src/compiler/glsl/glsl_parser_extras.h |  8 ++
 src/compiler/glsl/linker.cpp   | 23 +-
 src/compiler/glsl/standalone.cpp   |  4 +
 src/compiler/glsl/standalone_scaffolding.cpp   |  5 ++
 src/compiler/shader_enums.h|  1 +
 .../drivers/nouveau/codegen/nv50_ir_target.h   |  3 +-
 .../glapi/gen/ARB_compute_variable_group_size.xml  | 25 ++
 src/mapi/glapi/gen/Makefile.am |  1 +
 src/mapi/glapi/gen/gl_API.xml  |  2 +
 src/mesa/main/api_validate.c   | 94 ++
 src/mesa/main/api_validate.h   |  4 +
 src/mesa/main/compute.c| 25 ++
 src/mesa/main/compute.h|  5 ++
 src/mesa/main/context.c|  6 ++
 src/mesa/main/dd.h |  9 +++
 src/mesa/main/extensions_table.h   |  1 +
 src/mesa/main/get.c| 12 +++
 src/mesa/main/get_hash_params.py   |  3 +
 src/mesa/main/mtypes.h | 23 +-
 src/mesa/main/shaderapi.c  |  1 +
 src/mesa/main/shaderobj.c  |  2 +
 src/mesa/main/tests/dispatch_sanity.cpp|  3 +
 src/mesa/state_tracker/st_cb_compute.c | 15 +++-
 src/mesa/state_tracker/st_extensions.c | 13 +++
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +
 31 files changed, 329 insertions(+), 7 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_compute_variable_group_size.xml

-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/11] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 .../glapi/gen/ARB_compute_variable_group_size.xml  | 25 ++
 src/mapi/glapi/gen/Makefile.am |  1 +
 src/mapi/glapi/gen/gl_API.xml  |  2 ++
 src/mesa/main/compute.c|  8 +++
 src/mesa/main/compute.h|  5 +
 src/mesa/main/tests/dispatch_sanity.cpp|  3 +++
 6 files changed, 44 insertions(+)
 create mode 100644 src/mapi/glapi/gen/ARB_compute_variable_group_size.xml

diff --git a/src/mapi/glapi/gen/ARB_compute_variable_group_size.xml 
b/src/mapi/glapi/gen/ARB_compute_variable_group_size.xml
new file mode 100644
index 000..b21c52f
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_compute_variable_group_size.xml
@@ -0,0 +1,25 @@
+
+
+
+
+
+
+
+
+
+  
+  
+  
+  
+
+  
+
+
+
+
+
+
+  
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 0d7c338..49fdfe3 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -117,6 +117,7 @@ API_XML = \
ARB_color_buffer_float.xml \
ARB_compressed_texture_pixel_storage.xml \
ARB_compute_shader.xml \
+   ARB_compute_variable_group_size.xml \
ARB_copy_buffer.xml \
ARB_copy_image.xml \
ARB_debug_output.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index c39aa22..9ad3b60 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8258,6 +8258,8 @@
 
 http://www.w3.org/2001/XInclude"/>
 
+http://www.w3.org/2001/XInclude"/>
+
 
 
 http://www.w3.org/2001/XInclude"/>
diff --git a/src/mesa/main/compute.c b/src/mesa/main/compute.c
index b71430f..b052bae 100644
--- a/src/mesa/main/compute.c
+++ b/src/mesa/main/compute.c
@@ -60,3 +60,11 @@ _mesa_DispatchComputeIndirect(GLintptr indirect)
 
ctx->Driver.DispatchComputeIndirect(ctx, indirect);
 }
+
+void GLAPIENTRY
+_mesa_DispatchComputeGroupSizeARB(GLuint num_groups_x, GLuint num_groups_y,
+  GLuint num_groups_z, GLuint group_size_x,
+  GLuint group_size_y, GLuint group_size_z)
+{
+
+}
diff --git a/src/mesa/main/compute.h b/src/mesa/main/compute.h
index 0cc034f..8018bbb 100644
--- a/src/mesa/main/compute.h
+++ b/src/mesa/main/compute.h
@@ -35,4 +35,9 @@ _mesa_DispatchCompute(GLuint num_groups_x,
 extern void GLAPIENTRY
 _mesa_DispatchComputeIndirect(GLintptr indirect);
 
+extern void GLAPIENTRY
+_mesa_DispatchComputeGroupSizeARB(GLuint num_groups_x, GLuint num_groups_y,
+  GLuint num_groups_z, GLuint group_size_x,
+  GLuint group_size_y, GLuint group_size_z);
+
 #endif
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 42fe61a..7faeabe 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -942,6 +942,9 @@ const struct function common_desktop_functions_possible[] = 
{
{ "glDispatchCompute", 43, -1 },
{ "glDispatchComputeIndirect", 43, -1 },
 
+   /* GL_ARB_compute_variable_group_size */
+   { "glDispatchComputeGroupSizeARB", 43, -1 },
+
/* GL_EXT_polygon_offset_clamp */
{ "glPolygonOffsetClampEXT", 11, -1 },
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] mesa/st: add nir pass to lower tex_src_plane

2016-09-08 Thread Rob Clark

Signed-off-by: Rob Clark 
---
Note: the alternative is to fold this logic into nir_lower_tex (ie.
make nir_lower_tex support either multiple samplers or the extra
nir_tex_src_plane arg.  That probably means changing around the
order so that nir_lower_tex runs after nir_lower_samplers, but I
guess that is not a big deal.

But the nice thing about having a separate mesa/st specific pass
for this is it keeps the logic about which samplers to use for
the extra planes (which must be aligned with mesa/st's sampler
state/view management) inside mesa/st.

 src/mesa/Makefile.sources  |   1 +
 src/mesa/state_tracker/st_nir.h|   3 +
 .../state_tracker/st_nir_lower_tex_src_plane.c | 120 +
 3 files changed, 124 insertions(+)
 create mode 100644 src/mesa/state_tracker/st_nir_lower_tex_src_plane.c

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 653d615..611062f 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -505,6 +505,7 @@ STATETRACKER_FILES = \
state_tracker/st_mesa_to_tgsi.h \
state_tracker/st_nir.h \
state_tracker/st_nir_lower_builtin.c \
+   state_tracker/st_nir_lower_tex_src_plane.c \
state_tracker/st_pbo.c \
state_tracker/st_pbo.h \
state_tracker/st_program.c \
diff --git a/src/mesa/state_tracker/st_nir.h b/src/mesa/state_tracker/st_nir.h
index 523a274..28d375c 100644
--- a/src/mesa/state_tracker/st_nir.h
+++ b/src/mesa/state_tracker/st_nir.h
@@ -34,6 +34,9 @@ extern "C" {
 struct nir_shader;
 
 void st_nir_lower_builtin(struct nir_shader *shader);
+void st_nir_lower_tex_src_plane(struct nir_shader *shader, unsigned free_slots,
+unsigned lower_2plane, unsigned lower_3plane);
+
 struct nir_shader * st_glsl_to_nir(struct st_context *st, struct gl_program 
*prog,
struct gl_shader_program *shader_program,
gl_shader_stage stage);
diff --git a/src/mesa/state_tracker/st_nir_lower_tex_src_plane.c 
b/src/mesa/state_tracker/st_nir_lower_tex_src_plane.c
new file mode 100644
index 000..1dcaa31
--- /dev/null
+++ b/src/mesa/state_tracker/st_nir_lower_tex_src_plane.c
@@ -0,0 +1,120 @@
+/*
+ * Copyright © 2016 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ */
+
+/* Lowers the additional tex_src_plane src, generated by nir_lower_tex
+ * for planar YUV textures, into separate samplers, matching the logic
+ * that mesa/st uses to insert additional sampler view/state (since both
+ * sides need to agree).
+ *
+ * This should run after nir_lower_samplers.
+ */
+
+#include "compiler/nir/nir.h"
+#include "st_nir.h"
+
+typedef struct {
+   unsigned lower_2plane;
+   unsigned lower_3plane;
+
+   /* Maps a primary sampler (used for Y) to the U or UV sampler.  In
+* case of 3-plane YUV format, the V plane is next sampler after U.
+*/
+   unsigned char sampler_map[PIPE_MAX_SAMPLERS][2];
+} lower_tex_src_state;
+
+static void
+assign_extra_samplers(lower_tex_src_state *state, unsigned free_slots)
+{
+   unsigned mask = state->lower_2plane | state->lower_3plane;
+
+   while (mask) {
+  unsigned extra, y_samp = u_bit_scan(&mask);
+
+  extra = u_bit_scan(&free_slots);
+  state->sampler_map[y_samp][0] = extra;
+
+  if (state->lower_3plane & (1 << y_samp)) {
+ extra = u_bit_scan(&free_slots);
+ state->sampler_map[y_samp][1] = extra;
+  }
+   }
+}
+
+static void
+lower_tex_src_plane_block(lower_tex_src_state *state, nir_block *block)
+{
+   nir_foreach_instr(instr, block) {
+  if (instr->type != nir_instr_type_tex)
+ continue;
+
+  nir_tex_instr *tex = nir_instr_as_tex(instr);
+  int plane_index = nir_tex_instr_src_index(tex, nir_tex_src_plane);
+
+  if (plane_index < 0)
+ continue;
+
+  nir_const_value *plane = 
nir_src_a

[Mesa-dev] [PATCH 1/7] mesa/st: add lowering pass for YUV samplers

2016-09-08 Thread Rob Clark

Signed-off-by: Rob Clark 
---
 src/mesa/Makefile.sources  |   2 +
 src/mesa/state_tracker/st_tgsi_lower_yuv.c | 447 +
 src/mesa/state_tracker/st_tgsi_lower_yuv.h |  34 +++
 3 files changed, 483 insertions(+)
 create mode 100644 src/mesa/state_tracker/st_tgsi_lower_yuv.c
 create mode 100644 src/mesa/state_tracker/st_tgsi_lower_yuv.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 363b133..653d615 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -513,6 +513,8 @@ STATETRACKER_FILES = \
state_tracker/st_scissor.h \
state_tracker/st_texture.c \
state_tracker/st_texture.h \
+   state_tracker/st_tgsi_lower_yuv.c \
+   state_tracker/st_tgsi_lower_yuv.h \
state_tracker/st_vdpau.c \
state_tracker/st_vdpau.h
 
diff --git a/src/mesa/state_tracker/st_tgsi_lower_yuv.c 
b/src/mesa/state_tracker/st_tgsi_lower_yuv.c
new file mode 100644
index 000..e346b97
--- /dev/null
+++ b/src/mesa/state_tracker/st_tgsi_lower_yuv.c
@@ -0,0 +1,447 @@
+/*
+ * Copyright © 2016 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ */
+
+#include 
+
+#include "st_tgsi_lower_yuv.h"
+#include "tgsi/tgsi_transform.h"
+#include "tgsi/tgsi_scan.h"
+#include "tgsi/tgsi_dump.h"
+#include "util/u_debug.h"
+
+#include "util/bitscan.h"
+
+struct tgsi_yuv_transform {
+   struct tgsi_transform_context base;
+   struct tgsi_shader_info info;
+   struct tgsi_full_src_register imm[4];
+   struct {
+  struct tgsi_full_src_register src;
+  struct tgsi_full_dst_register dst;
+   } tmp[2];
+#define A 0
+#define B 1
+
+   /* Maps a primary sampler (used for Y) to the U or UV sampler.  In
+* case of 3-plane YUV format, the V plane is next sampler after U.
+*/
+   unsigned char sampler_map[PIPE_MAX_SAMPLERS][2];
+
+   bool first_instruction_emitted;
+   unsigned free_slots;
+   unsigned lower_nv12;
+   unsigned lower_iyuv;
+};
+
+static inline struct tgsi_yuv_transform *
+tgsi_yuv_transform(struct tgsi_transform_context *tctx)
+{
+   return (struct tgsi_yuv_transform *)tctx;
+}
+
+static void
+reg_dst(struct tgsi_full_dst_register *dst,
+const struct tgsi_full_dst_register *orig_dst, unsigned wrmask)
+{
+   *dst = *orig_dst;
+   dst->Register.WriteMask &= wrmask;
+   assert(dst->Register.WriteMask);
+}
+
+static inline void
+get_swiz(unsigned *swiz, const struct tgsi_src_register *src)
+{
+   swiz[0] = src->SwizzleX;
+   swiz[1] = src->SwizzleY;
+   swiz[2] = src->SwizzleZ;
+   swiz[3] = src->SwizzleW;
+}
+
+static void
+reg_src(struct tgsi_full_src_register *src,
+const struct tgsi_full_src_register *orig_src,
+unsigned sx, unsigned sy, unsigned sz, unsigned sw)
+{
+   unsigned swiz[4];
+   get_swiz(swiz, &orig_src->Register);
+   *src = *orig_src;
+   src->Register.SwizzleX = swiz[sx];
+   src->Register.SwizzleY = swiz[sy];
+   src->Register.SwizzleZ = swiz[sz];
+   src->Register.SwizzleW = swiz[sw];
+}
+
+#define TGSI_SWIZZLE__ TGSI_SWIZZLE_X  /* don't-care value! */
+#define SWIZ(x,y,z,w) TGSI_SWIZZLE_ ## x, TGSI_SWIZZLE_ ## y,   \
+  TGSI_SWIZZLE_ ## z, TGSI_SWIZZLE_ ## w
+
+static inline struct tgsi_full_instruction
+tex_instruction(unsigned samp)
+{
+   struct tgsi_full_instruction inst;
+
+   inst = tgsi_default_full_instruction();
+   inst.Instruction.Opcode = TGSI_OPCODE_TEX;
+   inst.Instruction.Texture = 1;
+   inst.Texture.Texture = TGSI_TEXTURE_2D;
+   inst.Instruction.NumDstRegs = 1;
+   inst.Instruction.NumSrcRegs = 2;
+   inst.Src[1].Register.File  = TGSI_FILE_SAMPLER;
+   inst.Src[1].Register.Index = samp;
+
+   return inst;
+}
+
+static inline struct tgsi_full_instruction
+mov_instruction(void)
+{
+   struct tgsi_full_instruction inst;
+
+   inst = tgsi_default_full_instruction();
+   inst.Instruction.Opcode = TGSI_OPCODE_MOV;
+   inst.Instruction.Saturate = 0;
+   inst.Instruction.NumDstRegs

[Mesa-dev] [PATCH 0/7] mesa/st: support for YUV EGLImages

2016-09-08 Thread Rob Clark

Since original RFC:
 + fixed up TODOs, review comments, and added support for NIR path as
   well as TGSI
 + Change how we pick which samplers to use for the U/V plane(s), to
   fill in any holes in SamplersUsed
 + changed CSC constants to match ITU-R BT.601 conversion (to align
   with what nir_lower_tex does and piglit expects)
 + added support for YVU420 (simply swaps 2nd and 3rd plane and treats
   as YUV420), since android wants this format for sw decoders
 + Added support for the rest of the shader stages.  Although as far
   as I can tell, android only needs it in FS (and the use-case for
   using a YUV external EGLImage in, for example, a VS is rather
   dubious.  Maybe we want one or two of the prep/cleanup patches
   anyways, but I'm fine with stashing the last patches on a branch
   somewhere until someone comes up with an actual use-case.

From original cover-letter:

So, android and blob GLES drivers were left unchecked for too long, and
now we are stuck with this annoying OES_EGL_image_external extension and
the expectation that the driver can import multi-planar YUV buffers (via,
for example, EGL_EXT_image_dma_buf_import), despite the fact that nearly
all hardware out there needs this lowered to multiple samplers (one per
plane) and colorspace conversion in the shader.  It would be nice to
ignore this (mis)feature, except that by now it is required by android.

This patchset adds a TGSI lowering pass to handle 2 or 3 planar YUV.
And associated logic in mesa/st to append additional sampler view/state
to handle the additional planes.  There is no change needed in the
individual gallium drivers (provided you support R8 and R8G8).  The
extra logic and shader variants only kick in when the shader uses
samplerExternalOES.

I've got some simple test code, which uses gbm to create dmabuf's
so it should run on any driver:

 https://github.com/robclark/kmscube/commits/yuv-cube

You can find the latest versions of the patches here:

 https://github.com/freedreno/mesa/commits/wip-yuv


Rob Clark (7):
  mesa/st: add lowering pass for YUV samplers
  mesa/st: add nir pass to lower tex_src_plane
  mesa/st: support lowering multi-planar YUV
  mesa/st: a bit of basic_variant refactoring
  mesa/st: pass st_compute_program to st_get_cp_variant
  mesa/st: pass prog to st_get_basic_variant()
  mesa/st: support for YUV in VS/VS/GS/TCS/TEC..

 src/gallium/auxiliary/util/u_inlines.h |   4 +-
 src/gallium/include/pipe/p_state.h |   9 +
 src/gallium/include/state_tracker/st_api.h |   3 +
 src/gallium/state_trackers/dri/dri2.c  | 119 +-
 src/gallium/state_trackers/dri/dri_screen.c|  11 +
 src/mesa/Makefile.sources  |   3 +
 src/mesa/main/mtypes.h |  16 +
 src/mesa/program/ir_to_mesa.cpp|   1 +
 src/mesa/state_tracker/st_atom_sampler.c   |  41 +-
 src/mesa/state_tracker/st_atom_shader.c|  27 +-
 src/mesa/state_tracker/st_atom_texture.c   |  58 +++
 src/mesa/state_tracker/st_cb_eglimage.c|  18 +
 src/mesa/state_tracker/st_context.c|  27 +-
 src/mesa/state_tracker/st_glsl_to_nir.cpp  |   1 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   4 +
 src/mesa/state_tracker/st_manager.c|   1 +
 src/mesa/state_tracker/st_nir.h|   3 +
 .../state_tracker/st_nir_lower_tex_src_plane.c | 120 ++
 src/mesa/state_tracker/st_program.c| 150 +--
 src/mesa/state_tracker/st_program.h|  60 ++-
 src/mesa/state_tracker/st_texture.h|  21 +
 src/mesa/state_tracker/st_tgsi_lower_yuv.c | 447 +
 src/mesa/state_tracker/st_tgsi_lower_yuv.h |  34 ++
 23 files changed, 1109 insertions(+), 69 deletions(-)
 create mode 100644 src/mesa/state_tracker/st_nir_lower_tex_src_plane.c
 create mode 100644 src/mesa/state_tracker/st_tgsi_lower_yuv.c
 create mode 100644 src/mesa/state_tracker/st_tgsi_lower_yuv.h

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/23] Remove the hash table in mesa/program

2016-09-08 Thread Thomas Helland

2016-08-23 1:38 GMT+02:00 Timothy Arceri :
> Did you get my previous reply about this series breaking a whole bunch
> of tests in piglit/cts? It looked like and issue with builtin
> functions.
>
> I attached the results file from jenkins which was rather large so it
> didn't get to the list but I also send it to you directly.

Oh, wow. My GMail is really messing with these replies to threads.
I've gotta do some research on that. No, I did not see that reply until now.
I'll have a look at it and see if I can find the issue at stake here.
Thanks for looking at the series!

Regards,
Thomas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i915g: add dma-buf support to i915_drm_buffer_get_handle

2016-09-08 Thread Nicholas Bishop

The implementation of i915_drm_buffer_get_handle now handles
DRM_API_HANDLE_TYPE_FD in the same way that intel_winsys_import_handle
does, by calling drm_intel_bo_gem_create_from_prime.

Tested by successfully running Chrome's ozone_demo [1] with the
ozone-gbm backend on an Intel Pineview M machine. Without this change
it fails while trying to create a DMA-BUF.

[1] 
https://chromium.googlesource.com/chromium/src.git/+/master/ui/ozone/demo/ozone_demo.cc

Signed-off-by: Nicholas Bishop 
---
 src/gallium/winsys/i915/drm/i915_drm_buffer.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/gallium/winsys/i915/drm/i915_drm_buffer.c 
b/src/gallium/winsys/i915/drm/i915_drm_buffer.c
index ba454ec..4080a08 100644
--- a/src/gallium/winsys/i915/drm/i915_drm_buffer.c
+++ b/src/gallium/winsys/i915/drm/i915_drm_buffer.c
@@ -153,6 +153,14 @@ i915_drm_buffer_get_handle(struct i915_winsys *iws,
   whandle->handle = buf->flink;
} else if (whandle->type == DRM_API_HANDLE_TYPE_KMS) {
   whandle->handle = buf->bo->handle;
+   } else if (whandle->type == DRM_API_HANDLE_TYPE_FD) {
+  int fd;
+  int err;
+
+  err = drm_intel_bo_gem_export_to_prime(buf->bo, &fd);
+  if (err)
+ return FALSE;
+  whandle->handle = fd;
} else {
   assert(!"unknown usage");
   return FALSE;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gbm/dri2: propagate errors when creating a DMA-BUF fd

2016-09-08 Thread Nicholas Bishop

Changed dri2_query_image to check the return value of
resource_get_handle and return GL_FALSE if an error occurs. Similarly
changed gbm_dri_bo_get_fd to check the return value of queryImage and
return -1 (an invalid file descriptor) if an error occurs.

Updated the comment for gbm_bo_get_fd to say that -1 is returned if
an error occurs.

For reference this is an example callstack that should propagate the
error back to the user:

i915_drm_buffer_get_handle
i915_texture_get_handle
u_resource_get_handle_vtbl
dri2_query_image
gbm_dri_bo_get_fd
gbm_bo_get_fd

Signed-off-by: Nicholas Bishop 
---
 src/gallium/state_trackers/dri/dri2.c | 11 +++
 src/gbm/backends/dri/gbm_dri.c|  8 +---
 src/gbm/main/gbm.c|  3 ++-
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 28f8078..c6260ba 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -979,10 +979,13 @@ dri2_query_image(__DRIimage *image, int attrib, int 
*value)
   return GL_TRUE;
case __DRI_IMAGE_ATTRIB_FD:
   whandle.type= DRM_API_HANDLE_TYPE_FD;
-  image->texture->screen->resource_get_handle(image->texture->screen,
- image->texture, &whandle, usage);
-  *value = whandle.handle;
-  return GL_TRUE;
+  if (image->texture->screen->resource_get_handle(image->texture->screen,
+ image->texture, &whandle, usage)) {
+ *value = whandle.handle;
+ return GL_TRUE;
+  } else {
+ return GL_FALSE;
+  }
case __DRI_IMAGE_ATTRIB_FORMAT:
   *value = image->dri_format;
   return GL_TRUE;
diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index c3626e3..54b293a 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -589,9 +589,11 @@ gbm_dri_bo_get_fd(struct gbm_bo *_bo)
if (bo->image == NULL)
   return -1;
 
-   dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_FD, &fd);
-
-   return fd;
+   if (dri->image->queryImage(bo->image, __DRI_IMAGE_ATTRIB_FD, &fd)) {
+  return fd;
+   } else {
+  return -1;
+   }
 }
 
 static void
diff --git a/src/gbm/main/gbm.c b/src/gbm/main/gbm.c
index 95b4c2c..c3a2ec33 100644
--- a/src/gbm/main/gbm.c
+++ b/src/gbm/main/gbm.c
@@ -242,7 +242,8 @@ gbm_bo_get_handle(struct gbm_bo *bo)
  * descriptor.
 
  * \param bo The buffer object
- * \return Returns a file descriptor referring  to the underlying buffer
+ * \return Returns a file descriptor referring to the underlying buffer or -1
+ * if an error occurs.
  */
 GBM_EXPORT int
 gbm_bo_get_fd(struct gbm_bo *bo)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/lower_tex: fix typo with sample_dim

2016-09-08 Thread Rob Clark

Numeric 2 is actually GLSL_SAMPLER_DIM_3D, which I don't think is what
was intended.

Signed-off-by: Rob Clark 
---
 src/compiler/nir/nir_lower_tex.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c
index a405758..0efd443 100644
--- a/src/compiler/nir/nir_lower_tex.c
+++ b/src/compiler/nir/nir_lower_tex.c
@@ -211,7 +211,7 @@ sample_plane(nir_builder *b, nir_tex_instr *tex, int plane)
plane_tex->src[1].src = nir_src_for_ssa(nir_imm_int(b, plane));
plane_tex->src[1].src_type = nir_tex_src_plane;
plane_tex->op = nir_texop_tex;
-   plane_tex->sampler_dim = 2;
+   plane_tex->sampler_dim = GLSL_SAMPLER_DIM_2D;
plane_tex->dest_type = nir_type_float;
plane_tex->coord_components = 2;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] doc: document GALLIUM_DRIVER

2016-09-08 Thread Christoph Haag

v2: Add dot at end of sentence
---
 docs/envvars.html | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/envvars.html b/docs/envvars.html
index 789f5e9..cf57ca5 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -217,6 +217,8 @@ Mesa EGL supports different sets of environment variables.  
See the
 disable for unencumbered viewing the rest of the time. For example, set
 GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).
 Use kill -10  to toggle the hud as desired.
+GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=1 for
+choosing one of the software renderers "softpipe", "llvmpipe" or "swr".
 GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.
 rather than stderr.
 GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] doc: document GALLIUM_DRIVER

2016-09-08 Thread Christoph Haag

---
 docs/envvars.html | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/envvars.html b/docs/envvars.html
index 789f5e9..24f4d0d 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -217,6 +217,8 @@ Mesa EGL supports different sets of environment variables.  
See the
 disable for unencumbered viewing the rest of the time. For example, set
 GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).
 Use kill -10  to toggle the hud as desired.
+GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=1 for
+choosing one of the software renderers "softpipe", "llvmpipe" or "swr"
 GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.
 rather than stderr.
 GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Disable the code that allocates W|X memory on OpenBSD

2016-09-08 Thread Mark Kettenis

> From: Emil Velikov 
> Date: Thu, 8 Sep 2016 18:57:44 +0100
> 
> On 1 September 2016 at 18:23, Jonathan Gray  wrote:
> > OpenBSD now has strict W^X enforcement.  Processes that violate
> > the policy get killed by the kernel.  Don't attempt to use
> > executable memory on OpenBSD to avoid this.
> >
> > Patch from Mark Kettenis.

Jonathan, Emil,

That diff should probably not land "upstream".  We're still tweaking
the W^X policy, trying to find the code in ports that violates W^X.
We recently changed things around such that

mmap(..., PROT_WRITE | PROT_EXEC)

will simply fail instead of aborting the process.  Since Mesa has
fallback code for that case, the changes I made (at least most of
them) are no longer necessary.  But I didn't revert them yet in order
to reduce the amount of false positives.

> > --- a/src/gallium/auxiliary/rtasm/rtasm_execmem.c
> > +++ b/src/gallium/auxiliary/rtasm/rtasm_execmem.c
> > @@ -69,6 +69,16 @@ static struct mem_block *exec_heap = NULL;
> >  static unsigned char *exec_mem = NULL;
> >
> >
> > +#ifdef __OpenBSD__
> > +
> > +static int
> > +init_heap(void)
> > +{
> > +   return 0;
> > +}
> Afaict this is equivalent to using the #else path in translate_see.c.
> In general I'm wondering if we can/should not have a configure toggle
> for this. Then again please look below.

Right.  We basically prefer the slow code paths over the unsecure
codepaths.  There are ways around W^X enforcement though.  One
possibility is to generate the code in pages that are writable, but
not executable, and then flip the permissions to make the page
read-only but executable.  Other techniques include using aliased
pages where one mapping is writable and the other mapping is
executable.  Although we don't really want to encourage using aliasing
since it is a bit of a cheat.

> > --- a/src/mapi/u_execmem.c
> > +++ b/src/mapi/u_execmem.c
> > @@ -45,8 +45,15 @@ static unsigned int head = 0;
> >
> >  static unsigned char *exec_mem = (unsigned char *)0;
> >
> > +#if defined(__OpenBSD__)
> >
> > -#if defined(__linux__) || defined(__OpenBSD__) || defined(_NetBSD__) || 
> > defined(__sun) || defined(__HAIKU__)
> > +static int
> > +init_map(void)
> > +{
> > +  return 0;
> > +}
> > +
> And this one to --disable-glx-tls and/or --disable-asm. Which reminds
> me of - have you guys tried enabling either/both of them. Has there
> been (m)any issues ?

Pilip Guenther is working on enabling TLS support in OpenBSD.  He is
getting close, but he was grumbling that Mesa was uncovering some
toolchain bugs.

> For a long while the intent has been to use --enable-glx-tls by
> default and kill off the other codepaths. But with the write xor
> execute policy, it's going to be (close to) impossible.

Not entirely sure what TLS has to do with W^X enforcement.

> Have you guys considered a way to disable the restriction for usecases
> that need the behaviour ?

"no comment", erh, yes, there is a way.  But not for what we consider
the base of our OS.  Which includes X.  Mapping pages to be both
writable and executable is really bad from a security perspective.
And the goal is to eliminate that practice completely from the
ecosystem.  Please note that we're not alone here.  iOS for example
does not allow you to create mappings like that either.

Cheers,

Mark
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] swr: RDTSC StopCapture hangs

2016-09-08 Thread Rowley, Timothy O

I’ve seen the bucket hang on stop in the past, but thought this was now a thing 
of the past.  Is there a particular workload that was making this easy to 
trigger?

What workload of glmark2 were you looking at?  Watching it run, most of the 
scenes appear very light in geometry, which would cause more contention for 
work by the BE workers.

Thanks.

-Tim

> On Sep 8, 2016, at 6:55 AM, Victor Moya del Barrio 
>  wrote:
> 
> 
> Again playing with OpenSWR running on a (OpenSWR originally intended) many 
> core system.
> 
> When enabling RDTSC Buckets profiling sometimes OpenSWR gets stuck on 
> StopCapture.
> 
> StopCapture waits for all the threads to close all the pending buckets 
> (expects threads to be at bucket level 0) but the problem seems to be that 
> some threads get stuck at WorkerWaitForThreadEvent (level 1) and given that 
> StopCapture is called from SwrEndFrame (in the API thread) they are probably 
> not going to be awaken ever.
> 
> I don't have a clean solution here because I didn't study with detail how the 
> thread wait/sleep mechanism works (the real problem could be an issue on why 
> some threads are sleeping and other not) so for now I just commented the code 
> in StopCapture that expects all threads to be at level 0.
> 
> BTW. based on the RDTSC Buckets I see a very horrible utilization of the 
> threads in this system on glmark2.  The BE threads seems to spend most of the 
> cycles on a spin loop looking for work through draw contexts and tiles inside 
> draw contexts, rather than say sleep if there is no real work to be done 
> until there is (but probably there should be as we want higher FPS) (ie the 
> BE thread gets stuck between the WorkerOnFifoBE and WorkerFoundWork buckets).
> 
> Thread 39 (WORKER)
>  %Tot   %Par  Cycles CPENumEvent   CPE2   NumEvent2  Bucket
>  85.57  85.57 171485899  2243   76423  0  0  
> WorkerWorkOnFifoBE
>  24.45  28.58 49002850   125648 3900  0  |-> 
> WorkerFoundWork
> 
> 
> Victor
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4] i965: Fix calculation of the image height at start level

2016-09-08 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand 

Sorry for ignoring you. :(

On Fri, Sep 2, 2016 at 6:04 PM, Antia Puentes  wrote:

> - Fixes CTS tests:
>
> * GL44-CTS.shader_image_size.advanced-nonMS-cs-float
> * GL44-CTS.shader_image_size.advanced-nonMS-cs-int
> * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint
> * GL44-CTS.shader_image_size.advanced-nonMS-gs-float
> * GL44-CTS.shader_image_size.advanced-nonMS-gs-int
> * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint
> * GL44-CTS.shader_image_size.advanced-nonMS-tes-float
> * GL44-CTS.shader_image_size.advanced-nonMS-tes-int
> * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint
> * GL44-CTS.shader_image_size.advanced-nonMS-vs-float
> * GL44-CTS.shader_image_size.advanced-nonMS-vs-int
> * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint
>
> v1: (written by Dave Airlie) Always shift height images for levels.
> Fixed the CTS test.
>
> v2: Only shift height if the texture is not an 1D_ARRAY,
> it fixes assertion in GL44-CTS.texture_view.gettexparameter
> due to the original patch (Antia).
>
> v3: Remove the loop. Do not shift height either for 1D textures.
> Use an explicit switch and add an assertion (levels == 0) for
> multisampled textures (Jason).
>
> v4: Rectangle textures can not have levels either (Ilia Mirkin).
>
> Signed-off-by: Dave Airlie 
> Signed-off-by: Antia Puentes 
> ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 27
> +--
>  1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 7affe08..65962eb 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct brw_context
> *brw,
> DBG("%s\n", __func__);
>
> /* Figure out image dimensions at start level. */
> -   for (i = intelImage->base.Base.Level; i > 0; i--) {
> -  width <<= 1;
> -  if (height != 1)
> - height <<= 1;
> -  if (intelObj->base.Target == GL_TEXTURE_3D)
> - depth <<= 1;
> +   switch(intelObj->base.Target) {
> +   case GL_TEXTURE_2D_MULTISAMPLE:
> +   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
> +   case GL_TEXTURE_RECTANGLE:
> +  assert(intelImage->base.Base.Level == 0);
> +  break;
> +   case GL_TEXTURE_3D:
> +  depth <<= intelImage->base.Base.Level;
> +  /* Fall through */
> +   case GL_TEXTURE_2D:
> +   case GL_TEXTURE_2D_ARRAY:
> +   case GL_TEXTURE_CUBE_MAP:
> +   case GL_TEXTURE_CUBE_MAP_ARRAY:
> +  height <<= intelImage->base.Base.Level;
> +  /* Fall through */
> +   case GL_TEXTURE_1D:
> +   case GL_TEXTURE_1D_ARRAY:
> +  width <<= intelImage->base.Base.Level;
> +  break;
> +   default:
> +  unreachable("Unexpected target");
> }
>
> /* Guess a reasonable value for lastLevel.  This is probably going
> --
> 2.7.4
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97643] Shader crashes radeon driver and brings the whole system down

2016-09-08 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97643

--- Comment #3 from Iaroslav Andrusyak  ---
still works


vo=opengl-hq:user-shaders="/home/pont/CrossBilateral.glsl":cscale=bilinear:dscale=mitchell:tscale=sinc:tscale-radius=2:interpolation-threshold=0.01:scale-radius=3:dither-depth=8:deband-iterations=2:deband-range=12
video-sync=display-resample
hwdec=no
framedrop=vo
cache=262144
stop-screensaver=yes

vd-lavc-threads=4

http://pastebin.com/neW3rkHA

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] nir: move tex_instr_remove_src

2016-09-08 Thread Jason Ekstrand

On Thu, Sep 8, 2016 at 11:21 AM, Jason Ekstrand 
wrote:

>
>
> On Thu, Sep 8, 2016 at 11:14 AM, Rob Clark  wrote:
>
>> I want to re-use this in a different pass, so move to nir.h
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/compiler/nir/nir.h   | 16 
>>  src/compiler/nir/nir_lower_tex.c | 20 ++--
>>  2 files changed, 18 insertions(+), 18 deletions(-)
>>
>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>> index c1cf940..e907bc9 100644
>> --- a/src/compiler/nir/nir.h
>> +++ b/src/compiler/nir/nir.h
>> @@ -2297,6 +2297,22 @@ unsigned nir_index_instrs(nir_function_impl
>> *impl);
>>
>>  void nir_index_blocks(nir_function_impl *impl);
>>
>> +static inline void
>> +nir_tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
>>
>
> This one is complex enough, I think I'd rather have it in nir.c.  That
> also allows us to put the declaration up with the other nir_tex_instr
> helpers which would be nice.
>

With that fixed, both are

Reviewed-by: Jason Ekstrand 


> +{
>> +   assert(src_idx < tex->num_srcs);
>> +
>> +   /* First rewrite the source to NIR_SRC_INIT */
>> +   nir_instr_rewrite_src(&tex->instr, &tex->src[src_idx].src,
>> NIR_SRC_INIT);
>> +
>> +   /* Now, move all of the other sources down */
>> +   for (unsigned i = src_idx + 1; i < tex->num_srcs; i++) {
>> +  tex->src[i-1].src_type = tex->src[i].src_type;
>> +  nir_instr_move_src(&tex->instr, &tex->src[i-1].src,
>> &tex->src[i].src);
>> +   }
>> +   tex->num_srcs--;
>> +}
>> +
>>  void nir_print_shader(nir_shader *shader, FILE *fp);
>>  void nir_print_shader_annotated(nir_shader *shader, FILE *fp, struct
>> hash_table *errors);
>>  void nir_print_instr(const nir_instr *instr, FILE *fp);
>> diff --git a/src/compiler/nir/nir_lower_tex.c
>> b/src/compiler/nir/nir_lower_tex.c
>> index b570598..a405758 100644
>> --- a/src/compiler/nir/nir_lower_tex.c
>> +++ b/src/compiler/nir/nir_lower_tex.c
>> @@ -39,22 +39,6 @@
>>  #include "nir_builder.h"
>>
>>  static void
>> -tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
>> -{
>> -   assert(src_idx < tex->num_srcs);
>> -
>> -   /* First rewrite the source to NIR_SRC_INIT */
>> -   nir_instr_rewrite_src(&tex->instr, &tex->src[src_idx].src,
>> NIR_SRC_INIT);
>> -
>> -   /* Now, move all of the other sources down */
>> -   for (unsigned i = src_idx + 1; i < tex->num_srcs; i++) {
>> -  tex->src[i-1].src_type = tex->src[i].src_type;
>> -  nir_instr_move_src(&tex->instr, &tex->src[i-1].src,
>> &tex->src[i].src);
>> -   }
>> -   tex->num_srcs--;
>> -}
>> -
>> -static void
>>  project_src(nir_builder *b, nir_tex_instr *tex)
>>  {
>> /* Find the projector in the srcs list, if present. */
>> @@ -114,7 +98,7 @@ project_src(nir_builder *b, nir_tex_instr *tex)
>>  nir_src_for_ssa(projected));
>> }
>>
>> -   tex_instr_remove_src(tex, proj_index);
>> +   nir_tex_instr_remove_src(tex, proj_index);
>>  }
>>
>>  static bool
>> @@ -159,7 +143,7 @@ lower_offset(nir_builder *b, nir_tex_instr *tex)
>> nir_instr_rewrite_src(&tex->instr, &tex->src[coord_index].src,
>>   nir_src_for_ssa(offset_coord));
>>
>> -   tex_instr_remove_src(tex, offset_index);
>> +   nir_tex_instr_remove_src(tex, offset_index);
>>
>> return true;
>>  }
>> --
>> 2.7.4
>>
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] nir: move tex_instr_remove_src

2016-09-08 Thread Jason Ekstrand

On Thu, Sep 8, 2016 at 11:14 AM, Rob Clark  wrote:

> I want to re-use this in a different pass, so move to nir.h
>
> Signed-off-by: Rob Clark 
> ---
>  src/compiler/nir/nir.h   | 16 
>  src/compiler/nir/nir_lower_tex.c | 20 ++--
>  2 files changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index c1cf940..e907bc9 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2297,6 +2297,22 @@ unsigned nir_index_instrs(nir_function_impl *impl);
>
>  void nir_index_blocks(nir_function_impl *impl);
>
> +static inline void
> +nir_tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
>

This one is complex enough, I think I'd rather have it in nir.c.  That also
allows us to put the declaration up with the other nir_tex_instr helpers
which would be nice.


> +{
> +   assert(src_idx < tex->num_srcs);
> +
> +   /* First rewrite the source to NIR_SRC_INIT */
> +   nir_instr_rewrite_src(&tex->instr, &tex->src[src_idx].src,
> NIR_SRC_INIT);
> +
> +   /* Now, move all of the other sources down */
> +   for (unsigned i = src_idx + 1; i < tex->num_srcs; i++) {
> +  tex->src[i-1].src_type = tex->src[i].src_type;
> +  nir_instr_move_src(&tex->instr, &tex->src[i-1].src,
> &tex->src[i].src);
> +   }
> +   tex->num_srcs--;
> +}
> +
>  void nir_print_shader(nir_shader *shader, FILE *fp);
>  void nir_print_shader_annotated(nir_shader *shader, FILE *fp, struct
> hash_table *errors);
>  void nir_print_instr(const nir_instr *instr, FILE *fp);
> diff --git a/src/compiler/nir/nir_lower_tex.c
> b/src/compiler/nir/nir_lower_tex.c
> index b570598..a405758 100644
> --- a/src/compiler/nir/nir_lower_tex.c
> +++ b/src/compiler/nir/nir_lower_tex.c
> @@ -39,22 +39,6 @@
>  #include "nir_builder.h"
>
>  static void
> -tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
> -{
> -   assert(src_idx < tex->num_srcs);
> -
> -   /* First rewrite the source to NIR_SRC_INIT */
> -   nir_instr_rewrite_src(&tex->instr, &tex->src[src_idx].src,
> NIR_SRC_INIT);
> -
> -   /* Now, move all of the other sources down */
> -   for (unsigned i = src_idx + 1; i < tex->num_srcs; i++) {
> -  tex->src[i-1].src_type = tex->src[i].src_type;
> -  nir_instr_move_src(&tex->instr, &tex->src[i-1].src,
> &tex->src[i].src);
> -   }
> -   tex->num_srcs--;
> -}
> -
> -static void
>  project_src(nir_builder *b, nir_tex_instr *tex)
>  {
> /* Find the projector in the srcs list, if present. */
> @@ -114,7 +98,7 @@ project_src(nir_builder *b, nir_tex_instr *tex)
>  nir_src_for_ssa(projected));
> }
>
> -   tex_instr_remove_src(tex, proj_index);
> +   nir_tex_instr_remove_src(tex, proj_index);
>  }
>
>  static bool
> @@ -159,7 +143,7 @@ lower_offset(nir_builder *b, nir_tex_instr *tex)
> nir_instr_rewrite_src(&tex->instr, &tex->src[coord_index].src,
>   nir_src_for_ssa(offset_coord));
>
> -   tex_instr_remove_src(tex, offset_index);
> +   nir_tex_instr_remove_src(tex, offset_index);
>
> return true;
>  }
> --
> 2.7.4
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] EGL: Fix some errors in eglDebugMessageControlKHR.

2016-09-08 Thread Kyle Brenneman

Check if the attribute list is valid even if the callback is NULL.

Set the callback pointer whether or not the attribute list is NULL.
---
 src/egl/main/eglapi.c | 45 +++--
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 3bbf3de..6f30ed6 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -2042,36 +2042,37 @@ eglLabelObjectKHR(
 static EGLint
 eglDebugMessageControlKHR(EGLDEBUGPROCKHR callback, const EGLAttrib 
*attrib_list)
 {
+   unsigned int newEnabled;
+
_EGL_FUNC_START(NULL, EGL_NONE, NULL, EGL_BAD_ALLOC);
 
mtx_lock(_eglGlobal.Mutex);
 
-   if (callback != NULL) {
-  if (attrib_list != NULL) {
- unsigned int newEnabled = _eglGlobal.debugTypesEnabled;
- int i;
-
- for (i = 0; attrib_list[i] != EGL_NONE; i += 2) {
-if (attrib_list[i] >= EGL_DEBUG_MSG_CRITICAL_KHR &&
-  attrib_list[i] <= EGL_DEBUG_MSG_INFO_KHR) {
-   if (attrib_list[i + 1]) {
-  newEnabled |= DebugBitFromType(attrib_list[i]);
-   } else {
-  newEnabled &= ~DebugBitFromType(attrib_list[i]);
-   }
+   newEnabled = _eglGlobal.debugTypesEnabled;
+   if (attrib_list != NULL) {
+  int i;
+  for (i = 0; attrib_list[i] != EGL_NONE; i += 2) {
+ if (attrib_list[i] >= EGL_DEBUG_MSG_CRITICAL_KHR &&
+   attrib_list[i] <= EGL_DEBUG_MSG_INFO_KHR) {
+if (attrib_list[i + 1]) {
+   newEnabled |= DebugBitFromType(attrib_list[i]);
 } else {
-   // On error, set the last error code, call the current
-   // debug callback, and return the error code.
-   mtx_unlock(_eglGlobal.Mutex);
-   _eglReportError(EGL_BAD_ATTRIBUTE, NULL,
- "Invalid attribute 0x%04lx", (unsigned long) 
attrib_list[i]);
-   return EGL_BAD_ATTRIBUTE;
+   newEnabled &= ~DebugBitFromType(attrib_list[i]);
 }
+ } else {
+// On error, set the last error code, call the current
+// debug callback, and return the error code.
+mtx_unlock(_eglGlobal.Mutex);
+_eglReportError(EGL_BAD_ATTRIBUTE, NULL,
+  "Invalid attribute 0x%04lx", (unsigned long) attrib_list[i]);
+return EGL_BAD_ATTRIBUTE;
  }
-
- _eglGlobal.debugCallback = callback;
- _eglGlobal.debugTypesEnabled = newEnabled;
   }
+   }
+
+   if (callback != NULL) {
+  _eglGlobal.debugCallback = callback;
+  _eglGlobal.debugTypesEnabled = newEnabled;
} else {
   _eglGlobal.debugCallback = NULL;
   _eglGlobal.debugTypesEnabled = _EGL_DEBUG_BIT_CRITICAL | 
_EGL_DEBUG_BIT_ERROR;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] nir: move tex_instr_remove_src

2016-09-08 Thread Rob Clark

I want to re-use this in a different pass, so move to nir.h

Signed-off-by: Rob Clark 
---
 src/compiler/nir/nir.h   | 16 
 src/compiler/nir/nir_lower_tex.c | 20 ++--
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index c1cf940..e907bc9 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2297,6 +2297,22 @@ unsigned nir_index_instrs(nir_function_impl *impl);
 
 void nir_index_blocks(nir_function_impl *impl);
 
+static inline void
+nir_tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
+{
+   assert(src_idx < tex->num_srcs);
+
+   /* First rewrite the source to NIR_SRC_INIT */
+   nir_instr_rewrite_src(&tex->instr, &tex->src[src_idx].src, NIR_SRC_INIT);
+
+   /* Now, move all of the other sources down */
+   for (unsigned i = src_idx + 1; i < tex->num_srcs; i++) {
+  tex->src[i-1].src_type = tex->src[i].src_type;
+  nir_instr_move_src(&tex->instr, &tex->src[i-1].src, &tex->src[i].src);
+   }
+   tex->num_srcs--;
+}
+
 void nir_print_shader(nir_shader *shader, FILE *fp);
 void nir_print_shader_annotated(nir_shader *shader, FILE *fp, struct 
hash_table *errors);
 void nir_print_instr(const nir_instr *instr, FILE *fp);
diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c
index b570598..a405758 100644
--- a/src/compiler/nir/nir_lower_tex.c
+++ b/src/compiler/nir/nir_lower_tex.c
@@ -39,22 +39,6 @@
 #include "nir_builder.h"
 
 static void
-tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
-{
-   assert(src_idx < tex->num_srcs);
-
-   /* First rewrite the source to NIR_SRC_INIT */
-   nir_instr_rewrite_src(&tex->instr, &tex->src[src_idx].src, NIR_SRC_INIT);
-
-   /* Now, move all of the other sources down */
-   for (unsigned i = src_idx + 1; i < tex->num_srcs; i++) {
-  tex->src[i-1].src_type = tex->src[i].src_type;
-  nir_instr_move_src(&tex->instr, &tex->src[i-1].src, &tex->src[i].src);
-   }
-   tex->num_srcs--;
-}
-
-static void
 project_src(nir_builder *b, nir_tex_instr *tex)
 {
/* Find the projector in the srcs list, if present. */
@@ -114,7 +98,7 @@ project_src(nir_builder *b, nir_tex_instr *tex)
 nir_src_for_ssa(projected));
}
 
-   tex_instr_remove_src(tex, proj_index);
+   nir_tex_instr_remove_src(tex, proj_index);
 }
 
 static bool
@@ -159,7 +143,7 @@ lower_offset(nir_builder *b, nir_tex_instr *tex)
nir_instr_rewrite_src(&tex->instr, &tex->src[coord_index].src,
  nir_src_for_ssa(offset_coord));
 
-   tex_instr_remove_src(tex, offset_index);
+   nir_tex_instr_remove_src(tex, offset_index);
 
return true;
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] nir/lower_tex: remove tex_instr_find_src()

2016-09-08 Thread Rob Clark

Turns out it already exists.. so don't duplicate it.

Signed-off-by: Rob Clark 
---
 src/compiler/nir/nir_lower_tex.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/src/compiler/nir/nir_lower_tex.c b/src/compiler/nir/nir_lower_tex.c
index 93637a3..b570598 100644
--- a/src/compiler/nir/nir_lower_tex.c
+++ b/src/compiler/nir/nir_lower_tex.c
@@ -38,17 +38,6 @@
 #include "nir.h"
 #include "nir_builder.h"
 
-static int
-tex_instr_find_src(nir_tex_instr *tex, nir_tex_src_type src_type)
-{
-   for (unsigned i = 0; i < tex->num_srcs; i++) {
-  if (tex->src[i].src_type == src_type)
- return i;
-   }
-
-   return -1;
-}
-
 static void
 tex_instr_remove_src(nir_tex_instr *tex, unsigned src_idx)
 {
@@ -69,7 +58,7 @@ static void
 project_src(nir_builder *b, nir_tex_instr *tex)
 {
/* Find the projector in the srcs list, if present. */
-   int proj_index = tex_instr_find_src(tex, nir_tex_src_projector);
+   int proj_index = nir_tex_instr_src_index(tex, nir_tex_src_projector);
if (proj_index < 0)
   return;
 
@@ -131,11 +120,11 @@ project_src(nir_builder *b, nir_tex_instr *tex)
 static bool
 lower_offset(nir_builder *b, nir_tex_instr *tex)
 {
-   int offset_index = tex_instr_find_src(tex, nir_tex_src_offset);
+   int offset_index = nir_tex_instr_src_index(tex, nir_tex_src_offset);
if (offset_index < 0)
   return false;
 
-   int coord_index = tex_instr_find_src(tex, nir_tex_src_coord);
+   int coord_index = nir_tex_instr_src_index(tex, nir_tex_src_coord);
assert(coord_index >= 0);
 
assert(tex->src[offset_index].src.is_ssa);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Disable the code that allocates W|X memory on OpenBSD

2016-09-08 Thread Jonathan Gray

On Thu, Sep 08, 2016 at 06:57:44PM +0100, Emil Velikov wrote:
> On 1 September 2016 at 18:23, Jonathan Gray  wrote:
> > OpenBSD now has strict W^X enforcement.  Processes that violate
> > the policy get killed by the kernel.  Don't attempt to use
> > executable memory on OpenBSD to avoid this.
> >
> > Patch from Mark Kettenis.
> >
> 
> > --- a/src/gallium/auxiliary/rtasm/rtasm_execmem.c
> > +++ b/src/gallium/auxiliary/rtasm/rtasm_execmem.c
> > @@ -69,6 +69,16 @@ static struct mem_block *exec_heap = NULL;
> >  static unsigned char *exec_mem = NULL;
> >
> >
> > +#ifdef __OpenBSD__
> > +
> > +static int
> > +init_heap(void)
> > +{
> > +   return 0;
> > +}
> Afaict this is equivalent to using the #else path in translate_see.c.
> In general I'm wondering if we can/should not have a configure toggle
> for this. Then again please look below.
> 
> 
> > --- a/src/mapi/u_execmem.c
> > +++ b/src/mapi/u_execmem.c
> > @@ -45,8 +45,15 @@ static unsigned int head = 0;
> >
> >  static unsigned char *exec_mem = (unsigned char *)0;
> >
> > +#if defined(__OpenBSD__)
> >
> > -#if defined(__linux__) || defined(__OpenBSD__) || defined(_NetBSD__) || 
> > defined(__sun) || defined(__HAIKU__)
> > +static int
> > +init_map(void)
> > +{
> > +  return 0;
> > +}
> > +
> And this one to --disable-glx-tls and/or --disable-asm. Which reminds
> me of - have you guys tried enabling either/both of them. Has there
> been (m)any issues ?
> 
> For a long while the intent has been to use --enable-glx-tls by
> default and kill off the other codepaths. But with the write xor
> execute policy, it's going to be (close to) impossible.

Full tls support is not in the OpenBSD tree currently, though the
remaining parts were being looked at including enabling tls with Mesa
last week.  I'm not sure what state that work is in currently.

> Have you guys considered a way to disable the restriction for usecases
> that need the behaviour ?

The limited exceptions involve flagging binaries and having to mount
the filesystem containing them with a flag.  This is mostly a temporary
measure as I understand it and libraries especially should not be creating
W|X mappings.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] EGL: Implement remaining functions from EGL_KHR_debug

2016-09-08 Thread Adam Jackson

On Thu, 2016-09-08 at 11:57 -0600, Kyle Brenneman wrote:
> This one has the a bug in it where it doesn't set the callback if 
> (attrib_list == NULL), plus the more minor bug where it doesn't check 
> for invalid attributes if (callback == NULL). The first one is the same 
> bug you noticed in libglvnd, which got copied over when I adapted it for 
> Mesa. I can fix that and send out an updated patch if you like, or if 
> it's easier, I can add a commit to the end of this list.
> 

Hah, indeed. Patch at the end of this series is probably easiest.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/7] EGL: Implement remaining functions from EGL_KHR_debug

2016-09-08 Thread Kyle Brenneman

This one has the a bug in it where it doesn't set the callback if 
(attrib_list == NULL), plus the more minor bug where it doesn't check 
for invalid attributes if (callback == NULL). The first one is the same 
bug you noticed in libglvnd, which got copied over when I adapted it for 
Mesa. I can fix that and send out an updated patch if you like, or if 
it's easier, I can add a commit to the end of this list.


-Kyle

On 09/08/2016 11:46 AM, Adam Jackson wrote:

From: Kyle Brenneman 

Implemented eglDebugMessageControlKHR and eglQueryDebugKHR. Added
entries in _egl_global to hold the debug callback and the set of enabled
message types.

Added a _eglDebugReport function to report a debug message, plus some
macros for each of the message types.

Still to do is to replace existing calls to _eglError with
_eglDebugReport.

Reviewed-by: Adam Jackson 
---
  src/egl/main/eglapi.c | 64 +++
  src/egl/main/eglcurrent.c | 37 +--
  src/egl/main/eglcurrent.h | 15 +++
  src/egl/main/eglglobals.c |  5 +++-
  src/egl/main/eglglobals.h | 15 +++
  5 files changed, 133 insertions(+), 3 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 31b842f..e5b098e 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1852,6 +1852,68 @@ eglLabelObjectKHR(
 }
  }
  
+static EGLint

+eglDebugMessageControlKHR(EGLDEBUGPROCKHR callback, const EGLAttrib 
*attrib_list)
+{
+   mtx_lock(_eglGlobal.Mutex);
+
+   if (callback != NULL) {
+  if (attrib_list != NULL) {
+ unsigned int newEnabled = _eglGlobal.debugTypesEnabled;
+ int i;
+
+ for (i = 0; attrib_list[i] != EGL_NONE; i += 2) {
+if (attrib_list[i] >= EGL_DEBUG_MSG_CRITICAL_KHR &&
+  attrib_list[i] <= EGL_DEBUG_MSG_INFO_KHR) {
+   if (attrib_list[i + 1]) {
+  newEnabled |= DebugBitFromType(attrib_list[i]);
+   } else {
+  newEnabled &= ~DebugBitFromType(attrib_list[i]);
+   }
+} else {
+   // On error, set the last error code, call the current
+   // debug callback, and return the error code.
+   mtx_unlock(_eglGlobal.Mutex);
+   _eglReportError(EGL_BAD_ATTRIBUTE, "eglDebugMessageControlKHR", 
NULL,
+   "Invalid attribute 0x%04lx", (unsigned long) attrib_list[i]);
+   return EGL_BAD_ATTRIBUTE;
+}
+ }
+
+ _eglGlobal.debugCallback = callback;
+ _eglGlobal.debugTypesEnabled = newEnabled;
+  }
+   } else {
+  _eglGlobal.debugCallback = NULL;
+  _eglGlobal.debugTypesEnabled = _EGL_DEBUG_BIT_CRITICAL | 
_EGL_DEBUG_BIT_ERROR;
+   }
+
+   mtx_unlock(_eglGlobal.Mutex);
+   return EGL_SUCCESS;
+}
+
+static EGLBoolean
+eglQueryDebugKHR(EGLint attribute, EGLAttrib *value)
+{
+   mtx_lock(_eglGlobal.Mutex);
+   if (attribute >= EGL_DEBUG_MSG_CRITICAL_KHR &&
+ attribute <= EGL_DEBUG_MSG_INFO_KHR) {
+  if (_eglGlobal.debugTypesEnabled & DebugBitFromType(attribute)) {
+ *value = EGL_TRUE;
+  } else {
+ *value = EGL_FALSE;
+  }
+   } else if (attribute == EGL_DEBUG_CALLBACK_KHR) {
+  *value = (EGLAttrib) _eglGlobal.debugCallback;
+   } else {
+  mtx_unlock(_eglGlobal.Mutex);
+  _eglReportError(EGL_BAD_ATTRIBUTE, "eglQueryDebugKHR", NULL,
+  "Invalid attribute 0x%04lx", (unsigned long) attribute);
+  return EGL_FALSE;
+   }
+   mtx_unlock(_eglGlobal.Mutex);
+   return EGL_TRUE;
+}
  
  __eglMustCastToProperFunctionPointerType EGLAPIENTRY

  eglGetProcAddress(const char *procname)
@@ -1933,6 +1995,8 @@ eglGetProcAddress(const char *procname)
{ "eglExportDMABUFImageQueryMESA", (_EGLProc) 
eglExportDMABUFImageQueryMESA },
{ "eglExportDMABUFImageMESA", (_EGLProc) eglExportDMABUFImageMESA },
{ "eglLabelObjectKHR", (_EGLProc) eglLabelObjectKHR },
+  { "eglDebugMessageControlKHR", (_EGLProc) eglDebugMessageControlKHR },
+  { "eglQueryDebugKHR", (_EGLProc) eglQueryDebugKHR },
{ NULL, NULL }
 };
 EGLint i;
diff --git a/src/egl/main/eglcurrent.c b/src/egl/main/eglcurrent.c
index 6dd6f4c..83db229 100644
--- a/src/egl/main/eglcurrent.c
+++ b/src/egl/main/eglcurrent.c
@@ -26,8 +26,10 @@
   **/
  
  
+#include 

  #include 
  #include 
+#include 
  #include "c99_compat.h"
  #include "c11/threads.h"
  
@@ -35,7 +37,6 @@

  #include "eglcurrent.h"
  #include "eglglobals.h"
  
-

  /* This should be kept in sync with _eglInitThreadInfo() */
  #define _EGL_THREAD_INFO_INITIALIZER \
 { EGL_SUCCESS, { NULL }, 0 }
@@ -283,8 +284,40 @@ _eglError(EGLint errCode, const char *msg)
  /**
   * Returns the label set for the current thread.
   */
-EGLLabelKHR _eglGetThreadLabel(void)
+EGLLabelKHR
+_eglGetThreadLabel(void)
  {
 _EGLThreadInfo *t = _eglGetCurrentThread(

Re: [Mesa-dev] [PATCH 14/33] intel/blorp: Add an entrypoint for doing bit-for-bit copies

2016-09-08 Thread Jason Ekstrand

On Wed, Sep 7, 2016 at 1:16 PM, Jason Ekstrand  wrote:

> On Sep 7, 2016 10:45 AM, "Nanley Chery"  wrote:
> >
> > On Wed, Sep 07, 2016 at 10:26:25AM -0700, Jason Ekstrand wrote:
> > > On Wed, Sep 7, 2016 at 9:50 AM, Jason Ekstrand 
> wrote:
> > >
> > > > On Wed, Sep 7, 2016 at 9:36 AM, Nanley Chery 
> > > > wrote:
> > > >
> > > >> On Tue, Sep 06, 2016 at 05:02:55PM -0700, Jason Ekstrand wrote:
> > > >> > On Tue, Sep 6, 2016 at 4:12 PM, Nanley Chery <
> nanleych...@gmail.com>
> > > >> wrote:
> > > >> >
> > > >> > > On Wed, Aug 31, 2016 at 02:22:33PM -0700, Jason Ekstrand wrote:
> > > >> > > > ---
> > > >> > > >  src/intel/blorp/blorp.h  |  10 
> > > >> > > >  src/intel/blorp/blorp_blit.c | 133
> ++
> > > >> > > +
> > > >> > > >  2 files changed, 143 insertions(+)
> > > >> > > >
> > > >> > > > diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> > > >> > > > index c1e93fd..6574124 100644
> > > >> > > > --- a/src/intel/blorp/blorp.h
> > > >> > > > +++ b/src/intel/blorp/blorp.h
> > > >> > > > @@ -109,6 +109,16 @@ blorp_blit(struct blorp_batch *batch,
> > > >> > > > uint32_t filter, bool mirror_x, bool mirror_y);
> > > >> > > >
> > > >> > > >  void
> > > >> > > > +blorp_copy(struct blorp_batch *batch,
> > > >> > > > +   const struct blorp_surf *src_surf,
> > > >> > > > +   unsigned src_level, unsigned src_layer,
> > > >> > > > +   const struct blorp_surf *dst_surf,
> > > >> > > > +   unsigned dst_level, unsigned dst_layer,
> > > >> > > > +   uint32_t src_x, uint32_t src_y,
> > > >> > > > +   uint32_t dst_x, uint32_t dst_y,
> > > >> > > > +   uint32_t src_width, uint32_t src_height);
> > > >> > > > +
> > > >> > > > +void
> > > >> > > >  blorp_fast_clear(struct blorp_batch *batch,
> > > >> > > >   const struct blorp_surf *surf,
> > > >> > > >   uint32_t level, uint32_t layer, enum
> isl_format
> > > >> format,
> > > >> > > > diff --git a/src/intel/blorp/blorp_blit.c
> > > >> b/src/intel/blorp/blorp_blit.c
> > > >> > > > index 3ab39a3..42a502c 100644
> > > >> > > > --- a/src/intel/blorp/blorp_blit.c
> > > >> > > > +++ b/src/intel/blorp/blorp_blit.c
> > > >> > > > @@ -1685,3 +1685,136 @@ blorp_blit(struct blorp_batch *batch,
> > > >> > > >   dst_x0, dst_y0, dst_x1, dst_y1,
> > > >> > > >   mirror_x, mirror_y);
> > > >> > > >  }
> > > >> > > > +
> > > >> > > > +static enum isl_format
> > > >> > > > +get_copy_format_for_bpb(unsigned bpb)
> > > >> > > > +{
> > > >> > > > +   /* The choice of UNORM and UINT formats is very
> intentional
> > > >> here.
> > > >> > > Most of
> > > >> > > > +* the time, we want to use a UINT format to avoid any
> rounding
> > > >> > > error in
> > > >> > > > +* the blit.  For stencil blits, R8_UINT is required by
> the
> > > >> hardware.
> > > >> > > > +* (It's the only format allowed in conjunction with
> W-tiling.)
> > > >> > > Also we
> > > >> > > > +* intentionally use the 4-channel formats whenever we
> can.
> > > >> This is
> > > >> > > so
> > > >> > > > +* that, when we do a RGB <-> RGBX copy, the two formats
> will
> > > >> line
> > > >> > > up even
> > > >> > > > +* though one of them is 3/4 the size of the other.  The
> choice
> > > >> of
> > > >> > > UNORM
> > > >> > > > +* vs. UINT is also very intentional because Haswell
> doesn't
> > > >> handle
> > > >> > > 8 or
> > > >> > > > +* 16-bit RGB UINT formats at all so we have to use UNORM
> there.
> > > >> > > > +* Fortunately, the only time we should ever use two
> different
> > > >> > > formats in
> > > >> > > > +* the table below is for RGB -> RGBA blits and so we
> will never
> > > >> > > have any
> > > >> > > > +* UNORM/UINT mismatch.
> > > >> > > > +*/
> > > >> > > > +   switch (bpb) {
> > > >> > > > +   case 8:  return ISL_FORMAT_R8_UINT;
> > > >> > > > +   case 16: return ISL_FORMAT_R8G8_UINT;
> > > >> > > > +   case 24: return ISL_FORMAT_R8G8B8_UNORM;
> > > >> > > > +   case 32: return ISL_FORMAT_R8G8B8A8_UNORM;
> > > >> > > > +   case 48: return ISL_FORMAT_R16G16B16_UNORM;
> > > >> > > > +   case 64: return ISL_FORMAT_R16G16B16A16_UNORM;
> > > >> > > > +   case 96: return ISL_FORMAT_R32G32B32_UINT;
> > > >> > > > +   case 128:return ISL_FORMAT_R32G32B32A32_UINT;
> > > >> > > > +   default:
> > > >> > > > +  unreachable("Unknown format bpb");
> > > >> > > > +   }
> > > >> > > > +}
> > > >> > > > +
> > > >> > > > +static void
> > > >> > > > +surf_convert_to_uncompressed(const struct isl_device
> *isl_dev,
> > > >> > > > + struct brw_blorp_surface_info
> *info,
> > > >> > > > + uint32_t *x, uint32_t *y,
> > > >> > > > + uint32_t *width, uint32_t
> *height)
> > > >> > > > +{
> > > >> > > > +   const struct isl_format_layout *fmtl =
> > > >> > > > +  isl_format_get_layout(info->surf.format);
> > > >

Re: [Mesa-dev] [PATCH] Disable the code that allocates W|X memory on OpenBSD

2016-09-08 Thread Emil Velikov

On 1 September 2016 at 18:23, Jonathan Gray  wrote:
> OpenBSD now has strict W^X enforcement.  Processes that violate
> the policy get killed by the kernel.  Don't attempt to use
> executable memory on OpenBSD to avoid this.
>
> Patch from Mark Kettenis.
>

> --- a/src/gallium/auxiliary/rtasm/rtasm_execmem.c
> +++ b/src/gallium/auxiliary/rtasm/rtasm_execmem.c
> @@ -69,6 +69,16 @@ static struct mem_block *exec_heap = NULL;
>  static unsigned char *exec_mem = NULL;
>
>
> +#ifdef __OpenBSD__
> +
> +static int
> +init_heap(void)
> +{
> +   return 0;
> +}
Afaict this is equivalent to using the #else path in translate_see.c.
In general I'm wondering if we can/should not have a configure toggle
for this. Then again please look below.


> --- a/src/mapi/u_execmem.c
> +++ b/src/mapi/u_execmem.c
> @@ -45,8 +45,15 @@ static unsigned int head = 0;
>
>  static unsigned char *exec_mem = (unsigned char *)0;
>
> +#if defined(__OpenBSD__)
>
> -#if defined(__linux__) || defined(__OpenBSD__) || defined(_NetBSD__) || 
> defined(__sun) || defined(__HAIKU__)
> +static int
> +init_map(void)
> +{
> +  return 0;
> +}
> +
And this one to --disable-glx-tls and/or --disable-asm. Which reminds
me of - have you guys tried enabling either/both of them. Has there
been (m)any issues ?

For a long while the intent has been to use --enable-glx-tls by
default and kill off the other codepaths. But with the write xor
execute policy, it's going to be (close to) impossible.

Have you guys considered a way to disable the restriction for usecases
that need the behaviour ?


Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97643] Shader crashes radeon driver and brings the whole system down

2016-09-08 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97643

--- Comment #2 from Cris  ---
(In reply to Iaroslav Andrusyak from comment #1)
> works fine on 7970 and mesa-git,llvm-git
> vo=opengl-hq:user-shaders="/home/pont/CrossBilateral.glsl"  
> 
> http://pastebin.com/6TNTa6ry

I forgot to mention that for crossbilateral to work, cscale must be set to
bilinear.

My mpv.conf:

profile=opengl-hq
scale=ewa_lanczossharp
cscale=bilinear
opengl-shaders="~~/shaders/crossbilateral.glsl"
dscale=mitchell
tscale=sinc
interpolation
tscale-radius=2
interpolation-threshold=0.01
scale-radius=3
temporal-dither
dither-depth=8
deband-iterations=2
deband-range=12
correct-downscaling
blend-subtitles
video-sync=display-resample
hwdec=no
framedrop=vo
cache=262144
stop-screensaver=yes

vd-lavc-threads=4

llvm is 3.8.0-r2

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] intel/isl: Divide QPitch by 2 for 3-D stencil textures on SKL+

2016-09-08 Thread Jason Ekstrand

---
 src/intel/isl/isl_surface_state.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index f8ea122..22fef3d 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -173,7 +173,20 @@ get_qpitch(const struct isl_surf *surf)
   unreachable("Bad isl_surf_dim");
case ISL_DIM_LAYOUT_GEN4_2D:
   if (GEN_GEN >= 9) {
- return isl_surf_get_array_pitch_el_rows(surf);
+ if (surf->dim == ISL_SURF_DIM_3D && surf->tiling == ISL_TILING_W) {
+/* This is rather annoying and completely undocumented.  It
+ * appears that the hardware has a bug (or undocumented feature)
+ * regarding stencil buffers most likely related to the way
+ * W-tiling is handled as modified Y-tiling.  If you bind a 3-D or
+ * 2-D array stencil buffer normally, and use texelFetch on it,
+ * the z or array index will get implicitly multiplied by 2 for no
+ * obvious reason.  The fix appears to be to divide qpitch by 2
+ * for W-tiled surfaces.
+ */
+return isl_surf_get_array_pitch_el_rows(surf) / 2;
+ } else {
+return isl_surf_get_array_pitch_el_rows(surf);
+ }
   } else {
  /* From the Broadwell PRM for RENDER_SURFACE_STATE.QPitch
   *
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] isl/state: Don't set QPitch for GEN4_3D surfaces

2016-09-08 Thread Jason Ekstrand

---
 src/intel/isl/isl_surface_state.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 979e140..f8ea122 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -172,7 +172,6 @@ get_qpitch(const struct isl_surf *surf)
default:
   unreachable("Bad isl_surf_dim");
case ISL_DIM_LAYOUT_GEN4_2D:
-   case ISL_DIM_LAYOUT_GEN4_3D:
   if (GEN_GEN >= 9) {
  return isl_surf_get_array_pitch_el_rows(surf);
   } else {
@@ -199,6 +198,22 @@ get_qpitch(const struct isl_surf *surf)
*slices.
*/
   return isl_surf_get_array_pitch_el(surf);
+   case ISL_DIM_LAYOUT_GEN4_3D:
+  /* QPitch doesn't make sense for ISL_DIM_LAYOUT_GEN4_3D since it uses a
+   * different pitch at each LOD.  Also, the QPitch field is ignored for
+   * these surfaces.  From the Broadwell PRM documentation for QPitch:
+   *
+   *This field specifies the distance in rows between array slices. It
+   *is used only in the following cases:
+   * - Surface Array is enabled OR
+   * - Number of Mulitsamples is not NUMSAMPLES_1 and Multisampled
+   *   Surface Storage Format set to MSFMT_MSS OR
+   * - Surface Type is SURFTYPE_CUBE
+   *
+   * None of the three conditions above can possibly apply to a 3D surface
+   * so it is safe to just set QPitch to 0.
+   */
+  return 0;
}
 }
 #endif /* GEN_GEN >= 8 */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/7] EGL: Fix some command names for EGL_KHR_debug

2016-09-08 Thread Adam Jackson

From: Kyle Brenneman 

Change a few EGL entrypoints to call a common internal function instead
of forwarding to another entrypoint.

If one EGL entrypoint calls another, then the second entrypoint would
overwrite the current function name in the _EGLThreadInfo struct. That
would cause it to pass the wrong function name to the EGL_KHR_debug
callback.

[ajax: Fixed up eglWaitClient]

Reviewed-by: Adam Jackson 
---
 src/egl/main/eglapi.c | 214 +-
 1 file changed, 125 insertions(+), 89 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index a684b43..3bbf3de 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -326,7 +326,7 @@ eglGetDisplay(EGLNativeDisplayType nativeDisplay)
_EGLDisplay *dpy;
void *native_display_ptr;
 
-   _EGL_FUNC_START(NULL, EGL_NONE, NULL, EGL_NO_DISPLAY);
+   _EGL_FUNC_START(NULL, EGL_OBJECT_THREAD_KHR, NULL, EGL_NO_DISPLAY);
 
STATIC_ASSERT(sizeof(void*) == sizeof(nativeDisplay));
native_display_ptr = (void*) nativeDisplay;
@@ -336,14 +336,12 @@ eglGetDisplay(EGLNativeDisplayType nativeDisplay)
return _eglGetDisplayHandle(dpy);
 }
 
-static EGLDisplay EGLAPIENTRY
-eglGetPlatformDisplayEXT(EGLenum platform, void *native_display,
+static EGLDisplay
+_eglGetPlatformDisplayCommon(EGLenum platform, void *native_display,
  const EGLint *attrib_list)
 {
_EGLDisplay *dpy;
 
-   _EGL_FUNC_START(NULL, EGL_NONE, NULL, EGL_NO_DISPLAY);
-
switch (platform) {
 #ifdef HAVE_X11_PLATFORM
case EGL_PLATFORM_X11_EXT:
@@ -369,6 +367,14 @@ eglGetPlatformDisplayEXT(EGLenum platform, void 
*native_display,
return _eglGetDisplayHandle(dpy);
 }
 
+static EGLDisplay EGLAPIENTRY
+eglGetPlatformDisplayEXT(EGLenum platform, void *native_display,
+ const EGLint *attrib_list)
+{
+   _EGL_FUNC_START(NULL, EGL_OBJECT_THREAD_KHR, NULL, EGL_NO_DISPLAY);
+   return _eglGetPlatformDisplayCommon(platform, native_display, attrib_list);
+}
+
 EGLDisplay EGLAPIENTRY
 eglGetPlatformDisplay(EGLenum platform, void *native_display,
   const EGLAttrib *attrib_list)
@@ -376,13 +382,13 @@ eglGetPlatformDisplay(EGLenum platform, void 
*native_display,
EGLDisplay display;
EGLint *int_attribs;
 
-   _EGL_FUNC_START(NULL, EGL_NONE, NULL, EGL_NO_DISPLAY);
+   _EGL_FUNC_START(NULL, EGL_OBJECT_THREAD_KHR, NULL, EGL_NO_DISPLAY);
 
int_attribs = _eglConvertAttribsToInt(attrib_list);
if (attrib_list && !int_attribs)
   RETURN_EGL_ERROR(NULL, EGL_BAD_ALLOC, NULL);
 
-   display = eglGetPlatformDisplayEXT(platform, native_display, int_attribs);
+   display = _eglGetPlatformDisplayCommon(platform, native_display, 
int_attribs);
free(int_attribs);
return display;
 }
@@ -788,7 +794,8 @@ eglQueryContext(EGLDisplay dpy, EGLContext ctx,
 
 static EGLSurface
 _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig config,
-  void *native_window, const EGLint *attrib_list)
+  void *native_window, const EGLint *attrib_list,
+  EGLBoolean fromPlatform)
 {
_EGLConfig *conf = _eglLookupConfig(config, disp);
_EGLDriver *drv;
@@ -797,6 +804,19 @@ _eglCreateWindowSurfaceCommon(_EGLDisplay *disp, EGLConfig 
config,
 
_EGL_CHECK_CONFIG(disp, conf, EGL_NO_SURFACE, drv);
 
+#ifdef HAVE_X11_PLATFORM
+   if (fromPlatform && disp->Platform == _EGL_PLATFORM_X11 && native_window != 
NULL) {
+  /* The `native_window` parameter for the X11 platform differs between
+   * eglCreateWindowSurface() and eglCreatePlatformPixmapSurfaceEXT(). In
+   * eglCreateWindowSurface(), the type of `native_window` is an Xlib
+   * `Window`. In eglCreatePlatformWindowSurfaceEXT(), the type is
+   * `Window*`.  Convert `Window*` to `Window` because that's what
+   * dri2_x11_create_window_surface() expects.
+   */
+  native_window = (void*) (* (Window*) native_window);
+   }
+#endif
+
if (native_window == NULL)
   RETURN_EGL_ERROR(disp, EGL_BAD_NATIVE_WINDOW, EGL_NO_SURFACE);
 
@@ -816,7 +836,7 @@ eglCreateWindowSurface(EGLDisplay dpy, EGLConfig config,
_EGL_FUNC_START(disp, EGL_OBJECT_DISPLAY_KHR, NULL, EGL_NO_SURFACE);
STATIC_ASSERT(sizeof(void*) == sizeof(window));
return _eglCreateWindowSurfaceCommon(disp, config, (void*) window,
-attrib_list);
+attrib_list, EGL_FALSE);
 }
 
 
@@ -827,22 +847,8 @@ eglCreatePlatformWindowSurfaceEXT(EGLDisplay dpy, 
EGLConfig config,
 {
_EGLDisplay *disp = _eglLockDisplay(dpy);
_EGL_FUNC_START(disp, EGL_OBJECT_DISPLAY_KHR, NULL, EGL_NO_SURFACE);
-
-#ifdef HAVE_X11_PLATFORM
-   if (disp->Platform == _EGL_PLATFORM_X11 && native_window != NULL) {
-  /* The `native_window` parameter for the X11 platform differs between
-   * eglCreateWindowSurface() and eglCreatePlatformPixmapSurfaceEXT(). In
-   * egl

[Mesa-dev] [PATCH 7/7] egl: Fix DebugMessageControl(callback, NULL)

2016-09-08 Thread Adam Jackson

Treat a null attribute list as meaning "don't change attributes". This
is semantically equivalent to a list consisting of just EGL_NONE.

Signed-off-by: Adam Jackson 
---
 src/egl/main/eglapi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 3bbf3de..0034f1e 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -2069,9 +2069,9 @@ eglDebugMessageControlKHR(EGLDEBUGPROCKHR callback, const 
EGLAttrib *attrib_list
 }
  }
 
- _eglGlobal.debugCallback = callback;
  _eglGlobal.debugTypesEnabled = newEnabled;
   }
+  _eglGlobal.debugCallback = callback;
} else {
   _eglGlobal.debugCallback = NULL;
   _eglGlobal.debugTypesEnabled = _EGL_DEBUG_BIT_CRITICAL | 
_EGL_DEBUG_BIT_ERROR;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] EGL: Implement eglLabelObjectKHR

2016-09-08 Thread Adam Jackson

From: Kyle Brenneman 

Added a label to the _EGLThreadInfo, _EGLDisplay, and EGLResource
structs. Implemented the function eglLabelObjectKHR.

Reviewed-by: Adam Jackson 
---
 src/egl/main/eglapi.c | 63 +++
 src/egl/main/eglcurrent.c |  9 +++
 src/egl/main/eglcurrent.h |  4 +++
 src/egl/main/egldisplay.h |  4 +++
 4 files changed, 80 insertions(+)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index df2dcd6..31b842f 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1791,6 +1791,68 @@ eglExportDMABUFImageMESA(EGLDisplay dpy, EGLImage image,
RETURN_EGL_EVAL(disp, ret);
 }
 
+static EGLint EGLAPIENTRY
+eglLabelObjectKHR(
+  EGLDisplay dpy,
+  EGLenum objectType,
+  EGLObjectKHR object,
+  EGLLabelKHR label)
+{
+   if (objectType == EGL_OBJECT_THREAD_KHR) {
+  _EGLThreadInfo *t = _eglGetCurrentThread();
+  if (!_eglIsCurrentThreadDummy()) {
+ t->Label = label;
+  }
+  return EGL_SUCCESS;
+   } else {
+  _EGLDisplay *disp = _eglLookupDisplay(dpy);
+  if (disp == NULL) {
+ _eglError(EGL_BAD_DISPLAY, "eglLabelObjectKHR");
+ return EGL_BAD_DISPLAY;
+  }
+
+  if (objectType == EGL_OBJECT_DISPLAY_KHR) {
+ if (dpy != (EGLDisplay) object) {
+_eglError(EGL_BAD_PARAMETER, "eglLabelObjectKHR");
+return EGL_BAD_PARAMETER;
+ }
+ disp->Label = label;
+ return EGL_SUCCESS;
+  } else {
+ _EGLResourceType type;
+ switch (objectType)
+ {
+case EGL_OBJECT_CONTEXT_KHR:
+   type = _EGL_RESOURCE_CONTEXT;
+   break;
+case EGL_OBJECT_SURFACE_KHR:
+   type = _EGL_RESOURCE_SURFACE;
+   break;
+case EGL_OBJECT_IMAGE_KHR:
+   type = _EGL_RESOURCE_IMAGE;
+   break;
+case EGL_OBJECT_SYNC_KHR:
+   type = _EGL_RESOURCE_SYNC;
+   break;
+case EGL_OBJECT_STREAM_KHR:
+default:
+_eglError(EGL_BAD_PARAMETER, "eglLabelObjectKHR");
+   return EGL_BAD_PARAMETER;
+ }
+
+ if (_eglCheckResource(object, type, disp)) {
+_EGLResource *res = (_EGLResource *) object;
+res->Label = label;
+return EGL_SUCCESS;
+ } else {
+_eglError(EGL_BAD_PARAMETER, "eglLabelObjectKHR");
+return EGL_BAD_PARAMETER;
+ }
+  }
+   }
+}
+
+
 __eglMustCastToProperFunctionPointerType EGLAPIENTRY
 eglGetProcAddress(const char *procname)
 {
@@ -1870,6 +1932,7 @@ eglGetProcAddress(const char *procname)
   { "eglGetSyncValuesCHROMIUM", (_EGLProc) eglGetSyncValuesCHROMIUM },
   { "eglExportDMABUFImageQueryMESA", (_EGLProc) 
eglExportDMABUFImageQueryMESA },
   { "eglExportDMABUFImageMESA", (_EGLProc) eglExportDMABUFImageMESA },
+  { "eglLabelObjectKHR", (_EGLProc) eglLabelObjectKHR },
   { NULL, NULL }
};
EGLint i;
diff --git a/src/egl/main/eglcurrent.c b/src/egl/main/eglcurrent.c
index 345f4cc..6dd6f4c 100644
--- a/src/egl/main/eglcurrent.c
+++ b/src/egl/main/eglcurrent.c
@@ -279,3 +279,12 @@ _eglError(EGLint errCode, const char *msg)
 
return EGL_FALSE;
 }
+
+/**
+ * Returns the label set for the current thread.
+ */
+EGLLabelKHR _eglGetThreadLabel(void)
+{
+   _EGLThreadInfo *t = _eglGetCurrentThread();
+   return t->Label;
+}
diff --git a/src/egl/main/eglcurrent.h b/src/egl/main/eglcurrent.h
index b922435..e139271 100644
--- a/src/egl/main/eglcurrent.h
+++ b/src/egl/main/eglcurrent.h
@@ -54,6 +54,7 @@ struct _egl_thread_info
EGLint LastError;
_EGLContext *CurrentContext;
EGLenum CurrentAPI;
+   EGLLabelKHR Label;
 };
 
 
@@ -91,6 +92,9 @@ _eglGetCurrentContext(void);
 extern EGLBoolean
 _eglError(EGLint errCode, const char *msg);
 
+extern EGLLabelKHR
+_eglGetThreadLabel(void);
+
 
 #ifdef __cplusplus
 }
diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h
index 6bfc858..d27f63a 100644
--- a/src/egl/main/egldisplay.h
+++ b/src/egl/main/egldisplay.h
@@ -79,6 +79,8 @@ struct _egl_resource
EGLBoolean IsLinked;
EGLint RefCount;
 
+   EGLLabelKHR Label;
+
/* used to link resources of the same type */
_EGLResource *Next;
 };
@@ -165,6 +167,8 @@ struct _egl_display
 
/* lists of resources */
_EGLResource *ResourceLists[_EGL_NUM_RESOURCES];
+
+   EGLLabelKHR Label;
 };
 
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 155 matches

Mail list logo