Re: [Mesa-dev] [PATCH] nvc0: bind a fake tess control program when there isn't one available

2015-07-26 Thread Samuel Pitoiset



On 07/26/2015 06:56 AM, Ilia Mirkin wrote:

Apparently this is necessary in order for tess factors to work in a tess
eval program without a tess control program bound. Probably because it
uses the fake program's shader header to work out the number of patch
constants.

Fixes vs-tes-tessinner-tessouter-inputs

Signed-off-by: Ilia Mirkin 
---
  src/gallium/drivers/nouveau/nvc0/nvc0_context.c  |  5 +
  src/gallium/drivers/nouveau/nvc0/nvc0_context.h  |  3 +++
  src/gallium/drivers/nouveau/nvc0/nvc0_program.c  | 17 +
  src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c |  6 +-
  4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
index 84f8db6..46970db 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
@@ -132,6 +132,9 @@ nvc0_context_unreference_resources(struct nvc0_context 
*nvc0)
pipe_resource_reference(res, NULL);
 }
 util_dynarray_fini(&nvc0->global_residents);
+
+   if (nvc0->tcp_empty)
+  nvc0->base.pipe.delete_tcs_state(&nvc0->base.pipe, nvc0->tcp_empty);
  }
  
  static void

@@ -326,6 +329,8 @@ nvc0_create(struct pipe_screen *pscreen, void *priv)
  
 /* shader builtin library is per-screen, but we need a context for m2mf */

 nvc0_program_library_upload(nvc0);
+   nvc0_program_init_tcp_empty(nvc0);
+   nvc0->dirty |= NVC0_NEW_TCTLPROG;
  
 /* add permanently resident buffers to bufctxts */
  
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h

index f449942..df1a891 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -128,6 +128,8 @@ struct nvc0_context {
 struct nvc0_program *fragprog;
 struct nvc0_program *compprog;
  
+   struct nvc0_program *tcp_empty;

+
 struct nvc0_constbuf constbuf[6][NVC0_MAX_PIPE_CONSTBUFS];
 uint16_t constbuf_dirty[6];
 uint16_t constbuf_valid[6];
@@ -227,6 +229,7 @@ void nvc0_program_destroy(struct nvc0_context *, struct 
nvc0_program *);
  void nvc0_program_library_upload(struct nvc0_context *);
  uint32_t nvc0_program_symbol_offset(const struct nvc0_program *,
  uint32_t label);
+void nvc0_program_init_tcp_empty(struct nvc0_context *);
  
  /* nvc0_query.c */

  void nvc0_init_query_functions(struct nvc0_context *);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
index 4941831..e9975ce 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -22,6 +22,8 @@
  
  #include "pipe/p_defines.h"
  
+#include "tgsi/tgsi_ureg.h"

+
  #include "nvc0/nvc0_context.h"
  
  #include "codegen/nv50_ir_driver.h"

@@ -803,3 +805,18 @@ nvc0_program_symbol_offset(const struct nvc0_program 
*prog, uint32_t label)
   return prog->code_base + base + syms[i].offset;
 return prog->code_base; /* no symbols or symbol not found */
  }
+
+void
+nvc0_program_init_tcp_empty(struct nvc0_context *nvc0)
+{
+   struct ureg_program *ureg;
+
+   ureg = ureg_create(TGSI_PROCESSOR_TESS_CTRL);
+   if (!ureg)
+  return;
+
+   ureg_property(ureg, TGSI_PROPERTY_TCS_VERTICES_OUT, 1);
+   ureg_END(ureg);
+
+   nvc0->tcp_empty = ureg_create_shader_and_destroy(ureg, &nvc0->base.pipe);
+}
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c
index 8aa127a..e21515f 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_shader_state.c
@@ -148,8 +148,12 @@ nvc0_tctlprog_validate(struct nvc0_context *nvc0)
BEGIN_NVC0(push, NVC0_3D(SP_GPR_ALLOC(2)), 1);
PUSH_DATA (push, tp->num_gprs);
 } else {
-  BEGIN_NVC0(push, NVC0_3D(SP_SELECT(2)), 1);
+  tp = nvc0->tcp_empty;
+  if (!nvc0_program_validate(nvc0, tp))
+ assert(!"unable to validate empty tcp");
+  BEGIN_NVC0(push, NVC0_3D(SP_SELECT(2)), 2);
PUSH_DATA (push, 0x20);
+  PUSH_DATA (push, tp->code_base);
 }


It would be good to check if tp is not NULL before trying to validate 
the program.
And if the program can't be validated, I don't think we want to push 
tp->code_base, isn't it?



 nvc0_program_update_context_state(nvc0, tp, 1);
  }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V3] glsl: fix atomic buffer index for bindings other than 0

2015-07-26 Thread Timothy Arceri
Since commit c0cd5b var->data.binding was being used as a replacement
for atomic buffer index, but they don't have to be the same value they
just happen to end up the same when binding is 0.

Now we store atomic buffer index in the unused var->data.index
to avoid the extra memory of putting back the atmoic buffer index field.

V3: Dont make unrelated changes to the uniform storage handling.
Cc the correct stable branch.

V2: store buffer index in var->data.index and uniform slot in
var->data.location to avoid issues when linking more than 2 shaders.
Also some small tidy ups.

Cc: Alejandro Piñeiro 
Cc: Ian Romanick 
Cc: 10.6 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
---
 src/glsl/ir.h  | 2 ++
 src/glsl/link_atomics.cpp  | 2 ++
 src/glsl/nir/glsl_to_nir.cpp   | 2 --
 src/glsl/nir/nir.h | 5 +++--
 src/glsl/nir/nir_lower_atomics.c   | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +-
 6 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index ede8caa..f77534c 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -757,6 +757,8 @@ public:
* \note
* The GLSL spec only allows the values 0 or 1 for the index in \b dual
* source blending.
+   *
+   * For atomic counters this stores the atomic buffer index.
*/
   unsigned index:1;
 
diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp
index 100d03c..5120564 100644
--- a/src/glsl/link_atomics.cpp
+++ b/src/glsl/link_atomics.cpp
@@ -204,6 +204,8 @@ link_assign_atomic_counter_resources(struct gl_context *ctx,
  if (!var->data.explicit_binding)
 var->data.binding = i;
 
+ var->data.index = i;
+
  storage->atomic_buffer_index = i;
  storage->offset = var->data.atomic.offset;
  storage->array_stride = (var->type->is_array() ?
diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 77327b6..6cb23c0 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -326,8 +326,6 @@ nir_visitor::visit(ir_variable *ir)
 
var->data.index = ir->data.index;
var->data.binding = ir->data.binding;
-   /* XXX Get rid of buffer_index */
-   var->data.atomic.buffer_index = ir->data.binding;
var->data.atomic.offset = ir->data.atomic.offset;
var->data.image.read_only = ir->data.image_read_only;
var->data.image.write_only = ir->data.image_write_only;
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 62cdbd4..837d197 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -292,7 +292,9 @@ typedef struct {
   unsigned int driver_location;
 
   /**
-   * output index for dual source blending.
+   * Output index for dual source blending.
+   *
+   * For atomic counters this stores the atomic buffer index.
*/
   int index;
 
@@ -307,7 +309,6 @@ typedef struct {
* Location an atomic counter is stored at.
*/
   struct {
- unsigned buffer_index;
  unsigned offset;
   } atomic;
 
diff --git a/src/glsl/nir/nir_lower_atomics.c b/src/glsl/nir/nir_lower_atomics.c
index ce3615a..6119f62 100644
--- a/src/glsl/nir/nir_lower_atomics.c
+++ b/src/glsl/nir/nir_lower_atomics.c
@@ -63,7 +63,7 @@ lower_instr(nir_intrinsic_instr *instr, nir_function_impl 
*impl)
 
nir_intrinsic_instr *new_instr = nir_intrinsic_instr_create(mem_ctx, op);
new_instr->const_index[0] =
-  (int) instr->variables[0]->var->data.atomic.buffer_index;
+  (int) instr->variables[0]->var->data.index;
 
nir_load_const_instr *offset_const = nir_load_const_instr_create(mem_ctx, 
1);
offset_const->value.u[0] = instr->variables[0]->var->data.atomic.offset;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 6fee798..5e0bf1b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2407,7 +2407,7 @@ vec4_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
   ir->actual_parameters.get_head());
ir_variable *location = deref->variable_referenced();
unsigned surf_index = (prog_data->base.binding_table.abo_start +
-  location->data.binding);
+  location->data.index);
 
/* Calculate the surface offset */
src_reg offset(this, glsl_type::uint_type);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2] glsl: fix atomic buffer index for bindings other than 0

2015-07-26 Thread Timothy Arceri

On Sat, 2015-07-25 at 18:11 +0200, Alejandro Piñeiro wrote:
> Hi Timothy,
> 
> thanks for CCing me. Just to say that it looks good to me. And FWIW,
> with this patch, the piglit subtest included on the second version of my
> patch (second version after the first review of Ian Romanick) here:
> 
> http://lists.freedesktop.org/archives/piglit/2015-May/015979.html

Thanks for testing. I tested with your piglit patch too I'll try give your
second version a review soon :)

I've also sent a V3 of my patch as I realised just after sending V2 that I
shouldn't have needed to be messing with the uniform location.

> 
> pass properly.
> 
> Best regards
> 
> On 25/07/15 16:24, Timothy Arceri wrote:
> > Since commit c0cd5b var->data.binding was being used as a replacement
> > for atomic buffer index, but they don't have to be the same value they
> > just happen to end up the same when binding is 0.
> > 
> > Now we store atomic buffer index in the unused var->data.index
> > to avoid the extra memory of putting back the atmoic buffer index field.
> > 
> > V2: store buffer index in var->data.index and uniform slot in
> > var->data.location to avoid issues when linking more than 2 shaders.
> > Also some small tidy ups.
> > 
> > Cc: Alejandro Piñeiro 
> > Cc: Ian Romanick 
> > Cc: 10.4, 10.5 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
> > ---
> >  src/glsl/ir.h  |  3 +++
> >  src/glsl/link_atomics.cpp  | 18 +++---
> >  src/glsl/link_uniforms.cpp |  4 
> >  src/glsl/nir/glsl_to_nir.cpp   |  2 --
> >  src/glsl/nir/nir.h |  6 --
> >  src/glsl/nir/nir_lower_atomics.c   |  2 +-
> >  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  2 +-
> >  7 files changed, 20 insertions(+), 17 deletions(-)
> > 
> > diff --git a/src/glsl/ir.h b/src/glsl/ir.h
> > index ede8caa..e76b0ec 100644
> > --- a/src/glsl/ir.h
> > +++ b/src/glsl/ir.h
> > @@ -757,6 +757,8 @@ public:
> > * \note
> > * The GLSL spec only allows the values 0 or 1 for the index in \b 
> > dual
> > * source blending.
> > +   *
> > +   * For atomic counters this stores the atomic buffer index.
> > */
> >unsigned index:1;
> >  
> > @@ -819,6 +821,7 @@ public:
> > *   - Fragment shader output: one of the values from \c 
> > gl_frag_result.
> > *   - Uniforms: Per-stage uniform slot number for default uniform 
> > block.
> > *   - Uniforms: Index within the uniform block definition for UBO 
> > members.
> > +   *   - Atomic Counter: Uniform slot number.
> > *   - Other: This field is not currently used.
> > *
> > * If the variable is a uniform, shader input, or shader output, 
> > and the
> > diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp
> > index 100d03c..5d3c40f 100644
> > --- a/src/glsl/link_atomics.cpp
> > +++ b/src/glsl/link_atomics.cpp
> > @@ -33,7 +33,6 @@ namespace {
> >  * Atomic counter as seen by the program.
> >  */
> > struct active_atomic_counter {
> > -  unsigned id;
> >ir_variable *var;
> > };
> >  
> > @@ -52,7 +51,7 @@ namespace {
> >   free(counters);
> >}
> >  
> > -  void push_back(unsigned id, ir_variable *var)
> > +  void push_back(ir_variable *var)
> >{
> >   active_atomic_counter *new_counters;
> >  
> > @@ -66,7 +65,6 @@ namespace {
> >   }
> >  
> >   counters = new_counters;
> > - counters[num_counters].id = id;
> >   counters[num_counters].var = var;
> >   num_counters++;
> >}
> > @@ -114,10 +112,6 @@ namespace {
> >  ir_variable *var = node->as_variable();
> >  
> >  if (var && var->type->contains_atomic()) {
> > -   unsigned id = 0;
> > -   bool found = prog->UniformHash->get(id, var->name);
> > -   assert(found);
> > -   (void) found;
> > active_atomic_buffer *buf = &buffers[var->data.binding];
> >  
> > /* If this is the first time the buffer is used, increment
> > @@ -126,7 +120,7 @@ namespace {
> > if (buf->size == 0)
> >(*num_buffers)++;
> >  
> > -   buf->push_back(id, var);
> > +   buf->push_back(var);
> >  
> > buf->stage_references[i]++;
> > buf->size = MAX2(buf->size, var->data.atomic.offset +
> > @@ -197,13 +191,15 @@ link_assign_atomic_counter_resources(struct 
> > gl_context *ctx,
> >/* Assign counter-specific fields. */
> >for (unsigned j = 0; j < ab.num_counters; j++) {
> >   ir_variable *const var = ab.counters[j].var;
> > - const unsigned id = ab.counters[j].id;
> > - gl_uniform_storage *const storage = &prog->UniformStorage[id];
> > + gl_uniform_storage *const storage =
> > +&prog->U

Re: [Mesa-dev] [PATCH V2] glsl: fix atomic buffer index for bindings other than 0

2015-07-26 Thread Timothy Arceri
On Sun, 2015-07-26 at 03:04 +0200, Erik Faye-Lund wrote:
> On Sat, Jul 25, 2015 at 4:24 PM, Timothy Arceri  
> wrote:
> > Since commit c0cd5b var->data.binding was being used as a replacement
> > for atomic buffer index, but they don't have to be the same value they
> > just happen to end up the same when binding is 0.
> > 
> > Now we store atomic buffer index in the unused var->data.index
> > to avoid the extra memory of putting back the atmoic buffer index field.
> > 
> > V2: store buffer index in var->data.index and uniform slot in
> > var->data.location to avoid issues when linking more than 2 shaders.
> > Also some small tidy ups.
> > 
> > Cc: Alejandro Piñeiro 
> > Cc: Ian Romanick 
> > Cc: 10.4, 10.5 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
> > ---
> >  src/glsl/ir.h  |  3 +++
> >  src/glsl/link_atomics.cpp  | 18 +++---
> >  src/glsl/link_uniforms.cpp |  4 
> >  src/glsl/nir/glsl_to_nir.cpp   |  2 --
> >  src/glsl/nir/nir.h |  6 --
> >  src/glsl/nir/nir_lower_atomics.c   |  2 +-
> >  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  2 +-
> >  7 files changed, 20 insertions(+), 17 deletions(-)
> > 
> > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> > index 62cdbd4..d97db68 100644
> > --- a/src/glsl/nir/nir.h
> > +++ b/src/glsl/nir/nir.h
> > @@ -307,7 +310,6 @@ typedef struct {
> > * Location an atomic counter is stored at.
> > */
> >struct {
> > - unsigned buffer_index;
> >   unsigned offset;
> >} atomic;
> 
> This smells a bit like the struct should be nuked all together...

Yep, but it should be in a separate patch as this is aimed at stable. I don't
have time for that clean-up right now but feel free to take it on. The struct
in ir.h should go too.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91020] Mesa's demo / tools won't compile since EGL changes

2015-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91020

Pali Rohár  changed:

   What|Removed |Added

 CC||pali.ro...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3 v4.1] clover: add clLinkProgramm (CL 1.2)

2015-07-26 Thread Francisco Jerez
EdB  writes:

> ---
>  src/gallium/state_trackers/clover/api/program.cpp | 33 
> +++
>  1 file changed, 33 insertions(+)
>
> diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
> b/src/gallium/state_trackers/clover/api/program.cpp
> index 553bc83..086f952 100644
> --- a/src/gallium/state_trackers/clover/api/program.cpp
> +++ b/src/gallium/state_trackers/clover/api/program.cpp
> @@ -238,6 +238,39 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
> return e.get();
>  }
>  
> +CLOVER_API cl_program
> +clLinkProgram(cl_context d_ctx, cl_uint num_devs, const cl_device_id *d_devs,
> +  const char *p_opts, cl_uint num_progs, const cl_program 
> *d_progs,
> +  void (*pfn_notify) (cl_program, void *), void *user_data,
> +  cl_int *r_errcode) try {
> +   auto &ctx = obj(d_ctx);
> +   auto devs = (d_devs ? objs(d_devs, num_devs) :
> +ref_vector(ctx.devices()));
> +   auto opts = (p_opts ? p_opts : "");
> +   auto progs = objs(d_progs, num_progs);
> +
> +   if (!pfn_notify && user_data)
> + throw error(CL_INVALID_VALUE);
> +
> +   if (any_of([&](const device &dev) {
> +return !count(dev, ctx.devices());
> + }, objs(d_devs, num_devs)))
> +  throw error(CL_INVALID_DEVICE);
> +
> +   auto prog = intrusive_ref(*(new program(ctx, {}, {})));

Ah, of course, the empty initializers now made it impossible for the
compiler to deduce the correct argument types for create() -- You could
give it some help though like 'create(ctx,
ref_vector(), std::vector())', or feel free to add
default "= {}" initializers to the last two constructor arguments so you
don't need to pass them at all.

> +   try {
> +  prog().link(devs, opts, progs);
> +  ret_error(r_errcode, CL_SUCCESS);;

Double semicolon.  With these fixed this patch is:

Reviewed-by: Francisco Jerez 

> +   } catch (link_error &e) {
> +  ret_error(r_errcode, CL_LINK_PROGRAM_FAILURE);
> +   }
> +
> +   return ret_object(prog);
> +} catch (error &e) {
> +   ret_error(r_errcode, e);
> +   return NULL;
> +}
> +
>  CLOVER_API cl_int
>  clUnloadCompiler() {
> return CL_SUCCESS;
> -- 
> 2.5.0.rc2


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-26 Thread Zoltán Gilián
> auto img = dynamic_cast(**(explicit_arg - 1))

Ok, so it should be (explicit_arg - 2) for the image format, I
presume. This will be incorrect, however, if the targets that need
implicit argument for format metadata are indeed a strict superset of
the ones that need dimension, as you mentioned before. The targets
that only need format will break this code. Should I swap the order of
the format and dimension implicit args to make this approach work
under the aforementioned assumption?

> It also seems like you've got rid of the static casts you had
> previously?

That is a mistake, I'll fix it.

> My expectation here was that the compiler would be able to hard-code
> sampler indices in the shader without the API passing any actual data in
> the input buffer.  Doesn't that work for you?

Yes, that's correct, but this is also the case with images. If image
index is uploaded explicitly, I don't see why it can't be done with
sampler indices. But probably it's a better idea to send the sampler
value rather than the index, in case the kernel needs it (e.g. the
normalized or non-normalized nature of texture coordinates may have to
be specified in the fetch instruction itself, and not by hardware
registers), so I'll definitely change this.
But the bigger problem is, that the byte offsets of the kernel
arguments are computed considering the sampler argument too, so the
binary expects it to be present in the input vector. Furthermore I
can't erase the sampler argument from the IR, because it is needed to
make it possible for get_kernel_args to detect the sampler. But if
sampler_argument::bind doesn't append 4 bytes (clang compiles
sampler_t to i32) to the input vector, the binary will try to load the
following arguments from wrong locations.

> I don't think it's a good idea to use such obvious names for the
> implicit argument types. Couldn't they collide with some type declared
by the user that happens to have the same name?  How about
"__clover_image_size" or something similar?

Indeed, that's a good point. I'd go with something like
"__opencl_image_*" or "__llvm_image_*", because these strings will be
added to llvm, and non-clover code may depend on them in the future.

> (Identifiers starting with
> double underscore are reserved for the implementation at least on C and
> GLSL, not sure about OpenCL-C)

I believe this is true for OpenCL C too, since it is an extension to
C99. Identifiers starting with double underscores are reserved in C99.

On Sat, Jul 25, 2015 at 1:06 PM, Francisco Jerez  wrote:
> Zoltan Gilian  writes:
>
>> Read-only and write-only image arguments are recognized and
>> distinguished.
>> Attributes of the image arguments are passed to the kernel as implicit
>> arguments.
>> ---
>>  src/gallium/state_trackers/clover/core/kernel.cpp  |  46 ++-
>>  src/gallium/state_trackers/clover/core/kernel.hpp  |  13 +-
>>  src/gallium/state_trackers/clover/core/memory.cpp  |   2 +-
>>  src/gallium/state_trackers/clover/core/module.hpp  |   4 +-
>>  .../state_trackers/clover/llvm/invocation.cpp  | 147 
>> -
>>  5 files changed, 198 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp 
>> b/src/gallium/state_trackers/clover/core/kernel.cpp
>> index 0756f06..1a6c28f 100644
>> --- a/src/gallium/state_trackers/clover/core/kernel.cpp
>> +++ b/src/gallium/state_trackers/clover/core/kernel.cpp
>> @@ -158,13 +158,18 @@ 
>> kernel::exec_context::bind(intrusive_ptr _q,
>> auto margs = find(name_equals(kern.name()), m.syms).args;
>> auto msec = find(type_equals(module::section::text), m.secs);
>> auto explicit_arg = kern._args.begin();
>> +   image_argument *last_image_arg = nullptr;
>>
>> for (auto &marg : margs) {
>>switch (marg.semantic) {
>> -  case module::argument::general:
>> +  case module::argument::general: {
>> + auto image_arg = 
>> dynamic_cast(explicit_arg->get());
>> + if (image_arg) {
>> +last_image_arg = image_arg;
>> + }
>>   (*(explicit_arg++))->bind(*this, marg);
>>   break;
>> -
>> +  }
>>case module::argument::grid_dimension: {
>>   const cl_uint dimension = grid_offset.size();
>>   auto arg = argument::create(marg);
>> @@ -182,6 +187,36 @@ kernel::exec_context::bind(intrusive_ptr 
>> _q,
>>   }
>>   break;
>>}
>> +  case module::argument::image_size: {
>> + assert(last_image_arg);
>> + auto img = last_image_arg->get();
>
> Instead of carrying around an extra variable during the loop, you could
> achieve the same effect more locally by doing:
>
> |auto img = dynamic_cast(**(explicit_arg - 1))
> |   .get();
>
> The cast to reference would also make sure that the argument is of the
> specified type or otherwise throw std::bad_cast which is as good as an
> assertion failure.
>
>> + std::vector image_size({
>> +   img->widt

Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-26 Thread Francisco Jerez
Zoltán Gilián  writes:

>> auto img = dynamic_cast(**(explicit_arg - 1))
>
> Ok, so it should be (explicit_arg - 2) for the image format, I
> presume.

Why?  Your module::argument::image_size and ::image_format cases don't
touch the explicit_arg iterator at all AFAICT, so it will be left
pointing one past the last general semantic argument

> This will be incorrect, however, if the targets that need implicit
> argument for format metadata are indeed a strict superset of the ones
> that need dimension, as you mentioned before. The targets that only
> need format will break this code. Should I swap the order of the
> format and dimension implicit args to make this approach work under
> the aforementioned assumption?
>
Why would it be incorrect?  The kernel::_args vector explicit_arg
iterates in contains explicit arguments only so it shouldn't make a
difference in which order you emit them as long as you don't interleave
other explicit arguments in between.

>> It also seems like you've got rid of the static casts you had
>> previously?
>
> That is a mistake, I'll fix it.
>
>> My expectation here was that the compiler would be able to hard-code
>> sampler indices in the shader without the API passing any actual data in
>> the input buffer.  Doesn't that work for you?
>
> Yes, that's correct, but this is also the case with images. If image
> index is uploaded explicitly, I don't see why it can't be done with
> sampler indices. But probably it's a better idea to send the sampler
> value rather than the index, in case the kernel needs it (e.g. the
> normalized or non-normalized nature of texture coordinates may have to
> be specified in the fetch instruction itself, and not by hardware
> registers), so I'll definitely change this.

Some hardware (e.g. Intel's) will need an index (which can probably be
hardcoded in the shader for now) into a table of sampler configurations
which can be set-up later on by the driver at state-emit time.  In any
case we can always add a new argument semantic enum later on for the
target to select sampler config vs index if different drivers turn out
to need different things.

> But the bigger problem is, that the byte offsets of the kernel
> arguments are computed considering the sampler argument too, so the
> binary expects it to be present in the input vector. Furthermore I
> can't erase the sampler argument from the IR, because it is needed to
> make it possible for get_kernel_args to detect the sampler. But if
> sampler_argument::bind doesn't append 4 bytes (clang compiles
> sampler_t to i32) to the input vector, the binary will try to load the
> following arguments from wrong locations.

Hmmm...  So you only need it as padding?  Wouldn't it be possible for
you to declare samplers to be 0 bytes?

>
>> I don't think it's a good idea to use such obvious names for the
>> implicit argument types. Couldn't they collide with some type declared
> by the user that happens to have the same name?  How about
> "__clover_image_size" or something similar?
>
> Indeed, that's a good point. I'd go with something like
> "__opencl_image_*" or "__llvm_image_*", because these strings will be
> added to llvm, and non-clover code may depend on them in the future.
>
Sounds good to me.

>> (Identifiers starting with
>> double underscore are reserved for the implementation at least on C and
>> GLSL, not sure about OpenCL-C)
>
> I believe this is true for OpenCL C too, since it is an extension to
> C99. Identifiers starting with double underscores are reserved in C99.
>
IIRC identifiers starting with double underscore and single underscore
followed by some upper-case letter were reserved even in the original
ANSI C spec.

> On Sat, Jul 25, 2015 at 1:06 PM, Francisco Jerez  
> wrote:
>> Zoltan Gilian  writes:
>>
>>> Read-only and write-only image arguments are recognized and
>>> distinguished.
>>> Attributes of the image arguments are passed to the kernel as implicit
>>> arguments.
>>> ---
>>>  src/gallium/state_trackers/clover/core/kernel.cpp  |  46 ++-
>>>  src/gallium/state_trackers/clover/core/kernel.hpp  |  13 +-
>>>  src/gallium/state_trackers/clover/core/memory.cpp  |   2 +-
>>>  src/gallium/state_trackers/clover/core/module.hpp  |   4 +-
>>>  .../state_trackers/clover/llvm/invocation.cpp  | 147 
>>> -
>>>  5 files changed, 198 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp 
>>> b/src/gallium/state_trackers/clover/core/kernel.cpp
>>> index 0756f06..1a6c28f 100644
>>> --- a/src/gallium/state_trackers/clover/core/kernel.cpp
>>> +++ b/src/gallium/state_trackers/clover/core/kernel.cpp
>>> @@ -158,13 +158,18 @@ 
>>> kernel::exec_context::bind(intrusive_ptr _q,
>>> auto margs = find(name_equals(kern.name()), m.syms).args;
>>> auto msec = find(type_equals(module::section::text), m.secs);
>>> auto explicit_arg = kern._args.begin();
>>> +   image_argument *last_image_arg = nullptr;
>>>
>>> for (auto &marg : ma

[Mesa-dev] [PATCH 3/3 v4.2] clover: add clLinkProgramm (CL 1.2)

2015-07-26 Thread EdB
---
 src/gallium/state_trackers/clover/api/program.cpp  | 33 ++
 src/gallium/state_trackers/clover/core/program.hpp |  4 +--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
b/src/gallium/state_trackers/clover/api/program.cpp
index 553bc83..4176562 100644
--- a/src/gallium/state_trackers/clover/api/program.cpp
+++ b/src/gallium/state_trackers/clover/api/program.cpp
@@ -238,6 +238,39 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
return e.get();
 }
 
+CLOVER_API cl_program
+clLinkProgram(cl_context d_ctx, cl_uint num_devs, const cl_device_id *d_devs,
+  const char *p_opts, cl_uint num_progs, const cl_program *d_progs,
+  void (*pfn_notify) (cl_program, void *), void *user_data,
+  cl_int *r_errcode) try {
+   auto &ctx = obj(d_ctx);
+   auto devs = (d_devs ? objs(d_devs, num_devs) :
+ref_vector(ctx.devices()));
+   auto opts = (p_opts ? p_opts : "");
+   auto progs = objs(d_progs, num_progs);
+
+   if (!pfn_notify && user_data)
+ throw error(CL_INVALID_VALUE);
+
+   if (any_of([&](const device &dev) {
+return !count(dev, ctx.devices());
+ }, objs(d_devs, num_devs)))
+  throw error(CL_INVALID_DEVICE);
+
+   auto prog = create(ctx);
+   try {
+  prog().link(devs, opts, progs);
+  ret_error(r_errcode, CL_SUCCESS);
+   } catch (link_error &e) {
+  ret_error(r_errcode, CL_LINK_PROGRAM_FAILURE);
+   }
+
+   return ret_object(prog);
+} catch (error &e) {
+   ret_error(r_errcode, e);
+   return NULL;
+}
+
 CLOVER_API cl_int
 clUnloadCompiler() {
return CL_SUCCESS;
diff --git a/src/gallium/state_trackers/clover/core/program.hpp 
b/src/gallium/state_trackers/clover/core/program.hpp
index 7d86018..a70ed08 100644
--- a/src/gallium/state_trackers/clover/core/program.hpp
+++ b/src/gallium/state_trackers/clover/core/program.hpp
@@ -40,8 +40,8 @@ namespace clover {
   program(clover::context &ctx,
   const std::string &source);
   program(clover::context &ctx,
-  const ref_vector &devs,
-  const std::vector &binaries);
+  const ref_vector &devs = {},
+  const std::vector &binaries = {});
 
   program(const program &prog) = delete;
   program &
-- 
2.5.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 10.6.3

2015-07-26 Thread Emil Velikov
Mesa 10.6.3 is now available.

This release mostly consists of nouveau bugfixes, although we do
have some patches for the VL module (affecting VDPAU/VAAPI/OMX),
XA (memory leak) and osmesa (long standing typo in
OSMesaGetProcAddress/OSMesaPixelsStore).

Brian Paul (1):
  osmesa: fix OSMesaPixelsStore typo

Chad Versace (1):
  mesa: Fix generation of git_sha1.h.tmp for gitlinks

Christian König (2):
  vl: cleanup video buffer private when the decoder is destroyed
  st/vdpau: fix mixer size checks

Emil Velikov (4):
  docs: Add sha256 checksums for the 10.6.2 release
  auxiliary/vl: use the correct screen index
  Update version to 10.6.3
  Add release notes for 10.6.3

Francisco Jerez (1):
  i965/gen9: Use custom MOCS entries set up by the kernel.

Ilia Mirkin (5):
  nv50, nvc0: enable at least one color RT if alphatest is enabled
  nvc0/ir: fix txq on indirect samplers
  nvc0/ir: don't worry about sampler in txq handling
  gm107/ir: fix indirect txq emission
  nv50: fix max level clamping on G80

Kenneth Graunke (1):
  program: Allow redundant OPTION ARB_fog_* directives.

Rob Clark (1):
  xa: don't leak fences


git tag: mesa-10.6.3

ftp://ftp.freedesktop.org/pub/mesa/10.6.3/mesa-10.6.3.tar.gz
MD5: b47aa2a6a60860de1481507085568f31  mesa-10.6.3.tar.gz
SHA1: b51f1b31aae71101a3c8ec4b43fb871d63fe877c  mesa-10.6.3.tar.gz
SHA256: c27e1e33798e69a6d2d2425aee8ac7b4c0b243066a65dd76cbb182ea31b1c7f2  
mesa-10.6.3.tar.gz
PGP: ftp://ftp.freedesktop.org/pub/mesa/10.6.3/mesa-10.6.3.tar.gz.sig

ftp://ftp.freedesktop.org/pub/mesa/10.6.3/mesa-10.6.3.tar.xz
MD5: 553e525d2f20ed48fca8f1ec3176fd83  mesa-10.6.3.tar.xz
SHA1: 31dcbd4d932e74e522fd484e07d7258fdb5fb6b6  mesa-10.6.3.tar.xz
SHA256: 58592e07c350cd2e8969b73fa83048c657a39fe2f13f3b88f5e5818fe2e4676d  
mesa-10.6.3.tar.xz
PGP: ftp://ftp.freedesktop.org/pub/mesa/10.6.3/mesa-10.6.3.tar.xz.sig

--
-Emil



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91468] LLVM 3.8(svn): llvm changes llvm-config output again?

2015-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91468

Bug ID: 91468
   Summary: LLVM 3.8(svn): llvm changes llvm-config output again?
   Product: Mesa
   Version: git
  Hardware: All
OS: All
Status: NEW
  Severity: normal
  Priority: lowest
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: sob...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 117384
  --> https://bugs.freedesktop.org/attachment.cgi?id=117384&action=edit
A patch?

LLVM 3.8(svn): llvm-config-3.8 output for "--version" have "svn" in it, but
shared library filename doesn't have it. For reasons?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91468] LLVM 3.8(svn): llvm changes llvm-config output again?

2015-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91468

Lorenzo Bona  changed:

   What|Removed |Added

 CC||lorenz.b...@gmail.com

--- Comment #2 from Lorenzo Bona  ---
*** Bug 91456 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91456] Mesa won't compile with llvm 3.8

2015-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91456

Lorenzo Bona  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Lorenzo Bona  ---


*** This bug has been marked as a duplicate of bug 91468 ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91468] LLVM 3.8(svn): llvm changes llvm-config output again?

2015-07-26 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91468

--- Comment #1 from Lorenzo Bona  ---
(In reply to Krzysztof A. Sobiecki from comment #0)
> Created attachment 117384 [details] [review]
> A patch?
> 
> LLVM 3.8(svn): llvm-config-3.8 output for "--version" have "svn" in it, but
> shared library filename doesn't have it. For reasons?

Thank you, with your patch I can build mesa again with llvm 3.8.

https://bugs.freedesktop.org/show_bug.cgi?id=91456

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 02/78] i965/nir/vec4: Select between new nir_vec4 or current vec4_visitor code-paths

2015-07-26 Thread Eduardo Lima Mitev
On 07/23/2015 11:25 PM, Jason Ekstrand wrote:
> On Thu, Jul 23, 2015 at 3:16 AM, Eduardo Lima Mitev  wrote:
>> The NIR->vec4 pass will be activated if both the following conditions are 
>> met:
>>
>> * INTEL_USE_NIR environment variable is defined and is positive (1 or true)
>> * The stage is vertex shader (support for geometry shaders and
>>   ARB_vertex_program will be added later).
>> ---
>>  src/mesa/drivers/dri/i965/brw_shader.cpp | 14 --
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp   | 18 ++
>>  2 files changed, 22 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> index 9d60543..cb04d8a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> @@ -122,12 +122,14 @@ brw_compiler_create(void *mem_ctx, const struct 
>> brw_device_info *devinfo)
>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = 
>> true;
>> compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].OptimizeForAOS = 
>> true;
>>
>> -   if (compiler->scalar_vs) {
>> -  /* If we're using the scalar backend for vertex shaders, we need to
>> -   * configure these accordingly.
>> -   */
>> -  
>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
>> true;
>> -  
>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
>> true;
>> +   if (compiler->scalar_vs || brw_env_var_as_boolean("INTEL_USE_NIR", 
>> false)) {
>> +  if (compiler->scalar_vs) {
>> + /* If we're using the scalar backend for vertex shaders, we need to
>> +  * configure these accordingly.
>> +  */
>> + 
>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
>> true;
>> + 
>> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
>> true;
>> +  }
>>compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = 
>> false;
>>
>>compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions = 
>> nir_options;
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index 53270fb..ce04f1b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -1709,6 +1709,9 @@ vec4_visitor::emit_shader_time_write(int 
>> shader_time_subindex, src_reg value)
>>  bool
>>  vec4_visitor::run(gl_clip_plane *clip_planes)
>>  {
>> +   bool use_vec4_nir =
>> +  compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions != 
>> NULL;
>> +
>> sanity_param_count = prog->Parameters->NumParameters;
>>
>> if (shader_time_index >= 0)
>> @@ -1718,11 +1721,18 @@ vec4_visitor::run(gl_clip_plane *clip_planes)
>>
>> emit_prolog();
>>
>> -   /* Generate VS IR for main().  (the visitor only descends into
>> -* functions called "main").
>> -*/
>> if (shader) {
>> -  visit_instructions(shader->base.ir);
>> +  if (use_vec4_nir) {
> 
> We could put the compiler_options[].NirOptions check here.  I don't
> care too much though.
> 

Yes, the extra variable is not necessary at this point, but later on in
the series we have the patch
http://lists.freedesktop.org/archives/mesa-dev/2015-July/089653.html
which uses the bool variable again in the same method. Hence, I think we
can leave it there if you don't feel strongly about it. Otherwise, we
can introduce the variable in the later commit where the condition is
needed again.

>> + assert(prog->nir != NULL);
>> + emit_nir_code();
>> + if (failed)
>> +return false;
>> +  } else {
>> + /* Generate VS IR for main().  (the visitor only descends into
>> +  * functions called "main").
>> +  */
>> + visit_instructions(shader->base.ir);
>> +  }
>> } else {
>>emit_program_code();
>> }
>> --
>> 2.1.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 09/78] i965/nir: Pass a is_scalar boolean to brw_create_nir()

2015-07-26 Thread Eduardo Lima Mitev
On 07/24/2015 12:31 AM, Jason Ekstrand wrote:
> On Thu, Jul 23, 2015 at 3:16 AM, Eduardo Lima Mitev  wrote:
>> The upcoming introduction of NIR->vec4 pass will require that some NIR 
>> lowering
>> passes are enabled/disabled depending on the type of shader (scalar vs. 
>> vector).
>>
>> With this patch we pass a 'is_scalar' variable to the process of 
>> constructing the
>> NIR, to let an external context decide how the shader should be handled.
>> ---
>>  src/mesa/drivers/dri/i965/brw_nir.c  | 3 ++-
>>  src/mesa/drivers/dri/i965/brw_nir.h  | 3 ++-
>>  src/mesa/drivers/dri/i965/brw_program.c  | 6 --
>>  src/mesa/drivers/dri/i965/brw_shader.cpp | 6 --
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp   | 2 +-
>>  5 files changed, 13 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> index 3e154c1..4aa893a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> @@ -61,7 +61,8 @@ nir_shader *
>>  brw_create_nir(struct brw_context *brw,
>> const struct gl_shader_program *shader_prog,
>> const struct gl_program *prog,
>> -   gl_shader_stage stage)
>> +   gl_shader_stage stage,
>> +   bool is_scalar)
>>  {
>> struct gl_context *ctx = &brw->ctx;
>> const nir_shader_compiler_options *options =
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.h 
>> b/src/mesa/drivers/dri/i965/brw_nir.h
>> index 3131109..c76defd 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.h
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.h
>> @@ -77,7 +77,8 @@ void brw_nir_analyze_boolean_resolves(nir_shader *nir);
>>  nir_shader *brw_create_nir(struct brw_context *brw,
>> const struct gl_shader_program *shader_prog,
>> const struct gl_program *prog,
>> -   gl_shader_stage stage);
>> +   gl_shader_stage stage,
>> +   bool is_scalar);
>>
>>  #ifdef __cplusplus
>>  }
>> diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
>> b/src/mesa/drivers/dri/i965/brw_program.c
>> index 85e271d..b913f27 100644
>> --- a/src/mesa/drivers/dri/i965/brw_program.c
>> +++ b/src/mesa/drivers/dri/i965/brw_program.c
>> @@ -143,7 +143,8 @@ brwProgramStringNotify(struct gl_context *ctx,
>>brw_add_texrect_params(prog);
>>
>>if 
>> (ctx->Const.ShaderCompilerOptions[MESA_SHADER_FRAGMENT].NirOptions) {
>> - prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_FRAGMENT);
>> + prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_FRAGMENT,
>> +true);
> 
> I don't think this needs to be on its own line.
> 

Ok, in general I have been strict with not going over 80 cols, but agree
it doesn't look good here.

>>}
>>
>>brw_fs_precompile(ctx, NULL, prog);
>> @@ -169,7 +170,8 @@ brwProgramStringNotify(struct gl_context *ctx,
>>brw_add_texrect_params(prog);
>>
>>if (ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions) {
>> - prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_VERTEX);
>> + prog->nir = brw_create_nir(brw, NULL, prog, MESA_SHADER_VERTEX,
>> +false);
> 
> Here too.
> 

Ok.

>>}
>>
>>brw_vs_precompile(ctx, NULL, prog);
>> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
>> b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> index cb04d8a..34b040d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
>> @@ -397,8 +397,10 @@ brw_link_shader(struct gl_context *ctx, struct 
>> gl_shader_program *shProg)
>>
>>brw_add_texrect_params(prog);
>>
>> -  if (options->NirOptions)
>> - prog->nir = brw_create_nir(brw, shProg, prog, (gl_shader_stage) 
>> stage);
>> +  if (options->NirOptions) {
>> + prog->nir = brw_create_nir(brw, shProg, prog, (gl_shader_stage) 
>> stage,
>> +is_scalar_shader_stage(brw, stage));
>> +  }
>>
>>_mesa_reference_program(ctx, &prog, NULL);
>> }
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index ce04f1b..8f29e50 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -1919,7 +1919,7 @@ brw_vs_emit(struct brw_context *brw,
>>*/
>>   assert(vp->Base.Id == 0 && prog == NULL);
>>   vp->Base.nir =
>> -brw_create_nir(brw, NULL, &vp->Base, MESA_SHADER_VERTEX);
>> +brw_create_nir(brw, NULL, &vp->Base, MESA_SHADER_VERTEX, false);
>>}
>>
>>prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8;
>> --
>> 2.1.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.

Re: [Mesa-dev] [PATCH v2 02/78] i965/nir/vec4: Select between new nir_vec4 or current vec4_visitor code-paths

2015-07-26 Thread Jason Ekstrand
On Jul 26, 2015 1:40 PM, "Eduardo Lima Mitev"  wrote:
>
> On 07/23/2015 11:25 PM, Jason Ekstrand wrote:
> > On Thu, Jul 23, 2015 at 3:16 AM, Eduardo Lima Mitev 
wrote:
> >> The NIR->vec4 pass will be activated if both the following conditions
are met:
> >>
> >> * INTEL_USE_NIR environment variable is defined and is positive (1 or
true)
> >> * The stage is vertex shader (support for geometry shaders and
> >>   ARB_vertex_program will be added later).
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_shader.cpp | 14 --
> >>  src/mesa/drivers/dri/i965/brw_vec4.cpp   | 18 ++
> >>  2 files changed, 22 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
b/src/mesa/drivers/dri/i965/brw_shader.cpp
> >> index 9d60543..cb04d8a 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> >> @@ -122,12 +122,14 @@ brw_compiler_create(void *mem_ctx, const struct
brw_device_info *devinfo)
> >> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS
= true;
> >>
 compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].OptimizeForAOS =
true;
> >>
> >> -   if (compiler->scalar_vs) {
> >> -  /* If we're using the scalar backend for vertex shaders, we
need to
> >> -   * configure these accordingly.
> >> -   */
> >> -
compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput =
true;
> >> -
compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp =
true;
> >> +   if (compiler->scalar_vs || brw_env_var_as_boolean("INTEL_USE_NIR",
false)) {
> >> +  if (compiler->scalar_vs) {
> >> + /* If we're using the scalar backend for vertex shaders, we
need to
> >> +  * configure these accordingly.
> >> +  */
> >> +
 compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput =
true;
> >> +
 compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp =
true;
> >> +  }
> >>
compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = false;
> >>
> >>compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions
= nir_options;
> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> >> index 53270fb..ce04f1b 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> >> @@ -1709,6 +1709,9 @@ vec4_visitor::emit_shader_time_write(int
shader_time_subindex, src_reg value)
> >>  bool
> >>  vec4_visitor::run(gl_clip_plane *clip_planes)
> >>  {
> >> +   bool use_vec4_nir =
> >> +  compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions
!= NULL;
> >> +
> >> sanity_param_count = prog->Parameters->NumParameters;
> >>
> >> if (shader_time_index >= 0)
> >> @@ -1718,11 +1721,18 @@ vec4_visitor::run(gl_clip_plane *clip_planes)
> >>
> >> emit_prolog();
> >>
> >> -   /* Generate VS IR for main().  (the visitor only descends into
> >> -* functions called "main").
> >> -*/
> >> if (shader) {
> >> -  visit_instructions(shader->base.ir);
> >> +  if (use_vec4_nir) {
> >
> > We could put the compiler_options[].NirOptions check here.  I don't
> > care too much though.
> >
>
> Yes, the extra variable is not necessary at this point, but later on in
> the series we have the patch
> http://lists.freedesktop.org/archives/mesa-dev/2015-July/089653.html
> which uses the bool variable again in the same method. Hence, I think we
> can leave it there if you don't feel strongly about it. Otherwise, we
> can introduce the variable in the later commit where the condition is
> needed again.

That's fine. Go ahead and leave it.

> >> + assert(prog->nir != NULL);
> >> + emit_nir_code();
> >> + if (failed)
> >> +return false;
> >> +  } else {
> >> + /* Generate VS IR for main().  (the visitor only descends
into
> >> +  * functions called "main").
> >> +  */
> >> + visit_instructions(shader->base.ir);
> >> +  }
> >> } else {
> >>emit_program_code();
> >> }
> >> --
> >> 2.1.4
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 20/78] nir-lower_io: Store data.location instead, in const_index[0] of store_output

2015-07-26 Thread Eduardo Lima Mitev
On 07/25/2015 12:04 AM, Jason Ekstrand wrote:
> I think we already agreed to just copy data.location into
> data.driver_location and we don't need this special-casing.
> 
> Just making a note of it as I review.
> --Jason
> 

Yes, I have completely removed this patch from the series. Now all the
agreed changes live in the patch that implements store_output intrinsic,
which now looks like this:

https://github.com/Igalia/mesa/commit/c7bfd3e8f0fa5540b4728f712e8441c254d029ba


> On Thu, Jul 23, 2015 at 3:17 AM, Eduardo Lima Mitev  wrote:
>> Non-scalar backends like i965's NIR-vec4  need the orginal variable's varying
>> value instead of the driver_location (due to the way URB file emission is
>> implemented). This patch stores variable's location in const_index[0] 
>> instead of
>> the current driver_location value, which is not needed.
>> ---
>>  src/glsl/nir/nir_lower_io.c | 12 ++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
>> index ccc832b..71a925c 100644
>> --- a/src/glsl/nir/nir_lower_io.c
>> +++ b/src/glsl/nir/nir_lower_io.c
>> @@ -378,9 +378,17 @@ nir_lower_io_block(nir_block *block, void *void_state)
>>   nir_src indirect;
>>   unsigned offset = get_io_offset(intrin->variables[0],
>>   &intrin->instr, &indirect, state);
>> - offset += intrin->variables[0]->var->data.driver_location;
>>
>> - store->const_index[0] = offset;
>> + /* Some non-scalar backends (like i965's NIR-vec4) need the orginal
>> +  * variable's varying value instead of the driver_location.
>> +  */
>> + if (!state->is_scalar) {
>> +store->const_index[0] =
>> +   intrin->variables[0]->var->data.location + offset;
>> + } else {
>> +store->const_index[0] =
>> +   intrin->variables[0]->var->data.driver_location + offset;
>> + }
>>
>>   nir_src_copy(&store->src[0], &intrin->src[0], state->mem_ctx);
>>
>> --
>> 2.1.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 20/78] nir-lower_io: Store data.location instead, in const_index[0] of store_output

2015-07-26 Thread Jason Ekstrand
On Jul 26, 2015 2:09 PM, "Eduardo Lima Mitev"  wrote:
>
> On 07/25/2015 12:04 AM, Jason Ekstrand wrote:
> > I think we already agreed to just copy data.location into
> > data.driver_location and we don't need this special-casing.
> >
> > Just making a note of it as I review.
> > --Jason
> >
>
> Yes, I have completely removed this patch from the series. Now all the
> agreed changes live in the patch that implements store_output intrinsic,
> which now looks like this:
>
>
https://github.com/Igalia/mesa/commit/c7bfd3e8f0fa5540b4728f712e8441c254d029ba

Those changes look fine.
>
> > On Thu, Jul 23, 2015 at 3:17 AM, Eduardo Lima Mitev 
wrote:
> >> Non-scalar backends like i965's NIR-vec4  need the orginal variable's
varying
> >> value instead of the driver_location (due to the way URB file emission
is
> >> implemented). This patch stores variable's location in const_index[0]
instead of
> >> the current driver_location value, which is not needed.
> >> ---
> >>  src/glsl/nir/nir_lower_io.c | 12 ++--
> >>  1 file changed, 10 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
> >> index ccc832b..71a925c 100644
> >> --- a/src/glsl/nir/nir_lower_io.c
> >> +++ b/src/glsl/nir/nir_lower_io.c
> >> @@ -378,9 +378,17 @@ nir_lower_io_block(nir_block *block, void
*void_state)
> >>   nir_src indirect;
> >>   unsigned offset = get_io_offset(intrin->variables[0],
> >>   &intrin->instr, &indirect,
state);
> >> - offset += intrin->variables[0]->var->data.driver_location;
> >>
> >> - store->const_index[0] = offset;
> >> + /* Some non-scalar backends (like i965's NIR-vec4) need the
orginal
> >> +  * variable's varying value instead of the driver_location.
> >> +  */
> >> + if (!state->is_scalar) {
> >> +store->const_index[0] =
> >> +   intrin->variables[0]->var->data.location + offset;
> >> + } else {
> >> +store->const_index[0] =
> >> +   intrin->variables[0]->var->data.driver_location +
offset;
> >> + }
> >>
> >>   nir_src_copy(&store->src[0], &intrin->src[0],
state->mem_ctx);
> >>
> >> --
> >> 2.1.4
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 21/78] i965/nir/vec4: Implement store_output intrinsic

2015-07-26 Thread Eduardo Lima Mitev
On 07/25/2015 12:08 AM, Jason Ekstrand wrote:
> On Thu, Jul 23, 2015 at 3:17 AM, Eduardo Lima Mitev  wrote:
>> The destination register from the instruction is stored in the output_reg
>> variable at its original varying value. From there, vec4_visitor's
>> emit_urb_slot() will pick it up and continue the URB setup code, so that
>> part is shared.
>>
>> This implementation expects that const_index[0] of the instrinsic instruction
>> stores the shader variable location (not the calculated driver_location).
>>
>> The driver_location is not used at all so this patch also disables the
>> nir_assign_var_locations pass on non-scalar shaders.
> 
> These last two paragraphs are now stale.  Otherwise,
> 

Yes, I updated the commit log together with the patch.

> Reviewed-by: Jason Ekstrand 
> 

Thanks!

>> ---
>>  src/mesa/drivers/dri/i965/brw_nir.c|  2 +-
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 +++--
>>  2 files changed, 20 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> index b241121..1164984 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> @@ -106,13 +106,13 @@ brw_create_nir(struct brw_context *brw,
>>  &nir->num_direct_uniforms,
>>  &nir->num_uniforms,
>>  is_scalar);
>> +  nir_assign_var_locations(&nir->outputs, &nir->num_outputs, is_scalar);
>> } else {
>>nir_assign_var_locations(&nir->uniforms,
>> &nir->num_uniforms,
>> is_scalar);
>> }
>> nir_assign_var_locations(&nir->inputs, &nir->num_inputs, is_scalar);
>> -   nir_assign_var_locations(&nir->outputs, &nir->num_outputs, is_scalar);
>>
>> nir_lower_io(nir, is_scalar);
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> index 4dd2194..740cc61 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> @@ -474,10 +474,27 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
>> *instr)
>> }
>>
>> case nir_intrinsic_store_output_indirect:
>> +  has_indirect = true;
>>/* fallthrough */
>> -   case nir_intrinsic_store_output:
>> -  /* @TODO: Not yet implemented */
>> +   case nir_intrinsic_store_output: {
>> +  /* Here we need the original variable's varying value, which is stored
>> +   * by nir_lower_io in const_index[0] of the store_output intrinsic
>> +   * instruction for non-scalar backends like we are.
>> +   */
>> +  int varying = instr->const_index[0];
>> +
>> +  src = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_F,
>> +instr->num_components);
>> +  dest = dst_reg(src);
>> +
>> +  if (has_indirect) {
>> + dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr->src[1],
>> + 
>> BRW_REGISTER_TYPE_D,
>> + 1));
>> +  }
>> +  output_reg[varying] = dest;
>>break;
>> +   }
>>
>> case nir_intrinsic_load_vertex_id:
>>unreachable("should be lowered by lower_vertex_id()");
>> --
>> 2.1.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 15/78] i965/nir/vec4: Add get_nir_dst() and get_nir_src() methods

2015-07-26 Thread Eduardo Lima Mitev
On 07/24/2015 11:53 PM, Jason Ekstrand wrote:
> On Fri, Jul 24, 2015 at 12:19 PM, Jason Ekstrand  wrote:
>> On Thu, Jul 23, 2015 at 3:16 AM, Eduardo Lima Mitev  wrote:
>>> These methods are essential for the implementation of the NIR->vec4 pass. 
>>> They
>>> work similar to their fs_nir counter-parts.
>>>
>>> When processing instructions, these methods are invoked to resolve the
>>> brw registers (source or destination) corresponding to the NIR sources
>>> or destination. It uses the map of NIR register index to brw register for
>>> all registers locally allocated in a block.
>>>
>>> Signed-off-by: Samuel Iglesias Gonsalvez 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4.h   | 10 
>>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 74 
>>> ++
>>>  2 files changed, 84 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
>>> b/src/mesa/drivers/dri/i965/brw_vec4.h
>>> index 83ac4c4..eb83dfc 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
>>> @@ -408,6 +408,16 @@ public:
>>> virtual void nir_emit_jump(nir_jump_instr *instr);
>>> virtual void nir_emit_texture(nir_tex_instr *instr);
>>>
>>> +   dst_reg get_nir_dest(nir_dest dest, enum brw_reg_type type);
>>> +   dst_reg get_nir_dest(nir_dest dest, nir_alu_type type);
>>> +   dst_reg get_nir_dest(nir_dest dest);
>>> +   src_reg get_nir_src(nir_src src, enum brw_reg_type type,
>>> +   unsigned num_components = 4);
>>> +   src_reg get_nir_src(nir_src src, nir_alu_type type,
>>> +   unsigned num_components = 4);
>>> +   src_reg get_nir_src(nir_src src,
>>> +   unsigned num_components = 4);
>>> +
>>> virtual dst_reg *make_reg_for_system_value(int location,
>>>const glsl_type *type) = 0;
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> index 4733b60..3259290 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> @@ -331,6 +331,80 @@ vec4_visitor::nir_emit_instr(nir_instr *instr)
>>> }
>>>  }
>>>
>>> +static dst_reg
>>> +dst_reg_for_nir_reg(vec4_visitor *v, nir_register *nir_reg,
>>> +unsigned base_offset, nir_src *indirect)
>>> +{
>>> +   dst_reg reg;
>>> +
>>> +   reg = v->nir_locals[nir_reg->index];
>>> +   reg = offset(reg, base_offset);
>>> +   if (indirect) {
>>> +  reg.reladdr =
>>> + new(v->mem_ctx) src_reg(v->get_nir_src(*indirect,
>>> +BRW_REGISTER_TYPE_D,
>>> +1));
>>> +   }
>>> +   return reg;
>>> +}
>>> +
>>> +dst_reg
>>> +vec4_visitor::get_nir_dest(nir_dest dest)
>>> +{
> 
> Also, you should add "assert(!dest.is_ssa);" here.
> 

Ok.

>>> +   return dst_reg_for_nir_reg(this, dest.reg.reg, dest.reg.base_offset,
>>> +  dest.reg.indirect);
>>> +}
>>> +
>>> +dst_reg
>>> +vec4_visitor::get_nir_dest(nir_dest dest, enum brw_reg_type type)
>>> +{
>>> +   dst_reg reg = get_nir_dest(dest);
>>> +   return retype(reg, type);
>>
>> This could be one line.
>>

Ok.

>>> +}
>>> +
>>> +dst_reg
>>> +vec4_visitor::get_nir_dest(nir_dest dest, nir_alu_type type)
>>> +{
>>> +   dst_reg reg = get_nir_dest(dest);
>>> +   return retype(reg, brw_type_for_nir_type(type));
>>
>> This could just call get_nir_dest(nir_dest, brw_reg_type)
>>

Oh yeah, that slipped.

>>> +}
>>> +
>>> +src_reg
>>> +vec4_visitor::get_nir_src(nir_src src, enum brw_reg_type type,
>>> +  unsigned num_components)
>>> +{
>>> +   dst_reg reg;
>>> +
>>> +   if (src.is_ssa) {
>>> +  assert(src.ssa != NULL);
>>> +  reg = nir_ssa_values[src.ssa->index];
>>> +   }
>>> +   else {
>>> + reg = dst_reg_for_nir_reg(this, src.reg.reg, src.reg.base_offset,
>>> +   src.reg.indirect);
>>> +   }
>>> +
>>> +   reg = retype(reg, type);
>>> +
>>> +   src_reg reg_as_src = src_reg(reg);
>>> +   reg_as_src.swizzle = brw_swizzle_for_size(num_components);
>>> +   return reg_as_src;
>>> +}
>>> +
>>> +src_reg
>>> +vec4_visitor::get_nir_src(nir_src src, nir_alu_type type,
>>> +  unsigned num_components)
>>> +{
>>> +   return get_nir_src(src, brw_type_for_nir_type(type), num_components);
>>> +}
>>> +
>>> +src_reg
>>> +vec4_visitor::get_nir_src(nir_src src, unsigned num_components)
>>> +{
>>> +   /* if type is not specified, default to signed int */
>>> +   return get_nir_src(src, nir_type_int, num_components);
>>> +}
>>> +
>>>  void
>>>  vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
>>>  {
>>> --
>>> 2.1.4
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___

Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-26 Thread Zoltán Gilián
> Why?  Your module::argument::image_size and ::image_format cases don't
> touch the explicit_arg iterator at all AFAICT, so it will be left
> pointing one past the last general semantic argument

Ok, my mistake, I didn't think this through.

> Hmmm...  So you only need it as padding?  Wouldn't it be possible for
> you to declare samplers to be 0 bytes?

Maybe it can be done by changing the type of the sampler arg from i32
to an empty struct. I'll have to try this, I don't know if it will
work.

On Sun, Jul 26, 2015 at 2:40 PM, Francisco Jerez  wrote:
> Zoltán Gilián  writes:
>
>>> auto img = dynamic_cast(**(explicit_arg - 1))
>>
>> Ok, so it should be (explicit_arg - 2) for the image format, I
>> presume.
>
> Why?  Your module::argument::image_size and ::image_format cases don't
> touch the explicit_arg iterator at all AFAICT, so it will be left
> pointing one past the last general semantic argument
>
>> This will be incorrect, however, if the targets that need implicit
>> argument for format metadata are indeed a strict superset of the ones
>> that need dimension, as you mentioned before. The targets that only
>> need format will break this code. Should I swap the order of the
>> format and dimension implicit args to make this approach work under
>> the aforementioned assumption?
>>
> Why would it be incorrect?  The kernel::_args vector explicit_arg
> iterates in contains explicit arguments only so it shouldn't make a
> difference in which order you emit them as long as you don't interleave
> other explicit arguments in between.
>
>>> It also seems like you've got rid of the static casts you had
>>> previously?
>>
>> That is a mistake, I'll fix it.
>>
>>> My expectation here was that the compiler would be able to hard-code
>>> sampler indices in the shader without the API passing any actual data in
>>> the input buffer.  Doesn't that work for you?
>>
>> Yes, that's correct, but this is also the case with images. If image
>> index is uploaded explicitly, I don't see why it can't be done with
>> sampler indices. But probably it's a better idea to send the sampler
>> value rather than the index, in case the kernel needs it (e.g. the
>> normalized or non-normalized nature of texture coordinates may have to
>> be specified in the fetch instruction itself, and not by hardware
>> registers), so I'll definitely change this.
>
> Some hardware (e.g. Intel's) will need an index (which can probably be
> hardcoded in the shader for now) into a table of sampler configurations
> which can be set-up later on by the driver at state-emit time.  In any
> case we can always add a new argument semantic enum later on for the
> target to select sampler config vs index if different drivers turn out
> to need different things.
>
>> But the bigger problem is, that the byte offsets of the kernel
>> arguments are computed considering the sampler argument too, so the
>> binary expects it to be present in the input vector. Furthermore I
>> can't erase the sampler argument from the IR, because it is needed to
>> make it possible for get_kernel_args to detect the sampler. But if
>> sampler_argument::bind doesn't append 4 bytes (clang compiles
>> sampler_t to i32) to the input vector, the binary will try to load the
>> following arguments from wrong locations.
>
> Hmmm...  So you only need it as padding?  Wouldn't it be possible for
> you to declare samplers to be 0 bytes?
>
>>
>>> I don't think it's a good idea to use such obvious names for the
>>> implicit argument types. Couldn't they collide with some type declared
>> by the user that happens to have the same name?  How about
>> "__clover_image_size" or something similar?
>>
>> Indeed, that's a good point. I'd go with something like
>> "__opencl_image_*" or "__llvm_image_*", because these strings will be
>> added to llvm, and non-clover code may depend on them in the future.
>>
> Sounds good to me.
>
>>> (Identifiers starting with
>>> double underscore are reserved for the implementation at least on C and
>>> GLSL, not sure about OpenCL-C)
>>
>> I believe this is true for OpenCL C too, since it is an extension to
>> C99. Identifiers starting with double underscores are reserved in C99.
>>
> IIRC identifiers starting with double underscore and single underscore
> followed by some upper-case letter were reserved even in the original
> ANSI C spec.
>
>> On Sat, Jul 25, 2015 at 1:06 PM, Francisco Jerez  
>> wrote:
>>> Zoltan Gilian  writes:
>>>
 Read-only and write-only image arguments are recognized and
 distinguished.
 Attributes of the image arguments are passed to the kernel as implicit
 arguments.
 ---
  src/gallium/state_trackers/clover/core/kernel.cpp  |  46 ++-
  src/gallium/state_trackers/clover/core/kernel.hpp  |  13 +-
  src/gallium/state_trackers/clover/core/memory.cpp  |   2 +-
  src/gallium/state_trackers/clover/core/module.hpp  |   4 +-
  .../state_trackers/clover/llvm/invocation.cpp  | 147 
 -
  

[Mesa-dev] [PATCH 5/6] radeonsi: add GS multiple streams support

2015-07-26 Thread Dave Airlie
From: Dave Airlie 

This is the final piece for ARB_gpu_shader5,

The code is based on the r600 code from Glenn Kennard,
and myself.

While developing this, I'm not 100% sure of all the calculations
made in the GS registers, this is why the max_stream is worked
out there and used to limit the changes in registers. Otherwise
my initial attempts either regressed GS texelFetch tests
or primitive-id-restart. The current code has no regressions
in piglit.

This commit doesn't enable ARB_gpu_shader5, since that just
bumps the glsl level to 4.00, so I'll just do a separate patch
for 4.10.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/radeonsi/si_descriptors.c   |  4 +-
 src/gallium/drivers/radeonsi/si_pipe.c  |  2 +-
 src/gallium/drivers/radeonsi/si_shader.c| 59 ---
 src/gallium/drivers/radeonsi/si_state.c |  4 --
 src/gallium/drivers/radeonsi/si_state.h |  8 ++-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 75 +++--
 6 files changed, 120 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 2e2a35b..14bb6e1 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -724,7 +724,7 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint 
shader, uint slot,
struct pipe_resource *buffer,
unsigned stride, unsigned num_records,
bool add_tid, bool swizzle,
-   unsigned element_size, unsigned index_stride)
+   unsigned element_size, unsigned index_stride, uint64_t 
offset)
 {
struct si_context *sctx = (struct si_context *)ctx;
struct si_buffer_resources *buffers = &sctx->rw_buffers[shader];
@@ -741,7 +741,7 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint 
shader, uint slot,
if (buffer) {
uint64_t va;
 
-   va = r600_resource(buffer)->gpu_address;
+   va = r600_resource(buffer)->gpu_address + offset;
 
switch (element_size) {
default:
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index ebe1f5a..e84fe7a 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -316,7 +316,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
return 4095;
case PIPE_CAP_MAX_VERTEX_STREAMS:
-   return 1;
+   return 4;
 
case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE:
return 2048;
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index fa31f73..b472fa6 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -31,6 +31,7 @@
 #include "gallivm/lp_bld_intr.h"
 #include "gallivm/lp_bld_logic.h"
 #include "gallivm/lp_bld_arit.h"
+#include "gallivm/lp_bld_bitarit.h"
 #include "gallivm/lp_bld_flow.h"
 #include "radeon/r600_cs.h"
 #include "radeon/radeon_llvm.h"
@@ -1576,6 +1577,8 @@ static void si_llvm_emit_streamout(struct 
si_shader_context *shader,
LLVMValueRef can_emit =
LLVMBuildICmp(builder, LLVMIntULT, tid, so_vtx_count, "");
 
+   LLVMValueRef stream_id =
+ unpack_param(shader, shader->param_streamout_config, 24, 2);
/* Emit the streamout code conditionally. This actually avoids
 * out-of-bounds buffer access. The hw tells us via the SGPR
 * (so_vtx_count) which threads are allowed to emit streamout data. */
@@ -1615,8 +1618,9 @@ static void si_llvm_emit_streamout(struct 
si_shader_context *shader,
unsigned reg = so->output[i].register_index;
unsigned start = so->output[i].start_component;
unsigned num_comps = so->output[i].num_components;
+   unsigned stream = so->output[i].stream;
LLVMValueRef out[4];
-
+   struct lp_build_if_state if_ctx_stream;
assert(num_comps && num_comps <= 4);
if (!num_comps || num_comps > 4)
continue;
@@ -1649,11 +1653,15 @@ static void si_llvm_emit_streamout(struct 
si_shader_context *shader,
break;
}
 
+   LLVMValueRef can_emit_stream =
+ LLVMBuildICmp(builder, LLVMIntEQ, stream_id, 
lp_build_const_int32(gallivm, stream), "");
+   lp_build_if(&if_ctx_stream, gallivm, can_emit_stream);
build_tbuffer_store_dwords(shader, 
shader->so_buffers[buf_idx],
   vdata, num_comps,
   

[Mesa-dev] [PATCH 4/6] radeon: add support for streams to the common streamout code.

2015-07-26 Thread Dave Airlie
From: Dave Airlie 

This adds to the common radeon streamout code, support
for multiple streams.

It updates radeonsi/r600 to set the enabled mask up.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/r600/r600_shader.c  |  7 +++
 src/gallium/drivers/r600/r600_shader.h  |  1 +
 src/gallium/drivers/r600/r600_state_common.c|  2 ++
 src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
 src/gallium/drivers/radeon/r600_streamout.c | 25 ++---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 17 ++---
 6 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 1a72bf6..dda38f6 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -310,6 +310,7 @@ struct r600_shader_ctx {
int gs_next_vertex;
struct r600_shader  *gs_for_vs;
int gs_export_gpr_treg;
+   unsignedenabled_stream_buffers_mask;
 };
 
 struct r600_shader_tgsi_instruction {
@@ -1402,6 +1403,9 @@ static int emit_streamout(struct r600_shader_ctx *ctx, 
struct pipe_stream_output
 * with MEM_STREAM instructions */
output.array_size = 0xFFF;
output.comp_mask = ((1 << so->output[i].num_components) - 1) << 
so->output[i].start_component;
+
+   ctx->enabled_stream_buffers_mask |= (1 << 
so->output[i].output_buffer);
+
if (ctx->bc->chip_class >= EVERGREEN) {
switch (so->output[i].output_buffer) {
case 0:
@@ -1718,6 +1722,8 @@ static int generate_gs_copy_shader(struct r600_context 
*rctx,
gs->gs_copy_shader = cshader;
 
ctx.bc->nstack = 1;
+
+   cshader->enabled_stream_buffers_mask = ctx.enabled_stream_buffers_mask;
cshader->shader.ring_item_size = ocnt * 16;
 
return r600_bytecode_build(ctx.bc);
@@ -2261,6 +2267,7 @@ static int r600_shader_from_tgsi(struct r600_context 
*rctx,
so.num_outputs && !use_llvm)
emit_streamout(&ctx, &so);
 
+   pipeshader->enabled_stream_buffers_mask = 
ctx.enabled_stream_buffers_mask;
convert_edgeflag_to_int(&ctx);
 
if (ring_outputs) {
diff --git a/src/gallium/drivers/r600/r600_shader.h 
b/src/gallium/drivers/r600/r600_shader.h
index dd359d7..5d05c81 100644
--- a/src/gallium/drivers/r600/r600_shader.h
+++ b/src/gallium/drivers/r600/r600_shader.h
@@ -125,6 +125,7 @@ struct r600_pipe_shader {
struct r600_shader_key  key;
unsigneddb_shader_control;
unsignedps_depth_export;
+   unsignedenabled_stream_buffers_mask;
 };
 
 /* return the table index 0-5 for TGSI_INTERPOLATE_LINEAR/PERSPECTIVE and
diff --git a/src/gallium/drivers/r600/r600_state_common.c 
b/src/gallium/drivers/r600/r600_state_common.c
index 0c78b50..455e59a 100644
--- a/src/gallium/drivers/r600/r600_state_common.c
+++ b/src/gallium/drivers/r600/r600_state_common.c
@@ -1208,6 +1208,7 @@ static bool r600_update_derived_state(struct r600_context 
*rctx)
rctx->clip_misc_state.clip_disable = 
rctx->gs_shader->current->shader.vs_position_window_space;
rctx->clip_misc_state.atom.dirty = true;
}
+   rctx->b.streamout.enabled_stream_buffers_mask = 
rctx->gs_shader->current->gs_copy_shader->enabled_stream_buffers_mask;
}
 
r600_shader_select(ctx, rctx->vs_shader, &vs_dirty);
@@ -1242,6 +1243,7 @@ static bool r600_update_derived_state(struct r600_context 
*rctx)
rctx->clip_misc_state.clip_disable = 
rctx->vs_shader->current->shader.vs_position_window_space;
rctx->clip_misc_state.atom.dirty = true;
}
+   rctx->b.streamout.enabled_stream_buffers_mask = 
rctx->vs_shader->current->enabled_stream_buffers_mask;
}
}
 
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index d225f25..16613af 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -328,6 +328,7 @@ struct r600_streamout {
/* External state which comes from the vertex shader,
 * it must be set explicitly when binding a shader. */
unsigned*stride_in_dw;
+   unsignedenabled_stream_buffers_mask; /* stream0 
buffers0-3 in 4 LSB */
 
/* The state of VGT_STRMOUT_(CONFIG|EN). */
struct r600_atomenable_atom;
diff --git a/src/gallium/drivers/radeon/r600_streamout.c 
b/src/gallium/drivers/radeon/r600_streamout.c
index 0688397..520c71e 100644
--

[Mesa-dev] [PATCH 2/6] radeon: add streamout status 1-3 queries.

2015-07-26 Thread Dave Airlie
From: Dave Airlie 

This adds support for queries against the non-0 vertex streams.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/radeon/r600_query.c   | 18 --
 src/gallium/drivers/radeon/r600d_common.h |  3 +++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index a1d8241..f8072ea 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -54,6 +54,8 @@ struct r600_query {
uint64_t end_result;
/* Fence for GPU_FINISHED. */
struct pipe_fence_handle *fence;
+   /* For transform feedback: which stream the query is for */
+   unsigned stream;
 };
 
 
@@ -155,6 +157,17 @@ static void r600_update_occlusion_query_state(struct 
r600_common_context *rctx,
}
 }
 
+static unsigned event_type_for_stream(struct r600_query *query)
+{
+   switch (query->stream) {
+   default:
+   case 0: return EVENT_TYPE_SAMPLE_STREAMOUTSTATS;
+   case 1: return EVENT_TYPE_SAMPLE_STREAMOUTSTATS1;
+   case 2: return EVENT_TYPE_SAMPLE_STREAMOUTSTATS2;
+   case 3: return EVENT_TYPE_SAMPLE_STREAMOUTSTATS3;
+   }
+}
+
 static void r600_emit_query_begin(struct r600_common_context *ctx, struct 
r600_query *query)
 {
struct radeon_winsys_cs *cs = ctx->rings.gfx.cs;
@@ -189,7 +202,7 @@ static void r600_emit_query_begin(struct 
r600_common_context *ctx, struct r600_q
case PIPE_QUERY_SO_STATISTICS:
case PIPE_QUERY_SO_OVERFLOW_PREDICATE:
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
-   radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_SAMPLE_STREAMOUTSTATS) | 
EVENT_INDEX(3));
+   radeon_emit(cs, EVENT_TYPE(event_type_for_stream(query)) | 
EVENT_INDEX(3));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
break;
@@ -246,7 +259,7 @@ static void r600_emit_query_end(struct r600_common_context 
*ctx, struct r600_que
case PIPE_QUERY_SO_OVERFLOW_PREDICATE:
va += query->buffer.results_end + query->result_size/2;
radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 2, 0));
-   radeon_emit(cs, EVENT_TYPE(EVENT_TYPE_SAMPLE_STREAMOUTSTATS) | 
EVENT_INDEX(3));
+   radeon_emit(cs, EVENT_TYPE(event_type_for_stream(query)) | 
EVENT_INDEX(3));
radeon_emit(cs, va);
radeon_emit(cs, (va >> 32UL) & 0xFF);
break;
@@ -367,6 +380,7 @@ static struct pipe_query *r600_create_query(struct 
pipe_context *ctx, unsigned q
/* NumPrimitivesWritten, PrimitiveStorageNeeded. */
query->result_size = 32;
query->num_cs_dw = 6;
+   query->stream = index;
break;
case PIPE_QUERY_PIPELINE_STATISTICS:
/* 11 values on EG, 8 on R600. */
diff --git a/src/gallium/drivers/radeon/r600d_common.h 
b/src/gallium/drivers/radeon/r600d_common.h
index 74c8d87..5a56a54 100644
--- a/src/gallium/drivers/radeon/r600d_common.h
+++ b/src/gallium/drivers/radeon/r600d_common.h
@@ -66,6 +66,9 @@
 #define PKT3_SET_SH_REG0x76 /* SI and later */
 #define PKT3_SET_UCONFIG_REG   0x79 /* CIK and later */
 
+#define EVENT_TYPE_SAMPLE_STREAMOUTSTATS1  0x1 /* EG and later */
+#define EVENT_TYPE_SAMPLE_STREAMOUTSTATS2  0x2 /* EG and later */
+#define EVENT_TYPE_SAMPLE_STREAMOUTSTATS3  0x3 /* EG and later */
 #define EVENT_TYPE_PS_PARTIAL_FLUSH0x10
 #define EVENT_TYPE_CACHE_FLUSH_AND_INV_TS_EVENT 0x14
 #define EVENT_TYPE_ZPASS_DONE  0x15
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] radeonsi: add support for interpolateAt functions (v2)

2015-07-26 Thread Dave Airlie
From: Dave Airlie 

This is part of ARB_gpu_shader5, and this passes
all the piglit tests currently available.

v2: use macros from the fine derivs commit.
add comments.
Signed-off-by: Dave Airlie 
---
 docs/GL3.txt |   2 +-
 src/gallium/drivers/radeonsi/si_shader.c | 241 ++-
 2 files changed, 241 insertions(+), 2 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 15bb57f..258a6fb 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -107,7 +107,7 @@ GL 4.0, GLSL 4.00:
   - Geometry shader instancing DONE (r600, radeonsi, 
llvmpipe, softpipe)
   - Geometry shader multiple streams   DONE ()
   - Enhanced per-sample shadingDONE (r600, radeonsi)
-  - Interpolation functionsDONE (r600)
+  - Interpolation functionsDONE (r600, radeonsi)
   - New overload resolution rules  DONE
   GL_ARB_gpu_shader_fp64   DONE (nvc0, radeonsi, 
llvmpipe, softpipe)
   GL_ARB_sample_shadingDONE (i965, nv50, nvc0, 
r600, radeonsi)
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 27b3c72..fa31f73 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2960,6 +2960,234 @@ static void si_llvm_emit_ddxy(
emit_data->output[0] = lp_build_gather_values(gallivm, result, 4);
 }
 
+/*
+ * this takes an I,J coordinate pair,
+ * and works out the X and Y derivatives.
+ * it returns DDX(I), DDX(J), DDY(I), DDY(J).
+ */
+static LLVMValueRef si_llvm_emit_ddxy_interp(
+   struct lp_build_tgsi_context *bld_base,
+   LLVMValueRef interp_ij)
+{
+   struct si_shader_context *si_shader_ctx = si_shader_context(bld_base);
+   struct gallivm_state *gallivm = bld_base->base.gallivm;
+   struct lp_build_context *base = &bld_base->base;
+   LLVMValueRef indices[2];
+   LLVMValueRef store_ptr, load_ptr_x, load_ptr_y, load_ptr_ddx, 
load_ptr_ddy, temp, temp2;
+   LLVMValueRef tl, tr, bl, result[4];
+   LLVMTypeRef i32;
+   unsigned c;
+
+   i32 = LLVMInt32TypeInContext(gallivm->context);
+
+   indices[0] = bld_base->uint_bld.zero;
+   indices[1] = build_intrinsic(gallivm->builder, "llvm.SI.tid", i32,
+NULL, 0, LLVMReadNoneAttribute);
+   store_ptr = LLVMBuildGEP(gallivm->builder, si_shader_ctx->lds,
+indices, 2, "");
+
+   temp = LLVMBuildAnd(gallivm->builder, indices[1],
+   lp_build_const_int32(gallivm, TID_MASK_LEFT), "");
+
+   temp2 = LLVMBuildAnd(gallivm->builder, indices[1],
+lp_build_const_int32(gallivm, TID_MASK_TOP), "");
+
+   indices[1] = temp;
+   load_ptr_x = LLVMBuildGEP(gallivm->builder, si_shader_ctx->lds,
+ indices, 2, "");
+
+   indices[1] = temp2;
+   load_ptr_y = LLVMBuildGEP(gallivm->builder, si_shader_ctx->lds,
+ indices, 2, "");
+
+   indices[1] = LLVMBuildAdd(gallivm->builder, temp,
+ lp_build_const_int32(gallivm, 1), "");
+   load_ptr_ddx = LLVMBuildGEP(gallivm->builder, si_shader_ctx->lds,
+  indices, 2, "");
+
+   indices[1] = LLVMBuildAdd(gallivm->builder, temp2,
+ lp_build_const_int32(gallivm, 2), "");
+   load_ptr_ddy = LLVMBuildGEP(gallivm->builder, si_shader_ctx->lds,
+  indices, 2, "");
+
+   for (c = 0; c < 2; ++c) {
+   LLVMValueRef store_val;
+   LLVMValueRef c_ll = lp_build_const_int32(gallivm, c);
+
+   store_val = LLVMBuildExtractElement(gallivm->builder,
+   interp_ij, c_ll, "");
+   LLVMBuildStore(gallivm->builder,
+  store_val,
+  store_ptr);
+
+   tl = LLVMBuildLoad(gallivm->builder, load_ptr_x, "");
+   tl = LLVMBuildBitCast(gallivm->builder, tl, base->elem_type, 
"");
+
+   tr = LLVMBuildLoad(gallivm->builder, load_ptr_ddx, "");
+   tr = LLVMBuildBitCast(gallivm->builder, tr, base->elem_type, 
"");
+
+   result[c] = LLVMBuildFSub(gallivm->builder, tr, tl, "");
+
+   tl = LLVMBuildLoad(gallivm->builder, load_ptr_y, "");
+   tl = LLVMBuildBitCast(gallivm->builder, tl, base->elem_type, 
"");
+
+   bl = LLVMBuildLoad(gallivm->builder, load_ptr_ddy, "");
+   bl = LLVMBuildBitCast(gallivm->builder, bl, base->elem_type, 
"");
+
+   result[c + 2] = LLVMBuildFSub(gallivm->builder, bl, tl, "");
+   }
+
+   return lp_build_gather_values(gallivm, result, 4);
+}
+
+static void interp_f

[Mesa-dev] radeonsi, last bits of ARB_gpu_shader5 and GL 4.1 enable

2015-07-26 Thread Dave Airlie
Pretty much what it says, the multiple stream work and repost
of interpolateAt bits, with final patch to turn stuff on.

Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] radeonsi: enable GL4.1 and update documentation

2015-07-26 Thread Dave Airlie
From: Dave Airlie 

This enables GL4.1 for radeonsi, and updates the
docs in the correct places.

Signed-off-by: Dave Airlie 
---
 docs/GL3.txt   | 16 
 docs/relnotes/10.7.0.html  |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c |  2 +-
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 258a6fb..eb9ed18 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -96,18 +96,18 @@ GL 4.0, GLSL 4.00:
 
   GL_ARB_draw_buffers_blendDONE (i965, nv50, nvc0, 
r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_draw_indirect DONE (i965, nvc0, r600, 
radeonsi, llvmpipe, softpipe)
-  GL_ARB_gpu_shader5   DONE (i965, nvc0)
+  GL_ARB_gpu_shader5   DONE (i965, nvc0, 
radeonsi)
   - 'precise' qualifierDONE
-  - Dynamically uniform sampler array indices  DONE (r600, radeonsi, 
softpipe)
-  - Dynamically uniform UBO array indices  DONE (r600, radeonsi)
+  - Dynamically uniform sampler array indices  DONE (r600, softpipe)
+  - Dynamically uniform UBO array indices  DONE (r600)
   - Implicit signed -> unsigned conversionsDONE
   - Fused multiply-add DONE ()
-  - Packing/bitfield/conversion functions  DONE (r600, radeonsi, 
softpipe)
-  - Enhanced textureGather DONE (r600, radeonsi, 
softpipe)
-  - Geometry shader instancing DONE (r600, radeonsi, 
llvmpipe, softpipe)
+  - Packing/bitfield/conversion functions  DONE (r600, softpipe)
+  - Enhanced textureGather DONE (r600, softpipe)
+  - Geometry shader instancing DONE (r600, llvmpipe, 
softpipe)
   - Geometry shader multiple streams   DONE ()
-  - Enhanced per-sample shadingDONE (r600, radeonsi)
-  - Interpolation functionsDONE (r600, radeonsi)
+  - Enhanced per-sample shadingDONE (r600)
+  - Interpolation functionsDONE (r600)
   - New overload resolution rules  DONE
   GL_ARB_gpu_shader_fp64   DONE (nvc0, radeonsi, 
llvmpipe, softpipe)
   GL_ARB_sample_shadingDONE (i965, nv50, nvc0, 
r600, radeonsi)
diff --git a/docs/relnotes/10.7.0.html b/docs/relnotes/10.7.0.html
index afef525..04d26a4 100644
--- a/docs/relnotes/10.7.0.html
+++ b/docs/relnotes/10.7.0.html
@@ -49,6 +49,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_ARB_fragment_layer_viewport on radeonsi
 GL_ARB_framebuffer_no_attachments on i965
 GL_ARB_get_texture_sub_image for all drivers
+GL_ARB_gpu_shader5 on radeonsi
 GL_ARB_gpu_shader_fp64 on llvmpipe, radeonsi
 GL_ARB_shader_stencil_export on llvmpipe
 GL_ARB_shader_subroutine on core profile all drivers
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index e84fe7a..82e2abe 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -271,7 +271,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
return 4;
 
case PIPE_CAP_GLSL_FEATURE_LEVEL:
-   return 330;
+   return 410;
 
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return MIN2(sscreen->b.info.vram_size, 0x);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] radeon: move streamout buffer config to streamout enable function.

2015-07-26 Thread Dave Airlie
From: Dave Airlie 

This will be used here later.

Signed-off-by: Dave Airlie 
---
 src/gallium/drivers/radeon/r600_streamout.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_streamout.c 
b/src/gallium/drivers/radeon/r600_streamout.c
index bc8bf97..0688397 100644
--- a/src/gallium/drivers/radeon/r600_streamout.c
+++ b/src/gallium/drivers/radeon/r600_streamout.c
@@ -192,11 +192,6 @@ static void r600_emit_streamout_begin(struct 
r600_common_context *rctx, struct r
 
r600_flush_vgt_streamout(rctx);
 
-   r600_write_context_reg(cs, rctx->chip_class >= EVERGREEN ?
-  R_028B98_VGT_STRMOUT_BUFFER_CONFIG :
-  R_028B20_VGT_STRMOUT_BUFFER_EN,
-  rctx->streamout.enabled_mask);
-
for (i = 0; i < rctx->streamout.num_targets; i++) {
if (!t[i])
continue;
@@ -326,6 +321,11 @@ static bool r600_get_strmout_en(struct r600_common_context 
*rctx)
 static void r600_emit_streamout_enable(struct r600_common_context *rctx,
   struct r600_atom *atom)
 {
+   r600_write_context_reg(rctx->rings.gfx.cs, rctx->chip_class >= 
EVERGREEN ?
+  R_028B98_VGT_STRMOUT_BUFFER_CONFIG :
+  R_028B20_VGT_STRMOUT_BUFFER_EN,
+  rctx->streamout.enabled_mask);
+
r600_write_context_reg(rctx->rings.gfx.cs,
   rctx->chip_class >= EVERGREEN ?
   R_028B94_VGT_STRMOUT_CONFIG :
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] radeonsi: enable GL4.1 and update documentation

2015-07-26 Thread Dave Airlie
On 27 July 2015 at 11:50, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This enables GL4.1 for radeonsi, and updates the
> docs in the correct places.
>
self review suggests this should probably be gated on LLVM 3.7.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/arb_gpu_shader_fp64: add support for glGetUniformdv

2015-07-26 Thread Dave Airlie
From: Dave Airlie 

This was missed when I did fp64, I've sent a piglit test to cover
the case as well.

Signed-off-by: Dave Airlie 
---
 src/mesa/main/uniform_query.cpp | 15 +++
 src/mesa/main/uniforms.c|  9 -
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index b5a94e9..5fb903a 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -319,24 +319,31 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
GLint location,
 
   return;
}
+   if ((uni->type->base_type == GLSL_TYPE_DOUBLE &&
+returnType != GLSL_TYPE_DOUBLE) ||
+   (uni->type->base_type != GLSL_TYPE_DOUBLE &&
+returnType == GLSL_TYPE_DOUBLE)) {
+_mesa_error( ctx, GL_INVALID_OPERATION,
+"glGetnUniform*vARB(incompatible uniform types)");
+   return;
+   }
 
{
   unsigned elements = (uni->type->is_sampler())
 ? 1 : uni->type->components();
+  const int dmul = uni->type->base_type == GLSL_TYPE_DOUBLE ? 2 : 1;
 
   /* Calculate the source base address *BEFORE* modifying elements to
* account for the size of the user's buffer.
*/
   const union gl_constant_value *const src =
-&uni->storage[offset * elements];
+&uni->storage[offset * elements * dmul];
 
-  assert(returnType == GLSL_TYPE_FLOAT || returnType == GLSL_TYPE_INT ||
- returnType == GLSL_TYPE_UINT);
   /* The three (currently) supported types all have the same size,
* which is of course the same as their union. That'll change
* with glGetUniformdv()...
*/
-  unsigned bytes = sizeof(src[0]) * elements;
+  unsigned bytes = sizeof(src[0]) * elements * dmul;
   if (bufSize < 0 || bytes > (unsigned) bufSize) {
 _mesa_error( ctx, GL_INVALID_OPERATION,
 "glGetnUniform*vARB(out of bounds: bufSize is %d,"
diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index 6ba746e..13bec88 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -888,16 +888,7 @@ _mesa_GetnUniformdvARB(GLuint program, GLint location,
 {
GET_CURRENT_CONTEXT(ctx);
 
-   (void) program;
-   (void) location;
-   (void) bufSize;
-   (void) params;
-
-   /*
_mesa_get_uniform(ctx, program, location, bufSize, GLSL_TYPE_DOUBLE, 
params);
-   */
-   _mesa_error(ctx, GL_INVALID_OPERATION, "glGetUniformdvARB"
-   "(GL_ARB_gpu_shader_fp64 not implemented)");
 }
 
 void GLAPIENTRY
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] glsl: enable conservative depth, ssbo based on GLSL version

2015-07-26 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

Sam

On 25/07/15 07:06, Ilia Mirkin wrote:
> Add in missed version checks in the GLSL parser
> 
> Signed-off-by: Ilia Mirkin 
> ---
> 
> v1 -> v2: drop AoA hunks to avoid conflicting with Timothy's changes
> 
>  src/glsl/glsl_parser.yy | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> index 4cce5b8..2b0c8bd 100644
> --- a/src/glsl/glsl_parser.yy
> +++ b/src/glsl/glsl_parser.yy
> @@ -1166,7 +1166,8 @@ layout_qualifier_id:
>/* Layout qualifiers for AMD/ARB_conservative_depth. */
>if (!$$.flags.i &&
>(state->AMD_conservative_depth_enable ||
> -   state->ARB_conservative_depth_enable)) {
> +   state->ARB_conservative_depth_enable ||
> +   state->is_version(420, 0))) {
>   if (match_layout_qualifier($1, "depth_any", state) == 0) {
>  $$.flags.q.depth_any = 1;
>   } else if (match_layout_qualifier($1, "depth_greater", state) == 0) 
> {
> @@ -1460,7 +1461,7 @@ layout_qualifier_id:
>  
>if ((state->has_420pack() ||
> state->has_atomic_counters() ||
> -   state->ARB_shader_storage_buffer_object_enable) &&
> +   state->has_shader_storage_buffer_objects()) &&
>match_layout_qualifier("binding", $1, state) == 0) {
>   $$.flags.q.explicit_binding = 1;
>   $$.binding = $3;
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] i965/vec4: Don't emit scratch reads for a spilled register we have just written

2015-07-26 Thread Iago Toral
On Fri, 2015-07-24 at 16:18 +0300, Francisco Jerez wrote:
> Iago Toral Quiroga  writes:
> 
> > When we have code such as this:
> >
> > mov vgrf1.0.x:F, vgrf2.:F
> > mov vgrf3.0.x:F, vgrf1.:F
> > ...
> > mov vgrf3.0.x:F, vgrf1.:F
> >
> > And vgrf1 is chosen for spilling, we can emit this:
> >
> > mov vgrf1.0.x:F, vgrf2.:F
> > gen4_scratch_write hw_reg0:F, vgrf1.:D, 22D
> > mov vgrf3.0.x:F, vgrf1.:F
> > ...
> > gen4_scratch_read vgrf4.0.x:F, 22D
> > mov vgrf3.0.x:F, vgrf4.:F
> >
> > Instead of this:
> >
> > mov vgrf1.0.x:F, vgrf2.:F
> > gen4_scratch_write hw_reg0:F, vgrf1.:D, 22D
> > gen4_scratch_read vgrf4.0.x:F, 22D
> > mov vgrf3.0.x:F, vgrf4.:F
> > ...
> > gen4_scratch_read vgrf5.0.x:F, 22D
> > mov vgrf3.0.x:F, vgrf5.:F
> >
> > And save one scratch read while still preserving the benefits of
> > spilling the register.
> 
> This sounds reasonable to me in principle.  I guess that there is in
> general a trade-off between the number of spills/fills you omit and the
> number of interference edges you eliminate.  It may also be worth
> checking whether you can extend the same principle to cache the value of
> the variable in a GRF until the next instruction regardless of whether
> it was written or read (e.g. so you don't unspill the same register in
> two adjacent instructions).

That makes sense, I'll send a v2 with that chage.

> In either case it seems like the overall cost of spilling a register
> would be decreased in cases where this heuristic can be applied, would
> it make sense to update the cost metric accordingly?

Yeah, I guess so. I'll do that too.

> One more comment inline.
> 
> > ---
> >  .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 39 
> > +-
> >  1 file changed, 38 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp 
> > b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > index 80ab813..5fed2f9 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > @@ -334,6 +334,18 @@ vec4_visitor::choose_spill_reg(struct ra_graph *g)
> > return ra_get_best_spill_node(g);
> >  }
> >  
> > +static bool
> > +writemask_matches_swizzle(unsigned writemask, unsigned swizzle)
> > +{
> > +   for (int i = 0; i < 4; i++) {
> > +  unsigned channel = 1 << BRW_GET_SWZ(swizzle, i);
> > +  if (!(writemask & channel))
> > + return false;
> > +   }
> > +
> > +   return true;
> > +}
> > +
> >  void
> >  vec4_visitor::spill_reg(int spill_reg_nr)
> >  {
> > @@ -341,11 +353,33 @@ vec4_visitor::spill_reg(int spill_reg_nr)
> > unsigned int spill_offset = last_scratch++;
> >  
> > /* Generate spill/unspill instructions for the objects being spilled. */
> > +   vec4_instruction *spill_write_inst = NULL;
> > foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
> > +  /* We don't spill registers used for scratch */
> > +  if (inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_READ ||
> > +  inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE)
> > + continue;
> > +
> >int scratch_reg = -1;
> >for (unsigned int i = 0; i < 3; i++) {
> >   if (inst->src[i].file == GRF && inst->src[i].reg == spill_reg_nr) 
> > {
> > -if (scratch_reg == -1) {
> > +/* If we are reading the spilled register right after writing
> > + * to it we can skip the scratch read and use directly the
> > + * register we used as source for the scratch write. For this
> > + * to work we must check that:
> > + *
> > + * 1) The write is inconditional, that is, it is not 
> > predicated or
> > +  it is a SEL.
> > + * 2) All the channels that we read have been written in that
> > + *last write instruction.
> > + */
> > +if (spill_write_inst &&
> > +(!spill_write_inst->predicate ||
> > + spill_write_inst->opcode == BRW_OPCODE_SEL) &&
> > +writemask_matches_swizzle(spill_write_inst->dst.writemask,
> > +  inst->src[i].swizzle)) {
> 
> brw_mask_for_swizzle() returns the mask of components accessed by a
> swizzle, you could just AND it with ~spill_write_inst->dst.writemask to
> find out whether it's contained in the destination of the previous
> instruction.

Ah nice, thanks for the tip!

Iago

> > +   scratch_reg = spill_write_inst->dst.reg;
> > +} else if (scratch_reg == -1) {
> > scratch_reg = alloc.allocate(1);
> > src_reg temp = inst->src[i];
> > temp.reg = scratch_reg;
> > @@ -358,6 +392,9 @@ vec4_visitor::spill_reg(int spill_reg_nr)
> >  
> >if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) {
> >   emit_scratch_write(block, inst, spill_offset);
> > + sp

Re: [Mesa-dev] [PATCH 3/5] i965/vec4: Register spilling should never see registers with size != 1

2015-07-26 Thread Iago Toral
On Fri, 2015-07-24 at 16:20 +0300, Francisco Jerez wrote:
> Iago Toral Quiroga  writes:
> 
> > Larger registers should have been moved to scratch (like GRF array access)
> > or split to size 1 by the split_virtual_grfs pass.
> 
> Not necessarily.  split_virtual_grfs() won't be able to split stuff
> which is read or written at once by the same instruction -- E.g. by
> send-from-GRF instructions as used for surface messages on e.g.  your
> SSBO implementation.  :)
> 
> We should probably eventually migrate other messages too like the ones
> used for texturing and framebuffer writes to use proper sends from
> GRF...

Okay, in that case I'll include patches to add support for spilling
registers with size > 1 as well.

Thanks,
Iago

> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp 
> > b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > index cff5406..80ab813 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
> > @@ -271,7 +271,8 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, 
> > bool *no_spill)
> >  
> > for (unsigned i = 0; i < this->alloc.count; i++) {
> >spill_costs[i] = 0.0;
> > -  no_spill[i] = alloc.sizes[i] != 1;
> > +  no_spill[i] = false;
> > +  assert(this->alloc.sizes[i] == 1);
> > }
> >  
> > /* Calculate costs for spilling nodes.  Call it a cost of 1 per
> > -- 
> > 1.9.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V5 5/7] mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1

2015-07-26 Thread Samuel Iglesias Gonsálvez


On 23/07/15 16:38, Marta Lofstedt wrote:
> From: Marta Lofstedt 
> 
> Signed-off-by: Marta Lofstedt 
> ---
>  src/mesa/main/get.c  |  6 ++
>  src/mesa/main/get_hash_params.py | 10 +-
>  2 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 60c1b1b..a443493 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -385,6 +385,12 @@ static const int extra_ARB_texture_multisample_es31[] = {
> EXTRA_END
>  };
>  
> +static const int extra_ARB_texture_gather_es31[] = {
> +   EXT(ARB_texture_gather),
> +   EXTRA_API_ES31,
> +   EXTRA_END
> +};
> +
>  EXTRA_EXT(ARB_texture_cube_map);
>  EXTRA_EXT(EXT_texture_array);
>  EXTRA_EXT(NV_fog_distance);
> diff --git a/src/mesa/main/get_hash_params.py 
> b/src/mesa/main/get_hash_params.py
> index fbf2abf..d7e7a7a 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -436,6 +436,11 @@ descriptor=[
>[ "MAX_INTEGER_SAMPLES", "CONTEXT_INT(Const.MaxIntegerSamples), 
> extra_ARB_texture_multisample_es31" ],
>[ "SAMPLE_MASK", "CONTEXT_BOOL(Multisample.SampleMask), 
> extra_ARB_texture_multisample_es31" ],
>[ "MAX_SAMPLE_MASK_WORDS", "CONST(1), extra_ARB_texture_multisample_es31" 
> ],
> +
> +# GL_ARB_texture_gather / GLES 3.1
> +  [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET", 
> "CONTEXT_INT(Const.MinProgramTextureGatherOffset), 
> extra_ARB_texture_gather_es31"],
> +  [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET", 
> "CONTEXT_INT(Const.MaxProgramTextureGatherOffset), 
> extra_ARB_texture_gather_es31"],
> +  [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
> "CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
> extra_ARB_texture_gather_es31"],

I have not found MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB in GLES 3.1
spec. Can you confirm it?

Sam

>  ]},
>  
>  # Enums in OpenGL Core profile and ES 3.1
> @@ -775,11 +780,6 @@ descriptor=[
>  # GL_ARB_texture_cube_map_array
>[ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, 
> TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ],
>  
> -# GL_ARB_texture_gather
> -  [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET", 
> "CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather"],
> -  [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET", 
> "CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather"],
> -  [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
> "CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
> extra_ARB_texture_gather"],
> -
>  # GL_ARB_separate_shader_objects
>[ "PROGRAM_PIPELINE_BINDING", "LOC_CUSTOM, TYPE_INT, 
> GL_PROGRAM_PIPELINE_BINDING, NO_EXTRA" ],
>  
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V5 2/7] mesa/es3.1: enable GL_ARB_shader_image_load_store for GLES 3.1

2015-07-26 Thread Samuel Iglesias Gonsálvez


On 23/07/15 16:38, Marta Lofstedt wrote:
> From: Marta Lofstedt 
> 
> Signed-off-by: Marta Lofstedt 
> ---
>  src/mesa/main/get.c  |  6 ++
>  src/mesa/main/get_hash_params.py | 17 +++--
>  2 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index ec7eb71..dc04930 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -367,6 +367,12 @@ static const int extra_ARB_draw_indirect_es31[] = {
> EXTRA_END
>  };
>  
> +static const int extra_ARB_shader_image_load_store_es31[] = {
> +   EXT(ARB_shader_image_load_store),
> +   EXTRA_API_ES31,
> +   EXTRA_END
> +};
> +
>  EXTRA_EXT(ARB_texture_cube_map);
>  EXTRA_EXT(EXT_texture_array);
>  EXTRA_EXT(NV_fog_distance);
> diff --git a/src/mesa/main/get_hash_params.py 
> b/src/mesa/main/get_hash_params.py
> index 4137e7f..85df077 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -407,6 +407,17 @@ descriptor=[
>[ "TEXTURE_EXTERNAL_OES", "LOC_CUSTOM, TYPE_BOOLEAN, 0, 
> extra_OES_EGL_image_external" ],
>  ]},
>  
> +# Enums in OpenGL and ES 3.1
> +{ "apis": ["GL", "GL_CORE", "GLES31"], "params": [
> +# GL_ARB_shader_image_load_store / GLES 3.1
> +  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
> extra_ARB_shader_image_load_store_es31" ],
> +  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
> "CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
> extra_ARB_shader_image_load_store_es31" ],

I have not found MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS in GLES
3.1 spec. Only MAX_COMBINED_TEXTURE_IMAGE_UNITS which is missing in this
patch.

Can you verify it?

> +  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
> extra_ARB_shader_image_load_store_es31" ],

I have not found MAX_IMAGE_SAMPLES in GLES 3.1. Please check it.

Sam

> +  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
> extra_ARB_shader_image_load_store_es31" ],
> +  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
> extra_ARB_shader_image_load_store_es31" ],
> +  [ "MAX_COMBINED_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.MaxCombinedImageUniforms), 
> extra_ARB_shader_image_load_store_es31" ],
> +]},
> +
>  # Enums in OpenGL Core profile and ES 3.1
>  { "apis": ["GL_CORE", "GLES3"], "params": [
>  # GL_ARB_draw_indirect / GLES 3.1
> @@ -779,13 +790,7 @@ descriptor=[
>[ "MAX_VERTEX_ATTRIB_BINDINGS", 
> "CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ],
>  
>  # GL_ARB_shader_image_load_store
> -  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
> "CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
> extra_ARB_shader_image_load_store"],
>[ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
> extra_ARB_shader_image_load_store_and_geometry_shader"],
> -  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_COMBINED_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.MaxCombinedImageUniforms), 
> extra_ARB_shader_image_load_store"],
>  
>  # GL_ARB_compute_shader
>[ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
> "CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader" 
> ],
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V5 4/7] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1

2015-07-26 Thread Samuel Iglesias Gonsálvez


On 23/07/15 16:38, Marta Lofstedt wrote:
> From: Marta Lofstedt 
> 
> Signed-off-by: Marta Lofstedt 
> ---
>  src/mesa/main/get.c  |  6 ++
>  src/mesa/main/get_hash_params.py | 18 +-
>  2 files changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 39fe725..60c1b1b 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -379,6 +379,12 @@ static const int extra_ARB_shader_atomic_counters_es31[] 
> = {
> EXTRA_END
>  };
>  
> +static const int extra_ARB_texture_multisample_es31[] = {
> +   EXT(ARB_texture_multisample),
> +   EXTRA_API_ES31,
> +   EXTRA_END
> +};
> +
>  EXTRA_EXT(ARB_texture_cube_map);
>  EXTRA_EXT(EXT_texture_array);
>  EXTRA_EXT(NV_fog_distance);
> diff --git a/src/mesa/main/get_hash_params.py 
> b/src/mesa/main/get_hash_params.py
> index b1a5dd5..fbf2abf 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -427,6 +427,15 @@ descriptor=[
>[ "MAX_FRAGMENT_ATOMIC_COUNTERS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters), 
> extra_ARB_shader_atomic_counters_es31" ],
>[ "MAX_COMBINED_ATOMIC_COUNTER_BUFFERS", 
> "CONTEXT_INT(Const.MaxCombinedAtomicBuffers), 
> extra_ARB_shader_atomic_counters_es31" ],
>[ "MAX_COMBINED_ATOMIC_COUNTERS", 
> "CONTEXT_INT(Const.MaxCombinedAtomicCounters), 
> extra_ARB_shader_atomic_counters_es31" ],
> +
> +# GL_ARB_texture_multisample / GLES 3.1
> +  [ "TEXTURE_BINDING_2D_MULTISAMPLE", "LOC_CUSTOM, TYPE_INT, 
> TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample_es31" ],
> +  [ "TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY", "LOC_CUSTOM, TYPE_INT, 
> TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample_es31" ],

I have not found TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY in GLES 3.1 spec.
Can you verify it?

Sam

> +  [ "MAX_COLOR_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxColorTextureSamples), 
> extra_ARB_texture_multisample_es31" ],
> +  [ "MAX_DEPTH_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxDepthTextureSamples), 
> extra_ARB_texture_multisample_es31" ],
> +  [ "MAX_INTEGER_SAMPLES", "CONTEXT_INT(Const.MaxIntegerSamples), 
> extra_ARB_texture_multisample_es31" ],
> +  [ "SAMPLE_MASK", "CONTEXT_BOOL(Multisample.SampleMask), 
> extra_ARB_texture_multisample_es31" ],
> +  [ "MAX_SAMPLE_MASK_WORDS", "CONST(1), extra_ARB_texture_multisample_es31" 
> ],
>  ]},
>  
>  # Enums in OpenGL Core profile and ES 3.1
> @@ -718,15 +727,6 @@ descriptor=[
>[ "TEXTURE_BUFFER_FORMAT_ARB", "LOC_CUSTOM, TYPE_INT, 0, 
> extra_texture_buffer_object" ],
>[ "TEXTURE_BUFFER_ARB", "LOC_CUSTOM, TYPE_INT, 0, 
> extra_texture_buffer_object" ],
>  
> -# GL_ARB_texture_multisample / GL 3.2
> -  [ "TEXTURE_BINDING_2D_MULTISAMPLE", "LOC_CUSTOM, TYPE_INT, 
> TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample" ],
> -  [ "TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY", "LOC_CUSTOM, TYPE_INT, 
> TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample" ],
> -  [ "MAX_COLOR_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxColorTextureSamples), 
> extra_ARB_texture_multisample" ],
> -  [ "MAX_DEPTH_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxDepthTextureSamples), 
> extra_ARB_texture_multisample" ],
> -  [ "MAX_INTEGER_SAMPLES", "CONTEXT_INT(Const.MaxIntegerSamples), 
> extra_ARB_texture_multisample" ],
> -  [ "SAMPLE_MASK", "CONTEXT_BOOL(Multisample.SampleMask), 
> extra_ARB_texture_multisample" ],
> -  [ "MAX_SAMPLE_MASK_WORDS", "CONST(1), extra_ARB_texture_multisample" ],
> -
>  # GL 3.0
>[ "CONTEXT_FLAGS", "CONTEXT_INT(Const.ContextFlags), extra_version_30" ],
>  
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V5 0/7] Enabling extension enums for OpenGL ES 3.1

2015-07-26 Thread Samuel Iglesias Gonsálvez
Hello Marta,

I have quickly searched the constants in OpenGL ES 3.1 spec but I have
not found some of them. As I could do a wrong search, please check if
they are defined or not.

Thanks,

Sam

On 23/07/15 16:38, Marta Lofstedt wrote:
> This is V5 of my patch-set for enabling extension enums for
> OpenGL ES 3.1. 
> 
> This update address comments from Ilia Mirkin and adds a new
> GLES31 label.
> 
> I have my current GLES 3.1 work on github:
> https://github.com/MartaLo/mesa
> For theese patches see the gles31_resent_patches, branch
> 
> Marta Lofstedt (7):
>   mesa/es3.1: Add ES 3.1 handling to get.c and get_hash_generator.py
>   mesa/es3.1: enable GL_ARB_shader_image_load_store for GLES 3.1
>   mesa/es3.1: enable GL_ARB_shader_atomic_counters for GLES 3.1
>   mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1
>   mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1
>   mesa/es3.1: enable GL_ARB_compute_shader for GLES 3.1
>   mesa/es3.1: enable GL_ARB_explicit_uniform_location for GLES 3.1
> 
>  src/mesa/main/get.c | 41 -
>  src/mesa/main/get_hash_generator.py | 12 +++--
>  src/mesa/main/get_hash_params.py| 89 
> -
>  3 files changed, 97 insertions(+), 45 deletions(-)
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev