Re: [Mesa-dev] [PATCH 01/14] nir: Add explicitly sized types

2016-03-10 Thread Samuel Iglesias Gonsálvez
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On 11/03/16 01:08, Jason Ekstrand wrote:
> On Thu, Mar 10, 2016 at 4:00 PM, Connor Abbott
>  wrote:
> 
>> On Mon, Mar 7, 2016 at 3:45 AM, Samuel Iglesias Gonsálvez 
>>  wrote:
>>> From: Jason Ekstrand 
>>> 
>>> v2: Fix size/type mask to properly handle 8-bit types.
>>> 
>>> Signed-off-by: Juan A. Suarez Romero  --- 
>>> src/compiler/nir/nir.h | 17 - 1 file changed,
>>> 16 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h 
>>> index cccb3a4..659e98c 100644 --- a/src/compiler/nir/nir.h +++
>>> b/src/compiler/nir/nir.h @@ -605,9 +605,24 @@ typedef enum { 
>>> nir_type_float, nir_type_int, nir_type_uint, -   nir_type_bool 
>>> +   nir_type_bool, +   nir_type_bool32 =32 |
>>> nir_type_bool, +   nir_type_int8 =  8  | nir_type_int, +
>>> nir_type_int16 = 16 | nir_type_int, +   nir_type_int32 =
>>> 32 | nir_type_int, +   nir_type_int64 = 64 | nir_type_int, 
>>> +   nir_type_uint8 = 8  | nir_type_uint, +
>>> nir_type_uint16 =16 | nir_type_uint, +   nir_type_uint32 =
>>> 32 | nir_type_uint, +   nir_type_uint64 =64 |
>>> nir_type_uint, +   nir_type_float16 =   16 | nir_type_float, +
>>> nir_type_float32 =   32 | nir_type_float, +   nir_type_float64
>>> =   64 | nir_type_float, } nir_alu_type;
>>> 
>>> +#define NIR_ALU_TYPE_SIZE_MASK 0xfff8 +#define
>>> NIR_ALU_TYPE_BASE_TYPE_MASK 0x0007
>> 
>> So I'm not really the one to be reviewing this series (after all,
>> I wrote most of it :) ) but one thing that I never quite liked,
>> and didn't get around to fixing, is how we use these raw
>> constants all over the place. Perhaps we could make things more
>> readable by adding nir_get_sized_type(), nir_get_unsized_type(),
>> and nir_type_size() helpers and then use those instead of
>> or-ing/and-ing things together everywhere.
>> 
> 
> Agreed.
> 
> 

Agreed. We saw it too but, as this is used in a lot in the fp64 patches,
we were thinking on apply one patch at the end of the fp64 series adding
those helper functions (maybe just macros like NIR_GET_UNSIZED_TYPE and
NIR_GET_TYPE_SIZE) and adapting the users of the mask.

However, we can add them here and modify the rest of fp64 patches if
you prefer it.

Sam

>> 
>>> + typedef enum { NIR_OP_IS_COMMUTATIVE = (1 << 0), 
>>> NIR_OP_IS_ASSOCIATIVE = (1 << 1), -- 2.7.0
>>> 
>>> ___ mesa-dev
>>> mailing list mesa-dev@lists.freedesktop.org 
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___ mesa-dev mailing
>> list mesa-dev@lists.freedesktop.org 
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> 
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJW4nTAAAoJEH/0ujLxfcNDQmcP/3PDBMxX+z91XQ0wSY7QMuu8
I4BVir0n1J3g05S8Yid+z61vCOMNdDB9xmUCJmV1Jv+YuS4SB5GaluHj9jFBPgvj
YQtT5SnoGC1tBEViAPa+nNRwxF+fxh8xLKG+OQ2IXqDMAdIsx5V772Ea8/anClhi
q4d8Fw93URPubBKTTh8IMt/dOa0oN3L0Cka7062bLl27+Y2Ml8MyPVLEQPBI2WP8
ayMicIDco2ldRS3u/jteGc6R4GI9Ef8gIsSVyEYPKUYgNmVkun5LMJjpjbh2PXBB
VaManLcCdv6Yf2GP9ehQjTp4rr0GLl2rcAaftt0pD7MN1ZzQlFp/opyIQpzFe+Ny
hqzzvbn8wh/W4goKbfir6HpasaPC56AamTnHZ9zJVhaUIPjan/oSSRHRoK9kswib
rpnj5WDQN9KKnuY89Pxoo/w8aesgyektLiFbsXQx7jbNVxKOdrvKwnhSjSQs0sUG
C+e/2oLSMiH2VLnYT7iJoinD8IlQXgmYBo/IZvFgtcOfZdJRgSssrWQclfagv8MR
dzNLUTR5sS6/GG+4nTuD14uGaswuToCRCNiq2CDnemFXMdtgkIkztj8dwZd8u9hY
kP5UQKoW6KU+0fFf8PQez2YCFX/dxLXtRyP8uP+V5ZUh1y+Qv4TDwYacl/VG8Hlt
kx7+UXIC4g/vUS5ONfP0
=6z48
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] i965: Silence loop counter overflow warning

2016-03-10 Thread Jason Ekstrand
On Mar 10, 2016 1:21 PM, "Eirik Byrkjeflot Anonsen" 
wrote:
>
> Ian Romanick  writes:
>
> > From: Ian Romanick 
> >
> > I don't understand why the old code was bad, but the new code is fine.
>
> Probably because the *loop counter* can no longer overflow. Thus the
> loop can be optimized. The fact that "i" might overflow has become
> irrelevant to the warning.

Right.  In theory, since i is incremented by 4 each time it could, in
theory, skip right over size/4 and overflow.  However, this can never
happen since size is 32 bits and is divided by 4 so it has a maximum value
of 2^30-1.  Apparently, GCC isn't quite that smart. :-)

> (And from that perspective, it isn't equivalent. If "i" overflows in the
> original code, you would get an infinite loop.)
>
> eirik
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94481] softpipe - access violation in img_filter_2d_nearest

2016-03-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94481

--- Comment #2 from Roland Scheidegger  ---
The initialization should be pretty easy to fix I think, if noone beats me to
it I'll see what I can do...

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] LLVMInitializeAMDGPU* undefined?

2016-03-10 Thread Jan Vesely
On Fri, 2016-03-11 at 10:09 +0800, Chih-Wei Huang wrote:
> 2016年3月10日 下午6:47於 "Marek Olšák" 寫道:
> > 
> > 
> > Those functions are only supported by LLVM 3.7 and later, and if
> > you
> > have such a version, the AMDGPU backend must be enabled in your
> > LLVM
> > build.
> Yes, I knew that.
> But seems you misunderstood my question.
> 
> It's a compile time error instead of a linking time error:
> 
> external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c: In
> function 'init_r600_target':
> external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:106:2:
> error: implicit declaration of function
> 'LLVMInitializeAMDGPUTargetInfo'
> [-Werror=implicit-function-declaration]
>   LLVMInitializeAMDGPUTargetInfo();
>   ^
> external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:107:2:
> error: implicit declaration of function 'LLVMInitializeAMDGPUTarget'
> [-Werror=implicit-function-declaration]
>   LLVMInitializeAMDGPUTarget();
>   ^
> external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:108:2:
> error: implicit declaration of function
> 'LLVMInitializeAMDGPUTargetMC'
> [-Werror=implicit-function-declaration]
>   LLVMInitializeAMDGPUTargetMC();
>   ^
> external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:109:2:
> error: implicit declaration of function
> 'LLVMInitializeAMDGPUAsmPrinter'
> [-Werror=implicit-function-declaration]
>   LLVMInitializeAMDGPUAsmPrinter();
>   ^
> 
> cc1: some warnings being treated as errors
> 
> 
> Some proposed patches on the mesa side:
> 
> https://sourceforge.net/p/android-x86/external_mesa/ci/f6611f58cf89a4
> 0e013b20180604f65707b6e73e/
> (add a header to declare the functions)
> 
> https://github.com/maurossi/mesa/commit/deb3a6ebb7fdba688b0331bd0e4b2
> 7acfc9d869f
> (disable implicit declaration warnings)
> 
> But I'm still not sure whether if it should be fixed on the mesa
> side.
> 
> So my question is, what kind of fix do we want (i.e., acceptable by
> the upstream)?

this is the result of using llvm without AMDGPU backend. the header
"llvm-c/Target.h" includes "llvm/Config/Targets.def" and declares
functions based on backends listed there (selected at llvm build time).

Your "llvm/Config/Targets.def" needs to include this line:
LLVM_TARGET(AMDGPU)

otherwise the functions are not declared.
probably the only patch needed on mesa side is to detect this at
configure time.

regards,
Jan


> 
> > 
> > On Thu, Mar 10, 2016 at 10:04 AM, Chih-Wei Huang
> >  wrote:
> > > 
> > > Hi devs,
> > > On building 64-bit Android-x86 mesa with amdgpu support,
> > > we got some errors about the symbols LLVMInitializeAMDGPU*
> > > are not defined. (missing prototypes)
> > > 
> > > It's easy to fix the errors by adding the definition of
> > > the function prototypes.
> > > However, I'm curious in which side it should be fixed?
> > > The mesa or llvm?
> > > 
> > > Does the llvm forgot to provide a header for these functions?
> > > Or the functions are internal APIs of llvm that should not be
> > > used
> > > by mesa directly?
> > > 
> > > Any comment?
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965: Stop XY clipping point and line primitives.

2016-03-10 Thread Roland Scheidegger
Am 11.03.2016 um 03:50 schrieb Kenneth Graunke:
> On Friday, March 11, 2016 2:50:46 AM PST Roland Scheidegger wrote:
>> Technically, this is still wrong for rendering traditional gl points,
>> which indeed require points either be drawn in full (even the parts
>> outside viewport, if the center is inside viewport) or not at all (if
>> the center is outside viewport). Albeit the gles language may be
>> different (and looks like gets cleared up even) the rules for gl are
>> still the same
>> However, just about everybody seems to hate the traditional point
>> clipping, and some vendors never implemented it that way anyway (or did
>> so on a case-by-case base even depending on whatever...). (d3d9 rules
>> required it the same as what you're doing in this change here fwiw, so
>> same as gles, whereas d3d10 "fixed" this problem by not actually
>> supporting large points at all...)
>>
>> Roland
> 
> Exactly.  I originally thought the test was broken.  I think ES
> clarifies the language, but it doesn't actually specify *this*
> behavior yet.
> 
> It seems like most vendors implement this behavior, and it's the
> behavior that people actually seem to want.  Based on the bug entry,
> it seems people were generally in favor of trying to write new language
> to support this behavior.  But it stalled in 2014.
> 
> https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10698
> 
> It's kind of sketchy.
> 

Oh I skimmed through the bug and thought the proposal to fix the spec
language was only for ES (which is already different to GL here), not
GL. But that doesn't look to be entirely the case, albeit it's not quite
clear (the part to change GL seems to be more of some informal
request)... And I just assumed newer gles versions would already have
the new language as the bug is quite old, there seemed to be some
agreement to change it, but nothing happened strangely enough so gles
3.1 and 3.2 still indeed have the same wording...

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Disallow GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME on winsys FBO.

2016-03-10 Thread Kenneth Graunke
Fixes:
dEQP-GLES3.functional.negative_api.state.get_framebuffer_attachment_parameteriv

Apparently, GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is not allowed when
GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is GL_FRAMEBUFFER_DEFAULT, and
is expected to result in a GL_INVALID_ENUM error.

No GL specification actually defines what GL_FRAMEBUFFER_DEFAULT means.
It probably means the window system FBO.  It also doesn't mention the
behavior of any queries for that type.  Various ARB folks seem fairly
confused about it too.  For now, just do something vaguely like what
dEQP expects.

I think we probably need to check the visual bits against 0 for the
attachment, but we haven't been doing that thusfar, and given how
confusingly this is specified, I can't imagine anyone relying on it.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/fbobject.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index 0eec9d9..b52a71b 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -3625,6 +3625,17 @@ _mesa_get_framebuffer_attachment_parameter(struct 
gl_context *ctx,
   }
   /* the default / window-system FBO */
   att = _mesa_get_fb0_attachment(ctx, buffer, attachment);
+
+  /* No credible spec text to cite, but see
+   * https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12928#c1
+   * and https://bugs.freedesktop.org/show_bug.cgi?id=31947
+   */
+  if (pname == GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME) {
+ _mesa_error(ctx, GL_INVALID_ENUM,
+ "%s(requesting GL_FRAMEBUFFER_ATTACHMENT_OBJECT_NAME "
+ "when GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is "
+ "GL_FRAMEBUFFER_DEFAULT is not allowed)", caller);
+  }
}
else {
   /* user-created framebuffer FBO */
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965: Stop XY clipping point and line primitives.

2016-03-10 Thread Kenneth Graunke
On Friday, March 11, 2016 2:50:46 AM PST Roland Scheidegger wrote:
> Technically, this is still wrong for rendering traditional gl points,
> which indeed require points either be drawn in full (even the parts
> outside viewport, if the center is inside viewport) or not at all (if
> the center is outside viewport). Albeit the gles language may be
> different (and looks like gets cleared up even) the rules for gl are
> still the same
> However, just about everybody seems to hate the traditional point
> clipping, and some vendors never implemented it that way anyway (or did
> so on a case-by-case base even depending on whatever...). (d3d9 rules
> required it the same as what you're doing in this change here fwiw, so
> same as gles, whereas d3d10 "fixed" this problem by not actually
> supporting large points at all...)
> 
> Roland

Exactly.  I originally thought the test was broken.  I think ES
clarifies the language, but it doesn't actually specify *this*
behavior yet.

It seems like most vendors implement this behavior, and it's the
behavior that people actually seem to want.  Based on the bug entry,
it seems people were generally in favor of trying to write new language
to support this behavior.  But it stalled in 2014.

https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10698

It's kind of sketchy.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] squash: Fix up VPM read optimization.

2016-03-10 Thread Rhys Kidd
On 10 March 2016 at 15:21, Eric Anholt  wrote:

> - There's no reason there would be only 64 operations that read from the
>   output of a mov from VPM, so we might smash the stack (fixes etqw trace)
>
> - Fixes segfault where we assumed that a single-use temp had a def (fixes
>   2 piglit tests)
>
> - We need to only mark progress when we actually did the optimization, or
>   we'll infinite loop (0ad trace).
>
> - Misc style fixes.
>
> - No reordering sampler instructions (fixes a glean test)
>
> shader-db results:
> total instructions in shared programs: 78513 -> 78071 (-0.56%)
> instructions in affected programs: 10406 -> 9964 (-4.25%)
> total estimated cycles in shared programs: 234674 -> 234274 (-0.17%)
> estimated cycles in affected programs: 35188 -> 34788 (-1.14%)
> ---
>
> Varad, here's what I came up with trying to test your patch.  If these
> changes look good to you, I can squash them in and push.
>
>  src/gallium/drivers/vc4/vc4_opt_vpm.c | 45
> +--
>  1 file changed, 22 insertions(+), 23 deletions(-)
>
> diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm.c
> b/src/gallium/drivers/vc4/vc4_opt_vpm.c
> index 277b345..a4ee6af 100644
> --- a/src/gallium/drivers/vc4/vc4_opt_vpm.c
> +++ b/src/gallium/drivers/vc4/vc4_opt_vpm.c
> @@ -40,10 +40,8 @@ qir_opt_vpm(struct vc4_compile *c)
>
>  bool progress = false;
>  struct qinst *vpm_writes[64] = { 0 };
> -struct qinst *vpm_reads[64] = { 0 };
>  uint32_t use_count[c->num_temps];
>  uint32_t vpm_write_count = 0;
> -uint32_t vpm_read_count = 0;
>  memset(_count, 0, sizeof(use_count));
>
>  list_for_each_entry(struct qinst, inst, >instructions, link) {
> @@ -59,24 +57,14 @@ qir_opt_vpm(struct vc4_compile *c)
>  if (inst->src[i].file == QFILE_TEMP) {
>  uint32_t temp = inst->src[i].index;
>  use_count[temp]++;
> -
> -struct qinst *mov = c->defs[temp];
> -if (!mov ||
> -(mov->op != QOP_MOV &&
> -mov->op != QOP_FMOV &&
> -mov->op != QOP_MMOV)) {
> -continue;
> -}
> -
> -if (mov->src[0].file == QFILE_VPM)
> -vpm_reads[vpm_read_count++] =
> inst;
>  }
>  }
>  }
>
> -for (int i = 0; i < vpm_read_count; i++) {
> -struct qinst *inst = vpm_reads[i];
> -
> +/* For instructions reading from a temporary that contains a VPM
> read
> + * result, try to move the instruction up in place of the VPM
> read.
> + */
> +list_for_each_entry(struct qinst, inst, >instructions, link) {
>  if (!inst || qir_is_multi_instruction(inst))
>  continue;
>
> @@ -84,21 +72,32 @@ qir_opt_vpm(struct vc4_compile *c)
>  continue;
>
>  if (qir_has_side_effects(c, inst) ||
> -qir_has_side_effect_reads(c, inst))
> +qir_has_side_effect_reads(c, inst) ||
> +qir_is_tex(inst))
>  continue;
>
>  for (int j = 0; j < qir_get_op_nsrc(inst->op); j++) {
> -if(inst->src[j].file != QFILE_TEMP)
> +if (inst->src[j].file != QFILE_TEMP)
>  continue;
>
>  uint32_t temp = inst->src[j].index;
> +
> +/* Since VPM reads pull from a FIFO, we only get
> to
> + * read each VPM entry once (unless we reset the
> read
> + * pointer).  That means we can't copy-propagate
> a VPM
> + * read to multiple locations.
> + */
>  if (use_count[temp] != 1)
>  continue;
>
>  struct qinst *mov = c->defs[temp];
> -
> -if (mov->src[0].file != QFILE_VPM)
> +if (!mov ||
> +(mov->op != QOP_MOV &&
> + mov->op != QOP_FMOV &&
> + mov->op != QOP_MMOV) ||
> +mov->src[0].file != QFILE_VPM) {
>  continue;
> +}
>
>  uint32_t temps = 0;
>  for (int k = 0; k < qir_get_op_nsrc(inst->op);
> k++) {
> @@ -109,15 +108,15 @@ qir_opt_vpm(struct vc4_compile *c)
>  /* The instruction is safe to reorder if its other
>   * sources are independent of previous
> 

Re: [Mesa-dev] LLVMInitializeAMDGPU* undefined?

2016-03-10 Thread Chih-Wei Huang
2016年3月10日 下午6:47於 "Marek Olšák" 寫道:
>
> Those functions are only supported by LLVM 3.7 and later, and if you
> have such a version, the AMDGPU backend must be enabled in your LLVM
> build.

Yes, I knew that.
But seems you misunderstood my question.

It's a compile time error instead of a linking time error:

external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c: In
function 'init_r600_target':
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:106:2:
error: implicit declaration of function
'LLVMInitializeAMDGPUTargetInfo'
[-Werror=implicit-function-declaration]
  LLVMInitializeAMDGPUTargetInfo();
  ^
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:107:2:
error: implicit declaration of function 'LLVMInitializeAMDGPUTarget'
[-Werror=implicit-function-declaration]
  LLVMInitializeAMDGPUTarget();
  ^
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:108:2:
error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC'
[-Werror=implicit-function-declaration]
  LLVMInitializeAMDGPUTargetMC();
  ^
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:109:2:
error: implicit declaration of function
'LLVMInitializeAMDGPUAsmPrinter'
[-Werror=implicit-function-declaration]
  LLVMInitializeAMDGPUAsmPrinter();
  ^

cc1: some warnings being treated as errors


Some proposed patches on the mesa side:

https://sourceforge.net/p/android-x86/external_mesa/ci/f6611f58cf89a40e013b20180604f65707b6e73e/
(add a header to declare the functions)

https://github.com/maurossi/mesa/commit/deb3a6ebb7fdba688b0331bd0e4b27acfc9d869f
(disable implicit declaration warnings)

But I'm still not sure whether if it should be fixed on the mesa side.

So my question is, what kind of fix do we want (i.e., acceptable by
the upstream)?

> On Thu, Mar 10, 2016 at 10:04 AM, Chih-Wei Huang
>  wrote:
> > Hi devs,
> > On building 64-bit Android-x86 mesa with amdgpu support,
> > we got some errors about the symbols LLVMInitializeAMDGPU*
> > are not defined. (missing prototypes)
> >
> > It's easy to fix the errors by adding the definition of
> > the function prototypes.
> > However, I'm curious in which side it should be fixed?
> > The mesa or llvm?
> >
> > Does the llvm forgot to provide a header for these functions?
> > Or the functions are internal APIs of llvm that should not be used
> > by mesa directly?
> >
> > Any comment?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa include guard style. (Was: [PATCH] i965/cfg: Remove redundant #pragma once.)

2016-03-10 Thread Francisco Jerez
Iago Toral  writes:

> On Wed, 2016-03-09 at 19:04 -0800, Francisco Jerez wrote:
>> Matt Turner  writes:
>> 
>> > On Wed, Mar 9, 2016 at 1:37 PM, Francisco Jerez  
>> > wrote:
>> >> Iago Toral  writes:
>> >>
>> >>> On Tue, 2016-03-08 at 17:42 -0800, Francisco Jerez wrote:
>>  brw_cfg.h already has include guards, remove the "#pragma once" which
>>  is redundant and non-standard.
>> >>>
>> >>> FWIW, I think using both #pragma once and include guards is a way to
>> >>> keep portability while still getting the performance advantage of
>> >>> #pragma once where it is supported.
>> >>>
>> >> It's highly unlikely to make any significant difference on any
>> >> reasonably modern compiler.  I cannot measure any change in compilation
>> >> time locally from my cleanup.
>> >>
>> >>> Also it seems that we do the same thing in many other files...
>> >>>
>> >> Really?  I'm not aware of any other file where we use both.
>> >
>> > There are quite a few in glsl/
>> 
>> Heh, apparently you're right.  Anyway it seems rather pointless to use
>> '#pragma once' in a bunch of scattered header files with the expectation
>> to gain some speed, the improvement from a single header file is so
>> minuscule (if it will make any difference at all on a modern compiler
>> and compilation workload, which I doubt) that we would have to use it
>> universally in order to have the chance to measure any improvement.
>> 
>> Can we please just decide for one of the include guard styles and use it
>> consistently?  Given that the majority of header files in the Mesa
>> codebase use old-school define guards, that it's the only standard
>> option, that it has well-defined semantics in presence of file copies
>> and hardlinks, and that the performance argument against it is rather
>> dubious (although I definitely find '#pragma once' prettier and more
>> concise), I'd vote for using preprocessor define guards universally.
>> 
>> What do other people think?
>
> I think we have to use define guards necessarily since #pragma once is
> not standard even it it has wide support. So the question is whether we
> want to use only define guards or define guards plus #pragma once. I am
> fine with doing only define guards as you propose.
>

*Shrug* I have the impression that the only real advantage of '#pragma
once' is that you no longer need to do the ifndef/define dance, so I
don't think I can see much benefit in doing both.

> Iago


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965: Stop XY clipping point and line primitives.

2016-03-10 Thread Roland Scheidegger
Technically, this is still wrong for rendering traditional gl points,
which indeed require points either be drawn in full (even the parts
outside viewport, if the center is inside viewport) or not at all (if
the center is outside viewport). Albeit the gles language may be
different (and looks like gets cleared up even) the rules for gl are
still the same
However, just about everybody seems to hate the traditional point
clipping, and some vendors never implemented it that way anyway (or did
so on a case-by-case base even depending on whatever...). (d3d9 rules
required it the same as what you're doing in this change here fwiw, so
same as gles, whereas d3d10 "fixed" this problem by not actually
supporting large points at all...)

Roland


Am 11.03.2016 um 01:59 schrieb Kenneth Graunke:
> Wide points and lines are not supposed to be clipped by the viewport.
> Rather, they should be rendered, and any fragments outside of the
> viewport should be discarded.
> 
> The traditional use case for this behavior is rendering moving wide
> point particles.  When the center of the point approaches the viewport
> edge, clipping would make it pop out of view early.
> 
> Fixes:
> - dEQP-GLES2.functional.clipping.point.wide_point_clip
> - dEQP-GLES3.functional.clipping.point.wide_point_clip
> - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
> - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner
> - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center
> - dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
> Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10698
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen6_clip_state.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_clip_state.c 
> b/src/mesa/drivers/dri/i965/gen6_clip_state.c
> index 9a29366..004eceb 100644
> --- a/src/mesa/drivers/dri/i965/gen6_clip_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_clip_state.c
> @@ -174,12 +174,14 @@ upload_clip_state(struct brw_context *brw)
> else
>enable = GEN6_CLIP_ENABLE;
>  
> +   if (!is_drawing_points(brw) && !is_drawing_lines(brw))
> +  dw2 |= GEN6_CLIP_XY_TEST;
> +
> BEGIN_BATCH(4);
> OUT_BATCH(_3DSTATE_CLIP << 16 | (4 - 2));
> OUT_BATCH(dw1);
> OUT_BATCH(enable |
>GEN6_CLIP_MODE_NORMAL |
> -  GEN6_CLIP_XY_TEST |
>dw2);
> OUT_BATCH(U_FIXED(0.125, 3) << GEN6_CLIP_MIN_POINT_WIDTH_SHIFT |
>   U_FIXED(255.875, 3) << GEN6_CLIP_MAX_POINT_WIDTH_SHIFT |
> @@ -195,7 +197,9 @@ const struct brw_tracked_state gen6_clip_state = {
> _NEW_TRANSFORM,
>.brw   = BRW_NEW_CONTEXT |
> BRW_NEW_FS_PROG_DATA |
> +   BRW_NEW_GEOMETRY_PROGRAM |
> BRW_NEW_META_IN_PROGRESS |
> +   BRW_NEW_PRIMITIVE |
> BRW_NEW_RASTERIZER_DISCARD,
> },
> .emit = upload_clip_state,
> @@ -209,7 +213,9 @@ const struct brw_tracked_state gen7_clip_state = {
> _NEW_TRANSFORM,
>.brw   = BRW_NEW_CONTEXT |
> BRW_NEW_FS_PROG_DATA |
> +   BRW_NEW_GEOMETRY_PROGRAM |
> BRW_NEW_META_IN_PROGRESS |
> +   BRW_NEW_PRIMITIVE |
> BRW_NEW_RASTERIZER_DISCARD,
> },
> .emit = upload_clip_state,
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] i965: Include the viewport in the scissor rectangle.

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 8:05 PM, Ilia Mirkin  wrote:
> I thought there was some deal where viewport was *not* supposed to act
> as a scissor... perhaps I misunderstood though.

It is, btw, entirely conceivable that I remember that precisely in the
context of wide points/lines...

>
> On Thu, Mar 10, 2016 at 7:59 PM, Kenneth Graunke  
> wrote:
>> We'll need to use scissoring to restrict fragments to the viewport
>> soon.  It seems harmless to include it generally, so let's do that.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/gen6_scissor_state.c | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen6_scissor_state.c 
>> b/src/mesa/drivers/dri/i965/gen6_scissor_state.c
>> index 17b4a7f..a206732 100644
>> --- a/src/mesa/drivers/dri/i965/gen6_scissor_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen6_scissor_state.c
>> @@ -58,10 +58,10 @@ gen6_upload_scissor_state(struct brw_context *brw)
>> for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
>>int bbox[4];
>>
>> -  bbox[0] = 0;
>> -  bbox[1] = fb_width;
>> -  bbox[2] = 0;
>> -  bbox[3] = fb_height;
>> +  bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
>> +  bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
>> +  bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
>> +  bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
>>_mesa_intersect_scissor_bounding_box(ctx, i, bbox);
>>
>>if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) {
>> --
>> 2.7.2
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965: Move is_drawing_points to brw_state.h.

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 7:59 PM, Kenneth Graunke  wrote:
> I need to use this in multiple source files.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_state.h | 20 
>  src/mesa/drivers/dri/i965/gen6_sf_state.c | 20 
>  2 files changed, 20 insertions(+), 20 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> b/src/mesa/drivers/dri/i965/brw_state.h
> index 6b85eac..9a17b73 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -34,6 +34,7 @@
>  #define BRW_STATE_H
>
>  #include "brw_context.h"
> +#include "brw_defines.h"
>
>  #ifdef __cplusplus
>  extern "C" {
> @@ -406,6 +407,25 @@ void gen7_reset_hw_bt_pool_offsets(struct brw_context 
> *brw);
>  void
>  gen7_restore_default_l3_config(struct brw_context *brw);
>
> +static inline bool
> +is_drawing_points(const struct brw_context *brw)
> +{
> +   /* Determine if the primitives *reaching the SF* are points */
> +   /* _NEW_POLYGON */
> +   if (brw->ctx.Polygon.FrontMode == GL_POINT ||
> +   brw->ctx.Polygon.BackMode == GL_POINT) {
> +  return true;
> +   }
> +
> +   if (brw->geometry_program) {
> +  /* BRW_NEW_GEOMETRY_PROGRAM */
> +  return brw->geometry_program->OutputType == GL_POINTS;

What about TES point_mode? [I know it already didn't account for that...]

Similarly I think that the lines change needs to account for isoline
tessellation output.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] i965: Include the viewport in the scissor rectangle.

2016-03-10 Thread Ilia Mirkin
I thought there was some deal where viewport was *not* supposed to act
as a scissor... perhaps I misunderstood though.

On Thu, Mar 10, 2016 at 7:59 PM, Kenneth Graunke  wrote:
> We'll need to use scissoring to restrict fragments to the viewport
> soon.  It seems harmless to include it generally, so let's do that.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen6_scissor_state.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen6_scissor_state.c 
> b/src/mesa/drivers/dri/i965/gen6_scissor_state.c
> index 17b4a7f..a206732 100644
> --- a/src/mesa/drivers/dri/i965/gen6_scissor_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_scissor_state.c
> @@ -58,10 +58,10 @@ gen6_upload_scissor_state(struct brw_context *brw)
> for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
>int bbox[4];
>
> -  bbox[0] = 0;
> -  bbox[1] = fb_width;
> -  bbox[2] = 0;
> -  bbox[3] = fb_height;
> +  bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> +  bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> +  bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> +  bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
>_mesa_intersect_scissor_bounding_box(ctx, i, bbox);
>
>if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) {
> --
> 2.7.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965: Stop XY clipping point and line primitives.

2016-03-10 Thread Kenneth Graunke
Wide points and lines are not supposed to be clipped by the viewport.
Rather, they should be rendered, and any fragments outside of the
viewport should be discarded.

The traditional use case for this behavior is rendering moving wide
point particles.  When the center of the point approaches the viewport
edge, clipping would make it pop out of view early.

Fixes:
- dEQP-GLES2.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner
- dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center
- dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10698
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_clip_state.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_clip_state.c 
b/src/mesa/drivers/dri/i965/gen6_clip_state.c
index 9a29366..004eceb 100644
--- a/src/mesa/drivers/dri/i965/gen6_clip_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_clip_state.c
@@ -174,12 +174,14 @@ upload_clip_state(struct brw_context *brw)
else
   enable = GEN6_CLIP_ENABLE;
 
+   if (!is_drawing_points(brw) && !is_drawing_lines(brw))
+  dw2 |= GEN6_CLIP_XY_TEST;
+
BEGIN_BATCH(4);
OUT_BATCH(_3DSTATE_CLIP << 16 | (4 - 2));
OUT_BATCH(dw1);
OUT_BATCH(enable |
 GEN6_CLIP_MODE_NORMAL |
-GEN6_CLIP_XY_TEST |
 dw2);
OUT_BATCH(U_FIXED(0.125, 3) << GEN6_CLIP_MIN_POINT_WIDTH_SHIFT |
  U_FIXED(255.875, 3) << GEN6_CLIP_MAX_POINT_WIDTH_SHIFT |
@@ -195,7 +197,9 @@ const struct brw_tracked_state gen6_clip_state = {
_NEW_TRANSFORM,
   .brw   = BRW_NEW_CONTEXT |
BRW_NEW_FS_PROG_DATA |
+   BRW_NEW_GEOMETRY_PROGRAM |
BRW_NEW_META_IN_PROGRESS |
+   BRW_NEW_PRIMITIVE |
BRW_NEW_RASTERIZER_DISCARD,
},
.emit = upload_clip_state,
@@ -209,7 +213,9 @@ const struct brw_tracked_state gen7_clip_state = {
_NEW_TRANSFORM,
   .brw   = BRW_NEW_CONTEXT |
BRW_NEW_FS_PROG_DATA |
+   BRW_NEW_GEOMETRY_PROGRAM |
BRW_NEW_META_IN_PROGRESS |
+   BRW_NEW_PRIMITIVE |
BRW_NEW_RASTERIZER_DISCARD,
},
.emit = upload_clip_state,
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965: Introduce an is_drawing_lines() helper.

2016-03-10 Thread Kenneth Graunke
Similar to is_drawing_points().

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_state.h | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 9a17b73..ba448b3 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -426,6 +426,32 @@ is_drawing_points(const struct brw_context *brw)
}
 }
 
+static inline bool
+is_drawing_lines(const struct brw_context *brw)
+{
+   /* Determine if the primitives *reaching the SF* are points */
+   /* _NEW_POLYGON */
+   if (brw->ctx.Polygon.FrontMode == GL_LINE ||
+   brw->ctx.Polygon.BackMode == GL_LINE) {
+  return true;
+   }
+
+   if (brw->geometry_program) {
+  /* BRW_NEW_GEOMETRY_PROGRAM */
+  return brw->geometry_program->OutputType == GL_LINE_STRIP;
+   } else {
+  /* BRW_NEW_PRIMITIVE */
+  switch (brw->primitive) {
+  case _3DPRIM_LINELIST:
+  case _3DPRIM_LINESTRIP:
+  case _3DPRIM_LINELOOP:
+ return true;
+  }
+   }
+   return false;
+}
+
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965: Scissor to the viewport when rendering points/lines.

2016-03-10 Thread Kenneth Graunke
We're about to start allowing wide points/lines whose vertices are
outside the viewport past the clipper.  This scissoring hack ensures
that any fragments generated are still restricted to the viewport.

It is not necessary on Gen8+ as those platforms already discard
fragments which are outside the viewport.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_sf_state.c | 5 +++--
 src/mesa/drivers/dri/i965/gen7_sf_state.c | 8 +---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index 685fbdc..3626686 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -349,8 +349,9 @@ upload_sf_state(struct brw_context *brw)
unreachable("not reached");
}
 
-   /* _NEW_SCISSOR */
-   if (ctx->Scissor.EnableFlags)
+   /* _NEW_SCISSOR _NEW_POLYGON BRW_NEW_GEOMETRY_PROGRAM BRW_NEW_PRIMITIVE */
+   if (ctx->Scissor.EnableFlags ||
+   is_drawing_points(brw) || is_drawing_lines(brw))
   dw3 |= GEN6_SF_SCISSOR_ENABLE;
 
/* _NEW_POLYGON */
diff --git a/src/mesa/drivers/dri/i965/gen7_sf_state.c 
b/src/mesa/drivers/dri/i965/gen7_sf_state.c
index b1f13ac..7c98c73 100644
--- a/src/mesa/drivers/dri/i965/gen7_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sf_state.c
@@ -188,8 +188,9 @@ upload_sf_state(struct brw_context *brw)
   dw2 |= GEN6_SF_CULL_NONE;
}
 
-   /* _NEW_SCISSOR */
-   if (ctx->Scissor.EnableFlags)
+   /* _NEW_SCISSOR _NEW_POLYGON BRW_NEW_GEOMETRY_PROGRAM BRW_NEW_PRIMITIVE */
+   if (ctx->Scissor.EnableFlags ||
+   is_drawing_points(brw) || is_drawing_lines(brw))
   dw2 |= GEN6_SF_SCISSOR_ENABLE;
 
/* _NEW_LINE */
@@ -254,7 +255,8 @@ const struct brw_tracked_state gen7_sf_state = {
_NEW_POLYGON |
_NEW_PROGRAM |
_NEW_SCISSOR,
-  .brw   = BRW_NEW_CONTEXT,
+  .brw   = BRW_NEW_CONTEXT |
+   BRW_NEW_PRIMITIVE,
},
.emit = upload_sf_state,
 };
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965: Include the viewport in the scissor rectangle.

2016-03-10 Thread Kenneth Graunke
We'll need to use scissoring to restrict fragments to the viewport
soon.  It seems harmless to include it generally, so let's do that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94453
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94454
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_scissor_state.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_scissor_state.c 
b/src/mesa/drivers/dri/i965/gen6_scissor_state.c
index 17b4a7f..a206732 100644
--- a/src/mesa/drivers/dri/i965/gen6_scissor_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_scissor_state.c
@@ -58,10 +58,10 @@ gen6_upload_scissor_state(struct brw_context *brw)
for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
   int bbox[4];
 
-  bbox[0] = 0;
-  bbox[1] = fb_width;
-  bbox[2] = 0;
-  bbox[3] = fb_height;
+  bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
+  bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
+  bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
+  bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
   _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
 
   if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) {
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] i965: Move is_drawing_points to brw_state.h.

2016-03-10 Thread Kenneth Graunke
I need to use this in multiple source files.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_state.h | 20 
 src/mesa/drivers/dri/i965/gen6_sf_state.c | 20 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 6b85eac..9a17b73 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -34,6 +34,7 @@
 #define BRW_STATE_H
 
 #include "brw_context.h"
+#include "brw_defines.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -406,6 +407,25 @@ void gen7_reset_hw_bt_pool_offsets(struct brw_context 
*brw);
 void
 gen7_restore_default_l3_config(struct brw_context *brw);
 
+static inline bool
+is_drawing_points(const struct brw_context *brw)
+{
+   /* Determine if the primitives *reaching the SF* are points */
+   /* _NEW_POLYGON */
+   if (brw->ctx.Polygon.FrontMode == GL_POINT ||
+   brw->ctx.Polygon.BackMode == GL_POINT) {
+  return true;
+   }
+
+   if (brw->geometry_program) {
+  /* BRW_NEW_GEOMETRY_PROGRAM */
+  return brw->geometry_program->OutputType == GL_POINTS;
+   } else {
+  /* BRW_NEW_PRIMITIVE */
+  return brw->primitive == _3DPRIM_POINTLIST;
+   }
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index 2634e6b..685fbdc 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -147,26 +147,6 @@ get_attr_override(const struct brw_vue_map *vue_map, int 
urb_entry_read_offset,
 }
 
 
-static bool
-is_drawing_points(const struct brw_context *brw)
-{
-   /* Determine if the primitives *reaching the SF* are points */
-   /* _NEW_POLYGON */
-   if (brw->ctx.Polygon.FrontMode == GL_POINT ||
-   brw->ctx.Polygon.BackMode == GL_POINT) {
-  return true;
-   }
-
-   if (brw->geometry_program) {
-  /* BRW_NEW_GEOMETRY_PROGRAM */
-  return brw->geometry_program->OutputType == GL_POINTS;
-   } else {
-  /* BRW_NEW_PRIMITIVE */
-  return brw->primitive == _3DPRIM_POINTLIST;
-   }
-}
-
-
 /**
  * Create the mapping from the FS inputs we produce to the previous pipeline
  * stage (GS or VS) outputs they source from.
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] nir: Add explicitly sized types

2016-03-10 Thread Jason Ekstrand
On Thu, Mar 10, 2016 at 4:00 PM, Connor Abbott  wrote:

> On Mon, Mar 7, 2016 at 3:45 AM, Samuel Iglesias Gonsálvez
>  wrote:
> > From: Jason Ekstrand 
> >
> > v2: Fix size/type mask to properly handle 8-bit types.
> >
> > Signed-off-by: Juan A. Suarez Romero 
> > ---
> >  src/compiler/nir/nir.h | 17 -
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> > index cccb3a4..659e98c 100644
> > --- a/src/compiler/nir/nir.h
> > +++ b/src/compiler/nir/nir.h
> > @@ -605,9 +605,24 @@ typedef enum {
> > nir_type_float,
> > nir_type_int,
> > nir_type_uint,
> > -   nir_type_bool
> > +   nir_type_bool,
> > +   nir_type_bool32 =32 | nir_type_bool,
> > +   nir_type_int8 =  8  | nir_type_int,
> > +   nir_type_int16 = 16 | nir_type_int,
> > +   nir_type_int32 = 32 | nir_type_int,
> > +   nir_type_int64 = 64 | nir_type_int,
> > +   nir_type_uint8 = 8  | nir_type_uint,
> > +   nir_type_uint16 =16 | nir_type_uint,
> > +   nir_type_uint32 =32 | nir_type_uint,
> > +   nir_type_uint64 =64 | nir_type_uint,
> > +   nir_type_float16 =   16 | nir_type_float,
> > +   nir_type_float32 =   32 | nir_type_float,
> > +   nir_type_float64 =   64 | nir_type_float,
> >  } nir_alu_type;
> >
> > +#define NIR_ALU_TYPE_SIZE_MASK 0xfff8
> > +#define NIR_ALU_TYPE_BASE_TYPE_MASK 0x0007
>
> So I'm not really the one to be reviewing this series (after all, I
> wrote most of it :) ) but one thing that I never quite liked, and
> didn't get around to fixing, is how we use these raw constants all
> over the place. Perhaps we could make things more readable by adding
> nir_get_sized_type(), nir_get_unsized_type(), and nir_type_size()
> helpers and then use those instead of or-ing/and-ing things together
> everywhere.
>

Agreed.


>
> > +
> >  typedef enum {
> > NIR_OP_IS_COMMUTATIVE = (1 << 0),
> > NIR_OP_IS_ASSOCIATIVE = (1 << 1),
> > --
> > 2.7.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] nir: Add explicitly sized types

2016-03-10 Thread Connor Abbott
On Mon, Mar 7, 2016 at 3:45 AM, Samuel Iglesias Gonsálvez
 wrote:
> From: Jason Ekstrand 
>
> v2: Fix size/type mask to properly handle 8-bit types.
>
> Signed-off-by: Juan A. Suarez Romero 
> ---
>  src/compiler/nir/nir.h | 17 -
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index cccb3a4..659e98c 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -605,9 +605,24 @@ typedef enum {
> nir_type_float,
> nir_type_int,
> nir_type_uint,
> -   nir_type_bool
> +   nir_type_bool,
> +   nir_type_bool32 =32 | nir_type_bool,
> +   nir_type_int8 =  8  | nir_type_int,
> +   nir_type_int16 = 16 | nir_type_int,
> +   nir_type_int32 = 32 | nir_type_int,
> +   nir_type_int64 = 64 | nir_type_int,
> +   nir_type_uint8 = 8  | nir_type_uint,
> +   nir_type_uint16 =16 | nir_type_uint,
> +   nir_type_uint32 =32 | nir_type_uint,
> +   nir_type_uint64 =64 | nir_type_uint,
> +   nir_type_float16 =   16 | nir_type_float,
> +   nir_type_float32 =   32 | nir_type_float,
> +   nir_type_float64 =   64 | nir_type_float,
>  } nir_alu_type;
>
> +#define NIR_ALU_TYPE_SIZE_MASK 0xfff8
> +#define NIR_ALU_TYPE_BASE_TYPE_MASK 0x0007

So I'm not really the one to be reviewing this series (after all, I
wrote most of it :) ) but one thing that I never quite liked, and
didn't get around to fixing, is how we use these raw constants all
over the place. Perhaps we could make things more readable by adding
nir_get_sized_type(), nir_get_unsized_type(), and nir_type_size()
helpers and then use those instead of or-ing/and-ing things together
everywhere.

> +
>  typedef enum {
> NIR_OP_IS_COMMUTATIVE = (1 << 0),
> NIR_OP_IS_ASSOCIATIVE = (1 << 1),
> --
> 2.7.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-10 Thread Connor Abbott
On Thu, Mar 10, 2016 at 9:30 AM, tournier.elie  wrote:
> First, thank you all for your answers.
>
> So if I summarize what was said, we need
> Ian:
>  - add
>  - negate
>  - absolute value
>  - multiply
>  - reciprocal
>  - convert to single precision
>  - convert from single precision
> Roland:
>  - sqrt
>  - comparaison (< / == / >)
>  - floor/ceil

One thing to note is that since Intel hw doesn't support reciprocal,
sqrt, rsqrt, or floor/ceil/truncate for doubles, our fp64
implementation (not merged yet, but soon will be) already includes
routines for emulating those things:
https://github.com/Igalia/mesa/blob/i965-fp64/src/compiler/nir/nir_lower_double_ops.c
and I suspect that since we're using the GLSL precision rules and
relying on the presence of 32-bit floating point operations (which
basically all GPU's support), it's a lot simpler than most other
softfloat libraries. To use this, you'd have to port it from NIR to
GLSL, but that shouldn't be too difficult. After that, the only thing
you'd need to implement would be add, multiply, absolute value,
negate, convert to/from single precision, and comparison.

> I will contact Pat Brown (His name appear in the contact field in [1]) to
> know if we need the function below for implement gpu_shader_fp64.
>  - pow
>  - exp
>  - log
>
> About the license
>
> Like I mentioned in the project description, there are quite a few
> existing C implementations of these functions.  Finding one of those
> that you can understand and that has a compatible license is probably
> the best place to start.
>
> Main Mesa code is under MIT license.
> If I chose to use a GNU GPL license file like Linux kernel [3], my code must
> be under GNU GPL and probably all the project too. Am I right?
>
> [1] https://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt
> [2] http://www.mesa3d.org/license.html
> [3]
> https://github.com/torvalds/linux/blob/097f70b3c4d84ffccca15195bdfde3a37c0a7c0f/arch/arm/nwfpe/softfloat.c
>
> 2016-03-10 2:18 GMT+01:00 Roland Scheidegger :
>>
>> Am 09.03.2016 um 23:51 schrieb Ian Romanick:
>> > On 03/09/2016 02:25 AM, tournier.elie wrote:
>> >> Hi everyone.
>> >>
>> >> My name is Elie TOURNIER, I am enrolled in a French Engineering school
>> >> (Telecom Physique Strasbourg) specialized in Medical ICT.
>> >> I'm interested in implementing "Soft" double precision floating point
>> >> support [1].
>> >> Taking this subject seem to be a good way to get my feet wet in the
>> >> Mesa
>> >> code and discover how some of its components works.
>> >>
>> >> I come to you in order to become know but also to retrieve valuable
>> >> information for the success of this project.
>> >>
>> >> I would like to know more about the following things to understand your
>> >> requirements :
>> >> 1- "/Each double precision value would be stored in a uvec2/" The IEEE
>> >> double precision floating point standard representation requires a 64
>> >> bit: 1 for sign, 11 for exponent and the others for fraction [2].
>> >> -> How double precision value must be stored?
>> >
>> > As Emil mentioned, on GLSL 1.30, a uvec2 consists of two, 32-bit
>> > unsigned integers.  Each double precision value would be stored in a
>> > uvec2.
>> >
>> >> 2- Where can I find |GL_ARB_gpu_shader_fp64 |documentation|?
>> >> |
>> >>
>> >>
>> >> This is my first exposure to Mesa. Please excuse me if I am asking
>> >> basic
>> >> questions.
>> >
>> > For this particular project, you wouldn't need Mesa at all for quite
>> > some time.  All of the initial project should be done in "raw" GLSL
>> > 1.30, and any OpenGL implementation capable of GLSL 1.30 can be used.
>> > You would implement (and test!) a library of functions like 'uvec2
>> > addDouble(uvec2 a, uvec2 b)' that would provide all of the required
>> > double precision operations.
>> >
>> > The set of required functions should be pretty small.  I think:
>> >
>> >  - add
>> >  - negate
>> >  - absolute value
>> >  - multiply
>> >  - reciprocal
>> >  - convert to single precision
>> >  - convert from single precision
>> >  - pow (maybe?)
>> >  - exp (maybe?)
>> >  - log (maybe?)
>>
>> I don't think you need exp/log. At least glsl dosen't require it, though
>> the project isn't clear about it.
>> (pow all hw I know of with exactly one exception (that would be intel
>> graphics...) implements it as log2/mul/exp2 even for f32 anyway).
>> I think though you need sqrt (or rsqrt). And some functions for
>> rounding, plus comparison operations. Maybe min/max too (albeit if you
>> have comparisons you can emulate them of course).
>>
>> Roland
>>
>>
>> >
>> > I think everything else could be implemented using those functions.
>> >
>> > Like I mentioned in the project description, there are quite a few
>> > existing C implementations of these functions.  Finding one of those
>> > that you can understand and that has a compatible license is probably
>> > the best place to start.
>> >
>> >> Please point me to the right 

Re: [Mesa-dev] [PATCH 2/2] st/mesa: add GL_ARB_shader_atomic_counter_ops support

2016-03-10 Thread Nicolai Hähnle

On 10.03.2016 12:50, Glenn Kennard wrote:

On Thu, 10 Mar 2016 18:13:03 +0100, Ilia Mirkin 
wrote:


On Thu, Mar 10, 2016 at 12:04 PM, Glenn Kennard
 wrote:

On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin 
wrote:


On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle 
wrote:


-   if (c->MaxCombinedAtomicBuffers > 0)
+   if (c->MaxCombinedAtomicBuffers > 0) {
extensions->ARB_shader_atomic_counters = GL_TRUE;
+  extensions->ARB_shader_atomic_counter_ops = GL_TRUE;
+   }




I believe there's pre-GCN AMD hardware which can support atomic
counters
but
not atomic_counter_ops (at least according to what the closed driver
exposes, I haven't actually checked the docs), so there should
probably
be a
capability flag here.



I assumed this was due to laziness... seems odd if the SSBO atomic ops
can be supported, but those same ops can't be supported on atomic
buffers. Glenn / Dave - do you guys happen to know what the pre-GCN hw
is capable of?

  -ilia



AFAIK Cayman supports atomic counter ops on SSBOs, evergreen only on
counter
buffers, and earlier hardware does neither.


To phrase this a different way, my patch is fine? :) If you support
atomic counters, you support all the various ops in
ARB_shader_atomic_counter_ops (which are basically all the SSBO ops,
but on atomic counters)?



I think so, though the closed driver only exposes
ARB_shader_atomic_counter_ops on
Cayman only which may be a hint to something. Cross that bridge when we
get there...


Fair enough. Ilia, I did take a look at the other parts of patch #2, so 
feel free to add my R-b there as well.


Cheers,
Nicolai



/Glenn

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: check that the image unit is valid in st_bind_images

2016-03-10 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/mesa/state_tracker/st_atom_image.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_atom_image.c 
b/src/mesa/state_tracker/st_atom_image.c
index d0f0c42..bf7486b 100644
--- a/src/mesa/state_tracker/st_atom_image.c
+++ b/src/mesa/state_tracker/st_atom_image.c
@@ -25,6 +25,7 @@
  **/
 
 #include "main/imports.h"
+#include "main/shaderimage.h"
 #include "program/prog_parameter.h"
 #include "program/prog_print.h"
 #include "compiler/glsl/ir_uniform.h"
@@ -60,7 +61,7 @@ st_bind_images(struct st_context *st, struct gl_shader 
*shader,
   struct st_texture_object *stObj = st_texture_object(u->TexObj);
   struct pipe_image_view *img = [i];
 
-  if (!stObj ||
+  if (!_mesa_is_image_unit_valid(st->ctx, u) ||
   !st_finalize_texture(st->ctx, st->pipe, u->TexObj) ||
   !stObj->pt) {
  memset(img, 0, sizeof(*img));
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] get: reconcile aliasing enums for MaxCombinedShaderOutputResources

2016-03-10 Thread Nicolai Hähnle
From: Nicolai Hähnle 

The enums MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS and
MAX_COMBINED_SHADER_OUTPUT_RESOURCES are equal and should therefore only
appear once.

Noticed while implementing ARB_shader_image_load_store without previously
implementing SSBO.
---
 src/mesa/main/get.c  | 7 +++
 src/mesa/main/get_hash_params.py | 6 --
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 67c4f99..b0fadc9 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -384,6 +384,13 @@ static const int 
extra_ARB_shader_storage_buffer_object_and_geometry_shader[] =
EXTRA_END
 };
 
+static const int 
extra_ARB_shader_image_load_store_shader_storage_buffer_object_es31[] = {
+   EXT(ARB_shader_image_load_store),
+   EXT(ARB_shader_storage_buffer_object),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 static const int extra_ARB_framebuffer_no_attachments_and_geometry_shader[] = {
EXTRA_EXT_FB_NO_ATTACH_GS,
EXTRA_END
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index f9d66f8..12c2189 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -495,9 +495,12 @@ descriptor=[
   [ "MAX_COMBINED_SHADER_STORAGE_BLOCKS", 
"CONTEXT_INT(Const.MaxCombinedShaderStorageBlocks), 
extra_ARB_shader_storage_buffer_object_es31" ],
   [ "MAX_SHADER_STORAGE_BLOCK_SIZE", 
"CONTEXT_INT(Const.MaxShaderStorageBlockSize), 
extra_ARB_shader_storage_buffer_object_es31" ],
   [ "MAX_SHADER_STORAGE_BUFFER_BINDINGS", 
"CONTEXT_INT(Const.MaxShaderStorageBufferBindings), 
extra_ARB_shader_storage_buffer_object_es31" ],
-  [ "MAX_COMBINED_SHADER_OUTPUT_RESOURCES", 
"CONTEXT_INT(Const.MaxCombinedShaderOutputResources), 
extra_ARB_shader_storage_buffer_object_es31" ],
   [ "SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT", 
"CONTEXT_INT(Const.ShaderStorageBufferOffsetAlignment), 
extra_ARB_shader_storage_buffer_object_es31" ],
   [ "SHADER_STORAGE_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_shader_storage_buffer_object_es31" ],
+
+  # GL_ARB_shader_image_load_store / GL_ARB_shader_storage_buffer_object / 
GLES 3.1
+  # (MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS in 
GL_ARB_shader_image_load_store)
+  [ "MAX_COMBINED_SHADER_OUTPUT_RESOURCES", 
"CONTEXT_INT(Const.MaxCombinedShaderOutputResources), 
extra_ARB_shader_image_load_store_shader_storage_buffer_object_es31" ],
 ]},
 
 # Enums in OpenGL Core profile and ES 3.1
@@ -841,7 +844,6 @@ descriptor=[
   [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
"CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather"],
 
 # GL_ARB_shader_image_load_store
-  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
"CONTEXT_INT(Const.MaxCombinedShaderOutputResources), 
extra_ARB_shader_image_load_store" ],
   [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store" ],
 
 # GL_EXT_polygon_offset_clamp
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-10 Thread Dylan Baker
Quoting Marek Olšák (2016-03-10 06:57:57)
> On Thu, Mar 10, 2016 at 3:30 PM, tournier.elie  
> wrote:
> > First, thank you all for your answers.
> >
> > So if I summarize what was said, we need
> > Ian:
> >  - add
> >  - negate
> >  - absolute value
> >  - multiply
> >  - reciprocal
> >  - convert to single precision
> >  - convert from single precision
> > Roland:
> >  - sqrt
> >  - comparaison (< / == / >)
> >  - floor/ceil
> > I will contact Pat Brown (His name appear in the contact field in [1]) to
> > know if we need the function below for implement gpu_shader_fp64.
> >  - pow
> >  - exp
> >  - log
> >
> > About the license
> >
> > Like I mentioned in the project description, there are quite a few
> > existing C implementations of these functions.  Finding one of those
> > that you can understand and that has a compatible license is probably
> > the best place to start.
> >
> > Main Mesa code is under MIT license.
> > If I chose to use a GNU GPL license file like Linux kernel [3], my code must
> > be under GNU GPL and probably all the project too. Am I right?
> >
> > [1] https://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt
> > [2] http://www.mesa3d.org/license.html
> > [3]
> > https://github.com/torvalds/linux/blob/097f70b3c4d84ffccca15195bdfde3a37c0a7c0f/arch/arm/nwfpe/softfloat.c
> 
> You can't use GNU GPL for this project.
> 
> The kernel as a whole is licensed under GNU GPL, but some source files
> aren't. The file you linked doesn't mention GNU GPL. Somebody needs to
> verify that the file you linked can be legally re-licensed under the
> MIT license. If not, I think you have to forget the contents of the
> file immediately, but I'm not a lawyer.
> 
> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Most BSD style licenses are legally compatible, as long as none of the
developers object. One of the BSD kernels should have a softfloat
implementation that would be license compatible.


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Patrick Baggett
On Thu, Mar 10, 2016 at 3:08 PM, Patrick Baggett
 wrote:
> On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick  wrote:
>> From: Ian Romanick 
>>
>> Sandy Bridge / Ivy Bridge / Haswell
>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>> instructions in affected programs: 564 -> 558 (-1.06%)
>> helped: 6
>> HURT: 0
>>
>> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
>> cycles in affected programs: 9768 -> 9582 (-1.90%)
>> helped: 12
>> HURT: 0
>>
>> Broadwell / Skylake
>> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
>> instructions in affected programs: 626 -> 619 (-1.12%)
>> helped: 7
>> HURT: 0
>>
>> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
>> cycles in affected programs: 9378 -> 9192 (-1.98%)
>> helped: 12
>> HURT: 0
>>
>> G45 and Ironlake showed no change.
>>
>> Signed-off-by: Ian Romanick 
>> ---
>>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
>> b/src/compiler/nir/nir_opt_algebraic.py
>> index 4db8f84..1442ce8 100644
>> --- a/src/compiler/nir/nir_opt_algebraic.py
>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>> @@ -108,6 +108,11 @@ optimizations = [
>> # inot(a)
>> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>>
>> +   # 0.0 < fabs(a)
>> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
> I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
> some a, you can't say then fabs(a) != 0.0.
>
> Then, the counter-example is when a = 0.0
>
> 1) 0.0 != fabs(0.0)
> 2) 0.0 != 0.0
>
Rather, I mean the comment is wrong, but the conclusion that:
0 < fabs(a) <-> a != 0.0
is correct. You can just build a truth table or just observe that when
a == 0, 0 < 0 is false, and
when a != 0.0, fabs(a) will be > 0, so 0 < fabs(a) will be always true.



>> +   # 0.0 != a
>
>
>
>
>> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
>> +
>> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
>> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/10] i965: Silence loop counter overflow warning

2016-03-10 Thread Eirik Byrkjeflot Anonsen
Ian Romanick  writes:

> From: Ian Romanick 
>
> I don't understand why the old code was bad, but the new code is fine.

Probably because the *loop counter* can no longer overflow. Thus the
loop can be optimized. The fact that "i" might overflow has become
irrelevant to the warning.

(And from that perspective, it isn't equivalent. If "i" overflows in the
original code, you would get an infinite loop.)

eirik
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Patrick Baggett
On Thu, Mar 10, 2016 at 12:25 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
> cycles in affected programs: 9768 -> 9582 (-1.90%)
> helped: 12
> HURT: 0
>
> Broadwell / Skylake
> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
> instructions in affected programs: 626 -> 619 (-1.12%)
> helped: 7
> HURT: 0
>
> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
> cycles in affected programs: 9378 -> 9192 (-1.98%)
> helped: 12
> HURT: 0
>
> G45 and Ironlake showed no change.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 4db8f84..1442ce8 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -108,6 +108,11 @@ optimizations = [
> # inot(a)
> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>
> +   # 0.0 < fabs(a)
> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
I think this is wrong. Because >= 0.0 can mean that fabs(a) == 0.0 for
some a, you can't say then fabs(a) != 0.0.

Then, the counter-example is when a = 0.0

1) 0.0 != fabs(0.0)
2) 0.0 != 0.0

> +   # 0.0 != a




> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
> +
> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] squash: Fix up VPM read optimization.

2016-03-10 Thread Eric Anholt
- There's no reason there would be only 64 operations that read from the
  output of a mov from VPM, so we might smash the stack (fixes etqw trace)

- Fixes segfault where we assumed that a single-use temp had a def (fixes
  2 piglit tests)

- We need to only mark progress when we actually did the optimization, or
  we'll infinite loop (0ad trace).

- Misc style fixes.

- No reordering sampler instructions (fixes a glean test)

shader-db results:
total instructions in shared programs: 78513 -> 78071 (-0.56%)
instructions in affected programs: 10406 -> 9964 (-4.25%)
total estimated cycles in shared programs: 234674 -> 234274 (-0.17%)
estimated cycles in affected programs: 35188 -> 34788 (-1.14%)
---

Varad, here's what I came up with trying to test your patch.  If these
changes look good to you, I can squash them in and push.

 src/gallium/drivers/vc4/vc4_opt_vpm.c | 45 +--
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_opt_vpm.c 
b/src/gallium/drivers/vc4/vc4_opt_vpm.c
index 277b345..a4ee6af 100644
--- a/src/gallium/drivers/vc4/vc4_opt_vpm.c
+++ b/src/gallium/drivers/vc4/vc4_opt_vpm.c
@@ -40,10 +40,8 @@ qir_opt_vpm(struct vc4_compile *c)
 
 bool progress = false;
 struct qinst *vpm_writes[64] = { 0 };
-struct qinst *vpm_reads[64] = { 0 };
 uint32_t use_count[c->num_temps];
 uint32_t vpm_write_count = 0;
-uint32_t vpm_read_count = 0;
 memset(_count, 0, sizeof(use_count));
 
 list_for_each_entry(struct qinst, inst, >instructions, link) {
@@ -59,24 +57,14 @@ qir_opt_vpm(struct vc4_compile *c)
 if (inst->src[i].file == QFILE_TEMP) {
 uint32_t temp = inst->src[i].index;
 use_count[temp]++;
-
-struct qinst *mov = c->defs[temp];
-if (!mov ||
-(mov->op != QOP_MOV &&
-mov->op != QOP_FMOV &&
-mov->op != QOP_MMOV)) {
-continue;
-}
-
-if (mov->src[0].file == QFILE_VPM)
-vpm_reads[vpm_read_count++] = inst;
 }
 }
 }
 
-for (int i = 0; i < vpm_read_count; i++) {
-struct qinst *inst = vpm_reads[i];
-
+/* For instructions reading from a temporary that contains a VPM read
+ * result, try to move the instruction up in place of the VPM read.
+ */
+list_for_each_entry(struct qinst, inst, >instructions, link) {
 if (!inst || qir_is_multi_instruction(inst))
 continue;
 
@@ -84,21 +72,32 @@ qir_opt_vpm(struct vc4_compile *c)
 continue;
 
 if (qir_has_side_effects(c, inst) ||
-qir_has_side_effect_reads(c, inst))
+qir_has_side_effect_reads(c, inst) ||
+qir_is_tex(inst))
 continue;
 
 for (int j = 0; j < qir_get_op_nsrc(inst->op); j++) {
-if(inst->src[j].file != QFILE_TEMP)
+if (inst->src[j].file != QFILE_TEMP)
 continue;
 
 uint32_t temp = inst->src[j].index;
+
+/* Since VPM reads pull from a FIFO, we only get to
+ * read each VPM entry once (unless we reset the read
+ * pointer).  That means we can't copy-propagate a VPM
+ * read to multiple locations.
+ */
 if (use_count[temp] != 1)
 continue;
 
 struct qinst *mov = c->defs[temp];
-
-if (mov->src[0].file != QFILE_VPM)
+if (!mov ||
+(mov->op != QOP_MOV &&
+ mov->op != QOP_FMOV &&
+ mov->op != QOP_MMOV) ||
+mov->src[0].file != QFILE_VPM) {
 continue;
+}
 
 uint32_t temps = 0;
 for (int k = 0; k < qir_get_op_nsrc(inst->op); k++) {
@@ -109,15 +108,15 @@ qir_opt_vpm(struct vc4_compile *c)
 /* The instruction is safe to reorder if its other
  * sources are independent of previous instructions
  */
-if (temps == 1 ) {
+if (temps == 1) {
 list_del(>link);
 inst->src[j] = mov->src[0];
 

Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 3:24 PM, Ilia Mirkin  wrote:
> On Thu, Mar 10, 2016 at 1:25 PM, Ian Romanick  wrote:
>> From: Ian Romanick 
>>
>> Sandy Bridge / Ivy Bridge / Haswell
>> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
>> instructions in affected programs: 564 -> 558 (-1.06%)
>> helped: 6
>> HURT: 0
>>
>> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
>> cycles in affected programs: 9768 -> 9582 (-1.90%)
>> helped: 12
>> HURT: 0
>>
>> Broadwell / Skylake
>> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
>> instructions in affected programs: 626 -> 619 (-1.12%)
>> helped: 7
>> HURT: 0
>>
>> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
>> cycles in affected programs: 9378 -> 9192 (-1.98%)
>> helped: 12
>> HURT: 0
>>
>> G45 and Ironlake showed no change.
>>
>> Signed-off-by: Ian Romanick 
>> ---
>>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
>> b/src/compiler/nir/nir_opt_algebraic.py
>> index 4db8f84..1442ce8 100644
>> --- a/src/compiler/nir/nir_opt_algebraic.py
>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>> @@ -108,6 +108,11 @@ optimizations = [
>> # inot(a)
>> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>>
>> +   # 0.0 < fabs(a)
>> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
>> +   # 0.0 != a
>> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
>
> Can you walk me through the logic? You're saying that
>
> 0 < |a| <-> a != 0
>
> If a == 0, 0 < |a| would still be false, no? I think that
>
> 0 < |a| <-> false
> 0 <= |a| <-> a != 0
>
> But I could just be missing something obvious...

Yeah. Nevermind. I'm tired. Ignore this.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 1:25 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
> cycles in affected programs: 9768 -> 9582 (-1.90%)
> helped: 12
> HURT: 0
>
> Broadwell / Skylake
> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
> instructions in affected programs: 626 -> 619 (-1.12%)
> helped: 7
> HURT: 0
>
> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
> cycles in affected programs: 9378 -> 9192 (-1.98%)
> helped: 12
> HURT: 0
>
> G45 and Ironlake showed no change.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 4db8f84..1442ce8 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -108,6 +108,11 @@ optimizations = [
> # inot(a)
> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>
> +   # 0.0 < fabs(a)
> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
> +   # 0.0 != a
> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),

Can you walk me through the logic? You're saying that

0 < |a| <-> a != 0

If a == 0, 0 < |a| would still be false, no? I think that

0 < |a| <-> false
0 <= |a| <-> a != 0

But I could just be missing something obvious...

  -ilia

> +
> (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
> (('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
> (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] glcpp: Implicitly resolve version after the first non-space/hash token.

2016-03-10 Thread Kenneth Graunke
On Wednesday, March 9, 2016 3:18:50 PM PST Jon Turney wrote:
> On 05/03/2016 03:33, Kenneth Graunke wrote:
> > We resolved the implicit version directive when processing control lines,
> > such as #ifdef, to ensure any built-in macros exist.  However, we failed
> > to resolve it when handling ordinary text.
> [...]
> > diff --git a/src/compiler/glsl/glcpp/tests/146-version-first-
hash.c.expected b/src/compiler/glsl/glcpp/tests/146-version-first-
hash.c.expected
> > new file mode 100644
> > index 000..2872090
> > --- /dev/null
> > +++ b/src/compiler/glsl/glcpp/tests/146-version-first-hash.c.expected
> > @@ -0,0 +1,3 @@
> > +0:1(3): preprocessor error: #version must appear on the first line
> > +
> > +
> 
> This last test fails in glcpp-test-cr-lf for me (See attached).
> 
> Can you just confirm that it passes for you, before I start looking into 
> why it might fail just for me...?

Sorry about that.  I had just run glcpp-test, but not glcpp-test-cr-lf.

It turns out that our handling of hash followed by newline was not
counting lines correctly, so it was returning either line 3 or line 4
based on the line terminator characters.  0:1(3) in the test was wrong;
it should have actually been 0:2(1).

Iago just reviewed my patch to fix this, so I've pushed it.  Hopefully
master should work for you now.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [GSoC2016] Interested in implementing "Soft" double precision floating point support

2016-03-10 Thread Ian Romanick
On 03/10/2016 06:30 AM, tournier.elie wrote:
> First, thank you all for your answers.
> 
> So if I summarize what was said, we need
> Ian:
>  - add
>  - negate
>  - absolute value
>  - multiply
>  - reciprocal
>  - convert to single precision
>  - convert from single precision
> Roland:
>  - sqrt

Reciprocal sqrt (rsqrt) is probably more useful.  You can then get sqrt
using reciprocal and rqsrt.

>  - comparaison (< / == / >)
>  - floor/ceil
> I will contact Pat Brown (His name appear in the contact field in [1])
> to know if we need the function below for implement gpu_shader_fp64.

Don't contact Pat.  He's a busy guy, and he probably won't respond. :)

We have to have some sort of implementation of every function in the
extension.  That's not the part that's under debate.  The part that is
under debate is whether specific operations are needed to implement the
functions.  For example, as Roland mentioned, you can implement pow
using exp and log.

>  - pow
>  - exp
>  - log

I looked back at the spec, and there are no pow, exponent, or logarithm
related functions.  I guess we get of easy there.  It may still be
interesting to eventually implement these, but that would be a much,
much, much later step.

> About the license
> 
> /Like I mentioned in the project description, there are quite a few
> existing C implementations of these functions.  Finding one of those
> that you can understand and that has a compatible license is probably
> the best place to start./
> 
> Main Mesa code is under MIT license.
> If I chose to use a GNU GPL license file like Linux kernel [3], my code
> must be under GNU GPL and probably all the project too. Am I right?
> 
> [1] https://www.opengl.org/registry/specs/ARB/gpu_shader_fp64.txt
> [2] http://www.mesa3d.org/license.html
> [3]
> https://github.com/torvalds/linux/blob/097f70b3c4d84ffccca15195bdfde3a37c0a7c0f/arch/arm/nwfpe/softfloat.c
> 
> 2016-03-10 2:18 GMT+01:00 Roland Scheidegger  >:
> 
> Am 09.03.2016 um 23:51 schrieb Ian Romanick:
> > On 03/09/2016 02:25 AM, tournier.elie wrote:
> >> Hi everyone.
> >>
> >> My name is Elie TOURNIER, I am enrolled in a French Engineering
> school
> >> (Telecom Physique Strasbourg) specialized in Medical ICT.
> >> I'm interested in implementing "Soft" double precision floating point
> >> support [1].
> >> Taking this subject seem to be a good way to get my feet wet in
> the Mesa
> >> code and discover how some of its components works.
> >>
> >> I come to you in order to become know but also to retrieve valuable
> >> information for the success of this project.
> >>
> >> I would like to know more about the following things to
> understand your
> >> requirements :
> >> 1- "/Each double precision value would be stored in a uvec2/" The
> IEEE
> >> double precision floating point standard representation requires a 64
> >> bit: 1 for sign, 11 for exponent and the others for fraction [2].
> >> -> How double precision value must be stored?
> >
> > As Emil mentioned, on GLSL 1.30, a uvec2 consists of two, 32-bit
> > unsigned integers.  Each double precision value would be stored in
> a uvec2.
> >
> >> 2- Where can I find |GL_ARB_gpu_shader_fp64 |documentation|?
> >> |
> >>
> >>
> >> This is my first exposure to Mesa. Please excuse me if I am
> asking basic
> >> questions.
> >
> > For this particular project, you wouldn't need Mesa at all for quite
> > some time.  All of the initial project should be done in "raw" GLSL
> > 1.30, and any OpenGL implementation capable of GLSL 1.30 can be used.
> > You would implement (and test!) a library of functions like 'uvec2
> > addDouble(uvec2 a, uvec2 b)' that would provide all of the required
> > double precision operations.
> >
> > The set of required functions should be pretty small.  I think:
> >
> >  - add
> >  - negate
> >  - absolute value
> >  - multiply
> >  - reciprocal
> >  - convert to single precision
> >  - convert from single precision
> >  - pow (maybe?)
> >  - exp (maybe?)
> >  - log (maybe?)
> 
> I don't think you need exp/log. At least glsl dosen't require it, though
> the project isn't clear about it.
> (pow all hw I know of with exactly one exception (that would be intel
> graphics...) implements it as log2/mul/exp2 even for f32 anyway).
> I think though you need sqrt (or rsqrt). And some functions for
> rounding, plus comparison operations. Maybe min/max too (albeit if you
> have comparisons you can emulate them of course).
> 
> Roland
> 
> 
> >
> > I think everything else could be implemented using those functions.
> >
> > Like I mentioned in the project description, there are quite a few
> > existing C implementations of these functions.  Finding one 

[Mesa-dev] [Bug 94481] softpipe - access violation in img_filter_2d_nearest

2016-03-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94481

--- Comment #1 from Greg  ---
This problem also exists in mip_filter_linear_aniso(...)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/nir: Use uniform index instead of lookup by name

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:13 AM, Topi Pohjolainen
 wrote:
> Uniform linking in (see link_assign_uniform_locations()) already
> stores the index to the storage in ir_variable which is further
> stored into nir_variable (see nir_visitor::visit(ir_variable *)).
>
> Instead of doing uniform_num^2 string comparisons one can recur
> over the uniform type the same way uniform linking does.
>
> Unfortunately I didn't see any improvement in performance tests,
> at least on BDW. Only the the fps numbers in a few synthetic
> benchmarks started to vary more than before between two subsequent
> runs.
>
> CC: Kenneth Graunke 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 113 
> +++--
>  1 file changed, 67 insertions(+), 46 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp 
> b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> index f3361d6..f8ee0af 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
> @@ -67,62 +67,83 @@ brw_nir_setup_glsl_builtin_uniform(nir_variable *var,
>  }
>
>  static void
> -brw_nir_setup_glsl_uniform(gl_shader_stage stage, nir_variable *var,
> -   struct gl_shader_program *shader_prog,
> +brw_nir_setup_glsl_uniform(gl_shader_stage stage, bool is_scalar,
> struct brw_stage_prog_data *stage_prog_data,
> -   bool is_scalar)
> +   const struct gl_uniform_storage *storage,
> +   unsigned *uniform_index)
>  {
> -   int namelen = strlen(var->name);
> -
> -   /* The data for our (non-builtin) uniforms is stored in a series of
> -* gl_uniform_driver_storage structs for each subcomponent that
> -* glGetUniformLocation() could name.  We know it's been set up in the 
> same
> -* order we'd walk the type, so walk the list of storage and find anything
> -* with our name, or the prefix of a component that starts with our name.
> -*/
> -   unsigned uniform_index = var->data.driver_location / 4;
> -   for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
> -  struct gl_uniform_storage *storage = _prog->UniformStorage[u];
> -
> -  if (storage->builtin)
> - continue;
> +   if (storage->type->is_image()) {
> +  brw_setup_image_uniform_values(stage, stage_prog_data,
> + *uniform_index, storage);
> +  *uniform_index +=
> + BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
> +   } else {
> +  gl_constant_value *components = storage->storage;
> +  unsigned vector_count = (MAX2(storage->array_elements, 1) *
> +   storage->type->matrix_columns);
> +  unsigned vector_size = storage->type->vector_elements;
> +
> +  for (unsigned s = 0; s < vector_count; s++) {
> + unsigned i;
> + for (i = 0; i < vector_size; i++) {
> +stage_prog_data->param[(*uniform_index)++] = components++;
> + }
>
> -  if (strncmp(var->name, storage->name, namelen) != 0 ||
> -  (storage->name[namelen] != 0 &&
> -   storage->name[namelen] != '.' &&
> -   storage->name[namelen] != '[')) {
> - continue;
> + if (!is_scalar) {
> +/* Pad out with zeros if needed (only needed for vec4) */
> +for (; i < 4; i++) {
> +   static const gl_constant_value zero = { 0.0 };
> +   stage_prog_data->param[(*uniform_index)++] = 
> +}
> + }
>}
> +   }
> +}
>
> -  if (storage->type->is_image()) {
> - brw_setup_image_uniform_values(stage, stage_prog_data,
> -uniform_index, storage);
> - uniform_index +=
> -BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
> -  } else {
> - gl_constant_value *components = storage->storage;
> - unsigned vector_count = (MAX2(storage->array_elements, 1) *
> -  storage->type->matrix_columns);
> - unsigned vector_size = storage->type->vector_elements;
> -
> - for (unsigned s = 0; s < vector_count; s++) {
> -unsigned i;
> -for (i = 0; i < vector_size; i++) {
> -   stage_prog_data->param[uniform_index++] = components++;
> -}
> +/* This mirrors the breakdown of complex uniforms in link_uniforms.cpp */
> +static void
> +brw_nir_recur_to_glsl_uniform(gl_shader_stage stage, bool is_scalar,
> +  struct brw_stage_prog_data *stage_prog_data,
> +  const struct gl_uniform_storage **storage,
> +  unsigned *uniform_index, const glsl_type *t)
> +{
> +   assert(!t->is_interface() && !t->without_array()->is_interface());
>
> -   

Re: [Mesa-dev] [PATCH 10/10] nir: Don't abs slt and friends

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> No shader-db changes, but this is symmetric with the previous commit.

Right, i965 doesn't use these operations.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/10] nir: Lower flrp with Boolean interpolator to bcsel

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> On Intel platforms that don't set lower_flrp, using bcsel instead of
> flrp seems to be a small amount worse.

Yep, that's my experience too. It's because bcsel turns into CMP+SEL,
and because of the flag register we can't schedule instructions well.

> On those platforms, the use of
> flrp, bcsel, and multiply of b2f is still an active area of research.
>
> shader-db results:
>
> G4X / Ironlake
> total instructions in shared programs: 4016538 -> 4012279 (-0.11%)
> instructions in affected programs: 161556 -> 157297 (-2.64%)
> helped: 1077
> HURT: 1
>
> total cycles in shared programs: 84328296 -> 84315862 (-0.01%)
> cycles in affected programs: 4174570 -> 4162136 (-0.30%)
> helped: 926
> HURT: 53
>
> Unsurprisingly, no changes on later platforms.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 39be85f..8a44a7a 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -83,10 +83,13 @@ optimizations = [
> (('flrp', a, b, 1.0), b),
> (('flrp', a, a, b), a),
> (('flrp', 0.0, a, b), ('fmul', a, b)),
> +   (('flrp', a, b, ('b2f', c)), ('bcsel', c, b, a), 'options->lower_flrp'),
> (('flrp', a, b, c), ('fadd', ('fmul', c, ('fsub', b, a)), a), 
> 'options->lower_flrp'),
> (('ffract', a), ('fsub', a, ('ffloor', a)), 'options->lower_ffract'),
> -   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c))), ('fmul', b, c)), 
> ('flrp', a, b, c), '!options->lower_flrp'),
> -   (('fadd', a, ('fmul', c, ('fadd', b, ('fneg', a, ('flrp', a, b, c), 
> '!options->lower_flrp'),
> +   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', ('b2f', c, ('fmul', b, 
> ('b2f', c))), ('bcsel', c, b, a), 'options->lower_flrp'),
> +   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c ))), ('fmul', b,   
>   c )), ('flrp', a, b, c), '!options->lower_flrp'),
> +   (('fadd', a, ('fmul', ('b2f', c), ('fadd', b, ('fneg', a, ('bcsel', 
> c, b, a), 'options->lower_flrp'),
> +   (('fadd', a, ('fmul', c , ('fadd', b, ('fneg', a, ('flrp', a, 
> b, c), '!options->lower_flrp'),

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Sandy Bridge / Ivy Bridge / Haswell
> total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
> instructions in affected programs: 564 -> 558 (-1.06%)
> helped: 6
> HURT: 0
>
> total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
> cycles in affected programs: 9768 -> 9582 (-1.90%)
> helped: 12
> HURT: 0
>
> Broadwell / Skylake
> total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
> instructions in affected programs: 626 -> 619 (-1.12%)
> helped: 7
> HURT: 0
>
> total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
> cycles in affected programs: 9378 -> 9192 (-1.98%)
> helped: 12
> HURT: 0
>
> G45 and Ironlake showed no change.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 4db8f84..1442ce8 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -108,6 +108,11 @@ optimizations = [
> # inot(a)
> (('fge', 0.0, ('b2f', a)), ('inot', a)),
>
> +   # 0.0 < fabs(a)
> +   # 0.0 != fabs(a)  because fabs(a) must be >= 0
> +   # 0.0 != a
> +   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
> +

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/10] nir: Simplify 0 >= b2f(a)

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> This also prevented some regressions with other patches in my local
> tree.
>
> Broadwell / Skylake
> total instructions in shared programs: 8980835 -> 8980833 (-0.00%)
> instructions in affected programs: 45 -> 43 (-4.44%)
> helped: 1
> HURT: 0
>
> total cycles in shared programs: 70077904 -> 70077900 (-0.00%)
> cycles in affected programs: 122 -> 118 (-3.28%)
> helped: 1
> HURT: 0
>
> No changes on earlier platforms.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/nir/nir_opt_algebraic.py | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/compiler/nir/nir_opt_algebraic.py 
> b/src/compiler/nir/nir_opt_algebraic.py
> index 5b3694e..4db8f84 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -101,6 +101,13 @@ optimizations = [
> (('inot', ('ige', a, b)), ('ilt', a, b)),
> (('inot', ('ieq', a, b)), ('ine', a, b)),
> (('inot', ('ine', a, b)), ('ieq', a, b)),
> +
> +   # 0.0 >= b2f(a)
> +   # 0.0 == b2f(a) because b2f(a) can only be 0 or 1
> +   # b2f(a) == 0.0
> +   # inot(a)
> +   (('fge', 0.0, ('b2f', a)), ('inot', a)),
> +

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/10] nir: Simplify i2b with negated or abs operand

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> This enables removing ssa_201 and ssa_202 in sequences like:
>
>  vec1 ssa_200 = flt ssa_199, ssa_194
>  vec1 ssa_201 = b2i ssa_200
>  vec1 ssa_202 = i2b -ssa_201

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/3] tgsi: Add support for global / local / input MEMORY

2016-03-10 Thread Hans de Goede

Hi,

On 10-03-16 16:35, Aaron Watry wrote:

On Thu, Mar 10, 2016 at 9:14 AM, Hans de Goede  wrote:


Extend the MEMORY file support to differentiate between global, local
and shared memory, as well as "input" memory.

"MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
special memory type is added for this, since the actual storage of these
(e.g. UBO-s) may differ per implementation. The uploading of kernel
parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers
to use an access mechanism for parameter reads which matches with the
upload method.

Signed-off-by: Hans de Goede 
---
  src/gallium/auxiliary/tgsi/tgsi_build.c|  8 +++
  src/gallium/auxiliary/tgsi/tgsi_dump.c |  9 ++--
  src/gallium/auxiliary/tgsi/tgsi_text.c | 14 ++--
  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 25
--
  src/gallium/auxiliary/tgsi/tgsi_ureg.h |  2 +-
  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  7 +++---
  src/gallium/include/pipe/p_shader_tokens.h | 10 +++--
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +-
  8 files changed, 51 insertions(+), 26 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index c420ae1..b108ade 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -111,7 +111,7 @@ tgsi_default_declaration( void )
 declaration.Local = 0;
 declaration.Array = 0;
 declaration.Atomic = 0;
-   declaration.Shared = 0;
+   declaration.MemType = TGSI_MEMORY_TYPE_GLOBAL;
 declaration.Padding = 0;

 return declaration;
@@ -128,7 +128,7 @@ tgsi_build_declaration(
 unsigned local,
 unsigned array,
 unsigned atomic,
-   unsigned shared,
+   unsigned mem_type,
 struct tgsi_header *header )
  {
 struct tgsi_declaration declaration;
@@ -146,7 +146,7 @@ tgsi_build_declaration(
 declaration.Local = local;
 declaration.Array = array;
 declaration.Atomic = atomic;
-   declaration.Shared = shared;
+   declaration.MemType = mem_type;
 header_bodysize_grow( header );

 return declaration;
@@ -406,7 +406,7 @@ tgsi_build_full_declaration(
full_decl->Declaration.Local,
full_decl->Declaration.Array,
full_decl->Declaration.Atomic,
-  full_decl->Declaration.Shared,
+  full_decl->Declaration.MemType,
header );

 if (maxsize <= size)
diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c
b/src/gallium/auxiliary/tgsi/tgsi_dump.c
index f232f38..273f0ae 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
@@ -365,8 +365,13 @@ iter_declaration(
 }

 if (decl->Declaration.File == TGSI_FILE_MEMORY) {
-  if (decl->Declaration.Shared)
- TXT(", SHARED");
+  switch (decl->Declaration.MemType) {
+  /* Note: ,GLOBAL is optional / the default */
+  case TGSI_MEMORY_TYPE_GLOBAL: TXT(", GLOBAL"); break;
+  case TGSI_MEMORY_TYPE_LOCAL:  TXT(", LOCAL");  break;
+  case TGSI_MEMORY_TYPE_SHARED: TXT(", SHARED"); break;
+  case TGSI_MEMORY_TYPE_INPUT:  TXT(", INPUT");  break;
+  }
 }

 if (decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index 77598d2..9438e3b 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -1390,8 +1390,18 @@ static boolean parse_declaration( struct
translate_ctx *ctx )
  ctx->cur = cur;
   }
} else if (file == TGSI_FILE_MEMORY) {
- if (str_match_nocase_whole(, "SHARED")) {
-decl.Declaration.Shared = 1;
+ if (str_match_nocase_whole(, "GLOBAL")) {
+/* Note this is a no-op global is the default */
+decl.Declaration.MemType = TGSI_MEMORY_TYPE_GLOBAL;
+ctx->cur = cur;
+ } else if (str_match_nocase_whole(, "LOCAL")) {
+decl.Declaration.MemType = TGSI_MEMORY_TYPE_LOCAL;
+ctx->cur = cur;
+ } else if (str_match_nocase_whole(, "SHARED")) {
+decl.Declaration.MemType = TGSI_MEMORY_TYPE_SHARED;
+ctx->cur = cur;
+ } else if (str_match_nocase_whole(, "INPUT")) {
+decl.Declaration.MemType = TGSI_MEMORY_TYPE_INPUT;
  ctx->cur = cur;
   }
} else {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index e1a7278..9e10044 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -190,7 +190,7 @@ struct ureg_program

 struct ureg_tokens domain[2];

-   bool use_shared_memory;
+   bool use_memory[TGSI_MEMORY_TYPE_COUNT];
  };

  static union tgsi_any_token error_tokens[32];
@@ -729,13 +729,14 @@ struct ureg_src ureg_DECL_buffer(struct 

Re: [Mesa-dev] [PATCH 04/10] i965: Have NIR lower flrp on pre-GEN6 vec4 backend

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
> By doing it in NIR, we have the opportunity for NIR to do additional
> optimization of the expanded code.
>
> This also enables optimizations added by the next commit.
>
> shader-db results:
>
> G4X / Ironlake
> total instructions in shared programs: 4024401 -> 4016538 (-0.20%)
> instructions in affected programs: 447686 -> 439823 (-1.76%)
> helped: 2623
> HURT: 0
>
> total cycles in shared programs: 84375846 -> 84328296 (-0.06%)
> cycles in affected programs: 16964960 -> 16917410 (-0.28%)
> helped: 2556
> HURT: 41
>
> Unsurprisingly, no changes on later platforms.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.c | 27 +--
>  1 file changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c 
> b/src/mesa/drivers/dri/i965/brw_compiler.c
> index 2f05a26..6f67b5c 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> @@ -107,6 +107,26 @@ static const struct nir_shader_compiler_options 
> vector_nir_options = {
>  */
> .fdot_replicates = true,
>
> +   /* Prior to GEN6, there are no three source operations for SIMD4x2. */

Gen's not an acronym, so we don't write it in all-caps.

> +   .lower_flrp = true,
> +
> +   .lower_pack_snorm_2x16 = true,
> +   .lower_pack_unorm_2x16 = true,
> +   .lower_unpack_snorm_2x16 = true,
> +   .lower_unpack_unorm_2x16 = true,
> +   .lower_extract_byte = true,
> +   .lower_extract_word = true,
> +};
> +
> +static const struct nir_shader_compiler_options vector_nir_options_gen6 = {
> +   COMMON_OPTIONS,
> +
> +   /* In the vec4 backend, our dpN instruction replicates its result to all 
> the
> +* components of a vec4.  We would like NIR to give us replicated fdot
> +* instructions because it can optimize better for us.
> +*/
> +   .fdot_replicates = true,
> +
> .lower_pack_snorm_2x16 = true,
> .lower_pack_unorm_2x16 = true,
> .lower_unpack_snorm_2x16 = true,
> @@ -159,8 +179,11 @@ brw_compiler_create(void *mem_ctx, const struct 
> brw_device_info *devinfo)
>if (devinfo->gen < 7)
>   compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
>
> -  compiler->glsl_compiler_options[i].NirOptions =
> - is_scalar ? _nir_options : _nir_options;
> +  if (is_scalar)
> + compiler->glsl_compiler_options[i].NirOptions = _nir_options;
> +  else
> + compiler->glsl_compiler_options[i].NirOptions =
> +devinfo->gen < 6 ? _nir_options : 
> _nir_options_gen6;

Braces since this statement is multiline (and braces around the if
since the else will have them).

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] First round of Boolean math optimizations

2016-03-10 Thread Ian Romanick
This is the first round of patches to improve the way we deal with
floating point values used to represent Booleans.  I have a bunch more,
but this is the set that has been stable and shows almost exclusively
improvement.

I'm not married to the first 3 patches in the series.  I was trying to
silence a GCC warning, but I don't really like the way it turned out.
Maybe someone can explain to me why that code after patch 2 is still
"bad," but the good after patch 3 is "good."  It seems like GCC ought to
be able to see that they're the same.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] nir: Lower flrp with Boolean interpolator to bcsel

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

On Intel platforms that don't set lower_flrp, using bcsel instead of
flrp seems to be a small amount worse.  On those platforms, the use of
flrp, bcsel, and multiply of b2f is still an active area of research.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4016538 -> 4012279 (-0.11%)
instructions in affected programs: 161556 -> 157297 (-2.64%)
helped: 1077
HURT: 1

total cycles in shared programs: 84328296 -> 84315862 (-0.01%)
cycles in affected programs: 4174570 -> 4162136 (-0.30%)
helped: 926
HURT: 53

Unsurprisingly, no changes on later platforms.

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 39be85f..8a44a7a 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -83,10 +83,13 @@ optimizations = [
(('flrp', a, b, 1.0), b),
(('flrp', a, a, b), a),
(('flrp', 0.0, a, b), ('fmul', a, b)),
+   (('flrp', a, b, ('b2f', c)), ('bcsel', c, b, a), 'options->lower_flrp'),
(('flrp', a, b, c), ('fadd', ('fmul', c, ('fsub', b, a)), a), 
'options->lower_flrp'),
(('ffract', a), ('fsub', a, ('ffloor', a)), 'options->lower_ffract'),
-   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c))), ('fmul', b, c)), 
('flrp', a, b, c), '!options->lower_flrp'),
-   (('fadd', a, ('fmul', c, ('fadd', b, ('fneg', a, ('flrp', a, b, c), 
'!options->lower_flrp'),
+   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', ('b2f', c, ('fmul', b, 
('b2f', c))), ('bcsel', c, b, a), 'options->lower_flrp'),
+   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c ))), ('fmul', b, 
c )), ('flrp', a, b, c), '!options->lower_flrp'),
+   (('fadd', a, ('fmul', ('b2f', c), ('fadd', b, ('fneg', a, ('bcsel', c, 
b, a), 'options->lower_flrp'),
+   (('fadd', a, ('fmul', c , ('fadd', b, ('fneg', a, ('flrp', a, 
b, c), '!options->lower_flrp'),
(('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'),
(('fadd', ('fmul', a, b), c), ('ffma', a, b, c), '!options->lower_ffma'),
# Comparison simplifications
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] i965: Silence loop counter overflow warning

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

I don't understand why the old code was bad, but the new code is fine.

brw_state_dump.c: In function ‘brw_debug_batch’:
brw_state_dump.c:677:4: warning: cannot optimize loop, the loop counter may 
overflow [-Wunsafe-loop-optimizations]
for (i = 0; i < size / 4; i += 4) {
^
brw_state_dump.c:693:4: warning: cannot optimize loop, the loop counter may 
overflow [-Wunsafe-loop-optimizations]
for (i = 0; i < size / 4; i += 4) {
^

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_state_dump.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_dump.c 
b/src/mesa/drivers/dri/i965/brw_state_dump.c
index 943b2a9..ba09537 100644
--- a/src/mesa/drivers/dri/i965/brw_state_dump.c
+++ b/src/mesa/drivers/dri/i965/brw_state_dump.c
@@ -674,7 +674,9 @@ dump_vs_constants(struct brw_context *brw, uint32_t offset, 
uint32_t size)
uint32_t *as_uint = brw->batch.bo->virtual + offset;
float *as_float = brw->batch.bo->virtual + offset;
 
-   for (unsigned i = 0; i < size / 4; i += 4) {
+   for (unsigned j = 0; j < size / 16; j++) {
+  const unsigned i = j * 4;
+
   batch_out(brw, name, offset, i, "%3d: (% f % f % f % f) (0x%08x 0x%08x 
0x%08x 0x%08x)\n",
i / 4,
as_float[i], as_float[i + 1], as_float[i + 2], as_float[i + 3],
@@ -689,7 +691,9 @@ dump_wm_constants(struct brw_context *brw, uint32_t offset, 
uint32_t size)
uint32_t *as_uint = brw->batch.bo->virtual + offset;
float *as_float = brw->batch.bo->virtual + offset;
 
-   for (unsigned i = 0; i < size / 4; i += 4) {
+   for (unsigned j = 0; j < size / 16; j++) {
+  const unsigned i = j * 4;
+
   batch_out(brw, name, offset, i, "%3d: (% f % f % f % f) (0x%08x 0x%08x 
0x%08x 0x%08x)\n",
i / 4,
as_float[i], as_float[i + 1], as_float[i + 2], as_float[i + 3],
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/10] nir: Don't abs slt and friends

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

No shader-db changes, but this is symmetric with the previous commit.

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 1a0bdd0..0c64b09 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -133,6 +133,10 @@ optimizations = [
(('ior', ('flt', a, c), ('flt', b, c)), ('flt', ('fmin', a, b), c)),
(('ior', ('fge', a, b), ('fge', a, c)), ('fge', a, ('fmin', b, c))),
(('ior', ('fge', a, c), ('fge', b, c)), ('fge', ('fmax', a, b), c)),
+   (('fabs', ('slt', a, b)), ('slt', a, b)),
+   (('fabs', ('sge', a, b)), ('sge', a, b)),
+   (('fabs', ('seq', a, b)), ('seq', a, b)),
+   (('fabs', ('sne', a, b)), ('sne', a, b)),
(('slt', a, b), ('b2f', ('flt', a, b)), 'options->lower_scmp'),
(('sge', a, b), ('b2f', ('fge', a, b)), 'options->lower_scmp'),
(('seq', a, b), ('b2f', ('feq', a, b)), 'options->lower_scmp'),
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/10] i965: Have NIR lower flrp on pre-GEN6 vec4 backend

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
By doing it in NIR, we have the opportunity for NIR to do additional
optimization of the expanded code.

This also enables optimizations added by the next commit.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4024401 -> 4016538 (-0.20%)
instructions in affected programs: 447686 -> 439823 (-1.76%)
helped: 2623
HURT: 0

total cycles in shared programs: 84375846 -> 84328296 (-0.06%)
cycles in affected programs: 16964960 -> 16917410 (-0.28%)
helped: 2556
HURT: 41

Unsurprisingly, no changes on later platforms.

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_compiler.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c 
b/src/mesa/drivers/dri/i965/brw_compiler.c
index 2f05a26..6f67b5c 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.c
+++ b/src/mesa/drivers/dri/i965/brw_compiler.c
@@ -107,6 +107,26 @@ static const struct nir_shader_compiler_options 
vector_nir_options = {
 */
.fdot_replicates = true,
 
+   /* Prior to GEN6, there are no three source operations for SIMD4x2. */
+   .lower_flrp = true,
+
+   .lower_pack_snorm_2x16 = true,
+   .lower_pack_unorm_2x16 = true,
+   .lower_unpack_snorm_2x16 = true,
+   .lower_unpack_unorm_2x16 = true,
+   .lower_extract_byte = true,
+   .lower_extract_word = true,
+};
+
+static const struct nir_shader_compiler_options vector_nir_options_gen6 = {
+   COMMON_OPTIONS,
+
+   /* In the vec4 backend, our dpN instruction replicates its result to all the
+* components of a vec4.  We would like NIR to give us replicated fdot
+* instructions because it can optimize better for us.
+*/
+   .fdot_replicates = true,
+
.lower_pack_snorm_2x16 = true,
.lower_pack_unorm_2x16 = true,
.lower_unpack_snorm_2x16 = true,
@@ -159,8 +179,11 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
   if (devinfo->gen < 7)
  compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
 
-  compiler->glsl_compiler_options[i].NirOptions =
- is_scalar ? _nir_options : _nir_options;
+  if (is_scalar)
+ compiler->glsl_compiler_options[i].NirOptions = _nir_options;
+  else
+ compiler->glsl_compiler_options[i].NirOptions =
+devinfo->gen < 6 ? _nir_options : _nir_options_gen6;
 
   compiler->glsl_compiler_options[i].LowerBufferInterfaceBlocks = true;
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/10] nir: Simplify 0 < fabs(a)

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

Sandy Bridge / Ivy Bridge / Haswell
total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
instructions in affected programs: 564 -> 558 (-1.06%)
helped: 6
HURT: 0

total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
cycles in affected programs: 9768 -> 9582 (-1.90%)
helped: 12
HURT: 0

Broadwell / Skylake
total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
instructions in affected programs: 626 -> 619 (-1.12%)
helped: 7
HURT: 0

total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
cycles in affected programs: 9378 -> 9192 (-1.98%)
helped: 12
HURT: 0

G45 and Ironlake showed no change.

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 4db8f84..1442ce8 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -108,6 +108,11 @@ optimizations = [
# inot(a)
(('fge', 0.0, ('b2f', a)), ('inot', a)),
 
+   # 0.0 < fabs(a)
+   # 0.0 != fabs(a)  because fabs(a) must be >= 0
+   # 0.0 != a
+   (('flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
+
(('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
(('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
(('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] nir: Simplify i2b with negated or abs operand

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

This enables removing ssa_201 and ssa_202 in sequences like:

 vec1 ssa_200 = flt ssa_199, ssa_194
 vec1 ssa_201 = b2i ssa_200
 vec1 ssa_202 = i2b -ssa_201

shader-db results:

Sandy Bridge
total instructions in shared programs: 8462257 -> 8462180 (-0.00%)
instructions in affected programs: 3846 -> 3769 (-2.00%)
helped: 35
HURT: 0

total cycles in shared programs: 117542934 -> 117542462 (-0.00%)
cycles in affected programs: 20072 -> 19600 (-2.35%)
helped: 20
HURT: 1

Ivy Bridge
total instructions in shared programs: 7775252 -> 7775137 (-0.00%)
instructions in affected programs: 3645 -> 3530 (-3.16%)
helped: 35
HURT: 0

total cycles in shared programs: 65760522 -> 65760068 (-0.00%)
cycles in affected programs: 21082 -> 20628 (-2.15%)
helped: 25
HURT: 2

Haswell
total instructions in shared programs: 7108666 -> 7108589 (-0.00%)
instructions in affected programs: 3253 -> 3176 (-2.37%)
helped: 35
HURT: 0

total cycles in shared programs: 64675726 -> 64675272 (-0.00%)
cycles in affected programs: 21034 -> 20580 (-2.16%)
helped: 26
HURT: 1

Broadwell / Skylake
total instructions in shared programs: 8980912 -> 8980835 (-0.00%)
instructions in affected programs: 3223 -> 3146 (-2.39%)
helped: 35
HURT: 0

total cycles in shared programs: 70077926 -> 70077904 (-0.00%)
cycles in affected programs: 21886 -> 21864 (-0.10%)
helped: 21
HURT: 6

G45 and Ironlake showed no change.

Signed-off-by: Ian Romanick 
Suggested-by: Jason Ekstrand 
---
 src/compiler/nir/nir_opt_algebraic.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 8a44a7a..5b3694e 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -217,6 +217,8 @@ optimizations = [
(('i2b', ('b2i', a)), a),
(('f2i', ('ftrunc', a)), ('f2i', a)),
(('f2u', ('ftrunc', a)), ('f2u', a)),
+   (('i2b', ('ineg', a)), ('i2b', a)),
+   (('i2b', ('iabs', a)), ('i2b', a)),
 
# Byte extraction
(('ushr', a, 24), ('extract_u8', a, 3), '!options->lower_extract_byte'),
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] nir: Simplify 0 >= b2f(a)

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

This also prevented some regressions with other patches in my local
tree.

Broadwell / Skylake
total instructions in shared programs: 8980835 -> 8980833 (-0.00%)
instructions in affected programs: 45 -> 43 (-4.44%)
helped: 1
HURT: 0

total cycles in shared programs: 70077904 -> 70077900 (-0.00%)
cycles in affected programs: 122 -> 118 (-3.28%)
helped: 1
HURT: 0

No changes on earlier platforms.

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 5b3694e..4db8f84 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -101,6 +101,13 @@ optimizations = [
(('inot', ('ige', a, b)), ('ilt', a, b)),
(('inot', ('ieq', a, b)), ('ine', a, b)),
(('inot', ('ine', a, b)), ('ieq', a, b)),
+
+   # 0.0 >= b2f(a)
+   # 0.0 == b2f(a) because b2f(a) can only be 0 or 1
+   # b2f(a) == 0.0
+   # inot(a)
+   (('fge', 0.0, ('b2f', a)), ('inot', a)),
+
(('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
(('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
(('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] nir: Don't abs the result of b2f or b2i

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

In the results below, 2 SIMD16 shaders in Trine are lost.

G4X
total instructions in shared programs: 4012279 -> 4011108 (-0.03%)
instructions in affected programs: 116776 -> 115605 (-1.00%)
helped: 339
HURT: 0

total cycles in shared programs: 84315862 -> 84313584 (-0.00%)
cycles in affected programs: 1767232 -> 1764954 (-0.13%)
helped: 274
HURT: 81

Ironlake
total instructions in shared programs: 6399073 -> 6396998 (-0.03%)
instructions in affected programs: 218050 -> 215975 (-0.95%)
helped: 600
HURT: 0

total cycles in shared programs: 128892088 -> 12810 (-0.00%)
cycles in affected programs: 2867452 -> 2864174 (-0.11%)
helped: 422
HURT: 137

Sandy Bridge
total instructions in shared programs: 8462174 -> 8460759 (-0.02%)
instructions in affected programs: 178529 -> 177114 (-0.79%)
helped: 596
HURT: 0

total cycles in shared programs: 117542276 -> 117534098 (-0.01%)
cycles in affected programs: 1239166 -> 1230988 (-0.66%)
helped: 369
HURT: 150

Ivy Bridge
total instructions in shared programs: 7775131 -> 7773410 (-0.02%)
instructions in affected programs: 162903 -> 161182 (-1.06%)
helped: 590
HURT: 0

total cycles in shared programs: 65759882 -> 65747268 (-0.02%)
cycles in affected programs: 1004354 -> 991740 (-1.26%)
helped: 467
HURT: 141

Haswell
total instructions in shared programs: 7107786 -> 7106327 (-0.02%)
instructions in affected programs: 140954 -> 139495 (-1.04%)
helped: 590
HURT: 0

total cycles in shared programs: 64668028 -> 64655322 (-0.02%)
cycles in affected programs: 967080 -> 954374 (-1.31%)
helped: 452
HURT: 149

LOST:   2
GAINED: 0

Broadwell
total instructions in shared programs: 8980029 -> 8978287 (-0.02%)
instructions in affected programs: 197232 -> 195490 (-0.88%)
helped: 715
HURT: 0

total cycles in shared programs: 70070448 -> 70055970 (-0.02%)
cycles in affected programs: 975724 -> 961246 (-1.48%)
helped: 471
HURT: 111

LOST:   2
GAINED: 0

Skylake
total instructions in shared programs: 9115178 -> 9113436 (-0.02%)
instructions in affected programs: 203012 -> 201270 (-0.86%)
helped: 715
HURT: 0

total cycles in shared programs: 68848660 -> 68834004 (-0.02%)
cycles in affected programs: 993888 -> 979232 (-1.47%)
helped: 473
HURT: 116

LOST:   2
GAINED: 0

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 1442ce8..1a0bdd0 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -231,6 +231,8 @@ optimizations = [
(('f2u', ('ftrunc', a)), ('f2u', a)),
(('i2b', ('ineg', a)), ('i2b', a)),
(('i2b', ('iabs', a)), ('i2b', a)),
+   (('fabs', ('b2f', a)), ('b2f', a)),
+   (('iabs', ('b2i', a)), ('b2i', a)),
 
# Byte extraction
(('ushr', a, 24), ('extract_u8', a, 3), '!options->lower_extract_byte'),
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] i965: Silence silly comparison between signed and unsigned integer warnings

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

brw_state_dump.c: In function ‘gen7_dump_sampler_state’:
brw_state_dump.c:405:22: warning: comparison between signed and unsigned 
integer expressions [-Wsign-compare]
for (int i = 0; i < size / 16; i++) {
  ^
brw_state_dump.c: In function ‘gen8_dump_blend_state’:
brw_state_dump.c:621:22: warning: comparison between signed and unsigned 
integer expressions [-Wsign-compare]
for (int i = 1; i < size / 4; i += 2) {
  ^
brw_state_dump.c: In function ‘dump_vs_constants’:
brw_state_dump.c:677:18: warning: comparison between signed and unsigned 
integer expressions [-Wsign-compare]
for (i = 0; i < size / 4; i += 4) {
  ^
brw_state_dump.c: In function ‘dump_wm_constants’:
brw_state_dump.c:693:18: warning: comparison between signed and unsigned 
integer expressions [-Wsign-compare]
for (i = 0; i < size / 4; i += 4) {
  ^
brw_state_dump.c: In function ‘dump_binding_table’:
brw_state_dump.c:708:18: warning: comparison between signed and unsigned 
integer expressions [-Wsign-compare]
for (i = 0; i < size / 4; i++) {
  ^

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_state_dump.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_dump.c 
b/src/mesa/drivers/dri/i965/brw_state_dump.c
index 6450f9b..943b2a9 100644
--- a/src/mesa/drivers/dri/i965/brw_state_dump.c
+++ b/src/mesa/drivers/dri/i965/brw_state_dump.c
@@ -403,7 +403,7 @@ static void gen7_dump_sampler_state(struct brw_context *brw,
const uint32_t *samp = brw->batch.bo->virtual + offset;
char name[20];
 
-   for (int i = 0; i < size / 16; i++) {
+   for (unsigned i = 0; i < size / 16; i++) {
   sprintf(name, "SAMPLER_STATE %d", i);
   batch_out(brw, name, offset, i,
 "Disabled = %s, Base Mip: %u.%u, Mip/Mag/Min Filter: %s/%s/%s, 
LOD Bias: %d.%d\n",
@@ -619,7 +619,7 @@ gen8_dump_blend_state(struct brw_context *brw, uint32_t 
offset, uint32_t size)
if (((size) % 2) != 0)
   fprintf(stderr, "Invalid blend state size %d\n", size);
 
-   for (int i = 1; i < size / 4; i += 2) {
+   for (unsigned i = 1; i < size / 4; i += 2) {
   char name[sizeof("BLEND_ENTRYXXX")];
   sprintf(name, "BLEND_ENTRY%02d", (i - 1) / 2);
   if (blend[i + 1] & GEN8_BLEND_LOGIC_OP_ENABLE) {
@@ -673,9 +673,8 @@ dump_vs_constants(struct brw_context *brw, uint32_t offset, 
uint32_t size)
const char *name = "VS_CONST";
uint32_t *as_uint = brw->batch.bo->virtual + offset;
float *as_float = brw->batch.bo->virtual + offset;
-   int i;
 
-   for (i = 0; i < size / 4; i += 4) {
+   for (unsigned i = 0; i < size / 4; i += 4) {
   batch_out(brw, name, offset, i, "%3d: (% f % f % f % f) (0x%08x 0x%08x 
0x%08x 0x%08x)\n",
i / 4,
as_float[i], as_float[i + 1], as_float[i + 2], as_float[i + 3],
@@ -689,9 +688,8 @@ dump_wm_constants(struct brw_context *brw, uint32_t offset, 
uint32_t size)
const char *name = "WM_CONST";
uint32_t *as_uint = brw->batch.bo->virtual + offset;
float *as_float = brw->batch.bo->virtual + offset;
-   int i;
 
-   for (i = 0; i < size / 4; i += 4) {
+   for (unsigned i = 0; i < size / 4; i += 4) {
   batch_out(brw, name, offset, i, "%3d: (% f % f % f % f) (0x%08x 0x%08x 
0x%08x 0x%08x)\n",
i / 4,
as_float[i], as_float[i + 1], as_float[i + 2], as_float[i + 3],
@@ -703,10 +701,9 @@ static void dump_binding_table(struct brw_context *brw, 
uint32_t offset,
   uint32_t size)
 {
char name[20];
-   int i;
uint32_t *data = brw->batch.bo->virtual + offset;
 
-   for (i = 0; i < size / 4; i++) {
+   for (unsigned i = 0; i < size / 4; i++) {
   if (data[i] == 0)
 continue;
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] i965: Silence unused parameter warning

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

Remove the parameter.  Also, reformat the function definition to match
Mesa coding style.

brw_state_dump.c: In function ‘q_to_float’:
brw_state_dump.c:266:44: warning: unused parameter ‘integer_end’ 
[-Wunused-parameter]
 static float q_to_float(uint32_t data, int integer_end, int integer_start,
^

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_state_dump.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_dump.c 
b/src/mesa/drivers/dri/i965/brw_state_dump.c
index 4666788..6450f9b 100644
--- a/src/mesa/drivers/dri/i965/brw_state_dump.c
+++ b/src/mesa/drivers/dri/i965/brw_state_dump.c
@@ -263,8 +263,9 @@ static void dump_gen7_surface_state(struct brw_context 
*brw, uint32_t offset)
batch_out(brw, name, offset, 7, "\n");
 }
 
-static float q_to_float(uint32_t data, int integer_end, int integer_start,
-int fractional_end, int fractional_start)
+static float
+q_to_float(uint32_t data, int integer_start,
+   int fractional_end, int fractional_start)
 {
/* Convert the number to floating point. */
float n = GET_BITS(data, integer_start, fractional_end);
@@ -305,7 +306,7 @@ dump_gen8_surface_state(struct brw_context *brw, uint32_t 
offset, int index)
  surface_tiling[GET_BITS(surf[0], 13, 12)]);
batch_out(brw, name, offset, 1, "MOCS: 0x%x Base MIP: %.1f (%u mips) 
Surface QPitch: %d\n",
  GET_FIELD(surf[1], GEN8_SURFACE_MOCS),
- q_to_float(surf[1], 23, 20, 19, 19),
+ q_to_float(surf[1], 20, 19, 19),
  surf[5] & INTEL_MASK(3, 0),
  GET_FIELD(surf[1], GEN8_SURFACE_QPITCH) << 2);
batch_out(brw, name, offset, 2, "%dx%d [%s]\n",
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] scons: build osmesa swrast and gallium

2016-03-10 Thread Roland Scheidegger
Am 10.03.2016 um 08:47 schrieb Andreas Fänger:
> 
>> -Ursprüngliche Nachricht- Von: Roland Scheidegger Gesendet:
>> Mittwoch, 9. März 2016 17:31 Betreff: Re: [Mesa-dev] [PATCH] scons:
>> build osmesa swrast and gallium
>> 
>> Am 09.03.2016 um 08:41 schrieb Andreas Fänger:
 -Ursprüngliche Nachricht- Von: Roland Scheidegger
 Gesendet: Dienstag, 8. März 2016 18:26 Betreff: Re: [Mesa-dev]
 [PATCH] scons: build osmesa swrast and gallium
 
 Not that I really care what you can or can't build (and I
 won't comment on build changes), what are those features
 lacking in llvmpipe, beside from anisotropic filtering (which I
 always considered essentially useless for a software renderer,
 albeit interesting if you're curious about the math involved)?
 Last time I checked llvmpipe/softpipe had a much more robust
 feature set (especially when it comes to non-legacy GL
 features), for starters I'll just mention working derivatives
 which is usually the first thing people still using classic
 swrast are hitting bugs on...
 
>>> 
>>> We are using osmesa for rendering single images on a server. No 
>>> shaders at the moment, only texturing (fixed function pipeline).
>>> For us the qualitiy of the images is the most important criteria
>>> (and rendering speed, of course); therefore anisotropic filtering
>>> is absolutely necessary in order to achieve good looking images.
>>> The same with anti-aliasing: Currently we are using
>>> GL_POLYGON_SMOOTH but this is also missing in gallium. We need an
>>> antialiasing method that is not simply blurring the whole image
>>> or texturese but affects only the edges of the polygons.
>> 
>> Ah ok. Gallium supports polygon smooth but the sw rasterization
>> drivers do not, it isn't supported by quite a lot of hw drivers
>> neither (I wasn't even aware swrast did), albeit of course hw
>> drivers typically support msaa (chances of llvmpipe getting support
>> for msaa one day is probably a lot higher than for polygon_smooth,
>> albeit maybe the latter could use mostly the same code as the
>> former...). Anisotropic would be interesting to implement, but it
>> was just something easy to skip (since no apis really require it) -
>> there's just not many developers working on llvmpipe... Just don't
>> get your hopes up for better support of modern GL features for 
>> classic swrast.
>> 
> 
> We are interessted in using llvmpipe, however, this is why we
> currently cannot do the switch and have to stick to classic mesa. It
> would be really great if msaa (or GL_POLYGON_SMOOTH) would be
> implemented in llvmpipe together with anisotropic filtering in the
> near future. There already is an implementation for anisotropic
> filtering for softpipe (and swrast), so maybe it's possible that one
> of the llvmpipe developers could port it? If people are concerned
> with speed, maybe make it optional and provide a compiler flag to
> turn it on or off?
> 

I'm afraid there aren't really many llvmpipe developers. Nowadays, that
would be mostly me - and I'm not currently working on it as I've got
other things to do...
I don't think aniso would really have to be disabled at compile time due
to performance concerns, with the jit code there'd only be a performance
loss if it's actually enabled, in which case that should be ok (albeit
if there's apps anyone is interested in running on llvmpipe which don't
let you disable it could still have an env var or whatever to disable it).
Patches are welcome, though...

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/nir: Use uniform index instead of lookup by name

2016-03-10 Thread Topi Pohjolainen
Uniform linking in (see link_assign_uniform_locations()) already
stores the index to the storage in ir_variable which is further
stored into nir_variable (see nir_visitor::visit(ir_variable *)).

Instead of doing uniform_num^2 string comparisons one can recur
over the uniform type the same way uniform linking does.

Unfortunately I didn't see any improvement in performance tests,
at least on BDW. Only the the fps numbers in a few synthetic
benchmarks started to vary more than before between two subsequent
runs.

CC: Kenneth Graunke 
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 113 +++--
 1 file changed, 67 insertions(+), 46 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp 
b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
index f3361d6..f8ee0af 100644
--- a/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
+++ b/src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp
@@ -67,62 +67,83 @@ brw_nir_setup_glsl_builtin_uniform(nir_variable *var,
 }
 
 static void
-brw_nir_setup_glsl_uniform(gl_shader_stage stage, nir_variable *var,
-   struct gl_shader_program *shader_prog,
+brw_nir_setup_glsl_uniform(gl_shader_stage stage, bool is_scalar,
struct brw_stage_prog_data *stage_prog_data,
-   bool is_scalar)
+   const struct gl_uniform_storage *storage,
+   unsigned *uniform_index)
 {
-   int namelen = strlen(var->name);
-
-   /* The data for our (non-builtin) uniforms is stored in a series of
-* gl_uniform_driver_storage structs for each subcomponent that
-* glGetUniformLocation() could name.  We know it's been set up in the same
-* order we'd walk the type, so walk the list of storage and find anything
-* with our name, or the prefix of a component that starts with our name.
-*/
-   unsigned uniform_index = var->data.driver_location / 4;
-   for (unsigned u = 0; u < shader_prog->NumUniformStorage; u++) {
-  struct gl_uniform_storage *storage = _prog->UniformStorage[u];
-
-  if (storage->builtin)
- continue;
+   if (storage->type->is_image()) {
+  brw_setup_image_uniform_values(stage, stage_prog_data,
+ *uniform_index, storage);
+  *uniform_index +=
+ BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
+   } else {
+  gl_constant_value *components = storage->storage;
+  unsigned vector_count = (MAX2(storage->array_elements, 1) *
+   storage->type->matrix_columns);
+  unsigned vector_size = storage->type->vector_elements;
+
+  for (unsigned s = 0; s < vector_count; s++) {
+ unsigned i;
+ for (i = 0; i < vector_size; i++) {
+stage_prog_data->param[(*uniform_index)++] = components++;
+ }
 
-  if (strncmp(var->name, storage->name, namelen) != 0 ||
-  (storage->name[namelen] != 0 &&
-   storage->name[namelen] != '.' &&
-   storage->name[namelen] != '[')) {
- continue;
+ if (!is_scalar) {
+/* Pad out with zeros if needed (only needed for vec4) */
+for (; i < 4; i++) {
+   static const gl_constant_value zero = { 0.0 };
+   stage_prog_data->param[(*uniform_index)++] = 
+}
+ }
   }
+   }
+}
 
-  if (storage->type->is_image()) {
- brw_setup_image_uniform_values(stage, stage_prog_data,
-uniform_index, storage);
- uniform_index +=
-BRW_IMAGE_PARAM_SIZE * MAX2(storage->array_elements, 1);
-  } else {
- gl_constant_value *components = storage->storage;
- unsigned vector_count = (MAX2(storage->array_elements, 1) *
-  storage->type->matrix_columns);
- unsigned vector_size = storage->type->vector_elements;
-
- for (unsigned s = 0; s < vector_count; s++) {
-unsigned i;
-for (i = 0; i < vector_size; i++) {
-   stage_prog_data->param[uniform_index++] = components++;
-}
+/* This mirrors the breakdown of complex uniforms in link_uniforms.cpp */
+static void
+brw_nir_recur_to_glsl_uniform(gl_shader_stage stage, bool is_scalar,
+  struct brw_stage_prog_data *stage_prog_data,
+  const struct gl_uniform_storage **storage,
+  unsigned *uniform_index, const glsl_type *t)
+{
+   assert(!t->is_interface() && !t->without_array()->is_interface());
 
-if (!is_scalar) {
-   /* Pad out with zeros if needed (only needed for vec4) */
-   for (; i < 4; i++) {
-  static const gl_constant_value zero = { 0.0 };
-  stage_prog_data->param[uniform_index++] = 
-   }
-}
- 

Re: [Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-10 Thread Adam Jackson
On Thu, 2016-03-10 at 10:53 -0700, Kyle Brenneman wrote:
> On 03/10/2016 10:47 AM, Martin Peres wrote:
> > 
> > That could be a hacky way of handling the case where multiple 3D 
> > drivers could be used to drive the same GPU. This may be necessary in 
> > the future if two mesa drivers support the same GPU but one is 
> > considered better than the other. We can also imagine a case where a 
> > proprietary driver would need to be co-installable with an open source 
> > one and would still use the same DDX. Isn't that what AMD is going to 
> > do soon? Did anyone think about this case?
> That case is the reason for allowing multiple vendor names. For a case 
> like AMD's driver, it would hand back two names. The order would be up 
> to the driver implementation, but I would guess that it would list the 
> proprietary driver first and the open source driver second. If the 
> proprietary one is installed, then the client would use it, and if not, 
> the client would use the open source one.

Right. It's pretty straightforward (and I plan) to wire this logic
through to xorg.conf, so the admin can control the list at server
startup time. Historically it's been somewhat pointless to do so, since
all the proprietary drivers have also replaced the server's GLX module,
but this gives us a path towards not doing that anymore at least.

I'd also considered using this mechanism as a starting path towards
both implementing GLX+Xinerama at all for the open drivers, and
eventually to do that with heterogeneous open/closed GLX on the server
side. Those are both fairly long term prospects, but I imagine it'll be
easier to do if we can opt into different client-side logic.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94481] softpipe - access violation in img_filter_2d_nearest

2016-03-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94481

Greg  changed:

   What|Removed |Added

 CC||greg.bea...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94481] softpipe - access violation in img_filter_2d_nearest

2016-03-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94481

Greg  changed:

   What|Removed |Added

   Hardware|All |x86-64 (AMD64)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94481] softpipe - access violation in img_filter_2d_nearest

2016-03-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94481

Bug ID: 94481
   Summary: softpipe - access violation in img_filter_2d_nearest
   Product: Mesa
   Version: 11.2
  Hardware: All
OS: Windows (All)
Status: NEW
  Severity: major
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: greg.bea...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

img_filter_2d_nearest and the other image filter functions access the
args->offset property of the img_filter structure which is a pointer without
checking if the value is actually defined. img_filter_2d_ewa(...) creates a
local copy of img_filter, but it never initializes the offset property. When
that local copy of img_filter is passed into min_filter(sp_sview, sp_samp,
, _temp[0][jj]), an access violation occurs when the offset property
is attempted to be read.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Hans de Goede

Hi,

On 10-03-16 17:03, Samuel Pitoiset wrote:

Looks fine, except that you will need to lower FILE_SHADER_INPUT to 
FILE_MEMORY_SHARED for Tesla because input kernel parameters are located at 
s[0x10].


Ok, but should this be done in nv50_ir_from_tgsi.cpp ? That feels like the 
wrong place to
handle this detail. Not sure where to do it otherwise though, and doing this 
later
may make the code more complicated.

> No need to do this for Fermi+ because it's already lowered to c0[]. Note that 
input kernel parameters will be probably sticked on c7[] after my changes but that 
doesn't change anything for you.

Ack.



I already have a patch for the nv50 bits btw, maybe it's the right time to send 
it?

https://cgit.freedesktop.org/~hakzsam/mesa/commit/?h=compute=640d68009bcf93c1814cee0b1a12939cb85e5895


Ah I see that answers my question.

Yes I guess this is the right time to send it, although I've not really looked
at opencl for nv50 yet.

Regards,

Hans




Reviewed-by: Samuel Pitoiset 

On 03/10/2016 04:43 PM, Ilia Mirkin wrote:

On Thu, Mar 10, 2016 at 10:27 AM, Samuel Pitoiset
 wrote:



On 03/10/2016 04:23 PM, Ilia Mirkin wrote:


On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede 
wrote:


Add support for clover / OpenCL kernel input parameters.

Signed-off-by: Hans de Goede 
---
   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18
+++---
   1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index a8258af..de0c72b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int
idx, int c, uint32_t address)

  sym->reg.fileIndex = fileIdx;

-   if (tgsiFile == TGSI_FILE_MEMORY &&
-   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
-  sym->setFile(FILE_MEMORY_SHARED);
+   if (tgsiFile == TGSI_FILE_MEMORY) {
+  switch (code->memoryFiles[fileIdx].mem_type) {
+  case TGSI_MEMORY_TYPE_SHARED:
+ sym->setFile(FILE_MEMORY_SHARED);
+ break;
+  case TGSI_MEMORY_TYPE_INPUT:
+ assert(prog->getType() == Program::TYPE_COMPUTE);
+ assert(idx == -1);
+ sym->setFile(FILE_SHADER_INPUT);
+ address += info->prop.cp.inputOffset;



What's the idea here? i.e. what is the inputOffset, how is it set, and
why?



I don't get the idea too, btw.

But prop.cp.inputOffset is only defined for compute on Kepler. It's the
offset of input parameters in the screen->parm BO but as you already know,
it is going to be removed because the idea is to use screen->uniform_bo
instead. I'll do this change *after* the compute shaders support on Kepler.


Actually looks like it's only set for nv50 that I can see, shifting
things over by 0x10. It used to be reflected by getResourceBase, but
we broke that abstraction... might be nice to get it back somehow,
perhaps by sending more arguments down to getResourceBase? Either way,
that can be done later. This patch is

Reviewed-by: Ilia Mirkin 




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-10 Thread Kyle Brenneman


On 03/10/2016 10:47 AM, Martin Peres wrote:

On 09/03/16 20:15, Kyle Brenneman wrote:

The current implementation of libglvnd uses a new X extension called
x11glvnd to look up a vendor name for each screen and to find a screen
number for a GLXDrawable.

But, Adam Jackson pointed out that a GLX extension could do the same job
more cleanly: Looking up a vendor name is just querying a per-screen
string, which GLXQueryServerString does. Looking up a screen number for
a drawable could work by adding a GLX_SCREEN attribute to the
GLXGetDrawableAttributes reply.

Based on that idea, I've written up a rough draft of a GLX extension
spec. Any comments, questions, or suggestions are welcome, of course.

-Kyle


Name

 EXT_libglvnd

Name Strings

 GLX_EXT_libglvnd

Contact

 Kyle Brenneman, NVIDIA, kbrenneman at nvidia.com

Contributors

 Kyle Brenneman
 Adam Jackson

Status

 XXX - Not complete yet

Version

 Last Modified Date: March 8, 2016
 Revision: 1

Number

 ???

Dependencies

 GLX version 1.3 is required.

 This specification is written against the wording of the GLX 1.4
 Specification.

Overview

 This extension allows the vendor-neutral GLX client library,
libglvnd, to
 determine which vendor-specific driver is needed to support a given
GLX
 drawable or X11 screen.

 This GLX extension is not intended to be used directly by
applications.
 Instead, it is intended to be used by the GLX client library.

IP Status

 No known IP claims.

New Procedures and Functions

 None

New Types

 None

New Tokens

 Accepted by the  parameter of glXQueryServerString:

 GLX_VENDOR_NAMES_EXT0x

Additions to Chapter 3 of the GLX 1.4 Specification
(Functions and Errors)

 [Modify Section 3.3.2, GLX Versioning]

 [Replace the 2nd sentence of the 5th paragraph with the following]

 "The possible values for  and the format of the strings is
the same
 as for glXGetClientString.  may also be 
GLX_VENDOR_NAMES_EXT."


 [Add the following paragraph to the end of the section]

 "If  is GLX_VENDOR_NAMES_EXT, then the returned string is a
 space-separated sequence of vendor names. The names are in order of
 preference, with the most preferred vendor first."


 [Modify Section 3.3.6, Querying Attributes]

 [Replace the 2nd sentence of the 1st paragraph with the following]

 " must be set to one of GLX_WIDTH, GLX_HEIGHT,
 GLX_PRESERVED_CONTENTS, GLX_LARGEST_PBUFFER, GLX_FBCONFIG_ID, or
 GLX_SCREEN"

 [Add the following paragraph just before the last of the section]

 "If  is GLX_SCREEN, then  will be the screen
number that
 the drawable was created on."

GLX Protocol

 This extension does not add any new requests. The
GLX_VENDOR_NAMES_EXT enum
 is used with the existing glXQueryServerString request, and
GLX_SCREEN is
 added to the attributes in the glXGetDrawableAttributes reply.

Errors

 None

Issues
 1)  Should GLX_VENDOR_NAMES_EXT contain a single vendor name or a
list of
 names?

 Allowing multiple names would allow for multiple client-side
drivers
 that work with a single server-side driver. With only a single
name,
 selecting between multiple client drivers would require some
form of
 additional configuration.

 2)  How are vendor names defined and interpreted?

 The vendor names for a screen are defined based on the server's
GLX
 implementation. Typically, a server will simply send the name
of the
 driver that controls the screen.

 The GLX client library is responsible for translating the
vendor name
 to a vendor library name. The details of the translation are
part of
 the interface between the vendor library and the GLX client
library,
 and so is not defined in this specification.

 3)  What order should the vendor names be returned in?

 The GLX client library will try to load and use each vendor
name, in
 the order that the server lists them. It will stop when it
finds the
 first vendor that works.


That could be a hacky way of handling the case where multiple 3D 
drivers could be used to drive the same GPU. This may be necessary in 
the future if two mesa drivers support the same GPU but one is 
considered better than the other. We can also imagine a case where a 
proprietary driver would need to be co-installable with an open source 
one and would still use the same DDX. Isn't that what AMD is going to 
do soon? Did anyone think about this case?
That case is the reason for allowing multiple vendor names. For a case 
like AMD's driver, it would hand back two names. The order would be up 
to the driver implementation, but I would guess that it would list the 
proprietary driver first and the open source driver second. If the 
proprietary one is installed, then the client would use it, and if not, 
the client would 

Re: [Mesa-dev] [PATCH 2/2] st/mesa: add GL_ARB_shader_atomic_counter_ops support

2016-03-10 Thread Glenn Kennard

On Thu, 10 Mar 2016 18:13:03 +0100, Ilia Mirkin  wrote:


On Thu, Mar 10, 2016 at 12:04 PM, Glenn Kennard  wrote:

On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin 
wrote:


On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle 
wrote:


-   if (c->MaxCombinedAtomicBuffers > 0)
+   if (c->MaxCombinedAtomicBuffers > 0) {
extensions->ARB_shader_atomic_counters = GL_TRUE;
+  extensions->ARB_shader_atomic_counter_ops = GL_TRUE;
+   }




I believe there's pre-GCN AMD hardware which can support atomic counters
but
not atomic_counter_ops (at least according to what the closed driver
exposes, I haven't actually checked the docs), so there should probably
be a
capability flag here.



I assumed this was due to laziness... seems odd if the SSBO atomic ops
can be supported, but those same ops can't be supported on atomic
buffers. Glenn / Dave - do you guys happen to know what the pre-GCN hw
is capable of?

  -ilia



AFAIK Cayman supports atomic counter ops on SSBOs, evergreen only on counter
buffers, and earlier hardware does neither.


To phrase this a different way, my patch is fine? :) If you support
atomic counters, you support all the various ops in
ARB_shader_atomic_counter_ops (which are basically all the SSBO ops,
but on atomic counters)?



I think so, though the closed driver only exposes ARB_shader_atomic_counter_ops 
on
Cayman only which may be a hint to something. Cross that bridge when we get 
there...

/Glenn
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-10 Thread Martin Peres

On 09/03/16 20:15, Kyle Brenneman wrote:

The current implementation of libglvnd uses a new X extension called
x11glvnd to look up a vendor name for each screen and to find a screen
number for a GLXDrawable.

But, Adam Jackson pointed out that a GLX extension could do the same job
more cleanly: Looking up a vendor name is just querying a per-screen
string, which GLXQueryServerString does. Looking up a screen number for
a drawable could work by adding a GLX_SCREEN attribute to the
GLXGetDrawableAttributes reply.

Based on that idea, I've written up a rough draft of a GLX extension
spec. Any comments, questions, or suggestions are welcome, of course.

-Kyle


Name

 EXT_libglvnd

Name Strings

 GLX_EXT_libglvnd

Contact

 Kyle Brenneman, NVIDIA, kbrenneman at nvidia.com

Contributors

 Kyle Brenneman
 Adam Jackson

Status

 XXX - Not complete yet

Version

 Last Modified Date: March 8, 2016
 Revision: 1

Number

 ???

Dependencies

 GLX version 1.3 is required.

 This specification is written against the wording of the GLX 1.4
 Specification.

Overview

 This extension allows the vendor-neutral GLX client library,
libglvnd, to
 determine which vendor-specific driver is needed to support a given
GLX
 drawable or X11 screen.

 This GLX extension is not intended to be used directly by
applications.
 Instead, it is intended to be used by the GLX client library.

IP Status

 No known IP claims.

New Procedures and Functions

 None

New Types

 None

New Tokens

 Accepted by the  parameter of glXQueryServerString:

 GLX_VENDOR_NAMES_EXT0x

Additions to Chapter 3 of the GLX 1.4 Specification
(Functions and Errors)

 [Modify Section 3.3.2, GLX Versioning]

 [Replace the 2nd sentence of the 5th paragraph with the following]

 "The possible values for  and the format of the strings is
the same
 as for glXGetClientString.  may also be GLX_VENDOR_NAMES_EXT."

 [Add the following paragraph to the end of the section]

 "If  is GLX_VENDOR_NAMES_EXT, then the returned string is a
 space-separated sequence of vendor names. The names are in order of
 preference, with the most preferred vendor first."


 [Modify Section 3.3.6, Querying Attributes]

 [Replace the 2nd sentence of the 1st paragraph with the following]

 " must be set to one of GLX_WIDTH, GLX_HEIGHT,
 GLX_PRESERVED_CONTENTS, GLX_LARGEST_PBUFFER, GLX_FBCONFIG_ID, or
 GLX_SCREEN"

 [Add the following paragraph just before the last of the section]

 "If  is GLX_SCREEN, then  will be the screen
number that
 the drawable was created on."

GLX Protocol

 This extension does not add any new requests. The
GLX_VENDOR_NAMES_EXT enum
 is used with the existing glXQueryServerString request, and
GLX_SCREEN is
 added to the attributes in the glXGetDrawableAttributes reply.

Errors

 None

Issues
 1)  Should GLX_VENDOR_NAMES_EXT contain a single vendor name or a
list of
 names?

 Allowing multiple names would allow for multiple client-side
drivers
 that work with a single server-side driver. With only a single
name,
 selecting between multiple client drivers would require some
form of
 additional configuration.

 2)  How are vendor names defined and interpreted?

 The vendor names for a screen are defined based on the server's
GLX
 implementation. Typically, a server will simply send the name
of the
 driver that controls the screen.

 The GLX client library is responsible for translating the
vendor name
 to a vendor library name. The details of the translation are
part of
 the interface between the vendor library and the GLX client
library,
 and so is not defined in this specification.

 3)  What order should the vendor names be returned in?

 The GLX client library will try to load and use each vendor
name, in
 the order that the server lists them. It will stop when it
finds the
 first vendor that works.


That could be a hacky way of handling the case where multiple 3D drivers 
could be used to drive the same GPU. This may be necessary in the future 
if two mesa drivers support the same GPU but one is considered better 
than the other. We can also imagine a case where a proprietary driver 
would need to be co-installable with an open source one and would still 
use the same DDX. Isn't that what AMD is going to do soon? Did anyone 
think about this case?

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] radeonsi: process TGSI property NEXT_SHADER

2016-03-10 Thread Marek Olšák
From: Marek Olšák 

This allows compiling the main shader part as ES or LS.

If we get the correct hint, non-separable GLSL shaders no longer have to be
compiled as VS first, followed by LS or ES compiled on demand.

The result is that fewer shaders are compiled by piglit, but it doesn't
improve piglit running time.
---
 src/gallium/drivers/radeonsi/si_shader.c|  9 ++---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 27 +
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 8c1151a..151615e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5897,12 +5897,15 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,
struct si_shader *mainp = shader->selector->main_shader_part;
int r;
 
-   /* LS and ES are always compiled on demand. */
+   /* LS, ES, VS are compiled on demand if the main part hasn't been
+* compiled for that stage.
+*/
if (!mainp ||
(shader->selector->type == PIPE_SHADER_VERTEX &&
-(shader->key.vs.as_es || shader->key.vs.as_ls)) ||
+(shader->key.vs.as_es != mainp->key.vs.as_es ||
+ shader->key.vs.as_ls != mainp->key.vs.as_ls)) ||
(shader->selector->type == PIPE_SHADER_TESS_EVAL &&
-shader->key.tes.as_es)) {
+shader->key.tes.as_es != mainp->key.tes.as_es)) {
/* Monolithic shader (compiled as a whole, has many variants,
 * may take a long time to compile).
 */
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 321b87d..2378b44 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1042,6 +1042,31 @@ static int si_shader_select(struct pipe_context *ctx,
return si_shader_select_with_key(ctx, state, );
 }
 
+static void si_parse_next_shader_property(const struct tgsi_shader_info *info,
+ union si_shader_key *key)
+{
+   unsigned next_shader = info->properties[TGSI_PROPERTY_NEXT_SHADER];
+
+   switch (info->processor) {
+   case TGSI_PROCESSOR_VERTEX:
+   switch (next_shader) {
+   case TGSI_PROCESSOR_GEOMETRY:
+   key->vs.as_es = 1;
+   break;
+   case TGSI_PROCESSOR_TESS_CTRL:
+   case TGSI_PROCESSOR_TESS_EVAL:
+   key->vs.as_ls = 1;
+   break;
+   }
+   break;
+
+   case TGSI_PROCESSOR_TESS_EVAL:
+   if (next_shader == TGSI_PROCESSOR_GEOMETRY)
+   key->tes.as_es = 1;
+   break;
+   }
+}
+
 static void *si_create_shader_selector(struct pipe_context *ctx,
   const struct pipe_shader_state *state)
 {
@@ -1164,6 +1189,7 @@ static void *si_create_shader_selector(struct 
pipe_context *ctx,
goto error;
 
shader->selector = sel;
+   si_parse_next_shader_property(>info, >key);
 
tgsi_binary = si_get_tgsi_binary(sel);
 
@@ -1199,6 +1225,7 @@ static void *si_create_shader_selector(struct 
pipe_context *ctx,
union si_shader_key key;
 
memset(, 0, sizeof(key));
+   si_parse_next_shader_property(>info, );
 
/* Set reasonable defaults, so that the shader key doesn't
 * cause any code to be eliminated.
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] gallium: add TGSI property NEXT_SHADER

2016-03-10 Thread Marek Olšák
From: Marek Olšák 

Radeonsi needs to know which shader stage will execute after a shader
in order to make the best decision about which shader variant to compile
first.

This is only set for VS and TES, because we don't need it elsewhere.

VS has 3 variants:
- next shader is FS
- next shader is GS
- next shader is TCS

TES has 2 variants:
- next shader is FS
- next shader is GS

Currently, radeonsi always assumes the next shader is FS, which is suboptimal,
since st/mesa always knows which shader is next if the GLSL program is not
a "separate shader".

By default, ureg always sets "next shader is FS".
---
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  1 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 19 +++
 src/gallium/auxiliary/tgsi/tgsi_ureg.h |  2 ++
 src/gallium/docs/source/tgsi.rst   |  7 +++
 src/gallium/include/pipe/p_shader_tokens.h |  3 ++-
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index b15ae69..17c389f 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -144,6 +144,7 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
"TES_POINT_MODE",
"NUM_CLIPDIST_ENABLED",
"NUM_CULLDIST_ENABLED",
+   "NEXT_SHADER",
 };
 
 const char *tgsi_return_type_names[TGSI_RETURN_TYPE_COUNT] =
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index e1a7278..b0147fb 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -101,6 +101,7 @@ struct ureg_program
 {
unsigned processor;
bool supports_any_inout_decl_range;
+   int next_shader_processor;
 
struct {
   unsigned semantic_name;
@@ -1960,6 +1961,16 @@ const struct tgsi_token *ureg_finalize( struct 
ureg_program *ureg )
 {
const struct tgsi_token *tokens;
 
+   switch (ureg->processor) {
+   case TGSI_PROCESSOR_VERTEX:
+   case TGSI_PROCESSOR_TESS_EVAL:
+  ureg_property(ureg, TGSI_PROPERTY_NEXT_SHADER,
+ureg->next_shader_processor == -1 ?
+   TGSI_PROCESSOR_FRAGMENT :
+   ureg->next_shader_processor);
+  break;
+   }
+
emit_header( ureg );
emit_decls( ureg );
copy_instructions( ureg );
@@ -2073,6 +2084,7 @@ ureg_create_with_screen(unsigned processor, struct 
pipe_screen *screen)
   screen->get_shader_param(screen,
util_pipe_shader_from_tgsi_processor(processor),
PIPE_SHADER_CAP_TGSI_ANY_INOUT_DECL_RANGE) != 0;
+   ureg->next_shader_processor = -1;
 
for (i = 0; i < Elements(ureg->properties); i++)
   ureg->properties[i] = ~0;
@@ -2102,6 +2114,13 @@ no_ureg:
 }
 
 
+void
+ureg_set_next_shader_processor(struct ureg_program *ureg, unsigned processor)
+{
+   ureg->next_shader_processor = processor;
+}
+
+
 unsigned
 ureg_get_nr_outputs( const struct ureg_program *ureg )
 {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
index 6a3b5dd..2e63c62 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
@@ -114,6 +114,8 @@ ureg_create_shader( struct ureg_program *,
 struct pipe_context *pipe,
const struct pipe_stream_output_info *so );
 
+void
+ureg_set_next_shader_processor(struct ureg_program *ureg, unsigned processor);
 
 /* Alternately, return the built token stream and hand ownership of
  * that memory to the caller:
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 489cbb0..1db88d7 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -3206,6 +3206,13 @@ NUM_CULLDIST_ENABLED
 
 How many cull distance scalar outputs are enabled.
 
+NEXT_SHADER
+"""
+
+Which shader stage will MOST LIKELY follow after this shader when the shader
+is bound. This is only a hint to the driver and doesn't have to be precise.
+Only set for VS and TES.
+
 
 Texture Sampling and Texture Formats
 
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 9d4a96a..54b6127 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -277,7 +277,8 @@ union tgsi_immediate_data
 #define TGSI_PROPERTY_TES_POINT_MODE 14
 #define TGSI_PROPERTY_NUM_CLIPDIST_ENABLED   15
 #define TGSI_PROPERTY_NUM_CULLDIST_ENABLED   16
-#define TGSI_PROPERTY_COUNT  17
+#define TGSI_PROPERTY_NEXT_SHADER17
+#define TGSI_PROPERTY_COUNT  18
 
 struct tgsi_property {
unsigned Type : 4;  /**< TGSI_TOKEN_TYPE_PROPERTY */
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

[Mesa-dev] [PATCH 2/3] st/mesa: set TGSI property NEXT_SHADER

2016-03-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 35 ++
 1 file changed, 35 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 26e463e..27c8a47 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6306,6 +6306,41 @@ st_translate_program(
t->insn[t->labels[i].branch_target]);
}
 
+   /* Set the next shader stage hint for VS and TES. */
+   switch (procType) {
+   case TGSI_PROCESSOR_VERTEX:
+   case TGSI_PROCESSOR_TESS_EVAL:
+  if (program->shader_program->SeparateShader)
+ break;
+
+  for (i = program->shader->Stage+1; i <= MESA_SHADER_FRAGMENT; i++) {
+ if (program->shader_program->_LinkedShaders[i]) {
+unsigned next;
+
+switch (i) {
+case MESA_SHADER_TESS_CTRL:
+   next = TGSI_PROCESSOR_TESS_CTRL;
+   break;
+case MESA_SHADER_TESS_EVAL:
+   next = TGSI_PROCESSOR_TESS_EVAL;
+   break;
+case MESA_SHADER_GEOMETRY:
+   next = TGSI_PROCESSOR_GEOMETRY;
+   break;
+case MESA_SHADER_FRAGMENT:
+   next = TGSI_PROCESSOR_FRAGMENT;
+   break;
+default:
+   assert(0);
+}
+
+ureg_set_next_shader_processor(ureg, next);
+break;
+ }
+  }
+  break;
+   }
+
 out:
if (t) {
   free(t->arrays);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: remove ST_NEW_MESA flag

2016-03-10 Thread Ilia Mirkin
v2 is Reviewed-by: Ilia Mirkin 

[in the future, I'd really appreciate inline patches... had to
"manually" de-base64 the attachment... gmail, in their infinite
wisdom, doesn't provide a way to view inline attachments]

On Thu, Mar 10, 2016 at 12:09 PM, Marek Olšák  wrote:
> Yes, please see the attached updated patch.
>
> Thanks,
> Marek
>
> On Thu, Mar 10, 2016 at 6:00 PM, Ilia Mirkin  wrote:
>> Do you also need to do this when validating the compute pipeline?
>>
>> On Thu, Mar 10, 2016 at 11:59 AM, Marek Olšák  wrote:
>>> From: Marek Olšák 
>>>
>>> Only used indirectly when checking dirty.st != 0
>>> ---
>>>  src/mesa/state_tracker/st_context.c | 2 --
>>>  src/mesa/state_tracker/st_context.h | 2 +-
>>>  src/mesa/state_tracker/st_draw.c| 4 ++--
>>>  3 files changed, 3 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/src/mesa/state_tracker/st_context.c 
>>> b/src/mesa/state_tracker/st_context.c
>>> index e3ddee6..f5a6f85 100644
>>> --- a/src/mesa/state_tracker/st_context.c
>>> +++ b/src/mesa/state_tracker/st_context.c
>>> @@ -141,9 +141,7 @@ void st_invalidate_state(struct gl_context * ctx, 
>>> GLbitfield new_state)
>>>
>>> /* Invalidate render and compute pipelines. */
>>> st->dirty.mesa |= new_state;
>>> -   st->dirty.st |= ST_NEW_MESA;
>>> st->dirty_cp.mesa |= new_state;
>>> -   st->dirty_cp.st |= ST_NEW_MESA;
>>>
>>> /* This is the only core Mesa module we depend upon.
>>>  * No longer use swrast, swsetup, tnl.
>>> diff --git a/src/mesa/state_tracker/st_context.h 
>>> b/src/mesa/state_tracker/st_context.h
>>> index f960c64..ba51a9c 100644
>>> --- a/src/mesa/state_tracker/st_context.h
>>> +++ b/src/mesa/state_tracker/st_context.h
>>> @@ -50,7 +50,7 @@ struct st_perf_monitor_group;
>>>  struct u_upload_mgr;
>>>
>>>
>>> -#define ST_NEW_MESA(1 << 0) /* Mesa state has changed 
>>> */
>>> +/* gap  */
>>>  #define ST_NEW_FRAGMENT_PROGRAM(1 << 1)
>>>  #define ST_NEW_VERTEX_PROGRAM  (1 << 2)
>>>  #define ST_NEW_FRAMEBUFFER (1 << 3)
>>> diff --git a/src/mesa/state_tracker/st_draw.c 
>>> b/src/mesa/state_tracker/st_draw.c
>>> index 2de6620..fdd59a3 100644
>>> --- a/src/mesa/state_tracker/st_draw.c
>>> +++ b/src/mesa/state_tracker/st_draw.c
>>> @@ -201,7 +201,7 @@ st_draw_vbo(struct gl_context *ctx,
>>> st_flush_bitmap_cache(st);
>>>
>>> /* Validate state. */
>>> -   if (st->dirty.st || ctx->NewDriverState) {
>>> +   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
>>>st_validate_state(st, ST_PIPELINE_RENDER);
>>>
>>>  #if 0
>>> @@ -314,7 +314,7 @@ st_indirect_draw_vbo(struct gl_context *ctx,
>>> assert(stride);
>>>
>>> /* Validate state. */
>>> -   if (st->dirty.st || ctx->NewDriverState) {
>>> +   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
>>>st_validate_state(st, ST_PIPELINE_RENDER);
>>> }
>>>
>>> --
>>> 2.5.0
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: add GL_ARB_shader_atomic_counter_ops support

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 12:04 PM, Glenn Kennard  wrote:
> On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin 
> wrote:
>
>> On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle 
>> wrote:

 -   if (c->MaxCombinedAtomicBuffers > 0)
 +   if (c->MaxCombinedAtomicBuffers > 0) {
 extensions->ARB_shader_atomic_counters = GL_TRUE;
 +  extensions->ARB_shader_atomic_counter_ops = GL_TRUE;
 +   }
>>>
>>>
>>>
>>> I believe there's pre-GCN AMD hardware which can support atomic counters
>>> but
>>> not atomic_counter_ops (at least according to what the closed driver
>>> exposes, I haven't actually checked the docs), so there should probably
>>> be a
>>> capability flag here.
>>
>>
>> I assumed this was due to laziness... seems odd if the SSBO atomic ops
>> can be supported, but those same ops can't be supported on atomic
>> buffers. Glenn / Dave - do you guys happen to know what the pre-GCN hw
>> is capable of?
>>
>>   -ilia
>>
>
> AFAIK Cayman supports atomic counter ops on SSBOs, evergreen only on counter
> buffers, and earlier hardware does neither.

To phrase this a different way, my patch is fine? :) If you support
atomic counters, you support all the various ops in
ARB_shader_atomic_counter_ops (which are basically all the SSBO ops,
but on atomic counters)?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: remove ST_NEW_MESA flag

2016-03-10 Thread Marek Olšák
Yes, please see the attached updated patch.

Thanks,
Marek

On Thu, Mar 10, 2016 at 6:00 PM, Ilia Mirkin  wrote:
> Do you also need to do this when validating the compute pipeline?
>
> On Thu, Mar 10, 2016 at 11:59 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> Only used indirectly when checking dirty.st != 0
>> ---
>>  src/mesa/state_tracker/st_context.c | 2 --
>>  src/mesa/state_tracker/st_context.h | 2 +-
>>  src/mesa/state_tracker/st_draw.c| 4 ++--
>>  3 files changed, 3 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/mesa/state_tracker/st_context.c 
>> b/src/mesa/state_tracker/st_context.c
>> index e3ddee6..f5a6f85 100644
>> --- a/src/mesa/state_tracker/st_context.c
>> +++ b/src/mesa/state_tracker/st_context.c
>> @@ -141,9 +141,7 @@ void st_invalidate_state(struct gl_context * ctx, 
>> GLbitfield new_state)
>>
>> /* Invalidate render and compute pipelines. */
>> st->dirty.mesa |= new_state;
>> -   st->dirty.st |= ST_NEW_MESA;
>> st->dirty_cp.mesa |= new_state;
>> -   st->dirty_cp.st |= ST_NEW_MESA;
>>
>> /* This is the only core Mesa module we depend upon.
>>  * No longer use swrast, swsetup, tnl.
>> diff --git a/src/mesa/state_tracker/st_context.h 
>> b/src/mesa/state_tracker/st_context.h
>> index f960c64..ba51a9c 100644
>> --- a/src/mesa/state_tracker/st_context.h
>> +++ b/src/mesa/state_tracker/st_context.h
>> @@ -50,7 +50,7 @@ struct st_perf_monitor_group;
>>  struct u_upload_mgr;
>>
>>
>> -#define ST_NEW_MESA(1 << 0) /* Mesa state has changed */
>> +/* gap  */
>>  #define ST_NEW_FRAGMENT_PROGRAM(1 << 1)
>>  #define ST_NEW_VERTEX_PROGRAM  (1 << 2)
>>  #define ST_NEW_FRAMEBUFFER (1 << 3)
>> diff --git a/src/mesa/state_tracker/st_draw.c 
>> b/src/mesa/state_tracker/st_draw.c
>> index 2de6620..fdd59a3 100644
>> --- a/src/mesa/state_tracker/st_draw.c
>> +++ b/src/mesa/state_tracker/st_draw.c
>> @@ -201,7 +201,7 @@ st_draw_vbo(struct gl_context *ctx,
>> st_flush_bitmap_cache(st);
>>
>> /* Validate state. */
>> -   if (st->dirty.st || ctx->NewDriverState) {
>> +   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
>>st_validate_state(st, ST_PIPELINE_RENDER);
>>
>>  #if 0
>> @@ -314,7 +314,7 @@ st_indirect_draw_vbo(struct gl_context *ctx,
>> assert(stride);
>>
>> /* Validate state. */
>> -   if (st->dirty.st || ctx->NewDriverState) {
>> +   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
>>st_validate_state(st, ST_PIPELINE_RENDER);
>> }
>>
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
From 466d11ead34f40b68c2ce30e1a8e1a81d1213cb4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Ol=C5=A1=C3=A1k?= 
Date: Wed, 9 Mar 2016 17:03:12 +0100
Subject: [PATCH] st/mesa: remove ST_NEW_MESA flag (v2)

Only used indirectly when checking dirty.st != 0

v2: also update st_cb_compute.c
---
 src/mesa/state_tracker/st_cb_compute.c | 2 +-
 src/mesa/state_tracker/st_context.c| 2 --
 src/mesa/state_tracker/st_context.h| 2 +-
 src/mesa/state_tracker/st_draw.c   | 4 ++--
 4 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_compute.c b/src/mesa/state_tracker/st_cb_compute.c
index 364159d..bfc6d96 100644
--- a/src/mesa/state_tracker/st_cb_compute.c
+++ b/src/mesa/state_tracker/st_cb_compute.c
@@ -47,7 +47,7 @@ static void st_dispatch_compute_common(struct gl_context *ctx,
if (ctx->NewState)
   _mesa_update_state(ctx);
 
-   if (st->dirty_cp.st || ctx->NewDriverState)
+   if (st->dirty_cp.st || st->dirty_cp.mesa || ctx->NewDriverState)
   st_validate_state(st, ST_PIPELINE_COMPUTE);
 
for (unsigned i = 0; i < 3; i++) {
diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c
index e3ddee6..f5a6f85 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -141,9 +141,7 @@ void st_invalidate_state(struct gl_context * ctx, GLbitfield new_state)
 
/* Invalidate render and compute pipelines. */
st->dirty.mesa |= new_state;
-   st->dirty.st |= ST_NEW_MESA;
st->dirty_cp.mesa |= new_state;
-   st->dirty_cp.st |= ST_NEW_MESA;
 
/* This is the only core Mesa module we depend upon.
 * No longer use swrast, swsetup, tnl.
diff --git a/src/mesa/state_tracker/st_context.h b/src/mesa/state_tracker/st_context.h
index f960c64..ba51a9c 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -50,7 +50,7 @@ struct st_perf_monitor_group;
 struct u_upload_mgr;
 
 
-#define ST_NEW_MESA(1 << 0) /* Mesa state has changed */
+/* gap  */
 #define ST_NEW_FRAGMENT_PROGRAM(1 << 1)
 #define ST_NEW_VERTEX_PROGRAM  (1 << 2)
 #define ST_NEW_FRAMEBUFFER (1 << 3)
diff --git 

Re: [Mesa-dev] [PATCH] r600/sb: Do not distribute neg in expr_handler::fold_assoc() when folding multiplications.

2016-03-10 Thread Glenn Kennard

The patch makes a bit more sense to me after realizing a fallthrough was 
changed to a break, so the whole patch is

Reviewed-by: Glenn Kennard 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] nvc0: expose SM35 perf counters to AMD_performance_monitor

2016-03-10 Thread Samuel Pitoiset



On 03/10/2016 01:28 AM, Ilia Mirkin wrote:

On Wed, Mar 9, 2016 at 6:23 PM, Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
---
  src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
index 6836432..5cbc66e 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query.c
@@ -204,7 +204,8 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen 
*pscreen,

 if (screen->base.drm->version >= 0x01000101) {
if (screen->compute) {
- if (screen->base.class_3d <= NVE4_3D_CLASS) {
+ if (screen->base.class_3d <= NVF0_3D_CLASS &&
+ screen->base.class_3d != NVEA_3D_CLASS) {
  count += 2;
   }
}
@@ -230,7 +231,8 @@ nvc0_screen_get_driver_query_group_info(struct pipe_screen 
*pscreen,
 } else
 if (id == NVC0_HW_METRIC_QUERY_GROUP) {
if (screen->compute) {
-  if (screen->base.class_3d <= NVE4_3D_CLASS) {
+  if (screen->base.class_3d <= NVF0_3D_CLASS &&
+  screen->base.class_3d != NVE4_3D_CLASS) {


4's do tend to look a lot like A's...


Really good catch. :-)
Thanks.



with the unnecessary attempt to filter out NVEA_3D_CLASS (which is
already filtered out because it's > NVF0_3D_CLASS), this whole series
is

Acked-by: Ilia Mirkin 


  info->name = "Performance metrics";
  info->max_active_queries = 1;
  info->num_queries = nvc0_hw_metric_get_num_queries(screen);
--
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] nvc0: add MP performance counters for SM35 (GK110)

2016-03-10 Thread Samuel Pitoiset



On 03/10/2016 01:23 AM, Ilia Mirkin wrote:

On Wed, Mar 9, 2016 at 6:23 PM, Samuel Pitoiset
 wrote:

+ if (screen->base.class_3d <= NVF0_3D_CLASS &&
+ screen->base.class_3d != NVEA_3D_CLASS) {


Why? NVEA should be the same as NVF0 I think... and actually
NVEA_3D_CLASS is 0xa297, while the NVF0 one is a197...


I doubt because NVEA is SM32 that's why I don't want to enable it for now.



   -ilia


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: add GL_ARB_shader_atomic_counter_ops support

2016-03-10 Thread Glenn Kennard

On Thu, 10 Mar 2016 17:02:15 +0100, Ilia Mirkin  wrote:


On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle  wrote:

-   if (c->MaxCombinedAtomicBuffers > 0)
+   if (c->MaxCombinedAtomicBuffers > 0) {
extensions->ARB_shader_atomic_counters = GL_TRUE;
+  extensions->ARB_shader_atomic_counter_ops = GL_TRUE;
+   }



I believe there's pre-GCN AMD hardware which can support atomic counters but
not atomic_counter_ops (at least according to what the closed driver
exposes, I haven't actually checked the docs), so there should probably be a
capability flag here.


I assumed this was due to laziness... seems odd if the SSBO atomic ops
can be supported, but those same ops can't be supported on atomic
buffers. Glenn / Dave - do you guys happen to know what the pre-GCN hw
is capable of?

  -ilia



AFAIK Cayman supports atomic counter ops on SSBOs, evergreen only on counter
buffers, and earlier hardware does neither.


/Glenn
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: remove ST_NEW_MESA flag

2016-03-10 Thread Ilia Mirkin
Do you also need to do this when validating the compute pipeline?

On Thu, Mar 10, 2016 at 11:59 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Only used indirectly when checking dirty.st != 0
> ---
>  src/mesa/state_tracker/st_context.c | 2 --
>  src/mesa/state_tracker/st_context.h | 2 +-
>  src/mesa/state_tracker/st_draw.c| 4 ++--
>  3 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_context.c 
> b/src/mesa/state_tracker/st_context.c
> index e3ddee6..f5a6f85 100644
> --- a/src/mesa/state_tracker/st_context.c
> +++ b/src/mesa/state_tracker/st_context.c
> @@ -141,9 +141,7 @@ void st_invalidate_state(struct gl_context * ctx, 
> GLbitfield new_state)
>
> /* Invalidate render and compute pipelines. */
> st->dirty.mesa |= new_state;
> -   st->dirty.st |= ST_NEW_MESA;
> st->dirty_cp.mesa |= new_state;
> -   st->dirty_cp.st |= ST_NEW_MESA;
>
> /* This is the only core Mesa module we depend upon.
>  * No longer use swrast, swsetup, tnl.
> diff --git a/src/mesa/state_tracker/st_context.h 
> b/src/mesa/state_tracker/st_context.h
> index f960c64..ba51a9c 100644
> --- a/src/mesa/state_tracker/st_context.h
> +++ b/src/mesa/state_tracker/st_context.h
> @@ -50,7 +50,7 @@ struct st_perf_monitor_group;
>  struct u_upload_mgr;
>
>
> -#define ST_NEW_MESA(1 << 0) /* Mesa state has changed */
> +/* gap  */
>  #define ST_NEW_FRAGMENT_PROGRAM(1 << 1)
>  #define ST_NEW_VERTEX_PROGRAM  (1 << 2)
>  #define ST_NEW_FRAMEBUFFER (1 << 3)
> diff --git a/src/mesa/state_tracker/st_draw.c 
> b/src/mesa/state_tracker/st_draw.c
> index 2de6620..fdd59a3 100644
> --- a/src/mesa/state_tracker/st_draw.c
> +++ b/src/mesa/state_tracker/st_draw.c
> @@ -201,7 +201,7 @@ st_draw_vbo(struct gl_context *ctx,
> st_flush_bitmap_cache(st);
>
> /* Validate state. */
> -   if (st->dirty.st || ctx->NewDriverState) {
> +   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
>st_validate_state(st, ST_PIPELINE_RENDER);
>
>  #if 0
> @@ -314,7 +314,7 @@ st_indirect_draw_vbo(struct gl_context *ctx,
> assert(stride);
>
> /* Validate state. */
> -   if (st->dirty.st || ctx->NewDriverState) {
> +   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
>st_validate_state(st, ST_PIPELINE_RENDER);
> }
>
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: remove ST_NEW_MESA flag

2016-03-10 Thread Marek Olšák
From: Marek Olšák 

Only used indirectly when checking dirty.st != 0
---
 src/mesa/state_tracker/st_context.c | 2 --
 src/mesa/state_tracker/st_context.h | 2 +-
 src/mesa/state_tracker/st_draw.c| 4 ++--
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index e3ddee6..f5a6f85 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -141,9 +141,7 @@ void st_invalidate_state(struct gl_context * ctx, 
GLbitfield new_state)
 
/* Invalidate render and compute pipelines. */
st->dirty.mesa |= new_state;
-   st->dirty.st |= ST_NEW_MESA;
st->dirty_cp.mesa |= new_state;
-   st->dirty_cp.st |= ST_NEW_MESA;
 
/* This is the only core Mesa module we depend upon.
 * No longer use swrast, swsetup, tnl.
diff --git a/src/mesa/state_tracker/st_context.h 
b/src/mesa/state_tracker/st_context.h
index f960c64..ba51a9c 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -50,7 +50,7 @@ struct st_perf_monitor_group;
 struct u_upload_mgr;
 
 
-#define ST_NEW_MESA(1 << 0) /* Mesa state has changed */
+/* gap  */
 #define ST_NEW_FRAGMENT_PROGRAM(1 << 1)
 #define ST_NEW_VERTEX_PROGRAM  (1 << 2)
 #define ST_NEW_FRAMEBUFFER (1 << 3)
diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index 2de6620..fdd59a3 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -201,7 +201,7 @@ st_draw_vbo(struct gl_context *ctx,
st_flush_bitmap_cache(st);
 
/* Validate state. */
-   if (st->dirty.st || ctx->NewDriverState) {
+   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
   st_validate_state(st, ST_PIPELINE_RENDER);
 
 #if 0
@@ -314,7 +314,7 @@ st_indirect_draw_vbo(struct gl_context *ctx,
assert(stride);
 
/* Validate state. */
-   if (st->dirty.st || ctx->NewDriverState) {
+   if (st->dirty.st || st->dirty.mesa || ctx->NewDriverState) {
   st_validate_state(st, ST_PIPELINE_RENDER);
}
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] radeonsi: Lazily re-set sampler views after disabling DCC

2016-03-10 Thread Bas Nieuwenhuizen
Clear DCC flags if necessary when binding a new sampler view.

v2: Do not reset DCC flags of bound sampler views.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/gallium/drivers/radeon/r600_texture.c |  2 --
 src/gallium/drivers/radeonsi/si_descriptors.c | 10 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 1a8822c..07118fc 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -310,8 +310,6 @@ static void r600_texture_disable_dcc(struct 
r600_common_screen *rscreen,
 
/* Notify all contexts about the change. */
r600_dirty_all_framebuffer_states(rscreen);
-
-   /* TODO: re-set all sampler views and images, but how? */
 }
 
 static boolean r600_texture_get_handle(struct pipe_screen* screen,
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 37b9d68..d996077 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -185,12 +185,16 @@ static void si_set_sampler_view(struct si_context *sctx,
struct si_sampler_views *views,
unsigned slot, struct pipe_sampler_view *view)
 {
-   if (views->views[slot] == view)
+   struct si_sampler_view *rview = (struct si_sampler_view*)view;
+
+   if (view && G_008F28_COMPRESSION_EN(rview->state[6]) &&
+   ((struct r600_texture*)rview->base.texture)->dcc_offset == 0) {
+   rview->state[6] &= C_008F28_COMPRESSION_EN &
+  C_008F28_ALPHA_IS_ON_MSB;
+   } else if (views->views[slot] == view)
return;
 
if (view) {
-   struct si_sampler_view *rview =
-   (struct si_sampler_view*)view;
struct r600_texture *rtex = (struct r600_texture 
*)view->texture;
 
si_sampler_view_add_buffer(sctx, view->texture);
-- 
2.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] r600g: update compressed_colortex_masks when a cmask is created or disabled

2016-03-10 Thread Marek Olšák
On Thu, Mar 10, 2016 at 5:36 PM, Marek Olšák  wrote:
> On Thu, Mar 10, 2016 at 12:07 AM, Nicolai Hähnle  wrote:
>> From: Nicolai Hähnle 
>>
>> ---
>>  src/gallium/drivers/r600/r600_state_common.c | 30 
>> 
>>  1 file changed, 30 insertions(+)
>>
>> diff --git a/src/gallium/drivers/r600/r600_state_common.c 
>> b/src/gallium/drivers/r600/r600_state_common.c
>> index e3314bb..40ceb8d 100644
>> --- a/src/gallium/drivers/r600/r600_state_common.c
>> +++ b/src/gallium/drivers/r600/r600_state_common.c
>> @@ -693,6 +693,26 @@ static void r600_set_sampler_views(struct pipe_context 
>> *pipe, unsigned shader,
>> }
>>  }
>>
>> +static void r600_update_compressed_colortex_mask(struct 
>> r600_samplerview_state *views)
>> +{
>> +   uint32_t mask = views->enabled_mask;
>> +
>> +   while (mask) {
>> +   unsigned i = u_bit_scan();
>> +   struct pipe_resource *res = views->views[i]->base.texture;
>> +
>> +   if (res && res->target != PIPE_BUFFER) {
>> +   struct r600_texture *rtex = (struct r600_texture 
>> *)res;
>> +
>> +   if (rtex->cmask.size) {
>> +   views->compressed_colortex_mask |= 1 << i;
>> +   } else {
>> +   views->compressed_colortex_mask &= ~(1 << i);
>> +   }
>
> r600_set_sampler_views contains the same code. This conditional should
> be moved to a separate function, so that both functions can use it.

Whether or not you decide to apply my suggestion above, the whole series is:

Reviewed-by: Marek Olšák 

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] r600g: update compressed_colortex_masks when a cmask is created or disabled

2016-03-10 Thread Marek Olšák
On Thu, Mar 10, 2016 at 12:07 AM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
>  src/gallium/drivers/r600/r600_state_common.c | 30 
> 
>  1 file changed, 30 insertions(+)
>
> diff --git a/src/gallium/drivers/r600/r600_state_common.c 
> b/src/gallium/drivers/r600/r600_state_common.c
> index e3314bb..40ceb8d 100644
> --- a/src/gallium/drivers/r600/r600_state_common.c
> +++ b/src/gallium/drivers/r600/r600_state_common.c
> @@ -693,6 +693,26 @@ static void r600_set_sampler_views(struct pipe_context 
> *pipe, unsigned shader,
> }
>  }
>
> +static void r600_update_compressed_colortex_mask(struct 
> r600_samplerview_state *views)
> +{
> +   uint32_t mask = views->enabled_mask;
> +
> +   while (mask) {
> +   unsigned i = u_bit_scan();
> +   struct pipe_resource *res = views->views[i]->base.texture;
> +
> +   if (res && res->target != PIPE_BUFFER) {
> +   struct r600_texture *rtex = (struct r600_texture 
> *)res;
> +
> +   if (rtex->cmask.size) {
> +   views->compressed_colortex_mask |= 1 << i;
> +   } else {
> +   views->compressed_colortex_mask &= ~(1 << i);
> +   }

r600_set_sampler_views contains the same code. This conditional should
be moved to a separate function, so that both functions can use it.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/8] Add GL ES per-sample shading support

2016-03-10 Thread Ilia Mirkin
ping?

I've already pushed patches 1 and 2, but the rest still require review.

On Sat, Feb 27, 2016 at 11:21 AM, Ilia Mirkin  wrote:
> GL ES adds several extensions that enable the full functionality. I
> sent many of these out before on a piecemeal basis, but this unifies
> everything in one series and includes various little fixes I made
> along the way.
>
> As part of this series, I also implement the clarification regarding
> per-sample shading vs per-sample interpolation.
>
> Ilia Mirkin (8):
>   st/mesa: don't force per-sample interp if only sampleid/pos are used
>   nv50/ir: using sampleid/pos shouldn't force per-sample interpolation
>   glsl: add gl_MaxSamples, new in GL 4.5 / GL ES 3.2
>   mesa: add OES_sample_variables to extension table, add enable bit
>   glsl: add GL_OES_sample_variables support
>   mesa: add GL_OES_sample_shading support
>   mesa: add GL_OES_shader_multisample_interpolation support
>   st/mesa: add ES sample-shading support
>
>  docs/GL3.txt  |  6 +++---
>  src/compiler/glsl/builtin_functions.cpp   | 12 +++-
>  src/compiler/glsl/builtin_variables.cpp   | 15 
> ---
>  src/compiler/glsl/glcpp/glcpp-parse.y |  4 
>  src/compiler/glsl/glsl_lexer.ll   |  2 +-
>  src/compiler/glsl/glsl_parser_extras.cpp  |  5 +
>  src/compiler/glsl/glsl_parser_extras.h|  7 +++
>  src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h  |  1 -
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp |  6 +-
>  src/mapi/glapi/gen/es_EXT.xml |  6 ++
>  src/mesa/main/enable.c|  4 ++--
>  src/mesa/main/extensions_table.h  |  3 +++
>  src/mesa/main/get.c   |  5 +
>  src/mesa/main/get_hash_params.py  | 14 --
>  src/mesa/main/mtypes.h|  1 +
>  src/mesa/main/multisample.c   |  3 ++-
>  src/mesa/main/tests/dispatch_sanity.cpp   |  3 +++
>  src/mesa/state_tracker/st_atom_shader.c   |  4 
>  src/mesa/state_tracker/st_extensions.c|  6 ++
>  src/mesa/state_tracker/st_program.c   |  4 
>  20 files changed, 76 insertions(+), 35 deletions(-)
>
> --
> 2.4.10
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: Lazily re-set sampler views after disabling DCC

2016-03-10 Thread Nicolai Hähnle

On 09.03.2016 16:12, Bas Nieuwenhuizen wrote:

Clear DCC flags if necessary when binding a new sampler_view. Also
rebind all sampler views so that the sampler views that were already
bound are also up to date.


Seems mostly reasonable to me and should cover all the cases.

I don't think rebinding the sampler views is really necessary: during 
DCC decompression, the DCC buffer should be cleared. This means that a 
currently bound sampler view will still result in correct data, just at 
a slightly increased bandwidth cost (because the DCC metadata is 
unnecessarily loaded). When the sampler view is bound the next time, 
that additional cost will go away. So you could simplify the patch a 
bit, which is probably beneficial in the long run. It is a minor quibble 
though.


One more comment below.



Signed-off-by: Bas Nieuwenhuizen 
---
  src/gallium/drivers/radeon/r600_texture.c |  2 --
  src/gallium/drivers/radeonsi/si_descriptors.c | 22 +++---
  src/gallium/drivers/radeonsi/si_state.h   |  1 +
  src/gallium/drivers/radeonsi/si_state_draw.c  |  1 +
  4 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 1a8822c..07118fc 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -307,14 +307,12 @@ static void r600_texture_disable_dcc(struct 
r600_common_screen *rscreen,
/* Disable DCC. */
rtex->dcc_offset = 0;
rtex->cb_color_info &= ~VI_S_028C70_DCC_ENABLE(1);

/* Notify all contexts about the change. */
r600_dirty_all_framebuffer_states(rscreen);
-
-   /* TODO: re-set all sampler views and images, but how? */
  }

  static boolean r600_texture_get_handle(struct pipe_screen* screen,
   struct pipe_resource *resource,
   struct winsys_handle *whandle,
 unsigned usage)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 37b9d68..5838e24 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -182,18 +182,22 @@ static void si_sampler_views_begin_new_cs(struct 
si_context *sctx,
  }

  static void si_set_sampler_view(struct si_context *sctx,
struct si_sampler_views *views,
unsigned slot, struct pipe_sampler_view *view)
  {
-   if (views->views[slot] == view)
+   struct si_sampler_view *rview = (struct si_sampler_view*)view;
+
+   if (view && G_008F28_COMPRESSION_EN(rview->state[6]) &&
+   ((struct r600_texture*)rview->base.texture)->dcc_offset == 0) {
+   rview->state[6] &= C_008F28_COMPRESSION_EN &
+  C_008F28_ALPHA_IS_ON_MSB;
+   } else if (views->views[slot] == view)
return;

if (view) {
-   struct si_sampler_view *rview =
-   (struct si_sampler_view*)view;
struct r600_texture *rtex = (struct r600_texture 
*)view->texture;

si_sampler_view_add_buffer(sctx, view->texture);

pipe_sampler_view_reference(>views[slot], view);
memcpy(views->desc.list + slot * 16, rview->state, 8*4);
@@ -267,12 +271,24 @@ static void si_set_sampler_views(struct pipe_context *ctx,
samplers->depth_texture_mask &= ~(1 << slot);
samplers->compressed_colortex_mask &= ~(1 << slot);
}
}
  }

+void si_reset_sampler_views(struct si_context *sctx) {
+   unsigned shader, sampler;
+
+   for (shader = 0; shader < SI_NUM_SHADERS; ++shader) {
+   struct si_sampler_views *views = >samplers[shader].views;
+   for (sampler = 0; sampler < SI_NUM_SAMPLERS; ++sampler) {
+   si_set_sampler_view(sctx, views, sampler,
+   views->views[sampler]);
+   }
+   }
+}


Please use a u_bit_scan loop over the enabled_mask like in other places.

Cheers,
Nicolai


+
  /* SAMPLER STATES */

  static void si_bind_sampler_states(struct pipe_context *ctx, unsigned shader,
 unsigned start, unsigned count, void 
**states)
  {
struct si_context *sctx = (struct si_context *)ctx;
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index fb16d0f..dab94e5 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -243,12 +243,13 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint 
shader, uint slot,
bool add_tid, bool swizzle,
unsigned element_size, unsigned index_stride, uint64_t 
offset);
  void 

Re: [Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-10 Thread Adam Jackson
On Thu, 2016-03-10 at 08:32 -0700, Kyle Brenneman wrote:

> > That could work, although I would expect "vendor-specific info" to 
> > mean "random, arbitrary, and probably not machine-parsable". I'd be 
> > hesitant to try to impose a structure on something that's never had 
> > any structure before.

As far as I'm aware, there are zero servers that supply anything more
than the bare version number in this string. But a new token is
certainly the more conservative approach. I'll switch my branch back to
using that.

> > >  2) Do we want to add GLX_SCREEN to the list of fbconfig attributes
> > > as well?
> > > 
> > > UNRESOLVED.  glvnd does not need that information, but it would
> > > be a natural orthogonality, and GLX_SGIX_fbconfig mentions it
> > > though GLX 1.3 does not.
> > Possibly, but that wouldn't change the protocol at all. The screen 
> > number is included in the glXGetFBConfigs request, so it wouldn't make 
> > sense to add it to the reply as well. It would be up to the client to 
> > keep track of it instead.
> Oh, wait. Now that I think about it, GLX already provides a GLXFBConfig 
> to screen mapping in glXGetVisualFromFBConfig, and indirectly from 
> glXGetFBConfigs.
> 
> So, unless someone feels strongly otherwise, I think it would make the 
> most sense to leave glXGetFBConfigAttrib as it is.

Sounds fine to me.

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Nouveau] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Pierre Moreau
On 11:05 AM - Mar 10 2016, Ilia Mirkin wrote:
> On Thu, Mar 10, 2016 at 11:03 AM, Pierre Moreau  wrote:
> > You might want to increment the address by at least
> > `info->prop.cp.inputOffset`, and if inputs still end up in shared on Tesla,
> 
> There's a cp.sharedOffset just for that :) However it doesn't appear
> to get set anywhere...

Oh really?! I completely missed that one… Well, I have some changes to make on
my own code then! :-D Thanks for pointing that out!

Pierre


> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Nouveau] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 11:05 AM, Samuel Pitoiset
 wrote:
>> If I understand correctly, the goal is to have user inputs in a
>> `screen->uniform_bo`, and so for all generations?
>
> Sure for fermi, and probably for Tesla.

I think continuing to use the USER_PARAMS or whatever mechanism on
telsa makes sense. That's why I agreed to keep the MEMORY, INPUT
concept.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Nouveau] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Samuel Pitoiset



On 03/10/2016 05:03 PM, Pierre Moreau wrote:

On 04:27 PM - Mar 10 2016, Samuel Pitoiset wrote:



On 03/10/2016 04:23 PM, Ilia Mirkin wrote:

On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede  wrote:

Add support for clover / OpenCL kernel input parameters.

Signed-off-by: Hans de Goede 
---
  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18 +++---
  1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index a8258af..de0c72b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int idx, 
int c, uint32_t address)

 sym->reg.fileIndex = fileIdx;

-   if (tgsiFile == TGSI_FILE_MEMORY &&
-   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
-  sym->setFile(FILE_MEMORY_SHARED);
+   if (tgsiFile == TGSI_FILE_MEMORY) {
+  switch (code->memoryFiles[fileIdx].mem_type) {
+  case TGSI_MEMORY_TYPE_SHARED:
+ sym->setFile(FILE_MEMORY_SHARED);


You might want to increment the address by at least
`info->prop.cp.inputOffset`, and if inputs still end up in shared on Tesla,
then increment further by the input size. This input offset of 0x10 (or is it
0x20?) is due to the card sticking the size of a block and of the grid inside
`s[0x0..0x10]` (or maybe Nouveau is doing that, but I doubt it.). So even if
the user inputs end up somewhere else in memory, you most likely still don't
want to overwrite the grid information. This should be necessary only for Tesla
cards.


cf. my previous comment. :-)




+ break;
+  case TGSI_MEMORY_TYPE_INPUT:
+ assert(prog->getType() == Program::TYPE_COMPUTE);
+ assert(idx == -1);
+ sym->setFile(FILE_SHADER_INPUT);
+ address += info->prop.cp.inputOffset;


What's the idea here? i.e. what is the inputOffset, how is it set, and why?


I don't get the idea too, btw.

But prop.cp.inputOffset is only defined for compute on Kepler. It's the
offset of input parameters in the screen->parm BO but as you already know,
it is going to be removed because the idea is to use screen->uniform_bo
instead. I'll do this change *after* the compute shaders support on Kepler.


If I understand correctly, the goal is to have user inputs in a
`screen->uniform_bo`, and so for all generations?


Sure for fermi, and probably for Tesla.



Pierre






   -ilia


+ break;
+  default:
+ assert(0); /* TODO: Add support for global and local memory */
+  }
+   }

 if (idx >= 0) {
if (sym->reg.file == FILE_SHADER_INPUT)
--
2.7.2



--
-Samuel
___
Nouveau mailing list
nouv...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Nouveau] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 11:03 AM, Pierre Moreau  wrote:
> You might want to increment the address by at least
> `info->prop.cp.inputOffset`, and if inputs still end up in shared on Tesla,

There's a cp.sharedOffset just for that :) However it doesn't appear
to get set anywhere...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Samuel Pitoiset
Looks fine, except that you will need to lower FILE_SHADER_INPUT to 
FILE_MEMORY_SHARED for Tesla because input kernel parameters are located 
at s[0x10]. No need to do this for Fermi+ because it's already lowered 
to c0[]. Note that input kernel parameters will be probably sticked on 
c7[] after my changes but that doesn't change anything for you.


I already have a patch for the nv50 bits btw, maybe it's the right time 
to send it?


https://cgit.freedesktop.org/~hakzsam/mesa/commit/?h=compute=640d68009bcf93c1814cee0b1a12939cb85e5895

Reviewed-by: Samuel Pitoiset 

On 03/10/2016 04:43 PM, Ilia Mirkin wrote:

On Thu, Mar 10, 2016 at 10:27 AM, Samuel Pitoiset
 wrote:



On 03/10/2016 04:23 PM, Ilia Mirkin wrote:


On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede 
wrote:


Add support for clover / OpenCL kernel input parameters.

Signed-off-by: Hans de Goede 
---
   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18
+++---
   1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index a8258af..de0c72b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int
idx, int c, uint32_t address)

  sym->reg.fileIndex = fileIdx;

-   if (tgsiFile == TGSI_FILE_MEMORY &&
-   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
-  sym->setFile(FILE_MEMORY_SHARED);
+   if (tgsiFile == TGSI_FILE_MEMORY) {
+  switch (code->memoryFiles[fileIdx].mem_type) {
+  case TGSI_MEMORY_TYPE_SHARED:
+ sym->setFile(FILE_MEMORY_SHARED);
+ break;
+  case TGSI_MEMORY_TYPE_INPUT:
+ assert(prog->getType() == Program::TYPE_COMPUTE);
+ assert(idx == -1);
+ sym->setFile(FILE_SHADER_INPUT);
+ address += info->prop.cp.inputOffset;



What's the idea here? i.e. what is the inputOffset, how is it set, and
why?



I don't get the idea too, btw.

But prop.cp.inputOffset is only defined for compute on Kepler. It's the
offset of input parameters in the screen->parm BO but as you already know,
it is going to be removed because the idea is to use screen->uniform_bo
instead. I'll do this change *after* the compute shaders support on Kepler.


Actually looks like it's only set for nv50 that I can see, shifting
things over by 0x10. It used to be reflected by getResourceBase, but
we broke that abstraction... might be nice to get it back somehow,
perhaps by sending more arguments down to getResourceBase? Either way,
that can be done later. This patch is

Reviewed-by: Ilia Mirkin 



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Nouveau] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Pierre Moreau
On 04:27 PM - Mar 10 2016, Samuel Pitoiset wrote:
> 
> 
> On 03/10/2016 04:23 PM, Ilia Mirkin wrote:
> >On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede  wrote:
> >>Add support for clover / OpenCL kernel input parameters.
> >>
> >>Signed-off-by: Hans de Goede 
> >>---
> >>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18 
> >> +++---
> >>  1 file changed, 15 insertions(+), 3 deletions(-)
> >>
> >>diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> >>b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> >>index a8258af..de0c72b 100644
> >>--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> >>+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> >>@@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int 
> >>idx, int c, uint32_t address)
> >>
> >> sym->reg.fileIndex = fileIdx;
> >>
> >>-   if (tgsiFile == TGSI_FILE_MEMORY &&
> >>-   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
> >>-  sym->setFile(FILE_MEMORY_SHARED);
> >>+   if (tgsiFile == TGSI_FILE_MEMORY) {
> >>+  switch (code->memoryFiles[fileIdx].mem_type) {
> >>+  case TGSI_MEMORY_TYPE_SHARED:
> >>+ sym->setFile(FILE_MEMORY_SHARED);

You might want to increment the address by at least
`info->prop.cp.inputOffset`, and if inputs still end up in shared on Tesla,
then increment further by the input size. This input offset of 0x10 (or is it
0x20?) is due to the card sticking the size of a block and of the grid inside
`s[0x0..0x10]` (or maybe Nouveau is doing that, but I doubt it.). So even if
the user inputs end up somewhere else in memory, you most likely still don't
want to overwrite the grid information. This should be necessary only for Tesla
cards.

> >>+ break;
> >>+  case TGSI_MEMORY_TYPE_INPUT:
> >>+ assert(prog->getType() == Program::TYPE_COMPUTE);
> >>+ assert(idx == -1);
> >>+ sym->setFile(FILE_SHADER_INPUT);
> >>+ address += info->prop.cp.inputOffset;
> >
> >What's the idea here? i.e. what is the inputOffset, how is it set, and why?
> 
> I don't get the idea too, btw.
> 
> But prop.cp.inputOffset is only defined for compute on Kepler. It's the
> offset of input parameters in the screen->parm BO but as you already know,
> it is going to be removed because the idea is to use screen->uniform_bo
> instead. I'll do this change *after* the compute shaders support on Kepler.

If I understand correctly, the goal is to have user inputs in a
`screen->uniform_bo`, and so for all generations?

Pierre


> 
> >
> >   -ilia
> >
> >>+ break;
> >>+  default:
> >>+ assert(0); /* TODO: Add support for global and local memory */
> >>+  }
> >>+   }
> >>
> >> if (idx >= 0) {
> >>if (sym->reg.file == FILE_SHADER_INPUT)
> >>--
> >>2.7.2
> >>
> 
> -- 
> -Samuel
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: add GL_ARB_shader_atomic_counter_ops support

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 10:57 AM, Nicolai Hähnle  wrote:
>> -   if (c->MaxCombinedAtomicBuffers > 0)
>> +   if (c->MaxCombinedAtomicBuffers > 0) {
>> extensions->ARB_shader_atomic_counters = GL_TRUE;
>> +  extensions->ARB_shader_atomic_counter_ops = GL_TRUE;
>> +   }
>
>
> I believe there's pre-GCN AMD hardware which can support atomic counters but
> not atomic_counter_ops (at least according to what the closed driver
> exposes, I haven't actually checked the docs), so there should probably be a
> capability flag here.

I assumed this was due to laziness... seems odd if the SSBO atomic ops
can be supported, but those same ops can't be supported on atomic
buffers. Glenn / Dave - do you guys happen to know what the pre-GCN hw
is capable of?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: add GL_ARB_shader_atomic_counter_ops support

2016-03-10 Thread Nicolai Hähnle

On 20.02.2016 00:13, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
  src/mesa/state_tracker/st_extensions.c |  4 +-
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 60 +++---
  2 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 94696ce..21e108d 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -368,8 +368,10 @@ void st_init_limits(struct pipe_screen *screen,
   c->Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers;
 assert(c->MaxCombinedAtomicBuffers <= MAX_COMBINED_ATOMIC_BUFFERS);

-   if (c->MaxCombinedAtomicBuffers > 0)
+   if (c->MaxCombinedAtomicBuffers > 0) {
extensions->ARB_shader_atomic_counters = GL_TRUE;
+  extensions->ARB_shader_atomic_counter_ops = GL_TRUE;
+   }


I believe there's pre-GCN AMD hardware which can support atomic counters 
but not atomic_counter_ops (at least according to what the closed driver 
exposes, I haven't actually checked the docs), so there should probably 
be a capability flag here.




 c->MaxCombinedShaderOutputResources = c->MaxDrawBuffers;
 c->ShaderStorageBufferOffsetAlignment =
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 943582d..fe6d58b 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3163,8 +3163,8 @@ void
  glsl_to_tgsi_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
  {
 const char *callee = ir->callee->function_name();
-   ir_dereference *deref = static_cast(
-  ir->actual_parameters.get_head());
+   exec_node *param = ir->actual_parameters.get_head();
+   ir_dereference *deref = static_cast(param);
 ir_variable *location = deref->variable_referenced();

 st_src_reg buffer(
@@ -3193,17 +3193,56 @@ 
glsl_to_tgsi_visitor::visit_atomic_counter_intrinsic(ir_call *ir)

 if (!strcmp("__intrinsic_atomic_read", callee)) {
inst = emit_asm(ir, TGSI_OPCODE_LOAD, dst, offset);
-  inst->buffer = buffer;
 } else if (!strcmp("__intrinsic_atomic_increment", callee)) {
inst = emit_asm(ir, TGSI_OPCODE_ATOMUADD, dst, offset,
st_src_reg_for_int(1));
-  inst->buffer = buffer;
 } else if (!strcmp("__intrinsic_atomic_predecrement", callee)) {
inst = emit_asm(ir, TGSI_OPCODE_ATOMUADD, dst, offset,
st_src_reg_for_int(-1));
-  inst->buffer = buffer;
emit_asm(ir, TGSI_OPCODE_ADD, dst, this->result, 
st_src_reg_for_int(-1));
+   } else {
+  param = param->get_next();
+  ir_rvalue *val = ((ir_instruction *)param)->as_rvalue();
+  val->accept(this);
+
+  st_src_reg data = this->result, data2 = undef_src;
+  unsigned opcode;
+  if (!strcmp("__intrinsic_atomic_add", callee))
+ opcode = TGSI_OPCODE_ATOMUADD;
+  else if (!strcmp("__intrinsic_atomic_min", callee))
+ opcode = TGSI_OPCODE_ATOMIMIN;
+  else if (!strcmp("__intrinsic_atomic_max", callee))
+ opcode = TGSI_OPCODE_ATOMIMAX;
+  else if (!strcmp("__intrinsic_atomic_and", callee))
+ opcode = TGSI_OPCODE_ATOMAND;
+  else if (!strcmp("__intrinsic_atomic_or", callee))
+ opcode = TGSI_OPCODE_ATOMOR;
+  else if (!strcmp("__intrinsic_atomic_xor", callee))
+ opcode = TGSI_OPCODE_ATOMXOR;
+  else if (!strcmp("__intrinsic_atomic_exchange", callee))
+ opcode = TGSI_OPCODE_ATOMXCHG;
+  else if (!strcmp("__intrinsic_atomic_comp_swap", callee)) {
+ opcode = TGSI_OPCODE_ATOMCAS;
+ param = param->get_next();
+ val = ((ir_instruction *)param)->as_rvalue();
+ val->accept(this);
+ data2 = this->result;
+  } else if (!strcmp("__intrinsic_atomic_sub", callee)) {
+ opcode = TGSI_OPCODE_ATOMUADD;
+ st_src_reg res = get_temp(glsl_type::uvec4_type);
+ st_dst_reg dstres = st_dst_reg(res);
+ dstres.writemask = dst.writemask;
+ emit_asm(ir, TGSI_OPCODE_INEG, dstres, data);
+ data = res;
+  } else {
+ assert(!"Unexpected intrinsic");
+ return;
+  }
+
+  inst = emit_asm(ir, opcode, dst, offset, data, data2);
 }
+
+   inst->buffer = buffer;


You could refactor this a bit further so that all intrinsics go through 
the same emit_asm call, but that's a minor point.


Cheers,
Nicolai


  }

  void
@@ -3596,7 +3635,16 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
 /* Filter out intrinsics */
 if (!strcmp("__intrinsic_atomic_read", callee) ||
 !strcmp("__intrinsic_atomic_increment", callee) ||
-   !strcmp("__intrinsic_atomic_predecrement", callee)) {
+   !strcmp("__intrinsic_atomic_predecrement", callee) ||
+   !strcmp("__intrinsic_atomic_add", callee) ||
+   !strcmp("__intrinsic_atomic_sub", callee) ||
+   

Re: [Mesa-dev] [PATCH 1/2] mesa: add GL_ARB_shader_atomic_counter_ops support

2016-03-10 Thread Nicolai Hähnle

I'm not super familiar with this code, but it looks good to me, so:

Reviewed-by: Nicolai Hähnle 

On 20.02.2016 00:13, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 
---
  src/compiler/glsl/builtin_functions.cpp  | 110 +++
  src/compiler/glsl/glcpp/glcpp-parse.y|   3 +
  src/compiler/glsl/glsl_parser_extras.cpp |   1 +
  src/compiler/glsl/glsl_parser_extras.h   |   2 +
  src/mesa/main/extensions_table.h |   1 +
  src/mesa/main/mtypes.h   |   1 +
  6 files changed, 118 insertions(+)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index b862da0..d4dc271 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -439,6 +439,12 @@ shader_atomic_counters(const _mesa_glsl_parse_state *state)
  }

  static bool
+shader_atomic_counter_ops(const _mesa_glsl_parse_state *state)
+{
+   return state->ARB_shader_atomic_counter_ops_enable;
+}
+
+static bool
  shader_clock(const _mesa_glsl_parse_state *state)
  {
 return state->ARB_shader_clock_enable;
@@ -819,8 +825,14 @@ private:
 B1(interpolateAtSample)

 ir_function_signature 
*_atomic_counter_intrinsic(builtin_available_predicate avail);
+   ir_function_signature 
*_atomic_counter_intrinsic1(builtin_available_predicate avail);
+   ir_function_signature 
*_atomic_counter_intrinsic2(builtin_available_predicate avail);
 ir_function_signature *_atomic_counter_op(const char *intrinsic,
   builtin_available_predicate 
avail);
+   ir_function_signature *_atomic_counter_op1(const char *intrinsic,
+  builtin_available_predicate 
avail);
+   ir_function_signature *_atomic_counter_op2(const char *intrinsic,
+  builtin_available_predicate 
avail);

 ir_function_signature *_atomic_intrinsic2(builtin_available_predicate 
avail,
   const glsl_type *type);
@@ -983,48 +995,59 @@ builtin_builder::create_intrinsics()
 glsl_type::uint_type),
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::int_type),
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
+NULL);
+   add_function("__intrinsic_atomic_sub",
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
  NULL);
 add_function("__intrinsic_atomic_min",
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::uint_type),
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::int_type),
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
  NULL);
 add_function("__intrinsic_atomic_max",
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::uint_type),
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::int_type),
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
  NULL);
 add_function("__intrinsic_atomic_and",
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::uint_type),
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::int_type),
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
  NULL);
 add_function("__intrinsic_atomic_or",
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::uint_type),
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::int_type),
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
  NULL);
 add_function("__intrinsic_atomic_xor",
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::uint_type),
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::int_type),
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
  NULL);
 add_function("__intrinsic_atomic_exchange",
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::uint_type),
  _atomic_intrinsic2(buffer_atomics_supported,
 glsl_type::int_type),
+_atomic_counter_intrinsic1(shader_atomic_counter_ops),
  NULL);
 add_function("__intrinsic_atomic_comp_swap",
  

Re: [Mesa-dev] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Samuel Pitoiset



On 03/10/2016 04:43 PM, Ilia Mirkin wrote:

On Thu, Mar 10, 2016 at 10:27 AM, Samuel Pitoiset
 wrote:



On 03/10/2016 04:23 PM, Ilia Mirkin wrote:


On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede 
wrote:


Add support for clover / OpenCL kernel input parameters.

Signed-off-by: Hans de Goede 
---
   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18
+++---
   1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index a8258af..de0c72b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int
idx, int c, uint32_t address)

  sym->reg.fileIndex = fileIdx;

-   if (tgsiFile == TGSI_FILE_MEMORY &&
-   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
-  sym->setFile(FILE_MEMORY_SHARED);
+   if (tgsiFile == TGSI_FILE_MEMORY) {
+  switch (code->memoryFiles[fileIdx].mem_type) {
+  case TGSI_MEMORY_TYPE_SHARED:
+ sym->setFile(FILE_MEMORY_SHARED);
+ break;
+  case TGSI_MEMORY_TYPE_INPUT:
+ assert(prog->getType() == Program::TYPE_COMPUTE);
+ assert(idx == -1);
+ sym->setFile(FILE_SHADER_INPUT);
+ address += info->prop.cp.inputOffset;



What's the idea here? i.e. what is the inputOffset, how is it set, and
why?



I don't get the idea too, btw.

But prop.cp.inputOffset is only defined for compute on Kepler. It's the
offset of input parameters in the screen->parm BO but as you already know,
it is going to be removed because the idea is to use screen->uniform_bo
instead. I'll do this change *after* the compute shaders support on Kepler.


Actually looks like it's only set for nv50 that I can see, shifting
things over by 0x10. It used to be reflected by getResourceBase, but
we broke that abstraction... might be nice to get it back somehow,
perhaps by sending more arguments down to getResourceBase? Either way,
that can be done later. This patch is


Oh yes, I was confused with prop.cp.gridInfoBase on Kepler...



Reviewed-by: Ilia Mirkin 



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 10:27 AM, Samuel Pitoiset
 wrote:
>
>
> On 03/10/2016 04:23 PM, Ilia Mirkin wrote:
>>
>> On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede 
>> wrote:
>>>
>>> Add support for clover / OpenCL kernel input parameters.
>>>
>>> Signed-off-by: Hans de Goede 
>>> ---
>>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18
>>> +++---
>>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> index a8258af..de0c72b 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> @@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int
>>> idx, int c, uint32_t address)
>>>
>>>  sym->reg.fileIndex = fileIdx;
>>>
>>> -   if (tgsiFile == TGSI_FILE_MEMORY &&
>>> -   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
>>> -  sym->setFile(FILE_MEMORY_SHARED);
>>> +   if (tgsiFile == TGSI_FILE_MEMORY) {
>>> +  switch (code->memoryFiles[fileIdx].mem_type) {
>>> +  case TGSI_MEMORY_TYPE_SHARED:
>>> + sym->setFile(FILE_MEMORY_SHARED);
>>> + break;
>>> +  case TGSI_MEMORY_TYPE_INPUT:
>>> + assert(prog->getType() == Program::TYPE_COMPUTE);
>>> + assert(idx == -1);
>>> + sym->setFile(FILE_SHADER_INPUT);
>>> + address += info->prop.cp.inputOffset;
>>
>>
>> What's the idea here? i.e. what is the inputOffset, how is it set, and
>> why?
>
>
> I don't get the idea too, btw.
>
> But prop.cp.inputOffset is only defined for compute on Kepler. It's the
> offset of input parameters in the screen->parm BO but as you already know,
> it is going to be removed because the idea is to use screen->uniform_bo
> instead. I'll do this change *after* the compute shaders support on Kepler.

Actually looks like it's only set for nv50 that I can see, shifting
things over by 0x10. It used to be reflected by getResourceBase, but
we broke that abstraction... might be nice to get it back somehow,
perhaps by sending more arguments down to getResourceBase? Either way,
that can be done later. This patch is

Reviewed-by: Ilia Mirkin 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/3] tgsi: Add support for global / local / input MEMORY

2016-03-10 Thread Aaron Watry
On Thu, Mar 10, 2016 at 9:14 AM, Hans de Goede  wrote:

> Extend the MEMORY file support to differentiate between global, local
> and shared memory, as well as "input" memory.
>
> "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
> special memory type is added for this, since the actual storage of these
> (e.g. UBO-s) may differ per implementation. The uploading of kernel
> parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers
> to use an access mechanism for parameter reads which matches with the
> upload method.
>
> Signed-off-by: Hans de Goede 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_build.c|  8 +++
>  src/gallium/auxiliary/tgsi/tgsi_dump.c |  9 ++--
>  src/gallium/auxiliary/tgsi/tgsi_text.c | 14 ++--
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 25
> --
>  src/gallium/auxiliary/tgsi/tgsi_ureg.h |  2 +-
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  7 +++---
>  src/gallium/include/pipe/p_shader_tokens.h | 10 +++--
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +-
>  8 files changed, 51 insertions(+), 26 deletions(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c
> b/src/gallium/auxiliary/tgsi/tgsi_build.c
> index c420ae1..b108ade 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
> @@ -111,7 +111,7 @@ tgsi_default_declaration( void )
> declaration.Local = 0;
> declaration.Array = 0;
> declaration.Atomic = 0;
> -   declaration.Shared = 0;
> +   declaration.MemType = TGSI_MEMORY_TYPE_GLOBAL;
> declaration.Padding = 0;
>
> return declaration;
> @@ -128,7 +128,7 @@ tgsi_build_declaration(
> unsigned local,
> unsigned array,
> unsigned atomic,
> -   unsigned shared,
> +   unsigned mem_type,
> struct tgsi_header *header )
>  {
> struct tgsi_declaration declaration;
> @@ -146,7 +146,7 @@ tgsi_build_declaration(
> declaration.Local = local;
> declaration.Array = array;
> declaration.Atomic = atomic;
> -   declaration.Shared = shared;
> +   declaration.MemType = mem_type;
> header_bodysize_grow( header );
>
> return declaration;
> @@ -406,7 +406,7 @@ tgsi_build_full_declaration(
>full_decl->Declaration.Local,
>full_decl->Declaration.Array,
>full_decl->Declaration.Atomic,
> -  full_decl->Declaration.Shared,
> +  full_decl->Declaration.MemType,
>header );
>
> if (maxsize <= size)
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c
> b/src/gallium/auxiliary/tgsi/tgsi_dump.c
> index f232f38..273f0ae 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
> @@ -365,8 +365,13 @@ iter_declaration(
> }
>
> if (decl->Declaration.File == TGSI_FILE_MEMORY) {
> -  if (decl->Declaration.Shared)
> - TXT(", SHARED");
> +  switch (decl->Declaration.MemType) {
> +  /* Note: ,GLOBAL is optional / the default */
> +  case TGSI_MEMORY_TYPE_GLOBAL: TXT(", GLOBAL"); break;
> +  case TGSI_MEMORY_TYPE_LOCAL:  TXT(", LOCAL");  break;
> +  case TGSI_MEMORY_TYPE_SHARED: TXT(", SHARED"); break;
> +  case TGSI_MEMORY_TYPE_INPUT:  TXT(", INPUT");  break;
> +  }
> }
>
> if (decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c
> b/src/gallium/auxiliary/tgsi/tgsi_text.c
> index 77598d2..9438e3b 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_text.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
> @@ -1390,8 +1390,18 @@ static boolean parse_declaration( struct
> translate_ctx *ctx )
>  ctx->cur = cur;
>   }
>} else if (file == TGSI_FILE_MEMORY) {
> - if (str_match_nocase_whole(, "SHARED")) {
> -decl.Declaration.Shared = 1;
> + if (str_match_nocase_whole(, "GLOBAL")) {
> +/* Note this is a no-op global is the default */
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_GLOBAL;
> +ctx->cur = cur;
> + } else if (str_match_nocase_whole(, "LOCAL")) {
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_LOCAL;
> +ctx->cur = cur;
> + } else if (str_match_nocase_whole(, "SHARED")) {
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_SHARED;
> +ctx->cur = cur;
> + } else if (str_match_nocase_whole(, "INPUT")) {
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_INPUT;
>  ctx->cur = cur;
>   }
>} else {
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> index e1a7278..9e10044 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> @@ -190,7 +190,7 @@ struct ureg_program
>
> struct ureg_tokens domain[2];
>
> -   bool use_shared_memory;
> +   bool 

Re: [Mesa-dev] GLX extension for vendor name lookup in libglvnd

2016-03-10 Thread Kyle Brenneman


On 03/09/2016 12:53 PM, Kyle Brenneman wrote:

On 03/09/2016 12:21 PM, Adam Jackson wrote:

On Wed, 2016-03-09 at 11:15 -0700, Kyle Brenneman wrote:

The current implementation of libglvnd uses a new X extension called
x11glvnd to look up a vendor name for each screen and to find a screen
number for a GLXDrawable.

But, Adam Jackson pointed out that a GLX extension could do the same 
job

more cleanly: Looking up a vendor name is just querying a per-screen
string, which GLXQueryServerString does. Looking up a screen number for
a drawable could work by adding a GLX_SCREEN attribute to the
GLXGetDrawableAttributes reply.

Based on that idea, I've written up a rough draft of a GLX extension
spec. Any comments, questions, or suggestions are welcome, of course.

Argh, you beat me to it, I'd written almost exactly the same thing. I
just an update to my serverstring branch on github implementing what
I'd spec'd, details below...

Ah, sorry about that. I should have mentioned that I was working on it.

New Tokens

  Accepted by the  parameter of glXQueryServerString:

  GLX_VENDOR_NAMES_EXT0x

Perhaps easier than getting an enum allocated here, I'd appended this
string to the end of the response for GLX_VERSION, in the form

 glvnd:

where list is comma-separated, since that part of the string is already
"vendor-specific info".
That could work, although I would expect "vendor-specific info" to 
mean "random, arbitrary, and probably not machine-parsable". I'd be 
hesitant to try to impose a structure on something that's never had 
any structure before.


Agreed with your rationale in the Issues section. I'd also had:

 1) Do we need to define the interaction with GLX_SGIX_pbuffer?

UNRESOLVED.  Xorg uses the same code paths for the 1.3 and
pbuffer versions of GetDrawableAttributes, but extra attributes
are probably harmless.
We probably don't need to -- as you say, extra attributes are likely 
harmless. I'd guess that any system that supports libglvnd is going to 
support at least GLX 1.3, so using glXQueryDrawable to look up the 
screen number seems reasonable.


 2) Do we want to add GLX_SCREEN to the list of fbconfig attributes
as well?

UNRESOLVED.  glvnd does not need that information, but it would
be a natural orthogonality, and GLX_SGIX_fbconfig mentions it
though GLX 1.3 does not.
Possibly, but that wouldn't change the protocol at all. The screen 
number is included in the glXGetFBConfigs request, so it wouldn't make 
sense to add it to the reply as well. It would be up to the client to 
keep track of it instead.
Oh, wait. Now that I think about it, GLX already provides a GLXFBConfig 
to screen mapping in glXGetVisualFromFBConfig, and indirectly from 
glXGetFBConfigs.


So, unless someone feels strongly otherwise, I think it would make the 
most sense to leave glXGetFBConfigAttrib as it is.




- ajax


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 2/3] tgsi: Add support for global / local / input MEMORY

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede  wrote:
> Extend the MEMORY file support to differentiate between global, local
> and shared memory, as well as "input" memory.
>
> "MEMORY[x], INPUT" is intended to access OpenCL kernel parameters, a
> special memory type is added for this, since the actual storage of these
> (e.g. UBO-s) may differ per implementation. The uploading of kernel
> parameters is handled by launch_grid, "MEMORY[x], INPUT" allows drivers
> to use an access mechanism for parameter reads which matches with the
> upload method.
>
> Signed-off-by: Hans de Goede 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_build.c|  8 +++
>  src/gallium/auxiliary/tgsi/tgsi_dump.c |  9 ++--
>  src/gallium/auxiliary/tgsi/tgsi_text.c | 14 ++--
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 25 
> --
>  src/gallium/auxiliary/tgsi/tgsi_ureg.h |  2 +-
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  |  7 +++---
>  src/gallium/include/pipe/p_shader_tokens.h | 10 +++--
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +-
>  8 files changed, 51 insertions(+), 26 deletions(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
> b/src/gallium/auxiliary/tgsi/tgsi_build.c
> index c420ae1..b108ade 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
> @@ -111,7 +111,7 @@ tgsi_default_declaration( void )
> declaration.Local = 0;
> declaration.Array = 0;
> declaration.Atomic = 0;
> -   declaration.Shared = 0;
> +   declaration.MemType = TGSI_MEMORY_TYPE_GLOBAL;
> declaration.Padding = 0;
>
> return declaration;
> @@ -128,7 +128,7 @@ tgsi_build_declaration(
> unsigned local,
> unsigned array,
> unsigned atomic,
> -   unsigned shared,
> +   unsigned mem_type,
> struct tgsi_header *header )
>  {
> struct tgsi_declaration declaration;
> @@ -146,7 +146,7 @@ tgsi_build_declaration(
> declaration.Local = local;
> declaration.Array = array;
> declaration.Atomic = atomic;
> -   declaration.Shared = shared;
> +   declaration.MemType = mem_type;
> header_bodysize_grow( header );
>
> return declaration;
> @@ -406,7 +406,7 @@ tgsi_build_full_declaration(
>full_decl->Declaration.Local,
>full_decl->Declaration.Array,
>full_decl->Declaration.Atomic,
> -  full_decl->Declaration.Shared,
> +  full_decl->Declaration.MemType,
>header );
>
> if (maxsize <= size)
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c 
> b/src/gallium/auxiliary/tgsi/tgsi_dump.c
> index f232f38..273f0ae 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c
> @@ -365,8 +365,13 @@ iter_declaration(
> }
>
> if (decl->Declaration.File == TGSI_FILE_MEMORY) {
> -  if (decl->Declaration.Shared)
> - TXT(", SHARED");
> +  switch (decl->Declaration.MemType) {
> +  /* Note: ,GLOBAL is optional / the default */
> +  case TGSI_MEMORY_TYPE_GLOBAL: TXT(", GLOBAL"); break;
> +  case TGSI_MEMORY_TYPE_LOCAL:  TXT(", LOCAL");  break;
> +  case TGSI_MEMORY_TYPE_SHARED: TXT(", SHARED"); break;
> +  case TGSI_MEMORY_TYPE_INPUT:  TXT(", INPUT");  break;
> +  }
> }
>
> if (decl->Declaration.File == TGSI_FILE_SAMPLER_VIEW) {
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
> b/src/gallium/auxiliary/tgsi/tgsi_text.c
> index 77598d2..9438e3b 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_text.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
> @@ -1390,8 +1390,18 @@ static boolean parse_declaration( struct translate_ctx 
> *ctx )
>  ctx->cur = cur;
>   }
>} else if (file == TGSI_FILE_MEMORY) {
> - if (str_match_nocase_whole(, "SHARED")) {
> -decl.Declaration.Shared = 1;
> + if (str_match_nocase_whole(, "GLOBAL")) {
> +/* Note this is a no-op global is the default */
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_GLOBAL;
> +ctx->cur = cur;
> + } else if (str_match_nocase_whole(, "LOCAL")) {
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_LOCAL;
> +ctx->cur = cur;
> + } else if (str_match_nocase_whole(, "SHARED")) {
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_SHARED;
> +ctx->cur = cur;
> + } else if (str_match_nocase_whole(, "INPUT")) {
> +decl.Declaration.MemType = TGSI_MEMORY_TYPE_INPUT;
>  ctx->cur = cur;
>   }
>} else {
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
> b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> index e1a7278..9e10044 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> @@ -190,7 +190,7 @@ struct ureg_program
>
> struct ureg_tokens domain[2];
>
> -   bool use_shared_memory;
> +   bool 

Re: [Mesa-dev] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Samuel Pitoiset



On 03/10/2016 04:23 PM, Ilia Mirkin wrote:

On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede  wrote:

Add support for clover / OpenCL kernel input parameters.

Signed-off-by: Hans de Goede 
---
  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18 +++---
  1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index a8258af..de0c72b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int idx, 
int c, uint32_t address)

 sym->reg.fileIndex = fileIdx;

-   if (tgsiFile == TGSI_FILE_MEMORY &&
-   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
-  sym->setFile(FILE_MEMORY_SHARED);
+   if (tgsiFile == TGSI_FILE_MEMORY) {
+  switch (code->memoryFiles[fileIdx].mem_type) {
+  case TGSI_MEMORY_TYPE_SHARED:
+ sym->setFile(FILE_MEMORY_SHARED);
+ break;
+  case TGSI_MEMORY_TYPE_INPUT:
+ assert(prog->getType() == Program::TYPE_COMPUTE);
+ assert(idx == -1);
+ sym->setFile(FILE_SHADER_INPUT);
+ address += info->prop.cp.inputOffset;


What's the idea here? i.e. what is the inputOffset, how is it set, and why?


I don't get the idea too, btw.

But prop.cp.inputOffset is only defined for compute on Kepler. It's the 
offset of input parameters in the screen->parm BO but as you already 
know, it is going to be removed because the idea is to use 
screen->uniform_bo instead. I'll do this change *after* the compute 
shaders support on Kepler.




   -ilia


+ break;
+  default:
+ assert(0); /* TODO: Add support for global and local memory */
+  }
+   }

 if (idx >= 0) {
if (sym->reg.file == FILE_SHADER_INPUT)
--
2.7.2



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 1/3] tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi text

2016-03-10 Thread Samuel Pitoiset

Reviewed-by: Samuel Pitoiset 


On 03/10/2016 04:14 PM, Hans de Goede wrote:

When support for decl.Atomic and .Shared was added, tgsi_build_declaration
was not updated to propagate these properly.

Signed-off-by: Hans de Goede 
---
  src/gallium/auxiliary/tgsi/tgsi_build.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
b/src/gallium/auxiliary/tgsi/tgsi_build.c
index cfe9b92..c420ae1 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -127,6 +127,8 @@ tgsi_build_declaration(
 unsigned invariant,
 unsigned local,
 unsigned array,
+   unsigned atomic,
+   unsigned shared,
 struct tgsi_header *header )
  {
 struct tgsi_declaration declaration;
@@ -143,6 +145,8 @@ tgsi_build_declaration(
 declaration.Invariant = invariant;
 declaration.Local = local;
 declaration.Array = array;
+   declaration.Atomic = atomic;
+   declaration.Shared = shared;
 header_bodysize_grow( header );

 return declaration;
@@ -401,6 +405,8 @@ tgsi_build_full_declaration(
full_decl->Declaration.Invariant,
full_decl->Declaration.Local,
full_decl->Declaration.Array,
+  full_decl->Declaration.Atomic,
+  full_decl->Declaration.Shared,
header );

 if (maxsize <= size)



--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 1/3] tgsi: Fix decl.Atomic and .Shared not propagating when parsing tgsi text

2016-03-10 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede  wrote:
> When support for decl.Atomic and .Shared was added, tgsi_build_declaration
> was not updated to propagate these properly.
>
> Signed-off-by: Hans de Goede 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_build.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c 
> b/src/gallium/auxiliary/tgsi/tgsi_build.c
> index cfe9b92..c420ae1 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
> @@ -127,6 +127,8 @@ tgsi_build_declaration(
> unsigned invariant,
> unsigned local,
> unsigned array,
> +   unsigned atomic,
> +   unsigned shared,
> struct tgsi_header *header )
>  {
> struct tgsi_declaration declaration;
> @@ -143,6 +145,8 @@ tgsi_build_declaration(
> declaration.Invariant = invariant;
> declaration.Local = local;
> declaration.Array = array;
> +   declaration.Atomic = atomic;
> +   declaration.Shared = shared;
> header_bodysize_grow( header );
>
> return declaration;
> @@ -401,6 +405,8 @@ tgsi_build_full_declaration(
>full_decl->Declaration.Invariant,
>full_decl->Declaration.Local,
>full_decl->Declaration.Array,
> +  full_decl->Declaration.Atomic,
> +  full_decl->Declaration.Shared,
>header );
>
> if (maxsize <= size)
> --
> 2.7.2
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 3/3] nouveau: Add support for clover / OpenCL kernel input parameters

2016-03-10 Thread Ilia Mirkin
On Thu, Mar 10, 2016 at 10:14 AM, Hans de Goede  wrote:
> Add support for clover / OpenCL kernel input parameters.
>
> Signed-off-by: Hans de Goede 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 18 
> +++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index a8258af..de0c72b 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -1523,9 +1523,21 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int 
> idx, int c, uint32_t address)
>
> sym->reg.fileIndex = fileIdx;
>
> -   if (tgsiFile == TGSI_FILE_MEMORY &&
> -   code->memoryFiles[fileIdx].mem_type == TGSI_MEMORY_TYPE_SHARED)
> -  sym->setFile(FILE_MEMORY_SHARED);
> +   if (tgsiFile == TGSI_FILE_MEMORY) {
> +  switch (code->memoryFiles[fileIdx].mem_type) {
> +  case TGSI_MEMORY_TYPE_SHARED:
> + sym->setFile(FILE_MEMORY_SHARED);
> + break;
> +  case TGSI_MEMORY_TYPE_INPUT:
> + assert(prog->getType() == Program::TYPE_COMPUTE);
> + assert(idx == -1);
> + sym->setFile(FILE_SHADER_INPUT);
> + address += info->prop.cp.inputOffset;

What's the idea here? i.e. what is the inputOffset, how is it set, and why?

  -ilia

> + break;
> +  default:
> + assert(0); /* TODO: Add support for global and local memory */
> +  }
> +   }
>
> if (idx >= 0) {
>if (sym->reg.file == FILE_SHADER_INPUT)
> --
> 2.7.2
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >