Re: [Mesa-dev] [PATCH] i965/cs: Initialize GPGPU Thread Count

2015-07-02 Thread Jordan Justen
On 2015-06-25 11:34:59, Ben Widawsky wrote:
> On Thu, Jun 11, 2015 at 09:04:45PM -0700, Jordan Justen wrote:
> > +   desc[dw++] = 0;
> > +   const uint32_t media_threads =
> > +  brw->gen >= 8 ?
> > +  SET_FIELD(threads, GEN8_MEDIA_GPGPU_THREAD_COUNT) :
> > +  SET_FIELD(threads, MEDIA_GPGPU_THREAD_COUNT);
> > +   desc[dw++] = media_threads;
> 
> What's the deal with, "The maximum value for global barriers is limited by the
> number of threads in the system, or by 511," Can we add an assert?

I guess we are using a local barrier, so "the maximum value is the
number of threads in a subslice for local barriers".

How about I add assert(threads <= brw->max_cs_threads)? Although, in
brw_compute:brw_emit_gpgpu_walker we have a similar assertion.

-Jordan

> >  
> > BEGIN_BATCH(4);
> > OUT_BATCH(MEDIA_INTERFACE_DESCRIPTOR_LOAD << 16 | (4 - 2));
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index f6da305..2a8f500 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -2495,6 +2495,11 @@ enum brw_wm_barycentric_interp_mode {
> >  # define MEDIA_VFE_STATE_CURBE_ALLOC_MASK   INTEL_MASK(15, 0)
> >  
> >  #define MEDIA_INTERFACE_DESCRIPTOR_LOAD 0x7002
> > +/* GEN7 DW5, GEN8+ DW6 */
> > +# define MEDIA_GPGPU_THREAD_COUNT_SHIFT 0
> > +# define MEDIA_GPGPU_THREAD_COUNT_MASK  INTEL_MASK(7, 0)
> > +# define GEN8_MEDIA_GPGPU_THREAD_COUNT_SHIFT0
> > +# define GEN8_MEDIA_GPGPU_THREAD_COUNT_MASK INTEL_MASK(9, 0)
> >  #define MEDIA_STATE_FLUSH   0x7004
> >  #define GPGPU_WALKER0x7105
> >  /* GEN8+ DW2 */
> > -- 
> > 2.1.4
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH mesa v2] i965/gen8+: bo in state base address must be in 32-bit address range

2015-07-02 Thread Michel Thierry

On 7/2/2015 8:21 AM, Chris Wilson wrote:

On Wed, Jul 01, 2015 at 04:28:10PM +0100, Michel Thierry wrote:

Gen8+ supports 48-bit virtual addresses, but some objects must always be
allocated inside the 32-bit address range.

 OUT_BATCH(0);
 OUT_BATCH(mocs_wb << 16);
 /* Surface state base address: */
-   OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
-   mocs_wb << 4 | 1);
+   OUT_RELOC(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
+ mocs_wb << 4 | 1);
+   OUT_BATCH(0);
 /* Dynamic state base address: */
-   OUT_RELOC64(brw->batch.bo,
-   I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
-   mocs_wb << 4 | 1);
+   OUT_RELOC(brw->batch.bo,
+ I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
+ mocs_wb << 4 | 1);
+   OUT_BATCH(0);


Note this is in general dangerous since it lies to the kernel about the
value written into the batch corresponding to the 64bit relocation.
(Aliasing a high object here isn't much of an issue since the relocation
will be forced by having to rebind the buffer low.)

Personally I would have stuck with the OUT_RELOC64 so that any future
cut'n'paste didn't have a subtle bug and that the wa was clearly
indicated by clearing the execobject flag for brw->batch.bo and
brw->cache.bo.


 /* Indirect object base address: MEDIA_OBJECT data */
 OUT_BATCH(mocs_wb << 4 | 1);
 OUT_BATCH(0);
 /* Instruction base address: shader kernels (incl. SIP) */
-   OUT_RELOC64(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
-   mocs_wb << 4 | 1);
-
+   OUT_RELOC(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
+ mocs_wb << 4 | 1);
+   OUT_BATCH(0);
 /* General state buffer size */
 OUT_BATCH(0xf001);
 /* Dynamic state buffer size */
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 54081a1..220a35b 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -411,9 +411,9 @@ intel_batchbuffer_emit_reloc64(struct brw_context *brw,
 uint32_t read_domains, uint32_t write_domain,
   uint32_t delta)
  {
-   int ret = drm_intel_bo_emit_reloc(brw->batch.bo, 4*brw->batch.used,
- buffer, delta,
- read_domains, write_domain);
+   int ret = drm_intel_bo_emit_reloc_48bit(brw->batch.bo, 4*brw->batch.used,
+   buffer, delta,
+   read_domains, write_domain);


I would have just exposed setting the flag on the execobject.  That way you
still have existing userspace safe by default, can set a
bufmgr-level flag to enable 48bit support by default and then
individually turn off 48bit support for the couple of buffers that
require it.

So, something more like v1? 
http://mid.gmane.org/1435062089-19877-2-git-send-email-michel.thie...@intel.com


(apart of the hacky macro name)


Just my 2c because I have seen what trouble lying to the kernel about
relocation values can cause and would rather not have that in the
interface by design.
-Chris


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nouveau: don't mark full range as used on unmap with explicit flush

2015-07-02 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/nouveau_buffer.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_buffer.c 
b/src/gallium/drivers/nouveau/nouveau_buffer.c
index 09cdbb5..c1702ad 100644
--- a/src/gallium/drivers/nouveau/nouveau_buffer.c
+++ b/src/gallium/drivers/nouveau/nouveau_buffer.c
@@ -537,8 +537,13 @@ nouveau_buffer_transfer_unmap(struct pipe_context *pipe,
struct nv04_resource *buf = nv04_resource(transfer->resource);
 
if (tx->base.usage & PIPE_TRANSFER_WRITE) {
-  if (!(tx->base.usage & PIPE_TRANSFER_FLUSH_EXPLICIT) && tx->map)
- nouveau_transfer_write(nv, tx, 0, tx->base.box.width);
+  if (!(tx->base.usage & PIPE_TRANSFER_FLUSH_EXPLICIT)) {
+ if (tx->map)
+nouveau_transfer_write(nv, tx, 0, tx->base.box.width);
+
+ util_range_add(&buf->valid_buffer_range,
+tx->base.box.x, tx->base.box.x + tx->base.box.width);
+  }
 
   if (likely(buf->domain)) {
  const uint8_t bind = buf->base.bind;
@@ -546,9 +551,6 @@ nouveau_buffer_transfer_unmap(struct pipe_context *pipe,
  if (bind & (PIPE_BIND_VERTEX_BUFFER | PIPE_BIND_INDEX_BUFFER))
 nv->vbo_dirty = TRUE;
   }
-
-  util_range_add(&buf->valid_buffer_range,
- tx->base.box.x, tx->base.box.x + tx->base.box.width);
}
 
if (!tx->bo && (tx->base.usage & PIPE_TRANSFER_WRITE))
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add a missing call to _mesa_locale_init

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 6:23 PM, Erik Faye-Lund  wrote:
> On Thu, Jul 2, 2015 at 2:56 PM, Ilia Mirkin  wrote:
>> On Thu, Jul 2, 2015 at 5:54 PM, Matt Turner  wrote:
>>> On Thu, Jul 2, 2015 at 2:22 PM, Ilia Mirkin  wrote:
 Can this be done at dlopen/init time? For example what happens if you do

 static int foo = _mesa_locale_init()

 IIRC things like that are possible in C++, not sure about C.
>>>
>>> gcc has __attribute__((constructor)).
>>>
>>> But I don't think we really care... Erik's series converted the strtod
>>> code from C++ to C (including moving locale init from being a static
>>> constructor to being called in one_time_init) and fixing the memory
>>> leak.
>>
>> Well, this is just going to happen over and over again, I was hoping
>> there was an easy way to do static initializers in C. If not, then I
>> guess we're stuck with this.
>
> The good news is that the breakage was noticed real quick. I agree
> that it'd be awesome to have this happen automatically, but AFAIK
> there's no perfect solution for this:
>
> * C++ static object initializer leads to libc++ dependencies.
> * __attribute__((constructor)) is compiler specific.
> * Naively mutex-protecting initialization leads to overhead in the common 
> case.
> * Lighter-weight double checked locks are tricky to implement, and
>   also have some overhead
> * pthread_once() is not available on Windows, and have some overhead

Sounds good. Just wanted to make sure some alternatives were
considered before peppering main functions with _mesa_locale_init.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add a missing call to _mesa_locale_init

2015-07-02 Thread Erik Faye-Lund
On Thu, Jul 2, 2015 at 2:56 PM, Ilia Mirkin  wrote:
> On Thu, Jul 2, 2015 at 5:54 PM, Matt Turner  wrote:
>> On Thu, Jul 2, 2015 at 2:22 PM, Ilia Mirkin  wrote:
>>> Can this be done at dlopen/init time? For example what happens if you do
>>>
>>> static int foo = _mesa_locale_init()
>>>
>>> IIRC things like that are possible in C++, not sure about C.
>>
>> gcc has __attribute__((constructor)).
>>
>> But I don't think we really care... Erik's series converted the strtod
>> code from C++ to C (including moving locale init from being a static
>> constructor to being called in one_time_init) and fixing the memory
>> leak.
>
> Well, this is just going to happen over and over again, I was hoping
> there was an easy way to do static initializers in C. If not, then I
> guess we're stuck with this.

The good news is that the breakage was noticed real quick. I agree
that it'd be awesome to have this happen automatically, but AFAIK
there's no perfect solution for this:

* C++ static object initializer leads to libc++ dependencies.
* __attribute__((constructor)) is compiler specific.
* Naively mutex-protecting initialization leads to overhead in the common case.
* Lighter-weight double checked locks are tricky to implement, and
  also have some overhead
* pthread_once() is not available on Windows, and have some overhead

Generally, it leaves a bad taste in the mouth to have to know about
compiler-internals for being able to use the compiler. This could be
slightly mitigated by introducing something like _mesa_glsl_init(),
which means that call-sites only need to know about initializing the
compiler.

However, such a function would only call _mesa_locale_init() at this
point, so we need to know the same amount of things. I'd introduce
_mesa_glsl_init() only if it turns out we need to do more. Besides,
new call-sites for _mesa_strto{d,f} will crash visible and hard if
introduced without _mesa_locale_init.

So, while not being awesome, I think this is the lesser evil for now.

But maybe it would be more palatable to move this into
initialize_context_to_defaults() instead, which is a shared
initializer for stand-alone compilers? Perhaps it's more likely that
this function gets called by other non-driver call-sites... It would
also reduce the number of call-sites by one.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] egl: use unix defines on osx with clang

2015-07-02 Thread Julien Isorce
From: Julien Isorce 

I also created an bug in Khronos 's bugzilla as you suggested:
https://www.khronos.org/bugzilla/show_bug.cgi?id=1356
I'll let you know if I get feedback from this bug or else where.

Patch with updated error messages:

[PATCH] eglplatform: treat __APPLE__ the same way as __unix__ to handle X11 
types

  CC   eglapi.lo
./egldisplay.h:258:19: error: unknown type name 'Display'
_eglGetX11Display(Display *native_display, const EGLint *attrib_list);
eglapi.c:290:4: error: array size is negative
   STATIC_ASSERT(sizeof(void*) == sizeof(nativeDisplay));
eglapi.c:291:25: warning: cast to 'void *' from smaller integer type
   'EGLNativeDisplayType' (aka 'int') [-Wint-to-void-pointer-cast]
   native_display_ptr = (void*) nativeDisplay;
eglapi.c:307:32: error: use of undeclared identifier 'Display'
  dpy = _eglGetX11Display((Display*) native_display, attrib_list);
eglapi.c:776:35: error: use of undeclared identifier 'Window'
  native_window = (void*) (* (Window*) native_window);
eglapi.c:847:35: error: use of undeclared identifier 'Pixmap'
  native_pixmap = (void*) (* (Pixmap*) native_pixmap);

Bugzilla Mesa: https://bugs.freedesktop.org/show_bug.cgi?id=90249
Bugzilla Khronos: https://www.khronos.org/bugzilla/show_bug.cgi?id=1356
Signed-off-by: Julien Isorce 
---
 include/EGL/eglplatform.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h
index 7802542..b376e64 100644
--- a/include/EGL/eglplatform.h
+++ b/include/EGL/eglplatform.h
@@ -77,7 +77,7 @@ typedef HDC EGLNativeDisplayType;
 typedef HBITMAP EGLNativePixmapType;
 typedef HWNDEGLNativeWindowType;
 
-#elif defined(__APPLE__) || defined(__WINSCW__) || defined(__SYMBIAN32__)  /* 
Symbian */
+#elif defined(__WINSCW__) || defined(__SYMBIAN32__)  /* Symbian */
 
 typedef int   EGLNativeDisplayType;
 typedef void *EGLNativeWindowType;
@@ -105,7 +105,7 @@ typedef struct ANativeWindow*   EGLNativeWindowType;
 typedef struct egl_native_pixmap_t* EGLNativePixmapType;
 typedef void*   EGLNativeDisplayType;
 
-#elif defined(__unix__)
+#elif defined(__unix__) || defined(__APPLE__)
 
 #if defined(MESA_EGL_NO_X11_HEADERS)
 
-- 
1.9.5 (Apple Git-50.3)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add a missing call to _mesa_locale_init

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 5:54 PM, Matt Turner  wrote:
> On Thu, Jul 2, 2015 at 2:22 PM, Ilia Mirkin  wrote:
>> Can this be done at dlopen/init time? For example what happens if you do
>>
>> static int foo = _mesa_locale_init()
>>
>> IIRC things like that are possible in C++, not sure about C.
>
> gcc has __attribute__((constructor)).
>
> But I don't think we really care... Erik's series converted the strtod
> code from C++ to C (including moving locale init from being a static
> constructor to being called in one_time_init) and fixing the memory
> leak.

Well, this is just going to happen over and over again, I was hoping
there was an easy way to do static initializers in C. If not, then I
guess we're stuck with this.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add a missing call to _mesa_locale_init

2015-07-02 Thread Matt Turner
On Thu, Jul 2, 2015 at 2:18 PM, Erik Faye-Lund  wrote:
> After c61bc6e ("util: port _mesa_strto[df] to C"), "make check"
> fails due to a missing _mesa_locale_init. Fixup this oversight.
>
> Signed-off-by: Erik Faye-Lund 
> Tested-by: Vinson Lee 

Reviewed-by: Matt Turner 

I'll commit this soon.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add a missing call to _mesa_locale_init

2015-07-02 Thread Matt Turner
On Thu, Jul 2, 2015 at 2:22 PM, Ilia Mirkin  wrote:
> Can this be done at dlopen/init time? For example what happens if you do
>
> static int foo = _mesa_locale_init()
>
> IIRC things like that are possible in C++, not sure about C.

gcc has __attribute__((constructor)).

But I don't think we really care... Erik's series converted the strtod
code from C++ to C (including moving locale init from being a static
constructor to being called in one_time_init) and fixing the memory
leak.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-02 Thread Chris Forbes
Looks OK to me. I didn't think there was going to be much required to
make this work -- is nice that it turned out to be nothing.

Reviewed-by: Chris Forbes 

- Chris

On Fri, Jul 3, 2015 at 6:41 AM, Neil Roberts  wrote:
> There was a comment saying that in SIMD16 mode the pixel interpolator
> returns coords interleaved 8 channels at a time and that this requires
> extra work to support. However, this interleaved format is exactly
> what the PLN instruction requires so I don't think anything needs to
> be done to support it apart from removing the line to disable it and
> to ensure that the message lengths for the send message are correct.
>
> I am more convinced that this is correct because as it says in the
> comment this interleaved output is identical to what is given in the
> thread payload. The code generated to apply the plane equation to
> these coordinates is identical on SIMD16 and SIMD8 except that the
> dispatch width is larger which implies no special unmangling is
> needed.
>
> Perhaps the confusion stems from the fact that the description of the
> PLN instruction in the IVB PRM seems to imply that the src1 inputs are
> not interleaved so it wouldn't work. However, in the HSW and BDW PRMs,
> the pseudo-code is different and looks like it expects the interleaved
> format. Mesa doesn't seem to generate different code on IVB to
> uninterleave the payload registers and everything is working so I can
> only assume that the PRM is wrong.
>
> I tested the interpolateAt tests on HSW and did a full Piglit run on
> IVB on there were no regressions.
> ---
>
> I've CC'd Chris Forbes because according to git-annotate he wrote the
> original comment so he might know something I don't.
>
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 59081ea..717e597 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1461,12 +1461,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
> case nir_intrinsic_interp_var_at_centroid:
> case nir_intrinsic_interp_var_at_sample:
> case nir_intrinsic_interp_var_at_offset: {
> -  /* in SIMD16 mode, the pixel interpolator returns coords interleaved
> -   * 8 channels at a time, same as the barycentric coords presented in
> -   * the FS payload. this requires a bit of extra work to support.
> -   */
> -  no16("interpolate_at_* not yet supported in SIMD16 mode.");
> -
>fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
>
>/* For most messages, we need one reg of ignored data; the hardware
> @@ -1531,7 +1525,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
> bld.SEL(offset(src, i), itemp, fs_reg(7)));
>  }
>
> -mlen = 2;
> +mlen = 2 * dispatch_width / 8;
>  inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, 
> dst_xy, src,
>  fs_reg(0u));
>   }
> @@ -1543,7 +1537,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
>}
>
>inst->mlen = mlen;
> -  inst->regs_written = 2; /* 2 floats per slot returned */
> +  /* 2 floats per slot returned */
> +  inst->regs_written = 2 * dispatch_width / 8;
>inst->pi_noperspective = instr->variables[0]->var->data.interpolation 
> ==
> INTERP_QUALIFIER_NOPERSPECTIVE;
>
> --
> 1.9.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add a missing call to _mesa_locale_init

2015-07-02 Thread Ilia Mirkin
Can this be done at dlopen/init time? For example what happens if you do

static int foo = _mesa_locale_init()

IIRC things like that are possible in C++, not sure about C.

On Thu, Jul 2, 2015 at 5:18 PM, Erik Faye-Lund  wrote:
> After c61bc6e ("util: port _mesa_strto[df] to C"), "make check"
> fails due to a missing _mesa_locale_init. Fixup this oversight.
>
> Signed-off-by: Erik Faye-Lund 
> Tested-by: Vinson Lee 
> ---
>  src/glsl/test.cpp | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/glsl/test.cpp b/src/glsl/test.cpp
> index b1ff92e..9dc7c47 100644
> --- a/src/glsl/test.cpp
> +++ b/src/glsl/test.cpp
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include "util/strtod.h"
>
>  #include "test_optpass.h"
>
> @@ -67,6 +68,9 @@ static const char *extract_command_from_argv(int *argc, 
> char **argv)
>  int main(int argc, char **argv)
>  {
> const char *command = extract_command_from_argv(&argc, argv);
> +
> +   _mesa_locale_init();
> +
> if (strcmp(command, "optpass") == 0) {
>return test_optpass(argc, argv);
> } else {
> --
> 2.1.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: add a missing call to _mesa_locale_init

2015-07-02 Thread Erik Faye-Lund
After c61bc6e ("util: port _mesa_strto[df] to C"), "make check"
fails due to a missing _mesa_locale_init. Fixup this oversight.

Signed-off-by: Erik Faye-Lund 
Tested-by: Vinson Lee 
---
 src/glsl/test.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/glsl/test.cpp b/src/glsl/test.cpp
index b1ff92e..9dc7c47 100644
--- a/src/glsl/test.cpp
+++ b/src/glsl/test.cpp
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include "util/strtod.h"
 
 #include "test_optpass.h"
 
@@ -67,6 +68,9 @@ static const char *extract_command_from_argv(int *argc, char 
**argv)
 int main(int argc, char **argv)
 {
const char *command = extract_command_from_argv(&argc, argv);
+
+   _mesa_locale_init();
+
if (strcmp(command, "optpass") == 0) {
   return test_optpass(argc, argv);
} else {
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 19:45, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 2:31 PM, Jose Fonseca  wrote:

On 02/07/15 17:49, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:


On 02/07/15 17:24, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca 
wrote:




Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...




I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any
difference.



It just won't have compiled the shaders, I think. I guess this could force
it.



AFAIK, tiles defer the _rendering_, not the compilation. At least llvmpipe
compiles everything at draw time.




That said, glretrace already does glReadPixels when dumping state, so one
way or the other, when inspecting state in qapitrace, everything will be
flushed and and synched.



But that's too late -- you said the glGetActiveBla would go right
after the draw call. Presumably if you did it right after glReadPixels
it'd end up seeing the state left over from a blit or something?



Fair enough. It's the first thing after glDraw. Forget about glReadpixels.

I guess just still don't understand what's special about tilers.  But I
don't think it's pertinent now.


What's special about tilers is that they defer renders. Compiling the
program can similarly get deferred because they can. (And sometimes
entire renders get dropped due to clears, etc.) Should it get
deferred? Dunno. I don't even remember if freedreno defers
compilation, and never knew what vc4 did.





Perhaps the API should instead be

glEnable(GL_PROGRAM_SAVE_DUMP)
glProgramDumpDebugInfo(progid, callback)

which would then optionally dump any info associated with that
program. That way it doesn't even have to be internally active (due to
a subsequent blit or who-knows-what). But it would rely on that
program having been previously-drawn-with which would have generated
the relevant data.




Doing this immediately after draw call is no problem at all. I don't think
it's worth complicating things by allowing a lag between draw and shader
extraction. It just makes things more unreliable which defeats the point.


Would it really complicate things though? Internally, it can never
drop the debug info since a program might later be reused wholesale
and there won't be another compilation, so it has to store the info on
the program object.


Program object might not exist (e.g. when debugging fixed-function).

And the concept of program object looses meaning in the downstream 
layers (e.g inside gallium pipe drivers, where TGSI can come from all 
sort of utility modules and not just GLSL).



I have little doubts: for this to be feasible, it's imperative this 
applies to the immediately validated state.  Our stack has too many 
layer to do anything else: it would be complex and buggy.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/chv|skl: Apply sampler bypass w/a

2015-07-02 Thread Matt Turner
On Thu, Jul 2, 2015 at 12:57 PM, Matt Turner  wrote:
> On Wed, Jul 1, 2015 at 4:03 PM, Ben Widawsky
>  wrote:
>> Certain compressed formats require this setting. The docs don't go into much
>> detail as to why it's needed exactly.
>>
>> This fixes 0 piglit failures with a GBM gpu piglit run.
>
> That's a really weird way of saying that.
>
>>
>> Signed-off-by: Ben Widawsky 
>> ---
>>
>> I had this one sitting around for almost 2 months. I'm not sure why I didn't
>> send it out sooner. It seems like it's needed.
>>
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h|  1 +
>>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 26 
>> ++
>>  2 files changed, 27 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
>> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index 66b9abc..f55fd49 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -276,6 +276,7 @@
>>  #define GEN8_SURFACE_TILING_W   (1 << 12)
>>  #define GEN8_SURFACE_TILING_X   (2 << 12)
>>  #define GEN8_SURFACE_TILING_Y   (3 << 12)
>> +#define GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE  (1 << 9)
>>  #define BRW_SURFACE_RC_READ_WRITE  (1 << 8)
>>  #define BRW_SURFACE_MIPLAYOUT_SHIFT10
>>  #define BRW_SURFACE_MIPMAPLAYOUT_BELOW   0
>> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
>> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
>> index ca5ed17..a245379 100644
>> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
>> @@ -264,6 +264,19 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
>>surf[0] |= BRW_SURFACE_CUBEFACE_ENABLES;
>> }
>>
>> +   if (brw->is_cherryview || brw->gen >= 9) {
>> +  /* "This bit must be set for the following surface types: BC2_UNORM
>> +   * BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM"
>
> Don't do naked quotes -- Use the normal style.
>
>> +   */
>> +  switch (format) {
>> +  case BRW_SURFACEFORMAT_BC2_UNORM:
>> +  case BRW_SURFACEFORMAT_BC3_UNORM:
>> +  case BRW_SURFACEFORMAT_BC5_SNORM:
>
> Missing BRW_SURFACEFORMAT_BC5_UNORM.
>
>> +  case BRW_SURFACEFORMAT_BC7_UNORM:
>> + surf[0] |= GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE;
>
> It wouldn't surprise me if static analysis tools complain about the
> missing break.
>
> Add a break or make it an if statement (which would be two lines
> shorter). All together, how about
>
>/* From the CHV PRM, Volume 2d, page 321 (RENDER_SURFACE_STATE dword 0
> * bit 9 "Sampler L2 Bypass Mode Disable" Programming Notes):
> *
> *This bit must be set for the following surface types: BC2_UNORM
> *BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM
> */
>if (format == BRW_SURFACEFORMAT_BC2_UNORM ||
>format == BRW_SURFACEFORMAT_BC3_UNORM ||
>format == BRW_SURFACEFORMAT_BC5_SNORM ||
>format == BRW_SURFACEFORMAT_BC5_UNORM ||

Bah, would be nice to so BC5_UNORM and then BC5_SNORM to better match
the comment.

>format == BRW_SURFACEFORMAT_BC7_UNORM)
>   surf[0] |= GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE;
>}
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/chv|skl: Apply sampler bypass w/a

2015-07-02 Thread Matt Turner
On Wed, Jul 1, 2015 at 4:03 PM, Ben Widawsky
 wrote:
> Certain compressed formats require this setting. The docs don't go into much
> detail as to why it's needed exactly.
>
> This fixes 0 piglit failures with a GBM gpu piglit run.

That's a really weird way of saying that.

>
> Signed-off-by: Ben Widawsky 
> ---
>
> I had this one sitting around for almost 2 months. I'm not sure why I didn't
> send it out sooner. It seems like it's needed.
>
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h|  1 +
>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 26 
> ++
>  2 files changed, 27 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index 66b9abc..f55fd49 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -276,6 +276,7 @@
>  #define GEN8_SURFACE_TILING_W   (1 << 12)
>  #define GEN8_SURFACE_TILING_X   (2 << 12)
>  #define GEN8_SURFACE_TILING_Y   (3 << 12)
> +#define GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE  (1 << 9)
>  #define BRW_SURFACE_RC_READ_WRITE  (1 << 8)
>  #define BRW_SURFACE_MIPLAYOUT_SHIFT10
>  #define BRW_SURFACE_MIPMAPLAYOUT_BELOW   0
> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> index ca5ed17..a245379 100644
> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> @@ -264,6 +264,19 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
>surf[0] |= BRW_SURFACE_CUBEFACE_ENABLES;
> }
>
> +   if (brw->is_cherryview || brw->gen >= 9) {
> +  /* "This bit must be set for the following surface types: BC2_UNORM
> +   * BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM"

Don't do naked quotes -- Use the normal style.

> +   */
> +  switch (format) {
> +  case BRW_SURFACEFORMAT_BC2_UNORM:
> +  case BRW_SURFACEFORMAT_BC3_UNORM:
> +  case BRW_SURFACEFORMAT_BC5_SNORM:

Missing BRW_SURFACEFORMAT_BC5_UNORM.

> +  case BRW_SURFACEFORMAT_BC7_UNORM:
> + surf[0] |= GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE;

It wouldn't surprise me if static analysis tools complain about the
missing break.

Add a break or make it an if statement (which would be two lines
shorter). All together, how about

   /* From the CHV PRM, Volume 2d, page 321 (RENDER_SURFACE_STATE dword 0
* bit 9 "Sampler L2 Bypass Mode Disable" Programming Notes):
*
*This bit must be set for the following surface types: BC2_UNORM
*BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM
*/
   if (format == BRW_SURFACEFORMAT_BC2_UNORM ||
   format == BRW_SURFACEFORMAT_BC3_UNORM ||
   format == BRW_SURFACEFORMAT_BC5_SNORM ||
   format == BRW_SURFACEFORMAT_BC5_UNORM ||
   format == BRW_SURFACEFORMAT_BC7_UNORM)
  surf[0] |= GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE;
   }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] i965/vs: Fix matNxM vertex attributes where M != 4.

2015-07-02 Thread Mark Janes
This fixes 1256 tests on BDW.

Tested-by: Mark Janes 

Kenneth Graunke  writes:

> Matrix vertex attributes have their columns padded out to vec4s, which
> I was failing to account for.  Scalar NIR expects them to be packed,
> however.
>
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> I still need to write proper Piglit tests for this.  We have basically a 
> single
> test for matrix vertex attributes, and that's a mat4 (which worked).
>
> But I figure we probably shouldn't hold up the bugfix on that.
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index caf1300..37b1ed7 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -91,12 +91,19 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
>* So, we need to copy from fs_reg(ATTR, var->location) to
>* offset(nir_inputs, var->data.driver_location).
>*/
> - unsigned components = var->type->without_array()->components();
> + const glsl_type *const t = var->type->without_array();
> + const unsigned components = t->components();
> + const unsigned cols = t->matrix_columns;
> + const unsigned elts = t->vector_elements;
>   unsigned array_length = var->type->is_array() ? var->type->length : 
> 1;
>   for (unsigned i = 0; i < array_length; i++) {
> -for (unsigned j = 0; j < components; j++) {
> -   bld.MOV(retype(offset(input, bld, components * i + j), type),
> -   offset(fs_reg(ATTR, var->data.location + i, type), 
> bld, j));
> +for (unsigned j = 0; j < cols; j++) {
> +   for (unsigned k = 0; k < elts; k++) {
> +  bld.MOV(offset(retype(input, type), bld,
> + components * i + elts * j + k),
> +  offset(fs_reg(ATTR, var->data.location + i, type),
> + bld, 4 * j + k));
> +   }
>  }
>   }
>   break;
> -- 
> 2.4.4
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 2:59 PM, Eric Anholt  wrote:
> Ilia Mirkin  writes:
>
>> On Thu, Jul 2, 2015 at 2:31 PM, Jose Fonseca  wrote:
>>> On 02/07/15 17:49, Ilia Mirkin wrote:

 On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:
>>> Fair enough. It's the first thing after glDraw. Forget about glReadpixels.
>>>
>>> I guess just still don't understand what's special about tilers.  But I
>>> don't think it's pertinent now.
>>
>> What's special about tilers is that they defer renders. Compiling the
>> program can similarly get deferred because they can. (And sometimes
>> entire renders get dropped due to clears, etc.) Should it get
>> deferred? Dunno. I don't even remember if freedreno defers
>> compilation, and never knew what vc4 did.
>
> Wait, what?  I'm also writing a tiler driver, and I don't see how being
> a tiler makes *any* change to when compilation could happen.  I can't
> bin without my programs compiled, so I need my programs compiled at the
> point where I'm writing the command list, just like an immediate
> renderer.

I'm probably just confused then, so ignore me :) As Jose said, this
isn't actually relevant to the point at hand.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/11] mesa: implement _mesa_GetTextureSubImage()

2015-07-02 Thread Brian Paul

On 7/2/15 1:00 PM, Eric Anholt wrote:

Brian Paul  writes:


The main issue is I need to do additional work for cube and/or cube
array textures.  This is still on my to-do list.  Do you need this?


FWIW, I just wanted this extension in X yesterday.



I'm on vacation this week.  I could try to look at it next week if you 
really need it.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/11] mesa: implement _mesa_GetTextureSubImage()

2015-07-02 Thread Eric Anholt
Brian Paul  writes:

> The main issue is I need to do additional work for cube and/or cube 
> array textures.  This is still on my to-do list.  Do you need this?

FWIW, I just wanted this extension in X yesterday.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Eric Anholt
Ilia Mirkin  writes:

> On Thu, Jul 2, 2015 at 2:31 PM, Jose Fonseca  wrote:
>> On 02/07/15 17:49, Ilia Mirkin wrote:
>>>
>>> On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:

 On 02/07/15 17:24, Ilia Mirkin wrote:
>
>
> On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca 
> wrote:
>>
>>
> Ah OK. So I guess tilers will have to disable their render queues for
> this one. Which seems like a reasonable trade-off...



 I don't see why.

 This is a purely SW query. So I don't see why the HW needs to see any
 difference.
>>>
>>>
>>> It just won't have compiled the shaders, I think. I guess this could force
>>> it.
>>
>>
>> AFAIK, tiles defer the _rendering_, not the compilation. At least llvmpipe
>> compiles everything at draw time.
>>
>>

 That said, glretrace already does glReadPixels when dumping state, so one
 way or the other, when inspecting state in qapitrace, everything will be
 flushed and and synched.
>>>
>>>
>>> But that's too late -- you said the glGetActiveBla would go right
>>> after the draw call. Presumably if you did it right after glReadPixels
>>> it'd end up seeing the state left over from a blit or something?
>>
>>
>> Fair enough. It's the first thing after glDraw. Forget about glReadpixels.
>>
>> I guess just still don't understand what's special about tilers.  But I
>> don't think it's pertinent now.
>
> What's special about tilers is that they defer renders. Compiling the
> program can similarly get deferred because they can. (And sometimes
> entire renders get dropped due to clears, etc.) Should it get
> deferred? Dunno. I don't even remember if freedreno defers
> compilation, and never knew what vc4 did.

Wait, what?  I'm also writing a tiler driver, and I don't see how being
a tiler makes *any* change to when compilation could happen.  I can't
bin without my programs compiled, so I need my programs compiled at the
point where I'm writing the command list, just like an immediate
renderer.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 2:31 PM, Jose Fonseca  wrote:
> On 02/07/15 17:49, Ilia Mirkin wrote:
>>
>> On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:
>>>
>>> On 02/07/15 17:24, Ilia Mirkin wrote:


 On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca 
 wrote:
>
>
 Ah OK. So I guess tilers will have to disable their render queues for
 this one. Which seems like a reasonable trade-off...
>>>
>>>
>>>
>>> I don't see why.
>>>
>>> This is a purely SW query. So I don't see why the HW needs to see any
>>> difference.
>>
>>
>> It just won't have compiled the shaders, I think. I guess this could force
>> it.
>
>
> AFAIK, tiles defer the _rendering_, not the compilation. At least llvmpipe
> compiles everything at draw time.
>
>
>>>
>>> That said, glretrace already does glReadPixels when dumping state, so one
>>> way or the other, when inspecting state in qapitrace, everything will be
>>> flushed and and synched.
>>
>>
>> But that's too late -- you said the glGetActiveBla would go right
>> after the draw call. Presumably if you did it right after glReadPixels
>> it'd end up seeing the state left over from a blit or something?
>
>
> Fair enough. It's the first thing after glDraw. Forget about glReadpixels.
>
> I guess just still don't understand what's special about tilers.  But I
> don't think it's pertinent now.

What's special about tilers is that they defer renders. Compiling the
program can similarly get deferred because they can. (And sometimes
entire renders get dropped due to clears, etc.) Should it get
deferred? Dunno. I don't even remember if freedreno defers
compilation, and never knew what vc4 did.

>
>>
>> Perhaps the API should instead be
>>
>> glEnable(GL_PROGRAM_SAVE_DUMP)
>> glProgramDumpDebugInfo(progid, callback)
>>
>> which would then optionally dump any info associated with that
>> program. That way it doesn't even have to be internally active (due to
>> a subsequent blit or who-knows-what). But it would rely on that
>> program having been previously-drawn-with which would have generated
>> the relevant data.
>
>
>
> Doing this immediately after draw call is no problem at all. I don't think
> it's worth complicating things by allowing a lag between draw and shader
> extraction. It just makes things more unreliable which defeats the point.

Would it really complicate things though? Internally, it can never
drop the debug info since a program might later be reused wholesale
and there won't be another compilation, so it has to store the info on
the program object. The additionally nice thing is that apitrace would
know which program which debug info belongs to, rather than just
having it all be part of a single "linked" thing. Obviously some code
in one might affect the other (like clipping getting moved to fs, etc)
but on average it might be nice to keep them separate.

One counterpoint is that an driver for hw that has a different
programming model wouldn't work as well. For example how would intel
dump the programmable clipper code for gen4/5? Or something like PVR
which I understand is totally crazy? That makes more the argument for
an "all in one" type of thing. I guess I don't feel too strongly about
it either way though.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Don't disable SIMD16 when using the pixel interpolator

2015-07-02 Thread Neil Roberts
There was a comment saying that in SIMD16 mode the pixel interpolator
returns coords interleaved 8 channels at a time and that this requires
extra work to support. However, this interleaved format is exactly
what the PLN instruction requires so I don't think anything needs to
be done to support it apart from removing the line to disable it and
to ensure that the message lengths for the send message are correct.

I am more convinced that this is correct because as it says in the
comment this interleaved output is identical to what is given in the
thread payload. The code generated to apply the plane equation to
these coordinates is identical on SIMD16 and SIMD8 except that the
dispatch width is larger which implies no special unmangling is
needed.

Perhaps the confusion stems from the fact that the description of the
PLN instruction in the IVB PRM seems to imply that the src1 inputs are
not interleaved so it wouldn't work. However, in the HSW and BDW PRMs,
the pseudo-code is different and looks like it expects the interleaved
format. Mesa doesn't seem to generate different code on IVB to
uninterleave the payload registers and everything is working so I can
only assume that the PRM is wrong.

I tested the interpolateAt tests on HSW and did a full Piglit run on
IVB on there were no regressions.
---

I've CC'd Chris Forbes because according to git-annotate he wrote the
original comment so he might know something I don't.

 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 59081ea..717e597 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1461,12 +1461,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
case nir_intrinsic_interp_var_at_centroid:
case nir_intrinsic_interp_var_at_sample:
case nir_intrinsic_interp_var_at_offset: {
-  /* in SIMD16 mode, the pixel interpolator returns coords interleaved
-   * 8 channels at a time, same as the barycentric coords presented in
-   * the FS payload. this requires a bit of extra work to support.
-   */
-  no16("interpolate_at_* not yet supported in SIMD16 mode.");
-
   fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
 
   /* For most messages, we need one reg of ignored data; the hardware
@@ -1531,7 +1525,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
bld.SEL(offset(src, i), itemp, fs_reg(7)));
 }
 
-mlen = 2;
+mlen = 2 * dispatch_width / 8;
 inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, dst_xy, 
src,
 fs_reg(0u));
  }
@@ -1543,7 +1537,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
   }
 
   inst->mlen = mlen;
-  inst->regs_written = 2; /* 2 floats per slot returned */
+  /* 2 floats per slot returned */
+  inst->regs_written = 2 * dispatch_width / 8;
   inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
INTERP_QUALIFIER_NOPERSPECTIVE;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:49, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:

On 02/07/15 17:24, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:



Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...



I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any
difference.


It just won't have compiled the shaders, I think. I guess this could force it.


AFAIK, tiles defer the _rendering_, not the compilation. At least 
llvmpipe compiles everything at draw time.





That said, glretrace already does glReadPixels when dumping state, so one
way or the other, when inspecting state in qapitrace, everything will be
flushed and and synched.


But that's too late -- you said the glGetActiveBla would go right
after the draw call. Presumably if you did it right after glReadPixels
it'd end up seeing the state left over from a blit or something?


Fair enough. It's the first thing after glDraw. Forget about glReadpixels.

I guess just still don't understand what's special about tilers.  But I 
don't think it's pertinent now.




Perhaps the API should instead be

glEnable(GL_PROGRAM_SAVE_DUMP)
glProgramDumpDebugInfo(progid, callback)

which would then optionally dump any info associated with that
program. That way it doesn't even have to be internally active (due to
a subsequent blit or who-knows-what). But it would rely on that
program having been previously-drawn-with which would have generated
the relevant data.



Doing this immediately after draw call is no problem at all. I don't 
think it's worth complicating things by allowing a lag between draw and 
shader extraction. It just makes things more unreliable which defeats 
the point.


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/11] mesa: implement _mesa_GetTextureSubImage()

2015-07-02 Thread Ilia Mirkin
Nope, just looking over the GL 4.5 extensions, was wondering what was
happening with this one.

On Thu, Jul 2, 2015 at 2:14 PM, Brian Paul  wrote:
> The main issue is I need to do additional work for cube and/or cube array
> textures.  This is still on my to-do list.  Do you need this?
>
> -Brian
>
>
>
> On 7/1/15 11:01 PM, Ilia Mirkin wrote:
>>
>> Brian, what happened to this series? Was there some conflict with DSA
>> and so you held off?
>>
>> On Sat, Dec 13, 2014 at 9:42 AM, Brian Paul  wrote:
>>>
>>> One of the two new functions in GL_ARB_get_texture_sub_image.
>>> ---
>>>   src/mesa/main/texgetimage.c | 305
>>> ++--
>>>   src/mesa/main/texgetimage.h |   8 ++
>>>   2 files changed, 277 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
>>> index 71c25bb..e1f238b 100644
>>> --- a/src/mesa/main/texgetimage.c
>>> +++ b/src/mesa/main/texgetimage.c
>>> @@ -44,6 +44,7 @@
>>>   #include "texcompress.h"
>>>   #include "texgetimage.h"
>>>   #include "teximage.h"
>>> +#include "texobj.h"
>>>   #include "texstore.h"
>>>
>>>
>>> @@ -803,41 +804,36 @@ legal_getteximage_target(struct gl_context *ctx,
>>> GLenum target)
>>>
>>>
>>>   /**
>>> - * Do error checking for a glGetTexImage() call.
>>> + * Do error checking for a glGetTexImage() or glGetTextureSubImage()
>>> call.
>>>* \return GL_TRUE if any error, GL_FALSE if no errors.
>>>*/
>>>   static GLboolean
>>> -getteximage_error_check(struct gl_context *ctx, GLenum target, GLint
>>> level,
>>> +getteximage_error_check(struct gl_context *ctx,
>>> +struct gl_texture_object *texObj, GLint level,
>>>   GLenum format, GLenum type, GLsizei
>>> clientMemSize,
>>> -GLvoid *pixels )
>>> +GLvoid *pixels, const char *caller)
>>>   {
>>> -   struct gl_texture_object *texObj;
>>> +   GLenum target = texObj->Target;
>>>  struct gl_texture_image *texImage;
>>>  const GLint maxLevels = _mesa_max_texture_levels(ctx, target);
>>> -   const GLuint dimensions = (target == GL_TEXTURE_3D) ? 3 : 2;
>>>  GLenum baseFormat, err;
>>>
>>> -   if (!legal_getteximage_target(ctx, target)) {
>>> -  _mesa_error(ctx, GL_INVALID_ENUM, "glGetTexImage(target=0x%x)",
>>> target);
>>> -  return GL_TRUE;
>>> -   }
>>> -
>>>  assert(maxLevels != 0);
>>>  if (level < 0 || level >= maxLevels) {
>>> -  _mesa_error( ctx, GL_INVALID_VALUE, "glGetTexImage(level)" );
>>> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(level)", caller);
>>> return GL_TRUE;
>>>  }
>>>
>>>  err = _mesa_error_check_format_and_type(ctx, format, type);
>>>  if (err != GL_NO_ERROR) {
>>> -  _mesa_error(ctx, err, "glGetTexImage(format/type)");
>>> +  _mesa_error(ctx, err, "%s(format/type)", caller);
>>> return GL_TRUE;
>>>  }
>>>
>>>  texObj = _mesa_get_current_tex_object(ctx, target);
>>>
>>>  if (!texObj) {
>>> -  _mesa_error(ctx, GL_INVALID_ENUM, "glGetTexImage(target)");
>>> +  _mesa_error(ctx, GL_INVALID_ENUM, "%s(target)", caller);
>>> return GL_TRUE;
>>>  }
>>>
>>> @@ -854,64 +850,79 @@ getteximage_error_check(struct gl_context *ctx,
>>> GLenum target, GLint level,
>>>   */
>>>  if (_mesa_is_color_format(format)
>>>  && !_mesa_is_color_format(baseFormat)) {
>>> -  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format
>>> mismatch)");
>>> +  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)",
>>> caller);
>>> return GL_TRUE;
>>>  }
>>>  else if (_mesa_is_depth_format(format)
>>>   && !_mesa_is_depth_format(baseFormat)
>>>   && !_mesa_is_depthstencil_format(baseFormat)) {
>>> -  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format
>>> mismatch)");
>>> +  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)",
>>> caller);
>>> return GL_TRUE;
>>>  }
>>>  else if (_mesa_is_stencil_format(format)
>>>   && !ctx->Extensions.ARB_texture_stencil8) {
>>> -  _mesa_error(ctx, GL_INVALID_ENUM,
>>> "glGetTexImage(format=GL_STENCIL_INDEX)");
>>> +  _mesa_error(ctx, GL_INVALID_ENUM, "%s(format=GL_STENCIL_INDEX)",
>>> caller);
>>> return GL_TRUE;
>>>  }
>>>  else if (_mesa_is_ycbcr_format(format)
>>>   && !_mesa_is_ycbcr_format(baseFormat)) {
>>> -  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format
>>> mismatch)");
>>> +  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)",
>>> caller);
>>> return GL_TRUE;
>>>  }
>>>  else if (_mesa_is_depthstencil_format(format)
>>>   && !_mesa_is_depthstencil_format(baseFormat)) {
>>> -  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format
>>> mismatch)");
>>> +  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)",
>>> caller);
>>> return GL_TRUE;
>>>  }
>>> 

Re: [Mesa-dev] [PATCH 05/11] mesa: implement _mesa_GetTextureSubImage()

2015-07-02 Thread Brian Paul
The main issue is I need to do additional work for cube and/or cube 
array textures.  This is still on my to-do list.  Do you need this?


-Brian


On 7/1/15 11:01 PM, Ilia Mirkin wrote:

Brian, what happened to this series? Was there some conflict with DSA
and so you held off?

On Sat, Dec 13, 2014 at 9:42 AM, Brian Paul  wrote:

One of the two new functions in GL_ARB_get_texture_sub_image.
---
  src/mesa/main/texgetimage.c | 305 ++--
  src/mesa/main/texgetimage.h |   8 ++
  2 files changed, 277 insertions(+), 36 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 71c25bb..e1f238b 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -44,6 +44,7 @@
  #include "texcompress.h"
  #include "texgetimage.h"
  #include "teximage.h"
+#include "texobj.h"
  #include "texstore.h"


@@ -803,41 +804,36 @@ legal_getteximage_target(struct gl_context *ctx, GLenum 
target)


  /**
- * Do error checking for a glGetTexImage() call.
+ * Do error checking for a glGetTexImage() or glGetTextureSubImage() call.
   * \return GL_TRUE if any error, GL_FALSE if no errors.
   */
  static GLboolean
-getteximage_error_check(struct gl_context *ctx, GLenum target, GLint level,
+getteximage_error_check(struct gl_context *ctx,
+struct gl_texture_object *texObj, GLint level,
  GLenum format, GLenum type, GLsizei clientMemSize,
-GLvoid *pixels )
+GLvoid *pixels, const char *caller)
  {
-   struct gl_texture_object *texObj;
+   GLenum target = texObj->Target;
 struct gl_texture_image *texImage;
 const GLint maxLevels = _mesa_max_texture_levels(ctx, target);
-   const GLuint dimensions = (target == GL_TEXTURE_3D) ? 3 : 2;
 GLenum baseFormat, err;

-   if (!legal_getteximage_target(ctx, target)) {
-  _mesa_error(ctx, GL_INVALID_ENUM, "glGetTexImage(target=0x%x)", target);
-  return GL_TRUE;
-   }
-
 assert(maxLevels != 0);
 if (level < 0 || level >= maxLevels) {
-  _mesa_error( ctx, GL_INVALID_VALUE, "glGetTexImage(level)" );
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(level)", caller);
return GL_TRUE;
 }

 err = _mesa_error_check_format_and_type(ctx, format, type);
 if (err != GL_NO_ERROR) {
-  _mesa_error(ctx, err, "glGetTexImage(format/type)");
+  _mesa_error(ctx, err, "%s(format/type)", caller);
return GL_TRUE;
 }

 texObj = _mesa_get_current_tex_object(ctx, target);

 if (!texObj) {
-  _mesa_error(ctx, GL_INVALID_ENUM, "glGetTexImage(target)");
+  _mesa_error(ctx, GL_INVALID_ENUM, "%s(target)", caller);
return GL_TRUE;
 }

@@ -854,64 +850,79 @@ getteximage_error_check(struct gl_context *ctx, GLenum 
target, GLint level,
  */
 if (_mesa_is_color_format(format)
 && !_mesa_is_color_format(baseFormat)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format mismatch)");
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)", caller);
return GL_TRUE;
 }
 else if (_mesa_is_depth_format(format)
  && !_mesa_is_depth_format(baseFormat)
  && !_mesa_is_depthstencil_format(baseFormat)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format mismatch)");
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)", caller);
return GL_TRUE;
 }
 else if (_mesa_is_stencil_format(format)
  && !ctx->Extensions.ARB_texture_stencil8) {
-  _mesa_error(ctx, GL_INVALID_ENUM, 
"glGetTexImage(format=GL_STENCIL_INDEX)");
+  _mesa_error(ctx, GL_INVALID_ENUM, "%s(format=GL_STENCIL_INDEX)", caller);
return GL_TRUE;
 }
 else if (_mesa_is_ycbcr_format(format)
  && !_mesa_is_ycbcr_format(baseFormat)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format mismatch)");
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)", caller);
return GL_TRUE;
 }
 else if (_mesa_is_depthstencil_format(format)
  && !_mesa_is_depthstencil_format(baseFormat)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format mismatch)");
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)", caller);
return GL_TRUE;
 }
 else if (_mesa_is_enum_format_integer(format) !=
  _mesa_is_format_integer(texImage->TexFormat)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "glGetTexImage(format mismatch)");
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(format mismatch)", caller);
return GL_TRUE;
 }

-   if (!_mesa_validate_pbo_access(dimensions, &ctx->Pack, texImage->Width,
-  texImage->Height, texImage->Depth,
-  format, type, clientMemSize, pixels)) {
+   return GL_FALSE;
+}
+
+
+/**
+ * Do error checking related to the PBO and image size.
+ */
+static bool
+

[Mesa-dev] [Bug 91173] Oddworld: Stranger's Wrath HD: disfigured models in wrong colors

2015-07-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91173

Ilia Mirkin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Ilia Mirkin  ---
Both of the patches are now upstream. They missed the last 10.5.x release, but
should be in 10.6.2.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glapi: fix argument parsing in glX_proto_recv.py

2015-07-02 Thread Dylan Baker
One of the plugins I use with vim "helpfully" added an underscore to the
front of mode for kicks.

Obviously this isn't a feature used very often because it's been broken
since d986cb7c70db (since May 20th), and no one has noticed.

Signed-off-by: Dylan Baker 
---
 src/mapi/glapi/gen/glX_proto_recv.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mapi/glapi/gen/glX_proto_recv.py 
b/src/mapi/glapi/gen/glX_proto_recv.py
index da468dc..5d95f27 100644
--- a/src/mapi/glapi/gen/glX_proto_recv.py
+++ b/src/mapi/glapi/gen/glX_proto_recv.py
@@ -549,9 +549,9 @@ def main():
 """Main function."""
 args = _parser()
 
-if args._mode == "dispatch_c":
+if args.mode == "dispatch_c":
 printer = PrintGlxDispatchFunctions(args.swap)
-elif args._mode == "dispatch_h":
+elif args.mode == "dispatch_h":
 printer = PrintGlxDispatch_h()
 
 api = gl_XML.parse_GL_API(
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/78] i965/nir/vec4: Implement conditional statements (nir_cf_node_if)

2015-07-02 Thread Jason Ekstrand
On Wed, Jul 1, 2015 at 11:44 PM, Iago Toral  wrote:
> On Tue, 2015-06-30 at 09:30 -0700, Jason Ekstrand wrote:
>> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
>> > From: Iago Toral Quiroga 
>> >
>> > The same we do in the FS NIR backend, only that here we need to consider
>> > the number of components in the condition and adjust the swizzle
>> > accordingly.
>> >
>> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 23 ++-
>> >  1 file changed, 22 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > index 1ec75ee..d81b6a7 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > @@ -314,7 +314,28 @@ vec4_visitor::nir_emit_cf_list(exec_list *list)
>> >  void
>> >  vec4_visitor::nir_emit_if(nir_if *if_stmt)
>> >  {
>> > -   /* @TODO: Not yet implemented */
>> > +   /* First, put the condition in f0 */
>> > +   src_reg condition = get_nir_src(if_stmt->condition, 
>> > BRW_REGISTER_TYPE_D);
>> > +
>> > +   int num_components = if_stmt->condition.is_ssa ?
>> > +  if_stmt->condition.ssa->num_components :
>> > +  if_stmt->condition.reg.reg->num_components;
>> > +
>> > +   condition.swizzle = brw_swizzle_for_size(num_components);
>> > +
>> > +   vec4_instruction *inst = emit(MOV(dst_null_d(), condition));
>> > +   inst->conditional_mod = BRW_CONDITIONAL_NZ;
>>
>> NIR if statements read only one component by definition.  There's no
>> need to do this.
>
> I see, we still need to do this explicitly though:
>
> condition.swizzle = brw_swizzle_for_size(1);
>
> Maybe we should just make get_nir_src() set the swizzle based on the
> number of components instead so we don't have to do this kind of things
> after calling that, does that sound better?

Just pass the number of components into get_nir_src()?  That sounds fine to me.
--Jason

> Iago
>
>>
>> > +   emit(IF(BRW_PREDICATE_NORMAL));
>> > +
>> > +   nir_emit_cf_list(&if_stmt->then_list);
>> > +
>> > +   /* note: if the else is empty, dead CF elimination will remove it */
>> > +   emit(BRW_OPCODE_ELSE);
>> > +
>> > +   nir_emit_cf_list(&if_stmt->else_list);
>> > +
>> > +   emit(BRW_OPCODE_ENDIF);
>> >  }
>> >
>> >  void
>> > --
>> > 2.1.4
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] glsl: fix some strict aliasing issues in exec_list

2015-07-02 Thread Davin McCall

On 02/07/15 14:58, Neil Roberts wrote:

Davin McCall  writes:


I actually had thought about this, but technically, you can only use
unions for type aliasing if you perform all accesses (that are not to
the 'active' member) through the union. All the list processing code
that iterates through all the nodes including the tail sentinel would
*technically* still have an aliasing problem, because it doesn't go
through the exec_list structure at all (though in practice, I don't
think it would ever manifest in as an issue in the compiled code).

I don't think it matters that the list iterating code doesn't use
exec_list. If something modifies the list pointers through an exec_node
then the compiler will know that that potentially aliases the pointers
in an exec_list because the pointers in exec_list are also wrapped in an
exec_node.


This is why "in practice, I don't think it would ever manifest as an 
issue in the compiled code". But from a purely theoretical standpoint, 
the fact that the list iteration code will access both the head sentinel 
and tail sentinel struct exec_node without going through the union is an 
issue.



With your patch there is no type aliasing at all and the
union modification doesn't alter that.


If by 'type aliasing' you mean that an object is accessed through two 
different types (i.e. type punning), then:
The union modification *does* alter that - since there would be two 
anonymous structs which are punned via the union.


See the GCC documentation on the matter at: 
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Type-punning - 
"... type-punning is allowed, provided the memory is accessed through 
the union type".


In practice, as I say, it would almost certainly come out ok if you did 
what you've suggested.



Why must all accesses go through the union? Consider the case:

  int someMethod(float *f, int *i) {
  *f = 4.0;// LINE A
  int a = *i;   // LINE B
  return a;
  }

If I had some union 'u' with { float ff; int ii; } then I could call
the above method, even if it was in a different module, with:

  someMethod(&u.ff, &u.ii).

Now, that would mean that the compiler would not be allowed to
re-order LINE A and LINE B. But as I said, someMethod might be in a
different module, where the compiler does not know that 'i' and 'f'
point to members of the same union. In that case it assumes that 'i'
and 'f' don't alias. Compare that to:

  int someMethod(union u *a, union u *b)
  {
  u->ff = 4.0;
  int a = u->ii;
  return a;
  }

In this version, accesses are through the union, and the compiler
knows that they potentially alias.

I don't think this example is relevant in this case because all of the
relevant members of the union I suggested are the same type (struct
exec_node). There is no type aliasing.


Any reference to a member in one of the anonymous structs can in fact 
alias the other anonymous struct (which is a different type) and its 
members.



Maybe a hypothetical problem with this sort of use could be if you had a
function like this:

struct exec_node *
some_method(struct exec_node *a,
 struct exec_node *b)
{
a->prev = &something;
b->next = &something_else;
return a->prev;
}

If you called this with the head and tail sentinels then the compiler
won't know that a->prev and b->next alias each other so it might return
&something instead of &something_else. However in practice for this use
case the only part that is aliased is a NULL pointer that is never
written to so I don't think it would actually matter.


I agree that it is unlikely to matter in practice; I'm just not sure 
that dicing with the language rules is a good idea (I mean, is it really 
worth resolving the aliasing violation issue by introducing a different 
kind of aliasing violation which we think won't trip up the compiler?) - 
and it's not necessary, in this case, even if it was worthwhile to avoid 
increasing the size of the exec_list structure (see my V2 patch).


Davin

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:
> On 02/07/15 17:24, Ilia Mirkin wrote:
>>
>> On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:
>>>
>>> On 02/07/15 17:08, Ilia Mirkin wrote:


 On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca 
 wrote:
>
>
> On 02/07/15 16:34, Ilia Mirkin wrote:
>>
>>
>>
>> On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca 
>> wrote:
>>>
>>>
>>>
>>> On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment #
>>> 14
>>> 



 on bug 91173 
 from
 Ilia Mirkin  *

 Erm... ok...

 MOV R0.zw, c[A0.x + 9];
 MOV R1.x, c[0].w;
 ADD R0.x, c[A0.x + 9].y, R1;
 FLR R0.y, R0.x;

 vs

   0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
   3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
   7: FLR TEMP[0].y, CONST[0].

 Could be that I'm matching the wrong shaders. But this seems highly
 suspect.
 Need to see if there's a good way of dumping mesa ir... I wonder if
 it
 doesn't
 notice the write-mask on the MOV R0.zw and thinks that R0 contains
 the
 value it
 wants.
>>>
>>>
>>>
>>>
>>> Nice detective work on this bug, Ilia.
>>>
 Could be that I'm matching the wrong shaders.
>>>
>>>
>>>
>>>
>>> I think it could be quite useful if there was a
>>> "GL_MESAX_get_internal_representation" Mesa specific extension to
>>> extract
>>> a
>>> text representation of the current bound GLSL, TGSI, hardware
>>> speicfic,
>>> etc,
>>> exclusively for debugging purposes.
>>>
>>> It doesn't even need to be advertised on non-debug builds of Mesa.
>>> But
>>> merely being able to see next to each other all the IRs at a given
>>> call
>>> in a
>>> trace, will probably save some time / grief for us developers on
>>> similar
>>> situations.
>>>
>>>
>>> I did something akin to this for NVIDIA prioprietary drivers on
>>>
>>>
>>>
>>> https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
>>> but I don't think GetProgramBinary is apropriate for Mesa (only one
>>> format.)
>>>
>>>
>>> Instead, for Mesa we could have something like
>>>
>>>   GLint n;
>>>   // this will trigget IRs being collected into an array
>>> internally
>>>   glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);
>>>
>>>   for (i=0; i < n; ++i) {
>>>   GLint nameLength;
>>>   char *name;
>>>   GLint sourceLength;
>>>   char *source;
>>>   glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
>>> NULL);
>>>   name = malloc(nameLength)
>>>   source = malloc(sourceLength)
>>>   glGetActiveInternalRepr(NULL, name, NULL, source);
>>>   }
>>>
>>> And this would need to be plumbed through all the way inside the
>>> drivers,
>>> each layer would  advertise additional IRs.
>>>
>>> And the information here would only be obtainable/valid immediately
>>> after
>>> a
>>> draw call.
>>>
>>>
>>> A completely different tack, is that apitrace's glretrace would
>>> advertise
>>> an
>>> unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
>>> drivers/layers would write shaders repres, and when they are
>>> bound/unbound/destroyed on  a preestablished format:
>>>
>>> CREATE "GLSL/123"
>>> ...
>>> EOF
>>>
>>> CREATE TGSI/456
>>> EOF
>>>
>>> BIND GLSL/123
>>> BIND TGSI/456
>>> BIND HW/789
>>>
>>> UNBIND GLSL/123
>>> UNBIND TGSI/456
>>> UNBIND HW/789
>>>
>>> DESTROY GLSL/123
>>> DESTROY TGSI/456
>>> DESTROY HW/789
>>>
>>>
>>> I don't feel strongly either way, but I suspect that having a proper
>>> extension, even if a little more work at start, will be more robust
>>> on
>>> the
>>> long term.  And less runtime overhead.  GL extensions also give a
>>> mechanism
>>> to revise/deprecate this functionality in the future.
>>
>>
>>
>>
>> This would still require fairly extensive changes as you'd have to
>> track all the bindings together.
>
>
>
>
> Really? I don't think so.  Which alternative are you referring to?



 The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
 TGSI/HW as it might change based on other states. By the time you get
 it sufficiently robust, you might as well do the GL extension.

>
> Yet 

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:39, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 12:24 PM, Ilia Mirkin  wrote:

On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:

On 02/07/15 17:08, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:


On 02/07/15 16:34, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca 
wrote:



On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment #
14




on bug 91173  from
Ilia Mirkin  *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

  0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
  3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
  7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.




Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.




I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to
extract
a
text representation of the current bound GLSL, TGSI, hardware speicfic,
etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call
in a
trace, will probably save some time / grief for us developers on
similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on


https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one
format.)


Instead, for Mesa we could have something like

  GLint n;
  // this will trigget IRs being collected into an array internally
  glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

  for (i=0; i < n; ++i) {
  GLint nameLength;
  char *name;
  GLint sourceLength;
  char *source;
  glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
NULL);
  name = malloc(nameLength)
  source = malloc(sourceLength)
  glGetActiveInternalRepr(NULL, name, NULL, source);
  }

And this would need to be plumbed through all the way inside the
drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately
after
a
draw call.


A completely different tack, is that apitrace's glretrace would
advertise
an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on
the
long term.  And less runtime overhead.  GL extensions also give a
mechanism
to revise/deprecate this functionality in the future.




This would still require fairly extensive changes as you'd have to
track all the bindings together.




Really? I don't think so.  Which alternative are you referring to?



The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
TGSI/HW as it might change based on other states. By the time you get
it sufficiently robust, you might as well do the GL extension.



Yet another option would be to provide a callback

typedef void (*GLircallbackMESA)(const char *name, const char *body);

void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream
layers with the same callback.



What "name" would the driver supply here? And how would you link
things up together?



Giving llvmpipe example, which I'm more familiar,

  - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
and "glsl-ir/{vs,fs}"
  - and invoke pipe_context::get_active_ir (callback) if the pipe driver
implements it
  - src/gallium/drivers/llvmpipe would invoke with
- "llvmpipe/tgsi/{vs,fs}" (which might differ from the state tracker due
to draw module
- "llvmpipe/llvm/{vs,fs,setup}_{full,partial}"
- and maybe even "llvmpipe/x86/{vs,fs}

The idea is that this glGetActiveInternalReprMesa() call dumps what's active
_now_, which is only makes sense immediately after draw calls. So the only
thing the drivers need to do is dump what they see bound.


Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...

One issue is that this won't capture "internal" renders that happen on
blit/dra

Re: [Mesa-dev] [PATCH 12/78] i965/nir/vec4: Add nir_get_dst() and nir_get_src() methods

2015-07-02 Thread Jason Ekstrand
On Thu, Jul 2, 2015 at 2:54 AM, Eduardo Lima Mitev  wrote:
> On 06/30/2015 06:26 PM, Jason Ekstrand wrote:
>> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
>>> From: Samuel Iglesias Gonsalvez 
>>>
>>> These methods are essential for the implementation of the NIR->vec4 pass. 
>>> They
>>> work similar to their fs_nir counter-parts.
>>>
>>> When processing instructions, these methods are invoked to resolve the
>>> brw registers (source or destination) corresponding to the NIR sources
>>> or destination. It builds a map of NIR register index to brw register for
>>> all registers locally allocated in a block.
>>>
>>> In the case of get_nir_src(), it also builds immediate registers on-the-fly
>>> when queried for a SSA source which at this point can only correspond to
>>> constant values.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4.h   |   7 ++
>>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 102 
>>> +
>>>  2 files changed, 109 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
>>> b/src/mesa/drivers/dri/i965/brw_vec4.h
>>> index d837d90..a0f5935 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
>>> @@ -411,6 +411,13 @@ public:
>>> virtual void nir_emit_jump(nir_jump_instr *instr);
>>> virtual void nir_emit_texture(nir_tex_instr *instr);
>>>
>>> +   dst_reg get_nir_dest(nir_dest dest, enum brw_reg_type type);
>>> +   dst_reg get_nir_dest(nir_dest dest, nir_alu_type type);
>>> +   dst_reg get_nir_dest(nir_dest dest);
>>> +   src_reg get_nir_src(nir_src src, enum brw_reg_type type);
>>> +   src_reg get_nir_src(nir_src src, nir_alu_type type);
>>> +   src_reg get_nir_src(nir_src src);
>>> +
>>> virtual dst_reg *make_reg_for_system_value(int location,
>>>const glsl_type *type) = 0;
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> index 36c9dc0..1ec75ee 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>>> @@ -365,6 +365,108 @@ vec4_visitor::nir_emit_instr(nir_instr *instr)
>>> }
>>>  }
>>>
>>> +static dst_reg
>>> +dst_reg_for_nir_reg(vec4_visitor *v, nir_register *nir_reg,
>>> +unsigned base_offset, nir_src *indirect)
>>> +{
>>> +   dst_reg reg;
>>> +
>>> +   reg = v->nir_locals[nir_reg->index];
>>> +
>>> +   reg = offset(reg, base_offset * nir_reg->num_components);
>>> +   if (indirect) {
>>> +  int multiplier = nir_reg->num_components;
>>> +
>>> +  reg.reladdr = new(v->mem_ctx) src_reg(dst_reg(GRF, 
>>> v->alloc.allocate(1)));
>>> +  v->emit(v->MUL(dst_reg(*reg.reladdr),
>>> + retype(v->get_nir_src(*indirect), 
>>> BRW_REGISTER_TYPE_D),
>>> + src_reg(multiplier)));
>>
>> I'm not sure this is correct.  We need this in the FS backend because
>> each register is a scalar value.  In vec4, each array element should
>> correspond to a single vec4 register so you shouldn't need to
>> multiply.  Actually, I think it's currently dead code in the backend
>> because we lower it all away, so it may not be correct at all.
>>
>
> You are right, this multiply is not needed. I already removed it and
> verified it. This will be changed in v2.
>
>>> +   }
>>> +
>>> +   return reg;
>>> +}
>>> +
>>> +dst_reg
>>> +vec4_visitor::get_nir_dest(nir_dest dest)
>>> +{
>>> +   return dst_reg_for_nir_reg(this, dest.reg.reg, dest.reg.base_offset,
>>> +  dest.reg.indirect);
>>> +}
>>> +
>>> +dst_reg
>>> +vec4_visitor::get_nir_dest(nir_dest dest, enum brw_reg_type type)
>>> +{
>>> +   dst_reg reg = get_nir_dest(dest);
>>> +   return retype(reg, type);
>>> +}
>>> +
>>> +dst_reg
>>> +vec4_visitor::get_nir_dest(nir_dest dest, nir_alu_type type)
>>> +{
>>> +   dst_reg reg = get_nir_dest(dest);
>>> +   return retype(reg, brw_type_for_nir_type(type));
>>> +}
>>> +
>>> +src_reg
>>> +vec4_visitor::get_nir_src(nir_src src, enum brw_reg_type type)
>>> +{
>>> +   dst_reg reg;
>>> +
>>> +   if (src.is_ssa) {
>>> +  assert(src.ssa->parent_instr->type == nir_instr_type_load_const);
>>> +  nir_load_const_instr *load = 
>>> nir_instr_as_load_const(src.ssa->parent_instr);
>>> +
>>> +  reg = dst_reg(GRF, alloc.allocate(src.ssa->num_components));
>>
>> It's vec4, you only need one register, not num_components-many.
>>
>
> Right. This comes from our initial misconception of how the units were
> given in vec4 vs. scalar. I have changed that everywhere in the backend
> already, to allocate exactly 1 unit.
>
>>> +  reg = retype(reg, type);
>>> +
>>> +  /* @FIXME: while this is what fs_nir does, we can do this better in 
>>> the VS
>>> +   * stage because we can emit vector operations and save some MOVs in
>>> +   * cases where the constants are representable 

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:24, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:

On 02/07/15 17:08, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:


On 02/07/15 16:34, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca 
wrote:



On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment #
14




on bug 91173  from
Ilia Mirkin  *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

  0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
  3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
  7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.




Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.




I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to
extract
a
text representation of the current bound GLSL, TGSI, hardware speicfic,
etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call
in a
trace, will probably save some time / grief for us developers on
similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on


https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one
format.)


Instead, for Mesa we could have something like

  GLint n;
  // this will trigget IRs being collected into an array internally
  glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

  for (i=0; i < n; ++i) {
  GLint nameLength;
  char *name;
  GLint sourceLength;
  char *source;
  glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
NULL);
  name = malloc(nameLength)
  source = malloc(sourceLength)
  glGetActiveInternalRepr(NULL, name, NULL, source);
  }

And this would need to be plumbed through all the way inside the
drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately
after
a
draw call.


A completely different tack, is that apitrace's glretrace would
advertise
an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on
the
long term.  And less runtime overhead.  GL extensions also give a
mechanism
to revise/deprecate this functionality in the future.




This would still require fairly extensive changes as you'd have to
track all the bindings together.




Really? I don't think so.  Which alternative are you referring to?



The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
TGSI/HW as it might change based on other states. By the time you get
it sufficiently robust, you might as well do the GL extension.



Yet another option would be to provide a callback

typedef void (*GLircallbackMESA)(const char *name, const char *body);

void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream
layers with the same callback.



What "name" would the driver supply here? And how would you link
things up together?



Giving llvmpipe example, which I'm more familiar,

  - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
and "glsl-ir/{vs,fs}"
  - and invoke pipe_context::get_active_ir (callback) if the pipe driver
implements it
  - src/gallium/drivers/llvmpipe would invoke with
- "llvmpipe/tgsi/{vs,fs}" (which might differ from the state tracker due
to draw module
- "llvmpipe/llvm/{vs,fs,setup}_{full,partial}"
- and maybe even "llvmpipe/x86/{vs,fs}

The idea is that this glGetActiveInternalReprMesa() call dumps what's active
_now_, which is only makes sense immediately after draw calls. So the only
thing the drivers need to do is dump what they see bound.


Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...


I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any 
difference.


That said, glretrace already 

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 12:24 PM, Ilia Mirkin  wrote:
> On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:
>> On 02/07/15 17:08, Ilia Mirkin wrote:
>>>
>>> On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:

 On 02/07/15 16:34, Ilia Mirkin wrote:
>
>
> On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca 
> wrote:
>>
>>
>> On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment #
>> 14
>> 
>>>
>>>
>>> on bug 91173  from
>>> Ilia Mirkin  *
>>>
>>> Erm... ok...
>>>
>>> MOV R0.zw, c[A0.x + 9];
>>> MOV R1.x, c[0].w;
>>> ADD R0.x, c[A0.x + 9].y, R1;
>>> FLR R0.y, R0.x;
>>>
>>> vs
>>>
>>>  0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
>>>  3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
>>>  7: FLR TEMP[0].y, CONST[0].
>>>
>>> Could be that I'm matching the wrong shaders. But this seems highly
>>> suspect.
>>> Need to see if there's a good way of dumping mesa ir... I wonder if it
>>> doesn't
>>> notice the write-mask on the MOV R0.zw and thinks that R0 contains the
>>> value it
>>> wants.
>>
>>
>>
>> Nice detective work on this bug, Ilia.
>>
>>> Could be that I'm matching the wrong shaders.
>>
>>
>>
>> I think it could be quite useful if there was a
>> "GL_MESAX_get_internal_representation" Mesa specific extension to
>> extract
>> a
>> text representation of the current bound GLSL, TGSI, hardware speicfic,
>> etc,
>> exclusively for debugging purposes.
>>
>> It doesn't even need to be advertised on non-debug builds of Mesa.  But
>> merely being able to see next to each other all the IRs at a given call
>> in a
>> trace, will probably save some time / grief for us developers on
>> similar
>> situations.
>>
>>
>> I did something akin to this for NVIDIA prioprietary drivers on
>>
>>
>> https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
>> but I don't think GetProgramBinary is apropriate for Mesa (only one
>> format.)
>>
>>
>> Instead, for Mesa we could have something like
>>
>>  GLint n;
>>  // this will trigget IRs being collected into an array internally
>>  glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);
>>
>>  for (i=0; i < n; ++i) {
>>  GLint nameLength;
>>  char *name;
>>  GLint sourceLength;
>>  char *source;
>>  glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
>> NULL);
>>  name = malloc(nameLength)
>>  source = malloc(sourceLength)
>>  glGetActiveInternalRepr(NULL, name, NULL, source);
>>  }
>>
>> And this would need to be plumbed through all the way inside the
>> drivers,
>> each layer would  advertise additional IRs.
>>
>> And the information here would only be obtainable/valid immediately
>> after
>> a
>> draw call.
>>
>>
>> A completely different tack, is that apitrace's glretrace would
>> advertise
>> an
>> unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
>> drivers/layers would write shaders repres, and when they are
>> bound/unbound/destroyed on  a preestablished format:
>>
>> CREATE "GLSL/123"
>> ...
>> EOF
>>
>> CREATE TGSI/456
>> EOF
>>
>> BIND GLSL/123
>> BIND TGSI/456
>> BIND HW/789
>>
>> UNBIND GLSL/123
>> UNBIND TGSI/456
>> UNBIND HW/789
>>
>> DESTROY GLSL/123
>> DESTROY TGSI/456
>> DESTROY HW/789
>>
>>
>> I don't feel strongly either way, but I suspect that having a proper
>> extension, even if a little more work at start, will be more robust on
>> the
>> long term.  And less runtime overhead.  GL extensions also give a
>> mechanism
>> to revise/deprecate this functionality in the future.
>
>
>
> This would still require fairly extensive changes as you'd have to
> track all the bindings together.



 Really? I don't think so.  Which alternative are you referring to?
>>>
>>>
>>> The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
>>> TGSI/HW as it might change based on other states. By the time you get
>>> it sufficiently robust, you might as well do the GL extension.
>>>

 Yet another option would be to provide a callback

typedef void (*GLircallbackMESA)(const char *name, const char *body);

void glGetActiveInternalReprMesa(GLircallbackMESA callback);

 and basically each layer would dump the IRs, and invoke the downstream
 layers with the same callback.
>>>
>>>
>>> What "name" would the driver su

Re: [Mesa-dev] [PATCH 20/78] i965/nir/vec4: Implement load_uniform intrinsic

2015-07-02 Thread Jason Ekstrand
On Thu, Jul 2, 2015 at 3:53 AM, Iago Toral  wrote:
> On Thu, 2015-07-02 at 09:33 +0200, Iago Toral wrote:
>> On Tue, 2015-06-30 at 11:53 -0700, Jason Ekstrand wrote:
>> > On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  
>> > wrote:
>> > > From: Iago Toral Quiroga 
>> > >
>> > > For the indirect case we need to take the index delivered by
>> > > NIR and compute the parent uniform that we are accessing (the one
>> > > that we uploaded to a surface) and the constant offset into that
>> > > surface.
>> > >
>> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>> > > ---
>> > >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 27 
>> > > +--
>> > >  1 file changed, 25 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > > index 3fa8ca8..9a0ae25 100644
>> > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> > > @@ -569,10 +569,33 @@ 
>> > > vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
>> > > }
>> > >
>> > > case nir_intrinsic_load_uniform_indirect:
>> > > +  has_indirect = true;
>> > >/* fallthrough */
>> > > -   case nir_intrinsic_load_uniform:
>> > > -  /* @TODO: Not yet implemented */
>> > > +   case nir_intrinsic_load_uniform: {
>> > > +  int index = instr->const_index[0];
>> > > +  int uniform = nir_uniform_offset[index];
>> >
>> > Again, I don't know that this indirection is really needed.  We should
>> > make nir_lower_io just do the right thing for you here.
>>
>> Ok, I'll see what kind of changes we need to achieve that.
>>
>
> Initially, this was a consequence of the fact that we had not patched
> nir_lower_io to operate in units of vec4 for non-scalar shaders, so we
> needed a way to compute the vec4 offset from the scalar offset. With my
> patch to nir_lower_io that is not needed any more except for the fact
> that brw_nir.c calls nir_assign_var_locations_direct_first instead of
> nir_assign_var_locations for uniforms, which breaks the assumption that
> uniforms get locations in the same order as we find them in the shader.
> Because of that we still need a way to map what nir_lower_io delivers to
> a uniform number. Obviously, that function is very much tied to how the
> FS works (which splits uniforms in direct only and indirect) so I
> suppose it makes sense to call one or the other depending on whether
> this is a scalar shader or not.
>
> In summary: calling nir_assign_var_locations for uniforms in vec4
> shaders instead of nir_assign_var_locations_direct_first allows us to
> get rid of that nir_uniform_offset array mapping. Does this change look
> good to you?

That's fine to me.  I expected them to be different on different
backends anyway.

> Iago
>
>>
>> > > +  dest = get_nir_dest(instr->dest);
>> > > +
>> > > +  if (has_indirect) {
>> > > + /* Split addressing into uniform and offset */
>> > > + int offset = index - nir_uniform_driver_location[uniform];
>> > > + assert(offset >= 0);
>> > > +
>> > > + uniform -= offset;
>> > > + assert(uniform >= 0);
>> > > +
>> > > + src = src_reg(dst_reg(UNIFORM, uniform));
>> > > + src.reg_offset = offset;
>> > > + src_reg tmp = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_D);
>> > > + src.reladdr = new(mem_ctx) src_reg(tmp);
>> > > +  } else {
>> > > + src = src_reg(dst_reg(UNIFORM, uniform));
>> > > +  }
>> > > +
>> > > +  emit(MOV(dest, src));
>> > >break;
>> > > +   }
>> > >
>> > > case nir_intrinsic_atomic_counter_read:
>> > > case nir_intrinsic_atomic_counter_inc:
>> > > --
>> > > 2.1.4
>> > >
>> > > ___
>> > > mesa-dev mailing list
>> > > mesa-dev@lists.freedesktop.org
>> > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/chv|skl: Apply sampler bypass w/a

2015-07-02 Thread Nanley Chery
Nice find. What do you think about putting this in a function to
reduce the code duplication?

- Nanley

On Wed, Jul 1, 2015 at 6:14 PM, Ben Widawsky  wrote:
> On Wed, Jul 01, 2015 at 04:03:53PM -0700, Ben Widawsky wrote:
>> Certain compressed formats require this setting. The docs don't go into much
>> detail as to why it's needed exactly.
>>
>> This fixes 0 piglit failures with a GBM gpu piglit run.
>
> I just ran this again in piglit since Chris asked me to clarify. There are 
> also
> no regressions (http://otc-mesa-ci.jf.intel.com/job/Leeroy/124112/) - ie. no
> changes.
>
> (It looks like BSW was removed from jenkins).
>
> [snip]
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:
> On 02/07/15 17:08, Ilia Mirkin wrote:
>>
>> On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:
>>>
>>> On 02/07/15 16:34, Ilia Mirkin wrote:


 On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca 
 wrote:
>
>
> On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment #
> 14
> 
>>
>>
>> on bug 91173  from
>> Ilia Mirkin  *
>>
>> Erm... ok...
>>
>> MOV R0.zw, c[A0.x + 9];
>> MOV R1.x, c[0].w;
>> ADD R0.x, c[A0.x + 9].y, R1;
>> FLR R0.y, R0.x;
>>
>> vs
>>
>>  0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
>>  3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
>>  7: FLR TEMP[0].y, CONST[0].
>>
>> Could be that I'm matching the wrong shaders. But this seems highly
>> suspect.
>> Need to see if there's a good way of dumping mesa ir... I wonder if it
>> doesn't
>> notice the write-mask on the MOV R0.zw and thinks that R0 contains the
>> value it
>> wants.
>
>
>
> Nice detective work on this bug, Ilia.
>
>> Could be that I'm matching the wrong shaders.
>
>
>
> I think it could be quite useful if there was a
> "GL_MESAX_get_internal_representation" Mesa specific extension to
> extract
> a
> text representation of the current bound GLSL, TGSI, hardware speicfic,
> etc,
> exclusively for debugging purposes.
>
> It doesn't even need to be advertised on non-debug builds of Mesa.  But
> merely being able to see next to each other all the IRs at a given call
> in a
> trace, will probably save some time / grief for us developers on
> similar
> situations.
>
>
> I did something akin to this for NVIDIA prioprietary drivers on
>
>
> https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
> but I don't think GetProgramBinary is apropriate for Mesa (only one
> format.)
>
>
> Instead, for Mesa we could have something like
>
>  GLint n;
>  // this will trigget IRs being collected into an array internally
>  glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);
>
>  for (i=0; i < n; ++i) {
>  GLint nameLength;
>  char *name;
>  GLint sourceLength;
>  char *source;
>  glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
> NULL);
>  name = malloc(nameLength)
>  source = malloc(sourceLength)
>  glGetActiveInternalRepr(NULL, name, NULL, source);
>  }
>
> And this would need to be plumbed through all the way inside the
> drivers,
> each layer would  advertise additional IRs.
>
> And the information here would only be obtainable/valid immediately
> after
> a
> draw call.
>
>
> A completely different tack, is that apitrace's glretrace would
> advertise
> an
> unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
> drivers/layers would write shaders repres, and when they are
> bound/unbound/destroyed on  a preestablished format:
>
> CREATE "GLSL/123"
> ...
> EOF
>
> CREATE TGSI/456
> EOF
>
> BIND GLSL/123
> BIND TGSI/456
> BIND HW/789
>
> UNBIND GLSL/123
> UNBIND TGSI/456
> UNBIND HW/789
>
> DESTROY GLSL/123
> DESTROY TGSI/456
> DESTROY HW/789
>
>
> I don't feel strongly either way, but I suspect that having a proper
> extension, even if a little more work at start, will be more robust on
> the
> long term.  And less runtime overhead.  GL extensions also give a
> mechanism
> to revise/deprecate this functionality in the future.



 This would still require fairly extensive changes as you'd have to
 track all the bindings together.
>>>
>>>
>>>
>>> Really? I don't think so.  Which alternative are you referring to?
>>
>>
>> The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
>> TGSI/HW as it might change based on other states. By the time you get
>> it sufficiently robust, you might as well do the GL extension.
>>
>>>
>>> Yet another option would be to provide a callback
>>>
>>>typedef void (*GLircallbackMESA)(const char *name, const char *body);
>>>
>>>void glGetActiveInternalReprMesa(GLircallbackMESA callback);
>>>
>>> and basically each layer would dump the IRs, and invoke the downstream
>>> layers with the same callback.
>>
>>
>> What "name" would the driver supply here? And how would you link
>> things up together?
>
>
> Giving llvmpipe example, which I'm more familiar,
>
>  - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
> and "glsl-ir/{vs,fs}"
>

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:08, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:

On 02/07/15 16:34, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca  wrote:


On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment # 14



on bug 91173  from
Ilia Mirkin  *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

 0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
 3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
 7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.



Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.



I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to extract
a
text representation of the current bound GLSL, TGSI, hardware speicfic,
etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call
in a
trace, will probably save some time / grief for us developers on similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on

https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one
format.)


Instead, for Mesa we could have something like

 GLint n;
 // this will trigget IRs being collected into an array internally
 glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

 for (i=0; i < n; ++i) {
 GLint nameLength;
 char *name;
 GLint sourceLength;
 char *source;
 glGetActiveInternalRepr(&nameLength, NULL, &sourceLength, NULL);
 name = malloc(nameLength)
 source = malloc(sourceLength)
 glGetActiveInternalRepr(NULL, name, NULL, source);
 }

And this would need to be plumbed through all the way inside the drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately after
a
draw call.


A completely different tack, is that apitrace's glretrace would advertise
an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on
the
long term.  And less runtime overhead.  GL extensions also give a
mechanism
to revise/deprecate this functionality in the future.



This would still require fairly extensive changes as you'd have to
track all the bindings together.



Really? I don't think so.  Which alternative are you referring to?


The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
TGSI/HW as it might change based on other states. By the time you get
it sufficiently robust, you might as well do the GL extension.



Yet another option would be to provide a callback

   typedef void (*GLircallbackMESA)(const char *name, const char *body);

   void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream
layers with the same callback.


What "name" would the driver supply here? And how would you link
things up together?


Giving llvmpipe example, which I'm more familiar,

 - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
and "glsl-ir/{vs,fs}"
 - and invoke pipe_context::get_active_ir (callback) if the pipe driver 
implements it

 - src/gallium/drivers/llvmpipe would invoke with
   - "llvmpipe/tgsi/{vs,fs}" (which might differ from the state tracker 
due to draw module

   - "llvmpipe/llvm/{vs,fs,setup}_{full,partial}"
   - and maybe even "llvmpipe/x86/{vs,fs}

The idea is that this glGetActiveInternalReprMesa() call dumps what's 
active _now_, which is only makes sense immediately after draw calls. So 
the only thing the drivers need to do is dump what they see bound.


That is, there's no need to "link things up together".

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:
> On 02/07/15 16:34, Ilia Mirkin wrote:
>>
>> On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca  wrote:
>>>
>>> On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment # 14
>>> 

 on bug 91173  from
 Ilia Mirkin  *

 Erm... ok...

 MOV R0.zw, c[A0.x + 9];
 MOV R1.x, c[0].w;
 ADD R0.x, c[A0.x + 9].y, R1;
 FLR R0.y, R0.x;

 vs

 0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
 3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
 7: FLR TEMP[0].y, CONST[0].

 Could be that I'm matching the wrong shaders. But this seems highly
 suspect.
 Need to see if there's a good way of dumping mesa ir... I wonder if it
 doesn't
 notice the write-mask on the MOV R0.zw and thinks that R0 contains the
 value it
 wants.
>>>
>>>
>>> Nice detective work on this bug, Ilia.
>>>
 Could be that I'm matching the wrong shaders.
>>>
>>>
>>> I think it could be quite useful if there was a
>>> "GL_MESAX_get_internal_representation" Mesa specific extension to extract
>>> a
>>> text representation of the current bound GLSL, TGSI, hardware speicfic,
>>> etc,
>>> exclusively for debugging purposes.
>>>
>>> It doesn't even need to be advertised on non-debug builds of Mesa.  But
>>> merely being able to see next to each other all the IRs at a given call
>>> in a
>>> trace, will probably save some time / grief for us developers on similar
>>> situations.
>>>
>>>
>>> I did something akin to this for NVIDIA prioprietary drivers on
>>>
>>> https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
>>> but I don't think GetProgramBinary is apropriate for Mesa (only one
>>> format.)
>>>
>>>
>>> Instead, for Mesa we could have something like
>>>
>>> GLint n;
>>> // this will trigget IRs being collected into an array internally
>>> glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);
>>>
>>> for (i=0; i < n; ++i) {
>>> GLint nameLength;
>>> char *name;
>>> GLint sourceLength;
>>> char *source;
>>> glGetActiveInternalRepr(&nameLength, NULL, &sourceLength, NULL);
>>> name = malloc(nameLength)
>>> source = malloc(sourceLength)
>>> glGetActiveInternalRepr(NULL, name, NULL, source);
>>> }
>>>
>>> And this would need to be plumbed through all the way inside the drivers,
>>> each layer would  advertise additional IRs.
>>>
>>> And the information here would only be obtainable/valid immediately after
>>> a
>>> draw call.
>>>
>>>
>>> A completely different tack, is that apitrace's glretrace would advertise
>>> an
>>> unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
>>> drivers/layers would write shaders repres, and when they are
>>> bound/unbound/destroyed on  a preestablished format:
>>>
>>> CREATE "GLSL/123"
>>> ...
>>> EOF
>>>
>>> CREATE TGSI/456
>>> EOF
>>>
>>> BIND GLSL/123
>>> BIND TGSI/456
>>> BIND HW/789
>>>
>>> UNBIND GLSL/123
>>> UNBIND TGSI/456
>>> UNBIND HW/789
>>>
>>> DESTROY GLSL/123
>>> DESTROY TGSI/456
>>> DESTROY HW/789
>>>
>>>
>>> I don't feel strongly either way, but I suspect that having a proper
>>> extension, even if a little more work at start, will be more robust on
>>> the
>>> long term.  And less runtime overhead.  GL extensions also give a
>>> mechanism
>>> to revise/deprecate this functionality in the future.
>>
>>
>> This would still require fairly extensive changes as you'd have to
>> track all the bindings together.
>
>
> Really? I don't think so.  Which alternative are you referring to?

The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
TGSI/HW as it might change based on other states. By the time you get
it sufficiently robust, you might as well do the GL extension.

>
> Yet another option would be to provide a callback
>
>   typedef void (*GLircallbackMESA)(const char *name, const char *body);
>
>   void glGetActiveInternalReprMesa(GLircallbackMESA callback);
>
> and basically each layer would dump the IRs, and invoke the downstream
> layers with the same callback.

What "name" would the driver supply here? And how would you link
things up together?

>
>
>> Anyways, *something* would be fantastic. It's also incredibly
>> difficult to tell what shader is being used for a particular draw...
>> I've resorted to taking mmt traces of nouveau to see what it's
>> actually drawing with. Sometimes qapitrace doesn't show a shader,
>> sometimes it's a fixed function shader, etc. Note that among other
>> things, this has to account for any shader keys that might exist at
>> any of the levels. And since bugs are often in optimization passes,
>> being able to see both pre- and post-opt shaders would be *really*
>> nice.
>
>
>
> That's doable. Each layer just needs to keep the intermediate versions
> around.  We 

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 16:34, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca  wrote:

On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment # 14


on bug 91173  from
Ilia Mirkin  *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.


Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.


I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to extract a
text representation of the current bound GLSL, TGSI, hardware speicfic, etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call in a
trace, will probably save some time / grief for us developers on similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on
https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one format.)


Instead, for Mesa we could have something like

GLint n;
// this will trigget IRs being collected into an array internally
glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

for (i=0; i < n; ++i) {
GLint nameLength;
char *name;
GLint sourceLength;
char *source;
glGetActiveInternalRepr(&nameLength, NULL, &sourceLength, NULL);
name = malloc(nameLength)
source = malloc(sourceLength)
glGetActiveInternalRepr(NULL, name, NULL, source);
}

And this would need to be plumbed through all the way inside the drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately after a
draw call.


A completely different tack, is that apitrace's glretrace would advertise an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on the
long term.  And less runtime overhead.  GL extensions also give a mechanism
to revise/deprecate this functionality in the future.


This would still require fairly extensive changes as you'd have to
track all the bindings together.


Really? I don't think so.  Which alternative are you referring to?

Yet another option would be to provide a callback

  typedef void (*GLircallbackMESA)(const char *name, const char *body);

  void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream 
layers with the same callback.




Anyways, *something* would be fantastic. It's also incredibly
difficult to tell what shader is being used for a particular draw...
I've resorted to taking mmt traces of nouveau to see what it's
actually drawing with. Sometimes qapitrace doesn't show a shader,
sometimes it's a fixed function shader, etc. Note that among other
things, this has to account for any shader keys that might exist at
any of the levels. And since bugs are often in optimization passes,
being able to see both pre- and post-opt shaders would be *really*
nice.



That's doable. Each layer just needs to keep the intermediate versions 
around.  We could have glretrace invoke a 
glEnable(GL_KEEP_ALL_IR_MESAX), so that the driver would have not have 
to do this all the time.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Ilia Mirkin
On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca  wrote:
> On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment # 14
> 
>> on bug 91173  from
>> Ilia Mirkin  *
>>
>> Erm... ok...
>>
>> MOV R0.zw, c[A0.x + 9];
>> MOV R1.x, c[0].w;
>> ADD R0.x, c[A0.x + 9].y, R1;
>> FLR R0.y, R0.x;
>>
>> vs
>>
>>0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
>>3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
>>7: FLR TEMP[0].y, CONST[0].
>>
>> Could be that I'm matching the wrong shaders. But this seems highly
>> suspect.
>> Need to see if there's a good way of dumping mesa ir... I wonder if it
>> doesn't
>> notice the write-mask on the MOV R0.zw and thinks that R0 contains the
>> value it
>> wants.
>
> Nice detective work on this bug, Ilia.
>
>> Could be that I'm matching the wrong shaders.
>
> I think it could be quite useful if there was a
> "GL_MESAX_get_internal_representation" Mesa specific extension to extract a
> text representation of the current bound GLSL, TGSI, hardware speicfic, etc,
> exclusively for debugging purposes.
>
> It doesn't even need to be advertised on non-debug builds of Mesa.  But
> merely being able to see next to each other all the IRs at a given call in a
> trace, will probably save some time / grief for us developers on similar
> situations.
>
>
> I did something akin to this for NVIDIA prioprietary drivers on
> https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
> but I don't think GetProgramBinary is apropriate for Mesa (only one format.)
>
>
> Instead, for Mesa we could have something like
>
>GLint n;
>// this will trigget IRs being collected into an array internally
>glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);
>
>for (i=0; i < n; ++i) {
>GLint nameLength;
>char *name;
>GLint sourceLength;
>char *source;
>glGetActiveInternalRepr(&nameLength, NULL, &sourceLength, NULL);
>name = malloc(nameLength)
>source = malloc(sourceLength)
>glGetActiveInternalRepr(NULL, name, NULL, source);
>}
>
> And this would need to be plumbed through all the way inside the drivers,
> each layer would  advertise additional IRs.
>
> And the information here would only be obtainable/valid immediately after a
> draw call.
>
>
> A completely different tack, is that apitrace's glretrace would advertise an
> unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
> drivers/layers would write shaders repres, and when they are
> bound/unbound/destroyed on  a preestablished format:
>
> CREATE "GLSL/123"
> ...
> EOF
>
> CREATE TGSI/456
> EOF
>
> BIND GLSL/123
> BIND TGSI/456
> BIND HW/789
>
> UNBIND GLSL/123
> UNBIND TGSI/456
> UNBIND HW/789
>
> DESTROY GLSL/123
> DESTROY TGSI/456
> DESTROY HW/789
>
>
> I don't feel strongly either way, but I suspect that having a proper
> extension, even if a little more work at start, will be more robust on the
> long term.  And less runtime overhead.  GL extensions also give a mechanism
> to revise/deprecate this functionality in the future.

This would still require fairly extensive changes as you'd have to
track all the bindings together.

Anyways, *something* would be fantastic. It's also incredibly
difficult to tell what shader is being used for a particular draw...
I've resorted to taking mmt traces of nouveau to see what it's
actually drawing with. Sometimes qapitrace doesn't show a shader,
sometimes it's a fixed function shader, etc. Note that among other
things, this has to account for any shader keys that might exist at
any of the levels. And since bugs are often in optimization passes,
being able to see both pre- and post-opt shaders would be *really*
nice.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] applegl: Provide requirements of _SET_DrawBuffers

2015-07-02 Thread Jon TURNEY

On 02/07/2015 13:42, Emil Velikov wrote:

On 27 June 2015 at 12:21, Jon TURNEY wrote:

This smells to me like there's something not quite right about the
partitioning of stuff between libmesa and libglapi, but I don't have any
insight into what exactly the problem is.


Quite likely. Although it would be better to resolve it, as (if my
understanding is correct) libGL should not know anything about mesa's
internals. It ought to interact with DRI and glapi alone.

Mildly related:
Jon what happened to the applegl-vtable branches ? I really liked the
direction that they were going.


I'd quite forgotten about those.

Currently, there is a build-time choice between using apple-dri, or
indirect and swrast. This is not ideal (I believe it makes remote 
clients under OSX useless), so they should all be runtime alternatives.


Those branches are really just experiments of mine to try to understand 
what that might look like before I did a similar implementation for Windows.


But I think I came to the conclusion that it was less work to use the 
existing __GLXDRIdisplay implementations as a model, rather than convert 
appple-dri to that interface and then use it as a model for my 
windows-dri implementation (see [1])


[1] 
http://cgit.freedesktop.org/~jturney/mesa/commit/?h=windows-dri&id=84b173c63727099ec74edc74a15c6ea953312360


I think it's perfectly possibly to put apple-dri into a similar form, 
it's just someone has to do it :)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

2015-07-02 Thread Mike Stroyan
I had actually made the change to gen8_upload_blend_state, but after
reading through the gen8 PRM a few times I decided to back it out.
It does seem that the initial gen8 BLEND_STATE DWord can disable alpha test.
Of course, new hardware features may not always behave as described.
In that case allowances will need to be made.
Or, we could always fill in the extra eight bytes for the first BLEND_STATE
entry.
Given that the structure is 64 byte aligned, that 'extra' data is almost
always free.

On Thu, Jul 2, 2015 at 1:45 AM, Kenneth Graunke 
wrote:

> On Wednesday, July 01, 2015 10:16:28 AM Mike Stroyan wrote:
> > When there are no color buffer render targets, gen6 and gen7 still
> > use the first BLEND_STATE element to determine alpha test.
> > gen6_upload_blend_state was allocating zero elements when
> > ctx->Color.AlphaEnabled was false.
> > That left _3DSTATE_CC_STATE_POINTERS or _3DSTATE_BLEND_STATE_POINTERS
> > pointing to random data from some previous brw_state_batch().
> > That sometimes suppressed depth rendering when those bits
> > happened to mean COMPAREFUNC_NEVER.
> > This produced flickering shadows for dota2 reborn.
> > ---
> >  src/mesa/drivers/dri/i965/gen6_cc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c
> b/src/mesa/drivers/dri/i965/gen6_cc.c
> > index 2bfa271..2b76e24 100644
> > --- a/src/mesa/drivers/dri/i965/gen6_cc.c
> > +++ b/src/mesa/drivers/dri/i965/gen6_cc.c
> > @@ -51,7 +51,7 @@ gen6_upload_blend_state(struct brw_context *brw)
> >  * with render target 0, which will reference BLEND_STATE[0] for
> >  * alpha test enable.
> >  */
> > -   if (nr_draw_buffers == 0 && ctx->Color.AlphaEnabled)
> > +   if (nr_draw_buffers == 0)
> >nr_draw_buffers = 1;
> >
> > size = sizeof(*blend) * nr_draw_buffers;
> >
>
> Great catch!
>
> Reviewed-by: Kenneth Graunke 
>
> And pushed:
>9d408a4..fe2b748  master -> master
>
> I think we ought to change gen8_blend_state.c as well, but I'm not quite
> sure what change to make.  Either we should make the same change you did
> here, or delete the whole "We need at least 1 BLEND_STATE written"
> block.
>
> On Gen8+, it looks like the alpha test and other functions that might
> discard pixels are all in the shared/common DWord, and the per-color
> target DWord pairs look relatively harmless.  I suppose the null RT
> would still refer to BLEND_STATE[0]...so it might still be worth
> emitting one.  Any thoughts?
>



-- 

 Mike Stroyan - Software Architect
 LunarG, Inc.  - The Graphics Experts
 Cell:  (970) 219-7905
 Email: m...@lunarg.com
 Website: http://www.lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] darwin: Suppress type conversion warnings for GLhandleARB

2015-07-02 Thread Jose Fonseca

On 01/07/15 00:33, Julien Isorce wrote:

darwin: silence GLhandleARB convertions from and to GLuint

This patch and its description are inspired from Jose Fonseca
explanations and suggestions.

With this patch the following logic applies and only if __APPLE__:

When building mesa, GLhandleARB is defined as unsigned long and
at some point casted to GLuint in gl fuction implementations.
These exact points are where these errors and warnings appear.

When building an application GLhandleARB is defined as void*.
Later when calling a gl function, for example glBindAttribLocationARB,
it will be dispatched to _mesa_BindAttribLocation. So internally
void* will be treated as unsigned long which has the same size.
So the same truncation happens when casting it to GLuint.

Same when GLhandleARB appears as return value.
For mesa it will be GLuint -> unsigned long.
For an application it will be GLuint -> unsigned long -> void*.
Note that the value will be preserved when casting back to GLuint.

When GLhandleARB appears as a pointer there are also separate
entry-points, i.e. _mesa_FuncNameARB. So the same logic can
be applied.

https://bugs.freedesktop.org/show_bug.cgi?id=66346
Signed-off-by: Julien Isorce 
---
  configure.ac   | 2 +-
  include/GL/glext.h | 5 +
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 7661bd9..1cd8e77 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1357,7 +1357,7 @@ if test "x$enable_dri" = xyes; then
  fi
  ;;
  darwin*)
-DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED"
+DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED -DBUILDING_MESA"
  if test "x$with_dri_drivers" = "xyes"; then
  with_dri_drivers="swrast"
  fi
diff --git a/include/GL/glext.h b/include/GL/glext.h
index a3873a6..e5f1d89 100644
--- a/include/GL/glext.h
+++ b/include/GL/glext.h
@@ -3879,7 +3879,12 @@ GLAPI void APIENTRY glMinSampleShadingARB (GLfloat 
value);
  #ifndef GL_ARB_shader_objects
  #define GL_ARB_shader_objects 1
  #ifdef __APPLE__
+#ifdef BUILDING_MESA
+/* Avoid uint <-> void* warnings */
+typedef unsigned long GLhandleARB;
+#else
  typedef void *GLhandleARB;
+#endif
  #else
  typedef unsigned int GLhandleARB;
  #endif



The only snafu here is that this change can get dropped when updating 
glext.h.  On the other hand, I believe that there's also precedent to 
modify Khronos headers in Mesa.


Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] darwin: Suppress type conversion warnings for GLhandleARB

2015-07-02 Thread Jose Fonseca

On 02/07/15 13:16, Emil Velikov wrote:

On 1 July 2015 at 00:33, Julien Isorce  wrote:

darwin: silence GLhandleARB convertions from and to GLuint

This patch and its description are inspired from Jose Fonseca
explanations and suggestions.

With this patch the following logic applies and only if __APPLE__:

When building mesa, GLhandleARB is defined as unsigned long and
at some point casted to GLuint in gl fuction implementations.
These exact points are where these errors and warnings appear.

When building an application GLhandleARB is defined as void*.
Later when calling a gl function, for example glBindAttribLocationARB,
it will be dispatched to _mesa_BindAttribLocation. So internally
void* will be treated as unsigned long which has the same size.
So the same truncation happens when casting it to GLuint.

Same when GLhandleARB appears as return value.
For mesa it will be GLuint -> unsigned long.
For an application it will be GLuint -> unsigned long -> void*.
Note that the value will be preserved when casting back to GLuint.

When GLhandleARB appears as a pointer there are also separate
entry-points, i.e. _mesa_FuncNameARB. So the same logic can
be applied.

https://bugs.freedesktop.org/show_bug.cgi?id=66346
Signed-off-by: Julien Isorce 
---
  configure.ac   | 2 +-
  include/GL/glext.h | 5 +
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 7661bd9..1cd8e77 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1357,7 +1357,7 @@ if test "x$enable_dri" = xyes; then
  fi
  ;;
  darwin*)
-DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED"
+DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED -DBUILDING_MESA"
  if test "x$with_dri_drivers" = "xyes"; then
  with_dri_drivers="swrast"
  fi
diff --git a/include/GL/glext.h b/include/GL/glext.h
index a3873a6..e5f1d89 100644
--- a/include/GL/glext.h
+++ b/include/GL/glext.h
@@ -3879,7 +3879,12 @@ GLAPI void APIENTRY glMinSampleShadingARB (GLfloat 
value);
  #ifndef GL_ARB_shader_objects
  #define GL_ARB_shader_objects 1
  #ifdef __APPLE__
+#ifdef BUILDING_MESA
+/* Avoid uint <-> void* warnings */
+typedef unsigned long GLhandleARB;
+#else
  typedef void *GLhandleARB;
+#endif

Ideally we'll ship a header without this change, but that evolves
adding a hook at the build/install stage into the autotools build.


I don't think that complexity is justifiable.  There is even a 
precedent, e.g., from the top of mesa/include/GL/gl.h:


#  if (defined(_MSC_VER) || defined(__MINGW32__)) && defined(BUILD_GL32) 
/* tag specify we're building mesa as a DLL */


Many other open-sources projects have public headers that react 
different when building vs external use, and I have yet seen one that 
has install hooks to filter that stuff out.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH mesa v2] i965/gen8+: bo in state base address must be in 32-bit address range

2015-07-02 Thread Chris Wilson
On Thu, Jul 02, 2015 at 02:53:45PM +0100, Michel Thierry wrote:
> >I would have just exposed setting the flag on the execobject.  That way you
> >still have existing userspace safe by default, can set a
> >bufmgr-level flag to enable 48bit support by default and then
> >individually turn off 48bit support for the couple of buffers that
> >require it.
> >
> So, something more like v1? 
> http://mid.gmane.org/1435062089-19877-2-git-send-email-michel.thie...@intel.com
> 
> (apart of the hacky macro name)

Yes. I value having those 2-dwords marked as being the relocation
address even if the top dword must be zero.

You can adopt the libdrm interface Ben suggested underneath the
macros, but having it marked as being both a 64-bit relocation and a w/a
in the source is pricless imo.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] glsl: fix some strict aliasing issues in exec_list

2015-07-02 Thread Neil Roberts
Davin McCall  writes:

> I actually had thought about this, but technically, you can only use
> unions for type aliasing if you perform all accesses (that are not to
> the 'active' member) through the union. All the list processing code
> that iterates through all the nodes including the tail sentinel would
> *technically* still have an aliasing problem, because it doesn't go
> through the exec_list structure at all (though in practice, I don't
> think it would ever manifest in as an issue in the compiled code).

I don't think it matters that the list iterating code doesn't use
exec_list. If something modifies the list pointers through an exec_node
then the compiler will know that that potentially aliases the pointers
in an exec_list because the pointers in exec_list are also wrapped in an
exec_node. With your patch there is no type aliasing at all and the
union modification doesn't alter that.

> Why must all accesses go through the union? Consider the case:
>
>  int someMethod(float *f, int *i) {
>  *f = 4.0;// LINE A
>  int a = *i;   // LINE B
>  return a;
>  }
>
> If I had some union 'u' with { float ff; int ii; } then I could call
> the above method, even if it was in a different module, with:
>
>  someMethod(&u.ff, &u.ii).
>
> Now, that would mean that the compiler would not be allowed to
> re-order LINE A and LINE B. But as I said, someMethod might be in a
> different module, where the compiler does not know that 'i' and 'f'
> point to members of the same union. In that case it assumes that 'i'
> and 'f' don't alias. Compare that to:
>
>  int someMethod(union u *a, union u *b)
>  {
>  u->ff = 4.0;
>  int a = u->ii;
>  return a;
>  }
>
> In this version, accesses are through the union, and the compiler
> knows that they potentially alias.

I don't think this example is relevant in this case because all of the
relevant members of the union I suggested are the same type (struct
exec_node). There is no type aliasing.

Maybe a hypothetical problem with this sort of use could be if you had a
function like this:

struct exec_node *
some_method(struct exec_node *a,
struct exec_node *b)
{
   a->prev = &something;
   b->next = &something_else;
   return a->prev;
}

If you called this with the head and tail sentinels then the compiler
won't know that a->prev and b->next alias each other so it might return
&something instead of &something_else. However in practice for this use
case the only part that is aliased is a NULL pointer that is never
written to so I don't think it would actually matter.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] applegl: Provide requirements of _SET_DrawBuffers

2015-07-02 Thread Emil Velikov
On 27 June 2015 at 12:21, Jon TURNEY  wrote:
> On 19/06/2015 12:51, Emil Velikov wrote:
>>
>> On 19 June 2015 at 00:05, Julien Isorce  wrote:
>>>
>>> Sorry for removing the XXX line. Original message is here:
>>> https://bugs.freedesktop.org/attachment.cgi?id=115539
>>>
>>> At the time src/glx/apple/apple_glapi.c has been developed this patch was
>>> not needed I guess so I wonder which change made the regression.
>
>
> I'm not sure bisecting is going to work too well, as you might find it's
> apple-glx isn't buildable for most of the range, and you will probably land
> on a commit which say "rewrite glapi (again)" :D
>
True :'-(
If only we had more people to report things earlier/more often as things broke.


>> That's one way to get to the bottom of it. Alternatively we can take a
>> look at how other glx (indirect, xlib) handle it.
>
>
>> As is, the patch gets a big fat NACK, sorry.
>
>
> This smells to me like there's something not quite right about the
> partitioning of stuff between libmesa and libglapi, but I don't have any
> insight into what exactly the problem is.
>
Quite likely. Although it would be better to resolve it, as (if my
understanding is correct) libGL should not know anything about mesa's
internals. It ought to interact with DRI and glapi alone.

Mildly related:
Jon what happened to the applegl-vtable branches ? I really liked the
direction that they were going.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] egl: use unix defines on osx with clang

2015-07-02 Thread Emil Velikov
On 19 June 2015 at 12:43, Emil Velikov  wrote:
> On 19 June 2015 at 00:23, Julien Isorce  wrote:
>> On 18 June 2015 at 19:33, Emil Velikov  wrote:
>>>
>>> On 18 June 2015 at 19:29, Emil Velikov  wrote:
>>> Sorry about that. Unintentionally hit send ;-\
>>>
>>> > On 18 June 2015 at 06:53, Julien Isorce  wrote:
>>> >> CC   egl_dri2.lo
>>> >> include/EGL/eglplatform.h:135:2:
>>> >>   error: "Platform not recognized"
>>> >> include/EGL/eglplatform.h:140:9:
>>> >>   error: unknown type name 'EGLNativeDisplayType'
>>> >> typedef EGLNativeDisplayType NativeDisplayType;
>>> >>
>>> You should not longer see this message. Did you try building things,
>>> with the updated eglplatform.h ?
>>
>>
>> You are right I forgot to update the new error message with current
>> eglplatform.h.
>>
>>>
>>> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90249
>>> >> Signed-off-by: Julien Isorce 
>>> >> ---
>>> >>  include/EGL/eglplatform.h | 4 ++--
>>> >>  1 file changed, 2 insertions(+), 2 deletions(-)
>>> >>
>>> >> diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h
>>> >> index 7802542..b376e64 100644
>>> >> --- a/include/EGL/eglplatform.h
>>> >> +++ b/include/EGL/eglplatform.h
>>> >> @@ -77,7 +77,7 @@ typedef HDC EGLNativeDisplayType;
>>> >>  typedef HBITMAP EGLNativePixmapType;
>>> >>  typedef HWNDEGLNativeWindowType;
>>> >>
>>> >> -#elif defined(__APPLE__) || defined(__WINSCW__) ||
>>> >> defined(__SYMBIAN32__)  /* Symbian */
>>> >> +#elif defined(__WINSCW__) || defined(__SYMBIAN32__)  /* Symbian */
>>> >>
>>> The above "defined(__APPLE__))" comes from Khronos, so if it's wrong
>>> perhaps it should be reported to them ? Afaict with current mesa this
>>> patch is not needed, and things will just work.
>>
>>
>> I remember that with the latest eglplatform.h I had to manually include
>> #include  / #include  where they are used  (because
>> not hitting their inclusion by eglplatform.h)
>> But then build was complaining about converting int to Display*, and void*
>> to Pixmap/Window. A naive cast could silenced the build error but then it
>> was crashing at runtime.
>> I'll have another try.
>>
> So it seems that we need something like MESA_EGL_NO_X11_HEADERS for
> __APPLE__. If anyone has personal experience with
> upstreaming/discussing such changes to Khronos, some input would be
> appreciated :-)
>
> I think that updating the commit message (and of course getting in
> touch with the Khronos guys). With Samsung being a "Promoter member"
> there should be a few people from Samsung that can help sort/speed
> things up.
>
In case ^^ was on the vague side:
Imho we should be safe to apply this, with updated commit message.
Although for changes involving the official headers, we should seek
feedback and try to fix/upstream them.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] darwin: Suppress type conversion warnings for GLhandleARB

2015-07-02 Thread Emil Velikov
On 1 July 2015 at 00:33, Julien Isorce  wrote:
> darwin: silence GLhandleARB convertions from and to GLuint
>
> This patch and its description are inspired from Jose Fonseca
> explanations and suggestions.
>
> With this patch the following logic applies and only if __APPLE__:
>
> When building mesa, GLhandleARB is defined as unsigned long and
> at some point casted to GLuint in gl fuction implementations.
> These exact points are where these errors and warnings appear.
>
> When building an application GLhandleARB is defined as void*.
> Later when calling a gl function, for example glBindAttribLocationARB,
> it will be dispatched to _mesa_BindAttribLocation. So internally
> void* will be treated as unsigned long which has the same size.
> So the same truncation happens when casting it to GLuint.
>
> Same when GLhandleARB appears as return value.
> For mesa it will be GLuint -> unsigned long.
> For an application it will be GLuint -> unsigned long -> void*.
> Note that the value will be preserved when casting back to GLuint.
>
> When GLhandleARB appears as a pointer there are also separate
> entry-points, i.e. _mesa_FuncNameARB. So the same logic can
> be applied.
>
> https://bugs.freedesktop.org/show_bug.cgi?id=66346
> Signed-off-by: Julien Isorce 
> ---
>  configure.ac   | 2 +-
>  include/GL/glext.h | 5 +
>  2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index 7661bd9..1cd8e77 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1357,7 +1357,7 @@ if test "x$enable_dri" = xyes; then
>  fi
>  ;;
>  darwin*)
> -DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED"
> +DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED -DBUILDING_MESA"
>  if test "x$with_dri_drivers" = "xyes"; then
>  with_dri_drivers="swrast"
>  fi
> diff --git a/include/GL/glext.h b/include/GL/glext.h
> index a3873a6..e5f1d89 100644
> --- a/include/GL/glext.h
> +++ b/include/GL/glext.h
> @@ -3879,7 +3879,12 @@ GLAPI void APIENTRY glMinSampleShadingARB (GLfloat 
> value);
>  #ifndef GL_ARB_shader_objects
>  #define GL_ARB_shader_objects 1
>  #ifdef __APPLE__
> +#ifdef BUILDING_MESA
> +/* Avoid uint <-> void* warnings */
> +typedef unsigned long GLhandleARB;
> +#else
>  typedef void *GLhandleARB;
> +#endif
Ideally we'll ship a header without this change, but that evolves
adding a hook at the build/install stage into the autotools build.
If you feel extra nice and do so that'll be greatly appreciated.

Either way
Reviewed-by: Emil Velikov 

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] New stable-branch 10.5 candidate pushed

2015-07-02 Thread Emil Velikov
Hello list,

The candidate for the Mesa 10.5.9 is now available. Currently we have:
- 28 queued
- 8 nominated (outstanding)
- and 0 rejected (obsoleete) patches

The present queue consist of many driver fixes - nouveau, i915, i965,
 r600/radeonsi, alongside GLSL ones (amongst which a Dota2 Reborn
bugfix) and a few build related cleanups.

Take a look at section "Mesa stable queue" for more information.

Testing
---
The following results are against piglit 305ecc3ac89.


Changes - classic i965(snb)
---
None.


Changes - swrast classic

Fixes:
 - spec
+ !opengl 1.1
   + drawbuffer-modesfail > pass


Changes - gallium softpipe
--
None.


Changes - gallium llvmpipe (LLVM 3.6)
-
None.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch)
will be greatly appreciated.


Trivial merge conflicts
---
commit e3bcc8775598c36595d5044163e6a3b658fea2d8
Author: Chris Wilson 

i965: Export format comparison for blitting between miptrees

(cherry picked from commit 922c0c9fd526ce19b87bc74a3159dec7705c1de1)


commit 3c4570e3fc283f81b5cf4b3b05a9bb66cf47ae2f
Author: Michel Dänzer 

winsys/radeon: Unmap GPU VM address range when destroying BO

(cherry picked from commit 7796e8889a9a2cc1b454dc32d8da3d756404339a)


commit 5bec73364111907dc0a32c38b7d5f9b11c19626a
Author: Tapani Pälli 

glsl: Allow dynamic sampler array indexing with GLSL ES < 3.00

(cherry picked from commit edb8383c98ee23385731d0fc23a6b6673528a8ec)


commit f8ea1430ae0bbaf68f73aa3a706d9112402e396d
Author: Tapani Pälli 

i965: use EmitNoIndirectSampler for gen < 7

(cherry picked from commit 8852e26e93af1fc4b72bf9d57e847f53e1a1371b)


commit 81ac47605f887c9cbf5d2f3a77e7a71a28fbe37f
Author: Tapani Pälli 

glsl: validate sampler array indexing for 'constant-index-expression'

(cherry picked from commit 9350ea6979c48772e1fb55d4f1c7c5a3cfa987b0)



The plan is to have 10.5.9 this Friday (3rd of July).

If you have any questions or comments that you would like to share
before the release, please go ahead.


Cheers,
Emil


Mesa stable queue
-

Nominated (8)
==

Boyan Ding (1):
  i915: Add XRGB format to intel_screen_make_configs

Brian Paul (1):
  configure: don't try to build gallium DRI drivers if --disable-dri is set

Emil Velikov (1):
  bugzilla_mesa.sh: sort the bugs list by number

Ilia Mirkin (1):
  mesa: reset the source packing when creating temp transfer image

Mario Kleiner (1):
  radeon: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

Tom Stellard (3):
  clover: Call clBuildProgram() notification function when build completes 
v2
  gallium/drivers: Add threadsafe wrappers for pipe_context and pipe_screen
  clover: Use threadsafe wrappers for pipe_screen and pipe_context


Queued (28)
===

Ben Widawsky (2):
  i965/gen9: Implement Push Constant Buffer workaround
  i965/skl: Use 1 register for uniform pull constant payload

Boyan Ding (1):
  egl/x11: Remove duplicate call to dri2_x11_add_configs_for_visuals

Chris Wilson (3):
  i965: Fix HW blitter pitch limits
  i915: Blit RGBX<->RGBA drawpixels
  i965: Export format comparison for blitting between miptrees

Emil Velikov (5):
  docs: Add sha256sums for the 10.5.8 release
  configure: warn about shared_glapi & xlib-glx only when both are set
  configure: error out when building backend-less libEGL
  configure: error out when building libEGL without shared-glapi
  gbm: do not (over)link against libglapi.so

Frank Henigman (1):
  gbm: dlopen libglapi so gbm_create_device works

Ilia Mirkin (8):
  glsl: add version checks to conditionals for builtin variable enablement
  mesa: add GL_PROGRAM_PIPELINE support in KHR_debug calls
  glsl: binding point is a texture unit, which is a combined space
  nvc0: always put all tfb bufs into bufctx
  nv50,nvc0: make sure to pushbuf_refn before putting bo into pushbuf_data
  nv50/ir: propagate modifier to right arg when const-folding mad
  nv50/ir: fix emission of address reg in 3rd source
  nv50/ir: copy joinAt when splitting both before and after

Mario Kleiner (1):
  nouveau: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads.

Michel Dänzer (1):
  winsys/radeon: Unmap GPU VM address range when destroying BO

Tapani Pälli (6):
  glsl: Allow dynamic sampler array indexing with GLSL ES < 3.00
  mesa/glsl: new compiler option EmitNoIndirectSampler
  i915: use EmitNoIndirectSampler
  mesa/st: use EmitNoIndirectSampler if !ARB_gpu_shader5
  i965: use EmitNoIndirectSampler for gen < 7
  glsl: validate sampler array indexing for 'constant-index-expression'
_

Re: [Mesa-dev] [PATCH 20/78] i965/nir/vec4: Implement load_uniform intrinsic

2015-07-02 Thread Iago Toral
On Thu, 2015-07-02 at 09:33 +0200, Iago Toral wrote:
> On Tue, 2015-06-30 at 11:53 -0700, Jason Ekstrand wrote:
> > On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  
> > wrote:
> > > From: Iago Toral Quiroga 
> > >
> > > For the indirect case we need to take the index delivered by
> > > NIR and compute the parent uniform that we are accessing (the one
> > > that we uploaded to a surface) and the constant offset into that
> > > surface.
> > >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 27 
> > > +--
> > >  1 file changed, 25 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > index 3fa8ca8..9a0ae25 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > @@ -569,10 +569,33 @@ 
> > > vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
> > > }
> > >
> > > case nir_intrinsic_load_uniform_indirect:
> > > +  has_indirect = true;
> > >/* fallthrough */
> > > -   case nir_intrinsic_load_uniform:
> > > -  /* @TODO: Not yet implemented */
> > > +   case nir_intrinsic_load_uniform: {
> > > +  int index = instr->const_index[0];
> > > +  int uniform = nir_uniform_offset[index];
> > 
> > Again, I don't know that this indirection is really needed.  We should
> > make nir_lower_io just do the right thing for you here.
> 
> Ok, I'll see what kind of changes we need to achieve that.
> 

Initially, this was a consequence of the fact that we had not patched
nir_lower_io to operate in units of vec4 for non-scalar shaders, so we
needed a way to compute the vec4 offset from the scalar offset. With my
patch to nir_lower_io that is not needed any more except for the fact
that brw_nir.c calls nir_assign_var_locations_direct_first instead of
nir_assign_var_locations for uniforms, which breaks the assumption that
uniforms get locations in the same order as we find them in the shader.
Because of that we still need a way to map what nir_lower_io delivers to
a uniform number. Obviously, that function is very much tied to how the
FS works (which splits uniforms in direct only and indirect) so I
suppose it makes sense to call one or the other depending on whether
this is a scalar shader or not.

In summary: calling nir_assign_var_locations for uniforms in vec4
shaders instead of nir_assign_var_locations_direct_first allows us to
get rid of that nir_uniform_offset array mapping. Does this change look
good to you?

Iago

> 
> > > +  dest = get_nir_dest(instr->dest);
> > > +
> > > +  if (has_indirect) {
> > > + /* Split addressing into uniform and offset */
> > > + int offset = index - nir_uniform_driver_location[uniform];
> > > + assert(offset >= 0);
> > > +
> > > + uniform -= offset;
> > > + assert(uniform >= 0);
> > > +
> > > + src = src_reg(dst_reg(UNIFORM, uniform));
> > > + src.reg_offset = offset;
> > > + src_reg tmp = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_D);
> > > + src.reladdr = new(mem_ctx) src_reg(tmp);
> > > +  } else {
> > > + src = src_reg(dst_reg(UNIFORM, uniform));
> > > +  }
> > > +
> > > +  emit(MOV(dest, src));
> > >break;
> > > +   }
> > >
> > > case nir_intrinsic_atomic_counter_read:
> > > case nir_intrinsic_atomic_counter_inc:
> > > --
> > > 2.1.4
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/78] i965/vec4: Add auxiliary func to build a writemask from a component size

2015-07-02 Thread Eduardo Lima Mitev
On 06/30/2015 01:52 AM, Jason Ekstrand wrote:
> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
>> New method brw_writemask_for_size() will return a writemask with the first 
>> 'size'
>> components activated.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>> ---
>>  src/mesa/drivers/dri/i965/brw_reg.h | 9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_reg.h 
>> b/src/mesa/drivers/dri/i965/brw_reg.h
>> index c8b1341..83fe488 100644
>> --- a/src/mesa/drivers/dri/i965/brw_reg.h
>> +++ b/src/mesa/drivers/dri/i965/brw_reg.h
>> @@ -950,6 +950,15 @@ brw_set_writemask(struct brw_reg reg, unsigned mask)
>> return reg;
>>  }
>>
>> +static inline unsigned
>> +brw_writemask_for_size(unsigned n)
>> +{
>> +  unsigned writemask = 0;
>> +  for (unsigned i = 0; i < n; i++)
>> + writemask |= 1 << i;
>> +  return writemask;
> 
> You don't need the loop; you can just do "return (1 << n) - 1;"
> 

Right, that's an embarrassing mistake :).

>> +}
>> +
>>  static inline struct brw_reg
>>  negate(struct brw_reg reg)
>>  {
>> --
>> 2.1.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/78] i965/nir/vec4: Add nir_get_dst() and nir_get_src() methods

2015-07-02 Thread Eduardo Lima Mitev
On 06/30/2015 06:26 PM, Jason Ekstrand wrote:
> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
>> From: Samuel Iglesias Gonsalvez 
>>
>> These methods are essential for the implementation of the NIR->vec4 pass. 
>> They
>> work similar to their fs_nir counter-parts.
>>
>> When processing instructions, these methods are invoked to resolve the
>> brw registers (source or destination) corresponding to the NIR sources
>> or destination. It builds a map of NIR register index to brw register for
>> all registers locally allocated in a block.
>>
>> In the case of get_nir_src(), it also builds immediate registers on-the-fly
>> when queried for a SSA source which at this point can only correspond to
>> constant values.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.h   |   7 ++
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 102 
>> +
>>  2 files changed, 109 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
>> b/src/mesa/drivers/dri/i965/brw_vec4.h
>> index d837d90..a0f5935 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
>> @@ -411,6 +411,13 @@ public:
>> virtual void nir_emit_jump(nir_jump_instr *instr);
>> virtual void nir_emit_texture(nir_tex_instr *instr);
>>
>> +   dst_reg get_nir_dest(nir_dest dest, enum brw_reg_type type);
>> +   dst_reg get_nir_dest(nir_dest dest, nir_alu_type type);
>> +   dst_reg get_nir_dest(nir_dest dest);
>> +   src_reg get_nir_src(nir_src src, enum brw_reg_type type);
>> +   src_reg get_nir_src(nir_src src, nir_alu_type type);
>> +   src_reg get_nir_src(nir_src src);
>> +
>> virtual dst_reg *make_reg_for_system_value(int location,
>>const glsl_type *type) = 0;
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> index 36c9dc0..1ec75ee 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
>> @@ -365,6 +365,108 @@ vec4_visitor::nir_emit_instr(nir_instr *instr)
>> }
>>  }
>>
>> +static dst_reg
>> +dst_reg_for_nir_reg(vec4_visitor *v, nir_register *nir_reg,
>> +unsigned base_offset, nir_src *indirect)
>> +{
>> +   dst_reg reg;
>> +
>> +   reg = v->nir_locals[nir_reg->index];
>> +
>> +   reg = offset(reg, base_offset * nir_reg->num_components);
>> +   if (indirect) {
>> +  int multiplier = nir_reg->num_components;
>> +
>> +  reg.reladdr = new(v->mem_ctx) src_reg(dst_reg(GRF, 
>> v->alloc.allocate(1)));
>> +  v->emit(v->MUL(dst_reg(*reg.reladdr),
>> + retype(v->get_nir_src(*indirect), BRW_REGISTER_TYPE_D),
>> + src_reg(multiplier)));
> 
> I'm not sure this is correct.  We need this in the FS backend because
> each register is a scalar value.  In vec4, each array element should
> correspond to a single vec4 register so you shouldn't need to
> multiply.  Actually, I think it's currently dead code in the backend
> because we lower it all away, so it may not be correct at all.
> 

You are right, this multiply is not needed. I already removed it and
verified it. This will be changed in v2.

>> +   }
>> +
>> +   return reg;
>> +}
>> +
>> +dst_reg
>> +vec4_visitor::get_nir_dest(nir_dest dest)
>> +{
>> +   return dst_reg_for_nir_reg(this, dest.reg.reg, dest.reg.base_offset,
>> +  dest.reg.indirect);
>> +}
>> +
>> +dst_reg
>> +vec4_visitor::get_nir_dest(nir_dest dest, enum brw_reg_type type)
>> +{
>> +   dst_reg reg = get_nir_dest(dest);
>> +   return retype(reg, type);
>> +}
>> +
>> +dst_reg
>> +vec4_visitor::get_nir_dest(nir_dest dest, nir_alu_type type)
>> +{
>> +   dst_reg reg = get_nir_dest(dest);
>> +   return retype(reg, brw_type_for_nir_type(type));
>> +}
>> +
>> +src_reg
>> +vec4_visitor::get_nir_src(nir_src src, enum brw_reg_type type)
>> +{
>> +   dst_reg reg;
>> +
>> +   if (src.is_ssa) {
>> +  assert(src.ssa->parent_instr->type == nir_instr_type_load_const);
>> +  nir_load_const_instr *load = 
>> nir_instr_as_load_const(src.ssa->parent_instr);
>> +
>> +  reg = dst_reg(GRF, alloc.allocate(src.ssa->num_components));
> 
> It's vec4, you only need one register, not num_components-many.
> 

Right. This comes from our initial misconception of how the units were
given in vec4 vs. scalar. I have changed that everywhere in the backend
already, to allocate exactly 1 unit.

>> +  reg = retype(reg, type);
>> +
>> +  /* @FIXME: while this is what fs_nir does, we can do this better in 
>> the VS
>> +   * stage because we can emit vector operations and save some MOVs in
>> +   * cases where the constants are representable in 8 bits.
>> +   * So by now, we emit a MOV for each component.
>> +   */
>> +  for (unsigned i = 0; i < src.ssa->num_components; ++i) {
>> + reg.writemask = 1 << i;
>> +
>> + 

[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2015-07-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449

Mike  changed:

   What|Removed |Added

 CC||bugs.freedesktop@pm.por
   ||t0.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91187] F.E.A.R 2 (in Wine) runs out of memory and crashes

2015-07-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91187

Bug ID: 91187
   Summary: F.E.A.R 2 (in Wine) runs out of memory and crashes
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: gyebr...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

With mesa/nouveau the game crashes when I start a new game or load a previously
saved game. Text on the loading screen becomes corrupted (blocky) and the game
crashes a few seconds later.
Looks like it happens when the graphical option  is
enabled and set to maximum. If I reduce every other graphical options to the
minimum but leave Texture LOD on maximum then the game crashes.

With Nvidia blob 340.76 the game runs stable for a long period of time even if
I set everything to the maximum including anti-aliasing..

The crash occurs with the software renderer too.

Wine terminal output+backtrace:
http://pastebin.com/0AhVxDMj

Trace (uncompressed 450 MB):
https://drive.google.com/open?id=0B-tTbLKBl-tOVExQSEFERjZLeUE

Fedora 22 32-bit
VGA compatible controller: NVIDIA Corporation G92 [GeForce GTS 250] (rev a2)
(prog-if 00 [VGA controller])
Mesa from git
Kernel 4.0.6-300.fc22.i686+PAE
Xorg 1.17.2
libdrm-2.4.61-3.fc22.i686

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/vs: Fix matNxM vertex attributes where M != 4.

2015-07-02 Thread Kenneth Graunke
Matrix vertex attributes have their columns padded out to vec4s, which
I was failing to account for.  Scalar NIR expects them to be packed,
however.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

I still need to write proper Piglit tests for this.  We have basically a single
test for matrix vertex attributes, and that's a mat4 (which worked).

But I figure we probably shouldn't hold up the bugfix on that.

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index caf1300..37b1ed7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -91,12 +91,19 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
   * So, we need to copy from fs_reg(ATTR, var->location) to
   * offset(nir_inputs, var->data.driver_location).
   */
- unsigned components = var->type->without_array()->components();
+ const glsl_type *const t = var->type->without_array();
+ const unsigned components = t->components();
+ const unsigned cols = t->matrix_columns;
+ const unsigned elts = t->vector_elements;
  unsigned array_length = var->type->is_array() ? var->type->length : 1;
  for (unsigned i = 0; i < array_length; i++) {
-for (unsigned j = 0; j < components; j++) {
-   bld.MOV(retype(offset(input, bld, components * i + j), type),
-   offset(fs_reg(ATTR, var->data.location + i, type), bld, 
j));
+for (unsigned j = 0; j < cols; j++) {
+   for (unsigned k = 0; k < elts; k++) {
+  bld.MOV(offset(retype(input, type), bld,
+ components * i + elts * j + k),
+  offset(fs_reg(ATTR, var->data.location + i, type),
+ bld, 4 * j + k));
+   }
 }
  }
  break;
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10.6] i965/vs: Fix matNxM vertex attributes where M != 4.

2015-07-02 Thread Kenneth Graunke
Matrix vertex attributes have their columns padded out to vec4s, which
I was failing to account for.  Scalar NIR expects them to be packed,
however.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

Here's a backport of the patch to 10.6, since the one on master won't apply
due to all the builder changes.

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 5dd8363..6d11c0d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -89,12 +89,19 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
   * So, we need to copy from fs_reg(ATTR, var->location) to
   * offset(nir_inputs, var->data.driver_location).
   */
- unsigned components = var->type->without_array()->components();
+ const glsl_type *const t = var->type->without_array();
+ const unsigned components = t->components();
+ const unsigned cols = t->matrix_columns;
+ const unsigned elts = t->vector_elements;
  unsigned array_length = var->type->is_array() ? var->type->length : 1;
  for (unsigned i = 0; i < array_length; i++) {
-for (unsigned j = 0; j < components; j++) {
-   emit(MOV(retype(offset(input, components * i + j), type),
-offset(fs_reg(ATTR, var->data.location + i, type), 
j)));
+for (unsigned j = 0; j < cols; j++) {
+   for (unsigned k = 0; k < elts; k++) {
+  emit(MOV(offset(retype(input, type),
+  components * i + elts * j + k),
+   offset(fs_reg(ATTR, var->data.location + i, type),
+  4 * j + k)));
+   }
 }
  }
  break;
-- 
2.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: allocate at least 1 BLEND_STATE element

2015-07-02 Thread Kenneth Graunke
On Wednesday, July 01, 2015 10:16:28 AM Mike Stroyan wrote:
> When there are no color buffer render targets, gen6 and gen7 still
> use the first BLEND_STATE element to determine alpha test.
> gen6_upload_blend_state was allocating zero elements when
> ctx->Color.AlphaEnabled was false.
> That left _3DSTATE_CC_STATE_POINTERS or _3DSTATE_BLEND_STATE_POINTERS
> pointing to random data from some previous brw_state_batch().
> That sometimes suppressed depth rendering when those bits
> happened to mean COMPAREFUNC_NEVER.
> This produced flickering shadows for dota2 reborn.
> ---
>  src/mesa/drivers/dri/i965/gen6_cc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_cc.c 
> b/src/mesa/drivers/dri/i965/gen6_cc.c
> index 2bfa271..2b76e24 100644
> --- a/src/mesa/drivers/dri/i965/gen6_cc.c
> +++ b/src/mesa/drivers/dri/i965/gen6_cc.c
> @@ -51,7 +51,7 @@ gen6_upload_blend_state(struct brw_context *brw)
>  * with render target 0, which will reference BLEND_STATE[0] for
>  * alpha test enable.
>  */
> -   if (nr_draw_buffers == 0 && ctx->Color.AlphaEnabled)
> +   if (nr_draw_buffers == 0)
>nr_draw_buffers = 1;
>  
> size = sizeof(*blend) * nr_draw_buffers;
> 

Great catch!

Reviewed-by: Kenneth Graunke 

And pushed:
   9d408a4..fe2b748  master -> master

I think we ought to change gen8_blend_state.c as well, but I'm not quite
sure what change to make.  Either we should make the same change you did
here, or delete the whole "We need at least 1 BLEND_STATE written"
block.

On Gen8+, it looks like the alpha test and other functions that might
discard pixels are all in the shared/common DWord, and the per-color
target DWord pairs look relatively harmless.  I suppose the null RT
would still refer to BLEND_STATE[0]...so it might still be worth
emitting one.  Any thoughts?


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/78] i965/nir/vec4: Implement load_uniform intrinsic

2015-07-02 Thread Iago Toral
On Tue, 2015-06-30 at 11:53 -0700, Jason Ekstrand wrote:
> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> > From: Iago Toral Quiroga 
> >
> > For the indirect case we need to take the index delivered by
> > NIR and compute the parent uniform that we are accessing (the one
> > that we uploaded to a surface) and the constant offset into that
> > surface.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 27 +--
> >  1 file changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > index 3fa8ca8..9a0ae25 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > @@ -569,10 +569,33 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> > *instr)
> > }
> >
> > case nir_intrinsic_load_uniform_indirect:
> > +  has_indirect = true;
> >/* fallthrough */
> > -   case nir_intrinsic_load_uniform:
> > -  /* @TODO: Not yet implemented */
> > +   case nir_intrinsic_load_uniform: {
> > +  int index = instr->const_index[0];
> > +  int uniform = nir_uniform_offset[index];
> 
> Again, I don't know that this indirection is really needed.  We should
> make nir_lower_io just do the right thing for you here.

Ok, I'll see what kind of changes we need to achieve that.

Iago

> > +  dest = get_nir_dest(instr->dest);
> > +
> > +  if (has_indirect) {
> > + /* Split addressing into uniform and offset */
> > + int offset = index - nir_uniform_driver_location[uniform];
> > + assert(offset >= 0);
> > +
> > + uniform -= offset;
> > + assert(uniform >= 0);
> > +
> > + src = src_reg(dst_reg(UNIFORM, uniform));
> > + src.reg_offset = offset;
> > + src_reg tmp = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_D);
> > + src.reladdr = new(mem_ctx) src_reg(tmp);
> > +  } else {
> > + src = src_reg(dst_reg(UNIFORM, uniform));
> > +  }
> > +
> > +  emit(MOV(dest, src));
> >break;
> > +   }
> >
> > case nir_intrinsic_atomic_counter_read:
> > case nir_intrinsic_atomic_counter_inc:
> > --
> > 2.1.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/78] nir/nir_lower_io: Add vec4 support

2015-07-02 Thread Iago Toral
On Tue, 2015-06-30 at 11:32 -0700, Jason Ekstrand wrote:
> I'm not sure what I think about adding an is_scalar flag vs. having
> _scalar and _vec4 versions of each function.  My feeling is that once
> we tweak assign_var_locations as I mentioned for vec4 outputs, we'll
> find that we want to have them separate.  The big thing here is that
> I'd rather have _vec4 and _scalar versions internally call the same
> function than have a general-looking function that switches inside on
> is_scalar and does two completely different things.  So if we can
> share code like this for all of them, is_scalar is totally fine.  If
> we start having divergence, different scalar/vec4 functions is
> probably better.
> --Jason

Ok, let's see how the changes to vec4 outputs affects this. If we end up
needing more changes than just the kind of type_size() function we need
to call then I'll look into having separate paths for scalar and vec4.

Iago

> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> > From: Iago Toral Quiroga 
> >
> > The current implementation operates in scalar mode only, so add a vec4
> > mode where types are padded to vec4 sizes.
> >
> > This will be useful in the i965 driver for its vec4 nir backend
> > (and possbly other drivers that have vec4-based shaders).
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580
> > ---
> >  src/glsl/nir/nir.h  | 18 
> >  src/glsl/nir/nir_lower_io.c | 87 
> > +
> >  src/mesa/drivers/dri/i965/brw_nir.c | 18 +---
> >  3 files changed, 90 insertions(+), 33 deletions(-)
> >
> > diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> > index 697d37e..334abbc 100644
> > --- a/src/glsl/nir/nir.h
> > +++ b/src/glsl/nir/nir.h
> > @@ -1640,14 +1640,16 @@ void nir_lower_global_vars_to_local(nir_shader 
> > *shader);
> >
> >  void nir_lower_locals_to_regs(nir_shader *shader);
> >
> > -void nir_assign_var_locations_scalar(struct exec_list *var_list,
> > - unsigned *size);
> > -void nir_assign_var_locations_scalar_direct_first(nir_shader *shader,
> > -  struct exec_list 
> > *var_list,
> > -  unsigned *direct_size,
> > -  unsigned *size);
> > -
> > -void nir_lower_io(nir_shader *shader);
> > +void nir_assign_var_locations(struct exec_list *var_list,
> > +  unsigned *size,
> > +  bool is_scalar);
> > +void nir_assign_var_locations_direct_first(nir_shader *shader,
> > +   struct exec_list *var_list,
> > +   unsigned *direct_size,
> > +   unsigned *size,
> > +   bool is_scalar);
> > +
> > +void nir_lower_io(nir_shader *shader, bool is_scalar);
> >
> >  void nir_lower_vars_to_ssa(nir_shader *shader);
> >
> > diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
> > index 6761d5b..3ecd386 100644
> > --- a/src/glsl/nir/nir_lower_io.c
> > +++ b/src/glsl/nir/nir_lower_io.c
> > @@ -29,19 +29,56 @@
> >  /*
> >   * This lowering pass converts references to input/output variables with
> >   * loads/stores to actual input/output intrinsics.
> > - *
> > - * NOTE: This pass really only works for scalar backends at the moment due
> > - * to the way it packes the input/output data.
> >   */
> >
> >  #include "nir.h"
> >
> >  struct lower_io_state {
> > void *mem_ctx;
> > +   bool is_scalar;
> >  };
> >
> > +static int
> > +type_size_vec4(const struct glsl_type *type)
> > +{
> > +   unsigned int i;
> > +   int size;
> > +
> > +   switch (glsl_get_base_type(type)) {
> > +   case GLSL_TYPE_UINT:
> > +   case GLSL_TYPE_INT:
> > +   case GLSL_TYPE_FLOAT:
> > +   case GLSL_TYPE_BOOL:
> > +  if (glsl_type_is_matrix(type)) {
> > + return glsl_get_matrix_columns(type);
> > +  } else {
> > + return 1;
> > +  }
> > +   case GLSL_TYPE_ARRAY:
> > +  return type_size_vec4(glsl_get_array_element(type)) * 
> > glsl_get_length(type);
> > +   case GLSL_TYPE_STRUCT:
> > +  size = 0;
> > +  for (i = 0; i <  glsl_get_length(type); i++) {
> > + size += type_size_vec4(glsl_get_struct_field(type, i));
> > +  }
> > +  return size;
> > +   case GLSL_TYPE_SAMPLER:
> > +  return 0;
> > +   case GLSL_TYPE_ATOMIC_UINT:
> > +  return 0;
> > +   case GLSL_TYPE_IMAGE:
> > +   case GLSL_TYPE_VOID:
> > +   case GLSL_TYPE_DOUBLE:
> > +   case GLSL_TYPE_ERROR:
> > +   case GLSL_TYPE_INTERFACE:
> > +  unreachable("not reached");
> > +   }
> > +
> > +   return 0;
> > +}
> > +
> >  static unsigned
> > -type_size(const struct glsl_type *type)
> > +type_size_scalar(const struct glsl_type *type)
> >  {
> > unsigned int size, i;
> >
> > @@ -52,11 +89,11 @@ type_size(cons

Re: [Mesa-dev] [Intel-gfx] [PATCH mesa v2] i965/gen8+: bo in state base address must be in 32-bit address range

2015-07-02 Thread Chris Wilson
On Wed, Jul 01, 2015 at 04:28:10PM +0100, Michel Thierry wrote:
> Gen8+ supports 48-bit virtual addresses, but some objects must always be
> allocated inside the 32-bit address range.
> 
> In specific, any resource used with flat/heapless (0x-0xf000)
> General State Heap or Intruction State Heap must be in a 32-bit range
> (GSH / ISH), because the General State Offset and Instruction State Offset
> are limited to 32-bits.
> 
> Use drm_intel_bo_emit_reloc_48bit when the 4GB limit is not necessary, and
> the bo can be in the full address space.
> 
> This commit introduces a dependency of libdrm 2.4.63, which introduces the
> drm_intel_bo_emit_reloc_48bit function.
> 
> v2: s/48baddress/48b_address/,
> Only use in OUT_RELOC64 cases, OUT_RELOC implies a 32-bit address offset
> is needed (Ben)
> 
> Cc: Ben Widawsky 
> Cc: mesa-dev@lists.freedesktop.org
> Signed-off-by: Michel Thierry 
> ---
>  configure.ac  |  2 +-
>  src/mesa/drivers/dri/i965/gen8_misc_state.c   | 23 +++
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c |  6 +++---
>  3 files changed, 19 insertions(+), 12 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index af61aa2..c92ca44 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -68,7 +68,7 @@ AC_SUBST([OSMESA_VERSION])
>  dnl Versions for external dependencies
>  LIBDRM_REQUIRED=2.4.38
>  LIBDRM_RADEON_REQUIRED=2.4.56
> -LIBDRM_INTEL_REQUIRED=2.4.60
> +LIBDRM_INTEL_REQUIRED=2.4.63
>  LIBDRM_NVVIEUX_REQUIRED=2.4.33
>  LIBDRM_NOUVEAU_REQUIRED="2.4.33 libdrm >= 2.4.41"
>  LIBDRM_FREEDRENO_REQUIRED=2.4.57
> diff --git a/src/mesa/drivers/dri/i965/gen8_misc_state.c 
> b/src/mesa/drivers/dri/i965/gen8_misc_state.c
> index b20038e..5c8924d 100644
> --- a/src/mesa/drivers/dri/i965/gen8_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_misc_state.c
> @@ -28,6 +28,11 @@
>  
>  /**
>   * Define the base addresses which some state is referenced from.
> + *
> + * Use OUT_RELOC instead of OUT_RELOC64, because the General State
> + * Offset and Instruction State Offset are limited to 32-bits by
> + * hardware [and add OUT_BATCH(0) after each OUT_RELOC to complete
> + * the number of dwords needed for STATE_BASE_ADDRESS].
>   */
>  void gen8_upload_state_base_address(struct brw_context *brw)
>  {
> @@ -41,19 +46,21 @@ void gen8_upload_state_base_address(struct brw_context 
> *brw)
> OUT_BATCH(0);
> OUT_BATCH(mocs_wb << 16);
> /* Surface state base address: */
> -   OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
> -   mocs_wb << 4 | 1);
> +   OUT_RELOC(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
> + mocs_wb << 4 | 1);
> +   OUT_BATCH(0);
> /* Dynamic state base address: */
> -   OUT_RELOC64(brw->batch.bo,
> -   I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
> -   mocs_wb << 4 | 1);
> +   OUT_RELOC(brw->batch.bo,
> + I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
> + mocs_wb << 4 | 1);
> +   OUT_BATCH(0);

Note this is in general dangerous since it lies to the kernel about the
value written into the batch corresponding to the 64bit relocation.
(Aliasing a high object here isn't much of an issue since the relocation
will be forced by having to rebind the buffer low.)

Personally I would have stuck with the OUT_RELOC64 so that any future
cut'n'paste didn't have a subtle bug and that the wa was clearly
indicated by clearing the execobject flag for brw->batch.bo and
brw->cache.bo.

> /* Indirect object base address: MEDIA_OBJECT data */
> OUT_BATCH(mocs_wb << 4 | 1);
> OUT_BATCH(0);
> /* Instruction base address: shader kernels (incl. SIP) */
> -   OUT_RELOC64(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
> -   mocs_wb << 4 | 1);
> -
> +   OUT_RELOC(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
> + mocs_wb << 4 | 1);
> +   OUT_BATCH(0);
> /* General state buffer size */
> OUT_BATCH(0xf001);
> /* Dynamic state buffer size */
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
> b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> index 54081a1..220a35b 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
> @@ -411,9 +411,9 @@ intel_batchbuffer_emit_reloc64(struct brw_context *brw,
> uint32_t read_domains, uint32_t write_domain,
>  uint32_t delta)
>  {
> -   int ret = drm_intel_bo_emit_reloc(brw->batch.bo, 4*brw->batch.used,
> - buffer, delta,
> - read_domains, write_domain);
> +   int ret = drm_intel_bo_emit_reloc_48bit(brw->batch.bo, 4*brw->batch.used,
> +   buffer, delta,
> +   read_domains, write_domain);

I would have just exposed setting the flag on the execobject.  That way you
sti