Re: [Mesa-dev] [PATCH] libdrm/amdgpu: Use private fd for amdgpu_device and winsys hash table to fix ZaphodHeads.

2015-07-15 Thread Christian König

On 15.07.2015 07:15, Mario Kleiner wrote:

The amdgpu_device for a device node needs its own dup'ed fd, instead
of using the original fd passed in for a screen, to make multi-x-screen
ZaphodHeads configurations work on amdgpu.

The original fd's lifetime differs from that of the amdgpu_device, and from the
one stored in the hash. The hash key is the fd, and in order to compare hash
entries we fstat them, so the fd must be around for as long as the amdgpu_device
is.

This patch for libdrm/amdgpu is a translation of the radeon-winsys ZaphodHeads
fix for mesa's radeon-winsys, from mesa commit 
28dda47ae4d974e3e032d60e8e0965c8c068c6d8

"winsys/radeon: Use dup fd as key in drm-winsys hash table to fix ZaphodHeads."

Signed-off-by: Mario Kleiner 
Cc: Marek Olšák 


Mhm, we originally designed libdrm_amdgpu with the idea that we take 
ownership of the file descriptor.


That was obviously not implemented like this (we leak the descriptor on 
destruction), but IIRC there was a reason for doing it like this.


The patch is Reviewed-by: Christian König  for 
now, but I'm not 100% sure if that's the last word on the topic.


Regards,
Christian.


---
  amdgpu/amdgpu_device.c | 14 ++
  1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
index b50ce26..1b0cd12 100644
--- a/amdgpu/amdgpu_device.c
+++ b/amdgpu/amdgpu_device.c
@@ -34,6 +34,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "xf86drm.h"

  #include "amdgpu_drm.h"
@@ -153,7 +154,7 @@ int amdgpu_device_initialize(int fd,
return r;
}
if ((flag_auth) && (!flag_authexist)) {
-   dev->flink_fd = fd;
+   dev->flink_fd = dup(fd);
}
*major_version = dev->major_version;
*minor_version = dev->minor_version;
@@ -183,8 +184,8 @@ int amdgpu_device_initialize(int fd,
goto cleanup;
}
  
-	dev->fd = fd;

-   dev->flink_fd = fd;
+   dev->fd = dup(fd);
+   dev->flink_fd = dev->fd;
dev->major_version = version->version_major;
dev->minor_version = version->version_minor;
drmFreeVersion(version);
@@ -213,12 +214,14 @@ int amdgpu_device_initialize(int fd,
*major_version = dev->major_version;
*minor_version = dev->minor_version;
*device_handle = dev;
-   util_hash_table_set(fd_tab, UINT_TO_PTR(fd), dev);
+   util_hash_table_set(fd_tab, UINT_TO_PTR(dev->fd), dev);
pthread_mutex_unlock(&fd_mutex);
  
  	return 0;
  
  cleanup:

+   if (dev->fd)
+   close(dev->fd);
free(dev);
pthread_mutex_unlock(&fd_mutex);
return r;
@@ -232,6 +235,9 @@ void amdgpu_device_free_internal(amdgpu_device_handle dev)
pthread_mutex_destroy(&dev->bo_table_mutex);
pthread_mutex_destroy(&(dev->vamgr.bo_va_mutex));
util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
+   close(dev->fd);
+   if (dev->fd != dev->flink_fd)
+   close(dev->flink_fd);
free(dev);
  }
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] i965: Reduce the scope of input in buffer tex setup

2015-07-15 Thread Pohjolainen, Topi
On Tue, Jul 14, 2015 at 04:48:19PM -0700, Ben Widawsky wrote:
> On Wed, Jul 01, 2015 at 02:46:32PM +0300, Topi Pohjolainen wrote:
> > Signed-off-by: Topi Pohjolainen 
> 
> I swear I am not trying to nitpick but I don't actually understand what your
> goal of the patch is. Could you maybe elaborate a bit on what "reduce the 
> scope
> of input in.."

That does reserve an explanation. Before this patch the call was given
the texture unit and the call resolved the texture object using it. Now,
the texture object is given directly and hence the scope of input arguments
can be thought to shrink as the call isn't aware of the set of texture
objects anymore.

I agree that the title is misleading and should be instead something on the
lines:

i965: Pass the tex object directly to surface setup

> 
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h   | 4 ++--
> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 8 +++-
> >  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 2 +-
> >  src/mesa/drivers/dri/i965/gen8_surface_state.c| 2 +-
> >  4 files changed, 7 insertions(+), 9 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> > b/src/mesa/drivers/dri/i965/brw_context.h
> > index ae29798..da018bf 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -1710,8 +1710,8 @@ void brw_create_constant_surface(struct brw_context 
> > *brw,
> >   uint32_t size,
> >   uint32_t *out_offset,
> >   bool dword_pitch);
> > -void brw_update_buffer_texture_surface(struct gl_context *ctx,
> > -   unsigned unit,
> > +void brw_update_buffer_texture_surface(struct brw_context *brw,
> > +   struct gl_texture_object *tObj,
> > uint32_t *surf_offset);
> >  void
> >  brw_update_sol_surface(struct brw_context *brw,
> > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > index 72aad96..73aa719 100644
> > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > @@ -276,12 +276,10 @@ gen4_emit_buffer_surface_state(struct brw_context 
> > *brw,
> >  }
> >  
> >  void
> > -brw_update_buffer_texture_surface(struct gl_context *ctx,
> > -  unsigned unit,
> > +brw_update_buffer_texture_surface(struct brw_context *brw,
> > +  struct gl_texture_object *tObj,
> >uint32_t *surf_offset)
> >  {
> > -   struct brw_context *brw = brw_context(ctx);
> > -   struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
> > struct intel_buffer_object *intel_obj =
> >intel_buffer_object(tObj->BufferObject);
> > uint32_t size = tObj->BufferSize;
> > @@ -323,7 +321,7 @@ brw_update_texture_surface(struct gl_context *ctx,
> >  
> > /* BRW_NEW_TEXTURE_BUFFER */
> > if (tObj->Target == GL_TEXTURE_BUFFER) {
> > -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
> > +  brw_update_buffer_texture_surface(brw, tObj, surf_offset);
> >return;
> > }
> >  
> > diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
> > b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
> > index 494bc22..6aa8299 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
> > @@ -357,7 +357,7 @@ gen7_update_texture_surface(struct gl_context *ctx,
> > struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
> >  
> > if (obj->Target == GL_TEXTURE_BUFFER) {
> > -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
> > +  brw_update_buffer_texture_surface(brw, obj, surf_offset);
> >  
> > } else {
> >struct intel_texture_object *intel_obj = intel_texture_object(obj);
> > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> > b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > index c595ec3..11defd1 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > @@ -308,7 +308,7 @@ gen8_update_texture_surface(struct gl_context *ctx,
> > struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
> >  
> > if (obj->Target == GL_TEXTURE_BUFFER) {
> > -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
> > +  brw_update_buffer_texture_surface(brw, obj, surf_offset);
> >  
> > } else {
> >struct gl_texture_image *firstImage = obj->Image[0][obj->BaseLevel];
> > -- 
> > 1.9.3
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
_

Re: [Mesa-dev] [PATCH 01/13] mesa: Use floats for viewport bounds.

2015-07-15 Thread Mathias Fröhlich
Hi Matt,

On Monday, July 13, 2015 16:22:03 Matt Turner wrote:
> ARB_viewport_array specifies that DEPTH_RANGE consists of double-
> precision parameters (corresponding commit d4dc35987), and a preparatory
> commit (6340e609a) added _mesa_get_viewport_xform() which returned
> double-precision scale[3] and translate[3] vectors, even though X, Y,
> Width, and Height were still floats.
> 
> All users of _mesa_get_viewport_xform() immediately convert the double
> scale and translation vectors into floats (which were floats originally,
> but were converted to doubles in _mesa_get_viewport_xform(), sigh).
> 
> i965 at least cannot consume doubles (see SF_CLIP_VIEWPORT). If we want
> to pass doubles to hardware, we should have a different function that
> does that.
The aim was to route the precision provided with the depth range one stage
further down to the hardware. At the current time we do not support a single
piece of hardware that takes doubles at that point, right?

There is a comment inline below.

I am not sure if I am in the position to review, but at least you can get my
Acked-by: Mathias Froehlich 

Mathias


> ---
>  src/mesa/drivers/dri/i915/i915_state.c  |  2 +-
>  src/mesa/drivers/dri/i965/brw_sf_state.c|  2 +-
>  src/mesa/drivers/dri/i965/gen6_viewport_state.c |  2 +-
>  src/mesa/drivers/dri/i965/gen7_viewport_state.c |  2 +-
>  src/mesa/drivers/dri/i965/gen8_viewport_state.c |  2 +-
>  src/mesa/drivers/dri/r200/r200_state.c  |  2 +-
>  src/mesa/drivers/dri/radeon/radeon_state.c  |  2 +-
>  src/mesa/main/viewport.c| 14 +++---
>  src/mesa/main/viewport.h|  2 +-
>  src/mesa/math/m_matrix.c|  4 ++--
>  src/mesa/math/m_matrix.h|  4 ++--
>  src/mesa/state_tracker/st_atom_viewport.c   |  2 +-
>  src/mesa/tnl/t_context.c|  2 +-
>  src/mesa/tnl/t_rasterpos.c  |  2 +-
>  14 files changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i915/i915_state.c 
> b/src/mesa/drivers/dri/i915/i915_state.c
> index 5f10b84..4c83073 100644
> --- a/src/mesa/drivers/dri/i915/i915_state.c
> +++ b/src/mesa/drivers/dri/i915/i915_state.c
> @@ -402,7 +402,7 @@ void
>  intelCalcViewport(struct gl_context * ctx)
>  {
> struct intel_context *intel = intel_context(ctx);
> -   double scale[3], translate[3];
> +   float scale[3], translate[3];
>  
> _mesa_get_viewport_xform(ctx, 0, scale, translate);
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c 
> b/src/mesa/drivers/dri/i965/brw_sf_state.c
> index 5d98922..3be6e4a 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
> @@ -45,7 +45,7 @@ static void upload_sf_vp(struct brw_context *brw)
> struct gl_context *ctx = &brw->ctx;
> struct brw_sf_viewport *sfv;
> GLfloat y_scale, y_bias;
> -   double scale[3], translate[3];
> +   float scale[3], translate[3];
> const bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
>  
> sfv = brw_state_batch(brw, AUB_TRACE_SF_VP_STATE,
> diff --git a/src/mesa/drivers/dri/i965/gen6_viewport_state.c 
> b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
> index 7c8d884..11b9a36 100644
> --- a/src/mesa/drivers/dri/i965/gen6_viewport_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
> @@ -101,7 +101,7 @@ gen6_upload_sf_vp(struct brw_context *brw)
> }
>  
> for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
> -  double scale[3], translate[3];
> +  float scale[3], translate[3];
>  
>/* _NEW_VIEWPORT */
>_mesa_get_viewport_xform(ctx, i, scale, translate);
> diff --git a/src/mesa/drivers/dri/i965/gen7_viewport_state.c 
> b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
> index b655205..c75dc99 100644
> --- a/src/mesa/drivers/dri/i965/gen7_viewport_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
> @@ -53,7 +53,7 @@ gen7_upload_sf_clip_viewport(struct brw_context *brw)
> }
>  
> for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
> -  double scale[3], translate[3];
> +  float scale[3], translate[3];
>_mesa_get_viewport_xform(ctx, i, scale, translate);
>  
>/* According to the "Vertex X,Y Clamping and Quantization" section of
> diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c 
> b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
> index 2d8eeb1..2692ad5 100644
> --- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
> @@ -53,7 +53,7 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw)
> }
>  
> for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
> -  double scale[3], translate[3];
> +  float scale[3], translate[3];
>_mesa_get_viewport_xform(ctx, i, scale, translate);
>  
>/* _NEW_VIEWPORT: Viewport Matrix Elements */
> diff --git a/src/mesa/drivers/dri/r20

[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2015-07-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90264

--- Comment #36 from Michel Dänzer  ---
(In reply to Heiko from comment #35)
> Iirc from my last post, radeon_draw_buffer() does re-evaluate window sizes.
> That's probably why the reverted commit makes things work again.

Ken, do you think it's plausible that your commit causes problems with window
resizes vs. glViewport calls?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] libdrm/amdgpu: Use private fd for amdgpu_device and winsys hash table to fix ZaphodHeads.

2015-07-15 Thread Marek Olšák
Acked-by: Marek Olšák 

Marek

On Wed, Jul 15, 2015 at 7:15 AM, Mario Kleiner
 wrote:
> The amdgpu_device for a device node needs its own dup'ed fd, instead
> of using the original fd passed in for a screen, to make multi-x-screen
> ZaphodHeads configurations work on amdgpu.
>
> The original fd's lifetime differs from that of the amdgpu_device, and from 
> the
> one stored in the hash. The hash key is the fd, and in order to compare hash
> entries we fstat them, so the fd must be around for as long as the 
> amdgpu_device
> is.
>
> This patch for libdrm/amdgpu is a translation of the radeon-winsys ZaphodHeads
> fix for mesa's radeon-winsys, from mesa commit 
> 28dda47ae4d974e3e032d60e8e0965c8c068c6d8
>
> "winsys/radeon: Use dup fd as key in drm-winsys hash table to fix 
> ZaphodHeads."
>
> Signed-off-by: Mario Kleiner 
> Cc: Marek Olšák 
> ---
>  amdgpu/amdgpu_device.c | 14 ++
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
> index b50ce26..1b0cd12 100644
> --- a/amdgpu/amdgpu_device.c
> +++ b/amdgpu/amdgpu_device.c
> @@ -34,6 +34,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "xf86drm.h"
>  #include "amdgpu_drm.h"
> @@ -153,7 +154,7 @@ int amdgpu_device_initialize(int fd,
> return r;
> }
> if ((flag_auth) && (!flag_authexist)) {
> -   dev->flink_fd = fd;
> +   dev->flink_fd = dup(fd);
> }
> *major_version = dev->major_version;
> *minor_version = dev->minor_version;
> @@ -183,8 +184,8 @@ int amdgpu_device_initialize(int fd,
> goto cleanup;
> }
>
> -   dev->fd = fd;
> -   dev->flink_fd = fd;
> +   dev->fd = dup(fd);
> +   dev->flink_fd = dev->fd;
> dev->major_version = version->version_major;
> dev->minor_version = version->version_minor;
> drmFreeVersion(version);
> @@ -213,12 +214,14 @@ int amdgpu_device_initialize(int fd,
> *major_version = dev->major_version;
> *minor_version = dev->minor_version;
> *device_handle = dev;
> -   util_hash_table_set(fd_tab, UINT_TO_PTR(fd), dev);
> +   util_hash_table_set(fd_tab, UINT_TO_PTR(dev->fd), dev);
> pthread_mutex_unlock(&fd_mutex);
>
> return 0;
>
>  cleanup:
> +   if (dev->fd)
> +   close(dev->fd);
> free(dev);
> pthread_mutex_unlock(&fd_mutex);
> return r;
> @@ -232,6 +235,9 @@ void amdgpu_device_free_internal(amdgpu_device_handle dev)
> pthread_mutex_destroy(&dev->bo_table_mutex);
> pthread_mutex_destroy(&(dev->vamgr.bo_va_mutex));
> util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
> +   close(dev->fd);
> +   if (dev->fd != dev->flink_fd)
> +   close(dev->flink_fd);
> free(dev);
>  }
>
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] i965/fs: fix stride and type for hw_reg's in regs_read()

2015-07-15 Thread Francisco Jerez
Connor Abbott  writes:

> On Tue, Jul 14, 2015 at 6:02 AM, Francisco Jerez  
> wrote:
>> Connor Abbott  writes:
>>
>>> sources with file == HW_REG get all their information from the
>>> fixed_hw_reg field, so we need to get the stride and type from there
>>> when computing the size.
>>>
>>> Signed-off-by: Connor Abbott 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 24 ++--
>>>  1 file changed, 18 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> index 38b9095..64f093b 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> @@ -696,24 +696,36 @@ fs_inst::regs_read(int arg) const
>>>break;
>>> }
>>>
>>> +   unsigned stride;
>>> +   enum brw_reg_type type;
>>> +
>>> switch (src[arg].file) {
>>> case BAD_FILE:
>>> case UNIFORM:
>>> case IMM:
>>>return 1;
>>> +
>>> case GRF:
>>> +  stride = src[arg].stride;
>>> +  type = src[arg].type;
>>> +  break;
>>> +
>>> case HW_REG:
>>> -  if (src[arg].stride == 0) {
>>> - return 1;
>>> -  } else {
>>> - int size = components * this->exec_size * type_sz(src[arg].type);
>>> - return DIV_ROUND_UP(size * src[arg].stride, 32);
>>> -  }
>>> +  stride = src[arg].fixed_hw_reg.hstride;
>>> +  type = src[arg].fixed_hw_reg.type;
>>> +  break;
>>> +
>>> case MRF:
>>>unreachable("MRF registers are not allowed as sources");
>>> default:
>>>unreachable("Invalid register file");
>>> }
>>> +
>>> +   if (stride == 0)
>>> +  return 1;
>>> +
>>> +   int size = components * this->exec_size * type_sz(type);
>>> +   return DIV_ROUND_UP(size * stride, 32);
>>
>> I don't think this will work unfortunately, brw_reg::hstride is the log2
>> of the actual stride, unlike fs_reg::stride.  Did I already mention I'm
>> appalled by the fact that fs_reg has a number of fields with overlapping
>> semantics but different representation, one or the other being
>> applicable depending on the occasion.  I guess it would be more or less
>> bearable if these data members were declared private and some reasonable
>> abstraction was provided to access them.
>
> I don't think anybody's happy with it, but refactoring that is it's
> own can of worms.
>

Sure, it would be a pile of work, but I think it should be quite
straightforward in principle.  We could just punt fixed_hw_reg and
replace it with an ARF file and a fixed-GRF file using the same fields
normal regististers use.  For immediates we'd have to add to add a union
with float/unsigned/int fields similar to brw_reg::dw1.

>>
>> How do you like the attached patch?  It doesn't solve the fundamental
>> problem but it seems to improve the situation slightly.
>
> It seems fine to me... I was more paranoid about getting the type from
> the fixed_hw_reg too, but brw_reg_from_fs_reg() in the generator we
> have:
>
> assert(reg->type == reg->fixed_hw_reg.type);
>
> so it seems my paranoia wasn't justified. I'd like someone else who's
> more experienced to take a look though. I suspect that others might
> want to bikeshed about the name, but I don't have a better suggestion.
>

Yeah, your paranoia was definitely justified, it's essentially the same
problem.  It actually led to actual bugs in the past which is why I
added that assertion and changed retype() to keep the type of the
fixed_hw_reg in sync with the normal type...

>>
>>>  }
>>>
>>>  bool
>>> --
>>> 2.4.3
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] osmesa: fix OSMesaPixelsStore typo

2015-07-15 Thread Brian Paul
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91337
Cc: 10.6 
---
 src/gallium/state_trackers/osmesa/osmesa.c | 2 +-
 src/mesa/drivers/osmesa/osmesa.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/osmesa/osmesa.c 
b/src/gallium/state_trackers/osmesa/osmesa.c
index 2d5d096..96b8358 100644
--- a/src/gallium/state_trackers/osmesa/osmesa.c
+++ b/src/gallium/state_trackers/osmesa/osmesa.c
@@ -886,7 +886,7 @@ static struct name_function functions[] = {
{ "OSMesaDestroyContext", (OSMESAproc) OSMesaDestroyContext },
{ "OSMesaMakeCurrent", (OSMESAproc) OSMesaMakeCurrent },
{ "OSMesaGetCurrentContext", (OSMESAproc) OSMesaGetCurrentContext },
-   { "OSMesaPixelsStore", (OSMESAproc) OSMesaPixelStore },
+   { "OSMesaPixelStore", (OSMESAproc) OSMesaPixelStore },
{ "OSMesaGetIntegerv", (OSMESAproc) OSMesaGetIntegerv },
{ "OSMesaGetDepthBuffer", (OSMESAproc) OSMesaGetDepthBuffer },
{ "OSMesaGetColorBuffer", (OSMESAproc) OSMesaGetColorBuffer },
diff --git a/src/mesa/drivers/osmesa/osmesa.c b/src/mesa/drivers/osmesa/osmesa.c
index 022523e..5c7dcac 100644
--- a/src/mesa/drivers/osmesa/osmesa.c
+++ b/src/mesa/drivers/osmesa/osmesa.c
@@ -1124,7 +1124,7 @@ static struct name_function functions[] = {
{ "OSMesaDestroyContext", (OSMESAproc) OSMesaDestroyContext },
{ "OSMesaMakeCurrent", (OSMESAproc) OSMesaMakeCurrent },
{ "OSMesaGetCurrentContext", (OSMESAproc) OSMesaGetCurrentContext },
-   { "OSMesaPixelsStore", (OSMESAproc) OSMesaPixelStore },
+   { "OSMesaPixelStore", (OSMESAproc) OSMesaPixelStore },
{ "OSMesaGetIntegerv", (OSMESAproc) OSMesaGetIntegerv },
{ "OSMesaGetDepthBuffer", (OSMESAproc) OSMesaGetDepthBuffer },
{ "OSMesaGetColorBuffer", (OSMESAproc) OSMesaGetColorBuffer },
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91337] OSMesaGetProcAdress("OSMesaPixelStore") returns nil

2015-07-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91337

--- Comment #1 from Brian Paul  ---
Thanks. Patch posted to mesa-dev for review.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Seeking advice speeding up glsl optimisation passes for AoA

2015-07-15 Thread Timothy Arceri
On Wed, 2015-07-15 at 16:42 +1000, Timothy Arceri wrote:
> Hi guys,
> 
> As I've mentioned a couple of times in previous patches some of the cts AoA
> tests are taking very long time to compile. This is due to excessive
> optimisation passes mainly in the glsl optimisations (there are some 
> slowdowns
> in the intel backend too but these seemed to go away when I tried the new 
> nir
> vec4 backend).
> 
> I fixed part of the problem with this patch to do the dead code elimination 
> in
> a single pass [1]. 
> These excessive passes exist in normal shaders but its generally not an 
> issue
> as the number of passes is generally quite low, and inexpensive. However 
> when
> you have an 8 dimensional array constantly walking this becomes quite
> expensive.
> 
> The remaining issue I'm seeking some advice for is with constant
> propagation/folding.
> 
> It seems for interators used in loops you can get into a situation where an
> optimisation pass is needed for each loop iteration in order to make all
> values of the iterator constant.
> 
> I didn't have look too find some real world examples of this in the public
> shader-db. For example here is it happening for a Unity shader:
> 
> -- Linker Pass 1 --
> opt_function_inlining progress
> opt_dead_functions progress
> opt_copy_propagation_elements progress
> opt_tree_grafting progress
> opt_constant_folding progress
> opt_algebraic progress
> -- Linker Pass 1 --
> constant prop : l_3, Value : 0
> constant prop : l_3, Value : 0
> constant prop : l_3, Value : 0
> constant prop : l_3, Value : 0
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 2 --
> constant prop : l_3, Value : 1
> constant prop : l_3, Value : 1
> constant prop : l_3, Value : 1
> constant prop : l_3, Value : 1
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 3 --
> constant prop : l_3, Value : 2
> constant prop : l_3, Value : 2
> constant prop : l_3, Value : 2
> constant prop : l_3, Value : 2
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 4 --
> constant prop : l_3, Value : 3
> constant prop : l_3, Value : 3
> constant prop : l_3, Value : 3
> constant prop : l_3, Value : 3
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 5 --
> constant prop : l_3, Value : 4
> constant prop : l_3, Value : 4
> constant prop : l_3, Value : 4
> constant prop : l_3, Value : 4
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 6 --
> constant prop : l_3, Value : 5
> constant prop : l_3, Value : 5
> constant prop : l_3, Value : 5
> constant prop : l_3, Value : 5
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 7 --
> constant prop : l_3, Value : 6
> constant prop : l_3, Value : 6
> constant prop : l_3, Value : 6
> constant prop : l_3, Value : 6
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 8 --
> constant prop : l_3, Value : 7
> constant prop : l_3, Value : 7
> constant prop : l_3, Value : 7
> constant prop : l_3, Value : 7
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 9 --
> constant prop : l_3, Value : 8
> constant prop : l_3, Value : 8
> constant prop : l_3, Value : 8
> constant prop : l_3, Value : 8
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 10 --
> constant prop : l_3, Value : 9
> constant prop : l_3, Value : 9
> constant prop : l_3, Value : 9
> constant prop : l_3, Value : 9
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 11 --
> constant prop : l_3, Value : 10
> constant prop : l_3, Value : 10
> constant prop : l_3, Value : 10
> constant prop : l_3, Value : 10
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 12 --
> constant prop : l_3, Value : 11
> constant prop : l_3, Value : 11
> constant prop : l_3, Value : 11
> constant prop : l_3, Value : 11
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 13 --
> constant prop : l_3, Value : 12
> constant prop : l_3, Value : 12
> constant prop : l_3, Value : 12
> constant prop : l_3, Value : 12
> opt_constant_propagation progress
> opt_constant_folding progress
> -- Linker Pass 14 --
> constant prop : l_3, Value : 13
> constant prop : l_3, Value : 

Re: [Mesa-dev] [Mesa-stable] [PATCH] osmesa: fix OSMesaPixelsStore typo

2015-07-15 Thread Emil Velikov
On 15 July 2015 at 13:23, Brian Paul  wrote:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91337
> Cc: 10.6 
Seems like it was broken since day 1 with commit 01dc182ee86(added
OSMesaGetProcAddress())

Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Match swrast modes more loosely.

2015-07-15 Thread Jose Fonseca
From: Tom Hughes 

https://bugs.freedesktop.org/show_bug.cgi?id=90817

Signed-off-by: Jose Fonseca 
---
 src/glx/dri_common.c | 59 +++-
 1 file changed, 58 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c
index 63c8de3..1a62ee2 100644
--- a/src/glx/dri_common.c
+++ b/src/glx/dri_common.c
@@ -266,6 +266,36 @@ scalarEqual(struct glx_config *mode, unsigned int attrib, 
unsigned int value)
 }
 
 static int
+scalarGreaterEqual(struct glx_config *mode, unsigned int attrib, unsigned int 
value)
+{
+   unsigned int glxValue;
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(attribMap); i++)
+  if (attribMap[i].attrib == attrib) {
+ glxValue = *(unsigned int *) ((char *) mode + attribMap[i].offset);
+ return glxValue == GLX_DONT_CARE || glxValue >= value;
+  }
+
+   return GL_TRUE;  /* Is a non-existing attribute greater than or 
equal to value? */
+}
+
+static int
+booleanSupported(struct glx_config *mode, unsigned int attrib, unsigned int 
value)
+{
+   unsigned int glxValue;
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(attribMap); i++)
+  if (attribMap[i].attrib == attrib) {
+ glxValue = *(unsigned int *) ((char *) mode + attribMap[i].offset);
+ return glxValue == GLX_DONT_CARE || glxValue;
+  }
+
+   return GL_TRUE;  /* Is a non-existing attribute supported? */
+}
+
+static int
 driConfigEqual(const __DRIcoreExtension *core,
struct glx_config *config, const __DRIconfig *driConfig)
 {
@@ -313,10 +343,37 @@ driConfigEqual(const __DRIcoreExtension *core,
  if (value & __DRI_ATTRIB_TEXTURE_RECTANGLE_BIT)
 glxValue |= GLX_TEXTURE_RECTANGLE_BIT_EXT;
  if (config->bindToTextureTargets != GLX_DONT_CARE &&
- glxValue != config->bindToTextureTargets)
+ glxValue != (config->bindToTextureTargets & glxValue))
+return GL_FALSE;
+ break;
+
+  case __DRI_ATTRIB_STENCIL_SIZE:
+  case __DRI_ATTRIB_ACCUM_RED_SIZE:
+  case __DRI_ATTRIB_ACCUM_GREEN_SIZE:
+  case __DRI_ATTRIB_ACCUM_BLUE_SIZE:
+  case __DRI_ATTRIB_ACCUM_ALPHA_SIZE:
+ if (value != 0 && !scalarEqual(config, attrib, value))
 return GL_FALSE;
  break;
 
+  case __DRI_ATTRIB_DOUBLE_BUFFER:
+  case __DRI_ATTRIB_BIND_TO_TEXTURE_RGB:
+  case __DRI_ATTRIB_BIND_TO_TEXTURE_RGBA:
+  case __DRI_ATTRIB_BIND_TO_MIPMAP_TEXTURE:
+  case __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE:
+  if (value && !booleanSupported(config, attrib, value))
+return GL_FALSE;
+  break;
+
+  case __DRI_ATTRIB_SAMPLE_BUFFERS:
+  case __DRI_ATTRIB_SAMPLES:
+  case __DRI_ATTRIB_AUX_BUFFERS:
+  case __DRI_ATTRIB_MAX_PBUFFER_WIDTH:
+  case __DRI_ATTRIB_MAX_PBUFFER_HEIGHT:
+  case __DRI_ATTRIB_MAX_PBUFFER_PIXELS:
+ if (!scalarGreaterEqual(config, attrib, value))
+return GL_FALSE;
+
   default:
  if (!scalarEqual(config, attrib, value))
 return GL_FALSE;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91337] OSMesaGetProcAdress("OSMesaPixelStore") returns nil

2015-07-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91337

Brian Paul  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Brian Paul  ---
Fixed with commit 141e1eb29fe80ad341e718147a1277cc3b1b9c11

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Register spilling issues in the NIR->vec4 backend

2015-07-15 Thread Iago Toral
Hi,

when we sent the patches for the new nir->vec4 backend we mentioned that
we had a few dEQP tests that would fail to link because of register
spilling. Now that we have added GS support we see a few instances of
this problem popping up in a few GS piglit tests too, for example this
one:

tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd.shader_test

I have been looking into what is going on with these tests and I came to
the conclusion that the problem is a consequence of various factors, but
probably the main thing contributing to it is the way our SSA pass
works. That said, I am not that experienced with NIR, so it could also
be that my analysis is missing something and I am just arriving to wrong
conclusions, so I'll explain my thoughts below and hopefully someone
else with more NIR experience can jump in and confirm or reject my
analysis.
 
The GS code in that test looks like this:

for (int p = 0; p < 3; p++) {
   color = ((index >= ins[p].m1.length() ?  
ins[p].m2[index-ins[p].m1.length()] :
ins[p].m1[index]) == expect) ?
   vec4(0.0, 1.0, 0.0, 1.0) : vec4(1.0, 0.0, 0.0, 1.0);
   gl_Position = gl_in[p].gl_Position;
   EmitVertex();
}

One thing that is immediately contributing to the register pressure is
some really awful code generated because of the indirect array indexing
on the inputs inside the loop. This is because of the
lower_variable_index_to_cond_assign lowering pass called from
brw_shader.cpp. This pass will convert that color assignment into a
bunch of nested if/else statements which makes the generated GLSL IR
code rather large, involving plenty of temporaries too. This is only
made worse by the fact that loop unrolling will replicate that 3 times.
The result is a huge pile of GLSL IR with a few dozens of nested if/else
statements and temporaries that looks like [1] (that is only a fragment
of the GLSL IR).

One thing that is particularly relevant in that code is that it has
multiple conditional assignments to the same variable
(dereference_array_value) as a consequence of this lowering pass.

That much, however, is common to the NIR and non-NIR paths. The problem
in the NIR case is that all these assignments generate new SSA values,
which then become new registers in the final NIR form. This leads to NIR
code like [2].  In contrast, the old vec4 visitor path, is able to have
writes to the same variable write to the same register.

As a result, if I print the code right before register allocation in the
NIR path [3] and I compare that to what we get with the old vec4 visitor
path at that same point [4], it is clearly visible that this difference
is allowing the vec4 visitor path to reduce register pressure (see how
in [4] we have multiple writes to vgrf5, while in [3] we always write to
a new vgrf every time).

So, am I missing something or is this kind of result expected with NIR
programs? Is there anything in the nir->vec4 pass that we can do to fix
this or does this need to be fixed when going out of SSA moe inside NIR?

Iago

[1] http://pastebin.com/5uA8ex2S
[2] http://pastebin.com/pqLfvAVN
[3] http://pastebin.com/64nSuUH8
[4] http://pastebin.com/WCrdYxzt

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: include stdarg.h for va_list

2015-07-15 Thread Jonathan Gray
Include stdarg.h for va_list.  Unbreaks the build on OpenBSD:

In file included from mesa/program/dummy_errors.c:24:
../src/mesa/main/errors.h:85: error: expected declaration specifiers or '...' be
fore 'va_list'

Signed-off-by: Jonathan Gray 
---
 src/mesa/main/errors.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/errors.h b/src/mesa/main/errors.h
index 24f234f..81e47a8 100644
--- a/src/mesa/main/errors.h
+++ b/src/mesa/main/errors.h
@@ -37,6 +37,7 @@
 
 
 #include 
+#include 
 #include "compiler.h"
 #include "glheader.h"
 #include "mtypes.h"
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: include stdarg.h for va_list

2015-07-15 Thread Matt Turner
Acked-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-15 Thread Matt Turner
On Tue, Jul 14, 2015 at 11:45 AM, Eric Anholt  wrote:
> These are really useful hints to the compiler in the absence of link-time
> optimization, and I'm going to use them in VC4.
>
> I've made the const attribute be ATTRIBUTE_CONST unlike other function
> attributes, because we have other things in the tree #defining CONST for
> their own unrelated purposes.
> ---
>  configure.ac  |  2 ++
>  src/util/macros.h | 20 
>  2 files changed, 22 insertions(+)
>
> diff --git a/configure.ac b/configure.ac
> index bdfd134..38ad398 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -210,6 +210,8 @@ AX_GCC_FUNC_ATTRIBUTE([format])
>  AX_GCC_FUNC_ATTRIBUTE([malloc])
>  AX_GCC_FUNC_ATTRIBUTE([packed])
>  AX_GCC_FUNC_ATTRIBUTE([unused])
> +AX_GCC_FUNC_ATTRIBUTE([const])
> +AX_GCC_FUNC_ATTRIBUTE([pure])
>  AX_GCC_FUNC_ATTRIBUTE([warn_unused_result])
>
>  AM_CONDITIONAL([GEN_ASM_OFFSETS], test "x$GEN_ASM_OFFSETS" = xyes)
> diff --git a/src/util/macros.h b/src/util/macros.h
> index 66698e7..4d16183 100644
> --- a/src/util/macros.h
> +++ b/src/util/macros.h
> @@ -130,6 +130,26 @@ do {   \
>  #define PACKED
>  #endif
>
> +/* Attribute pure is used for functions that have no effects other than their
> + * return value.  As a result, calls to it can be dead code eliminated.
> + */
> +#ifdef HAVE_FUNC_ATTRIBUTE_PURE
> +#define PURE __attribute__((__pure__))
> +#else
> +#define PURE
> +#endif
> +
> +/* Attribute const is used for functions that have no effects other than 
> their
> + * return value, and only rely on the argument values to compute the return
> + * value.  As a result, calls to it can be CSEed.  Note that using memory
> + * pointed to by the arguments is not allowed for const functions.
> + */
> +#ifdef HAVE_FUNC_ATTRIBUTE_CONST
> +#define ATTRIBUTE_CONST __attribute__((__const__))
> +#else
> +#define ATTRIBUTE_CONST
> +#endif
> +
>  #ifdef __cplusplus
>  /**
>   * Macro function that evaluates to true if T is a trivially
> --
> 2.1.4

Both of these hunks are alphabetized.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] egl/dri2: Close file descriptor on error.

2015-07-15 Thread Matt Turner
---
 src/egl/drivers/dri2/platform_drm.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index 0d1f4c6..a8c5401 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -619,18 +619,22 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp)
  fd = loader_open_device("/dev/dri/card0");
   dri2_dpy->own_device = 1;
   gbm = gbm_create_device(fd);
-  if (gbm == NULL)
+  if (gbm == NULL) {
+ close(fd);
  return EGL_FALSE;
+  }
}
 
if (strcmp(gbm_device_get_backend_name(gbm), "drm") != 0) {
   free(dri2_dpy);
+  close(fd);
   return EGL_FALSE;
}
 
dri2_dpy->gbm_dri = gbm_dri_device(gbm);
if (dri2_dpy->gbm_dri->base.type != GBM_DRM_DRIVER_TYPE_DRI) {
   free(dri2_dpy);
+  close(fd);
   return EGL_FALSE;
}
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Mark default case of type_sz() unreachable.

2015-07-15 Thread Matt Turner
Otherwise Coverity thinks we'll divide by zero.
---
 src/mesa/drivers/dri/i965/brw_reg.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_reg.h 
b/src/mesa/drivers/dri/i965/brw_reg.h
index c8b1341..f96b28d 100644
--- a/src/mesa/drivers/dri/i965/brw_reg.h
+++ b/src/mesa/drivers/dri/i965/brw_reg.h
@@ -283,7 +283,7 @@ type_sz(unsigned type)
case BRW_REGISTER_TYPE_B:
   return 1;
default:
-  return 0;
+  unreachable("not reached");
}
 }
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965/cfg: Assert that cur_do/while/if pointers are non-NULL.

2015-07-15 Thread Matt Turner
More.. like in commit 4d93a07c.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index f1f230e..91d53ef 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -208,6 +208,7 @@ cfg_t::cfg_t(exec_list *instructions)
  cur_else = cur;
 
 next = new_block();
+ assert(cur_if != NULL);
 cur_if->add_successor(mem_ctx, next);
 
 set_next_block(&cur, next, ip);
@@ -274,6 +275,7 @@ cfg_t::cfg_t(exec_list *instructions)
  inst->exec_node::remove();
  cur->instructions.push_tail(inst);
 
+ assert(cur_do != NULL);
 cur->add_successor(mem_ctx, cur_do);
 
 next = new_block();
@@ -287,6 +289,7 @@ cfg_t::cfg_t(exec_list *instructions)
  inst->exec_node::remove();
  cur->instructions.push_tail(inst);
 
+ assert(cur_while != NULL);
 cur->add_successor(mem_ctx, cur_while);
 
 next = new_block();
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Add a value range propagation pass

2015-07-15 Thread Matt Turner
On Tue, Jul 14, 2015 at 4:29 PM, Thomas Helland
 wrote:
> Signed-off-by: Thomas Helland 
> ---
>  src/glsl/Makefile.sources  |1 +
>  src/glsl/nir/nir.h |2 +
>  src/glsl/nir/nir_opt_value_range.c | 1330 
> 
>  3 files changed, 1333 insertions(+)
>  create mode 100644 src/glsl/nir/nir_opt_value_range.c
>
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index b938f1e..720ff70 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -56,6 +56,7 @@ NIR_FILES = \
> nir/nir_opt_peephole_ffma.c \
> nir/nir_opt_peephole_select.c \
> nir/nir_opt_remove_phis.c \
> +   nir/nir_opt_value_range.c \
> nir/nir_print.c \
> nir/nir_remove_dead_variables.c \
> nir/nir_search.c \
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 6efbfbd..44dd015 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1693,6 +1693,8 @@ bool nir_opt_peephole_ffma(nir_shader *shader);
>
>  bool nir_opt_remove_phis(nir_shader *shader);
>
> +bool nir_opt_value_range(nir_shader *shader);
> +
>  void nir_sweep(nir_shader *shader);
>
>  #ifdef __cplusplus
> diff --git a/src/glsl/nir/nir_opt_value_range.c 
> b/src/glsl/nir/nir_opt_value_range.c
> new file mode 100644
> index 000..1e6ff0e
> --- /dev/null
> +++ b/src/glsl/nir/nir_opt_value_range.c
> @@ -0,0 +1,1330 @@
> +/*
> + * Copyright © 2014 Thomas Helland

Presumably 2015.

> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + *
> + *
> + */
> +
> +#include "nir.h"
> +#include "nir_ssa_def_worklist.h"
> +#include "nir_block_worklist.h"
> +#include "nir_constant_expressions.h"
> +
> +/* This pass implements an extension of
> + * "Constant Propagation with Conditional Branches" by Wegman and Zadeck
> + * that also handles value ranges. This is useful as a lot of shaders have
> + * min/max expressions that can be eliminated, or conditionals that we can
> + * prove to be false or true due to previously applied restrictions on range.
> + * Value range propagation is a superset of tracking constant values,
> + * and due to that this pass eliminates the need for a separate constant
> + * propagation pass. This pass is optimistic, meaning we assume all variables
> + * are constant (or have restricted range) and disprove it.
> + * A pessimistic algorithm would assume all values where undeterminable,
> + * and then propagate expressions we know to be constant through the program.
> + * An optimistic algorithm gets better results than a pessimistic, with the
> + * downside being that it can not be aborted midway through the pass as the
> + * results gathered may be wrong (based on wrong assumptions).

Line wrap this block better -- highlight in vim with shift+v, then gq

> + *
> + * The lattice types are:
> + * undefined: Variable may be constant or range-restricted (not yet 
> processed)

Does "not yet processed" mean "not yet implemented"?

I remember there were some concerns about what to do about undefined
values in the GLSL IR implementation of this pass.

> + * constant: Value is determined to be constant
> + * range: Value is determined to be range-restricted
> + * overdefined: We cannot guarantee the value is constant or range-restricted
> + *
> + * We extend the lattice so that constant entries are changed to inclusive
> + * ranges for each vector component. The join rules are:
> + *
> + * undefined join undefined = undefined
> + * undefined join overdefined = overdefined
> + * overdefined join overdefined = overdefined
> + * [low, high] join overdefined = overdefined
> + * [low, high] join undefined = [low, high]
> + * [low1, high1] join [low2, high2] = [min(low1, low2), max(high1, high2)]
> + *
> + * These rules are general pessimistic rules. There may situations where we
> + * can still determine parts of the

[Mesa-dev] [PATCH] i965: Fix comment about DRM_IOCTL_I915_GEM_WAIT.

2015-07-15 Thread Kenneth Graunke
From: Chris Wilson 

The kernel actually waits forever when supplied a timeout value < 0,
rather than returning immediately.  See i915_gem_wait_ioctl() in
i915_gem.c's call to __i915_wait_request().

(split by Ken from a large patch authored by Chris Wilson)

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_syncobj.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c 
b/src/mesa/drivers/dri/i965/intel_syncobj.c
index c44c4be..c2f4fa9 100644
--- a/src/mesa/drivers/dri/i965/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i965/intel_syncobj.c
@@ -105,9 +105,9 @@ brw_fence_client_wait(struct brw_context *brw, struct 
brw_fence *fence,
assert(fence->batch_bo);
 
/* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
-* immediately for timeouts <= 0.  The best we can do is to clamp the
-* timeout to INT64_MAX.  This limits the maximum timeout from 584 years to
-* 292 years - likely not a big deal.
+* immediately for timeout == 0, and indefinitely if timeout is negative.
+* The best we can do is to clamp the timeout to INT64_MAX.  This limits
+* the maximum timeout from 584 years to 292 years - likely not a big deal.
 */
if (timeout > INT64_MAX)
   timeout = INT64_MAX;
-- 
2.4.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] i965/fs: fix stride and type for hw_reg's in regs_read()

2015-07-15 Thread Jason Ekstrand
On Wed, Jul 15, 2015 at 4:31 AM, Francisco Jerez  wrote:
> Connor Abbott  writes:
>
>> On Tue, Jul 14, 2015 at 6:02 AM, Francisco Jerez  
>> wrote:
>>> Connor Abbott  writes:
>>>
 sources with file == HW_REG get all their information from the
 fixed_hw_reg field, so we need to get the stride and type from there
 when computing the size.

 Signed-off-by: Connor Abbott 
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 24 ++--
  1 file changed, 18 insertions(+), 6 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 38b9095..64f093b 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -696,24 +696,36 @@ fs_inst::regs_read(int arg) const
break;
 }

 +   unsigned stride;
 +   enum brw_reg_type type;
 +
 switch (src[arg].file) {
 case BAD_FILE:
 case UNIFORM:
 case IMM:
return 1;
 +
 case GRF:
 +  stride = src[arg].stride;
 +  type = src[arg].type;
 +  break;
 +
 case HW_REG:
 -  if (src[arg].stride == 0) {
 - return 1;
 -  } else {
 - int size = components * this->exec_size * type_sz(src[arg].type);
 - return DIV_ROUND_UP(size * src[arg].stride, 32);
 -  }
 +  stride = src[arg].fixed_hw_reg.hstride;
 +  type = src[arg].fixed_hw_reg.type;
 +  break;
 +
 case MRF:
unreachable("MRF registers are not allowed as sources");
 default:
unreachable("Invalid register file");
 }
 +
 +   if (stride == 0)
 +  return 1;
 +
 +   int size = components * this->exec_size * type_sz(type);
 +   return DIV_ROUND_UP(size * stride, 32);
>>>
>>> I don't think this will work unfortunately, brw_reg::hstride is the log2
>>> of the actual stride, unlike fs_reg::stride.  Did I already mention I'm
>>> appalled by the fact that fs_reg has a number of fields with overlapping
>>> semantics but different representation, one or the other being
>>> applicable depending on the occasion.  I guess it would be more or less
>>> bearable if these data members were declared private and some reasonable
>>> abstraction was provided to access them.
>>
>> I don't think anybody's happy with it, but refactoring that is it's
>> own can of worms.
>>
>
> Sure, it would be a pile of work, but I think it should be quite
> straightforward in principle.  We could just punt fixed_hw_reg and
> replace it with an ARF file and a fixed-GRF file using the same fields
> normal regististers use.  For immediates we'd have to add to add a union
> with float/unsigned/int fields similar to brw_reg::dw1.
>
>>>
>>> How do you like the attached patch?  It doesn't solve the fundamental
>>> problem but it seems to improve the situation slightly.
>>
>> It seems fine to me... I was more paranoid about getting the type from
>> the fixed_hw_reg too, but brw_reg_from_fs_reg() in the generator we
>> have:
>>
>> assert(reg->type == reg->fixed_hw_reg.type);
>>
>> so it seems my paranoia wasn't justified. I'd like someone else who's
>> more experienced to take a look though. I suspect that others might
>> want to bikeshed about the name, but I don't have a better suggestion.

Seems perfectly reasonable to me.  Consider the patch you attached

Reviewed-by: Jason Ekstrand 

There are several more places in the compiler where we completely fall
over on that calculation.  However, this seems like a good start.
--Jason

> Yeah, your paranoia was definitely justified, it's essentially the same
> problem.  It actually led to actual bugs in the past which is why I
> added that assertion and changed retype() to keep the type of the
> fixed_hw_reg in sync with the normal type...
>
>>>
  }

  bool
 --
 2.4.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Register spilling issues in the NIR->vec4 backend

2015-07-15 Thread Connor Abbott
On Wed, Jul 15, 2015 at 7:49 AM, Iago Toral  wrote:
> Hi,
>
> when we sent the patches for the new nir->vec4 backend we mentioned that
> we had a few dEQP tests that would fail to link because of register
> spilling. Now that we have added GS support we see a few instances of
> this problem popping up in a few GS piglit tests too, for example this
> one:
>
> tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd.shader_test
>
> I have been looking into what is going on with these tests and I came to
> the conclusion that the problem is a consequence of various factors, but
> probably the main thing contributing to it is the way our SSA pass
> works. That said, I am not that experienced with NIR, so it could also
> be that my analysis is missing something and I am just arriving to wrong
> conclusions, so I'll explain my thoughts below and hopefully someone
> else with more NIR experience can jump in and confirm or reject my
> analysis.
>
> The GS code in that test looks like this:
>
> for (int p = 0; p < 3; p++) {
>color = ((index >= ins[p].m1.length() ?
> ins[p].m2[index-ins[p].m1.length()] :
> ins[p].m1[index]) == expect) ?
>vec4(0.0, 1.0, 0.0, 1.0) : vec4(1.0, 0.0, 0.0, 1.0);
>gl_Position = gl_in[p].gl_Position;
>EmitVertex();
> }
>
> One thing that is immediately contributing to the register pressure is
> some really awful code generated because of the indirect array indexing
> on the inputs inside the loop. This is because of the
> lower_variable_index_to_cond_assign lowering pass called from
> brw_shader.cpp. This pass will convert that color assignment into a
> bunch of nested if/else statements which makes the generated GLSL IR
> code rather large, involving plenty of temporaries too. This is only
> made worse by the fact that loop unrolling will replicate that 3 times.
> The result is a huge pile of GLSL IR with a few dozens of nested if/else
> statements and temporaries that looks like [1] (that is only a fragment
> of the GLSL IR).
>
> One thing that is particularly relevant in that code is that it has
> multiple conditional assignments to the same variable
> (dereference_array_value) as a consequence of this lowering pass.
>
> That much, however, is common to the NIR and non-NIR paths. The problem
> in the NIR case is that all these assignments generate new SSA values,
> which then become new registers in the final NIR form. This leads to NIR
> code like [2].  In contrast, the old vec4 visitor path, is able to have
> writes to the same variable write to the same register.
>
> As a result, if I print the code right before register allocation in the
> NIR path [3] and I compare that to what we get with the old vec4 visitor
> path at that same point [4], it is clearly visible that this difference
> is allowing the vec4 visitor path to reduce register pressure (see how
> in [4] we have multiple writes to vgrf5, while in [3] we always write to
> a new vgrf every time).
>
> So, am I missing something or is this kind of result expected with NIR
> programs? Is there anything in the nir->vec4 pass that we can do to fix
> this or does this need to be fixed when going out of SSA moe inside NIR?
>
> Iago
>
> [1] http://pastebin.com/5uA8ex2S
> [2] http://pastebin.com/pqLfvAVN
> [3] http://pastebin.com/64nSuUH8
> [4] http://pastebin.com/WCrdYxzt
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Hi Iago,

Indeed, NIR does convert conditional writes to conditional selectss --
it's a required part of the conversion to SSA, and since our HW has a
conditional select instruction that's just as fast as doing a
conditional move, we haven't bothered much to try and change it back
during out-of-SSA. However, doing this shouldn't make things worse. In
your example, vgrf9, vgrf15, and vgrf17 all have very short live
intervals and don't interfere with vgrf11 (unless there's another use
of them somewhere after the snippet you pasted), which means that the
register allocator is free to allocate the destinations of all the
selects to the same register.

What's happening, though, is that you're running into our terrible
liveness analysis. After doing the proper liveness analysis, we figure
out the place each register first becomes live and last becomes dead,
and then we consider registers that have overlapping ranges to
interfere. So we consider vgrf11 to interfere with vgrf15 and vgrf17,
even though it really doesn't. The trouble with making it do the right
thing is that we may actually need to extend the live ranges of
registers when the exec masks don't match up, either because one uses
writemask_all or because they have incompatible exec masks due to
containing different datatypes (half-float vs. float, etc.). For
example, in your snippet, if vgrf15 were to be written with
force_writemask_all, then it actually would interfere with vg

[Mesa-dev] [PATCH] nir: Add a value range propagation pass

2015-07-15 Thread Thomas Helland
Hi Matt,

I've commented on some of your feedback down below.
The rest is taken note of and I'll be fixing it up later.

2015-07-15 19:18 GMT+02:00 Matt Turner :
> On Tue, Jul 14, 2015 at 4:29 PM, Thomas Helland
>  wrote:
>> Signed-off-by: Thomas Helland 
>> ---
>>  src/glsl/Makefile.sources  |1 +
>>  src/glsl/nir/nir.h |2 +
>>  src/glsl/nir/nir_opt_value_range.c | 1330 
>> 
>>  3 files changed, 1333 insertions(+)
>>  create mode 100644 src/glsl/nir/nir_opt_value_range.c
>>
>> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>> index b938f1e..720ff70 100644
>> --- a/src/glsl/Makefile.sources
>> +++ b/src/glsl/Makefile.sources
>> @@ -56,6 +56,7 @@ NIR_FILES = \
>> nir/nir_opt_peephole_ffma.c \
>> nir/nir_opt_peephole_select.c \
>> nir/nir_opt_remove_phis.c \
>> +   nir/nir_opt_value_range.c \
>> nir/nir_print.c \
>> nir/nir_remove_dead_variables.c \
>> nir/nir_search.c \
>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> index 6efbfbd..44dd015 100644
>> --- a/src/glsl/nir/nir.h
>> +++ b/src/glsl/nir/nir.h
>> @@ -1693,6 +1693,8 @@ bool nir_opt_peephole_ffma(nir_shader *shader);
>>
>>  bool nir_opt_remove_phis(nir_shader *shader);
>>
>> +bool nir_opt_value_range(nir_shader *shader);
>> +
>>  void nir_sweep(nir_shader *shader);
>>
>>  #ifdef __cplusplus
>> diff --git a/src/glsl/nir/nir_opt_value_range.c 
>> b/src/glsl/nir/nir_opt_value_range.c
>> new file mode 100644
>> index 000..1e6ff0e
>> --- /dev/null
>> +++ b/src/glsl/nir/nir_opt_value_range.c
>> @@ -0,0 +1,1330 @@
>> +/*
>> + * Copyright © 2014 Thomas Helland
>
> Presumably 2015.
>
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + *
>> + */
>> +
>> +#include "nir.h"
>> +#include "nir_ssa_def_worklist.h"
>> +#include "nir_block_worklist.h"
>> +#include "nir_constant_expressions.h"
>> +
>> +/* This pass implements an extension of
>> + * "Constant Propagation with Conditional Branches" by Wegman and Zadeck
>> + * that also handles value ranges. This is useful as a lot of shaders have
>> + * min/max expressions that can be eliminated, or conditionals that we can
>> + * prove to be false or true due to previously applied restrictions on 
>> range.
>> + * Value range propagation is a superset of tracking constant values,
>> + * and due to that this pass eliminates the need for a separate constant
>> + * propagation pass. This pass is optimistic, meaning we assume all 
>> variables
>> + * are constant (or have restricted range) and disprove it.
>> + * A pessimistic algorithm would assume all values where undeterminable,
>> + * and then propagate expressions we know to be constant through the 
>> program.
>> + * An optimistic algorithm gets better results than a pessimistic, with the
>> + * downside being that it can not be aborted midway through the pass as the
>> + * results gathered may be wrong (based on wrong assumptions).
>
> Line wrap this block better -- highlight in vim with shift+v, then gq
>
>> + *
>> + * The lattice types are:
>> + * undefined: Variable may be constant or range-restricted (not yet 
>> processed)
>
> Does "not yet processed" mean "not yet implemented"?
>
> I remember there were some concerns about what to do about undefined
> values in the GLSL IR implementation of this pass.
>

It means that we have not yet found out anything about it.
This can be because we have not yet processed it, or that
it is in a block that we have not yet found reachable.
It can also be an instruction that we have not found any
need to process due to the result never getting used.

>> + * constant: Value is determined to be constant
>> + * range: Value is determined to be range-restricted
>> + * overdefined: We cannot guarantee the value is constant or 
>> range-restrict

Re: [Mesa-dev] Seeking advice speeding up glsl optimisation passes for AoA

2015-07-15 Thread Eric Anholt
Timothy Arceri  writes:

> Hi guys,
>
> As I've mentioned a couple of times in previous patches some of the cts AoA
> tests are taking very long time to compile. This is due to excessive
> optimisation passes mainly in the glsl optimisations (there are some slowdowns
> in the intel backend too but these seemed to go away when I tried the new nir
> vec4 backend).
>
> I fixed part of the problem with this patch to do the dead code elimination in
> a single pass [1]. 
> These excessive passes exist in normal shaders but its generally not an issue
> as the number of passes is generally quite low, and inexpensive. However when
> you have an 8 dimensional array constantly walking this becomes quite
> expensive.
>
> The remaining issue I'm seeking some advice for is with constant
> propagation/folding.
>
> It seems for interators used in loops you can get into a situation where an
> optimisation pass is needed for each loop iteration in order to make all
> values of the iterator constant.
>
> I didn't have look too find some real world examples of this in the public
> shader-db. For example here is it happening for a Unity shader:

How about if we just disable the GLSL IR constant prop pass when NIR is
enabled?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Seeking advice speeding up glsl optimisation passes for AoA

2015-07-15 Thread Connor Abbott
On Wed, Jul 15, 2015 at 11:53 AM, Eric Anholt  wrote:
> Timothy Arceri  writes:
>
>> Hi guys,
>>
>> As I've mentioned a couple of times in previous patches some of the cts AoA
>> tests are taking very long time to compile. This is due to excessive
>> optimisation passes mainly in the glsl optimisations (there are some 
>> slowdowns
>> in the intel backend too but these seemed to go away when I tried the new nir
>> vec4 backend).
>>
>> I fixed part of the problem with this patch to do the dead code elimination 
>> in
>> a single pass [1].
>> These excessive passes exist in normal shaders but its generally not an issue
>> as the number of passes is generally quite low, and inexpensive. However when
>> you have an 8 dimensional array constantly walking this becomes quite
>> expensive.
>>
>> The remaining issue I'm seeking some advice for is with constant
>> propagation/folding.
>>
>> It seems for interators used in loops you can get into a situation where an
>> optimisation pass is needed for each loop iteration in order to make all
>> values of the iterator constant.
>>
>> I didn't have look too find some real world examples of this in the public
>> shader-db. For example here is it happening for a Unity shader:
>
> How about if we just disable the GLSL IR constant prop pass when NIR is
> enabled?

Unfortunately, we're not quite at the point where we can do that yet.
NIR doesn't have the ability to clean up stuff like if (true) { ... },
so turning off GLSL IR constant propagation will prevent that from
happening in a lot of cases when the condition isn't simple enough. I
have some patches on the list from a long time ago to do it, but Jason
never got around to reviewing them and we thought we might want to
re-work the approach -- there are a bunch of other places where we
want to modify control flow in NIR, but doing it while preserving the
structure and preserving SSA is a tricky problem that I only touched a
tiny part of in that series.

>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix comment about DRM_IOCTL_I915_GEM_WAIT.

2015-07-15 Thread Kristian Høgsberg
On Wed, Jul 15, 2015 at 10:22 AM, Kenneth Graunke  wrote:
> From: Chris Wilson 
>
> The kernel actually waits forever when supplied a timeout value < 0,
> rather than returning immediately.  See i915_gem_wait_ioctl() in
> i915_gem.c's call to __i915_wait_request().
>
> (split by Ken from a large patch authored by Chris Wilson)
>
> Reviewed-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/intel_syncobj.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c 
> b/src/mesa/drivers/dri/i965/intel_syncobj.c
> index c44c4be..c2f4fa9 100644
> --- a/src/mesa/drivers/dri/i965/intel_syncobj.c
> +++ b/src/mesa/drivers/dri/i965/intel_syncobj.c
> @@ -105,9 +105,9 @@ brw_fence_client_wait(struct brw_context *brw, struct 
> brw_fence *fence,
> assert(fence->batch_bo);
>
> /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
> -* immediately for timeouts <= 0.  The best we can do is to clamp the
> -* timeout to INT64_MAX.  This limits the maximum timeout from 584 years 
> to
> -* 292 years - likely not a big deal.
> +* immediately for timeout == 0, and indefinitely if timeout is negative.
> +* The best we can do is to clamp the timeout to INT64_MAX.  This limits
> +* the maximum timeout from 584 years to 292 years - likely not a big 
> deal.

No, there are kernel versions in the wild that has this bug. The
comment after the patch says:

 /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
 * immediately for timeout == 0, and indefinitely if timeout is negative.
 * The best we can do is to clamp the timeout to INT64_MAX.  This limits
 * the maximum timeout from 584 years to 292 years - likely not a big deal.

which doesn't make sense. If we feel like we need to point out that
we've fixed the bug, that's fine, but keep the part about how some
kernels are broken so it's clear why the workaround is needed.

Kristian

>  */
> if (timeout > INT64_MAX)
>timeout = INT64_MAX;
> --
> 2.4.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/13] mesa: Use floats for viewport bounds.

2015-07-15 Thread Matt Turner
On Wed, Jul 15, 2015 at 12:48 AM, Mathias Fröhlich
 wrote:
>
>
> Hi Matt,
>
>
>
> On Monday, July 13, 2015 16:22:03 Matt Turner wrote:
>
>> ARB_viewport_array specifies that DEPTH_RANGE consists of double-
>
>> precision parameters (corresponding commit d4dc35987), and a preparatory
>
>> commit (6340e609a) added _mesa_get_viewport_xform() which returned
>
>> double-precision scale[3] and translate[3] vectors, even though X, Y,
>
>> Width, and Height were still floats.
>
>>
>
>> All users of _mesa_get_viewport_xform() immediately convert the double
>
>> scale and translation vectors into floats (which were floats originally,
>
>> but were converted to doubles in _mesa_get_viewport_xform(), sigh).
>
>>
>
>> i965 at least cannot consume doubles (see SF_CLIP_VIEWPORT). If we want
>
>> to pass doubles to hardware, we should have a different function that
>
>> does that.
>
> The aim was to route the precision provided with the depth range one stage
>
> further down to the hardware. At the current time we do not support a single
>
> piece of hardware that takes doubles at that point, right?

I don't know about NVIDIA or AMD hardware, but Intel hardware cannot
take double-precision viewport bounds, and pipe_viewport_state (what
the code in st_atom_viewport.c copies the outputs from
_mesa_get_viewport_xform() into) defines scale and translate as
floats. Certainly there's no point in the conversion round trip for
hardware like i915, radeon, and r200.

My suggestion, if we want to pipe double-precision viewport bounds
from GL into capable hardware is to add a new function.

> There is a comment inline below.
>
>
>
> I am not sure if I am in the position to review, but at least you can get my
>
> Acked-by: Mathias Froehlich 

Thanks!

> Mathias
>
>
>
>
>
>> ---
>
>> src/mesa/drivers/dri/i915/i915_state.c | 2 +-
>
>> src/mesa/drivers/dri/i965/brw_sf_state.c | 2 +-
>
>> src/mesa/drivers/dri/i965/gen6_viewport_state.c | 2 +-
>
>> src/mesa/drivers/dri/i965/gen7_viewport_state.c | 2 +-
>
>> src/mesa/drivers/dri/i965/gen8_viewport_state.c | 2 +-
>
>> src/mesa/drivers/dri/r200/r200_state.c | 2 +-
>
>> src/mesa/drivers/dri/radeon/radeon_state.c | 2 +-
>
>> src/mesa/main/viewport.c | 14 +++---
>
>> src/mesa/main/viewport.h | 2 +-
>
>> src/mesa/math/m_matrix.c | 4 ++--
>
>> src/mesa/math/m_matrix.h | 4 ++--
>
>> src/mesa/state_tracker/st_atom_viewport.c | 2 +-
>
>> src/mesa/tnl/t_context.c | 2 +-
>
>> src/mesa/tnl/t_rasterpos.c | 2 +-
>
>> 14 files changed, 22 insertions(+), 22 deletions(-)
>
>>
>
>> diff --git a/src/mesa/drivers/dri/i915/i915_state.c
>> b/src/mesa/drivers/dri/i915/i915_state.c
>
>> index 5f10b84..4c83073 100644
>
>> --- a/src/mesa/drivers/dri/i915/i915_state.c
>
>> +++ b/src/mesa/drivers/dri/i915/i915_state.c
>
>> @@ -402,7 +402,7 @@ void
>
>> intelCalcViewport(struct gl_context * ctx)
>
>> {
>
>> struct intel_context *intel = intel_context(ctx);
>
>> - double scale[3], translate[3];
>
>> + float scale[3], translate[3];
>
>>
>
>> _mesa_get_viewport_xform(ctx, 0, scale, translate);
>
>>
>
>> diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c
>> b/src/mesa/drivers/dri/i965/brw_sf_state.c
>
>> index 5d98922..3be6e4a 100644
>
>> --- a/src/mesa/drivers/dri/i965/brw_sf_state.c
>
>> +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
>
>> @@ -45,7 +45,7 @@ static void upload_sf_vp(struct brw_context *brw)
>
>> struct gl_context *ctx = &brw->ctx;
>
>> struct brw_sf_viewport *sfv;
>
>> GLfloat y_scale, y_bias;
>
>> - double scale[3], translate[3];
>
>> + float scale[3], translate[3];
>
>> const bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
>
>>
>
>> sfv = brw_state_batch(brw, AUB_TRACE_SF_VP_STATE,
>
>> diff --git a/src/mesa/drivers/dri/i965/gen6_viewport_state.c
>> b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
>
>> index 7c8d884..11b9a36 100644
>
>> --- a/src/mesa/drivers/dri/i965/gen6_viewport_state.c
>
>> +++ b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
>
>> @@ -101,7 +101,7 @@ gen6_upload_sf_vp(struct brw_context *brw)
>
>> }
>
>>
>
>> for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
>
>> - double scale[3], translate[3];
>
>> + float scale[3], translate[3];
>
>>
>
>> /* _NEW_VIEWPORT */
>
>> _mesa_get_viewport_xform(ctx, i, scale, translate);
>
>> diff --git a/src/mesa/drivers/dri/i965/gen7_viewport_state.c
>> b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
>
>> index b655205..c75dc99 100644
>
>> --- a/src/mesa/drivers/dri/i965/gen7_viewport_state.c
>
>> +++ b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
>
>> @@ -53,7 +53,7 @@ gen7_upload_sf_clip_viewport(struct brw_context *brw)
>
>> }
>
>>
>
>> for (unsigned i = 0; i < ctx->Const.MaxViewports; i++) {
>
>> - double scale[3], translate[3];
>
>> + float scale[3], translate[3];
>
>> _mesa_get_viewport_xform(ctx, i, scale, translate);
>
>>
>
>> /* According to the "Vertex X,Y Clamping and Quantization" section of
>
>> diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c
>> b/src/mesa/drivers/dri/i965/gen8_viewport_state.c

Re: [Mesa-dev] [PATCH 00/13] Avoid double promotion

2015-07-15 Thread Matt Turner
On Tue, Jul 14, 2015 at 4:32 AM, Daniel Stone  wrote:
> Hi,
>
> On 14 July 2015 at 00:22, Matt Turner  wrote:
> but it's not really
>> useful in general because float arguments are always cast to double
>> when passed as arguments to varargs functions like printf (why?), and
>> it warns about that, generating a lot of noise.
>
> It might shock you to learn that the answer is awful historical
> reasons. Or maybe not.
>
> Some types undergo unconditional promotion, similar to pre-prototype
> functions, when used in vararg calls: float is promoted to double, and
> char/short are promoted to int.
>
> http://c-faq.com/~scs/cgi-bin/faqcat.cgi?sec=ansi#argpromos

Eesh! Thanks for the information.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix comment about DRM_IOCTL_I915_GEM_WAIT.

2015-07-15 Thread Chris Wilson
On Wed, Jul 15, 2015 at 12:20:15PM -0700, Kristian Høgsberg wrote:
> On Wed, Jul 15, 2015 at 10:22 AM, Kenneth Graunke  
> wrote:
> > From: Chris Wilson 
> >
> > The kernel actually waits forever when supplied a timeout value < 0,
> > rather than returning immediately.  See i915_gem_wait_ioctl() in
> > i915_gem.c's call to __i915_wait_request().
> >
> > (split by Ken from a large patch authored by Chris Wilson)
> >
> > Reviewed-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/intel_syncobj.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c 
> > b/src/mesa/drivers/dri/i965/intel_syncobj.c
> > index c44c4be..c2f4fa9 100644
> > --- a/src/mesa/drivers/dri/i965/intel_syncobj.c
> > +++ b/src/mesa/drivers/dri/i965/intel_syncobj.c
> > @@ -105,9 +105,9 @@ brw_fence_client_wait(struct brw_context *brw, struct 
> > brw_fence *fence,
> > assert(fence->batch_bo);
> >
> > /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
> > -* immediately for timeouts <= 0.  The best we can do is to clamp the
> > -* timeout to INT64_MAX.  This limits the maximum timeout from 584 
> > years to
> > -* 292 years - likely not a big deal.
> > +* immediately for timeout == 0, and indefinitely if timeout is 
> > negative.
> > +* The best we can do is to clamp the timeout to INT64_MAX.  This limits
> > +* the maximum timeout from 584 years to 292 years - likely not a big 
> > deal.
> 
> No, there are kernel versions in the wild that has this bug. The
> comment after the patch says:
> 
>  /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
>  * immediately for timeout == 0, and indefinitely if timeout is negative.
>  * The best we can do is to clamp the timeout to INT64_MAX.  This limits
>  * the maximum timeout from 584 years to 292 years - likely not a big deal.
> 
> which doesn't make sense. If we feel like we need to point out that
> we've fixed the bug, that's fine, but keep the part about how some
> kernels are broken so it's clear why the workaround is needed.

This is not a workaround for the kernel bug but an impedence mismatch
for the uint64_t GL interface and int64_t kernel interface.

The text certainly didn't describe the kernel interface at the time it
was introduced nor at the present time. If you want to rewrite it to
mention the bug, then do so, but don't do that claiming to be a
description of the kernel interface.

If you really want to mention the inconsequential bug, then include:

 * except once upon a time, shortly from 3.17, a bug was introduced that
 * made the kernel return instantly for negative timeouts.

It doesn't change the code in the slightest since you still have to make
a judgment call about what to do with the timeout values of greater than
292 years. Looping at 292 years is closer than never.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/13] gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).

2015-07-15 Thread Matt Turner
On Mon, Jul 13, 2015 at 5:13 PM, Roland Scheidegger  wrote:
> Did you replace 2 of them by exp2f but one by exp2f on purpose?
>
> I don't think we can use exp2/exp2f in gallium. This requires msvc 2013
> (all of exp2, exp2f, powf are c99, powf is supported by older msvc but
> the others are not). I guess though could throw some wrappers into
> c99_math.h.

I think this series adds uses of sinf, cosf, exp2f, logf, and powf.

I see existing uses of sinf, cosf, logf, and powf in tgsi_exec.c.
Presumably that means they're safe for me to use elsewhere?

I also see that lp_test_arit.c uses exp2f. Presumably that means it's
also safe to use?

I don't see wrappers for any of these.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] tnl: Avoid double promotion.

2015-07-15 Thread Matt Turner
On Tue, Jul 14, 2015 at 3:53 AM, Iago Toral  wrote:
> On Mon, 2015-07-13 at 16:22 -0700, Matt Turner wrote:
>> There are a couple of unrelated changes in t_vb_lighttmp.h that I hope
>> you'll excuse -- there's a block of code that's duplicated modulo a few
>> trivial differences that I took the liberty of fixing.
>> ---
>>  src/mesa/tnl/t_draw.c   |  2 +-
>>  src/mesa/tnl/t_rasterpos.c  |  6 +++---
>>  src/mesa/tnl/t_vb_fog.c |  6 +++---
>>  src/mesa/tnl/t_vb_light.c   | 16 
>>  src/mesa/tnl/t_vb_lighttmp.h| 16 +++-
>>  src/mesa/tnl/t_vb_normals.c |  4 ++--
>>  src/mesa/tnl/t_vertex_generic.c |  2 +-
>>  7 files changed, 25 insertions(+), 27 deletions(-)
>>
>> diff --git a/src/mesa/tnl/t_draw.c b/src/mesa/tnl/t_draw.c
>> index 6adf1dc..713c6a9 100644
>> --- a/src/mesa/tnl/t_draw.c
>> +++ b/src/mesa/tnl/t_draw.c
>> @@ -257,7 +257,7 @@ static GLboolean *_tnl_import_edgeflag( struct 
>> gl_context *ctx,
>> GLuint i;
>>
>> for (i = 0; i < count; i++) {
>> -  *bptr++ = ((GLfloat *)ptr)[0] == 1.0;
>> +  *bptr++ = ((GLfloat *)ptr)[0] == 1.0F;
>>ptr += stride;
>> }
>>
>> diff --git a/src/mesa/tnl/t_rasterpos.c b/src/mesa/tnl/t_rasterpos.c
>> index 7ef50ea..4bd9ac8 100644
>> --- a/src/mesa/tnl/t_rasterpos.c
>> +++ b/src/mesa/tnl/t_rasterpos.c
>> @@ -148,7 +148,7 @@ shade_rastpos(struct gl_context *ctx,
>>SUB_3V(VP, light->_Position, vertex);
>>   /* d = length(VP) */
>>d = (GLfloat) LEN_3FV( VP );
>> -  if (d > 1.0e-6) {
>> +  if (d > 1.0e-6F) {
>>  /* normalize VP */
>>   GLfloat invd = 1.0F / d;
>>   SELF_SCALE_SCALAR_3V(VP, invd);
>> @@ -172,7 +172,7 @@ shade_rastpos(struct gl_context *ctx,
>>}
>>}
>>
>> -  if (attenuation < 1e-3)
>> +  if (attenuation < 1e-3F)
>>continue;
>>
>>n_dot_VP = DOT3( normal, VP );
>> @@ -219,7 +219,7 @@ shade_rastpos(struct gl_context *ctx,
>>   shine = ctx->Light.Material.Attrib[MAT_ATTRIB_FRONT_SHININESS][0];
>>   spec_coef = powf(n_dot_h, shine);
>>
>> - if (spec_coef > 1.0e-10) {
>> + if (spec_coef > 1.0e-10F) {
>> if 
>> (ctx->Light.Model.ColorControl==GL_SEPARATE_SPECULAR_COLOR) {
>>ACC_SCALE_SCALAR_3V( specularContrib, spec_coef,
>> light->_MatSpecular[0]);
>> diff --git a/src/mesa/tnl/t_vb_fog.c b/src/mesa/tnl/t_vb_fog.c
>> index 1ca72f8..5489ed6 100644
>> --- a/src/mesa/tnl/t_vb_fog.c
>> +++ b/src/mesa/tnl/t_vb_fog.c
>> @@ -45,8 +45,8 @@ struct fog_stage_data {
>>  #define FOG_STAGE_DATA(stage) ((struct fog_stage_data *)stage->privatePtr)
>>
>>  #define FOG_EXP_TABLE_SIZE 256
>> -#define FOG_MAX (10.0)
>> -#define EXP_FOG_MAX .0006595
>> +#define FOG_MAX (10.0F)
>> +#define EXP_FOG_MAX .0006595F
>>  #define FOG_INCR (FOG_MAX/FOG_EXP_TABLE_SIZE)
>>  static GLfloat exp_table[FOG_EXP_TABLE_SIZE];
>>  static GLfloat inited = 0;
>> @@ -54,7 +54,7 @@ static GLfloat inited = 0;
>>  #if 1
>>  #define NEG_EXP( result, narg ) 
>>  \
>>  do { \
>> -   GLfloat f = (GLfloat) (narg * (1.0/FOG_INCR));\
>> +   GLfloat f = (GLfloat) (narg * (1.0F / FOG_INCR)); \
>> GLint k = (GLint) f; 
>>  \
>> if (k > FOG_EXP_TABLE_SIZE-2) \
>>result = (GLfloat) EXP_FOG_MAX;   
>>  \
>> diff --git a/src/mesa/tnl/t_vb_light.c b/src/mesa/tnl/t_vb_light.c
>> index dbd57fa..df9073e 100644
>> --- a/src/mesa/tnl/t_vb_light.c
>> +++ b/src/mesa/tnl/t_vb_light.c
>> @@ -137,23 +137,23 @@ validate_shine_table( struct gl_context *ctx, GLuint 
>> side, GLfloat shininess )
>>   break;
>>
>>m = s->tab;
>> -  m[0] = 0.0;
>> -  if (shininess == 0.0) {
>> +  m[0] = 0.0F;
>> +  if (shininess == 0.0F) {
>>for (j = 1 ; j <= SHINE_TABLE_SIZE ; j++)
>> - m[j] = 1.0;
>> + m[j] = 1.0F;
>>}
>>else {
>>for (j = 1 ; j < SHINE_TABLE_SIZE ; j++) {
>>  GLdouble t, x = j / (GLfloat) (SHINE_TABLE_SIZE - 1);
>
> I think you want to declare x (and probably t as well) with type GLfloat
> here.
>
>> -if (x < 0.005) /* underflow check */
>> -   x = 0.005;
>> +if (x < 0.005F) /* underflow check */
>> +   x = 0.005F;
>>  t = pow(x, shininess);
>
> Since the code below casts t to float anyway, is there a reason why you
> did not use powf and declared t as float? Also, there is no point in
> making the change below if you don't do that, right?

Totally right. Thanks for the nice review!

I've made t and x GLfloat, s/pow/powf/, and removed the GLfloat cast from t.
___
mesa-dev mailing lis

Re: [Mesa-dev] [PATCH 13/13] mesa: Avoid double promotion.

2015-07-15 Thread Matt Turner
On Tue, Jul 14, 2015 at 4:45 AM, Iago Toral  wrote:
> On Mon, 2015-07-13 at 16:22 -0700, Matt Turner wrote:
>> case GL_READ_BUFFER:
>> diff --git a/src/mesa/main/light.c b/src/mesa/main/light.c
>> index 4021dbe..fe2ce8c 100644
>> --- a/src/mesa/main/light.c
>> +++ b/src/mesa/main/light.c
>> @@ -143,7 +143,7 @@ _mesa_light(struct gl_context *ctx, GLuint lnum, GLenum 
>> pname, const GLfloat *pa
>>COPY_3V(light->SpotDirection, params);
>>break;
>> case GL_SPOT_EXPONENT:
>> -  assert(params[0] >= 0.0);
>> +  assert(params[0] >= 0.0F);
>>assert(params[0] <= ctx->Const.MaxSpotExponent);
>>if (light->SpotExponent == params[0])
>>return;
>> @@ -151,12 +151,12 @@ _mesa_light(struct gl_context *ctx, GLuint lnum, 
>> GLenum pname, const GLfloat *pa
>>light->SpotExponent = params[0];
>>break;
>> case GL_SPOT_CUTOFF:
>> -  assert(params[0] == 180.0 || (params[0] >= 0.0 && params[0] <= 90.0));
>> +  assert(params[0] == 180.0F || (params[0] >= 0.0F && params[0] <= 
>> 90.0F));
>>if (light->SpotCutoff == params[0])
>>   return;
>>FLUSH_VERTICES(ctx, _NEW_LIGHT);
>>light->SpotCutoff = params[0];
>> -  light->_CosCutoff = (GLfloat) (cos(light->SpotCutoff * M_PI / 180.0));
>> +  light->_CosCutoff = (cosf(light->SpotCutoff * M_PI / 180.0));
>
> Same comment as in the previous patch: is there any gain here?

I think so -- I expect cosf() is faster than cos(). In essence, we're
moving the double -> float conversion from the function's result to
the function's argument, allowing us to call a cheaper function.

We additionally might want to cast (M_PI / 180.0) to float. I'm not
sure. I'll do some tests.

> Other than this:
> Reviewed-by: Iago Toral Quiroga 

Thanks.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/13] gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).

2015-07-15 Thread Matt Turner
On Wed, Jul 15, 2015 at 12:44 PM, Matt Turner  wrote:
> On Mon, Jul 13, 2015 at 5:13 PM, Roland Scheidegger  
> wrote:
>> Did you replace 2 of them by exp2f but one by exp2f on purpose?
>>
>> I don't think we can use exp2/exp2f in gallium. This requires msvc 2013
>> (all of exp2, exp2f, powf are c99, powf is supported by older msvc but
>> the others are not). I guess though could throw some wrappers into
>> c99_math.h.
>
> I think this series adds uses of sinf, cosf, exp2f, logf, and powf.
>
> I see existing uses of sinf, cosf, logf, and powf in tgsi_exec.c.
> Presumably that means they're safe for me to use elsewhere?
>
> I also see that lp_test_arit.c uses exp2f. Presumably that means it's
> also safe to use?
>
> I don't see wrappers for any of these.

Additionally, I've added calls to copysignf and ldexpf. I don't see
existing uses of either of these.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix comment about DRM_IOCTL_I915_GEM_WAIT.

2015-07-15 Thread Kristian Høgsberg
On Wed, Jul 15, 2015 at 12:41 PM, Chris Wilson  wrote:
> On Wed, Jul 15, 2015 at 12:20:15PM -0700, Kristian Høgsberg wrote:
>> On Wed, Jul 15, 2015 at 10:22 AM, Kenneth Graunke  
>> wrote:
>> > From: Chris Wilson 
>> >
>> > The kernel actually waits forever when supplied a timeout value < 0,
>> > rather than returning immediately.  See i915_gem_wait_ioctl() in
>> > i915_gem.c's call to __i915_wait_request().
>> >
>> > (split by Ken from a large patch authored by Chris Wilson)
>> >
>> > Reviewed-by: Kenneth Graunke 
>> > ---
>> >  src/mesa/drivers/dri/i965/intel_syncobj.c | 6 +++---
>> >  1 file changed, 3 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/intel_syncobj.c 
>> > b/src/mesa/drivers/dri/i965/intel_syncobj.c
>> > index c44c4be..c2f4fa9 100644
>> > --- a/src/mesa/drivers/dri/i965/intel_syncobj.c
>> > +++ b/src/mesa/drivers/dri/i965/intel_syncobj.c
>> > @@ -105,9 +105,9 @@ brw_fence_client_wait(struct brw_context *brw, struct 
>> > brw_fence *fence,
>> > assert(fence->batch_bo);
>> >
>> > /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
>> > -* immediately for timeouts <= 0.  The best we can do is to clamp the
>> > -* timeout to INT64_MAX.  This limits the maximum timeout from 584 
>> > years to
>> > -* 292 years - likely not a big deal.
>> > +* immediately for timeout == 0, and indefinitely if timeout is 
>> > negative.
>> > +* The best we can do is to clamp the timeout to INT64_MAX.  This 
>> > limits
>> > +* the maximum timeout from 584 years to 292 years - likely not a big 
>> > deal.
>>
>> No, there are kernel versions in the wild that has this bug. The
>> comment after the patch says:
>>
>>  /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
>>  * immediately for timeout == 0, and indefinitely if timeout is negative.
>>  * The best we can do is to clamp the timeout to INT64_MAX.  This limits
>>  * the maximum timeout from 584 years to 292 years - likely not a big deal.
>>
>> which doesn't make sense. If we feel like we need to point out that
>> we've fixed the bug, that's fine, but keep the part about how some
>> kernels are broken so it's clear why the workaround is needed.
>
> This is not a workaround for the kernel bug but an impedence mismatch
> for the uint64_t GL interface and int64_t kernel interface.

We used to pass the GL uint64_t  to the kernel, and before 3.17 that
looked like it did thee right thing. Of course, any value over
INT64_MAX would wait forever, but it's hard to tell the difference
between that and 292 years. When the kernel bug happened, the common
"wait for UINT64_MAX ns" case broke. Not sure if the old behavior was
intentional, but it did handle the impedance mismatch (perhaps
inadvenrtently) as well as the current code. When the kernel broke, I
changed the way we handle the mismatch -  not because the "wait 292
years" behaviour is better, but because it works around the kernel
bug. So I'd certainly call the current behavior a workaround as much
as it's handling impedance mismatch.

> The text certainly didn't describe the kernel interface at the time it
> was introduced nor at the present time. If you want to rewrite it to
> mention the bug, then do so, but don't do that claiming to be a
> description of the kernel interface.
>
> If you really want to mention the inconsequential bug, then include:
>
>  * except once upon a time, shortly from 3.17, a bug was introduced that
>  * made the kernel return instantly for negative timeouts.
>
> It doesn't change the code in the slightest since you still have to make
> a judgment call about what to do with the timeout values of greater than
> 292 years. Looping at 292 years is closer than never.

It's not about changing the code, it's about documenting why we're
doing it this way instead of the "round up to infinity" for the
timeout > INT64_MAX cases. It's not an inconsequential bug, it breaks
the main use case of GL_ARB_sync, so I think it's worth having a
comment there to avoid accidentally regressing that.

Kristian

> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/13] gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).

2015-07-15 Thread Roland Scheidegger
Am 15.07.2015 um 21:44 schrieb Matt Turner:
> On Mon, Jul 13, 2015 at 5:13 PM, Roland Scheidegger  
> wrote:
>> Did you replace 2 of them by exp2f but one by exp2f on purpose?
>>
>> I don't think we can use exp2/exp2f in gallium. This requires msvc 2013
>> (all of exp2, exp2f, powf are c99, powf is supported by older msvc but
>> the others are not). I guess though could throw some wrappers into
>> c99_math.h.
> 
> I think this series adds uses of sinf, cosf, exp2f, logf, and powf.
> 
> I see existing uses of sinf, cosf, logf, and powf in tgsi_exec.c.
> Presumably that means they're safe for me to use elsewhere?
Yes. I think msvc generally has support for these float functions when
there's also double versions of the same function, c99 or not. And the
double version for all of these exist in c89 (in contrast to exp2).

> 
> I also see that lp_test_arit.c uses exp2f. Presumably that means it's
> also safe to use?
Nope. This one is explicitly excluded from being compiled with msvc...
(at least in scons, though I think the exp2f wasn't the only reason)

> 
> I don't see wrappers for any of these.
> 

Maybe we should add one for exp2/exp2f.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Register spilling issues in the NIR->vec4 backend

2015-07-15 Thread Ben Widawsky
On Wed, Jul 15, 2015 at 11:02:03AM -0700, Connor Abbott wrote:
> On Wed, Jul 15, 2015 at 7:49 AM, Iago Toral  wrote:
> > Hi,
> >
> > when we sent the patches for the new nir->vec4 backend we mentioned that
> > we had a few dEQP tests that would fail to link because of register
> > spilling. Now that we have added GS support we see a few instances of
> > this problem popping up in a few GS piglit tests too, for example this
> > one:
> >
> > tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd.shader_test
> >
> > I have been looking into what is going on with these tests and I came to
> > the conclusion that the problem is a consequence of various factors, but
> > probably the main thing contributing to it is the way our SSA pass
> > works. That said, I am not that experienced with NIR, so it could also
> > be that my analysis is missing something and I am just arriving to wrong
> > conclusions, so I'll explain my thoughts below and hopefully someone
> > else with more NIR experience can jump in and confirm or reject my
> > analysis.
> >
> > The GS code in that test looks like this:
> >
> > for (int p = 0; p < 3; p++) {
> >color = ((index >= ins[p].m1.length() ?
> > ins[p].m2[index-ins[p].m1.length()] :
> > ins[p].m1[index]) == expect) ?
> >vec4(0.0, 1.0, 0.0, 1.0) : vec4(1.0, 0.0, 0.0, 1.0);
> >gl_Position = gl_in[p].gl_Position;
> >EmitVertex();
> > }
> >
> > One thing that is immediately contributing to the register pressure is
> > some really awful code generated because of the indirect array indexing
> > on the inputs inside the loop. This is because of the
> > lower_variable_index_to_cond_assign lowering pass called from
> > brw_shader.cpp. This pass will convert that color assignment into a
> > bunch of nested if/else statements which makes the generated GLSL IR
> > code rather large, involving plenty of temporaries too. This is only
> > made worse by the fact that loop unrolling will replicate that 3 times.
> > The result is a huge pile of GLSL IR with a few dozens of nested if/else
> > statements and temporaries that looks like [1] (that is only a fragment
> > of the GLSL IR).
> >
> > One thing that is particularly relevant in that code is that it has
> > multiple conditional assignments to the same variable
> > (dereference_array_value) as a consequence of this lowering pass.
> >
> > That much, however, is common to the NIR and non-NIR paths. The problem
> > in the NIR case is that all these assignments generate new SSA values,
> > which then become new registers in the final NIR form. This leads to NIR
> > code like [2].  In contrast, the old vec4 visitor path, is able to have
> > writes to the same variable write to the same register.
> >
> > As a result, if I print the code right before register allocation in the
> > NIR path [3] and I compare that to what we get with the old vec4 visitor
> > path at that same point [4], it is clearly visible that this difference
> > is allowing the vec4 visitor path to reduce register pressure (see how
> > in [4] we have multiple writes to vgrf5, while in [3] we always write to
> > a new vgrf every time).
> >
> > So, am I missing something or is this kind of result expected with NIR
> > programs? Is there anything in the nir->vec4 pass that we can do to fix
> > this or does this need to be fixed when going out of SSA moe inside NIR?
> >
> > Iago
> >
> > [1] http://pastebin.com/5uA8ex2S
> > [2] http://pastebin.com/pqLfvAVN
> > [3] http://pastebin.com/64nSuUH8
> > [4] http://pastebin.com/WCrdYxzt
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> Hi Iago,
> 
> Indeed, NIR does convert conditional writes to conditional selectss --
> it's a required part of the conversion to SSA, and since our HW has a
> conditional select instruction that's just as fast as doing a
> conditional move, we haven't bothered much to try and change it back
> during out-of-SSA. However, doing this shouldn't make things worse. In
> your example, vgrf9, vgrf15, and vgrf17 all have very short live
> intervals and don't interfere with vgrf11 (unless there's another use
> of them somewhere after the snippet you pasted), which means that the
> register allocator is free to allocate the destinations of all the
> selects to the same register.
> 
> What's happening, though, is that you're running into our terrible
> liveness analysis. After doing the proper liveness analysis, we figure
> out the place each register first becomes live and last becomes dead,
> and then we consider registers that have overlapping ranges to
> interfere. So we consider vgrf11 to interfere with vgrf15 and vgrf17,
> even though it really doesn't. The trouble with making it do the right
> thing is that we may actually need to extend the live ranges of
> registers when the exec masks don't match up, either because one 

Re: [Mesa-dev] [PATCH 03/13] gallium/auxiliary: Use exp2(x) instead of pow(2.0, x).

2015-07-15 Thread Roland Scheidegger
Am 15.07.2015 um 21:58 schrieb Matt Turner:
> On Wed, Jul 15, 2015 at 12:44 PM, Matt Turner  wrote:
>> On Mon, Jul 13, 2015 at 5:13 PM, Roland Scheidegger  
>> wrote:
>>> Did you replace 2 of them by exp2f but one by exp2f on purpose?
>>>
>>> I don't think we can use exp2/exp2f in gallium. This requires msvc 2013
>>> (all of exp2, exp2f, powf are c99, powf is supported by older msvc but
>>> the others are not). I guess though could throw some wrappers into
>>> c99_math.h.
>>
>> I think this series adds uses of sinf, cosf, exp2f, logf, and powf.
>>
>> I see existing uses of sinf, cosf, logf, and powf in tgsi_exec.c.
>> Presumably that means they're safe for me to use elsewhere?
>>
>> I also see that lp_test_arit.c uses exp2f. Presumably that means it's
>> also safe to use?
>>
>> I don't see wrappers for any of these.
> 
> Additionally, I've added calls to copysignf and ldexpf. I don't see
> existing uses of either of these.
> 

Actually these two won't work with older msvc neither according to my
google skills. But outside of gallium code, this is just fine nowadays.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965: Push miptree tiling request into flags

2015-07-15 Thread Anuj Phogat
On Tue, Jul 14, 2015 at 9:56 AM, Ben Widawsky
 wrote:
> With the last few patches a way was provided to influence lower layer miptree
> layout and allocation decisions via flags (replacing bools). For simplicity, I
> chose not to touch the tiling requests because the change was slightly less
> mechanical than replacing the bools.
>
> The goal is to organize the code so we can continue to add new parameters and
> tiling types while minimizing risk to the existing code, and not having to
> constantly add new function parameters.
>
> v2: Rebased on Anuj's recent Yf/Ys changes
> Fix non-msrt MCS allocation (was only happening in gen8 case before)
>
> Cc: Anuj Phogat 
> Cc: Chad Versace 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_tex_layout.c | 21 ++--
>  src/mesa/drivers/dri/i965/intel_fbo.c  |  6 ++--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c  | 45 
> +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h  | 15 -
>  src/mesa/drivers/dri/i965/intel_tex.c  |  2 +-
>  src/mesa/drivers/dri/i965/intel_tex_image.c|  3 +-
>  src/mesa/drivers/dri/i965/intel_tex_validate.c |  5 +--
>  7 files changed, 50 insertions(+), 47 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index 389834f..a12b4af 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -614,8 +614,8 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
>   */
>  static uint32_t
>  brw_miptree_choose_tiling(struct brw_context *brw,
> -  enum intel_miptree_tiling_mode requested,
> -  const struct intel_mipmap_tree *mt)
> +  const struct intel_mipmap_tree *mt,
> +  uint32_t layout_flags)
>  {
> if (mt->format == MESA_FORMAT_S_UINT8) {
>/* The stencil buffer is W tiled. However, we request from the kernel a
> @@ -624,15 +624,18 @@ brw_miptree_choose_tiling(struct brw_context *brw,
>return I915_TILING_NONE;
> }
>
> +   /* Do not support changing the tiling for miptrees with pre-allocated 
> BOs. */
> +   assert((layout_flags & MIPTREE_LAYOUT_FOR_BO) == 0);
> +
> /* Some usages may want only one type of tiling, like depth miptrees (Y
>  * tiled), or temporary BOs for uploading data once (linear).
>  */
> -   switch (requested) {
> -   case INTEL_MIPTREE_TILING_ANY:
> +   switch (layout_flags & MIPTREE_LAYOUT_ALLOC_ANY_TILED) {
> +   case MIPTREE_LAYOUT_ALLOC_ANY_TILED:
>break;
> -   case INTEL_MIPTREE_TILING_Y:
> +   case MIPTREE_LAYOUT_ALLOC_YTILED:
>return I915_TILING_Y;
> -   case INTEL_MIPTREE_TILING_NONE:
> +   case MIPTREE_LAYOUT_ALLOC_LINEAR:
>return I915_TILING_NONE;
> }
>
> @@ -835,7 +838,6 @@ intel_miptree_can_use_tr_mode(const struct 
> intel_mipmap_tree *mt)
>  void
>  brw_miptree_layout(struct brw_context *brw,
> struct intel_mipmap_tree *mt,
> -   enum intel_miptree_tiling_mode requested,
> uint32_t layout_flags)
>  {
> const unsigned bpp = mt->cpp * 8;
> @@ -852,8 +854,7 @@ brw_miptree_layout(struct brw_context *brw,
>!(layout_flags & MIPTREE_LAYOUT_FOR_BO) &&
>!mt->compressed &&
>_mesa_is_format_color_format(mt->format) &&
> -  (requested == INTEL_MIPTREE_TILING_Y ||
> -   requested == INTEL_MIPTREE_TILING_ANY) &&
> +  (layout_flags & MIPTREE_LAYOUT_ALLOC_YTILED) &&
>(bpp && is_power_of_two(bpp)) &&
>/* FIXME: To avoid piglit regressions keep the Yf/Ys tiling
> * disabled at the moment.
> @@ -897,7 +898,7 @@ brw_miptree_layout(struct brw_context *brw,
>if (layout_flags & MIPTREE_LAYOUT_FOR_BO)
>   break;
>
> -  mt->tiling = brw_miptree_choose_tiling(brw, requested, mt);
> +  mt->tiling = brw_miptree_choose_tiling(brw, mt, layout_flags);
>if (is_tr_mode_yf_ys_allowed) {
>   if (intel_miptree_can_use_tr_mode(mt))
>  break;
> diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
> b/src/mesa/drivers/dri/i965/intel_fbo.c
> index 05e3f8b..26f895b 100644
> --- a/src/mesa/drivers/dri/i965/intel_fbo.c
> +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
> @@ -1022,6 +1022,9 @@ intel_renderbuffer_move_to_temp(struct brw_context *brw,
> struct intel_mipmap_tree *new_mt;
> int width, height, depth;
>
> +   uint32_t layout_flags = MIPTREE_LAYOUT_ACCELERATED_UPLOAD |
> +   MIPTREE_LAYOUT_ALLOC_ANY_TILED;
> +
> intel_miptree_get_dimensions_for_image(rb->TexImage, &width, &height, 
> &depth);
>
> new_mt = intel_miptree_create(brw, rb->TexImage->TexObject->Target,
> @@ -1030,8 +1033,7 @@ intel_renderbuffer_move_to_temp(struct brw_context *brw,
>   intel_image->base.Base.Level,
>   width, height, depth,
> 

Re: [Mesa-dev] [PATCH 2/5] i965/miptree: Cleanup some of the miptree map logic

2015-07-15 Thread Anuj Phogat
On Tue, Jul 14, 2015 at 9:56 AM, Ben Widawsky
 wrote:
> At the crux of this change is moving whether or not we can even use the 
> hardware
> blitter into the can_blit_slice check. Fundamentally this makes sense as
> blitting a slice is a subset in functionality of being able to use the blitter
> at all.
>
> NOTE: I think it's bad practice to have the assert in a function that is
> determining whether or not we should use the blitter, but I tried the
> alternatives, and they look worse IMO.
>
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c| 13 +
>  src/mesa/drivers/dri/i965/intel_blit.h|  3 +++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 27 
> +--
>  3 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index bc39053..c4701e3 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> @@ -241,6 +241,19 @@ intel_miptree_blit_compatible_formats(mesa_format src, 
> mesa_format dst)
> return false;
>  }
>
> +bool
> +intel_miptree_can_hw_blit(struct brw_context *brw, struct intel_mipmap_tree 
> *mt)
> +{
> +   if (mt->compressed)
> +  return false;
> +
> +   /* Prior to Sandybridge, the blitter can't handle Y tiling */
> +   if (brw->gen < 6 && mt->tiling == I915_TILING_Y)
> +  return false;
> +
> +   return true;
> +}
> +
>  /**
>   * Implements a rectangular block transfer (blit) of pixels between two
>   * miptrees.
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.h 
> b/src/mesa/drivers/dri/i965/intel_blit.h
> index c3d19a5..e60dd9b 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.h
> +++ b/src/mesa/drivers/dri/i965/intel_blit.h
> @@ -50,6 +50,9 @@ intelEmitCopyBlit(struct brw_context *brw,
>
>  bool intel_miptree_blit_compatible_formats(mesa_format src, mesa_format dst);
>
> +bool intel_miptree_can_hw_blit(struct brw_context *brw,
> +   struct intel_mipmap_tree *mt);
> +
>  bool intel_miptree_blit(struct brw_context *brw,
>  struct intel_mipmap_tree *src_mt,
>  int src_level, int src_slice,
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 72fba49..1330c2f 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -2600,9 +2600,14 @@ intel_miptree_release_map(struct intel_mipmap_tree *mt,
>  }
>
>  static bool
> -can_blit_slice(struct intel_mipmap_tree *mt,
> +can_blit_slice(struct brw_context *brw,
> +   struct intel_mipmap_tree *mt,
> unsigned int level, unsigned int slice)
>  {
> +
> +   if (!intel_miptree_can_hw_blit(brw, mt))
> +  return false;
> +
> uint32_t image_x;
> uint32_t image_y;
> intel_miptree_get_image_offset(mt, level, slice, &image_x, &image_y);
> @@ -2624,20 +2629,22 @@ use_intel_mipree_map_blit(struct brw_context *brw,
>unsigned int slice)
>  {
> if (brw->has_llc &&
> -  /* It's probably not worth swapping to the blit ring because of
> -   * all the overhead involved.
> -   */
> !(mode & GL_MAP_WRITE_BIT) &&
> -   !mt->compressed &&
> -   (mt->tiling == I915_TILING_X ||
> -/* Prior to Sandybridge, the blitter can't handle Y tiling */
> -(brw->gen >= 6 && mt->tiling == I915_TILING_Y)) &&
> -   can_blit_slice(mt, level, slice))
> +   can_blit_slice(brw, mt, level, slice))
>return true;
>
> if (mt->tiling != I915_TILING_NONE &&
> mt->bo->size >= brw->max_gtt_map_object_size) {
> -  assert(can_blit_slice(mt, level, slice));
> +  /* XXX: This assertion is actually the final condition for platforms
> +   * without SSE4.1.  Returning false is not the right thing to do with
> +   * the current code. On those platforms, the goal of this function is 
> to give
> +   * preference to the GTT, and at this point we've determined we cannot 
> use
> +   * the GTT, and we cannot blit, so we are out of options.
> +   *
> +   * NOTE: It should be possible to actually handle the case, but AFAIK, 
> we
> +   * never get this assertion.
> +   */
> +  assert(can_blit_slice(brw, mt, level, slice));
>return true;
> }
>
> --
> 2.4.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

This patch now allows using hw blitter with I915_TILING_NONE
which is not allowed for unexplained reason at present. So, we
have a bug fix in this patch. May be you should split this in to
two? Changes in the patch look fine to me.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/list

Re: [Mesa-dev] [PATCH 08/13] mesa: overhaul the glGetTexImage code

2015-07-15 Thread Brian Paul

On 07/14/2015 03:54 PM, Ilia Mirkin wrote:

On Mon, Jul 13, 2015 at 9:21 PM, Brian Paul  wrote:

1. Reorganize the error checking code.
2. Lay groundwork for getting sub images.
3. Implement _mesa_GetnTexImageARB(), _mesa_GetTexImage() and
_mesa_GetTextureImage() all in terms of get_texture_image().
---
  src/mesa/main/texgetimage.c | 539 +---
  1 file changed, 352 insertions(+), 187 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index e180a4c..b74517a 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -49,6 +49,13 @@
  #include "format_utils.h"
  #include "pixeltransfer.h"

+
+/** Some magic numbers used when getting whole texture image */
+#define WHOLE_WIDTH -123
+#define WHOLE_HEIGHT -123
+#define WHOLE_DEPTH -123


IMHO this is incredibly dirty. There simplest way to avoid this is to
move the gl_texture_image retrieval logic off into a helper and pass
the gl_texture_image in directly into get_texture_image, along with
optionally, the full image width/height/depth.


I reworked the code to do that, but it's a bit more code overall.  It's 
kind of tricky.  To pass the texture dims to the error checking 
function, you need to lookup the texture image, but that involves some 
error checking too.






+
+
  /**
   * Can the given type represent negative values?
   */
@@ -891,27 +898,59 @@ legal_getteximage_target(struct gl_context *ctx, GLenum 
target, bool dsa)


  /**
- * Do error checking for a glGetTex(ture)Image() call.
- * \return GL_TRUE if any error, GL_FALSE if no errors.
+ * Do error checking for all (non-compressed) get-texture-image functions.
+ * \param target  user-provided texture target, or 0 if called from DSA 
function
+ * \return true if any error, false if no errors.
   */
-static GLboolean
+static bool
  getteximage_error_check(struct gl_context *ctx,
-struct gl_texture_image *texImage,
+struct gl_texture_object *texObj,
  GLenum target, GLint level,
  GLenum format, GLenum type, GLsizei clientMemSize,
-GLvoid *pixels, bool dsa)
+GLvoid *pixels, const char *caller)
  {
-   const GLint maxLevels = _mesa_max_texture_levels(ctx, target);
-   const GLuint dimensions = (target == GL_TEXTURE_3D) ? 3 : 2;
-   GLenum baseFormat;
-   const char *suffix = dsa ? "ture" : "";
+   struct gl_texture_image *texImage;
+   GLenum baseFormat, err;
+   GLint maxLevels;
+   const bool dsa = (target == 0);


So now I can pass a GL_NONE target into these glGetTex* functions and
get something back? [Perhaps that's guarded elsehwere, but this diff
is hard to grok. Perhaps there's a tree I can look at somewhere?]


Sorry, the code is in my local tree.  I could send you the complete 
texgetimage.c file if that would help.







-   assert(texImage);
-   assert(maxLevels != 0);
+   if (!texObj || texObj->Target == 0) {
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(invalid texture)", caller);
+  return true;
+   }
+
+   if (dsa) {
+  target = texObj->Target;


in the non-dsa case, can target ever != texObj->Target? i.e. can you
just use this unconditionally?


I've changed this too.





+   }
+
+   if (!legal_getteximage_target(ctx, target, dsa)) {
+  _mesa_error(ctx, GL_INVALID_ENUM, "%s(target=%s)", caller,
+  _mesa_lookup_enum_by_nr(texObj->Target));
+  return true;
+   }
+
+   err = _mesa_error_check_format_and_type(ctx, format, type);
+   if (err != GL_NO_ERROR) {
+  _mesa_error(ctx, err, "%s(format/type)", caller);
+  return true;
+   }
+
+   maxLevels = _mesa_max_texture_levels(ctx, target);
 if (level < 0 || level >= maxLevels) {
_mesa_error(ctx, GL_INVALID_VALUE,
-  "glGetTex%sImage(level out of range)", suffix);
-  return GL_TRUE;
+  "%s(level = %d)", caller, level);
+  return true;
+   }
+
+   if (target == GL_TEXTURE_CUBE_MAP) {
+  target = GL_TEXTURE_CUBE_MAP_POSITIVE_X;
+   }
+
+   texImage = _mesa_select_tex_image(texObj, target, level);
+   if (!texImage) {
+  /* non-existant texture image */
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
+  return true;
 }

 /*
@@ -927,247 +966,325 @@ getteximage_error_check(struct gl_context *ctx,
 if (_mesa_is_color_format(format)
 && !_mesa_is_color_format(baseFormat)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetTex%sImage(format mismatch)", suffix);
-  return GL_TRUE;
+  "%s(format mismatch)", caller);
+  return true;
 }
 else if (_mesa_is_depth_format(format)
  && !_mesa_is_depth_format(baseFormat)
  && !_mesa_is_depthstencil_format(baseFormat)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetTex%sImage(format mismatch)", suffix);
-  return GL_TRUE;
+  

[Mesa-dev] [PATCH 05/14] mesa: make _mesa_get_[compressed_]texture_image() static

2015-07-15 Thread Brian Paul
These functions are only called from teximage.c

Reviewed-by: Ilia Mirkin 
---
 src/mesa/main/texgetimage.c | 24 
 src/mesa/main/texgetimage.h |  7 ---
 2 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index d17dd52..30a7d06 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -1004,12 +1004,12 @@ getteximage_error_check(struct gl_context *ctx,
  * \param dsa True when the caller is an ARB_direct_state_access function,
  *false otherwise
  */
-void
-_mesa_get_texture_image(struct gl_context *ctx,
-struct gl_texture_object *texObj,
-struct gl_texture_image *texImage, GLenum target,
-GLint level, GLenum format, GLenum type,
-GLsizei bufSize, GLvoid *pixels, bool dsa)
+static void
+get_texture_image(struct gl_context *ctx,
+  struct gl_texture_object *texObj,
+  struct gl_texture_image *texImage, GLenum target,
+  GLint level, GLenum format, GLenum type,
+  GLsizei bufSize, GLvoid *pixels, bool dsa)
 {
assert(texObj);
assert(texImage);
@@ -1103,8 +1103,8 @@ _mesa_GetnTexImageARB(GLenum target, GLint level, GLenum 
format,
if (!texImage)
   return;
 
-   _mesa_get_texture_image(ctx, texObj, texImage, target, level, format, type,
-   bufSize, pixels, false);
+   get_texture_image(ctx, texObj, texImage, target, level, format, type,
+ bufSize, pixels, false);
 }
 
 
@@ -1179,8 +1179,8 @@ _mesa_GetTextureImage(GLuint texture, GLint level, GLenum 
format,
  texImage = texObj->Image[i][level];
  assert(texImage);
 
- _mesa_get_texture_image(ctx, texObj, texImage, texObj->Target, level,
- format, type, bufSize, pixels, true);
+ get_texture_image(ctx, texObj, texImage, texObj->Target, level,
+   format, type, bufSize, pixels, true);
 
  image_stride = _mesa_image_image_stride(&ctx->Pack, texImage->Width,
  texImage->Height, format,
@@ -1194,8 +1194,8 @@ _mesa_GetTextureImage(GLuint texture, GLint level, GLenum 
format,
   if (!texImage)
  return;
 
-  _mesa_get_texture_image(ctx, texObj, texImage, texObj->Target, level,
-  format, type, bufSize, pixels, true);
+  get_texture_image(ctx, texObj, texImage, texObj->Target, level,
+format, type, bufSize, pixels, true);
}
 }
 
diff --git a/src/mesa/main/texgetimage.h b/src/mesa/main/texgetimage.h
index 611b1bd..5124851 100644
--- a/src/mesa/main/texgetimage.h
+++ b/src/mesa/main/texgetimage.h
@@ -49,13 +49,6 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
GLvoid *data);
 
 extern void
-_mesa_get_texture_image(struct gl_context *ctx,
-struct gl_texture_object *texObj,
-struct gl_texture_image *texImage, GLenum target,
-GLint level, GLenum format, GLenum type,
-GLsizei bufSize, GLvoid *pixels, bool dsa);
-
-extern void
 _mesa_get_compressed_texture_image( struct gl_context *ctx,
 struct gl_texture_object *texObj,
 struct gl_texture_image *texImage,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/14] mesa: add API dispatch for GL_ARB_get_texture_sub_image

2015-07-15 Thread Brian Paul
This adds the new glGetTextureSubImage() and
glGetCompressedTextureSubImage() functions.  Also update the
dispatch sanity test program.

v2: remove stray brace, move xi:include line in gl_API.xml, fix extension
number typo, s/program/texture/ in xml file.
---
 src/mapi/glapi/gen/ARB_get_texture_sub_image.xml | 40 
 src/mapi/glapi/gen/Makefile.am   |  1 +
 src/mapi/glapi/gen/gl_API.xml|  4 ++-
 src/mesa/main/tests/dispatch_sanity.cpp  |  5 +++
 4 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 src/mapi/glapi/gen/ARB_get_texture_sub_image.xml

diff --git a/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml 
b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
new file mode 100644
index 000..14e1c20
--- /dev/null
+++ b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
@@ -0,0 +1,40 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 5b163b0..170898c 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -135,6 +135,7 @@ API_XML = \
ARB_framebuffer_object.xml \
ARB_geometry_shader4.xml \
ARB_get_program_binary.xml \
+   ARB_get_texture_sub_image.xml \
ARB_gpu_shader_fp64.xml \
ARB_gpu_shader5.xml \
ARB_instanced_arrays.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 2f33075..eb8c72a 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -8253,7 +8253,9 @@
 
 http://www.w3.org/2001/XInclude"/>
 
-
+http://www.w3.org/2001/XInclude"/>
+
+
 
 http://www.w3.org/2001/XInclude"/>
 
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 800720b..cc89acb 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -930,6 +930,11 @@ const struct function common_desktop_functions_possible[] 
= {
 
/* GL_EXT_polygon_offset_clamp */
{ "glPolygonOffsetClampEXT", 11, -1 },
+
+   /* GL_ARB_get_texture_sub_image */
+   { "glGetTextureSubImage", 12, -1 },
+   { "glGetCompressedTextureSubImage", 12, -1 },
+
{ NULL, 0, -1 }
 };
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/14] mesa: replace Driver.GetCompressedTexImage() w/ GetCompressedTexSubImage()

2015-07-15 Thread Brian Paul
For now, pass offsets of zero and width/height/depth equal to the
whole image.

Reviewed-by: Ilia Mirkin 
---
 src/mesa/drivers/common/driverfuncs.c  |  2 +-
 src/mesa/main/dd.h |  9 ++---
 src/mesa/main/texgetimage.c| 28 
 src/mesa/main/texgetimage.h|  9 ++---
 src/mesa/state_tracker/st_cb_texture.c |  2 +-
 5 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/common/driverfuncs.c 
b/src/mesa/drivers/common/driverfuncs.c
index ce99620..6fe42b1 100644
--- a/src/mesa/drivers/common/driverfuncs.c
+++ b/src/mesa/drivers/common/driverfuncs.c
@@ -101,7 +101,7 @@ _mesa_init_driver_functions(struct dd_function_table 
*driver)
driver->TestProxyTexImage = _mesa_test_proxy_teximage;
driver->CompressedTexImage = _mesa_store_compressed_teximage;
driver->CompressedTexSubImage = _mesa_store_compressed_texsubimage;
-   driver->GetCompressedTexImage = _mesa_GetCompressedTexImage_sw;
+   driver->GetCompressedTexSubImage = _mesa_GetCompressedTexSubImage_sw;
driver->BindTexture = NULL;
driver->NewTextureObject = _mesa_new_texture_object;
driver->DeleteTexture = _mesa_delete_texture_object;
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 12e54de..ae01770 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -335,9 +335,12 @@ struct dd_function_table {
/**
 * Called by glGetCompressedTexImage.
 */
-   void (*GetCompressedTexImage)(struct gl_context *ctx,
- struct gl_texture_image *texImage,
- GLvoid *data);
+   void (*GetCompressedTexSubImage)(struct gl_context *ctx,
+struct gl_texture_image *texImage,
+GLint xoffset, GLint yoffset,
+GLint zoffset, GLsizei width,
+GLsizei height, GLsizei depth,
+GLvoid *data);
/*@}*/
 
/**
diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 30a7d06..fb3c2c8 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -772,13 +772,16 @@ _mesa_GetTexSubImage_sw(struct gl_context *ctx,
 
 
 /**
- * This is the software fallback for Driver.GetCompressedTexImage().
+ * This is the software fallback for Driver.GetCompressedTexSubImage().
  * All error checking will have been done before this routine is called.
  */
 void
-_mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
-   struct gl_texture_image *texImage,
-   GLvoid *img)
+_mesa_GetCompressedTexSubImage_sw(struct gl_context *ctx,
+  struct gl_texture_image *texImage,
+  GLint xoffset, GLint yoffset,
+  GLint zoffset, GLsizei width,
+  GLint height, GLint depth,
+  GLvoid *img)
 {
const GLuint dimensions =
   _mesa_get_texture_dimensions(texImage->TexObject->Target);
@@ -787,10 +790,8 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
GLubyte *dest;
 
_mesa_compute_compressed_pixelstore(dimensions, texImage->TexFormat,
-   texImage->Width, texImage->Height,
-   texImage->Depth,
-   &ctx->Pack,
-   &store);
+   width, height, depth,
+   &ctx->Pack, &store);
 
if (_mesa_is_bufferobj(ctx->Pack.BufferObj)) {
   /* pack texture image into a PBO */
@@ -816,8 +817,8 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
   GLubyte *src;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, slice,
-  0, 0, texImage->Width, texImage->Height,
+  ctx->Driver.MapTextureImage(ctx, texImage, zoffset + slice,
+  xoffset, yoffset, width, height,
   GL_MAP_READ_BIT, &src, &srcRowStride);
 
   if (src) {
@@ -828,7 +829,7 @@ _mesa_GetCompressedTexImage_sw(struct gl_context *ctx,
 src += srcRowStride;
  }
 
- ctx->Driver.UnmapTextureImage(ctx, texImage, slice);
+ ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + slice);
 
  /* Advance to next slice */
  dest += store.TotalBytesPerRow * (store.TotalRowsPerSlice - 
store.CopyRowsPerSlice);
@@ -1323,7 +1324,10 @@ _mesa_get_compressed_texture_image(struct gl_context 
*ctx,
 
_mesa_lock_texture(ctx, texObj);
{
-  ctx->Driver.GetCompressedTexImage(ctx, texImage, pixels);
+  ctx->Driver.GetCompressedTexSubImage(ctx, texImage,
+   0, 0, 0,
+

[Mesa-dev] [PATCH 14/14] mesa: s/GLint/GLsizei/ for consistency

2015-07-15 Thread Brian Paul
---
 src/mesa/main/dd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index ae01770..87eb63e 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -328,7 +328,7 @@ struct dd_function_table {
void (*CompressedTexSubImage)(struct gl_context *ctx, GLuint dims,
  struct gl_texture_image *texImage,
  GLint xoffset, GLint yoffset, GLint zoffset,
- GLsizei width, GLint height, GLint depth,
+ GLsizei width, GLsizei height, GLsizei depth,
  GLenum format,
  GLsizei imageSize, const GLvoid *data);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/14] mesa: replace Driver.GetTexImage with GetTexSubImage()

2015-07-15 Thread Brian Paul
The new driver hook has x/y/zoffset and width/height/depth parameters
for the new glGetTextureSubImage() function.

The meta code and gallium state tracker are updated to handle the
new parameters.

Callers to Driver.GetTexSubImage() pass in offsets=0 and sizes equal
to the whole texture size.

v2: update i965 driver code, s/GLint/GLsizei/ in GetTexSubImage hook
---
 src/mesa/drivers/common/driverfuncs.c   |  2 +-
 src/mesa/drivers/common/meta.c  | 22 ++-
 src/mesa/drivers/common/meta.h  |  8 ---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 16 --
 src/mesa/main/dd.h  | 10 +
 src/mesa/main/debug.c   |  4 +++-
 src/mesa/main/mipmap.c  |  9 +---
 src/mesa/main/texgetimage.c | 15 -
 src/mesa/main/texgetimage.h |  9 
 src/mesa/state_tracker/st_cb_texture.c  | 33 +
 10 files changed, 78 insertions(+), 50 deletions(-)

diff --git a/src/mesa/drivers/common/driverfuncs.c 
b/src/mesa/drivers/common/driverfuncs.c
index 71c1a76..ce99620 100644
--- a/src/mesa/drivers/common/driverfuncs.c
+++ b/src/mesa/drivers/common/driverfuncs.c
@@ -94,7 +94,7 @@ _mesa_init_driver_functions(struct dd_function_table *driver)
driver->QuerySamplesForFormat = _mesa_query_samples_for_format;
driver->TexImage = _mesa_store_teximage;
driver->TexSubImage = _mesa_store_texsubimage;
-   driver->GetTexImage = _mesa_meta_GetTexImage;
+   driver->GetTexSubImage = _mesa_meta_GetTexSubImage;
driver->ClearTexSubImage = _mesa_meta_ClearTexSubImage;
driver->CopyTexSubImage = _mesa_meta_CopyTexSubImage;
driver->GenerateMipmap = _mesa_meta_GenerateMipmap;
diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 34a8e4b..12045eb 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -3196,15 +3196,17 @@ decompress_texture_image(struct gl_context *ctx,
  * from core Mesa.
  */
 void
-_mesa_meta_GetTexImage(struct gl_context *ctx,
-   GLenum format, GLenum type, GLvoid *pixels,
-   struct gl_texture_image *texImage)
+_mesa_meta_GetTexSubImage(struct gl_context *ctx,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLsizei depth,
+  GLenum format, GLenum type, GLvoid *pixels,
+  struct gl_texture_image *texImage)
 {
if (_mesa_is_format_compressed(texImage->TexFormat)) {
   GLuint slice;
   bool result = true;
 
-  for (slice = 0; slice < texImage->Depth; slice++) {
+  for (slice = 0; slice < depth; slice++) {
  void *dst;
  if (texImage->TexObject->Target == GL_TEXTURE_2D_ARRAY
  || texImage->TexObject->Target == GL_TEXTURE_CUBE_MAP_ARRAY) {
@@ -3216,15 +3218,14 @@ _mesa_meta_GetTexImage(struct gl_context *ctx,
 struct gl_pixelstore_attrib packing = ctx->Pack;
 packing.SkipPixels = 0;
 packing.SkipRows = 0;
-dst = _mesa_image_address3d(&packing, pixels, texImage->Width,
-texImage->Height, format, type,
-slice, 0, 0);
+dst = _mesa_image_address3d(&packing, pixels, width, height,
+format, type, slice, 0, 0);
  }
  else {
 dst = pixels;
  }
- result = decompress_texture_image(ctx, texImage, slice, 0, 0,
-   texImage->Width, texImage->Height,
+ result = decompress_texture_image(ctx, texImage, slice,
+   xoffset, yoffset, width, height,
format, type, dst);
  if (!result)
 break;
@@ -3234,7 +3235,8 @@ _mesa_meta_GetTexImage(struct gl_context *ctx,
  return;
}
 
-   _mesa_GetTexImage_sw(ctx, format, type, pixels, texImage);
+   _mesa_GetTexSubImage_sw(ctx, xoffset, yoffset, zoffset,
+   width, height, depth, format, type, pixels, 
texImage);
 }
 
 
diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index f5b74c4..fe43915 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -560,9 +560,11 @@ _mesa_meta_ClearTexSubImage(struct gl_context *ctx,
 const GLvoid *clearValue);
 
 extern void
-_mesa_meta_GetTexImage(struct gl_context *ctx,
-   GLenum format, GLenum type, GLvoid *pixels,
-   struct gl_texture_image *texImage);
+_mesa_meta_GetTexSubImage(struct gl_context *ctx,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLsizei depth,
+   

[Mesa-dev] [PATCH 10/14] mesa: add new _mesa_Get[Compressed]TextureSubImage() functions

2015-07-15 Thread Brian Paul
Simple implementations in terms of get_[compressed_]texture_image().
---
 src/mesa/main/texgetimage.c | 62 -
 src/mesa/main/texgetimage.h | 15 +++
 2 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 6ac1779..ee086d1 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -893,7 +893,8 @@ legal_getteximage_target(struct gl_context *ctx, GLenum 
target, bool dsa)
 /**
  * Wrapper for _mesa_select_tex_image() which can handle target being
  * GL_TEXTURE_CUBE_MAP_ARB in which case we use zoffset to select a cube face.
- * This can happen for glGetTextureImage (DSA function).
+ * This can happen for glGetTextureImage and glGetTextureSubImage (DSA
+ * functions).
  */
 static struct gl_texture_image *
 select_tex_image(const struct gl_texture_object *texObj, GLenum target,
@@ -1437,6 +1438,35 @@ _mesa_GetTextureImage(GLuint texture, GLint level, 
GLenum format, GLenum type,
 }
 
 
+void GLAPIENTRY
+_mesa_GetTextureSubImage(GLuint texture, GLint level,
+ GLint xoffset, GLint yoffset, GLint zoffset,
+ GLsizei width, GLsizei height, GLsizei depth,
+ GLenum format, GLenum type, GLsizei bufSize,
+ void *pixels)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   static const char *caller = "glGetTextureSubImage";
+   struct gl_texture_object *texObj =
+  _mesa_lookup_texture_err(ctx, texture, caller);
+
+   if (!texObj) {
+  return;
+   }
+
+   if (getteximage_error_check(ctx, texObj, texObj->Target, level,
+   xoffset, yoffset, zoffset, width, height, depth,
+   format, type, bufSize, pixels, caller)) {
+  return;
+   }
+
+   get_texture_image(ctx, texObj, texObj->Target, level,
+ xoffset, yoffset, zoffset, width, height, depth,
+ format, type, bufSize, pixels, caller);
+}
+
+
+
 /**
  * Compute the number of bytes which will be written when retrieving
  * a sub-region of a compressed texture.
@@ -1720,3 +1750,33 @@ _mesa_GetCompressedTextureImage(GLuint texture, GLint 
level,
 0, 0, 0, width, height, depth,
 bufSize, pixels, caller);
 }
+
+
+void APIENTRY
+_mesa_GetCompressedTextureSubImage(GLuint texture, GLint level,
+   GLint xoffset, GLint yoffset,
+   GLint zoffset, GLsizei width,
+   GLsizei height, GLsizei depth,
+   GLsizei bufSize, void *pixels)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   static const char *caller = "glGetCompressedTextureImage";
+   struct gl_texture_object *texObj;
+
+   texObj = _mesa_lookup_texture_err(ctx, texture, caller);
+   if (!texObj) {
+  return;
+   }
+
+   if (getcompressedteximage_error_check(ctx, texObj, texObj->Target, level,
+ xoffset, yoffset, zoffset,
+ width, height, depth,
+ bufSize, pixels, caller)) {
+  return;
+   }
+
+   get_compressed_texture_image(ctx, texObj, texObj->Target, level,
+xoffset, yoffset, zoffset,
+width, height, depth,
+bufSize, pixels, caller);
+}
diff --git a/src/mesa/main/texgetimage.h b/src/mesa/main/texgetimage.h
index 040495a..63c75eb 100644
--- a/src/mesa/main/texgetimage.h
+++ b/src/mesa/main/texgetimage.h
@@ -71,6 +71,14 @@ _mesa_GetTextureImage(GLuint texture, GLint level, GLenum 
format,
   GLenum type, GLsizei bufSize, GLvoid *pixels);
 
 extern void GLAPIENTRY
+_mesa_GetTextureSubImage(GLuint texture, GLint level,
+ GLint xoffset, GLint yoffset, GLint zoffset,
+ GLsizei width, GLsizei height, GLsizei depth,
+ GLenum format, GLenum type, GLsizei bufSize,
+ void *pixels);
+
+
+extern void GLAPIENTRY
 _mesa_GetCompressedTexImage(GLenum target, GLint lod, GLvoid *img);
 
 extern void GLAPIENTRY
@@ -81,4 +89,11 @@ extern void GLAPIENTRY
 _mesa_GetCompressedTextureImage(GLuint texture, GLint level, GLsizei bufSize,
 GLvoid *pixels);
 
+extern void APIENTRY
+_mesa_GetCompressedTextureSubImage(GLuint texture, GLint level,
+   GLint xoffset, GLint yoffset,
+   GLint zoffset, GLsizei width,
+   GLsizei height, GLsizei depth,
+   GLsizei bufSize, void *pixels);
+
 #endif /* TEXGETIMAGE_H */
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listi

[Mesa-dev] [PATCH 08/14] mesa: overhaul the glGetTexImage code

2015-07-15 Thread Brian Paul
1. Reorganize the error checking code.
2. Lay groundwork for getting sub images by passing image offset and
   dimensions to the error checking code.
3. Implement _mesa_GetnTexImageARB(), _mesa_GetTexImage() and
   _mesa_GetTextureImage() all in terms of get_texture_image().

v2: pass offset/width/height/depth arguments to the error checking
function, avoid using magic width/height/depth values.
---
 src/mesa/main/texgetimage.c | 624 ++--
 1 file changed, 429 insertions(+), 195 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index e180a4c..49239e7 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -891,29 +891,299 @@ legal_getteximage_target(struct gl_context *ctx, GLenum 
target, bool dsa)
 
 
 /**
- * Do error checking for a glGetTex(ture)Image() call.
- * \return GL_TRUE if any error, GL_FALSE if no errors.
+ * Wrapper for _mesa_select_tex_image() which can handle target being
+ * GL_TEXTURE_CUBE_MAP_ARB in which case we use zoffset to select a cube face.
+ * This can happen for glGetTextureImage (DSA function).
  */
-static GLboolean
+static struct gl_texture_image *
+select_tex_image(const struct gl_texture_object *texObj, GLenum target,
+ GLint level, GLint zoffset)
+{
+   assert(level >= 0);
+   assert(level < MAX_TEXTURE_LEVELS);
+   if (target == GL_TEXTURE_CUBE_MAP) {
+  assert(zoffset >= 0);
+  assert(zoffset < 6);
+  target = GL_TEXTURE_CUBE_MAP_POSITIVE_X + zoffset;
+   }
+   return _mesa_select_tex_image(texObj, target, level);
+}
+
+
+/**
+ * Error-check the offset and size arguments to
+ * glGet[Compressed]TextureSubImage().  Also checks if the specified
+ * texture image is missing.
+ * \return true if error, false if no error.
+ */
+static bool
+dimensions_error_check(struct gl_context *ctx,
+   struct gl_texture_object *texObj,
+   GLenum target, GLint level,
+   GLint xoffset, GLint yoffset, GLint zoffset,
+   GLsizei width, GLsizei height, GLsizei depth,
+   const char *caller)
+{
+   const struct gl_texture_image *texImage;
+   int i;
+
+   if (xoffset < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(xoffset = %d)", caller, xoffset);
+  return true;
+   }
+
+   if (yoffset < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(yoffset = %d)", caller, yoffset);
+  return true;
+   }
+
+   if (zoffset < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(zoffset = %d)", caller, zoffset);
+  return true;
+   }
+
+   if (width < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(width = %d)", caller, width);
+  return true;
+   }
+
+   if (height < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(height = %d)", caller, height);
+  return true;
+   }
+
+   if (depth < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "%s(depth = %d)", caller, depth);
+  return true;
+   }
+
+   /* do special per-target checks */
+   switch (target) {
+   case GL_TEXTURE_1D:
+  if (yoffset != 0) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "%s(1D, yoffset = %d)", caller, yoffset);
+ return true;
+  }
+  if (height != 1) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "%s(1D, height = %d)", caller, height);
+ return true;
+  }
+  /* fall-through */
+   case GL_TEXTURE_1D_ARRAY:
+   case GL_TEXTURE_2D:
+   case GL_TEXTURE_RECTANGLE:
+  if (zoffset != 0) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "%s(zoffset = %d)", caller, zoffset);
+ return true;
+  }
+  if (depth != 1) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "%s(depth = %d)", caller, depth);
+ return true;
+  }
+  break;
+   case GL_TEXTURE_CUBE_MAP:
+  /* Non-array cube maps are special because we have a gl_texture_image
+   * per face.
+   */
+  if (zoffset + depth > 6) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "%s(zoffset + depth = %d)", caller, zoffset + depth);
+ return true;
+  }
+  /* check that the range of faces exist */
+  for (i = 0; i < depth; i++) {
+ GLenum face = GL_TEXTURE_CUBE_MAP_POSITIVE_X + zoffset + i;
+ if (!_mesa_select_tex_image(texObj, face, level)) {
+/* non-existant face */
+_mesa_error(ctx, GL_INVALID_OPERATION,
+"%s(missing cube face)", caller);
+return true;
+ }
+  }
+  break;
+   default:
+  ; /* nothing */
+   }
+
+   texImage = select_tex_image(texObj, target, level, zoffset);
+   if (!texImage) {
+  /* missing texture image */
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "%s(zoffset = %d)", caller, zoffset);
+  return true;
+   }
+
+   if (xoffset + width > texImage->Width) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  

[Mesa-dev] [PATCH 02/14] meta: add offset, width, height parameters to decompress_texture_image()

2015-07-15 Thread Brian Paul
In preparation for decompressing texture sub images.

Reviewed-by: Ilia Mirkin 
---
 src/mesa/drivers/common/meta.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 54c3d5a..34a8e4b 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -2966,15 +2966,14 @@ static bool
 decompress_texture_image(struct gl_context *ctx,
  struct gl_texture_image *texImage,
  GLuint slice,
+ GLint xoffset, GLint yoffset,
+ GLsizei width, GLsizei height,
  GLenum destFormat, GLenum destType,
  GLvoid *dest)
 {
struct decompress_state *decompress = &ctx->Meta->Decompress;
struct decompress_fbo_state *decompress_fbo;
struct gl_texture_object *texObj = texImage->TexObject;
-   const GLint width = texImage->Width;
-   const GLint height = texImage->Height;
-   const GLint depth = texImage->Height;
const GLenum target = texObj->Target;
GLenum rbFormat;
GLenum faceTarget;
@@ -3093,7 +3092,7 @@ decompress_texture_image(struct gl_context *ctx,
memset(verts, 0, sizeof(verts));
 
_mesa_meta_setup_texture_coords(faceTarget, slice,
-   0, 0, width, height,
+   xoffset, yoffset, width, height,
texImage->Width, texImage->Height,
texImage->Depth,
verts[0].tex,
@@ -3224,7 +3223,8 @@ _mesa_meta_GetTexImage(struct gl_context *ctx,
  else {
 dst = pixels;
  }
- result = decompress_texture_image(ctx, texImage, slice,
+ result = decompress_texture_image(ctx, texImage, slice, 0, 0,
+   texImage->Width, texImage->Height,
format, type, dst);
  if (!result)
 break;
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/14] mesa: overhaul the glGetCompressedTexImage code

2015-07-15 Thread Brian Paul
Same idea as the previous patch.
---
 src/mesa/main/texgetimage.c | 345 ++--
 1 file changed, 203 insertions(+), 142 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 49239e7..6ac1779 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -1438,224 +1438,285 @@ _mesa_GetTextureImage(GLuint texture, GLint level, 
GLenum format, GLenum type,
 
 
 /**
- * Do error checking for a glGetCompressedTexImage() call.
- * \return GL_TRUE if any error, GL_FALSE if no errors.
+ * Compute the number of bytes which will be written when retrieving
+ * a sub-region of a compressed texture.
  */
-static GLboolean
+static GLsizei
+packed_compressed_size(GLuint dimensions, mesa_format format,
+   GLsizei width, GLsizei height, GLsizei depth,
+   const struct gl_pixelstore_attrib *packing)
+{
+   struct compressed_pixelstore st;
+   GLsizei totalBytes;
+
+   _mesa_compute_compressed_pixelstore(dimensions, format,
+   width, height, depth,
+   packing, &st);
+   totalBytes =
+  (st.CopySlices - 1) * st.TotalRowsPerSlice * st.TotalBytesPerRow +
+  st.SkipBytes +
+  (st.CopyRowsPerSlice - 1) * st.TotalBytesPerRow +
+  st.CopyBytesPerRow;
+
+   return totalBytes;
+}
+
+
+/**
+ * Do error checking for getting compressed texture images.
+ * \return true if any error, false if no errors.
+ */
+static bool
 getcompressedteximage_error_check(struct gl_context *ctx,
-  struct gl_texture_image *texImage,
-  GLenum target,
-  GLint level, GLsizei clientMemSize,
-  GLvoid *img, bool dsa)
+  struct gl_texture_object *texObj,
+  GLenum target, GLint level,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLsizei depth,
+  GLsizei bufSize, GLvoid *pixels,
+  const char *caller)
 {
-   const GLint maxLevels = _mesa_max_texture_levels(ctx, target);
-   GLuint compressedSize, dimensions;
-   const char *suffix = dsa ? "ture" : "";
+   struct gl_texture_image *texImage;
+   GLint maxLevels;
+   GLsizei totalBytes;
+   GLuint dimensions;
 
-   assert(texImage);
+   assert(texObj);
 
-   if (!legal_getteximage_target(ctx, target, dsa)) {
-  _mesa_error(ctx, GL_INVALID_ENUM,
-  "glGetCompressedTex%sImage(target=%s)", suffix,
-  _mesa_lookup_enum_by_nr(target));
-  return GL_TRUE;
+   if (texObj->Target == 0) {
+  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(invalid texture)", caller);
+  return true;
}
 
-   assert(maxLevels != 0);
+   maxLevels = _mesa_max_texture_levels(ctx, target);
if (level < 0 || level >= maxLevels) {
   _mesa_error(ctx, GL_INVALID_VALUE,
-  "glGetCompressedTex%sImage(bad level = %d)", suffix, level);
-  return GL_TRUE;
+  "%s(bad level = %d)", caller, level);
+  return true;
}
 
+   if (dimensions_error_check(ctx, texObj, target, level,
+  xoffset, yoffset, zoffset,
+  width, height, depth, caller)) {
+  return true;
+   }
+
+   texImage = select_tex_image(texObj, target, level, zoffset);
+   assert(texImage);
+
if (!_mesa_is_format_compressed(texImage->TexFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glGetCompressedTex%sImage(texture is not compressed)",
-  suffix);
-  return GL_TRUE;
+  "%s(texture is not compressed)", caller);
+  return true;
}
 
-   compressedSize = _mesa_format_image_size(texImage->TexFormat,
-texImage->Width,
-texImage->Height,
-texImage->Depth);
-
/* Check for invalid pixel storage modes */
-   dimensions = _mesa_get_texture_dimensions(texImage->TexObject->Target);
+   dimensions = _mesa_get_texture_dimensions(texObj->Target);
if (!_mesa_compressed_pixel_storage_error_check(ctx, dimensions,
-  &ctx->Pack, dsa ?
-  "glGetCompressedTextureImage":
-  "glGetCompressedTexImage")) {
-  return GL_TRUE;
+   &ctx->Pack,
+   caller)) {
+  return true;
}
 
-   if (!_mesa_is_bufferobj(ctx->Pack.BufferObj)) {
-  /* do bounds checking on writing to client memory */
-  if (clientMemSize < (GLsizei) compressedSize) {
- 

[Mesa-dev] [PATCH 07/14] mesa: 80-column wrapping in texgetimage.c

2015-07-15 Thread Brian Paul
Reviewed-by: Ilia Mirkin 
---
 src/mesa/main/texgetimage.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index fb3c2c8..e180a4c 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -449,7 +449,8 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint 
dimensions,
   rebaseSwizzle[1] = MESA_FORMAT_SWIZZLE_ZERO;
   rebaseSwizzle[2] = MESA_FORMAT_SWIZZLE_ZERO;
   rebaseSwizzle[3] = MESA_FORMAT_SWIZZLE_W;
-} else if (texImage->_BaseFormat != 
_mesa_get_format_base_format(texFormat)) {
+} else if (texImage->_BaseFormat !=
+   _mesa_get_format_base_format(texFormat)) {
   needsRebase =
  _mesa_compute_rgba2base2rgba_component_mapping(texImage->_BaseFormat,
 rebaseSwizzle);
@@ -531,8 +532,8 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint 
dimensions,
  /* If we had to rebase, we have already handled that */
  needsRebase = false;
 
- /* If we were lucky and our RGBA conversion matches the dst format, 
then
-  * we are done.
+ /* If we were lucky and our RGBA conversion matches the dst format,
+  * then we are done.
   */
  if (!need_convert)
 goto do_swap;
@@ -832,7 +833,8 @@ _mesa_GetCompressedTexSubImage_sw(struct gl_context *ctx,
  ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + slice);
 
  /* Advance to next slice */
- dest += store.TotalBytesPerRow * (store.TotalRowsPerSlice - 
store.CopyRowsPerSlice);
+ dest += store.TotalBytesPerRow * (store.TotalRowsPerSlice -
+   store.CopyRowsPerSlice);
 
   } else {
  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetCompresssedTexImage");
@@ -953,7 +955,8 @@ getteximage_error_check(struct gl_context *ctx,
   "glGetTex%sImage(format mismatch)", suffix);
   return GL_TRUE;
}
-   else if (!_mesa_is_stencil_format(format) && 
_mesa_is_enum_format_integer(format) !=
+   else if (!_mesa_is_stencil_format(format) &&
+_mesa_is_enum_format_integer(format) !=
 _mesa_is_format_integer(texImage->TexFormat)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "glGetTex%sImage(format mismatch)", suffix);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/14] docs: document that GL_ARB_get_texture_sub_image is completed

2015-07-15 Thread Brian Paul
---
 docs/GL3.txt  | 2 +-
 docs/relnotes/10.7.0.html | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 33a282e..2144257 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -202,7 +202,7 @@ GL 4.5, GLSL 4.50:
   - Sampler object DONE
   - Program Pipeline objectDONE
   - Query object   DONE (will require 
changes when GL_ARB_query_buffer_object lands)
-  GL_ARB_get_texture_sub_image started (Brian Paul)
+  GL_ARB_get_texture_sub_image DONE (all drivers)
   GL_ARB_shader_texture_image_samples  not started
   GL_ARB_texture_barrier   DONE (nv50, nvc0, r600, 
radeonsi)
   GL_KHR_context_flush_control DONE (all - but needs 
GLX/EXT extension to be useful)
diff --git a/docs/relnotes/10.7.0.html b/docs/relnotes/10.7.0.html
index 42ea807..4533e78 100644
--- a/docs/relnotes/10.7.0.html
+++ b/docs/relnotes/10.7.0.html
@@ -44,6 +44,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
+GL_ARB_get_texture_sub_image for all drivers
 GL_AMD_vertex_shader_viewport_index on radeonsi
 GL_ARB_fragment_layer_viewport on radeonsi
 GL_ARB_framebuffer_no_attachments on i965
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/14] mesa: enable GL_ARB_get_texture_sub_image for all drivers

2015-07-15 Thread Brian Paul
---
 src/mesa/main/extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index d753e5f..7deb823 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -121,6 +121,7 @@ static const struct extension extension_table[] = {
{ "GL_ARB_framebuffer_object",  o(ARB_framebuffer_object),  
GL, 2005 },
{ "GL_ARB_framebuffer_sRGB",o(EXT_framebuffer_sRGB),
GL, 1998 },
{ "GL_ARB_get_program_binary",  o(dummy_true),  
GL, 2010 },
+   { "GL_ARB_get_texture_sub_image",   o(dummy_true),  
GL, 2014 },
{ "GL_ARB_gpu_shader5", o(ARB_gpu_shader5), 
GLC,2010 },
{ "GL_ARB_gpu_shader_fp64", o(ARB_gpu_shader_fp64), 
GLC,2010 },
{ "GL_ARB_half_float_pixel",o(dummy_true),  
GL, 2003 },
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/14] meta: handle subimages in _mesa_meta_setup_texture_coords()

2015-07-15 Thread Brian Paul
v2: fix depth, total_depth mix-up in meta.h, per Laura Ekstrand.
---
 src/mesa/drivers/common/meta.c | 88 +-
 src/mesa/drivers/common/meta.h |  6 +-
 src/mesa/drivers/common/meta_generate_mipmap.c |  4 +-
 3 files changed, 65 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 9a75019..54c3d5a 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -2449,30 +2449,53 @@ _mesa_meta_Bitmap(struct gl_context *ctx,
 
 /**
  * Compute the texture coordinates for the four vertices of a quad for
- * drawing a 2D texture image or slice of a cube/3D texture.
+ * drawing a 2D texture image or slice of a cube/3D texture.  The offset
+ * and width, height specify a sub-region of the 2D image.
+ *
  * \param faceTarget  GL_TEXTURE_1D/2D/3D or cube face name
  * \param slice  slice of a 1D/2D array texture or 3D texture
- * \param width  width of the texture image
- * \param height  height of the texture image
+ * \param xoffset  X position of sub texture
+ * \param yoffset  Y position of sub texture
+ * \param width  width of the sub texture image
+ * \param height  height of the sub texture image
+ * \param total_width  total width of the texture image
+ * \param total_height  total height of the texture image
+ * \param total_depth  total depth of the texture image
  * \param coords0/1/2/3  returns the computed texcoords
  */
 void
 _mesa_meta_setup_texture_coords(GLenum faceTarget,
 GLint slice,
+GLint xoffset,
+GLint yoffset,
 GLint width,
 GLint height,
-GLint depth,
+GLint total_width,
+GLint total_height,
+GLint total_depth,
 GLfloat coords0[4],
 GLfloat coords1[4],
 GLfloat coords2[4],
 GLfloat coords3[4])
 {
-   static const GLfloat st[4][2] = {
-  {0.0f, 0.0f}, {1.0f, 0.0f}, {1.0f, 1.0f}, {0.0f, 1.0f}
-   };
+   float st[4][2];
GLuint i;
+   const float s0 = (float) xoffset / (float) total_width;
+   const float s1 = (float) (xoffset + width) / (float) total_width;
+   const float t0 = (float) yoffset / (float) total_height;
+   const float t1 = (float) (yoffset + height) / (float) total_height;
GLfloat r;
 
+   /* setup the reference texcoords */
+   st[0][0] = s0;
+   st[0][1] = t0;
+   st[1][0] = s1;
+   st[1][1] = t0;
+   st[2][0] = s1;
+   st[2][1] = t1;
+   st[3][0] = s0;
+   st[3][1] = t1;
+
if (faceTarget == GL_TEXTURE_CUBE_MAP_ARRAY)
   faceTarget = GL_TEXTURE_CUBE_MAP_POSITIVE_X + slice % 6;
 
@@ -2489,52 +2512,52 @@ _mesa_meta_setup_texture_coords(GLenum faceTarget,
case GL_TEXTURE_3D:
case GL_TEXTURE_2D_ARRAY:
   if (faceTarget == GL_TEXTURE_3D) {
- assert(slice < depth);
- assert(depth >= 1);
- r = (slice + 0.5f) / depth;
+ assert(slice < total_depth);
+ assert(total_depth >= 1);
+ r = (slice + 0.5f) / total_depth;
   }
   else if (faceTarget == GL_TEXTURE_2D_ARRAY)
  r = (float) slice;
   else
  r = 0.0F;
-  coords0[0] = 0.0F; /* s */
-  coords0[1] = 0.0F; /* t */
+  coords0[0] = st[0][0]; /* s */
+  coords0[1] = st[0][1]; /* t */
   coords0[2] = r; /* r */
-  coords1[0] = 1.0F;
-  coords1[1] = 0.0F;
+  coords1[0] = st[1][0];
+  coords1[1] = st[1][1];
   coords1[2] = r;
-  coords2[0] = 1.0F;
-  coords2[1] = 1.0F;
+  coords2[0] = st[2][0];
+  coords2[1] = st[2][1];
   coords2[2] = r;
-  coords3[0] = 0.0F;
-  coords3[1] = 1.0F;
+  coords3[0] = st[3][0];
+  coords3[1] = st[3][1];
   coords3[2] = r;
   break;
case GL_TEXTURE_RECTANGLE_ARB:
-  coords0[0] = 0.0F; /* s */
-  coords0[1] = 0.0F; /* t */
+  coords0[0] = (float) xoffset; /* s */
+  coords0[1] = (float) yoffset; /* t */
   coords0[2] = 0.0F; /* r */
-  coords1[0] = (float) width;
-  coords1[1] = 0.0F;
+  coords1[0] = (float) (xoffset + width);
+  coords1[1] = (float) yoffset;
   coords1[2] = 0.0F;
-  coords2[0] = (float) width;
-  coords2[1] = (float) height;
+  coords2[0] = (float) (xoffset + width);
+  coords2[1] = (float) (yoffset + height);
   coords2[2] = 0.0F;
-  coords3[0] = 0.0F;
-  coords3[1] = (float) height;
+  coords3[0] = (float) xoffset;
+  coords3[1] = (float) (yoffset + height);
   coords3[2] = 0.0F;
   break;
case GL_TEXTURE_1D_ARRAY:
-  coords0[0] = 0.0F; /* s */
+  coords0[0] = st[0][0]; /* s */
   coords0[1] = (float) slice; /* t */
   coords0[2] = 0.0F; /* r */
-  coords

[Mesa-dev] [PATCH 04/14] mesa: plumb offset/size parameters through GetTexSubImage code

2015-07-15 Thread Brian Paul
Needed for GL_ARB_get_texture_sub_image.  But at this point, the
offsets are always zero and the sizes match the whole texture image.

v2: Fixes, suggestions from Laura Ekstrand:
* Fix calls to ctx->Driver.UnmapTextureImage() to pass the correct
  slice value.
* Added comments and assertions to check zoffset+depth<=tex->Depth before
  the 'img' loops.
* Added a new zoffset==0 assert in get_tex_memcpy().

Reviewed-by: Ilia Mirkin 
---
 src/mesa/main/texgetimage.c | 137 ++--
 1 file changed, 80 insertions(+), 57 deletions(-)

diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
index 01f1a15..d17dd52 100644
--- a/src/mesa/main/texgetimage.c
+++ b/src/mesa/main/texgetimage.c
@@ -75,12 +75,11 @@ type_needs_clamping(GLenum type)
  */
 static void
 get_tex_depth(struct gl_context *ctx, GLuint dimensions,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLint depth,
   GLenum format, GLenum type, GLvoid *pixels,
   struct gl_texture_image *texImage)
 {
-   const GLint width = texImage->Width;
-   GLint height = texImage->Height;
-   GLint depth = texImage->Depth;
GLint img, row;
GLfloat *depthRow = malloc(width * sizeof(GLfloat));
 
@@ -94,14 +93,15 @@ get_tex_depth(struct gl_context *ctx, GLuint dimensions,
   height = 1;
}
 
+   assert(zoffset + depth <= texImage->Depth);
for (img = 0; img < depth; img++) {
   GLubyte *srcMap;
   GLint srcRowStride;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, img,
-  0, 0, width, height, GL_MAP_READ_BIT,
-  &srcMap, &srcRowStride);
+  ctx->Driver.MapTextureImage(ctx, texImage, zoffset + img,
+  xoffset, yoffset, width, height,
+  GL_MAP_READ_BIT, &srcMap, &srcRowStride);
 
   if (srcMap) {
  for (row = 0; row < height; row++) {
@@ -113,7 +113,7 @@ get_tex_depth(struct gl_context *ctx, GLuint dimensions,
 _mesa_pack_depth_span(ctx, width, dest, type, depthRow, 
&ctx->Pack);
  }
 
- ctx->Driver.UnmapTextureImage(ctx, texImage, img);
+ ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + img);
   }
   else {
  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage");
@@ -130,26 +130,26 @@ get_tex_depth(struct gl_context *ctx, GLuint dimensions,
  */
 static void
 get_tex_depth_stencil(struct gl_context *ctx, GLuint dimensions,
+  GLint xoffset, GLint yoffset, GLint zoffset,
+  GLsizei width, GLsizei height, GLint depth,
   GLenum format, GLenum type, GLvoid *pixels,
   struct gl_texture_image *texImage)
 {
-   const GLint width = texImage->Width;
-   const GLint height = texImage->Height;
-   const GLint depth = texImage->Depth;
GLint img, row;
 
assert(format == GL_DEPTH_STENCIL);
assert(type == GL_UNSIGNED_INT_24_8 ||
   type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV);
 
+   assert(zoffset + depth <= texImage->Depth);
for (img = 0; img < depth; img++) {
   GLubyte *srcMap;
   GLint rowstride;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, img,
-  0, 0, width, height, GL_MAP_READ_BIT,
-  &srcMap, &rowstride);
+  ctx->Driver.MapTextureImage(ctx, texImage, zoffset + img,
+  xoffset, yoffset, width, height,
+  GL_MAP_READ_BIT, &srcMap, &rowstride);
 
   if (srcMap) {
  for (row = 0; row < height; row++) {
@@ -166,7 +166,7 @@ get_tex_depth_stencil(struct gl_context *ctx, GLuint 
dimensions,
 }
  }
 
- ctx->Driver.UnmapTextureImage(ctx, texImage, img);
+ ctx->Driver.UnmapTextureImage(ctx, texImage, zoffset + img);
   }
   else {
  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGetTexImage");
@@ -180,12 +180,11 @@ get_tex_depth_stencil(struct gl_context *ctx, GLuint 
dimensions,
  */
 static void
 get_tex_stencil(struct gl_context *ctx, GLuint dimensions,
+GLint xoffset, GLint yoffset, GLint zoffset,
+GLsizei width, GLsizei height, GLint depth,
 GLenum format, GLenum type, GLvoid *pixels,
 struct gl_texture_image *texImage)
 {
-   const GLint width = texImage->Width;
-   const GLint height = texImage->Height;
-   const GLint depth = texImage->Depth;
GLint img, row;
 
assert(format == GL_STENCIL_INDEX);
@@ -195,8 +194,9 @@ get_tex_stencil(struct gl_context *ctx, GLuint dimensions,
   GLint rowstride;
 
   /* map src texture buffer */
-  ctx->Driver.MapTextureImage(ctx, texImage, img,
-  0, 0, width, height, GL_MAP_READ_BIT,
+  ctx->Driver.MapTe

Re: [Mesa-dev] [PATCH 11/14] mesa: add API dispatch for GL_ARB_get_texture_sub_image

2015-07-15 Thread Ilia Mirkin
On Wed, Jul 15, 2015 at 7:40 PM, Brian Paul  wrote:
> This adds the new glGetTextureSubImage() and
> glGetCompressedTextureSubImage() functions.  Also update the
> dispatch sanity test program.
>
> v2: remove stray brace, move xi:include line in gl_API.xml, fix extension
> number typo, s/program/texture/ in xml file.
> ---
>  src/mapi/glapi/gen/ARB_get_texture_sub_image.xml | 40 
> 
>  src/mapi/glapi/gen/Makefile.am   |  1 +
>  src/mapi/glapi/gen/gl_API.xml|  4 ++-
>  src/mesa/main/tests/dispatch_sanity.cpp  |  5 +++
>  4 files changed, 49 insertions(+), 1 deletion(-)
>  create mode 100644 src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
>
> diff --git a/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml 
> b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
> new file mode 100644
> index 000..14e1c20
> --- /dev/null
> +++ b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
> @@ -0,0 +1,40 @@
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
> index 5b163b0..170898c 100644
> --- a/src/mapi/glapi/gen/Makefile.am
> +++ b/src/mapi/glapi/gen/Makefile.am
> @@ -135,6 +135,7 @@ API_XML = \
> ARB_framebuffer_object.xml \
> ARB_geometry_shader4.xml \
> ARB_get_program_binary.xml \
> +   ARB_get_texture_sub_image.xml \
> ARB_gpu_shader_fp64.xml \
> ARB_gpu_shader5.xml \
> ARB_instanced_arrays.xml \
> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
> index 2f33075..eb8c72a 100644
> --- a/src/mapi/glapi/gen/gl_API.xml
> +++ b/src/mapi/glapi/gen/gl_API.xml
> @@ -8253,7 +8253,9 @@
>
>   xmlns:xi="http://www.w3.org/2001/XInclude"/>
>
> -
> + xmlns:xi="http://www.w3.org/2001/XInclude"/>
> +
> +
>
>   xmlns:xi="http://www.w3.org/2001/XInclude"/>
>
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 800720b..cc89acb 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -930,6 +930,11 @@ const struct function 
> common_desktop_functions_possible[] = {
>
> /* GL_EXT_polygon_offset_clamp */
> { "glPolygonOffsetClampEXT", 11, -1 },
> +
> +   /* GL_ARB_get_texture_sub_image */
> +   { "glGetTextureSubImage", 12, -1 },
> +   { "glGetCompressedTextureSubImage", 12, -1 },

That should be 20, not 12 -- the ext spec requires GL 2.0, and all
your piglits do too.

With that fixed, this patch is

Reviewed-by: Ilia Mirkin 

[of course this does mean that it'll get exposed on drivers that don't
expose GL 2.0 but... meh]

> +
> { NULL, 0, -1 }
>  };
>
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/14] mesa: overhaul the glGetTexImage code

2015-07-15 Thread Ilia Mirkin
The reason why you chose to call select_tex_image() several times
through a get_tex_image flow eludes me. Why not just get it once at
the beginning and pass it around? The old APIs also used a
gl_texture_image and not a gl_texture_object. (Obviously the object is
implicitly gettable from the image when necessary.)

On Wed, Jul 15, 2015 at 7:40 PM, Brian Paul  wrote:
> 1. Reorganize the error checking code.
> 2. Lay groundwork for getting sub images by passing image offset and
>dimensions to the error checking code.
> 3. Implement _mesa_GetnTexImageARB(), _mesa_GetTexImage() and
>_mesa_GetTextureImage() all in terms of get_texture_image().
>
> v2: pass offset/width/height/depth arguments to the error checking
> function, avoid using magic width/height/depth values.
> ---
>  src/mesa/main/texgetimage.c | 624 
> ++--
>  1 file changed, 429 insertions(+), 195 deletions(-)
>
> diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c
> index e180a4c..49239e7 100644
> --- a/src/mesa/main/texgetimage.c
> +++ b/src/mesa/main/texgetimage.c
> @@ -891,29 +891,299 @@ legal_getteximage_target(struct gl_context *ctx, 
> GLenum target, bool dsa)
>
>
>  /**
> - * Do error checking for a glGetTex(ture)Image() call.
> - * \return GL_TRUE if any error, GL_FALSE if no errors.
> + * Wrapper for _mesa_select_tex_image() which can handle target being
> + * GL_TEXTURE_CUBE_MAP_ARB in which case we use zoffset to select a cube 
> face.
> + * This can happen for glGetTextureImage (DSA function).
>   */
> -static GLboolean
> +static struct gl_texture_image *
> +select_tex_image(const struct gl_texture_object *texObj, GLenum target,
> + GLint level, GLint zoffset)
> +{
> +   assert(level >= 0);
> +   assert(level < MAX_TEXTURE_LEVELS);
> +   if (target == GL_TEXTURE_CUBE_MAP) {
> +  assert(zoffset >= 0);
> +  assert(zoffset < 6);
> +  target = GL_TEXTURE_CUBE_MAP_POSITIVE_X + zoffset;
> +   }
> +   return _mesa_select_tex_image(texObj, target, level);
> +}
> +
> +
> +/**
> + * Error-check the offset and size arguments to
> + * glGet[Compressed]TextureSubImage().  Also checks if the specified
> + * texture image is missing.
> + * \return true if error, false if no error.
> + */
> +static bool
> +dimensions_error_check(struct gl_context *ctx,
> +   struct gl_texture_object *texObj,
> +   GLenum target, GLint level,
> +   GLint xoffset, GLint yoffset, GLint zoffset,
> +   GLsizei width, GLsizei height, GLsizei depth,
> +   const char *caller)
> +{
> +   const struct gl_texture_image *texImage;
> +   int i;
> +
> +   if (xoffset < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(xoffset = %d)", caller, 
> xoffset);
> +  return true;
> +   }
> +
> +   if (yoffset < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(yoffset = %d)", caller, 
> yoffset);
> +  return true;
> +   }
> +
> +   if (zoffset < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(zoffset = %d)", caller, 
> zoffset);
> +  return true;
> +   }
> +
> +   if (width < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(width = %d)", caller, width);
> +  return true;
> +   }
> +
> +   if (height < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(height = %d)", caller, height);
> +  return true;
> +   }
> +
> +   if (depth < 0) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(depth = %d)", caller, depth);
> +  return true;
> +   }
> +
> +   /* do special per-target checks */
> +   switch (target) {
> +   case GL_TEXTURE_1D:
> +  if (yoffset != 0) {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "%s(1D, yoffset = %d)", caller, yoffset);
> + return true;
> +  }
> +  if (height != 1) {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "%s(1D, height = %d)", caller, height);
> + return true;
> +  }
> +  /* fall-through */
> +   case GL_TEXTURE_1D_ARRAY:
> +   case GL_TEXTURE_2D:
> +   case GL_TEXTURE_RECTANGLE:
> +  if (zoffset != 0) {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "%s(zoffset = %d)", caller, zoffset);
> + return true;
> +  }
> +  if (depth != 1) {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "%s(depth = %d)", caller, depth);
> + return true;
> +  }
> +  break;
> +   case GL_TEXTURE_CUBE_MAP:
> +  /* Non-array cube maps are special because we have a gl_texture_image
> +   * per face.
> +   */
> +  if (zoffset + depth > 6) {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "%s(zoffset + depth = %d)", caller, zoffset + depth);
> + return true;
> +  }
> +  /* check that the range of faces exist */
> +  for (i = 0; i < depth; i++) {
> + GLenum face = GL_TEXTURE_CUBE_MAP_POSITIVE_X + zoffset + i;
> + if (!_m

Re: [Mesa-dev] [PATCH 3/4] radeon: fix some potential big endian issues

2015-07-15 Thread Roland Scheidegger
Am 13.07.2015 um 17:58 schrieb Emil Velikov:
> On 11 July 2015 at 19:39,   wrote:
> 
>> @@ -186,17 +172,24 @@ static inline void emit_cb_setup(struct r100_context 
>> *r100,
>>
>>  /* XXX others?  BE/LE? */
> Drop the BE/LE part of the comment ?
> 
>>  switch (mesa_format) {
>> +/* le */
>>  case MESA_FORMAT_B8G8R8A8_UNORM:
>>  case MESA_FORMAT_B8G8R8X8_UNORM:
>> +/* be */
>> +case MESA_FORMAT_A8R8G8B8_UNORM:
>> +case MESA_FORMAT_X8R8G8B8_UNORM:
> Something like the following might be easier for you/others X months
> down the line.
> 
> +/* The former format of each pair is for LE while latter for BE systems. 
> */
>  switch (mesa_format) {
>  case MESA_FORMAT_B8G8R8A8_UNORM:
> +case MESA_FORMAT_A8R8G8B8_UNORM:
>  case MESA_FORMAT_B8G8R8X8_UNORM:
> +case MESA_FORMAT_X8R8G8B8_UNORM:
> 
> Same suggestion applies for 4/4.
> 
> -Emil
> 

Thanks for the feedback. All fixed (and added some more comments even).

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/14] mesa: add API dispatch for GL_ARB_get_texture_sub_image

2015-07-15 Thread Roland Scheidegger
Am 16.07.2015 um 02:18 schrieb Ilia Mirkin:
> On Wed, Jul 15, 2015 at 7:40 PM, Brian Paul  wrote:
>> This adds the new glGetTextureSubImage() and
>> glGetCompressedTextureSubImage() functions.  Also update the
>> dispatch sanity test program.
>>
>> v2: remove stray brace, move xi:include line in gl_API.xml, fix extension
>> number typo, s/program/texture/ in xml file.
>> ---
>>  src/mapi/glapi/gen/ARB_get_texture_sub_image.xml | 40 
>> 
>>  src/mapi/glapi/gen/Makefile.am   |  1 +
>>  src/mapi/glapi/gen/gl_API.xml|  4 ++-
>>  src/mesa/main/tests/dispatch_sanity.cpp  |  5 +++
>>  4 files changed, 49 insertions(+), 1 deletion(-)
>>  create mode 100644 src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
>>
>> diff --git a/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml 
>> b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
>> new file mode 100644
>> index 000..14e1c20
>> --- /dev/null
>> +++ b/src/mapi/glapi/gen/ARB_get_texture_sub_image.xml
>> @@ -0,0 +1,40 @@
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
>> index 5b163b0..170898c 100644
>> --- a/src/mapi/glapi/gen/Makefile.am
>> +++ b/src/mapi/glapi/gen/Makefile.am
>> @@ -135,6 +135,7 @@ API_XML = \
>> ARB_framebuffer_object.xml \
>> ARB_geometry_shader4.xml \
>> ARB_get_program_binary.xml \
>> +   ARB_get_texture_sub_image.xml \
>> ARB_gpu_shader_fp64.xml \
>> ARB_gpu_shader5.xml \
>> ARB_instanced_arrays.xml \
>> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
>> index 2f33075..eb8c72a 100644
>> --- a/src/mapi/glapi/gen/gl_API.xml
>> +++ b/src/mapi/glapi/gen/gl_API.xml
>> @@ -8253,7 +8253,9 @@
>>
>>  > xmlns:xi="http://www.w3.org/2001/XInclude"/>
>>
>> -
>> +> xmlns:xi="http://www.w3.org/2001/XInclude"/>
>> +
>> +
>>
>>  > xmlns:xi="http://www.w3.org/2001/XInclude"/>
>>
>> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
>> b/src/mesa/main/tests/dispatch_sanity.cpp
>> index 800720b..cc89acb 100644
>> --- a/src/mesa/main/tests/dispatch_sanity.cpp
>> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
>> @@ -930,6 +930,11 @@ const struct function 
>> common_desktop_functions_possible[] = {
>>
>> /* GL_EXT_polygon_offset_clamp */
>> { "glPolygonOffsetClampEXT", 11, -1 },
>> +
>> +   /* GL_ARB_get_texture_sub_image */
>> +   { "glGetTextureSubImage", 12, -1 },
>> +   { "glGetCompressedTextureSubImage", 12, -1 },
> 
> That should be 20, not 12 -- the ext spec requires GL 2.0, and all
> your piglits do too.
> 
> With that fixed, this patch is
> 
> Reviewed-by: Ilia Mirkin 
> 
> [of course this does mean that it'll get exposed on drivers that don't
> expose GL 2.0 but... meh]

I don't actually see why it's a hard requirement (other than at this
point probably noone is interested in new extensions for pre-2.0
drivers). There's a couple of things in the extension which require at
least 1.3 (cube maps, compressed textures), but even those are waved
away by the dependency section (together with the array textures which
would require newer GL version). But I might be missing something why it
couldn't possibly work with older versions. But anyway, seems like a
pretty academical point indeed.

Roland



>> +
>> { NULL, 0, -1 }
>>  };
>>
>> --
>> 1.9.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=BQIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=Z4otm2Wbhybr5Zf7Qk4YsoT9tKDRKnsMQLxYYFLrAVI&s=8KIIR6NcN6WeDJXRhiwnvBOLEFesg1by-Oo6cHxhllg&e=
>>  
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=BQIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=Z4otm2Wbhybr5Zf7Qk4YsoT9tKDRKnsMQLxYYFLrAVI&s=8KIIR6NcN6WeDJXRhiwnvBOLEFesg1by-Oo6cHxhllg&e=
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Remove MSVC implementations of copysign and isnormal.

2015-07-15 Thread Matt Turner
Non-Gallium parts of Mesa require MSVC 2013 which provides these.
---
 src/glsl/ir_constant_expression.cpp  | 14 +-
 src/glsl/nir/nir_constant_expressions.py | 14 +-
 2 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/src/glsl/ir_constant_expression.cpp 
b/src/glsl/ir_constant_expression.cpp
index 53ba3bf..5cc62ee 100644
--- a/src/glsl/ir_constant_expression.cpp
+++ b/src/glsl/ir_constant_expression.cpp
@@ -40,12 +40,7 @@
 #include "glsl_types.h"
 #include "program/hash_table.h"
 
-#if defined(_MSC_VER) && (_MSC_VER < 1800)
-static int isnormal(double x)
-{
-   return _fpclass(x) == _FPCLASS_NN || _fpclass(x) == _FPCLASS_PN;
-}
-#elif defined(__SUNPRO_CC) && !defined(isnormal)
+#if defined(__SUNPRO_CC) && !defined(isnormal)
 #include 
 static int isnormal(double x)
 {
@@ -53,13 +48,6 @@ static int isnormal(double x)
 }
 #endif
 
-#if defined(_MSC_VER)
-static double copysign(double x, double y)
-{
-   return _copysign(x, y);
-}
-#endif
-
 static float
 dot_f(ir_constant *op0, ir_constant *op1)
 {
diff --git a/src/glsl/nir/nir_constant_expressions.py 
b/src/glsl/nir/nir_constant_expressions.py
index bf82fe5..8fd9b10 100644
--- a/src/glsl/nir/nir_constant_expressions.py
+++ b/src/glsl/nir/nir_constant_expressions.py
@@ -31,12 +31,7 @@ template = """\
 #include "util/rounding.h" /* for _mesa_roundeven */
 #include "nir_constant_expressions.h"
 
-#if defined(_MSC_VER) && (_MSC_VER < 1800)
-static int isnormal(double x)
-{
-   return _fpclass(x) == _FPCLASS_NN || _fpclass(x) == _FPCLASS_PN;
-}
-#elif defined(__SUNPRO_CC)
+#if defined(__SUNPRO_CC)
 #include 
 static int isnormal(double x)
 {
@@ -44,13 +39,6 @@ static int isnormal(double x)
 }
 #endif
 
-#if defined(_MSC_VER)
-static double copysign(double x, double y)
-{
-   return _copysign(x, y);
-}
-#endif
-
 /**
  * Evaluate one component of packSnorm4x8.
  */
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] c99_math: Implement exp2f for MSVC.

2015-07-15 Thread Matt Turner
---
This will go in before my double promotion series which uses exp2f.

 include/c99_math.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/c99_math.h b/include/c99_math.h
index 7ed7cc2..0ca5a73 100644
--- a/include/c99_math.h
+++ b/include/c99_math.h
@@ -140,6 +140,12 @@ llrintf(float f)
return rounded;
 }
 
+static inline float
+exp2f(float f)
+{
+   return powf(2.0f, f);
+}
+
 #endif /* C99 */
 
 
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Register spilling issues in the NIR->vec4 backend

2015-07-15 Thread Iago Toral
On Wed, 2015-07-15 at 11:02 -0700, Connor Abbott wrote:
> On Wed, Jul 15, 2015 at 7:49 AM, Iago Toral  wrote:
> > Hi,
> >
> > when we sent the patches for the new nir->vec4 backend we mentioned that
> > we had a few dEQP tests that would fail to link because of register
> > spilling. Now that we have added GS support we see a few instances of
> > this problem popping up in a few GS piglit tests too, for example this
> > one:
> >
> > tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd.shader_test
> >
> > I have been looking into what is going on with these tests and I came to
> > the conclusion that the problem is a consequence of various factors, but
> > probably the main thing contributing to it is the way our SSA pass
> > works. That said, I am not that experienced with NIR, so it could also
> > be that my analysis is missing something and I am just arriving to wrong
> > conclusions, so I'll explain my thoughts below and hopefully someone
> > else with more NIR experience can jump in and confirm or reject my
> > analysis.
> >
> > The GS code in that test looks like this:
> >
> > for (int p = 0; p < 3; p++) {
> >color = ((index >= ins[p].m1.length() ?
> > ins[p].m2[index-ins[p].m1.length()] :
> > ins[p].m1[index]) == expect) ?
> >vec4(0.0, 1.0, 0.0, 1.0) : vec4(1.0, 0.0, 0.0, 1.0);
> >gl_Position = gl_in[p].gl_Position;
> >EmitVertex();
> > }
> >
> > One thing that is immediately contributing to the register pressure is
> > some really awful code generated because of the indirect array indexing
> > on the inputs inside the loop. This is because of the
> > lower_variable_index_to_cond_assign lowering pass called from
> > brw_shader.cpp. This pass will convert that color assignment into a
> > bunch of nested if/else statements which makes the generated GLSL IR
> > code rather large, involving plenty of temporaries too. This is only
> > made worse by the fact that loop unrolling will replicate that 3 times.
> > The result is a huge pile of GLSL IR with a few dozens of nested if/else
> > statements and temporaries that looks like [1] (that is only a fragment
> > of the GLSL IR).
> >
> > One thing that is particularly relevant in that code is that it has
> > multiple conditional assignments to the same variable
> > (dereference_array_value) as a consequence of this lowering pass.
> >
> > That much, however, is common to the NIR and non-NIR paths. The problem
> > in the NIR case is that all these assignments generate new SSA values,
> > which then become new registers in the final NIR form. This leads to NIR
> > code like [2].  In contrast, the old vec4 visitor path, is able to have
> > writes to the same variable write to the same register.
> >
> > As a result, if I print the code right before register allocation in the
> > NIR path [3] and I compare that to what we get with the old vec4 visitor
> > path at that same point [4], it is clearly visible that this difference
> > is allowing the vec4 visitor path to reduce register pressure (see how
> > in [4] we have multiple writes to vgrf5, while in [3] we always write to
> > a new vgrf every time).
> >
> > So, am I missing something or is this kind of result expected with NIR
> > programs? Is there anything in the nir->vec4 pass that we can do to fix
> > this or does this need to be fixed when going out of SSA moe inside NIR?
> >
> > Iago
> >
> > [1] http://pastebin.com/5uA8ex2S
> > [2] http://pastebin.com/pqLfvAVN
> > [3] http://pastebin.com/64nSuUH8
> > [4] http://pastebin.com/WCrdYxzt
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> Hi Iago,
> 
> Indeed, NIR does convert conditional writes to conditional selectss --
> it's a required part of the conversion to SSA, and since our HW has a
> conditional select instruction that's just as fast as doing a
> conditional move, we haven't bothered much to try and change it back
> during out-of-SSA. However, doing this shouldn't make things worse. In
> your example, vgrf9, vgrf15, and vgrf17 all have very short live
> intervals and don't interfere with vgrf11 (unless there's another use
> of them somewhere after the snippet you pasted), which means that the
> register allocator is free to allocate the destinations of all the
> selects to the same register.
> 
> What's happening, though, is that you're running into our terrible
> liveness analysis. After doing the proper liveness analysis, we figure
> out the place each register first becomes live and last becomes dead,
> and then we consider registers that have overlapping ranges to
> interfere. So we consider vgrf11 to interfere with vgrf15 and vgrf17,
> even though it really doesn't. The trouble with making it do the right
> thing is that we may actually need to extend the live ranges of
> registers when the exec masks don't match up, either because one uses
> 

Re: [Mesa-dev] [PATCH v2] glsl: avoid compiler's segfault when processing operators with void arguments

2015-07-15 Thread Samuel Iglesias Gonsálvez


On 13/07/15 13:06, Samuel Iglesias Gonsálvez wrote:
> On 11/07/15 19:38, Renaud Gaubert wrote:
>> This is done by returning an rvalue of type void in the
>> ast_function_expression::hir function instead of a void expression.
>>
>> This produces (in the case of the ternary) an hir with a call
>> to the void returning function and an assignment of a void variable
>> which will be optimized out (the assignment) during the optimization
>> pass.
>>
>> This fix results in having a valid subexpression in the many
>> different cases where the subexpressions are functions whose
>> return values are void.
>>
>> Thus preventing to dereference NULL in the following cases:
>>   * binary operator
>>   * unary operators
>>   * ternary operator
>>   * comparison operators (except equal and nequal operator)
>>
>> Equal and nequal had to be handled as a special case because
>> instead of segfaulting on a forbidden syntax it was now accepting
>> expressions with a void return value on either (or both) side of
>> the expression.
>>
>> Signed-off-by: Renaud Gaubert 
>> Reviewed-by: Gabriel Laskar 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85252
>>
>> ---
>> Piglit tests were sent to the Piglit mailing list:
>>   * glsl-1.10 Adds tests on how void functions are handled
>>
>>  src/glsl/ast_function.cpp | 9 -
>>  src/glsl/ast_to_hir.cpp   | 9 -
>>  2 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
>> index 92e26bf..ac32723 100644
>> --- a/src/glsl/ast_function.cpp
>> +++ b/src/glsl/ast_function.cpp
>> @@ -1785,7 +1785,14 @@ ast_function_expression::hir(exec_list *instructions,
>>   /* an error has already been emitted */
>>   value = ir_rvalue::error_value(ctx);
>>} else {
>> - value = generate_call(instructions, sig, &actual_parameters, state);
>> +value = generate_call(instructions, sig, &actual_parameters, state);
>> +if (!value) {
>> +   ir_variable *const tmp = new(ctx) 
>> ir_variable(glsl_type::void_type,
>> + "void_var",
>> + ir_var_temporary);
>> +   instructions->push_tail(tmp);
>> +   value = new(ctx) ir_dereference_variable(tmp);
>> +}
> 
> Indention. You used two spaces indention instead of three.
> 
> With that fixed,
> 
>Reviewed-by: Samuel Iglesias Gonsálvez 
> 
> If you don't have commit rights, I can fix the indention, add the R-b
> and push the patch to master (no need of v3 patch). Just tell me to do
> so (*).
> 

Pushed! :)

Sam

> Thanks for your contribution!
> 
> Sam
> 
> (*) Also for your piglit patch. If nobody has pushed your piglit patch
> yet, I can do it.
> 
>>}
>>  
>>return value;
>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>> index 8cb46be..0d0ad2a 100644
>> --- a/src/glsl/ast_to_hir.cpp
>> +++ b/src/glsl/ast_to_hir.cpp
>> @@ -1270,7 +1270,14 @@ ast_expression::do_hir(exec_list *instructions,
>> *applied to one operand that can make them match, in which
>> *case this conversion is done."
>> */
>> -  if ((!apply_implicit_conversion(op[0]->type, op[1], state)
>> +
>> +  if (op[0]->type == glsl_type::void_type || op[1]->type == 
>> glsl_type::void_type) {
>> + _mesa_glsl_error(& loc, state, "`%s':  wrong operand types: "
>> + "no operation `%1$s' exists that takes a left-hand 
>> "
>> + "operand of type 'void' or a right operand of type 
>> "
>> + "'void'", (this->oper == ast_equal) ? "==" : "!=");
>> + error_emitted = true;
>> +  } else if ((!apply_implicit_conversion(op[0]->type, op[1], state)
>> && !apply_implicit_conversion(op[1]->type, op[0], state))
>>|| (op[0]->type != op[1]->type)) {
>>   _mesa_glsl_error(& loc, state, "operands of `%s' must have the 
>> same "
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev