[Mesa-dev] [Bug 95005] Unreal engine demos segfault after shader compilation error with OpenGL 4.3

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=95005

Michel Dänzer  changed:

   What|Removed |Added

 CC||mesa-dev@lists.freedesktop.
   ||org

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] loader: add a libdrm case for loader_get_device_name_for_fd

2016-04-18 Thread Jonathan Gray
Any objections to this?

On Mon, Dec 21, 2015 at 04:39:55PM +1100, Jonathan Gray wrote:
> Use dev_node_from_fd() with HAVE_LIBDRM to provide an implmentation
> of loader_get_device_name_for_fd() for non-linux systems that
> use libdrm but don't have udev or sysfs.
> 
> Signed-off-by: Jonathan Gray 
> ---
>  src/loader/loader.c | 26 +-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/src/loader/loader.c b/src/loader/loader.c
> index 8634f45..522fba3 100644
> --- a/src/loader/loader.c
> +++ b/src/loader/loader.c
> @@ -445,7 +445,7 @@ int loader_get_user_preferred_fd(int default_fd, int 
> *different_device)
>  }
>  #endif
>  
> -#if defined(HAVE_SYSFS)
> +#if defined(HAVE_SYSFS) || defined(HAVE_LIBDRM)
>  static int
>  dev_node_from_fd(int fd, unsigned int *maj, unsigned int *min)
>  {
> @@ -466,7 +466,9 @@ dev_node_from_fd(int fd, unsigned int *maj, unsigned int 
> *min)
>  
> return 0;
>  }
> +#endif
>  
> +#if defined(HAVE_SYSFS)
>  static int
>  sysfs_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
>  {
> @@ -671,6 +673,24 @@ sysfs_get_device_name_for_fd(int fd)
>  }
>  #endif
>  
> +#if defined(HAVE_LIBDRM)
> +static char *
> +drm_get_device_name_for_fd(int fd)
> +{
> +   unsigned int maj, min;
> +   char buf[0x40];
> +   int n;
> +
> +   if (dev_node_from_fd(fd, , ) < 0)
> +  return NULL;
> +
> +   n = snprintf(buf, sizeof(buf), DRM_DEV_NAME, DRM_DIR_NAME, min);
> +   if (n == -1 || n >= sizeof(buf))
> +  return NULL;
> +
> +   return strdup(buf);
> +}
> +#endif
>  
>  char *
>  loader_get_device_name_for_fd(int fd)
> @@ -685,6 +705,10 @@ loader_get_device_name_for_fd(int fd)
> if ((result = sysfs_get_device_name_for_fd(fd)))
>return result;
>  #endif
> +#if HAVE_LIBDRM
> +   if ((result = drm_get_device_name_for_fd(fd)))
> +  return result;
> +#endif
> return result;
>  }
>  
> -- 
> 2.6.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: fix the --disable-llvm-shared-libs build

2016-04-18 Thread Jonathan Gray
This patch is still required for master.

On Sun, Feb 28, 2016 at 02:47:03PM +1100, Jonathan Gray wrote:
> When building with --disable-llvm-shared-libs use llvm-config --libfiles
> instead of of --libs so the full path to the .a files is used instead of
> -lname.
> 
> Otherwise at install time gallium_dri.a is installed of gallium_dri.so
> and the hardlinking of gallium_dri.so to other names fails.
> 
> Cc: "11.2 11.1" 
> Signed-off-by: Jonathan Gray 
> ---
>  configure.ac | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 6f970d7..3e2923f 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2304,13 +2304,14 @@ if test "x$MESA_LLVM" != x0; then
>  if ! $LLVM_CONFIG --libs ${LLVM_COMPONENTS} >/dev/null; then
> AC_MSG_ERROR([Calling ${LLVM_CONFIG} failed])
>  fi
> -LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
>  
>  dnl llvm-config may not give the right answer when llvm is a built as a
>  dnl single shared library, so we must work the library name out for
>  dnl ourselves.
>  dnl (See https://llvm.org/bugs/show_bug.cgi?id=6823)
>  if test "x$enable_llvm_shared_libs" = xyes; then
> +LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
> +
>  dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,
>  LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version`
>  AS_IF([test -f "$LLVM_LIBDIR/lib$LLVM_SO_NAME.$IMP_LIB_EXT"], 
> [llvm_have_one_so=yes])
> @@ -2337,6 +2338,8 @@ if test "x$MESA_LLVM" != x0; then
> dnl already added all of these objects to LLVM_LIBS.
>  fi
>  else
> +LLVM_LIBS="`$LLVM_CONFIG --libfiles ${LLVM_COMPONENTS}`"
> +
>  AC_MSG_WARN([Building mesa with statically linked LLVM may cause 
> compilation issues])
>  dnl We need to link to llvm system libs when using static libs
>  dnl However, only llvm 3.5+ provides --system-libs
> -- 
> 2.7.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/18] anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER

2016-04-18 Thread Jason Ekstrand
On Mon, Apr 18, 2016 at 5:37 PM, Ian Romanick  wrote:

> On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
> > ---
> >  src/intel/vulkan/genX_cmd_buffer.c | 76
> +-
> >  1 file changed, 42 insertions(+), 34 deletions(-)
> >
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> > index 932ba65..713de82 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -924,28 +924,34 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
> >
> > /* Emit 3DSTATE_DEPTH_BUFFER */
> > if (has_depth) {
> > -  anv_batch_emit(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER),
> > - .SurfaceType = SURFTYPE_2D,
> > - .DepthWriteEnable = true,
> > - .StencilWriteEnable = has_stencil,
> > - .HierarchicalDepthBufferEnable = false,
> > - .SurfaceFormat = isl_surf_get_depth_format(>isl_dev,
> > -
> >depth_surface.isl),
> > - .SurfacePitch = image->depth_surface.isl.row_pitch - 1,
> > - .SurfaceBaseAddress = {
> > +  anv_batch_emit_blk(_buffer->batch,
> GENX(3DSTATE_DEPTH_BUFFER), db) {
> > + db.SurfaceType   = SURFTYPE_2D;
> > + db.DepthWriteEnable  = true;
> > + db.StencilWriteEnable= has_stencil;
> > + db.HierarchicalDepthBufferEnable = false;
> > +
> > + db.SurfaceFormat = isl_surf_get_depth_format(>isl_dev,
> > +
> >depth_surface.isl);
> > +
> > + db.SurfaceBaseAddress = (struct anv_address) {
> >  .bo = image->bo,
> >  .offset = image->offset + image->depth_surface.offset,
> > - },
> > - .Height = fb->height - 1,
> > - .Width = fb->width - 1,
> > - .LOD = 0,
> > - .Depth = 1 - 1,
> > - .MinimumArrayElement = 0,
> > - .DepthBufferObjectControlState = GENX(MOCS),
> > + };
> > + db.DepthBufferObjectControlState = GENX(MOCS),
> > +
> > + db.SurfacePitch = image->depth_surface.isl.row_pitch -
> 1;
> > + db.Height   = fb->height - 1;
> > + db.Width= fb->width - 1;
> > + db.LOD  = 0;
> > + db.Depth= 1 - 1;
> > + db.MinimumArrayElement  = 0;
> > +
> >  #if GEN_GEN >= 8
> > - .SurfaceQPitch =
> isl_surf_get_array_pitch_el_rows(>depth_surface.isl) >> 2,
> > + db.SurfaceQPitch =
> > +isl_surf_get_array_pitch_el_rows(>depth_surface.isl)
> >> 2,
> >  #endif
> > - .RenderTargetViewExtent = 1 - 1);
> > + db.RenderTargetViewExtent = 1 - 1;
> > +  }
> > } else {
> >/* Even when no depth buffer is present, the hardware requires
> that
> > * 3DSTATE_DEPTH_BUFFER be programmed correctly. The Broadwell
> PRM says:
> > @@ -965,45 +971,47 @@ cmd_buffer_emit_depth_stencil(struct
> anv_cmd_buffer *cmd_buffer)
> > * nor stencil buffer is present.  Also, D16_UNORM is not allowed
> to
> > * be combined with a stencil buffer so we use D32_FLOAT instead.
> > */
> > -  anv_batch_emit(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER),
> > - .SurfaceType = SURFTYPE_2D,
> > - .SurfaceFormat = D32_FLOAT,
> > - .Width = fb->width - 1,
> > - .Height = fb->height - 1,
> > - .StencilWriteEnable = has_stencil);
> > +  anv_batch_emit_blk(_buffer->batch,
> GENX(3DSTATE_DEPTH_BUFFER), db) {
> > + db.SurfaceType  = SURFTYPE_2D;
> > + db.SurfaceFormat= D32_FLOAT;
> > + db.Width= fb->width - 1;
> > + db.Height   = fb->height - 1;
> > + db.StencilWriteEnable   = has_stencil;
> > +  }
> > }
> >
> > /* Emit 3DSTATE_STENCIL_BUFFER */
> > if (has_stencil) {
> > -  anv_batch_emit(_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER),
> > +  anv_batch_emit_blk(_buffer->batch,
> GENX(3DSTATE_STENCIL_BUFFER), sb) {
>
> I think all the code that follows violates the style guide.  This is
> part of the reason I think we may regret this style choice.  To be
> clear... I'll go with the crowd, but I just want to be sure the crowd
> fully has 2n eyes open.
>

Right.  This is a classic example of where the hard "vertical align" rule
breaks down.  The code below is so broken up by comments and #if's that
each of the 4 assignments is in it's own visual section of the code and
aligning them doesn't make as much sense.

At the end of the day, a hard "align things" rule won't really work.  It
needs to be more of a "sometimes we align things to make them look nicer"
which, as has been pointed out isn't something you can make indent, vim, or
emacs do for you.  Honestly, I'm not *that* much of a fan of aligning
things and, if it were only up to me, I probably wouldn't bother.


> >  #if GEN_GEN >= 8 || GEN_IS_HASWELL
> > - .StencilBufferEnable = 

[Mesa-dev] [PATCH] i965/tiled_memcpy: don't unconditionally use __builtin_bswap32

2016-04-18 Thread Jonathan Gray
Use the defines Mesa configure sets to indicate presence of the bswap32
builtins.  This lets i965 work on OpenBSD again after the changes that
were made in 0a5d8d9af42fd77fce1492d55f958da97816961a.

Signed-off-by: Jonathan Gray 
---
 src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c 
b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
index a549854..c888e46 100644
--- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
+++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
@@ -64,6 +64,19 @@ ror(uint32_t n, uint32_t d)
return (n >> d) | (n << (32 - d));
 }
 
+static inline uint32_t
+bswap32(uint32_t n)
+{
+#if defined(HAVE___BUILTIN_BSWAP32)
+   return __builtin_bswap32(n);
+#else
+   return (n >> 24) |
+  ((n >> 8) & 0xff00) |
+  ((n << 8) & 0x00ff) |
+  (n << 24);
+#endif
+}
+
 /**
  * Copy RGBA to BGRA - swap R and B.
  */
@@ -76,7 +89,7 @@ rgba8_copy(void *dst, const void *src, size_t bytes)
assert(bytes % 4 == 0);
 
while (bytes >= 4) {
-  *d = ror(__builtin_bswap32(*s), 8);
+  *d = ror(bswap32(*s), 8);
   d += 1;
   s += 1;
   bytes -= 4;
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/18] anv/cmd_buffer: Use the new emit macro for compute shader dispatch

2016-04-18 Thread Jason Ekstrand
On Mon, Apr 18, 2016 at 5:30 PM, Ian Romanick  wrote:

> On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
> > ---
> >  src/intel/vulkan/genX_cmd_buffer.c | 116
> -
> >  1 file changed, 64 insertions(+), 52 deletions(-)
> >
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c
> b/src/intel/vulkan/genX_cmd_buffer.c
> > index 45b009b..4a75825 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -572,17 +572,19 @@ static void
> >  emit_lrm(struct anv_batch *batch,
> >   uint32_t reg, struct anv_bo *bo, uint32_t offset)
> >  {
> > -   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_MEM),
> > -  .RegisterAddress = reg,
> > -  .MemoryAddress = { bo, offset });
> > +   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_MEM), lrm) {
> > +  lrm.RegisterAddress  = reg;
> > +  lrm.MemoryAddress= (struct anv_address) { bo, offset };
> > +   }
> >  }
> >
> >  static void
> >  emit_lri(struct anv_batch *batch, uint32_t reg, uint32_t imm)
>
> In patch 8 the Gen8 emit_lri helper is removed, but this one stays.  It
> seems like either both should go or both should stay.  I also thought it
> was odd that the Gen8 emit_lri was a #define while this is a function.
> Perhaps there's something else happening here that I don't see.
>

Right.  I wanted to just move it out of genX to some place shared and use
it in gen7 and gen8.  Unfortunately, I'm not sure exactly where to do
that.  The reason I kept the genX one is because it's used some 8 times
while the gen7 and gen8 versions are used once or twice and they didn't
seem worth keeping.


> >  {
> > -   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_IMM),
> > -  .RegisterOffset = reg,
> > -  .DataDWord = imm);
> > +   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_IMM), lri) {
> > +  lri.RegisterOffset   = reg;
> > +  lri.DataDWord= imm;
> > +   }
> >  }
> >
> >  void genX(CmdDrawIndirect)(
> > @@ -695,18 +697,19 @@ void genX(CmdDispatch)(
> >
> > genX(cmd_buffer_flush_compute_state)(cmd_buffer);
> >
> > -   anv_batch_emit(_buffer->batch, GENX(GPGPU_WALKER),
> > -  .SIMDSize = prog_data->simd_size / 16,
> > -  .ThreadDepthCounterMaximum = 0,
> > -  .ThreadHeightCounterMaximum = 0,
> > -  .ThreadWidthCounterMaximum =
> pipeline->cs_thread_width_max - 1,
> > -  .ThreadGroupIDXDimension = x,
> > -  .ThreadGroupIDYDimension = y,
> > -  .ThreadGroupIDZDimension = z,
> > -  .RightExecutionMask = pipeline->cs_right_mask,
> > -  .BottomExecutionMask = 0x);
> > -
> > -   anv_batch_emit(_buffer->batch, GENX(MEDIA_STATE_FLUSH));
> > +   anv_batch_emit_blk(_buffer->batch, GENX(GPGPU_WALKER), ggw) {
> > +  ggw.SIMDSize = prog_data->simd_size / 16;
> > +  ggw.ThreadDepthCounterMaximum= 0;
> > +  ggw.ThreadHeightCounterMaximum   = 0;
> > +  ggw.ThreadWidthCounterMaximum= pipeline->cs_thread_width_max
> - 1;
> > +  ggw.ThreadGroupIDXDimension  = x;
> > +  ggw.ThreadGroupIDYDimension  = y;
> > +  ggw.ThreadGroupIDZDimension  = z;
> > +  ggw.RightExecutionMask   = pipeline->cs_right_mask;
> > +  ggw.BottomExecutionMask  = 0x;
> > +   }
> > +
> > +   anv_batch_emit_blk(_buffer->batch, GENX(MEDIA_STATE_FLUSH), msf);
> >  }
> >
> >  #define GPGPU_DISPATCHDIMX 0x2500
> > @@ -758,48 +761,53 @@ void genX(CmdDispatchIndirect)(
> > emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 0);
> >
> > /* predicate = (compute_dispatch_indirect_x_size == 0); */
> > -   anv_batch_emit(batch, GENX(MI_PREDICATE),
> > -  .LoadOperation = LOAD_LOAD,
> > -  .CombineOperation = COMBINE_SET,
> > -  .CompareOperation = COMPARE_SRCS_EQUAL);
> > +   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
> > +  mip.LoadOperation= LOAD_LOAD;
> > +  mip.CombineOperation = COMBINE_SET;
> > +  mip.CompareOperation = COMPARE_SRCS_EQUAL;
> > +   }
> >
> > /* Load compute_dispatch_indirect_y_size into SRC0 */
> > emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 4);
> >
> > /* predicate |= (compute_dispatch_indirect_y_size == 0); */
> > -   anv_batch_emit(batch, GENX(MI_PREDICATE),
> > -  .LoadOperation = LOAD_LOAD,
> > -  .CombineOperation = COMBINE_OR,
> > -  .CompareOperation = COMPARE_SRCS_EQUAL);
> > +   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
> > +  mip.LoadOperation= LOAD_LOAD;
> > +  mip.CombineOperation = COMBINE_OR;
> > +  mip.CompareOperation = COMPARE_SRCS_EQUAL;
> > +   }
> >
> > /* Load compute_dispatch_indirect_z_size into SRC0 */
> > emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 8);
> >
> > /* 

[Mesa-dev] [PATCH] egl/x11: authenticate before doing chipset id ioctls

2016-04-18 Thread Jonathan Gray
For systems without udev or sysfs that use drm ioctls in the loader
drm authentication must take place earlier or the loader will fail
"MESA-LOADER: failed to get param for i915".

Patch from Mark Kettenis.

Cc: "11.2 11.1" 
Signed-off-by: Mark Kettenis 
Signed-off-by: Jonathan Gray 
---
 src/egl/drivers/dri2/platform_x11.c | 110 ++--
 1 file changed, 56 insertions(+), 54 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index 3ab9188..43a0918 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -542,6 +542,55 @@ dri2_x11_flush_front_buffer(__DRIdrawable * driDrawable, 
void *loaderPrivate)
 #endif
 }
 
+static int
+dri2_x11_authenticate(struct dri2_egl_display *dri2_dpy, uint32_t id)
+{
+   xcb_dri2_authenticate_reply_t *authenticate;
+   xcb_dri2_authenticate_cookie_t authenticate_cookie;
+   xcb_screen_iterator_t s;
+   xcb_screen_t *screen;
+   int ret = 0;
+
+   s = xcb_setup_roots_iterator(xcb_get_setup(dri2_dpy->conn));
+
+   screen = get_xcb_screen(s, dri2_dpy->screen);
+   if (!screen) {
+  _eglLog(_EGL_WARNING, "DRI2: failed to get xcb screen");
+  return -1;
+   }
+
+   authenticate_cookie =
+  xcb_dri2_authenticate_unchecked(dri2_dpy->conn, screen->root, id);
+   authenticate =
+  xcb_dri2_authenticate_reply(dri2_dpy->conn, authenticate_cookie, NULL);
+
+   if (authenticate == NULL || !authenticate->authenticated)
+  ret = -1;
+
+   free(authenticate);
+   
+   return ret;
+}
+
+static EGLBoolean
+dri2_x11_local_authenticate(struct dri2_egl_display *dri2_dpy)
+{
+#ifdef HAVE_LIBDRM
+   drm_magic_t magic;
+
+   if (drmGetMagic(dri2_dpy->fd, )) {
+  _eglLog(_EGL_WARNING, "DRI2: failed to get drm magic");
+  return EGL_FALSE;
+   }
+   
+   if (dri2_x11_authenticate(dri2_dpy, magic) < 0) {
+  _eglLog(_EGL_WARNING, "DRI2: failed to authenticate");
+  return EGL_FALSE;
+   }
+#endif
+   return EGL_TRUE;
+}
+
 static EGLBoolean
 dri2_x11_connect(struct dri2_egl_display *dri2_dpy)
 {
@@ -630,6 +679,13 @@ dri2_x11_connect(struct dri2_egl_display *dri2_dpy)
   return EGL_FALSE;
}
 
+   if (!dri2_x11_local_authenticate(dri2_dpy)) {
+  close(dri2_dpy->fd);
+  free(dri2_dpy->device_name);
+  free(connect);
+  return EGL_FALSE;
+   }
+
driver_name = xcb_dri2_connect_driver_name (connect);
 
/* If Mesa knows about the appropriate driver for this fd, then trust it.
@@ -656,57 +712,6 @@ dri2_x11_connect(struct dri2_egl_display *dri2_dpy)
return EGL_TRUE;
 }
 
-static int
-dri2_x11_authenticate(_EGLDisplay *disp, uint32_t id)
-{
-   struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
-   xcb_dri2_authenticate_reply_t *authenticate;
-   xcb_dri2_authenticate_cookie_t authenticate_cookie;
-   xcb_screen_iterator_t s;
-   xcb_screen_t *screen;
-   int ret = 0;
-
-   s = xcb_setup_roots_iterator(xcb_get_setup(dri2_dpy->conn));
-
-   screen = get_xcb_screen(s, dri2_dpy->screen);
-   if (!screen) {
-  _eglLog(_EGL_WARNING, "DRI2: failed to get xcb screen");
-  return -1;
-   }
-
-   authenticate_cookie =
-  xcb_dri2_authenticate_unchecked(dri2_dpy->conn, screen->root, id);
-   authenticate =
-  xcb_dri2_authenticate_reply(dri2_dpy->conn, authenticate_cookie, NULL);
-
-   if (authenticate == NULL || !authenticate->authenticated)
-  ret = -1;
-
-   free(authenticate);
-   
-   return ret;
-}
-
-static EGLBoolean
-dri2_x11_local_authenticate(_EGLDisplay *disp)
-{
-#ifdef HAVE_LIBDRM
-   struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
-   drm_magic_t magic;
-
-   if (drmGetMagic(dri2_dpy->fd, )) {
-  _eglLog(_EGL_WARNING, "DRI2: failed to get drm magic");
-  return EGL_FALSE;
-   }
-   
-   if (dri2_x11_authenticate(disp, magic) < 0) {
-  _eglLog(_EGL_WARNING, "DRI2: failed to authenticate");
-  return EGL_FALSE;
-   }
-#endif
-   return EGL_TRUE;
-}
-
 static EGLBoolean
 dri2_x11_add_configs_for_visuals(struct dri2_egl_display *dri2_dpy,
  _EGLDisplay *disp, bool supports_preserved)
@@ -1390,9 +1395,6 @@ dri2_initialize_x11_dri2(_EGLDriver *drv, _EGLDisplay 
*disp)
if (!dri2_x11_connect(dri2_dpy))
   goto cleanup_conn;
 
-   if (!dri2_x11_local_authenticate(disp))
-  goto cleanup_fd;
-
if (!dri2_load_driver(disp))
   goto cleanup_fd;
 
-- 
2.8.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/18] anv: Add a short style guide

2016-04-18 Thread Jason Ekstrand
On Mon, Apr 18, 2016 at 5:35 PM, Ian Romanick  wrote:

> On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
> > ---
> >  src/intel/vulkan/STYLE | 67
> ++
> >  1 file changed, 67 insertions(+)
> >  create mode 100644 src/intel/vulkan/STYLE
> >
> > diff --git a/src/intel/vulkan/STYLE b/src/intel/vulkan/STYLE
> > new file mode 100644
> > index 000..4eb8f79
> > --- /dev/null
> > +++ b/src/intel/vulkan/STYLE
> > @@ -0,0 +1,67 @@
> > +The Intel Vulkan driver typically follows the mesa coding style with a
> few
> > +exceptions.  First is that structs declared in anv_private.h should be
> > +written as follows:
> > +
> > +struct anv_foo {
> > +   int  short_type;
> > +   struct anv_long_type long_type;
> > +   void *   ptr;
> > +};
> > +
> > +Where the * for pointers goes one space after the type and the names are
> > +vertically aligned.  The names should be tabbed over the minimum amount
> > +(still a multiple of 3 spaces) such that they can all be aligned.
> > +
> > +When the anv_batch_emit function is used, it should look as follows:
> > +
> > +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
> > +   .VertexAccessType   = SEQUENTIAL,
> > +   .PrimitiveTopologyType  = pipeline->topology,
> > +   .VertexCountPerInstance = vertexCount,
> > +   .StartVertexLocation= firstVertex,
> > +   .InstanceCount  = instanceCount,
> > +   .StartInstanceLocation  = firstInstance,
> > +   .BaseVertexLocation = 0);
> > +
> > +The batch and struct name parameters should go on the same line with the
> > +anv_batch_emit call and each named parameter on its own line.  The
> > +alignment rules are the same as for structs where all of the "=" are
> > +vertically aligned at the minimum tabstop required to do so.
> > +
> > +Eventually, we would like to move to a block-based packing mechanism.
> In
> > +this case, the above packing macro would look like this:
> > +
> > +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE), prim) {
> > +   prim.VertexAccessType = SEQUENTIAL;
> > +   prim.PrimitiveTopologyType= pipeline->topology;
> > +   prim.VertexCountPerInstance   = vertexCount;
> > +   prim.StartVertexLocation  = firstVertex;
> > +   prim.InstanceCount= instanceCount;
> > +   prim.StartInstanceLocation= firstInstance;
> > +   prim.BaseVertexLocation   = 0;
> > +}
>
> When I first read the style guide, I thought, "Man, that looks nice."
> Then I read all the patches, and I thought, "Man, that looks irritating
> to type."  I don't know a way to make any text editor (or indent) do
> this for me automatically, so it seems really irritating to write.
>

It's not that bad with block-edit in vim.  Sometimes it's nicer, sometimes
it's not.


> Are we sure we want to commit to doing this?  It would be easy enough to
> change now using sed on the patches, but changing it later will be much
> more painful.
>

Mind saying what "this" is?


> Also... it seems like patch 18 should update this. :)
>
> > +
> > +With this new block mechansim, you may end up mixing code and
> declarations.
> > +In this case, it's up to the discression of the programmer exactly how
> to
> > +tab things but trying to keep with the above rules is recommended.
> > +
> > +In meta code, we use a fair number of compound initializers for more
> easily
> > +calling Vulkan functions that require struct arguments.  These should be
> > +declared as follows:
> > +
> > +anv_CreateFramebuffer(anv_device_to_handle(device),
> > +   &(VkFramebufferCreateInfo) {
> > +  .sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
> > +  .attachmentCount = 1,
> > +  .pAttachments = (VkImageView[]) {
> > + anv_image_view_to_handle(dest_iview),
> > +  },
> > +  .width = dest_iview->extent.width,
> > +  .height = dest_iview->extent.height,
> > +  .layers = 1
> > +   }, _buffer->pool->alloc, );
> > +
> > +The initial arguments go on the same line as the call and the primary
> > +struct being passed in is declared on its own line tabbed over 1 tab
> with
> > +it's contents on the following lines tabbed over an additional tab.
> > +Substructures get tabbed over by additional tabs as needed.
> >
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] anv/apply_dynamic_offsets: Provide a range on the load_uniform

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_nir_apply_dynamic_offsets.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_nir_apply_dynamic_offsets.c 
b/src/intel/vulkan/anv_nir_apply_dynamic_offsets.c
index 234855c..06fe8aa 100644
--- a/src/intel/vulkan/anv_nir_apply_dynamic_offsets.c
+++ b/src/intel/vulkan/anv_nir_apply_dynamic_offsets.c
@@ -77,11 +77,13 @@ apply_dynamic_offsets_block(nir_block *block, void 
*void_state)
   /* First, we need to generate the uniform load for the buffer offset */
   uint32_t index = state->layout->set[set].dynamic_offset_start +
set_layout->binding[binding].dynamic_offset_index;
+  uint32_t array_size = set_layout->binding[binding].array_size;
 
   nir_intrinsic_instr *offset_load =
  nir_intrinsic_instr_create(state->shader, nir_intrinsic_load_uniform);
   offset_load->num_components = 2;
-  offset_load->const_index[0] = state->indices_start + index * 8;
+  nir_intrinsic_set_base(offset_load, state->indices_start + index * 8);
+  nir_intrinsic_set_range(offset_load, array_size * 8);
   offset_load->src[0] = nir_src_for_ssa(nir_imul(b, res_intrin->src[0].ssa,
  nir_imm_int(b, 8)));
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965/vec4: Use nir_intrinsic_base in the load_uniform implementation

2016-04-18 Thread Jason Ekstrand
We shouldn't be reading the const_index directly
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index e199d96..b5c23c9 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -691,7 +691,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
 
   dest = get_nir_dest(instr->dest);
 
-  src = src_reg(dst_reg(UNIFORM, instr->const_index[0] / 16));
+  src = src_reg(dst_reg(UNIFORM, nir_intrinsic_base(instr) / 16));
   src.type = dest.type;
 
   /* Uniforms don't actually have to be vec4 aligned.  In the case that
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965/vec4: Use the correct offset for the swizzle shift in push constants

2016-04-18 Thread Jason Ekstrand
This was actually caught by Ken in review the first time around but somehow
didn't get fixed before the patches were pushed. :-(
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index b5c23c9..aa3965a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -712,7 +712,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
 
  unsigned offset = const_offset->u32[0] + shift * 4;
  src.reg_offset = offset / 16;
- shift = (nir_intrinsic_base(instr) % 16) / 4;
+ shift = (offset % 16) / 4;
  src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
 
  emit(MOV(dest, src));
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/vec4: Always split uniforms in array_access_to_pull_constants

2016-04-18 Thread Jason Ekstrand
Normally, we split uniforms at the end but in Vulkan, we bail because we
don't want pull constants.  However, we still need them split because
pack_uniforms relies on it.

I really don't like this patch not because it doesn't work (it does) but
because now that we're using MOV_INDIRECT, uniform numbers and sizes don't
really matter anymore.  In the FS backend, uniform splitting and packing is
handled all at once (actual re-assignment of locations happens later) and
we really should do it that way in vec4 eventually as well.
---
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 4b12a72..507f2ee 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1632,8 +1632,10 @@ 
vec4_visitor::move_uniform_array_access_to_pull_constants()
/* The vulkan dirver doesn't support pull constants other than UBOs so
 * everything has to be pushed regardless.
 */
-   if (stage_prog_data->pull_param == NULL)
+   if (stage_prog_data->pull_param == NULL) {
+  split_uniform_registers();
   return;
+   }
 
int pull_constant_loc[this->uniforms];
memset(pull_constant_loc, -1, sizeof(pull_constant_loc));
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] anv/lower_push_constants: Stop treating scalar specially

2016-04-18 Thread Jason Ekstrand
All of the code that did something special based on vec4 vs. scalar is
bogus.  In the backend, everything is now in units of bytes and the vec4
backend can handle full std140 packing so we don't need to do anything
special anymore.
---
 src/intel/vulkan/anv_nir.h  |  2 +-
 src/intel/vulkan/anv_nir_lower_push_constants.c | 25 ++---
 src/intel/vulkan/anv_pipeline.c |  5 +
 3 files changed, 4 insertions(+), 28 deletions(-)

diff --git a/src/intel/vulkan/anv_nir.h b/src/intel/vulkan/anv_nir.h
index 606fd1c..567de6c 100644
--- a/src/intel/vulkan/anv_nir.h
+++ b/src/intel/vulkan/anv_nir.h
@@ -30,7 +30,7 @@
 extern "C" {
 #endif
 
-void anv_nir_lower_push_constants(nir_shader *shader, bool is_scalar);
+void anv_nir_lower_push_constants(nir_shader *shader);
 
 void anv_nir_apply_dynamic_offsets(struct anv_pipeline *pipeline,
nir_shader *shader,
diff --git a/src/intel/vulkan/anv_nir_lower_push_constants.c 
b/src/intel/vulkan/anv_nir_lower_push_constants.c
index 53cd3d7..44a1a3f 100644
--- a/src/intel/vulkan/anv_nir_lower_push_constants.c
+++ b/src/intel/vulkan/anv_nir_lower_push_constants.c
@@ -23,16 +23,9 @@
 
 #include "anv_nir.h"
 
-struct lower_push_constants_state {
-   nir_shader *shader;
-   bool is_scalar;
-};
-
 static bool
 lower_push_constants_block(nir_block *block, void *void_state)
 {
-   struct lower_push_constants_state *state = void_state;
-
nir_foreach_instr(block, instr) {
   if (instr->type != nir_instr_type_intrinsic)
  continue;
@@ -43,9 +36,6 @@ lower_push_constants_block(nir_block *block, void *void_state)
   if (intrin->intrinsic != nir_intrinsic_load_push_constant)
  continue;
 
-  /* This wont work for vec4 stages. */
-  assert(state->is_scalar);
-
   assert(intrin->const_index[0] % 4 == 0);
   assert(intrin->const_index[1] == 128);
 
@@ -57,21 +47,10 @@ lower_push_constants_block(nir_block *block, void 
*void_state)
 }
 
 void
-anv_nir_lower_push_constants(nir_shader *shader, bool is_scalar)
+anv_nir_lower_push_constants(nir_shader *shader)
 {
-   struct lower_push_constants_state state = {
-  .shader = shader,
-  .is_scalar = is_scalar,
-   };
-
nir_foreach_function(shader, function) {
   if (function->impl)
- nir_foreach_block(function->impl, lower_push_constants_block, );
+ nir_foreach_block(function->impl, lower_push_constants_block, NULL);
}
-
-   assert(shader->num_uniforms % 4 == 0);
-   if (is_scalar)
-  shader->num_uniforms /= 4;
-   else
-  shader->num_uniforms = DIV_ROUND_UP(shader->num_uniforms, 16);
 }
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index a215a37..007c58b 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -313,16 +313,13 @@ anv_pipeline_compile(struct anv_pipeline *pipeline,
  struct brw_stage_prog_data *prog_data,
  struct anv_pipeline_bind_map *map)
 {
-   const struct brw_compiler *compiler =
-  pipeline->device->instance->physicalDevice.compiler;
-
nir_shader *nir = anv_shader_compile_to_nir(pipeline->device,
module, entrypoint, stage,
spec_info);
if (nir == NULL)
   return NULL;
 
-   anv_nir_lower_push_constants(nir, compiler->scalar_stage[stage]);
+   anv_nir_lower_push_constants(nir);
 
/* Figure out the number of parameters */
prog_data->nr_params = 0;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/18] anv: Add a short style guide

2016-04-18 Thread Ian Romanick
On 04/18/2016 05:56 PM, Dave Airlie wrote:
> On 19 April 2016 at 10:35, Ian Romanick  wrote:
>> On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
>>> ---
>>>  src/intel/vulkan/STYLE | 67 
>>> ++
>>>  1 file changed, 67 insertions(+)
>>>  create mode 100644 src/intel/vulkan/STYLE
>>>
>>> diff --git a/src/intel/vulkan/STYLE b/src/intel/vulkan/STYLE
>>> new file mode 100644
>>> index 000..4eb8f79
>>> --- /dev/null
>>> +++ b/src/intel/vulkan/STYLE
>>> @@ -0,0 +1,67 @@
>>> +The Intel Vulkan driver typically follows the mesa coding style with a few
>>> +exceptions.  First is that structs declared in anv_private.h should be
>>> +written as follows:
>>> +
>>> +struct anv_foo {
>>> +   int  short_type;
>>> +   struct anv_long_type long_type;
>>> +   void *   ptr;
>>> +};
>>> +
>>> +Where the * for pointers goes one space after the type and the names are
>>> +vertically aligned.  The names should be tabbed over the minimum amount
>>> +(still a multiple of 3 spaces) such that they can all be aligned.
>>> +
>>> +When the anv_batch_emit function is used, it should look as follows:
>>> +
>>> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
>>> +   .VertexAccessType   = SEQUENTIAL,
>>> +   .PrimitiveTopologyType  = pipeline->topology,
>>> +   .VertexCountPerInstance = vertexCount,
>>> +   .StartVertexLocation= firstVertex,
>>> +   .InstanceCount  = instanceCount,
>>> +   .StartInstanceLocation  = firstInstance,
>>> +   .BaseVertexLocation = 0);
>>> +
>>> +The batch and struct name parameters should go on the same line with the
>>> +anv_batch_emit call and each named parameter on its own line.  The
>>> +alignment rules are the same as for structs where all of the "=" are
>>> +vertically aligned at the minimum tabstop required to do so.
>>> +
>>> +Eventually, we would like to move to a block-based packing mechanism.  In
>>> +this case, the above packing macro would look like this:
>>> +
>>> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE), prim) {
>>> +   prim.VertexAccessType = SEQUENTIAL;
>>> +   prim.PrimitiveTopologyType= pipeline->topology;
>>> +   prim.VertexCountPerInstance   = vertexCount;
>>> +   prim.StartVertexLocation  = firstVertex;
>>> +   prim.InstanceCount= instanceCount;
>>> +   prim.StartInstanceLocation= firstInstance;
>>> +   prim.BaseVertexLocation   = 0;
>>> +}
>>
>> When I first read the style guide, I thought, "Man, that looks nice."
>> Then I read all the patches, and I thought, "Man, that looks irritating
>> to type."  I don't know a way to make any text editor (or indent) do
>> this for me automatically, so it seems really irritating to write.
>>
>> Are we sure we want to commit to doing this?  It would be easy enough to
>> change now using sed on the patches, but changing it later will be much
>> more painful.
>>
>> Also... it seems like patch 18 should update this. :)
> 
> Really we should just pick kernel style or Mesa style. I'm not sure we
> need to invest

Well... I tend to agree, but we're doing things here that we haven't
previously done in Mesa, and I don't know that the kernel does (much?)
of this either.

> in a new style. If your style guide isn't an indent command line and
> .emacs and .vim
> snippets, then it isn't near as useful. Also as I said on irc,

It definitely decreases the probability that people will get it
consistently right... and increases the number of vN+1 patches.

> separating the * from the name
> is bad documentation practice.
> 
> I can spot the bug in both lines below a lot easier in the second. The
> * goes with the name
> not with the type.
> 
> uint32_t *  ptr1, ptr2;
> uint32_t*ptr1, ptr2;

Right.  That's part of the reason Mesa doesn't generally declare
multiple variables in a single line like that.

I have seen similar styles that do

  int   foo;
  int  *bar;
  int   asdf;

The names still line up, the but the * is someplace sensible.

> Dave.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/18] anv: Add a short style guide

2016-04-18 Thread Dave Airlie
On 19 April 2016 at 10:35, Ian Romanick  wrote:
> On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
>> ---
>>  src/intel/vulkan/STYLE | 67 
>> ++
>>  1 file changed, 67 insertions(+)
>>  create mode 100644 src/intel/vulkan/STYLE
>>
>> diff --git a/src/intel/vulkan/STYLE b/src/intel/vulkan/STYLE
>> new file mode 100644
>> index 000..4eb8f79
>> --- /dev/null
>> +++ b/src/intel/vulkan/STYLE
>> @@ -0,0 +1,67 @@
>> +The Intel Vulkan driver typically follows the mesa coding style with a few
>> +exceptions.  First is that structs declared in anv_private.h should be
>> +written as follows:
>> +
>> +struct anv_foo {
>> +   int  short_type;
>> +   struct anv_long_type long_type;
>> +   void *   ptr;
>> +};
>> +
>> +Where the * for pointers goes one space after the type and the names are
>> +vertically aligned.  The names should be tabbed over the minimum amount
>> +(still a multiple of 3 spaces) such that they can all be aligned.
>> +
>> +When the anv_batch_emit function is used, it should look as follows:
>> +
>> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
>> +   .VertexAccessType   = SEQUENTIAL,
>> +   .PrimitiveTopologyType  = pipeline->topology,
>> +   .VertexCountPerInstance = vertexCount,
>> +   .StartVertexLocation= firstVertex,
>> +   .InstanceCount  = instanceCount,
>> +   .StartInstanceLocation  = firstInstance,
>> +   .BaseVertexLocation = 0);
>> +
>> +The batch and struct name parameters should go on the same line with the
>> +anv_batch_emit call and each named parameter on its own line.  The
>> +alignment rules are the same as for structs where all of the "=" are
>> +vertically aligned at the minimum tabstop required to do so.
>> +
>> +Eventually, we would like to move to a block-based packing mechanism.  In
>> +this case, the above packing macro would look like this:
>> +
>> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE), prim) {
>> +   prim.VertexAccessType = SEQUENTIAL;
>> +   prim.PrimitiveTopologyType= pipeline->topology;
>> +   prim.VertexCountPerInstance   = vertexCount;
>> +   prim.StartVertexLocation  = firstVertex;
>> +   prim.InstanceCount= instanceCount;
>> +   prim.StartInstanceLocation= firstInstance;
>> +   prim.BaseVertexLocation   = 0;
>> +}
>
> When I first read the style guide, I thought, "Man, that looks nice."
> Then I read all the patches, and I thought, "Man, that looks irritating
> to type."  I don't know a way to make any text editor (or indent) do
> this for me automatically, so it seems really irritating to write.
>
> Are we sure we want to commit to doing this?  It would be easy enough to
> change now using sed on the patches, but changing it later will be much
> more painful.
>
> Also... it seems like patch 18 should update this. :)

Really we should just pick kernel style or Mesa style. I'm not sure we
need to invest
in a new style. If your style guide isn't an indent command line and
.emacs and .vim
snippets, then it isn't near as useful. Also as I said on irc,
separating the * from the name
is bad documentation practice.

I can spot the bug in both lines below a lot easier in the second. The
* goes with the name
not with the type.

uint32_t *  ptr1, ptr2;
uint32_t*ptr1, ptr2;

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Properly handle integer types in opt_vector_float().

2016-04-18 Thread Kenneth Graunke
On Monday, April 18, 2016 11:50:53 AM PDT Matt Turner wrote:
> On Sun, Apr 17, 2016 at 11:14 PM, Kenneth Graunke  
wrote:
> > Previously, opt_vector_float() always interpreted MOV sources as
> > floating point, and always created a MOV with a F-type destination.
> >
> > This meant that we could mess up sequences of integer loads, such as:
> >
> >mov vgrf6.0.x:D, 0D
> >mov vgrf6.0.y:D, 1D
> >mov vgrf6.0.z:D, 2D
> >mov vgrf6.0.w:D, 3D
> >
> > Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
> >
> >mov vgrf6.0:F, [0F, 0F, 0F, 0F]
> >
> > which is clearly wrong.  We can properly handle this by converting
> > integer values to float (rather than bitcasting), and emitting a type
> > converting MOV:
> >
> >mov vgrf6.0:D, [0F, 1F, 2F, 3F]
> >
> > To do this, see first see if the integer values (converted to float)
> > are representable.  If so, we use a D-type MOV.  If not, we then try
> > the floating point values and an F-type MOV.  We make zero not impose
> > type restrictions.  This is important because 0D would imply a D-type
> > MOV, but is often used in sequences such as MOV 0D, MOV 0x3f80D,
> > where we want to use an F-type MOV.
> >
> > Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend.  This
> > recently became visible due to changes in opt_vector_float() which
> > made it optimize more cases, but it was a pre-existing bug.
> >
> > Signed-off-by: Kenneth Graunke 
> 
> Hurts a single program in shader-db... for some reason related to
> seeing a zero first?
> 
> In toki-tori-2/1, we see
> 
> -mov(8)  g18<1>.zwF  [0F, 0F, 0F, 1F]VF
> +mov(8)  g18<1>.zUD  0xUD
> +mov(8)  g18<1>.wD   1065353216D
> 
> Ignore the UD type -- the generator changes D -> UD so it can compact
> the instruction. It's actually type-D when opt_vector_float is called.

Thanks...this was a bug.  The larger code sequence was:

mov vgrf13.0.x:D, 1D
mov vgrf5.0.z:D, 0D
mov vgrf5.0.w:D, 1065353216D

When we arrive at the second instruction, inst_count > 0 and dest_type
is D (from the first instruction).  We try to avoid adding requirements
to the type by setting need_type to dest_type, but that's actually the
left over type from the previous sequence.  We then flush, reset
dest_type to F.  We then record the second instruction, setting
dest_type to need_type, which was incorrectly D.  Processing the third
instruction sees that dest_type (D) isn't equal to need_type (F), so it
flushes, which just bails since there isn't anything to batch up.

Nasty.  At any rate, I think I have a +5 -8 LOC fix.  Testing it now.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: do not raise uninitialized variable warning with gl_GlobalInvocationID/gl_LocalInvocationIndex

2016-04-18 Thread Timothy Arceri
On Mon, 2016-04-18 at 17:12 -0700, Ian Romanick wrote:
> Since the whole gl_* namespace is reserved, it might be better /
> easier
> to strncmp(var->name, "gl_", 3) instead.

We even have a helper is_gl_identifier()


> 
> On 03/31/2016 02:48 AM, Alejandro Piñeiro wrote:
> > 
> > Most GLSL built-ins variables are filtered out because they have
> > the mode ir_var_system_value, but those two not. Those two are
> > specially handled as they can be infered from other system values,
> > and were represented as a variable initialized with a value
> > based of those system values, instead of a lowering.
> > ---
> > 
> > As a reference, this would be the patch I'm proposing.
> > 
> >  src/compiler/glsl/ast_to_hir.cpp | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > b/src/compiler/glsl/ast_to_hir.cpp
> > index a031231..aec8e6f 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -1905,7 +1905,9 @@ ast_expression::do_hir(exec_list
> > *instructions,
> >  
> >   if ((var->data.mode == ir_var_auto || var->data.mode ==
> > ir_var_shader_out)
> >   && !this->is_lhs
> > - && result->variable_referenced()->data.assigned !=
> > true) {
> > + && result->variable_referenced()->data.assigned !=
> > true
> > + && strcmp(var->name, "gl_GlobalInvocationID") != 0
> > + && strcmp(var->name, "gl_LocalInvocationIndex") != 0)
> > {
> >  _mesa_glsl_warning(, state, "`%s' used
> > uninitialized",
> > this-
> > >primary_expression.identifier);
> >   }
> > 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/18] anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER

2016-04-18 Thread Ian Romanick
On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/genX_cmd_buffer.c | 76 
> +-
>  1 file changed, 42 insertions(+), 34 deletions(-)
> 
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 932ba65..713de82 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -924,28 +924,34 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
> *cmd_buffer)
>  
> /* Emit 3DSTATE_DEPTH_BUFFER */
> if (has_depth) {
> -  anv_batch_emit(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER),
> - .SurfaceType = SURFTYPE_2D,
> - .DepthWriteEnable = true,
> - .StencilWriteEnable = has_stencil,
> - .HierarchicalDepthBufferEnable = false,
> - .SurfaceFormat = isl_surf_get_depth_format(>isl_dev,
> -
> >depth_surface.isl),
> - .SurfacePitch = image->depth_surface.isl.row_pitch - 1,
> - .SurfaceBaseAddress = {
> +  anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER), db) 
> {
> + db.SurfaceType   = SURFTYPE_2D;
> + db.DepthWriteEnable  = true;
> + db.StencilWriteEnable= has_stencil;
> + db.HierarchicalDepthBufferEnable = false;
> +
> + db.SurfaceFormat = isl_surf_get_depth_format(>isl_dev,
> +  
> >depth_surface.isl);
> +
> + db.SurfaceBaseAddress = (struct anv_address) {
>  .bo = image->bo,
>  .offset = image->offset + image->depth_surface.offset,
> - },
> - .Height = fb->height - 1,
> - .Width = fb->width - 1,
> - .LOD = 0,
> - .Depth = 1 - 1,
> - .MinimumArrayElement = 0,
> - .DepthBufferObjectControlState = GENX(MOCS),
> + };
> + db.DepthBufferObjectControlState = GENX(MOCS),
> +
> + db.SurfacePitch = image->depth_surface.isl.row_pitch - 1;
> + db.Height   = fb->height - 1;
> + db.Width= fb->width - 1;
> + db.LOD  = 0;
> + db.Depth= 1 - 1;
> + db.MinimumArrayElement  = 0;
> +
>  #if GEN_GEN >= 8
> - .SurfaceQPitch = 
> isl_surf_get_array_pitch_el_rows(>depth_surface.isl) >> 2,
> + db.SurfaceQPitch =
> +isl_surf_get_array_pitch_el_rows(>depth_surface.isl) >> 2,
>  #endif
> - .RenderTargetViewExtent = 1 - 1);
> + db.RenderTargetViewExtent = 1 - 1;
> +  }
> } else {
>/* Even when no depth buffer is present, the hardware requires that
> * 3DSTATE_DEPTH_BUFFER be programmed correctly. The Broadwell PRM 
> says:
> @@ -965,45 +971,47 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
> *cmd_buffer)
> * nor stencil buffer is present.  Also, D16_UNORM is not allowed to
> * be combined with a stencil buffer so we use D32_FLOAT instead.
> */
> -  anv_batch_emit(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER),
> - .SurfaceType = SURFTYPE_2D,
> - .SurfaceFormat = D32_FLOAT,
> - .Width = fb->width - 1,
> - .Height = fb->height - 1,
> - .StencilWriteEnable = has_stencil);
> +  anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER), db) 
> {
> + db.SurfaceType  = SURFTYPE_2D;
> + db.SurfaceFormat= D32_FLOAT;
> + db.Width= fb->width - 1;
> + db.Height   = fb->height - 1;
> + db.StencilWriteEnable   = has_stencil;
> +  }
> }
>  
> /* Emit 3DSTATE_STENCIL_BUFFER */
> if (has_stencil) {
> -  anv_batch_emit(_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER),
> +  anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), 
> sb) {

I think all the code that follows violates the style guide.  This is
part of the reason I think we may regret this style choice.  To be
clear... I'll go with the crowd, but I just want to be sure the crowd
fully has 2n eyes open.

>  #if GEN_GEN >= 8 || GEN_IS_HASWELL
> - .StencilBufferEnable = true,
> + sb.StencilBufferEnable = true,
>  #endif
> - .StencilBufferObjectControlState = GENX(MOCS),
> + sb.StencilBufferObjectControlState = GENX(MOCS),
>  
>   /* Stencil buffers have strange pitch. The PRM says:
>*
>*The pitch must be set to 2x the value computed based on width,
>*as the stencil buffer is stored with two rows interleaved.
>*/
> - .SurfacePitch = 2 * image->stencil_surface.isl.row_pitch - 1,
> + sb.SurfacePitch = 2 * image->stencil_surface.isl.row_pitch - 1,
>  
>  #if GEN_GEN >= 8
> - .SurfaceQPitch = 
> isl_surf_get_array_pitch_el_rows(>stencil_surface.isl) >> 2,
> + sb.SurfaceQPitch = 
> 

Re: [Mesa-dev] [PATCH 01/18] anv: Add a short style guide

2016-04-18 Thread Ian Romanick
On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/STYLE | 67 
> ++
>  1 file changed, 67 insertions(+)
>  create mode 100644 src/intel/vulkan/STYLE
> 
> diff --git a/src/intel/vulkan/STYLE b/src/intel/vulkan/STYLE
> new file mode 100644
> index 000..4eb8f79
> --- /dev/null
> +++ b/src/intel/vulkan/STYLE
> @@ -0,0 +1,67 @@
> +The Intel Vulkan driver typically follows the mesa coding style with a few
> +exceptions.  First is that structs declared in anv_private.h should be
> +written as follows:
> +
> +struct anv_foo {
> +   int  short_type;
> +   struct anv_long_type long_type;
> +   void *   ptr;
> +};
> +
> +Where the * for pointers goes one space after the type and the names are
> +vertically aligned.  The names should be tabbed over the minimum amount
> +(still a multiple of 3 spaces) such that they can all be aligned.
> +
> +When the anv_batch_emit function is used, it should look as follows:
> +
> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
> +   .VertexAccessType   = SEQUENTIAL,
> +   .PrimitiveTopologyType  = pipeline->topology,
> +   .VertexCountPerInstance = vertexCount,
> +   .StartVertexLocation= firstVertex,
> +   .InstanceCount  = instanceCount,
> +   .StartInstanceLocation  = firstInstance,
> +   .BaseVertexLocation = 0);
> +
> +The batch and struct name parameters should go on the same line with the
> +anv_batch_emit call and each named parameter on its own line.  The
> +alignment rules are the same as for structs where all of the "=" are
> +vertically aligned at the minimum tabstop required to do so.
> +
> +Eventually, we would like to move to a block-based packing mechanism.  In
> +this case, the above packing macro would look like this:
> +
> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE), prim) {
> +   prim.VertexAccessType = SEQUENTIAL;
> +   prim.PrimitiveTopologyType= pipeline->topology;
> +   prim.VertexCountPerInstance   = vertexCount;
> +   prim.StartVertexLocation  = firstVertex;
> +   prim.InstanceCount= instanceCount;
> +   prim.StartInstanceLocation= firstInstance;
> +   prim.BaseVertexLocation   = 0;
> +}

When I first read the style guide, I thought, "Man, that looks nice."
Then I read all the patches, and I thought, "Man, that looks irritating
to type."  I don't know a way to make any text editor (or indent) do
this for me automatically, so it seems really irritating to write.

Are we sure we want to commit to doing this?  It would be easy enough to
change now using sed on the patches, but changing it later will be much
more painful.

Also... it seems like patch 18 should update this. :)

> +
> +With this new block mechansim, you may end up mixing code and declarations.
> +In this case, it's up to the discression of the programmer exactly how to
> +tab things but trying to keep with the above rules is recommended.
> +
> +In meta code, we use a fair number of compound initializers for more easily
> +calling Vulkan functions that require struct arguments.  These should be
> +declared as follows:
> +
> +anv_CreateFramebuffer(anv_device_to_handle(device),
> +   &(VkFramebufferCreateInfo) {
> +  .sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
> +  .attachmentCount = 1,
> +  .pAttachments = (VkImageView[]) {
> + anv_image_view_to_handle(dest_iview),
> +  },
> +  .width = dest_iview->extent.width,
> +  .height = dest_iview->extent.height,
> +  .layers = 1
> +   }, _buffer->pool->alloc, );
> +
> +The initial arguments go on the same line as the call and the primary
> +struct being passed in is declared on its own line tabbed over 1 tab with
> +it's contents on the following lines tabbed over an additional tab.
> +Substructures get tabbed over by additional tabs as needed.
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/18] anv/cmd_buffer: Use the new emit macro for compute shader dispatch

2016-04-18 Thread Ian Romanick
On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/genX_cmd_buffer.c | 116 
> -
>  1 file changed, 64 insertions(+), 52 deletions(-)
> 
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 45b009b..4a75825 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -572,17 +572,19 @@ static void
>  emit_lrm(struct anv_batch *batch,
>   uint32_t reg, struct anv_bo *bo, uint32_t offset)
>  {
> -   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_MEM),
> -  .RegisterAddress = reg,
> -  .MemoryAddress = { bo, offset });
> +   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_MEM), lrm) {
> +  lrm.RegisterAddress  = reg;
> +  lrm.MemoryAddress= (struct anv_address) { bo, offset };
> +   }
>  }
>  
>  static void
>  emit_lri(struct anv_batch *batch, uint32_t reg, uint32_t imm)

In patch 8 the Gen8 emit_lri helper is removed, but this one stays.  It
seems like either both should go or both should stay.  I also thought it
was odd that the Gen8 emit_lri was a #define while this is a function.
Perhaps there's something else happening here that I don't see.

>  {
> -   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_IMM),
> -  .RegisterOffset = reg,
> -  .DataDWord = imm);
> +   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_IMM), lri) {
> +  lri.RegisterOffset   = reg;
> +  lri.DataDWord= imm;
> +   }
>  }
>  
>  void genX(CmdDrawIndirect)(
> @@ -695,18 +697,19 @@ void genX(CmdDispatch)(
>  
> genX(cmd_buffer_flush_compute_state)(cmd_buffer);
>  
> -   anv_batch_emit(_buffer->batch, GENX(GPGPU_WALKER),
> -  .SIMDSize = prog_data->simd_size / 16,
> -  .ThreadDepthCounterMaximum = 0,
> -  .ThreadHeightCounterMaximum = 0,
> -  .ThreadWidthCounterMaximum = pipeline->cs_thread_width_max 
> - 1,
> -  .ThreadGroupIDXDimension = x,
> -  .ThreadGroupIDYDimension = y,
> -  .ThreadGroupIDZDimension = z,
> -  .RightExecutionMask = pipeline->cs_right_mask,
> -  .BottomExecutionMask = 0x);
> -
> -   anv_batch_emit(_buffer->batch, GENX(MEDIA_STATE_FLUSH));
> +   anv_batch_emit_blk(_buffer->batch, GENX(GPGPU_WALKER), ggw) {
> +  ggw.SIMDSize = prog_data->simd_size / 16;
> +  ggw.ThreadDepthCounterMaximum= 0;
> +  ggw.ThreadHeightCounterMaximum   = 0;
> +  ggw.ThreadWidthCounterMaximum= pipeline->cs_thread_width_max - 1;
> +  ggw.ThreadGroupIDXDimension  = x;
> +  ggw.ThreadGroupIDYDimension  = y;
> +  ggw.ThreadGroupIDZDimension  = z;
> +  ggw.RightExecutionMask   = pipeline->cs_right_mask;
> +  ggw.BottomExecutionMask  = 0x;
> +   }
> +
> +   anv_batch_emit_blk(_buffer->batch, GENX(MEDIA_STATE_FLUSH), msf);
>  }
>  
>  #define GPGPU_DISPATCHDIMX 0x2500
> @@ -758,48 +761,53 @@ void genX(CmdDispatchIndirect)(
> emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 0);
>  
> /* predicate = (compute_dispatch_indirect_x_size == 0); */
> -   anv_batch_emit(batch, GENX(MI_PREDICATE),
> -  .LoadOperation = LOAD_LOAD,
> -  .CombineOperation = COMBINE_SET,
> -  .CompareOperation = COMPARE_SRCS_EQUAL);
> +   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
> +  mip.LoadOperation= LOAD_LOAD;
> +  mip.CombineOperation = COMBINE_SET;
> +  mip.CompareOperation = COMPARE_SRCS_EQUAL;
> +   }
>  
> /* Load compute_dispatch_indirect_y_size into SRC0 */
> emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 4);
>  
> /* predicate |= (compute_dispatch_indirect_y_size == 0); */
> -   anv_batch_emit(batch, GENX(MI_PREDICATE),
> -  .LoadOperation = LOAD_LOAD,
> -  .CombineOperation = COMBINE_OR,
> -  .CompareOperation = COMPARE_SRCS_EQUAL);
> +   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
> +  mip.LoadOperation= LOAD_LOAD;
> +  mip.CombineOperation = COMBINE_OR;
> +  mip.CompareOperation = COMPARE_SRCS_EQUAL;
> +   }
>  
> /* Load compute_dispatch_indirect_z_size into SRC0 */
> emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 8);
>  
> /* predicate |= (compute_dispatch_indirect_z_size == 0); */
> -   anv_batch_emit(batch, GENX(MI_PREDICATE),
> -  .LoadOperation = LOAD_LOAD,
> -  .CombineOperation = COMBINE_OR,
> -  .CompareOperation = COMPARE_SRCS_EQUAL);
> +   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
> +  mip.LoadOperation= LOAD_LOAD;
> +  mip.CombineOperation = COMBINE_OR;
> +  mip.CompareOperation = COMPARE_SRCS_EQUAL;
> +   }
>  
> /* predicate = !predicate; */
>  #define COMPARE_FALSE   1
> -   

Re: [Mesa-dev] [PATCH 02/18] anv: Add a new block-based batch emit macro

2016-04-18 Thread Ian Romanick
On 04/18/2016 05:19 PM, Ian Romanick wrote:
> On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
>> This new macro uses a for loop to create an actual code block in which to
>> place the macro setup code.  One advantage of this is that you syntatically
>  syntactically
> 
>> use braces instead of parentheses.  Another is that the code in the block
>> doesn't even get executed if anv_batch_emit_dwords fails.
> 
> Is the old anv_batch_emit eventually removed?

Yes, you fool.  That's patch 17.

>> ---
>>  src/intel/vulkan/anv_private.h | 9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
>> index ae2e08d..d59b7ed 100644
>> --- a/src/intel/vulkan/anv_private.h
>> +++ b/src/intel/vulkan/anv_private.h
>> @@ -861,6 +861,15 @@ __gen_combine_address(struct anv_batch *batch, void 
>> *location,
>>VG(VALGRIND_CHECK_MEM_IS_DEFINED(dw, ARRAY_SIZE(dwords0) * 4));\
>> } while (0)
>>  
>> +#define anv_batch_emit_blk(batch, cmd, name)\
>> +   for (struct cmd name = { __anv_cmd_header(cmd) },\
>> +*_dst = anv_batch_emit_dwords(batch, __anv_cmd_length(cmd));\
>> +__builtin_expect(_dst != NULL, 1);  \
>> +({ __anv_cmd_pack(cmd)(batch, _dst, ); \
>> +   VG(VALGRIND_CHECK_MEM_IS_DEFINED(_dst, __anv_cmd_length(cmd) * 
>> 4)); \
>> +   _dst = NULL; \
>> + }))
>> +
>>  #define anv_state_pool_emit(pool, cmd, align, ...) ({   \
>>const uint32_t __size = __anv_cmd_length(cmd) * 4;\
>>struct anv_state __state =\
>>
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: add forgotten textureOffset function for sampler2DArrayShadow

2016-04-18 Thread sroland
From: Roland Scheidegger 

This was part of EXT_gpu_shader4 - as such it should have been supported
by glsl 130.
It was however forgotten, and not added until glsl 430 - with the wrong
syntax no less (glsl 430 mentions it was overlooked).
glsl 440 (but revision 8 only) fixed this finally for good.
At least nvidia supports this with just version glsl version 1.30 as well
(the spec doesn't explicitly say it should be supported retroactively),
so just add this to the other glsl 130 textureOffset functions.

Passes a revised piglit tex-miplevel-selection test (2DArrayShadow
textureOffset -auto) with llvmpipe.

v2: fix up comment (by Ian), add testing to commit message.
---
 src/compiler/glsl/builtin_functions.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 1f6fb22..25d914d 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -1756,6 +1756,13 @@ builtin_builder::create_builtins()
 _texture(ir_tex, v130, glsl_type::uvec4_type, 
glsl_type::usampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET),
 
 _texture(ir_tex, v130, glsl_type::float_type, 
glsl_type::sampler1DArrayShadow_type, glsl_type::vec3_type, TEX_OFFSET),
+/* The next one was forgotten in GLSL 1.30 spec. It's from
+ * EXT_gpu_shader4 originally. It was added in 4.30 with the
+ * wrong syntax. This was corrected in 4.40. 4.30 indicates
+ * that it was intended to be included previously, so allow it
+ * in 1.30.
+ */
+_texture(ir_tex, v130, glsl_type::float_type, 
glsl_type::sampler2DArrayShadow_type, glsl_type::vec4_type, TEX_OFFSET),
 
 _texture(ir_txb, v130_fs_only, glsl_type::vec4_type,  
glsl_type::sampler1D_type,  glsl_type::float_type, TEX_OFFSET),
 _texture(ir_txb, v130_fs_only, glsl_type::ivec4_type, 
glsl_type::isampler1D_type, glsl_type::float_type, TEX_OFFSET),
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/18] anv: Add a short style guide

2016-04-18 Thread Matt Turner
On Mon, Apr 18, 2016 at 5:10 PM, Jason Ekstrand  wrote:
> ---
>  src/intel/vulkan/STYLE | 67 
> ++
>  1 file changed, 67 insertions(+)
>  create mode 100644 src/intel/vulkan/STYLE
>
> diff --git a/src/intel/vulkan/STYLE b/src/intel/vulkan/STYLE
> new file mode 100644
> index 000..4eb8f79
> --- /dev/null
> +++ b/src/intel/vulkan/STYLE
> @@ -0,0 +1,67 @@
> +The Intel Vulkan driver typically follows the mesa coding style with a few
> +exceptions.  First is that structs declared in anv_private.h should be
> +written as follows:
> +
> +struct anv_foo {
> +   int  short_type;
> +   struct anv_long_type long_type;
> +   void *   ptr;
> +};
> +
> +Where the * for pointers goes one space after the type and the names are
> +vertically aligned.  The names should be tabbed over the minimum amount
> +(still a multiple of 3 spaces) such that they can all be aligned.
> +
> +When the anv_batch_emit function is used, it should look as follows:
> +
> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
> +   .VertexAccessType   = SEQUENTIAL,
> +   .PrimitiveTopologyType  = pipeline->topology,
> +   .VertexCountPerInstance = vertexCount,
> +   .StartVertexLocation= firstVertex,
> +   .InstanceCount  = instanceCount,
> +   .StartInstanceLocation  = firstInstance,
> +   .BaseVertexLocation = 0);
> +
> +The batch and struct name parameters should go on the same line with the
> +anv_batch_emit call and each named parameter on its own line.  The
> +alignment rules are the same as for structs where all of the "=" are
> +vertically aligned at the minimum tabstop required to do so.
> +
> +Eventually, we would like to move to a block-based packing mechanism.  In
> +this case, the above packing macro would look like this:
> +
> +anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE), prim) {
> +   prim.VertexAccessType = SEQUENTIAL;
> +   prim.PrimitiveTopologyType= pipeline->topology;
> +   prim.VertexCountPerInstance   = vertexCount;
> +   prim.StartVertexLocation  = firstVertex;
> +   prim.InstanceCount= instanceCount;
> +   prim.StartInstanceLocation= firstInstance;
> +   prim.BaseVertexLocation   = 0;
> +}
> +
> +With this new block mechansim, you may end up mixing code and declarations.
> +In this case, it's up to the discression of the programmer exactly how to
> +tab things but trying to keep with the above rules is recommended.
> +
> +In meta code, we use a fair number of compound initializers for more easily
> +calling Vulkan functions that require struct arguments.  These should be
> +declared as follows:
> +
> +anv_CreateFramebuffer(anv_device_to_handle(device),
> +   &(VkFramebufferCreateInfo) {
> +  .sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
> +  .attachmentCount = 1,
> +  .pAttachments = (VkImageView[]) {
> + anv_image_view_to_handle(dest_iview),
> +  },
> +  .width = dest_iview->extent.width,
> +  .height = dest_iview->extent.height,
> +  .layers = 1
> +   }, _buffer->pool->alloc, );
> +
> +The initial arguments go on the same line as the call and the primary
> +struct being passed in is declared on its own line tabbed over 1 tab with
> +it's contents on the following lines tabbed over an additional tab.

Hopefully it's not the style to use the wrong "its" :P

s/it's/its/

You might also want to replace "tabbed over" with "indented" since
that is the actual word for it. :)

> +Substructures get tabbed over by additional tabs as needed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/18] anv: Add a new block-based batch emit macro

2016-04-18 Thread Ian Romanick
On 04/18/2016 05:10 PM, Jason Ekstrand wrote:
> This new macro uses a for loop to create an actual code block in which to
> place the macro setup code.  One advantage of this is that you syntatically
 syntactically

> use braces instead of parentheses.  Another is that the code in the block
> doesn't even get executed if anv_batch_emit_dwords fails.

Is the old anv_batch_emit eventually removed?

> ---
>  src/intel/vulkan/anv_private.h | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index ae2e08d..d59b7ed 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -861,6 +861,15 @@ __gen_combine_address(struct anv_batch *batch, void 
> *location,
>VG(VALGRIND_CHECK_MEM_IS_DEFINED(dw, ARRAY_SIZE(dwords0) * 4));\
> } while (0)
>  
> +#define anv_batch_emit_blk(batch, cmd, name)\
> +   for (struct cmd name = { __anv_cmd_header(cmd) },\
> +*_dst = anv_batch_emit_dwords(batch, __anv_cmd_length(cmd));\
> +__builtin_expect(_dst != NULL, 1);  \
> +({ __anv_cmd_pack(cmd)(batch, _dst, ); \
> +   VG(VALGRIND_CHECK_MEM_IS_DEFINED(_dst, __anv_cmd_length(cmd) * 
> 4)); \
> +   _dst = NULL; \
> + }))
> +
>  #define anv_state_pool_emit(pool, cmd, align, ...) ({   \
>const uint32_t __size = __anv_cmd_length(cmd) * 4;\
>struct anv_state __state =\
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/vec4: Lower integer multiplication after optimizations.

2016-04-18 Thread Matt Turner
On Mon, Apr 18, 2016 at 5:08 PM, Ian Romanick  wrote:
> On 04/18/2016 04:14 PM, Matt Turner wrote:
>> Analogous to commit 1e4e17fbd in the i965/fs backend.
>>
>> Because the copy propagation pass in the vec4 backend is strictly local,
>> we look at the immediate values coming from NIR and emit the multiplies
>> we need directly. If the copy propagation pass becomes smarter in the
>> future, we can reduce the nir_op_imul case in brw_vec4_nir.cpp to a
>> single multiply.
>>
>> total instructions in shared programs: 7082311 -> 7081953 (-0.01%)
>> instructions in affected programs: 59581 -> 59223 (-0.60%)
>> helped: 293
>>
>> total cycles in shared programs: 65765712 -> 65764796 (-0.00%)
>> cycles in affected programs: 854112 -> 853196 (-0.11%)
>> helped: 154
>> HURT: 73
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 67 
>> ++
>>  src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 48 +
>>  3 files changed, 88 insertions(+), 28 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index b9cf3f6..1644d4d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -1671,6 +1671,71 @@ vec4_visitor::lower_minmax()
>> return progress;
>>  }
>>
>> +bool
>> +vec4_visitor::lower_integer_multiplication()
>> +{
>> +   bool progress = false;
>> +
>> +   foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
>> +  const vec4_builder ibld(this, block, inst);
>> +
>> +  if (inst->opcode == BRW_OPCODE_MUL) {
>> + if (inst->dst.is_accumulator() ||
>> + (inst->src[1].type != BRW_REGISTER_TYPE_D &&
>> +  inst->src[1].type != BRW_REGISTER_TYPE_UD))
>> +continue;
>> +
>> + /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit
>> +  * operation directly, but CHV/BXT cannot.
>> +  */
>> + if (devinfo->gen >= 8 &&
>> + !devinfo->is_cherryview && !devinfo->is_broxton)
>> +continue;
>
> Shouldn't this whole method just bail if we're Gen >= 8 and !CHV and
> !BXT?  Or does this structure simplify future changes?

Oh, I hadn't noticed.

The FS code was originally as you suggest, with the function returning
early under those conditions. Curro changed that in commit 2e731264382
in order to add lowering support for the multiply-high instruction on
all platforms. We may want to do that in the vec4 backend as well.

The other thing I need to fix is Cherryview multiplications, where we
need to change the type of src1 to UW. I'm not sure if it's better to
do that here, or at a lower level. Maybe in brw_MUL itself since
that's called in a few places...

Depending on whether people think that code should go here or
elsewhere, I'll move the block to the beginning of the function.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: do not raise uninitialized variable warning with gl_GlobalInvocationID/gl_LocalInvocationIndex

2016-04-18 Thread Ian Romanick
Since the whole gl_* namespace is reserved, it might be better / easier
to strncmp(var->name, "gl_", 3) instead.

On 03/31/2016 02:48 AM, Alejandro Piñeiro wrote:
> Most GLSL built-ins variables are filtered out because they have
> the mode ir_var_system_value, but those two not. Those two are
> specially handled as they can be infered from other system values,
> and were represented as a variable initialized with a value
> based of those system values, instead of a lowering.
> ---
> 
> As a reference, this would be the patch I'm proposing.
> 
>  src/compiler/glsl/ast_to_hir.cpp | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index a031231..aec8e6f 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -1905,7 +1905,9 @@ ast_expression::do_hir(exec_list *instructions,
>  
>   if ((var->data.mode == ir_var_auto || var->data.mode == 
> ir_var_shader_out)
>   && !this->is_lhs
> - && result->variable_referenced()->data.assigned != true) {
> + && result->variable_referenced()->data.assigned != true
> + && strcmp(var->name, "gl_GlobalInvocationID") != 0
> + && strcmp(var->name, "gl_LocalInvocationIndex") != 0) {
>  _mesa_glsl_warning(, state, "`%s' used uninitialized",
> this->primary_expression.identifier);
>   }
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/18] anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_cmd_buffer.c | 76 +-
 1 file changed, 42 insertions(+), 34 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 932ba65..713de82 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -924,28 +924,34 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
 
/* Emit 3DSTATE_DEPTH_BUFFER */
if (has_depth) {
-  anv_batch_emit(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER),
- .SurfaceType = SURFTYPE_2D,
- .DepthWriteEnable = true,
- .StencilWriteEnable = has_stencil,
- .HierarchicalDepthBufferEnable = false,
- .SurfaceFormat = isl_surf_get_depth_format(>isl_dev,
->depth_surface.isl),
- .SurfacePitch = image->depth_surface.isl.row_pitch - 1,
- .SurfaceBaseAddress = {
+  anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER), db) {
+ db.SurfaceType   = SURFTYPE_2D;
+ db.DepthWriteEnable  = true;
+ db.StencilWriteEnable= has_stencil;
+ db.HierarchicalDepthBufferEnable = false;
+
+ db.SurfaceFormat = isl_surf_get_depth_format(>isl_dev,
+  
>depth_surface.isl);
+
+ db.SurfaceBaseAddress = (struct anv_address) {
 .bo = image->bo,
 .offset = image->offset + image->depth_surface.offset,
- },
- .Height = fb->height - 1,
- .Width = fb->width - 1,
- .LOD = 0,
- .Depth = 1 - 1,
- .MinimumArrayElement = 0,
- .DepthBufferObjectControlState = GENX(MOCS),
+ };
+ db.DepthBufferObjectControlState = GENX(MOCS),
+
+ db.SurfacePitch = image->depth_surface.isl.row_pitch - 1;
+ db.Height   = fb->height - 1;
+ db.Width= fb->width - 1;
+ db.LOD  = 0;
+ db.Depth= 1 - 1;
+ db.MinimumArrayElement  = 0;
+
 #if GEN_GEN >= 8
- .SurfaceQPitch = 
isl_surf_get_array_pitch_el_rows(>depth_surface.isl) >> 2,
+ db.SurfaceQPitch =
+isl_surf_get_array_pitch_el_rows(>depth_surface.isl) >> 2,
 #endif
- .RenderTargetViewExtent = 1 - 1);
+ db.RenderTargetViewExtent = 1 - 1;
+  }
} else {
   /* Even when no depth buffer is present, the hardware requires that
* 3DSTATE_DEPTH_BUFFER be programmed correctly. The Broadwell PRM says:
@@ -965,45 +971,47 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
*cmd_buffer)
* nor stencil buffer is present.  Also, D16_UNORM is not allowed to
* be combined with a stencil buffer so we use D32_FLOAT instead.
*/
-  anv_batch_emit(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER),
- .SurfaceType = SURFTYPE_2D,
- .SurfaceFormat = D32_FLOAT,
- .Width = fb->width - 1,
- .Height = fb->height - 1,
- .StencilWriteEnable = has_stencil);
+  anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_DEPTH_BUFFER), db) {
+ db.SurfaceType  = SURFTYPE_2D;
+ db.SurfaceFormat= D32_FLOAT;
+ db.Width= fb->width - 1;
+ db.Height   = fb->height - 1;
+ db.StencilWriteEnable   = has_stencil;
+  }
}
 
/* Emit 3DSTATE_STENCIL_BUFFER */
if (has_stencil) {
-  anv_batch_emit(_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER),
+  anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb) 
{
 #if GEN_GEN >= 8 || GEN_IS_HASWELL
- .StencilBufferEnable = true,
+ sb.StencilBufferEnable = true,
 #endif
- .StencilBufferObjectControlState = GENX(MOCS),
+ sb.StencilBufferObjectControlState = GENX(MOCS),
 
  /* Stencil buffers have strange pitch. The PRM says:
   *
   *The pitch must be set to 2x the value computed based on width,
   *as the stencil buffer is stored with two rows interleaved.
   */
- .SurfacePitch = 2 * image->stencil_surface.isl.row_pitch - 1,
+ sb.SurfacePitch = 2 * image->stencil_surface.isl.row_pitch - 1,
 
 #if GEN_GEN >= 8
- .SurfaceQPitch = 
isl_surf_get_array_pitch_el_rows(>stencil_surface.isl) >> 2,
+ sb.SurfaceQPitch = 
isl_surf_get_array_pitch_el_rows(>stencil_surface.isl) >> 2,
 #endif
- .SurfaceBaseAddress = {
+ sb.SurfaceBaseAddress = (struct anv_address) {
 .bo = image->bo,
 .offset = image->offset + image->stencil_surface.offset,
- });
+ };
+  }
} else {
-  anv_batch_emit(_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER));
+  anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
}
 
/* Disable hierarchial depth buffers. */
-   anv_batch_emit(_buffer->batch, 

[Mesa-dev] [PATCH 02/18] anv: Add a new block-based batch emit macro

2016-04-18 Thread Jason Ekstrand
This new macro uses a for loop to create an actual code block in which to
place the macro setup code.  One advantage of this is that you syntatically
use braces instead of parentheses.  Another is that the code in the block
doesn't even get executed if anv_batch_emit_dwords fails.
---
 src/intel/vulkan/anv_private.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index ae2e08d..d59b7ed 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -861,6 +861,15 @@ __gen_combine_address(struct anv_batch *batch, void 
*location,
   VG(VALGRIND_CHECK_MEM_IS_DEFINED(dw, ARRAY_SIZE(dwords0) * 4));\
} while (0)
 
+#define anv_batch_emit_blk(batch, cmd, name)\
+   for (struct cmd name = { __anv_cmd_header(cmd) },\
+*_dst = anv_batch_emit_dwords(batch, __anv_cmd_length(cmd));\
+__builtin_expect(_dst != NULL, 1);  \
+({ __anv_cmd_pack(cmd)(batch, _dst, ); \
+   VG(VALGRIND_CHECK_MEM_IS_DEFINED(_dst, __anv_cmd_length(cmd) * 4)); 
\
+   _dst = NULL; \
+ }))
+
 #define anv_state_pool_emit(pool, cmd, align, ...) ({   \
   const uint32_t __size = __anv_cmd_length(cmd) * 4;\
   struct anv_state __state =\
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/18] anv/device: Use the new emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c | 17 +
 src/intel/vulkan/anv_device.c  |  8 
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 034f3fd..3bf0cd0 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -455,12 +455,13 @@ emit_batch_buffer_start(struct anv_cmd_buffer *cmd_buffer,
const uint32_t gen8_length =
   GEN8_MI_BATCH_BUFFER_START_length - 
GEN8_MI_BATCH_BUFFER_START_length_bias;
 
-   anv_batch_emit(_buffer->batch, GEN8_MI_BATCH_BUFFER_START,
-  .DWordLength = cmd_buffer->device->info.gen < 8 ?
- gen7_length : gen8_length,
-  ._2ndLevelBatchBuffer = _1stlevelbatch,
-  .AddressSpaceIndicator = ASI_PPGTT,
-  .BatchBufferStartAddress = { bo, offset });
+   anv_batch_emit_blk(_buffer->batch, GEN8_MI_BATCH_BUFFER_START, bbs) {
+  bbs.DWordLength   = cmd_buffer->device->info.gen < 8 ?
+  gen7_length : gen8_length;
+  bbs._2ndLevelBatchBuffer  = _1stlevelbatch;
+  bbs.AddressSpaceIndicator = ASI_PPGTT;
+  bbs.BatchBufferStartAddress   = (struct anv_address) { bo, offset };
+   }
 }
 
 static void
@@ -711,11 +712,11 @@ anv_cmd_buffer_end_batch_buffer(struct anv_cmd_buffer 
*cmd_buffer)
   cmd_buffer->batch.end += GEN8_MI_BATCH_BUFFER_START_length * 4;
   assert(cmd_buffer->batch.end == batch_bo->bo.map + batch_bo->bo.size);
 
-  anv_batch_emit(_buffer->batch, GEN7_MI_BATCH_BUFFER_END);
+  anv_batch_emit_blk(_buffer->batch, GEN7_MI_BATCH_BUFFER_END, bbe);
 
   /* Round batch up to an even number of dwords. */
   if ((cmd_buffer->batch.next - cmd_buffer->batch.start) & 4)
- anv_batch_emit(_buffer->batch, GEN7_MI_NOOP);
+ anv_batch_emit_blk(_buffer->batch, GEN7_MI_NOOP, noop);
 
   cmd_buffer->exec_mode = ANV_CMD_BUFFER_EXEC_MODE_PRIMARY;
}
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index e477fe1..c2c2db8 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1081,8 +1081,8 @@ VkResult anv_DeviceWaitIdle(
batch.start = batch.next = cmds;
batch.end = (void *) cmds + sizeof(cmds);
 
-   anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END);
-   anv_batch_emit(, GEN7_MI_NOOP);
+   anv_batch_emit_blk(, GEN7_MI_BATCH_BUFFER_END, bbe);
+   anv_batch_emit_blk(, GEN7_MI_NOOP, noop);
 
return anv_device_submit_simple_batch(device, );
 }
@@ -1423,8 +1423,8 @@ VkResult anv_CreateFence(
const uint32_t batch_offset = align_u32(sizeof(*fence), CACHELINE_SIZE);
batch.next = batch.start = fence->bo.map + batch_offset;
batch.end = fence->bo.map + fence->bo.size;
-   anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END);
-   anv_batch_emit(, GEN7_MI_NOOP);
+   anv_batch_emit_blk(, GEN7_MI_BATCH_BUFFER_END, bbe);
+   anv_batch_emit_blk(, GEN7_MI_NOOP, noop);
 
if (!device->info.has_llc) {
   assert(((uintptr_t) batch.start & CACHELINE_MASK) == 0);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/18] anv/gen7_cmd_buffer: Use the new emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/gen7_cmd_buffer.c | 122 +
 1 file changed, 70 insertions(+), 52 deletions(-)

diff --git a/src/intel/vulkan/gen7_cmd_buffer.c 
b/src/intel/vulkan/gen7_cmd_buffer.c
index 5130a40..88964da 100644
--- a/src/intel/vulkan/gen7_cmd_buffer.c
+++ b/src/intel/vulkan/gen7_cmd_buffer.c
@@ -57,18 +57,20 @@ gen7_cmd_buffer_emit_descriptor_pointers(struct 
anv_cmd_buffer *cmd_buffer,
 
anv_foreach_stage(s, stages) {
   if (cmd_buffer->state.samplers[s].alloc_size > 0) {
- anv_batch_emit(_buffer->batch,
-GENX(3DSTATE_SAMPLER_STATE_POINTERS_VS),
-._3DCommandSubOpcode  = sampler_state_opcodes[s],
-.PointertoVSSamplerState = 
cmd_buffer->state.samplers[s].offset);
+ anv_batch_emit_blk(_buffer->batch,
+GENX(3DSTATE_SAMPLER_STATE_POINTERS_VS), ssp) {
+ssp._3DCommandSubOpcode = sampler_state_opcodes[s];
+ssp.PointertoVSSamplerState = cmd_buffer->state.samplers[s].offset;
+ }
   }
 
   /* Always emit binding table pointers if we're asked to, since on SKL
* this is what flushes push constants. */
-  anv_batch_emit(_buffer->batch,
- GENX(3DSTATE_BINDING_TABLE_POINTERS_VS),
- ._3DCommandSubOpcode  = binding_table_opcodes[s],
- .PointertoVSBindingTable = 
cmd_buffer->state.binding_tables[s].offset);
+  anv_batch_emit_blk(_buffer->batch,
+ GENX(3DSTATE_BINDING_TABLE_POINTERS_VS), btp) {
+ btp._3DCommandSubOpcode = binding_table_opcodes[s];
+ btp.PointertoVSBindingTable = 
cmd_buffer->state.binding_tables[s].offset;
+  }
}
 }
 
@@ -173,8 +175,10 @@ gen7_cmd_buffer_emit_scissor(struct anv_cmd_buffer 
*cmd_buffer)
   }
}
 
-   anv_batch_emit(_buffer->batch, GEN7_3DSTATE_SCISSOR_STATE_POINTERS,
-  .ScissorRectPointer = scissor_state.offset);
+   anv_batch_emit_blk(_buffer->batch,
+  GEN7_3DSTATE_SCISSOR_STATE_POINTERS, ssp) {
+  ssp.ScissorRectPointer = scissor_state.offset;
+   }
 
if (!cmd_buffer->device->info.has_llc)
   anv_state_clflush(scissor_state);
@@ -237,9 +241,10 @@ flush_compute_descriptor_set(struct anv_cmd_buffer 
*cmd_buffer)
unsigned push_constant_regs = reg_aligned_constant_size / 32;
 
if (push_state.alloc_size) {
-  anv_batch_emit(_buffer->batch, GENX(MEDIA_CURBE_LOAD),
- .CURBETotalDataLength = push_state.alloc_size,
- .CURBEDataStartAddress = push_state.offset);
+  anv_batch_emit_blk(_buffer->batch, GENX(MEDIA_CURBE_LOAD), curbe) {
+ curbe.CURBETotalDataLength= push_state.alloc_size;
+ curbe.CURBEDataStartAddress   = push_state.offset;
+  }
}
 
assert(prog_data->total_shared <= 64 * 1024);
@@ -269,18 +274,15 @@ flush_compute_descriptor_set(struct anv_cmd_buffer 
*cmd_buffer)
  pipeline->cs_thread_width_max);
 
const uint32_t size = GENX(INTERFACE_DESCRIPTOR_DATA_length) * 
sizeof(uint32_t);
-   anv_batch_emit(_buffer->batch, GENX(MEDIA_INTERFACE_DESCRIPTOR_LOAD),
-  .InterfaceDescriptorTotalLength = size,
-  .InterfaceDescriptorDataStartAddress = state.offset);
+   anv_batch_emit_blk(_buffer->batch,
+  GENX(MEDIA_INTERFACE_DESCRIPTOR_LOAD), idl) {
+  idl.InterfaceDescriptorTotalLength= size;
+  idl.InterfaceDescriptorDataStartAddress = state.offset;
+   }
 
return VK_SUCCESS;
 }
 
-#define emit_lri(batch, reg, imm)   \
-   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_IMM),\
-  .RegisterOffset = __anv_reg_num(reg), \
-  .DataDWord = imm)
-
 void
 genX(cmd_buffer_config_l3)(struct anv_cmd_buffer *cmd_buffer, bool enable_slm)
 {
@@ -310,10 +312,11 @@ genX(cmd_buffer_config_l3)(struct anv_cmd_buffer 
*cmd_buffer, bool enable_slm)
* flushed, which involves a first PIPE_CONTROL flush which stalls the
* pipeline...
*/
-  anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
- .DCFlushEnable = true,
- .PostSyncOperation = NoWrite,
- .CommandStreamerStallEnable = true);
+  anv_batch_emit_blk(_buffer->batch, GENX(PIPE_CONTROL), pc) {
+ pc.DCFlushEnable  = true;
+ pc.CommandStreamerStallEnable = true;
+ pc.PostSyncOperation  = NoWrite;
+  }
 
   /* ...followed by a second pipelined PIPE_CONTROL that initiates
* invalidation of the relevant caches. Note that because RO
@@ -329,23 +332,28 @@ genX(cmd_buffer_config_l3)(struct anv_cmd_buffer 
*cmd_buffer, bool enable_slm)
* previous and subsequent PIPE_CONTROLs already guarantee that there is
* no concurrent GPGPU kernel execution (see SKL HSD 2132585).
*/
-  

[Mesa-dev] [PATCH 03/18] anv/cmd_buffer: Use the new emit macro for 3DPRIMITIVE commands

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_cmd_buffer.c | 52 --
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index d642832..b21ff97 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -513,14 +513,15 @@ void genX(CmdDraw)(
if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
 
-   anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
-  .VertexAccessType = SEQUENTIAL,
-  .PrimitiveTopologyType= pipeline->topology,
-  .VertexCountPerInstance   = vertexCount,
-  .StartVertexLocation  = firstVertex,
-  .InstanceCount= instanceCount,
-  .StartInstanceLocation= firstInstance,
-  .BaseVertexLocation   = 0);
+   anv_batch_emit_blk(_buffer->batch, GENX(3DPRIMITIVE), prim) {
+  prim.VertexAccessType = SEQUENTIAL;
+  prim.PrimitiveTopologyType= pipeline->topology;
+  prim.VertexCountPerInstance   = vertexCount;
+  prim.StartVertexLocation  = firstVertex;
+  prim.InstanceCount= instanceCount;
+  prim.StartInstanceLocation= firstInstance;
+  prim.BaseVertexLocation   = 0;
+   }
 }
 
 void genX(CmdDrawIndexed)(
@@ -540,14 +541,15 @@ void genX(CmdDrawIndexed)(
if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance);
 
-   anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
-  .VertexAccessType = RANDOM,
-  .PrimitiveTopologyType= pipeline->topology,
-  .VertexCountPerInstance   = indexCount,
-  .StartVertexLocation  = firstIndex,
-  .InstanceCount= instanceCount,
-  .StartInstanceLocation= firstInstance,
-  .BaseVertexLocation   = vertexOffset);
+   anv_batch_emit_blk(_buffer->batch, GENX(3DPRIMITIVE), prim) {
+  prim.VertexAccessType = RANDOM;
+  prim.PrimitiveTopologyType= pipeline->topology;
+  prim.VertexCountPerInstance   = indexCount;
+  prim.StartVertexLocation  = firstIndex;
+  prim.InstanceCount= instanceCount;
+  prim.StartInstanceLocation= firstInstance;
+  prim.BaseVertexLocation   = vertexOffset;
+   }
 }
 
 /* Auto-Draw / Indirect Registers */
@@ -600,10 +602,11 @@ void genX(CmdDrawIndirect)(
emit_lrm(_buffer->batch, GEN7_3DPRIM_START_INSTANCE, bo, bo_offset + 
12);
emit_lri(_buffer->batch, GEN7_3DPRIM_BASE_VERTEX, 0);
 
-   anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
-  .IndirectParameterEnable  = true,
-  .VertexAccessType = SEQUENTIAL,
-  .PrimitiveTopologyType= pipeline->topology);
+   anv_batch_emit_blk(_buffer->batch, GENX(3DPRIMITIVE), prim) {
+  prim.IndirectParameterEnable  = true;
+  prim.VertexAccessType = SEQUENTIAL;
+  prim.PrimitiveTopologyType= pipeline->topology;
+   }
 }
 
 void genX(CmdDrawIndexedIndirect)(
@@ -632,10 +635,11 @@ void genX(CmdDrawIndexedIndirect)(
emit_lrm(_buffer->batch, GEN7_3DPRIM_BASE_VERTEX, bo, bo_offset + 12);
emit_lrm(_buffer->batch, GEN7_3DPRIM_START_INSTANCE, bo, bo_offset + 
16);
 
-   anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
-  .IndirectParameterEnable  = true,
-  .VertexAccessType = RANDOM,
-  .PrimitiveTopologyType= pipeline->topology);
+   anv_batch_emit_blk(_buffer->batch, GENX(3DPRIMITIVE), prim) {
+  prim.IndirectParameterEnable  = true;
+  prim.VertexAccessType = RANDOM;
+  prim.PrimitiveTopologyType= pipeline->topology;
+   }
 }
 
 #if GEN_GEN == 7
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/18] anv: Add a short style guide

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/STYLE | 67 ++
 1 file changed, 67 insertions(+)
 create mode 100644 src/intel/vulkan/STYLE

diff --git a/src/intel/vulkan/STYLE b/src/intel/vulkan/STYLE
new file mode 100644
index 000..4eb8f79
--- /dev/null
+++ b/src/intel/vulkan/STYLE
@@ -0,0 +1,67 @@
+The Intel Vulkan driver typically follows the mesa coding style with a few
+exceptions.  First is that structs declared in anv_private.h should be
+written as follows:
+
+struct anv_foo {
+   int  short_type;
+   struct anv_long_type long_type;
+   void *   ptr;
+};
+
+Where the * for pointers goes one space after the type and the names are
+vertically aligned.  The names should be tabbed over the minimum amount
+(still a multiple of 3 spaces) such that they can all be aligned.
+
+When the anv_batch_emit function is used, it should look as follows:
+
+anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE),
+   .VertexAccessType   = SEQUENTIAL,
+   .PrimitiveTopologyType  = pipeline->topology,
+   .VertexCountPerInstance = vertexCount,
+   .StartVertexLocation= firstVertex,
+   .InstanceCount  = instanceCount,
+   .StartInstanceLocation  = firstInstance,
+   .BaseVertexLocation = 0);
+
+The batch and struct name parameters should go on the same line with the
+anv_batch_emit call and each named parameter on its own line.  The
+alignment rules are the same as for structs where all of the "=" are
+vertically aligned at the minimum tabstop required to do so.
+
+Eventually, we would like to move to a block-based packing mechanism.  In
+this case, the above packing macro would look like this:
+
+anv_batch_emit(_buffer->batch, GENX(3DPRIMITIVE), prim) {
+   prim.VertexAccessType = SEQUENTIAL;
+   prim.PrimitiveTopologyType= pipeline->topology;
+   prim.VertexCountPerInstance   = vertexCount;
+   prim.StartVertexLocation  = firstVertex;
+   prim.InstanceCount= instanceCount;
+   prim.StartInstanceLocation= firstInstance;
+   prim.BaseVertexLocation   = 0;
+}
+
+With this new block mechansim, you may end up mixing code and declarations.
+In this case, it's up to the discression of the programmer exactly how to
+tab things but trying to keep with the above rules is recommended.
+
+In meta code, we use a fair number of compound initializers for more easily
+calling Vulkan functions that require struct arguments.  These should be
+declared as follows:
+
+anv_CreateFramebuffer(anv_device_to_handle(device),
+   &(VkFramebufferCreateInfo) {
+  .sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
+  .attachmentCount = 1,
+  .pAttachments = (VkImageView[]) {
+ anv_image_view_to_handle(dest_iview),
+  },
+  .width = dest_iview->extent.width,
+  .height = dest_iview->extent.height,
+  .layers = 1
+   }, _buffer->pool->alloc, );
+
+The initial arguments go on the same line as the call and the primary
+struct being passed in is declared on its own line tabbed over 1 tab with
+it's contents on the following lines tabbed over an additional tab.
+Substructures get tabbed over by additional tabs as needed.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/18] anv/gen8_cmd_buffer: Use the new emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/gen8_cmd_buffer.c | 161 -
 1 file changed, 87 insertions(+), 74 deletions(-)

diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
b/src/intel/vulkan/gen8_cmd_buffer.c
index 3956a58..96ef846 100644
--- a/src/intel/vulkan/gen8_cmd_buffer.c
+++ b/src/intel/vulkan/gen8_cmd_buffer.c
@@ -80,20 +80,17 @@ gen8_cmd_buffer_emit_viewport(struct anv_cmd_buffer 
*cmd_buffer)
   anv_state_clflush(cc_state);
}
 
-   anv_batch_emit(_buffer->batch,
-  GENX(3DSTATE_VIEWPORT_STATE_POINTERS_CC),
-  .CCViewportPointer = cc_state.offset);
-   anv_batch_emit(_buffer->batch,
-  GENX(3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP),
-  .SFClipViewportPointer = sf_clip_state.offset);
+   anv_batch_emit_blk(_buffer->batch,
+  GENX(3DSTATE_VIEWPORT_STATE_POINTERS_CC), cc) {
+  cc.CCViewportPointer = cc_state.offset;
+   }
+   anv_batch_emit_blk(_buffer->batch,
+  GENX(3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP), clip) {
+  clip.SFClipViewportPointer = sf_clip_state.offset;
+   }
 }
 #endif
 
-#define emit_lri(batch, reg, imm)   \
-   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_IMM),\
-  .RegisterOffset = __anv_reg_num(reg), \
-  .DataDWord = imm)
-
 void
 genX(cmd_buffer_config_l3)(struct anv_cmd_buffer *cmd_buffer, bool enable_slm)
 {
@@ -120,10 +117,11 @@ genX(cmd_buffer_config_l3)(struct anv_cmd_buffer 
*cmd_buffer, bool enable_slm)
* flushed, which involves a first PIPE_CONTROL flush which stalls the
* pipeline...
*/
-  anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
- .DCFlushEnable = true,
- .PostSyncOperation = NoWrite,
- .CommandStreamerStallEnable = true);
+  anv_batch_emit_blk(_buffer->batch, GENX(PIPE_CONTROL), pc) {
+ pc.DCFlushEnable  = true;
+ pc.PostSyncOperation  = NoWrite;
+ pc.CommandStreamerStallEnable = true;
+  }
 
   /* ...followed by a second pipelined PIPE_CONTROL that initiates
* invalidation of the relevant caches. Note that because RO
@@ -139,22 +137,27 @@ genX(cmd_buffer_config_l3)(struct anv_cmd_buffer 
*cmd_buffer, bool enable_slm)
* previous and subsequent PIPE_CONTROLs already guarantee that there is
* no concurrent GPGPU kernel execution (see SKL HSD 2132585).
*/
-  anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
- .TextureCacheInvalidationEnable = true,
- .ConstantCacheInvalidationEnable = true,
- .InstructionCacheInvalidateEnable = true,
- .StateCacheInvalidationEnable = true,
- .PostSyncOperation = NoWrite);
+  anv_batch_emit_blk(_buffer->batch, GENX(PIPE_CONTROL), pc) {
+ pc.TextureCacheInvalidationEnable   = true,
+ pc.ConstantCacheInvalidationEnable  = true,
+ pc.InstructionCacheInvalidateEnable = true,
+ pc.StateCacheInvalidationEnable = true,
+ pc.PostSyncOperation= NoWrite;
+  }
 
   /* Now send a third stalling flush to make sure that invalidation is
* complete when the L3 configuration registers are modified.
*/
-  anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
- .DCFlushEnable = true,
- .PostSyncOperation = NoWrite,
- .CommandStreamerStallEnable = true);
-
-  emit_lri(_buffer->batch, GENX(L3CNTLREG), l3cr_val);
+  anv_batch_emit_blk(_buffer->batch, GENX(PIPE_CONTROL), pc) {
+ pc.DCFlushEnable  = true;
+ pc.PostSyncOperation  = NoWrite;
+ pc.CommandStreamerStallEnable = true;
+  }
+
+  anv_batch_emit_blk(_buffer->batch, GENX(MI_LOAD_REGISTER_IMM), lri) {
+ lri.RegisterOffset   = GENX(L3CNTLREG_num);
+ lri.DataDWord= l3cr_val;
+  }
   cmd_buffer->state.current_l3_config = l3cr_val;
}
 }
@@ -247,10 +250,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct 
anv_cmd_buffer *cmd_buffer)
   if (!cmd_buffer->device->info.has_llc)
  anv_state_clflush(cc_state);
 
-  anv_batch_emit(_buffer->batch,
- GENX(3DSTATE_CC_STATE_POINTERS),
- .ColorCalcStatePointer = cc_state.offset,
- .ColorCalcStatePointerValid = true);
+  anv_batch_emit_blk(_buffer->batch,
+ GENX(3DSTATE_CC_STATE_POINTERS), ccp) {
+ ccp.ColorCalcStatePointer= cc_state.offset;
+ ccp.ColorCalcStatePointerValid   = true;
+  }
}
 
if (cmd_buffer->state.dirty & (ANV_CMD_DIRTY_PIPELINE |
@@ -291,10 +295,11 @@ genX(cmd_buffer_flush_dynamic_state)(struct 
anv_cmd_buffer *cmd_buffer)
   if (!cmd_buffer->device->info.has_llc)
  anv_state_clflush(cc_state);
 

[Mesa-dev] [PATCH 12/18] anv/gen8_pipeline: Use the new emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/gen8_pipeline.c | 368 ---
 1 file changed, 191 insertions(+), 177 deletions(-)

diff --git a/src/intel/vulkan/gen8_pipeline.c b/src/intel/vulkan/gen8_pipeline.c
index 6f6868e..f4224e0 100644
--- a/src/intel/vulkan/gen8_pipeline.c
+++ b/src/intel/vulkan/gen8_pipeline.c
@@ -39,8 +39,9 @@ emit_ia_state(struct anv_pipeline *pipeline,
   const VkPipelineInputAssemblyStateCreateInfo *info,
   const struct anv_graphics_pipeline_create_info *extra)
 {
-   anv_batch_emit(>batch, GENX(3DSTATE_VF_TOPOLOGY),
-  .PrimitiveTopologyType = pipeline->topology);
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_VF_TOPOLOGY), vft) {
+  vft.PrimitiveTopologyType = pipeline->topology;
+   }
 }
 
 static void
@@ -191,26 +192,28 @@ emit_cb_state(struct anv_pipeline *pipeline,
 
struct GENX(BLEND_STATE_ENTRY) *bs0 = _state.Entry[0];
 
-   anv_batch_emit(>batch, GENX(3DSTATE_PS_BLEND),
-  .AlphaToCoverageEnable = blend_state.AlphaToCoverageEnable,
-  .HasWriteableRT = has_writeable_rt,
-  .ColorBufferBlendEnable = bs0->ColorBufferBlendEnable,
-  .SourceAlphaBlendFactor = bs0->SourceAlphaBlendFactor,
-  .DestinationAlphaBlendFactor =
- bs0->DestinationAlphaBlendFactor,
-  .SourceBlendFactor = bs0->SourceBlendFactor,
-  .DestinationBlendFactor = bs0->DestinationBlendFactor,
-  .AlphaTestEnable = false,
-  .IndependentAlphaBlendEnable =
- blend_state.IndependentAlphaBlendEnable);
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_PS_BLEND), blend) {
+  blend.AlphaToCoverageEnable = blend_state.AlphaToCoverageEnable;
+  blend.HasWriteableRT= has_writeable_rt;
+  blend.ColorBufferBlendEnable= bs0->ColorBufferBlendEnable;
+  blend.SourceAlphaBlendFactor= bs0->SourceAlphaBlendFactor;
+  blend.DestinationAlphaBlendFactor   = bs0->DestinationAlphaBlendFactor;
+  blend.SourceBlendFactor = bs0->SourceBlendFactor;
+  blend.DestinationBlendFactor= bs0->DestinationBlendFactor;
+  blend.AlphaTestEnable   = false;
+  blend.IndependentAlphaBlendEnable   =
+ blend_state.IndependentAlphaBlendEnable;
+   }
 
GENX(BLEND_STATE_pack)(NULL, pipeline->blend_state.map, _state);
if (!device->info.has_llc)
   anv_state_clflush(pipeline->blend_state);
 
-   anv_batch_emit(>batch, GENX(3DSTATE_BLEND_STATE_POINTERS),
-  .BlendStatePointer = pipeline->blend_state.offset,
-  .BlendStatePointerValid = true);
+   anv_batch_emit_blk(>batch,
+  GENX(3DSTATE_BLEND_STATE_POINTERS), bsp) {
+  bsp.BlendStatePointer  = pipeline->blend_state.offset;
+  bsp.BlendStatePointerValid = true;
+   }
 }
 
 static void
@@ -288,20 +291,21 @@ emit_ms_state(struct anv_pipeline *pipeline,
if (info && info->sampleShadingEnable)
   
anv_finishme("VkPipelineMultisampleStateCreateInfo::sampleShadingEnable");
 
-   anv_batch_emit(>batch, GENX(3DSTATE_MULTISAMPLE),
-
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_MULTISAMPLE), ms) {
   /* The PRM says that this bit is valid only for DX9:
*
*SW can choose to set this bit only for DX9 API. DX10/OGL API's
*should not have any effect by setting or not setting this bit.
*/
-  .PixelPositionOffsetEnable = false,
+  ms.PixelPositionOffsetEnable = false;
 
-  .PixelLocation = CENTER,
-  .NumberofMultisamples = log2_samples);
+  ms.PixelLocation = CENTER;
+  ms.NumberofMultisamples = log2_samples;
+   }
 
-   anv_batch_emit(>batch, GENX(3DSTATE_SAMPLE_MASK),
-  .SampleMask = sample_mask);
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_SAMPLE_MASK), sm) {
+  sm.SampleMask = sample_mask;
+   }
 }
 
 VkResult
@@ -347,85 +351,88 @@ genX(graphics_pipeline_create)(
emit_urb_setup(pipeline);
 
const struct brw_wm_prog_data *wm_prog_data = get_wm_prog_data(pipeline);
-   anv_batch_emit(>batch, GENX(3DSTATE_CLIP),
-  .ClipEnable = !(extra && extra->use_rectlist),
-  .EarlyCullEnable = true,
-  .APIMode = 1, /* D3D */
-  .ViewportXYClipTestEnable = true,
-
-  .ClipMode =
- pCreateInfo->pRasterizationState->rasterizerDiscardEnable 
?
- REJECT_ALL : NORMAL,
-
-  .NonPerspectiveBarycentricEnable = wm_prog_data ?
- (wm_prog_data->barycentric_interp_modes & 0x38) != 0 : 0,
-
-  .TriangleStripListProvokingVertexSelect = 0,
-  .LineStripListProvokingVertexSelect = 0,
-  .TriangleFanProvokingVertexSelect = 1,
-
-  .MinimumPointWidth = 0.125,
-  .MaximumPointWidth = 255.875,
- 

[Mesa-dev] [PATCH 09/18] anv/cmd_buffer: Use the new emit macro for quaries

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_cmd_buffer.c | 56 ++
 1 file changed, 32 insertions(+), 24 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index abf0961..5c00b1d 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1192,20 +1192,23 @@ void genX(CmdWriteTimestamp)(
 
switch (pipelineStage) {
case VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT:
-  anv_batch_emit(_buffer->batch, GENX(MI_STORE_REGISTER_MEM),
- .RegisterAddress = TIMESTAMP,
- .MemoryAddress = { >bo, offset });
-  anv_batch_emit(_buffer->batch, GENX(MI_STORE_REGISTER_MEM),
- .RegisterAddress = TIMESTAMP + 4,
- .MemoryAddress = { >bo, offset + 4 });
+  anv_batch_emit_blk(_buffer->batch, GENX(MI_STORE_REGISTER_MEM), srm) 
{
+ srm.RegisterAddress  = TIMESTAMP;
+ srm.MemoryAddress= (struct anv_address) { >bo, offset };
+  }
+  anv_batch_emit_blk(_buffer->batch, GENX(MI_STORE_REGISTER_MEM), srm) 
{
+ srm.RegisterAddress  = TIMESTAMP + 4;
+ srm.MemoryAddress= (struct anv_address) { >bo, offset + 4 };
+  }
   break;
 
default:
   /* Everything else is bottom-of-pipe */
-  anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
- .DestinationAddressType = DAT_PPGTT,
- .PostSyncOperation = WriteTimestamp,
- .Address = { >bo, offset });
+  anv_batch_emit_blk(_buffer->batch, GENX(PIPE_CONTROL), pc) {
+ pc.DestinationAddressType  = DAT_PPGTT,
+ pc.PostSyncOperation   = WriteTimestamp,
+ pc.Address = (struct anv_address) { >bo, offset };
+  }
   break;
}
 
@@ -1250,26 +1253,31 @@ static void
 emit_load_alu_reg_u64(struct anv_batch *batch, uint32_t reg,
   struct anv_bo *bo, uint32_t offset)
 {
-   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_MEM),
-  .RegisterAddress = reg,
-  .MemoryAddress = { bo, offset });
-   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_MEM),
-  .RegisterAddress = reg + 4,
-  .MemoryAddress = { bo, offset + 4 });
+   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_MEM), lrm) {
+  lrm.RegisterAddress  = reg,
+  lrm.MemoryAddress= (struct anv_address) { bo, offset };
+   }
+   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_MEM), lrm) {
+  lrm.RegisterAddress  = reg + 4;
+  lrm.MemoryAddress= (struct anv_address) { bo, offset + 4 };
+   }
 }
 
 static void
 store_query_result(struct anv_batch *batch, uint32_t reg,
struct anv_bo *bo, uint32_t offset, VkQueryResultFlags 
flags)
 {
-  anv_batch_emit(batch, GENX(MI_STORE_REGISTER_MEM),
- .RegisterAddress = reg,
- .MemoryAddress = { bo, offset });
-
-  if (flags & VK_QUERY_RESULT_64_BIT)
- anv_batch_emit(batch, GENX(MI_STORE_REGISTER_MEM),
-.RegisterAddress = reg + 4,
-.MemoryAddress = { bo, offset + 4 });
+   anv_batch_emit_blk(batch, GENX(MI_STORE_REGISTER_MEM), srm) {
+  srm.RegisterAddress  = reg;
+  srm.MemoryAddress= (struct anv_address) { bo, offset };
+   }
+
+   if (flags & VK_QUERY_RESULT_64_BIT) {
+  anv_batch_emit_blk(batch, GENX(MI_STORE_REGISTER_MEM), srm) {
+ srm.RegisterAddress  = reg + 4;
+ srm.MemoryAddress= (struct anv_address) { bo, offset + 4 };
+  }
+   }
 }
 
 void genX(CmdCopyQueryPoolResults)(
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/18] anv/gen7_pipeline: Use the new emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/gen7_pipeline.c | 237 ---
 1 file changed, 124 insertions(+), 113 deletions(-)

diff --git a/src/intel/vulkan/gen7_pipeline.c b/src/intel/vulkan/gen7_pipeline.c
index d6d5ce6..62e43ad 100644
--- a/src/intel/vulkan/gen7_pipeline.c
+++ b/src/intel/vulkan/gen7_pipeline.c
@@ -175,8 +175,10 @@ gen7_emit_cb_state(struct anv_pipeline *pipeline,
  anv_state_clflush(pipeline->blend_state);
 }
 
-   anv_batch_emit(>batch, GENX(3DSTATE_BLEND_STATE_POINTERS),
-  .BlendStatePointer = pipeline->blend_state.offset);
+   anv_batch_emit_blk(>batch,
+  GENX(3DSTATE_BLEND_STATE_POINTERS), bsp) {
+  bsp.BlendStatePointer = pipeline->blend_state.offset;
+   }
 }
 
 VkResult
@@ -193,7 +195,7 @@ genX(graphics_pipeline_create)(
VkResult result;
 
assert(pCreateInfo->sType == 
VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO);
-   
+
pipeline = anv_alloc2(>alloc, pAllocator, sizeof(*pipeline), 8,
  VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
if (pipeline == NULL)
@@ -222,19 +224,22 @@ genX(graphics_pipeline_create)(
const VkPipelineRasterizationStateCreateInfo *rs_info =
   pCreateInfo->pRasterizationState;
 
-   anv_batch_emit(>batch, GENX(3DSTATE_CLIP),
-  .FrontWinding = 
vk_to_gen_front_face[rs_info->frontFace],
-  .CullMode = 
vk_to_gen_cullmode[rs_info->cullMode],
-  .ClipEnable   = !(extra && 
extra->use_rectlist),
-  .APIMode  = APIMODE_OGL,
-  .ViewportXYClipTestEnable = true,
-  .ClipMode = CLIPMODE_NORMAL,
-  .TriangleStripListProvokingVertexSelect   = 0,
-  .LineStripListProvokingVertexSelect   = 0,
-  .TriangleFanProvokingVertexSelect = 1,
-  .MinimumPointWidth= 0.125,
-  .MaximumPointWidth= 255.875,
-  .MaximumVPIndex = pCreateInfo->pViewportState->viewportCount - 1);
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_CLIP), clip) {
+  clip.FrontWinding = vk_to_gen_front_face[rs_info->frontFace],
+  clip.CullMode = vk_to_gen_cullmode[rs_info->cullMode],
+  clip.ClipEnable   = !(extra && extra->use_rectlist),
+  clip.APIMode  = APIMODE_OGL,
+  clip.ViewportXYClipTestEnable = true,
+  clip.ClipMode = CLIPMODE_NORMAL,
+
+  clip.TriangleStripListProvokingVertexSelect   = 0,
+  clip.LineStripListProvokingVertexSelect   = 0,
+  clip.TriangleFanProvokingVertexSelect = 1,
+
+  clip.MinimumPointWidth= 0.125,
+  clip.MaximumPointWidth= 255.875,
+  clip.MaximumVPIndex = pCreateInfo->pViewportState->viewportCount - 1;
+   }
 
if (pCreateInfo->pMultisampleState &&
pCreateInfo->pMultisampleState->rasterizationSamples > 1)
@@ -243,12 +248,14 @@ genX(graphics_pipeline_create)(
uint32_t samples = 1;
uint32_t log2_samples = __builtin_ffs(samples) - 1;
 
-   anv_batch_emit(>batch, GENX(3DSTATE_MULTISAMPLE),
-  .PixelLocation= PIXLOC_CENTER,
-  .NumberofMultisamples = log2_samples);
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_MULTISAMPLE), ms) {
+  ms.PixelLocation= PIXLOC_CENTER;
+  ms.NumberofMultisamples = log2_samples;
+   }
 
-   anv_batch_emit(>batch, GENX(3DSTATE_SAMPLE_MASK),
-  .SampleMask   = 0xff);
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_SAMPLE_MASK), sm) {
+  sm.SampleMask = 0xff;
+   }
 
const struct brw_vs_prog_data *vs_prog_data = get_vs_prog_data(pipeline);
 
@@ -272,71 +279,76 @@ genX(graphics_pipeline_create)(
 #endif
 
if (pipeline->vs_vec4 == NO_KERNEL || (extra && extra->disable_vs))
-  anv_batch_emit(>batch, GENX(3DSTATE_VS), .VSFunctionEnable = 
false);
+  anv_batch_emit_blk(>batch, GENX(3DSTATE_VS), vs);
else
-  anv_batch_emit(>batch, GENX(3DSTATE_VS),
- .KernelStartPointer= pipeline->vs_vec4,
- .ScratchSpaceBaseOffset= 
pipeline->scratch_start[MESA_SHADER_VERTEX],
- .PerThreadScratchSpace = 
scratch_space(_prog_data->base.base),
-
- .DispatchGRFStartRegisterforURBData=
-vs_prog_data->base.base.dispatch_grf_start_reg,
- .VertexURBEntryReadLength  = 
vs_prog_data->base.urb_read_length,
- .VertexURBEntryReadOffset  = 0,
-
- .MaximumNumberofThreads= device->info.max_vs_threads 
- 1,
- .StatisticsEnable  = true,
- .VSFunctionEnable  = true);
+  anv_batch_emit_blk(>batch, GENX(3DSTATE_VS), vs) {
+ vs.KernelStartPointer = pipeline->vs_vec4;
+ 

[Mesa-dev] [PATCH 17/18] anv: Remove the old emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_private.h | 10 --
 1 file changed, 10 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index d59b7ed..a682587 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -829,16 +829,6 @@ __gen_combine_address(struct anv_batch *batch, void 
*location,
   VG(VALGRIND_CHECK_MEM_IS_DEFINED(dst, __anv_cmd_length(struc) * 4)); \
} while (0)
 
-#define anv_batch_emit(batch, cmd, ...) do {   \
-  void *__dst = anv_batch_emit_dwords(batch, __anv_cmd_length(cmd));   \
-  struct cmd __template = {\
- __anv_cmd_header(cmd),\
- __VA_ARGS__   \
-  };   \
-  __anv_cmd_pack(cmd)(batch, __dst, &__template);  \
-  VG(VALGRIND_CHECK_MEM_IS_DEFINED(__dst, __anv_cmd_length(cmd) * 4)); \
-   } while (0)
-
 #define anv_batch_emitn(batch, n, cmd, ...) ({  \
   void *__dst = anv_batch_emit_dwords(batch, n);\
   struct cmd __template = { \
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/18] anv/cmd_buffer: Use the new emit macro for compute shader dispatch

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_cmd_buffer.c | 116 -
 1 file changed, 64 insertions(+), 52 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 45b009b..4a75825 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -572,17 +572,19 @@ static void
 emit_lrm(struct anv_batch *batch,
  uint32_t reg, struct anv_bo *bo, uint32_t offset)
 {
-   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_MEM),
-  .RegisterAddress = reg,
-  .MemoryAddress = { bo, offset });
+   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_MEM), lrm) {
+  lrm.RegisterAddress  = reg;
+  lrm.MemoryAddress= (struct anv_address) { bo, offset };
+   }
 }
 
 static void
 emit_lri(struct anv_batch *batch, uint32_t reg, uint32_t imm)
 {
-   anv_batch_emit(batch, GENX(MI_LOAD_REGISTER_IMM),
-  .RegisterOffset = reg,
-  .DataDWord = imm);
+   anv_batch_emit_blk(batch, GENX(MI_LOAD_REGISTER_IMM), lri) {
+  lri.RegisterOffset   = reg;
+  lri.DataDWord= imm;
+   }
 }
 
 void genX(CmdDrawIndirect)(
@@ -695,18 +697,19 @@ void genX(CmdDispatch)(
 
genX(cmd_buffer_flush_compute_state)(cmd_buffer);
 
-   anv_batch_emit(_buffer->batch, GENX(GPGPU_WALKER),
-  .SIMDSize = prog_data->simd_size / 16,
-  .ThreadDepthCounterMaximum = 0,
-  .ThreadHeightCounterMaximum = 0,
-  .ThreadWidthCounterMaximum = pipeline->cs_thread_width_max - 
1,
-  .ThreadGroupIDXDimension = x,
-  .ThreadGroupIDYDimension = y,
-  .ThreadGroupIDZDimension = z,
-  .RightExecutionMask = pipeline->cs_right_mask,
-  .BottomExecutionMask = 0x);
-
-   anv_batch_emit(_buffer->batch, GENX(MEDIA_STATE_FLUSH));
+   anv_batch_emit_blk(_buffer->batch, GENX(GPGPU_WALKER), ggw) {
+  ggw.SIMDSize = prog_data->simd_size / 16;
+  ggw.ThreadDepthCounterMaximum= 0;
+  ggw.ThreadHeightCounterMaximum   = 0;
+  ggw.ThreadWidthCounterMaximum= pipeline->cs_thread_width_max - 1;
+  ggw.ThreadGroupIDXDimension  = x;
+  ggw.ThreadGroupIDYDimension  = y;
+  ggw.ThreadGroupIDZDimension  = z;
+  ggw.RightExecutionMask   = pipeline->cs_right_mask;
+  ggw.BottomExecutionMask  = 0x;
+   }
+
+   anv_batch_emit_blk(_buffer->batch, GENX(MEDIA_STATE_FLUSH), msf);
 }
 
 #define GPGPU_DISPATCHDIMX 0x2500
@@ -758,48 +761,53 @@ void genX(CmdDispatchIndirect)(
emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 0);
 
/* predicate = (compute_dispatch_indirect_x_size == 0); */
-   anv_batch_emit(batch, GENX(MI_PREDICATE),
-  .LoadOperation = LOAD_LOAD,
-  .CombineOperation = COMBINE_SET,
-  .CompareOperation = COMPARE_SRCS_EQUAL);
+   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
+  mip.LoadOperation= LOAD_LOAD;
+  mip.CombineOperation = COMBINE_SET;
+  mip.CompareOperation = COMPARE_SRCS_EQUAL;
+   }
 
/* Load compute_dispatch_indirect_y_size into SRC0 */
emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 4);
 
/* predicate |= (compute_dispatch_indirect_y_size == 0); */
-   anv_batch_emit(batch, GENX(MI_PREDICATE),
-  .LoadOperation = LOAD_LOAD,
-  .CombineOperation = COMBINE_OR,
-  .CompareOperation = COMPARE_SRCS_EQUAL);
+   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
+  mip.LoadOperation= LOAD_LOAD;
+  mip.CombineOperation = COMBINE_OR;
+  mip.CompareOperation = COMPARE_SRCS_EQUAL;
+   }
 
/* Load compute_dispatch_indirect_z_size into SRC0 */
emit_lrm(batch, MI_PREDICATE_SRC0, bo, bo_offset + 8);
 
/* predicate |= (compute_dispatch_indirect_z_size == 0); */
-   anv_batch_emit(batch, GENX(MI_PREDICATE),
-  .LoadOperation = LOAD_LOAD,
-  .CombineOperation = COMBINE_OR,
-  .CompareOperation = COMPARE_SRCS_EQUAL);
+   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
+  mip.LoadOperation= LOAD_LOAD;
+  mip.CombineOperation = COMBINE_OR;
+  mip.CompareOperation = COMPARE_SRCS_EQUAL;
+   }
 
/* predicate = !predicate; */
 #define COMPARE_FALSE   1
-   anv_batch_emit(batch, GENX(MI_PREDICATE),
-  .LoadOperation = LOAD_LOADINV,
-  .CombineOperation = COMBINE_OR,
-  .CompareOperation = COMPARE_FALSE);
+   anv_batch_emit_blk(batch, GENX(MI_PREDICATE), mip) {
+  mip.LoadOperation= LOAD_LOADINV;
+  mip.CombineOperation = COMBINE_OR;
+  mip.CompareOperation = COMPARE_FALSE;
+   }
 #endif
 
-   anv_batch_emit(batch, GENX(GPGPU_WALKER),
-  .IndirectParameterEnable = true,
-  .PredicateEnable = GEN_GEN <= 7,
-

[Mesa-dev] [PATCH 04/18] anv/cmd_buffer: Use the new emit macro for PIPE_CONTROL and STATE_BASE_ADDRESS

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_cmd_buffer.c | 138 -
 1 file changed, 76 insertions(+), 62 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index b21ff97..932ba65 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -49,46 +49,50 @@ genX(cmd_buffer_emit_state_base_address)(struct 
anv_cmd_buffer *cmd_buffer)
 * this, we get GPU hangs when using multi-level command buffers which
 * clear depth, reset state base address, and then go render stuff.
 */
-   anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
-  .RenderTargetCacheFlushEnable = true);
+   anv_batch_emit_blk(_buffer->batch, GENX(PIPE_CONTROL), pc) {
+  pc.RenderTargetCacheFlushEnable = true;
+   }
 #endif
 
-   anv_batch_emit(_buffer->batch, GENX(STATE_BASE_ADDRESS),
-  .GeneralStateBaseAddress = { scratch_bo, 0 },
-  .GeneralStateMemoryObjectControlState = GENX(MOCS),
-  .GeneralStateBaseAddressModifyEnable = true,
+   anv_batch_emit_blk(_buffer->batch, GENX(STATE_BASE_ADDRESS), sba) {
+  sba.GeneralStateBaseAddress = (struct anv_address) { scratch_bo, 0 };
+  sba.GeneralStateMemoryObjectControlState = GENX(MOCS);
+  sba.GeneralStateBaseAddressModifyEnable = true;
 
-  .SurfaceStateBaseAddress = 
anv_cmd_buffer_surface_base_address(cmd_buffer),
-  .SurfaceStateMemoryObjectControlState = GENX(MOCS),
-  .SurfaceStateBaseAddressModifyEnable = true,
+  sba.SurfaceStateBaseAddress =
+ anv_cmd_buffer_surface_base_address(cmd_buffer);
+  sba.SurfaceStateMemoryObjectControlState = GENX(MOCS);
+  sba.SurfaceStateBaseAddressModifyEnable = true;
 
-  .DynamicStateBaseAddress = { >dynamic_state_block_pool.bo, 0 },
-  .DynamicStateMemoryObjectControlState = GENX(MOCS),
-  .DynamicStateBaseAddressModifyEnable = true,
+  sba.DynamicStateBaseAddress =
+ (struct anv_address) { >dynamic_state_block_pool.bo, 0 };
+  sba.DynamicStateMemoryObjectControlState = GENX(MOCS),
+  sba.DynamicStateBaseAddressModifyEnable = true,
 
-  .IndirectObjectBaseAddress = { NULL, 0 },
-  .IndirectObjectMemoryObjectControlState = GENX(MOCS),
-  .IndirectObjectBaseAddressModifyEnable = true,
+  sba.IndirectObjectBaseAddress = (struct anv_address) { NULL, 0 };
+  sba.IndirectObjectMemoryObjectControlState = GENX(MOCS);
+  sba.IndirectObjectBaseAddressModifyEnable = true;
 
-  .InstructionBaseAddress = { >instruction_block_pool.bo, 0 },
-  .InstructionMemoryObjectControlState = GENX(MOCS),
-  .InstructionBaseAddressModifyEnable = true,
+  sba.InstructionBaseAddress =
+ (struct anv_address) { >instruction_block_pool.bo, 0 };
+  sba.InstructionMemoryObjectControlState = GENX(MOCS);
+  sba.InstructionBaseAddressModifyEnable = true;
 
 #  if (GEN_GEN >= 8)
   /* Broadwell requires that we specify a buffer size for a bunch of
* these fields.  However, since we will be growing the BO's live, we
* just set them all to the maximum.
*/
-  .GeneralStateBufferSize = 0xf,
-  .GeneralStateBufferSizeModifyEnable = true,
-  .DynamicStateBufferSize = 0xf,
-  .DynamicStateBufferSizeModifyEnable = true,
-  .IndirectObjectBufferSize = 0xf,
-  .IndirectObjectBufferSizeModifyEnable = true,
-  .InstructionBufferSize = 0xf,
-  .InstructionBuffersizeModifyEnable = true,
+  sba.GeneralStateBufferSize= 0xf;
+  sba.GeneralStateBufferSizeModifyEnable= true;
+  sba.DynamicStateBufferSize= 0xf;
+  sba.DynamicStateBufferSizeModifyEnable= true;
+  sba.IndirectObjectBufferSize  = 0xf;
+  sba.IndirectObjectBufferSizeModifyEnable  = true;
+  sba.InstructionBufferSize = 0xf;
+  sba.InstructionBuffersizeModifyEnable = true;
 #  endif
-   );
+   }
 
/* After re-setting the surface state base address, we have to do some
 * cache flusing so that the sampler engine will pick up the new
@@ -127,8 +131,9 @@ genX(cmd_buffer_emit_state_base_address)(struct 
anv_cmd_buffer *cmd_buffer)
 * units cache the binding table in the texture cache.  However, we have
 * yet to be able to actually confirm this.
 */
-   anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
-  .TextureCacheInvalidationEnable = true);
+   anv_batch_emit_blk(_buffer->batch, GENX(PIPE_CONTROL), pc) {
+  pc.TextureCacheInvalidationEnable = true;
+   }
 }
 
 void genX(CmdPipelineBarrier)(
@@ -414,10 +419,12 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer 
*cmd_buffer)
*PIPE_CONTROL needs to be sent before any combination of VS
*associated 3DSTATE."
*/
-  anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL),
- .DepthStallEnable = true,
- .PostSyncOperation = 

[Mesa-dev] [PATCH 18/18] anv: s/anv_batch_emit_blk/anv_batch_emit/

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_batch_chain.c|  6 +--
 src/intel/vulkan/anv_device.c |  8 ++--
 src/intel/vulkan/anv_private.h|  2 +-
 src/intel/vulkan/gen7_cmd_buffer.c| 39 
 src/intel/vulkan/gen7_pipeline.c  | 27 ++-
 src/intel/vulkan/gen8_cmd_buffer.c| 38 
 src/intel/vulkan/gen8_pipeline.c  | 31 ++---
 src/intel/vulkan/genX_cmd_buffer.c| 86 +--
 src/intel/vulkan/genX_pipeline.c  |  2 +-
 src/intel/vulkan/genX_pipeline_util.h | 12 ++---
 src/intel/vulkan/genX_state.c | 20 
 11 files changed, 133 insertions(+), 138 deletions(-)

diff --git a/src/intel/vulkan/anv_batch_chain.c 
b/src/intel/vulkan/anv_batch_chain.c
index 3bf0cd0..36c9565 100644
--- a/src/intel/vulkan/anv_batch_chain.c
+++ b/src/intel/vulkan/anv_batch_chain.c
@@ -455,7 +455,7 @@ emit_batch_buffer_start(struct anv_cmd_buffer *cmd_buffer,
const uint32_t gen8_length =
   GEN8_MI_BATCH_BUFFER_START_length - 
GEN8_MI_BATCH_BUFFER_START_length_bias;
 
-   anv_batch_emit_blk(_buffer->batch, GEN8_MI_BATCH_BUFFER_START, bbs) {
+   anv_batch_emit(_buffer->batch, GEN8_MI_BATCH_BUFFER_START, bbs) {
   bbs.DWordLength   = cmd_buffer->device->info.gen < 8 ?
   gen7_length : gen8_length;
   bbs._2ndLevelBatchBuffer  = _1stlevelbatch;
@@ -712,11 +712,11 @@ anv_cmd_buffer_end_batch_buffer(struct anv_cmd_buffer 
*cmd_buffer)
   cmd_buffer->batch.end += GEN8_MI_BATCH_BUFFER_START_length * 4;
   assert(cmd_buffer->batch.end == batch_bo->bo.map + batch_bo->bo.size);
 
-  anv_batch_emit_blk(_buffer->batch, GEN7_MI_BATCH_BUFFER_END, bbe);
+  anv_batch_emit(_buffer->batch, GEN7_MI_BATCH_BUFFER_END, bbe);
 
   /* Round batch up to an even number of dwords. */
   if ((cmd_buffer->batch.next - cmd_buffer->batch.start) & 4)
- anv_batch_emit_blk(_buffer->batch, GEN7_MI_NOOP, noop);
+ anv_batch_emit(_buffer->batch, GEN7_MI_NOOP, noop);
 
   cmd_buffer->exec_mode = ANV_CMD_BUFFER_EXEC_MODE_PRIMARY;
}
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index c2c2db8..00edd95 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1081,8 +1081,8 @@ VkResult anv_DeviceWaitIdle(
batch.start = batch.next = cmds;
batch.end = (void *) cmds + sizeof(cmds);
 
-   anv_batch_emit_blk(, GEN7_MI_BATCH_BUFFER_END, bbe);
-   anv_batch_emit_blk(, GEN7_MI_NOOP, noop);
+   anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END, bbe);
+   anv_batch_emit(, GEN7_MI_NOOP, noop);
 
return anv_device_submit_simple_batch(device, );
 }
@@ -1423,8 +1423,8 @@ VkResult anv_CreateFence(
const uint32_t batch_offset = align_u32(sizeof(*fence), CACHELINE_SIZE);
batch.next = batch.start = fence->bo.map + batch_offset;
batch.end = fence->bo.map + fence->bo.size;
-   anv_batch_emit_blk(, GEN7_MI_BATCH_BUFFER_END, bbe);
-   anv_batch_emit_blk(, GEN7_MI_NOOP, noop);
+   anv_batch_emit(, GEN7_MI_BATCH_BUFFER_END, bbe);
+   anv_batch_emit(, GEN7_MI_NOOP, noop);
 
if (!device->info.has_llc) {
   assert(((uintptr_t) batch.start & CACHELINE_MASK) == 0);
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index a682587..cbf9f96 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -851,7 +851,7 @@ __gen_combine_address(struct anv_batch *batch, void 
*location,
   VG(VALGRIND_CHECK_MEM_IS_DEFINED(dw, ARRAY_SIZE(dwords0) * 4));\
} while (0)
 
-#define anv_batch_emit_blk(batch, cmd, name)\
+#define anv_batch_emit(batch, cmd, name)\
for (struct cmd name = { __anv_cmd_header(cmd) },\
 *_dst = anv_batch_emit_dwords(batch, __anv_cmd_length(cmd));\
 __builtin_expect(_dst != NULL, 1);  \
diff --git a/src/intel/vulkan/gen7_cmd_buffer.c 
b/src/intel/vulkan/gen7_cmd_buffer.c
index 88964da..9bc949d 100644
--- a/src/intel/vulkan/gen7_cmd_buffer.c
+++ b/src/intel/vulkan/gen7_cmd_buffer.c
@@ -57,8 +57,8 @@ gen7_cmd_buffer_emit_descriptor_pointers(struct 
anv_cmd_buffer *cmd_buffer,
 
anv_foreach_stage(s, stages) {
   if (cmd_buffer->state.samplers[s].alloc_size > 0) {
- anv_batch_emit_blk(_buffer->batch,
-GENX(3DSTATE_SAMPLER_STATE_POINTERS_VS), ssp) {
+ anv_batch_emit(_buffer->batch,
+GENX(3DSTATE_SAMPLER_STATE_POINTERS_VS), ssp) {
 ssp._3DCommandSubOpcode = sampler_state_opcodes[s];
 ssp.PointertoVSSamplerState = cmd_buffer->state.samplers[s].offset;
  }
@@ -66,8 +66,8 @@ gen7_cmd_buffer_emit_descriptor_pointers(struct 
anv_cmd_buffer *cmd_buffer,
 
   /* Always emit binding table pointers if we're asked to, since on SKL
* this is what flushes push constants. */
-  anv_batch_emit_blk(_buffer->batch,
-  

[Mesa-dev] [PATCH 06/18] anv/cmd_buffer: Use the new emit macro for 3DSTATE_CONSTANT

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_cmd_buffer.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 713de82..45b009b 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -295,20 +295,21 @@ cmd_buffer_flush_push_constants(struct anv_cmd_buffer 
*cmd_buffer)
   struct anv_state state = anv_cmd_buffer_push_constants(cmd_buffer, 
stage);
 
   if (state.offset == 0) {
- anv_batch_emit(_buffer->batch, GENX(3DSTATE_CONSTANT_VS),
-._3DCommandSubOpcode = push_constant_opcodes[stage]);
+ anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_CONSTANT_VS), c)
+c._3DCommandSubOpcode = push_constant_opcodes[stage];
   } else {
- anv_batch_emit(_buffer->batch, GENX(3DSTATE_CONSTANT_VS),
-._3DCommandSubOpcode = push_constant_opcodes[stage],
-.ConstantBody = {
+ anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_CONSTANT_VS), c) {
+c._3DCommandSubOpcode = push_constant_opcodes[stage],
+c.ConstantBody = (struct GENX(3DSTATE_CONSTANT_BODY)) {
 #if GEN_GEN >= 9
-   .PointerToConstantBuffer2 = { 
_buffer->device->dynamic_state_block_pool.bo, state.offset },
-   .ConstantBuffer2ReadLength = 
DIV_ROUND_UP(state.alloc_size, 32),
+   .PointerToConstantBuffer2 = { 
_buffer->device->dynamic_state_block_pool.bo, state.offset },
+   .ConstantBuffer2ReadLength = DIV_ROUND_UP(state.alloc_size, 32),
 #else
-   .PointerToConstantBuffer0 = { .offset = 
state.offset },
-   .ConstantBuffer0ReadLength = 
DIV_ROUND_UP(state.alloc_size, 32),
+   .PointerToConstantBuffer0 = { .offset = state.offset },
+   .ConstantBuffer0ReadLength = DIV_ROUND_UP(state.alloc_size, 32),
 #endif
-});
+};
+ }
   }
 
   flushed |= mesa_to_vk_shader_stage(stage);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/18] anv/cmd_buffer: Use the new emit macro for DRAWING_RECTANGLE

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_cmd_buffer.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 4a75825..abf0961 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -1058,15 +1058,16 @@ void genX(CmdBeginRenderPass)(
 
const VkRect2D *render_area = >renderArea;
 
-   anv_batch_emit(_buffer->batch, GENX(3DSTATE_DRAWING_RECTANGLE),
-  .ClippedDrawingRectangleYMin = MAX2(render_area->offset.y, 
0),
-  .ClippedDrawingRectangleXMin = MAX2(render_area->offset.x, 
0),
-  .ClippedDrawingRectangleYMax =
- render_area->offset.y + render_area->extent.height - 1,
-  .ClippedDrawingRectangleXMax =
- render_area->offset.x + render_area->extent.width - 1,
-  .DrawingRectangleOriginY = 0,
-  .DrawingRectangleOriginX = 0);
+   anv_batch_emit_blk(_buffer->batch, GENX(3DSTATE_DRAWING_RECTANGLE), r) {
+  r.ClippedDrawingRectangleYMin = MAX2(render_area->offset.y, 0);
+  r.ClippedDrawingRectangleXMin = MAX2(render_area->offset.x, 0);
+  r.ClippedDrawingRectangleYMax =
+ render_area->offset.y + render_area->extent.height - 1;
+  r.ClippedDrawingRectangleXMax =
+ render_area->offset.x + render_area->extent.width - 1;
+  r.DrawingRectangleOriginY = 0;
+  r.DrawingRectangleOriginX = 0;
+   }
 
genX(cmd_buffer_set_subpass)(cmd_buffer, pass->subpasses);
anv_cmd_buffer_clear_subpass(cmd_buffer);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/18] anv/genX_pipeline: Use the new emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_pipeline.c  | 25 +++
 src/intel/vulkan/genX_pipeline_util.h | 58 +++
 2 files changed, 45 insertions(+), 38 deletions(-)

diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index cc8841e..776415a 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -105,23 +105,24 @@ genX(compute_pipeline_create)(
const uint32_t vfe_curbe_allocation =
   push_constant_regs * pipeline->cs_thread_width_max;
 
-   anv_batch_emit(>batch, GENX(MEDIA_VFE_STATE),
-  .ScratchSpaceBasePointer = 
pipeline->scratch_start[MESA_SHADER_COMPUTE],
-  .PerThreadScratchSpace = 
ffs(cs_prog_data->base.total_scratch / 2048),
+   anv_batch_emit_blk(>batch, GENX(MEDIA_VFE_STATE), vfe) {
+  vfe.ScratchSpaceBasePointer = 
pipeline->scratch_start[MESA_SHADER_COMPUTE];
+  vfe.PerThreadScratchSpace  = ffs(cs_prog_data->base.total_scratch / 
2048);
 #if GEN_GEN > 7
-  .ScratchSpaceBasePointerHigh = 0,
-  .StackSize = 0,
+  vfe.ScratchSpaceBasePointerHigh = 0;
+  vfe.StackSize  = 0;
 #else
-  .GPGPUMode = true,
+  vfe.GPGPUMode  = true;
 #endif
-  .MaximumNumberofThreads = device->info.max_cs_threads - 1,
-  .NumberofURBEntries = GEN_GEN <= 7 ? 0 : 2,
-  .ResetGatewayTimer = true,
+  vfe.MaximumNumberofThreads = device->info.max_cs_threads - 1;
+  vfe.NumberofURBEntries = GEN_GEN <= 7 ? 0 : 2;
+  vfe.ResetGatewayTimer  = true;
 #if GEN_GEN <= 8
-  .BypassGatewayControl = true,
+  vfe.BypassGatewayControl   = true;
 #endif
-  .URBEntryAllocationSize = GEN_GEN <= 7 ? 0 : 2,
-  .CURBEAllocationSize = vfe_curbe_allocation);
+  vfe.URBEntryAllocationSize = GEN_GEN <= 7 ? 0 : 2;
+  vfe.CURBEAllocationSize= vfe_curbe_allocation;
+   }
 
*pPipeline = anv_pipeline_to_handle(pipeline);
 
diff --git a/src/intel/vulkan/genX_pipeline_util.h 
b/src/intel/vulkan/genX_pipeline_util.h
index 654d2e0..46be36d 100644
--- a/src/intel/vulkan/genX_pipeline_util.h
+++ b/src/intel/vulkan/genX_pipeline_util.h
@@ -130,12 +130,13 @@ emit_vertex_input(struct anv_pipeline *pipeline,
* that controls instancing.  On Haswell and prior, that's part of
* VERTEX_BUFFER_STATE which we emit later.
*/
-  anv_batch_emit(>batch, GENX(3DSTATE_VF_INSTANCING),
- .InstancingEnable = 
pipeline->instancing_enable[desc->binding],
- .VertexElementIndex = slot,
- /* Vulkan so far doesn't have an instance divisor, so
-  * this is always 1 (ignored if not instancing). */
- .InstanceDataStepRate = 1);
+  anv_batch_emit_blk(>batch, GENX(3DSTATE_VF_INSTANCING), vfi) {
+ vfi.InstancingEnable = pipeline->instancing_enable[desc->binding],
+ vfi.VertexElementIndex = slot,
+ /* Vulkan so far doesn't have an instance divisor, so
+  * this is always 1 (ignored if not instancing). */
+ vfi.InstanceDataStepRate = 1;
+  }
 #endif
}
 
@@ -172,13 +173,14 @@ emit_vertex_input(struct anv_pipeline *pipeline,
}
 
 #if GEN_GEN >= 8
-   anv_batch_emit(>batch, GENX(3DSTATE_VF_SGVS),
-  .VertexIDEnable = vs_prog_data->uses_vertexid,
-  .VertexIDComponentNumber = 2,
-  .VertexIDElementOffset = id_slot,
-  .InstanceIDEnable = vs_prog_data->uses_instanceid,
-  .InstanceIDComponentNumber = 3,
-  .InstanceIDElementOffset = id_slot);
+   anv_batch_emit_blk(>batch, GENX(3DSTATE_VF_SGVS), sgvs) {
+  sgvs.VertexIDEnable  = vs_prog_data->uses_vertexid;
+  sgvs.VertexIDComponentNumber = 2;
+  sgvs.VertexIDElementOffset   = id_slot;
+  sgvs.InstanceIDEnable= vs_prog_data->uses_instanceid;
+  sgvs.InstanceIDComponentNumber   = 3;
+  sgvs.InstanceIDElementOffset = id_slot;
+   }
 #endif
 }
 
@@ -196,28 +198,32 @@ emit_urb_setup(struct anv_pipeline *pipeline)
 *3DSTATE_SAMPLER_STATE_POINTER_VS command.  Only one PIPE_CONTROL
 *needs to be sent before any combination of VS associated 3DSTATE."
 */
-   anv_batch_emit(>batch, GEN7_PIPE_CONTROL,
-  .DepthStallEnable = true,
-  .PostSyncOperation = WriteImmediateData,
-  .Address = { >workaround_bo, 0 });
+   anv_batch_emit_blk(>batch, GEN7_PIPE_CONTROL, pc) {
+  pc.DepthStallEnable  = true;
+  pc.PostSyncOperation = WriteImmediateData;
+  pc.Address   = (struct anv_address) { >workaround_bo, 0 
};
+   }
 #endif
 
unsigned push_start = 0;
for (int i = MESA_SHADER_VERTEX; i <= MESA_SHADER_FRAGMENT; i++) {
   unsigned push_size = pipeline->urb.push_size[i];
-  

[Mesa-dev] [PATCH 13/18] anv/state: Use the new emit macro

2016-04-18 Thread Jason Ekstrand
---
 src/intel/vulkan/genX_state.c | 155 +-
 1 file changed, 78 insertions(+), 77 deletions(-)

diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 900f6dc..b997e1b 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -44,97 +44,98 @@ genX(init_device_state)(struct anv_device *device)
batch.start = batch.next = cmds;
batch.end = (void *) cmds + sizeof(cmds);
 
-   anv_batch_emit(, GENX(PIPELINE_SELECT),
+   anv_batch_emit_blk(, GENX(PIPELINE_SELECT), ps) {
 #if GEN_GEN >= 9
-  .MaskBits = 3,
+  ps.MaskBits = 3;
 #endif
-  .PipelineSelection = _3D);
+  ps.PipelineSelection = _3D;
+   }
+
+   anv_batch_emit_blk(, GENX(3DSTATE_VF_STATISTICS), vfs)
+  vfs.StatisticsEnable = true;
 
-   anv_batch_emit(, GENX(3DSTATE_VF_STATISTICS),
-  .StatisticsEnable = true);
-   anv_batch_emit(, GENX(3DSTATE_HS));
-   anv_batch_emit(, GENX(3DSTATE_TE));
-   anv_batch_emit(, GENX(3DSTATE_DS));
+   anv_batch_emit_blk(, GENX(3DSTATE_HS), hs);
+   anv_batch_emit_blk(, GENX(3DSTATE_TE), ts);
+   anv_batch_emit_blk(, GENX(3DSTATE_DS), ds);
 
-   anv_batch_emit(, GENX(3DSTATE_STREAMOUT), .SOFunctionEnable = false);
-   anv_batch_emit(, GENX(3DSTATE_AA_LINE_PARAMETERS));
+   anv_batch_emit_blk(, GENX(3DSTATE_STREAMOUT), so);
+   anv_batch_emit_blk(, GENX(3DSTATE_AA_LINE_PARAMETERS), aa);
 
 #if GEN_GEN >= 8
-   anv_batch_emit(, GENX(3DSTATE_WM_CHROMAKEY),
-  .ChromaKeyKillEnable = false);
+   anv_batch_emit_blk(, GENX(3DSTATE_WM_CHROMAKEY), ck);
 
/* See the Vulkan 1.0 spec Table 24.1 "Standard sample locations" and
 * VkPhysicalDeviceFeatures::standardSampleLocations.
 */
-   anv_batch_emit(, GENX(3DSTATE_SAMPLE_PATTERN),
-  ._1xSample0XOffset  = 0.5,
-  ._1xSample0YOffset  = 0.5,
-  ._2xSample0XOffset  = 0.25,
-  ._2xSample0YOffset  = 0.25,
-  ._2xSample1XOffset  = 0.75,
-  ._2xSample1YOffset  = 0.75,
-  ._4xSample0XOffset  = 0.375,
-  ._4xSample0YOffset  = 0.125,
-  ._4xSample1XOffset  = 0.875,
-  ._4xSample1YOffset  = 0.375,
-  ._4xSample2XOffset  = 0.125,
-  ._4xSample2YOffset  = 0.625,
-  ._4xSample3XOffset  = 0.625,
-  ._4xSample3YOffset  = 0.875,
-  ._8xSample0XOffset  = 0.5625,
-  ._8xSample0YOffset  = 0.3125,
-  ._8xSample1XOffset  = 0.4375,
-  ._8xSample1YOffset  = 0.6875,
-  ._8xSample2XOffset  = 0.8125,
-  ._8xSample2YOffset  = 0.5625,
-  ._8xSample3XOffset  = 0.3125,
-  ._8xSample3YOffset  = 0.1875,
-  ._8xSample4XOffset  = 0.1875,
-  ._8xSample4YOffset  = 0.8125,
-  ._8xSample5XOffset  = 0.0625,
-  ._8xSample5YOffset  = 0.4375,
-  ._8xSample6XOffset  = 0.6875,
-  ._8xSample6YOffset  = 0.9375,
-  ._8xSample7XOffset  = 0.9375,
-  ._8xSample7YOffset  = 0.0625,
+   anv_batch_emit_blk(, GENX(3DSTATE_SAMPLE_PATTERN), sp) {
+  sp._1xSample0XOffset= 0.5;
+  sp._1xSample0YOffset= 0.5;
+  sp._2xSample0XOffset= 0.25;
+  sp._2xSample0YOffset= 0.25;
+  sp._2xSample1XOffset= 0.75;
+  sp._2xSample1YOffset= 0.75;
+  sp._4xSample0XOffset= 0.375;
+  sp._4xSample0YOffset= 0.125;
+  sp._4xSample1XOffset= 0.875;
+  sp._4xSample1YOffset= 0.375;
+  sp._4xSample2XOffset= 0.125;
+  sp._4xSample2YOffset= 0.625;
+  sp._4xSample3XOffset= 0.625;
+  sp._4xSample3YOffset= 0.875;
+  sp._8xSample0XOffset= 0.5625;
+  sp._8xSample0YOffset= 0.3125;
+  sp._8xSample1XOffset= 0.4375;
+  sp._8xSample1YOffset= 0.6875;
+  sp._8xSample2XOffset= 0.8125;
+  sp._8xSample2YOffset= 0.5625;
+  sp._8xSample3XOffset= 0.3125;
+  sp._8xSample3YOffset= 0.1875;
+  sp._8xSample4XOffset= 0.1875;
+  sp._8xSample4YOffset= 0.8125;
+  sp._8xSample5XOffset= 0.0625;
+  sp._8xSample5YOffset= 0.4375;
+  sp._8xSample6XOffset= 0.6875;
+  sp._8xSample6YOffset= 0.9375;
+  sp._8xSample7XOffset= 0.9375;
+  sp._8xSample7YOffset= 0.0625;
 #if GEN_GEN >= 9
-  ._16xSample0XOffset = 0.5625,
-  ._16xSample0YOffset = 0.5625,
-  ._16xSample1XOffset = 0.4375,
-  ._16xSample1YOffset = 0.3125,
-  ._16xSample2XOffset = 0.3125,
-  ._16xSample2YOffset = 0.6250,
-  ._16xSample3XOffset = 0.7500,
-  ._16xSample3YOffset = 0.4375,
-  ._16xSample4XOffset = 0.1875,
-  ._16xSample4YOffset = 0.3750,
-  ._16xSample5XOffset = 0.6250,
-  ._16xSample5YOffset = 0.8125,
-  ._16xSample6XOffset = 0.8125,
-  ._16xSample6YOffset = 0.6875,
-  ._16xSample7XOffset = 0.6875,
-  ._16xSample7YOffset = 0.1875,
-  ._16xSample8XOffset

[Mesa-dev] [PATCH 00/18] anv: Switch to a new emit macro

2016-04-18 Thread Jason Ekstrand
The first patch in this series adds a short style guide for the Vulkan
driver.  The rest adds a new emit macro and updates the entire driver to
use it and, while we're there, makes the style more consistent.

Jason Ekstrand (18):
  anv: Add a short style guide
  anv: Add a new block-based batch emit macro
  anv/cmd_buffer: Use the new emit macro for 3DPRIMITIVE commands
  anv/cmd_buffer: Use the new emit macro for PIPE_CONTROL and
STATE_BASE_ADDRESS
  anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER
  anv/cmd_buffer: Use the new emit macro for 3DSTATE_CONSTANT
  anv/cmd_buffer: Use the new emit macro for compute shader dispatch
  anv/cmd_buffer: Use the new emit macro for DRAWING_RECTANGLE
  anv/cmd_buffer: Use the new emit macro for quaries
  anv/gen8_cmd_buffer: Use the new emit macro
  anv/genX_pipeline: Use the new emit macro
  anv/gen8_pipeline: Use the new emit macro
  anv/state: Use the new emit macro
  anv/device: Use the new emit macro
  anv/gen7_cmd_buffer: Use the new emit macro
  anv/gen7_pipeline: Use the new emit macro
  anv: Remove the old emit macro
  anv: s/anv_batch_emit_blk/anv_batch_emit/

 src/intel/vulkan/STYLE|  67 +
 src/intel/vulkan/anv_batch_chain.c|  17 +-
 src/intel/vulkan/anv_device.c |   8 +-
 src/intel/vulkan/anv_private.h|  19 +-
 src/intel/vulkan/gen7_cmd_buffer.c| 115 
 src/intel/vulkan/gen7_pipeline.c  | 236 +
 src/intel/vulkan/gen8_cmd_buffer.c| 155 ++-
 src/intel/vulkan/gen8_pipeline.c  | 367 +-
 src/intel/vulkan/genX_cmd_buffer.c| 480 +++---
 src/intel/vulkan/genX_pipeline.c  |  25 +-
 src/intel/vulkan/genX_pipeline_util.h |  58 ++--
 src/intel/vulkan/genX_state.c | 155 +--
 12 files changed, 938 insertions(+), 764 deletions(-)
 create mode 100644 src/intel/vulkan/STYLE

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/vec4: Lower integer multiplication after optimizations.

2016-04-18 Thread Ian Romanick
On 04/18/2016 04:14 PM, Matt Turner wrote:
> Analogous to commit 1e4e17fbd in the i965/fs backend.
> 
> Because the copy propagation pass in the vec4 backend is strictly local,
> we look at the immediate values coming from NIR and emit the multiplies
> we need directly. If the copy propagation pass becomes smarter in the
> future, we can reduce the nir_op_imul case in brw_vec4_nir.cpp to a
> single multiply.
> 
> total instructions in shared programs: 7082311 -> 7081953 (-0.01%)
> instructions in affected programs: 59581 -> 59223 (-0.60%)
> helped: 293
> 
> total cycles in shared programs: 65765712 -> 65764796 (-0.00%)
> cycles in affected programs: 854112 -> 853196 (-0.11%)
> helped: 154
> HURT: 73
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 67 
> ++
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 48 +
>  3 files changed, 88 insertions(+), 28 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index b9cf3f6..1644d4d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1671,6 +1671,71 @@ vec4_visitor::lower_minmax()
> return progress;
>  }
>  
> +bool
> +vec4_visitor::lower_integer_multiplication()
> +{
> +   bool progress = false;
> +
> +   foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
> +  const vec4_builder ibld(this, block, inst);
> +
> +  if (inst->opcode == BRW_OPCODE_MUL) {
> + if (inst->dst.is_accumulator() ||
> + (inst->src[1].type != BRW_REGISTER_TYPE_D &&
> +  inst->src[1].type != BRW_REGISTER_TYPE_UD))
> +continue;
> +
> + /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit
> +  * operation directly, but CHV/BXT cannot.
> +  */
> + if (devinfo->gen >= 8 &&
> + !devinfo->is_cherryview && !devinfo->is_broxton)
> +continue;

Shouldn't this whole method just bail if we're Gen >= 8 and !CHV and
!BXT?  Or does this structure simplify future changes?

> +
> + if (inst->src[1].file == IMM &&
> + inst->src[1].ud < (1 << 16)) {
> +/* The MUL instruction isn't commutative. On Gen <= 6, only the 
> low
> + * 16-bits of src0 are read, and on Gen >= 7 only the low 
> 16-bits of
> + * src1 are used.
> + *
> + * If multiplying by an immediate value that fits in 16-bits, do 
> a
> + * single MUL instruction with that value in the proper location.
> + */
> +if (devinfo->gen < 7) {
> +   dst_reg imm(VGRF, alloc.allocate(1), inst->dst.type,
> +   inst->dst.writemask);
> +   ibld.MOV(imm, inst->src[1]);
> +   ibld.MUL(inst->dst, src_reg(imm), inst->src[0]);
> +} else {
> +   ibld.MUL(inst->dst, inst->src[0], inst->src[1]);
> +}
> + } else {
> +const dst_reg acc(brw_writemask(retype(brw_acc_reg(8),
> +   inst->dst.type),
> +inst->dst.writemask));
> +const dst_reg null(brw_writemask(retype(brw_null_reg(),
> +inst->dst.type),
> + inst->dst.writemask));
> +
> +ibld.MUL(acc, inst->src[0], inst->src[1]);
> +ibld.MACH(null, inst->src[0], inst->src[1]);
> +set_condmod(inst->conditional_mod,
> +ibld.MOV(inst->dst, src_reg(acc)));
> + }
> +  } else {
> + continue;
> +  }
> +
> +  inst->remove(block);
> +  progress = true;
> +   }
> +
> +   if (progress)
> +  invalidate_live_intervals();
> +
> +   return progress;
> +}
> +
>  src_reg
>  vec4_visitor::get_timestamp()
>  {
> @@ -1950,6 +2015,8 @@ vec4_visitor::run()
>OPT(dead_code_eliminate);
> }
>  
> +   OPT(lower_integer_multiplication);
> +
> if (failed)
>return false;
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index d43a5a8..f6f8b12 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -301,6 +301,7 @@ public:
> void resolve_ud_negate(src_reg *reg);
>  
> bool lower_minmax();
> +   bool lower_integer_multiplication();
>  
> src_reg get_timestamp();
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index e4e8c38..10e2f54 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1039,35 +1039,27 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>break;
>  
> case nir_op_imul: {
> -  if (devinfo->gen < 8) {
> - 

Re: [Mesa-dev] [PATCH] glsl: add forgotten textureOffset function for sampler2DArrayShadow

2016-04-18 Thread Roland Scheidegger
Am 19.04.2016 um 00:43 schrieb Ian Romanick:
> On 04/18/2016 02:05 PM, srol...@vmware.com wrote:
>> From: Roland Scheidegger 
>>
>> This was part of EXT_gpu_shader4 - as such it should have been supported
>> by glsl 130.
>> It was however forgotten, and not added until glsl 430 - with the wrong
>> syntax no less (glsl 430 mentions it was overlooked).
>> glsl 440 (but revision 8 only) fixed this finally for good.
>> It looks like most other implementations would support this with older
>> glsl versions as well, so just add this to the other glsl 130 textureOffset
> 
> Can you clarify this?  I believe that it would work on NVIDIA when the
> shader doesn't specify an explicit version.  Does it work on NVIDIA with
> an explicit #version 130?  How about AMD?
I haven't tried myself. Someone else however working on our gl backend
hit this issue on intel drivers (we just assumed the function was
available). With nvidia it works with #version 150 at least (and I can't
see why it would make sense to support it in 150 but not 130). Not sure
about AMD. (It actually seems problematic to detect if it's working or
not, since we don't usually require glsl 4.40, it was wrong in 4.30
even, and the spec doesn't really say it should be retroactively enabled
on older versions.)
I suppose I should add it to some piglit really...

> 
>> functions.
>>
>> (Completely untested...)
>> ---
>>  src/compiler/glsl/builtin_functions.cpp | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/src/compiler/glsl/builtin_functions.cpp 
>> b/src/compiler/glsl/builtin_functions.cpp
>> index f488434..004beb9 100644
>> --- a/src/compiler/glsl/builtin_functions.cpp
>> +++ b/src/compiler/glsl/builtin_functions.cpp
>> @@ -1707,6 +1707,9 @@ builtin_builder::create_builtins()
>>  _texture(ir_tex, v130, glsl_type::uvec4_type, 
>> glsl_type::usampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET),
>>  
>>  _texture(ir_tex, v130, glsl_type::float_type, 
>> glsl_type::sampler1DArrayShadow_type, glsl_type::vec3_type, TEX_OFFSET),
>> +/* the next one was forgotten in glsl spec (it's from 
>> EXT_gpu_shader4 initially),
>> +   had wrong syntax in 4.30, correct only in 4.40 but allow 
>> it in 130 */
> 
> How about:
> 
> /* The next one was forgotten in GLSL 1.30 spec.  It's from
>  * EXT_gpu_shader4 originally.  It was added in 4.30 with the
>  * wrong syntax.  This was corrected in 4.40.  4.30 indicates
>  * that it was intended to be included previously, so allow it
>  * in 1.30.
>  */
Ahh yes that sounds great.

Roland


> 
>> +_texture(ir_tex, v130, glsl_type::float_type, 
>> glsl_type::sampler2DArrayShadow_type, glsl_type::vec4_type, TEX_OFFSET),
>>  
>>  _texture(ir_txb, v130_fs_only, glsl_type::vec4_type,  
>> glsl_type::sampler1D_type,  glsl_type::float_type, TEX_OFFSET),
>>  _texture(ir_txb, v130_fs_only, glsl_type::ivec4_type, 
>> glsl_type::isampler1D_type, glsl_type::float_type, TEX_OFFSET),
>>
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Properly handle integer types in opt_vector_float().

2016-04-18 Thread Kenneth Graunke
On Monday, April 18, 2016 4:20:50 PM PDT Iago Toral wrote:
> On Sun, 2016-04-17 at 23:14 -0700, Kenneth Graunke wrote:
> > Previously, opt_vector_float() always interpreted MOV sources as
> > floating point, and always created a MOV with a F-type destination.
> > 
> > This meant that we could mess up sequences of integer loads, such as:
> > 
> >mov vgrf6.0.x:D, 0D
> >mov vgrf6.0.y:D, 1D
> >mov vgrf6.0.z:D, 2D
> >mov vgrf6.0.w:D, 3D
> > 
> > Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
> > 
> >mov vgrf6.0:F, [0F, 0F, 0F, 0F]
> > 
> > which is clearly wrong.  We can properly handle this by converting
> > integer values to float (rather than bitcasting), and emitting a type
> > converting MOV:
> > 
> >mov vgrf6.0:D, [0F, 1F, 2F, 3F]
> > 
> > To do this, see first see if the integer values (converted to float)
> > are representable.  If so, we use a D-type MOV.  If not, we then try
> > the floating point values and an F-type MOV.  We make zero not impose
> > type restrictions.  This is important because 0D would imply a D-type
> > MOV, but is often used in sequences such as MOV 0D, MOV 0x3f80D,
> > where we want to use an F-type MOV.
> > 
> > Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend.  This
> > recently became visible due to changes in opt_vector_float() which
> > made it optimize more cases, but it was a pre-existing bug.
> > 
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4.cpp | 24 
> >  1 file changed, 20 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/
dri/i965/brw_vec4.cpp
> > index 12c3c66..2bdcf1f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > @@ -361,9 +361,11 @@ vec4_visitor::opt_vector_float()
> > int inst_count = 0;
> > vec4_instruction *imm_inst[4];
> > unsigned writemask = 0;
> > +   enum brw_reg_type dest_type = BRW_REGISTER_TYPE_F;
> >  
> > foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
> >int vf = -1;
> > +  enum brw_reg_type need_type;
> >  
> >/* Look for unconditional MOVs from an immediate with a partial
> > * writemask.  Skip type-conversion MOVs other than integer 0,
> > @@ -375,14 +377,26 @@ vec4_visitor::opt_vector_float()
> >inst->predicate == BRW_PREDICATE_NONE &&
> >inst->dst.writemask != WRITEMASK_XYZW &&
> >(inst->src[0].type == inst->dst.type || inst->src[0].d == 0)) {
> > - vf = brw_float_to_vf(inst->src[0].f);
> > + vf = brw_float_to_vf(inst->src[0].d);
> > + need_type = BRW_REGISTER_TYPE_D;
> > +
> > + if (vf == -1) {
> > +vf = brw_float_to_vf(inst->src[0].f);
> > +need_type = BRW_REGISTER_TYPE_F;
> > + }
> 
> If we are packing actual float values (not integers), doesn't this mean
> that we re-interpret them as integers and convert the re-interpreted
> integer value to float? If the result of that sequence of operations is
> representable it seems that we would just use a D-MOV from a float that
> no longer represents the original value, right?
> 
> Example:
> 
> .f = 5.27 (0x40a8a3d7)
> .d = 1084793815 (0x40a8a3d7)
> 
> so we would do brw_float_to_vf(1084793815.0) instead of
> brw_float_to_vf(5.27), which does not look right.

No, I believe this should work.

Patch 3 makes us stop considering type-converting MOVs.  So, whatever
MOVs we're looking at will be F -> F or D -> D.

If we had

   mov(8) dst<1>D 1084793815D

it would first try to do:

   vf = brw_float_to_vf(inst->src[0].d);

Since brw_float_to_vf takes a float parameter, this is actually:

   vf = brw_float_to_vf((float) inst->src[0].d);

   (this might have been unclear, sorry!)

So our example would try:

   vf = brw_float_to_vf(1084793815.0f);

If this were representable (it isn't), then we would generate:

   mov(8) dst<1>D [1084793815.0f, ...]VF

Because it's a type converting MOV, the 1084793815.0f source would be
converted to integer 1084793815 and stored.  Reading dst<8,8,1>F would
then read 5.27, as expected.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/2] radeonsi: do not do two full flushes on every compute dispatch

2016-04-18 Thread Bas Nieuwenhuizen
v2: Add more CS_PARTIAL_FLUSH events.

Essentially every place with waits on finishing for pixel shaders
also has a write after read hazard with compute shaders.

Invalidating L2 waits implicitly on pixel and compute shaders,
so, we don't need a CS_PARTIAL_FLUSH for switching FBO.

v3: Add CS_PARTIAL_FLUSH events even if we already have INV_GLOBAL_L2.

According to Marek the INV_GLOBAL_L2 events don't wait for compute
shaders to finish, so wait for them explicitly.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/gallium/drivers/radeonsi/si_compute.c | 17 ++---
 src/gallium/drivers/radeonsi/si_cp_dma.c  |  6 --
 src/gallium/drivers/radeonsi/si_descriptors.c |  3 ++-
 src/gallium/drivers/radeonsi/si_hw_context.c  |  1 +
 src/gallium/drivers/radeonsi/si_state.c   | 12 
 5 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 10b88b3..6803334 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -439,13 +439,8 @@ static void si_launch_grid(
if (!sctx->cs_shader_state.initialized)
si_initialize_compute(sctx);
 
-   sctx->b.flags |= SI_CONTEXT_INV_VMEM_L1 |
-SI_CONTEXT_INV_GLOBAL_L2 |
-SI_CONTEXT_INV_ICACHE |
-SI_CONTEXT_INV_SMEM_L1 |
-SI_CONTEXT_FLUSH_WITH_INV_L2 |
-SI_CONTEXT_FLAG_COMPUTE;
-   si_emit_cache_flush(sctx, NULL);
+   if (sctx->b.flags)
+   si_emit_cache_flush(sctx, NULL);
 
if (!si_switch_compute_shader(sctx, program, >shader, 
info->pc))
return;
@@ -478,14 +473,6 @@ static void si_launch_grid(
si_setup_tgsi_grid(sctx, info);
 
si_emit_dispatch_packets(sctx, info);
-
-   sctx->b.flags |= SI_CONTEXT_CS_PARTIAL_FLUSH |
-SI_CONTEXT_INV_VMEM_L1 |
-SI_CONTEXT_INV_GLOBAL_L2 |
-SI_CONTEXT_INV_ICACHE |
-SI_CONTEXT_INV_SMEM_L1 |
-SI_CONTEXT_FLAG_COMPUTE;
-   si_emit_cache_flush(sctx, NULL);
 }
 
 
diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c 
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 001ddd4..38e0ee6 100644
--- a/src/gallium/drivers/radeonsi/si_cp_dma.c
+++ b/src/gallium/drivers/radeonsi/si_cp_dma.c
@@ -190,7 +190,8 @@ static void si_clear_buffer(struct pipe_context *ctx, 
struct pipe_resource *dst,
uint64_t va = r600_resource(dst)->gpu_address + offset;
 
/* Flush the caches. */
-   sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH | flush_flags;
+   sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
+SI_CONTEXT_CS_PARTIAL_FLUSH | flush_flags;
 
while (size) {
unsigned byte_count = MIN2(size, CP_DMA_MAX_BYTE_COUNT);
@@ -296,7 +297,8 @@ void si_copy_buffer(struct si_context *sctx,
}
 
/* Flush the caches. */
-   sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH | flush_flags;
+   sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
+SI_CONTEXT_CS_PARTIAL_FLUSH | flush_flags;
 
/* This is the main part doing the copying. Src is always aligned. */
main_dst_offset = dst_offset + skipped_size;
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 5b65fae..98ad3a7 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -940,7 +940,8 @@ static void si_set_streamout_targets(struct pipe_context 
*ctx,
 * start writing to the targets.
 */
if (num_targets)
-   sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH;
+   sctx->b.flags |= SI_CONTEXT_PS_PARTIAL_FLUSH |
+SI_CONTEXT_CS_PARTIAL_FLUSH;
 
/* Streamout buffers must be bound in 2 places:
 * 1) in VGT by setting the VGT_STRMOUT registers
diff --git a/src/gallium/drivers/radeonsi/si_hw_context.c 
b/src/gallium/drivers/radeonsi/si_hw_context.c
index 9862f07..b179092e 100644
--- a/src/gallium/drivers/radeonsi/si_hw_context.c
+++ b/src/gallium/drivers/radeonsi/si_hw_context.c
@@ -84,6 +84,7 @@ void si_context_gfx_flush(void *context, unsigned flags,
ctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER |
SI_CONTEXT_INV_VMEM_L1 |
SI_CONTEXT_INV_GLOBAL_L2 |
+   SI_CONTEXT_CS_PARTIAL_FLUSH |
/* this is probably not needed anymore */
SI_CONTEXT_PS_PARTIAL_FLUSH;
si_emit_cache_flush(ctx, NULL);
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index af9ffdd..305a70b 100644
--- 

Re: [Mesa-dev] [PATCH 3/4] [rfc] gallivm/llvmpipe dynamic samplers support.

2016-04-18 Thread Roland Scheidegger
Am 18.04.2016 um 23:10 schrieb Dave Airlie:
> On 19 April 2016 at 03:40, Roland Scheidegger  wrote:
>> Am 18.04.2016 um 04:49 schrieb Dave Airlie:
>>> From: Dave Airlie 
>>>
>>> This is a first attempt at adding support for dynamic indexing
>>> of samplers to llvmpipe. This is needed for ARB_gpu_shader5 support.
>>>
>>> This uses the sampler function generator to generate functions
>>> for all samplers, then uses if statements to pick which one to call.
>>>
>>> This passes all the tests in piglit except a couple of non-uniform
>>> ones which g
>> I can't quite parse this fully ;-).
>>
> Doh I wrote half a commit msg, got distracted by query size and forgot
> to go back.
> 
>>> +
>>> + for (i = 0; i < sampler->dynamic_state.num_samplers; i++) {
>> Indentation.
>> And I think this should really make an effort to only do this for
>> samplers which are actually dyanmically indexed, for the possible
>> range(s) (as they need to be declared as an array if I'm not mistaken).
>> I'm not entirely sure, but I believe this could even crash otherwise
>> potentially, due to trying to construct lookup functions with
>> incompatible combinations of parameters.
> 
> No unfortunately we don't declare samplers as an array, I think I should
> fix that at some point as I've come to this point a few times and would have
> really liked to have that information, at least for bounds checking.
Not having samplers as arrays is really _insane_ for this. You NEED them
for the code to make much sense (otherwise you might just construct huge
if ladders even if from your 32 samplers only 2 can be indexed
dynamically...)
That said, it's not THAT bad actually. You could (and should) sort of
"emulate" it. Because you know the lower bound from the base reg. And
for the upper bound, you can scan the sampler dcl targets (as they have
to match the base one). That way I'd be a bit more confident you don't
try to create functions which might be impossible and hit asserts or
crashes... Of course, that might very well (and likely) give you a too
pessimistic upper bound. But again, it would make more sense to have
arrays in the first place, so not requiring hacks.




>>
>>> + struct lp_sampler_params unit_params = *params;
>>> +
>>> + unit_params.texture_index = i;
>>> + unit_params.sampler_index = i;
>> Of course, the need to actually override texture and sampler index is
>> why I never even tried to implement it - for d3d10 sample opcodes, it
>> looked simply unfeasible as you'd have to generate the product of
>> num_samplers/num_textures. So I figured this needed way more work...
>> Though actually thinking about this, only need to be able to index into
>> resources, not samplers, there, so it should work as well...
>> I can't really make sense out of the msdn docs though.
>> But with d3d12 you actually could have legitimate per-pixel resource
>> indices (yuck!!!) - with the usual caveat about calculated lods.
>> But I can't actually find anything that says dynamic indexing into
>> resources even works at all for d3d11 now.
> 
> Yup I pretty much ignored SAMPLE* and D3D due to lack of any clue.
> 
> Though really generating all the functions in the world probably isn't
> going to be that brutal. but it would be good to know if you can indirect
> both.
Pretty sure you can't for samplers - so iff it's possible for resources,
it would look pretty much the same (just without overriding the sampler
index).

> 
>>
>>> + 
>>> lp_build_sample_soa(>dynamic_state.static_state[i].texture_state,
>>> + 
>>> >dynamic_state.static_state[i].sampler_state,
>>> + >dynamic_state.base,
>>> + gallivm, _params);
>> And I think really all of this should move to the actual sample code.
> 
> It's kinda messier to do that, since we passing
> dynamic_state.static_state, and I don't think
> we can access it anywhere else except here.
Ah I see. I think the fact you needed to do it in both llvmpipe and draw
exactly the same is a pretty good indication it really isn't the right
place (swr certainly would need to do it too if it wanted to support that).
My guess is we could just pass the whole static_state array easily. It
is true that there's actually different definitions now
(draw_sampler_static_state, lp_sampler_static_state,
swr_sampler_static_state) but effectively they are all the same (all
copied from llvmpipe, including the comments) anyway. So could require
the drivers to use a common definition instead.
(And pass along the range information too!)


> 
>>> +  indirect_matches = LLVMBuildBitCast(builder, indirect_matches, 
>>> LLVMIntTypeInContext(gallivm->context, params->type.length), "");
>>> +  indirect_matches = LLVMBuildICmp(builder, LLVMIntNE,
>>> +   indirect_matches, 
>>> LLVMConstNull(LLVMIntTypeInContext(gallivm->context, params->type.length)), 
>>> 

[Mesa-dev] [PATCH v3 0/2] Remainder radeonsi compute patches.

2016-04-18 Thread Bas Nieuwenhuizen
I added some CS_PARTIAL_FLUSH events after MArek's response. I haven't been 
able 
to detect anything wrong without them. However at least theoretically some 
event 
has to wait on CS shaders at the new points.(e.g fbo change clearly has a 
potential write after read hazard otherwise).

I also updated the update cap patch, as I discovered that writing the USER_DATA 
registers from a COPY_DATA packet was disallowed by the kernel with the SI CS 
checker. 

Now that that has been fixed in the kernel, the new patch checks for the drm 
version that has the fix.

Bas Nieuwenhuizen (2):
  radeonsi: do not do two full flushes on every compute dispatch
  radeonsi: enable TGSI support cap for compute shaders

 docs/GL3.txt  |  4 ++--
 docs/relnotes/11.3.0.html |  1 +
 src/gallium/drivers/radeon/r600_pipe_common.c | 21 -
 src/gallium/drivers/radeonsi/si_compute.c | 17 ++---
 src/gallium/drivers/radeonsi/si_cp_dma.c  |  6 --
 src/gallium/drivers/radeonsi/si_descriptors.c |  3 ++-
 src/gallium/drivers/radeonsi/si_hw_context.c  |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c| 15 +--
 src/gallium/drivers/radeonsi/si_state.c   | 12 
 9 files changed, 49 insertions(+), 31 deletions(-)

-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/2] radeonsi: enable TGSI support cap for compute shaders

2016-04-18 Thread Bas Nieuwenhuizen
v2: Use chip_class instead of family.

v3: Check kernel version for SI.

Signed-off-by: Bas Nieuwenhuizen 
---
 docs/GL3.txt  |  4 ++--
 docs/relnotes/11.3.0.html |  1 +
 src/gallium/drivers/radeon/r600_pipe_common.c | 21 -
 src/gallium/drivers/radeonsi/si_pipe.c| 15 +--
 4 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 3febd6e..6214f8d 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -167,7 +167,7 @@ GL 4.3, GLSL 4.30:
   GL_ARB_arrays_of_arrays   DONE (all drivers that 
support GLSL 1.30)
   GL_ARB_ES3_compatibility  DONE (all drivers that 
support GLSL 3.30)
   GL_ARB_clear_buffer_objectDONE (all drivers)
-  GL_ARB_compute_shader DONE (i965)
+  GL_ARB_compute_shader DONE (i965, radeonsi)
   GL_ARB_copy_image DONE (i965, nv50, 
nvc0, r600, radeonsi)
   GL_KHR_debug  DONE (all drivers)
   GL_ARB_explicit_uniform_location  DONE (all drivers that 
support GLSL)
@@ -225,7 +225,7 @@ GL 4.5, GLSL 4.50:
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
   GL_ARB_arrays_of_arrays   DONE (all drivers that 
support GLSL 1.30)
-  GL_ARB_compute_shader DONE (i965)
+  GL_ARB_compute_shader DONE (i965, radeonsi)
   GL_ARB_draw_indirect  DONE (i965, nvc0, 
r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_explicit_uniform_location  DONE (all drivers that 
support GLSL)
   GL_ARB_framebuffer_no_attachments DONE (i965, nvc0, 
r600, radeonsi, softpipe)
diff --git a/docs/relnotes/11.3.0.html b/docs/relnotes/11.3.0.html
index 0f9aed8..5a7083c 100644
--- a/docs/relnotes/11.3.0.html
+++ b/docs/relnotes/11.3.0.html
@@ -45,6 +45,7 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 OpenGL 4.2 on radeonsi
+GL_ARB_compute_shader on radeonsi
 GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi, softpipe
 GL_ARB_internalformat_query2 on all drivers
 GL_ARB_robust_buffer_access_behavior on radeonsi
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index a7477ab..64da62f 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -645,23 +645,34 @@ static int r600_get_compute_param(struct pipe_screen 
*screen,
uint64_t *grid_size = ret;
grid_size[0] = 65535;
grid_size[1] = 65535;
-   grid_size[2] = 1;
+   grid_size[2] = 65535;
}
return 3 * sizeof(uint64_t) ;
 
case PIPE_COMPUTE_CAP_MAX_BLOCK_SIZE:
if (ret) {
uint64_t *block_size = ret;
-   block_size[0] = 256;
-   block_size[1] = 256;
-   block_size[2] = 256;
+   if (rscreen->chip_class >= SI && HAVE_LLVM >= 0x309 &&
+   ir_type == PIPE_SHADER_IR_TGSI) {
+   block_size[0] = 2048;
+   block_size[1] = 2048;
+   block_size[2] = 2048;
+   } else {
+   block_size[0] = 256;
+   block_size[1] = 256;
+   block_size[2] = 256;
+   }
}
return 3 * sizeof(uint64_t);
 
case PIPE_COMPUTE_CAP_MAX_THREADS_PER_BLOCK:
if (ret) {
uint64_t *max_threads_per_block = ret;
-   *max_threads_per_block = 256;
+   if (rscreen->chip_class >= SI && HAVE_LLVM >= 0x309 &&
+   ir_type == PIPE_SHADER_IR_TGSI)
+   *max_threads_per_block = 2048;
+   else
+   *max_threads_per_block = 256;
}
return sizeof(uint64_t);
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index f22cd03..7501a8f 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -447,6 +447,8 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
 
 static int si_get_shader_param(struct pipe_screen* pscreen, unsigned shader, 
enum pipe_shader_cap param)
 {
+   struct si_screen *sscreen = (struct si_screen *)pscreen;
+
switch(shader)
{
case 

[Mesa-dev] [PATCH 1/2] i965/vec4: Lower integer multiplication after optimizations.

2016-04-18 Thread Matt Turner
Analogous to commit 1e4e17fbd in the i965/fs backend.

Because the copy propagation pass in the vec4 backend is strictly local,
we look at the immediate values coming from NIR and emit the multiplies
we need directly. If the copy propagation pass becomes smarter in the
future, we can reduce the nir_op_imul case in brw_vec4_nir.cpp to a
single multiply.

total instructions in shared programs: 7082311 -> 7081953 (-0.01%)
instructions in affected programs: 59581 -> 59223 (-0.60%)
helped: 293

total cycles in shared programs: 65765712 -> 65764796 (-0.00%)
cycles in affected programs: 854112 -> 853196 (-0.11%)
helped: 154
HURT: 73
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 67 ++
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 48 +
 3 files changed, 88 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index b9cf3f6..1644d4d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1671,6 +1671,71 @@ vec4_visitor::lower_minmax()
return progress;
 }
 
+bool
+vec4_visitor::lower_integer_multiplication()
+{
+   bool progress = false;
+
+   foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
+  const vec4_builder ibld(this, block, inst);
+
+  if (inst->opcode == BRW_OPCODE_MUL) {
+ if (inst->dst.is_accumulator() ||
+ (inst->src[1].type != BRW_REGISTER_TYPE_D &&
+  inst->src[1].type != BRW_REGISTER_TYPE_UD))
+continue;
+
+ /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit
+  * operation directly, but CHV/BXT cannot.
+  */
+ if (devinfo->gen >= 8 &&
+ !devinfo->is_cherryview && !devinfo->is_broxton)
+continue;
+
+ if (inst->src[1].file == IMM &&
+ inst->src[1].ud < (1 << 16)) {
+/* The MUL instruction isn't commutative. On Gen <= 6, only the low
+ * 16-bits of src0 are read, and on Gen >= 7 only the low 16-bits 
of
+ * src1 are used.
+ *
+ * If multiplying by an immediate value that fits in 16-bits, do a
+ * single MUL instruction with that value in the proper location.
+ */
+if (devinfo->gen < 7) {
+   dst_reg imm(VGRF, alloc.allocate(1), inst->dst.type,
+   inst->dst.writemask);
+   ibld.MOV(imm, inst->src[1]);
+   ibld.MUL(inst->dst, src_reg(imm), inst->src[0]);
+} else {
+   ibld.MUL(inst->dst, inst->src[0], inst->src[1]);
+}
+ } else {
+const dst_reg acc(brw_writemask(retype(brw_acc_reg(8),
+   inst->dst.type),
+inst->dst.writemask));
+const dst_reg null(brw_writemask(retype(brw_null_reg(),
+inst->dst.type),
+ inst->dst.writemask));
+
+ibld.MUL(acc, inst->src[0], inst->src[1]);
+ibld.MACH(null, inst->src[0], inst->src[1]);
+set_condmod(inst->conditional_mod,
+ibld.MOV(inst->dst, src_reg(acc)));
+ }
+  } else {
+ continue;
+  }
+
+  inst->remove(block);
+  progress = true;
+   }
+
+   if (progress)
+  invalidate_live_intervals();
+
+   return progress;
+}
+
 src_reg
 vec4_visitor::get_timestamp()
 {
@@ -1950,6 +2015,8 @@ vec4_visitor::run()
   OPT(dead_code_eliminate);
}
 
+   OPT(lower_integer_multiplication);
+
if (failed)
   return false;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index d43a5a8..f6f8b12 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -301,6 +301,7 @@ public:
void resolve_ud_negate(src_reg *reg);
 
bool lower_minmax();
+   bool lower_integer_multiplication();
 
src_reg get_timestamp();
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index e4e8c38..10e2f54 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -1039,35 +1039,27 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
   break;
 
case nir_op_imul: {
-  if (devinfo->gen < 8) {
- nir_const_value *value0 = nir_src_as_const_value(instr->src[0].src);
- nir_const_value *value1 = nir_src_as_const_value(instr->src[1].src);
-
- /* For integer multiplication, the MUL uses the low 16 bits of one of
-  * the operands (src0 through SNB, src1 on IVB and later). The MACH
-  * accumulates in the contribution of the upper 16 bits of that
-  * operand. If we can determine that one of the args 

[Mesa-dev] [PATCH] swr: fix resource backed constant buffers

2016-04-18 Thread Tim Rowley
Code was using an incorrect address for the base pointer.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979
---
 src/gallium/drivers/swr/swr_state.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index ded51a9..0e0979d 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -1138,7 +1138,8 @@ swr_update_derived(struct pipe_context *pipe,
  pDC->num_constantsVS[i] = cb->buffer_size;
  if (cb->buffer)
 pDC->constantVS[i] =
-   (const float *)((const uint8_t *)cb->buffer + 
cb->buffer_offset);
+   (const float *)(swr_resource(cb->buffer)->swr.pBaseAddress +
+   cb->buffer_offset);
  else {
 /* Need to copy these constants to scratch space */
 if (cb->user_buffer && cb->buffer_size) {
@@ -1163,7 +1164,8 @@ swr_update_derived(struct pipe_context *pipe,
  pDC->num_constantsFS[i] = cb->buffer_size;
  if (cb->buffer)
 pDC->constantFS[i] =
-   (const float *)((const uint8_t *)cb->buffer + 
cb->buffer_offset);
+   (const float *)(swr_resource(cb->buffer)->swr.pBaseAddress +
+   cb->buffer_offset);
  else {
 /* Need to copy these constants to scratch space */
 if (cb->user_buffer && cb->buffer_size) {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/vec4: Unrestrict constant propagation into integer multiply.

2016-04-18 Thread Matt Turner
Analogous to commit 81deefc45b in the i965/fs backend.
---
No shader-db changes because the vec4 copy propagation pass is local-only.

 src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 92423e1..7929dbc 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -216,8 +216,16 @@ try_constant_propagate(const struct brw_device_info 
*devinfo,
   } else if (arg == 0 && inst->src[1].file != IMM) {
 /* Fit this constant in by commuting the operands.  Exception: we
  * can't do this for 32-bit integer MUL/MACH because it's asymmetric.
+  *
+  * The BSpec says for Broadwell that
+  *
+  *"When multiplying DW x DW, the dst cannot be accumulator."
+  *
+  * Integer MUL with a non-accumulator destination will be lowered
+  * by lower_integer_multiplication(), so don't restrict it.
  */
-if ((inst->opcode == BRW_OPCODE_MUL ||
+ if (((inst->opcode == BRW_OPCODE_MUL &&
+   inst->dst.is_accumulator()) ||
   inst->opcode == BRW_OPCODE_MACH) &&
 (inst->src[1].type == BRW_REGISTER_TYPE_D ||
  inst->src[1].type == BRW_REGISTER_TYPE_UD))
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Checks for interpolation into its own function.

2016-04-18 Thread Timothy Arceri
On Mon, 2016-04-18 at 19:44 +0300, Andres Gomez wrote:
> Hi,
> 
> I would really appreciate if you could find some time to review this
> patch.

Is there a patch somewhere that makes use of this change? 

> 
> Thanks!
> 
> On Mon, 2016-04-04 at 19:50 +0300, Andres Gomez wrote:
> > 
> > This generalizes the validation also to be done for variables
> > inside
> > interface blocks, which, for some cases, was missing.
> > 
> > For a discussion about the additional validation cases included see
> > https://lists.freedesktop.org/archives/mesa-dev/2016-March/109117.h
> > tm
> > l
> > and Khronos bug #15671.
> > 
> > Signed-off-by: Andres Gomez 
> > ---
> >  src/compiler/glsl/ast_to_hir.cpp | 316 +--
> > 
> >  1 file changed, 171 insertions(+), 145 deletions(-)
> > 
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > b/src/compiler/glsl/ast_to_hir.cpp
> > index 7c9be81..e4ebc6b 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -2792,8 +2792,164 @@ apply_explicit_binding(struct
> > _mesa_glsl_parse_state *state,
> >  }
> >  
> >  
> > +static void
> > +validate_interpolation_qualifier(struct _mesa_glsl_parse_state
> > *state,
> > + YYLTYPE *loc,
> > + const glsl_interp_qualifier
> > interpolation,
> > + const struct ast_type_qualifier
> > *qual,
> > + const struct glsl_type *var_type,
> > + ir_variable_mode mode)
> > +{
> > +   /* Interpolation qualifiers can only apply to shader inputs or
> > outputs, but
> > +* not to vertex shader inputs nor fragment shader outputs.
> > +*
> > +* From section 4.3 ("Storage Qualifiers") of the GLSL 1.30
> > spec:
> > +*"Outputs from a vertex shader (out) and inputs to a
> > fragment
> > +*shader (in) can be further qualified with one or more of
> > these
> > +*interpolation qualifiers"
> > +*...
> > +*"These interpolation qualifiers may only precede the
> > qualifiers in,
> > +*centroid in, out, or centroid out in a declaration. They
> > do
> > not apply
> > +*to the deprecated storage qualifiers varying or centroid
> > +*varying. They also do not apply to inputs into a vertex
> > shader or
> > +*outputs from a fragment shader."
> > +*
> > +* From section 4.3 ("Storage Qualifiers") of the GLSL ES 3.00
> > spec:
> > +*"Outputs from a shader (out) and inputs to a shader (in)
> > can be
> > +*further qualified with one of these interpolation
> > qualifiers."
> > +*...
> > +*"These interpolation qualifiers may only precede the
> > qualifiers
> > +*in, centroid in, out, or centroid out in a declaration.
> > They do
> > +*not apply to inputs into a vertex shader or outputs from
> > a
> > +*fragment shader."
> > +*/
> > +   if (state->is_version(130, 300)
> > +   && interpolation != INTERP_QUALIFIER_NONE) {
> > +  const char *i = interpolation_string(interpolation);
> > +  if (mode != ir_var_shader_in && mode != ir_var_shader_out)
> > + _mesa_glsl_error(loc, state,
> > +  "interpolation qualifier `%s' can only
> > be
> > applied to "
> > +  "shader inputs or outputs.", i);
> > +
> > +  switch (state->stage) {
> > +  case MESA_SHADER_VERTEX:
> > + if (mode == ir_var_shader_in) {
> > +_mesa_glsl_error(loc, state,
> > + "interpolation qualifier '%s' cannot
> > be
> > applied to "
> > + "vertex shader inputs", i);
> > + }
> > + break;
> > +  case MESA_SHADER_FRAGMENT:
> > + if (mode == ir_var_shader_out) {
> > +_mesa_glsl_error(loc, state,
> > + "interpolation qualifier '%s' cannot
> > be
> > applied to "
> > + "fragment shader outputs", i);
> > + }
> > + break;
> > +  default:
> > + break;
> > +  }
> > +   }
> > +
> > +   /* Interpolation qualifiers cannot be applied to 'centroid' and
> > +* 'centroid varying'.
> > +*
> > +* From section 4.3 ("Storage Qualifiers") of the GLSL 1.30
> > spec:
> > +*"interpolation qualifiers may only precede the qualifiers
> > in,
> > +*centroid in, out, or centroid out in a declaration. They
> > do
> > not apply
> > +*to the deprecated storage qualifiers varying or centroid
> > varying."
> > +*
> > +* These deprecated storage qualifiers do not exist in GLSL ES
> > 3.00.
> > +*/
> > +   if (state->is_version(130, 0)
> > +   && interpolation != INTERP_QUALIFIER_NONE
> > +   && qual->flags.q.varying) {
> > +
> > +  const char *i = interpolation_string(interpolation);
> > +  const char *s;
> > +  if 

Re: [Mesa-dev] [PATCH] glsl: add forgotten textureOffset function for sampler2DArrayShadow

2016-04-18 Thread Ian Romanick
On 04/18/2016 02:05 PM, srol...@vmware.com wrote:
> From: Roland Scheidegger 
> 
> This was part of EXT_gpu_shader4 - as such it should have been supported
> by glsl 130.
> It was however forgotten, and not added until glsl 430 - with the wrong
> syntax no less (glsl 430 mentions it was overlooked).
> glsl 440 (but revision 8 only) fixed this finally for good.
> It looks like most other implementations would support this with older
> glsl versions as well, so just add this to the other glsl 130 textureOffset

Can you clarify this?  I believe that it would work on NVIDIA when the
shader doesn't specify an explicit version.  Does it work on NVIDIA with
an explicit #version 130?  How about AMD?

> functions.
> 
> (Completely untested...)
> ---
>  src/compiler/glsl/builtin_functions.cpp | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/compiler/glsl/builtin_functions.cpp 
> b/src/compiler/glsl/builtin_functions.cpp
> index f488434..004beb9 100644
> --- a/src/compiler/glsl/builtin_functions.cpp
> +++ b/src/compiler/glsl/builtin_functions.cpp
> @@ -1707,6 +1707,9 @@ builtin_builder::create_builtins()
>  _texture(ir_tex, v130, glsl_type::uvec4_type, 
> glsl_type::usampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET),
>  
>  _texture(ir_tex, v130, glsl_type::float_type, 
> glsl_type::sampler1DArrayShadow_type, glsl_type::vec3_type, TEX_OFFSET),
> +/* the next one was forgotten in glsl spec (it's from 
> EXT_gpu_shader4 initially),
> +   had wrong syntax in 4.30, correct only in 4.40 but allow 
> it in 130 */

How about:

/* The next one was forgotten in GLSL 1.30 spec.  It's from
 * EXT_gpu_shader4 originally.  It was added in 4.30 with the
 * wrong syntax.  This was corrected in 4.40.  4.30 indicates
 * that it was intended to be included previously, so allow it
 * in 1.30.
 */

> +_texture(ir_tex, v130, glsl_type::float_type, 
> glsl_type::sampler2DArrayShadow_type, glsl_type::vec4_type, TEX_OFFSET),
>  
>  _texture(ir_txb, v130_fs_only, glsl_type::vec4_type,  
> glsl_type::sampler1D_type,  glsl_type::float_type, TEX_OFFSET),
>  _texture(ir_txb, v130_fs_only, glsl_type::ivec4_type, 
> glsl_type::isampler1D_type, glsl_type::float_type, TEX_OFFSET),
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: enable GLSL 4.30 and therefore OpenGL 4.3

2016-04-18 Thread Bas Nieuwenhuizen
On Mon, Apr 18, 2016 at 7:58 PM, Ian Romanick  wrote:
> On 04/15/2016 03:33 AM, Marek Olšák wrote:
>> The same thing Nicolai said: This can be committed before the UE4
>> compile failure is fixed.
>
> Is there a bug filed for that problem?  Has anyone diagnosed the issue?
>

I just filed a bug for this issue at
https://bugs.freedesktop.org/show_bug.cgi?id=95005

- Bas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv_device: Set the compressed texture feature flags correctly

2016-04-18 Thread Nanley Chery
From: Nanley Chery 

Sampling from an ETC2 texture is supported from Gen8 onwards.
While ASTC_LDR is supported on Gen9, the logic to handle such
formats has not yet been implemented in the driver.

Fixes dEQP-VK.api.info.format_properties.compressed_formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94896
Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index e477fe1..5433dd3 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -380,8 +380,8 @@ void anv_GetPhysicalDeviceFeatures(
   .alphaToOne   = true,
   .multiViewport= true,
   .samplerAnisotropy= false, /* FINISHME */
-  .textureCompressionETC2   = true,
-  .textureCompressionASTC_LDR   = true,
+  .textureCompressionETC2   = pdevice->info->gen >= 8,
+  .textureCompressionASTC_LDR   = false, /* FINISHME */
   .textureCompressionBC = true,
   .occlusionQueryPrecise= true,
   .pipelineStatisticsQuery  = false,
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] [rfc] gallivm/llvmpipe dynamic samplers support.

2016-04-18 Thread Dave Airlie
On 19 April 2016 at 03:40, Roland Scheidegger  wrote:
> Am 18.04.2016 um 04:49 schrieb Dave Airlie:
>> From: Dave Airlie 
>>
>> This is a first attempt at adding support for dynamic indexing
>> of samplers to llvmpipe. This is needed for ARB_gpu_shader5 support.
>>
>> This uses the sampler function generator to generate functions
>> for all samplers, then uses if statements to pick which one to call.
>>
>> This passes all the tests in piglit except a couple of non-uniform
>> ones which g
> I can't quite parse this fully ;-).
>
Doh I wrote half a commit msg, got distracted by query size and forgot
to go back.

>> +
>> + for (i = 0; i < sampler->dynamic_state.num_samplers; i++) {
> Indentation.
> And I think this should really make an effort to only do this for
> samplers which are actually dyanmically indexed, for the possible
> range(s) (as they need to be declared as an array if I'm not mistaken).
> I'm not entirely sure, but I believe this could even crash otherwise
> potentially, due to trying to construct lookup functions with
> incompatible combinations of parameters.

No unfortunately we don't declare samplers as an array, I think I should
fix that at some point as I've come to this point a few times and would have
really liked to have that information, at least for bounds checking.

>
>> + struct lp_sampler_params unit_params = *params;
>> +
>> + unit_params.texture_index = i;
>> + unit_params.sampler_index = i;
> Of course, the need to actually override texture and sampler index is
> why I never even tried to implement it - for d3d10 sample opcodes, it
> looked simply unfeasible as you'd have to generate the product of
> num_samplers/num_textures. So I figured this needed way more work...
> Though actually thinking about this, only need to be able to index into
> resources, not samplers, there, so it should work as well...
> I can't really make sense out of the msdn docs though.
> But with d3d12 you actually could have legitimate per-pixel resource
> indices (yuck!!!) - with the usual caveat about calculated lods.
> But I can't actually find anything that says dynamic indexing into
> resources even works at all for d3d11 now.

Yup I pretty much ignored SAMPLE* and D3D due to lack of any clue.

Though really generating all the functions in the world probably isn't
going to be that brutal. but it would be good to know if you can indirect
both.

>
>> + 
>> lp_build_sample_soa(>dynamic_state.static_state[i].texture_state,
>> + 
>> >dynamic_state.static_state[i].sampler_state,
>> + >dynamic_state.base,
>> + gallivm, _params);
> And I think really all of this should move to the actual sample code.

It's kinda messier to do that, since we passing
dynamic_state.static_state, and I don't think
we can access it anywhere else except here.

>> +  indirect_matches = LLVMBuildBitCast(builder, indirect_matches, 
>> LLVMIntTypeInContext(gallivm->context, params->type.length), "");
>> +  indirect_matches = LLVMBuildICmp(builder, LLVMIntNE,
>> +   indirect_matches, 
>> LLVMConstNull(LLVMIntTypeInContext(gallivm->context, params->type.length)), 
>> "");
>>
> This isn't quite right (I suppose that's what the commit message wanted
> to say but was cut short...).
> From what I understand, you really need to examine the exec mask (not
> sure if the alive pixel mask is also needed) and pick your (scalar)
> index from a element which is in the current exec mask. I guess that's
> another parameter for the sampler params...
> And fwiw I think doing that with a switch statement in the end instead
> of if ladders would make far more sense.

Ah yes I expect you might be right in that and that is why the two
tests are failing, I'll see if I can work it out.

A switch statement in the end means having to keep track of all the
function pointers, whereas doing it with
separate if statements makes it a lot easier to generate the code
without having to keep copies of all the individual
samplers to build the switch at the end. Though I suppose it might be
possible to wrap it at a higher level.

>> +   if (params.sampler_is_indirect)
>> +   params.indirect_index = get_indirect_index(bld, 
>> inst->Src[sampler_reg].Register.File, inst->Src[sampler_reg].Register.Index, 
>> >Src[sampler_reg].Indirect);
> Does this ensure that the index doesn't exceed max sampler index?
> Hopefully yes, but if not you need to add some MIN here.

Ah I'll check that.

Thanks,
Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: add forgotten textureOffset function for sampler2DArrayShadow

2016-04-18 Thread sroland
From: Roland Scheidegger 

This was part of EXT_gpu_shader4 - as such it should have been supported
by glsl 130.
It was however forgotten, and not added until glsl 430 - with the wrong
syntax no less (glsl 430 mentions it was overlooked).
glsl 440 (but revision 8 only) fixed this finally for good.
It looks like most other implementations would support this with older
glsl versions as well, so just add this to the other glsl 130 textureOffset
functions.

(Completely untested...)
---
 src/compiler/glsl/builtin_functions.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index f488434..004beb9 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -1707,6 +1707,9 @@ builtin_builder::create_builtins()
 _texture(ir_tex, v130, glsl_type::uvec4_type, 
glsl_type::usampler2DArray_type, glsl_type::vec3_type, TEX_OFFSET),
 
 _texture(ir_tex, v130, glsl_type::float_type, 
glsl_type::sampler1DArrayShadow_type, glsl_type::vec3_type, TEX_OFFSET),
+/* the next one was forgotten in glsl spec (it's from 
EXT_gpu_shader4 initially),
+   had wrong syntax in 4.30, correct only in 4.40 but allow it 
in 130 */
+_texture(ir_tex, v130, glsl_type::float_type, 
glsl_type::sampler2DArrayShadow_type, glsl_type::vec4_type, TEX_OFFSET),
 
 _texture(ir_txb, v130_fs_only, glsl_type::vec4_type,  
glsl_type::sampler1D_type,  glsl_type::float_type, TEX_OFFSET),
 _texture(ir_txb, v130_fs_only, glsl_type::ivec4_type, 
glsl_type::isampler1D_type, glsl_type::float_type, TEX_OFFSET),
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: dereference cbuf/zbuf/views on context destroy

2016-04-18 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

It has been pointed out to me on several occasions that the state
tracker should be unbinding all these, and I seem to recall finding
that code. But for some reason I've had to do the same thing to avoid
leaks.

On Mon, Apr 18, 2016 at 4:29 PM, Tim Rowley  wrote:
> Fixes resource memory leaks.
> ---
>  src/gallium/drivers/swr/swr_context.cpp | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/src/gallium/drivers/swr/swr_context.cpp 
> b/src/gallium/drivers/swr/swr_context.cpp
> index 46c79a1..5d311dd 100644
> --- a/src/gallium/drivers/swr/swr_context.cpp
> +++ b/src/gallium/drivers/swr/swr_context.cpp
> @@ -300,6 +300,21 @@ swr_destroy(struct pipe_context *pipe)
>
> /* Idle core before deleting context */
> SwrWaitForIdle(ctx->swrContext);
> +
> +   for (unsigned i = 0; i < PIPE_MAX_COLOR_BUFS; i++) {
> +  pipe_surface_reference(>framebuffer.cbufs[i], NULL);
> +   }
> +
> +   pipe_surface_reference(>framebuffer.zsbuf, NULL);
> +
> +   for (unsigned i = 0; i < Elements(ctx->sampler_views[0]); i++) {
> +  
> pipe_sampler_view_reference(>sampler_views[PIPE_SHADER_FRAGMENT][i], 
> NULL);
> +   }
> +
> +   for (unsigned i = 0; i < Elements(ctx->sampler_views[0]); i++) {
> +  
> pipe_sampler_view_reference(>sampler_views[PIPE_SHADER_VERTEX][i], NULL);
> +   }
> +
> if (ctx->swrContext)
>SwrDestroyContext(ctx->swrContext);
>
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr: dereference cbuf/zbuf/views on context destroy

2016-04-18 Thread Tim Rowley
Fixes resource memory leaks.
---
 src/gallium/drivers/swr/swr_context.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/gallium/drivers/swr/swr_context.cpp 
b/src/gallium/drivers/swr/swr_context.cpp
index 46c79a1..5d311dd 100644
--- a/src/gallium/drivers/swr/swr_context.cpp
+++ b/src/gallium/drivers/swr/swr_context.cpp
@@ -300,6 +300,21 @@ swr_destroy(struct pipe_context *pipe)
 
/* Idle core before deleting context */
SwrWaitForIdle(ctx->swrContext);
+
+   for (unsigned i = 0; i < PIPE_MAX_COLOR_BUFS; i++) {
+  pipe_surface_reference(>framebuffer.cbufs[i], NULL);
+   }
+
+   pipe_surface_reference(>framebuffer.zsbuf, NULL);
+
+   for (unsigned i = 0; i < Elements(ctx->sampler_views[0]); i++) {
+  
pipe_sampler_view_reference(>sampler_views[PIPE_SHADER_FRAGMENT][i], NULL);
+   }
+
+   for (unsigned i = 0; i < Elements(ctx->sampler_views[0]); i++) {
+  pipe_sampler_view_reference(>sampler_views[PIPE_SHADER_VERTEX][i], 
NULL);
+   }
+
if (ctx->swrContext)
   SwrDestroyContext(ctx->swrContext);
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 95003] [Clover / OpenCL] CL_DEVICE_WAVEFRONT_WIDTH_AMD - 0x4043 unimplemented

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=95003

Bug ID: 95003
   Summary: [Clover / OpenCL] CL_DEVICE_WAVEFRONT_WIDTH_AMD -
0x4043 unimplemented
   Product: Mesa
   Version: 11.2
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: ros...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

This is used to find out the core count of the GPU in oclHashcat. But it errors
before it can do so with CL_INVALID_VALUE from a clGetDeviceInfo call.

Specific GPU is a 6520G.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94271] oclHashcat fails with ERROR: clGetDeviceInfo() : -30 : CL_INVALID_VALUE

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94271

ros...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from ros...@gmail.com ---
found the issue

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] glsl: raise warning when using uninitialized variables

2016-04-18 Thread Alejandro Piñeiro


On 18/04/16 16:11, Alejandro Piñeiro wrote:
> On 18/04/16 00:47, Ilia Mirkin wrote:
>> On Mon, Mar 28, 2016 at 2:50 PM, Alejandro Piñeiro  
>> wrote:
>>> v2:
>>>  * Take into account out varyings too (Timothy Arceri)
>>>  * Fix style (Timothy Arceri)
>>>  * Use a new ast_expression variable, instead of an
>>>ast_expression::hir new parameter (Timothy Arceri)
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94129
>>> ---
>>>  src/compiler/glsl/ast_to_hir.cpp | 7 +++
>>>  1 file changed, 7 insertions(+)
>>>
>>> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
>>> b/src/compiler/glsl/ast_to_hir.cpp
>>> index eb45f29..e38ab10 100644
>>> --- a/src/compiler/glsl/ast_to_hir.cpp
>>> +++ b/src/compiler/glsl/ast_to_hir.cpp
>>> @@ -1901,6 +1901,13 @@ ast_expression::do_hir(exec_list *instructions,
>>>if (var != NULL) {
>>>   var->data.used = true;
>>>   result = new(ctx) ir_dereference_variable(var);
>>> +
>>> + if ((var->data.mode == ir_var_auto || var->data.mode == 
>>> ir_var_shader_out)
>>> + && !this->is_lhs
>>> + && result->variable_referenced()->data.assigned != true) {
>>> +_mesa_glsl_warning(, state, "`%s' used uninitialized",
>>> +   this->primary_expression.identifier);
>> This also appears to warn in the case of
>>
>> void foo(out float x) { x = 1.0; }
>>
>> float bar;
>> foo(bar);
>>
>> It thinks that bar is being used uninitialized. How do we fix this? It
>> happens a ton in Talos Principle.
> After checking a little: initially I thought it would be easy to solve
> because there is already some checks related to the parameters modes
> (in, out, inout) at verify_parameter_modes (ast_function.cpp). This
> method compares the formal parameter (x in your example) and the actual
> parameter (bar in your example), and checks that the modes are correct
> and other stuff, like for example setting assigned to true on the actual
> parameter. The problem is that this is done after the actual variable is
> processed, that is where the warning is raised. And if fact, in order to
> work, the actual parameter needs to be already processed, as the modes
> are part of the ir variable, not of the ast_node.

Correcting myself, although it is true that we need the modes coming, in
this case we need the modes from the formal parameter, that was
processed before. So for the one we are processing now (the actual one)
we can use the ast_node (that is what that method is using). So perhaps
it would be possible to add a previous step before processing (something
like "preverify_parameter_modes"), using only the ast nodes of the
actual variable. That would allow to set is_lhs there, and avoid having
two places raising the same warning. If not possible I will fallback on
what I mentioned on my previous email.

>
> I think that a way to solve that would be set is_lhs for any function
> parameter, and raise the uninitialized warning too on
> verify_parameter_modes. That would mean that we raise the warning in two
> different places, but I think that it would not be a big issue, because
> as mentioned, that code is already checking errors related with the
> actual and formal parameters and their modes.
>
> I will try to do this tomorrow.
>
> BR
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/13] i965/compiler: Use ISL for image_load_store format

2016-04-18 Thread Emil Velikov
On 17 April 2016 at 00:47, Jason Ekstrand  wrote:
> On Sat, Apr 16, 2016 at 4:40 PM, Emil Velikov 
> wrote:
>>
>> On 16 April 2016 at 20:45, Jason Ekstrand  wrote:
>> > This little series switches our back-end compiler to use libisl for the
>> > surface format introspection it needs for doing image_load_store shader
>> > work-arounds.  Format introspection is the one place where thet back-end
>> > compilers still have a dependency on libmesa.
>> >
>> > Once this dependency is removed, we can stop linking the Vulkan driver
>> > against libmesa and cut the size of libvulkan_intel.so down to about 2
>> > MB.
>> Nice one Jason. With vulkan landing in master I was about to ask you
>> guys about this (re: reusing isl getting rid of libmesa dependency).
>>
>> The size savings sound quite nice. Although it seems that isl might
>> need a bit more for earlier generations and/or msaa.
>>
>> Meanwhile I'll give src/intel, simplifying/folding things a bit.
>>
>> > Unfortunately, we're not *quite* ready for that yet.  The way that the
>> > different core compiler libraries are split up, libnir has a dependency
>> > on
>> > GLSL because glsl_to_nir is in libnir.  It'll take a bit of whack-a-mole
>> > with makefiles and linking to really get to that point.
>> >
>> Strange I don't recall such issue. Can you share a build log ?
>
>
> If you want to experiment with it, go for it.
Some hacks  that I've got so far are in for-jason/vulkan-without-libmesa
at https://github.com/evelikov/Mesa/

An alternative method (flesh out glsl_to_nir to a separate static
library) is getting extremely hairy real quick. So be warned if you
want to go that route.

Then again regardless if one goes the former or latter method, we
should really start thinking about getting a common IR (and infra
around it) factored out. If we discard GLSL, we also have mesa (prog)
IR, which will be extra tricky to get out unless i965_compiler kicks
out the dependency of prog_to_nir() and alike.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Properly handle integer types in opt_vector_float().

2016-04-18 Thread Matt Turner
On Sun, Apr 17, 2016 at 11:14 PM, Kenneth Graunke  wrote:
> Previously, opt_vector_float() always interpreted MOV sources as
> floating point, and always created a MOV with a F-type destination.
>
> This meant that we could mess up sequences of integer loads, such as:
>
>mov vgrf6.0.x:D, 0D
>mov vgrf6.0.y:D, 1D
>mov vgrf6.0.z:D, 2D
>mov vgrf6.0.w:D, 3D
>
> Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
>
>mov vgrf6.0:F, [0F, 0F, 0F, 0F]
>
> which is clearly wrong.  We can properly handle this by converting
> integer values to float (rather than bitcasting), and emitting a type
> converting MOV:
>
>mov vgrf6.0:D, [0F, 1F, 2F, 3F]
>
> To do this, see first see if the integer values (converted to float)
> are representable.  If so, we use a D-type MOV.  If not, we then try
> the floating point values and an F-type MOV.  We make zero not impose
> type restrictions.  This is important because 0D would imply a D-type
> MOV, but is often used in sequences such as MOV 0D, MOV 0x3f80D,
> where we want to use an F-type MOV.
>
> Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend.  This
> recently became visible due to changes in opt_vector_float() which
> made it optimize more cases, but it was a pre-existing bug.
>
> Signed-off-by: Kenneth Graunke 

Hurts a single program in shader-db... for some reason related to
seeing a zero first?

In toki-tori-2/1, we see

-mov(8)  g18<1>.zwF  [0F, 0F, 0F, 1F]VF
+mov(8)  g18<1>.zUD  0xUD
+mov(8)  g18<1>.wD   1065353216D

Ignore the UD type -- the generator changes D -> UD so it can compact
the instruction. It's actually type-D when opt_vector_float is called.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94955] Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94955

David Lonie  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #9 from David Lonie  ---
Reopening -- I may have found it. This other failing test looks better for
tracking this down. It segfaults reproducibly with this stack trace from
valgrind:

==19776==  Address 0x207abe3c is not stack'd, malloc'd or (recently) free'd
==19776== 
==19776== 
==19776== Process terminating with default action of signal 11 (SIGSEGV):
dumping core
==19776==  Access not within mapped region at address 0x207ABE3C
==19776==at 0x4039042: ???
==19776==by 0x113B377A: lp_rast_shade_quads_mask (lp_rast.c:457)
==19776==by 0x113B8E5A: do_block_4_2 (lp_rast_tri_tmp.h:67)
==19776==by 0x113B9194: do_block_16_2 (lp_rast_tri_tmp.h:152)
==19776==by 0x113B96DC: lp_rast_triangle_2 (lp_rast_tri_tmp.h:305)
==19776==by 0x113B3AC5: do_rasterize_bin (lp_rast.c:609)
==19776==by 0x113B3B33: rasterize_bin (lp_rast.c:628)
==19776==by 0x113B3C42: rasterize_scene (lp_rast.c:688)
==19776==by 0x113B3F27: thread_function (lp_rast.c:828)
==19776==by 0x113B1C8E: impl_thrd_routine (threads_posix.h:87)
==19776==by 0x11C7A423: start_thread (in /usr/lib/libpthread-2.23.so)
==19776==by 0x4F1DCBC: clone (in /usr/lib/libc-2.23.so)

This looks similar to the third trace in my original bug, but it's an access
violation instead of an uninitialized value, and some of the frames are
different.

The core dump valgrind generates is corrupted, unfortunately:

(gdb) bt
#0  0x04039042 in ?? ()
#1  0x3ebb67ae3ebb67ae in ?? ()
#2  0x3ebb67ae3ebb67ae in ?? ()
#3  0x3ebb67ae3ebb67ae in ?? ()
#4  0x3ebb67ae3ebb67ae in ?? ()
#5  0x in ?? ()

Running in gdb gives a similar backtrace. Tried getting an apitrace, but the
segfault is preventing that from producing anything meaningful, too.

Hopefully that stack will be enough...I wish I could get you guys a useful
apitrace or core dump to inspect, but this crash is doing a good job of
covering its tracks! Best I can do is provide instructions for running the VTK
test that reproduces it:

1) git clone https://gitlab.kitware.com/vtk/vtk.git
2) mkdir vtk-build
3) cd vtk-build
4) cmake ../vtk \
 -DOPENGL_INCLUDE_DIR=/path/to/mesa/install/prefix/include \
 -DOPENGL_gl_LIBRARY=/path/to/mesa/install/prefix/lib/libMesaGL.so \
 -DOPENGL_glu_LIBRARY=""
5) make

To run the test, either:

ctest -R TestTextureRGBADepthPeeling

or (to get around the ctest launcher for gdb/valgrind):

bin/vtkRenderingCoreCxxTests "TestTextureRGBADepthPeeling" "-D"
"ExternalData/Testing" "-T" "Testing/Temporary" "-V"
"ExternalData/Rendering/Core/Testing/Data/Baseline/TestTextureRGBADepthPeeling.png

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Properly handle integer types in opt_vector_float().

2016-04-18 Thread Matt Turner
On Mon, Apr 18, 2016 at 7:20 AM, Iago Toral  wrote:
> On Sun, 2016-04-17 at 23:14 -0700, Kenneth Graunke wrote:
>> Previously, opt_vector_float() always interpreted MOV sources as
>> floating point, and always created a MOV with a F-type destination.
>>
>> This meant that we could mess up sequences of integer loads, such as:
>>
>>mov vgrf6.0.x:D, 0D
>>mov vgrf6.0.y:D, 1D
>>mov vgrf6.0.z:D, 2D
>>mov vgrf6.0.w:D, 3D
>>
>> Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
>>
>>mov vgrf6.0:F, [0F, 0F, 0F, 0F]
>>
>> which is clearly wrong.  We can properly handle this by converting
>> integer values to float (rather than bitcasting), and emitting a type
>> converting MOV:
>>
>>mov vgrf6.0:D, [0F, 1F, 2F, 3F]
>>
>> To do this, see first see if the integer values (converted to float)
>> are representable.  If so, we use a D-type MOV.  If not, we then try
>> the floating point values and an F-type MOV.  We make zero not impose
>> type restrictions.  This is important because 0D would imply a D-type
>> MOV, but is often used in sequences such as MOV 0D, MOV 0x3f80D,
>> where we want to use an F-type MOV.
>>
>> Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend.  This
>> recently became visible due to changes in opt_vector_float() which
>> made it optimize more cases, but it was a pre-existing bug.
>>
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 24 
>>  1 file changed, 20 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index 12c3c66..2bdcf1f 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -361,9 +361,11 @@ vec4_visitor::opt_vector_float()
>> int inst_count = 0;
>> vec4_instruction *imm_inst[4];
>> unsigned writemask = 0;
>> +   enum brw_reg_type dest_type = BRW_REGISTER_TYPE_F;
>>
>> foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
>>int vf = -1;
>> +  enum brw_reg_type need_type;
>>
>>/* Look for unconditional MOVs from an immediate with a partial
>> * writemask.  Skip type-conversion MOVs other than integer 0,
>> @@ -375,14 +377,26 @@ vec4_visitor::opt_vector_float()
>>inst->predicate == BRW_PREDICATE_NONE &&
>>inst->dst.writemask != WRITEMASK_XYZW &&
>>(inst->src[0].type == inst->dst.type || inst->src[0].d == 0)) {
>> - vf = brw_float_to_vf(inst->src[0].f);
>> + vf = brw_float_to_vf(inst->src[0].d);
>> + need_type = BRW_REGISTER_TYPE_D;
>> +
>> + if (vf == -1) {
>> +vf = brw_float_to_vf(inst->src[0].f);
>> +need_type = BRW_REGISTER_TYPE_F;
>> + }
>
> If we are packing actual float values (not integers), doesn't this mean
> that we re-interpret them as integers and convert the re-interpreted
> integer value to float? If the result of that sequence of operations is
> representable it seems that we would just use a D-MOV from a float that
> no longer represents the original value, right?

I believe you're correct, but that there are no values for which this
could cause a problem.

Since the case we're trying to handle is loading, e.g., integer
<1,2,3,4> as a MOV dst:D, [1,2,3,4]VF it might be safer (if there are
in fact problematic values) and clearer to only attempt to reinterpret
the bits as as integer if the destination type is an integer type.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94955] Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94955

--- Comment #8 from Roland Scheidegger  ---
(In reply to David Lonie from comment #7)
> > Comment # 4 on bug 94955 from Roland Scheidegger
> > FWIW the trace has some issues, namely it requests a 4.5 context thus needs
> > overrides to run. Not sure if this actually causes problems.
> 
> The 4.5 override was needed for my apitrace replay context. The actual
> segfaults are happening on a machine using a different (I believe 3.2?)
> context.
> 
> Is there information around that details how to get a better apitrace for
> you folks? I have another segfaulting test I could capture for you.
> 
The problem is, if the trace requires version override, it is difficult to tell
if it actually isn't using any functions which are actually known to not really
work.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] i965: Rework opt_vector_float() control flow.

2016-04-18 Thread Matt Turner
On Sun, Apr 17, 2016 at 11:14 PM, Kenneth Graunke  wrote:
> This reworks opt_vector_float() so that there's only one place that
> flushes out any accumulated state and emits a VF.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 63 
> +++---
>  1 file changed, 35 insertions(+), 28 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 6433fc5..fa0d80d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -385,48 +385,55 @@ vec4_visitor::opt_vector_float()
> unsigned writemask = 0;
>
> foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
> -  if (last_reg != inst->dst.nr ||
> +  int vf = -1;
> +
> +  /* Look for unconditional MOVs from an immediate with a partial
> +   * writemask.  See if the immediate can be represented as a VF.
> +   */
> +  if (inst->opcode == BRW_OPCODE_MOV &&
> +  inst->src[0].file == IMM &&
> +  inst->predicate == BRW_PREDICATE_NONE &&
> +  inst->dst.writemask != WRITEMASK_XYZW) {
> + vf = brw_float_to_vf(inst->src[0].f);
> +  }
> +
> +  /* If this wasn't a MOV, or the value was non-representable, or
> +   * the destination register doesn't match, then this breaks our
> +   * sequence.  Combine anything we've accumulated so far.
> +   */
> +  if (vf == -1 ||

Is the value being non-representable an important point? Running
shader-db on the first three patches shows that something hurt 20
programs, and I think it's considering a non-representable value to
break the sequence.

For instance, in metro-last-light/2175 we see:

-mov(8)  g16<1>.yD   1033476506D
-mov(8)  g16<1>.xzF  [0.375F, 0F, 2.5F, 0F]VF
+mov(8)  g16<1>.xD   1052770304D
+mov(8)  g16<1>.yD   1033476506D
+mov(8)  g16<1>.zD   1075838976D

where the .y component isn't representable but x and z are. I suspect
the other 19 cases are the same problem.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] Removing all double semi-colons

2016-04-18 Thread Ian Romanick
With the one typo in the commit message of patch 1 fixed, the series is

Reviewed-by: Ian Romanick 

If there are no objections or other comments, I'll go ahead and fix it
and push the series in the next day or so.

On 04/14/2016 09:07 AM, Jakob Sinclair wrote:
> This patch series remove all double semi-colons in the code that were left
> after my last patch series. These patches should not actually change 
> something.
> This is just cleanup work.
> 
> I don't have push access so someone reviewing will have to push this. My last
> series haven't been pushed yet either.
> 
> Regards
> Jakob Sinclair
> 
> Jakob Sinclair (4):
>   egl: Remove every double semi-colon
>   gallium: Remove every double semi-colon
>   glx: Remove every double semi-colon
>   mesa: Remove every double semi-colon
> 
>  src/egl/drivers/dri2/platform_android.c   | 2 +-
>  src/egl/drivers/dri2/platform_surfaceless.c   | 2 +-
>  src/gallium/drivers/freedreno/a3xx/fd3_emit.c | 2 +-
>  src/gallium/drivers/ilo/ilo_resource.c| 2 +-
>  src/gallium/drivers/ilo/shader/ilo_shader_gs.c| 2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_ssa.cpp   | 2 +-
>  src/gallium/drivers/nouveau/nv30/nv30_vbo.c   | 2 +-
>  src/gallium/drivers/r600/r600_state_common.c  | 2 +-
>  src/gallium/drivers/swr/rasterizer/core/format_types.h| 8 
>  src/gallium/drivers/swr/rasterizer/memory/StoreTile.cpp   | 6 +++---
>  src/gallium/winsys/amdgpu/drm/addrlib/core/addrobject.cpp | 2 +-
>  src/gallium/winsys/amdgpu/drm/addrlib/r800/egbaddrlib.cpp | 2 +-
>  src/glx/dri2_glx.c| 2 +-
>  src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp| 2 +-
>  src/mesa/math/m_debug_norm.c  | 2 +-
>  15 files changed, 20 insertions(+), 20 deletions(-)
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] egl: Remove every double semi-colon

2016-04-18 Thread Ian Romanick
On 04/14/2016 09:07 AM, Jakob Sinclair wrote:
> Removes all acidental semi-colons in egl.
  accidental

> 
> Signed-off-by: Jakob Sinclair 
> ---
>  src/egl/drivers/dri2/platform_android.c | 2 +-
>  src/egl/drivers/dri2/platform_surfaceless.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/egl/drivers/dri2/platform_android.c 
> b/src/egl/drivers/dri2/platform_android.c
> index 41840aa..c00b2c2 100644
> --- a/src/egl/drivers/dri2/platform_android.c
> +++ b/src/egl/drivers/dri2/platform_android.c
> @@ -514,7 +514,7 @@ droid_get_buffers_with_format(__DRIdrawable * driDrawable,
> if (height)
>*height = dri2_surf->base.Height;
>  
> -   *out_count = dri2_surf->buffer_count;;
> +   *out_count = dri2_surf->buffer_count;
>  
> return dri2_surf->buffers;
>  }
> diff --git a/src/egl/drivers/dri2/platform_surfaceless.c 
> b/src/egl/drivers/dri2/platform_surfaceless.c
> index 48f15df..e0ddc12 100644
> --- a/src/egl/drivers/dri2/platform_surfaceless.c
> +++ b/src/egl/drivers/dri2/platform_surfaceless.c
> @@ -68,7 +68,7 @@ surfaceless_get_buffers_with_format(__DRIdrawable * 
> driDrawable,
>*width = dri2_surf->base.Width;
> if (height)
>*height = dri2_surf->base.Height;
> -   *out_count = dri2_surf->buffer_count;;
> +   *out_count = dri2_surf->buffer_count;
> return dri2_surf->buffers;
>  }
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] util: add MAYBE_UNUSED for config dependent variables

2016-04-18 Thread Ian Romanick
On 04/18/2016 08:06 AM, Emil Velikov wrote:
> On 18 April 2016 at 04:43, Francisco Jerez  wrote:
>> Grazvydas Ignotas  writes:
>>
>>> On Sun, Apr 17, 2016 at 2:50 AM, Emil Velikov  
>>> wrote:
 On 16 April 2016 at 02:00, Grazvydas Ignotas  wrote:
> This is mostly for variables that are only used in asserts and cause
> unused-but-set-variable warnings in release builds. Could just use
> UNUSED directly, but MAYBE_UNUSED should be less confusing and is
> similar to what the Linux kernel has.
>
> And yes __attribute__((unused)) can be used on variables on both GCC 4.2
> (oldest supported by mesa) and clang 3.0 (just some random old version,
> nut sure what's the minimum for mesa).
>
> Signed-off-by: Grazvydas Ignotas 
> ---
> I have no commit access, if this patch is ok, please someone push.
>
>  src/util/macros.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/util/macros.h b/src/util/macros.h
> index 0c8958f..f081bb8 100644
> --- a/src/util/macros.h
> +++ b/src/util/macros.h
> @@ -204,6 +204,8 @@ do {   \
>  #define UNUSED
>  #endif
>
> +#define MAYBE_UNUSED UNUSED
> +
 Hell yeah !

 A thing that comes to mind ... a while back we've been wondering about
 (re)naming these just the the way we do in the kernel. Namely
 __maybe_unused in this case. Can you give that one a try and link a
 branch.
>>>
>>> I hope you mean something like this?
>>> https://github.com/notaz/mesa/commits/warnings
>>>
>>
>> You guys know that in standard C any identifier (including a
>> preprocessor define) starting with double underscore is reserved for the
>> implementation and causes the program's behavior to be undefined?  The
>> kernel is kind of part of the implementation so they can do whatever
>> they want, but it seems dubious to do the same in a userspace library.
>> How about you use a single (or no) underscore if people prefer the
>> lowercase spelling?
>>
> I do recall this. As mentioned before (to Jose I believe) sadly the
> cat is out of the bag. Both within mesa itself and also in other
> projects.
> 
> IIRC the conclusion from last time was along the lines of "[I guess]
> it's ok if it does not break things". So far scons (linux + windows)
> look fine - autoconf's normal make looks ok as well (make check takes
> a while) ;-)
> 
> Personally I'd opt for consistency across projects. It gives us extra
> reassurance that things are unlikely to break under our feet.
> Although yes, it does suck (a bit) that thing have turned out that way.

I agree with Curro.  I'd rather have consistency within Mesa (where
we've always used ALL_THE_UPPERCASE) than across projects.

> -Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: enable GLSL 4.30 and therefore OpenGL 4.3

2016-04-18 Thread Ian Romanick
On 04/15/2016 03:33 AM, Marek Olšák wrote:
> The same thing Nicolai said: This can be committed before the UE4
> compile failure is fixed.

Is there a bug filed for that problem?  Has anyone diagnosed the issue?

> Marek
> 
> On Fri, Apr 15, 2016 at 2:10 AM, Edward O'Callaghan
>  wrote:
>> This is the last necessary bit for OpenGL 4.3 support. All driver-specific
>> functionality has already been implemented as part of extensions.
>>
>> Signed-off-by: Edward O'Callaghan 
>> ---
>>  docs/GL3.txt   | 2 +-
>>  docs/relnotes/11.3.0.html  | 8 
>>  src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
>>  3 files changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/docs/GL3.txt b/docs/GL3.txt
>> index 6b5e016..53aeb9e 100644
>> --- a/docs/GL3.txt
>> +++ b/docs/GL3.txt
>> @@ -162,7 +162,7 @@ GL 4.2, GLSL 4.20 -- all DONE: radeonsi
>>GL_ARB_map_buffer_alignment   DONE (all drivers)
>>
>>
>> -GL 4.3, GLSL 4.30:
>> +GL 4.3, GLSL 4.30 -- all DONE: radeonsi
>>
>>GL_ARB_arrays_of_arrays   DONE (all drivers 
>> that support GLSL 1.30)
>>GL_ARB_ES3_compatibility  DONE (all drivers 
>> that support GLSL 3.30)
>> diff --git a/docs/relnotes/11.3.0.html b/docs/relnotes/11.3.0.html
>> index 5a7083c..0bbd756 100644
>> --- a/docs/relnotes/11.3.0.html
>> +++ b/docs/relnotes/11.3.0.html
>> @@ -22,11 +22,11 @@ People who are concerned with stability and reliability 
>> should stick
>>  with a previous release or wait for Mesa 11.3.1.
>>  
>>  
>> -Mesa 11.3.0 implements the OpenGL 4.2 API, but the version reported by
>> +Mesa 11.3.0 implements the OpenGL 4.3 API, but the version reported by
>>  glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
>>  glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
>> -Some drivers don't support all the features required in OpenGL 4.2.  OpenGL
>> -4.2 is only available if requested at context creation
>> +Some drivers don't support all the features required in OpenGL 4.3.  OpenGL
>> +4.3 is only available if requested at context creation
>>  because compatibility contexts are not supported.
>>  
>>
>> @@ -44,7 +44,7 @@ Note: some of the new features are only available with 
>> certain drivers.
>>  
>>
>>  
>> -OpenGL 4.2 on radeonsi
>> +OpenGL 4.3 on radeonsi
>>  GL_ARB_compute_shader on radeonsi
>>  GL_ARB_framebuffer_no_attachments on nvc0, r600, radeonsi, softpipe
>>  GL_ARB_internalformat_query2 on all drivers
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
>> b/src/gallium/drivers/radeonsi/si_pipe.c
>> index 94bd666..b789d4c 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>> @@ -338,7 +338,7 @@ static int si_get_param(struct pipe_screen* pscreen, 
>> enum pipe_cap param)
>> return HAVE_LLVM >= 0x0309 ? 4 : 0;
>>
>> case PIPE_CAP_GLSL_FEATURE_LEVEL:
>> -   return HAVE_LLVM >= 0x0309 ? 420 :
>> +   return HAVE_LLVM >= 0x0309 ? 430 :
>>HAVE_LLVM >= 0x0307 ? 410 : 330;
>>
>> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
>> --
>> 2.5.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94955] Uninitialized variables leads to random segfaults (valgrind log, apitrace attached)

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94955

--- Comment #7 from David Lonie  ---
> Comment # 1 on bug 94955 from Brian Paul
> (In reply to David Lonie from comment #0)
> > ==32054== Conditional jump or move depends on uninitialised value(s)
> > ==32054==    at 0x5367CF7: util_framebuffer_state_equal (u_framebuffer.c:58)
> > ==32054==    by 0x5444AFE: llvmpipe_set_framebuffer_state
> > (lp_state_surface.c:54)
> > ==32054==    by 0x53561DA: util_blitter_blit_generic (u_blitter.c:1694)
> > ==32054==    by 0x5356819: util_blitter_blit (u_blitter.c:1813)
> > ==32054==    by 0x544602C: lp_blit (lp_surface.c:117)
> > ==32054==    by 0x51705F7: st_CopyTexSubImage (st_cb_texture.c:2672)
> > ==32054==    by 0x50B2B03: copytexsubimage_by_slice (teximage.c:3459)
> > ==32054==    by 0x50B330D: copyteximage (teximage.c:3644)
> > ==32054==    by 0x50B3476: _mesa_CopyTexImage2D (teximage.c:3680)
> > ==32054==    by 0x4D340E: ??? (in /usr/bin/glretrace)
> > ==32054==    by 0x40: ??? (in /usr/bin/glretrace)
> > ==32054==    by 0x40D2A7: ??? (in /usr/bin/glretrace)
>
> This one looks easy to fix.  Though, I wasn't able to reproduce the valgrind
> warning here with piglit's copytexsubimage test which definitely hits the same
> code path.

I've poked at it, and this patch seems to do the trick for me:

--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -1573,6 +1573,8 @@ void util_blitter_blit_generic(struct blitter_context
*blitter,
    fb_state.nr_cbufs = blit_depth || blit_stencil ? 0 : 1;
    fb_state.cbufs[0] = NULL;
    fb_state.zsbuf = NULL;
+   fb_state.samples = 0;
+   fb_state.layers = 0;

    if (blit_depth || blit_stencil) {
       pipe->bind_blend_state(pipe, ctx->blend[0][0]);

> Comment # 2 on bug 94955 from Roland Scheidegger
> (In reply to Brian Paul from comment #1)
> > (In reply to David Lonie from comment #0)
> But in any case, I HIGHLY doubt these two are the reason for any random
> segfaults, I certainly don't see any evidence here. So, a backtrace of the
> actual crash would probably be more useful.

Glad the other two seem harmless. If these aren't likely to cause a segfault
I'll keep poking around. The backtrace is difficult to obtain because by the
time it crashes the stack is corrupt and the resulting backtrace is
meaningless.

Since the issue is a stack corruption, it makes it difficult to step through,
since the crashes happen somewhat randomly, and when it does all debugger state
is lost since the stack is nonsense. Hence why I thought the memory errors
would be at fault, but it looks like this is going to be trickier and more
subtle than it seemed.

I'll keep looking and update when I find something else.

> Comment # 3 on bug 94955 from Emil Velikov
> (In reply to David Lonie from comment #0)
> > which confuses my linker/loader ;) Another bug?)
> > 
> Indeed it is - your colleague (?) Chuck Atkins is working on that one. See
> bug#94086.

So he is! Glad it's a known issue, that one took some work to track down ;)

> Comment # 4 on bug 94955 from Roland Scheidegger
> FWIW the trace has some issues, namely it requests a 4.5 context thus needs
> overrides to run. Not sure if this actually causes problems.

The 4.5 override was needed for my apitrace replay context. The actual
segfaults are happening on a machine using a different (I believe 3.2?)
context.

Is there information around that details how to get a better apitrace for you
folks? I have another segfaulting test I could capture for you.

> Comment # 6 on bug 94955 from Roland Scheidegger
> So, I'm going to mark this bug as fixed. Two minor issues have been addressed
> (well actually the fb one could have real consequences resulting in 
> unnecessary
> state updates), and I'm not seeing any random segfaults in any case. Even if
> valgrind is right about the uninitialized values in the jit code that won't
> cause crashes (valgrind would say invalid read/write for anything which could
> crash). Feel free to open a new bug if you see crashes (preferably with a
> backtrace) or misrenderings.

Sounds good to me. Thanks so much for the fast replies and thorough checking of
the valgrind reports, even if they turned out to be bogus!

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] util: add MAYBE_UNUSED for config dependent variables

2016-04-18 Thread Francisco Jerez
Emil Velikov  writes:

> On 18 April 2016 at 04:43, Francisco Jerez  wrote:
>> Grazvydas Ignotas  writes:
>>
>>> On Sun, Apr 17, 2016 at 2:50 AM, Emil Velikov  
>>> wrote:
 On 16 April 2016 at 02:00, Grazvydas Ignotas  wrote:
> This is mostly for variables that are only used in asserts and cause
> unused-but-set-variable warnings in release builds. Could just use
> UNUSED directly, but MAYBE_UNUSED should be less confusing and is
> similar to what the Linux kernel has.
>
> And yes __attribute__((unused)) can be used on variables on both GCC 4.2
> (oldest supported by mesa) and clang 3.0 (just some random old version,
> nut sure what's the minimum for mesa).
>
> Signed-off-by: Grazvydas Ignotas 
> ---
> I have no commit access, if this patch is ok, please someone push.
>
>  src/util/macros.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/util/macros.h b/src/util/macros.h
> index 0c8958f..f081bb8 100644
> --- a/src/util/macros.h
> +++ b/src/util/macros.h
> @@ -204,6 +204,8 @@ do {   \
>  #define UNUSED
>  #endif
>
> +#define MAYBE_UNUSED UNUSED
> +
 Hell yeah !

 A thing that comes to mind ... a while back we've been wondering about
 (re)naming these just the the way we do in the kernel. Namely
 __maybe_unused in this case. Can you give that one a try and link a
 branch.
>>>
>>> I hope you mean something like this?
>>> https://github.com/notaz/mesa/commits/warnings
>>>
>>
>> You guys know that in standard C any identifier (including a
>> preprocessor define) starting with double underscore is reserved for the
>> implementation and causes the program's behavior to be undefined?  The
>> kernel is kind of part of the implementation so they can do whatever
>> they want, but it seems dubious to do the same in a userspace library.
>> How about you use a single (or no) underscore if people prefer the
>> lowercase spelling?
>>
> I do recall this. As mentioned before (to Jose I believe) sadly the
> cat is out of the bag. Both within mesa itself and also in other
> projects.
>
> IIRC the conclusion from last time was along the lines of "[I guess]
> it's ok if it does not break things". So far scons (linux + windows)
> look fine - autoconf's normal make looks ok as well (make check takes
> a while) ;-)
>
> Personally I'd opt for consistency across projects. It gives us extra
> reassurance that things are unlikely to break under our feet.
> Although yes, it does suck (a bit) that thing have turned out that way.
>
So you're suggesting that the precedent set by some other projects [that
happen to be mainly the kernel which has a wildly different set of
requirements than userspace code] should have more weight than
consistency with *Mesa's* own macro capitalization rules and the C
standard?

Sorry, but it seems rather pointless to me to ask Gražvydas to break the
rules knowingly.  His original patch has my vote and:

Reviewed-by: Francisco Jerez 

> -Emil


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: correct name for GL_OES_primitive_bounding_box

2016-04-18 Thread Kenneth Graunke
On Monday, April 18, 2016 5:26:33 PM PDT Erik Faye-Lund wrote:
> When this extension was added, an underscore were mistakenly replaced
> by a space. Let's correct this, so it's a tad easier to grep for this
> extension.
> 
> Signed-off-by: Erik Faye-Lund 
> ---
> 
> Just a tiny nit I noticed while reading docs... 
> 
>  docs/GL3.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index dc75cf8..3febd6e 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -261,7 +261,7 @@ GLES3.2, GLSL ES 3.2
>GL_OES_draw_elements_base_vertex  DONE (all drivers)
>GL_OES_geometry_shaderstarted (Marta)
>GL_OES_gpu_shader5DONE (all drivers 
that support GL_ARB_gpu_shader5)
> -  GL_OES_primitive_bounding box not started
> +  GL_OES_primitive_bounding_box not started
>GL_OES_sample_shading DONE (nvc0, r600, 
radeonsi)
>GL_OES_sample_variables   DONE (nvc0, r600, 
radeonsi)
>GL_OES_shader_image_atomicDONE (all drivers 
that support GL_ARB_shader_image_load_store)
> 

Pushed, thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] [rfc] gallivm/llvmpipe dynamic samplers support.

2016-04-18 Thread Roland Scheidegger
Am 18.04.2016 um 04:49 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> This is a first attempt at adding support for dynamic indexing
> of samplers to llvmpipe. This is needed for ARB_gpu_shader5 support.
> 
> This uses the sampler function generator to generate functions
> for all samplers, then uses if statements to pick which one to call.
> 
> This passes all the tests in piglit except a couple of non-uniform
> ones which g
I can't quite parse this fully ;-).



> ---
>  src/gallium/auxiliary/draw/draw_llvm.c|  4 +--
>  src/gallium/auxiliary/draw/draw_llvm.h|  2 +-
>  src/gallium/auxiliary/draw/draw_llvm_sample.c | 33 ++
>  src/gallium/auxiliary/gallivm/lp_bld_sample.h |  2 ++
>  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 25 +
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c   |  4 +++
>  src/gallium/drivers/llvmpipe/lp_state_fs.c|  2 +-
>  src/gallium/drivers/llvmpipe/lp_tex_sample.c  | 34 
> +++
>  src/gallium/drivers/llvmpipe/lp_tex_sample.h  |  3 +-
>  9 files changed, 88 insertions(+), 21 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
> b/src/gallium/auxiliary/draw/draw_llvm.c
> index 9c68d4f..d8418fc 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
> @@ -1678,7 +1678,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
> draw_llvm_variant *variant,
> LLVMBuildStore(builder, lp_build_zero(gallivm, lp_int_type(vs_type)), 
> clipmask_bool_ptr);
>  
> /* code generated texture sampling */
> -   sampler = 
> draw_llvm_sampler_soa_create(draw_llvm_variant_key_samplers(key));
> +   sampler = 
> draw_llvm_sampler_soa_create(draw_llvm_variant_key_samplers(key), 
> key->nr_samplers);
>  
> if (elts) {
>start = zero;
> @@ -2210,7 +2210,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
>draw_gs_jit_context_num_constants(variant->gallivm, context_ptr);
>  
> /* code generated texture sampling */
> -   sampler = draw_llvm_sampler_soa_create(variant->key.samplers);
> +   sampler = draw_llvm_sampler_soa_create(variant->key.samplers, 
> variant->key.nr_samplers);
>  
> mask_val = generate_mask_value(variant, gs_type);
> lp_build_mask_begin(, gallivm, gs_type, mask_val);
> diff --git a/src/gallium/auxiliary/draw/draw_llvm.h 
> b/src/gallium/auxiliary/draw/draw_llvm.h
> index 271433c..4231333 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm.h
> +++ b/src/gallium/auxiliary/draw/draw_llvm.h
> @@ -520,7 +520,7 @@ void
>  draw_gs_llvm_dump_variant_key(struct draw_gs_llvm_variant_key *key);
>  
>  struct lp_build_sampler_soa *
> -draw_llvm_sampler_soa_create(const struct draw_sampler_static_state 
> *static_state);
> +draw_llvm_sampler_soa_create(const struct draw_sampler_static_state 
> *static_state, int num_samplers);
>  
>  void
>  draw_llvm_set_sampler_state(struct draw_context *draw, unsigned 
> shader_stage);
> diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c 
> b/src/gallium/auxiliary/draw/draw_llvm_sample.c
> index 7e25918..5a3f9e4 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c
> @@ -37,6 +37,7 @@
>  #include "gallivm/lp_bld_type.h"
>  #include "gallivm/lp_bld_sample.h"
>  #include "gallivm/lp_bld_tgsi.h"
> +#include "gallivm/lp_bld_flow.h"
>  
>  
>  #include "util/u_debug.h"
> @@ -58,6 +59,7 @@ struct draw_llvm_sampler_dynamic_state
>  {
> struct lp_sampler_dynamic_state base;
>  
> +   int num_samplers;
> const struct draw_sampler_static_state *static_state;
>  };
>  
> @@ -236,10 +238,29 @@ draw_llvm_sampler_soa_emit_fetch_texel(const struct 
> lp_build_sampler_soa *base,
> assert(texture_index < PIPE_MAX_SHADER_SAMPLER_VIEWS);
> assert(sampler_index < PIPE_MAX_SAMPLERS);
>  
> -   
> lp_build_sample_soa(>dynamic_state.static_state[texture_index].texture_state,
> -   
> >dynamic_state.static_state[sampler_index].sampler_state,
> -   >dynamic_state.base,
> -   gallivm, params);
> +   if (!params->sampler_is_indirect) {
> +  
> lp_build_sample_soa(>dynamic_state.static_state[texture_index].texture_state,
> +  
> >dynamic_state.static_state[sampler_index].sampler_state,
> +  >dynamic_state.base,
> +  gallivm, params);
> +   } else {
> +  int i;
> +  for (i = 0; i < 4; i++)
> + params->texel[i] = lp_build_alloca(gallivm, 
> lp_build_vec_type(gallivm, params->type), "tex_store");
> +
> + for (i = 0; i < sampler->dynamic_state.num_samplers; i++) {
Indentation.
And I think this should really make an effort to only do this for
samplers which are actually dyanmically indexed, for the possible
range(s) (as they need to be declared as an array if I'm not mistaken).
I'm not entirely sure, but I believe this could even 

Re: [Mesa-dev] [PATCH] glsl: Checks for interpolation into its own function.

2016-04-18 Thread Andres Gomez
Hi,

I would really appreciate if you could find some time to review this
patch.

Thanks!

On Mon, 2016-04-04 at 19:50 +0300, Andres Gomez wrote:
> This generalizes the validation also to be done for variables inside
> interface blocks, which, for some cases, was missing.
> 
> For a discussion about the additional validation cases included see
> https://lists.freedesktop.org/archives/mesa-dev/2016-March/109117.htm
> l
> and Khronos bug #15671.
> 
> Signed-off-by: Andres Gomez 
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 316 +--
> 
>  1 file changed, 171 insertions(+), 145 deletions(-)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp
> b/src/compiler/glsl/ast_to_hir.cpp
> index 7c9be81..e4ebc6b 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -2792,8 +2792,164 @@ apply_explicit_binding(struct
> _mesa_glsl_parse_state *state,
>  }
>  
>  
> +static void
> +validate_interpolation_qualifier(struct _mesa_glsl_parse_state
> *state,
> + YYLTYPE *loc,
> + const glsl_interp_qualifier
> interpolation,
> + const struct ast_type_qualifier
> *qual,
> + const struct glsl_type *var_type,
> + ir_variable_mode mode)
> +{
> +   /* Interpolation qualifiers can only apply to shader inputs or
> outputs, but
> +* not to vertex shader inputs nor fragment shader outputs.
> +*
> +* From section 4.3 ("Storage Qualifiers") of the GLSL 1.30 spec:
> +*"Outputs from a vertex shader (out) and inputs to a
> fragment
> +*shader (in) can be further qualified with one or more of
> these
> +*interpolation qualifiers"
> +*...
> +*"These interpolation qualifiers may only precede the
> qualifiers in,
> +*centroid in, out, or centroid out in a declaration. They do
> not apply
> +*to the deprecated storage qualifiers varying or centroid
> +*varying. They also do not apply to inputs into a vertex
> shader or
> +*outputs from a fragment shader."
> +*
> +* From section 4.3 ("Storage Qualifiers") of the GLSL ES 3.00
> spec:
> +*"Outputs from a shader (out) and inputs to a shader (in)
> can be
> +*further qualified with one of these interpolation
> qualifiers."
> +*...
> +*"These interpolation qualifiers may only precede the
> qualifiers
> +*in, centroid in, out, or centroid out in a declaration.
> They do
> +*not apply to inputs into a vertex shader or outputs from a
> +*fragment shader."
> +*/
> +   if (state->is_version(130, 300)
> +   && interpolation != INTERP_QUALIFIER_NONE) {
> +  const char *i = interpolation_string(interpolation);
> +  if (mode != ir_var_shader_in && mode != ir_var_shader_out)
> + _mesa_glsl_error(loc, state,
> +  "interpolation qualifier `%s' can only be
> applied to "
> +  "shader inputs or outputs.", i);
> +
> +  switch (state->stage) {
> +  case MESA_SHADER_VERTEX:
> + if (mode == ir_var_shader_in) {
> +_mesa_glsl_error(loc, state,
> + "interpolation qualifier '%s' cannot be
> applied to "
> + "vertex shader inputs", i);
> + }
> + break;
> +  case MESA_SHADER_FRAGMENT:
> + if (mode == ir_var_shader_out) {
> +_mesa_glsl_error(loc, state,
> + "interpolation qualifier '%s' cannot be
> applied to "
> + "fragment shader outputs", i);
> + }
> + break;
> +  default:
> + break;
> +  }
> +   }
> +
> +   /* Interpolation qualifiers cannot be applied to 'centroid' and
> +* 'centroid varying'.
> +*
> +* From section 4.3 ("Storage Qualifiers") of the GLSL 1.30 spec:
> +*"interpolation qualifiers may only precede the qualifiers
> in,
> +*centroid in, out, or centroid out in a declaration. They do
> not apply
> +*to the deprecated storage qualifiers varying or centroid
> varying."
> +*
> +* These deprecated storage qualifiers do not exist in GLSL ES
> 3.00.
> +*/
> +   if (state->is_version(130, 0)
> +   && interpolation != INTERP_QUALIFIER_NONE
> +   && qual->flags.q.varying) {
> +
> +  const char *i = interpolation_string(interpolation);
> +  const char *s;
> +  if (qual->flags.q.centroid)
> + s = "centroid varying";
> +  else
> + s = "varying";
> +
> +  _mesa_glsl_error(loc, state,
> +   "qualifier '%s' cannot be applied to the "
> +   "deprecated storage qualifier '%s'", i, s);
> +   }
> +
> +   /* Integer fragment inputs must be qualified with 'flat'.  In
> GLSL ES,
> +* so must integer vertex outputs.
> +*
> +   

Re: [Mesa-dev] [PATCH 2/4] gallivm: prepare for dynamic texture sizes

2016-04-18 Thread Roland Scheidegger
Am 18.04.2016 um 04:49 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> Currently the texture member functions take a texture unit
> number, in order for TXQ to work we want to use a dynamic
> value here, so we need to pass in an value reference.
> 
> This means we can't do the assert or add the llvm name.
> 
> For most normal users they still pass a constant value here,
> just for the TXQ paths do we actually care that it gets the
> value.
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/gallium/auxiliary/draw/draw_llvm_sample.c | 10 +++---
>  src/gallium/auxiliary/gallivm/lp_bld_sample.c | 13 ---
>  src/gallium/auxiliary/gallivm/lp_bld_sample.h | 18 +-
>  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 41 
> ---
>  src/gallium/drivers/llvmpipe/lp_tex_sample.c  | 10 +++---
>  5 files changed, 48 insertions(+), 44 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c 
> b/src/gallium/auxiliary/draw/draw_llvm_sample.c
> index 1845c05..7e25918 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c
> @@ -85,7 +85,7 @@ static LLVMValueRef
>  draw_llvm_texture_member(const struct lp_sampler_dynamic_state *base,
>   struct gallivm_state *gallivm,
>   LLVMValueRef context_ptr,
> - unsigned texture_unit,
> + LLVMValueRef tex_unit_ref,
>   unsigned member_index,
>   const char *member_name,
>   boolean emit_load)
> @@ -95,14 +95,12 @@ draw_llvm_texture_member(const struct 
> lp_sampler_dynamic_state *base,
> LLVMValueRef ptr;
> LLVMValueRef res;
>  
> -   debug_assert(texture_unit < PIPE_MAX_SHADER_SAMPLER_VIEWS);
> -
> /* context[0] */
> indices[0] = lp_build_const_int32(gallivm, 0);
> /* context[0].textures */
> indices[1] = lp_build_const_int32(gallivm, DRAW_JIT_CTX_TEXTURES);
> /* context[0].textures[unit] */
> -   indices[2] = lp_build_const_int32(gallivm, texture_unit);
> +   indices[2] = tex_unit_ref;
> /* context[0].textures[unit].member */
> indices[3] = lp_build_const_int32(gallivm, member_index);
>  
> @@ -113,7 +111,7 @@ draw_llvm_texture_member(const struct 
> lp_sampler_dynamic_state *base,
> else
>res = ptr;
>  
> -   lp_build_name(res, "context.texture%u.%s", texture_unit, member_name);
> +//   lp_build_name(res, "context.texture%u.%s", texture_unit, member_name);
>  
> return res;
>  }
> @@ -179,7 +177,7 @@ draw_llvm_sampler_member(const struct 
> lp_sampler_dynamic_state *base,
> draw_llvm_texture_##_name( const struct lp_sampler_dynamic_state *base, \
>struct gallivm_state *gallivm,   \
>LLVMValueRef context_ptr,\
> -  unsigned texture_unit)   \
> +  LLVMValueRef texture_unit) 
>   \
> { \
>return draw_llvm_texture_member(base, gallivm, context_ptr, \
>texture_unit, _index, #_name, 
> _emit_load ); \
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
> index 4befb3a..2f61cbf 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
> @@ -238,6 +238,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
> unsigned i;
> LLVMValueRef i32undef = 
> LLVMGetUndef(LLVMInt32TypeInContext(gallivm->context));
> LLVMValueRef rho_xvec, rho_yvec;
> +   LLVMValueRef tunit = lp_build_const_int32(bld->gallivm, texture_unit);
>  
> /* Note that all simplified calculations will only work for isotropic 
> filtering */
>  
> @@ -247,7 +248,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
>  */
>  
> first_level = bld->dynamic_state->first_level(bld->dynamic_state, 
> bld->gallivm,
> - bld->context_ptr, 
> texture_unit);
> + bld->context_ptr, tunit);
> first_level_vec = lp_build_broadcast_scalar(int_size_bld, first_level);
> int_size = lp_build_minify(int_size_bld, bld->int_size, first_level_vec, 
> TRUE);
> float_size = lp_build_int_to_float(float_size_bld, int_size);
> @@ -904,11 +905,12 @@ lp_build_nearest_mip_level(struct 
> lp_build_sample_context *bld,
> struct lp_build_context *leveli_bld = >leveli_bld;
> struct lp_sampler_dynamic_state *dynamic_state = bld->dynamic_state;
> LLVMValueRef first_level, last_level, level;
> +   LLVMValueRef tunit = lp_build_const_int32(bld->gallivm, texture_unit);
>  
> first_level = dynamic_state->first_level(dynamic_state, bld->gallivm,
> -

Re: [Mesa-dev] i965: The future of blorp

2016-04-18 Thread Pohjolainen, Topi
On Sat, Apr 16, 2016 at 12:12:46PM -0700, Jason Ekstrand wrote:
>All,
>With Topi's gen8/9 blorp patches on the list, I wanted to start a brief
>discussion about the future of blorp in the hopes of us all being on
>the same page and not stepping on each other's toes.  I think everyone
>is now agreed that blorp is the future and GL meta should die.
>As we continue to work on the Vulkan driver, its need for blorp-like
>things increases.  We currently have all of the blits and resolves
>implemented in a Vulkan-based meta scheme.  While Vulkan doesn't have
>the same meta problems as GL (It's actually fairly clean), it still
>isn't a perfect fit.  The illusion of being able to use a
>hardware-agnostic API breaks down fairly quickly when you start doing
>hardware-specific things.  One example is HiZ resolves on gen7:  We
>can, in theory, add HiZ op bits to our side-band data structure that we
>pass to create_pipeline but a HiZ resolve pipeline isn't really that
>much like a normal pipeline.  You can make the argument that "we're
>re-using the normal pipeline creation code" or you can make the
>argument that "we're duplicating blorp".  In my brain, the second
>argument is starting ti win.
>Where am I going with this?  What I think I'd eventually like to see is
>some sort of a unified blorp that can be used in both drivers.  (Note:
>That unified blorp might still end up being Vulkan meta; I'm still not
>sure.)  Sure, it may mean doing state setup 3 places instead of 2 but
>it also means getting the HiZ and fast-clear pipelines right 1 place
>instead of 2 and I feel that's a bit more important.
>As it currently stands, here's my plan:
> 1) Get Topi's gen8/9 blorp patches reviewed and merged.
> 2) Rework blorp to start using NIR shaders whenever possible.  I know
>Topi had a project at one point that tried to get us using GLSL shaders
>in blorp which ran into some problems.  With the compiler APIs
>refactored to accept NIR directly, turning a NIR shader into a binary
>is trivial.  We've had great success using NIR directly for building
>shaders for Vulkan meta and I think doing that in blorp would also be
>good.  This would also provide our first blorp code-sharing point
>between Vulkan and GL as a lot of the NIR code to build those shaders
>could be shared.

Sounds good to me. We are missing 16x msaa support in blorp blits. Ken and I
quickly discussed implementing that in glsl later (perhaps using the version
we have for meta). Using NIR would allow us to skip all the scanning and
parsing.
In general moving from blorp hand-crafted compiler to NIR (or glsl) is the
right direction in my opinion.

> 3) Start using the XML-generated packing structs for doing blorp state
>setup.  Again, we've had great success in the Vulkan driver with
>Kristian's XML-generated packing structs.  Using them in blorp where
>it's practical would substantially reduce the amount of code we have
>for blorp state setup and possibly let us unify gen7-9 (not sure how
>much we can unify, but certainly some).

I've been thinking of re-using normal i965 upload logic. I've experimented
with that in past - all that is needed is to make the interface more explicit
instead of just passing current driver state to the emitters.
Having said that I'm equally pro for using something that worked for you guys
in Vulkan.

> 4) Figure out how to make blorp work in both drivers.  This is a bit
>open-ended as the two drivers have vastly different batch-submission
>and relocation-tracking models.  That said, I've been kicking this one
>around in my brain quite a bit lately and I think I'm starting to
>converge on something resembling a solution.
>I think each of the changes listed above has merit without the others,
>so if we decide to bail on the unified blorp plan at some point, we've
>still accomplished something. Thoughts?  Opinions?  Rotten tomatoes?
>As far as who does it, I'm more than willing to pick up the work and
>make it happen.  If others want to chip in, that's fine too.
>--Jason

I'm at least pretty open how to proceed. Having more eyes on blorp and
especially on the compiler side certainly wouldn't hurt :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94994] OSMesaGetProcAdress always fails on mangled OSMesa

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94994

Frederic Devernay  changed:

   What|Removed |Added

 Depends on||91724
 CC||frederic.dever...@m4x.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91724] GL/gl_mangle.h misses symbols from GLES/gl.h

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91724

Frederic Devernay  changed:

   What|Removed |Added

 Blocks||94994

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94994] OSMesaGetProcAdress always fails on mangled OSMesa

2016-04-18 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94994

Bug ID: 94994
   Summary: OSMesaGetProcAdress always fails on mangled OSMesa
   Product: Mesa
   Version: 11.2
  Hardware: All
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: frederic.dever...@m4x.org
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 123027
  --> https://bugs.freedesktop.org/attachment.cgi?id=123027=edit
patch

OSMesaGetProcAdress on mangled OSMesa checks that the first three characters
are "mgl", but the functions in the table are stored without the leading "m",
thus OSMesaGetProcAddress always returns a dummy function.

The proposed solution is similar to
https://lists.freedesktop.org/archives/mesa-dev/2015-September/095251.html :

If we are on mangled osmesa and the first character is "m", skip it and
proceed.

Thus OSMesaGetProcAddress("mglCreateShader") and
OSMesaGetProcAddress("glCreateShader") return the same thing on mangled OSMesa,
and the former doesn't work on non-mangled OSMesa.

This bug also depends on https://bugs.freedesktop.org/show_bug.cgi?id=91724
(which has a proposed solution)

Plese note that mangled OSMesa is a very useful feature, and may even be
necessary in some situations, as I discussed in
https://bugs.freedesktop.org/show_bug.cgi?id=91724#c4

I provide scripts for building and installing various flavors of OSMesa, as
well as a simple test program that checks the return values of
OSMesaGetProcAddress at: https://github.com/devernay/osmesa-install

Fred (early Mesa contributor, see the Mesa 1.2.1 README!)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0/2] Fix Gallium RGB565 image support

2016-04-18 Thread Nicolas Dufresne
Le lundi 18 avril 2016 à 11:40 +0900, Michel Dänzer a écrit :
> On 17.04.2016 09:49, nico...@ndufresne.ca wrote:
> > 
> > From: Nicolas Dufresne 
> > 
> > Sorry for the long delay breaking down this patch. I have now
> > rebased
> > on top recent mesa tree. First patch creates a new function to
> > convert
> > DRI2 format into PIPE format (to avoid more copy paste). The second
> > fixes 
> > the wrong pitch to stride calculation fixing RGB565 support. Note
> > that, in
> > that part of the code, pitch is considered to be in pixels while
> > stride is
> > in bytes.
> > 
> > Nicolas Dufresne (2):
> >   gallium/dri2: Factor out DRI2 to PIPE_FORMAT conversion
> >   gallium/dri2: Fix RGB565 EGLImage creation
> > 
> >  src/gallium/state_trackers/dri/dri2.c | 105 +-
> > 
> >  1 file changed, 51 insertions(+), 54 deletions(-)
> The series is
> 
> Reviewed-by: Michel Dänzer 
> 
> Do you need somebody to push the patches for you?

Yes please, thanks for the review.

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 04/11] gallium: add endian_format field to struct pipe_resource

2016-04-18 Thread Marek Olšák
On Mon, Apr 18, 2016 at 5:03 PM, Ilia Mirkin  wrote:
> On Mon, Apr 18, 2016 at 10:47 AM, Oded Gabbay  wrote:
>> On Thu, Apr 14, 2016 at 6:44 PM, Ilia Mirkin  wrote:
>>> On Thu, Apr 14, 2016 at 11:08 AM, Oded Gabbay  wrote:
> Wouldn't it make more sense to handle such issues in transfer_map?
> (i.e. create a staging memory area, and decode into it)? This assumes
> that the transfer_map() call has enough information to "do the right
> thing". I don't think it does today, but perhaps it could be taught?
 It doesn't have all the info today, that's for sure. I imagine though
 we can add parameters to it.

> That way everything that's in a pipe_resource is in some
> tightly-controlled format, and we specify the LE <-> BE parameters
> when converting between CPU-read/written and GPU-read/written data. I
> believe this is a better match for what's really happening, too. What
> do you think?
>
>   -ilia

 Unless I'm missing something, I think, at the end of the day, it will
 be the same issues as in my solution - per code path per format is a
 different case. That's because you will still need to "teach"
 transfer_map, per each transfer per format what to do. So one will
 need to go and debug every single code path there is in mesa for
 drawing/copying/reading/textures/etc., like what I did in the last 1.5
 months. It's a great learning experience but it won't give anything
 generic.

 Again, for example, in st_ReadPixels, I imagine you will need to give
 "different orders" to transfer_map for the two different scenarios -
 H/W blit and fallback. So what's the gain here ?

 If I'm missing something, please tell me.
>>>
>>> One of us is... let's figure out which one :)
>>>
>>> Here's my proposal:
>>>
>>> All data stored inside of resources is stored in a driver-happy
>>> format. The driver ensures that it's stored in proper endianness, etc.
>>> (Much like it does today wrt proper stride.)
>>>
>>> Blitting(/copying) between resources doesn't require any additional
>>> information, since you have the format(s) of the respective resources,
>>> and it's all inside the driver, so the driver does whatever it needs
>>> to do to make it all "work".
>>>
>>> *Accessing and modifying* resources (directly) from the CPU is what
>>> becomes tricky. The state tracker may have incorrect expectations of
>>> the actual backing data. There are a few different ways to resolve
>>> this. The one I'm proposing is that you only ever return a pointer to
>>> the directly underlying data if it matches the CPU's expectations
>>> (which will only be the case for byte-oriented array formats like
>>> PIPE_FORMAT_R8G8B8A8_* & co). Everything else, like e.g.
>>> PIPE_FORMAT_R5G6B5_UNORM and countless others, will have to go through
>>> a bounce buffer.
>>>
>>> At transfer map time, you convert the data from GPU-style to
>>> CPU-style, and copy back the relevant bits at unmap/flush time.
>>>
>>> This presents a nice clean boundary for this stuff. Instead of the
>>> state tracker trying to guess what the driver will do and feeding it
>>> endiannesses that it can't possibly guess properly, the tracking logic
>>> is relegated to the driver, and we extend the interfaces to allow the
>>> state tracker to access the data in a proper way.
>>>
>>> I believe the advantage of this scheme is that beyond adding format
>>> parameters to pipe_transfer_map() calls, there will not need to be any
>>> adjustments to the state trackers.
>>>
>>> One yet-to-be-resolved issue is what to do about glMapBuffer* - it
>>> maps a buffer, it's formatless (at map time), and yet the GPU will be
>>> required to interpret it correctly. We could decree that PIPE_BUFFER
>>> is just *always* an array of R8_UNORM and thus never needs any type of
>>> swapping. The driver needs to adjust accordingly to deal with accesses
>>> that don't fit that pattern (and where parameters can't be fed to the
>>> GPU to interpret it properly).
>>>
>>> I think something like the above will work. And I think it presents a
>>> cleaner barrier than your proposal, because none of the "this GPU can
>>> kinda-sorta understand BE, but not everywhere" details are ever
>>> exposed to the state tracker.
>>>
>>> Thoughts?
>>>
>>>   -ilia
>>
>> Ilia,
>>
>> To make the GPU do a conversion during blitting, I need to configure
>> registers. This is done in a couple of functions in the r600g driver
>> (r600_translate_texformat, r600_colorformat_endian_swap,
>> r600_translate_colorformat and r600_translate_colorswap).
>>
>> The problem is that transfer_map/unmap don't call directly to those
>> functions. They call other functions which eventually call those 4
>> functions. Among those "other" functions, there are several function
>> calls which are *not* in the r600g driver. i.e. we go back to generic
>> util 

Re: [Mesa-dev] [PATCH 8/9] llvmpipe: Test more vector lengths.

2016-04-18 Thread Tom Stellard
On Mon, Apr 18, 2016 at 10:14:35AM +0100, Jose Fonseca wrote:
> All power of two of up native vector length.
> 
> There is actually a bug in lp_build_round for v2, whereby it doesn't
> round to nearest.  Fixing is left to the future, but the test is now
> able to expect it to fail.
> ---
>  src/gallium/drivers/llvmpipe/lp_test_arit.c | 43 
> -
>  1 file changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_test_arit.c 
> b/src/gallium/drivers/llvmpipe/lp_test_arit.c
> index ba831f3..f3ba5a1 100644
> --- a/src/gallium/drivers/llvmpipe/lp_test_arit.c
> +++ b/src/gallium/drivers/llvmpipe/lp_test_arit.c
> @@ -297,14 +297,16 @@ unary_tests[] = {
>   */
>  static LLVMValueRef
>  build_unary_test_func(struct gallivm_state *gallivm,
> -  const struct unary_test_t *test)
> +  const struct unary_test_t *test,
> +  unsigned length,
> +  const char *test_name)
>  {
> -   struct lp_type type = lp_type_float_vec(32, lp_native_vector_width);
> +   struct lp_type type = lp_type_float_vec(32, length * 32);
> LLVMContextRef context = gallivm->context;
> LLVMModuleRef module = gallivm->module;
> LLVMTypeRef vf32t = lp_build_vec_type(gallivm, type);
> LLVMTypeRef args[2] = { LLVMPointerType(vf32t, 0), LLVMPointerType(vf32t, 
> 0) };
> -   LLVMValueRef func = LLVMAddFunction(module, test->name,
> +   LLVMValueRef func = LLVMAddFunction(module, test_name,
> 
> LLVMFunctionType(LLVMVoidTypeInContext(context),
>  args, 
> Elements(args), 0));
> LLVMValueRef arg0 = LLVMGetParam(func, 0);
> @@ -371,14 +373,15 @@ flush_denorm_to_zero(float val)
>   * Test one LLVM unary arithmetic builder function.
>   */
>  static boolean
> -test_unary(unsigned verbose, FILE *fp, const struct unary_test_t *test)
> +test_unary(unsigned verbose, FILE *fp, const struct unary_test_t *test, 
> unsigned length)
>  {
> +   char test_name[128];
> +   util_snprintf(test_name, sizeof test_name, "%s.v%u", test->name, length);
> struct gallivm_state *gallivm;
> LLVMValueRef test_func;
> unary_func_t test_func_jit;
> boolean success = TRUE;
> int i, j;
> -   int length = lp_native_vector_width / 32;
> float *in, *out;
>  
> in = align_malloc(length * 4, length * 4);
> @@ -391,7 +394,7 @@ test_unary(unsigned verbose, FILE *fp, const struct 
> unary_test_t *test)
>  
> gallivm = gallivm_create("test_module", LLVMGetGlobalContext());

This is not related to this patch, but the c++ equivalent of
LLVMGetGlobalContext() has been removed from LLVM, and I think the C
API may be removed at some point in the future, so these tests should
be migrated to use LLVMCreateContext().

-Tom

>  
> -   test_func = build_unary_test_func(gallivm, test);
> +   test_func = build_unary_test_func(gallivm, test, length, test_name);
>  
> gallivm_compile_module(gallivm);
>  
> @@ -411,6 +414,7 @@ test_unary(unsigned verbose, FILE *fp, const struct 
> unary_test_t *test)
>for (i = 0; i < num_vals; ++i) {
>   float testval, ref;
>   double error, precision;
> + boolean expected_pass = TRUE;
>   bool pass;
>  
>   testval = flush_denorm_to_zero(in[i]);
> @@ -429,14 +433,23 @@ test_unary(unsigned verbose, FILE *fp, const struct 
> unary_test_t *test)
>  continue;
>   }
>  
> - if (!pass || verbose) {
> -printf("%s(%.9g): ref = %.9g, out = %.9g, precision = %f bits, 
> %s\n",
> -  test->name, in[i], ref, out[i], precision,
> -  pass ? "PASS" : "FAIL");
> + if (test->ref ==  && length == 2 && 
> + ref != roundf(testval)) {
> +/* FIXME: The generic (non SSE) path in lp_build_iround, which is
> + * always taken for length==2 regardless of native round support,
> + * does not round to even. */
> +expected_pass = FALSE;
> + }
> +
> + if (pass != expected_pass || verbose) {
> +printf("%s(%.9g): ref = %.9g, out = %.9g, precision = %f bits, 
> %s%s\n",
> +  test_name, in[i], ref, out[i], precision,
> +  pass ? "PASS" : "FAIL",
> +  !expected_pass ? (pass ? " (unexpected)" : " (expected)" 
> ): "");
>  fflush(stdout);
>   }
>  
> - if (!pass) {
> + if (pass != expected_pass) {
>  success = FALSE;
>   }
>}
> @@ -458,8 +471,12 @@ test_all(unsigned verbose, FILE *fp)
> int i;
>  
> for (i = 0; i < Elements(unary_tests); ++i) {
> -  if (!test_unary(verbose, fp, _tests[i])) {
> - success = FALSE;
> +  unsigned max_length = lp_native_vector_width / 32;
> +  unsigned length;
> +  for (length = 1; length <= max_length; length *= 2) {
> + if 

[Mesa-dev] [PATCH] docs: correct name for GL_OES_primitive_bounding_box

2016-04-18 Thread Erik Faye-Lund
When this extension was added, an underscore were mistakenly replaced
by a space. Let's correct this, so it's a tad easier to grep for this
extension.

Signed-off-by: Erik Faye-Lund 
---

Just a tiny nit I noticed while reading docs... 

 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index dc75cf8..3febd6e 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -261,7 +261,7 @@ GLES3.2, GLSL ES 3.2
   GL_OES_draw_elements_base_vertex  DONE (all drivers)
   GL_OES_geometry_shaderstarted (Marta)
   GL_OES_gpu_shader5DONE (all drivers that 
support GL_ARB_gpu_shader5)
-  GL_OES_primitive_bounding box not started
+  GL_OES_primitive_bounding_box not started
   GL_OES_sample_shading DONE (nvc0, r600, 
radeonsi)
   GL_OES_sample_variables   DONE (nvc0, r600, 
radeonsi)
   GL_OES_shader_image_atomicDONE (all drivers that 
support GL_ARB_shader_image_load_store)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallivm: convert size query to using a set of parameters.

2016-04-18 Thread Roland Scheidegger
Am 18.04.2016 um 04:49 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> This isn't currently that easy to expand, so fix it up
> before expanding it later to include dynamic samplers.
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/gallium/auxiliary/draw/draw_llvm_sample.c | 22 ++-
>  src/gallium/auxiliary/gallivm/lp_bld_sample.h | 21 ---
>  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 71 
> +++
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi.h   |  9 +--
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c   | 18 +++---
>  src/gallium/drivers/llvmpipe/lp_tex_sample.c  | 22 ++-
>  src/gallium/drivers/swr/swr_tex_sample.cpp| 22 ++-
>  7 files changed, 71 insertions(+), 114 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_llvm_sample.c 
> b/src/gallium/auxiliary/draw/draw_llvm_sample.c
> index cb31695..1845c05 100644
> --- a/src/gallium/auxiliary/draw/draw_llvm_sample.c
> +++ b/src/gallium/auxiliary/draw/draw_llvm_sample.c
> @@ -251,30 +251,16 @@ draw_llvm_sampler_soa_emit_fetch_texel(const struct 
> lp_build_sampler_soa *base,
>  static void
>  draw_llvm_sampler_soa_emit_size_query(const struct lp_build_sampler_soa 
> *base,
>struct gallivm_state *gallivm,
> -  struct lp_type type,
> -  unsigned texture_unit,
> -  unsigned target,
> -  LLVMValueRef context_ptr,
> -  boolean is_sviewinfo,
> -  enum lp_sampler_lod_property 
> lod_property,
> -  LLVMValueRef explicit_lod, /* optional 
> */
> -  LLVMValueRef *sizes_out)
> +  const struct 
> lp_sampler_size_query_params *params)
>  {
> struct draw_llvm_sampler_soa *sampler = (struct draw_llvm_sampler_soa 
> *)base;
>  
> -   assert(texture_unit < PIPE_MAX_SHADER_SAMPLER_VIEWS);
> +   assert(params->texture_unit < PIPE_MAX_SHADER_SAMPLER_VIEWS);
>  
> lp_build_size_query_soa(gallivm,
> -   
> >dynamic_state.static_state[texture_unit].texture_state,
> +   
> >dynamic_state.static_state[params->texture_unit].texture_state,
> >dynamic_state.base,
> -   type,
> -   texture_unit,
> -   target,
> -   context_ptr,
> -   is_sviewinfo,
> -   lod_property,
> -   explicit_lod,
> -   sizes_out);
> +   params);
>  }
>  
>  struct lp_build_sampler_soa *
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h 
> b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
> index 902ae41..9ec051a 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
> @@ -110,7 +110,17 @@ struct lp_sampler_params
> LLVMValueRef *texel;
>  };
>  
> -
> +struct lp_sampler_size_query_params
> +{
> +   struct lp_type int_type;
> +   unsigned texture_unit;
> +   unsigned target;
> +   LLVMValueRef context_ptr;
> +   boolean is_sviewinfo;
> +   enum lp_sampler_lod_property lod_property;
> +   LLVMValueRef explicit_lod;
> +   LLVMValueRef *sizes_out;
> +};
>  /**
>   * Texture static state.
>   *
> @@ -606,14 +616,7 @@ void
>  lp_build_size_query_soa(struct gallivm_state *gallivm,
>  const struct lp_static_texture_state *static_state,
>  struct lp_sampler_dynamic_state *dynamic_state,
> -struct lp_type int_type,
> -unsigned texture_unit,
> -unsigned target,
> -LLVMValueRef context_ptr,
> -boolean is_sviewinfo,
> -enum lp_sampler_lod_property lod_property,
> -LLVMValueRef explicit_lod,
> -LLVMValueRef *sizes_out);
> +const struct lp_sampler_size_query_params *params);
>  
>  void
>  lp_build_sample_nop(struct gallivm_state *gallivm, 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
> index 937948b..c16b1c9 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
> @@ -3439,14 +3439,7 @@ void
>  lp_build_size_query_soa(struct gallivm_state *gallivm,
>  const struct lp_static_texture_state *static_state,
>  struct lp_sampler_dynamic_state *dynamic_state,
> -struct lp_type int_type,
> -unsigned 

Re: [Mesa-dev] [PATCH 1/7] util: add MAYBE_UNUSED for config dependent variables

2016-04-18 Thread Emil Velikov
On 18 April 2016 at 04:43, Francisco Jerez  wrote:
> Grazvydas Ignotas  writes:
>
>> On Sun, Apr 17, 2016 at 2:50 AM, Emil Velikov  
>> wrote:
>>> On 16 April 2016 at 02:00, Grazvydas Ignotas  wrote:
 This is mostly for variables that are only used in asserts and cause
 unused-but-set-variable warnings in release builds. Could just use
 UNUSED directly, but MAYBE_UNUSED should be less confusing and is
 similar to what the Linux kernel has.

 And yes __attribute__((unused)) can be used on variables on both GCC 4.2
 (oldest supported by mesa) and clang 3.0 (just some random old version,
 nut sure what's the minimum for mesa).

 Signed-off-by: Grazvydas Ignotas 
 ---
 I have no commit access, if this patch is ok, please someone push.

  src/util/macros.h | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/util/macros.h b/src/util/macros.h
 index 0c8958f..f081bb8 100644
 --- a/src/util/macros.h
 +++ b/src/util/macros.h
 @@ -204,6 +204,8 @@ do {   \
  #define UNUSED
  #endif

 +#define MAYBE_UNUSED UNUSED
 +
>>> Hell yeah !
>>>
>>> A thing that comes to mind ... a while back we've been wondering about
>>> (re)naming these just the the way we do in the kernel. Namely
>>> __maybe_unused in this case. Can you give that one a try and link a
>>> branch.
>>
>> I hope you mean something like this?
>> https://github.com/notaz/mesa/commits/warnings
>>
>
> You guys know that in standard C any identifier (including a
> preprocessor define) starting with double underscore is reserved for the
> implementation and causes the program's behavior to be undefined?  The
> kernel is kind of part of the implementation so they can do whatever
> they want, but it seems dubious to do the same in a userspace library.
> How about you use a single (or no) underscore if people prefer the
> lowercase spelling?
>
I do recall this. As mentioned before (to Jose I believe) sadly the
cat is out of the bag. Both within mesa itself and also in other
projects.

IIRC the conclusion from last time was along the lines of "[I guess]
it's ok if it does not break things". So far scons (linux + windows)
look fine - autoconf's normal make looks ok as well (make check takes
a while) ;-)

Personally I'd opt for consistency across projects. It gives us extra
reassurance that things are unlikely to break under our feet.
Although yes, it does suck (a bit) that thing have turned out that way.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 04/11] gallium: add endian_format field to struct pipe_resource

2016-04-18 Thread Ilia Mirkin
On Mon, Apr 18, 2016 at 10:47 AM, Oded Gabbay  wrote:
> On Thu, Apr 14, 2016 at 6:44 PM, Ilia Mirkin  wrote:
>> On Thu, Apr 14, 2016 at 11:08 AM, Oded Gabbay  wrote:
 Wouldn't it make more sense to handle such issues in transfer_map?
 (i.e. create a staging memory area, and decode into it)? This assumes
 that the transfer_map() call has enough information to "do the right
 thing". I don't think it does today, but perhaps it could be taught?
>>> It doesn't have all the info today, that's for sure. I imagine though
>>> we can add parameters to it.
>>>
 That way everything that's in a pipe_resource is in some
 tightly-controlled format, and we specify the LE <-> BE parameters
 when converting between CPU-read/written and GPU-read/written data. I
 believe this is a better match for what's really happening, too. What
 do you think?

   -ilia
>>>
>>> Unless I'm missing something, I think, at the end of the day, it will
>>> be the same issues as in my solution - per code path per format is a
>>> different case. That's because you will still need to "teach"
>>> transfer_map, per each transfer per format what to do. So one will
>>> need to go and debug every single code path there is in mesa for
>>> drawing/copying/reading/textures/etc., like what I did in the last 1.5
>>> months. It's a great learning experience but it won't give anything
>>> generic.
>>>
>>> Again, for example, in st_ReadPixels, I imagine you will need to give
>>> "different orders" to transfer_map for the two different scenarios -
>>> H/W blit and fallback. So what's the gain here ?
>>>
>>> If I'm missing something, please tell me.
>>
>> One of us is... let's figure out which one :)
>>
>> Here's my proposal:
>>
>> All data stored inside of resources is stored in a driver-happy
>> format. The driver ensures that it's stored in proper endianness, etc.
>> (Much like it does today wrt proper stride.)
>>
>> Blitting(/copying) between resources doesn't require any additional
>> information, since you have the format(s) of the respective resources,
>> and it's all inside the driver, so the driver does whatever it needs
>> to do to make it all "work".
>>
>> *Accessing and modifying* resources (directly) from the CPU is what
>> becomes tricky. The state tracker may have incorrect expectations of
>> the actual backing data. There are a few different ways to resolve
>> this. The one I'm proposing is that you only ever return a pointer to
>> the directly underlying data if it matches the CPU's expectations
>> (which will only be the case for byte-oriented array formats like
>> PIPE_FORMAT_R8G8B8A8_* & co). Everything else, like e.g.
>> PIPE_FORMAT_R5G6B5_UNORM and countless others, will have to go through
>> a bounce buffer.
>>
>> At transfer map time, you convert the data from GPU-style to
>> CPU-style, and copy back the relevant bits at unmap/flush time.
>>
>> This presents a nice clean boundary for this stuff. Instead of the
>> state tracker trying to guess what the driver will do and feeding it
>> endiannesses that it can't possibly guess properly, the tracking logic
>> is relegated to the driver, and we extend the interfaces to allow the
>> state tracker to access the data in a proper way.
>>
>> I believe the advantage of this scheme is that beyond adding format
>> parameters to pipe_transfer_map() calls, there will not need to be any
>> adjustments to the state trackers.
>>
>> One yet-to-be-resolved issue is what to do about glMapBuffer* - it
>> maps a buffer, it's formatless (at map time), and yet the GPU will be
>> required to interpret it correctly. We could decree that PIPE_BUFFER
>> is just *always* an array of R8_UNORM and thus never needs any type of
>> swapping. The driver needs to adjust accordingly to deal with accesses
>> that don't fit that pattern (and where parameters can't be fed to the
>> GPU to interpret it properly).
>>
>> I think something like the above will work. And I think it presents a
>> cleaner barrier than your proposal, because none of the "this GPU can
>> kinda-sorta understand BE, but not everywhere" details are ever
>> exposed to the state tracker.
>>
>> Thoughts?
>>
>>   -ilia
>
> Ilia,
>
> To make the GPU do a conversion during blitting, I need to configure
> registers. This is done in a couple of functions in the r600g driver
> (r600_translate_texformat, r600_colorformat_endian_swap,
> r600_translate_colorformat and r600_translate_colorswap).
>
> The problem is that transfer_map/unmap don't call directly to those
> functions. They call other functions which eventually call those 4
> functions. Among those "other" functions, there are several function
> calls which are *not* in the r600g driver. i.e. we go back to generic
> util functions. For example:
>
> #0  r600_translate_colorformat
> #1  evergreen_init_color_surface
> #2  evergreen_set_framebuffer_state
> #3  util_blitter_custom_depth_stencil
> #4  

Re: [Mesa-dev] [PATCH v2 04/11] gallium: add endian_format field to struct pipe_resource

2016-04-18 Thread Rob Clark
On Mon, Apr 18, 2016 at 10:47 AM, Oded Gabbay  wrote:
> On Thu, Apr 14, 2016 at 6:44 PM, Ilia Mirkin  wrote:
>> On Thu, Apr 14, 2016 at 11:08 AM, Oded Gabbay  wrote:
 Wouldn't it make more sense to handle such issues in transfer_map?
 (i.e. create a staging memory area, and decode into it)? This assumes
 that the transfer_map() call has enough information to "do the right
 thing". I don't think it does today, but perhaps it could be taught?
>>> It doesn't have all the info today, that's for sure. I imagine though
>>> we can add parameters to it.
>>>
 That way everything that's in a pipe_resource is in some
 tightly-controlled format, and we specify the LE <-> BE parameters
 when converting between CPU-read/written and GPU-read/written data. I
 believe this is a better match for what's really happening, too. What
 do you think?

   -ilia
>>>
>>> Unless I'm missing something, I think, at the end of the day, it will
>>> be the same issues as in my solution - per code path per format is a
>>> different case. That's because you will still need to "teach"
>>> transfer_map, per each transfer per format what to do. So one will
>>> need to go and debug every single code path there is in mesa for
>>> drawing/copying/reading/textures/etc., like what I did in the last 1.5
>>> months. It's a great learning experience but it won't give anything
>>> generic.
>>>
>>> Again, for example, in st_ReadPixels, I imagine you will need to give
>>> "different orders" to transfer_map for the two different scenarios -
>>> H/W blit and fallback. So what's the gain here ?
>>>
>>> If I'm missing something, please tell me.
>>
>> One of us is... let's figure out which one :)
>>
>> Here's my proposal:
>>
>> All data stored inside of resources is stored in a driver-happy
>> format. The driver ensures that it's stored in proper endianness, etc.
>> (Much like it does today wrt proper stride.)
>>
>> Blitting(/copying) between resources doesn't require any additional
>> information, since you have the format(s) of the respective resources,
>> and it's all inside the driver, so the driver does whatever it needs
>> to do to make it all "work".
>>
>> *Accessing and modifying* resources (directly) from the CPU is what
>> becomes tricky. The state tracker may have incorrect expectations of
>> the actual backing data. There are a few different ways to resolve
>> this. The one I'm proposing is that you only ever return a pointer to
>> the directly underlying data if it matches the CPU's expectations
>> (which will only be the case for byte-oriented array formats like
>> PIPE_FORMAT_R8G8B8A8_* & co). Everything else, like e.g.
>> PIPE_FORMAT_R5G6B5_UNORM and countless others, will have to go through
>> a bounce buffer.
>>
>> At transfer map time, you convert the data from GPU-style to
>> CPU-style, and copy back the relevant bits at unmap/flush time.
>>
>> This presents a nice clean boundary for this stuff. Instead of the
>> state tracker trying to guess what the driver will do and feeding it
>> endiannesses that it can't possibly guess properly, the tracking logic
>> is relegated to the driver, and we extend the interfaces to allow the
>> state tracker to access the data in a proper way.
>>
>> I believe the advantage of this scheme is that beyond adding format
>> parameters to pipe_transfer_map() calls, there will not need to be any
>> adjustments to the state trackers.
>>
>> One yet-to-be-resolved issue is what to do about glMapBuffer* - it
>> maps a buffer, it's formatless (at map time), and yet the GPU will be
>> required to interpret it correctly. We could decree that PIPE_BUFFER
>> is just *always* an array of R8_UNORM and thus never needs any type of
>> swapping. The driver needs to adjust accordingly to deal with accesses
>> that don't fit that pattern (and where parameters can't be fed to the
>> GPU to interpret it properly).
>>
>> I think something like the above will work. And I think it presents a
>> cleaner barrier than your proposal, because none of the "this GPU can
>> kinda-sorta understand BE, but not everywhere" details are ever
>> exposed to the state tracker.
>>
>> Thoughts?
>>
>>   -ilia
>
> Ilia,
>
> To make the GPU do a conversion during blitting, I need to configure
> registers. This is done in a couple of functions in the r600g driver
> (r600_translate_texformat, r600_colorformat_endian_swap,
> r600_translate_colorformat and r600_translate_colorswap).
>
> The problem is that transfer_map/unmap don't call directly to those
> functions. They call other functions which eventually call those 4
> functions. Among those "other" functions, there are several function
> calls which are *not* in the r600g driver. i.e. we go back to generic
> util functions. For example:
>
> #0  r600_translate_colorformat
> #1  evergreen_init_color_surface
> #2  evergreen_set_framebuffer_state
> #3  util_blitter_custom_depth_stencil
> #4  

Re: [Mesa-dev] [PATCH 4/7] radeonsi: don't use ACQUIRE_MEM on the graphics ring

2016-04-18 Thread Alex Deucher
On Sun, Apr 17, 2016 at 12:11 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> It's only required on the compute ring. This matches the closed driver.
>
> The compute flag is removed to prevent confusion and Bas's compute shader
> patches remove it in the whole function.

FWIW, these are effectively the same packet.  Surface sync will
eventually be replaced by acquire mem since the only real difference
is the expansion to support a larger address space.

Alex

> ---
>  src/gallium/drivers/radeonsi/si_state_draw.c | 26 --
>  1 file changed, 8 insertions(+), 18 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
> b/src/gallium/drivers/radeonsi/si_state_draw.c
> index 8f4eba4..86fb443 100644
> --- a/src/gallium/drivers/radeonsi/si_state_draw.c
> +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
> @@ -699,26 +699,16 @@ void si_emit_cache_flush(struct si_context *si_ctx, 
> struct r600_atom *atom)
> radeon_emit(cs, EVENT_TYPE(V_028A90_VGT_STREAMOUT_SYNC) | 
> EVENT_INDEX(0));
> }
>
> -   /* SURFACE_SYNC must be emitted after partial flushes.
> -* It looks like SURFACE_SYNC flushes caches immediately and doesn't
> -* wait for any engines. This should be last.
> +   /* When one of the DEST_BASE flags is set, SURFACE_SYNC waits for 
> idle.
> +* Therefore, it should be last.
>  */
> if (cp_coher_cntl) {
> -   if (sctx->chip_class >= CIK) {
> -   radeon_emit(cs, PKT3(PKT3_ACQUIRE_MEM, 5, 0) | 
> compute);
> -   radeon_emit(cs, cp_coher_cntl);   /* CP_COHER_CNTL */
> -   radeon_emit(cs, 0x);  /* CP_COHER_SIZE */
> -   radeon_emit(cs, 0xff);/* CP_COHER_SIZE_HI 
> */
> -   radeon_emit(cs, 0);   /* CP_COHER_BASE */
> -   radeon_emit(cs, 0);   /* CP_COHER_BASE_HI 
> */
> -   radeon_emit(cs, 0x000A);  /* POLL_INTERVAL */
> -   } else {
> -   radeon_emit(cs, PKT3(PKT3_SURFACE_SYNC, 3, 0) | 
> compute);
> -   radeon_emit(cs, cp_coher_cntl);   /* CP_COHER_CNTL */
> -   radeon_emit(cs, 0x);  /* CP_COHER_SIZE */
> -   radeon_emit(cs, 0);   /* CP_COHER_BASE */
> -   radeon_emit(cs, 0x000A);  /* POLL_INTERVAL */
> -   }
> +   /* ACQUIRE_MEM is only required on a compute ring. */
> +   radeon_emit(cs, PKT3(PKT3_SURFACE_SYNC, 3, 0));
> +   radeon_emit(cs, cp_coher_cntl);   /* CP_COHER_CNTL */
> +   radeon_emit(cs, 0x);  /* CP_COHER_SIZE */
> +   radeon_emit(cs, 0);   /* CP_COHER_BASE */
> +   radeon_emit(cs, 0x000A);  /* POLL_INTERVAL */
> }
>
> if (sctx->flags & R600_CONTEXT_START_PIPELINE_STATS) {
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 04/11] gallium: add endian_format field to struct pipe_resource

2016-04-18 Thread Oded Gabbay
On Thu, Apr 14, 2016 at 6:44 PM, Ilia Mirkin  wrote:
> On Thu, Apr 14, 2016 at 11:08 AM, Oded Gabbay  wrote:
>>> Wouldn't it make more sense to handle such issues in transfer_map?
>>> (i.e. create a staging memory area, and decode into it)? This assumes
>>> that the transfer_map() call has enough information to "do the right
>>> thing". I don't think it does today, but perhaps it could be taught?
>> It doesn't have all the info today, that's for sure. I imagine though
>> we can add parameters to it.
>>
>>> That way everything that's in a pipe_resource is in some
>>> tightly-controlled format, and we specify the LE <-> BE parameters
>>> when converting between CPU-read/written and GPU-read/written data. I
>>> believe this is a better match for what's really happening, too. What
>>> do you think?
>>>
>>>   -ilia
>>
>> Unless I'm missing something, I think, at the end of the day, it will
>> be the same issues as in my solution - per code path per format is a
>> different case. That's because you will still need to "teach"
>> transfer_map, per each transfer per format what to do. So one will
>> need to go and debug every single code path there is in mesa for
>> drawing/copying/reading/textures/etc., like what I did in the last 1.5
>> months. It's a great learning experience but it won't give anything
>> generic.
>>
>> Again, for example, in st_ReadPixels, I imagine you will need to give
>> "different orders" to transfer_map for the two different scenarios -
>> H/W blit and fallback. So what's the gain here ?
>>
>> If I'm missing something, please tell me.
>
> One of us is... let's figure out which one :)
>
> Here's my proposal:
>
> All data stored inside of resources is stored in a driver-happy
> format. The driver ensures that it's stored in proper endianness, etc.
> (Much like it does today wrt proper stride.)
>
> Blitting(/copying) between resources doesn't require any additional
> information, since you have the format(s) of the respective resources,
> and it's all inside the driver, so the driver does whatever it needs
> to do to make it all "work".
>
> *Accessing and modifying* resources (directly) from the CPU is what
> becomes tricky. The state tracker may have incorrect expectations of
> the actual backing data. There are a few different ways to resolve
> this. The one I'm proposing is that you only ever return a pointer to
> the directly underlying data if it matches the CPU's expectations
> (which will only be the case for byte-oriented array formats like
> PIPE_FORMAT_R8G8B8A8_* & co). Everything else, like e.g.
> PIPE_FORMAT_R5G6B5_UNORM and countless others, will have to go through
> a bounce buffer.
>
> At transfer map time, you convert the data from GPU-style to
> CPU-style, and copy back the relevant bits at unmap/flush time.
>
> This presents a nice clean boundary for this stuff. Instead of the
> state tracker trying to guess what the driver will do and feeding it
> endiannesses that it can't possibly guess properly, the tracking logic
> is relegated to the driver, and we extend the interfaces to allow the
> state tracker to access the data in a proper way.
>
> I believe the advantage of this scheme is that beyond adding format
> parameters to pipe_transfer_map() calls, there will not need to be any
> adjustments to the state trackers.
>
> One yet-to-be-resolved issue is what to do about glMapBuffer* - it
> maps a buffer, it's formatless (at map time), and yet the GPU will be
> required to interpret it correctly. We could decree that PIPE_BUFFER
> is just *always* an array of R8_UNORM and thus never needs any type of
> swapping. The driver needs to adjust accordingly to deal with accesses
> that don't fit that pattern (and where parameters can't be fed to the
> GPU to interpret it properly).
>
> I think something like the above will work. And I think it presents a
> cleaner barrier than your proposal, because none of the "this GPU can
> kinda-sorta understand BE, but not everywhere" details are ever
> exposed to the state tracker.
>
> Thoughts?
>
>   -ilia

Ilia,

To make the GPU do a conversion during blitting, I need to configure
registers. This is done in a couple of functions in the r600g driver
(r600_translate_texformat, r600_colorformat_endian_swap,
r600_translate_colorformat and r600_translate_colorswap).

The problem is that transfer_map/unmap don't call directly to those
functions. They call other functions which eventually call those 4
functions. Among those "other" functions, there are several function
calls which are *not* in the r600g driver. i.e. we go back to generic
util functions. For example:

#0  r600_translate_colorformat
#1  evergreen_init_color_surface
#2  evergreen_set_framebuffer_state
#3  util_blitter_custom_depth_stencil
#4  r600_blit_decompress_depth
#5  r600_texture_transfer_map

Am I allowed to now pass information from transfer_map/unmap all the
way down to the 4 functions I mentioned through all these layers as

Re: [Mesa-dev] [PATCH 1/4] i965: Rework opt_vector_float() control flow.

2016-04-18 Thread Iago Toral
Patches 1-3 are:

Reviewed-by: Iago Toral Quiroga 

On Sun, 2016-04-17 at 23:14 -0700, Kenneth Graunke wrote:
> This reworks opt_vector_float() so that there's only one place that
> flushes out any accumulated state and emits a VF.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 63 
> +++---
>  1 file changed, 35 insertions(+), 28 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 6433fc5..fa0d80d 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -385,48 +385,55 @@ vec4_visitor::opt_vector_float()
> unsigned writemask = 0;
>  
> foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
> -  if (last_reg != inst->dst.nr ||
> +  int vf = -1;
> +
> +  /* Look for unconditional MOVs from an immediate with a partial
> +   * writemask.  See if the immediate can be represented as a VF.
> +   */
> +  if (inst->opcode == BRW_OPCODE_MOV &&
> +  inst->src[0].file == IMM &&
> +  inst->predicate == BRW_PREDICATE_NONE &&
> +  inst->dst.writemask != WRITEMASK_XYZW) {
> + vf = brw_float_to_vf(inst->src[0].f);
> +  }
> +
> +  /* If this wasn't a MOV, or the value was non-representable, or
> +   * the destination register doesn't match, then this breaks our
> +   * sequence.  Combine anything we've accumulated so far.
> +   */
> +  if (vf == -1 ||
> +  last_reg != inst->dst.nr ||
>last_reg_offset != inst->dst.reg_offset ||
>last_reg_file != inst->dst.file) {
>   progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count,
> writemask);
>   inst_count = 0;
> + last_reg = -1;
>   writemask = 0;
> - last_reg = inst->dst.nr;
> - last_reg_offset = inst->dst.reg_offset;
> - last_reg_file = inst->dst.file;
>  
>   for (int i = 0; i < 4; i++) {
>  imm[i] = 0;
>   }
>}
>  
> -  if (inst->opcode != BRW_OPCODE_MOV ||
> -  inst->dst.writemask == WRITEMASK_XYZW ||
> -  inst->src[0].file != IMM ||
> -  inst->predicate != BRW_PREDICATE_NONE) {
> - progress |= vectorize_mov(block, inst, imm, imm_inst, inst_count,
> -   writemask);
> - inst_count = 0;
> - last_reg = -1;
> - continue;
> -  }
> +  /* Record this instruction's value (if it was representable). */
> +  if (vf != -1) {
> + if ((inst->dst.writemask & WRITEMASK_X) != 0)
> +imm[0] = vf;
> + if ((inst->dst.writemask & WRITEMASK_Y) != 0)
> +imm[1] = vf;
> + if ((inst->dst.writemask & WRITEMASK_Z) != 0)
> +imm[2] = vf;
> + if ((inst->dst.writemask & WRITEMASK_W) != 0)
> +imm[3] = vf;
>  
> -  int vf = brw_float_to_vf(inst->src[0].f);
> -  if (vf == -1)
> - continue;
> + writemask |= inst->dst.writemask;
> + imm_inst[inst_count++] = inst;
>  
> -  if ((inst->dst.writemask & WRITEMASK_X) != 0)
> - imm[0] = vf;
> -  if ((inst->dst.writemask & WRITEMASK_Y) != 0)
> - imm[1] = vf;
> -  if ((inst->dst.writemask & WRITEMASK_Z) != 0)
> - imm[2] = vf;
> -  if ((inst->dst.writemask & WRITEMASK_W) != 0)
> - imm[3] = vf;
> -
> -  writemask |= inst->dst.writemask;
> -  imm_inst[inst_count++] = inst;
> + last_reg = inst->dst.nr;
> + last_reg_offset = inst->dst.reg_offset;
> + last_reg_file = inst->dst.file;
> +  }
> }
>  
> if (progress)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Properly handle integer types in opt_vector_float().

2016-04-18 Thread Iago Toral
On Sun, 2016-04-17 at 23:14 -0700, Kenneth Graunke wrote:
> Previously, opt_vector_float() always interpreted MOV sources as
> floating point, and always created a MOV with a F-type destination.
> 
> This meant that we could mess up sequences of integer loads, such as:
> 
>mov vgrf6.0.x:D, 0D
>mov vgrf6.0.y:D, 1D
>mov vgrf6.0.z:D, 2D
>mov vgrf6.0.w:D, 3D
> 
> Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
> 
>mov vgrf6.0:F, [0F, 0F, 0F, 0F]
> 
> which is clearly wrong.  We can properly handle this by converting
> integer values to float (rather than bitcasting), and emitting a type
> converting MOV:
> 
>mov vgrf6.0:D, [0F, 1F, 2F, 3F]
> 
> To do this, see first see if the integer values (converted to float)
> are representable.  If so, we use a D-type MOV.  If not, we then try
> the floating point values and an F-type MOV.  We make zero not impose
> type restrictions.  This is important because 0D would imply a D-type
> MOV, but is often used in sequences such as MOV 0D, MOV 0x3f80D,
> where we want to use an F-type MOV.
> 
> Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend.  This
> recently became visible due to changes in opt_vector_float() which
> made it optimize more cases, but it was a pre-existing bug.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 24 
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 12c3c66..2bdcf1f 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -361,9 +361,11 @@ vec4_visitor::opt_vector_float()
> int inst_count = 0;
> vec4_instruction *imm_inst[4];
> unsigned writemask = 0;
> +   enum brw_reg_type dest_type = BRW_REGISTER_TYPE_F;
>  
> foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
>int vf = -1;
> +  enum brw_reg_type need_type;
>  
>/* Look for unconditional MOVs from an immediate with a partial
> * writemask.  Skip type-conversion MOVs other than integer 0,
> @@ -375,14 +377,26 @@ vec4_visitor::opt_vector_float()
>inst->predicate == BRW_PREDICATE_NONE &&
>inst->dst.writemask != WRITEMASK_XYZW &&
>(inst->src[0].type == inst->dst.type || inst->src[0].d == 0)) {
> - vf = brw_float_to_vf(inst->src[0].f);
> + vf = brw_float_to_vf(inst->src[0].d);
> + need_type = BRW_REGISTER_TYPE_D;
> +
> + if (vf == -1) {
> +vf = brw_float_to_vf(inst->src[0].f);
> +need_type = BRW_REGISTER_TYPE_F;
> + }

If we are packing actual float values (not integers), doesn't this mean
that we re-interpret them as integers and convert the re-interpreted
integer value to float? If the result of that sequence of operations is
representable it seems that we would just use a D-MOV from a float that
no longer represents the original value, right?

Example:

.f = 5.27 (0x40a8a3d7)
.d = 1084793815 (0x40a8a3d7)

so we would do brw_float_to_vf(1084793815.0) instead of
brw_float_to_vf(5.27), which does not look right.

> + /* Zero can be loaded as any type; don't impose a restriction. */
> + if (inst->src[0].d == 0)
> +need_type = dest_type;
>}
>  
>/* If this wasn't a MOV, or the value was non-representable, or
> -   * the destination register doesn't match, then this breaks our
> -   * sequence.  Combine anything we've accumulated so far.
> +   * the destination register doesn't match, or we have to switch
> +   * destination types, then this breaks our sequence.  Combine
> +   * anything we've accumulated so far.
> */
>if (vf == -1 ||
> +  dest_type != need_type ||
>last_reg != inst->dst.nr ||
>last_reg_offset != inst->dst.reg_offset ||
>last_reg_file != inst->dst.file) {
> @@ -391,7 +405,7 @@ vec4_visitor::opt_vector_float()
>  unsigned vf;
>  memcpy(, imm, sizeof(vf));
>  vec4_instruction *mov = MOV(imm_inst[0]->dst, brw_imm_vf(vf));
> -mov->dst.type = BRW_REGISTER_TYPE_F;
> +mov->dst.type = dest_type;
>  mov->dst.writemask = writemask;
>  inst->insert_before(block, mov);
>  
> @@ -405,6 +419,7 @@ vec4_visitor::opt_vector_float()
>   inst_count = 0;
>   last_reg = -1;
>   writemask = 0;
> + dest_type = BRW_REGISTER_TYPE_F;
>  
>   for (int i = 0; i < 4; i++) {
>  imm[i] = 0;
> @@ -428,6 +443,7 @@ vec4_visitor::opt_vector_float()
>   last_reg = inst->dst.nr;
>   last_reg_offset = inst->dst.reg_offset;
>   last_reg_file = inst->dst.file;
> + dest_type = need_type;
>}
> }
>  


___
mesa-dev mailing list

Re: [Mesa-dev] [PATCH 1/9] gallivm: Use LLVMPrintValueToString where available.

2016-04-18 Thread Roland Scheidegger
Series looks great to me. Thanks for the cleanup.

Reviewed-by: Roland Scheidegger 

Am 18.04.2016 um 11:14 schrieb Jose Fonseca:
> And llvm::raw_string_ostream where not (LLVM 3.3).
> 
> Thereby eliminating yet another dependency on unstable LLVM interfaces.
> 
> As a bonus this also gets LLVM IR on OutputDebugMessageA on MSVC (which
> was disabled, probably due to C++ issues.)
> 
> Tested `lp_test_arit -v -v` on LLVM 3.3, 3.4 and 3.8.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_debug.cpp | 45 
> ++
>  1 file changed, 10 insertions(+), 35 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp 
> b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
> index 11e9f92..a8c3899 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
> @@ -64,38 +64,6 @@ lp_check_alignment(const void *ptr, unsigned alignment)
> return ((uintptr_t)ptr & (alignment - 1)) == 0;
>  }
>  
> -#if (defined(PIPE_OS_WINDOWS) && !defined(PIPE_CC_MSVC)) || 
> defined(PIPE_OS_EMBEDDED)
> -
> -class raw_debug_ostream :
> -   public llvm::raw_ostream
> -{
> -private:
> -   uint64_t pos;
> -
> -public:
> -   raw_debug_ostream() : pos(0) { }
> -
> -   void write_impl(const char *Ptr, size_t Size);
> -
> -   uint64_t current_pos() const { return pos; }
> -   size_t preferred_buffer_size() const { return 512; }
> -};
> -
> -
> -void
> -raw_debug_ostream::write_impl(const char *Ptr, size_t Size)
> -{
> -   if (Size > 0) {
> -  char *lastPtr = (char *)[Size];
> -  char last = *lastPtr;
> -  *lastPtr = 0;
> -  _debug_printf("%*s", Size, Ptr);
> -  *lastPtr = last;
> -  pos += Size;
> -   }
> -}
> -
> -#endif
>  
>  extern "C" const char *
>  lp_get_module_id(LLVMModuleRef module)
> @@ -110,10 +78,17 @@ lp_get_module_id(LLVMModuleRef module)
>  extern "C" void
>  lp_debug_dump_value(LLVMValueRef value)
>  {
> -#if (defined(PIPE_OS_WINDOWS) && !defined(PIPE_CC_MSVC)) || 
> defined(PIPE_OS_EMBEDDED)
> -   raw_debug_ostream os;
> +#if HAVE_LLVM >= 0x0304
> +   char *str = LLVMPrintValueToString(value);
> +   if (str) {
> +  os_log_message(str);
> +  LLVMDisposeMessage(str);
> +   }
> +#elif defined(PIPE_OS_WINDOWS) || defined(PIPE_OS_EMBEDDED)
> +   std::string str;
> +   llvm::raw_string_ostream os(str);
> llvm::unwrap(value)->print(os);
> -   os.flush();
> +   os_log_message(str.c_str());
>  #else
> LLVMDumpValue(value);
>  #endif
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/9] gallivm: Use LLVMSetTarget.

2016-04-18 Thread Emil Velikov
On 18 April 2016 at 14:16, Jose Fonseca  wrote:
> On 18/04/16 13:27, Emil Velikov wrote:
>>
>> Hi Jose,
>>
>> On 18 April 2016 at 10:14, Jose Fonseca  wrote:
>>>
>>> Instead of LLVM C++ interfaces.
>>> ---
>>>   src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 8 +---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>> b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>> index c1e262b..37e2f08 100644
>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
>>> @@ -519,9 +519,11 @@
>>> lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
>>>  /*
>>>   * MCJIT works on Windows, but currently only through ELF object
>>> format.
>>>   */
>>> -   std::string targetTriple = llvm::sys::getProcessTriple();
>>> -   targetTriple.append("-elf");
>>> -   unwrap(M)->setTargetTriple(targetTriple);
>>> +#  ifdef _WIN64
>>> +   LLVMSetTarget(M, "x86_64-pc-win32-elf");
>>> +#  else
>>> +   LLVMSetTarget(M, "i686-pc-win32-elf");
>>> +#  endif
>>
>>
>> I've noticed that you're using LLVM_HOST_TRIPLE in patch 7/9. Wouldn't
>> it be better to use it here as well ?
>>
>> +   LLVMSetTarget(M, LLVM_HOST_TRIPLE "-elf");
>
>
> Thanks for taking a look.
>
> It's a good remark.
>
> Surprisingly LLVM uses different LLVM_HOST_TRIPLE for MinGW/MSVC:
>
> $ grep LLVM_HOST_TRIPLE */llvm-*/include
> mingw32/llvm-3.3.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "i686-pc-mingw32"
> mingw32/llvm-3.4.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "i686-pc-mingw32"
> mingw64/llvm-3.3.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "x86_64-w64-mingw32"
> mingw64/llvm-3.4.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "x86_64-w64-mingw32"
> msvc32/llvm-3.3.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "i686-pc-win32"
> msvc32/llvm-3.4.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "i686-pc-win32"
> msvc64/llvm-3.3.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "x86_64-pc-win32"
> msvc64/llvm-3.4.1/include/llvm/Config/config.h:#define LLVM_HOST_TRIPLE
> "x86_64-pc-win32"
>

(Very mildly) related
There's also the missing header for 'other' wordsize configs.
llvm/Config/llvm-config.h does mention both -32 and -64 headers, but
only one is shipped. Guessing that just like above it's a intentional
design decision or a bug ;-)


>> That aside I'm really glad to see mesa (modulo swr) no longer using
>> the unstable LLVM C++ API.
>
>
> Still not quite there, but yes that's indeed the hope.
>
>> Perhaps at some point we could port these
>>
>> to normal C and make gallivm 'C++ free'  ;-)
>
>
> There's no much use of C++ left indeed, and at some point might as well
> remove it completely.
>
Glad we're on the same page.


> For my part, my personal goal is to eliminate src/gallium/auxiliary/rtasm in
> the medium term.  And I think should remove src/mesa/x86/rtasm too.
>
Sounds like a plan. This might cause some minor perf degradation, as
we nuke the translate_sse2 path, but I guess one can benchmark that in
due time. On the other hand, if we mandate LLVM for those paths we
might end up 'forcing' it on driver which don't use it yet - nouveau,
ilo, freedreno, vc4, virgl...

Hopefully I didn't open a (the) can of worms with that last statement.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >