[Mesa-dev] [PATCH] panfrost: Use tiler fast path (performance boost)

2019-02-20 Thread Alyssa Rosenzweig
For reasons that are still unclear (speculation included in the comment
added in this patch), the tiler? metadata has a fast path that we were
not enabling; there looks to be a possible time/memory tradeoff, but the
details remain unclear.

Regardless, this patch improves performance dramatically. Particular
wins are for geometry-heavy scenes. For instance, glmark2-es2's
Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped
from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious
when vsync is disabled, as in glmark2-es2-wayland.

With this patch, on GLES 2.0 samples not involving FBOs, it appears
performance is converging with (and sometimes surpassing) the blob.

Signed-off-by: Alyssa Rosenzweig 
---
 src/gallium/drivers/panfrost/pan_context.c | 42 +++---
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index 822b5a0dfef..2996a9c1e09 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -256,7 +256,28 @@ static struct bifrost_framebuffer
 panfrost_emit_mfbd(struct panfrost_context *ctx)
 {
 struct bifrost_framebuffer framebuffer = {
-.tiler_meta = 0xf0c600,
+/* It is not yet clear what tiler_meta means or how it's
+ * calculated, but we can tell the lower 32-bits are a
+ * (monotonically increasing?) function of tile count and
+ * geometry complexity; I suspect it defines a memory size of
+ * some kind? for the tiler. It's really unclear at the
+ * moment... but to add to the confusion, the hardware is happy
+ * enough to accept a zero in this field, so we don't even have
+ * to worry about it right now.
+ *
+ * The byte (just after the 32-bit mark) is much more
+ * interesting. The higher nibble I've only ever seen as 0xF,
+ * but the lower one I've seen as 0x0 or 0xF, and it's not
+ * obvious what the difference is. But what -is- obvious is
+ * that when the lower nibble is zero, performance is severely
+ * degraded compared to when the lower nibble is set.
+ * Evidently, that nibble enables some sort of fast path,
+ * perhaps relating to caching or tile flush? Regardless, at
+ * this point there's no clear reason not to set it, aside from
+ * substantially increased memory requirements (of the misc_0
+ * buffer) */
+
+.tiler_meta = ((uint64_t) 0xff << 32) | 0x0,
 
 .width1 = MALI_POSITIVE(ctx->pipe_framebuffer.width),
 .height1 = MALI_POSITIVE(ctx->pipe_framebuffer.height),
@@ -271,10 +292,23 @@ panfrost_emit_mfbd(struct panfrost_context *ctx)
 
 .unknown2 = 0x1f,
 
-/* Presumably corresponds to unknown_address_X of SFBD */
+/* Corresponds to unknown_address_X of SFBD */
 .scratchpad = ctx->scratchpad.gpu,
 .tiler_scratch_start  = ctx->misc_0.gpu,
-.tiler_scratch_middle = ctx->misc_0.gpu + 
/*ctx->misc_0.size*/40960, /* Size depends on the size of the framebuffer and 
the number of vertices */
+
+/* The constant added here is, like the lower word of
+ * tiler_meta, (loosely) another product of framebuffer size
+ * and geometry complexity. It must be sufficiently large for
+ * the tiler_meta fast path to work; if it's too small, there
+ * will be DATA_INVALID_FAULTs. Conversely, it must be less
+ * than the total size of misc_0, or else there's no room. It's
+ * possible this constant configures a partition between two
+ * parts of misc_0? We haven't investigated the functionality,
+ * as these buffers are internally used by the hardware
+ * (presumably by the tiler) but not seemingly touched by the 
driver
+ */
+
+.tiler_scratch_middle = ctx->misc_0.gpu + 0xf,
 
 .tiler_heap_start = ctx->tiler_heap.gpu,
 .tiler_heap_end = ctx->tiler_heap.gpu + ctx->tiler_heap.size,
@@ -2696,7 +2730,7 @@ panfrost_setup_hardware(struct panfrost_context *ctx)
 screen->driver->allocate_slab(screen, >varying_mem, 16384, false, 
0, 0, 0);
 screen->driver->allocate_slab(screen, >shaders, 4096, true, 
PAN_ALLOCATE_EXECUTE, 0, 0);
 screen->driver->allocate_slab(screen, >tiler_heap, 32768, false, 
PAN_ALLOCATE_GROWABLE, 1, 128);
-screen->driver->allocate_slab(screen, >misc_0, 128, false, 
PAN_ALLOCATE_GROWABLE, 1, 128);
+screen->driver->allocate_slab(screen, >misc_0, 128*128, 

Re: [Mesa-dev] [PATCH] tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics

2019-02-20 Thread Ilia Mirkin
I don't think that's right... this is used in e.g.

   case TGSI_OPCODE_RESQ:
  if (tgsi_is_bindless_image_file(fullinst->Src[0].Register.File))
 info->uses_bindless_images = true;
  break;

And if you call RESQ with a src0 that is not IMAGE or BUFFER, then
that's a bindless reference.

I think the problem is that this is interacting poorly with the load
constbuf stuff -- probably needs a bit of subtlety there.

  -ilia

On Thu, Feb 21, 2019 at 12:03 AM Marek Olšák  wrote:
>
> From: Marek Olšák 
>
> This might have decreased performance for radeonsi/tgsi, because most
> most shaders claimed they used bindless.
>
> Cc: 18.3 19.0 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_scan.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h 
> b/src/gallium/auxiliary/tgsi/tgsi_scan.h
> index 580c73a2814..51d85af4fcb 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h
> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h
> @@ -214,18 +214,20 @@ tgsi_scan_arrays(const struct tgsi_token *tokens,
>  void
>  tgsi_scan_tess_ctrl(const struct tgsi_token *tokens,
>  const struct tgsi_shader_info *info,
>  struct tgsi_tessctrl_info *out);
>
>  static inline bool
>  tgsi_is_bindless_image_file(unsigned file)
>  {
> return file != TGSI_FILE_IMAGE &&
>file != TGSI_FILE_MEMORY &&
> -  file != TGSI_FILE_BUFFER;
> +  file != TGSI_FILE_BUFFER &&
> +  file != TGSI_FILE_CONSTBUF &&
> +  file != TGSI_FILE_HW_ATOMIC;
>  }
>
>  #ifdef __cplusplus
>  } // extern "C"
>  #endif
>
>  #endif /* TGSI_SCAN_H */
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] tgsi: don't set tgsi_info::uses_bindless_images for constbufs and hw atomics

2019-02-20 Thread Marek Olšák
From: Marek Olšák 

This might have decreased performance for radeonsi/tgsi, because most
most shaders claimed they used bindless.

Cc: 18.3 19.0 
---
 src/gallium/auxiliary/tgsi/tgsi_scan.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h 
b/src/gallium/auxiliary/tgsi/tgsi_scan.h
index 580c73a2814..51d85af4fcb 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h
@@ -214,18 +214,20 @@ tgsi_scan_arrays(const struct tgsi_token *tokens,
 void
 tgsi_scan_tess_ctrl(const struct tgsi_token *tokens,
 const struct tgsi_shader_info *info,
 struct tgsi_tessctrl_info *out);
 
 static inline bool
 tgsi_is_bindless_image_file(unsigned file)
 {
return file != TGSI_FILE_IMAGE &&
   file != TGSI_FILE_MEMORY &&
-  file != TGSI_FILE_BUFFER;
+  file != TGSI_FILE_BUFFER &&
+  file != TGSI_FILE_CONSTBUF &&
+  file != TGSI_FILE_HW_ATOMIC;
 }
 
 #ifdef __cplusplus
 } // extern "C"
 #endif
 
 #endif /* TGSI_SCAN_H */
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 104376] [ILK] Browser crashes while switching between fullscreen and windowed video on youtube.

2019-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104376

Anjala  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |dan...@fooishbar.org
   |org |
Product|Mesa|Spam
 OS|Linux (All) |Windows (All)
URL||https://www.youtube.com/wat
   ||ch?v=2FOXR16mLow=PL2-d
   ||afEMk2A4ut2pyv0fSIXqOzXtBGk
   ||Lj
  Component|Other   |Two
Version|17.3|unspecified
 QA Contact|mesa-dev@lists.freedesktop. |
   |org |

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-20 Thread Nanley Chery
On Wed, Feb 20, 2019 at 03:08:29PM +0200, Eleni Maria Stea wrote:
> Calculating the scissor rectangle fields with the y flipped (0 on top)
> can generate negative values that will cause assertion failure later on
> as the scissor fields are all unsigned. We must clamp the bbox values
> again to make sure they don't exceed the fb_height. Also fixed a
> calculation error.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
> 
> v2:
>- I initially clamped the values inside the if (Y is flipped) case
>and I made a mistake in the calculation: the clamp of the bbox[2] should
>be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
>shouldn't have changed the ScissorRectangleYMax calculation. As the
>fixed code is equivalent with using CLAMP instead of MAX2 at the top of
>the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
>clear, I replaced it. (Nanley Chery)
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index dcdfb3c9292..dd695218fea 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2445,8 +2445,8 @@ set_scissor_bits(const struct gl_context *ctx, int i,
>  
> bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
> bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
> -   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
> -   bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
> +   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
> +   bbox[3] = CLAMP(bbox[2] + ctx->ViewportArray[i].Height, 0, fb_height);

The API guarantees that viewport height is positive, so we can leave the
calculation of bbox[3] unmodified.

> _mesa_intersect_scissor_bounding_box(ctx, i, bbox);
>  
> if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) {
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba

2019-02-20 Thread Liu, Leo
Reviewed-by: Leo Liu 

On 2019-02-15 4:19 p.m., Zhu, James wrote:
> Fixes: 9364d66cb7f7 (Add video compositor compute shader render)
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646
> Problem 1,4: they are caused by imcomplete blend comute shader
> implementation. So Reverts rgba back to frament shader.
>
> Signed-off-by: James Zhu 
> ---
>   src/gallium/auxiliary/vl/vl_compositor.c | 17 +++--
>   1 file changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
> b/src/gallium/auxiliary/vl/vl_compositor.c
> index 8731ad9..a8f3620 100644
> --- a/src/gallium/auxiliary/vl/vl_compositor.c
> +++ b/src/gallium/auxiliary/vl/vl_compositor.c
> @@ -100,12 +100,12 @@ init_shaders(struct vl_compositor *c)
>debug_printf("Unable to create YCbCr-to-RGB weave fragment 
> shader.\n");
>return false;
> }
> +   }
>   
> -  c->fs_rgba = create_frag_shader_rgba(c);
> -  if (!c->fs_rgba) {
> - debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
> - return false;
> -  }
> +   c->fs_rgba = create_frag_shader_rgba(c);
> +   if (!c->fs_rgba) {
> +  debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
> +  return false;
>  }
>   
>  return true;
> @@ -132,8 +132,8 @@ static void cleanup_shaders(struct vl_compositor *c)
>  } else {
> c->pipe->delete_fs_state(c->pipe, c->fs_video_buffer);
> c->pipe->delete_fs_state(c->pipe, c->fs_weave_rgb);
> -  c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
>  }
> +   c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
>   }
>   
>   static bool
> @@ -642,10 +642,7 @@ vl_compositor_set_rgba_layer(struct vl_compositor_state 
> *s,
>  assert(layer < VL_COMPOSITOR_MAX_LAYERS);
>   
>  s->used_layers |= 1 << layer;
> -   if (c->pipe_compute_supported)
> -  s->layers[layer].cs = c->cs_rgba;
> -   else
> -  s->layers[layer].fs = c->fs_rgba;
> +   s->layers[layer].fs = c->fs_rgba;
>  s->layers[layer].samplers[0] = c->sampler_linear;
>  s->layers[layer].samplers[1] = NULL;
>  s->layers[layer].samplers[2] = NULL;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] radeonsi: use SDMA for uploading data through const_uploader

2019-02-20 Thread Dieter Nützel

For the _rev2_ version (Patchwork Mesa) series is

Tested-by: Dieter Nützel 

on Polaris 20

UH+UV working flawlessly, now.
No 'measurable' speed decrease. - GREAT!
Blender, FreeCAD, glmark2 all fine.

But I had to have rebased part 4 (see attachment).

Dieter

Am 07.02.2019 02:22, schrieb Marek Olšák:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_buffer.c | 56 ++--
 src/gallium/drivers/radeonsi/si_dma_cs.c | 19 
 src/gallium/drivers/radeonsi/si_gfx_cs.c | 42 +++---
 src/gallium/drivers/radeonsi/si_pipe.c   | 23 ++
 src/gallium/drivers/radeonsi/si_pipe.h   | 17 +++
 5 files changed, 131 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_buffer.c
b/src/gallium/drivers/radeonsi/si_buffer.c
index c01118ce96a..3f8db7cf4f0 100644
--- a/src/gallium/drivers/radeonsi/si_buffer.c
+++ b/src/gallium/drivers/radeonsi/si_buffer.c
@@ -433,21 +433,29 @@ static void *si_buffer_transfer_map(struct
pipe_context *ctx,

if (si_invalidate_buffer(sctx, buf)) {
/* At this point, the buffer is always idle. */
usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
} else {
/* Fall back to a temporary buffer. */
usage |= PIPE_TRANSFER_DISCARD_RANGE;
}
}

-   if ((usage & PIPE_TRANSFER_DISCARD_RANGE) &&
+   if (usage & PIPE_TRANSFER_FLUSH_EXPLICIT &&
+	buf->b.b.flags & SI_RESOURCE_FLAG_UPLOAD_FLUSH_EXPLICIT_VIA_SDMA) 
{

+   usage &= ~(PIPE_TRANSFER_UNSYNCHRONIZED |
+  PIPE_TRANSFER_PERSISTENT);
+   usage |= PIPE_TRANSFER_DISCARD_RANGE;
+   force_discard_range = true;
+   }
+
+   if (usage & PIPE_TRANSFER_DISCARD_RANGE &&
((!(usage & (PIPE_TRANSFER_UNSYNCHRONIZED |
 PIPE_TRANSFER_PERSISTENT))) ||
 (buf->flags & RADEON_FLAG_SPARSE))) {
assert(usage & PIPE_TRANSFER_WRITE);

/* Check if mapping this buffer would cause waiting for the GPU.
 */
if (buf->flags & RADEON_FLAG_SPARSE ||
force_discard_range ||
 		si_rings_is_buffer_referenced(sctx, buf->buf, 
RADEON_USAGE_READWRITE) ||

@@ -514,32 +522,72 @@ static void *si_buffer_transfer_map(struct
pipe_context *ctx,
data += box->x;

return si_buffer_get_transfer(ctx, resource, usage, box,
ptransfer, data, NULL, 0);
 }

 static void si_buffer_do_flush_region(struct pipe_context *ctx,
  struct pipe_transfer *transfer,
  const struct pipe_box *box)
 {
+   struct si_context *sctx = (struct si_context*)ctx;
struct si_transfer *stransfer = (struct si_transfer*)transfer;
struct si_resource *buf = si_resource(transfer->resource);

if (stransfer->staging) {
unsigned src_offset = stransfer->offset +
  transfer->box.x % SI_MAP_BUFFER_ALIGNMENT 
+
  (box->x - transfer->box.x);

+		if (buf->b.b.flags & 
SI_RESOURCE_FLAG_UPLOAD_FLUSH_EXPLICIT_VIA_SDMA) {

+   /* This should be true for all uploaders. */
+   assert(transfer->box.x == 0);
+
+   /* Find a previous upload and extend its range. The last
+* upload is likely to be at the end of the list.
+*/
+   for (int i = sctx->num_sdma_uploads - 1; i >= 0; i--) {
+   struct si_sdma_upload *up = 
>sdma_uploads[i];
+
+   if (up->dst != buf)
+   continue;
+
+   assert(up->src == stransfer->staging);
+   assert(box->x > up->dst_offset);
+   up->size = box->x + box->width - up->dst_offset;
+   return;
+   }
+
+   /* Enlarge the array if it's full. */
+   if (sctx->num_sdma_uploads == sctx->max_sdma_uploads) {
+   unsigned size;
+
+   sctx->max_sdma_uploads += 4;
+   size = sctx->max_sdma_uploads * 
sizeof(sctx->sdma_uploads[0]);
+   sctx->sdma_uploads = 
realloc(sctx->sdma_uploads, size);
+   }
+
+   /* Add a new upload. */
+   struct si_sdma_upload *up =
+   >sdma_uploads[sctx->num_sdma_uploads++];
+   up->dst = up->src = NULL;
+   si_resource_reference(>dst, buf);
+   si_resource_reference(>src, stransfer->staging);
+   

[Mesa-dev] [Bug 109698] dri.pc contents invalid when built with meson

2019-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109698

Bug ID: 109698
   Summary: dri.pc contents invalid when built with meson
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: Linux (All)
Status: NEW
  Severity: critical
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: jpwhit...@kde.org
QA Contact: mesa-dev@lists.freedesktop.org

When debian packaging built mesa 18.3.2-1 with meson build system using the
following configuration parameters:

-Ddri-drivers-path=/usr/lib/$(DEB_HOST_MULTIARCH)/dri
-Ddri-search-path='/usr/lib/$(DEB_HOST_MULTIARCH)/dri:\{ORIGIN}/dri:/usr/lib/dri'

dri.pc got the following incorrect dridriverdir:

dridriverdir=/usr//usr/lib/x86_64-linux-gnu/dri

which caused xorg-server built against it to try to dlopen the glx libraries
from the wrong path since xorg-xserver's configure uses pkg-config
--variable=dridriverdir dri to get the path to dlopen glx libraries from.

Removing the two -Ddri-*-path configuration parameters from debian/rules seems
to have fixed the problem which has also been reported here:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=922807

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] A few NIR compile time optimisations

2019-02-20 Thread Timothy Arceri

On 21/2/19 2:32 am, Marek Olšák wrote:
On Wed, Feb 20, 2019 at 2:31 AM Connor Abbott > wrote:




On Wed, Feb 20, 2019 at 4:29 AM Marek Olšák mailto:mar...@gmail.com>> wrote:

On Tue, Feb 19, 2019 at 7:57 PM Rob Clark mailto:robdcl...@gmail.com>> wrote:

On Tue, Feb 19, 2019 at 6:49 PM Marek Olšák
mailto:mar...@gmail.com>> wrote:
 >
 > st_link_shader takes 55% of CPU time with NIR, and 9%
with TGSI.
 >
 > nir_validate_shader 49%
 >
 > nir_validate_shader is overused. It doesn't make sense
even in debug builds.

tbh, I like nir_validate enabled when I do piglit/deqp
runs.. and I
already do separate release vs debug builds (which meson kinda
encourages by disallowing in-tree builds in the first place,
but is
totally possible with autotools builds).. I kinda think
benchmarking
debug build in the first place is a flawed process.

So I wouldn't profile/benchmark nir vs tgsi (or really
anything) with
debug builds.. I could get behind a separate compiler-debug
option
that changed the default NIR_VALIDATE setting and maybe some
other nir
debug features to get some features of debug builds without
enabling
the heavier nir debug features.  But other than making debug
options a
bit more fine grained, I wouldn't change things in response to a
flawed benchmarking process..


That's a harsh reaction to a relatively good benchmarking setup.
I use debugoptimized with -DDEBUG. My performance is probably
more affected by -fno-omit-frame-pointer than -DDEBUG.


Why would you enable DEBUG in a profiling build? AFAIK it's supposed
to enable validation more expensive than simple asserts, which you
probably don't want in a benchmarking setup, although from grepping
the source it's not used much. It might be a good idea to move
running NIR validation behind DEBUG instead of !NDEBUG. In the
meantime, unless you really want to benchmark things with assertions
enabled in which case NIR_VALIDATE=0 works (but why would you?), you
can set -Db_ndebug=false in Meson. I just saw today that they're
enabled by default in debugoptimized builds (whoops, better go fix
that and re-profile...).


Assertions and other debug code really don't cost much, so I don't see a 
reason to undef DEBUG.


Besides what Eero already mentioned I'm not sure that is always a safe 
assumption. This series itself moved some expensive checks (a bunch of 
recursive loops) into an assert, and there are other examples of this 
elsewhere in the compiler.




Marek

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [MR] iris: Add a new experimental Gallium driver for Intel Gen8+ GPUs

2019-02-20 Thread Kenneth Graunke
Hello,

I believe it's finally time to merge my out-of-tree Iris driver into
master.  While it's not finished, we're up to nearly 800 patches, and
so maintaining it out-of-tree is becoming a bit unruly...and it's in
solid enough shape that it could benefit from being in-tree.

I've opened a merge request here:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/283

For now, Iris will build and install alongside i965 - with i965
remaining the default driver.  Iris is still considered experimental.

More details below...

--Ken

What is Iris?
-

The new "Iris" driver prototype is essentially our Mesa i965 driver
reimagined and rebuilt from the ground up.  Our main goal was to make
the driver as efficient as possible, aiming for very low CPU overhead.
We also decided to drop support for legacy hardware and outdated kernel
drivers, freeing us up to design for the future without worrying about
impacting the past.  This let us ask, "if we were doing it all
again...what would we want it to look like?"

While designing the new driver, we drew a lot of inspiration from our
newer Vulkan driver, as well as a few other Gallium drivers.

Key Changes
---

* All new state upload code is dramatically more efficient.

  In CPU overhead microbenchmarks, Iris can issue on average
  3 times as many draw calls per second as our i965 driver.

* Supports only Broadwell and newer (Gen8+); drops support for
  the older Gen4-7.5 GPUs, as well as Braswell/Cherrytrail.

* Updated memory management takes full advantage of modern PPGTT
  support, allocating VMA in userspace for greater control, and
  allowing us to pre-bake state containing addresses.

* Requires a modern Linux kernel (v4.16+, could be v4.5 if needed).

* Leverages Mesa's "Gallium" driver framework, allowing us to share
  more code and development effort with the rest of the community.

While many pieces are new, much of the driver remains the same.
We continue using these components of our existing stack:

* The Mesa OpenGL API front-end
* The Mesa shader compiler (GLSL, SPIR-V, NIR, i965 Gen-assembly backend)
* The Intel Surface Layout (ISL) library
* The Intel Blit-on-Render-Pipe (BLORP) library
* The i965 auxiliary surface resolve tracker
* The i965 buffer manager (extended, but largely reused)

Q
---

Q: How do I build and use Iris?

Iris only supports the Meson and Android build systems (sorry, no
Autotools or SCons support).  To build Iris, use -Dgallium-drivers=iris
when running Meson.  -Ddri-drivers=i965 will also build i965.

i965 remains the default driver (assuming it's built).  You can try
out Iris on a per-application basis by setting an environment variable:

   $ export MESA_LOADER_DRIVER_OVERRIDE=iris

Please note that Iris requires a modern kernel, such as 4.19+.

Q: Should I use Iris yet?

Iris is not considered stable or ready for production use - it's still
very experimental.  However, it is largely feature complete (a few less
commonly-used features are missing), and is passing roughly 99% of the
Piglit test suite and OpenGL CTS.  We have not done extensive testing
or benchmarking of games and applications yet.

If you're feeling adventurous, feel free to try it out and report
issues at https://bugs.freedesktop.org/ under product Mesa and the
Drivers/Gallium/Iris component.  Or ask us on IRC (irc.freenode.net
channel #intel-3d).

Q: Are there any known issues?

* Currently, Glamor rendering, and shaders using gl_VertexID, may not
  work correctly.  This will be fixed shortly after merging.

* GPU reset support is poor - if it hangs, it will likely hang
  repeatedly until the kernel bans it.  Chris Wilson and I are
  working on a solution for this.

* There is a known kernel memory leak in some 4.18 kernels, which
  Iris can provoke.  The fix has already been backported, but a few
  distros didn't pick it up before moving on to 4.19+.  Please use
  the latest available kernel.

* Performance is expected to be slower than i965 for the moment, in
  many cases, but this will improve in the days to come.  There are
  still a number of stall avoidance paths which have not been hooked
  up yet, which can lead to the GPU idling.  Fast clears support is
  missing, and end-to-end color compression is not working yet.

  We expect it to be faster in the long run, and it already is in
  some particular cases.  For one customer, Iris is 18% faster than
  i965.  Your mileage may vary until the driver matures a bit more.

Q: What's with the commit history?

The commit history in this merge request is largely the actual
non-idealized history for bringing up the driver, rather than the
tidy patch series we're used to seeing.  I decided that preserving
the history mostly as-is might be of interest to other people who
are considering bringing up a new driver, to see how I did it.
Also, after a year of work, I wanted to be able to fairly attribute
people's work, and squashing it would not have done that justice.

During development, I 

[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds

2019-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109532

--- Comment #33 from Ian Romanick  ---
(In reply to andrii simiklit from comment #32)
> (In reply to andrii simiklit from comment #31)
> > (In reply to Mark Janes from comment #30)
> > > (In reply to Mark Janes from comment #28)
> > > >   
> > > > https://android-review.googlesource.com/c/platform/external/deqp/+/901894
> > > 
> > > Mesa still asserts with this fix.  I also tested Andrii's mesa patch with
> > > the dEQP fix and the test fails.
> > Do you mean the Chris's dEQP fix here, yes?
> > But looks like the mentioned Chris's dEQP fix considers some GL limitations
> > and doesn't affect the expectations of binding points.
> > 
> > Also the assertion is a separate issue, I created the piglit test for that:
> > https://patchwork.freedesktop.org/patch/286287/
> > But yes, we unable to fix the test fail without assertion because of crash
> > :-)
> > 
> > > 
> > > Since non-mesa drivers have found issues with the original dEQP change, I
> > > suspect there are still deeper problems with the test.
> > Possible they have the same issue with binding points mismatch after
> > optimizations by glsl compiler. 
> > They could try this fix/hack for deqp which is already helped us:
> > https://github.com/asimiklit/deqp/commit/
> > 91cff8150944213f6da533e281ee76d95ca00f21
> > If it helps them we will know that it is a common issue and it could
> > expedite this:
> > https://github.com/KhronosGroup/OpenGL-API/issues/46
> 
> So we have an answer from Piers Daniell:
>   "I believe all buffer binding points should be consumed, regardless
> whether 
>the array elements are used or not. This would be the behavior of least 
>surprise to the developer. I didn't see any language that would indicate 
>that unused elements should not be counted when assigning the element to 
>the buffer binding point."

I think this basically agrees with my earlier sentiment that we shouldn't trim
elements from the beginning of the array.  It's generally ok (and in some cases
expected) to trim elements from the end.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109615] 19.0.0_rc4 fails u_format_test on ppc64

2019-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109615

--- Comment #5 from Matt Turner  ---
Right, the tests were enabled by

commit 54abd2e0844edf005510cb50b001cde8cf0366b7
Author: Eric Anholt 
Date:   Thu Jan 24 09:36:56 2019 -0800

gallium: Enable unit tests as actual meson unit tests.

These tests don't need swrast, so we can always enable them when
build_tests is set.  Most of them run to successful completion quickly
(.9s on my SKL).

and git tag --contains=54abd2e0844edf005510cb50b001cde8cf0366b7

19.0-branchpoint
mesa-19.0-rc1
mesa-19.0.0-rc1
mesa-19.0.0-rc2
mesa-19.0.0-rc3
mesa-19.0.0-rc4

so they weren't run in 18.3.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: remove non-ssa support from nir_copy_prop()

2019-02-20 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Wed, Feb 20, 2019 at 12:30 AM Timothy Arceri 
wrote:

> Even in a very basic shader this reduces the time spent in
> nir_copy_prop() by ~17%.
>
> No shader-db changes for radeonsi NIR or i965.
> ---
>  src/compiler/nir/nir_opt_copy_propagate.c | 41 +++
>  1 file changed, 5 insertions(+), 36 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_copy_propagate.c
> b/src/compiler/nir/nir_opt_copy_propagate.c
> index 534f127c0d3..10b81c0a684 100644
> --- a/src/compiler/nir/nir_opt_copy_propagate.c
> +++ b/src/compiler/nir/nir_opt_copy_propagate.c
> @@ -34,6 +34,8 @@
>
>  static bool is_move(nir_alu_instr *instr)
>  {
> +   assert(instr->src[0].src.is_ssa);
> +
> if (instr->op != nir_op_fmov &&
> instr->op != nir_op_imov)
>return false;
> @@ -46,9 +48,6 @@ static bool is_move(nir_alu_instr *instr)
> if (instr->src[0].abs || instr->src[0].negate)
>return false;
>
> -   if (!instr->src[0].src.is_ssa)
> -  return false;
> -
> return true;
>
>  }
> @@ -56,8 +55,7 @@ static bool is_move(nir_alu_instr *instr)
>  static bool is_vec(nir_alu_instr *instr)
>  {
> for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) {
> -  if (!instr->src[i].src.is_ssa)
> - return false;
> +  assert(instr->src[i].src.is_ssa);
>
>/* we handle modifiers in a separate pass */
>if (instr->src[i].abs || instr->src[i].negate)
> @@ -102,11 +100,7 @@ static bool
>  copy_prop_src(nir_src *src, nir_instr *parent_instr, nir_if *parent_if,
>unsigned num_components)
>  {
> -   if (!src->is_ssa) {
> -  if (src->reg.indirect)
> - return copy_prop_src(src->reg.indirect, parent_instr, parent_if,
> 1);
> -  return false;
> -   }
> +   assert(src->is_ssa);
>
> nir_instr *src_instr = src->ssa->parent_instr;
> nir_ssa_def *copy_def;
> @@ -137,12 +131,7 @@ static bool
>  copy_prop_alu_src(nir_alu_instr *parent_alu_instr, unsigned index)
>  {
> nir_alu_src *src = _alu_instr->src[index];
> -   if (!src->src.is_ssa) {
> -  if (src->src.reg.indirect)
> - return copy_prop_src(src->src.reg.indirect,
> _alu_instr->instr,
> -  NULL, 1);
> -  return false;
> -   }
> +   assert(src->src.is_ssa);
>
> nir_instr *src_instr =  src->src.ssa->parent_instr;
> if (src_instr->type != nir_instr_type_alu)
> @@ -187,15 +176,6 @@ copy_prop_alu_src(nir_alu_instr *parent_alu_instr,
> unsigned index)
> return true;
>  }
>
> -static bool
> -copy_prop_dest(nir_dest *dest, nir_instr *instr)
> -{
> -   if (!dest->is_ssa && dest->reg.indirect)
> -  return copy_prop_src(dest->reg.indirect, instr, NULL, 1);
> -
> -   return false;
> -}
> -
>  static bool
>  copy_prop_instr(nir_instr *instr)
>  {
> @@ -208,9 +188,6 @@ copy_prop_instr(nir_instr *instr)
>   while (copy_prop_alu_src(alu_instr, i))
>  progress = true;
>
> -  while (copy_prop_dest(_instr->dest.dest, instr))
> - progress = true;
> -
>return progress;
> }
>
> @@ -241,9 +218,6 @@ copy_prop_instr(nir_instr *instr)
>  progress = true;
>}
>
> -  while (copy_prop_dest(>dest, instr))
> - progress = true;
> -
>return progress;
> }
>
> @@ -257,11 +231,6 @@ copy_prop_instr(nir_instr *instr)
>  progress = true;
>}
>
> -  if (nir_intrinsic_infos[intrin->intrinsic].has_dest) {
> - while (copy_prop_dest(>dest, instr))
> -progress = true;
> -  }
> -
>return progress;
> }
>
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: advertise 8 subpixel precision bits

2019-02-20 Thread Jason Ekstrand
Could we please also add a line which explicitly sets the precision in
3DSTATE_SF?  That way it's clearer from the code what's going on.

On Wed, Feb 20, 2019 at 11:19 AM Juan A. Suarez Romero 
wrote:

> On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
> used to select between 8 bit subpixel precision (value 0) or 4 bit
> subpixel precision (value 1). As this value is not set, means it is
> taking the value 0, so 8 bit are used.
>
> On the other side, in the Vulkan CTS tests, if the reference rasterizer,
> which uses 8 bit precision, as it is used to check what should be the
> expected value for the tests, is changed to use 4 bit as ANV was
> advertising so far, some of the tests will fail.
>
> So it seems ANV is actually using 8 bits.
>
> CC: Jason Ekstrand 
> CC: Kenneth Graunke 
> ---
>  src/intel/vulkan/anv_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 3120865466a..95224407318 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -1095,7 +1095,7 @@ void anv_GetPhysicalDeviceProperties(
>   16 * devinfo->max_cs_threads,
>   16 * devinfo->max_cs_threads,
>},
> -  .subPixelPrecisionBits= 4 /* FIXME */,
> +  .subPixelPrecisionBits= 8,
>.subTexelPrecisionBits= 4 /* FIXME */,
>.mipmapPrecisionBits  = 4 /* FIXME */,
>.maxDrawIndexedIndexValue = UINT32_MAX,
> --
> 2.20.1
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] A few NIR compile time optimisations

2019-02-20 Thread Eero Tamminen

Hi,

On 20.2.2019 17.32, Marek Olšák wrote:
On Wed, Feb 20, 2019 at 2:31 AM Connor Abbott On Wed, Feb 20, 2019 at 4:29 AM Marek Olšák 
...

That's a harsh reaction to a relatively good benchmarking setup.
I use debugoptimized with -DDEBUG. My performance is probably
more affected by -fno-omit-frame-pointer than -DDEBUG.

Why would you enable DEBUG in a profiling build? AFAIK it's supposed
to enable validation more expensive than simple asserts, which you
probably don't want in a benchmarking setup, although from grepping
the source it's not used much. It might be a good idea to move
running NIR validation behind DEBUG instead of !NDEBUG. In the
meantime, unless you really want to benchmark things with assertions
enabled in which case NIR_VALIDATE=0 works (but why would you?), you
can set -Db_ndebug=false in Meson. I just saw today that they're
enabled by default in debugoptimized builds (whoops, better go fix
that and re-profile...).

Assertions and other debug code really don't cost much, so I don't see a 
reason to undef DEBUG.


At least on other projects I've seen cases where:
* assert or debug code has been added to places where it's expensive, or 
some of those places become expensive later
* one sees low cost code in profiles and assume it to be specific to 
DEBUG build, although somebody had forgotten to ifdef it properly
* debug build (or part of it) is compiled with other optimization flags 
than release build [1]
* added debug code causes important code not anymore to fit into smaller 
cache level (this is very specific to cache sizes on given machine and 
code GCC generates though)


[1] Unexpected things happen.  Recently I e.g. saw switching
between Meson & Autotools causing Mesa URB allocation changes:
https://bugs.freedesktop.org/show_bug.cgi?id=108787


-> It would help if you could verify what's the perf gap between release 
and debug builds, and provide numbers for both (preferably each time 
there are significant code changes).


Just using release builds should be simpler though (unless your 
intention actually is investigating why debug code is slow).



- Eero
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: advertise 8 subpixel precision bits

2019-02-20 Thread Juan A. Suarez Romero
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
used to select between 8 bit subpixel precision (value 0) or 4 bit
subpixel precision (value 1). As this value is not set, means it is
taking the value 0, so 8 bit are used.

On the other side, in the Vulkan CTS tests, if the reference rasterizer,
which uses 8 bit precision, as it is used to check what should be the
expected value for the tests, is changed to use 4 bit as ANV was
advertising so far, some of the tests will fail.

So it seems ANV is actually using 8 bits.

CC: Jason Ekstrand 
CC: Kenneth Graunke 
---
 src/intel/vulkan/anv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 3120865466a..95224407318 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1095,7 +1095,7 @@ void anv_GetPhysicalDeviceProperties(
  16 * devinfo->max_cs_threads,
  16 * devinfo->max_cs_threads,
   },
-  .subPixelPrecisionBits= 4 /* FIXME */,
+  .subPixelPrecisionBits= 8,
   .subTexelPrecisionBits= 4 /* FIXME */,
   .mipmapPrecisionBits  = 4 /* FIXME */,
   .maxDrawIndexedIndexValue = UINT32_MAX,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba

2019-02-20 Thread Bruno Milreu
Tested-by: Bruno Filipe 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] spirv: handle function pointer returns.

2019-02-20 Thread Jason Ekstrand
This doesn't apply to master.  shader_info doesn't have ptr_size yet.

On Tue, Feb 19, 2019 at 7:38 PM Dave Airlie  wrote:

> From: Dave Airlie 
>
> This was hardcoded to 32, use the physical bit size we setup.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/compiler/spirv/vtn_cfg.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
> index c32d54e9006..2000c84832e 100644
> --- a/src/compiler/spirv/vtn_cfg.c
> +++ b/src/compiler/spirv/vtn_cfg.c
> @@ -276,7 +276,7 @@ vtn_cfg_handle_prepass_instruction(struct vtn_builder
> *b, SpvOp opcode,
>if (func_type->return_type->base_type != vtn_base_type_void) {
>   /* The return value is a regular pointer */
>   func->params[idx++] = (nir_parameter) {
> -.num_components = 1, .bit_size = 32,
> +.num_components = 1, .bit_size = b->shader->info.cs.ptr_size,
>   };
>}
>
> --
> 2.20.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Piglit] GitLab migration of Piglit

2019-02-20 Thread Den
Given the discussion below, I think we'll make piglit a sub-project of 
mesa.  Those who need commit access to piglit but not mesa can be 
added directly to the piglit project.


Hi list.

Since piglit was also moved to the gitlab, same with mesa, our team is 
interested in process workflow for contributing to it. Before (again, 
same with mesa) we created mailing threads and after reviewing test was 
pushed to master by somebody with access.
Now mesa got a new possibility for reviewing - merge requests, which 
doesn't exist in piglit. Also, according to Jason's conclusion, anybody 
can request commit access to piglit. But in this case how the review 
process will be done?


Thank you in advance.


On 6/7/18 10:18 AM, Daniel Stone wrote:

Hi Emil,

On 5 June 2018 at 18:21, Emil Velikov  wrote:

On 5 June 2018 at 18:02, Daniel Stone  wrote:

drm-gralloc.git

Empty - please nuke, alongside bugzilla & other infra.

Done.


drm.git

Out of curiosity - this and others (say igt) projects are accessible
as mesa/$foo and drm/$foo.
I'd image the same approach will be adopted in gitlab?

Unfortunately not, it provides a single canonical URL rather than
aliasing. But both old anongit URLs will continue to work.


llvm.git

Empty - nuke?

Done.


mesa-test.git

Plain copy/paste just like xserver-test. There were no extra
branches/patches wrt the usual repos.
Nuke?

Done.


libwsbm.git
linux-agp-compat.git
vmwgfx.git

Might want to check with the VMware people about these.
I'm suspecting that the developers are not keeping a close eye on mesa-dev@

I'm going to follow up with them separately about vmwgfx.git.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] A few NIR compile time optimisations

2019-02-20 Thread Marek Olšák
On Wed, Feb 20, 2019 at 2:31 AM Connor Abbott  wrote:

>
>
> On Wed, Feb 20, 2019 at 4:29 AM Marek Olšák  wrote:
>
>> On Tue, Feb 19, 2019 at 7:57 PM Rob Clark  wrote:
>>
>>> On Tue, Feb 19, 2019 at 6:49 PM Marek Olšák  wrote:
>>> >
>>> > st_link_shader takes 55% of CPU time with NIR, and 9% with TGSI.
>>> >
>>> > nir_validate_shader 49%
>>> >
>>> > nir_validate_shader is overused. It doesn't make sense even in debug
>>> builds.
>>>
>>> tbh, I like nir_validate enabled when I do piglit/deqp runs.. and I
>>> already do separate release vs debug builds (which meson kinda
>>> encourages by disallowing in-tree builds in the first place, but is
>>> totally possible with autotools builds).. I kinda think benchmarking
>>> debug build in the first place is a flawed process.
>>>
>>> So I wouldn't profile/benchmark nir vs tgsi (or really anything) with
>>> debug builds.. I could get behind a separate compiler-debug option
>>> that changed the default NIR_VALIDATE setting and maybe some other nir
>>> debug features to get some features of debug builds without enabling
>>> the heavier nir debug features.  But other than making debug options a
>>> bit more fine grained, I wouldn't change things in response to a
>>> flawed benchmarking process..
>>>
>>
>> That's a harsh reaction to a relatively good benchmarking setup. I use
>> debugoptimized with -DDEBUG. My performance is probably more affected by
>> -fno-omit-frame-pointer than -DDEBUG.
>>
>
> Why would you enable DEBUG in a profiling build? AFAIK it's supposed to
> enable validation more expensive than simple asserts, which you probably
> don't want in a benchmarking setup, although from grepping the source it's
> not used much. It might be a good idea to move running NIR validation
> behind DEBUG instead of !NDEBUG. In the meantime, unless you really want to
> benchmark things with assertions enabled in which case NIR_VALIDATE=0 works
> (but why would you?), you can set -Db_ndebug=false in Meson. I just saw
> today that they're enabled by default in debugoptimized builds (whoops,
> better go fix that and re-profile...).
>

Assertions and other debug code really don't cost much, so I don't see a
reason to undef DEBUG.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] gallium/auxiliary/vl: Fix transparent issue on compute shader with rgba

2019-02-20 Thread James Zhu
Hi Filipe,

Can I have your name in the Tested-by list for this patch?

Thanks!

James Zhu


On 2019-02-15 4:19 p.m., Zhu, James wrote:

Fixes: 9364d66cb7f7 (Add video compositor compute shader render)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109646
Problem 1,4: they are caused by imcomplete blend comute shader
implementation. So Reverts rgba back to frament shader.

Signed-off-by: James Zhu 
---
 src/gallium/auxiliary/vl/vl_compositor.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index 8731ad9..a8f3620 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -100,12 +100,12 @@ init_shaders(struct vl_compositor *c)
  debug_printf("Unable to create YCbCr-to-RGB weave fragment 
shader.\n");
  return false;
   }
+   }

-  c->fs_rgba = create_frag_shader_rgba(c);
-  if (!c->fs_rgba) {
- debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
- return false;
-  }
+   c->fs_rgba = create_frag_shader_rgba(c);
+   if (!c->fs_rgba) {
+  debug_printf("Unable to create RGB-to-RGB fragment shader.\n");
+  return false;
}

return true;
@@ -132,8 +132,8 @@ static void cleanup_shaders(struct vl_compositor *c)
} else {
   c->pipe->delete_fs_state(c->pipe, c->fs_video_buffer);
   c->pipe->delete_fs_state(c->pipe, c->fs_weave_rgb);
-  c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
}
+   c->pipe->delete_fs_state(c->pipe, c->fs_rgba);
 }

 static bool
@@ -642,10 +642,7 @@ vl_compositor_set_rgba_layer(struct vl_compositor_state *s,
assert(layer < VL_COMPOSITOR_MAX_LAYERS);

s->used_layers |= 1 << layer;
-   if (c->pipe_compute_supported)
-  s->layers[layer].cs = c->cs_rgba;
-   else
-  s->layers[layer].fs = c->fs_rgba;
+   s->layers[layer].fs = c->fs_rgba;
s->layers[layer].samplers[0] = c->sampler_linear;
s->layers[layer].samplers[1] = NULL;
s->layers[layer].samplers[2] = NULL;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-20 Thread Eleni Maria Stea
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index dcdfb3c9292..dd695218fea 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2445,8 +2445,8 @@ set_scissor_bits(const struct gl_context *ctx, int i,
 
bbox[0] = MAX2(ctx->ViewportArray[i].X, 0);
bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width);
-   bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0);
-   bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height);
+   bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height);
+   bbox[3] = CLAMP(bbox[2] + ctx->ViewportArray[i].Height, 0, fb_height);
_mesa_intersect_scissor_bounding_box(ctx, i, bbox);
 
if (bbox[0] == bbox[1] || bbox[2] == bbox[3]) {
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: fixed clamping in set_scissor_bits when the y is flipped

2019-02-20 Thread Eleni Maria Stea
On Tue, 19 Feb 2019 16:27:56 -0800
Nanley Chery  wrote:

> On Mon, Dec 10, 2018 at 12:42:40PM +0200, Eleni Maria Stea wrote:
> > Calculating the scissor rectangle fields with the y flipped (0 on
> > top) can generate negative values that will cause assertion failure
> > later on as the scissor fields are all unsigned. We must clamp the
> > bbox values again to make sure they don't exceed the fb_height.
> > Also fixed a calculation error.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999  
> 
> Good find. Could you send the test to the piglit list?
Sure, I will send it.


> 
> > ---
> >  src/mesa/drivers/dri/i965/genX_state_upload.c | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> > b/src/mesa/drivers/dri/i965/genX_state_upload.c index
> > 8e3fcbf12e..5d8fc8214e 100644 ---
> > a/src/mesa/drivers/dri/i965/genX_state_upload.c +++
> > b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -2424,8 +2424,21
> > @@ set_scissor_bits(const struct gl_context *ctx, int i, /* memory:
> > Y=0=top */ sc->ScissorRectangleXMin = bbox[0];
> >sc->ScissorRectangleXMax = bbox[1] - 1;
> > +
> > +  /* Clamping to fb_height is necessary because otherwise the
> > +   * subtractions below would produce a negative result, which
> > would
> > +   * then be assigned to the unsigned YMin/YMax scissor fields,
> > +   * resulting in an assertion failure in
> > GENX(SCISSOR_RECT_pack)
> > +   */
> > +
> > +  if (bbox[3] > fb_height)
> > + bbox[3] = fb_height;
> > +
> > +  if (bbox[2] > fb_height)
> > + bbox[2] = fb_height;
> > +  
> 
> We should be able to fix this bug in a simpler manner by changing the
> MAX2 calls at the top of this function to CLAMP calls.
> 
> >sc->ScissorRectangleYMin = fb_height - bbox[3];
> > -  sc->ScissorRectangleYMax = fb_height - bbox[2] - 1;
> > +  sc->ScissorRectangleYMax = fb_height - (bbox[2] - 1);  
> 
> I don't think we want to start adding 1 instead of subtracting 1. The
> subtraction is there to satisfy the requirement for the HW packet.
> 
> -Nanley

Right! This code would be correct if I had done:

  if (bbox[2] >= fb_height)
 bbox[2] = fb_height - 1;

and then had left:
  sc->ScissorRectangleYMax = fb_height - bbox[2] - 1;

as it was. :)

I think I like your solution better because with the CLAMP at the top
what we do here is more clear. I am going to send a new patch soon.

Thank you!
Eleni
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109615] 19.0.0_rc4 fails u_format_test on ppc64

2019-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109615

erhar...@mailbox.org changed:

   What|Removed |Added

 Blocks||109535


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109535
[Bug 109535] [Tracker] Mesa 19.0 release tracker
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker

2019-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109535

erhar...@mailbox.org changed:

   What|Removed |Added

 Depends on||109615


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109615
[Bug 109615] 19.0.0_rc4 fails u_format_test on ppc64
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.

2019-02-20 Thread Ville Syrjälä
On Tue, Feb 19, 2019 at 09:36:14AM -0800, Rodrigo Vivi wrote:
> On Mon, Feb 18, 2019 at 04:54:34PM +1100, Jonathan Gray wrote:
> > Compared to linux and libdrm Mesa is missing a VLV and ICL id.
> > 
> > 0x0f30
> > ff049b6ce21d2814451afd4a116d001712e0116b
> > drm/i915: bind driver to ValleyView chipsets
> > 
> > 0x8A70
> > d55cb4fa2cf0105bfb16b60a2846737b91fdc173
> > drm/i915/icl: Add the ICL PCI IDs
> > 
> > The Intel Media SDK describes these as
> > 
> > /* VLV */
> > { 0x0f30, MFX_HW_VLV, MFX_GT1 },   /* VLV mobile */
> 
> hmmm... no idea about this one...
> Ville?
> maybe we should just remove from kernel since it was never
> missed from Mesa?

Bspec says that is the infamous X0. Assuming it's not
lying to us it should be safe to remove.

> 
> > 
> > /* ICL LP */
> > { 0x8A70, MFX_HW_ICL_LP, MFX_GT1 }
> 
> This is pure display so no Mesa needed for this ID.
> 
> > 
> > and libdrm's intel_chipset.h describes the VLV id as
> > 
> > #define PCI_CHIP_VALLEYVIEW_PO  0x0f30 /* VLV PO board */
> > 
> > It isn't clear what the ICL configuration should be from public
> > information.
> > 
> > diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> > index b91abd7a3f9..3568007b1ef 100644
> > --- a/include/pci_ids/i965_pci_ids.h
> > +++ b/include/pci_ids/i965_pci_ids.h
> > @@ -86,6 +86,7 @@ CHIPSET(0x0D2B, hsw_gt3, "Intel(R) Haswell")
> >  CHIPSET(0x0D0E, hsw_gt1, "Intel(R) Haswell")
> >  CHIPSET(0x0D1E, hsw_gt2, "Intel(R) Haswell")
> >  CHIPSET(0x0D2E, hsw_gt3, "Intel(R) Haswell")
> > +CHIPSET(0x0F30, byt, "Intel(R) Bay Trail")
> >  CHIPSET(0x0F31, byt, "Intel(R) Bay Trail")
> >  CHIPSET(0x0F32, byt, "Intel(R) Bay Trail")
> >  CHIPSET(0x0F33, byt, "Intel(R) Bay Trail")
> > @@ -212,4 +213,5 @@ CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice 
> > Lake 6x8 GT1.5)")
> >  CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
> >  CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
> >  CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
> > +CHIPSET(0x8A70, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
> >  CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)")

-- 
Ville Syrjälä
Intel
-
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev