Re: [Mesa-dev] Time to update GSoC/EVoC ideas page

2017-01-23 Thread Daniel Vetter
On Fri, Jan 20, 2017 at 08:44:19AM -0800, Jason Ekstrand wrote:
> On Fri, Jan 20, 2017 at 2:15 AM, Nicolai Hähnle  wrote:
> 
> > Hi Rob,
> >
> > On 19.01.2017 23:32, Rob Clark wrote:
> >
> >> Just a friendly reminder that now would be a good time to update the
> >> wiki page for GSoC/EVoC ideas:
> >>
> >>   https://www.x.org/wiki/SummerOfCodeIdeas/
> >>
> >> There are currently still some stale ideas there (and probably plenty
> >> of missing good ideas).
> >>
> >> Also, I've added a "Potential Mentors" section.  Please add yourself
> >> if you are willing to be a mentor (and what sorts of projects you
> >> would be willing to mentor)
> >>
> >
> > I'd be happy to be listed as a possible mentor on the DriConf replacement
> > project (nha on IRC/freenode), but I either don't have a Wiki account or
> > forgot it a long time ago. Could you put my name down on the page?
> >
> 
> Pro tip: You can just git clone the wiki, change a markdown file, and git
> push it back up.  That's the way I always make edits.  Way nicer than a web
> GUI. :-)

And if you have a shell account, it's easy to resurrect your web wiki
account:

https://wiki.freedesktop.org/sitewranglers/wiki/401/

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99496] Dolphin emulator: Launching Mario Kart Wii results in a blank window/black screen and freeze.

2017-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99496

Christian Lanig  changed:

   What|Removed |Added

Summary|Dolphin emulator displays   |Dolphin emulator: Launching
   |message "Failed to load |Mario Kart Wii results in a
   |Vulkan library."|blank window/black screen
   ||and freeze.

--- Comment #2 from Christian Lanig  ---
I realized that libvulkan1 hasn't been installed on the system that's why I got
the messages.

Now the entire issue is the same like before - when I launch Mario Kart Wii
nothing happens than the system freezing...

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] nir/spirv/glsl450: rewrite atan2 to deal with infinities

2017-01-23 Thread Juan A. Suarez Romero
On Sun, 2017-01-22 at 00:20 -0800, Francisco Jerez wrote:
> "Juan A. Suarez Romero"  writes:
> 
> > Rewrite atan2(y,x) to cover (+/-)INF values.
> > 
> > This fixes several test cases in Vulkan CTS
> > (dEQP-VK.glsl.builtin.precision.atan2.*)
> > 
> > v2: do not flush denorms to 0 (jasuarez)
> > ---
> >  src/compiler/spirv/vtn_glsl450.c | 48 
> > +++-
> >  1 file changed, 42 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/compiler/spirv/vtn_glsl450.c 
> > b/src/compiler/spirv/vtn_glsl450.c
> > index 0d32fddbef..d52a22c0c3 100644
> > --- a/src/compiler/spirv/vtn_glsl450.c
> > +++ b/src/compiler/spirv/vtn_glsl450.c
> > @@ -299,18 +299,47 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x)
> > return nir_fmul(b, tmp, nir_fsign(b, y_over_x));
> >  }
> >  
> > +/*
> > + * Computes atan2(y,x)
> > + */
> >  static nir_ssa_def *
> >  build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
> >  {
> > nir_ssa_def *zero = nir_imm_float(b, 0.0f);
> > -
> > -   /* If |x| >= 1.0e-8 * |y|: */
> > -   nir_ssa_def *condition =
> > -  nir_fge(b, nir_fabs(b, x),
> > -  nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y)));
> > +   nir_ssa_def *inf = nir_imm_float(b, INFINITY);
> > +   nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY);
> > +   nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f),
> > +   nir_imm_float(b, M_PI_4f));
> > +
> > +   /* if y == +-INF */
> > +   nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf);
> > +
> > +   /* if x == +-INF */
> > +   nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf);
> > +
> > +   /* Case: y is +-INF */
> > +   nir_ssa_def *y_is_inf_then =
> > +  nir_fmul(b, nir_fsign(b, y),
> > +  nir_bcsel(b, nir_feq(b, x, inf),
> > +   nir_imm_float(b, M_PI_4f),
> > +   nir_bcsel(b, nir_feq(b, x, minus_inf),
> > +m_3_pi_4,
> > +nir_imm_float(b, M_PI_2f;
> > +
> > +   /* Case: x is +-INF */
> > +   nir_ssa_def *x_is_inf_then =
> > +  nir_fmul(b, nir_fsign(b, y),
> > +  nir_bcsel(b, nir_feq(b, x, inf),
> > +   zero,
> > +   nir_imm_float(b, M_PIf)));
> > +
> 
> I don't think we need all these special cases.  The majority of the
> infinity/zero handling rules required by IEEE are fairly natural and
> would be taken care of without any additional effort by the
> floating-point division operation and single-argument atan function
> below if they propagated infinities and zeroes according to IEEE rules.
> 
> I had a look at the test results myself and noticed that the failures
> are for the most part due to a precision problem in the current
> implementation that doesn't only affect infinity -- Relative precision
> also explodes as x grows above certain point, infinities just make the
> problem catastrophic and cause it to return NaN instead of the
> expected finite value.  The reason for the precision problem is that
> fdiv is later on lowered into an fmul+frcp sequence, and the latter may
> flush the result to zero if the denominator was so huge that its
> reciprocal would be denormalized.  If the numerator happened to be
> infinite you may end up with ∞/huge = NaN for the same reason.
> 

Right. For this case I'd submitted a patch to the test itself, that
roughly speaking assumes any result as possible if denominator is big
enough.

https://gerrit.khronos.org/#/c/524/


I understand with your alternative proposal you would also handle this
case correctly, making the CTS change not required, right?


> On top of that there seem to be other issues with the current atan2
> implementation:
> 
>  - It doesn't handle zeros correctly.  This may be related to your
>observation that denorm arguments cause it to give bogus results, but
>the problem doesn't seem to be related to denorms in particular, but
>to the fact that denorms can get flushed to -0 which is in turn
>handled incorrectly.  The reason is that the existing code uses 'y >=
>0' to determine on which side of the branch cut we are, but that
>causes the discontinuity to end up along the y=-epsilon line instead
>of along the y=0 line as IEEE requires -- IOW, with the current
>implementation very small negative y values behave as if they were
>positive which causes the result to have a large absolute error of
>2π.
> 
>  - It doesn't give IEEE-compliant results when both arguments are
>simultaneously infinite.  This is not surprising given that IEEE
>defining atan2(∞, ∞) = π/4 is fairly artificial (as are the other
>rules for combinations of positive or negative infinity), strictly
>speaking taking the limit along any direction other than the diagonal
>would be as right or wrong.  To make the matter worse IEEE disagrees

Re: [Mesa-dev] [PATCH 0/3] i965: IVB/BYT fp64 fixes

2017-01-23 Thread Samuel Iglesias Gonsálvez
On Fri, 2017-01-20 at 13:35 -0800, Matt Turner wrote:
> I committed my EU validator earlier today. It's caught three bugs in
> the IVB
> fp64 series. Patches 2 and 3 fix two of them. I'll respond directly
> to the
> patch in Igalia's series that introduces the other bug.
> 

OK, while waiting for Curro's answer for that patch, I am going to take
a look at it.

> These should go in with Igalia's series, before the extension is
> turned on.
> 
> 

Thanks! I am going to add them to our -rc3 branch.

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.

2017-01-23 Thread Samuel Iglesias Gonsálvez
On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote:
> Matt Turner  writes:
> 
> > In commit c35fa7a, we changed the "width" of DF source registers to
> > 2,
> > which is conceptually fine. Unfortunately a VertStride of 2 is not
> > allowed by align16 instructions on IVB/BYT, and the regular
> > VertStride
> > of 4 works fine in any case.
> > 
> 
> I'll try to throw some light onto why this works -- AFAIUI, in
> Align16
> mode the vertical stride control doesn't behave as you would expect
> -- A
> VertStride=0 does behave as expected and steps zero elements across
> rows
> (modulo instruction decompression bugs), but on Gen7 any other value
> simply behaves as "step by a fixed offset of 16B across rows".  On
> HSW
> they explicitly allowed VertStride=2, but I don't think the hardware
> became any smarter, it's still most likely a 16B fixed offset.  On
> IVB
> neither VertStride=2 nor VertStride=4 is supposed to work for our
> purposes (the former because it's supposedly not supported and the
> latter because one would expect it to step by 4 DF elements = 32B per
> 16B row), but they both likely work in practice.  Anyway let's stick
> to
> what the docs say is not illegal, a couple more comments below.
> 
> > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-
> > functions/vs-round-double.shader_test
> > for example:
> > 
> > cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{
> > align16 1Q };
> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
> > allowed
> > cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{
> > align16 2N };
> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
> > allowed
> > ---
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 888f95e..a01083f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg reg)
> >      /* This is an oddity of the fact we're using the same
> >       * descriptions for registers in align_16 as align_1:
> >       */
> 
> Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
> nobody gets confused and thinks that the code you added has to do
> with
> our representation of align_16 regions?
> 
> > -    if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
> >  brw_inst_set_src0_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > -    else
> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > +reg.type == BRW_REGISTER_TYPE_DF &&
> > +reg.vstride >= BRW_VERTICAL_STRIDE_1) {
> 
> I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF
> Align16
> regions with VertStride=4 really behave like VertStride=2.  If the
> caller expected anything else it's not going to get it.
> 
> Maybe copy-paste the relevant spec text here and below to explain why
> we
> only use BRW_VERTICAL_STRIDE_4?
> 

Matt, I can handle these changes... however, I have not found the
relevant spec quote. Can you provide it?

Sam

> > +brw_inst_set_src0_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > + } else {
> >  brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
> > + }
> >    }
> > }
> >  }
> > @@ -594,10 +599,15 @@ brw_set_src1(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg reg)
> >      /* This is an oddity of the fact we're using the same
> >       * descriptions for registers in align_16 as align_1:
> >       */
> > -    if (reg.vstride == BRW_VERTICAL_STRIDE_8)
> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
> >  brw_inst_set_src1_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > -    else
> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
> > +reg.type == BRW_REGISTER_TYPE_DF &&
> > +reg.vstride >= BRW_VERTICAL_STRIDE_1) {
> > +brw_inst_set_src1_vstride(devinfo, inst,
> > BRW_VERTICAL_STRIDE_4);
> > + } else {
> >  brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
> > + }
> >    }
> > }
> >  }
> > -- 
> > 2.7.3
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98833] [REGRESSION, bisected] Wayland revert commit breaks non-Vsync fullscreen frame updates

2017-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98833

Eero Tamminen  changed:

   What|Removed |Added

Summary|[REGRESSION, bisected]  |[REGRESSION, bisected]
   |Wayland revert commit   |Wayland revert commit
   |breaks fullscreen frame |breaks non-Vsync fullscreen
   |updates |frame updates
 Blocks||98471


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=98471
[Bug 98471] [TRACKER] Mesa 13.0 release tracker
-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98471] [TRACKER] Mesa 13.0 release tracker

2017-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98471

Eero Tamminen  changed:

   What|Removed |Added

 Depends on||98833


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=98833
[Bug 98833] [REGRESSION, bisected] Wayland revert commit breaks non-Vsync
fullscreen frame updates
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] configure.ac: Set and use HAVE_GALLIUM_LLVM define

2017-01-23 Thread Jose Fonseca

On 20/01/17 02:48, Emil Velikov wrote:

On 19 January 2017 at 19:26, Tobias Droste  wrote:

Am Mittwoch, 18. Januar 2017, 18:45:04 CET schrieb Emil Velikov:

On 18 January 2017 at 18:12, Jose Fonseca  wrote:

In order to untangle things we want to have a distinction between the
gallium (gallivm afaict) and other users - RADV presently.
So how about we update the RADV instances and ensure that the we set
the HAVE_{RADV,}_LLVM lot appropriately. Latter will be picky but
overall things should work w/o annoyances that HAVE_GALLIUM_LLVM
brings ?


I honestly don't even understand why we'd want to build parts of the tree
with LLVM while hiding LLVM from other components.  We can't we just build
everything with LLVM and avoid this combinatorial explosion of wierd
options that are nothing more than yet another way the build can break!!?


Sadly the combinatoric explosion has been there for a while. Based on
how well my previous attempts to resolve similar issues (see the
"platforms" topic) I doubt we'll even get to fix that.


But if a separate option is truly necessary, have the newcomer pick a
different name, or something.


That's pretty much what I suggested above. Tobias can you please give it a
try ?


I would rather "fix" the other build systems. (As in just define
HAVE_GALLIUM_LLVM if HAVE_LLVM is defined).

I think there is still a misunderstanding on Joses side on what this really
means. No file in gallivm or llvmpipe will be touched. It's really just
auxilliary/draw and there it's exactly 8 lines that will change.

That's it.

I really fail to see how this will break everything that is being worked on
and cause merge conflicts everywhere.

If you still want the other way, I can do that to, but this will of course
need the same fix in the other build system or we have the same situation we
have now, but with other drivers.


Afaict one point is that the use of HAVE_GALLIUM_LLVM vs HAVE_LLVM is
too subtle. Let's not forget that barring the WIP(?) branches, VMWare
has closed source components. Guess how much fun it will be as
suddenly things fail to build/work properly as they re-sync the code
base. No idea how likely the latter is, but considering Jose (and a
few other VMWare guys) wrote sizeable hunk of that code (and Mesa as a
whole) I'd go with his instinct.

Emil




The HAVE_LLVM->HAVE_GALLIUM_LLVM rename is indeed not as invasive as I 
thought.


But I still don't understand why HAVE_LLVM->HAVE_GALLIUM_LLVM is 
necessary in draw and not on gallivm/llvmpipe.


People want to build draw with LLVM support, but without 
gallivm/llvmpipe? That's impossible.


Or is this because the draw files are the only .c files that are 
compiled even when HAVE_LLVM is undefined, so these are the only ones 
that get to receive the renaming treatment?  That's crazy confusing. 
There's no away I can accept that.



Let me make this crystal clear to avoid making this discussion even more 
protracted: I will not accept any HAVE_LLVM change in 
draw/gallivm/llvmpipe .C/.H source code.  Period.



HAVE_LLVM used to mean, "whole Mesa being built with LLVM".  Now people 
want to build something (no idea what yet to be honest) with LLVM, but 
not build draw/gallivm/llvmpipe.


If you want to build some other component with LLVM but not 
draw/gallivm/llvmpipe, then add a new HAVE_LLVM_FOR_FOOBAR define and 
use it where you need it.



And honestly, I think people should reconsider enabling building parts 
with Mesa with LLVM and others without it.  It sounds an awful idea that 
will cause nothing more than grief.  But that's not for me to decide.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gbm-dri: Duplicate image after checking its format.

2017-01-23 Thread Eric Engestrom
On Wednesday, 2016-12-21 10:55:28 +0100, mateuszx.potr...@intel.com wrote:
> From: Mateusz Polrola 
> 
> If image will be duplicated before checking if its format is supported
> it may leak memory, as duplicated image for non supported formats is
> not being destroyed.
> 
> Signed-off-by: Mateusz Polrola 

You are correct, so this patch is:
Reviewed-by: Eric Engestrom 

However, when I looked at the code, it looks like a lot more places leak 
`image`.
I'll send another patch in a minute (to be applied on top of yours).

Cheers,
  Eric

> ---
>  src/gbm/backends/dri/gbm_dri.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> index 45cb42a..941d915 100644
> --- a/src/gbm/backends/dri/gbm_dri.c
> +++ b/src/gbm/backends/dri/gbm_dri.c
> @@ -679,8 +679,6 @@ gbm_dri_bo_import(struct gbm_device *gbm,
>   return NULL;
>}
>  
> -  image = dri->image->dupImage(wb->driver_buffer, NULL);
> -
>switch (wb->format) {
>case WL_DRM_FORMAT_XRGB:
>   gbm_format = GBM_FORMAT_XRGB;
> @@ -697,6 +695,8 @@ gbm_dri_bo_import(struct gbm_device *gbm,
>default:
>   return NULL;
>}
> +
> +  image = dri->image->dupImage(wb->driver_buffer, NULL);
>break;
> }
>  #endif
> -- 
> 2.5.5
> 
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de
> Managing Directors: Christin Eisenschmid, Christian Lamprechter
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa] gbm/dri: fix memory leaks in error path

2017-01-23 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 src/gbm/backends/dri/gbm_dri.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index 45cb42a862..ef96185848 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -715,6 +715,7 @@ gbm_dri_bo_import(struct gbm_device *gbm,
   gbm_format = gbm_dri_to_gbm_format(dri_format);
   if (gbm_format == 0) {
  errno = EINVAL;
+ dri->image->destroyImage(image);
  return NULL;
   }
   break;
@@ -759,8 +760,10 @@ gbm_dri_bo_import(struct gbm_device *gbm,
 
 
bo = calloc(1, sizeof *bo);
-   if (bo == NULL)
+   if (bo == NULL) {
+  dri->image->destroyImage(image);
   return NULL;
+   }
 
bo->image = image;
 
@@ -771,6 +774,7 @@ gbm_dri_bo_import(struct gbm_device *gbm,
if (dri->image->base.version >= 2 &&
!dri->image->validateUsage(bo->image, dri_use)) {
   errno = EINVAL;
+  dri->image->destroyImage(bo->image);
   free(bo);
   return NULL;
}
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 97967] glsl/tests/cache-test regression

2017-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97967

--- Comment #8 from Tapani Pälli  ---
(In reply to Timothy Arceri from comment #7)
> The real issue is these MAX_SIZE limits are far too small. Arguably we
> should handle it anyway or maybe apply a minimum size.

hmm let me share my theory .. MAX_SIZE limit set by this test is small
intentionally to be able to force eviction to happen. There is some intent in
the test to evict exact wanted entry BUT for some reason picking that exact
entry on CentOS fails. My theory is that in CentOS environment compiler
generates such code that our integrated sha1 implementation will produce
different results than on system where test was generated.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] i965: Combine the Gen6 SF and Clip viewport atoms.

2017-01-23 Thread Pohjolainen, Topi
On Sun, Jan 22, 2017 at 10:42:16PM -0800, Kenneth Graunke wrote:
> The next patch will make the guardband calculation dependent on the
> transformation matrix.  Instead of computing it in both atoms, just
> combine them into a single atom.
> 
> Signed-off-by: Kenneth Graunke 

Reviewed-by: Topi Pohjolainen 

> ---
>  src/mesa/drivers/dri/i965/brw_state.h   |  2 +-
>  src/mesa/drivers/dri/i965/brw_state_upload.c|  3 +-
>  src/mesa/drivers/dri/i965/gen6_viewport_state.c | 82 
> +
>  3 files changed, 30 insertions(+), 57 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> b/src/mesa/drivers/dri/i965/brw_state.h
> index f2349d8c037..ec6006c3fc6 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -110,7 +110,7 @@ extern const struct brw_tracked_state 
> gen7_cs_push_constants;
>  extern const struct brw_tracked_state gen6_binding_table_pointers;
>  extern const struct brw_tracked_state gen6_blend_state;
>  extern const struct brw_tracked_state gen6_clip_state;
> -extern const struct brw_tracked_state gen6_clip_vp;
> +extern const struct brw_tracked_state gen6_sf_and_clip_viewports;
>  extern const struct brw_tracked_state gen6_color_calc_state;
>  extern const struct brw_tracked_state gen6_depth_stencil_state;
>  extern const struct brw_tracked_state gen6_gs_state;
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index d0be6acaf0f..52b74a7c527 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -103,8 +103,7 @@ static const struct brw_tracked_state *gen4_atoms[] =
>  
>  static const struct brw_tracked_state *gen6_atoms[] =
>  {
> -   &gen6_clip_vp,
> -   &gen6_sf_vp,
> +   &gen6_sf_and_clip_viewports,
>  
> /* Command packets: */
>  
> diff --git a/src/mesa/drivers/dri/i965/gen6_viewport_state.c 
> b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
> index ad1e72d0a50..2e08f1a1290 100644
> --- a/src/mesa/drivers/dri/i965/gen6_viewport_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_viewport_state.c
> @@ -33,61 +33,12 @@
>  #include "main/framebuffer.h"
>  #include "main/viewport.h"
>  
> -/* The clip VP defines the guardband region where expensive clipping is 
> skipped
> - * and fragments are allowed to be generated and clipped out cheaply by the 
> SF.
> - */
> -static void
> -gen6_upload_clip_vp(struct brw_context *brw)
> -{
> -   struct gl_context *ctx = &brw->ctx;
> -   struct brw_clipper_viewport *vp;
> -
> -   /* BRW_NEW_VIEWPORT_COUNT */
> -   const unsigned viewport_count = brw->clip.viewport_count;
> -
> -   vp = brw_state_batch(brw, AUB_TRACE_CLIP_VP_STATE,
> -sizeof(*vp) * viewport_count, 32, 
> &brw->clip.vp_offset);
> -
> -   for (unsigned i = 0; i < viewport_count; i++) {
> -  /* According to the "Vertex X,Y Clamping and Quantization" section of 
> the
> -   * Strips and Fans documentation, objects must not have a screen-space
> -   * extents of over 8192 pixels, or they may be mis-rasterized.  The 
> maximum
> -   * screen space coordinates of a small object may larger, but we have 
> no
> -   * way to enforce the object size other than through clipping.
> -   *
> -   * If you're surprised that we set clip to -gbx to +gbx and it seems 
> like
> -   * we'll end up with 16384 wide, note that for a 8192-wide render 
> target,
> -   * we'll end up with a normal (-1, 1) clip volume that just covers the
> -   * drawable.
> -   */
> -  const float maximum_post_clamp_delta = 8192;
> -  float gbx = maximum_post_clamp_delta / ctx->ViewportArray[i].Width;
> -  float gby = maximum_post_clamp_delta / ctx->ViewportArray[i].Height;
> -
> -  vp[i].xmin = -gbx;
> -  vp[i].xmax = gbx;
> -  vp[i].ymin = -gby;
> -  vp[i].ymax = gby;
> -   }
> -
> -   brw->ctx.NewDriverState |= BRW_NEW_CLIP_VP;
> -}
> -
> -const struct brw_tracked_state gen6_clip_vp = {
> -   .dirty = {
> -  .mesa = _NEW_VIEWPORT,
> -  .brw = BRW_NEW_BATCH |
> - BRW_NEW_BLORP |
> - BRW_NEW_VIEWPORT_COUNT,
> -   },
> -   .emit = gen6_upload_clip_vp,
> -};
> -
>  static void
> -gen6_upload_sf_vp(struct brw_context *brw)
> +gen6_upload_sf_and_clip_viewports(struct brw_context *brw)
>  {
> struct gl_context *ctx = &brw->ctx;
> struct gen6_sf_viewport *sfv;
> +   struct brw_clipper_viewport *clv;
> GLfloat y_scale, y_bias;
> const bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
>  
> @@ -99,6 +50,10 @@ gen6_upload_sf_vp(struct brw_context *brw)
>   32, &brw->sf.vp_offset);
> memset(sfv, 0, sizeof(*sfv) * viewport_count);
>  
> +   clv = brw_state_batch(brw, AUB_TRACE_CLIP_VP_STATE,
> + sizeof(*clv) * viewport_count,
> + 32, &brw->clip.vp_offset);
> +
> /* _NEW_BUFFERS */
> if (render_to

[Mesa-dev] [Bug 99496] Dolphin emulator: Launching Mario Kart Wii results in a blank window/black screen and freeze.

2017-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99496

--- Comment #3 from Grazvydas Ignotas  ---
Same issue here on RX 470, seems to not depend on the game (or happens on
most).

There was some basic debugging done which showed that the command stream
doesn't even start to be executed (or the debug code fails to detect it), so
it's unclear what's going on.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] i965: Use a better guardband calculation.

2017-01-23 Thread Ilia Mirkin
On Mon, Jan 23, 2017 at 1:42 AM, Kenneth Graunke  wrote:
> +  *ymin = MIN2(ndc_gb_ymin, ndc_gb_ymax);

Should this be limited on the negative end of the range?

> +  *ymax = MIN2(MAX2(ndc_gb_ymin, ndc_gb_ymax), 16383);

And should this be a different value for gen6? Perhaps based on gb_size?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] drirc: remove spurious tabs

2017-01-23 Thread Eric Engestrom
On Friday, 2017-01-06 19:08:39 +1100, Edward O'Callaghan wrote:
> Reviewed-by: Edward O'Callaghan 

Thanks; can you push this for me please?

Cheers,
  Eric

> 
> On 01/06/2017 08:06 AM, Eric Engestrom wrote:
> > Signed-off-by: Eric Engestrom 
> > ---
> >  src/mesa/drivers/dri/common/drirc | 16 
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/common/drirc 
> > b/src/mesa/drivers/dri/common/drirc
> > index af84ee82e8..97297b7a1c 100644
> > --- a/src/mesa/drivers/dri/common/drirc
> > +++ b/src/mesa/drivers/dri/common/drirc
> > @@ -28,46 +28,46 @@ TODO: document the other workarounds.
> >  
> >  
> >  
> > -   
> > +
> >  
> >  
> >  
> >  
> > -   
> > +
> >  
> >   > executable="heaven_x86">
> >   > value="true" />
> >  
> >  
> > -   
> > +
> >  
> >   > executable="heaven_x64">
> >   > value="true" />
> >  
> >  
> > -   
> > +
> >  
> >   > executable="valley_x86">
> >   > value="true" />
> >  
> >  
> > -   
> > +
> >  
> >   > executable="valley_x64">
> >   > value="true" />
> >  
> >  
> > -   
> > +
> >  
> >   > executable="OilRush_x86">
> >  
> >   > value="true" />
> > -   
> > +
> >  
> >   > executable="OilRush_x64">
> >  
> >   > value="true" />
> > -   
> > +
> >  
> >  
> >  
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: delay calling begin_frame until we have all parameters

2017-01-23 Thread Christian König
Ah, yes of course. If we delay creating the decoder we need to call 
begin_frame() again as well.


Please review and/or test the attached patch. Andy I did understand you 
right that this is already a Tested-by from your side, isn't it?



I am wondering if calling decode_bitstream one at a time for each
buffer is similar to
calling it with all buffers at once? 

Yes, that is correct. It's just not as efficient.

One problem with VA-API is that it doesn't seem to guarantee that 
buffers stays around after handing them of to the decoder (the ownership 
handling of buffers and surfaces is a totally mess).


So we would need to make a copy of the buffer content to submit it again 
all at once.


Regards,
Christian.

Am 21.01.2017 um 20:46 schrieb Andy Furniss:

Nayan Deshmukh wrote:

Hi Christian,

The new patch leads to seg fault on my system. You forgot to set the
needs_begin_frame to true when the decoder is created. Here's diff for
reference:


Setting true there seems to fix, though only a quick test.

The patch below sets false :-)



diff --git a/src/gallium/state_trackers/va/picture.c
b/src/gallium/state_trackers/va/picture.c
index e75006d..a51e482 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -178,6 +178,8 @@ handlePictureParameterBuffer(vlVaDriver *drv,
vlVaContext *context, vlVaBuffer *

if (!context->decoder)
   return VA_STATUS_ERROR_ALLOCATION_FAILED;
+
+  context->needs_begin_frame = false;
 }

 return vaStatus;
--

I am wondering if calling decode_bitstream one at a time for each
buffer is similar to
calling it with all buffers at once?

Cheers,
Nayan

On Thu, Jan 19, 2017 at 8:36 PM, Andy Furniss  
wrote:

Andy Furniss wrote:


Christian König wrote:


Hi Andy,

Am 19.01.2017 um 11:46 schrieb Andy Furniss:


I think you are right about the slices, the failing vids are 
blu-ray/tv.



https://drive.google.com/file/d/0BxP5-S1t9VEEZlozcjVUZ1lDbWM/view?usp=sharing 





Thanks for the link, if you have time please give the attached 
patch a

try.

It should fix the issue, but I currently don't have a test system for
VAAPI ready so I can't confirm it of hand.



It doesn't fix properly. The vid will play normally after a second,
but during that second I get to see many flash frames of gpu mem.

Lucky I chose this sample out of several as the patch does seem to
fix a couple of other previously failing vids.



Though more testing shows it also regresses previously working vids, 
some as

above, but one never "starts" and is total junk.









>From 6d0f9a351fa72d4be56d1e44ad7c347ca93d4420 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= 
Date: Thu, 19 Jan 2017 13:44:34 +0100
Subject: [PATCH] st/va: make sure that we call begin_frame() only once v2
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This fixes "st/va: delay calling begin_frame until we have all parameters".

v2: call begin frame after decoder (re)creation as well.

Signed-off-by: Christian König 
---
 src/gallium/state_trackers/va/picture.c| 11 ---
 src/gallium/state_trackers/va/va_private.h |  1 +
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c b/src/gallium/state_trackers/va/picture.c
index dc7121c..82584ea 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -81,7 +81,7 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID context_id, VASurfaceID rende
}
 
if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE)
-  context->decoder->begin_frame(context->decoder, context->target, &context->desc.base);
+  context->needs_begin_frame = true;
 
return VA_STATUS_SUCCESS;
 }
@@ -178,6 +178,8 @@ handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer *
 
   if (!context->decoder)
  return VA_STATUS_ERROR_ALLOCATION_FAILED;
+
+  context->needs_begin_frame = true;
}
 
return vaStatus;
@@ -308,8 +310,11 @@ handleVASliceDataBufferType(vlVaContext *context, vlVaBuffer *buf)
sizes[num_buffers] = buf->size;
++num_buffers;
 
-   context->decoder->begin_frame(context->decoder, context->target,
-  &context->desc.base);
+   if (context->needs_begin_frame) {
+  context->decoder->begin_frame(context->decoder, context->target,
+ &context->desc.base);
+  context->needs_begin_frame = false;
+   }
context->decoder->decode_bitstream(context->decoder, context->target, &context->desc.base,
   num_buffers, (const void * const*)buffers, sizes);
 }
diff --git a/src/gallium/state_trackers/va/va_private.h b/src/gallium/state_trackers/va/va_private.h
index 8faec10..0877236 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -261,6 +261,7 @@ typedef struct {
int target_id;
   

Re: [Mesa-dev] [PATCH] mesa: Enable EXT_compressed_ETC1_RGB8_sub_texture

2017-01-23 Thread Manolova, Plamena
Thank you for reviewing guys! AFAIK this extension is a driver-side feature
and can be enabled for all drivers that support ETC1. I'll go ahead and
update my patch.

On Fri, Jan 20, 2017 at 6:25 PM, Jason Ekstrand 
wrote:

> On Fri, Jan 20, 2017 at 10:16 AM, Ilia Mirkin 
> wrote:
>
>> What level of support would a driver need to provide? Can this just be
>> enabled for all drivers? [This seems like largely a driver-side
>> feature rather than hardware-based.]
>>
>
> My understanding is that we should just expose this extension on all
> hardware that supports ETC1.  Obviously, if it doesn't support ETC1, you
> don't get this extension. :-)
>
>
>> On Fri, Jan 20, 2017 at 1:12 PM, Plamena Manolova
>>  wrote:
>> > Since we already have the functionality in place and games
>> > like Game of Thrones seem to depend on this extension, I
>> > think it makes sense to enable it by making it part of
>> > the extension string even though its still a draft:
>> >
>> > https://www.khronos.org/registry/gles/extensions/EXT/EXT_
>> compressed_ETC1_RGB8_sub_texture.txt
>> >
>> > Note: OES_compressed_ETC1_RGB8_sub_texture seems to be listed
>> > in gl2ext.h, but there's no documentation for it in the KHR
>> > registry
>> >
>> > Signed-off-by: Plamena Manolova 
>> > ---
>> >  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
>> >  src/mesa/main/extensions_table.h | 1 +
>> >  src/mesa/main/mtypes.h   | 1 +
>> >  src/mesa/main/teximage.c | 5 +
>> >  4 files changed, 8 insertions(+)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
>> b/src/mesa/drivers/dri/i965/intel_extensions.c
>> > index b674b2f..bdf2fa5 100644
>> > --- a/src/mesa/drivers/dri/i965/intel_extensions.c
>> > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
>> > @@ -93,6 +93,7 @@ intelInitExtensions(struct gl_context *ctx)
>> > ctx->Extensions.EXT_blend_equation_separate = true;
>> > ctx->Extensions.EXT_blend_func_separate = true;
>> > ctx->Extensions.EXT_blend_minmax = true;
>> > +   ctx->Extensions.EXT_compressed_ETC1_RGB8_sub_texture = true;
>> > ctx->Extensions.EXT_draw_buffers2 = true;
>> > ctx->Extensions.EXT_framebuffer_sRGB = true;
>> > ctx->Extensions.EXT_gpu_program_parameters = true;
>> > diff --git a/src/mesa/main/extensions_table.h
>> b/src/mesa/main/extensions_table.h
>> > index 2de3c59..8b52a97 100644
>> > --- a/src/mesa/main/extensions_table.h
>> > +++ b/src/mesa/main/extensions_table.h
>> > @@ -198,6 +198,7 @@ EXT(EXT_buffer_storage  ,
>> ARB_buffer_storage
>> >  EXT(EXT_clip_cull_distance  , ARB_cull_distance
>>   ,  x ,  x ,  x ,  30, 2016)
>> >  EXT(EXT_color_buffer_float  , dummy_true
>>,  x ,  x ,  x ,  30, 2013)
>> >  EXT(EXT_compiled_vertex_array   , dummy_true
>>, GLL,  x ,  x ,  x , 1996)
>> > +EXT(EXT_compressed_ETC1_RGB8_sub_texture,
>> EXT_compressed_ETC1_RGB8_sub_texture   ,  x ,  x , ES1, ES2, 2014)
>> >  EXT(EXT_copy_image  , OES_copy_image
>>,  x ,  x ,  x ,  30, 2014)
>> >  EXT(EXT_copy_texture, dummy_true
>>, GLL,  x ,  x ,  x , 1995)
>> >  EXT(EXT_depth_bounds_test   , EXT_depth_bounds_test
>>   , GLL, GLC,  x ,  x , 2002)
>> > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> > index f04ec51..719e248 100644
>> > --- a/src/mesa/main/mtypes.h
>> > +++ b/src/mesa/main/mtypes.h
>> > @@ -3948,6 +3948,7 @@ struct gl_extensions
>> > GLboolean EXT_timer_query;
>> > GLboolean EXT_vertex_array_bgra;
>> > GLboolean EXT_window_rectangles;
>> > +   GLboolean EXT_compressed_ETC1_RGB8_sub_texture;
>> > GLboolean OES_copy_image;
>> > GLboolean OES_primitive_bounding_box;
>> > GLboolean OES_sample_variables;
>> > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
>> > index bc3b76a..14fe4db 100644
>> > --- a/src/mesa/main/teximage.c
>> > +++ b/src/mesa/main/teximage.c
>> > @@ -1323,6 +1323,11 @@ compressedteximage_only_format(const struct
>> gl_context *ctx, GLenum format)
>> >  {
>> > switch (format) {
>> > case GL_ETC1_RGB8_OES:
>> > +  if (ctx->Extensions.EXT_compressed_ETC1_RGB8_sub_texture)
>> > + return false;
>> > +  else
>> > + return true;
>> > +  break;
>> > case GL_PALETTE4_RGB8_OES:
>> > case GL_PALETTE4_RGBA8_OES:
>> > case GL_PALETTE4_R5_G6_B5_OES:
>> > --
>> > 2.4.3
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop

Re: [Mesa-dev] [PATCH] st/va: delay calling begin_frame until we have all parameters

2017-01-23 Thread Nayan Deshmukh
The patch is Reviewed-by: Nayan Deshmukh 



On Mon, Jan 23, 2017 at 7:01 PM, Christian König
 wrote:
> Ah, yes of course. If we delay creating the decoder we need to call
> begin_frame() again as well.
>
> Please review and/or test the attached patch. Andy I did understand you
> right that this is already a Tested-by from your side, isn't it?
>
>> I am wondering if calling decode_bitstream one at a time for each
>> buffer is similar to
>> calling it with all buffers at once?
>
> Yes, that is correct. It's just not as efficient.
>
> One problem with VA-API is that it doesn't seem to guarantee that buffers
> stays around after handing them of to the decoder (the ownership handling of
> buffers and surfaces is a totally mess).
>
> So we would need to make a copy of the buffer content to submit it again all
> at once.
>
> Regards,
> Christian.
>
>
> Am 21.01.2017 um 20:46 schrieb Andy Furniss:
>>
>> Nayan Deshmukh wrote:
>>>
>>> Hi Christian,
>>>
>>> The new patch leads to seg fault on my system. You forgot to set the
>>> needs_begin_frame to true when the decoder is created. Here's diff for
>>> reference:
>>
>>
>> Setting true there seems to fix, though only a quick test.
>>
>> The patch below sets false :-)
>>
>>> 
>>> diff --git a/src/gallium/state_trackers/va/picture.c
>>> b/src/gallium/state_trackers/va/picture.c
>>> index e75006d..a51e482 100644
>>> --- a/src/gallium/state_trackers/va/picture.c
>>> +++ b/src/gallium/state_trackers/va/picture.c
>>> @@ -178,6 +178,8 @@ handlePictureParameterBuffer(vlVaDriver *drv,
>>> vlVaContext *context, vlVaBuffer *
>>>
>>> if (!context->decoder)
>>>return VA_STATUS_ERROR_ALLOCATION_FAILED;
>>> +
>>> +  context->needs_begin_frame = false;
>>>  }
>>>
>>>  return vaStatus;
>>> --
>>>
>>> I am wondering if calling decode_bitstream one at a time for each
>>> buffer is similar to
>>> calling it with all buffers at once?
>>>
>>> Cheers,
>>> Nayan
>>>
>>> On Thu, Jan 19, 2017 at 8:36 PM, Andy Furniss 
>>> wrote:

 Andy Furniss wrote:
>
>
> Christian König wrote:
>>
>>
>> Hi Andy,
>>
>> Am 19.01.2017 um 11:46 schrieb Andy Furniss:
>>>
>>>
>>> I think you are right about the slices, the failing vids are
>>> blu-ray/tv.
>>>
>>>
>>>
>>> https://drive.google.com/file/d/0BxP5-S1t9VEEZlozcjVUZ1lDbWM/view?usp=sharing
>>>
>>>
>>>
>> Thanks for the link, if you have time please give the attached patch a
>> try.
>>
>> It should fix the issue, but I currently don't have a test system for
>> VAAPI ready so I can't confirm it of hand.
>
>
>
> It doesn't fix properly. The vid will play normally after a second,
> but during that second I get to see many flash frames of gpu mem.
>
> Lucky I chose this sample out of several as the patch does seem to
> fix a couple of other previously failing vids.



 Though more testing shows it also regresses previously working vids,
 some as
 above, but one never "starts" and is total junk.



>>>
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Enable EXT_compressed_ETC1_RGB8_sub_texture

2017-01-23 Thread Ilia Mirkin
Yeah, sounds like you can reuse the OES_compressed_ETC1_RGB8_texture enable bit.

On Mon, Jan 23, 2017 at 9:31 AM, Manolova, Plamena
 wrote:
> Thank you for reviewing guys! AFAIK this extension is a driver-side feature
> and can be enabled for all drivers that support ETC1. I'll go ahead and
> update my patch.
>
> On Fri, Jan 20, 2017 at 6:25 PM, Jason Ekstrand 
> wrote:
>>
>> On Fri, Jan 20, 2017 at 10:16 AM, Ilia Mirkin 
>> wrote:
>>>
>>> What level of support would a driver need to provide? Can this just be
>>> enabled for all drivers? [This seems like largely a driver-side
>>> feature rather than hardware-based.]
>>
>>
>> My understanding is that we should just expose this extension on all
>> hardware that supports ETC1.  Obviously, if it doesn't support ETC1, you
>> don't get this extension. :-)
>>
>>>
>>> On Fri, Jan 20, 2017 at 1:12 PM, Plamena Manolova
>>>  wrote:
>>> > Since we already have the functionality in place and games
>>> > like Game of Thrones seem to depend on this extension, I
>>> > think it makes sense to enable it by making it part of
>>> > the extension string even though its still a draft:
>>> >
>>> >
>>> > https://www.khronos.org/registry/gles/extensions/EXT/EXT_compressed_ETC1_RGB8_sub_texture.txt
>>> >
>>> > Note: OES_compressed_ETC1_RGB8_sub_texture seems to be listed
>>> > in gl2ext.h, but there's no documentation for it in the KHR
>>> > registry
>>> >
>>> > Signed-off-by: Plamena Manolova 
>>> > ---
>>> >  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
>>> >  src/mesa/main/extensions_table.h | 1 +
>>> >  src/mesa/main/mtypes.h   | 1 +
>>> >  src/mesa/main/teximage.c | 5 +
>>> >  4 files changed, 8 insertions(+)
>>> >
>>> > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
>>> > b/src/mesa/drivers/dri/i965/intel_extensions.c
>>> > index b674b2f..bdf2fa5 100644
>>> > --- a/src/mesa/drivers/dri/i965/intel_extensions.c
>>> > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
>>> > @@ -93,6 +93,7 @@ intelInitExtensions(struct gl_context *ctx)
>>> > ctx->Extensions.EXT_blend_equation_separate = true;
>>> > ctx->Extensions.EXT_blend_func_separate = true;
>>> > ctx->Extensions.EXT_blend_minmax = true;
>>> > +   ctx->Extensions.EXT_compressed_ETC1_RGB8_sub_texture = true;
>>> > ctx->Extensions.EXT_draw_buffers2 = true;
>>> > ctx->Extensions.EXT_framebuffer_sRGB = true;
>>> > ctx->Extensions.EXT_gpu_program_parameters = true;
>>> > diff --git a/src/mesa/main/extensions_table.h
>>> > b/src/mesa/main/extensions_table.h
>>> > index 2de3c59..8b52a97 100644
>>> > --- a/src/mesa/main/extensions_table.h
>>> > +++ b/src/mesa/main/extensions_table.h
>>> > @@ -198,6 +198,7 @@ EXT(EXT_buffer_storage  ,
>>> > ARB_buffer_storage
>>> >  EXT(EXT_clip_cull_distance  , ARB_cull_distance
>>> > ,  x ,  x ,  x ,  30, 2016)
>>> >  EXT(EXT_color_buffer_float  , dummy_true
>>> > ,  x ,  x ,  x ,  30, 2013)
>>> >  EXT(EXT_compiled_vertex_array   , dummy_true
>>> > , GLL,  x ,  x ,  x , 1996)
>>> > +EXT(EXT_compressed_ETC1_RGB8_sub_texture,
>>> > EXT_compressed_ETC1_RGB8_sub_texture   ,  x ,  x , ES1, ES2, 2014)
>>> >  EXT(EXT_copy_image  , OES_copy_image
>>> > ,  x ,  x ,  x ,  30, 2014)
>>> >  EXT(EXT_copy_texture, dummy_true
>>> > , GLL,  x ,  x ,  x , 1995)
>>> >  EXT(EXT_depth_bounds_test   , EXT_depth_bounds_test
>>> > , GLL, GLC,  x ,  x , 2002)
>>> > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>>> > index f04ec51..719e248 100644
>>> > --- a/src/mesa/main/mtypes.h
>>> > +++ b/src/mesa/main/mtypes.h
>>> > @@ -3948,6 +3948,7 @@ struct gl_extensions
>>> > GLboolean EXT_timer_query;
>>> > GLboolean EXT_vertex_array_bgra;
>>> > GLboolean EXT_window_rectangles;
>>> > +   GLboolean EXT_compressed_ETC1_RGB8_sub_texture;
>>> > GLboolean OES_copy_image;
>>> > GLboolean OES_primitive_bounding_box;
>>> > GLboolean OES_sample_variables;
>>> > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
>>> > index bc3b76a..14fe4db 100644
>>> > --- a/src/mesa/main/teximage.c
>>> > +++ b/src/mesa/main/teximage.c
>>> > @@ -1323,6 +1323,11 @@ compressedteximage_only_format(const struct
>>> > gl_context *ctx, GLenum format)
>>> >  {
>>> > switch (format) {
>>> > case GL_ETC1_RGB8_OES:
>>> > +  if (ctx->Extensions.EXT_compressed_ETC1_RGB8_sub_texture)
>>> > + return false;
>>> > +  else
>>> > + return true;
>>> > +  break;
>>> > case GL_PALETTE4_RGB8_OES:
>>> > case GL_PALETTE4_RGBA8_OES:
>>> > case GL_PALETTE4_R5_G6_B5_OES:
>>> > --
>>> > 2.4.3
>>> >
>>> > ___
>>> > mesa-dev mailing list
>>> > mesa-dev@lists.freedesktop.org
>>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>> ___
>>> mesa-dev mailing list
>>> m

Re: [Mesa-dev] [PATCH v2 09/20] i965/fs: indirect addressing with doubles is not supported in IVB/BYT

2017-01-23 Thread Samuel Iglesias Gonsálvez
On Fri, 2017-01-20 at 13:41 -0800, Matt Turner wrote:
> On Tue, Jan 17, 2017 at 1:49 AM, Samuel Iglesias Gonsálvez
>  wrote:
> > It is tested empirically that IVB/BYT don't support indirect
> > addressing
> > with doubles but it is not documented in the PRM.
> > 
> > This patch applies the same solution than for Cherryview/Broxton
> > and
> > takes into account that we cannot double the stride, since the
> > hardware will do it internally.
> > 
> > v2:
> > - Fix assert to take into account Indirect DF MOVs in IVB and HSW.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> > ---
> 
> These two tests uncover a bug in this patch
> 
> spec/arb_gpu_shader_fp64/execution/fs-indirect-temp-double-const-src
> spec/arb_gpu_shader_fp64/uniform_buffers/fs-double-uniform-array-
> direct-indirect
> 
> They generate
> 
> add(8)  a0<1>UW g6<4,4,1>UW 0x0070UW{
> align1 1Q };
> mov(8)  g10<1>UDg[a0]UD{
> align1 1Q };
> add(8)  a0<1>UW g6.8<4,4,1>UW   0x0070UW{
> align1 2N };
> mov(8)  g7<1>UD g[a0]UD{
> align1 2N };
> add(8)  a0<1>UW g8<4,4,1>UW 0x0070UW{
> align1 1Q };
> mov(8)  g10.1<1>UD  g[a0]UD{
> align1 1Q };
> ERROR: Writes must be evenly split between the two
> destination registers
> add(8)  a0<1>UW g8.8<4,4,1>UW   0x0070UW{
> align1 2N };
> mov(8)  g7.1<1>UD   g[a0]UD{
> align1 2N };
> ERROR: Writes must be evenly split between the two
> destination registers
> 
> I think the mov(8)s from g[a0]UD are supposed to be mov(4).
> 

Right.

This bug is not present in our -rc3 branch. However I discovered that,
although piglit was saying the test passed with the -auto flag, this
was not true. It was checking the color of one pixel that turned out to
be green, but some of the others are yellow (which is the color set by
the test for failure).

I am taking a look at this to provide a proper patch.

Sam

> I would have thought that lower_simd_width was the appropriate place
> to split SHADER_OPCODE_MOV_INDIRECT, rather than directly in the
> translation from NIR (as in commit bdab572a8). Maybe if that were
> fixed the solution to this problem would be somewhat simpler?
> 
> Curro, what do you think?
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: destroy pipe_context before destroying st_context

2017-01-23 Thread Nicolai Hähnle
Looks like there's a problem with the error path at the end of 
st_create_context_priv (line ~506): it calls st_destroy_context_priv 
which will now destroy the pipe, which then leads to a double-destroy by 
the caller.


Just setting st->pipe = NULL; would be enough.

Nicolai

On 20.01.2017 20:00, Marek Olšák wrote:

From: Marek Olšák 

If radeonsi starts compiling an optimized shader variant for asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.

Firefox with WebGL2 enabled hits this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456

Cc: 17.0 
---
 src/mesa/state_tracker/st_context.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 0eae971..7a99e82 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -307,20 +307,24 @@ st_destroy_context_priv(struct st_context *st)
}

/* free glDrawPixels cache data */
free(st->drawpix_cache.image);
pipe_resource_reference(&st->drawpix_cache.texture, NULL);

/* free glReadPixels cache data */
st_invalidate_readpix_cache(st);

cso_destroy_context(st->cso_context);
+
+   if (st->pipe)
+  st->pipe->destroy(st->pipe);
+
free( st );
 }


 static struct st_context *
 st_create_context_priv( struct gl_context *ctx, struct pipe_context *pipe,
const struct st_config_options *options)
 {
struct pipe_screen *screen = pipe->screen;
uint i;
@@ -572,21 +576,20 @@ static void
 destroy_tex_sampler_cb(GLuint id, void *data, void *userData)
 {
struct gl_texture_object *texObj = (struct gl_texture_object *) data;
struct st_context *st = (struct st_context *) userData;

st_texture_release_sampler_view(st, st_texture_object(texObj));
 }

 void st_destroy_context( struct st_context *st )
 {
-   struct pipe_context *pipe = st->pipe;
struct gl_context *ctx = st->ctx;
GLuint i;

_mesa_HashWalk(ctx->Shared->TexObjects, destroy_tex_sampler_cb, st);

st_reference_fragprog(st, &st->fp, NULL);
st_reference_geomprog(st, &st->gp, NULL);
st_reference_vertprog(st, &st->vp, NULL);
st_reference_tesscprog(st, &st->tcp, NULL);
st_reference_tesseprog(st, &st->tep, NULL);
@@ -604,22 +607,20 @@ void st_destroy_context( struct st_context *st )

st_destroy_program_variants(st);

_mesa_free_context_data(ctx);

/* This will free the st_context too, so 'st' must not be accessed
 * afterwards. */
st_destroy_context_priv(st);
st = NULL;

-   pipe->destroy( pipe );
-
free(ctx);
 }

 static void
 st_emit_string_marker(struct gl_context *ctx, const GLchar *string, GLsizei 
len)
 {
struct st_context *st = ctx->st;
st->pipe->emit_string_marker(st->pipe, string, len);
 }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] radeonsi: always set the TCL1_ACTION_ENA when invalidating L2

2017-01-23 Thread Nicolai Hähnle

On 20.01.2017 20:07, Marek Olšák wrote:

From: Marek Olšák 

Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.


Have you actually seen this fix anything? Seems reasonable to me anyway, 
so patches 1-6:


Reviewed-by: Nicolai Hähnle 



Cc: 17.0 13.0 
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 837c025..d296874 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -843,25 +843,26 @@ void si_emit_cache_flush(struct si_context *sctx)
 * in PFP.
 *
 * cp_coher_cntl should contain all necessary flags except TC flags
 * at this point.
 *
 * SI-CIK don't support L2 write-back.
 */
if (rctx->flags & SI_CONTEXT_INV_GLOBAL_L2 ||
(rctx->chip_class <= CIK &&
 (rctx->flags & SI_CONTEXT_WRITEBACK_GLOBAL_L2))) {
-   /* Invalidate L1 & L2. (L1 is always invalidated)
+   /* Invalidate L1 & L2. (L1 is always invalidated on SI)
 * WB must be set on VI+ when TC_ACTION is set.
 */
si_emit_surface_sync(rctx, cp_coher_cntl |
 S_0085F0_TC_ACTION_ENA(1) |
+S_0085F0_TCL1_ACTION_ENA(1) |
 S_0301F0_TC_WB_ACTION_ENA(rctx->chip_class 
>= VI));
cp_coher_cntl = 0;
sctx->b.num_L2_invalidates++;
} else {
/* L1 invalidation and L2 writeback must be done separately,
 * because both operations can't be done together.
 */
if (rctx->flags & SI_CONTEXT_WRITEBACK_GLOBAL_L2) {
/* WB = write-back
 * NC = apply to non-coherent MTYPEs


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] radeonsi: handle first_non_void correctly in si_create_vertex_elements

2017-01-23 Thread Nicolai Hähnle
When does it happen that first_non_void < 0? Doesn't this require 
PIPE_FORMAT_NONE, and why would that occur?


Nicolai

On 20.01.2017 20:07, Marek Olšák wrote:

From: Marek Olšák 

Cc: 17.0 
---
 src/gallium/drivers/radeonsi/si_state.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 876cbf6..01edff9 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3361,21 +3361,21 @@ static void *si_create_vertex_elements(struct 
pipe_context *ctx,

if (!used[vbo_index]) {
v->first_vb_use_mask |= 1 << i;
used[vbo_index] = true;
}

desc = util_format_description(elements[i].src_format);
first_non_void = 
util_format_get_first_non_void_channel(elements[i].src_format);
data_format = si_translate_buffer_dataformat(ctx->screen, desc, 
first_non_void);
num_format = si_translate_buffer_numformat(ctx->screen, desc, 
first_non_void);
-   channel = &desc->channel[first_non_void];
+   channel = first_non_void >= 0 ? &desc->channel[first_non_void] 
: NULL;

v->rsrc_word3[i] = 
S_008F0C_DST_SEL_X(si_map_swizzle(desc->swizzle[0])) |
   
S_008F0C_DST_SEL_Y(si_map_swizzle(desc->swizzle[1])) |
   
S_008F0C_DST_SEL_Z(si_map_swizzle(desc->swizzle[2])) |
   
S_008F0C_DST_SEL_W(si_map_swizzle(desc->swizzle[3])) |
   S_008F0C_NUM_FORMAT(num_format) |
   S_008F0C_DATA_FORMAT(data_format);
v->format_size[i] = desc->block.bits / 8;

/* The hardware always treats the 2-bit alpha channel as
@@ -3383,26 +3383,26 @@ static void *si_create_vertex_elements(struct 
pipe_context *ctx,
 */
if (data_format == V_008F0C_BUF_DATA_FORMAT_2_10_10_10) {
if (num_format == V_008F0C_BUF_NUM_FORMAT_SNORM) {
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_A2_SNORM 
<< (4 * i);
} else if (num_format == 
V_008F0C_BUF_NUM_FORMAT_SSCALED) {
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_A2_SSCALED 
<< (4 * i);
} else if (num_format == V_008F0C_BUF_NUM_FORMAT_SINT) {
/* This isn't actually used in OpenGL. */
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_A2_SINT 
<< (4 * i);
}
-   } else if (channel->type == UTIL_FORMAT_TYPE_FIXED) {
+   } else if (channel && channel->type == UTIL_FORMAT_TYPE_FIXED) {
if (desc->swizzle[3] == PIPE_SWIZZLE_1)
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_RGBX_32_FIXED 
<< (4 * i);
else
v->fix_fetch |= (uint64_t)SI_FIX_FETCH_RGBA_32_FIXED 
<< (4 * i);
-   } else if (channel->size == 32 && !channel->pure_integer) {
+   } else if (channel && channel->size == 32 && 
!channel->pure_integer) {
if (channel->type == UTIL_FORMAT_TYPE_SIGNED) {
if (channel->normalized) {
if (desc->swizzle[3] == PIPE_SWIZZLE_1)
v->fix_fetch |= 
(uint64_t)SI_FIX_FETCH_RGBX_32_SNORM << (4 * i);
else
v->fix_fetch |= 
(uint64_t)SI_FIX_FETCH_RGBA_32_SNORM << (4 * i);
} else {
v->fix_fetch |= 
(uint64_t)SI_FIX_FETCH_RGBA_32_SSCALED << (4 * i);
}
} else if (channel->type == UTIL_FORMAT_TYPE_UNSIGNED) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] drirc: remove spurious tabs

2017-01-23 Thread Nicolai Hähnle

Pushed, thanks.

On 23.01.2017 14:29, Eric Engestrom wrote:

On Friday, 2017-01-06 19:08:39 +1100, Edward O'Callaghan wrote:

Reviewed-by: Edward O'Callaghan 


Thanks; can you push this for me please?

Cheers,
  Eric



On 01/06/2017 08:06 AM, Eric Engestrom wrote:

Signed-off-by: Eric Engestrom 
---
 src/mesa/drivers/dri/common/drirc | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc
index af84ee82e8..97297b7a1c 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -28,46 +28,46 @@ TODO: document the other workarounds.
 
 
 
-   
+

 
 
 
-   
+

 
 
 
 
-   
+

 
 
 
 
-   
+

 
 
 
 
-   
+

 
 
 
 
-   
+

 
 
 
-   
+

 
 
 
-   
+

 
 




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/blorp: Add also depth buffer to render cache

2017-01-23 Thread Pohjolainen, Topi
On Fri, Jan 20, 2017 at 08:40:50AM -0800, Jason Ekstrand wrote:
>On Thu, Jan 19, 2017 at 11:48 PM, Pohjolainen, Topi
><[1]topi.pohjolai...@gmail.com> wrote:
> 
>  On Thu, Jan 19, 2017 at 01:39:49PM -0800, Jason Ekstrand wrote:
>  >On Thu, Jan 19, 2017 at 12:40 PM, Francisco Jerez
>  ><[1][2]curroje...@riseup.net> wrote:
>  >
>  >  "Pohjolainen, Topi" <[2][3]topi.pohjolai...@gmail.com>
>  writes:
>  >  > On Thu, Jan 19, 2017 at 12:10:02PM -0800, Francisco Jerez
>  wrote:
>  >  >> Topi Pohjolainen <[3][4]topi.pohjolai...@gmail.com>
>  writes:
>  >  >>
>  >  >> > CC: Francisco Jerez <[4][5]curroje...@riseup.net>
>  >  >> > CC: Kenneth Graunke <[5][6]kenn...@whitecape.org>
>  >  >> > CC: Jason Ekstrand <[6][7]ja...@jlekstrand.net>
>  >  >> > Signed-off-by: Topi Pohjolainen
>  <[7][8]topi.pohjolai...@intel.com>
> 
>>  >> > ---
>>  >> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 3 +++
>>  >> >  1 file changed, 3 insertions(+)
>>  >> >
>>  >> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
>>  b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
>>  >> > index 647a362..594bd5a 100644
>>  >> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
>>  >> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
>>  >> > @@ -261,4 +261,7 @@ retry:
>>  >> >
>>  >> > if (params->dst.enabled)
>>  >> >brw_render_cache_set_add_bo(brw,
>>  params->dst.addr.buffer);
>>  >> > +
>>  >> > +   if (params->depth.enabled)
>>  >> > +  brw_render_cache_set_add_bo(brw,
>>  params->depth.addr.buffer);
>>  >>
>>  >> What about the stencil buffer?  Stencil texturing is likely
>to be
>>  >> unhappy unless you mark it as pending flush as well...
>>  >
>>  > As far as I know i965 only clears depth and color using blorp,
>>  stencil gets
>>  > cleared using meta. Blits in turn have it as destination.
>>  >
>>  That doesn't sound like a safe assumption to rely on looking
>forward
>>  if
>>  the blorp api already exposes support for stencil writes --
>Tracking
>>  down the ultimate cause of a memory coherency bugs can be really
>>  hard,
>>  why make our future lives more intentionally difficult by
>>  introducing
>>  buggy corner cases like this?  The extra check is not going to
>hurt
>>  performance or cause any other harmful side effects unless
>stencil
>>  writes are used...
>>
>>Agreed.  Let's stick it in there for stencil too.  I've got
>patches to
>>switch i965 over to blorp for depth/stencil blits.  I never landed
>them
>>because of what was most likely flushing bugs.  I'm hoping that
>you've
>>fixed those and I'll revive the patches.
>>Also, please make sure these fixes hit stable.
> 
>  This sits on top the four earlier patches. Rebasing this alone
>  against stable
>  requires manual work but can be done. How do you want to handle
>  that?
> 
>Ken, Curro, and I had a little chat about this in the office.  I think
>the conclusion we came to was the following:
>1) The patches to add flushing around HiZ ops and fast clear ops should
>get back-ported all the way to 13.0.  They fix potentially serious bugs
>that could cause problems.

You mean thse two? They apply on 13.0 without any tweaks:

i965/gen6: Issue direct depth stall and flush after depth clear
i965: Make depth clear flushing more explicit

>2) The patches that switch us over to the render cache should get
>backported to 17.0.  They aren't so much a bug fix as an enhancement
>but keeping 17.0 consistent with future will help in backporting other
>fixes.  For the record, this was me and Ken; Curro preferred to not
>backport these.

And all the four apply clean on 17.0.

> 
> References
> 
>1. mailto:topi.pohjolai...@gmail.com
>2. mailto:curroje...@riseup.net
>3. mailto:topi.pohjolai...@gmail.com
>4. mailto:topi.pohjolai...@gmail.com
>5. mailto:curroje...@riseup.net
>6. mailto:kenn...@whitecape.org
>7. mailto:ja...@jlekstrand.net
>8. mailto:topi.pohjolai...@intel.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/va: delay calling begin_frame until we have all parameters

2017-01-23 Thread Andy Furniss

Christian König wrote:

Ah, yes of course. If we delay creating the decoder we need to call
begin_frame() again as well.

Please review and/or test the attached patch. Andy I did understand you
right that this is already a Tested-by from your side, isn't it?


Yes, this patch seems OK.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/blorp: Add also depth buffer to render cache

2017-01-23 Thread Jason Ekstrand
On Mon, Jan 23, 2017 at 7:47 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Fri, Jan 20, 2017 at 08:40:50AM -0800, Jason Ekstrand wrote:
> >On Thu, Jan 19, 2017 at 11:48 PM, Pohjolainen, Topi
> ><[1]topi.pohjolai...@gmail.com> wrote:
> >
> >  On Thu, Jan 19, 2017 at 01:39:49PM -0800, Jason Ekstrand wrote:
> >  >On Thu, Jan 19, 2017 at 12:40 PM, Francisco Jerez
> >  ><[1][2]curroje...@riseup.net> wrote:
> >  >
> >  >  "Pohjolainen, Topi" <[2][3]topi.pohjolai...@gmail.com>
> >  writes:
> >  >  > On Thu, Jan 19, 2017 at 12:10:02PM -0800, Francisco Jerez
> >  wrote:
> >  >  >> Topi Pohjolainen <[3][4]topi.pohjolai...@gmail.com>
> >  writes:
> >  >  >>
> >  >  >> > CC: Francisco Jerez <[4][5]curroje...@riseup.net>
> >  >  >> > CC: Kenneth Graunke <[5][6]kenn...@whitecape.org>
> >  >  >> > CC: Jason Ekstrand <[6][7]ja...@jlekstrand.net>
> >  >  >> > Signed-off-by: Topi Pohjolainen
> >  <[7][8]topi.pohjolai...@intel.com>
> >
> >>  >> > ---
> >>  >> >  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 3 +++
> >>  >> >  1 file changed, 3 insertions(+)
> >>  >> >
> >>  >> > diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> >>  b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> >>  >> > index 647a362..594bd5a 100644
> >>  >> > --- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> >>  >> > +++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
> >>  >> > @@ -261,4 +261,7 @@ retry:
> >>  >> >
> >>  >> > if (params->dst.enabled)
> >>  >> >brw_render_cache_set_add_bo(brw,
> >>  params->dst.addr.buffer);
> >>  >> > +
> >>  >> > +   if (params->depth.enabled)
> >>  >> > +  brw_render_cache_set_add_bo(brw,
> >>  params->depth.addr.buffer);
> >>  >>
> >>  >> What about the stencil buffer?  Stencil texturing is likely
> >to be
> >>  >> unhappy unless you mark it as pending flush as well...
> >>  >
> >>  > As far as I know i965 only clears depth and color using
> blorp,
> >>  stencil gets
> >>  > cleared using meta. Blits in turn have it as destination.
> >>  >
> >>  That doesn't sound like a safe assumption to rely on looking
> >forward
> >>  if
> >>  the blorp api already exposes support for stencil writes --
> >Tracking
> >>  down the ultimate cause of a memory coherency bugs can be
> really
> >>  hard,
> >>  why make our future lives more intentionally difficult by
> >>  introducing
> >>  buggy corner cases like this?  The extra check is not going to
> >hurt
> >>  performance or cause any other harmful side effects unless
> >stencil
> >>  writes are used...
> >>
> >>Agreed.  Let's stick it in there for stencil too.  I've got
> >patches to
> >>switch i965 over to blorp for depth/stencil blits.  I never
> landed
> >them
> >>because of what was most likely flushing bugs.  I'm hoping that
> >you've
> >>fixed those and I'll revive the patches.
> >>Also, please make sure these fixes hit stable.
> >
> >  This sits on top the four earlier patches. Rebasing this alone
> >  against stable
> >  requires manual work but can be done. How do you want to handle
> >  that?
> >
> >Ken, Curro, and I had a little chat about this in the office.  I think
> >the conclusion we came to was the following:
> >1) The patches to add flushing around HiZ ops and fast clear ops
> should
> >get back-ported all the way to 13.0.  They fix potentially serious
> bugs
> >that could cause problems.
>
> You mean thse two? They apply on 13.0 without any tweaks:
>
> i965/gen6: Issue direct depth stall and flush after depth clear
> i965: Make depth clear flushing more explicit
>

Yes, those.


> >2) The patches that switch us over to the render cache should get
> >backported to 17.0.  They aren't so much a bug fix as an enhancement
> >but keeping 17.0 consistent with future will help in backporting other
> >fixes.  For the record, this was me and Ken; Curro preferred to not
> >backport these.
>
> And all the four apply clean on 17.0.
>

Cool


> >
> > References
> >
> >1. mailto:topi.pohjolai...@gmail.com
> >2. mailto:curroje...@riseup.net
> >3. mailto:topi.pohjolai...@gmail.com
> >4. mailto:topi.pohjolai...@gmail.com
> >5. mailto:curroje...@riseup.net
> >6. mailto:kenn...@whitecape.org
> >7. mailto:ja...@jlekstrand.net
> >8. mailto:topi.pohjolai...@intel.com
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] i965: Use a better guardband calculation.

2017-01-23 Thread Jason Ekstrand
On Mon, Jan 23, 2017 at 4:59 AM, Ilia Mirkin  wrote:

> On Mon, Jan 23, 2017 at 1:42 AM, Kenneth Graunke 
> wrote:
> > +  *ymin = MIN2(ndc_gb_ymin, ndc_gb_ymax);
>
> Should this be limited on the negative end of the range?
>
> > +  *ymax = MIN2(MAX2(ndc_gb_ymin, ndc_gb_ymax), 16383);
>
> And should this be a different value for gen6? Perhaps based on gb_size?
>

I don't think so.  It's a limit on some hardware for unknown reasons.  That
said, Ken, it would be really nice to get an actual PRM citation because I
can't find it anywhere.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: destroy pipe_context before destroying st_context (v2)

2017-01-23 Thread Marek Olšák
From: Marek Olšák 

If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.

Firefox with WebGL2 enabled hits this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456

v2: protect against a double destroy in st_create_context_priv and callers.

Cc: 17.0 
---
 src/mesa/state_tracker/st_context.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 0eae971..5523734 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -271,21 +271,21 @@ void st_invalidate_state(struct gl_context * ctx, 
GLbitfield new_state)
   st->dirty |= st->active_states & ST_NEW_CONSTANTS;
 
/* This is the only core Mesa module we depend upon.
 * No longer use swrast, swsetup, tnl.
 */
_vbo_InvalidateState(ctx, new_state);
 }
 
 
 static void
-st_destroy_context_priv(struct st_context *st)
+st_destroy_context_priv(struct st_context *st, bool destroy_pipe)
 {
uint shader, i;
 
st_destroy_atoms( st );
st_destroy_draw( st );
st_destroy_clear(st);
st_destroy_bitmap(st);
st_destroy_drawpix(st);
st_destroy_drawtex(st);
st_destroy_perfmon(st);
@@ -307,20 +307,24 @@ st_destroy_context_priv(struct st_context *st)
}
 
/* free glDrawPixels cache data */
free(st->drawpix_cache.image);
pipe_resource_reference(&st->drawpix_cache.texture, NULL);
 
/* free glReadPixels cache data */
st_invalidate_readpix_cache(st);
 
cso_destroy_context(st->cso_context);
+
+   if (st->pipe && destroy_pipe)
+  st->pipe->destroy(st->pipe);
+
free( st );
 }
 
 
 static struct st_context *
 st_create_context_priv( struct gl_context *ctx, struct pipe_context *pipe,
const struct st_config_options *options)
 {
struct pipe_screen *screen = pipe->screen;
uint i;
@@ -496,21 +500,21 @@ st_create_context_priv( struct gl_context *ctx, struct 
pipe_context *pipe,
st->shader_has_one_variant[MESA_SHADER_TESS_EVAL] = 
st->has_shareable_shaders;
st->shader_has_one_variant[MESA_SHADER_GEOMETRY] = 
st->has_shareable_shaders;
st->shader_has_one_variant[MESA_SHADER_COMPUTE] = st->has_shareable_shaders;
 
_mesa_compute_version(ctx);
 
if (ctx->Version == 0) {
   /* This can happen when a core profile was requested, but the driver
* does not support some features of GL 3.1 or later.
*/
-  st_destroy_context_priv(st);
+  st_destroy_context_priv(st, false);
   return NULL;
}
 
_mesa_initialize_dispatch_tables(ctx);
_mesa_initialize_vbo_vtxfmt(ctx);
 
return st;
 }
 
 static void st_init_driver_flags(struct gl_driver_flags *f)
@@ -572,21 +576,20 @@ static void
 destroy_tex_sampler_cb(GLuint id, void *data, void *userData)
 {
struct gl_texture_object *texObj = (struct gl_texture_object *) data;
struct st_context *st = (struct st_context *) userData;
 
st_texture_release_sampler_view(st, st_texture_object(texObj));
 }
  
 void st_destroy_context( struct st_context *st )
 {
-   struct pipe_context *pipe = st->pipe;
struct gl_context *ctx = st->ctx;
GLuint i;
 
_mesa_HashWalk(ctx->Shared->TexObjects, destroy_tex_sampler_cb, st);
 
st_reference_fragprog(st, &st->fp, NULL);
st_reference_geomprog(st, &st->gp, NULL);
st_reference_vertprog(st, &st->vp, NULL);
st_reference_tesscprog(st, &st->tcp, NULL);
st_reference_tesseprog(st, &st->tep, NULL);
@@ -601,25 +604,23 @@ void st_destroy_context( struct st_context *st )
pipe_resource_reference(&st->pixel_xfer.pixelmap_texture, NULL);
 
_vbo_DestroyContext(ctx);
 
st_destroy_program_variants(st);
 
_mesa_free_context_data(ctx);
 
/* This will free the st_context too, so 'st' must not be accessed
 * afterwards. */
-   st_destroy_context_priv(st);
+   st_destroy_context_priv(st, true);
st = NULL;
 
-   pipe->destroy( pipe );
-
free(ctx);
 }
 
 static void
 st_emit_string_marker(struct gl_context *ctx, const GLchar *string, GLsizei 
len)
 {
struct st_context *st = ctx->st;
st->pipe->emit_string_marker(st->pipe, string, len);
 }
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] radeonsi: always set the TCL1_ACTION_ENA when invalidating L2

2017-01-23 Thread Marek Olšák
On Mon, Jan 23, 2017 at 4:09 PM, Nicolai Hähnle  wrote:
> On 20.01.2017 20:07, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> Some CIK-VI docs say this is the default behavior on SI. That doesn't
>> answer whether it's also the default behavior on CIK-VI.
>
>
> Have you actually seen this fix anything? Seems reasonable to me anyway, so

Nope. I guess TC L1 hits between draw calls are very unlikely.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] radeonsi: handle first_non_void correctly in si_create_vertex_elements

2017-01-23 Thread Marek Olšák
On Mon, Jan 23, 2017 at 4:10 PM, Nicolai Hähnle  wrote:
> When does it happen that first_non_void < 0? Doesn't this require
> PIPE_FORMAT_NONE, and why would that occur?

R11G11B10_FLOAT, because it's in the category of "OTHER", meaning that
it doesn't have any channel description.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [AMD] Screen flickering with 4K and RX 480, would be glad to help debugging

2017-01-23 Thread Romain Failliot
2017-01-23 0:06 GMT-05:00 Timothy Arceri :

> I can confirm a similar problem. I have the same card and also got a 4K
> monitor recently. For me I was running a game in windowed mode (F1 2015
> I think) for a very short amount of time and had some flickering, after
> I closed the game gnome continued flickering periodically until I
> rebooted.
>
> I'm also running Fedora 25 but was running stock Mesa (13.0) at the
> time and stock llvm (3.8) and 4.8 Kernel.
>
> I haven't bothered trying to reproduce at this stage.
>

Always glad to know you're not alone ;)

What is your workaround? For me it is to get back to 2516x1440, it seems to
solve the problem, or at least I've never seen a glitch when in this
resolution.

-- 
Romain "Creak" Failliot
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [AMD] Screen flickering with 4K and RX 480, would be glad to help debugging

2017-01-23 Thread Romain Failliot
2017-01-23 12:43 GMT-05:00 Romain Failliot :

> What is your workaround? For me it is to get back to 2516x1440, it seems
> to solve the problem, or at least I've never seen a glitch when in this
> resolution.
>

s/2516/2560/

-- 
Romain "Creak" Failliot
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] gallivm: (trivial) fix ddiv cpu implementation

2017-01-23 Thread sroland
From: Roland Scheidegger 

we can't use the cpu implementation of fdiv, as this one uses different
lp_build_context, which causes assertion failure.
Just use default fdiv action (there is no fast rcp for doubles which we
could potentially use anyway).
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index 937170f..e78cdb0 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -2624,7 +2624,6 @@ lp_set_default_actions_cpu(
bld_base->op_actions[TGSI_OPCODE_DSLT].emit = dslt_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DSNE].emit = dsne_emit_cpu;
 
-   bld_base->op_actions[TGSI_OPCODE_DDIV].emit = div_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DRSQ].emit = drecip_sqrt_emit_cpu;
bld_base->op_actions[TGSI_OPCODE_DSQRT].emit = dsqrt_emit_cpu;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] tgsi: implement ddiv opcode

2017-01-23 Thread sroland
From: Roland Scheidegger 

softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64,
so we really need to support that opcode.
---
 src/gallium/auxiliary/tgsi/tgsi_exec.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index 915cd10..2135ad1 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -210,6 +210,16 @@ micro_dadd(union tgsi_double_channel *dst,
 }
 
 static void
+micro_ddiv(union tgsi_double_channel *dst,
+  const union tgsi_double_channel *src)
+{
+   dst->d[0] = src[0].d[0] / src[1].d[0];
+   dst->d[1] = src[0].d[1] / src[1].d[1];
+   dst->d[2] = src[0].d[2] / src[1].d[2];
+   dst->d[3] = src[0].d[3] / src[1].d[3];
+}
+
+static void
 micro_ddx(union tgsi_exec_channel *dst,
   const union tgsi_exec_channel *src)
 {
@@ -5995,6 +6005,10 @@ exec_instruction(
   exec_double_binary(mach, inst, micro_dadd, TGSI_EXEC_DATA_DOUBLE);
   break;
 
+   case TGSI_OPCODE_DDIV:
+  exec_double_binary(mach, inst, micro_ddiv, TGSI_EXEC_DATA_DOUBLE);
+  break;
+
case TGSI_OPCODE_DMUL:
   exec_double_binary(mach, inst, micro_dmul, TGSI_EXEC_DATA_DOUBLE);
   break;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] gallivm: don't try to use fast rcp for fdiv

2017-01-23 Thread sroland
From: Roland Scheidegger 

The use of fast rcp instruction is disabled, and will always fall back
to use a division instead (1 / x). Hence, if we get a division opcode,
it doesn't make much sense trying to split that into rcp/mul.
---
 src/gallium/auxiliary/gallivm/lp_bld_arit.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c 
b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
index 5553cb1..04f86be 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
@@ -1372,7 +1372,9 @@ lp_build_div(struct lp_build_context *bld,
  return LLVMConstUDiv(a, b);
}
 
-   if(((util_cpu_caps.has_sse && type.width == 32 && type.length == 4) ||
+   /* fast rcp is disabled (just uses div), so makes no sense to try that */
+   if(FALSE &&
+  ((util_cpu_caps.has_sse && type.width == 32 && type.length == 4) ||
(util_cpu_caps.has_avx && type.width == 32 && type.length == 8)) &&
   type.floating)
   return lp_build_mul(bld, a, lp_build_rcp(bld, b));
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallium/radeon: refactor the GRBM counters path

2017-01-23 Thread Marek Olšák
On Fri, Jan 20, 2017 at 8:19 PM, Samuel Pitoiset
 wrote:
> This will allow to expose more queries in order to know which
> blocks are busy/idle.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeon/r600_gpu_load.c| 55 
> ++-
>  src/gallium/drivers/radeon/r600_pipe_common.h | 18 +
>  src/gallium/drivers/radeon/r600_query.c   | 14 +++
>  3 files changed, 44 insertions(+), 43 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_gpu_load.c 
> b/src/gallium/drivers/radeon/r600_gpu_load.c
> index e3488b3ac7..38ddda1652 100644
> --- a/src/gallium/drivers/radeon/r600_gpu_load.c
> +++ b/src/gallium/drivers/radeon/r600_gpu_load.c
> @@ -35,6 +35,7 @@
>   */
>
>  #include "r600_pipe_common.h"
> +#include "r600_query.h"
>  #include "os/os_time.h"
>
>  /* For good accuracy at 1000 fps or lower. This will be inaccurate for higher
> @@ -45,6 +46,12 @@
>  #define SPI_BUSY(x)(((x) >> 22) & 0x1)
>  #define GUI_ACTIVE(x)  (((x) >> 31) & 0x1)
>
> +#define UPDATE_COUNTER(field, mask)\
> +   if (mask(value))\
> +   p_atomic_inc(&counters->named.field.busy);  \
> +   else\
> +   p_atomic_inc(&counters->named.field.idle);
> +
>  static void r600_update_grbm_counters(struct r600_common_screen *rscreen,
>   union r600_grbm_counters *counters)
>  {
> @@ -52,15 +59,8 @@ static void r600_update_grbm_counters(struct 
> r600_common_screen *rscreen,
>
> rscreen->ws->read_registers(rscreen->ws, GRBM_STATUS, 1, &value);
>
> -   if (SPI_BUSY(value))
> -   p_atomic_inc(&counters->named.spi_busy);
> -   else
> -   p_atomic_inc(&counters->named.spi_idle);
> -
> -   if (GUI_ACTIVE(value))
> -   p_atomic_inc(&counters->named.gui_busy);
> -   else
> -   p_atomic_inc(&counters->named.gui_idle);
> +   UPDATE_COUNTER(spi, SPI_BUSY);
> +   UPDATE_COUNTER(gui, GUI_ACTIVE);
>  }
>
>  static PIPE_THREAD_ROUTINE(r600_gpu_load_thread, param)
> @@ -104,8 +104,8 @@ void r600_gpu_load_kill_thread(struct r600_common_screen 
> *rscreen)
> rscreen->gpu_load_thread = 0;
>  }
>
> -static uint64_t r600_read_counter(struct r600_common_screen *rscreen,
> - unsigned busy_index)
> +static uint64_t r600_read_grbm_counter(struct r600_common_screen *rscreen,
> +  unsigned busy_index)
>  {
> /* Start the thread if needed. */
> if (!rscreen->gpu_load_thread) {
> @@ -123,10 +123,10 @@ static uint64_t r600_read_counter(struct 
> r600_common_screen *rscreen,
> return busy | ((uint64_t)idle << 32);
>  }
>
> -static unsigned r600_end_counter(struct r600_common_screen *rscreen,
> -uint64_t begin, unsigned busy_index)
> +static unsigned r600_end_grbm_counter(struct r600_common_screen *rscreen,
> + uint64_t begin, unsigned busy_index)
>  {
> -   uint64_t end = r600_read_counter(rscreen, busy_index);
> +   uint64_t end = r600_read_grbm_counter(rscreen, busy_index);
> unsigned busy = (end & 0x) - (begin & 0x);
> unsigned idle = (end >> 32) - (begin >> 32);
>
> @@ -147,25 +147,28 @@ static unsigned r600_end_counter(struct 
> r600_common_screen *rscreen,
> }
>  }
>
> -#define BUSY_INDEX(rscreen, field) 
> (&rscreen->grbm_counters.named.field##_busy - \
> +#define BUSY_INDEX(rscreen, field) (&rscreen->grbm_counters.named.field.busy 
> - \
> rscreen->grbm_counters.array)
>
> -uint64_t r600_begin_counter_spi(struct r600_common_screen *rscreen)
> -{
> -   return r600_read_counter(rscreen, BUSY_INDEX(rscreen, spi));
> -}
> -
> -unsigned r600_end_counter_spi(struct r600_common_screen *rscreen, uint64_t 
> begin)
> +static unsigned busy_index_from_type(struct r600_common_screen *rscreen,
> +unsigned type)
>  {
> -   return r600_end_counter(rscreen, begin, BUSY_INDEX(rscreen, spi));
> +   switch (type) {
> +   case R600_QUERY_GPU_LOAD: return BUSY_INDEX(rscreen, gui);
> +   case R600_QUERY_GPU_SHADERS_BUSY: return BUSY_INDEX(rscreen, spi);
> +   default: unreachable("query type does not correspond to grbm id");

Always make a new line after ':'

Other than that:

Reviewed-by: Marek Olšák 

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] gallivm: don't try to use fast rcp for fdiv

2017-01-23 Thread Jose Fonseca

On 23/01/17 17:47, srol...@vmware.com wrote:

From: Roland Scheidegger 

The use of fast rcp instruction is disabled, and will always fall back
to use a division instead (1 / x). Hence, if we get a division opcode,
it doesn't make much sense trying to split that into rcp/mul.
---
 src/gallium/auxiliary/gallivm/lp_bld_arit.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c 
b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
index 5553cb1..04f86be 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
@@ -1372,7 +1372,9 @@ lp_build_div(struct lp_build_context *bld,
  return LLVMConstUDiv(a, b);
}

-   if(((util_cpu_caps.has_sse && type.width == 32 && type.length == 4) ||
+   /* fast rcp is disabled (just uses div), so makes no sense to try that */
+   if(FALSE &&
+  ((util_cpu_caps.has_sse && type.width == 32 && type.length == 4) ||
(util_cpu_caps.has_avx && type.width == 32 && type.length == 8)) &&
   type.floating)
   return lp_build_mul(bld, a, lp_build_rcp(bld, b));



Good catch.  Series looks good to me.

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: add HUD queries for monitoring some hw blocks

2017-01-23 Thread Marek Olšák
On Fri, Jan 20, 2017 at 8:19 PM, Samuel Pitoiset
 wrote:
> It's also possible to monitor them via performance counters but
> the hardware can only use two counters simultaneously. It seems
> easier to re-use the existing code which reads from MMIO instead
> of writing a multi-pass approach.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeon/r600_gpu_load.c| 36 +
>  src/gallium/drivers/radeon/r600_pipe_common.h | 12 +
>  src/gallium/drivers/radeon/r600_query.c   | 39 
> ++-
>  src/gallium/drivers/radeon/r600_query.h   | 12 +
>  4 files changed, 98 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeon/r600_gpu_load.c 
> b/src/gallium/drivers/radeon/r600_gpu_load.c
> index 38ddda1652..8f5568c2dc 100644
> --- a/src/gallium/drivers/radeon/r600_gpu_load.c
> +++ b/src/gallium/drivers/radeon/r600_gpu_load.c
> @@ -43,7 +43,19 @@
>  #define SAMPLES_PER_SEC 1
>
>  #define GRBM_STATUS0x8010
> +#define TA_BUSY(x) (((x) >> 14) & 0x1)
> +#define GDS_BUSY(x)(((x) >> 15) & 0x1)
> +#define VGT_BUSY(x)(((x) >> 17) & 0x1)
> +#define IA_BUSY(x) (((x) >> 19) & 0x1)
> +#define SX_BUSY(x) (((x) >> 20) & 0x1)
> +#define WD_BUSY(x) (((x) >> 21) & 0x1)
>  #define SPI_BUSY(x)(((x) >> 22) & 0x1)
> +#define BCI_BUSY(x)(((x) >> 23) & 0x1)
> +#define SC_BUSY(x) (((x) >> 24) & 0x1)
> +#define PA_BUSY(x) (((x) >> 25) & 0x1)
> +#define DB_BUSY(x) (((x) >> 26) & 0x1)
> +#define CP_BUSY(x) (((x) >> 29) & 0x1)
> +#define CB_BUSY(x) (((x) >> 30) & 0x1)
>  #define GUI_ACTIVE(x)  (((x) >> 31) & 0x1)
>
>  #define UPDATE_COUNTER(field, mask)\
> @@ -59,7 +71,19 @@ static void r600_update_grbm_counters(struct 
> r600_common_screen *rscreen,
>
> rscreen->ws->read_registers(rscreen->ws, GRBM_STATUS, 1, &value);
>
> +   UPDATE_COUNTER(ta, TA_BUSY);
> +   UPDATE_COUNTER(gds, GDS_BUSY);
> +   UPDATE_COUNTER(vgt, VGT_BUSY);
> +   UPDATE_COUNTER(ia, IA_BUSY);
> +   UPDATE_COUNTER(sx, SX_BUSY);
> +   UPDATE_COUNTER(wd, WD_BUSY);
> UPDATE_COUNTER(spi, SPI_BUSY);
> +   UPDATE_COUNTER(bci, BCI_BUSY);
> +   UPDATE_COUNTER(sc, SC_BUSY);
> +   UPDATE_COUNTER(pa, PA_BUSY);
> +   UPDATE_COUNTER(db, DB_BUSY);
> +   UPDATE_COUNTER(cp, CP_BUSY);
> +   UPDATE_COUNTER(cb, CB_BUSY);
> UPDATE_COUNTER(gui, GUI_ACTIVE);
>  }
>
> @@ -156,6 +180,18 @@ static unsigned busy_index_from_type(struct 
> r600_common_screen *rscreen,
> switch (type) {
> case R600_QUERY_GPU_LOAD: return BUSY_INDEX(rscreen, gui);
> case R600_QUERY_GPU_SHADERS_BUSY: return BUSY_INDEX(rscreen, spi);
> +   case R600_QUERY_GPU_TA_BUSY: return BUSY_INDEX(rscreen, ta);
> +   case R600_QUERY_GPU_GDS_BUSY: return BUSY_INDEX(rscreen, gds);
> +   case R600_QUERY_GPU_VGT_BUSY: return BUSY_INDEX(rscreen, vgt);
> +   case R600_QUERY_GPU_IA_BUSY: return BUSY_INDEX(rscreen, ia);
> +   case R600_QUERY_GPU_SX_BUSY: return BUSY_INDEX(rscreen, sx);
> +   case R600_QUERY_GPU_WD_BUSY: return BUSY_INDEX(rscreen, wd);
> +   case R600_QUERY_GPU_BCI_BUSY: return BUSY_INDEX(rscreen, bci);
> +   case R600_QUERY_GPU_SC_BUSY: return BUSY_INDEX(rscreen, sc);
> +   case R600_QUERY_GPU_PA_BUSY: return BUSY_INDEX(rscreen, pa);
> +   case R600_QUERY_GPU_DB_BUSY: return BUSY_INDEX(rscreen, db);
> +   case R600_QUERY_GPU_CP_BUSY: return BUSY_INDEX(rscreen, cp);
> +   case R600_QUERY_GPU_CB_BUSY: return BUSY_INDEX(rscreen, cb);
> default: unreachable("query type does not correspond to grbm id");

Same as patch 1 - new line after ':'. Other than that:

Reviewed-by: Marek Olšák 

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] nir/spirv/glsl450: rewrite atan2 to deal with infinities

2017-01-23 Thread Francisco Jerez
"Juan A. Suarez Romero"  writes:

> On Sun, 2017-01-22 at 00:20 -0800, Francisco Jerez wrote:
>> "Juan A. Suarez Romero"  writes:
>> 
>> > Rewrite atan2(y,x) to cover (+/-)INF values.
>> > 
>> > This fixes several test cases in Vulkan CTS
>> > (dEQP-VK.glsl.builtin.precision.atan2.*)
>> > 
>> > v2: do not flush denorms to 0 (jasuarez)
>> > ---
>> >  src/compiler/spirv/vtn_glsl450.c | 48 
>> > +++-
>> >  1 file changed, 42 insertions(+), 6 deletions(-)
>> > 
>> > diff --git a/src/compiler/spirv/vtn_glsl450.c 
>> > b/src/compiler/spirv/vtn_glsl450.c
>> > index 0d32fddbef..d52a22c0c3 100644
>> > --- a/src/compiler/spirv/vtn_glsl450.c
>> > +++ b/src/compiler/spirv/vtn_glsl450.c
>> > @@ -299,18 +299,47 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x)
>> > return nir_fmul(b, tmp, nir_fsign(b, y_over_x));
>> >  }
>> >  
>> > +/*
>> > + * Computes atan2(y,x)
>> > + */
>> >  static nir_ssa_def *
>> >  build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
>> >  {
>> > nir_ssa_def *zero = nir_imm_float(b, 0.0f);
>> > -
>> > -   /* If |x| >= 1.0e-8 * |y|: */
>> > -   nir_ssa_def *condition =
>> > -  nir_fge(b, nir_fabs(b, x),
>> > -  nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y)));
>> > +   nir_ssa_def *inf = nir_imm_float(b, INFINITY);
>> > +   nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY);
>> > +   nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f),
>> > +   nir_imm_float(b, M_PI_4f));
>> > +
>> > +   /* if y == +-INF */
>> > +   nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf);
>> > +
>> > +   /* if x == +-INF */
>> > +   nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf);
>> > +
>> > +   /* Case: y is +-INF */
>> > +   nir_ssa_def *y_is_inf_then =
>> > +  nir_fmul(b, nir_fsign(b, y),
>> > +  nir_bcsel(b, nir_feq(b, x, inf),
>> > +   nir_imm_float(b, M_PI_4f),
>> > +   nir_bcsel(b, nir_feq(b, x, minus_inf),
>> > +m_3_pi_4,
>> > +nir_imm_float(b, M_PI_2f;
>> > +
>> > +   /* Case: x is +-INF */
>> > +   nir_ssa_def *x_is_inf_then =
>> > +  nir_fmul(b, nir_fsign(b, y),
>> > +  nir_bcsel(b, nir_feq(b, x, inf),
>> > +   zero,
>> > +   nir_imm_float(b, M_PIf)));
>> > +
>> 
>> I don't think we need all these special cases.  The majority of the
>> infinity/zero handling rules required by IEEE are fairly natural and
>> would be taken care of without any additional effort by the
>> floating-point division operation and single-argument atan function
>> below if they propagated infinities and zeroes according to IEEE rules.
>> 
>> I had a look at the test results myself and noticed that the failures
>> are for the most part due to a precision problem in the current
>> implementation that doesn't only affect infinity -- Relative precision
>> also explodes as x grows above certain point, infinities just make the
>> problem catastrophic and cause it to return NaN instead of the
>> expected finite value.  The reason for the precision problem is that
>> fdiv is later on lowered into an fmul+frcp sequence, and the latter may
>> flush the result to zero if the denominator was so huge that its
>> reciprocal would be denormalized.  If the numerator happened to be
>> infinite you may end up with ∞/huge = NaN for the same reason.
>> 
>
> Right. For this case I'd submitted a patch to the test itself, that
> roughly speaking assumes any result as possible if denominator is big
> enough.
>
> https://gerrit.khronos.org/#/c/524/
>
>
> I understand with your alternative proposal you would also handle this
> case correctly, making the CTS change not required, right?
>
Yes, I think the CTS had found a legitimate bug in our atan2
implementation, patching it only conceals the problem -- Granted that
trigonometric functions have unspecified precision according to the GLSL
spec [so you could argue that the majority of these tests shouldn't even
exist in the first place ;)], but the result was over 8 million ULP off
for a range of inputs which seems a bit over the top.  With my atan2
implementation the related CTS tests pass without any changes.

>
>> On top of that there seem to be other issues with the current atan2
>> implementation:
>> 
>>  - It doesn't handle zeros correctly.  This may be related to your
>>observation that denorm arguments cause it to give bogus results, but
>>the problem doesn't seem to be related to denorms in particular, but
>>to the fact that denorms can get flushed to -0 which is in turn
>>handled incorrectly.  The reason is that the existing code uses 'y >=
>>0' to determine on which side of the branch cut we are, but that
>>causes the discontinuity to end up along the y=-epsilon line instead
>>of along the y=0 line a

[Mesa-dev] [PATCH] glsl: fix compile errors with mingw due to missing PRIx64 definitions

2017-01-23 Thread sroland
From: Roland Scheidegger 

define __STDC_FORMAT_MACROS and include  (same as
ir_builder_print_visitor.cpp already does).

Otherwise, some mingw build errors out (since
8e7e1ae0365ddc7edb0d4d98250ab46728e6c14a and
bbce1c538dc0cb8bf3769510283d11847dc07540 presumably) with:
src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ before 
‘PRIu64’
   case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break;

(Note even with that fix I get other format specifier warnings:
src/compiler/glsl/ir_print_visitor.cpp:473:47:
warning: unknown conversion type character ‘a’ in format [-Wformat=]
fprintf(f, "%a", ir->value.f[i]);
   ^
src/compiler/glsl/ir_print_visitor.cpp:473:47:
warning: too many arguments for format [-Wformat-extra-args]
but it still compiles at least)
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 ++
 src/compiler/glsl/ir_print_visitor.cpp   | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index e888090..3d2fc14 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -20,6 +20,8 @@
  * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
  * DEALINGS IN THE SOFTWARE.
  */
+#define __STDC_FORMAT_MACROS 1
+#include  /* for PRIx64 macro */
 #include 
 #include 
 #include 
diff --git a/src/compiler/glsl/ir_print_visitor.cpp 
b/src/compiler/glsl/ir_print_visitor.cpp
index 0763277..debbdad 100644
--- a/src/compiler/glsl/ir_print_visitor.cpp
+++ b/src/compiler/glsl/ir_print_visitor.cpp
@@ -21,6 +21,8 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#define __STDC_FORMAT_MACROS 1
+#include  /* for PRIx64 macro */
 #include "ir_print_visitor.h"
 #include "compiler/glsl_types.h"
 #include "glsl_parser_extras.h"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Don't advertise GL_OES_read_format in core profile

2017-01-23 Thread Ian Romanick
From: Ian Romanick 

OpenGL ES implementations are not allowed to ship ARB extensions, and
OpenGL implementations are not allowed to ship OES extensions.

The functionality is also included in GL_ARB_ES2_compatibility.  Ever
OpenGL core-profile driver currently exposes both extensions.  I don't
know of any applications that explicitly check for GL_OES_read_format,
so removing it seems very unlikely to cause problems.  No functionality
is removed.

I have left this extension in place for compatibility profile.  There
are still OpenGL 1.x drivers in Mesa, and adding code to check for
compatibility profile and not GL_ARB_ES2_compatibility for
GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT
just feels dumb.

Three other other alternatives considered:

 - Remove the string from compatibility profile drivers but leave the
   functionality in place.

 - Add a flag to expose the extension string, and set it in every OpenGL
   driver that does not expose GL_ARB_ES2_compatibility (and those
   drivers only).  I tried this.  You can't have two instances of an
   extension in the extension table (one dummy_true for ES1 and one with
   a flag for compatibility profile), so the implementation requires a
   bit of effort.

 - Only expose the extension in compatibility if the version is less
   than 2.0.  I didn't see an easy way to do this.

Signed-off-by: Ian Romanick 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/extensions_table.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 91918c2..b12a9af 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -364,7 +364,7 @@ EXT(OES_point_size_array, dummy_true
 EXT(OES_point_sprite, ARB_point_sprite 
  ,  x ,  x , ES1,  x , 2004)
 EXT(OES_primitive_bounding_box  , OES_primitive_bounding_box   
  ,  x ,  x ,  x ,  31, 2014)
 EXT(OES_query_matrix, dummy_true   
  ,  x ,  x , ES1,  x , 2003)
-EXT(OES_read_format , dummy_true   
  , GLL, GLC, ES1,  x , 2003)
+EXT(OES_read_format , dummy_true   
  , GLL,  x , ES1,  x , 2003)
 EXT(OES_rgb8_rgba8  , dummy_true   
  ,  x ,  x , ES1, ES2, 2005)
 EXT(OES_sample_shading  , OES_sample_variables 
  ,  x ,  x ,  x ,  30, 2014)
 EXT(OES_sample_variables, OES_sample_variables 
  ,  x ,  x ,  x ,  30, 2014)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] configure.ac: Set and use HAVE_GALLIUM_LLVM define

2017-01-23 Thread Tobias Droste
Am Montag, 23. Januar 2017, 11:53:18 CET schrieb Jose Fonseca:
> On 20/01/17 02:48, Emil Velikov wrote:
> > On 19 January 2017 at 19:26, Tobias Droste  wrote:
> >> Am Mittwoch, 18. Januar 2017, 18:45:04 CET schrieb Emil Velikov:
> >>> On 18 January 2017 at 18:12, Jose Fonseca  wrote:
> >> In order to untangle things we want to have a distinction between the
> >> gallium (gallivm afaict) and other users - RADV presently.
> >> So how about we update the RADV instances and ensure that the we set
> >> the HAVE_{RADV,}_LLVM lot appropriately. Latter will be picky but
> >> overall things should work w/o annoyances that HAVE_GALLIUM_LLVM
> >> brings ?
>  
>  I honestly don't even understand why we'd want to build parts of the
>  tree
>  with LLVM while hiding LLVM from other components.  We can't we just
>  build
>  everything with LLVM and avoid this combinatorial explosion of wierd
>  options that are nothing more than yet another way the build can
>  break!!?
> >>> 
> >>> Sadly the combinatoric explosion has been there for a while. Based on
> >>> how well my previous attempts to resolve similar issues (see the
> >>> "platforms" topic) I doubt we'll even get to fix that.
> >>> 
>  But if a separate option is truly necessary, have the newcomer pick a
>  different name, or something.
> >>> 
> >>> That's pretty much what I suggested above. Tobias can you please give it
> >>> a
> >>> try ?
> >> 
> >> I would rather "fix" the other build systems. (As in just define
> >> HAVE_GALLIUM_LLVM if HAVE_LLVM is defined).
> >> 
> >> I think there is still a misunderstanding on Joses side on what this
> >> really
> >> means. No file in gallivm or llvmpipe will be touched. It's really just
> >> auxilliary/draw and there it's exactly 8 lines that will change.
> >> 
> >> That's it.
> >> 
> >> I really fail to see how this will break everything that is being worked
> >> on
> >> and cause merge conflicts everywhere.
> >> 
> >> If you still want the other way, I can do that to, but this will of
> >> course
> >> need the same fix in the other build system or we have the same situation
> >> we have now, but with other drivers.
> > 
> > Afaict one point is that the use of HAVE_GALLIUM_LLVM vs HAVE_LLVM is
> > too subtle. Let's not forget that barring the WIP(?) branches, VMWare
> > has closed source components. Guess how much fun it will be as
> > suddenly things fail to build/work properly as they re-sync the code
> > base. No idea how likely the latter is, but considering Jose (and a
> > few other VMWare guys) wrote sizeable hunk of that code (and Mesa as a
> > whole) I'd go with his instinct.
> > 
> > Emil
> 
> The HAVE_LLVM->HAVE_GALLIUM_LLVM rename is indeed not as invasive as I
> thought.
> 
> But I still don't understand why HAVE_LLVM->HAVE_GALLIUM_LLVM is
> necessary in draw and not on gallivm/llvmpipe.
> 
> People want to build draw with LLVM support, but without
> gallivm/llvmpipe? That's impossible.
> 
> Or is this because the draw files are the only .c files that are
> compiled even when HAVE_LLVM is undefined, so these are the only ones
> that get to receive the renaming treatment?  That's crazy confusing.
> There's no away I can accept that.
> 

The draw files are used by softpipe (and maybe other gallium drivers, haven't 
checked that) and there HAVE_LLVM should not be defined. If it's not, 
everything is fine. But with new non gallium drivers using LLVM and causing 
HAVE_LLVM to bedefined, softpipe is broken in some cases. See below.

> 
> Let me make this crystal clear to avoid making this discussion even more
> protracted: I will not accept any HAVE_LLVM change in
> draw/gallivm/llvmpipe .C/.H source code.  Period.
> 

I'm _not_ changing gallivm and llvmpipe. Draw is not only used by llvmpipe and 
I still think I have very good arguments for the change. See again below.

I understand, I'm the unknown new guy and you did a lot of work in this code. 
But I'm not getting paid for this and I don't have to do this. I want to help, 
but I also want to understand why I can't do something. With reasons other 
than "I say so, and I don't want to hear any reasons against it". I hope you 
understand that. 

> 
> HAVE_LLVM used to mean, "whole Mesa being built with LLVM".  Now people
> want to build something (no idea what yet to be honest) with LLVM, but
> not build draw/gallivm/llvmpipe.
> 
> If you want to build some other component with LLVM but not
> draw/gallivm/llvmpipe, then add a new HAVE_LLVM_FOR_FOOBAR define and
> use it where you need it.

The real problem is softpipe. Softpipe uses draw but (obviously) can't use 
LLVM.

Right now one could build radv (uses LLVM) and pass --disable-gallium-llvm to 
the build system to get softpipe built.

But due to radv "HAVE_LLVM" (which is used as a version check everywhere 
else!) is defined and the draw code gets build using gallivm (since HAVE_LLVM 
is defined). So the resulting softpipe will actually use LLVM eve

[Mesa-dev] [PATCH] i965/blorp: Use the correct ISL format for combined depth/stencil

2017-01-23 Thread Jason Ekstrand
In brw_blorp_copyteximage, we use the format from the render buffer.
This could be a combined depth/stencil format.  In this case, we handle
stencil properly but we give blorp the wrong ISL format.  Specifically,
we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
size was causing GPU hangs.

Fixes: 
GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage

Cc: "13.0 17.0" 
Cc: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index d79f529..3a7cf84 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -284,8 +284,10 @@ brw_blorp_to_isl_format(struct brw_context *brw, 
mesa_format format,
case MESA_FORMAT_S_UINT8:
   return ISL_FORMAT_R8_UINT;
case MESA_FORMAT_Z24_UNORM_X8_UINT:
+   case MESA_FORMAT_Z24_UNORM_S8_UINT:
   return ISL_FORMAT_R24_UNORM_X8_TYPELESS;
case MESA_FORMAT_Z_FLOAT32:
+   case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
   return ISL_FORMAT_R32_FLOAT;
case MESA_FORMAT_Z_UNORM16:
   return ISL_FORMAT_R16_UNORM;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/blorp: Use the correct ISL format for combined depth/stencil

2017-01-23 Thread Ilia Mirkin
On Mon, Jan 23, 2017 at 2:42 PM, Jason Ekstrand  wrote:
> In brw_blorp_copyteximage, we use the format from the render buffer.
> This could be a combined depth/stencil format.  In this case, we handle
> stencil properly but we give blorp the wrong ISL format.  Specifically,
> we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
> size was causing GPU hangs.
>
> Fixes: 
> GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage
>
> Cc: "13.0 17.0" 
> Cc: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index d79f529..3a7cf84 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -284,8 +284,10 @@ brw_blorp_to_isl_format(struct brw_context *brw, 
> mesa_format format,
> case MESA_FORMAT_S_UINT8:
>return ISL_FORMAT_R8_UINT;
> case MESA_FORMAT_Z24_UNORM_X8_UINT:
> +   case MESA_FORMAT_Z24_UNORM_S8_UINT:
>return ISL_FORMAT_R24_UNORM_X8_TYPELESS;
> case MESA_FORMAT_Z_FLOAT32:
> +   case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:

Are you sure you don't want ISL_FORMAT_R32_FLOAT_X8X24_TYPELESS for
this one? (I don't have the larger context here, so just asking...)

>return ISL_FORMAT_R32_FLOAT;
> case MESA_FORMAT_Z_UNORM16:
>return ISL_FORMAT_R16_UNORM;
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/blorp: Use the correct ISL format for combined depth/stencil

2017-01-23 Thread Jason Ekstrand
On Mon, Jan 23, 2017 at 11:48 AM, Ilia Mirkin  wrote:

> On Mon, Jan 23, 2017 at 2:42 PM, Jason Ekstrand 
> wrote:
> > In brw_blorp_copyteximage, we use the format from the render buffer.
> > This could be a combined depth/stencil format.  In this case, we handle
> > stencil properly but we give blorp the wrong ISL format.  Specifically,
> > we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
> > size was causing GPU hangs.
> >
> > Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_
> depth_stencil_copyteximage
> >
> > Cc: "13.0 17.0" 
> > Cc: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_blorp.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> > index d79f529..3a7cf84 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> > @@ -284,8 +284,10 @@ brw_blorp_to_isl_format(struct brw_context *brw,
> mesa_format format,
> > case MESA_FORMAT_S_UINT8:
> >return ISL_FORMAT_R8_UINT;
> > case MESA_FORMAT_Z24_UNORM_X8_UINT:
> > +   case MESA_FORMAT_Z24_UNORM_S8_UINT:
> >return ISL_FORMAT_R24_UNORM_X8_TYPELESS;
> > case MESA_FORMAT_Z_FLOAT32:
> > +   case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
>
> Are you sure you don't want ISL_FORMAT_R32_FLOAT_X8X24_TYPELESS for
> this one? (I don't have the larger context here, so just asking...)
>

Yes, I'm sure.  In theory, we could add that format to the list but I don't
think we ever see it.  The only reason why we see the two I'm adding is
because brw_blorp_copytexsubimage pulls the format from the renderbuffer
rather than the miptree so we see the combined format and want to pull the
corresponding depth format.


> >return ISL_FORMAT_R32_FLOAT;
> > case MESA_FORMAT_Z_UNORM16:
> >return ISL_FORMAT_R16_UNORM;
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/blorp: Use the correct ISL format for combined depth/stencil

2017-01-23 Thread Ilia Mirkin
On Mon, Jan 23, 2017 at 2:58 PM, Jason Ekstrand  wrote:
> On Mon, Jan 23, 2017 at 11:48 AM, Ilia Mirkin  wrote:
>>
>> On Mon, Jan 23, 2017 at 2:42 PM, Jason Ekstrand 
>> wrote:
>> > In brw_blorp_copyteximage, we use the format from the render buffer.
>> > This could be a combined depth/stencil format.  In this case, we handle
>> > stencil properly but we give blorp the wrong ISL format.  Specifically,
>> > we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
>> > size was causing GPU hangs.
>> >
>> > Fixes:
>> > GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage
>> >
>> > Cc: "13.0 17.0" 
>> > Cc: Kenneth Graunke 
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_blorp.c | 2 ++
>> >  1 file changed, 2 insertions(+)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
>> > b/src/mesa/drivers/dri/i965/brw_blorp.c
>> > index d79f529..3a7cf84 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
>> > @@ -284,8 +284,10 @@ brw_blorp_to_isl_format(struct brw_context *brw,
>> > mesa_format format,
>> > case MESA_FORMAT_S_UINT8:
>> >return ISL_FORMAT_R8_UINT;
>> > case MESA_FORMAT_Z24_UNORM_X8_UINT:
>> > +   case MESA_FORMAT_Z24_UNORM_S8_UINT:
>> >return ISL_FORMAT_R24_UNORM_X8_TYPELESS;
>> > case MESA_FORMAT_Z_FLOAT32:
>> > +   case MESA_FORMAT_Z32_FLOAT_S8X24_UINT:
>>
>> Are you sure you don't want ISL_FORMAT_R32_FLOAT_X8X24_TYPELESS for
>> this one? (I don't have the larger context here, so just asking...)
>
>
> Yes, I'm sure.  In theory, we could add that format to the list but I don't
> think we ever see it.  The only reason why we see the two I'm adding is
> because brw_blorp_copytexsubimage pulls the format from the renderbuffer
> rather than the miptree so we see the combined format and want to pull the
> corresponding depth format.

Oh, and your depth/stencil buffers aren't packed (at least in this
case), so the underlying DB format is R32F. Forgot about that. Thanks
for explaining!

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/radeon: add a new HUD query for the number of mapped buffers

2017-01-23 Thread Samuel Pitoiset
Useful when debugging applications which map too much VRAM.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/radeon/r600_query.c   | 4 
 src/gallium/drivers/radeon/r600_query.h   | 1 +
 src/gallium/drivers/radeon/radeon_winsys.h| 1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 3 +++
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 2 ++
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h | 1 +
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 3 +++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 2 ++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.h | 1 +
 9 files changed, 18 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 25e7f5bb23..96157cd40e 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -65,6 +65,7 @@ static enum radeon_value_id winsys_id_from_type(unsigned type)
case R600_QUERY_MAPPED_VRAM: return RADEON_MAPPED_VRAM;
case R600_QUERY_MAPPED_GTT: return RADEON_MAPPED_GTT;
case R600_QUERY_BUFFER_WAIT_TIME: return RADEON_BUFFER_WAIT_TIME_NS;
+   case R600_QUERY_NUM_MAPPED_BUFFERS: return RADEON_NUM_MAPPED_BUFFERS;
case R600_QUERY_NUM_GFX_IBS: return RADEON_NUM_GFX_IBS;
case R600_QUERY_NUM_SDMA_IBS: return RADEON_NUM_SDMA_IBS;
case R600_QUERY_NUM_BYTES_MOVED: return RADEON_NUM_BYTES_MOVED;
@@ -133,6 +134,7 @@ static bool r600_query_sw_begin(struct r600_common_context 
*rctx,
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BACK_BUFFER_PS_DRAW_RATIO:
+   case R600_QUERY_NUM_MAPPED_BUFFERS:
query->begin_result = 0;
break;
case R600_QUERY_BUFFER_WAIT_TIME:
@@ -241,6 +243,7 @@ static bool r600_query_sw_end(struct r600_common_context 
*rctx,
case R600_QUERY_CURRENT_GPU_SCLK:
case R600_QUERY_CURRENT_GPU_MCLK:
case R600_QUERY_BUFFER_WAIT_TIME:
+   case R600_QUERY_NUM_MAPPED_BUFFERS:
case R600_QUERY_NUM_GFX_IBS:
case R600_QUERY_NUM_SDMA_IBS:
case R600_QUERY_NUM_BYTES_MOVED:
@@ -1722,6 +1725,7 @@ static struct pipe_driver_query_info 
r600_driver_query_list[] = {
X("mapped-VRAM",MAPPED_VRAM,BYTES, AVERAGE),
X("mapped-GTT", MAPPED_GTT, BYTES, AVERAGE),
X("buffer-wait-time",   BUFFER_WAIT_TIME,   MICROSECONDS, 
CUMULATIVE),
+   X("num-mapped-buffers", NUM_MAPPED_BUFFERS, UINT64, 
AVERAGE),
X("num-GFX-IBs",NUM_GFX_IBS,UINT64, 
AVERAGE),
X("num-SDMA-IBs",   NUM_SDMA_IBS,   UINT64, 
AVERAGE),
X("num-bytes-moved",NUM_BYTES_MOVED,BYTES, 
CUMULATIVE),
diff --git a/src/gallium/drivers/radeon/r600_query.h 
b/src/gallium/drivers/radeon/r600_query.h
index 1e4554d009..20856a5b2e 100644
--- a/src/gallium/drivers/radeon/r600_query.h
+++ b/src/gallium/drivers/radeon/r600_query.h
@@ -60,6 +60,7 @@ enum {
R600_QUERY_MAPPED_VRAM,
R600_QUERY_MAPPED_GTT,
R600_QUERY_BUFFER_WAIT_TIME,
+   R600_QUERY_NUM_MAPPED_BUFFERS,
R600_QUERY_NUM_GFX_IBS,
R600_QUERY_NUM_SDMA_IBS,
R600_QUERY_NUM_BYTES_MOVED,
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index e6fb2d560d..476f0647dd 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -81,6 +81,7 @@ enum radeon_value_id {
 RADEON_MAPPED_VRAM,
 RADEON_MAPPED_GTT,
 RADEON_BUFFER_WAIT_TIME_NS,
+RADEON_NUM_MAPPED_BUFFERS,
 RADEON_TIMESTAMP,
 RADEON_NUM_GFX_IBS,
 RADEON_NUM_SDMA_IBS,
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index e8d2c006f3..5ee27b8ede 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -181,6 +181,7 @@ void amdgpu_bo_destroy(struct pb_buffer *_buf)
  bo->ws->mapped_vram -= bo->base.size;
   else if (bo->initial_domain & RADEON_DOMAIN_GTT)
  bo->ws->mapped_gtt -= bo->base.size;
+  bo->ws->num_mapped_buffers--;
}
 
FREE(bo);
@@ -308,6 +309,7 @@ static void *amdgpu_bo_map(struct pb_buffer *buf,
  real->ws->mapped_vram += real->base.size;
   else if (real->initial_domain & RADEON_DOMAIN_GTT)
  real->ws->mapped_gtt += real->base.size;
+  real->ws->num_mapped_buffers++;
}
return (uint8_t*)cpu + offset;
 }
@@ -327,6 +329,7 @@ static void amdgpu_bo_unmap(struct pb_buffer *buf)
  real->ws->mapped_vram -= real->base.size;
   else if (real->initial_domain & RADEON_DOMAIN_GTT)
  real->ws->mapped_gtt -= real->base.size;
+  real->ws->num_mapped_buffers--;
}
 
amdgpu_bo_cpu_unmap(real->bo);
diff --git a/src/gallium/winsys/amdgpu/drm/am

[Mesa-dev] [PATCH] radv: don't resubmit the same cs over and over while tracing

2017-01-23 Thread Grazvydas Ignotas
Fixes: 97dfff54 ("radv: Dump command buffer on hang.")
Signed-off-by: Grazvydas Ignotas 
---
no commit access

 src/amd/vulkan/radv_device.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index d27a66c..337c733 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -1004,8 +1004,7 @@ VkResult radv_QueueSubmit(
if (queue->device->trace_bo)
*queue->device->trace_id_ptr = 0;
 
-   ret = queue->device->ws->cs_submit(ctx, 
queue->queue_idx, cs_array,
-   
pSubmits[i].commandBufferCount,
+   ret = queue->device->ws->cs_submit(ctx, 
queue->queue_idx, cs_array + j, advance,
(struct 
radeon_winsys_sem **)pSubmits[i].pWaitSemaphores,
b ? 
pSubmits[i].waitSemaphoreCount : 0,
(struct 
radeon_winsys_sem **)pSubmits[i].pSignalSemaphores,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: don't resubmit the same cs over and over while tracing

2017-01-23 Thread Bas Nieuwenhuizen
Pushed, thanks.

On Mon, Jan 23, 2017 at 10:16 PM, Grazvydas Ignotas  wrote:
> Fixes: 97dfff54 ("radv: Dump command buffer on hang.")
> Signed-off-by: Grazvydas Ignotas 
> ---
> no commit access
>
>  src/amd/vulkan/radv_device.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index d27a66c..337c733 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -1004,8 +1004,7 @@ VkResult radv_QueueSubmit(
> if (queue->device->trace_bo)
> *queue->device->trace_id_ptr = 0;
>
> -   ret = queue->device->ws->cs_submit(ctx, 
> queue->queue_idx, cs_array,
> -   
> pSubmits[i].commandBufferCount,
> +   ret = queue->device->ws->cs_submit(ctx, 
> queue->queue_idx, cs_array + j, advance,
> (struct 
> radeon_winsys_sem **)pSubmits[i].pWaitSemaphores,
> b ? 
> pSubmits[i].waitSemaphoreCount : 0,
> (struct 
> radeon_winsys_sem **)pSubmits[i].pSignalSemaphores,
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] vulkan: bump vulkan.h to 1.0.39 version

2017-01-23 Thread Dave Airlie
From: Dave Airlie 

This introduces a bunch of new extension defines.

Signed-off-by: Dave Airlie 
---
 include/vulkan/vulkan.h | 367 +++-
 1 file changed, 365 insertions(+), 2 deletions(-)

diff --git a/include/vulkan/vulkan.h b/include/vulkan/vulkan.h
index f24a0a2..81dedf7 100644
--- a/include/vulkan/vulkan.h
+++ b/include/vulkan/vulkan.h
@@ -6,7 +6,7 @@ extern "C" {
 #endif
 
 /*
-** Copyright (c) 2015-2016 The Khronos Group Inc.
+** Copyright (c) 2015-2017 The Khronos Group Inc.
 **
 ** Licensed under the Apache License, Version 2.0 (the "License");
 ** you may not use this file except in compliance with the License.
@@ -43,7 +43,7 @@ extern "C" {
 #define VK_VERSION_MINOR(version) (((uint32_t)(version) >> 12) & 0x3ff)
 #define VK_VERSION_PATCH(version) ((uint32_t)(version) & 0xfff)
 // Version of this file
-#define VK_HEADER_VERSION 38
+#define VK_HEADER_VERSION 39
 
 
 #define VK_NULL_HANDLE 0
@@ -145,6 +145,7 @@ typedef enum VkResult {
 VK_ERROR_INCOMPATIBLE_DISPLAY_KHR = -103001,
 VK_ERROR_VALIDATION_FAILED_EXT = -111001,
 VK_ERROR_INVALID_SHADER_NV = -112000,
+VK_ERROR_OUT_OF_POOL_MEMORY_KHR = -169000,
 VK_RESULT_BEGIN_RANGE = VK_ERROR_FRAGMENTED_POOL,
 VK_RESULT_END_RANGE = VK_INCOMPLETE,
 VK_RESULT_RANGE_SIZE = (VK_INCOMPLETE - VK_ERROR_FRAGMENTED_POOL + 1),
@@ -225,13 +226,28 @@ typedef enum VkStructureType {
 VK_STRUCTURE_TYPE_IMPORT_MEMORY_WIN32_HANDLE_INFO_NV = 157000,
 VK_STRUCTURE_TYPE_EXPORT_MEMORY_WIN32_HANDLE_INFO_NV = 157001,
 VK_STRUCTURE_TYPE_WIN32_KEYED_MUTEX_ACQUIRE_RELEASE_INFO_NV = 158000,
+VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2_KHR = 159000,
+VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2_KHR = 159001,
+VK_STRUCTURE_TYPE_FORMAT_PROPERTIES_2_KHR = 159002,
+VK_STRUCTURE_TYPE_IMAGE_FORMAT_PROPERTIES_2_KHR = 159003,
+VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_IMAGE_FORMAT_INFO_2_KHR = 159004,
+VK_STRUCTURE_TYPE_QUEUE_FAMILY_PROPERTIES_2_KHR = 159005,
+VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MEMORY_PROPERTIES_2_KHR = 159006,
+VK_STRUCTURE_TYPE_SPARSE_IMAGE_FORMAT_PROPERTIES_2_KHR = 159007,
+VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SPARSE_IMAGE_FORMAT_INFO_2_KHR = 
159008,
 VK_STRUCTURE_TYPE_VALIDATION_FLAGS_EXT = 161000,
+VK_STRUCTURE_TYPE_VI_SURFACE_CREATE_INFO_NN = 162000,
 VK_STRUCTURE_TYPE_OBJECT_TABLE_CREATE_INFO_NVX = 186000,
 VK_STRUCTURE_TYPE_INDIRECT_COMMANDS_LAYOUT_CREATE_INFO_NVX = 186001,
 VK_STRUCTURE_TYPE_CMD_PROCESS_COMMANDS_INFO_NVX = 186002,
 VK_STRUCTURE_TYPE_CMD_RESERVE_SPACE_FOR_COMMANDS_INFO_NVX = 186003,
 VK_STRUCTURE_TYPE_DEVICE_GENERATED_COMMANDS_LIMITS_NVX = 186004,
 VK_STRUCTURE_TYPE_DEVICE_GENERATED_COMMANDS_FEATURES_NVX = 186005,
+VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES2_EXT = 19,
+VK_STRUCTURE_TYPE_DISPLAY_POWER_INFO_EXT = 191000,
+VK_STRUCTURE_TYPE_DEVICE_EVENT_INFO_EXT = 191001,
+VK_STRUCTURE_TYPE_DISPLAY_EVENT_INFO_EXT = 191002,
+VK_STRUCTURE_TYPE_SWAPCHAIN_COUNTER_CREATE_INFO_EXT = 191003,
 VK_STRUCTURE_TYPE_BEGIN_RANGE = VK_STRUCTURE_TYPE_APPLICATION_INFO,
 VK_STRUCTURE_TYPE_END_RANGE = VK_STRUCTURE_TYPE_LOADER_DEVICE_CREATE_INFO,
 VK_STRUCTURE_TYPE_RANGE_SIZE = 
(VK_STRUCTURE_TYPE_LOADER_DEVICE_CREATE_INFO - 
VK_STRUCTURE_TYPE_APPLICATION_INFO + 1),
@@ -840,6 +856,8 @@ typedef enum VkFormatFeatureFlagBits {
 VK_FORMAT_FEATURE_BLIT_DST_BIT = 0x0800,
 VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT = 0x1000,
 VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_IMG = 0x2000,
+VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR = 0x4000,
+VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR = 0x8000,
 VK_FORMAT_FEATURE_FLAG_BITS_MAX_ENUM = 0x7FFF
 } VkFormatFeatureFlagBits;
 typedef VkFlags VkFormatFeatureFlags;
@@ -863,6 +881,7 @@ typedef enum VkImageCreateFlagBits {
 VK_IMAGE_CREATE_SPARSE_ALIASED_BIT = 0x0004,
 VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT = 0x0008,
 VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT = 0x0010,
+VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT_KHR = 0x0020,
 VK_IMAGE_CREATE_FLAG_BITS_MAX_ENUM = 0x7FFF
 } VkImageCreateFlagBits;
 typedef VkFlags VkImageCreateFlags;
@@ -3206,6 +3225,18 @@ VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSurfaceKHR)
 
 typedef enum VkColorSpaceKHR {
 VK_COLOR_SPACE_SRGB_NONLINEAR_KHR = 0,
+VK_COLOR_SPACE_DISPLAY_P3_LINEAR_EXT = 1000104001,
+VK_COLOR_SPACE_DISPLAY_P3_NONLINEAR_EXT = 1000104002,
+VK_COLOR_SPACE_SCRGB_LINEAR_EXT = 1000104003,
+VK_COLOR_SPACE_SCRGB_NONLINEAR_EXT = 1000104004,
+VK_COLOR_SPACE_DCI_P3_LINEAR_EXT = 1000104005,
+VK_COLOR_SPACE_DCI_P3_NONLINEAR_EXT = 1000104006,
+VK_COLOR_SPACE_BT709_LINEAR_EXT = 1000104007,
+VK_COLOR_SPACE_BT709_NONLINEAR_EXT = 1000104008,
+VK_COLOR_SPACE_BT2020_LINEAR_EXT = 1000104009,
+VK_COLOR_SPACE_BT2020_

[Mesa-dev] [PATCH 2/2] radv: implement VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2

2017-01-23 Thread Dave Airlie
From: Dave Airlie 

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_device.c  | 38 +-
 src/amd/vulkan/radv_formats.c | 36 
 2 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 4aa6af2..a1d846f 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -119,6 +119,10 @@ static const VkExtensionProperties 
common_device_extensions[] = {
.extensionName = VK_AMD_NEGATIVE_VIEWPORT_HEIGHT_EXTENSION_NAME,
.specVersion = 1,
},
+   {
+   .extensionName = 
VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME,
+   .specVersion = 1,
+   },
 };
 
 static VkResult
@@ -467,6 +471,13 @@ void radv_GetPhysicalDeviceFeatures(
};
 }
 
+void radv_GetPhysicalDeviceFeatures2KHR(
+   VkPhysicalDevicephysicalDevice,
+   VkPhysicalDeviceFeatures2KHR   *pFeatures)
+{
+   return radv_GetPhysicalDeviceFeatures(physicalDevice, 
&pFeatures->features);
+}
+
 void radv_GetPhysicalDeviceProperties(
VkPhysicalDevicephysicalDevice,
VkPhysicalDeviceProperties* pProperties)
@@ -600,6 +611,13 @@ void radv_GetPhysicalDeviceProperties(
memcpy(pProperties->pipelineCacheUUID, pdevice->uuid, VK_UUID_SIZE);
 }
 
+void radv_GetPhysicalDeviceProperties2KHR(
+   VkPhysicalDevicephysicalDevice,
+   VkPhysicalDeviceProperties2KHR *pProperties)
+{
+   return radv_GetPhysicalDeviceProperties(physicalDevice, 
&pProperties->properties);
+}
+
 void radv_GetPhysicalDeviceQueueFamilyProperties(
VkPhysicalDevicephysicalDevice,
uint32_t*   pCount,
@@ -650,9 +668,19 @@ void radv_GetPhysicalDeviceQueueFamilyProperties(
*pCount = idx;
 }
 
+void radv_GetPhysicalDeviceQueueFamilyProperties2KHR(
+   VkPhysicalDevicephysicalDevice,
+   uint32_t*   pCount,
+   VkQueueFamilyProperties2KHR*pQueueFamilyProperties)
+{
+   return radv_GetPhysicalDeviceQueueFamilyProperties(physicalDevice,
+  pCount,
+  
&pQueueFamilyProperties->queueFamilyProperties);
+}
+
 void radv_GetPhysicalDeviceMemoryProperties(
VkPhysicalDevicephysicalDevice,
-   VkPhysicalDeviceMemoryProperties*   pMemoryProperties)
+   VkPhysicalDeviceMemoryProperties   *pMemoryProperties)
 {
RADV_FROM_HANDLE(radv_physical_device, physical_device, physicalDevice);
 
@@ -699,6 +727,14 @@ void radv_GetPhysicalDeviceMemoryProperties(
};
 }
 
+void radv_GetPhysicalDeviceMemoryProperties2KHR(
+   VkPhysicalDevicephysicalDevice,
+   VkPhysicalDeviceMemoryProperties2KHR   *pMemoryProperties)
+{
+   return radv_GetPhysicalDeviceMemoryProperties(physicalDevice,
+ 
&pMemoryProperties->memoryProperties);
+}
+
 static int
 radv_queue_init(struct radv_device *device, struct radv_queue *queue,
int queue_family_index, int idx)
diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
index e276432..87c28f1 100644
--- a/src/amd/vulkan/radv_formats.c
+++ b/src/amd/vulkan/radv_formats.c
@@ -957,6 +957,18 @@ void radv_GetPhysicalDeviceFormatProperties(
   pFormatProperties);
 }
 
+void radv_GetPhysicalDeviceFormatProperties2KHR(
+   VkPhysicalDevicephysicalDevice,
+   VkFormatformat,
+   VkFormatProperties2KHR* pFormatProperties)
+{
+   RADV_FROM_HANDLE(radv_physical_device, physical_device, physicalDevice);
+
+   radv_physical_device_get_format_properties(physical_device,
+  format,
+  
&pFormatProperties->formatProperties);
+}
+
 VkResult radv_GetPhysicalDeviceImageFormatProperties(
VkPhysicalDevicephysicalDevice,
VkFormatformat,
@@ -1071,6 +1083,20 @@ unsupported:
return VK_ERROR_FORMAT_NOT_SUPPORTED;
 }
 
+VkResult radv_GetPhysicalDeviceImageFormatProperties2KHR(
+   VkPhysicalDevicephysicalDevice,
+   const VkPhysicalDeviceImageFormatInfo2KHR*  pImageFormatInfo,
+   VkImageFormatProperties2KHR*pImageFormatProperties)
+{
+   return radv_GetPhysicalDeviceImageFormatProperties(physicalDevice,
+

[Mesa-dev] [PATCH 1/2] vulkan: import latest registry for 1.0.39 extensions.

2017-01-23 Thread Dave Airlie
From: Dave Airlie 

Signed-off-by: Dave Airlie 
---
 src/vulkan/registry/vk.xml | 450 -
 1 file changed, 408 insertions(+), 42 deletions(-)

diff --git a/src/vulkan/registry/vk.xml b/src/vulkan/registry/vk.xml
index 4f358c2..779875b 100644
--- a/src/vulkan/registry/vk.xml
+++ b/src/vulkan/registry/vk.xml
@@ -1,7 +1,7 @@
 
 
 
-Copyright (c) 2015-2016 The Khronos Group Inc.
+Copyright (c) 2015-2017 The Khronos Group Inc.
 
 Permission is hereby granted, free of charge, to any person obtaining a
 copy of this software and/or associated documentation files (the
@@ -62,6 +62,7 @@ maintained in the master branch of the Khronos Vulkan GitHub 
project.
 
 
 
+
 
 
 
@@ -70,6 +71,7 @@ maintained in the master branch of the Khronos Vulkan GitHub 
project.
 
 #include "vulkan.h"
 #include 

+#include 

 #include 

 #include 

 #include 

@@ -79,6 +81,7 @@ maintained in the master branch of the Khronos Vulkan GitHub 
project.
 
 
 
+
 
 
 
@@ -104,7 +107,7 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 // Vulkan 1.0 version number
 #define VK_API_VERSION_1_0 VK_MAKE_VERSION(1, 0, 
0)
 // Version of this file
-#define VK_HEADER_VERSION 38
+#define VK_HEADER_VERSION 39
 
 
 #define VK_DEFINE_HANDLE(object) typedef struct object##_T* 
object;
@@ -208,14 +211,17 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 typedef VkFlags 
VkDisplaySurfaceCreateFlagsKHR; 
 typedef VkFlags 
VkAndroidSurfaceCreateFlagsKHR; 
 typedef VkFlags 
VkMirSurfaceCreateFlagsKHR; 
+typedef VkFlags 
VkViSurfaceCreateFlagsNN;  
 typedef VkFlags 
VkWaylandSurfaceCreateFlagsKHR; 
 typedef VkFlags 
VkWin32SurfaceCreateFlagsKHR;   
 typedef VkFlags 
VkXlibSurfaceCreateFlagsKHR;
 typedef VkFlags 
VkXcbSurfaceCreateFlagsKHR; 
 
 typedef VkFlags 
VkDebugReportFlagsEXT;
+typedef VkFlags 
VkCommandPoolTrimFlagsKHR;
 typedef VkFlags 
VkExternalMemoryHandleTypeFlagsNV;
 typedef VkFlags 
VkExternalMemoryFeatureFlagsNV;
+typedef VkFlags 
VkSurfaceCounterFlagsEXT;
 
 
 VK_DEFINE_HANDLE(VkInstance)
@@ -357,6 +363,10 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 
 
 
+
+
+
+
 
 
 typedef void (VKAPI_PTR 
*PFN_vkInternalAllocationNotification)(
@@ -492,7 +502,7 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 
 
 VkStructureType 
sType
-const void* pNext   
   
+const 
void* pNext  

 VkDeviceCreateFlags
flags   
 uint32_t
queueCreateInfoCount
 const 
VkDeviceQueueCreateInfo* pQueueCreateInfos
@@ -1463,6 +1473,12 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 MirConnection*  
 connection
 MirSurface* 
 mirSurface
 
+
+VkStructureType
 sType
+const void*  
pNext
+VkViSurfaceCreateFlagsNN   
flags
+void*
window
+
 
 VkStructureType
 sType
 const void*  
pNext
@@ -1517,7 +1533,7 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 
 
 VkStructureType 
sType
-const void*  
pNext
+const 
void* pNext
 uint32_t 
waitSemaphoreCount   
 const 
VkSemaphore* pWaitSemaphores 
 uint32_t 
swapchainCount   
@@ -1713,6 +1729,7 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 VkObjectEntryTypeNVX 
type
 VkObjectEntryUsageFlagsNVX   
flags
 VkBuffer 
buffer
+VkIndexType  
indexType
 
 
 VkObjectEntryTypeNVX 
type
@@ -1720,6 +1737,94 @@ maintained in the master branch of the Khronos Vulkan 
GitHub project.
 VkPipelineLayout 
pipelineLayout
 VkShaderStageFlags   
stageFlags
 
+
+VkStructureType
 sType
+void*
pNext
+VkPhysicalDeviceFeatures 

[Mesa-dev] [PATCH 2/6] anv: Add trivial support for TrimCommandPoolKHR

2017-01-23 Thread Jason Ekstrand
Our command buffers already efficiently use a global pool so trimming
doesn't really need to do anything.
---
 src/intel/vulkan/anv_cmd_buffer.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index d882c18..3a23048 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -787,6 +787,14 @@ VkResult anv_ResetCommandPool(
return VK_SUCCESS;
 }
 
+void anv_TrimCommandPoolKHR(
+VkDevicedevice,
+VkCommandPool   commandPool,
+VkCommandPoolTrimFlagsKHR   flags)
+{
+   /* Nothing for us to do here.  Our pools stay pretty tidy. */
+}
+
 /**
  * Return NULL if the current subpass has no depthstencil attachment.
  */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] anv: Return better errors from AllocateDescriptorSets

2017-01-23 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_descriptor_set.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_descriptor_set.c 
b/src/intel/vulkan/anv_descriptor_set.c
index a5e65af..2d9734d 100644
--- a/src/intel/vulkan/anv_descriptor_set.c
+++ b/src/intel/vulkan/anv_descriptor_set.c
@@ -432,8 +432,12 @@ anv_descriptor_set_create(struct anv_device *device,
   }
}
 
-   if (set == NULL)
-  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+   if (set == NULL) {
+  if (pool->free_list != EMPTY)
+ return vk_error(VK_ERROR_FRAGMENTED_POOL);
+  else
+ return vk_error(VK_ERROR_OUT_OF_POOL_MEMOR_KHR);
+   }
 
set->size = size;
set->layout = layout;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] anv: Allow selecting the slice of a 3D image

2017-01-23 Thread Jason Ekstrand
As per VK_KHR_maintenance1, clients can render to a slice of a 3D image
by creating a VK_IMAGE_VIEW_TYPE_2D view of it.
---
 src/intel/vulkan/anv_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 12cca67..1c42821 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -529,7 +529,7 @@ anv_CreateImageView(VkDevice _device,
   .depth  = anv_minify(image->extent.depth , range->baseMipLevel),
};
 
-   if (image->type == VK_IMAGE_TYPE_3D) {
+   if (pCreateInfo->viewType == VK_IMAGE_VIEW_TYPE_3D) {
   iview->isl.base_array_layer = 0;
   iview->isl.array_len = iview->extent.depth;
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] anv: Report FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR

2017-01-23 Thread Jason Ekstrand
As of VK_KHR_maintenance1, these are supposed to be reported for any
formats on which we support transfer operations.  For us, this is
anything that we can texture from.
---
 src/intel/vulkan/anv_formats.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index a5d783e..39001d6 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -348,6 +348,11 @@ get_image_format_properties(const struct gen_device_info 
*devinfo,
if (base == ISL_FORMAT_R32_SINT || base == ISL_FORMAT_R32_UINT)
   flags |= VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT;
 
+   if (flags) {
+  flags |= VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR |
+   VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR;
+   }
+
return flags;
 }
 
@@ -393,7 +398,9 @@ anv_physical_device_get_format_properties(struct 
anv_physical_device *physical_d
  tiled |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT;
 
   tiled |= VK_FORMAT_FEATURE_BLIT_SRC_BIT |
-   VK_FORMAT_FEATURE_BLIT_DST_BIT;
+   VK_FORMAT_FEATURE_BLIT_DST_BIT |
+   VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR |
+   VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR;
} else {
   struct anv_format linear_fmt, tiled_fmt;
   linear_fmt = anv_get_format(&physical_device->info, format,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] anv: Set viewport extents correctly when height is negative

2017-01-23 Thread Jason Ekstrand
As per VK_KHR_maintenance1, setting a negative height in the viewport
can be used to get flipped coordinates.  This is, aparently, very useful
when porting D3D apps to Vulkan.  All we need to do to support this is
to make sure we actually set the min and max correctly.
---
 src/intel/vulkan/gen8_cmd_buffer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
b/src/intel/vulkan/gen8_cmd_buffer.c
index f22037b..ab68872 100644
--- a/src/intel/vulkan/gen8_cmd_buffer.c
+++ b/src/intel/vulkan/gen8_cmd_buffer.c
@@ -59,8 +59,8 @@ gen8_cmd_buffer_emit_viewport(struct anv_cmd_buffer 
*cmd_buffer)
  .YMaxClipGuardband = 1.0f,
  .XMinViewPort = vp->x,
  .XMaxViewPort = vp->x + vp->width - 1,
- .YMinViewPort = vp->y,
- .YMaxViewPort = vp->y + vp->height - 1,
+ .YMinViewPort = MIN2(vp->y, vp->y + vp->height),
+ .YMaxViewPort = MAX2(vp->y, vp->y + vp->height) - 1,
   };
 
   GENX(SF_CLIP_VIEWPORT_pack)(NULL, sf_clip_state.map + i * 64,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] anv: Expose VK_KHR_maintenance1

2017-01-23 Thread Jason Ekstrand
---
 src/intel/vulkan/anv_device.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index f80a36a..b24949c 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -263,7 +263,11 @@ static const VkExtensionProperties device_extensions[] = {
{
   .extensionName = VK_KHR_SAMPLER_MIRROR_CLAMP_TO_EDGE_EXTENSION_NAME,
   .specVersion = 1,
-   }
+   },
+   {
+  .extensionName = VK_KHR_MAINTENANCE1_EXTENSION_NAME,
+  .specVersion = 1,
+   },
 };
 
 static void *
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/6] anv: Implement VK_KHR_maintenance1

2017-01-23 Thread Jason Ekstrand
This little series implements the new VK_KHR_maintenance1 extension.  Most
of the patches are pretty trivial by themselves but they all do different
things so I split them out.  I'm happy to squash the whole series if
someone really wants me to.

Jason Ekstrand (6):
  anv: Set viewport extents correctly when height is negative
  anv: Add trivial support for TrimCommandPoolKHR
  anv: Report FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR
  anv: Allow selecting the slice of a 3D image
  anv: Return better errors from AllocateDescriptorSets
  anv: Expose VK_KHR_maintenance1

 src/intel/vulkan/anv_cmd_buffer.c | 8 
 src/intel/vulkan/anv_descriptor_set.c | 8 ++--
 src/intel/vulkan/anv_device.c | 6 +-
 src/intel/vulkan/anv_formats.c| 9 -
 src/intel/vulkan/anv_image.c  | 2 +-
 src/intel/vulkan/gen8_cmd_buffer.c| 4 ++--
 6 files changed, 30 insertions(+), 7 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] vulkan: import latest registry for 1.0.39 extensions.

2017-01-23 Thread Jason Ekstrand
Acked-by: Jason Ekstrand 

I was just about to send this out...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] anv: Return better errors from AllocateDescriptorSets

2017-01-23 Thread Ilia Mirkin
On Mon, Jan 23, 2017 at 5:12 PM, Jason Ekstrand  wrote:
> ---
>  src/intel/vulkan/anv_descriptor_set.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_descriptor_set.c 
> b/src/intel/vulkan/anv_descriptor_set.c
> index a5e65af..2d9734d 100644
> --- a/src/intel/vulkan/anv_descriptor_set.c
> +++ b/src/intel/vulkan/anv_descriptor_set.c
> @@ -432,8 +432,12 @@ anv_descriptor_set_create(struct anv_device *device,
>}
> }
>
> -   if (set == NULL)
> -  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> +   if (set == NULL) {
> +  if (pool->free_list != EMPTY)
> + return vk_error(VK_ERROR_FRAGMENTED_POOL);
> +  else
> + return vk_error(VK_ERROR_OUT_OF_POOL_MEMOR_KHR);

Shouldn't this be MEMORY?

> +   }
>
> set->size = size;
> set->layout = layout;
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Implement VK_KHR_get_physical_device_properties2

2017-01-23 Thread Chad Versace
Implement each vkFoo2KHR() by trivially passing it through to the
original vkFoo().

---

I tested this patch with a little demo app, but I haven't ran any CTS tests
with it. If CTS tests do exit (I'm searching for them now), I'll run
them against this patch before pushing.

 src/intel/vulkan/anv_device.c  | 36 
 src/intel/vulkan/anv_formats.c | 39 +++
 2 files changed, 75 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index f80a36a940..7a7ada3bfb 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -253,6 +253,10 @@ static const VkExtensionProperties global_extensions[] = {
   .specVersion = 5,
},
 #endif
+   {
+  .extensionName = VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME,
+  .specVersion = 1,
+   },
 };
 
 static const VkExtensionProperties device_extensions[] = {
@@ -493,6 +497,13 @@ void anv_GetPhysicalDeviceFeatures(
   pdevice->compiler->scalar_stage[MESA_SHADER_GEOMETRY];
 }
 
+void anv_GetPhysicalDeviceFeatures2KHR(
+VkPhysicalDevicephysicalDevice,
+VkPhysicalDeviceFeatures2KHR*   pFeatures)
+{
+   anv_GetPhysicalDeviceFeatures(physicalDevice, &pFeatures->features);
+}
+
 void anv_GetPhysicalDeviceProperties(
 VkPhysicalDevicephysicalDevice,
 VkPhysicalDeviceProperties* pProperties)
@@ -636,6 +647,13 @@ void anv_GetPhysicalDeviceProperties(
memcpy(pProperties->pipelineCacheUUID, pdevice->uuid, VK_UUID_SIZE);
 }
 
+void anv_GetPhysicalDeviceProperties2KHR(
+VkPhysicalDevicephysicalDevice,
+VkPhysicalDeviceProperties2KHR* pProperties)
+{
+   anv_GetPhysicalDeviceProperties(physicalDevice, &pProperties->properties);
+}
+
 void anv_GetPhysicalDeviceQueueFamilyProperties(
 VkPhysicalDevicephysicalDevice,
 uint32_t*   pCount,
@@ -667,6 +685,16 @@ void anv_GetPhysicalDeviceQueueFamilyProperties(
*pCount = 1;
 }
 
+void anv_GetPhysicalDeviceQueueFamilyProperties2KHR(
+VkPhysicalDevicephysicalDevice,
+uint32_t*   pQueueFamilyPropertyCount,
+VkQueueFamilyProperties2KHR*pQueueFamilyProperties)
+{
+   anv_GetPhysicalDeviceQueueFamilyProperties(physicalDevice,
+ pQueueFamilyPropertyCount,
+ &pQueueFamilyProperties->queueFamilyProperties);
+}
+
 void anv_GetPhysicalDeviceMemoryProperties(
 VkPhysicalDevicephysicalDevice,
 VkPhysicalDeviceMemoryProperties*   pMemoryProperties)
@@ -719,6 +747,14 @@ void anv_GetPhysicalDeviceMemoryProperties(
};
 }
 
+void anv_GetPhysicalDeviceMemoryProperties2KHR(
+VkPhysicalDevicephysicalDevice,
+VkPhysicalDeviceMemoryProperties2KHR*   pMemoryProperties)
+{
+   anv_GetPhysicalDeviceMemoryProperties(physicalDevice,
+ &pMemoryProperties->memoryProperties);
+}
+
 PFN_vkVoidFunction anv_GetInstanceProcAddr(
 VkInstance  instance,
 const char* pName)
diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c
index a5d783e689..c4ee14cab4 100644
--- a/src/intel/vulkan/anv_formats.c
+++ b/src/intel/vulkan/anv_formats.c
@@ -450,6 +450,15 @@ void anv_GetPhysicalDeviceFormatProperties(
pFormatProperties);
 }
 
+void anv_GetPhysicalDeviceFormatProperties2KHR(
+VkPhysicalDevicephysicalDevice,
+VkFormatformat,
+VkFormatProperties2KHR* pFormatProperties)
+{
+   anv_GetPhysicalDeviceFormatProperties(physicalDevice, format,
+ &pFormatProperties->formatProperties);
+}
+
 VkResult anv_GetPhysicalDeviceImageFormatProperties(
 VkPhysicalDevicephysicalDevice,
 VkFormatformat,
@@ -604,6 +613,20 @@ unsupported:
return VK_ERROR_FORMAT_NOT_SUPPORTED;
 }
 
+VkResult vkGetPhysicalDeviceImageFormatProperties2KHR(
+VkPhysicalDevicephysicalDevice,
+const VkPhysicalDeviceImageFormatInfo2KHR*  pImageFormatInfo,
+VkImageFormatProperties2KHR*pImageFormatProperties)
+{
+   return anv_GetPhysicalDeviceImageFormatProperties(physicalDevice,
+pImageFormatInfo->format,
+pImageFormatInfo->type,
+pImageFormatInfo->tiling,
+pImageFormatInfo->usage,
+pImageFormatInfo->flags,
+&pImageFormatProperties->imageFormatProperties);
+}
+
 void anv_GetPhysicalDeviceSparseImageFormatProperties(
 VkPhysicalDevicephysicalDevice,
 VkFormat 

[Mesa-dev] [PATCH 5/6] anv: Return better errors from AllocateDescriptorSets

2017-01-23 Thread Jason Ekstrand
v2: I need to learn to compile-test my patches

---
 src/intel/vulkan/anv_descriptor_set.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_descriptor_set.c 
b/src/intel/vulkan/anv_descriptor_set.c
index a5e65af..a4b7638 100644
--- a/src/intel/vulkan/anv_descriptor_set.c
+++ b/src/intel/vulkan/anv_descriptor_set.c
@@ -432,8 +432,13 @@ anv_descriptor_set_create(struct anv_device *device,
   }
}
 
-   if (set == NULL)
-  return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+   if (set == NULL) {
+  if (pool->free_list != EMPTY) {
+ return vk_error(VK_ERROR_FRAGMENTED_POOL);
+  } else {
+ return vk_error(VK_ERROR_OUT_OF_POOL_MEMORY_KHR);
+  }
+   }
 
set->size = size;
set->layout = layout;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Implement VK_KHR_get_physical_device_properties2

2017-01-23 Thread Jason Ekstrand
On Mon, Jan 23, 2017 at 2:28 PM, Chad Versace 
wrote:

> Implement each vkFoo2KHR() by trivially passing it through to the
> original vkFoo().
>

As I mentioned to Lionel when he wrote basically this exact same patch, I
think that may be backwards.  I can see two ways of doing this long-term:

1) Implement all of the queries (of a particular type) in a single function
and the legacy query calls the query2 variant and then copies the data over.
2) Implement each query as its own function and the queries2 function loops
over the data structures calling the appropriate function on each one.

TBH, I'm not sure which of those two I prefer but I would like to at least
think about the future so we don't have to come through and rewrite it.  I
think I'm slightly leaning towards option 2 but I'm not sold.  Thoughs?

--Jason


> ---
>
> I tested this patch with a little demo app, but I haven't ran any CTS tests
> with it. If CTS tests do exit (I'm searching for them now), I'll run
> them against this patch before pushing.
>
>  src/intel/vulkan/anv_device.c  | 36 
>  src/intel/vulkan/anv_formats.c | 39 ++
> +
>  2 files changed, 75 insertions(+)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index f80a36a940..7a7ada3bfb 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -253,6 +253,10 @@ static const VkExtensionProperties
> global_extensions[] = {
>.specVersion = 5,
> },
>  #endif
> +   {
> +  .extensionName = VK_KHR_GET_PHYSICAL_DEVICE_
> PROPERTIES_2_EXTENSION_NAME,
> +  .specVersion = 1,
> +   },
>  };
>
>  static const VkExtensionProperties device_extensions[] = {
> @@ -493,6 +497,13 @@ void anv_GetPhysicalDeviceFeatures(
>pdevice->compiler->scalar_stage[MESA_SHADER_GEOMETRY];
>  }
>
> +void anv_GetPhysicalDeviceFeatures2KHR(
> +VkPhysicalDevicephysicalDevice,
> +VkPhysicalDeviceFeatures2KHR*   pFeatures)
> +{
> +   anv_GetPhysicalDeviceFeatures(physicalDevice, &pFeatures->features);
> +}
> +
>  void anv_GetPhysicalDeviceProperties(
>  VkPhysicalDevicephysicalDevice,
>  VkPhysicalDeviceProperties* pProperties)
> @@ -636,6 +647,13 @@ void anv_GetPhysicalDeviceProperties(
> memcpy(pProperties->pipelineCacheUUID, pdevice->uuid, VK_UUID_SIZE);
>  }
>
> +void anv_GetPhysicalDeviceProperties2KHR(
> +VkPhysicalDevicephysicalDevice,
> +VkPhysicalDeviceProperties2KHR* pProperties)
> +{
> +   anv_GetPhysicalDeviceProperties(physicalDevice,
> &pProperties->properties);
> +}
> +
>  void anv_GetPhysicalDeviceQueueFamilyProperties(
>  VkPhysicalDevicephysicalDevice,
>  uint32_t*   pCount,
> @@ -667,6 +685,16 @@ void anv_GetPhysicalDeviceQueueFamilyProperties(
> *pCount = 1;
>  }
>
> +void anv_GetPhysicalDeviceQueueFamilyProperties2KHR(
> +VkPhysicalDevicephysicalDevice,
> +uint32_t*   pQueueFamilyPropertyCount,
> +VkQueueFamilyProperties2KHR*pQueueFamilyProperties)
> +{
> +   anv_GetPhysicalDeviceQueueFamilyProperties(physicalDevice,
> + pQueueFamilyPropertyCount,
> + &pQueueFamilyProperties->queueFamilyProperties);
> +}
> +
>  void anv_GetPhysicalDeviceMemoryProperties(
>  VkPhysicalDevicephysicalDevice,
>  VkPhysicalDeviceMemoryProperties*   pMemoryProperties)
> @@ -719,6 +747,14 @@ void anv_GetPhysicalDeviceMemoryProperties(
> };
>  }
>
> +void anv_GetPhysicalDeviceMemoryProperties2KHR(
> +VkPhysicalDevicephysicalDevice,
> +VkPhysicalDeviceMemoryProperties2KHR*   pMemoryProperties)
> +{
> +   anv_GetPhysicalDeviceMemoryProperties(physicalDevice,
> + &pMemoryProperties->
> memoryProperties);
> +}
> +
>  PFN_vkVoidFunction anv_GetInstanceProcAddr(
>  VkInstance  instance,
>  const char* pName)
> diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_
> formats.c
> index a5d783e689..c4ee14cab4 100644
> --- a/src/intel/vulkan/anv_formats.c
> +++ b/src/intel/vulkan/anv_formats.c
> @@ -450,6 +450,15 @@ void anv_GetPhysicalDeviceFormatProperties(
> pFormatProperties);
>  }
>
> +void anv_GetPhysicalDeviceFormatProperties2KHR(
> +VkPhysicalDevicephysicalDevice,
> +VkFormatformat,
> +VkFormatProperties2KHR* pFormatProperties)
> +{
> +   anv_GetPhysicalDeviceFormatProperties(physicalDevice, format,
> + &pFormatProperties->
> formatProperties);
> +}
> +
>  VkResult anv_GetPhysicalDeviceIma

[Mesa-dev] [Bug 97102] [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr

2017-01-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97102

Jan Ziak <0xe2.0x9a.0...@gmail.com> changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REOPENED|RESOLVED

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] i965: IVB/BYT fp64 fixes

2017-01-23 Thread Matt Turner
Thanks Sam!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Use correct VertStride on align16 instructions.

2017-01-23 Thread Matt Turner
On Mon, Jan 23, 2017 at 2:59 AM, Samuel Iglesias Gonsálvez
 wrote:
> On Fri, 2017-01-20 at 14:25 -0800, Francisco Jerez wrote:
>> Matt Turner  writes:
>>
>> > In commit c35fa7a, we changed the "width" of DF source registers to
>> > 2,
>> > which is conceptually fine. Unfortunately a VertStride of 2 is not
>> > allowed by align16 instructions on IVB/BYT, and the regular
>> > VertStride
>> > of 4 works fine in any case.
>> >
>>
>> I'll try to throw some light onto why this works -- AFAIUI, in
>> Align16
>> mode the vertical stride control doesn't behave as you would expect
>> -- A
>> VertStride=0 does behave as expected and steps zero elements across
>> rows
>> (modulo instruction decompression bugs), but on Gen7 any other value
>> simply behaves as "step by a fixed offset of 16B across rows".  On
>> HSW
>> they explicitly allowed VertStride=2, but I don't think the hardware
>> became any smarter, it's still most likely a 16B fixed offset.  On
>> IVB
>> neither VertStride=2 nor VertStride=4 is supposed to work for our
>> purposes (the former because it's supposedly not supported and the
>> latter because one would expect it to step by 4 DF elements = 32B per
>> 16B row), but they both likely work in practice.  Anyway let's stick
>> to
>> what the docs say is not illegal, a couple more comments below.
>>
>> > See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-
>> > functions/vs-round-double.shader_test
>> > for example:
>> >
>> > cmp.ge.f0(8)g18<1>DFg1<0>.xyxyDF-g8<2>DF{
>> > align16 1Q };
>> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
>> > allowed
>> > cmp.ge.f0(8)g19<1>DFg1<0>.xyxyDF-g9<2>DF{
>> > align16 2N };
>> > ERROR: In Align16 mode, only VertStride of 0 or 4 is
>> > allowed
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 18 ++
>> >  1 file changed, 14 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > index 888f95e..a01083f 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> > @@ -512,10 +512,15 @@ brw_set_src0(struct brw_codegen *p, brw_inst
>> > *inst, struct brw_reg reg)
>> >  /* This is an oddity of the fact we're using the same
>> >   * descriptions for registers in align_16 as align_1:
>> >   */
>>
>> Maybe move the comment above into the BRW_VERTICAL_STRIDE_8 block so
>> nobody gets confused and thinks that the code you added has to do
>> with
>> our representation of align_16 regions?
>>
>> > -if (reg.vstride == BRW_VERTICAL_STRIDE_8)
>> > + if (reg.vstride == BRW_VERTICAL_STRIDE_8) {
>> >  brw_inst_set_src0_vstride(devinfo, inst,
>> > BRW_VERTICAL_STRIDE_4);
>> > -else
>> > + } else if (devinfo->gen == 7 && !devinfo->is_haswell &&
>> > +reg.type == BRW_REGISTER_TYPE_DF &&
>> > +reg.vstride >= BRW_VERTICAL_STRIDE_1) {
>>
>> I think I'd only do this for BRW_VERTICAL_STRIDE_2, because DF
>> Align16
>> regions with VertStride=4 really behave like VertStride=2.  If the
>> caller expected anything else it's not going to get it.
>>
>> Maybe copy-paste the relevant spec text here and below to explain why
>> we
>> only use BRW_VERTICAL_STRIDE_4?
>>
>
> Matt, I can handle these changes... however, I have not found the
> relevant spec quote. Can you provide it?

I could not find anything useful in the IVB PRM. The HSW PRM has the
second quote below.

In the internal documentation, it says for DevSNB "For Align16 access
mode, only encodings of  and 0011 are allowed. Other codes are
reserved."

and for DevHSW "For Align16 access mode, only encodings of , 0010
and 0011 are allowed. Other codes are reserved."

Presumably the DevSNB behavior applies to IVB as well.

Thanks for handling this.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/37] glsl: Switch to disable-by-default for the GLSL shader cache

2017-01-23 Thread Timothy Arceri
From: Carl Worth 

The shader cache is expected to be developed incrementally over a
fairly long series of commits. For that period of instability, we
require users to opt into the shader cache by setting:

MESA_GLSL_CACHE_ENABLE=1

In the future, when the shader cache is complete, we can revert this
commit so that the cache will be on by default.

The user can always disable the cache with
MESA_GLSL_CACHE_DISABLE=1. That functionality is not affected by this
commit, (nor will it be affected by the future revert).
---
 src/compiler/glsl/tests/cache_test.c | 5 +
 src/util/disk_cache.c| 7 +++
 2 files changed, 12 insertions(+)

diff --git a/src/compiler/glsl/tests/cache_test.c 
b/src/compiler/glsl/tests/cache_test.c
index 0ef05aa..8547141 100644
--- a/src/compiler/glsl/tests/cache_test.c
+++ b/src/compiler/glsl/tests/cache_test.c
@@ -388,6 +388,11 @@ main(void)
 #ifdef ENABLE_SHADER_CACHE
int err;
 
+   /* While the shader cache is still experimental, this variable must
+* be set or the cache does nothing.
+*/
+   setenv("MESA_GLSL_CACHE_ENABLE", "1", 1);
+
test_disk_cache_create();
 
test_put_and_get();
diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 6de608c..dec09e0 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -151,6 +151,13 @@ disk_cache_create(void)
if (getenv("MESA_GLSL_CACHE_DISABLE"))
   goto fail;
 
+   /* As a temporary measure, (while the shader cache is under
+* development, and known to not be fully function), also require
+* the MESA_GLSL_CACHE_ENABLE variable to be set.
+*/
+   if (! getenv ("MESA_GLSL_CACHE_ENABLE"))
+  goto fail;
+
/* Determine path for cache based on the first defined name as follows:
 *
 *   $MESA_GLSL_CACHE_DIR
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/37] glsl: add cache to ctx and add sha1 string fields

2017-01-23 Thread Timothy Arceri
From: Carl Worth 

We also add a flag for detecting shaders written to shader cache.

V2: dont leak cache

Signed-off-by: Timothy Arceri 
---
 src/mesa/main/context.c | 6 ++
 src/mesa/main/mtypes.h  | 9 +
 2 files changed, 15 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index bd4551e..7ecd241 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -121,6 +121,7 @@
 #include "shared.h"
 #include "shaderobj.h"
 #include "shaderimage.h"
+#include "util/disk_cache.h"
 #include "util/strtod.h"
 #include "stencil.h"
 #include "texcompress_s3tc.h"
@@ -1229,6 +1230,8 @@ _mesa_initialize_context(struct gl_context *ctx,
memset(&ctx->TextureFormatSupported, GL_TRUE,
  sizeof(ctx->TextureFormatSupported));
 
+   ctx->Cache = disk_cache_create();
+
switch (ctx->API) {
case API_OPENGL_COMPAT:
   ctx->BeginEnd = create_beginend_table(ctx);
@@ -1269,6 +1272,7 @@ fail:
free(ctx->BeginEnd);
free(ctx->OutsideBeginEnd);
free(ctx->Save);
+   ralloc_free(ctx->Cache);
return GL_FALSE;
 }
 
@@ -1336,6 +1340,8 @@ _mesa_free_context_data( struct gl_context *ctx )
free(ctx->Save);
free(ctx->ContextLost);
 
+   ralloc_free(ctx->Cache);
+
/* Shared context state (display lists, textures, etc) */
_mesa_reference_shared_state(ctx, &ctx->Shared, NULL);
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 4ac7531..1cc8322 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1927,6 +1927,9 @@ struct gl_program
 
bool is_arb_asm; /** Is this an ARB assembly-style program */
 
+   /** Is this program written to on disk shader cache */
+   bool program_written_to_cache;
+
GLbitfield64 SecondaryOutputsWritten; /**< Subset of OutputsWritten outputs 
written with non-zero index. */
GLbitfield TexturesUsed[MAX_COMBINED_TEXTURE_IMAGE_UNITS];  /**< 
TEXTURE_x_BIT bitmask */
GLbitfield SamplersUsed;   /**< Bitfield of which samplers are used */
@@ -2384,6 +2387,7 @@ struct gl_shader
GLuint Name;  /**< AKA the handle */
GLint RefCount;  /**< Reference count */
GLchar *Label;   /**< GL_KHR_debug */
+   unsigned char sha1[20]; /**< SHA1 hash of pre-processed source */
GLboolean DeletePending;
GLboolean CompileStatus;
bool IsES;  /**< True if this shader uses GLSL ES */
@@ -2649,6 +2653,9 @@ struct gl_shader_program_data
 {
GLint RefCount;  /**< Reference count */
 
+   /** SHA1 hash of linked shader program */
+   unsigned char sha1[20];
+
unsigned NumUniformStorage;
unsigned NumHiddenUniforms;
struct gl_uniform_storage *UniformStorage;
@@ -4645,6 +4652,8 @@ struct gl_context
 * Stores the arguments to glPrimitiveBoundingBox
 */
GLfloat PrimitiveBoundingBox[8];
+
+   struct disk_cache *Cache;
 };
 
 /**
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/37] glsl: add new uniform fields to be used to restore state from cache

2017-01-23 Thread Timothy Arceri
From: Carl Worth 

Signed-off-by: Timothy Arceri 
---
 src/compiler/glsl/link_uniforms.cpp | 4 
 src/mesa/main/mtypes.h  | 4 
 2 files changed, 8 insertions(+)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index a450aa0..8930d26 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -1265,6 +1265,10 @@ link_assign_uniform_storage(struct gl_context *ctx,
 
link_setup_uniform_remap_tables(ctx, prog);
 
+   /* Set shader cache fields */
+   prog->data->NumUniformDataSlots = num_data_slots;
+   prog->data->UniformDataSlots = data;
+
link_set_uniform_initializers(prog, boolean_true);
 }
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b0a97b3..4ac7531 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2662,6 +2662,10 @@ struct gl_shader_program_data
struct gl_active_atomic_buffer *AtomicBuffers;
unsigned NumAtomicBuffers;
 
+   /* Shader cache variables used during restore */
+   unsigned NumUniformDataSlots;
+   union gl_constant_value *UniformDataSlots;
+
/** List of all active resources after linking. */
struct gl_program_resource *ProgramResourceList;
unsigned NumProgramResourceList;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Hardware agnostic on-disk shader cache patches

2017-01-23 Thread Timothy Arceri
The refactoring series is now pushed so here is the hardware agnostic
shader cache patches.

In future I want to change the way we track Mesa versions in patch 24
by creating a new cache directory for each Mesa version. This will
help avoid unnecessary fallback recompiles and also allow 3rd parties
to easily distribute precompiled shaders. However for now patch 24
works fine and I'd rather not hold up this series any longer.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/37] util: add a disk_cache_remove() function

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

This will be used to remove cache items created with old versions
of Mesa or other invalid cache items from the cache.
---
 src/util/disk_cache.c | 22 ++
 src/util/disk_cache.h | 12 
 2 files changed, 34 insertions(+)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index dec09e0..da01be7 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -538,6 +538,28 @@ evict_random_item(struct disk_cache *cache)
 }
 
 void
+disk_cache_remove(struct disk_cache *cache, cache_key key)
+{
+   struct stat sb;
+
+   char *filename = get_cache_file(cache, key);
+   if (filename == NULL) {
+  return;
+   }
+
+   if (stat(filename, &sb) == -1) {
+  ralloc_free(filename);
+  return;
+   }
+
+   unlink(filename);
+   ralloc_free(filename);
+
+   if (sb.st_size)
+  p_atomic_add(cache->size, - sb.st_size);
+}
+
+void
 disk_cache_put(struct disk_cache *cache,
   cache_key key,
   const void *data,
diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
index 7e9cb80..75d4b5b 100644
--- a/src/util/disk_cache.h
+++ b/src/util/disk_cache.h
@@ -78,6 +78,12 @@ void
 disk_cache_destroy(struct disk_cache *cache);
 
 /**
+ * Remove the item in the cache under the name \key.
+ */
+void
+disk_cache_remove(struct disk_cache *cache, cache_key key);
+
+/**
  * Store an item in the cache under the name \key.
  *
  * The item can be retrieved later with disk_cache_get(), (unless the item has
@@ -151,6 +157,12 @@ disk_cache_put(struct disk_cache *cache, cache_key key,
return;
 }
 
+static inline void
+cache_remove(struct program_cache *cache, cache_key key)
+{
+   return;
+}
+
 static inline uint8_t *
 disk_cache_get(struct disk_cache *cache, cache_key key, size_t *size)
 {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/37] glsl: add initial implementation of shader cache

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

This uses disk_cache.c to write out a serialization of various
state that's required in order to successfully load and use a
binary written out by a drivers backend, this state is referred to as
"metadata" throughout the implementation.

This initial version is intended to work with vertex and fragment
shader stages only.

This patch is based on the initial work done by Carl.
---
 src/compiler/Makefile.glsl.am  |   3 +-
 src/compiler/Makefile.sources  |   4 +
 src/compiler/glsl/shader_cache.cpp | 565 +
 src/compiler/glsl/shader_cache.h   |  38 +++
 4 files changed, 609 insertions(+), 1 deletion(-)
 create mode 100644 src/compiler/glsl/shader_cache.cpp
 create mode 100644 src/compiler/glsl/shader_cache.h

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index f673196..41edb3c 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -131,7 +131,8 @@ glsl_libglsl_la_LIBADD = \
 
 glsl_libglsl_la_SOURCES =  \
$(LIBGLSL_GENERATED_FILES)  \
-   $(LIBGLSL_FILES)
+   $(LIBGLSL_FILES)\
+   $(LIBGLSL_SHADER_CACHE_FILES)
 
 glsl_libstandalone_la_SOURCES = \
$(GLSL_COMPILER_CXX_FILES)
diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index a8bb4d3..1e8edc0 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -142,6 +142,10 @@ LIBGLSL_FILES = \
glsl/s_expression.cpp \
glsl/s_expression.h
 
+LIBGLSL_SHADER_CACHE_FILES = \
+   glsl/shader_cache.cpp \
+   glsl/shader_cache.h
+
 # glsl_compiler
 
 GLSL_COMPILER_CXX_FILES = \
diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
new file mode 100644
index 000..acb089c
--- /dev/null
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -0,0 +1,565 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * \file shader_cache.c
+ *
+ * GLSL shader cache implementation
+ *
+ * This uses the generic cache in cache.c to implement a cache of linked
+ * shader programs.
+ */
+
+#include "blob.h"
+#include "compiler/shader_info.h"
+#include "glsl_symbol_table.h"
+#include "glsl_parser_extras.h"
+#include "ir.h"
+#include "ir_optimization.h"
+#include "ir_rvalue_visitor.h"
+#include "ir_uniform.h"
+#include "linker.h"
+#include "link_varyings.h"
+#include "main/core.h"
+#include "nir.h"
+#include "program.h"
+#include "util/disk_cache.h"
+#include "util/mesa-sha1.h"
+#include "util/string_to_uint_map.h"
+
+extern "C" {
+#include "main/enums.h"
+#include "main/shaderobj.h"
+#include "program/program.h"
+}
+
+static void
+compile_shaders(struct gl_context *ctx, struct gl_shader_program *prog) {
+   for (unsigned i = 0; i < prog->NumShaders; i++) {
+  _mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true);
+   }
+}
+
+static void
+encode_type_to_blob(struct blob *blob, const glsl_type *type)
+{
+   uint32_t encoding;
+
+   switch (type->base_type) {
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_FLOAT:
+   case GLSL_TYPE_BOOL:
+   case GLSL_TYPE_DOUBLE:
+  encoding = (type->base_type << 24) |
+ (type->vector_elements << 4) |
+ (type->matrix_columns);
+  break;
+   case GLSL_TYPE_SAMPLER:
+  encoding = (type->base_type) << 24 |
+ (type->sampler_dimensionality << 4) |
+ (type->sampler_shadow << 3) |
+ (type->sampler_array << 2) |
+ (type->sampled_type);
+  break;
+   case GLSL_TYPE_SUBROUTINE:
+  encoding = type->base_type << 24;
+  blob_write_uint32(blob, encoding);
+  blob_write_string(blob, type->name);
+  return;
+   case GLSL_TYPE_IMAGE:
+  encoding = (type->base_type) << 24 |
+ (type->sampler_dimensionality << 3) |
+

[Mesa-dev] [PATCH 01/37] docs: add shader cache environment variables

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

Reviewed-by: Eric Anholt 
---
 docs/envvars.html | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/docs/envvars.html b/docs/envvars.html
index 276cea3..2269f18 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -114,6 +114,17 @@ glGetString(GL_VERSION) for OpenGL ES.
 glGetString(GL_SHADING_LANGUAGE_VERSION). Valid values are integers, such as
 "130".  Mesa will not really implement all the features of the given language 
version
 if it's higher than what's normally reported. (for developers only)
+MESA_GLSL_CACHE_DISABLE - if set, disables the GLSL shader cache
+MESA_GLSL_CACHE_MAX_SIZE - if set, determines the maximum size of
+the on-disk cache of compiled GLSL programs. Should be set to a number
+optionally followed by 'K', 'M', or 'G' to specify a size in
+kilobytes, megabytes, or gigabytes. By default, gigabytes will be
+assumed. And if unset, a maxium size of 1GB will be used.
+MESA_GLSL_CACHE_DIR - if set, determines the directory to be used
+for the on-disk cache of compiled GLSL programs. If this variable is
+not set, then the cache will be stored in $XDG_CACHE_HOME/mesa (if
+that variable is set), or else within .cache/mesa within the user's
+home directory.
 MESA_GLSL - shading language compiler 
options
 MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally 
disabled.
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/37] glsl: make use of on disk shader cache

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

The hash key for glsl metadata is a hash of the hashes of each GLSL
source string.

This commit uses the put_key/get_key support in the cache put the SHA-1
hash of the source string for each successfully compiled shader into the
cache. This allows for early, optimistic returns from glCompileShader
(if the identical source string had been successfully compiled in the past),
in the hope that the final, linked shader will be found in the cache.

This is based on the intial patch by Carl.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 18 ++
 src/compiler/glsl/linker.cpp |  7 +++
 src/mesa/program/ir_to_mesa.cpp  | 14 ++
 3 files changed, 39 insertions(+)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index f75165d..a68b76a 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -32,6 +32,8 @@
 #include "main/shaderobj.h"
 #include "util/u_atomic.h" /* for p_atomic_cmpxchg */
 #include "util/ralloc.h"
+#include "util/disk_cache.h"
+#include "util/mesa-sha1.h"
 #include "ast.h"
 #include "glsl_parser_extras.h"
 #include "glsl_parser.h"
@@ -1923,6 +1925,22 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
state->error = glcpp_preprocess(state, &source, &state->info_log,
  add_builtin_defines, state, ctx);
 
+#ifdef ENABLE_SHADER_CACHE
+   if (!force_recompile) {
+  char buf[41];
+  _mesa_sha1_compute(source, strlen(source), shader->sha1);
+  if (ctx->Cache && disk_cache_has_key(ctx->Cache, shader->sha1)) {
+ /* We've seen this shader before and know it compiles */
+ if (ctx->_Shader->Flags & GLSL_CACHE_INFO) {
+fprintf(stderr, "deferring compile of shader: %s\n",
+_mesa_sha1_format(buf, shader->sha1));
+ }
+ shader->CompileStatus = true;
+ return;
+  }
+   }
+#endif
+
if (!state->error) {
  _mesa_glsl_lexer_ctor(state, source);
  _mesa_glsl_parse(state);
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index dafa39d..f556ca6 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -73,6 +73,7 @@
 #include "program.h"
 #include "program/prog_instruction.h"
 #include "program/program.h"
+#include "util/mesa-sha1.h"
 #include "util/set.h"
 #include "util/string_to_uint_map.h"
 #include "linker.h"
@@ -81,6 +82,7 @@
 #include "ir_rvalue_visitor.h"
 #include "ir_uniform.h"
 #include "builtin_functions.h"
+#include "shader_cache.h"
 
 #include "main/shaderobj.h"
 #include "main/enums.h"
@@ -4633,6 +4635,11 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   return;
}
 
+#ifdef ENABLE_SHADER_CACHE
+   if (shader_cache_read_program_metadata(ctx, prog))
+  return;
+#endif
+
void *mem_ctx = ralloc_context(NULL); // temporary linker context
 
prog->ARB_fragment_coord_conventions_enable = false;
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 0ae797f..350c856 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -46,6 +46,7 @@
 #include "compiler/glsl_types.h"
 #include "compiler/glsl/linker.h"
 #include "compiler/glsl/program.h"
+#include "compiler/glsl/shader_cache.h"
 #include "program/prog_instruction.h"
 #include "program/prog_optimize.h"
 #include "program/prog_print.h"
@@ -3101,6 +3102,14 @@ _mesa_glsl_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
   link_shaders(ctx, prog);
}
 
+   /* FIXME: We look at prog->Version to determine whether we actually linked
+* the program or just loaded the uniform meta data from cache.  We
+* probably want to turn prog->LinkStatus into an enum that captures the
+* different states.
+*/
+   if (prog->data->LinkStatus && prog->data->Version == 0)
+  return;
+
if (prog->data->LinkStatus) {
   if (!ctx->Driver.LinkShader(ctx, prog)) {
  prog->data->LinkStatus = GL_FALSE;
@@ -3117,6 +3126,11 @@ _mesa_glsl_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
  fprintf(stderr, "%s\n", prog->data->InfoLog);
   }
}
+
+#ifdef ENABLE_SHADER_CACHE
+   if (prog->data->LinkStatus)
+  shader_cache_write_program_metadata(ctx, prog);
+#endif
 }
 
 } /* extern "C" */
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/37] glsl: fix uniform remap table cache when explicit locations used

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/shader_cache.cpp | 32 +---
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index a6f8238..ff35f69 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -43,7 +43,7 @@
 #include "main/core.h"
 #include "nir.h"
 #include "program.h"
-#include "util/disk_cache.h"
+#include "shader_cache.h"
 #include "util/mesa-sha1.h"
 #include "util/string_to_uint_map.h"
 
@@ -265,8 +265,20 @@ write_uniform_remap_table(struct blob *metadata,
blob_write_uint32(metadata, prog->NumUniformRemapTable);
 
for (unsigned i = 0; i < prog->NumUniformRemapTable; i++) {
-  blob_write_uint32(metadata, prog->UniformRemapTable[i] -
-   prog->data->UniformStorage);
+  blob_write_uint64(metadata,
+ptr_to_uint64_t(prog->UniformRemapTable[i]));
+
+  if (prog->UniformRemapTable[i] != INACTIVE_UNIFORM_EXPLICIT_LOCATION &&
+  prog->UniformRemapTable[i] != NULL) {
+
+ /* Here we store the offset rather than calculating it on restore
+  * because gl_uniform_storage may have a different size on the
+  * platform we are restoring the cache on.
+  */
+ uint32_t offset =
+prog->UniformRemapTable[i] - prog->data->UniformStorage;
+ blob_write_uint32(metadata, offset);
+  }
}
 }
 
@@ -276,12 +288,18 @@ read_uniform_remap_table(struct blob_reader *metadata,
 {
prog->NumUniformRemapTable = blob_read_uint32(metadata);
 
-   prog->UniformRemapTable =rzalloc_array(prog, struct gl_uniform_storage *,
-  prog->NumUniformRemapTable);
+   prog->UniformRemapTable = rzalloc_array(prog, struct gl_uniform_storage *,
+   prog->NumUniformRemapTable);
 
for (unsigned i = 0; i < prog->NumUniformRemapTable; i++) {
-  prog->UniformRemapTable[i] =
- prog->data->UniformStorage + blob_read_uint32(metadata);
+  uint64_t uni_ptr = blob_read_uint64(metadata);
+  if (uni_ptr == (uint64_t) INACTIVE_UNIFORM_EXPLICIT_LOCATION ||
+  uni_ptr == (uint64_t) NULL) {
+ prog->UniformRemapTable[i] = (gl_uniform_storage *) uni_ptr;
+  } else {
+ uint32_t uni_offset = blob_read_uint32(metadata);
+ prog->UniformRemapTable[i] = prog->data->UniformStorage + uni_offset;
+  }
}
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/37] glsl: add shader cache support for samplers

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/shader_cache.cpp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 4a144ea..1093726 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -215,6 +215,8 @@ write_uniforms(struct blob *metadata, struct 
gl_shader_program *prog)
 prog->data->UniformStorage[i].top_level_array_size);
   blob_write_uint32(metadata,
 prog->data->UniformStorage[i].top_level_array_stride);
+  blob_write_bytes(metadata, prog->data->UniformStorage[i].opaque,
+   sizeof(prog->data->UniformStorage[i].opaque));
}
 }
 
@@ -254,6 +256,10 @@ read_uniforms(struct blob_reader *metadata, struct 
gl_shader_program *prog)
   uniforms[i].top_level_array_size = blob_read_uint32(metadata);
   uniforms[i].top_level_array_stride = blob_read_uint32(metadata);
   prog->UniformHash->put(i, uniforms[i].name);
+
+  memcpy(uniforms[i].opaque,
+ blob_read_bytes(metadata, sizeof(uniforms[i].opaque)),
+ sizeof(uniforms[i].opaque));
}
 }
 
@@ -548,6 +554,12 @@ write_shader_metadata(struct blob *metadata, 
gl_linked_shader *shader)
 sizeof(glprog->TexturesUsed));
blob_write_uint64(metadata, glprog->SamplersUsed);
 
+   blob_write_bytes(metadata, glprog->SamplerUnits,
+sizeof(glprog->SamplerUnits));
+   blob_write_bytes(metadata, glprog->sh.SamplerTargets,
+sizeof(glprog->sh.SamplerTargets));
+   blob_write_uint32(metadata, glprog->ShadowSamplers);
+
write_shader_parameters(metadata, glprog->Parameters);
 }
 
@@ -560,6 +572,12 @@ read_shader_metadata(struct blob_reader *metadata,
sizeof(glprog->TexturesUsed));
glprog->SamplersUsed = blob_read_uint64(metadata);
 
+   blob_copy_bytes(metadata, (uint8_t *) glprog->SamplerUnits,
+   sizeof(glprog->SamplerUnits));
+   blob_copy_bytes(metadata, (uint8_t *) glprog->sh.SamplerTargets,
+   sizeof(glprog->sh.SamplerTargets));
+   glprog->ShadowSamplers = blob_read_uint32(metadata);
+
glprog->Parameters = _mesa_new_parameter_list();
read_shader_parameters(metadata, glprog->Parameters);
 }
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/37] glsl: add param to force shader recompile

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

This will be used to skip checking the cache and force a recompile.
---
 src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
 src/compiler/glsl/program.h  | 2 +-
 src/compiler/glsl/standalone.cpp | 3 ++-
 src/mesa/main/shaderapi.c| 2 +-
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index e888090..f75165d 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1910,7 +1910,7 @@ do_late_parsing_checks(struct _mesa_glsl_parse_state 
*state)
 
 void
 _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader,
-  bool dump_ast, bool dump_hir)
+  bool dump_ast, bool dump_hir, bool force_recompile)
 {
struct _mesa_glsl_parse_state *state =
   new(shader) _mesa_glsl_parse_state(ctx, shader->Stage, shader);
diff --git a/src/compiler/glsl/program.h b/src/compiler/glsl/program.h
index 8f5a31b..58a7069 100644
--- a/src/compiler/glsl/program.h
+++ b/src/compiler/glsl/program.h
@@ -33,7 +33,7 @@ struct gl_shader_program;
 
 extern void
 _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader,
- bool dump_ast, bool dump_hir);
+ bool dump_ast, bool dump_hir, bool force_recompile);
 
 #ifdef __cplusplus
 } /* extern "C" */
diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index 44f2c0f..e7ba780 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -381,7 +381,8 @@ compile_shader(struct gl_context *ctx, struct gl_shader 
*shader)
struct _mesa_glsl_parse_state *state =
   new(shader) _mesa_glsl_parse_state(ctx, shader->Stage, shader);
 
-   _mesa_glsl_compile_shader(ctx, shader, options->dump_ast, 
options->dump_hir);
+   _mesa_glsl_compile_shader(ctx, shader, options->dump_ast,
+ options->dump_hir, true);
 
/* Print out the resulting IR */
if (!state->error && options->dump_lir) {
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index c4ea79f..c31c170 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1036,7 +1036,7 @@ _mesa_compile_shader(struct gl_context *ctx, struct 
gl_shader *sh)
   /* this call will set the shader->CompileStatus field to indicate if
* compilation was successful.
*/
-  _mesa_glsl_compile_shader(ctx, sh, false, false);
+  _mesa_glsl_compile_shader(ctx, sh, false, false, false);
 
   if (ctx->_Shader->Flags & GLSL_LOG) {
  _mesa_write_shader_to_file(sh);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/37] glsl: add helper to convert pointers to uint64_t

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

This will be used to store all pointers in the cache as 64bit ints
allowing us to avoid issues when a 32bit program reads a cached
shader that was created by a 64bit application.
---
 src/compiler/glsl/shader_cache.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.h b/src/compiler/glsl/shader_cache.h
index 8bd0a3c..1596c33 100644
--- a/src/compiler/glsl/shader_cache.h
+++ b/src/compiler/glsl/shader_cache.h
@@ -27,6 +27,16 @@
 
 #include "util/disk_cache.h"
 
+static uint64_t inline
+ptr_to_uint64_t(void *ptr)
+{
+   uint64_t ptr_int = (uint64_t) ptr;
+#if __i386__
+   ptr_int &= 0x;
+#endif
+   return ptr_int;
+}
+
 void
 shader_cache_write_program_metadata(struct gl_context *ctx,
 struct gl_shader_program *prog);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/37] glsl: Serialize three additional hash tables with program metadata

2017-01-23 Thread Timothy Arceri
From: Carl Worth 

The three additional tables are AttributeBindings, FragDataBindings,
and FragDataIndexBindings.

The first table (AttributeBindings) was identified as missing by
trying to test the shader cache with a program that called
glGetAttribLocation.

Many thanks to Tapani Pälli , as it was review
of related work that he had done previously that pointed me to the
necessity to also save and restore FragDataBindings and
FragDataIndexBindings.
---
 src/compiler/glsl/shader_cache.cpp | 74 ++
 1 file changed, 74 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index acb089c..a6f8238 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -285,6 +285,76 @@ read_uniform_remap_table(struct blob_reader *metadata,
}
 }
 
+struct whte_closure
+{
+   struct blob *blob;
+   size_t num_entries;
+};
+
+static void
+write_hash_table_entry(const char *key, unsigned value, void *closure)
+{
+   struct whte_closure *whte = (struct whte_closure *) closure;
+
+   blob_write_string(whte->blob, key);
+   blob_write_uint32(whte->blob, value);
+
+   whte->num_entries++;
+}
+
+static void
+write_hash_table(struct blob *metadata, struct string_to_uint_map *hash)
+{
+   size_t offset;
+   struct whte_closure whte;
+
+   whte.blob = metadata;
+   whte.num_entries = 0;
+
+   offset = metadata->size;
+
+   /* Write a placeholder for the hashtable size. */
+   blob_write_uint32 (metadata, 0);
+
+   hash->iterate(write_hash_table_entry, &whte);
+
+   /* Overwrite with the computed number of entires written. */
+   blob_overwrite_uint32 (metadata, offset, whte.num_entries);
+}
+
+static void
+read_hash_table(struct blob_reader *metadata, struct string_to_uint_map *hash)
+{
+   size_t i, num_entries;
+   const char *key;
+   uint32_t value;
+
+   num_entries = blob_read_uint32 (metadata);
+
+   for (i = 0; i < num_entries; i++) {
+  key = blob_read_string(metadata);
+  value = blob_read_uint32(metadata);
+
+  hash->put(value, key);
+   }
+}
+
+static void
+write_hash_tables(struct blob *metadata, struct gl_shader_program *prog)
+{
+   write_hash_table(metadata, prog->AttributeBindings);
+   write_hash_table(metadata, prog->FragDataBindings);
+   write_hash_table(metadata, prog->FragDataIndexBindings);
+}
+
+static void
+read_hash_tables(struct blob_reader *metadata, struct gl_shader_program *prog)
+{
+   read_hash_table(metadata, prog->AttributeBindings);
+   read_hash_table(metadata, prog->FragDataBindings);
+   read_hash_table(metadata, prog->FragDataIndexBindings);
+}
+
 static void
 write_shader_parameters(struct blob *metadata,
 struct gl_program_parameter_list *params)
@@ -412,6 +482,8 @@ shader_cache_write_program_metadata(struct gl_context *ctx,
 
write_uniforms(metadata, prog);
 
+   write_hash_tables(metadata, prog);
+
blob_write_uint32(metadata, prog->data->linked_stages);
 
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
@@ -529,6 +601,8 @@ shader_cache_read_program_metadata(struct gl_context *ctx,
 
read_uniforms(&metadata, prog);
 
+   read_hash_tables(&metadata, prog);
+
prog->data->linked_stages = blob_read_uint32(&metadata);
 
unsigned mask = prog->data->linked_stages;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/37] glsl: add support for caching shaders with xfb qualifiers

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.
---
 src/compiler/glsl/linker.cpp   |  14 -
 src/compiler/glsl/shader_cache.cpp | 108 +
 2 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f556ca6..fa9e154 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4636,7 +4636,19 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
}
 
 #ifdef ENABLE_SHADER_CACHE
-   if (shader_cache_read_program_metadata(ctx, prog))
+   /* If transform feedback used on the program then compile all shaders. */
+   bool skip_cache = false;
+   if (prog->TransformFeedback.NumVarying > 0) {
+  for (unsigned i = 0; i < prog->NumShaders; i++) {
+ if (prog->Shaders[i]->ir) {
+continue;
+ }
+ _mesa_glsl_compile_shader(ctx, prog->Shaders[i], false, false, true);
+  }
+  skip_cache = true;
+   }
+
+   if (!skip_cache && shader_cache_read_program_metadata(ctx, prog))
   return;
 #endif
 
diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index cc3eb84..9bbbf1f 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -190,6 +190,84 @@ decode_type_from_blob(struct blob_reader *blob)
 }
 
 static void
+write_xfb(struct blob *metadata, struct gl_shader_program *shProg)
+{
+   struct gl_program *prog = shProg->last_vert_prog;
+
+   if (!prog) {
+  blob_write_uint32(metadata, ~0u);
+  return;
+   }
+
+   struct gl_transform_feedback_info *ltf = prog->sh.LinkedTransformFeedback;
+
+   blob_write_uint32(metadata, prog->info.stage);
+
+   blob_write_uint32(metadata, ltf->NumOutputs);
+   blob_write_uint32(metadata, ltf->ActiveBuffers);
+   blob_write_uint32(metadata, ltf->NumVarying);
+
+   blob_write_bytes(metadata, ltf->Outputs,
+sizeof(struct gl_transform_feedback_output) *
+   ltf->NumOutputs);
+
+   for (int i = 0; i < ltf->NumVarying; i++) {
+  blob_write_string(metadata, ltf->Varyings[i].Name);
+  blob_write_uint32(metadata, ltf->Varyings[i].Type);
+  blob_write_uint32(metadata, ltf->Varyings[i].BufferIndex);
+  blob_write_uint32(metadata, ltf->Varyings[i].Size);
+  blob_write_uint32(metadata, ltf->Varyings[i].Offset);
+   }
+
+   blob_write_bytes(metadata, ltf->Buffers,
+sizeof(struct gl_transform_feedback_buffer) *
+   MAX_FEEDBACK_BUFFERS);
+}
+
+static void
+read_xfb(struct blob_reader *metadata, struct gl_shader_program *shProg)
+{
+   unsigned xfb_stage = blob_read_uint32(metadata);
+
+   if (xfb_stage == ~0u)
+  return;
+
+   struct gl_program *prog = shProg->_LinkedShaders[xfb_stage]->Program;
+   struct gl_transform_feedback_info *ltf =
+  rzalloc(prog, struct gl_transform_feedback_info);
+
+   prog->sh.LinkedTransformFeedback = ltf;
+   shProg->last_vert_prog = prog;
+
+   ltf->NumOutputs = blob_read_uint32(metadata);
+   ltf->ActiveBuffers = blob_read_uint32(metadata);
+   ltf->NumVarying = blob_read_uint32(metadata);
+
+   ltf->Outputs = rzalloc_array(prog, struct gl_transform_feedback_output,
+ltf->NumOutputs);
+
+   blob_copy_bytes(metadata, (uint8_t *) ltf->Outputs,
+   sizeof(struct gl_transform_feedback_output) *
+  ltf->NumOutputs);
+
+   ltf->Varyings = rzalloc_array(prog,
+ struct gl_transform_feedback_varying_info,
+ ltf->NumVarying);
+
+   for (int i = 0; i < ltf->NumVarying; i++) {
+  ltf->Varyings[i].Name = ralloc_strdup(prog, blob_read_string(metadata));
+  ltf->Varyings[i].Type = blob_read_uint32(metadata);
+  ltf->Varyings[i].BufferIndex = blob_read_uint32(metadata);
+  ltf->Varyings[i].Size = blob_read_uint32(metadata);
+  ltf->Varyings[i].Offset = blob_read_uint32(metadata);
+   }
+
+   blob_copy_bytes(metadata, (uint8_t *) ltf->Buffers,
+   sizeof(struct gl_transform_feedback_buffer) *
+  MAX_FEEDBACK_BUFFERS);
+}
+
+static void
 write_uniforms(struct blob *metadata, struct gl_shader_program *prog)
 {
blob_write_uint32(metadata, prog->SamplersValidated);
@@ -416,6 +494,24 @@ write_program_resource_data(struct blob *metadata,
  }
   }
   break;
+   case GL_TRANSFORM_FEEDBACK_BUFFER:
+  for (unsigned i = 0; i < MAX_FEEDBACK_BUFFERS; i++) {
+ if (((gl_transform_feedback_buffer *)res->Data)->Binding ==
+ 
prog->last_vert_prog->sh.LinkedTransformFeedback->Buffers[i].Binding) {
+blob_write_uint32(metadata, i);
+break;
+ }
+  }
+  break;
+   case GL_TRANSFORM_FEEDBACK_VARYING:
+  for (int i = 0; i < 
prog->

[Mesa-dev] [PATCH 13/37] glsl: add basic support for resource list to shader cache

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

This initially adds support for simple uniforms and varyings.
---
 src/compiler/glsl/shader_cache.cpp | 121 +
 1 file changed, 121 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index ff35f69..4a144ea 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -374,6 +374,123 @@ read_hash_tables(struct blob_reader *metadata, struct 
gl_shader_program *prog)
 }
 
 static void
+write_program_resource_data(struct blob *metadata,
+struct gl_shader_program *prog,
+struct gl_program_resource *res)
+{
+   switch(res->Type) {
+   case GL_PROGRAM_INPUT:
+   case GL_PROGRAM_OUTPUT: {
+  const gl_shader_variable *var = (gl_shader_variable *)res->Data;
+  blob_write_bytes(metadata, var, sizeof(gl_shader_variable));
+  encode_type_to_blob(metadata, var->type);
+
+  if (var->interface_type)
+ encode_type_to_blob(metadata, var->interface_type);
+
+  if (var->outermost_struct_type)
+ encode_type_to_blob(metadata, var->outermost_struct_type);
+
+  blob_write_string(metadata, var->name);
+  break;
+   }
+   case GL_BUFFER_VARIABLE:
+   case GL_VERTEX_SUBROUTINE_UNIFORM:
+   case GL_GEOMETRY_SUBROUTINE_UNIFORM:
+   case GL_FRAGMENT_SUBROUTINE_UNIFORM:
+   case GL_COMPUTE_SUBROUTINE_UNIFORM:
+   case GL_TESS_CONTROL_SUBROUTINE_UNIFORM:
+   case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM:
+   case GL_UNIFORM:
+  for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
+ if (strcmp(((gl_uniform_storage *)res->Data)->name,
+prog->data->UniformStorage[i].name) == 0) {
+blob_write_uint32(metadata, i);
+break;
+ }
+  }
+  break;
+   default:
+  assert(!"Support for writting resource not yet implemented.");
+   }
+}
+
+static void
+read_program_resource_data(struct blob_reader *metadata,
+   struct gl_shader_program *prog,
+   struct gl_program_resource *res)
+{
+   switch(res->Type) {
+   case GL_PROGRAM_INPUT:
+   case GL_PROGRAM_OUTPUT: {
+  gl_shader_variable *var = ralloc(prog, struct gl_shader_variable);
+
+  blob_copy_bytes(metadata, (uint8_t *) var, sizeof(gl_shader_variable));
+  var->type = decode_type_from_blob(metadata);
+
+  if (var->interface_type)
+ var->interface_type = decode_type_from_blob(metadata);
+
+  if (var->outermost_struct_type)
+ var->outermost_struct_type = decode_type_from_blob(metadata);
+
+  var->name = ralloc_strdup(prog, blob_read_string(metadata));
+
+  res->Data = var;
+  break;
+   }
+   case GL_BUFFER_VARIABLE:
+   case GL_VERTEX_SUBROUTINE_UNIFORM:
+   case GL_GEOMETRY_SUBROUTINE_UNIFORM:
+   case GL_FRAGMENT_SUBROUTINE_UNIFORM:
+   case GL_COMPUTE_SUBROUTINE_UNIFORM:
+   case GL_TESS_CONTROL_SUBROUTINE_UNIFORM:
+   case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM:
+   case GL_UNIFORM:
+  res->Data = &prog->data->UniformStorage[blob_read_uint32(metadata)];
+  break;
+   default:
+  assert(!"Support for reading resource not yet implemented.");
+   }
+}
+
+static void
+write_program_resource_list(struct blob *metadata,
+struct gl_shader_program *prog)
+{
+   blob_write_uint32(metadata, prog->data->NumProgramResourceList);
+
+   for (unsigned i = 0; i < prog->data->NumProgramResourceList; i++) {
+  blob_write_uint32(metadata, prog->data->ProgramResourceList[i].Type);
+  write_program_resource_data(metadata, prog,
+  &prog->data->ProgramResourceList[i]);
+  blob_write_bytes(metadata,
+   &prog->data->ProgramResourceList[i].StageReferences,
+   
sizeof(prog->data->ProgramResourceList[i].StageReferences));
+   }
+}
+
+static void
+read_program_resource_list(struct blob_reader *metadata,
+   struct gl_shader_program *prog)
+{
+   prog->data->NumProgramResourceList = blob_read_uint32(metadata);
+
+   prog->data->ProgramResourceList =
+  ralloc_array(prog, gl_program_resource,
+   prog->data->NumProgramResourceList);
+
+   for (unsigned i = 0; i < prog->data->NumProgramResourceList; i++) {
+  prog->data->ProgramResourceList[i].Type = blob_read_uint32(metadata);
+  read_program_resource_data(metadata, prog,
+ &prog->data->ProgramResourceList[i]);
+  blob_copy_bytes(metadata,
+  (uint8_t *) 
&prog->data->ProgramResourceList[i].StageReferences,
+  
sizeof(prog->data->ProgramResourceList[i].StageReferences));
+   }
+}
+
+static void
 write_shader_parameters(struct blob *metadata,
 struct gl_program_parameter_list *params)
 {
@@ -522,6 +639,8 @@ shader_cache_write_program_metadata(struct gl_context *ctx,
 
write_uniform_remap_tabl

[Mesa-dev] [PATCH 05/37] mesa: add new MESA_GLSL flag for printing shader cache debug info

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 docs/shading.html | 1 +
 src/mesa/main/mtypes.h| 1 +
 src/mesa/main/shaderapi.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/docs/shading.html b/docs/shading.html
index b0ed249..e44035a 100644
--- a/docs/shading.html
+++ b/docs/shading.html
@@ -49,6 +49,7 @@ execution.  These are generally used for debugging.
 log - log all GLSL shaders to files.
 The filenames will be "shader_X.vert" or "shader_X.frag" where X
 the shader ID.
+cache_info - print debug information about shader cache
 nopt - disable compiler optimizations
 opt - force compiler optimizations
 uniform - print message to stdout when glUniform is called
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 1cc8322..a2280e2 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2837,6 +2837,7 @@ struct gl_shader_program
 #define GLSL_USE_PROG 0x80  /**< Log glUseProgram calls */
 #define GLSL_REPORT_ERRORS 0x100  /**< Print compilation errors */
 #define GLSL_DUMP_ON_ERROR 0x200 /**< Dump shaders to stderr on compile error 
*/
+#define GLSL_CACHE_INFO 0x400 /**< Print debug information about shader cache 
*/
 
 
 /**
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 3313fa2..c4ea79f 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -77,6 +77,8 @@ _mesa_get_shader_flags(void)
  flags |= GLSL_DUMP;
   if (strstr(env, "log"))
  flags |= GLSL_LOG;
+  if (strstr(env, "cache_info"))
+ flags |= GLSL_CACHE_INFO;
   if (strstr(env, "nopvert"))
  flags |= GLSL_NOP_VERT;
   if (strstr(env, "nopfrag"))
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/37] glsl: add shader cache support for buffer blocks

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/shader_cache.cpp | 163 +
 1 file changed, 163 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 9ed1c4e..29fc70a 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -252,6 +252,141 @@ read_subroutines(struct blob_reader *metadata, struct 
gl_shader_program *prog)
 }
 
 static void
+write_buffer_block(struct blob *metadata, struct gl_uniform_block *b)
+{
+   blob_write_string(metadata, b->Name);
+   blob_write_uint32(metadata, b->NumUniforms);
+   blob_write_uint32(metadata, b->Binding);
+   blob_write_uint32(metadata, b->UniformBufferSize);
+   blob_write_uint32(metadata, b->stageref);
+
+   for (unsigned j = 0; j < b->NumUniforms; j++) {
+  blob_write_string(metadata, b->Uniforms[j].Name);
+  blob_write_string(metadata, b->Uniforms[j].IndexName);
+  encode_type_to_blob(metadata, b->Uniforms[j].Type);
+  blob_write_uint32(metadata, b->Uniforms[j].Offset);
+   }
+}
+
+static void
+write_buffer_blocks(struct blob *metadata, struct gl_shader_program *prog)
+{
+   blob_write_uint32(metadata, prog->data->NumUniformBlocks);
+   blob_write_uint32(metadata, prog->data->NumShaderStorageBlocks);
+
+   for (unsigned i = 0; i < prog->data->NumUniformBlocks; i++) {
+  write_buffer_block(metadata, &prog->data->UniformBlocks[i]);
+   }
+
+   for (unsigned i = 0; i < prog->data->NumShaderStorageBlocks; i++) {
+  write_buffer_block(metadata, &prog->data->ShaderStorageBlocks[i]);
+   }
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+  if (!sh)
+ continue;
+
+  struct gl_program *glprog = sh->Program;
+
+  blob_write_uint32(metadata, glprog->info.num_ubos);
+  blob_write_uint32(metadata, glprog->info.num_ssbos);
+
+  for (unsigned j = 0; j < glprog->info.num_ubos; j++) {
+ uint32_t offset =
+glprog->sh.UniformBlocks[j] - prog->data->UniformBlocks;
+ blob_write_uint32(metadata, offset);
+  }
+
+  for (unsigned j = 0; j < glprog->info.num_ssbos; j++) {
+ uint32_t offset = glprog->sh.ShaderStorageBlocks[j] -
+prog->data->ShaderStorageBlocks;
+ blob_write_uint32(metadata, offset);
+  }
+   }
+}
+
+static void
+read_buffer_block(struct blob_reader *metadata, struct gl_uniform_block *b,
+  struct gl_shader_program *prog)
+{
+  b->Name = ralloc_strdup(prog->data, blob_read_string (metadata));
+  b->NumUniforms = blob_read_uint32(metadata);
+  b->Binding = blob_read_uint32(metadata);
+  b->UniformBufferSize = blob_read_uint32(metadata);
+  b->stageref = blob_read_uint32(metadata);
+
+  b->Uniforms =
+ rzalloc_array(prog->data, struct gl_uniform_buffer_variable,
+   b->NumUniforms);
+  for (unsigned j = 0; j < b->NumUniforms; j++) {
+ b->Uniforms[j].Name = ralloc_strdup(prog->data,
+ blob_read_string (metadata));
+
+ char *index_name = blob_read_string(metadata);
+ if (strcmp(b->Uniforms[j].Name, index_name) == 0) {
+b->Uniforms[j].IndexName = b->Uniforms[j].Name;
+ } else {
+b->Uniforms[j].IndexName = ralloc_strdup(prog->data, index_name);
+ }
+
+ b->Uniforms[j].Type = decode_type_from_blob(metadata);
+ b->Uniforms[j].Offset = blob_read_uint32(metadata);
+  }
+}
+
+static void
+read_buffer_blocks(struct blob_reader *metadata,
+   struct gl_shader_program *prog)
+{
+   prog->data->NumUniformBlocks = blob_read_uint32(metadata);
+   prog->data->NumShaderStorageBlocks = blob_read_uint32(metadata);
+
+   prog->data->UniformBlocks =
+  rzalloc_array(prog->data, struct gl_uniform_block,
+prog->data->NumUniformBlocks);
+
+   prog->data->ShaderStorageBlocks =
+  rzalloc_array(prog->data, struct gl_uniform_block,
+prog->data->NumShaderStorageBlocks);
+
+   for (unsigned i = 0; i < prog->data->NumUniformBlocks; i++) {
+  read_buffer_block(metadata, &prog->data->UniformBlocks[i], prog);
+   }
+
+   for (unsigned i = 0; i < prog->data->NumShaderStorageBlocks; i++) {
+  read_buffer_block(metadata, &prog->data->ShaderStorageBlocks[i], prog);
+   }
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+  if (!sh)
+ continue;
+
+  struct gl_program *glprog = sh->Program;
+
+  glprog->info.num_ubos = blob_read_uint32(metadata);
+  glprog->info.num_ssbos = blob_read_uint32(metadata);
+
+  glprog->sh.UniformBlocks =
+ rzalloc_array(glprog, gl_uniform_block *, glprog->info.num_ubos);
+  glprog->sh.ShaderStorageBlocks =
+ rzalloc_array(glprog, gl_uniform_block *, glprog->info.num_ssbos);
+
+  for (unsigned 

[Mesa-dev] [PATCH 17/37] glsl: add support for caching subroutines

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/shader_cache.cpp | 107 +
 1 file changed, 107 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 9bbbf1f..b5468a9 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -190,6 +190,68 @@ decode_type_from_blob(struct blob_reader *blob)
 }
 
 static void
+write_subroutines(struct blob *metadata, struct gl_shader_program *prog)
+{
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+  if (!sh)
+ continue;
+
+  struct gl_program *glprog = sh->Program;
+
+  blob_write_uint32(metadata, glprog->sh.NumSubroutineUniforms);
+  blob_write_uint32(metadata, glprog->sh.MaxSubroutineFunctionIndex);
+  blob_write_uint32(metadata, glprog->sh.NumSubroutineFunctions);
+  for (unsigned j = 0; j < glprog->sh.NumSubroutineFunctions; j++) {
+ int num_types = glprog->sh.SubroutineFunctions[j].num_compat_types;
+
+ blob_write_string(metadata, glprog->sh.SubroutineFunctions[j].name);
+ blob_write_uint32(metadata, glprog->sh.SubroutineFunctions[j].index);
+ blob_write_uint32(metadata, num_types);
+
+ for (int k = 0; k < num_types; k++) {
+encode_type_to_blob(metadata,
+glprog->sh.SubroutineFunctions[j].types[k]);
+ }
+  }
+   }
+}
+
+static void
+read_subroutines(struct blob_reader *metadata, struct gl_shader_program *prog)
+{
+   struct gl_subroutine_function *subs;
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+  if (!sh)
+ continue;
+
+  struct gl_program *glprog = sh->Program;
+
+  glprog->sh.NumSubroutineUniforms = blob_read_uint32(metadata);
+  glprog->sh.MaxSubroutineFunctionIndex = blob_read_uint32(metadata);
+  glprog->sh.NumSubroutineFunctions = blob_read_uint32(metadata);
+
+  subs = rzalloc_array(prog, struct gl_subroutine_function,
+   glprog->sh.NumSubroutineFunctions);
+  glprog->sh.SubroutineFunctions = subs;
+
+  for (unsigned j = 0; j < glprog->sh.NumSubroutineFunctions; j++) {
+ subs[j].name = ralloc_strdup(prog, blob_read_string (metadata));
+ subs[j].index = (int) blob_read_uint32(metadata);
+ subs[j].num_compat_types = (int) blob_read_uint32(metadata);
+
+ subs[j].types = rzalloc_array(prog, const struct glsl_type *,
+   subs[j].num_compat_types);
+ for (int k = 0; k < subs[j].num_compat_types; k++) {
+subs[j].types[k] = decode_type_from_blob(metadata);
+ }
+  }
+   }
+}
+
+static void
 write_xfb(struct blob *metadata, struct gl_shader_program *shProg)
 {
struct gl_program *prog = shProg->last_vert_prog;
@@ -458,10 +520,28 @@ read_hash_tables(struct blob_reader *metadata, struct 
gl_shader_program *prog)
 }
 
 static void
+write_shader_subroutine_index(struct blob *metadata,
+  struct gl_linked_shader *sh,
+  struct gl_program_resource *res)
+{
+   assert(sh);
+
+   for (unsigned j = 0; j < sh->Program->sh.NumSubroutineFunctions; j++) {
+  if (strcmp(((gl_subroutine_function *)res->Data)->name,
+ sh->Program->sh.SubroutineFunctions[j].name) == 0) {
+ blob_write_uint32(metadata, j);
+ break;
+  }
+   }
+}
+
+static void
 write_program_resource_data(struct blob *metadata,
 struct gl_shader_program *prog,
 struct gl_program_resource *res)
 {
+   struct gl_linked_shader *sh;
+
switch(res->Type) {
case GL_PROGRAM_INPUT:
case GL_PROGRAM_OUTPUT: {
@@ -512,6 +592,16 @@ write_program_resource_data(struct blob *metadata,
  }
   }
   break;
+   case GL_VERTEX_SUBROUTINE:
+   case GL_TESS_CONTROL_SUBROUTINE:
+   case GL_TESS_EVALUATION_SUBROUTINE:
+   case GL_GEOMETRY_SUBROUTINE:
+   case GL_FRAGMENT_SUBROUTINE:
+   case GL_COMPUTE_SUBROUTINE:
+  sh =
+ prog->_LinkedShaders[_mesa_shader_stage_from_subroutine(res->Type)];
+  write_shader_subroutine_index(metadata, sh, res);
+  break;
default:
   assert(!"Support for writting resource not yet implemented.");
}
@@ -522,6 +612,8 @@ read_program_resource_data(struct blob_reader *metadata,
struct gl_shader_program *prog,
struct gl_program_resource *res)
 {
+   struct gl_linked_shader *sh;
+
switch(res->Type) {
case GL_PROGRAM_INPUT:
case GL_PROGRAM_OUTPUT: {
@@ -559,6 +651,17 @@ read_program_resource_data(struct blob_reader *metadata,
   res->Data = &prog->last_vert_prog->
  sh.LinkedTransformFeedback->Varyings[blob_read_uint32(metadata)];
   break;
+   case GL_VERTEX_SUBROUTINE:
+   case 

[Mesa-dev] [PATCH 18/37] glsl: store subroutine remap table in shader cache

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/shader_cache.cpp | 57 ++
 1 file changed, 51 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index b5468a9..9ed1c4e 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -405,8 +405,8 @@ read_uniforms(struct blob_reader *metadata, struct 
gl_shader_program *prog)
 
 
 static void
-write_uniform_remap_table(struct blob *metadata,
-  struct gl_shader_program *prog)
+write_uniform_remap_tables(struct blob *metadata,
+   struct gl_shader_program *prog)
 {
blob_write_uint32(metadata, prog->NumUniformRemapTable);
 
@@ -426,11 +426,31 @@ write_uniform_remap_table(struct blob *metadata,
  blob_write_uint32(metadata, offset);
   }
}
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+  if (sh) {
+ struct gl_program *glprog = sh->Program;
+ blob_write_uint32(metadata, 
glprog->sh.NumSubroutineUniformRemapTable);
+
+ for (unsigned j = 0; j < glprog->sh.NumSubroutineUniformRemapTable; 
j++) {
+struct gl_uniform_storage *sr =
+   glprog->sh.SubroutineUniformRemapTable[j];
+
+blob_write_uint64(metadata, ptr_to_uint64_t(sr));
+
+if (sr != INACTIVE_UNIFORM_EXPLICIT_LOCATION && sr != NULL) {
+   uint32_t offset = sr - prog->data->UniformStorage;
+   blob_write_uint32(metadata, offset);
+}
+ }
+  }
+   }
 }
 
 static void
-read_uniform_remap_table(struct blob_reader *metadata,
- struct gl_shader_program *prog)
+read_uniform_remap_tables(struct blob_reader *metadata,
+  struct gl_shader_program *prog)
 {
prog->NumUniformRemapTable = blob_read_uint32(metadata);
 
@@ -447,6 +467,31 @@ read_uniform_remap_table(struct blob_reader *metadata,
  prog->UniformRemapTable[i] = prog->data->UniformStorage + uni_offset;
   }
}
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+  if (sh) {
+ struct gl_program *glprog = sh->Program;
+ glprog->sh.NumSubroutineUniformRemapTable = 
blob_read_uint32(metadata);
+
+ glprog->sh.SubroutineUniformRemapTable =
+rzalloc_array(glprog, struct gl_uniform_storage *,
+  glprog->sh.NumSubroutineUniformRemapTable);
+
+ for (unsigned j = 0; j < glprog->sh.NumSubroutineUniformRemapTable; 
j++) {
+uint64_t uni_ptr = blob_read_uint64(metadata);
+if (uni_ptr == (uint64_t) INACTIVE_UNIFORM_EXPLICIT_LOCATION ||
+uni_ptr == (uint64_t) NULL) {
+   glprog->sh.SubroutineUniformRemapTable[j] =
+  (gl_uniform_storage *) uni_ptr;
+} else {
+   uint32_t uni_offset = blob_read_uint32(metadata);
+   glprog->sh.SubroutineUniformRemapTable[j] =
+  prog->data->UniformStorage + uni_offset;
+}
+ }
+  }
+   }
 }
 
 struct whte_closure
@@ -864,7 +909,7 @@ shader_cache_write_program_metadata(struct gl_context *ctx,
 
write_xfb(metadata, prog);
 
-   write_uniform_remap_table(metadata, prog);
+   write_uniform_remap_tables(metadata, prog);
 
write_subroutines(metadata, prog);
 
@@ -980,7 +1025,7 @@ shader_cache_read_program_metadata(struct gl_context *ctx,
 
read_xfb(&metadata, prog);
 
-   read_uniform_remap_table(&metadata, prog);
+   read_uniform_remap_tables(&metadata, prog);
 
read_subroutines(&metadata, prog);
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/37] glsl: skip linking when current program has been retrieved from cache

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

The scenario is a program has been linked for the first time and we
cache the program metadata, then glLinkProgram() is called for a second
time. Since we will now retrieve the program metadata from cache we need
to skip linking.
---
 src/compiler/glsl/shader_cache.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 1093726..cc3eb84 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -788,6 +788,7 @@ shader_cache_read_program_metadata(struct gl_context *ctx,
   return false;
}
 
+   prog->data->Version = 0; /* This is used to flag a shader retrieved from 
cache */
prog->data->LinkStatus = true;
 
free (buffer);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/37] glsl: add support for caching atomic buffers

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/shader_cache.cpp | 89 ++
 1 file changed, 89 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 29fc70a..9db4f25 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -387,6 +387,79 @@ read_buffer_blocks(struct blob_reader *metadata,
 }
 
 static void
+write_atomic_buffers(struct blob *metadata, struct gl_shader_program *prog)
+{
+   blob_write_uint32(metadata, prog->data->NumAtomicBuffers);
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (prog->_LinkedShaders[i]) {
+ struct gl_program *glprog = prog->_LinkedShaders[i]->Program;
+ blob_write_uint32(metadata, glprog->info.num_abos);
+  }
+   }
+
+   for (unsigned i = 0; i < prog->data->NumAtomicBuffers; i++) {
+  blob_write_uint32(metadata, prog->data->AtomicBuffers[i].Binding);
+  blob_write_uint32(metadata, prog->data->AtomicBuffers[i].MinimumSize);
+  blob_write_uint32(metadata, prog->data->AtomicBuffers[i].NumUniforms);
+
+  blob_write_bytes(metadata, prog->data->AtomicBuffers[i].StageReferences,
+   sizeof(prog->data->AtomicBuffers[i].StageReferences));
+
+  for (unsigned j = 0; j < prog->data->AtomicBuffers[i].NumUniforms; j++) {
+ blob_write_uint32(metadata, prog->data->AtomicBuffers[i].Uniforms[j]);
+  }
+   }
+}
+
+static void
+read_atomic_buffers(struct blob_reader *metadata,
+ struct gl_shader_program *prog)
+{
+   prog->data->NumAtomicBuffers = blob_read_uint32(metadata);
+   prog->data->AtomicBuffers =
+  rzalloc_array(prog, gl_active_atomic_buffer,
+prog->data->NumAtomicBuffers);
+
+   struct gl_active_atomic_buffer **stage_buff_list[MESA_SHADER_STAGES];
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (prog->_LinkedShaders[i]) {
+ struct gl_program *glprog = prog->_LinkedShaders[i]->Program;
+
+ glprog->info.num_abos = blob_read_uint32(metadata);
+ glprog->sh.AtomicBuffers =
+rzalloc_array(glprog, gl_active_atomic_buffer *,
+  glprog->info.num_abos);
+ stage_buff_list[i] = glprog->sh.AtomicBuffers;
+  }
+   }
+
+   for (unsigned i = 0; i < prog->data->NumAtomicBuffers; i++) {
+  prog->data->AtomicBuffers[i].Binding = blob_read_uint32(metadata);
+  prog->data->AtomicBuffers[i].MinimumSize = blob_read_uint32(metadata);
+  prog->data->AtomicBuffers[i].NumUniforms = blob_read_uint32(metadata);
+
+  blob_copy_bytes(metadata,
+  (uint8_t *) 
&prog->data->AtomicBuffers[i].StageReferences,
+  sizeof(prog->data->AtomicBuffers[i].StageReferences));
+
+  prog->data->AtomicBuffers[i].Uniforms = rzalloc_array(prog, unsigned,
+ prog->data->AtomicBuffers[i].NumUniforms);
+
+  for (unsigned j = 0; j < prog->data->AtomicBuffers[i].NumUniforms; j++) {
+ prog->data->AtomicBuffers[i].Uniforms[j] = blob_read_uint32(metadata);
+  }
+
+  for (unsigned j = 0; j < MESA_SHADER_STAGES; j++) {
+ if (prog->data->AtomicBuffers[i].StageReferences[j]) {
+*stage_buff_list[j] = &prog->data->AtomicBuffers[i];
+stage_buff_list[j]++;
+ }
+  }
+   }
+}
+
+static void
 write_xfb(struct blob *metadata, struct gl_shader_program *shProg)
 {
struct gl_program *prog = shProg->last_vert_prog;
@@ -772,6 +845,15 @@ write_program_resource_data(struct blob *metadata,
  }
   }
   break;
+   case GL_ATOMIC_COUNTER_BUFFER:
+  for (unsigned i = 0; i < prog->data->NumAtomicBuffers; i++) {
+ if (((gl_active_atomic_buffer *)res->Data)->Binding ==
+ prog->data->AtomicBuffers[i].Binding) {
+blob_write_uint32(metadata, i);
+break;
+ }
+  }
+  break;
case GL_TRANSFORM_FEEDBACK_BUFFER:
   for (unsigned i = 0; i < MAX_FEEDBACK_BUFFERS; i++) {
  if (((gl_transform_feedback_buffer *)res->Data)->Binding ==
@@ -847,6 +929,9 @@ read_program_resource_data(struct blob_reader *metadata,
case GL_UNIFORM:
   res->Data = &prog->data->UniformStorage[blob_read_uint32(metadata)];
   break;
+   case GL_ATOMIC_COUNTER_BUFFER:
+  res->Data = &prog->data->AtomicBuffers[blob_read_uint32(metadata)];
+  break;
case GL_TRANSFORM_FEEDBACK_BUFFER:
   res->Data = &prog->last_vert_prog->
  sh.LinkedTransformFeedback->Buffers[blob_read_uint32(metadata)];
@@ -1070,6 +1155,8 @@ shader_cache_write_program_metadata(struct gl_context 
*ctx,
 
write_uniform_remap_tables(metadata, prog);
 
+   write_atomic_buffers(metadata, prog);
+
write_buffer_blocks(metadata, prog);
 
write_subroutines(metadata, prog);
@@ -1188,6 +1275,8 @@ shader_cache_read_program_metadata(struct gl_context *ctx,
 
read_uniform_remap_tables(&metadata, prog);
 
+   read_atomic_buffers(&metadata

[Mesa-dev] [PATCH 21/37] glsl: cache some more image metadata

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/shader_cache.cpp | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 9db4f25..358eb4f 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1056,6 +1056,11 @@ write_shader_metadata(struct blob *metadata, 
gl_linked_shader *shader)
 sizeof(glprog->sh.SamplerTargets));
blob_write_uint32(metadata, glprog->ShadowSamplers);
 
+   blob_write_bytes(metadata, glprog->sh.ImageAccess,
+sizeof(glprog->sh.ImageAccess));
+   blob_write_bytes(metadata, glprog->sh.ImageUnits,
+sizeof(glprog->sh.ImageUnits));
+
write_shader_parameters(metadata, glprog->Parameters);
 }
 
@@ -1074,6 +1079,11 @@ read_shader_metadata(struct blob_reader *metadata,
sizeof(glprog->sh.SamplerTargets));
glprog->ShadowSamplers = blob_read_uint32(metadata);
 
+   blob_copy_bytes(metadata, (uint8_t *) glprog->sh.ImageAccess,
+   sizeof(glprog->sh.ImageAccess));
+   blob_copy_bytes(metadata, (uint8_t *) glprog->sh.ImageUnits,
+   sizeof(glprog->sh.ImageUnits));
+
glprog->Parameters = _mesa_new_parameter_list();
read_shader_parameters(metadata, glprog->Parameters);
 }
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/37] glsl: make uniform values helper available for use elsewhere

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/link_uniforms.cpp | 2 +-
 src/compiler/glsl/linker.h  | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index 8930d26..41fd79a 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -45,7 +45,7 @@
 /**
  * Count the backing storage requirements for a type
  */
-static unsigned
+unsigned
 values_for_type(const glsl_type *type)
 {
if (type->is_sampler()) {
diff --git a/src/compiler/glsl/linker.h b/src/compiler/glsl/linker.h
index 9841ef0..abcfdb1 100644
--- a/src/compiler/glsl/linker.h
+++ b/src/compiler/glsl/linker.h
@@ -76,6 +76,9 @@ void
 validate_interstage_uniform_blocks(struct gl_shader_program *prog,
gl_linked_shader **stages);
 
+unsigned
+values_for_type(const glsl_type *type);
+
 extern void
 link_assign_atomic_counter_resources(struct gl_context *ctx,
  struct gl_shader_program *prog);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/37] glsl: don't reference shader prog data during cache fallback

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

We already have a reference.
---
 src/compiler/glsl/linker.cpp | 3 ++-
 src/mesa/main/shaderobj.c| 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index fa9e154..d34ec97 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2197,7 +2197,8 @@ link_intrastage_shaders(void *mem_ctx,
   return NULL;
}
 
-   _mesa_reference_shader_program_data(ctx, &gl_prog->sh.data, prog->data);
+   if (!prog->data->cache_fallback)
+  _mesa_reference_shader_program_data(ctx, &gl_prog->sh.data, prog->data);
 
/* Don't use _mesa_reference_program() just take ownership */
linked->Program = gl_prog;
diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index 6ddccd2..a8d3f5a 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -433,7 +433,8 @@ _mesa_delete_shader_program(struct gl_context *ctx,
 struct gl_shader_program *shProg)
 {
_mesa_free_shader_program_data(ctx, shProg);
-   _mesa_reference_shader_program_data(ctx, &shProg->data, NULL);
+   if (!shProg->data->cache_fallback)
+  _mesa_reference_shader_program_data(ctx, &shProg->data, NULL);
ralloc_free(shProg);
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/37] glsl: make a copy of the shader source for use with cache fallback

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

A number of things can happen that change the shader source after it is
compiled or linked.

For example:
- Source changed after it is first compiled
- Source changed after linking
- Shader detached after linking

In order to be able to fallback to a full rebuild on a cache miss we
make a copy of the shader source and store it in the new FallbackShaders
field when linking.
---
 src/compiler/glsl/shader_cache.cpp | 29 +
 src/mesa/main/mtypes.h |  2 ++
 src/mesa/main/shaderobj.c  |  4 
 src/mesa/program/ir_to_mesa.cpp|  8 +++-
 4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 1179c12..6bb94e7 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1242,6 +1242,35 @@ shader_cache_read_program_metadata(struct gl_context 
*ctx,
if (!cache || prog->data->cache_fallback)
   return false;
 
+   /* Free previous fallback information */
+   if (prog->data->FallbackShaders == NULL) {
+  prog->data->NumFallbackShaders = 0;
+  for (unsigned i = 0; i < prog->data->NumFallbackShaders; i++) {
+ ralloc_free(prog->data->FallbackShaders);
+ prog->data->FallbackShaders = NULL;
+  }
+   }
+
+   /* Shaders could be recompiled using different source code after linking,
+* or the shader could be detached from the program so store some
+* information about the shader to be used in case of fallback.
+*/
+   prog->data->NumFallbackShaders = prog->NumShaders;
+   prog->data->FallbackShaders = (struct gl_shader **)
+  reralloc(NULL, prog->data->FallbackShaders, struct gl_shader *,
+   prog->NumShaders);
+   for (unsigned i = 0; i < prog->NumShaders; i++) {
+  prog->data->FallbackShaders[i] = rzalloc(prog->data->FallbackShaders,
+   struct gl_shader);
+  memcpy(prog->data->FallbackShaders[i]->sha1, prog->Shaders[i]->sha1,
+ sizeof(prog->Shaders[i]->sha1));
+  prog->data->FallbackShaders[i]->Stage = prog->Shaders[i]->Stage;
+  prog->data->FallbackShaders[i]->Source =
+ ralloc_strdup(prog->data->FallbackShaders, prog->Shaders[i]->Source);
+  prog->data->FallbackShaders[i]->InfoLog =
+ ralloc_strdup(prog->data->FallbackShaders, "");
+   }
+
for (unsigned i = 0; i < prog->NumShaders; i++) {
   if (prog->Shaders[i]->Stage == MESA_SHADER_COMPUTE) {
  compile_shaders(ctx, prog);
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 2c1e11b..b11b287 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2674,6 +2674,8 @@ struct gl_shader_program_data
union gl_constant_value *UniformDataSlots;
 
bool cache_fallback;
+   GLuint NumFallbackShaders;
+   struct gl_shader **FallbackShaders; /**< Shaders used for cache fallback */
 
/** List of all active resources after linking. */
struct gl_program_resource *ProgramResourceList;
diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index b41137f..6ddccd2 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -404,10 +404,14 @@ _mesa_free_shader_program_data(struct gl_context *ctx,
   _mesa_reference_shader(ctx, &shProg->Shaders[i], NULL);
}
shProg->NumShaders = 0;
+   shProg->data->NumFallbackShaders = 0;
 
free(shProg->Shaders);
shProg->Shaders = NULL;
 
+   ralloc_free(shProg->data->FallbackShaders);
+   shProg->data->FallbackShaders = NULL;
+
/* Transform feedback varying vars */
for (i = 0; i < shProg->TransformFeedback.NumVarying; i++) {
   free(shProg->TransformFeedback.VaryingNames[i]);
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp
index 350c856..debd58d 100644
--- a/src/mesa/program/ir_to_mesa.cpp
+++ b/src/mesa/program/ir_to_mesa.cpp
@@ -3128,8 +3128,14 @@ _mesa_glsl_link_shader(struct gl_context *ctx, struct 
gl_shader_program *prog)
}
 
 #ifdef ENABLE_SHADER_CACHE
-   if (prog->data->LinkStatus)
+   if (prog->data->LinkStatus && !prog->data->cache_fallback) {
+  if (prog->data->FallbackShaders) {
+ prog->data->NumFallbackShaders = 0;
+ ralloc_free(prog->data->FallbackShaders);
+ prog->data->FallbackShaders = NULL;
+  }
   shader_cache_write_program_metadata(ctx, prog);
+   }
 #endif
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 28/37] glsl: don't lose uniform values when falling back to full compile

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

Here we skip the recreation of uniform storage if we are relinking
after a cache miss. This is improtant because uniform values may
have already been set by the application and we don't want to reset
them.
---
 src/compiler/glsl/link_uniforms.cpp | 31 +++
 src/mesa/main/shaderobj.c   |  4 ++--
 2 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/link_uniforms.cpp 
b/src/compiler/glsl/link_uniforms.cpp
index 41fd79a..48d9db1 100644
--- a/src/compiler/glsl/link_uniforms.cpp
+++ b/src/compiler/glsl/link_uniforms.cpp
@@ -1213,11 +1213,17 @@ link_assign_uniform_storage(struct gl_context *ctx,
 
unsigned int boolean_true = ctx->Const.UniformBooleanTrue;
 
-   prog->data->UniformStorage = rzalloc_array(prog, struct gl_uniform_storage,
-  prog->data->NumUniformStorage);
-   union gl_constant_value *data = rzalloc_array(prog->data->UniformStorage,
- union gl_constant_value,
- num_data_slots);
+   union gl_constant_value *data;
+   if (prog->data->UniformStorage == NULL) {
+  prog->data->UniformStorage = rzalloc_array(prog,
+ struct gl_uniform_storage,
+ 
prog->data->NumUniformStorage);
+  data = rzalloc_array(prog->data->UniformStorage,
+   union gl_constant_value, num_data_slots);
+   } else {
+  data = prog->data->UniformDataSlots;
+   }
+
 #ifndef NDEBUG
union gl_constant_value *data_end = &data[num_data_slots];
 #endif
@@ -1252,6 +1258,13 @@ link_assign_uniform_storage(struct gl_context *ctx,
  sizeof(prog->_LinkedShaders[i]->Program->sh.SamplerTargets));
}
 
+   /* If this is a fallback compile for a cache miss we already have the
+* correct uniform mappings and we don't want to reinitialise uniforms so
+* just return now.
+*/
+   if (prog->data->cache_fallback)
+  return;
+
 #ifndef NDEBUG
for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
   assert(prog->data->UniformStorage[i].storage != NULL ||
@@ -1276,9 +1289,11 @@ void
 link_assign_uniform_locations(struct gl_shader_program *prog,
   struct gl_context *ctx)
 {
-   ralloc_free(prog->data->UniformStorage);
-   prog->data->UniformStorage = NULL;
-   prog->data->NumUniformStorage = 0;
+   if (!prog->data->cache_fallback) {
+  ralloc_free(prog->data->UniformStorage);
+  prog->data->UniformStorage = NULL;
+  prog->data->NumUniformStorage = 0;
+   }
 
if (prog->UniformHash != NULL) {
   prog->UniformHash->clear();
diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index a8d3f5a..4804041 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -326,7 +326,7 @@ _mesa_clear_shader_program_data(struct gl_context *ctx,
 
shProg->data->linked_stages = 0;
 
-   if (shProg->data->UniformStorage) {
+   if (shProg->data->UniformStorage && !shProg->data->cache_fallback) {
   for (unsigned i = 0; i < shProg->data->NumUniformStorage; ++i)
  _mesa_uniform_detach_all_driver_storage(&shProg->data->
 UniformStorage[i]);
@@ -335,7 +335,7 @@ _mesa_clear_shader_program_data(struct gl_context *ctx,
   shProg->data->UniformStorage = NULL;
}
 
-   if (shProg->UniformRemapTable) {
+   if (shProg->UniformRemapTable && !shProg->data->cache_fallback) {
   ralloc_free(shProg->UniformRemapTable);
   shProg->NumUniformRemapTable = 0;
   shProg->UniformRemapTable = NULL;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 24/37] glsl: track mesa version shader cache items were created with

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

Also remove cache item and fallback to full recompile if current Mesa
version differs.

V2: don't leak buffer
---
 src/compiler/glsl/shader_cache.cpp | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 8b9cabc..5cd134a 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1172,6 +1172,8 @@ shader_cache_write_program_metadata(struct gl_context 
*ctx,
 
struct blob *metadata = blob_create(NULL);
 
+   blob_write_string(metadata, ctx->VersionString);
+
write_uniforms(metadata, prog);
 
write_hash_tables(metadata, prog);
@@ -1299,6 +1301,17 @@ shader_cache_read_program_metadata(struct gl_context 
*ctx,
struct blob_reader metadata;
blob_reader_init(&metadata, buffer, size);
 
+   char *version_string = blob_read_string(&metadata);
+   if (strcmp(ctx->VersionString, version_string) != 0) {
+  /* The cached version of the program was created with a different
+   * version of Mesa so remove it and fallback to full recompile.
+   */
+  disk_cache_remove(cache, prog->data->sha1);
+  compile_shaders(ctx, prog);
+  free(buffer);
+  return false;
+   }
+
assert(prog->data->UniformStorage == NULL);
 
read_uniforms(&metadata, prog);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/37] glsl: cache uniform values

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

These may be lowered constant arrays or uniform values that we set before 
linking
so we need to cache the actual uniform values.
---
 src/compiler/glsl/shader_cache.cpp | 33 +
 1 file changed, 33 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 358eb4f..8b9cabc 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -550,11 +550,13 @@ write_uniforms(struct blob *metadata, struct 
gl_shader_program *prog)
   blob_write_string(metadata, prog->data->UniformStorage[i].name);
   blob_write_uint32(metadata, prog->data->UniformStorage[i].storage -
   prog->data->UniformDataSlots);
+  blob_write_uint32(metadata, prog->data->UniformStorage[i].builtin);
   blob_write_uint32(metadata, 
prog->data->UniformStorage[i].remap_location);
   blob_write_uint32(metadata, prog->data->UniformStorage[i].block_index);
   blob_write_uint32(metadata, 
prog->data->UniformStorage[i].atomic_buffer_index);
   blob_write_uint32(metadata, prog->data->UniformStorage[i].offset);
   blob_write_uint32(metadata, prog->data->UniformStorage[i].array_stride);
+  blob_write_uint32(metadata, prog->data->UniformStorage[i].hidden);
   blob_write_uint32(metadata, prog->data->UniformStorage[i].matrix_stride);
   blob_write_uint32(metadata, prog->data->UniformStorage[i].row_major);
   blob_write_uint32(metadata,
@@ -566,6 +568,22 @@ write_uniforms(struct blob *metadata, struct 
gl_shader_program *prog)
   blob_write_bytes(metadata, prog->data->UniformStorage[i].opaque,
sizeof(prog->data->UniformStorage[i].opaque));
}
+
+   /* Here we cache all uniform values. We do this to retain values for
+* uniforms with initialisers and also hidden uniforms that may be lowered
+* constant arrays. We could possibly just store the values we need but for
+* now we just store everything.
+*/
+   blob_write_uint32(metadata, prog->data->NumHiddenUniforms);
+   for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
+  if (!prog->data->UniformStorage[i].builtin) {
+ unsigned vec_size =
+values_for_type(prog->data->UniformStorage[i].type) *
+MAX2(prog->data->UniformStorage[i].array_elements, 1);
+ blob_write_bytes(metadata, prog->data->UniformStorage[i].storage,
+  sizeof(union gl_constant_value) * vec_size);
+  }
+   }
 }
 
 static void
@@ -593,11 +611,13 @@ read_uniforms(struct blob_reader *metadata, struct 
gl_shader_program *prog)
   uniforms[i].array_elements = blob_read_uint32(metadata);
   uniforms[i].name = ralloc_strdup(prog, blob_read_string (metadata));
   uniforms[i].storage = data + blob_read_uint32(metadata);
+  uniforms[i].builtin = blob_read_uint32(metadata);
   uniforms[i].remap_location = blob_read_uint32(metadata);
   uniforms[i].block_index = blob_read_uint32(metadata);
   uniforms[i].atomic_buffer_index = blob_read_uint32(metadata);
   uniforms[i].offset = blob_read_uint32(metadata);
   uniforms[i].array_stride = blob_read_uint32(metadata);
+  uniforms[i].hidden = blob_read_uint32(metadata);
   uniforms[i].matrix_stride = blob_read_uint32(metadata);
   uniforms[i].row_major = blob_read_uint32(metadata);
   uniforms[i].num_compatible_subroutines = blob_read_uint32(metadata);
@@ -609,6 +629,19 @@ read_uniforms(struct blob_reader *metadata, struct 
gl_shader_program *prog)
  blob_read_bytes(metadata, sizeof(uniforms[i].opaque)),
  sizeof(uniforms[i].opaque));
}
+
+   /* Restore uniform values. */
+   prog->data->NumHiddenUniforms = blob_read_uint32(metadata);
+   for (unsigned i = 0; i < prog->data->NumUniformStorage; i++) {
+  if (!prog->data->UniformStorage[i].builtin) {
+ unsigned vec_size =
+values_for_type(prog->data->UniformStorage[i].type) *
+MAX2(prog->data->UniformStorage[i].array_elements, 1);
+ blob_copy_bytes(metadata,
+ (uint8_t *) prog->data->UniformStorage[i].storage,
+ sizeof(union gl_constant_value) * vec_size);
+  }
+   }
 }
 
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 25/37] mesa/glsl: add cache_fallback flag to gl_shader_program_data

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

This will allow us to skip certain things when falling back to
a full recompile on a cache miss such as avoiding reinitialising
uniforms.

In this chage we use it to avoid reading the program metadata
from the cache and skipping linking during a fallback.
---
 src/compiler/glsl/shader_cache.cpp | 2 +-
 src/mesa/main/mtypes.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 5cd134a..1179c12 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1239,7 +1239,7 @@ shader_cache_read_program_metadata(struct gl_context *ctx,
   return false;
 
struct disk_cache *cache = ctx->Cache;
-   if (!cache)
+   if (!cache || prog->data->cache_fallback)
   return false;
 
for (unsigned i = 0; i < prog->NumShaders; i++) {
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index a2280e2..2c1e11b 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2673,6 +2673,8 @@ struct gl_shader_program_data
unsigned NumUniformDataSlots;
union gl_constant_value *UniformDataSlots;
 
+   bool cache_fallback;
+
/** List of all active resources after linking. */
struct gl_program_resource *ProgramResourceList;
unsigned NumProgramResourceList;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 29/37] glsl: skip more uniform initialisation when doing fallback linking

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

We already pull these values from the metadata cache so no need to
recreate them.
---
 src/compiler/glsl/linker.cpp | 20 
 src/mesa/main/shaderobj.c|  8 +---
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index d34ec97..3271a19 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4532,12 +4532,14 @@ link_and_validate_uniforms(struct gl_context *ctx,
update_array_sizes(prog);
link_assign_uniform_locations(prog, ctx);
 
-   link_assign_atomic_counter_resources(ctx, prog);
-   link_calculate_subroutine_compat(prog);
-   check_resources(ctx, prog);
-   check_subroutine_resources(prog);
-   check_image_resources(ctx, prog);
-   link_check_atomic_counter_resources(ctx, prog);
+   if (!prog->data->cache_fallback) {
+  link_assign_atomic_counter_resources(ctx, prog);
+  link_calculate_subroutine_compat(prog);
+  check_resources(ctx, prog);
+  check_subroutine_resources(prog);
+  check_image_resources(ctx, prog);
+  link_check_atomic_counter_resources(ctx, prog);
+   }
 }
 
 static bool
@@ -4818,8 +4820,10 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   last = i;
}
 
-   check_explicit_uniform_locations(ctx, prog);
-   link_assign_subroutine_types(prog);
+   if (!prog->data->cache_fallback) {
+  check_explicit_uniform_locations(ctx, prog);
+  link_assign_subroutine_types(prog);
+   }
 
if (!prog->data->LinkStatus)
   goto done;
diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index 4804041..33b9f63 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -358,9 +358,11 @@ _mesa_clear_shader_program_data(struct gl_context *ctx,
shProg->data->ShaderStorageBlocks = NULL;
shProg->data->NumShaderStorageBlocks = 0;
 
-   ralloc_free(shProg->data->AtomicBuffers);
-   shProg->data->AtomicBuffers = NULL;
-   shProg->data->NumAtomicBuffers = 0;
+   if (shProg->data->AtomicBuffers && !shProg->data->cache_fallback) {
+  ralloc_free(shProg->data->AtomicBuffers);
+  shProg->data->AtomicBuffers = NULL;
+  shProg->data->NumAtomicBuffers = 0;
+   }
 
if (shProg->data->ProgramResourceList) {
   ralloc_free(shProg->data->ProgramResourceList);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 30/37] glsl: don't reprocess or clear UBOs on cache fallback

2017-01-23 Thread Timothy Arceri
From: Timothy Arceri 

---
 src/compiler/glsl/linker.cpp | 62 +++-
 src/mesa/main/shaderobj.c| 16 +++-
 2 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 3271a19..7054783 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2252,32 +2252,34 @@ link_intrastage_shaders(void *mem_ctx,
v.run(linked->ir);
v.fixup_unnamed_interface_types();
 
-   /* Link up uniform blocks defined within this stage. */
-   link_uniform_blocks(mem_ctx, ctx, prog, linked, &ubo_blocks,
-   &num_ubo_blocks, &ssbo_blocks, &num_ssbo_blocks);
-
-   if (!prog->data->LinkStatus) {
-  _mesa_delete_linked_shader(ctx, linked);
-  return NULL;
-   }
+   if (!prog->data->cache_fallback) {
+  /* Link up uniform blocks defined within this stage. */
+  link_uniform_blocks(mem_ctx, ctx, prog, linked, &ubo_blocks,
+  &num_ubo_blocks, &ssbo_blocks, &num_ssbo_blocks);
 
-   /* Copy ubo blocks to linked shader list */
-   linked->Program->sh.UniformBlocks =
-  ralloc_array(linked, gl_uniform_block *, num_ubo_blocks);
-   ralloc_steal(linked, ubo_blocks);
-   for (unsigned i = 0; i < num_ubo_blocks; i++) {
-  linked->Program->sh.UniformBlocks[i] = &ubo_blocks[i];
-   }
-   linked->Program->info.num_ubos = num_ubo_blocks;
+  if (!prog->data->LinkStatus) {
+ _mesa_delete_linked_shader(ctx, linked);
+ return NULL;
+  }
 
-   /* Copy ssbo blocks to linked shader list */
-   linked->Program->sh.ShaderStorageBlocks =
-  ralloc_array(linked, gl_uniform_block *, num_ssbo_blocks);
-   ralloc_steal(linked, ssbo_blocks);
-   for (unsigned i = 0; i < num_ssbo_blocks; i++) {
-  linked->Program->sh.ShaderStorageBlocks[i] = &ssbo_blocks[i];
+  /* Copy ubo blocks to linked shader list */
+  linked->Program->sh.UniformBlocks =
+ ralloc_array(linked, gl_uniform_block *, num_ubo_blocks);
+  ralloc_steal(linked, ubo_blocks);
+  for (unsigned i = 0; i < num_ubo_blocks; i++) {
+ linked->Program->sh.UniformBlocks[i] = &ubo_blocks[i];
+  }
+  linked->Program->info.num_ubos = num_ubo_blocks;
+
+  /* Copy ssbo blocks to linked shader list */
+  linked->Program->sh.ShaderStorageBlocks =
+ ralloc_array(linked, gl_uniform_block *, num_ssbo_blocks);
+  ralloc_steal(linked, ssbo_blocks);
+  for (unsigned i = 0; i < num_ssbo_blocks; i++) {
+ linked->Program->sh.ShaderStorageBlocks[i] = &ssbo_blocks[i];
+  }
+  linked->Program->info.num_ssbos = num_ssbo_blocks;
}
-   linked->Program->info.num_ssbos = num_ssbo_blocks;
 
/* At this point linked should contain all of the linked IR, so
 * validate it to make sure nothing went wrong.
@@ -4878,13 +4880,15 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
if (prog->SeparateShader)
   disable_varying_optimizations_for_sso(prog);
 
-   /* Process UBOs */
-   if (!interstage_cross_validate_uniform_blocks(prog, false))
-  goto done;
+   if (!prog->data->cache_fallback) {
+  /* Process UBOs */
+  if (!interstage_cross_validate_uniform_blocks(prog, false))
+ goto done;
 
-   /* Process SSBOs */
-   if (!interstage_cross_validate_uniform_blocks(prog, true))
-  goto done;
+  /* Process SSBOs */
+  if (!interstage_cross_validate_uniform_blocks(prog, true))
+ goto done;
+   }
 
/* Do common optimization before assigning storage for attributes,
 * uniforms, and varyings.  Later optimization could possibly make
diff --git a/src/mesa/main/shaderobj.c b/src/mesa/main/shaderobj.c
index 33b9f63..ed19a72 100644
--- a/src/mesa/main/shaderobj.c
+++ b/src/mesa/main/shaderobj.c
@@ -350,13 +350,15 @@ _mesa_clear_shader_program_data(struct gl_context *ctx,
ralloc_free(shProg->data->InfoLog);
shProg->data->InfoLog = ralloc_strdup(shProg->data, "");
 
-   ralloc_free(shProg->data->UniformBlocks);
-   shProg->data->UniformBlocks = NULL;
-   shProg->data->NumUniformBlocks = 0;
-
-   ralloc_free(shProg->data->ShaderStorageBlocks);
-   shProg->data->ShaderStorageBlocks = NULL;
-   shProg->data->NumShaderStorageBlocks = 0;
+   if (!shProg->data->cache_fallback) {
+  ralloc_free(shProg->data->UniformBlocks);
+  shProg->data->UniformBlocks = NULL;
+  shProg->data->NumUniformBlocks = 0;
+
+  ralloc_free(shProg->data->ShaderStorageBlocks);
+  shProg->data->ShaderStorageBlocks = NULL;
+  shProg->data->NumShaderStorageBlocks = 0;
+   }
 
if (shProg->data->AtomicBuffers && !shProg->data->cache_fallback) {
   ralloc_free(shProg->data->AtomicBuffers);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/6] configure.ac: Set and use HAVE_GALLIUM_LLVM define

2017-01-23 Thread Roland Scheidegger
Am 23.01.2017 um 20:39 schrieb Tobias Droste:
> Am Montag, 23. Januar 2017, 11:53:18 CET schrieb Jose Fonseca:
>> On 20/01/17 02:48, Emil Velikov wrote:
>>> On 19 January 2017 at 19:26, Tobias Droste  wrote:
 Am Mittwoch, 18. Januar 2017, 18:45:04 CET schrieb Emil Velikov:
> On 18 January 2017 at 18:12, Jose Fonseca  wrote:
 In order to untangle things we want to have a distinction between the
 gallium (gallivm afaict) and other users - RADV presently.
 So how about we update the RADV instances and ensure that the we set
 the HAVE_{RADV,}_LLVM lot appropriately. Latter will be picky but
 overall things should work w/o annoyances that HAVE_GALLIUM_LLVM
 brings ?
>>
>> I honestly don't even understand why we'd want to build parts of the
>> tree
>> with LLVM while hiding LLVM from other components.  We can't we just
>> build
>> everything with LLVM and avoid this combinatorial explosion of wierd
>> options that are nothing more than yet another way the build can
>> break!!?
>
> Sadly the combinatoric explosion has been there for a while. Based on
> how well my previous attempts to resolve similar issues (see the
> "platforms" topic) I doubt we'll even get to fix that.
>
>> But if a separate option is truly necessary, have the newcomer pick a
>> different name, or something.
>
> That's pretty much what I suggested above. Tobias can you please give it
> a
> try ?

 I would rather "fix" the other build systems. (As in just define
 HAVE_GALLIUM_LLVM if HAVE_LLVM is defined).

 I think there is still a misunderstanding on Joses side on what this
 really
 means. No file in gallivm or llvmpipe will be touched. It's really just
 auxilliary/draw and there it's exactly 8 lines that will change.

 That's it.

 I really fail to see how this will break everything that is being worked
 on
 and cause merge conflicts everywhere.

 If you still want the other way, I can do that to, but this will of
 course
 need the same fix in the other build system or we have the same situation
 we have now, but with other drivers.
>>>
>>> Afaict one point is that the use of HAVE_GALLIUM_LLVM vs HAVE_LLVM is
>>> too subtle. Let's not forget that barring the WIP(?) branches, VMWare
>>> has closed source components. Guess how much fun it will be as
>>> suddenly things fail to build/work properly as they re-sync the code
>>> base. No idea how likely the latter is, but considering Jose (and a
>>> few other VMWare guys) wrote sizeable hunk of that code (and Mesa as a
>>> whole) I'd go with his instinct.
>>>
>>> Emil
>>
>> The HAVE_LLVM->HAVE_GALLIUM_LLVM rename is indeed not as invasive as I
>> thought.
>>
>> But I still don't understand why HAVE_LLVM->HAVE_GALLIUM_LLVM is
>> necessary in draw and not on gallivm/llvmpipe.
>>
>> People want to build draw with LLVM support, but without
>> gallivm/llvmpipe? That's impossible.
>>
>> Or is this because the draw files are the only .c files that are
>> compiled even when HAVE_LLVM is undefined, so these are the only ones
>> that get to receive the renaming treatment?  That's crazy confusing.
>> There's no away I can accept that.
>>
> 
> The draw files are used by softpipe (and maybe other gallium drivers, haven't 
> checked that) and there HAVE_LLVM should not be defined. If it's not, 
> everything is fine. But with new non gallium drivers using LLVM and causing 
> HAVE_LLVM to bedefined, softpipe is broken in some cases. See below.
> 
>>
>> Let me make this crystal clear to avoid making this discussion even more
>> protracted: I will not accept any HAVE_LLVM change in
>> draw/gallivm/llvmpipe .C/.H source code.  Period.
>>
> 
> I'm _not_ changing gallivm and llvmpipe. Draw is not only used by llvmpipe 
> and 
> I still think I have very good arguments for the change. See again below.
> 
> I understand, I'm the unknown new guy and you did a lot of work in this code. 
> But I'm not getting paid for this and I don't have to do this. I want to 
> help, 
> but I also want to understand why I can't do something. With reasons other 
> than "I say so, and I don't want to hear any reasons against it". I hope you 
> understand that. 
> 
>>
>> HAVE_LLVM used to mean, "whole Mesa being built with LLVM".  Now people
>> want to build something (no idea what yet to be honest) with LLVM, but
>> not build draw/gallivm/llvmpipe.
>>
>> If you want to build some other component with LLVM but not
>> draw/gallivm/llvmpipe, then add a new HAVE_LLVM_FOR_FOOBAR define and
>> use it where you need it.
> 
> The real problem is softpipe. Softpipe uses draw but (obviously) can't use 
> LLVM.
> 
> Right now one could build radv (uses LLVM) and pass --disable-gallium-llvm to 
> the build system to get softpipe built.
> 
> But due to radv "HAVE_LLVM" (which is used as a version check everywhere 
> else!) is defined and the draw 

  1   2   >