date:20160216

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-16 Thread Iago Toral

Hey Jason,

this is awesome news, congrats to all the people involved!

Did you have a chance to try the new driver with anything other than
conformance tests or small demos? I know it is still in a very early
stage but I'd like to know your impressions about how much of an
improvement it brings over the GL driver in its current form and whether
there is still major optimization work to be done. I figure that since
you are using NIR and the same compiler backend used for GL things
should be looking pretty good already.

Iago

On Tue, 2016-02-16 at 07:19 -0800, Jason Ekstrand wrote:
> The Intel mesa team is pleased to announce a brand-new open-source
> Vulkan
> driver for Intel hardware.  We've been working hard on this over the
> course
> of the past year or so and are excited to finally share it with the
> community.  We will work on up-streaming the driver in the next few
> weeks
> and hope to have it all in place in time for mesa 11.3 (mesa 12?).  In
> the
> mean time, the driver can be found in the "vulkan" branch of the mesa
> git
> repo on freedesktop.org:
> 
> https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
> 
> More information on building the driver and running a few simple apps
> can
> be found on the 01.org web site:
> 
> https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
> 
> We have talked to people at Red Hat and Cannonical and binaries should
> be
> available for Fedora and Ubuntu soon.  We will update the page on
> 01.org
> with links as soon as they are available.
> 
> We have also created a small test suite called crucible which contains
> a
> few hundred tests (mostly for miptrees) that we created when bringing
> up
> the driver.  This isn't really intended to be the piglit of vulkan.
> With
> the CTS being publicly available, most cross-platform tests should go
> there.  We mostly made crucible so that we could write a few tests
> early on
> to get us going and for tests that were targetted specifically at our
> implementation.  None the less, they may prove useful to someone and
> we are
> happy to share them.  The crucible source code can be found at
> 
> https://cgit.freedesktop.org/mesa/crucible/
> 
> Frequently Asked Questions:
> 
> What all hardware does it support?
> 
>The driver currently supports Sky Lake all the way back to Ivy
> Bridge.
>The driver is Vulkan 1.0 conformant for 64-bit builds on Sky Lake,
>Broadwell, and Braswell.  We are still having a couple of 32-bit
> issues
>and support for Haswell, Ivy Bridge, and Bay Trail should be
> considered
>experimental.
> 
> How much code is shared between the Vulkan and GL drivers?
> 
>For shaders, we're using a SPIR-V to NIR pass which is new, and a
> few
>new NIR lowering passes for things that we previously depended on
> GLSL
>IR to handle.  Beyond that, we're using the same core NIR and the
> same
>back-end compiler that we have for GL.  We're carrying a few
> patches
>against the back-end compiler, but the delta is very small and it's
> all
>stuff that we eventually want to do for GL anyway.
> 
>The main API handling and state setup code is all new and written
> from
>the ground-up for Vulkan.  For actually packing hardware packets,
> we are
>using a codegen system that Kristian developed early on in the
> project
>that's based on an XML description of the hardware packets.  The
> result
>is state setup code that's both easier to work with and maybe even
> a
>little more efficient than what we have in mesa today.
> 
>We also have a brand-new surface layout library called ISL that
> handles
>all of the surface layout calculations.  ISL should have most of
> the
>code required to do surface layout all the way back to gen4.  Once
> we
>get aux surface support in ISL (required for HiZ, MSAA compression,
> and
>CCMS/fast clears), we hope to start using it in the GL driver as
> well.
> 
> How much code could be shared with other Vulkan drivers?
> 
>Not as much as you would think.  The SPIR-V to NIR translator and
> the
>rest of the NIR compiler stack could obviously be re-used by anyone
>willing to tie NIR into their back-end.  The rest of the driver is,
> and
>will probably stay, Intel-specific.  Vulkan is a very low-level
> API,
>possibly even lower-level than gallium.  A lot of the things that
> we
>share between drivers in mesa today: the front-end compiler, state
>tracking, error-handling, etc. is pushed off to either the
> application
>or third-party layers in the Vulkan world.  That said, anyone
> wishing to
>write their own Vulkan driver, is more than welcome to use ours as
> a
>reference and steal whatever they'd like from it.
> 
> What are your up-streaming plans?
> 
>Before we can land the SPIR-V to NIR layer, there are a number of
> core
>NIR changes that need to land first.  All of that code needs to be
>reviewed as it interacts with t

Re: [Mesa-dev] Where do we put a Vulkan driver?

2016-02-16 Thread Jose Fonseca


On 16/02/16 22:19, Brian Paul wrote:

On 02/16/2016 02:41 PM, Dave Airlie wrote:

On 17 February 2016 at 04:39, Jason Ekstrand 
wrote:

So, we just pushed a branch containing a Vulkan driver.  Naturally, we
would like to incorporate that driver into the upstream mesa tree.
While
we work on upstreaming the prerequisites in NIR and the i965 back-end
compiler, there is a question that needs answering:  Where do we put it?

The Vulkan driver challenges the tree-like nature of the way mesa is
currently organized.  We now have two drivers that share a lot of the
same
underlying hardware-specific code (compiler and ISL) but target
different
APIs and no gallium-like middle layer to hide behind.  Obviously, we
don't
want to put a Vulkan driver in src/mesa/drivers/dri/i965.  If we start a
src/vulkan directory, we don't really want to put the shared parts into
src/vulkan/intel.  Where should we put the Intel-specific but
API-agnostic
bits?  In particular, we need a place to put ISL and the back-end
compiler.
We don't want to deal with the headaches of making a public API and
keeping
it stable, so they need to live somewhere in the mesa tree.

In my personal opinion, the best thing to do is probably to add a
src/intel
folder with subfolders for vulkan, isl, and the back-end compiler.  The
src/mesa/drivers/dri/i965 folder would then basically be just the GL
bits
of the driver.  It does seem a little odd to have "intel" as a top-level
source folder, but I can't come up with anything better.

Thoughts?  Opinions?  Favorite colors?


I don't think we'll get this right the first time, and when we
randomly decide to
change it we can just make poor Emil handle the fallout. :-P

Anyways,

src/intel works for me, also src/shed/intel, src/shared/intel,
src/drivers/intel


I like src/shared/intel/ FWIW.

-Brian


Me too FWIW.  We could also consider putting all non-GL/Gallium/Vulkan 
specific stuff in src/shared too, like src/util, nir, etc.


But like Dave said, we can always tweak these things later.

Congrats on getting 0-day Vulkan support BTW!

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/gen7: Use predicated rendering for indirect compute

2016-02-16 Thread Kenneth Graunke

On Tuesday, February 16, 2016 10:09:50 AM PST Jordan Justen wrote:
> On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect
> dispatch is used, but one of the dimensions is 0.
> 
> Therefore we use predicated rendering on the GPGPU_WALKER command to
> handle this case.
> 
> Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100
> Signed-off-by: Jordan Justen 
> Cc: Kenneth Graunke 
> Cc: Ben Widawsky 
> Cc: Ilia Mirkin 
> ---
>  src/mesa/drivers/dri/i965/brw_compute.c | 104 ++
+-
>  src/mesa/drivers/dri/i965/brw_defines.h |   1 +
>  2 files changed, 91 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_compute.c b/src/mesa/drivers/dri/
i965/brw_compute.c
> index d9f181a..bbb8ce3 100644
> --- a/src/mesa/drivers/dri/i965/brw_compute.c
> +++ b/src/mesa/drivers/dri/i965/brw_compute.c
> @@ -35,6 +35,92 @@
>  
>  
>  static void
> +brw_prepare_indirect_gpgpu_walker(struct brw_context *brw)

static functions don't need the "brw_" prefix.  (Similarly, in core
Mesa, we don't use the "_mesa_" prefix for static functions...this
helps identify them.)

> +{
> +   GLintptr indirect_offset = brw->compute.num_work_groups_offset;
> +   drm_intel_bo *bo = brw->compute.num_work_groups_bo;
> +
> +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMX, bo,
> + I915_GEM_DOMAIN_VERTEX, 0,
> + indirect_offset + 0);
> +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMY, bo,
> + I915_GEM_DOMAIN_VERTEX, 0,
> + indirect_offset + 4);
> +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMZ, bo,
> + I915_GEM_DOMAIN_VERTEX, 0,
> + indirect_offset + 8);
> +
> +   if (brw->gen > 7)
> +  return;
> +
> +   /* Clear upper 32-bits of SRC0 and all 64-bits of SRC1
> +*/

As Ben mentioned, I do prefer one line comments, but it's up to you...

> +   BEGIN_BATCH(7);
> +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (7 - 2));
> +   OUT_BATCH(MI_PREDICATE_SRC0 + 4);
> +   OUT_BATCH(0u);
> +   OUT_BATCH(MI_PREDICATE_SRC1 + 0);
> +   OUT_BATCH(0u);
> +   OUT_BATCH(MI_PREDICATE_SRC1 + 4);
> +   OUT_BATCH(0u);
> +   ADVANCE_BATCH();
> +
> +   /* Load compute_dispatch_indirect_x_size into SRC0
> +*/
> +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> + I915_GEM_DOMAIN_INSTRUCTION, 0,
> + indirect_offset + 0);
> +
> +   /* predicate = (compute_dispatch_indirect_x_size == 0);
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOAD |
> + MI_PREDICATE_COMBINEOP_SET |
> + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> +   ADVANCE_BATCH();
> +
> +   /* Load compute_dispatch_indirect_y_size into SRC0
> +*/
> +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> + I915_GEM_DOMAIN_INSTRUCTION, 0,
> + indirect_offset + 4);
> +
> +   /* predicate |= (compute_dispatch_indirect_y_size == 0);
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOAD |
> + MI_PREDICATE_COMBINEOP_OR |
> + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> +   ADVANCE_BATCH();
> +
> +   /* Load compute_dispatch_indirect_z_size into SRC0
> +*/
> +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> + I915_GEM_DOMAIN_INSTRUCTION, 0,
> + indirect_offset + 8);
> +
> +   /* predicate |= (compute_dispatch_indirect_z_size == 0);
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOAD |
> + MI_PREDICATE_COMBINEOP_OR |
> + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> +   ADVANCE_BATCH();
> +
> +   /* predicate = !predicate;
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOADINV |
> + MI_PREDICATE_COMBINEOP_OR |
> + MI_PREDICATE_COMPAREOP_FALSE);
> +   ADVANCE_BATCH();
> +}
> +
> +static void
>  brw_emit_gpgpu_walker(struct brw_context *brw)
>  {
> const struct brw_cs_prog_data *prog_data = brw->cs.prog_data;
> @@ -45,20 +131,10 @@ brw_emit_gpgpu_walker(struct brw_context *brw)
> if (brw->compute.num_work_groups_bo == NULL) {
>indirect_flag = 0;
> } else {
> -  GLintptr indirect_offset = brw->compute.num_work_groups_offset;
> -  drm_intel_bo *bo = brw->compute.num_work_groups_bo;
> -
> -  indirect_flag = GEN7_GPGPU_INDIRECT_PARAMETER_ENABLE;
> -
> -  brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMX, bo,
> -I915_GEM_DOMAIN_VERTEX, 0,
> -indirect_offset + 0);
> -  brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMY, bo,
> -I915_GEM_DOMAIN_VERTEX, 0,
> -

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-16 Thread Olivier Galibert

I'm actually interested about how one goes about debugging that kind
of problem, if you have pointers.  I would have an idea or two on how
to go about it if it was in userspace only, but once it crosses into
the kernel I'm not sure what strategies are best.

Best,

  OG.


On Wed, Feb 17, 2016 at 2:51 AM, Jason Ekstrand  wrote:
> On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert 
> wrote:
>>
>>   Hi,
>>
>> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
>> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
>> should I go about debugging that?
>
>
> It's a depth-stencil issue and we know about it.   The gen7 code needs some
> love.   I think Kristian and Jordan have been working on it.
> --Jason
>
>>
>>
>>   OG.
>>
>>
>> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand 
>> wrote:
>> > The Intel mesa team is pleased to announce a brand-new open-source
>> > Vulkan
>> > driver for Intel hardware.  We've been working hard on this over the
>> > course
>> > of the past year or so and are excited to finally share it with the
>> > community.  We will work on up-streaming the driver in the next few
>> > weeks
>> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).  In
>> > the
>> > mean time, the driver can be found in the "vulkan" branch of the mesa
>> > git
>> > repo on freedesktop.org:
>> >
>> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>> >
>> > More information on building the driver and running a few simple apps
>> > can
>> > be found on the 01.org web site:
>> >
>> >
>> > https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>> >
>> > We have talked to people at Red Hat and Cannonical and binaries should
>> > be
>> > available for Fedora and Ubuntu soon.  We will update the page on 01.org
>> > with links as soon as they are available.
>> >
>> > We have also created a small test suite called crucible which contains a
>> > few hundred tests (mostly for miptrees) that we created when bringing up
>> > the driver.  This isn't really intended to be the piglit of vulkan.
>> > With
>> > the CTS being publicly available, most cross-platform tests should go
>> > there.  We mostly made crucible so that we could write a few tests early
>> > on
>> > to get us going and for tests that were targetted specifically at our
>> > implementation.  None the less, they may prove useful to someone and we
>> > are
>> > happy to share them.  The crucible source code can be found at
>> >
>> > https://cgit.freedesktop.org/mesa/crucible/
>> >
>> > Frequently Asked Questions:
>> >
>> > What all hardware does it support?
>> >
>> >The driver currently supports Sky Lake all the way back to Ivy
>> > Bridge.
>> >The driver is Vulkan 1.0 conformant for 64-bit builds on Sky Lake,
>> >Broadwell, and Braswell.  We are still having a couple of 32-bit
>> > issues
>> >and support for Haswell, Ivy Bridge, and Bay Trail should be
>> > considered
>> >experimental.
>> >
>> > How much code is shared between the Vulkan and GL drivers?
>> >
>> >For shaders, we're using a SPIR-V to NIR pass which is new, and a few
>> >new NIR lowering passes for things that we previously depended on
>> > GLSL
>> >IR to handle.  Beyond that, we're using the same core NIR and the
>> > same
>> >back-end compiler that we have for GL.  We're carrying a few patches
>> >against the back-end compiler, but the delta is very small and it's
>> > all
>> >stuff that we eventually want to do for GL anyway.
>> >
>> >The main API handling and state setup code is all new and written
>> > from
>> >the ground-up for Vulkan.  For actually packing hardware packets, we
>> > are
>> >using a codegen system that Kristian developed early on in the
>> > project
>> >that's based on an XML description of the hardware packets.  The
>> > result
>> >is state setup code that's both easier to work with and maybe even a
>> >little more efficient than what we have in mesa today.
>> >
>> >We also have a brand-new surface layout library called ISL that
>> > handles
>> >all of the surface layout calculations.  ISL should have most of the
>> >code required to do surface layout all the way back to gen4.  Once we
>> >get aux surface support in ISL (required for HiZ, MSAA compression,
>> > and
>> >CCMS/fast clears), we hope to start using it in the GL driver as
>> > well.
>> >
>> > How much code could be shared with other Vulkan drivers?
>> >
>> >Not as much as you would think.  The SPIR-V to NIR translator and the
>> >rest of the NIR compiler stack could obviously be re-used by anyone
>> >willing to tie NIR into their back-end.  The rest of the driver is,
>> > and
>> >will probably stay, Intel-specific.  Vulkan is a very low-level API,
>> >possibly even lower-level than gallium.  A lot of the things that we
>> >share between drivers in mesa today: the front-end compiler, state
>> >tracking, error-handlin

Re: [Mesa-dev] [PATCH] mesa: default DepthMode to GL_RED on ES 3.0

2016-02-16 Thread Kenneth Graunke

On Tuesday, February 16, 2016 6:29:39 PM PST Ilia Mirkin wrote:
> See commit 9db2098d which did it internally to the i965 driver. No
> reason not to have this more globally set though.
> 
> This fixes depth in a bunch of dEQP EXT_texture_border_clamp tests. And
> probably other items as well.
> 
> Signed-off-by: Ilia Mirkin 
> Cc: Ian Romanick 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/mesa/main/texobj.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
> index d8407f0..2b9c80a 100644
> --- a/src/mesa/main/texobj.c
> +++ b/src/mesa/main/texobj.c
> @@ -320,7 +320,8 @@ _mesa_initialize_texture_object( struct gl_context *ctx,
> obj->Sampler.MaxAnisotropy = 1.0;
> obj->Sampler.CompareMode = GL_NONE; /* ARB_shadow */
> obj->Sampler.CompareFunc = GL_LEQUAL;   /* ARB_shadow */
> -   obj->DepthMode = ctx->API == API_OPENGL_CORE ? GL_RED : GL_LUMINANCE;
> +   obj->DepthMode = (ctx->API == API_OPENGL_CORE || _mesa_is_gles3(ctx)) ?
> +  GL_RED : GL_LUMINANCE;
> obj->StencilSampling = false;
> obj->Sampler.CubeMapSeamless = GL_FALSE;
> obj->Swizzle[0] = GL_RED;
> 

Now I'm a bit weirded out - three years later I can't recall why I wrote
an i965 specific patch for this.  Fixing it in core Mesa seems way
better.  I wonder why I didn't do that in the first place.

I don't think this is quite right, though...won't this default depth
mode to GL_RED for *all* formats?  The commit you and Ian cited explains
that we should default to GL_RED (X, 0, 0, 1) for *sized* formats, but
leave it as GL_LUMINANCE (X, X, X, 1) for the *unsized* ones.

We were sort of painted into a corner here...GLES 2 can be silently
promoted to GLES 3...and GLES 2 already specified this as GL_LUMINANCE,
but GLES 3 specified things as GL_RED...so there was a compromise.

Incidentally, we should figure out the GL 4.2 interaction I decided to
put off in 2013, since we're finally there :)

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [v2] intel: Add missing SKL device IDs

2016-02-16 Thread Ben Widawsky

On Tue, Feb 16, 2016 at 03:42:45PM -0800, Ben Widawsky wrote:
> A new list yielded new devices that apparently have shipped, or will ship.
> 
> v2: I can't read. 0x192d is GT3
> 
> Signed-off-by: Ben Widawsky 
> ---
>  intel/intel_chipset.h | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h
> index 35148e5..9c24701 100644
> --- a/intel/intel_chipset.h
> +++ b/intel/intel_chipset.h
> @@ -168,6 +168,7 @@
>  #define PCI_CHIP_SKYLAKE_DT_GT1  0x1902
>  #define PCI_CHIP_SKYLAKE_ULT_GT1 0x1906
>  #define PCI_CHIP_SKYLAKE_SRV_GT1 0x190A /* Reserved */
> +#define PCI_CHIP_SKYLAKE_H_GT1   0x190B
>  #define PCI_CHIP_SKYLAKE_ULX_GT1 0x190E /* Reserved */
>  #define PCI_CHIP_SKYLAKE_DT_GT2  0x1912
>  #define PCI_CHIP_SKYLAKE_FUSED0_GT2  0x1913 /* Reserved */
> @@ -182,6 +183,7 @@
>  #define PCI_CHIP_SKYLAKE_GT3 0x1926
>  #define PCI_CHIP_SKYLAKE_HALO_GT30x192B /* Reserved */
>  #define PCI_CHIP_SKYLAKE_SRV_GT4 0x192A
> +#define PCI_CHIP_SKYLAKE_MEDIA_SRV_GT3   0x192D
>  #define PCI_CHIP_SKYLAKE_DT_GT4  0x1932
>  #define PCI_CHIP_SKYLAKE_SRV_GT4X0x193A
>  #define PCI_CHIP_SKYLAKE_H_GT4   0x193B
> @@ -376,7 +378,8 @@
>  #define IS_SKL_GT1(devid)((devid) == PCI_CHIP_SKYLAKE_ULT_GT1|| \
>(devid) == PCI_CHIP_SKYLAKE_ULX_GT1|| \
>(devid) == PCI_CHIP_SKYLAKE_DT_GT1 || \
> -  (devid) == PCI_CHIP_SKYLAKE_SRV_GT1)
> +  (devid) == PCI_CHIP_SKYLAKE_SRV_GT1|| \
> +  (devid) == PCI_CHIP_SKYLAKE_H_GT1)
>  
>  #define IS_SKL_GT2(devid)((devid) == PCI_CHIP_SKYLAKE_DT_GT2 || \
>(devid) == PCI_CHIP_SKYLAKE_FUSED0_GT2 || \
> @@ -390,13 +393,15 @@
>(devid) == PCI_CHIP_SKYLAKE_MOBILE_GT2)
>  
>  #define IS_SKL_GT3(devid)((devid) == PCI_CHIP_SKYLAKE_GT3|| \
> -  (devid) == PCI_CHIP_SKYLAKE_HALO_GT3)
> +  (devid) == PCI_CHIP_SKYLAKE_HALO_GT3   || \
> +  (devid) == PCI_CHIP_SKYLAKE_MEDIA_SRV_GT3)
> +
>  
>  #define IS_SKL_GT4(devid)((devid) == PCI_CHIP_SKYLAKE_SRV_GT4|| \
>(devid) == PCI_CHIP_SKYLAKE_DT_GT4 || \
>(devid) == PCI_CHIP_SKYLAKE_SRV_GT4X   || \
>(devid) == PCI_CHIP_SKYLAKE_H_GT4  || \
> -  (devid) == PCI_CHIP_SKYLAKE_WKS_GT4)
> +  (devid) == PCI_CHIP_SKYLAKE_WKS_GT4|| \

Really sloppy on my part... This hunk is gone from my local patch. I'm not even
sure really how I sent this out...

>  
>  #define IS_KBL_GT1(devid)((devid) == PCI_CHIP_KABYLAKE_ULT_GT1_5 || \
>(devid) == PCI_CHIP_KABYLAKE_ULX_GT1_5 || \
> -- 
> 2.7.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 94175] ilo driver fail to load for intel mobile gm45 Express chipset

2016-02-16 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=94175

--- Comment #2 from jydc...@21cn.com ---
got it, thanks for reply.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/5] android: radeonsi: fix building error in si_shader.c

2016-02-16 Thread Michel Dänzer

On 16.02.2016 20:03, Emil Velikov wrote:
> On 16 February 2016 at 07:02, Michel Dänzer  wrote:
>> On 14.02.2016 23:41, Mauro Rossi wrote:
>>>
>>> From: Mauro Rossi mailto:issor.or...@gmail.com>>
>>> Date: Sun, 14 Feb 2016 15:34:16 +0100
>>> Subject: [PATCH 1/2] android: add support for strchrnul
>>>
>>> Android Bionic has no strchrnul in string functions,
>>> radeonsi uses strchrnul, so we need an implementation.
>>>
>>> strchrnul.h is added in top mesa include path.
>>
>> Gallium code (at least outside of src/gallium/state_trackers) is not
>> supposed to include headers from the toplevel include directory. This
>> header should be in src/util/ instead.
>>
> If we consider this a compatibility wrapper then include/ is fine
> (alongside a name like gnu_string.h). Although I'm thinking about a
> shorter fix -> s/strchrnul/util_strchrnul/. Gallium already has (and
> uses) an util function.

Oh. Sorry Mauro I missed util_strchrnul, please just make radeonsi use that.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] anv: fix warning about unused width variable.

2016-02-16 Thread Dave Airlie

From: Dave Airlie 

We don't use width outside the debug clause here.
---
 src/vulkan/gen_pack_header.py | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/vulkan/gen_pack_header.py b/src/vulkan/gen_pack_header.py
index 3cabb58..75c4f26 100755
--- a/src/vulkan/gen_pack_header.py
+++ b/src/vulkan/gen_pack_header.py
@@ -62,11 +62,10 @@ __gen_mbo(uint32_t start, uint32_t end)
 static inline uint64_t
 __gen_uint(uint64_t v, uint32_t start, uint32_t end)
 {
-   const int width = end - start + 1;
-
__gen_validate_value(v);
 
 #if DEBUG
+   const int width = end - start + 1;
if (width < 64) {
   const uint64_t max = (1ull << width) - 1;
   assert(v <= max);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [v2] i965/skl: Add two missing device IDs

2016-02-16 Thread Ben Widawsky

On Wed, Feb 17, 2016 at 04:02:32AM +0200, Grazvydas Ignotas wrote:
> On Wed, Feb 17, 2016 at 1:45 AM, Ben Widawsky
>  wrote:
> > The Iris part is left unbranded because we did not have these with original 
> > SKL.
> >
> > v2: 0x192d is gt3, not gt4
> 
> The name and description still don't agree.
> 

Oops. I wasn't paying careful attention since I immediately change it in the
next patch, but since I want this one backported, I should fix it. I guess I
need to bother with a v3 too so that Emil can pick up the proper one.

Sheesh... v3 for this :-)

> >
> > Cc: "11.0 11.1"  > Signed-off-by: Ben Widawsky 
> > ---
> >  include/pci_ids/i965_pci_ids.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> > index 5139e27..77d38fc 100644
> > --- a/include/pci_ids/i965_pci_ids.h
> > +++ b/include/pci_ids/i965_pci_ids.h
> > @@ -112,6 +112,7 @@ CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
> >  CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
> >  CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
> >  CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
> > +CHIPSET(0x190B, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
> >  CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake GT1")
> >  CHIPSET(0x1912, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")
> >  CHIPSET(0x1913, skl_gt2, "Intel(R) Skylake GT2f")
> > @@ -128,6 +129,7 @@ CHIPSET(0x1926, skl_gt3, "Intel(R) HD Graphics 535 
> > (Skylake GT3)")
> >  CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")
> >  CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")
> >  CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics (Skylake GT3fe)")
> > +CHIPSET(0x192D, skl_gt3, "Intel(R) Skylake GT4")
> >  CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
> >  CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
> >  CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
> > --
> > 2.7.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] [v2] i965/skl: Add two missing device IDs

2016-02-16 Thread Grazvydas Ignotas

On Wed, Feb 17, 2016 at 1:45 AM, Ben Widawsky
 wrote:
> The Iris part is left unbranded because we did not have these with original 
> SKL.
>
> v2: 0x192d is gt3, not gt4

The name and description still don't agree.

>
> Cc: "11.0 11.1"  Signed-off-by: Ben Widawsky 
> ---
>  include/pci_ids/i965_pci_ids.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> index 5139e27..77d38fc 100644
> --- a/include/pci_ids/i965_pci_ids.h
> +++ b/include/pci_ids/i965_pci_ids.h
> @@ -112,6 +112,7 @@ CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
>  CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
>  CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
>  CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
> +CHIPSET(0x190B, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
>  CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake GT1")
>  CHIPSET(0x1912, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")
>  CHIPSET(0x1913, skl_gt2, "Intel(R) Skylake GT2f")
> @@ -128,6 +129,7 @@ CHIPSET(0x1926, skl_gt3, "Intel(R) HD Graphics 535 
> (Skylake GT3)")
>  CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")
>  CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")
>  CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics (Skylake GT3fe)")
> +CHIPSET(0x192D, skl_gt3, "Intel(R) Skylake GT4")
>  CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
>  CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
>  CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
> --
> 2.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-16 Thread Jason Ekstrand

On Tue, Feb 16, 2016 at 1:21 PM, Olivier Galibert 
wrote:

>   Hi,
>
> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
> should I go about debugging that?
>

It's a depth-stencil issue and we know about it.   The gen7 code needs some
love.   I think Kristian and Jordan have been working on it.
--Jason


>
>   OG.
>
>
> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand 
> wrote:
> > The Intel mesa team is pleased to announce a brand-new open-source Vulkan
> > driver for Intel hardware.  We've been working hard on this over the
> course
> > of the past year or so and are excited to finally share it with the
> > community.  We will work on up-streaming the driver in the next few weeks
> > and hope to have it all in place in time for mesa 11.3 (mesa 12?).  In
> the
> > mean time, the driver can be found in the "vulkan" branch of the mesa git
> > repo on freedesktop.org:
> >
> > https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
> >
> > More information on building the driver and running a few simple apps can
> > be found on the 01.org web site:
> >
> >
> https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
> >
> > We have talked to people at Red Hat and Cannonical and binaries should be
> > available for Fedora and Ubuntu soon.  We will update the page on 01.org
> > with links as soon as they are available.
> >
> > We have also created a small test suite called crucible which contains a
> > few hundred tests (mostly for miptrees) that we created when bringing up
> > the driver.  This isn't really intended to be the piglit of vulkan.  With
> > the CTS being publicly available, most cross-platform tests should go
> > there.  We mostly made crucible so that we could write a few tests early
> on
> > to get us going and for tests that were targetted specifically at our
> > implementation.  None the less, they may prove useful to someone and we
> are
> > happy to share them.  The crucible source code can be found at
> >
> > https://cgit.freedesktop.org/mesa/crucible/
> >
> > Frequently Asked Questions:
> >
> > What all hardware does it support?
> >
> >The driver currently supports Sky Lake all the way back to Ivy Bridge.
> >The driver is Vulkan 1.0 conformant for 64-bit builds on Sky Lake,
> >Broadwell, and Braswell.  We are still having a couple of 32-bit
> issues
> >and support for Haswell, Ivy Bridge, and Bay Trail should be
> considered
> >experimental.
> >
> > How much code is shared between the Vulkan and GL drivers?
> >
> >For shaders, we're using a SPIR-V to NIR pass which is new, and a few
> >new NIR lowering passes for things that we previously depended on GLSL
> >IR to handle.  Beyond that, we're using the same core NIR and the same
> >back-end compiler that we have for GL.  We're carrying a few patches
> >against the back-end compiler, but the delta is very small and it's
> all
> >stuff that we eventually want to do for GL anyway.
> >
> >The main API handling and state setup code is all new and written from
> >the ground-up for Vulkan.  For actually packing hardware packets, we
> are
> >using a codegen system that Kristian developed early on in the project
> >that's based on an XML description of the hardware packets.  The
> result
> >is state setup code that's both easier to work with and maybe even a
> >little more efficient than what we have in mesa today.
> >
> >We also have a brand-new surface layout library called ISL that
> handles
> >all of the surface layout calculations.  ISL should have most of the
> >code required to do surface layout all the way back to gen4.  Once we
> >get aux surface support in ISL (required for HiZ, MSAA compression,
> and
> >CCMS/fast clears), we hope to start using it in the GL driver as well.
> >
> > How much code could be shared with other Vulkan drivers?
> >
> >Not as much as you would think.  The SPIR-V to NIR translator and the
> >rest of the NIR compiler stack could obviously be re-used by anyone
> >willing to tie NIR into their back-end.  The rest of the driver is,
> and
> >will probably stay, Intel-specific.  Vulkan is a very low-level API,
> >possibly even lower-level than gallium.  A lot of the things that we
> >share between drivers in mesa today: the front-end compiler, state
> >tracking, error-handling, etc. is pushed off to either the application
> >or third-party layers in the Vulkan world.  That said, anyone wishing
> to
> >write their own Vulkan driver, is more than welcome to use ours as a
> >reference and steal whatever they'd like from it.
> >
> > What are your up-streaming plans?
> >
> >Before we can land the SPIR-V to NIR layer, there are a number of core
> >NIR changes that need to land first.  All of that code needs to be
> >reviewed as it interacts with the GL driver and we don't w

Re: [Mesa-dev] ANNOUNCE: An open-source Vulkan driver for Intel hardware

2016-02-16 Thread Connor Abbott

The first thing I would do is to try running it under valgrind and see
if there are any errors. Make sure to have the valgrind development
headers installed when building mesa, as anvil has a lot of special
valgrind hooks builtin that it will only use if it finds the header
during the build.

On Tue, Feb 16, 2016 at 4:21 PM, Olivier Galibert  wrote:
>   Hi,
>
> I'm getting gpu hangs with the lunarg examples (cube and tri) on my
> Haswell (64 bits).  I attach /sys/class/drm/card0/error fwiw.  How
> should I go about debugging that?
>
>   OG.
>
>
> On Tue, Feb 16, 2016 at 4:19 PM, Jason Ekstrand  wrote:
>> The Intel mesa team is pleased to announce a brand-new open-source Vulkan
>> driver for Intel hardware.  We've been working hard on this over the course
>> of the past year or so and are excited to finally share it with the
>> community.  We will work on up-streaming the driver in the next few weeks
>> and hope to have it all in place in time for mesa 11.3 (mesa 12?).  In the
>> mean time, the driver can be found in the "vulkan" branch of the mesa git
>> repo on freedesktop.org:
>>
>> https://cgit.freedesktop.org/mesa/mesa/log/?h=vulkan
>>
>> More information on building the driver and running a few simple apps can
>> be found on the 01.org web site:
>>
>> https://01.org/linuxgraphics/blogs/jekstrand/2016/open-source-vulkan-drivers-intel-hardware
>>
>> We have talked to people at Red Hat and Cannonical and binaries should be
>> available for Fedora and Ubuntu soon.  We will update the page on 01.org
>> with links as soon as they are available.
>>
>> We have also created a small test suite called crucible which contains a
>> few hundred tests (mostly for miptrees) that we created when bringing up
>> the driver.  This isn't really intended to be the piglit of vulkan.  With
>> the CTS being publicly available, most cross-platform tests should go
>> there.  We mostly made crucible so that we could write a few tests early on
>> to get us going and for tests that were targetted specifically at our
>> implementation.  None the less, they may prove useful to someone and we are
>> happy to share them.  The crucible source code can be found at
>>
>> https://cgit.freedesktop.org/mesa/crucible/
>>
>> Frequently Asked Questions:
>>
>> What all hardware does it support?
>>
>>The driver currently supports Sky Lake all the way back to Ivy Bridge.
>>The driver is Vulkan 1.0 conformant for 64-bit builds on Sky Lake,
>>Broadwell, and Braswell.  We are still having a couple of 32-bit issues
>>and support for Haswell, Ivy Bridge, and Bay Trail should be considered
>>experimental.
>>
>> How much code is shared between the Vulkan and GL drivers?
>>
>>For shaders, we're using a SPIR-V to NIR pass which is new, and a few
>>new NIR lowering passes for things that we previously depended on GLSL
>>IR to handle.  Beyond that, we're using the same core NIR and the same
>>back-end compiler that we have for GL.  We're carrying a few patches
>>against the back-end compiler, but the delta is very small and it's all
>>stuff that we eventually want to do for GL anyway.
>>
>>The main API handling and state setup code is all new and written from
>>the ground-up for Vulkan.  For actually packing hardware packets, we are
>>using a codegen system that Kristian developed early on in the project
>>that's based on an XML description of the hardware packets.  The result
>>is state setup code that's both easier to work with and maybe even a
>>little more efficient than what we have in mesa today.
>>
>>We also have a brand-new surface layout library called ISL that handles
>>all of the surface layout calculations.  ISL should have most of the
>>code required to do surface layout all the way back to gen4.  Once we
>>get aux surface support in ISL (required for HiZ, MSAA compression, and
>>CCMS/fast clears), we hope to start using it in the GL driver as well.
>>
>> How much code could be shared with other Vulkan drivers?
>>
>>Not as much as you would think.  The SPIR-V to NIR translator and the
>>rest of the NIR compiler stack could obviously be re-used by anyone
>>willing to tie NIR into their back-end.  The rest of the driver is, and
>>will probably stay, Intel-specific.  Vulkan is a very low-level API,
>>possibly even lower-level than gallium.  A lot of the things that we
>>share between drivers in mesa today: the front-end compiler, state
>>tracking, error-handling, etc. is pushed off to either the application
>>or third-party layers in the Vulkan world.  That said, anyone wishing to
>>write their own Vulkan driver, is more than welcome to use ours as a
>>reference and steal whatever they'd like from it.
>>
>> What are your up-streaming plans?
>>
>>Before we can land the SPIR-V to NIR layer, there are a number of core
>>NIR changes that need to land first.  All of that code needs to be
>>reviewed as it interacts

[Mesa-dev] [PATCH] anv: fix Get*MemoryRequirements for !LLC

2016-02-16 Thread Connor Abbott

AFAIK buffers and images can be backed by coherent or non-coherent
memory types. Found by inspection, only compile tested.

Signed-off-by: Connor Abbott 
---
 src/vulkan/anv_device.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/vulkan/anv_device.c b/src/vulkan/anv_device.c
index dfc29e4..453d66f 100644
--- a/src/vulkan/anv_device.c
+++ b/src/vulkan/anv_device.c
@@ -1259,11 +1259,12 @@ VkResult anv_InvalidateMappedMemoryRanges(
 }
 
 void anv_GetBufferMemoryRequirements(
-VkDevicedevice,
+VkDevice_device,
 VkBuffer_buffer,
 VkMemoryRequirements*   pMemoryRequirements)
 {
ANV_FROM_HANDLE(anv_buffer, buffer, _buffer);
+   ANV_FROM_HANDLE(anv_device, device, _device);
 
/* The Vulkan spec (git aaed022) says:
 *
@@ -1272,20 +1273,21 @@ void anv_GetBufferMemoryRequirements(
 *only if the memory type `i` in the VkPhysicalDeviceMemoryProperties
 *structure for the physical device is supported.
 *
-* We support exactly one memory type.
+* We support exactly one memory type on LLC, two on non-LLC.
 */
-   pMemoryRequirements->memoryTypeBits = 1;
+   pMemoryRequirements->memoryTypeBits = device->info.has_llc ? 1 : 3;
 
pMemoryRequirements->size = buffer->size;
pMemoryRequirements->alignment = 16;
 }
 
 void anv_GetImageMemoryRequirements(
-VkDevicedevice,
+VkDevice_device,
 VkImage _image,
 VkMemoryRequirements*   pMemoryRequirements)
 {
ANV_FROM_HANDLE(anv_image, image, _image);
+   ANV_FROM_HANDLE(anv_device, device, _device);
 
/* The Vulkan spec (git aaed022) says:
 *
@@ -1294,9 +1296,9 @@ void anv_GetImageMemoryRequirements(
 *only if the memory type `i` in the VkPhysicalDeviceMemoryProperties
 *structure for the physical device is supported.
 *
-* We support exactly one memory type.
+* We support exactly one memory type on LLC, two on non-LLC.
 */
-   pMemoryRequirements->memoryTypeBits = 1;
+   pMemoryRequirements->memoryTypeBits = device->info.has_llc ? 1 : 3;
 
pMemoryRequirements->size = image->size;
pMemoryRequirements->alignment = image->alignment;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] [v2] i965/skl: Add two missing device IDs

2016-02-16 Thread Ben Widawsky

The Iris part is left unbranded because we did not have these with original SKL.

v2: 0x192d is gt3, not gt4

Cc: "11.0 11.1" 
---
 include/pci_ids/i965_pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 5139e27..77d38fc 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -112,6 +112,7 @@ CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
 CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
+CHIPSET(0x190B, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake GT1")
 CHIPSET(0x1912, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")
 CHIPSET(0x1913, skl_gt2, "Intel(R) Skylake GT2f")
@@ -128,6 +129,7 @@ CHIPSET(0x1926, skl_gt3, "Intel(R) HD Graphics 535 (Skylake 
GT3)")
 CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")
 CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")
 CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics (Skylake GT3fe)")
+CHIPSET(0x192D, skl_gt3, "Intel(R) Skylake GT4")
 CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
 CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
 CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] [v2] i965/skl: Update Skylake renderer strings

2016-02-16 Thread Ben Widawsky

Also adds some of the Iris/Pro parts which we previously didn't have named.

v2: 0x192d is gt3, not gt4

Signed-off-by: Ben Widawsky 
---
 include/pci_ids/i965_pci_ids.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 77d38fc..e7a54f8 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -123,17 +123,17 @@ CHIPSET(0x191A, skl_gt2, "Intel(R) Skylake GT2")
 CHIPSET(0x191B, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")
 CHIPSET(0x191D, skl_gt2, "Intel(R) HD Graphics P530 (Skylake GT2)")
 CHIPSET(0x191E, skl_gt2, "Intel(R) HD Graphics 515 (Skylake GT2)")
-CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake GT2")
-CHIPSET(0x1923, skl_gt3, "Intel(R) Iris Graphics 540 (Skylake GT3e)")
-CHIPSET(0x1926, skl_gt3, "Intel(R) HD Graphics 535 (Skylake GT3)")
+CHIPSET(0x1921, skl_gt2, "Intel(R) HD Graphics 520 (Skylake GT2)")
+CHIPSET(0x1923, skl_gt3, "Intel(R) Skylake GT3e")
+CHIPSET(0x1926, skl_gt3, "Intel(R) Iris Graphics 540 (Skylake GT3)")
 CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")
 CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics (Skylake GT3fe)")
-CHIPSET(0x192D, skl_gt3, "Intel(R) Skylake GT4")
-CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x193D, skl_gt4, "Intel(R) Skylake GT4")
+CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics 555 (Skylake GT3e)")
+CHIPSET(0x192D, skl_gt3, "Intel(R) Iris Graphics P555 (Skylake GT3e)")
+CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4)")
+CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4)")
+CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4)")
+CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4)")
 CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
 CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
 CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] [v2] intel: Add missing SKL device IDs

2016-02-16 Thread Ben Widawsky

A new list yielded new devices that apparently have shipped, or will ship.

v2: I can't read. 0x192d is GT3

Signed-off-by: Ben Widawsky 
---
 intel/intel_chipset.h | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h
index 35148e5..9c24701 100644
--- a/intel/intel_chipset.h
+++ b/intel/intel_chipset.h
@@ -168,6 +168,7 @@
 #define PCI_CHIP_SKYLAKE_DT_GT10x1902
 #define PCI_CHIP_SKYLAKE_ULT_GT1   0x1906
 #define PCI_CHIP_SKYLAKE_SRV_GT1   0x190A /* Reserved */
+#define PCI_CHIP_SKYLAKE_H_GT1 0x190B
 #define PCI_CHIP_SKYLAKE_ULX_GT1   0x190E /* Reserved */
 #define PCI_CHIP_SKYLAKE_DT_GT20x1912
 #define PCI_CHIP_SKYLAKE_FUSED0_GT20x1913 /* Reserved */
@@ -182,6 +183,7 @@
 #define PCI_CHIP_SKYLAKE_GT3   0x1926
 #define PCI_CHIP_SKYLAKE_HALO_GT3  0x192B /* Reserved */
 #define PCI_CHIP_SKYLAKE_SRV_GT4   0x192A
+#define PCI_CHIP_SKYLAKE_MEDIA_SRV_GT3 0x192D
 #define PCI_CHIP_SKYLAKE_DT_GT40x1932
 #define PCI_CHIP_SKYLAKE_SRV_GT4X  0x193A
 #define PCI_CHIP_SKYLAKE_H_GT4 0x193B
@@ -376,7 +378,8 @@
 #define IS_SKL_GT1(devid)  ((devid) == PCI_CHIP_SKYLAKE_ULT_GT1|| \
 (devid) == PCI_CHIP_SKYLAKE_ULX_GT1|| \
 (devid) == PCI_CHIP_SKYLAKE_DT_GT1 || \
-(devid) == PCI_CHIP_SKYLAKE_SRV_GT1)
+(devid) == PCI_CHIP_SKYLAKE_SRV_GT1|| \
+(devid) == PCI_CHIP_SKYLAKE_H_GT1)
 
 #define IS_SKL_GT2(devid)  ((devid) == PCI_CHIP_SKYLAKE_DT_GT2 || \
 (devid) == PCI_CHIP_SKYLAKE_FUSED0_GT2 || \
@@ -390,13 +393,15 @@
 (devid) == PCI_CHIP_SKYLAKE_MOBILE_GT2)
 
 #define IS_SKL_GT3(devid)  ((devid) == PCI_CHIP_SKYLAKE_GT3|| \
-(devid) == PCI_CHIP_SKYLAKE_HALO_GT3)
+(devid) == PCI_CHIP_SKYLAKE_HALO_GT3   || \
+(devid) == PCI_CHIP_SKYLAKE_MEDIA_SRV_GT3)
+
 
 #define IS_SKL_GT4(devid)  ((devid) == PCI_CHIP_SKYLAKE_SRV_GT4|| \
 (devid) == PCI_CHIP_SKYLAKE_DT_GT4 || \
 (devid) == PCI_CHIP_SKYLAKE_SRV_GT4X   || \
 (devid) == PCI_CHIP_SKYLAKE_H_GT4  || \
-(devid) == PCI_CHIP_SKYLAKE_WKS_GT4)
+(devid) == PCI_CHIP_SKYLAKE_WKS_GT4|| \
 
 #define IS_KBL_GT1(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT1_5 || \
 (devid) == PCI_CHIP_KABYLAKE_ULX_GT1_5 || \
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: default DepthMode to GL_RED on ES 3.0

2016-02-16 Thread Ilia Mirkin

See commit 9db2098d which did it internally to the i965 driver. No
reason not to have this more globally set though.

This fixes depth in a bunch of dEQP EXT_texture_border_clamp tests. And
probably other items as well.

Signed-off-by: Ilia Mirkin 
Cc: Ian Romanick 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/main/texobj.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c
index d8407f0..2b9c80a 100644
--- a/src/mesa/main/texobj.c
+++ b/src/mesa/main/texobj.c
@@ -320,7 +320,8 @@ _mesa_initialize_texture_object( struct gl_context *ctx,
obj->Sampler.MaxAnisotropy = 1.0;
obj->Sampler.CompareMode = GL_NONE; /* ARB_shadow */
obj->Sampler.CompareFunc = GL_LEQUAL;   /* ARB_shadow */
-   obj->DepthMode = ctx->API == API_OPENGL_CORE ? GL_RED : GL_LUMINANCE;
+   obj->DepthMode = (ctx->API == API_OPENGL_CORE || _mesa_is_gles3(ctx)) ?
+  GL_RED : GL_LUMINANCE;
obj->StencilSampling = false;
obj->Sampler.CubeMapSeamless = GL_FALSE;
obj->Swizzle[0] = GL_RED;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] intel: Add missing SKL device IDs

2016-02-16 Thread Ben Widawsky

A new list yielded new devices that apparently have shipped, or will ship.

Signed-off-by: Ben Widawsky 
---
 intel/intel_chipset.h | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h
index 35148e5..392f7ba 100644
--- a/intel/intel_chipset.h
+++ b/intel/intel_chipset.h
@@ -168,6 +168,7 @@
 #define PCI_CHIP_SKYLAKE_DT_GT10x1902
 #define PCI_CHIP_SKYLAKE_ULT_GT1   0x1906
 #define PCI_CHIP_SKYLAKE_SRV_GT1   0x190A /* Reserved */
+#define PCI_CHIP_SKYLAKE_H_GT1 0x190B
 #define PCI_CHIP_SKYLAKE_ULX_GT1   0x190E /* Reserved */
 #define PCI_CHIP_SKYLAKE_DT_GT20x1912
 #define PCI_CHIP_SKYLAKE_FUSED0_GT20x1913 /* Reserved */
@@ -182,6 +183,7 @@
 #define PCI_CHIP_SKYLAKE_GT3   0x1926
 #define PCI_CHIP_SKYLAKE_HALO_GT3  0x192B /* Reserved */
 #define PCI_CHIP_SKYLAKE_SRV_GT4   0x192A
+#define PCI_CHIP_SKYLAKE_MEDIA_SRV_GT4 0x192D
 #define PCI_CHIP_SKYLAKE_DT_GT40x1932
 #define PCI_CHIP_SKYLAKE_SRV_GT4X  0x193A
 #define PCI_CHIP_SKYLAKE_H_GT4 0x193B
@@ -376,7 +378,8 @@
 #define IS_SKL_GT1(devid)  ((devid) == PCI_CHIP_SKYLAKE_ULT_GT1|| \
 (devid) == PCI_CHIP_SKYLAKE_ULX_GT1|| \
 (devid) == PCI_CHIP_SKYLAKE_DT_GT1 || \
-(devid) == PCI_CHIP_SKYLAKE_SRV_GT1)
+(devid) == PCI_CHIP_SKYLAKE_SRV_GT1|| \
+(devid) == PCI_CHIP_SKYLAKE_H_GT1)
 
 #define IS_SKL_GT2(devid)  ((devid) == PCI_CHIP_SKYLAKE_DT_GT2 || \
 (devid) == PCI_CHIP_SKYLAKE_FUSED0_GT2 || \
@@ -396,7 +399,8 @@
 (devid) == PCI_CHIP_SKYLAKE_DT_GT4 || \
 (devid) == PCI_CHIP_SKYLAKE_SRV_GT4X   || \
 (devid) == PCI_CHIP_SKYLAKE_H_GT4  || \
-(devid) == PCI_CHIP_SKYLAKE_WKS_GT4)
+(devid) == PCI_CHIP_SKYLAKE_WKS_GT4|| \
+(devid) == PCI_CHIP_SKYLAKE_MEDIA_SRV_GT4)
 
 #define IS_KBL_GT1(devid)  ((devid) == PCI_CHIP_KABYLAKE_ULT_GT1_5 || \
 (devid) == PCI_CHIP_KABYLAKE_ULX_GT1_5 || \
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i965/skl: Update Skylake renderer strings

2016-02-16 Thread Ben Widawsky

Also adds some of the Iris/Pro parts which we previously didn't have named.

Signed-off-by: Ben Widawsky 
---
 include/pci_ids/i965_pci_ids.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index c049b78..c038f9b 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -123,17 +123,17 @@ CHIPSET(0x191A, skl_gt2, "Intel(R) Skylake GT2")
 CHIPSET(0x191B, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")
 CHIPSET(0x191D, skl_gt2, "Intel(R) HD Graphics P530 (Skylake GT2)")
 CHIPSET(0x191E, skl_gt2, "Intel(R) HD Graphics 515 (Skylake GT2)")
-CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake GT2")
-CHIPSET(0x1923, skl_gt3, "Intel(R) Iris Graphics 540 (Skylake GT3e)")
-CHIPSET(0x1926, skl_gt3, "Intel(R) HD Graphics 535 (Skylake GT3)")
+CHIPSET(0x1921, skl_gt2, "Intel(R) HD Graphics 520 (Skylake GT2)")
+CHIPSET(0x1923, skl_gt3, "Intel(R) Skylake GT3e")
+CHIPSET(0x1926, skl_gt3, "Intel(R) Iris Graphics 540 (Skylake GT3)")
 CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")
 CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics (Skylake GT3fe)")
-CHIPSET(0x192D, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
-CHIPSET(0x193D, skl_gt4, "Intel(R) Skylake GT4")
+CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics 555 (Skylake GT3e)")
+CHIPSET(0x192D, skl_gt4, "Intel(R) Iris Graphics P555 Skylake GT4")
+CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4)")
+CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4)")
+CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4)")
+CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4)")
 CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")
 CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")
 CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] i965/skl: Add two missing device IDs

2016-02-16 Thread Ben Widawsky

The Iris part is left unbranded because we did not have these with original SKL.

Cc: "11.0 11.1" 
---
 include/pci_ids/i965_pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index 5139e27..c049b78 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -112,6 +112,7 @@ CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")
 CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")
+CHIPSET(0x190B, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")
 CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake GT1")
 CHIPSET(0x1912, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")
 CHIPSET(0x1913, skl_gt2, "Intel(R) Skylake GT2f")
@@ -128,6 +129,7 @@ CHIPSET(0x1926, skl_gt3, "Intel(R) HD Graphics 535 (Skylake 
GT3)")
 CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")
 CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")
 CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics (Skylake GT3fe)")
+CHIPSET(0x192D, skl_gt4, "Intel(R) Skylake GT4")
 CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
 CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
 CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv: pCreateInfo->pApplicationInfo parameter to vkCreateInstance may be NULL

2016-02-16 Thread Jason Ekstrand

On Tue, Feb 16, 2016 at 1:55 PM, Philipp Zabel 
wrote:

> Fix a NULL pointer dereference in anv_CreateInstance in case
> the pApplicationInfo field of the supplied VkInstanceCreateInfo
> structure is NULL [1].
>
> [1]
> https://www.khronos.org/registry/vulkan/specs/1.0/apispec.html#VkInstanceCreateInfo
>
> Signed-off-by: Philipp Zabel 
> ---
>  src/vulkan/anv_device.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/vulkan/anv_device.c b/src/vulkan/anv_device.c
> index a6ce176..6863906 100644
> --- a/src/vulkan/anv_device.c
> +++ b/src/vulkan/anv_device.c
> @@ -214,7 +214,9 @@ VkResult anv_CreateInstance(
>
> assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO);
>
> -   uint32_t client_version = pCreateInfo->pApplicationInfo->apiVersion;
> +   uint32_t client_version = pCreateInfo->pApplicationInfo ?
> + pCreateInfo->pApplicationInfo->apiVersion :
> + VK_MAKE_VERSION(1, 0, 0);
>

That seems like a reasonable thing to do.  Kind of silly not to provide a
version though.

Pushed.  thanks!


> if (VK_MAKE_VERSION(1, 0, 0) > client_version ||
> client_version > VK_MAKE_VERSION(1, 0, 3)) {
>return vk_errorf(VK_ERROR_INCOMPATIBLE_DRIVER,
> @@ -249,7 +251,7 @@ VkResult anv_CreateInstance(
> else
>instance->alloc = default_alloc;
>
> -   instance->apiVersion = pCreateInfo->pApplicationInfo->apiVersion;
> +   instance->apiVersion = client_version;
> instance->physicalDeviceCount = -1;
>
> _mesa_locale_init();
> --
> 2.7.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/10] postprocess: fix new gcc6 warnings

2016-02-16 Thread Matt Turner

On Tue, Feb 16, 2016 at 10:58 AM, Rob Clark  wrote:
> In file included from src/gallium/state_trackers/dri/dri_screen.h:44:0,
>  from src/gallium/state_trackers/dri/dri_query_renderer.c:7:
> src/gallium/auxiliary/postprocess/filters.h:54:33: warning: ‘pp_filters’
> defined but not used [-Wunused-const-variable]
>  static const struct pp_filter_t pp_filters[PP_FILTERS] = {
>  ^~
>
> Note, this one we may actually want to move into an .c file instead?

I think that would be best. I expect this table is being duplicated
for each compilation unit that uses it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] util: fix new gcc6 warnings

2016-02-16 Thread Matt Turner

On Tue, Feb 16, 2016 at 11:37 AM, Ian Romanick  wrote:
> On 02/16/2016 10:57 AM, Rob Clark wrote:
>> src/util/hash_table.h:111:23: warning: ‘_mesa_fnv32_1a_offset_bias’ defined 
>> but not used [-Wunused-const-variable]
>>  static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
>>^~
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/util/hash_table.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/util/hash_table.h b/src/util/hash_table.h
>> index 85b013c..a0244d7 100644
>> --- a/src/util/hash_table.h
>> +++ b/src/util/hash_table.h
>> @@ -108,7 +108,7 @@ static inline uint32_t _mesa_hash_pointer(const void 
>> *pointer)
>> return _mesa_hash_data(&pointer, sizeof(pointer));
>>  }
>>
>> -static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
>> +static const uint32_t _mesa_fnv32_1a_offset_bias UNUSED = 2166136261u;
>
> Looking at how it's used in the code, this seems like it should either
> be a #define or an anonymous union.  I mean, I had to go look at the
> code to figure out why it should be UNUSED instead of just removed. :)
>
> enum { _mesa_fnv32_1a_offset_bias = 2166136261u };

I agree either of these would be better. Putting static data in a
header file is strange.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Where do we put a Vulkan driver?

2016-02-16 Thread Brian Paul


On 02/16/2016 02:41 PM, Dave Airlie wrote:

On 17 February 2016 at 04:39, Jason Ekstrand  wrote:

So, we just pushed a branch containing a Vulkan driver.  Naturally, we
would like to incorporate that driver into the upstream mesa tree.  While
we work on upstreaming the prerequisites in NIR and the i965 back-end
compiler, there is a question that needs answering:  Where do we put it?

The Vulkan driver challenges the tree-like nature of the way mesa is
currently organized.  We now have two drivers that share a lot of the same
underlying hardware-specific code (compiler and ISL) but target different
APIs and no gallium-like middle layer to hide behind.  Obviously, we don't
want to put a Vulkan driver in src/mesa/drivers/dri/i965.  If we start a
src/vulkan directory, we don't really want to put the shared parts into
src/vulkan/intel.  Where should we put the Intel-specific but API-agnostic
bits?  In particular, we need a place to put ISL and the back-end compiler.
We don't want to deal with the headaches of making a public API and keeping
it stable, so they need to live somewhere in the mesa tree.

In my personal opinion, the best thing to do is probably to add a src/intel
folder with subfolders for vulkan, isl, and the back-end compiler.  The
src/mesa/drivers/dri/i965 folder would then basically be just the GL bits
of the driver.  It does seem a little odd to have "intel" as a top-level
source folder, but I can't come up with anything better.

Thoughts?  Opinions?  Favorite colors?


I don't think we'll get this right the first time, and when we
randomly decide to
change it we can just make poor Emil handle the fallout. :-P

Anyways,

src/intel works for me, also src/shed/intel, src/shared/intel, src/drivers/intel


I like src/shared/intel/ FWIW.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/gen7: Use predicated rendering for indirect compute

2016-02-16 Thread Ben Widawsky

On Tue, Feb 16, 2016 at 12:21:02PM -0800, Jordan Justen wrote:
> On 2016-02-16 12:03:10, Ben Widawsky wrote:
> > On Tue, Feb 16, 2016 at 10:09:50AM -0800, Jordan Justen wrote:
> > > On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect
> > > dispatch is used, but one of the dimensions is 0.
> > > 
> > > Therefore we use predicated rendering on the GPGPU_WALKER command to
> > > handle this case.
> > > 
> > > Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100
> > > Signed-off-by: Jordan Justen 
> > > Cc: Kenneth Graunke 
> > > Cc: Ben Widawsky 
> > > Cc: Ilia Mirkin 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_compute.c | 104 
> > > +++-
> > >  src/mesa/drivers/dri/i965/brw_defines.h |   1 +
> > >  2 files changed, 91 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
> > > b/src/mesa/drivers/dri/i965/brw_compute.c
> > > index d9f181a..bbb8ce3 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_compute.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_compute.c
> > > @@ -35,6 +35,92 @@
> > >  
> > >  
> > >  static void
> > > +brw_prepare_indirect_gpgpu_walker(struct brw_context *brw)
> > > +{
> > 
> > Just FYI:
> > There is a blurb in the predicate text:
> > To ensure the memory sources of the MI_LOAD_REGISTER_MEM commands are 
> > coherent
> > with previous 3D_PIPECONTROL store-DWord operations, software can use the 
> > new
> > Pipe Control Flush Enable bit in the PIPE_CONTROL command.
> > 
> > I suppose it's never the case that we'll be writing these with 
> > PIPE_CONTROL, so
> > it's safe to ignore this.
> > 
> 
> On irc it sounded like you didn't think the flush was required. I'm
> going to stick with that unless you tell me otherwise.
> 
> The LRM is coming from a user BO, and they may have set the values by
> mapping the buffer and writing it from the CPU, or by writing it from
> a shader (for example SSBO).
> 

The note seems to imply that DW writes with a pipe control can be deferred and
so you need to flush are previously pipe controls... again, yeah, I think we're
safe.

> > > +   GLintptr indirect_offset = brw->compute.num_work_groups_offset;
> > > +   drm_intel_bo *bo = brw->compute.num_work_groups_bo;
> > > +
> > > +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMX, bo,
> > > + I915_GEM_DOMAIN_VERTEX, 0,
> > > + indirect_offset + 0);
> > > +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMY, bo,
> > > + I915_GEM_DOMAIN_VERTEX, 0,
> > > + indirect_offset + 4);
> > > +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMZ, bo,
> > > + I915_GEM_DOMAIN_VERTEX, 0,
> > > + indirect_offset + 8);
> > > +
> > > +   if (brw->gen > 7)
> > > +  return;
> > > +
> > > +   /* Clear upper 32-bits of SRC0 and all 64-bits of SRC1
> > > +*/
> > > +   BEGIN_BATCH(7);
> > > +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (7 - 2));
> > > +   OUT_BATCH(MI_PREDICATE_SRC0 + 4);
> > > +   OUT_BATCH(0u);
> > > +   OUT_BATCH(MI_PREDICATE_SRC1 + 0);
> > > +   OUT_BATCH(0u);
> > > +   OUT_BATCH(MI_PREDICATE_SRC1 + 4);
> > > +   OUT_BATCH(0u);
> > > +   ADVANCE_BATCH();
> > > +
> > > +   /* Load compute_dispatch_indirect_x_size into SRC0
> > > +*/
> > > +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> > > + I915_GEM_DOMAIN_INSTRUCTION, 0,
> > > + indirect_offset + 0);
> > > +
> > > +   /* predicate = (compute_dispatch_indirect_x_size == 0);
> > > +*/
> > > +   BEGIN_BATCH(1);
> > > +   OUT_BATCH(GEN7_MI_PREDICATE |
> > > + MI_PREDICATE_LOADOP_LOAD |
> > > + MI_PREDICATE_COMBINEOP_SET |
> > > + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> > > +   ADVANCE_BATCH();
> > > +
> > > +   /* Load compute_dispatch_indirect_y_size into SRC0
> > > +*/
> > > +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> > > + I915_GEM_DOMAIN_INSTRUCTION, 0,
> > > + indirect_offset + 4);
> > > +
> > > +   /* predicate |= (compute_dispatch_indirect_y_size == 0);
> > > +*/
> > > +   BEGIN_BATCH(1);
> > > +   OUT_BATCH(GEN7_MI_PREDICATE |
> > > + MI_PREDICATE_LOADOP_LOAD |
> > > + MI_PREDICATE_COMBINEOP_OR |
> > > + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> > > +   ADVANCE_BATCH();
> > > +
> > > +   /* Load compute_dispatch_indirect_z_size into SRC0
> > > +*/
> > > +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> > > + I915_GEM_DOMAIN_INSTRUCTION, 0,
> > > + indirect_offset + 8);
> > > +
> > > +   /* predicate |= (compute_dispatch_indirect_z_size == 0);
> > > +*/
> > > +   BEGIN_BATCH(1);
> > > +   OUT_BATCH(GEN7_MI_PREDICATE |
> > > + MI_PREDICATE_LOADOP_LOAD |
> > > + MI_P

[Mesa-dev] [PATCH] anv: pCreateInfo->pApplicationInfo parameter to vkCreateInstance may be NULL

2016-02-16 Thread Philipp Zabel

Fix a NULL pointer dereference in anv_CreateInstance in case
the pApplicationInfo field of the supplied VkInstanceCreateInfo
structure is NULL [1].

[1] 
https://www.khronos.org/registry/vulkan/specs/1.0/apispec.html#VkInstanceCreateInfo

Signed-off-by: Philipp Zabel 
---
 src/vulkan/anv_device.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/vulkan/anv_device.c b/src/vulkan/anv_device.c
index a6ce176..6863906 100644
--- a/src/vulkan/anv_device.c
+++ b/src/vulkan/anv_device.c
@@ -214,7 +214,9 @@ VkResult anv_CreateInstance(
 
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO);
 
-   uint32_t client_version = pCreateInfo->pApplicationInfo->apiVersion;
+   uint32_t client_version = pCreateInfo->pApplicationInfo ?
+ pCreateInfo->pApplicationInfo->apiVersion :
+ VK_MAKE_VERSION(1, 0, 0);
if (VK_MAKE_VERSION(1, 0, 0) > client_version ||
client_version > VK_MAKE_VERSION(1, 0, 3)) {
   return vk_errorf(VK_ERROR_INCOMPATIBLE_DRIVER,
@@ -249,7 +251,7 @@ VkResult anv_CreateInstance(
else
   instance->alloc = default_alloc;
 
-   instance->apiVersion = pCreateInfo->pApplicationInfo->apiVersion;
+   instance->apiVersion = client_version;
instance->physicalDeviceCount = -1;
 
_mesa_locale_init();
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Where do we put a Vulkan driver?

2016-02-16 Thread Dave Airlie

On 17 February 2016 at 04:39, Jason Ekstrand  wrote:
> So, we just pushed a branch containing a Vulkan driver.  Naturally, we
> would like to incorporate that driver into the upstream mesa tree.  While
> we work on upstreaming the prerequisites in NIR and the i965 back-end
> compiler, there is a question that needs answering:  Where do we put it?
>
> The Vulkan driver challenges the tree-like nature of the way mesa is
> currently organized.  We now have two drivers that share a lot of the same
> underlying hardware-specific code (compiler and ISL) but target different
> APIs and no gallium-like middle layer to hide behind.  Obviously, we don't
> want to put a Vulkan driver in src/mesa/drivers/dri/i965.  If we start a
> src/vulkan directory, we don't really want to put the shared parts into
> src/vulkan/intel.  Where should we put the Intel-specific but API-agnostic
> bits?  In particular, we need a place to put ISL and the back-end compiler.
> We don't want to deal with the headaches of making a public API and keeping
> it stable, so they need to live somewhere in the mesa tree.
>
> In my personal opinion, the best thing to do is probably to add a src/intel
> folder with subfolders for vulkan, isl, and the back-end compiler.  The
> src/mesa/drivers/dri/i965 folder would then basically be just the GL bits
> of the driver.  It does seem a little odd to have "intel" as a top-level
> source folder, but I can't come up with anything better.
>
> Thoughts?  Opinions?  Favorite colors?

I don't think we'll get this right the first time, and when we
randomly decide to
change it we can just make poor Emil handle the fallout. :-P

Anyways,

src/intel works for me, also src/shed/intel, src/shared/intel, src/drivers/intel

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0: add MP performance counters for SM35 (GK110:GM107)

2016-02-16 Thread Samuel Pitoiset




On 02/16/2016 10:04 PM, Ilia Mirkin wrote:

On Tue, Feb 16, 2016 at 3:59 PM, Samuel Pitoiset
 wrote:

  static inline const struct nvc0_hw_sm_query_cfg **
  nvc0_hw_sm_get_queries(struct nvc0_screen *screen)
  {
+   const struct nvc0_hw_sm_query_cfg **queries = NULL;
 struct nouveau_device *dev = screen->base.device;

-   if (dev->chipset == 0xc0 || dev->chipset == 0xc8)
-  return sm20_hw_sm_queries;
-   return sm21_hw_sm_queries;
+   switch (dev->chipset & ~0xf) {
+   case 0xc0:
+   case 0xd0:
+  if (dev->chipset == 0xc0 || dev->chipset == 0xc8)
+ queries = sm20_hw_sm_queries;
+  else
+ queries = sm21_hw_sm_queries;
+  break;
+   case 0xe0:
+  queries = sm30_hw_sm_queries;
+  break;
+   case 0xf0:
+   case 0x100:
+  queries = sm35_hw_sm_queries;
+  break;
+   default:
+  break;
+   }
+   return queries;
  }


This might be wider to do based on 3d class. For example GK20A (aka
0xea chipset) uses SM35.


Yeah, maybe this could improve readability.
Anyway, when all performance counters will be upstream, I think it would 
be good to refactor the code (or try to).



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0: add MP performance counters for SM35 (GK110:GM107)

2016-02-16 Thread Ilia Mirkin

On Tue, Feb 16, 2016 at 3:59 PM, Samuel Pitoiset
 wrote:
>  static inline const struct nvc0_hw_sm_query_cfg **
>  nvc0_hw_sm_get_queries(struct nvc0_screen *screen)
>  {
> +   const struct nvc0_hw_sm_query_cfg **queries = NULL;
> struct nouveau_device *dev = screen->base.device;
>
> -   if (dev->chipset == 0xc0 || dev->chipset == 0xc8)
> -  return sm20_hw_sm_queries;
> -   return sm21_hw_sm_queries;
> +   switch (dev->chipset & ~0xf) {
> +   case 0xc0:
> +   case 0xd0:
> +  if (dev->chipset == 0xc0 || dev->chipset == 0xc8)
> + queries = sm20_hw_sm_queries;
> +  else
> + queries = sm21_hw_sm_queries;
> +  break;
> +   case 0xe0:
> +  queries = sm30_hw_sm_queries;
> +  break;
> +   case 0xf0:
> +   case 0x100:
> +  queries = sm35_hw_sm_queries;
> +  break;
> +   default:
> +  break;
> +   }
> +   return queries;
>  }

This might be wider to do based on 3d class. For example GK20A (aka
0xea chipset) uses SM35.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nvc0: add MP performance counters for SM35 (GK110:GM107)

2016-02-16 Thread Samuel Pitoiset

Because compute support is not enabled by default for these chipsets,
NVF0_COMPUTE=1 needs to be used, along with GALLIUM_HUD to enable
performance counters.

Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 755 ++---
 .../drivers/nouveau/nvc0/nvc0_query_hw_sm.h|   2 +
 .../drivers/nouveau/nvc0/nve4_compute.xml.h|   4 +
 3 files changed, 667 insertions(+), 94 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
index 68c8ff5..b584532 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c
@@ -65,6 +65,7 @@ static const char *nve4_hw_sm_query_names[] =
"local_load_transactions",
"local_store",
"local_store_transactions",
+   "not_predicated_off_thread_inst_executed",
"prof_trigger_00",
"prof_trigger_01",
"prof_trigger_02",
@@ -78,6 +79,7 @@ static const char *nve4_hw_sm_query_names[] =
"shared_store",
"shared_store_replay",
"sm_cta_launched",
+   "thread_inst_executed",
"threads_launched",
"uncached_global_load_transaction",
"warps_launched",
@@ -169,6 +171,49 @@ static const uint64_t nve4_read_hw_sm_counters_code[] =
0x80001de7ULL
 };
 
+static const uint64_t nvf0_read_hw_sm_counters_code[] =
+{
+   /* Same kernel as GK104:GK110 */
+   0x0880808080808080ULL,
+   0x8640109c0022ULL,
+   0x8640019c0032ULL,
+   0x8640021c0002ULL,
+   0x8640029c0006ULL,
+   0x8640031c000aULL,
+   0x8640039c000eULL,
+   0x8640041c0012ULL,
+   0x08ac1080108c8080ULL,
+   0x8640049c0016ULL,
+   0x8640051c001aULL,
+   0x8640059c001eULL,
+   0xdb201c007f9c201eULL,
+   0x64c03c1c002aULL,
+   0xc0020a1c3021ULL,
+   0x64c03c9c002eULL,
+   0x0810a0808010b810ULL,
+   0xc001041c3025ULL,
+   0x1820003cULL,
+   0xdb201c007f9c243eULL,
+   0xc1c0301c2021ULL,
+   0xc1c0081c2431ULL,
+   0xc1c0021c2435ULL,
+   0xe080069c2026ULL,
+   0x08b010b010b010a0ULL,
+   0xe080061c2022ULL,
+   0xe4c03c00051c0032ULL,
+   0xe084041c282aULL,
+   0xe4c03c00059c0036ULL,
+   0xe08040007f9c2c2eULL,
+   0xe084049c3032ULL,
+   0xfe80001c2800ULL,
+   0x08b81080b010ULL,
+   0x64c03c00011c0002ULL,
+   0xe08040007f9c3436ULL,
+   0xfe8020043010ULL,
+   0xfc80281c3000ULL,
+   0x181c003cULL,
+};
+
 /* For simplicity, we will allocate as many group slots as we allocate counter
  * slots. This means that a single counter which wants to source from 2 groups
  * will have to be declared as using 2 counter slots. This shouldn't really be
@@ -192,64 +237,539 @@ struct nvc0_hw_sm_query_cfg
uint8_t norm[2]; /* normalization num,denom */
 };
 
-#define _Q1A(n, f, m, g, s, nu, dn) [NVE4_HW_SM_QUERY_##n] = { { { f, 
NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, NVE4_COMPUTE_MP_PM_A_SIGSEL_##g, 0, s }, 
{}, {}, {} }, 1, { nu, dn } }
-#define _Q1B(n, f, m, g, s, nu, dn) [NVE4_HW_SM_QUERY_##n] = { { { f, 
NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 1, NVE4_COMPUTE_MP_PM_B_SIGSEL_##g, 0, s }, 
{}, {}, {} }, 1, { nu, dn } }
+#define _CA(f, m, g, s) { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 0, 
NVE4_COMPUTE_MP_PM_A_SIGSEL_##g, 0, s }
+#define _CB(f, m, g, s) { f, NVE4_COMPUTE_MP_PM_FUNC_MODE_##m, 1, 
NVE4_COMPUTE_MP_PM_B_SIGSEL_##g, 0, s }
+#define _Q(n, c) [NVE4_HW_SM_QUERY_##n] = c
+
+/*  Compute capability 3.0 (GK104:GK110)  */
+static const struct nvc0_hw_sm_query_cfg
+sm30_active_cycles =
+{
+   .ctr[0]   = _CB(0x0001, B6, WARP, 0x),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_active_warps =
+{
+   .ctr[0]   = _CB(0x003f, B6, WARP, 0x31483104),
+   .num_counters = 1,
+   .norm = { 2, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_atom_cas_count =
+{
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x4),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_atom_count =
+{
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_branch =
+{
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x000c),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_divergent_branch =
+{
+   .ctr[0]   = _CA(0x0001, B6, BRANCH, 0x0010),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gld_request =
+{
+   .ctr[0]   = _CA(0x0001, B6, LDST, 0x0010),
+   .num_counters = 1,
+   .norm = { 1, 1 },
+};
+
+static const struct nvc0_hw_sm_query_cfg
+sm30_gld_mem_div_replay =
+{
+   .ctr[0]   = _CB(0x0001, B6, REPLAY, 0x0010),
+   .num_counters = 1

Re: [Mesa-dev] [PATCH] Revert "i965: Restore vbo after color resolve during brw_try_draw_prims()"

2016-02-16 Thread Mark Janes

Ben Widawsky  writes:

> On Mon, Feb 15, 2016 at 11:34:03AM +0200, Topi Pohjolainen wrote:
>> This got pushed accidentally in the first place but wasn't reverted
>> as it didn't regress piglit but instead fixed one newly introduced
>> test exercising a corner in case in i965 driver. However, saving and
>> restoring vertex buffer context is complicated and requires more
>> thought.
>
> So now a revert is going to cause a piglit regression? What a weird world we
> live in. Mark, can you make sure the test mentioned in the bug
> (getteximage-formats init-by-clear-and-render) gets skipped after the revert
> lands?

The revert landed, however the test is not skipped.  It fails.

I wrote up bug 94181 to track that regression before I associated it
with this thread.

>
> Reviewed-by: Ben Widawsky 
>
>> 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94150
>> 
>> Signed-off-by: Topi Pohjolainen 
>> CC: Ben Widawsky 
>> CC: Ian Romanick 
>> Reviewed-by: Tapani Palli 
>> ---
>>  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 9 -
>>  1 file changed, 9 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
>> b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
>> index 93f1a85..b2b07e7 100644
>> --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
>> +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
>> @@ -887,15 +887,6 @@ brw_meta_resolve_color(struct brw_context *brw,
>>  
>> _mesa_meta_end(ctx);
>>  
>> -   /* Restore in case we were called in the middle of brw_try_draw_prims().
>> -* But only in case the just restored context really uses vertex buffer
>> -* objects.
>> -*/
>> -   if (ctx->API != API_OPENGLES) {
>> -  ctx->vbo_context->exec.array.recalculate_inputs = true;
>> -  vbo_bind_arrays(ctx);
>> -   }
>> -
>> /* We're typically called from intel_update_state() and we're supposed to
>>  * return with the state all updated to what it was before
>>  * brw_meta_resolve_color() was called.  The meta rendering will have
>> -- 
>> 2.5.0
>> 
>
> -- 
> Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: add missing ETC2 entries to format_map

2016-02-16 Thread Ilia Mirkin

Sounds reasonable. The original patch is

Reviewed-by: Ilia Mirkin 

On Tue, Feb 16, 2016 at 3:46 PM, Rob Clark  wrote:
> gave that a quick try.. and, well, I think that may just be the start
> of the rabbit-hole..
>
> 0xb69eb92c in __memcpy_neon () from /lib/libc.so.6
> (gdb) bt
> #0  0xb69eb92c in __memcpy_neon () from /lib/libc.so.6
> #1  0xb5ea63e0 in _mesa_store_compressed_texsubimage (ctx=0x2ff970,
> dims=2, texImage=0xb382ac50, xoffset=0, yoffset=0, zoffset=0,
> width=2048, height=2048, depth=1, format=37492,
> imageSize=2097152, data=0xb340) at
> ../../../src/mesa/main/texstore.c:1364
> #2  0xb5f60320 in st_CompressedTexSubImage (ctx=0x2ff970, dims=2,
> texImage=0xb382ac50, x=0, y=0, z=0, w=2048, h=2048, d=1, format=37492,
> imageSize=2097152, data=0x0)
> at ../../../src/mesa/state_tracker/st_cb_texture.c:2043
> #3  0xb5e8ece0 in _mesa_compressed_texture_sub_image (ctx=0x2ff970,
> dims=2, texObj=0xb382a9b0, texImage=0xb382ac50, target=3553, level=0,
> xoffset=0, yoffset=0, zoffset=0, width=2048,
> height=2048, depth=1, format=37492, imageSize=2097152, data=0x0)
> at ../../../src/mesa/main/teximage.c:4388
> #4  0xb5e8f350 in _mesa_CompressedTexSubImage2D (target=3553, level=0,
> xoffset=0, yoffset=0, width=2048, height=2048, format=37492,
> imageSize=2097152, data=0x0)
> at ../../../src/mesa/main/teximage.c:4509
> #5  0xb67cfadc in shared_dispatch_stub_412 (target=3553, level=0,
> xoffset=0, yoffset=0, width=2048, height=2048, format=37492,
> imageSize=2097152, data=0x0)
> at ./shared-glapi/glapi_mapi_tmp.h:19098
>
> I'll think I'll skip the fallback format_map entries for now, since a
> debug build assert is less obnoxious than a segfault..
>
> BR,
> -R
>
>
> On Tue, Feb 16, 2016 at 12:14 PM, Ilia Mirkin  wrote:
>> Should be noted that, not at all due to this patch,
>> glTexStorage(ETC1/ETC2) is broken on gallium drivers that don't
>> implement those formats in HW (i.e. use the sw fallback). This patch
>> makes it work for drivers that *do* support it in HW, but more work
>> needed for the other drivers. Maybe we should just have the
>> PIPE_FORMAT_RGBA8 stuff right in there as fallback formats? [Would
>> need to do that for ETC1 as well.]
>>
>> On Tue, Feb 16, 2016 at 12:04 PM, Rob Clark  wrote:
>>> From: Rob Clark 
>>>
>>> Noticed by Ilia when I was trying to figure out why some app was failing
>>> to use ETC2.
>>>
>>> Signed-off-by: Rob Clark 
>>> Reviewed-by: Ilia Mirkin 
>>> ---
>>>  src/mesa/state_tracker/st_format.c | 42 
>>> ++
>>>  1 file changed, 42 insertions(+)
>>>
>>> diff --git a/src/mesa/state_tracker/st_format.c 
>>> b/src/mesa/state_tracker/st_format.c
>>> index 2b92bad..82bf3a1 100644
>>> --- a/src/mesa/state_tracker/st_format.c
>>> +++ b/src/mesa/state_tracker/st_format.c
>>> @@ -1484,6 +1484,48 @@ static const struct format_mapping format_map[] = {
>>>{ PIPE_FORMAT_ETC1_RGB8, 0 }
>>> },
>>>
>>> +   /* ETC2 */
>>> +   {
>>> +  { GL_COMPRESSED_RGB8_ETC2, 0 },
>>> +  { PIPE_FORMAT_ETC2_RGB8, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_SRGB8_ETC2, 0 },
>>> +  { PIPE_FORMAT_ETC2_SRGB8, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
>>> +  { PIPE_FORMAT_ETC2_RGB8A1, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
>>> +  { PIPE_FORMAT_ETC2_SRGB8A1, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_RGBA8_ETC2_EAC, 0 },
>>> +  { PIPE_FORMAT_ETC2_RGBA8, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC, 0 },
>>> +  { PIPE_FORMAT_ETC2_SRGBA8, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_R11_EAC, 0 },
>>> +  { PIPE_FORMAT_ETC2_R11_UNORM, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_SIGNED_R11_EAC, 0 },
>>> +  { PIPE_FORMAT_ETC2_R11_SNORM, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_RG11_EAC, 0 },
>>> +  { PIPE_FORMAT_ETC2_RG11_UNORM, 0 }
>>> +   },
>>> +   {
>>> +  { GL_COMPRESSED_SIGNED_RG11_EAC, 0 },
>>> +  { PIPE_FORMAT_ETC2_RG11_SNORM, 0 }
>>> +   },
>>> +
>>> /* BPTC */
>>> {
>>>{ GL_COMPRESSED_RGBA_BPTC_UNORM, 0 },
>>> --
>>> 2.5.0
>>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: add missing ETC2 entries to format_map

2016-02-16 Thread Rob Clark

gave that a quick try.. and, well, I think that may just be the start
of the rabbit-hole..

0xb69eb92c in __memcpy_neon () from /lib/libc.so.6
(gdb) bt
#0  0xb69eb92c in __memcpy_neon () from /lib/libc.so.6
#1  0xb5ea63e0 in _mesa_store_compressed_texsubimage (ctx=0x2ff970,
dims=2, texImage=0xb382ac50, xoffset=0, yoffset=0, zoffset=0,
width=2048, height=2048, depth=1, format=37492,
imageSize=2097152, data=0xb340) at
../../../src/mesa/main/texstore.c:1364
#2  0xb5f60320 in st_CompressedTexSubImage (ctx=0x2ff970, dims=2,
texImage=0xb382ac50, x=0, y=0, z=0, w=2048, h=2048, d=1, format=37492,
imageSize=2097152, data=0x0)
at ../../../src/mesa/state_tracker/st_cb_texture.c:2043
#3  0xb5e8ece0 in _mesa_compressed_texture_sub_image (ctx=0x2ff970,
dims=2, texObj=0xb382a9b0, texImage=0xb382ac50, target=3553, level=0,
xoffset=0, yoffset=0, zoffset=0, width=2048,
height=2048, depth=1, format=37492, imageSize=2097152, data=0x0)
at ../../../src/mesa/main/teximage.c:4388
#4  0xb5e8f350 in _mesa_CompressedTexSubImage2D (target=3553, level=0,
xoffset=0, yoffset=0, width=2048, height=2048, format=37492,
imageSize=2097152, data=0x0)
at ../../../src/mesa/main/teximage.c:4509
#5  0xb67cfadc in shared_dispatch_stub_412 (target=3553, level=0,
xoffset=0, yoffset=0, width=2048, height=2048, format=37492,
imageSize=2097152, data=0x0)
at ./shared-glapi/glapi_mapi_tmp.h:19098

I'll think I'll skip the fallback format_map entries for now, since a
debug build assert is less obnoxious than a segfault..

BR,
-R


On Tue, Feb 16, 2016 at 12:14 PM, Ilia Mirkin  wrote:
> Should be noted that, not at all due to this patch,
> glTexStorage(ETC1/ETC2) is broken on gallium drivers that don't
> implement those formats in HW (i.e. use the sw fallback). This patch
> makes it work for drivers that *do* support it in HW, but more work
> needed for the other drivers. Maybe we should just have the
> PIPE_FORMAT_RGBA8 stuff right in there as fallback formats? [Would
> need to do that for ETC1 as well.]
>
> On Tue, Feb 16, 2016 at 12:04 PM, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> Noticed by Ilia when I was trying to figure out why some app was failing
>> to use ETC2.
>>
>> Signed-off-by: Rob Clark 
>> Reviewed-by: Ilia Mirkin 
>> ---
>>  src/mesa/state_tracker/st_format.c | 42 
>> ++
>>  1 file changed, 42 insertions(+)
>>
>> diff --git a/src/mesa/state_tracker/st_format.c 
>> b/src/mesa/state_tracker/st_format.c
>> index 2b92bad..82bf3a1 100644
>> --- a/src/mesa/state_tracker/st_format.c
>> +++ b/src/mesa/state_tracker/st_format.c
>> @@ -1484,6 +1484,48 @@ static const struct format_mapping format_map[] = {
>>{ PIPE_FORMAT_ETC1_RGB8, 0 }
>> },
>>
>> +   /* ETC2 */
>> +   {
>> +  { GL_COMPRESSED_RGB8_ETC2, 0 },
>> +  { PIPE_FORMAT_ETC2_RGB8, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_SRGB8_ETC2, 0 },
>> +  { PIPE_FORMAT_ETC2_SRGB8, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
>> +  { PIPE_FORMAT_ETC2_RGB8A1, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
>> +  { PIPE_FORMAT_ETC2_SRGB8A1, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_RGBA8_ETC2_EAC, 0 },
>> +  { PIPE_FORMAT_ETC2_RGBA8, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC, 0 },
>> +  { PIPE_FORMAT_ETC2_SRGBA8, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_R11_EAC, 0 },
>> +  { PIPE_FORMAT_ETC2_R11_UNORM, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_SIGNED_R11_EAC, 0 },
>> +  { PIPE_FORMAT_ETC2_R11_SNORM, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_RG11_EAC, 0 },
>> +  { PIPE_FORMAT_ETC2_RG11_UNORM, 0 }
>> +   },
>> +   {
>> +  { GL_COMPRESSED_SIGNED_RG11_EAC, 0 },
>> +  { PIPE_FORMAT_ETC2_RG11_SNORM, 0 }
>> +   },
>> +
>> /* BPTC */
>> {
>>{ GL_COMPRESSED_RGBA_BPTC_UNORM, 0 },
>> --
>> 2.5.0
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 94175] ilo driver fail to load for intel mobile gm45 Express chipset

2016-02-16 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=94175

Timothy Arceri  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #1 from Timothy Arceri  ---
No the oldest it supports is Sandy Bridge.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] glsl: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

On Tue, Feb 16, 2016 at 2:32 PM, Ian Romanick  wrote:
> On 02/16/2016 10:58 AM, Rob Clark wrote:
>> src/compiler/glsl/lower_discard_flow.cpp:79:1: warning: ‘ir_visitor_status 
>> {anonymous}::lower_discard_flow_visitor::visit_enter(ir_loop_jump*)’ defined 
>> but not used [-Wunused-function]
>>  lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
>>  ^~
>>
>> Note, not sure if this was a latent bug?  Could be that was intended to
>> override visit(ir_loop_jump *)?
>
> I'll wager that is correct, and the bug has existed since day 1. :(

I guess this is as good an excuse as any to try to write a piglit test
(shader_test)..

BR,
-R

> To hit the bug, you'd need a loop with both a discard and a continue.  I
> suspect that is a rare combination.  To observe that the bug had been
> hit, you'd have to use a derivative (either via dFdx() and friends or
> texture()) in a particular way.  I'll have to try to think of a test case.
>
> It might be easier to just add a unit test.  We know that
>
>for (int i = 0; i < x; i++) {
>   if (z)
>  continue;
>
>   if (y)
>  discard;
>}
>
> should get transformed to
>
>for (int i = 0; i < x; i++) {
>   if (z) {
>  if (discarded)
> break;
>
>  continue;
>   }
>
>   if (y) {
>  discarded = true;
>  discard;
>   }
>
>   if (discarded)
>  break;
>}
>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/compiler/glsl/lower_discard_flow.cpp | 12 
>>  1 file changed, 12 deletions(-)
>>
>> diff --git a/src/compiler/glsl/lower_discard_flow.cpp 
>> b/src/compiler/glsl/lower_discard_flow.cpp
>> index 9d0a56b..bdb96b4 100644
>> --- a/src/compiler/glsl/lower_discard_flow.cpp
>> +++ b/src/compiler/glsl/lower_discard_flow.cpp
>> @@ -63,7 +63,6 @@ public:
>> }
>>
>> ir_visitor_status visit_enter(ir_discard *ir);
>> -   ir_visitor_status visit_enter(ir_loop_jump *ir);
>> ir_visitor_status visit_enter(ir_loop *ir);
>> ir_visitor_status visit_enter(ir_function_signature *ir);
>>
>> @@ -76,17 +75,6 @@ public:
>>  } /* anonymous namespace */
>>
>>  ir_visitor_status
>> -lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
>> -{
>> -   if (ir->mode != ir_loop_jump::jump_continue)
>> -  return visit_continue;
>> -
>> -   ir->insert_before(generate_discard_break());
>> -
>> -   return visit_continue;
>> -}
>> -
>> -ir_visitor_status
>>  lower_discard_flow_visitor::visit_enter(ir_discard *ir)
>>  {
>> ir_dereference *lhs = new(mem_ctx) ir_dereference_variable(discarded);
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/gen7: Use predicated rendering for indirect compute

2016-02-16 Thread Jordan Justen

On 2016-02-16 12:03:10, Ben Widawsky wrote:
> On Tue, Feb 16, 2016 at 10:09:50AM -0800, Jordan Justen wrote:
> > On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect
> > dispatch is used, but one of the dimensions is 0.
> > 
> > Therefore we use predicated rendering on the GPGPU_WALKER command to
> > handle this case.
> > 
> > Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100
> > Signed-off-by: Jordan Justen 
> > Cc: Kenneth Graunke 
> > Cc: Ben Widawsky 
> > Cc: Ilia Mirkin 
> > ---
> >  src/mesa/drivers/dri/i965/brw_compute.c | 104 
> > +++-
> >  src/mesa/drivers/dri/i965/brw_defines.h |   1 +
> >  2 files changed, 91 insertions(+), 14 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
> > b/src/mesa/drivers/dri/i965/brw_compute.c
> > index d9f181a..bbb8ce3 100644
> > --- a/src/mesa/drivers/dri/i965/brw_compute.c
> > +++ b/src/mesa/drivers/dri/i965/brw_compute.c
> > @@ -35,6 +35,92 @@
> >  
> >  
> >  static void
> > +brw_prepare_indirect_gpgpu_walker(struct brw_context *brw)
> > +{
> 
> Just FYI:
> There is a blurb in the predicate text:
> To ensure the memory sources of the MI_LOAD_REGISTER_MEM commands are coherent
> with previous 3D_PIPECONTROL store-DWord operations, software can use the new
> Pipe Control Flush Enable bit in the PIPE_CONTROL command.
> 
> I suppose it's never the case that we'll be writing these with PIPE_CONTROL, 
> so
> it's safe to ignore this.
> 

On irc it sounded like you didn't think the flush was required. I'm
going to stick with that unless you tell me otherwise.

The LRM is coming from a user BO, and they may have set the values by
mapping the buffer and writing it from the CPU, or by writing it from
a shader (for example SSBO).

> > +   GLintptr indirect_offset = brw->compute.num_work_groups_offset;
> > +   drm_intel_bo *bo = brw->compute.num_work_groups_bo;
> > +
> > +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMX, bo,
> > + I915_GEM_DOMAIN_VERTEX, 0,
> > + indirect_offset + 0);
> > +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMY, bo,
> > + I915_GEM_DOMAIN_VERTEX, 0,
> > + indirect_offset + 4);
> > +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMZ, bo,
> > + I915_GEM_DOMAIN_VERTEX, 0,
> > + indirect_offset + 8);
> > +
> > +   if (brw->gen > 7)
> > +  return;
> > +
> > +   /* Clear upper 32-bits of SRC0 and all 64-bits of SRC1
> > +*/
> > +   BEGIN_BATCH(7);
> > +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (7 - 2));
> > +   OUT_BATCH(MI_PREDICATE_SRC0 + 4);
> > +   OUT_BATCH(0u);
> > +   OUT_BATCH(MI_PREDICATE_SRC1 + 0);
> > +   OUT_BATCH(0u);
> > +   OUT_BATCH(MI_PREDICATE_SRC1 + 4);
> > +   OUT_BATCH(0u);
> > +   ADVANCE_BATCH();
> > +
> > +   /* Load compute_dispatch_indirect_x_size into SRC0
> > +*/
> > +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> > + I915_GEM_DOMAIN_INSTRUCTION, 0,
> > + indirect_offset + 0);
> > +
> > +   /* predicate = (compute_dispatch_indirect_x_size == 0);
> > +*/
> > +   BEGIN_BATCH(1);
> > +   OUT_BATCH(GEN7_MI_PREDICATE |
> > + MI_PREDICATE_LOADOP_LOAD |
> > + MI_PREDICATE_COMBINEOP_SET |
> > + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> > +   ADVANCE_BATCH();
> > +
> > +   /* Load compute_dispatch_indirect_y_size into SRC0
> > +*/
> > +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> > + I915_GEM_DOMAIN_INSTRUCTION, 0,
> > + indirect_offset + 4);
> > +
> > +   /* predicate |= (compute_dispatch_indirect_y_size == 0);
> > +*/
> > +   BEGIN_BATCH(1);
> > +   OUT_BATCH(GEN7_MI_PREDICATE |
> > + MI_PREDICATE_LOADOP_LOAD |
> > + MI_PREDICATE_COMBINEOP_OR |
> > + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> > +   ADVANCE_BATCH();
> > +
> > +   /* Load compute_dispatch_indirect_z_size into SRC0
> > +*/
> > +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> > + I915_GEM_DOMAIN_INSTRUCTION, 0,
> > + indirect_offset + 8);
> > +
> > +   /* predicate |= (compute_dispatch_indirect_z_size == 0);
> > +*/
> > +   BEGIN_BATCH(1);
> > +   OUT_BATCH(GEN7_MI_PREDICATE |
> > + MI_PREDICATE_LOADOP_LOAD |
> > + MI_PREDICATE_COMBINEOP_OR |
> > + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> > +   ADVANCE_BATCH();
> > +
> > +   /* predicate = !predicate;
> > +*/
> > +   BEGIN_BATCH(1);
> > +   OUT_BATCH(GEN7_MI_PREDICATE |
> > + MI_PREDICATE_LOADOP_LOADINV |
> > + MI_PREDICATE_COMBINEOP_OR |
> > + MI_PREDICATE_COMPAREOP_FALSE);
> > +   ADVANCE_BATCH();
> > +}
> 
> I think all of your comments would fit on one line...
> 
> Just summing up our conve

Re: [Mesa-dev] [PATCH] Revert "i965: Restore vbo after color resolve during brw_try_draw_prims()"

2016-02-16 Thread Ben Widawsky

On Mon, Feb 15, 2016 at 11:34:03AM +0200, Topi Pohjolainen wrote:
> This got pushed accidentally in the first place but wasn't reverted
> as it didn't regress piglit but instead fixed one newly introduced
> test exercising a corner in case in i965 driver. However, saving and
> restoring vertex buffer context is complicated and requires more
> thought.

So now a revert is going to cause a piglit regression? What a weird world we
live in. Mark, can you make sure the test mentioned in the bug
(getteximage-formats init-by-clear-and-render) gets skipped after the revert
lands?

Reviewed-by: Ben Widawsky 

> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94150
> 
> Signed-off-by: Topi Pohjolainen 
> CC: Ben Widawsky 
> CC: Ian Romanick 
> Reviewed-by: Tapani Palli 
> ---
>  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
> b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> index 93f1a85..b2b07e7 100644
> --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> @@ -887,15 +887,6 @@ brw_meta_resolve_color(struct brw_context *brw,
>  
> _mesa_meta_end(ctx);
>  
> -   /* Restore in case we were called in the middle of brw_try_draw_prims().
> -* But only in case the just restored context really uses vertex buffer
> -* objects.
> -*/
> -   if (ctx->API != API_OPENGLES) {
> -  ctx->vbo_context->exec.array.recalculate_inputs = true;
> -  vbo_bind_arrays(ctx);
> -   }
> -
> /* We're typically called from intel_update_state() and we're supposed to
>  * return with the state all updated to what it was before
>  * brw_meta_resolve_color() was called.  The meta rendering will have
> -- 
> 2.5.0
> 

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/gen7: Use predicated rendering for indirect compute

2016-02-16 Thread Ben Widawsky

On Tue, Feb 16, 2016 at 10:09:50AM -0800, Jordan Justen wrote:
> On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect
> dispatch is used, but one of the dimensions is 0.
> 
> Therefore we use predicated rendering on the GPGPU_WALKER command to
> handle this case.
> 
> Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100
> Signed-off-by: Jordan Justen 
> Cc: Kenneth Graunke 
> Cc: Ben Widawsky 
> Cc: Ilia Mirkin 
> ---
>  src/mesa/drivers/dri/i965/brw_compute.c | 104 
> +++-
>  src/mesa/drivers/dri/i965/brw_defines.h |   1 +
>  2 files changed, 91 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
> b/src/mesa/drivers/dri/i965/brw_compute.c
> index d9f181a..bbb8ce3 100644
> --- a/src/mesa/drivers/dri/i965/brw_compute.c
> +++ b/src/mesa/drivers/dri/i965/brw_compute.c
> @@ -35,6 +35,92 @@
>  
>  
>  static void
> +brw_prepare_indirect_gpgpu_walker(struct brw_context *brw)
> +{

Just FYI:
There is a blurb in the predicate text:
To ensure the memory sources of the MI_LOAD_REGISTER_MEM commands are coherent
with previous 3D_PIPECONTROL store-DWord operations, software can use the new
Pipe Control Flush Enable bit in the PIPE_CONTROL command.

I suppose it's never the case that we'll be writing these with PIPE_CONTROL, so
it's safe to ignore this.

> +   GLintptr indirect_offset = brw->compute.num_work_groups_offset;
> +   drm_intel_bo *bo = brw->compute.num_work_groups_bo;
> +
> +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMX, bo,
> + I915_GEM_DOMAIN_VERTEX, 0,
> + indirect_offset + 0);
> +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMY, bo,
> + I915_GEM_DOMAIN_VERTEX, 0,
> + indirect_offset + 4);
> +   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMZ, bo,
> + I915_GEM_DOMAIN_VERTEX, 0,
> + indirect_offset + 8);
> +
> +   if (brw->gen > 7)
> +  return;
> +
> +   /* Clear upper 32-bits of SRC0 and all 64-bits of SRC1
> +*/
> +   BEGIN_BATCH(7);
> +   OUT_BATCH(MI_LOAD_REGISTER_IMM | (7 - 2));
> +   OUT_BATCH(MI_PREDICATE_SRC0 + 4);
> +   OUT_BATCH(0u);
> +   OUT_BATCH(MI_PREDICATE_SRC1 + 0);
> +   OUT_BATCH(0u);
> +   OUT_BATCH(MI_PREDICATE_SRC1 + 4);
> +   OUT_BATCH(0u);
> +   ADVANCE_BATCH();
> +
> +   /* Load compute_dispatch_indirect_x_size into SRC0
> +*/
> +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> + I915_GEM_DOMAIN_INSTRUCTION, 0,
> + indirect_offset + 0);
> +
> +   /* predicate = (compute_dispatch_indirect_x_size == 0);
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOAD |
> + MI_PREDICATE_COMBINEOP_SET |
> + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> +   ADVANCE_BATCH();
> +
> +   /* Load compute_dispatch_indirect_y_size into SRC0
> +*/
> +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> + I915_GEM_DOMAIN_INSTRUCTION, 0,
> + indirect_offset + 4);
> +
> +   /* predicate |= (compute_dispatch_indirect_y_size == 0);
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOAD |
> + MI_PREDICATE_COMBINEOP_OR |
> + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> +   ADVANCE_BATCH();
> +
> +   /* Load compute_dispatch_indirect_z_size into SRC0
> +*/
> +   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
> + I915_GEM_DOMAIN_INSTRUCTION, 0,
> + indirect_offset + 8);
> +
> +   /* predicate |= (compute_dispatch_indirect_z_size == 0);
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOAD |
> + MI_PREDICATE_COMBINEOP_OR |
> + MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
> +   ADVANCE_BATCH();
> +
> +   /* predicate = !predicate;
> +*/
> +   BEGIN_BATCH(1);
> +   OUT_BATCH(GEN7_MI_PREDICATE |
> + MI_PREDICATE_LOADOP_LOADINV |
> + MI_PREDICATE_COMBINEOP_OR |
> + MI_PREDICATE_COMPAREOP_FALSE);
> +   ADVANCE_BATCH();
> +}

I think all of your comments would fit on one line...

Just summing up our conversation, I believe you could have slightly simpler code
using DELTAS_EQUAL, but it's not entirely clear.

> +
> +static void
>  brw_emit_gpgpu_walker(struct brw_context *brw)
>  {
> const struct brw_cs_prog_data *prog_data = brw->cs.prog_data;
> @@ -45,20 +131,10 @@ brw_emit_gpgpu_walker(struct brw_context *brw)
> if (brw->compute.num_work_groups_bo == NULL) {
>indirect_flag = 0;
> } else {
> -  GLintptr indirect_offset = brw->compute.num_work_groups_offset;
> -  drm_intel_bo *bo = brw->compute.num_work_groups_bo;
> -
> -  indirect_flag = GEN7_GPGPU_INDIRECT

Re: [Mesa-dev] [PATCH] glsl: set user defined varyings to smooth by default in ES

2016-02-16 Thread Ian Romanick

On 02/15/2016 11:26 PM, Iago Toral wrote:
> On Tue, 2016-02-16 at 11:03 +1100, Timothy Arceri wrote:
>> This is usually handled by the backends in order to handle the
>> various interactions with the gl_*Color built-ins.
>>
>> The problem is this means linking will fail if one side on the
>> interface adds the smooth qualifier to the varying and the other
>> side just uses the default even though they match.
>>
>> This fixes various deqp tests. The spec is not clear what to for
>> deskto GL so leave it as is for now.
   desktop

>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
>> ---
>>  src/compiler/glsl/ast_to_hir.cpp | 11 +++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
>> b/src/compiler/glsl/ast_to_hir.cpp
>> index b639378..4203cd5 100644
>> --- a/src/compiler/glsl/ast_to_hir.cpp
>> +++ b/src/compiler/glsl/ast_to_hir.cpp
>> @@ -2750,6 +2750,17 @@ interpret_interpolation_qualifier(const struct 
>> ast_type_qualifier *qual,
>>"vertex shader inputs or fragment shader outputs",
>>interpolation_string(interpolation));
>>}
>> +   } else if (state->es_shader &&
>> +  ((mode == ir_var_shader_in &&
>> +state->stage != MESA_SHADER_VERTEX) ||
>> +   (mode == ir_var_shader_out &&
>> +state->stage != MESA_SHADER_FRAGMENT))) {
>> +  /* From Section 4.3.9 (Interpolation) of the GLSL ES spec:

 Section 4.3.8 (Interpolation) of the GLSL ES X.Y spec says:

>> +   *
>> +   *" When no interpolation qualifier is present, smooth 
>> interpolation
 ^ Extra space

With those fixed, this patch is

Reviewed-by: Ian Romanick 

>> +   *is used."
>> +   */
>> +  interpolation = INTERP_QUALIFIER_SMOOTH;
>> }
>>  
>> return interpolation;
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glsl: set user defined varyings to smooth by default

2016-02-16 Thread Ian Romanick

On 02/15/2016 07:12 AM, Iago Toral wrote:
> On Mon, 2016-02-15 at 18:38 +1100, Timothy Arceri wrote:
>> This is usually handled by the backends in order to handle the
>> various interactions with the gl_*Color built-ins.
>>
>> The problem is this means linking will fail if one side on the
>> interface adds the smooth qualifier to the varying and the other
>> side just uses the default even though they match.
>>
>> This fixes various deqp tests and should have no impact on
>> built-ins as they generate GLSL IR directly.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
>> ---
>>  src/compiler/glsl/ast_to_hir.cpp | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
>> b/src/compiler/glsl/ast_to_hir.cpp
>> index b639378..47d52ee 100644
>> --- a/src/compiler/glsl/ast_to_hir.cpp
>> +++ b/src/compiler/glsl/ast_to_hir.cpp
>> @@ -2750,6 +2750,11 @@ interpret_interpolation_qualifier(const struct 
>> ast_type_qualifier *qual,
>>"vertex shader inputs or fragment shader outputs",
>>interpolation_string(interpolation));
>>}
>> +   } else if ((mode == ir_var_shader_in &&
>> +   state->stage != MESA_SHADER_VERTEX) ||
>> +  (mode == ir_var_shader_out &&
>> +   state->stage != MESA_SHADER_FRAGMENT)) {
>> +  interpolation = INTERP_QUALIFIER_SMOOTH;
>> }
> 
> The GLES spec explicitly says that in the absence of an interp qualifier
> smooth is used, but I can't find the same statement in the desktop GLSL
> spec. Should we make this ES specific?

Desktop OpenGL has an API control (via glShadeModel) that OpenGL ES 2.0+
does not have.  If a compatibility profile vertex shader outputs
built-in varyings (or maybe just color... I'd have to look) to
fixed-function, the interpolation mode is determined by the shading
model (GL_FLAT or GL_SMOOTH).

> Iago
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] util: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

On Tue, Feb 16, 2016 at 2:37 PM, Ian Romanick  wrote:
> On 02/16/2016 10:57 AM, Rob Clark wrote:
>> src/util/hash_table.h:111:23: warning: ‘_mesa_fnv32_1a_offset_bias’ defined 
>> but not used [-Wunused-const-variable]
>>  static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
>>^~
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/util/hash_table.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/util/hash_table.h b/src/util/hash_table.h
>> index 85b013c..a0244d7 100644
>> --- a/src/util/hash_table.h
>> +++ b/src/util/hash_table.h
>> @@ -108,7 +108,7 @@ static inline uint32_t _mesa_hash_pointer(const void 
>> *pointer)
>> return _mesa_hash_data(&pointer, sizeof(pointer));
>>  }
>>
>> -static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
>> +static const uint32_t _mesa_fnv32_1a_offset_bias UNUSED = 2166136261u;
>
> Looking at how it's used in the code, this seems like it should either
> be a #define or an anonymous union.  I mean, I had to go look at the
> code to figure out why it should be UNUSED instead of just removed. :)
>
> enum { _mesa_fnv32_1a_offset_bias = 2166136261u };

I'm *assuming* the purpose of the current approach is to have type
safety.. although I guess we could do #define FOO ((uint32_t)...)..

BR,
-R

>>
>>  static inline uint32_t
>>  _mesa_fnv32_1a_accumulate_block(uint32_t hash, const void *data, size_t 
>> size)
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: add GL_OES_sample_shading support

2016-02-16 Thread Ilia Mirkin

Signed-off-by: Ilia Mirkin 
---

The dEQP tests expect this to enable per-sample interpolation, so I had to undo
some of the changes in st/mesa from earlier to get it to pass.

 docs/GL3.txt| 2 +-
 src/mapi/glapi/gen/es_EXT.xml   | 6 ++
 src/mesa/main/enable.c  | 4 ++--
 src/mesa/main/extensions_table.h| 1 +
 src/mesa/main/multisample.c | 3 ++-
 src/mesa/main/tests/dispatch_sanity.cpp | 3 +++
 6 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index ae439f6..2e528d4 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -247,7 +247,7 @@ GLES3.2, GLSL ES 3.2
   GL_OES_geometry_shader   started (Marta)
   GL_OES_gpu_shader5   not started (based on 
parts of GL_ARB_gpu_shader5, which is done for some drivers)
   GL_OES_primitive_bounding boxnot started
-  GL_OES_sample_shadingnot started (based on 
parts of GL_ARB_sample_shading, which is done for some drivers)
+  GL_OES_sample_shadingDONE (nvc0, r600, 
radeonsi)
   GL_OES_sample_variables  DONE (nvc0, r600, 
radeonsi)
   GL_OES_shader_image_atomic   not started (based on 
parts of GL_ARB_shader_image_load_store, which is done for some drivers)
   GL_OES_shader_io_blocks  not started (based on 
parts of GLSL 1.50, which is done)
diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
index 93284be..35c286d 100644
--- a/src/mapi/glapi/gen/es_EXT.xml
+++ b/src/mapi/glapi/gen/es_EXT.xml
@@ -798,6 +798,12 @@
 
 
 
+
+
+
+
+
+
 
 
 
diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
index 3985457..566b29b 100644
--- a/src/mesa/main/enable.c
+++ b/src/mesa/main/enable.c
@@ -805,7 +805,7 @@ _mesa_set_enable(struct gl_context *ctx, GLenum cap, 
GLboolean state)
 
   /* GL_ARB_sample_shading */
   case GL_SAMPLE_SHADING:
- if (!_mesa_is_desktop_gl(ctx))
+ if (!_mesa_is_desktop_gl(ctx) && !_mesa_is_gles3(ctx))
 goto invalid_enum_error;
  CHECK_EXTENSION(ARB_sample_shading, cap);
  if (ctx->Multisample.SampleShading == state)
@@ -1604,7 +1604,7 @@ _mesa_IsEnabled( GLenum cap )
 
   /* ARB_sample_shading */
   case GL_SAMPLE_SHADING:
- if (!_mesa_is_desktop_gl(ctx))
+ if (!_mesa_is_desktop_gl(ctx) && !_mesa_is_gles3(ctx))
 goto invalid_enum_error;
  CHECK_EXTENSION(ARB_sample_shading);
  return ctx->Multisample.SampleShading;
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 196a0c6..5cf4ba5 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -326,6 +326,7 @@ EXT(OES_point_sprite, 
ARB_point_sprite
 EXT(OES_query_matrix, dummy_true   
  ,  x ,  x , ES1,  x , 2003)
 EXT(OES_read_format , dummy_true   
  , GLL, GLC, ES1,  x , 2003)
 EXT(OES_rgb8_rgba8  , dummy_true   
  ,  x ,  x , ES1, ES2, 2005)
+EXT(OES_sample_shading  , OES_sample_variables 
  ,  x ,  x ,  x ,  30, 2014)
 EXT(OES_sample_variables, OES_sample_variables 
  ,  x ,  x ,  x ,  30, 2014)
 EXT(OES_single_precision, dummy_true   
  ,  x ,  x , ES1,  x , 2003)
 EXT(OES_standard_derivatives, OES_standard_derivatives 
  ,  x ,  x ,  x , ES2, 2005)
diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
index e7783ea..6b644461 100644
--- a/src/mesa/main/multisample.c
+++ b/src/mesa/main/multisample.c
@@ -127,7 +127,8 @@ _mesa_MinSampleShading(GLclampf value)
 {
GET_CURRENT_CONTEXT(ctx);
 
-   if (!ctx->Extensions.ARB_sample_shading || !_mesa_is_desktop_gl(ctx)) {
+   if (!_mesa_has_ARB_sample_shading(ctx) &&
+   !_mesa_has_OES_sample_shading(ctx)) {
   _mesa_error(ctx, GL_INVALID_OPERATION, "glMinSampleShading");
   return;
}
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 3d0fce6..c323766 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -2449,6 +2449,9 @@ const struct function gles3_functions_possible[] = {
/* GL_OES_copy_image */
{ "glCopyImageSubDataOES", 30, -1 },
 
+   /* GL_OES_sample_shading */
+   { "glMinSampleShadingOES", 30, -1 },
+
{ NULL, 0, -1 }
 };
 
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] util: fix new gcc6 warnings

2016-02-16 Thread Ian Romanick

On 02/16/2016 10:57 AM, Rob Clark wrote:
> src/util/hash_table.h:111:23: warning: ‘_mesa_fnv32_1a_offset_bias’ defined 
> but not used [-Wunused-const-variable]
>  static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
>^~
> 
> Signed-off-by: Rob Clark 
> ---
>  src/util/hash_table.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/util/hash_table.h b/src/util/hash_table.h
> index 85b013c..a0244d7 100644
> --- a/src/util/hash_table.h
> +++ b/src/util/hash_table.h
> @@ -108,7 +108,7 @@ static inline uint32_t _mesa_hash_pointer(const void 
> *pointer)
> return _mesa_hash_data(&pointer, sizeof(pointer));
>  }
>  
> -static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
> +static const uint32_t _mesa_fnv32_1a_offset_bias UNUSED = 2166136261u;

Looking at how it's used in the code, this seems like it should either
be a #define or an anonymous union.  I mean, I had to go look at the
code to figure out why it should be UNUSED instead of just removed. :)

enum { _mesa_fnv32_1a_offset_bias = 2166136261u };

>  
>  static inline uint32_t
>  _mesa_fnv32_1a_accumulate_block(uint32_t hash, const void *data, size_t size)
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Where do we put a Vulkan driver?

2016-02-16 Thread Jason Ekstrand

On Tue, Feb 16, 2016 at 11:25 AM, Rob Clark  wrote:

> On Tue, Feb 16, 2016 at 1:39 PM, Jason Ekstrand 
> wrote:
> > So, we just pushed a branch containing a Vulkan driver.  Naturally, we
> > would like to incorporate that driver into the upstream mesa tree.  While
> > we work on upstreaming the prerequisites in NIR and the i965 back-end
> > compiler, there is a question that needs answering:  Where do we put it?
> >
> > The Vulkan driver challenges the tree-like nature of the way mesa is
> > currently organized.  We now have two drivers that share a lot of the
> same
> > underlying hardware-specific code (compiler and ISL) but target different
> > APIs and no gallium-like middle layer to hide behind.  Obviously, we
> don't
> > want to put a Vulkan driver in src/mesa/drivers/dri/i965.  If we start a
> > src/vulkan directory, we don't really want to put the shared parts into
> > src/vulkan/intel.  Where should we put the Intel-specific but
> API-agnostic
> > bits?  In particular, we need a place to put ISL and the back-end
> compiler.
> > We don't want to deal with the headaches of making a public API and
> keeping
> > it stable, so they need to live somewhere in the mesa tree.
> >
> > In my personal opinion, the best thing to do is probably to add a
> src/intel
> > folder with subfolders for vulkan, isl, and the back-end compiler.  The
> > src/mesa/drivers/dri/i965 folder would then basically be just the GL bits
> > of the driver.  It does seem a little odd to have "intel" as a top-level
> > source folder, but I can't come up with anything better.
>
> Some day (ie. not near-term future) I'd like to share ir3 compiler
> backend between vulkan and gallium.. and I think there it is even less
> clear how to arrange things.  Perhaps in an ideal world, gl-on-vulkan
> could replace gallium / mesa-st.. although I'm not even sure how that
> would work.. even if it didn't give up some features/performance, I
> have a lot of shared code between adreno generations that can support
> vulkan (a4xx, and possibly a5xx) and which can not (a3xx)
>

I think it's best to assume, for the sake of these discussions, that
GL-on-vulkan isn't going to happen, at least not for quite some time.  It's
theoretically possible but a lot of work and it's not clear what the
performance will be.  For now, GL and Vulkan drivers will likely be
separate things (with shared code).

So not really an answer, more an observation that I'm not really sure
> that there is a right answer..
>

Neither do I/we.  That's why I sent the e-mail. :-)


> > Thoughts?  Opinions?  Favorite colors?
>
> blue, no red... arrrgghhh..
>
> BR,
> -R
>
> > --Jason
> >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/10] glsl: fix new gcc6 warnings

2016-02-16 Thread Ian Romanick

On 02/16/2016 10:58 AM, Rob Clark wrote:
> src/compiler/glsl/lower_discard_flow.cpp:79:1: warning: ‘ir_visitor_status 
> {anonymous}::lower_discard_flow_visitor::visit_enter(ir_loop_jump*)’ defined 
> but not used [-Wunused-function]
>  lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
>  ^~
> 
> Note, not sure if this was a latent bug?  Could be that was intended to
> override visit(ir_loop_jump *)?

I'll wager that is correct, and the bug has existed since day 1. :(

To hit the bug, you'd need a loop with both a discard and a continue.  I
suspect that is a rare combination.  To observe that the bug had been
hit, you'd have to use a derivative (either via dFdx() and friends or
texture()) in a particular way.  I'll have to try to think of a test case.

It might be easier to just add a unit test.  We know that

   for (int i = 0; i < x; i++) {
  if (z)
 continue;

  if (y)
 discard;
   }

should get transformed to

   for (int i = 0; i < x; i++) {
  if (z) {
 if (discarded)
break;

 continue;
  }

  if (y) {
 discarded = true;
 discard;
  }

  if (discarded)
 break;
   }

> Signed-off-by: Rob Clark 
> ---
>  src/compiler/glsl/lower_discard_flow.cpp | 12 
>  1 file changed, 12 deletions(-)
> 
> diff --git a/src/compiler/glsl/lower_discard_flow.cpp 
> b/src/compiler/glsl/lower_discard_flow.cpp
> index 9d0a56b..bdb96b4 100644
> --- a/src/compiler/glsl/lower_discard_flow.cpp
> +++ b/src/compiler/glsl/lower_discard_flow.cpp
> @@ -63,7 +63,6 @@ public:
> }
>  
> ir_visitor_status visit_enter(ir_discard *ir);
> -   ir_visitor_status visit_enter(ir_loop_jump *ir);
> ir_visitor_status visit_enter(ir_loop *ir);
> ir_visitor_status visit_enter(ir_function_signature *ir);
>  
> @@ -76,17 +75,6 @@ public:
>  } /* anonymous namespace */
>  
>  ir_visitor_status
> -lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
> -{
> -   if (ir->mode != ir_loop_jump::jump_continue)
> -  return visit_continue;
> -
> -   ir->insert_before(generate_discard_break());
> -
> -   return visit_continue;
> -}
> -
> -ir_visitor_status
>  lower_discard_flow_visitor::visit_enter(ir_discard *ir)
>  {
> ir_dereference *lhs = new(mem_ctx) ir_dereference_variable(discarded);
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Where do we put a Vulkan driver?

2016-02-16 Thread Rob Clark

On Tue, Feb 16, 2016 at 1:39 PM, Jason Ekstrand  wrote:
> So, we just pushed a branch containing a Vulkan driver.  Naturally, we
> would like to incorporate that driver into the upstream mesa tree.  While
> we work on upstreaming the prerequisites in NIR and the i965 back-end
> compiler, there is a question that needs answering:  Where do we put it?
>
> The Vulkan driver challenges the tree-like nature of the way mesa is
> currently organized.  We now have two drivers that share a lot of the same
> underlying hardware-specific code (compiler and ISL) but target different
> APIs and no gallium-like middle layer to hide behind.  Obviously, we don't
> want to put a Vulkan driver in src/mesa/drivers/dri/i965.  If we start a
> src/vulkan directory, we don't really want to put the shared parts into
> src/vulkan/intel.  Where should we put the Intel-specific but API-agnostic
> bits?  In particular, we need a place to put ISL and the back-end compiler.
> We don't want to deal with the headaches of making a public API and keeping
> it stable, so they need to live somewhere in the mesa tree.
>
> In my personal opinion, the best thing to do is probably to add a src/intel
> folder with subfolders for vulkan, isl, and the back-end compiler.  The
> src/mesa/drivers/dri/i965 folder would then basically be just the GL bits
> of the driver.  It does seem a little odd to have "intel" as a top-level
> source folder, but I can't come up with anything better.

Some day (ie. not near-term future) I'd like to share ir3 compiler
backend between vulkan and gallium.. and I think there it is even less
clear how to arrange things.  Perhaps in an ideal world, gl-on-vulkan
could replace gallium / mesa-st.. although I'm not even sure how that
would work.. even if it didn't give up some features/performance, I
have a lot of shared code between adreno generations that can support
vulkan (a4xx, and possibly a5xx) and which can not (a3xx)

So not really an answer, more an observation that I'm not really sure
that there is a right answer..

> Thoughts?  Opinions?  Favorite colors?

blue, no red... arrrgghhh..

BR,
-R

> --Jason
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] st/mesa: add OES_sample_variables support

2016-02-16 Thread Ilia Mirkin

On Tue, Feb 16, 2016 at 1:42 PM, Ian Romanick  wrote:
> On 02/16/2016 09:12 AM, Ilia Mirkin wrote:
>> On Tue, Feb 16, 2016 at 12:02 PM, Ian Romanick  wrote:
>>> On 02/15/2016 10:31 PM, Ilia Mirkin wrote:
 Basically the same thing as ARB_sample_shading except that it also needs
 gl_SampleMaskIn support as well as not enable per-sample interpolation
 whenever doing per-sample shading. This is done explicitly in another
 extension.

 Signed-off-by: Ilia Mirkin 
 ---

 I get 16 failures with dEQP tests, these fall into 2 categories:

  - 1-sample multisample surfaces don't behave the way it likes (it 
 considers
them non-multisampled even though they're created through 
 gl*Multisample*

  - gl_SampleMaskIn is reporting the whole pixel's worth of mask rather than
just the fragment in question. Looking back, it appears that
ARB_gpu_shader5 also wants it for only the current fragment.
>
> I think that's correct.  GL_ARB_sample_shading also says:
>
> If the fragment shader is being evaluated at
> any frequency other than per-framgent, bits of the sample mask not
> corresponding to the current fragment shader invocation are ignored.
>
  docs/GL3.txt| 2 +-
  src/mesa/state_tracker/st_atom_rasterizer.c | 2 ++
  src/mesa/state_tracker/st_atom_shader.c | 2 ++
  src/mesa/state_tracker/st_extensions.c  | 4 
  src/mesa/state_tracker/st_program.c | 5 -
  5 files changed, 13 insertions(+), 2 deletions(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index 26847b9..ae439f6 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -248,7 +248,7 @@ GLES3.2, GLSL ES 3.2
GL_OES_gpu_shader5   not started (based 
 on parts of GL_ARB_gpu_shader5, which is done for some drivers)
GL_OES_primitive_bounding boxnot started
GL_OES_sample_shadingnot started (based 
 on parts of GL_ARB_sample_shading, which is done for some drivers)
 -  GL_OES_sample_variables  not started (based 
 on parts of GL_ARB_sample_shading, which is done for some drivers)
 +  GL_OES_sample_variables  DONE (nvc0, r600, 
 radeonsi)
GL_OES_shader_image_atomic   not started (based 
 on parts of GL_ARB_shader_image_load_store, which is done for some drivers)
GL_OES_shader_io_blocks  not started (based 
 on parts of GLSL 1.50, which is done)
GL_OES_shader_multisample_interpolation  not started (based 
 on parts of GL_ARB_gpu_shader5, which is done)
 diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
 b/src/mesa/state_tracker/st_atom_rasterizer.c
 index c20cadf..d42d512 100644
 --- a/src/mesa/state_tracker/st_atom_rasterizer.c
 +++ b/src/mesa/state_tracker/st_atom_rasterizer.c
 @@ -31,6 +31,7 @@
*/

  #include "main/macros.h"
 +#include "main/context.h"
  #include "st_context.h"
  #include "st_atom.h"
  #include "st_debug.h"
 @@ -239,6 +240,7 @@ static void update_raster_state( struct st_context *st 
 )

 /* _NEW_MULTISAMPLE | _NEW_BUFFERS */
 raster->force_persample_interp =
 + !_mesa_is_gles(ctx) &&
   !st->force_persample_in_shader &&
   ctx->Multisample._Enabled &&
   ctx->Multisample.SampleShading &&
>>>
>>> Is this change necessary?  I would have thought that
>>> ctx->Multisample.SampleShading couldn't get set without using features
>>> from OES_sample_shading.
>>
>> We don't want per-sample interp when sample shading, as far as I can
>> tell, based on the various OES text.
>
> With just OES_sample_shading, this is true.  Based on the existing
> expression, you can only get per-sample interpolation if
> ctx->Multisample.SampleShading is set.  That can only be set by
> glEnable(GL_SAMPLE_SHADING), and that is gated on 'if
> (!_mesa_is_desktop_gl(ctx)) goto invalid_enum_error;'  We'll need to
> change that when we add support for GL_OES_sample_shading. :)
>
> So... I'm pretty sure that it's already impossible for
> raster->force_persample_interp to get set to true in a GLES context.
> This hunk just adds some code that will have to be removed when we
> (you?) add GL_OES_sample_shading support.
>
 diff --git a/src/mesa/state_tracker/st_atom_shader.c 
 b/src/mesa/state_tracker/st_atom_shader.c
 index a88f035..8cfe756 100644
 --- a/src/mesa/state_tracker/st_atom_shader.c
 +++ b/src/mesa/state_tracker/st_atom_shader.c
 @@ -36,6 +36,7 @@
   */

  #include "main/imports.h"
 +#include "main/context.h"
  #include "main/mtypes.h"
  #include "program/program.h"

 @@ -76,6 +77,7 @@ update_fp( struct st_context *

Re: [Mesa-dev] [PATCH 07/10] i965: fix new gcc6 warnings

2016-02-16 Thread Ian Romanick

This patch is

Reviewed-by: Ian Romanick 

On 02/16/2016 10:58 AM, Rob Clark wrote:
> src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp:244:1: warning:
> ‘void {anonymous}::fs_copy_prop_dataflow::dump_block_data() const’ defined 
> but not used [-Wunused-function]
>  fs_copy_prop_dataflow::dump_block_data() const
>  ^
> 
> From looking at git history, it looks like this is intended to be unused
> (ie. just for adding on-demand debug prints)
> 
> Signed-off-by: Rob Clark 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> index fd25307..9dbe13d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
> @@ -87,7 +87,7 @@ public:
> void setup_initial_values();
> void run();
>  
> -   void dump_block_data() const;
> +   void dump_block_data() const UNUSED;
>  
> void *mem_ctx;
> cfg_t *cfg;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/10] mesa: fix new gcc6 warnings

2016-02-16 Thread Ian Romanick

This patch is

Reviewed-by: Ian Romanick 

On 02/16/2016 10:58 AM, Rob Clark wrote:
> src/mesa/main/texstore.c:92:22: warning: ‘map_1032’ defined but not used 
> [-Wunused-const-variable]
>  static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE };
>   ^~~~
> src/mesa/main/texstore.c:91:22: warning: ‘map_3210’ defined but not used 
> [-Wunused-const-variable]
>  static const GLubyte map_3210[6] = { 3, 2, 1, 0, ZERO, ONE };
>   ^~~~
> src/mesa/main/texstore.c:90:22: warning: ‘map_identity’ defined but not used 
> [-Wunused-const-variable]
>  static const GLubyte map_identity[6] = { 0, 1, 2, 3, ZERO, ONE };
>   ^~~~
> 
> These appear to be unused since:
> 
> commit 8ec6534b266549cdc2798e2523bf6753924f6cde
> Author: Iago Toral Quiroga 
> AuthorDate: Wed Oct 15 13:42:11 2014 +0200
> 
> mesa: Use _mesa_format_convert to implement texstore_rgba.
> 
> Signed-off-by: Rob Clark 
> ---
>  src/mesa/main/texstore.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
> index d767173..c33b109 100644
> --- a/src/mesa/main/texstore.c
> +++ b/src/mesa/main/texstore.c
> @@ -87,9 +87,6 @@ enum {
>   * Texture image storage function.
>   */
>  typedef GLboolean (*StoreTexImageFunc)(TEXSTORE_PARAMS);
> -static const GLubyte map_identity[6] = { 0, 1, 2, 3, ZERO, ONE };
> -static const GLubyte map_3210[6] = { 3, 2, 1, 0, ZERO, ONE };
> -static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE };
>  
>  
>  /**
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] glsl: fix new gcc6 warnings

2016-02-16 Thread Ian Romanick

This patch is

Reviewed-by: Ian Romanick 

On 02/16/2016 10:58 AM, Rob Clark wrote:
> src/compiler/glsl/ast_to_hir.cpp: In function ‘unsigned int 
> ast_process_struct_or_iface_block_members(exec_list*, 
> _mesa_glsl_parse_state*, exec_list*, glsl_struct_field**, bool, 
> glsl_matrix_layout, bool, ir_variable_mode, ast_type_qualifier*,
> unsigned int, unsigned int)’:
> src/compiler/glsl/ast_to_hir.cpp:6339:52: warning: 
> ‘first_member_has_explicit_location’ may be used uninitialized in this 
> function [-Wmaybe-uninitialized]
>  if (!layout->flags.q.explicit_location &&
>  ~~~^~
>  ((first_member_has_explicit_location &&
>  ~~~
>!qual->flags.q.explicit_location) ||
>
>   (!first_member_has_explicit_location &&
>   ~~~
>qual->flags.q.explicit_location))) {
>~
> 
> Signed-off-by: Rob Clark 
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index b639378..9b08d25 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -6259,7 +6259,7 @@ ast_process_struct_or_iface_block_members(exec_list 
> *instructions,
>decl_count);
>  
> bool first_member = true;
> -   bool first_member_has_explicit_location;
> +   bool first_member_has_explicit_location = false;
>  
> unsigned i = 0;
> foreach_list_typed (ast_declarator_list, decl_list, link, declarations) {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/10] freedreno/ir3: fix new gcc6 errors

2016-02-16 Thread Rob Clark

src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c: In function ‘emit_tex’:
src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c:1368:26: warning: unused 
variable ‘const_off’ [-Wunused-variable]
  struct ir3_instruction *const_off[4];
  ^
unused since:

commit 8750299a420af76cebd3067f6f603eacde06ae06
Author: Jason Ekstrand 
Date:   Tue Feb 9 14:51:28 2016 -0800

nir: Remove the const_offset from nir_tex_instr

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index ffa7577..7a1812f 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -1365,7 +1365,6 @@ emit_tex(struct ir3_compile *ctx, nir_tex_instr *tex)
struct ir3_block *b = ctx->block;
struct ir3_instruction **dst, *sam, *src0[12], *src1[4];
struct ir3_instruction **coord, *lod, *compare, *proj, **off, **ddx, 
**ddy;
-   struct ir3_instruction *const_off[4];
bool has_bias = false, has_lod = false, has_proj = false, has_off = 
false;
unsigned i, coords, flags;
unsigned nsrc0 = 0, nsrc1 = 0;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/10] mesa: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/mesa/main/texstore.c:92:22: warning: ‘map_1032’ defined but not used 
[-Wunused-const-variable]
 static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE };
  ^~~~
src/mesa/main/texstore.c:91:22: warning: ‘map_3210’ defined but not used 
[-Wunused-const-variable]
 static const GLubyte map_3210[6] = { 3, 2, 1, 0, ZERO, ONE };
  ^~~~
src/mesa/main/texstore.c:90:22: warning: ‘map_identity’ defined but not used 
[-Wunused-const-variable]
 static const GLubyte map_identity[6] = { 0, 1, 2, 3, ZERO, ONE };
  ^~~~

These appear to be unused since:

commit 8ec6534b266549cdc2798e2523bf6753924f6cde
Author: Iago Toral Quiroga 
AuthorDate: Wed Oct 15 13:42:11 2014 +0200

mesa: Use _mesa_format_convert to implement texstore_rgba.

Signed-off-by: Rob Clark 
---
 src/mesa/main/texstore.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c
index d767173..c33b109 100644
--- a/src/mesa/main/texstore.c
+++ b/src/mesa/main/texstore.c
@@ -87,9 +87,6 @@ enum {
  * Texture image storage function.
  */
 typedef GLboolean (*StoreTexImageFunc)(TEXSTORE_PARAMS);
-static const GLubyte map_identity[6] = { 0, 1, 2, 3, ZERO, ONE };
-static const GLubyte map_3210[6] = { 3, 2, 1, 0, ZERO, ONE };
-static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE };
 
 
 /**
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/10] postprocess: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

In file included from src/gallium/state_trackers/dri/dri_screen.h:44:0,
 from src/gallium/state_trackers/dri/dri_query_renderer.c:7:
src/gallium/auxiliary/postprocess/filters.h:54:33: warning: ‘pp_filters’
defined but not used [-Wunused-const-variable]
 static const struct pp_filter_t pp_filters[PP_FILTERS] = {
 ^~

Note, this one we may actually want to move into an .c file instead?

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/postprocess/filters.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/postprocess/filters.h 
b/src/gallium/auxiliary/postprocess/filters.h
index 321f333..8028df9 100644
--- a/src/gallium/auxiliary/postprocess/filters.h
+++ b/src/gallium/auxiliary/postprocess/filters.h
@@ -51,7 +51,7 @@ struct pp_filter_t
 
 /* Order matters. Put new filters in a suitable place. */
 
-static const struct pp_filter_t pp_filters[PP_FILTERS] = {
+static const struct pp_filter_t pp_filters[PP_FILTERS] UNUSED = {
 /*name inner   shaders verts   init
run   free   */
{ "pp_noblue",  0,  2,  1,  pp_noblue_init, 
pp_nocolor,   pp_nocolor_free },
{ "pp_nogreen", 0,  2,  1,  pp_nogreen_init,
pp_nocolor,   pp_nocolor_free },
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/10] trace: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/gallium/drivers/trace/tr_context.c:1713:39: warning: ‘rbug_blocker_flags’ 
defined but not used [-Wunused-const-variable]
 static const struct debug_named_value rbug_blocker_flags[] = {
   ^~

Note that use of rbug_blocker_flags was removed in:

commit 5494332128da0b2826e85df5eeaa878bb5c30a4e
Author: Jakob Bornecrantz 
Date:   Wed May 12 19:26:19 2010 +0100

trace: Remove rbug from trace

Signed-off-by: Rob Clark 
---
 src/gallium/drivers/trace/tr_context.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/src/gallium/drivers/trace/tr_context.c 
b/src/gallium/drivers/trace/tr_context.c
index 46936c1..0028377 100644
--- a/src/gallium/drivers/trace/tr_context.c
+++ b/src/gallium/drivers/trace/tr_context.c
@@ -1709,13 +1709,6 @@ static void trace_context_launch_grid(struct 
pipe_context *_pipe,
trace_dump_call_end();
 }
 
-
-static const struct debug_named_value rbug_blocker_flags[] = {
-   {"before", 1, NULL},
-   {"after", 2, NULL},
-   DEBUG_NAMED_VALUE_END
-};
-
 struct pipe_context *
 trace_context_create(struct trace_screen *tr_scr,
  struct pipe_context *pipe)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/10] i965: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp:244:1: warning:
‘void {anonymous}::fs_copy_prop_dataflow::dump_block_data() const’ defined but 
not used [-Wunused-function]
 fs_copy_prop_dataflow::dump_block_data() const
 ^

From looking at git history, it looks like this is intended to be unused
(ie. just for adding on-demand debug prints)

Signed-off-by: Rob Clark 
---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index fd25307..9dbe13d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -87,7 +87,7 @@ public:
void setup_initial_values();
void run();
 
-   void dump_block_data() const;
+   void dump_block_data() const UNUSED;
 
void *mem_ctx;
cfg_t *cfg;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/10] glsl: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/compiler/glsl/lower_discard_flow.cpp:79:1: warning: ‘ir_visitor_status 
{anonymous}::lower_discard_flow_visitor::visit_enter(ir_loop_jump*)’ defined 
but not used [-Wunused-function]
 lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
 ^~

Note, not sure if this was a latent bug?  Could be that was intended to
override visit(ir_loop_jump *)?

Signed-off-by: Rob Clark 
---
 src/compiler/glsl/lower_discard_flow.cpp | 12 
 1 file changed, 12 deletions(-)

diff --git a/src/compiler/glsl/lower_discard_flow.cpp 
b/src/compiler/glsl/lower_discard_flow.cpp
index 9d0a56b..bdb96b4 100644
--- a/src/compiler/glsl/lower_discard_flow.cpp
+++ b/src/compiler/glsl/lower_discard_flow.cpp
@@ -63,7 +63,6 @@ public:
}
 
ir_visitor_status visit_enter(ir_discard *ir);
-   ir_visitor_status visit_enter(ir_loop_jump *ir);
ir_visitor_status visit_enter(ir_loop *ir);
ir_visitor_status visit_enter(ir_function_signature *ir);
 
@@ -76,17 +75,6 @@ public:
 } /* anonymous namespace */
 
 ir_visitor_status
-lower_discard_flow_visitor::visit_enter(ir_loop_jump *ir)
-{
-   if (ir->mode != ir_loop_jump::jump_continue)
-  return visit_continue;
-
-   ir->insert_before(generate_discard_break());
-
-   return visit_continue;
-}
-
-ir_visitor_status
 lower_discard_flow_visitor::visit_enter(ir_discard *ir)
 {
ir_dereference *lhs = new(mem_ctx) ir_dereference_variable(discarded);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/10] glsl: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/compiler/glsl/ast_to_hir.cpp: In function ‘unsigned int 
ast_process_struct_or_iface_block_members(exec_list*, _mesa_glsl_parse_state*, 
exec_list*, glsl_struct_field**, bool, glsl_matrix_layout, bool, 
ir_variable_mode, ast_type_qualifier*,
unsigned int, unsigned int)’:
src/compiler/glsl/ast_to_hir.cpp:6339:52: warning: 
‘first_member_has_explicit_location’ may be used uninitialized in this function 
[-Wmaybe-uninitialized]
 if (!layout->flags.q.explicit_location &&
 ~~~^~
 ((first_member_has_explicit_location &&
 ~~~
   !qual->flags.q.explicit_location) ||
   
  (!first_member_has_explicit_location &&
  ~~~
   qual->flags.q.explicit_location))) {
   ~

Signed-off-by: Rob Clark 
---
 src/compiler/glsl/ast_to_hir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index b639378..9b08d25 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -6259,7 +6259,7 @@ ast_process_struct_or_iface_block_members(exec_list 
*instructions,
   decl_count);
 
bool first_member = true;
-   bool first_member_has_explicit_location;
+   bool first_member_has_explicit_location = false;
 
unsigned i = 0;
foreach_list_typed (ast_declarator_list, decl_list, link, declarations) {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/10] util: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/util/hash_table.h:111:23: warning: ‘_mesa_fnv32_1a_offset_bias’ defined but 
not used [-Wunused-const-variable]
 static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
   ^~

Signed-off-by: Rob Clark 
---
 src/util/hash_table.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/util/hash_table.h b/src/util/hash_table.h
index 85b013c..a0244d7 100644
--- a/src/util/hash_table.h
+++ b/src/util/hash_table.h
@@ -108,7 +108,7 @@ static inline uint32_t _mesa_hash_pointer(const void 
*pointer)
return _mesa_hash_data(&pointer, sizeof(pointer));
 }
 
-static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u;
+static const uint32_t _mesa_fnv32_1a_offset_bias UNUSED = 2166136261u;
 
 static inline uint32_t
 _mesa_fnv32_1a_accumulate_block(uint32_t hash, const void *data, size_t size)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/10] gallium/hud: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/gallium/auxiliary/hud/font.c:234:22: warning: ‘Fixed8x13_Character_159’ 
defined but not used [-Wunused-const-variable]
 static const GLubyte Fixed8x13_Character_159[] = {  9,  0,  0,  0,  0, 0,  
0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0, 0,170,  0,  0,  
0,  0,  0};
  ^~~
 many more..

These are simply unused, just #if 0 them out for now, in case someone
wants to use them in the future.

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/hud/font.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/auxiliary/hud/font.c b/src/gallium/auxiliary/hud/font.c
index 60e8ae5..067de9e 100644
--- a/src/gallium/auxiliary/hud/font.c
+++ b/src/gallium/auxiliary/hud/font.c
@@ -199,6 +199,7 @@ static const GLubyte Fixed8x13_Character_123[] = {  8,  0,  
0,  0, 14, 16, 16,
 static const GLubyte Fixed8x13_Character_124[] = {  8,  0,  0,  0, 16, 16, 16, 
16, 16, 16, 16, 16, 16,  0,  0};
 static const GLubyte Fixed8x13_Character_125[] = {  8,  0,  0,  0,112,  8,  8, 
16, 12, 16,  8,  8,112,  0,  0};
 static const GLubyte Fixed8x13_Character_126[] = {  8,  0,  0,  0,  0,  0,  0, 
 0,  0,  0, 72, 84, 36,  0,  0};
+#if 0 /* currently unused */
 static const GLubyte Fixed8x13_Character_127[] = {  9,  0,  0,  0,  0,  0,  
0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,170,  0,  0,  
0,  0,  0};
 static const GLubyte Fixed8x13_Character_128[] = {  9,  0,  0,  0,  0,  0,  
0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,170,  0,  0,  
0,  0,  0};
 static const GLubyte Fixed8x13_Character_129[] = {  9,  0,  0,  0,  0,  0,  
0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,170,  0,  0,  
0,  0,  0};
@@ -232,6 +233,7 @@ static const GLubyte Fixed8x13_Character_156[] = {  9,  0,  
0,  0,  0,  0,  0,17
 static const GLubyte Fixed8x13_Character_157[] = {  9,  0,  0,  0,  0,  0,  
0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,170,  0,  0,  
0,  0,  0};
 static const GLubyte Fixed8x13_Character_158[] = {  9,  0,  0,  0,  0,  0,  
0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,170,  0,  0,  
0,  0,  0};
 static const GLubyte Fixed8x13_Character_159[] = {  9,  0,  0,  0,  0,  0,  
0,170,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,130,  0,  0,  0,170,  0,  0,  
0,  0,  0};
+#endif
 static const GLubyte Fixed8x13_Character_160[] = {  8,  0,  0,  0,  0,  0,  0, 
 0,  0,  0,  0,  0,  0,  0,  0};
 static const GLubyte Fixed8x13_Character_161[] = {  8,  0,  0,  0, 16, 16, 16, 
16, 16, 16, 16,  0, 16,  0,  0};
 static const GLubyte Fixed8x13_Character_162[] = {  8,  0,  0,  0,  0, 16, 56, 
84, 80, 80, 84, 56, 16,  0,  0};
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/10] Clean up most of the new gcc6 warnings

2016-02-16 Thread Rob Clark

From: Rob Clark 

gcc6 landed in rawhide, and all of a sudden building mesa got very
noisy.  This patchset cleans up most of the warnings (especially
the first one which shows up everywhere that #includes hash_table.h)

There are two remaining:

src/gallium/auxiliary/util/u_debug_stack.c: In function 
‘debug_backtrace_capture’:
src/gallium/auxiliary/util/u_debug_stack.c:108:18: warning: calling 
‘__builtin_frame_address’ with a nonzero argument is unsafe [-Wframe-address]
frame_pointer = ((const void **)__builtin_frame_address(1));
~~^

Not sure there is anything we *can* do about that one.  And:

  src/gallium/auxiliary/util/u_debug_stack.c: In function 
‘debug_backtrace_capture’:
  src/gallium/auxiliary/util/u_debug_stack.c:108:18: warning: calling 
‘__builtin_frame_address’ with a nonzero argument is unsafe [-Wframe-address]
  frame_pointer = ((const void **)__builtin_frame_address(1));
  ~~^

Not sure there is anything we *can* do about that one.  And:

  src/mesa/main/get.c:473:18: warning: ‘extra_version_40’ defined but not used 
[-Wunused-const-variable]
   static const int extra_version_40[] = { EXTRA_VERSION_40, EXTRA_END };
^~~~
  src/mesa/main/get.c:232:21: warning: ‘extra_ARB_compute_shader’ defined but 
not used [-Wunused-const-variable]
  static const int extra_##e[] = {  \
   ^
  src/mesa/main/get.c:441:1: note: in expansion of macro ‘EXTRA_EXT’
   EXTRA_EXT(ARB_compute_shader);
   ^

Not really sure what we should do about that one.

Rob Clark (10):
  util: fix new gcc6 warnings
  gallium/hud: fix new gcc6 warnings
  gallium/auxiliary: fix new gcc6 warnings
  trace: fix new gcc6 warnings
  postprocess: fix new gcc6 warnings
  freedreno/ir3: fix new gcc6 errors
  i965: fix new gcc6 warnings
  glsl: fix new gcc6 warnings
  glsl: fix new gcc6 warnings
  mesa: fix new gcc6 warnings

 src/compiler/glsl/ast_to_hir.cpp  |  2 +-
 src/compiler/glsl/lower_discard_flow.cpp  | 12 
 src/gallium/auxiliary/hud/font.c  |  2 ++
 src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c   |  4 ++--
 src/gallium/auxiliary/postprocess/filters.h   |  2 +-
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c  |  1 -
 src/gallium/drivers/trace/tr_context.c|  7 ---
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp |  2 +-
 src/mesa/main/texstore.c  |  3 ---
 src/util/hash_table.h |  2 +-
 10 files changed, 8 insertions(+), 29 deletions(-)

-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/10] gallium/auxiliary: fix new gcc6 warnings

2016-02-16 Thread Rob Clark

src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c: In function 
‘mm_bufmgr_create_from_buffer’:
src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:288:4:
warning: statement is indented as if it were guarded by... 
[-Wmisleading-indentation]
if(mm->map)
^~
src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:286:1: note:
...this ‘if’ clause, but it is not
 if(mm->heap)
 ^~

Signed-off-by: Rob Clark 
---
 src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c 
b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c
index 14de61b..023a028 100644
--- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c
+++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c
@@ -283,8 +283,8 @@ mm_bufmgr_create_from_buffer(struct pb_buffer *buffer,
return SUPER(mm);

 failure:
-if(mm->heap)
-   u_mmDestroy(mm->heap);
+   if(mm->heap)
+  u_mmDestroy(mm->heap);
if(mm->map)
   pb_unmap(mm->buffer);
FREE(mm);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] st/mesa: add OES_sample_variables support

2016-02-16 Thread Ian Romanick

On 02/16/2016 09:12 AM, Ilia Mirkin wrote:
> On Tue, Feb 16, 2016 at 12:02 PM, Ian Romanick  wrote:
>> On 02/15/2016 10:31 PM, Ilia Mirkin wrote:
>>> Basically the same thing as ARB_sample_shading except that it also needs
>>> gl_SampleMaskIn support as well as not enable per-sample interpolation
>>> whenever doing per-sample shading. This is done explicitly in another
>>> extension.
>>>
>>> Signed-off-by: Ilia Mirkin 
>>> ---
>>>
>>> I get 16 failures with dEQP tests, these fall into 2 categories:
>>>
>>>  - 1-sample multisample surfaces don't behave the way it likes (it considers
>>>them non-multisampled even though they're created through gl*Multisample*
>>>
>>>  - gl_SampleMaskIn is reporting the whole pixel's worth of mask rather than
>>>just the fragment in question. Looking back, it appears that
>>>ARB_gpu_shader5 also wants it for only the current fragment.

I think that's correct.  GL_ARB_sample_shading also says:

If the fragment shader is being evaluated at
any frequency other than per-framgent, bits of the sample mask not
corresponding to the current fragment shader invocation are ignored.

>>>  docs/GL3.txt| 2 +-
>>>  src/mesa/state_tracker/st_atom_rasterizer.c | 2 ++
>>>  src/mesa/state_tracker/st_atom_shader.c | 2 ++
>>>  src/mesa/state_tracker/st_extensions.c  | 4 
>>>  src/mesa/state_tracker/st_program.c | 5 -
>>>  5 files changed, 13 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/docs/GL3.txt b/docs/GL3.txt
>>> index 26847b9..ae439f6 100644
>>> --- a/docs/GL3.txt
>>> +++ b/docs/GL3.txt
>>> @@ -248,7 +248,7 @@ GLES3.2, GLSL ES 3.2
>>>GL_OES_gpu_shader5   not started (based 
>>> on parts of GL_ARB_gpu_shader5, which is done for some drivers)
>>>GL_OES_primitive_bounding boxnot started
>>>GL_OES_sample_shadingnot started (based 
>>> on parts of GL_ARB_sample_shading, which is done for some drivers)
>>> -  GL_OES_sample_variables  not started (based 
>>> on parts of GL_ARB_sample_shading, which is done for some drivers)
>>> +  GL_OES_sample_variables  DONE (nvc0, r600, 
>>> radeonsi)
>>>GL_OES_shader_image_atomic   not started (based 
>>> on parts of GL_ARB_shader_image_load_store, which is done for some drivers)
>>>GL_OES_shader_io_blocks  not started (based 
>>> on parts of GLSL 1.50, which is done)
>>>GL_OES_shader_multisample_interpolation  not started (based 
>>> on parts of GL_ARB_gpu_shader5, which is done)
>>> diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
>>> b/src/mesa/state_tracker/st_atom_rasterizer.c
>>> index c20cadf..d42d512 100644
>>> --- a/src/mesa/state_tracker/st_atom_rasterizer.c
>>> +++ b/src/mesa/state_tracker/st_atom_rasterizer.c
>>> @@ -31,6 +31,7 @@
>>>*/
>>>
>>>  #include "main/macros.h"
>>> +#include "main/context.h"
>>>  #include "st_context.h"
>>>  #include "st_atom.h"
>>>  #include "st_debug.h"
>>> @@ -239,6 +240,7 @@ static void update_raster_state( struct st_context *st )
>>>
>>> /* _NEW_MULTISAMPLE | _NEW_BUFFERS */
>>> raster->force_persample_interp =
>>> + !_mesa_is_gles(ctx) &&
>>>   !st->force_persample_in_shader &&
>>>   ctx->Multisample._Enabled &&
>>>   ctx->Multisample.SampleShading &&
>>
>> Is this change necessary?  I would have thought that
>> ctx->Multisample.SampleShading couldn't get set without using features
>> from OES_sample_shading.
> 
> We don't want per-sample interp when sample shading, as far as I can
> tell, based on the various OES text.

With just OES_sample_shading, this is true.  Based on the existing
expression, you can only get per-sample interpolation if
ctx->Multisample.SampleShading is set.  That can only be set by
glEnable(GL_SAMPLE_SHADING), and that is gated on 'if
(!_mesa_is_desktop_gl(ctx)) goto invalid_enum_error;'  We'll need to
change that when we add support for GL_OES_sample_shading. :)

So... I'm pretty sure that it's already impossible for
raster->force_persample_interp to get set to true in a GLES context.
This hunk just adds some code that will have to be removed when we
(you?) add GL_OES_sample_shading support.

>>> diff --git a/src/mesa/state_tracker/st_atom_shader.c 
>>> b/src/mesa/state_tracker/st_atom_shader.c
>>> index a88f035..8cfe756 100644
>>> --- a/src/mesa/state_tracker/st_atom_shader.c
>>> +++ b/src/mesa/state_tracker/st_atom_shader.c
>>> @@ -36,6 +36,7 @@
>>>   */
>>>
>>>  #include "main/imports.h"
>>> +#include "main/context.h"
>>>  #include "main/mtypes.h"
>>>  #include "program/program.h"
>>>
>>> @@ -76,6 +77,7 @@ update_fp( struct st_context *st )
>>>  * Ignore sample qualifier while computing this flag.
>>>  */
>>> key.persample_shading =
>>> +  !_mesa_is_gles(st->ctx) &&
>>>st->force_persample_in_shader &&
>

[Mesa-dev] Where do we put a Vulkan driver?

2016-02-16 Thread Jason Ekstrand

So, we just pushed a branch containing a Vulkan driver.  Naturally, we
would like to incorporate that driver into the upstream mesa tree.  While
we work on upstreaming the prerequisites in NIR and the i965 back-end
compiler, there is a question that needs answering:  Where do we put it?

The Vulkan driver challenges the tree-like nature of the way mesa is
currently organized.  We now have two drivers that share a lot of the same
underlying hardware-specific code (compiler and ISL) but target different
APIs and no gallium-like middle layer to hide behind.  Obviously, we don't
want to put a Vulkan driver in src/mesa/drivers/dri/i965.  If we start a
src/vulkan directory, we don't really want to put the shared parts into
src/vulkan/intel.  Where should we put the Intel-specific but API-agnostic
bits?  In particular, we need a place to put ISL and the back-end compiler.
We don't want to deal with the headaches of making a public API and keeping
it stable, so they need to live somewhere in the mesa tree.

In my personal opinion, the best thing to do is probably to add a src/intel
folder with subfolders for vulkan, isl, and the back-end compiler.  The
src/mesa/drivers/dri/i965 folder would then basically be just the GL bits
of the driver.  It does seem a little odd to have "intel" as a top-level
source folder, but I can't come up with anything better.

Thoughts?  Opinions?  Favorite colors?
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: use cso_set_viewport_dims() in try_pbo_upload_common()

2016-02-16 Thread Roland Scheidegger

Am 16.02.2016 um 18:01 schrieb Brian Paul:
> Note that this results in a different transformation for the viewport's
> Z axis (depth range), but that doesn't matter for this case.
> ---
>  src/mesa/state_tracker/st_cb_texture.c | 13 +
>  1 file changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/src/mesa/state_tracker/st_cb_texture.c 
> b/src/mesa/state_tracker/st_cb_texture.c
> index a06cc72..d09c360 100644
> --- a/src/mesa/state_tracker/st_cb_texture.c
> +++ b/src/mesa/state_tracker/st_cb_texture.c
> @@ -1474,18 +1474,7 @@ try_pbo_upload_common(struct gl_context *ctx,
>pipe_surface_reference(&fb.cbufs[0], NULL);
> }
>  
> -   /* Viewport state */
> -   {
> -  struct pipe_viewport_state vp;
> -  vp.scale[0] = 0.5f * surface->width;
> -  vp.scale[1] = 0.5f * surface->height;
> -  vp.scale[2] = 1.0f;
> -  vp.translate[0] = 0.5f * surface->width;
> -  vp.translate[1] = 0.5f * surface->height;
> -  vp.translate[2] = 0.0f;
> -
> -  cso_set_viewport(cso, &vp);
> -   }
> +   cso_set_viewport_dims(cso, surface->width, surface->height, FALSE);
>  
> /* Blend state */
> cso_set_blend(cso, &st->pbo_upload.blend);
> 

Reviewed-by: Roland Scheidegger 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/gen7: Use predicated rendering for indirect compute

2016-02-16 Thread Jordan Justen

On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect
dispatch is used, but one of the dimensions is 0.

Therefore we use predicated rendering on the GPGPU_WALKER command to
handle this case.

Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100
Signed-off-by: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Ben Widawsky 
Cc: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/brw_compute.c | 104 +++-
 src/mesa/drivers/dri/i965/brw_defines.h |   1 +
 2 files changed, 91 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
b/src/mesa/drivers/dri/i965/brw_compute.c
index d9f181a..bbb8ce3 100644
--- a/src/mesa/drivers/dri/i965/brw_compute.c
+++ b/src/mesa/drivers/dri/i965/brw_compute.c
@@ -35,6 +35,92 @@
 
 
 static void
+brw_prepare_indirect_gpgpu_walker(struct brw_context *brw)
+{
+   GLintptr indirect_offset = brw->compute.num_work_groups_offset;
+   drm_intel_bo *bo = brw->compute.num_work_groups_bo;
+
+   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMX, bo,
+ I915_GEM_DOMAIN_VERTEX, 0,
+ indirect_offset + 0);
+   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMY, bo,
+ I915_GEM_DOMAIN_VERTEX, 0,
+ indirect_offset + 4);
+   brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMZ, bo,
+ I915_GEM_DOMAIN_VERTEX, 0,
+ indirect_offset + 8);
+
+   if (brw->gen > 7)
+  return;
+
+   /* Clear upper 32-bits of SRC0 and all 64-bits of SRC1
+*/
+   BEGIN_BATCH(7);
+   OUT_BATCH(MI_LOAD_REGISTER_IMM | (7 - 2));
+   OUT_BATCH(MI_PREDICATE_SRC0 + 4);
+   OUT_BATCH(0u);
+   OUT_BATCH(MI_PREDICATE_SRC1 + 0);
+   OUT_BATCH(0u);
+   OUT_BATCH(MI_PREDICATE_SRC1 + 4);
+   OUT_BATCH(0u);
+   ADVANCE_BATCH();
+
+   /* Load compute_dispatch_indirect_x_size into SRC0
+*/
+   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ indirect_offset + 0);
+
+   /* predicate = (compute_dispatch_indirect_x_size == 0);
+*/
+   BEGIN_BATCH(1);
+   OUT_BATCH(GEN7_MI_PREDICATE |
+ MI_PREDICATE_LOADOP_LOAD |
+ MI_PREDICATE_COMBINEOP_SET |
+ MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
+   ADVANCE_BATCH();
+
+   /* Load compute_dispatch_indirect_y_size into SRC0
+*/
+   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ indirect_offset + 4);
+
+   /* predicate |= (compute_dispatch_indirect_y_size == 0);
+*/
+   BEGIN_BATCH(1);
+   OUT_BATCH(GEN7_MI_PREDICATE |
+ MI_PREDICATE_LOADOP_LOAD |
+ MI_PREDICATE_COMBINEOP_OR |
+ MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
+   ADVANCE_BATCH();
+
+   /* Load compute_dispatch_indirect_z_size into SRC0
+*/
+   brw_load_register_mem(brw, MI_PREDICATE_SRC0, bo,
+ I915_GEM_DOMAIN_INSTRUCTION, 0,
+ indirect_offset + 8);
+
+   /* predicate |= (compute_dispatch_indirect_z_size == 0);
+*/
+   BEGIN_BATCH(1);
+   OUT_BATCH(GEN7_MI_PREDICATE |
+ MI_PREDICATE_LOADOP_LOAD |
+ MI_PREDICATE_COMBINEOP_OR |
+ MI_PREDICATE_COMPAREOP_SRCS_EQUAL);
+   ADVANCE_BATCH();
+
+   /* predicate = !predicate;
+*/
+   BEGIN_BATCH(1);
+   OUT_BATCH(GEN7_MI_PREDICATE |
+ MI_PREDICATE_LOADOP_LOADINV |
+ MI_PREDICATE_COMBINEOP_OR |
+ MI_PREDICATE_COMPAREOP_FALSE);
+   ADVANCE_BATCH();
+}
+
+static void
 brw_emit_gpgpu_walker(struct brw_context *brw)
 {
const struct brw_cs_prog_data *prog_data = brw->cs.prog_data;
@@ -45,20 +131,10 @@ brw_emit_gpgpu_walker(struct brw_context *brw)
if (brw->compute.num_work_groups_bo == NULL) {
   indirect_flag = 0;
} else {
-  GLintptr indirect_offset = brw->compute.num_work_groups_offset;
-  drm_intel_bo *bo = brw->compute.num_work_groups_bo;
-
-  indirect_flag = GEN7_GPGPU_INDIRECT_PARAMETER_ENABLE;
-
-  brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMX, bo,
-I915_GEM_DOMAIN_VERTEX, 0,
-indirect_offset + 0);
-  brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMY, bo,
-I915_GEM_DOMAIN_VERTEX, 0,
-indirect_offset + 4);
-  brw_load_register_mem(brw, GEN7_GPGPU_DISPATCHDIMZ, bo,
-I915_GEM_DOMAIN_VERTEX, 0,
-indirect_offset + 8);
+  indirect_flag =
+ GEN7_GPGPU_INDIRECT_PARAMETER_ENABLE |
+ ((brw->gen == 7) ? GEN7_GPGPU_PREDICATE_ENABLE : 0);
+  brw_prepare_indirect_gpgpu_walker(brw);
}
 
const unsigned simd_size = prog_data->simd_size;
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index

[Mesa-dev] [PATCH] configure.ac: enable_asm=yes when x-compiling across same X86 arch

2016-02-16 Thread Dongwon Kim

Currently, configure script is forcing 'enable_asm' to be 'no'
whenever cross-compilation is performed on X86 host. This is
based on an assumption that target architecture is different
from host's (i.e. ARM). But there's always a case that we do
cross-compilation for target that is also X86 based just like
host in which same ASM codes will be supported. 'enable_asm'
should not be forced to be "no" anymore in this case.

v2: corrected commit message

Signed-off-by: Dongwon Kim 
---
 configure.ac | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index b05f33d..0ad27c9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -710,8 +710,10 @@ test "x$enable_asm" = xno && AC_MSG_RESULT([no])
 if test "x$enable_asm" = xyes -a "x$cross_compiling" = xyes; then
 case "$host_cpu" in
 i?86 | x86_64 | amd64)
-enable_asm=no
-AC_MSG_RESULT([no, cross compiling])
+if test "x$host_cpu" != "x$target_cpu"; then
+enable_asm=no
+AC_MSG_RESULT([no, cross compiling])
+fi
 ;;
 esac
 fi
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gm107/ir: add ATOM CAS emission

2016-02-16 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Tue, Feb 16, 2016 at 12:53 PM, Samuel Pitoiset
 wrote:
> From: Samuel Pitoiset 
>
> This fixes the following dEQP test and the other compswap variants.
>
> dEQP-GLES31.functional.ssbo.atomic.compswap.highp_int
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 42 
> ++
>  1 file changed, 27 insertions(+), 15 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 5dbdeea..025eb19 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -2332,22 +2332,34 @@ void
>  CodeEmitterGM107::emitATOM()
>  {
> unsigned dType, subOp;
> -   switch (insn->dType) {
> -   case TYPE_U32: dType = 0; break;
> -   case TYPE_S32: dType = 1; break;
> -   case TYPE_U64: dType = 2; break;
> -   case TYPE_F32: dType = 3; break;
> -   case TYPE_B128: dType = 4; break;
> -   case TYPE_S64: dType = 5; break;
> -   default: assert(!"unexpected dType"); dType = 0; break;
> -   }
> -   if (insn->subOp == NV50_IR_SUBOP_ATOM_EXCH)
> -  subOp = 8;
> -   else
> -  subOp = insn->subOp;
> -   assert(insn->subOp != NV50_IR_SUBOP_ATOM_CAS); /* XXX */
>
> -   emitInsn (0xed00);
> +   if (insn->subOp == NV50_IR_SUBOP_ATOM_CAS) {
> +  switch (insn->dType) {
> +  case TYPE_U32: dType = 0; break;
> +  case TYPE_U64: dType = 1; break;
> +  default: assert(!"unexpected dType"); dType = 0; break;
> +  }
> +  subOp = 15;
> +
> +  emitInsn (0xee00);
> +   } else {
> +  switch (insn->dType) {
> +  case TYPE_U32: dType = 0; break;
> +  case TYPE_S32: dType = 1; break;
> +  case TYPE_U64: dType = 2; break;
> +  case TYPE_F32: dType = 3; break;
> +  case TYPE_B128: dType = 4; break;
> +  case TYPE_S64: dType = 5; break;
> +  default: assert(!"unexpected dType"); dType = 0; break;
> +  }
> +  if (insn->subOp == NV50_IR_SUBOP_ATOM_EXCH)
> + subOp = 8;
> +  else
> + subOp = insn->subOp;
> +
> +  emitInsn (0xed00);
> +   }
> +
> emitField(0x34, 4, subOp);
> emitField(0x31, 3, dType);
> emitField(0x30, 1, insn->src(0).getIndirect(0)->getSize() == 8);
> --
> 2.7.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gm107/ir: add ATOM CAS emission

2016-02-16 Thread Samuel Pitoiset

From: Samuel Pitoiset 

This fixes the following dEQP test and the other compswap variants.

dEQP-GLES31.functional.ssbo.atomic.compswap.highp_int

Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 42 ++
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 5dbdeea..025eb19 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -2332,22 +2332,34 @@ void
 CodeEmitterGM107::emitATOM()
 {
unsigned dType, subOp;
-   switch (insn->dType) {
-   case TYPE_U32: dType = 0; break;
-   case TYPE_S32: dType = 1; break;
-   case TYPE_U64: dType = 2; break;
-   case TYPE_F32: dType = 3; break;
-   case TYPE_B128: dType = 4; break;
-   case TYPE_S64: dType = 5; break;
-   default: assert(!"unexpected dType"); dType = 0; break;
-   }
-   if (insn->subOp == NV50_IR_SUBOP_ATOM_EXCH)
-  subOp = 8;
-   else
-  subOp = insn->subOp;
-   assert(insn->subOp != NV50_IR_SUBOP_ATOM_CAS); /* XXX */
 
-   emitInsn (0xed00);
+   if (insn->subOp == NV50_IR_SUBOP_ATOM_CAS) {
+  switch (insn->dType) {
+  case TYPE_U32: dType = 0; break;
+  case TYPE_U64: dType = 1; break;
+  default: assert(!"unexpected dType"); dType = 0; break;
+  }
+  subOp = 15;
+
+  emitInsn (0xee00);
+   } else {
+  switch (insn->dType) {
+  case TYPE_U32: dType = 0; break;
+  case TYPE_S32: dType = 1; break;
+  case TYPE_U64: dType = 2; break;
+  case TYPE_F32: dType = 3; break;
+  case TYPE_B128: dType = 4; break;
+  case TYPE_S64: dType = 5; break;
+  default: assert(!"unexpected dType"); dType = 0; break;
+  }
+  if (insn->subOp == NV50_IR_SUBOP_ATOM_EXCH)
+ subOp = 8;
+  else
+ subOp = insn->subOp;
+
+  emitInsn (0xed00);
+   }
+
emitField(0x34, 4, subOp);
emitField(0x31, 3, dType);
emitField(0x30, 1, insn->src(0).getIndirect(0)->getSize() == 8);
-- 
2.7.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/25] radeonsi: add PS prolog

2016-02-16 Thread Marek Olšák

On Tue, Feb 16, 2016 at 5:31 PM, Nicolai Hähnle  wrote:
> So, patches 12-16 also look good to me except for the comments I've sent on
> 12-14.
>
> I'm a bit worried though that there is a lot of "almost code duplication"
> around the handling of input and output positions etc., and maintaining the
> two different code paths for monolithic and non-monolithic is brittle.
>
> Here's an approach that I think could work to clean this up: keep only the
> non-monolithic code for LLVM IR function generation. Then implement
> monolithic mode with a helper that takes a sequence of LLVM IR functions and
> generates a master function that pipes each function's output into the input
> of the next. Then set the functions as always inline and rely on LLVM's
> inliner to stitch everything together.
>
> This ends up with slightly higher overhead for the monolithic code path
> (although the unconditional inlining should be fast), but it would help
> clean the code up tremendously.

Yes, I had the same idea. However, the incremental approach was the
most bearable way to do it and made fixing regressions and hangs a lot
easier. Doing a complete rewrite from monolithic to non-monolithic
would be a lot more frustrating.

Cleaning this up is definitely a good idea, but we need to make it all
more useful first and add an on-disk shader cache to mesa/main. That
will be quite a project too.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: add missing ETC2 entries to format_map

2016-02-16 Thread Ilia Mirkin

Should be noted that, not at all due to this patch,
glTexStorage(ETC1/ETC2) is broken on gallium drivers that don't
implement those formats in HW (i.e. use the sw fallback). This patch
makes it work for drivers that *do* support it in HW, but more work
needed for the other drivers. Maybe we should just have the
PIPE_FORMAT_RGBA8 stuff right in there as fallback formats? [Would
need to do that for ETC1 as well.]

On Tue, Feb 16, 2016 at 12:04 PM, Rob Clark  wrote:
> From: Rob Clark 
>
> Noticed by Ilia when I was trying to figure out why some app was failing
> to use ETC2.
>
> Signed-off-by: Rob Clark 
> Reviewed-by: Ilia Mirkin 
> ---
>  src/mesa/state_tracker/st_format.c | 42 
> ++
>  1 file changed, 42 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_format.c 
> b/src/mesa/state_tracker/st_format.c
> index 2b92bad..82bf3a1 100644
> --- a/src/mesa/state_tracker/st_format.c
> +++ b/src/mesa/state_tracker/st_format.c
> @@ -1484,6 +1484,48 @@ static const struct format_mapping format_map[] = {
>{ PIPE_FORMAT_ETC1_RGB8, 0 }
> },
>
> +   /* ETC2 */
> +   {
> +  { GL_COMPRESSED_RGB8_ETC2, 0 },
> +  { PIPE_FORMAT_ETC2_RGB8, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_SRGB8_ETC2, 0 },
> +  { PIPE_FORMAT_ETC2_SRGB8, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
> +  { PIPE_FORMAT_ETC2_RGB8A1, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
> +  { PIPE_FORMAT_ETC2_SRGB8A1, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_RGBA8_ETC2_EAC, 0 },
> +  { PIPE_FORMAT_ETC2_RGBA8, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC, 0 },
> +  { PIPE_FORMAT_ETC2_SRGBA8, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_R11_EAC, 0 },
> +  { PIPE_FORMAT_ETC2_R11_UNORM, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_SIGNED_R11_EAC, 0 },
> +  { PIPE_FORMAT_ETC2_R11_SNORM, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_RG11_EAC, 0 },
> +  { PIPE_FORMAT_ETC2_RG11_UNORM, 0 }
> +   },
> +   {
> +  { GL_COMPRESSED_SIGNED_RG11_EAC, 0 },
> +  { PIPE_FORMAT_ETC2_RG11_SNORM, 0 }
> +   },
> +
> /* BPTC */
> {
>{ GL_COMPRESSED_RGBA_BPTC_UNORM, 0 },
> --
> 2.5.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] st/mesa: add OES_sample_variables support

2016-02-16 Thread Ilia Mirkin

On Tue, Feb 16, 2016 at 12:02 PM, Ian Romanick  wrote:
> On 02/15/2016 10:31 PM, Ilia Mirkin wrote:
>> Basically the same thing as ARB_sample_shading except that it also needs
>> gl_SampleMaskIn support as well as not enable per-sample interpolation
>> whenever doing per-sample shading. This is done explicitly in another
>> extension.
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>
>> I get 16 failures with dEQP tests, these fall into 2 categories:
>>
>>  - 1-sample multisample surfaces don't behave the way it likes (it considers
>>them non-multisampled even though they're created through gl*Multisample*
>>
>>  - gl_SampleMaskIn is reporting the whole pixel's worth of mask rather than
>>just the fragment in question. Looking back, it appears that
>>ARB_gpu_shader5 also wants it for only the current fragment.
>>
>>  docs/GL3.txt| 2 +-
>>  src/mesa/state_tracker/st_atom_rasterizer.c | 2 ++
>>  src/mesa/state_tracker/st_atom_shader.c | 2 ++
>>  src/mesa/state_tracker/st_extensions.c  | 4 
>>  src/mesa/state_tracker/st_program.c | 5 -
>>  5 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/GL3.txt b/docs/GL3.txt
>> index 26847b9..ae439f6 100644
>> --- a/docs/GL3.txt
>> +++ b/docs/GL3.txt
>> @@ -248,7 +248,7 @@ GLES3.2, GLSL ES 3.2
>>GL_OES_gpu_shader5   not started (based 
>> on parts of GL_ARB_gpu_shader5, which is done for some drivers)
>>GL_OES_primitive_bounding boxnot started
>>GL_OES_sample_shadingnot started (based 
>> on parts of GL_ARB_sample_shading, which is done for some drivers)
>> -  GL_OES_sample_variables  not started (based 
>> on parts of GL_ARB_sample_shading, which is done for some drivers)
>> +  GL_OES_sample_variables  DONE (nvc0, r600, 
>> radeonsi)
>>GL_OES_shader_image_atomic   not started (based 
>> on parts of GL_ARB_shader_image_load_store, which is done for some drivers)
>>GL_OES_shader_io_blocks  not started (based 
>> on parts of GLSL 1.50, which is done)
>>GL_OES_shader_multisample_interpolation  not started (based 
>> on parts of GL_ARB_gpu_shader5, which is done)
>> diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
>> b/src/mesa/state_tracker/st_atom_rasterizer.c
>> index c20cadf..d42d512 100644
>> --- a/src/mesa/state_tracker/st_atom_rasterizer.c
>> +++ b/src/mesa/state_tracker/st_atom_rasterizer.c
>> @@ -31,6 +31,7 @@
>>*/
>>
>>  #include "main/macros.h"
>> +#include "main/context.h"
>>  #include "st_context.h"
>>  #include "st_atom.h"
>>  #include "st_debug.h"
>> @@ -239,6 +240,7 @@ static void update_raster_state( struct st_context *st )
>>
>> /* _NEW_MULTISAMPLE | _NEW_BUFFERS */
>> raster->force_persample_interp =
>> + !_mesa_is_gles(ctx) &&
>>   !st->force_persample_in_shader &&
>>   ctx->Multisample._Enabled &&
>>   ctx->Multisample.SampleShading &&
>
> Is this change necessary?  I would have thought that
> ctx->Multisample.SampleShading couldn't get set without using features
> from OES_sample_shading.

We don't want per-sample interp when sample shading, as far as I can
tell, based on the various OES text.

>
>> diff --git a/src/mesa/state_tracker/st_atom_shader.c 
>> b/src/mesa/state_tracker/st_atom_shader.c
>> index a88f035..8cfe756 100644
>> --- a/src/mesa/state_tracker/st_atom_shader.c
>> +++ b/src/mesa/state_tracker/st_atom_shader.c
>> @@ -36,6 +36,7 @@
>>   */
>>
>>  #include "main/imports.h"
>> +#include "main/context.h"
>>  #include "main/mtypes.h"
>>  #include "program/program.h"
>>
>> @@ -76,6 +77,7 @@ update_fp( struct st_context *st )
>>  * Ignore sample qualifier while computing this flag.
>>  */
>> key.persample_shading =
>> +  !_mesa_is_gles(st->ctx) &&
>>st->force_persample_in_shader &&
>>!(stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
>>  SYSTEM_BIT_SAMPLE_POS)) &&
>
> Is this correct? While not normative, the overview section of the OES
> extension does say
>
> This means that where these features are used (gl_SampleID and
> gl_SamplePosition), implementations must run the fragment shader
> for each sample.
>
> I haven't dug into the body of the spec to find supporting text there.

'Correct' is a strong word to use. Not sure. But I think so :) Based
on my reading of the various OES texts (and note that I haven't dug
into the EXT or NV/whatever ones), GL ES does away with the whole
auto-interpolate-to-sample when sample shading. Basically the point of
this persample_shading shader key is to force the interpolation
position to be 'sample' for drivers that can't auto-do it via the
rasterizer setting (above).

>
>> diff --git a/src/mesa/state_tracker/st_extensions.c 
>> b/src/mesa/state_trac

Re: [Mesa-dev] [PATCH 11/25] radeonsi: first bits for non-monolithic shaders

2016-02-16 Thread Nicolai Hähnle


On 16.02.2016 11:39, Marek Olšák wrote:

On Tue, Feb 16, 2016 at 5:01 PM, Nicolai Hähnle  wrote:

On 15.02.2016 18:59, Marek Olšák wrote:


From: Marek Olšák 

---
   src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
   src/gallium/drivers/radeonsi/si_pipe.h   |  3 ++
   src/gallium/drivers/radeonsi/si_shader.c | 53

   src/gallium/drivers/radeonsi/si_shader.h |  2 +-
   4 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index fa60732..448fe88 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -600,6 +600,7 @@ struct pipe_screen *radeonsi_screen_create(struct
radeon_winsys *ws)

 sscreen->b.has_cp_dma = true;
 sscreen->b.has_streamout = true;
+   sscreen->use_monolithic_shaders = true;

 if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
 sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS |
DBG_PS | DBG_CS;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
b/src/gallium/drivers/radeonsi/si_pipe.h
index b5790d6..2a2455c 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -84,6 +84,9 @@ struct si_compute;
   struct si_screen {
 struct r600_common_screen   b;
 unsignedgs_table_depth;
+
+   /* Whether shaders are monolithic (1-part) or separate (3-part).
*/
+   booluse_monolithic_shaders;
   };

   struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index b058019..b74ed1e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -70,6 +70,12 @@ struct si_shader_context

 unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader.
*/
 bool is_gs_copy_shader;
+
+   /* Whether to generate the optimized shader variant compiled as a
whole
+* (without a prolog and epilog)
+*/
+   bool is_monolithic;
+
 int param_streamout_config;
 int param_streamout_write_index;
 int param_streamout_offset[4];
@@ -3657,8 +3663,10 @@ static void create_function(struct
si_shader_context *ctx)
 struct lp_build_tgsi_context *bld_base =
&ctx->radeon_bld.soa.bld_base;
 struct gallivm_state *gallivm = bld_base->base.gallivm;
 struct si_shader *shader = ctx->shader;
-   LLVMTypeRef params[SI_NUM_PARAMS], v2i32, v3i32;
+   LLVMTypeRef params[SI_NUM_PARAMS + SI_NUM_VERTEX_BUFFERS], v2i32,
v3i32;
+   LLVMTypeRef returns[16+32*4];



This is a bit of a magic number, I guess something like max parameters plus
attributes. Can you replace it by the appropriate defines?


There is not a single definition that would express this clearly.

The prolog has to return up to 16 input SGPRs and 4-20 input VGPRs.
Additionally, the prolog returns other data in VGPRs. That's up to
4+16 VGPRs (16 vertex load addresses) for the VS and 20+8 VGPRs (2
vec4 colors) for the PS. The PS epilog returns one SGPR (but in s10 or
so, so we need to allocate 11) and 9*4 VGPRs at most. This all can
change in the future, who knows.

16+32*4 is much more than we'll ever need, but it shouldn't overflow
at least. Assertions also check if we don't overflow.


Hmm, I see. I guess I can live with it, as well as with the casts in 
patch 14.


Nicolai


Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/mesa: add missing ETC2 entries to format_map

2016-02-16 Thread Rob Clark

From: Rob Clark 

Noticed by Ilia when I was trying to figure out why some app was failing
to use ETC2.

Signed-off-by: Rob Clark 
Reviewed-by: Ilia Mirkin 
---
 src/mesa/state_tracker/st_format.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/src/mesa/state_tracker/st_format.c 
b/src/mesa/state_tracker/st_format.c
index 2b92bad..82bf3a1 100644
--- a/src/mesa/state_tracker/st_format.c
+++ b/src/mesa/state_tracker/st_format.c
@@ -1484,6 +1484,48 @@ static const struct format_mapping format_map[] = {
   { PIPE_FORMAT_ETC1_RGB8, 0 }
},
 
+   /* ETC2 */
+   {
+  { GL_COMPRESSED_RGB8_ETC2, 0 },
+  { PIPE_FORMAT_ETC2_RGB8, 0 }
+   },
+   {
+  { GL_COMPRESSED_SRGB8_ETC2, 0 },
+  { PIPE_FORMAT_ETC2_SRGB8, 0 }
+   },
+   {
+  { GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
+  { PIPE_FORMAT_ETC2_RGB8A1, 0 }
+   },
+   {
+  { GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2, 0 },
+  { PIPE_FORMAT_ETC2_SRGB8A1, 0 }
+   },
+   {
+  { GL_COMPRESSED_RGBA8_ETC2_EAC, 0 },
+  { PIPE_FORMAT_ETC2_RGBA8, 0 }
+   },
+   {
+  { GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC, 0 },
+  { PIPE_FORMAT_ETC2_SRGBA8, 0 }
+   },
+   {
+  { GL_COMPRESSED_R11_EAC, 0 },
+  { PIPE_FORMAT_ETC2_R11_UNORM, 0 }
+   },
+   {
+  { GL_COMPRESSED_SIGNED_R11_EAC, 0 },
+  { PIPE_FORMAT_ETC2_R11_SNORM, 0 }
+   },
+   {
+  { GL_COMPRESSED_RG11_EAC, 0 },
+  { PIPE_FORMAT_ETC2_RG11_UNORM, 0 }
+   },
+   {
+  { GL_COMPRESSED_SIGNED_RG11_EAC, 0 },
+  { PIPE_FORMAT_ETC2_RG11_SNORM, 0 }
+   },
+
/* BPTC */
{
   { GL_COMPRESSED_RGBA_BPTC_UNORM, 0 },
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 24/25] gallium/radeon: remove unused radeon_shader_binary_free_* functions

2016-02-16 Thread Nicolai Hähnle


Patches 22-24 are also

Reviewed-by: Nicolai Hähnle 

Very nice series overall!

On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeon/radeon_elf_util.c | 19 ---
  src/gallium/drivers/radeon/radeon_elf_util.h | 14 --
  2 files changed, 33 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_elf_util.c 
b/src/gallium/drivers/radeon/radeon_elf_util.c
index 70a2c4d..8aaa85d 100644
--- a/src/gallium/drivers/radeon/radeon_elf_util.c
+++ b/src/gallium/drivers/radeon/radeon_elf_util.c
@@ -195,22 +195,3 @@ const unsigned char *radeon_shader_binary_config_start(
}
return binary->config;
  }
-
-void radeon_shader_binary_free_relocs(struct radeon_shader_reloc *relocs,
-   unsigned reloc_count)
-{
-   FREE(relocs);
-}
-
-void radeon_shader_binary_free_members(struct radeon_shader_binary *binary,
-   unsigned free_relocs)
-{
-   FREE(binary->code);
-   FREE(binary->config);
-   FREE(binary->rodata);
-
-   if (free_relocs) {
-   radeon_shader_binary_free_relocs(binary->relocs,
-   binary->reloc_count);
-   }
-}
diff --git a/src/gallium/drivers/radeon/radeon_elf_util.h 
b/src/gallium/drivers/radeon/radeon_elf_util.h
index ea4ab2f..c2af9e0 100644
--- a/src/gallium/drivers/radeon/radeon_elf_util.h
+++ b/src/gallium/drivers/radeon/radeon_elf_util.h
@@ -47,18 +47,4 @@ const unsigned char *radeon_shader_binary_config_start(
const struct radeon_shader_binary *binary,
uint64_t symbol_offset);

-/**
- * Free all memory allocated for members of \p binary.  This function does
- * not free \p binary.
- *
- * @param free_relocs If false, reolc information will not be freed.
- */
-void radeon_shader_binary_free_members(struct radeon_shader_binary *binary,
-   unsigned free_relocs);
-
-/**
- * Free \p relocs and all member data.
- */
-void radeon_shader_binary_free_relocs(struct radeon_shader_reloc *relocs,
-   unsigned reloc_count);
  #endif /* RADEON_ELF_UTIL_H */


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] st/mesa: add OES_sample_variables support

2016-02-16 Thread Ian Romanick

On 02/15/2016 10:31 PM, Ilia Mirkin wrote:
> Basically the same thing as ARB_sample_shading except that it also needs
> gl_SampleMaskIn support as well as not enable per-sample interpolation
> whenever doing per-sample shading. This is done explicitly in another
> extension.
> 
> Signed-off-by: Ilia Mirkin 
> ---
> 
> I get 16 failures with dEQP tests, these fall into 2 categories:
> 
>  - 1-sample multisample surfaces don't behave the way it likes (it considers
>them non-multisampled even though they're created through gl*Multisample*
> 
>  - gl_SampleMaskIn is reporting the whole pixel's worth of mask rather than
>just the fragment in question. Looking back, it appears that
>ARB_gpu_shader5 also wants it for only the current fragment.
> 
>  docs/GL3.txt| 2 +-
>  src/mesa/state_tracker/st_atom_rasterizer.c | 2 ++
>  src/mesa/state_tracker/st_atom_shader.c | 2 ++
>  src/mesa/state_tracker/st_extensions.c  | 4 
>  src/mesa/state_tracker/st_program.c | 5 -
>  5 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index 26847b9..ae439f6 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -248,7 +248,7 @@ GLES3.2, GLSL ES 3.2
>GL_OES_gpu_shader5   not started (based on 
> parts of GL_ARB_gpu_shader5, which is done for some drivers)
>GL_OES_primitive_bounding boxnot started
>GL_OES_sample_shadingnot started (based on 
> parts of GL_ARB_sample_shading, which is done for some drivers)
> -  GL_OES_sample_variables  not started (based on 
> parts of GL_ARB_sample_shading, which is done for some drivers)
> +  GL_OES_sample_variables  DONE (nvc0, r600, 
> radeonsi)
>GL_OES_shader_image_atomic   not started (based on 
> parts of GL_ARB_shader_image_load_store, which is done for some drivers)
>GL_OES_shader_io_blocks  not started (based on 
> parts of GLSL 1.50, which is done)
>GL_OES_shader_multisample_interpolation  not started (based on 
> parts of GL_ARB_gpu_shader5, which is done)
> diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
> b/src/mesa/state_tracker/st_atom_rasterizer.c
> index c20cadf..d42d512 100644
> --- a/src/mesa/state_tracker/st_atom_rasterizer.c
> +++ b/src/mesa/state_tracker/st_atom_rasterizer.c
> @@ -31,6 +31,7 @@
>*/
>   
>  #include "main/macros.h"
> +#include "main/context.h"
>  #include "st_context.h"
>  #include "st_atom.h"
>  #include "st_debug.h"
> @@ -239,6 +240,7 @@ static void update_raster_state( struct st_context *st )
>  
> /* _NEW_MULTISAMPLE | _NEW_BUFFERS */
> raster->force_persample_interp =
> + !_mesa_is_gles(ctx) &&
>   !st->force_persample_in_shader &&
>   ctx->Multisample._Enabled &&
>   ctx->Multisample.SampleShading &&

Is this change necessary?  I would have thought that
ctx->Multisample.SampleShading couldn't get set without using features
from OES_sample_shading.

> diff --git a/src/mesa/state_tracker/st_atom_shader.c 
> b/src/mesa/state_tracker/st_atom_shader.c
> index a88f035..8cfe756 100644
> --- a/src/mesa/state_tracker/st_atom_shader.c
> +++ b/src/mesa/state_tracker/st_atom_shader.c
> @@ -36,6 +36,7 @@
>   */
>  
>  #include "main/imports.h"
> +#include "main/context.h"
>  #include "main/mtypes.h"
>  #include "program/program.h"
>  
> @@ -76,6 +77,7 @@ update_fp( struct st_context *st )
>  * Ignore sample qualifier while computing this flag.
>  */
> key.persample_shading =
> +  !_mesa_is_gles(st->ctx) &&
>st->force_persample_in_shader &&
>!(stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
>  SYSTEM_BIT_SAMPLE_POS)) &&

Is this correct? While not normative, the overview section of the OES
extension does say

This means that where these features are used (gl_SampleID and
gl_SamplePosition), implementations must run the fragment shader
for each sample.

I haven't dug into the body of the spec to find supporting text there.

> diff --git a/src/mesa/state_tracker/st_extensions.c 
> b/src/mesa/state_tracker/st_extensions.c
> index eff3a2d..49d5a2c 100644
> --- a/src/mesa/state_tracker/st_extensions.c
> +++ b/src/mesa/state_tracker/st_extensions.c
> @@ -861,6 +861,10 @@ void st_init_extensions(struct pipe_screen *screen,
>extensions->OES_copy_image = GL_TRUE;
> }
>  
> +   /* Needs PIPE_CAP_SAMPLE_SHADING + gl_SampleMaskIn.
> +*/
> +   extensions->OES_sample_variables = extensions->ARB_gpu_shader5;
> +

Just based on the comment... shouldn't this be
extensions->ARB_sample_shading && extensions->ARB_gpu_shader5?

Does any Gallium driver support PIPE_CAP_SAMPLE_SHADING and not support
gl_SampleMaskIn?  I wonder if it would be better to extend the meaning
of the cap bit to i

Re: [Mesa-dev] [PATCH 25/25] radeonsi: implement binary shaders & shader cache in memory

2016-02-16 Thread Nicolai Hähnle


On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeonsi/si_pipe.c  |   5 +-
  src/gallium/drivers/radeonsi/si_pipe.h  |  16 ++
  src/gallium/drivers/radeonsi/si_shader.h|   4 +-
  src/gallium/drivers/radeonsi/si_state.h |   2 +
  src/gallium/drivers/radeonsi/si_state_shaders.c | 234 +++-
  5 files changed, 254 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 75d4775..a576237 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -563,7 +563,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
}
}
pipe_mutex_destroy(sscreen->shader_parts_mutex);
-
+   si_destroy_shader_cache(sscreen);
r600_destroy_common_screen(&sscreen->b);
  }

@@ -611,7 +611,8 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)
sscreen->b.b.resource_create = r600_resource_create_common;

if (!r600_common_screen_init(&sscreen->b, ws) ||
-   !si_init_gs_info(sscreen)) {
+   !si_init_gs_info(sscreen) ||
+   !si_init_shader_cache(sscreen)) {
FREE(sscreen);
return NULL;
}
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 1ac7bc4..ef860a5 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -80,6 +80,7 @@
  #define SI_MAX_BORDER_COLORS  4096

  struct si_compute;
+struct hash_table;

  struct si_screen {
struct r600_common_screen   b;
@@ -94,6 +95,21 @@ struct si_screen {
struct si_shader_part   *tcs_epilogs;
struct si_shader_part   *ps_prologs;
struct si_shader_part   *ps_epilogs;
+
+   /* Shader cache in memory.
+*
+* Design & limitations:
+* - The shader cache is per screen (= per process), never saved to
+*   disk, and skips redundant shader compilations from TGSI to 
bytecode.
+* - It can only be used with one-variant-per-shader support, in which
+*   case only the main (typically middle) part of shaders is cached.
+* - Only VS, TCS, TES, PS are cached, out of which only the hw VS
+*   variants of VS and TES are cached, so LS and ES aren't.
+* - GS and CS aren't cached, but it's certainly possible to cache
+*   those as well.
+*/
+   pipe_mutex  shader_cache_mutex;
+   struct hash_table   *shader_cache;
  };

  struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 48e048d..7e46871 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -362,8 +362,10 @@ struct si_shader {
struct r600_resource*bo;
struct r600_resource*scratch_bo;
union si_shader_key key;
-   struct radeon_shader_binary binary;
boolis_binary_shared;
+
+   /* The following data is all that's needed for binary shaders. */
+   struct radeon_shader_binary binary;
struct si_shader_config config;
struct si_shader_info   info;
  };
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index f64c4d4..40792cb 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -280,6 +280,8 @@ si_create_sampler_view_custom(struct pipe_context *ctx,
  /* si_state_shader.c */
  bool si_update_shaders(struct si_context *sctx);
  void si_init_shader_functions(struct si_context *sctx);
+bool si_init_shader_cache(struct si_screen *sscreen);
+void si_destroy_shader_cache(struct si_screen *sscreen);

  /* si_state_draw.c */
  void si_emit_cache_flush(struct si_context *sctx, struct r600_atom *atom);
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index c62cbb7..bc3e5be 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -32,10 +32,217 @@

  #include "tgsi/tgsi_parse.h"
  #include "tgsi/tgsi_ureg.h"
+#include "util/hash_table.h"
+#include "util/u_hash.h"
  #include "util/u_memory.h"
  #include "util/u_prim.h"
  #include "util/u_simple_shaders.h"

+/* SHADER_CACHE */
+
+/**
+ * Return the TGSI binary in a buffer. The first 4 bytes contain its size as
+ * integer.
+ */
+static void *si_get_tgsi_binary(struct si_shader_selector *sel)
+{
+   unsigned tgsi_size = tgsi_num_tokens(sel->tokens) *
+sizeof(struct tgsi_token);
+   unsigned size = 4 + tgsi_size + sizeof(sel->so);
+   char *result = (char*)MALLOC(size);
+
+   if (!result)
+

[Mesa-dev] [PATCH] st/mesa: use cso_set_viewport_dims() in try_pbo_upload_common()

2016-02-16 Thread Brian Paul

Note that this results in a different transformation for the viewport's
Z axis (depth range), but that doesn't matter for this case.
---
 src/mesa/state_tracker/st_cb_texture.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index a06cc72..d09c360 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -1474,18 +1474,7 @@ try_pbo_upload_common(struct gl_context *ctx,
   pipe_surface_reference(&fb.cbufs[0], NULL);
}
 
-   /* Viewport state */
-   {
-  struct pipe_viewport_state vp;
-  vp.scale[0] = 0.5f * surface->width;
-  vp.scale[1] = 0.5f * surface->height;
-  vp.scale[2] = 1.0f;
-  vp.translate[0] = 0.5f * surface->width;
-  vp.translate[1] = 0.5f * surface->height;
-  vp.translate[2] = 0.0f;
-
-  cso_set_viewport(cso, &vp);
-   }
+   cso_set_viewport_dims(cso, surface->width, surface->height, FALSE);
 
/* Blend state */
cso_set_blend(cso, &st->pbo_upload.blend);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/25] radeonsi: add TCS epilog

2016-02-16 Thread Marek Olšák

On Tue, Feb 16, 2016 at 5:14 PM, Nicolai Hähnle  wrote:
> On 15.02.2016 18:59, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> ---
>>   src/gallium/drivers/radeonsi/si_pipe.c   |   1 +
>>   src/gallium/drivers/radeonsi/si_pipe.h   |   1 +
>>   src/gallium/drivers/radeonsi/si_shader.c | 163
>> ---
>>   src/gallium/drivers/radeonsi/si_shader.h |   3 +
>>   4 files changed, 155 insertions(+), 13 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
>> b/src/gallium/drivers/radeonsi/si_pipe.c
>> index 2b5ce3a..645d418 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>> @@ -540,6 +540,7 @@ static void si_destroy_screen(struct pipe_screen*
>> pscreen)
>> struct si_shader_part *parts[] = {
>> sscreen->vs_prologs,
>> sscreen->vs_epilogs,
>> +   sscreen->tcs_epilogs,
>> };
>> unsigned i;
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
>> b/src/gallium/drivers/radeonsi/si_pipe.h
>> index 8d98779..d9175b9 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.h
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
>> @@ -91,6 +91,7 @@ struct si_screen {
>> pipe_mutex  shader_parts_mutex;
>> struct si_shader_part   *vs_prologs;
>> struct si_shader_part   *vs_epilogs;
>> +   struct si_shader_part   *tcs_epilogs;
>>   };
>>
>>   struct si_blend_color {
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
>> b/src/gallium/drivers/radeonsi/si_shader.c
>> index 0085c43..bc6f8cd 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -109,9 +109,11 @@ struct si_shader_context
>> LLVMTypeRef i1;
>> LLVMTypeRef i8;
>> LLVMTypeRef i32;
>> +   LLVMTypeRef i64;
>> LLVMTypeRef i128;
>> LLVMTypeRef f32;
>> LLVMTypeRef v16i8;
>> +   LLVMTypeRef v2i32;
>> LLVMTypeRef v4i32;
>> LLVMTypeRef v4f32;
>> LLVMTypeRef v8i32;
>> @@ -2078,14 +2080,51 @@ static void si_write_tess_factors(struct
>> lp_build_tgsi_context *bld_base,
>>   static void si_llvm_emit_tcs_epilogue(struct lp_build_tgsi_context
>> *bld_base)
>>   {
>> struct si_shader_context *ctx = si_shader_context(bld_base);
>> -   LLVMValueRef invocation_id;
>> +   LLVMValueRef rel_patch_id, invocation_id, tf_lds_offset;
>>
>> +   rel_patch_id = get_rel_patch_id(ctx);
>> invocation_id = unpack_param(ctx, SI_PARAM_REL_IDS, 8, 5);
>> +   tf_lds_offset = get_tcs_out_current_patch_data_offset(ctx);
>>
>> -   si_write_tess_factors(bld_base,
>> - get_rel_patch_id(ctx),
>> - invocation_id,
>> - get_tcs_out_current_patch_data_offset(ctx));
>> +   if (!ctx->is_monolithic) {
>> +   /* Return epilog parameters from this function. */
>> +   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
>> +   LLVMValueRef ret = ctx->return_value;
>> +   LLVMValueRef rw_buffers, rw0, rw1, tf_soffset;
>> +   unsigned vgpr;
>> +
>> +   /* RW_BUFFERS pointer */
>> +   rw_buffers = LLVMGetParam(ctx->radeon_bld.main_fn,
>> + SI_PARAM_RW_BUFFERS);
>> +   rw_buffers = LLVMBuildPtrToInt(builder, rw_buffers,
>> ctx->i64, "");
>> +   rw_buffers = LLVMBuildBitCast(builder, rw_buffers,
>> ctx->v2i32, "");
>> +   rw0 = LLVMBuildExtractElement(builder, rw_buffers,
>> + bld_base->uint_bld.zero,
>> "");
>> +   rw1 = LLVMBuildExtractElement(builder, rw_buffers,
>> + bld_base->uint_bld.one, "");
>> +   ret = LLVMBuildInsertValue(builder, ret, rw0, 0, "");
>> +   ret = LLVMBuildInsertValue(builder, ret, rw1, 1, "");
>
>
> Ugh, that's a bit ugly even if it ends up being a no-op in the final binary.
> Doesn't LLVM at least support vector return values or maybe even i64?

Yes, it's ugly.

LLVM only supports i32 and f32 return values.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 21/25] radeonsi: use smaller types for some si_shader members

2016-02-16 Thread Nicolai Hähnle


On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

in order to decrease the shader size for a shader cache.
---
  src/gallium/drivers/radeonsi/si_shader.c | 3 +++
  src/gallium/drivers/radeonsi/si_shader.h | 6 +++---
  2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 2789788..3758009 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1889,6 +1889,7 @@ handle_semantic:
case TGSI_SEMANTIC_COLOR:
case TGSI_SEMANTIC_BCOLOR:
target = V_008DFC_SQ_EXP_PARAM + param_count;
+   assert(i < ARRAY_SIZE(shader->vs_output_param_offset));
shader->vs_output_param_offset[i] = param_count;
param_count++;
break;
@@ -1903,6 +1904,7 @@ handle_semantic:
case TGSI_SEMANTIC_TEXCOORD:
case TGSI_SEMANTIC_GENERIC:
target = V_008DFC_SQ_EXP_PARAM + param_count;
+   assert(i < ARRAY_SIZE(shader->vs_output_param_offset));
shader->vs_output_param_offset[i] = param_count;
param_count++;
break;
@@ -5268,6 +5270,7 @@ static bool si_get_vs_epilog(struct si_screen *sscreen,
unsigned offset = shader->nr_param_exports++;

epilog_key.vs_epilog.prim_id_param_offset = offset;
+   assert(index < ARRAY_SIZE(shader->vs_output_param_offset));
shader->vs_output_param_offset[index] = offset;
}

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index ee81621..a77e54a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -359,10 +359,10 @@ struct si_shader {
ubyte   num_input_vgprs;
charface_vgpr_index;

-   unsignedvs_output_param_offset[PIPE_MAX_SHADER_OUTPUTS];
+   ubyte   vs_output_param_offset[40];


Magic number - please replace with an appropriate #define or at least 
explain. Apart from that, patches 17-21:


Reviewed-by: Nicolai Hähnle 


booluses_instanceid;
-   unsignednr_pos_exports;
-   unsignednr_param_exports;
+   ubyte   nr_pos_exports;
+   ubyte   nr_param_exports;
  };

  struct si_shader_part {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Don't call driver when there is no compute work

2016-02-16 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

(somehow I was sure this was done already... but apparently not.)

On Tue, Feb 16, 2016 at 11:24 AM, Jordan Justen
 wrote:
> The ARB_compute_shader spec says:
>
>   "If the work group count in any dimension is zero, no work groups
>are dispatched."
>
> Signed-off-by: Jordan Justen 
> ---
>  src/mesa/main/compute.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/main/compute.c b/src/mesa/main/compute.c
> index 53e7a50..b71430f 100644
> --- a/src/mesa/main/compute.c
> +++ b/src/mesa/main/compute.c
> @@ -41,6 +41,9 @@ _mesa_DispatchCompute(GLuint num_groups_x,
> if (!_mesa_validate_DispatchCompute(ctx, num_groups))
>return;
>
> +   if (num_groups_x == 0u || num_groups_y == 0u || num_groups_z == 0u)
> +   return;
> +
> ctx->Driver.DispatchCompute(ctx, num_groups);
>  }
>
> --
> 2.7.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/25] radeonsi: first bits for non-monolithic shaders

2016-02-16 Thread Marek Olšák

On Tue, Feb 16, 2016 at 5:01 PM, Nicolai Hähnle  wrote:
> On 15.02.2016 18:59, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> ---
>>   src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
>>   src/gallium/drivers/radeonsi/si_pipe.h   |  3 ++
>>   src/gallium/drivers/radeonsi/si_shader.c | 53
>> 
>>   src/gallium/drivers/radeonsi/si_shader.h |  2 +-
>>   4 files changed, 45 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
>> b/src/gallium/drivers/radeonsi/si_pipe.c
>> index fa60732..448fe88 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>> @@ -600,6 +600,7 @@ struct pipe_screen *radeonsi_screen_create(struct
>> radeon_winsys *ws)
>>
>> sscreen->b.has_cp_dma = true;
>> sscreen->b.has_streamout = true;
>> +   sscreen->use_monolithic_shaders = true;
>>
>> if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
>> sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS |
>> DBG_PS | DBG_CS;
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
>> b/src/gallium/drivers/radeonsi/si_pipe.h
>> index b5790d6..2a2455c 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.h
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
>> @@ -84,6 +84,9 @@ struct si_compute;
>>   struct si_screen {
>> struct r600_common_screen   b;
>> unsignedgs_table_depth;
>> +
>> +   /* Whether shaders are monolithic (1-part) or separate (3-part).
>> */
>> +   booluse_monolithic_shaders;
>>   };
>>
>>   struct si_blend_color {
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
>> b/src/gallium/drivers/radeonsi/si_shader.c
>> index b058019..b74ed1e 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -70,6 +70,12 @@ struct si_shader_context
>>
>> unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader.
>> */
>> bool is_gs_copy_shader;
>> +
>> +   /* Whether to generate the optimized shader variant compiled as a
>> whole
>> +* (without a prolog and epilog)
>> +*/
>> +   bool is_monolithic;
>> +
>> int param_streamout_config;
>> int param_streamout_write_index;
>> int param_streamout_offset[4];
>> @@ -3657,8 +3663,10 @@ static void create_function(struct
>> si_shader_context *ctx)
>> struct lp_build_tgsi_context *bld_base =
>> &ctx->radeon_bld.soa.bld_base;
>> struct gallivm_state *gallivm = bld_base->base.gallivm;
>> struct si_shader *shader = ctx->shader;
>> -   LLVMTypeRef params[SI_NUM_PARAMS], v2i32, v3i32;
>> +   LLVMTypeRef params[SI_NUM_PARAMS + SI_NUM_VERTEX_BUFFERS], v2i32,
>> v3i32;
>> +   LLVMTypeRef returns[16+32*4];
>
>
> This is a bit of a magic number, I guess something like max parameters plus
> attributes. Can you replace it by the appropriate defines?

There is not a single definition that would express this clearly.

The prolog has to return up to 16 input SGPRs and 4-20 input VGPRs.
Additionally, the prolog returns other data in VGPRs. That's up to
4+16 VGPRs (16 vertex load addresses) for the VS and 20+8 VGPRs (2
vec4 colors) for the PS. The PS epilog returns one SGPR (but in s10 or
so, so we need to allocate 11) and 9*4 VGPRs at most. This all can
change in the future, who knows.

16+32*4 is much more than we'll ever need, but it shouldn't overflow
at least. Assertions also check if we don't overflow.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/25] radeonsi: add VS epilog

2016-02-16 Thread Marek Olšák

On Tue, Feb 16, 2016 at 5:12 PM, Nicolai Hähnle  wrote:
> On 15.02.2016 18:59, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> It only exports the primitive ID.
>> Also used by TES when it's compiled as VS.
>>
>> The VS input location of the primitive ID input is v2.
>
>
> So the reason for having two unused outputs/return values of the main VS is
> so that primitive ID can get passed through without any moves? Sounds good,
> but may be worth documenting e.g. where VS_EPILOG_PRIMID_LOC is defined.

Yes, I'll add the comment.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: add GL_OES_texture_border_clamp support

2016-02-16 Thread Ilia Mirkin

On Tue, Feb 16, 2016 at 11:24 AM, Ian Romanick  wrote:
> On 02/15/2016 04:14 PM, Ilia Mirkin wrote:
>> Only minor differences to the existing ARB_texture_border_clamp support.
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>
>> I get 53 failures (and 548 passes) in the dEQP tests, they appear to expect
>> all-red for depth texturing while gallium apparently returns gray. Haven't
>> figured out if it's the fault of the tests or the implementation.
>
> This is a known change from compatibility profile desktop OpenGL to
> GLES3 and core profile.  In compatibility, the default color format for
> depth textures is GL_INTENSITY.  Since GL_INTENSITY is removed or
> deprecated everywhere else, the color format for depth textures is GL_RED.
>
> See commit 9db2098d.

Wow, thanks for that! Should hopefully be an easy fix then.

>
>> (I also had to claim it was the EXT version of the ext, and hack up dEQP to
>> pull the *OES functions instead of the *EXT ones.)
>
> I looked at the two specs.  Doing
>
>diff -wud gles/extensions/EXT/EXT_texture_border_clamp.txt <(sed
> 's/OES/EXT/g' < gles/extensions/OES/OES_texture_border_clamp.txt)
>
> showed basically no differences.  Maybe we should just expose both?  I
> could probably also be convinced that we should expose the NV extension.
>  The textual differences were quite a bit larger, but that appears to be
> because the EXT and OES extensions add interactions with GLES 3.0.

Happy to do whatever. I was under the impression that only OES exts
should be exposed, but like you said, it's identical to the EXT one.
Let me know.

>
> I haven't check the piglit list yet... are there any piglits for the OES
> version (or just dEQP)?

Just dEQP. I wasn't planning on writing piglit tests.

>
>>  docs/GL3.txt|  2 +-
>>  src/mapi/glapi/gen/es_EXT.xml   | 58 
>> -
>>  src/mesa/main/extensions_table.h|  1 +
>>  src/mesa/main/samplerobj.c  |  6 ++--
>>  src/mesa/main/tests/dispatch_sanity.cpp | 10 ++
>>  src/mesa/main/texparam.c| 11 ---
>>  6 files changed, 80 insertions(+), 8 deletions(-)
>>
>> diff --git a/docs/GL3.txt b/docs/GL3.txt
>> index ea7ceef..0957247 100644
>> --- a/docs/GL3.txt
>> +++ b/docs/GL3.txt
>> @@ -253,7 +253,7 @@ GLES3.2, GLSL ES 3.2
>>GL_OES_shader_io_blocks  not started (based 
>> on parts of GLSL 1.50, which is done)
>>GL_OES_shader_multisample_interpolation  not started (based 
>> on parts of GL_ARB_gpu_shader5, which is done)
>>GL_OES_tessellation_shader   not started (based 
>> on GL_ARB_tessellation_shader, which is done for some drivers)
>> -  GL_OES_texture_border_clamp  not started (based 
>> on GL_ARB_texture_border_clamp, which is done)
>> +  GL_OES_texture_border_clamp  DONE (all drivers)
>>GL_OES_texture_buffernot started (based 
>> on GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and 
>> GL_ARB_texture_buffer_object_rgb32 that are all done)
>>GL_OES_texture_cube_map_arraynot started (based 
>> on GL_ARB_texture_cube_map_array, which is done for all drivers)
>>GL_OES_texture_stencil8  not started (based 
>> on GL_ARB_texture_stencil8, which is done for some drivers)
>> diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
>> index 86df980..fb0ef05 100644
>> --- a/src/mapi/glapi/gen/es_EXT.xml
>> +++ b/src/mapi/glapi/gen/es_EXT.xml
>> @@ -982,5 +982,61 @@
>>  
>>  
>>  
>> -  
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +
>> +> alias="GetTexParameterIiv">
>> +
>> +
>> +
>> +
>> +
>> +> alias="GetTexParameterIuiv">
>> +
>> +
>> +
>> +
>> +
>> +> alias="SamplerParameterIiv">
>> +  
>> +  
>> +  
>> +
>> +
>> +> alias="SamplerParameterIuiv">
>> +  
>> +  
>> +  
>> +
>> +
>> +> alias="GetSamplerParameterIiv">
>> +  
>> +  
>> +  
>> +
>> +
>> +> alias="GetSamplerParameterIuiv">
>> +  
>> +  
>> +  
>> +
>> +
>> +
>> +
>>  
>> diff --git a/src/mesa/main/extensions_table.h 
>> b/src/mesa/main/extensions_table.h
>> index d1e3a99..b07d635 100644
>> --- a/src/mesa/main/extensions_table.h
>> +++ b/src/mesa/main/extensions_table.h
>> @@ -333,6 +333,7 @@ EXT(OES_stencil8, dummy_true
>>  EXT(OES_stencil_wrap, dummy_true
>>  ,  x ,  x , ES1,  x , 2002)
>>  EXT(OES_surfaceless_context , dummy_true
>>  ,  x ,  x , ES1, ES2, 2012)
>>  EXT(OES_texture_3D  , dummy_true

Re: [Mesa-dev] [PATCH v2] egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage

2016-02-16 Thread Daniel Stone

Hi,

On 16 February 2016 at 16:34, Derek Foreman  wrote:
> +try_damage_buffer(struct dri2_egl_surface *dri2_surf,
> +  const EGLint *rects,
> +  EGLint n_rects)
> +{
> +/* The WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION macro and
> + * wl_proxy_get_version() were both introduced in wayland 1.10.
> + * Instead of bumping our wayland dependency we just make this
> + * function conditional on the required 1.10 features, falling
> + * back to old (correct but suboptimal) behaviour for older
> + * wayland.
> + */
> +#ifdef WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION

It still bumps the runtime requirement, i.e. once built against >=1.10
it can only ever be run against >= 1.10. Maybe dlsym is overkill, but
OTOH maybe not ...

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] intel skylake gpu support

2016-02-16 Thread Jarkko Korpi

Well I have few years of linux experience, but I find this system still so new 
to me when it comes to modifying system files (or something similar). Got 17.3 
Mint here and oibaf ppa for mesa. I used this for r9 290 radeon, but I am 
upgrading that card. Meanwhile I noticed that I can really play dota 2 with 
intel. And the same ppa works for intel too. I haven't noticed any tearing 
issues with intel, which I had sometimes with ati. 

Is there any guide which I could go through to check if I have all the files 
installed and setup that I should?

> Date: Tue, 16 Feb 2016 08:09:45 -0800
> From: b...@bwidawsk.net
> To: robdcl...@gmail.com
> CC: jarkko_ko...@hotmail.com; mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] intel skylake gpu support
> 
> On Tue, Feb 16, 2016 at 10:39:23AM -0500, Rob Clark wrote:
> > Try xf86-video-modesetting instead of xf86-video-intel..
> 
> Might I inquire the thought behind this? It's my impression that unless one is
> using glamor, modesetting won't ever outperform xf86-video-intel (which 
> defaults
> to the hardware blitter on gen9+). Certainly modesetting might be a logical
> choice if there are corruption issues.
> 
> It sounds more like the person is missing all the required vaapi goop to me, 
> but
> I'm genuinely curious if you know something I don't :-)
> 
> > 
> > BR,
> > -R
> > 
> > On Tue, Feb 16, 2016 at 2:24 AM, Jarkko Korpi  
> > wrote:
> > > I have 3 questions for you.
> > >
> > > I noticed that opengl 4.0 support is missing just 1 extension
> > > GL_ARB_gpu_shader_fp64. Is there any estimated schedule this to finnish?
> > >
> > > Then the other question. My cpu is skylake 6600k which has powersaving on
> > > that it drops its speed when not doing much. It can drop cores at 800mhz.
> > > Most of youtube videos if not all are in vp9 format, that's the info I get
> > > when i look at the video stats. But I get lots of dropped frames even with
> > > low resolution clips. Is this connection issue or driver issue or cpu not
> > > using enough speed?
> > >
> > > Does intel driver somehow have hardware encode/decode for vp9?
> > >
> > >
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
  ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage

2016-02-16 Thread Derek Foreman

Since commit d1314de293e9e4a63c35f094c3893aaaed8580b4 we ignore
damage passed to SwapBuffersWithDamage.

Wayland 1.10 now has functionality that allows us to properly
process those damage rectangles, and a way to query if it's
available.

Now we can use wl_surface.damage_buffer and interpret the incoming
damage as being in buffer co-ordinates.

Reviewed-by: Jason Ekstrand 
Reviewed-by: Pekka Paalanen 
Signed-off-by: Derek Foreman 
---
Changes from v1:
Add comment explaining why the call to wl_proxy_get_version() is hidden
by the seemingly unrelated macro.

 src/egl/drivers/dri2/platform_wayland.c | 39 ++---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_wayland.c 
b/src/egl/drivers/dri2/platform_wayland.c
index c2438f7..341acb7 100644
--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -653,6 +653,37 @@ create_wl_buffer(struct dri2_egl_surface *dri2_surf)
   &wl_buffer_listener, dri2_surf);
 }
 
+static EGLBoolean
+try_damage_buffer(struct dri2_egl_surface *dri2_surf,
+  const EGLint *rects,
+  EGLint n_rects)
+{
+/* The WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION macro and
+ * wl_proxy_get_version() were both introduced in wayland 1.10.
+ * Instead of bumping our wayland dependency we just make this
+ * function conditional on the required 1.10 features, falling
+ * back to old (correct but suboptimal) behaviour for older
+ * wayland.
+ */
+#ifdef WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION
+   int i;
+
+   if (wl_proxy_get_version((struct wl_proxy *) dri2_surf->wl_win->surface)
+   < WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION)
+  return EGL_FALSE;
+
+   for (i = 0; i < n_rects; i++) {
+  const int *rect = &rects[i * 4];
+
+  wl_surface_damage_buffer(dri2_surf->wl_win->surface,
+   rect[0],
+   dri2_surf->base.Height - rect[1] - rect[3],
+   rect[2], rect[3]);
+   }
+   return EGL_TRUE;
+#endif
+   return EGL_FALSE;
+}
 /**
  * Called via eglSwapBuffers(), drv->API.SwapBuffers().
  */
@@ -703,10 +734,12 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
dri2_surf->dx = 0;
dri2_surf->dy = 0;
 
-   /* We deliberately ignore the damage region and post maximum damage, due to
+   /* If the compositor doesn't support damage_buffer, we deliberately
+* ignore the damage region and post maximum damage, due to
 * https://bugs.freedesktop.org/78190 */
-   wl_surface_damage(dri2_surf->wl_win->surface,
- 0, 0, INT32_MAX, INT32_MAX);
+   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
+  wl_surface_damage(dri2_surf->wl_win->surface,
+0, 0, INT32_MAX, INT32_MAX);
 
if (dri2_dpy->is_different_gpu) {
   _EGLContext *ctx = _eglGetCurrentContext();
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] intel skylake gpu support

2016-02-16 Thread Rob Clark

On Tue, Feb 16, 2016 at 11:09 AM, Ben Widawsky  wrote:
> On Tue, Feb 16, 2016 at 10:39:23AM -0500, Rob Clark wrote:
>> Try xf86-video-modesetting instead of xf86-video-intel..
>
> Might I inquire the thought behind this? It's my impression that unless one is
> using glamor, modesetting won't ever outperform xf86-video-intel (which 
> defaults
> to the hardware blitter on gen9+). Certainly modesetting might be a logical
> choice if there are corruption issues.

yeah, use of glamor was the point, since afaiu sna doesn't support skl
yet (or at least doesn't support xv on skl)

> It sounds more like the person is missing all the required vaapi goop to me, 
> but
> I'm genuinely curious if you know something I don't :-)

could be.. although I'd expect a skl system should be able to play
youtube videos on the cpu without much problem..

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/25] radeonsi: add PS prolog

2016-02-16 Thread Nicolai Hähnle

So, patches 12-16 also look good to me except for the comments I've sent 
on 12-14.


I'm a bit worried though that there is a lot of "almost code 
duplication" around the handling of input and output positions etc., and 
maintaining the two different code paths for monolithic and 
non-monolithic is brittle.


Here's an approach that I think could work to clean this up: keep only 
the non-monolithic code for LLVM IR function generation. Then implement 
monolithic mode with a helper that takes a sequence of LLVM IR functions 
and generates a master function that pipes each function's output into 
the input of the next. Then set the functions as always inline and rely 
on LLVM's inliner to stitch everything together.


This ends up with slightly higher overhead for the monolithic code path 
(although the unconditional inlining should be fast), but it would help 
clean the code up tremendously.


Cheers,
Nicolai

On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeonsi/si_pipe.c  |   1 +
  src/gallium/drivers/radeonsi/si_pipe.h  |   1 +
  src/gallium/drivers/radeonsi/si_shader.c| 324 +++-
  src/gallium/drivers/radeonsi/si_shader.h|  14 +-
  src/gallium/drivers/radeonsi/si_state_shaders.c |   7 +
  5 files changed, 345 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 02c430d..44f6047 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -541,6 +541,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
sscreen->vs_prologs,
sscreen->vs_epilogs,
sscreen->tcs_epilogs,
+   sscreen->ps_prologs,
sscreen->ps_epilogs
};
unsigned i;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 5d204ec..1ac7bc4 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -92,6 +92,7 @@ struct si_screen {
struct si_shader_part   *vs_prologs;
struct si_shader_part   *vs_epilogs;
struct si_shader_part   *tcs_epilogs;
+   struct si_shader_part   *ps_prologs;
struct si_shader_part   *ps_epilogs;
  };

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 915ac1d..c6d4cb5 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -875,7 +875,8 @@ static int lookup_interp_param_index(unsigned interpolate, 
unsigned location)
  static unsigned select_interp_param(struct si_shader_context *ctx,
unsigned param)
  {
-   if (!ctx->shader->key.ps.prolog.force_persample_interp)
+   if (!ctx->shader->key.ps.prolog.force_persample_interp ||
+   !ctx->is_monolithic)
return param;

/* If the shader doesn't use center/centroid, just return the parameter.
@@ -1019,6 +1020,7 @@ static void declare_input_fs(
unsigned input_index,
const struct tgsi_full_declaration *decl)
  {
+   struct lp_build_context *base = &radeon_bld->soa.bld_base.base;
struct si_shader_context *ctx =
si_shader_context(&radeon_bld->soa.bld_base);
struct si_shader *shader = ctx->shader;
@@ -1026,6 +1028,26 @@ static void declare_input_fs(
LLVMValueRef interp_param = NULL;
int interp_param_idx;

+   /* Get colors from input VGPRs (set by the prolog). */
+   if (!ctx->is_monolithic &&
+   decl->Semantic.Name == TGSI_SEMANTIC_COLOR) {
+   unsigned i = decl->Semantic.Index;
+   unsigned colors_read = shader->selector->info.colors_read;
+   unsigned mask = colors_read >> (i * 4);
+   unsigned offset = SI_PARAM_POS_FIXED_PT + 1 +
+ (i ? util_bitcount(colors_read & 0xf) : 0);
+
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 0)] =
+   mask & 0x1 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 1)] =
+   mask & 0x2 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 2)] =
+   mask & 0x4 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 3)] =
+   mask & 0x8 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   return;
+   }
+
interp_param_idx = lookup_interp_param_index(decl->Interp.Interpolate,
 decl->Interp.Location);
if (interp_param_idx == -1)
@@ -3966,6 +3988,16 @@ static void cr

Re: [Mesa-dev] intel skylake gpu support

2016-02-16 Thread Eero Tamminen


Hi,

On 16.02.2016 09:24, Jarkko Korpi wrote:

Then the other question. My cpu is skylake 6600k which has powersaving
on that it drops its speed when not doing much. It can drop cores at
800mhz. Most of youtube videos if not all are in vp9 format, that's the
info I get when i look at the video stats. But I get lots of dropped
frames even with low resolution clips. Is this connection issue or
driver issue or cpu not using enough speed?

Does intel driver somehow have hardware encode/decode for vp9?


Video HW acceleration isn't handled by Mesa, but by VAAPI.

Do you have new enough Media drivers:
https://01.org/linuxgraphics/downloads/2015q3-intel-graphics-stack-release-0
?


- Eero


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: Don't call driver when there is no compute work

2016-02-16 Thread Jordan Justen

The ARB_compute_shader spec says:

  "If the work group count in any dimension is zero, no work groups
   are dispatched."

Signed-off-by: Jordan Justen 
---
 src/mesa/main/compute.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/compute.c b/src/mesa/main/compute.c
index 53e7a50..b71430f 100644
--- a/src/mesa/main/compute.c
+++ b/src/mesa/main/compute.c
@@ -41,6 +41,9 @@ _mesa_DispatchCompute(GLuint num_groups_x,
if (!_mesa_validate_DispatchCompute(ctx, num_groups))
   return;
 
+   if (num_groups_x == 0u || num_groups_y == 0u || num_groups_z == 0u)
+   return;
+
ctx->Driver.DispatchCompute(ctx, num_groups);
 }
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: add GL_OES_texture_border_clamp support

2016-02-16 Thread Ian Romanick

On 02/15/2016 04:14 PM, Ilia Mirkin wrote:
> Only minor differences to the existing ARB_texture_border_clamp support.
> 
> Signed-off-by: Ilia Mirkin 
> ---
> 
> I get 53 failures (and 548 passes) in the dEQP tests, they appear to expect
> all-red for depth texturing while gallium apparently returns gray. Haven't
> figured out if it's the fault of the tests or the implementation.

This is a known change from compatibility profile desktop OpenGL to
GLES3 and core profile.  In compatibility, the default color format for
depth textures is GL_INTENSITY.  Since GL_INTENSITY is removed or
deprecated everywhere else, the color format for depth textures is GL_RED.

See commit 9db2098d.

> (I also had to claim it was the EXT version of the ext, and hack up dEQP to
> pull the *OES functions instead of the *EXT ones.)

I looked at the two specs.  Doing

   diff -wud gles/extensions/EXT/EXT_texture_border_clamp.txt <(sed
's/OES/EXT/g' < gles/extensions/OES/OES_texture_border_clamp.txt)

showed basically no differences.  Maybe we should just expose both?  I
could probably also be convinced that we should expose the NV extension.
 The textual differences were quite a bit larger, but that appears to be
because the EXT and OES extensions add interactions with GLES 3.0.

I haven't check the piglit list yet... are there any piglits for the OES
version (or just dEQP)?

>  docs/GL3.txt|  2 +-
>  src/mapi/glapi/gen/es_EXT.xml   | 58 
> -
>  src/mesa/main/extensions_table.h|  1 +
>  src/mesa/main/samplerobj.c  |  6 ++--
>  src/mesa/main/tests/dispatch_sanity.cpp | 10 ++
>  src/mesa/main/texparam.c| 11 ---
>  6 files changed, 80 insertions(+), 8 deletions(-)
> 
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index ea7ceef..0957247 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -253,7 +253,7 @@ GLES3.2, GLSL ES 3.2
>GL_OES_shader_io_blocks  not started (based on 
> parts of GLSL 1.50, which is done)
>GL_OES_shader_multisample_interpolation  not started (based on 
> parts of GL_ARB_gpu_shader5, which is done)
>GL_OES_tessellation_shader   not started (based on 
> GL_ARB_tessellation_shader, which is done for some drivers)
> -  GL_OES_texture_border_clamp  not started (based on 
> GL_ARB_texture_border_clamp, which is done)
> +  GL_OES_texture_border_clamp  DONE (all drivers)
>GL_OES_texture_buffernot started (based on 
> GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and 
> GL_ARB_texture_buffer_object_rgb32 that are all done)
>GL_OES_texture_cube_map_arraynot started (based on 
> GL_ARB_texture_cube_map_array, which is done for all drivers)
>GL_OES_texture_stencil8  not started (based on 
> GL_ARB_texture_stencil8, which is done for some drivers)
> diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
> index 86df980..fb0ef05 100644
> --- a/src/mapi/glapi/gen/es_EXT.xml
> +++ b/src/mapi/glapi/gen/es_EXT.xml
> @@ -982,5 +982,61 @@
>  
>  
>  
> -  
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> + alias="GetTexParameterIiv">
> +
> +
> +
> +
> +
> + alias="GetTexParameterIuiv">
> +
> +
> +
> +
> +
> + alias="SamplerParameterIiv">
> +  
> +  
> +  
> +
> +
> + alias="SamplerParameterIuiv">
> +  
> +  
> +  
> +
> +
> + alias="GetSamplerParameterIiv">
> +  
> +  
> +  
> +
> +
> + alias="GetSamplerParameterIuiv">
> +  
> +  
> +  
> +
> +
> +
> +
>  
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index d1e3a99..b07d635 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -333,6 +333,7 @@ EXT(OES_stencil8, dummy_true
>  EXT(OES_stencil_wrap, dummy_true 
> ,  x ,  x , ES1,  x , 2002)
>  EXT(OES_surfaceless_context , dummy_true 
> ,  x ,  x , ES1, ES2, 2012)
>  EXT(OES_texture_3D  , dummy_true 
> ,  x ,  x ,  x , ES2, 2005)
> +EXT(OES_texture_border_clamp, ARB_texture_border_clamp   
> ,  x ,  x ,  x , ES2, 2014)
>  EXT(OES_texture_cube_map, ARB_texture_cube_map   
> ,  x ,  x , ES1,  x , 2007)
>  EXT(OES_texture_env_crossbar, ARB_texture_env_crossbar   
> ,  x ,  x , ES1,  x , 2005)
>  EXT(OES_texture_float   , OES_texture_float  
>

Re: [Mesa-dev] [PATCH 14/25] radeonsi: add TCS epilog

2016-02-16 Thread Nicolai Hähnle


On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeonsi/si_pipe.c   |   1 +
  src/gallium/drivers/radeonsi/si_pipe.h   |   1 +
  src/gallium/drivers/radeonsi/si_shader.c | 163 ---
  src/gallium/drivers/radeonsi/si_shader.h |   3 +
  4 files changed, 155 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 2b5ce3a..645d418 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -540,6 +540,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
struct si_shader_part *parts[] = {
sscreen->vs_prologs,
sscreen->vs_epilogs,
+   sscreen->tcs_epilogs,
};
unsigned i;

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 8d98779..d9175b9 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -91,6 +91,7 @@ struct si_screen {
pipe_mutex  shader_parts_mutex;
struct si_shader_part   *vs_prologs;
struct si_shader_part   *vs_epilogs;
+   struct si_shader_part   *tcs_epilogs;
  };

  struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0085c43..bc6f8cd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -109,9 +109,11 @@ struct si_shader_context
LLVMTypeRef i1;
LLVMTypeRef i8;
LLVMTypeRef i32;
+   LLVMTypeRef i64;
LLVMTypeRef i128;
LLVMTypeRef f32;
LLVMTypeRef v16i8;
+   LLVMTypeRef v2i32;
LLVMTypeRef v4i32;
LLVMTypeRef v4f32;
LLVMTypeRef v8i32;
@@ -2078,14 +2080,51 @@ static void si_write_tess_factors(struct 
lp_build_tgsi_context *bld_base,
  static void si_llvm_emit_tcs_epilogue(struct lp_build_tgsi_context *bld_base)
  {
struct si_shader_context *ctx = si_shader_context(bld_base);
-   LLVMValueRef invocation_id;
+   LLVMValueRef rel_patch_id, invocation_id, tf_lds_offset;

+   rel_patch_id = get_rel_patch_id(ctx);
invocation_id = unpack_param(ctx, SI_PARAM_REL_IDS, 8, 5);
+   tf_lds_offset = get_tcs_out_current_patch_data_offset(ctx);

-   si_write_tess_factors(bld_base,
- get_rel_patch_id(ctx),
- invocation_id,
- get_tcs_out_current_patch_data_offset(ctx));
+   if (!ctx->is_monolithic) {
+   /* Return epilog parameters from this function. */
+   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
+   LLVMValueRef ret = ctx->return_value;
+   LLVMValueRef rw_buffers, rw0, rw1, tf_soffset;
+   unsigned vgpr;
+
+   /* RW_BUFFERS pointer */
+   rw_buffers = LLVMGetParam(ctx->radeon_bld.main_fn,
+ SI_PARAM_RW_BUFFERS);
+   rw_buffers = LLVMBuildPtrToInt(builder, rw_buffers, ctx->i64, 
"");
+   rw_buffers = LLVMBuildBitCast(builder, rw_buffers, ctx->v2i32, 
"");
+   rw0 = LLVMBuildExtractElement(builder, rw_buffers,
+ bld_base->uint_bld.zero, "");
+   rw1 = LLVMBuildExtractElement(builder, rw_buffers,
+ bld_base->uint_bld.one, "");
+   ret = LLVMBuildInsertValue(builder, ret, rw0, 0, "");
+   ret = LLVMBuildInsertValue(builder, ret, rw1, 1, "");


Ugh, that's a bit ugly even if it ends up being a no-op in the final 
binary. Doesn't LLVM at least support vector return values or maybe even 
i64?


Nicolai


+   /* Tess factor buffer soffset is after user SGPRs. */
+   tf_soffset = LLVMGetParam(ctx->radeon_bld.main_fn,
+ SI_PARAM_TESS_FACTOR_OFFSET);
+   ret = LLVMBuildInsertValue(builder, ret, tf_soffset,
+  SI_TCS_NUM_USER_SGPR, "");
+
+   /* VGPRs */
+   rel_patch_id = bitcast(bld_base, TGSI_TYPE_FLOAT, rel_patch_id);
+   invocation_id = bitcast(bld_base, TGSI_TYPE_FLOAT, 
invocation_id);
+   tf_lds_offset = bitcast(bld_base, TGSI_TYPE_FLOAT, 
tf_lds_offset);
+
+   vgpr = SI_TCS_NUM_USER_SGPR + 1;
+   ret = LLVMBuildInsertValue(builder, ret, rel_patch_id, vgpr++, 
"");
+   ret = LLVMBuildInsertValue(builder, ret, invocation_id, vgpr++, 
"");
+   ret = LLVMBuildInsertValue(builder, ret, tf_lds_offset, vgpr++, 
"");
+   ctx->return_value = ret;
+   return;
+   }
+
+   si_write_tess_factors(bld_base, rel_patch_id, invocation_id, 
tf_lds_offset);
  }

Re: [Mesa-dev] [PATCH 13/25] radeonsi: add VS epilog

2016-02-16 Thread Nicolai Hähnle


On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

It only exports the primitive ID.
Also used by TES when it's compiled as VS.

The VS input location of the primitive ID input is v2.


So the reason for having two unused outputs/return values of the main VS 
is so that primitive ID can get passed through without any moves? Sounds 
good, but may be worth documenting e.g. where VS_EPILOG_PRIMID_LOC is 
defined.


Nicolai


---
  src/gallium/drivers/radeonsi/si_pipe.c   |   2 +-
  src/gallium/drivers/radeonsi/si_pipe.h   |   1 +
  src/gallium/drivers/radeonsi/si_shader.c | 172 +--
  src/gallium/drivers/radeonsi/si_shader.h |   4 +
  4 files changed, 168 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 7ce9570..2b5ce3a 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -539,7 +539,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
struct si_screen *sscreen = (struct si_screen *)pscreen;
struct si_shader_part *parts[] = {
sscreen->vs_prologs,
-   /* this will be filled with other shader parts */
+   sscreen->vs_epilogs,
};
unsigned i;

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index f4bafc2..8d98779 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -90,6 +90,7 @@ struct si_screen {

pipe_mutex  shader_parts_mutex;
struct si_shader_part   *vs_prologs;
+   struct si_shader_part   *vs_epilogs;
  };

  struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index fbb8394..0085c43 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -129,6 +129,7 @@ static void si_init_shader_ctx(struct si_shader_context 
*ctx,
   LLVMTargetMachineRef tm,
   struct tgsi_shader_info *info);

+#define VS_EPILOG_PRIMID_LOC 2

  #define PERSPECTIVE_BASE 0
  #define LINEAR_BASE 9
@@ -2230,16 +2231,26 @@ static void si_llvm_emit_vs_epilogue(struct 
lp_build_tgsi_context *bld_base)
  "");
}

-   /* Export PrimitiveID when PS needs it. */
-   if (si_vs_exports_prim_id(ctx->shader)) {
-   outputs[i].name = TGSI_SEMANTIC_PRIMID;
-   outputs[i].sid = 0;
-   outputs[i].values[0] = bitcast(bld_base, TGSI_TYPE_FLOAT,
-  get_primitive_id(bld_base, 0));
-   outputs[i].values[1] = bld_base->base.undef;
-   outputs[i].values[2] = bld_base->base.undef;
-   outputs[i].values[3] = bld_base->base.undef;
-   i++;
+   if (ctx->is_monolithic) {
+   /* Export PrimitiveID when PS needs it. */
+   if (si_vs_exports_prim_id(ctx->shader)) {
+   outputs[i].name = TGSI_SEMANTIC_PRIMID;
+   outputs[i].sid = 0;
+   outputs[i].values[0] = bitcast(bld_base, 
TGSI_TYPE_FLOAT,
+  
get_primitive_id(bld_base, 0));
+   outputs[i].values[1] = bld_base->base.undef;
+   outputs[i].values[2] = bld_base->base.undef;
+   outputs[i].values[3] = bld_base->base.undef;
+   i++;
+   }
+   } else {
+   /* Return the primitive ID from the LLVM function. */
+   ctx->return_value =
+   LLVMBuildInsertValue(gallivm->builder,
+ctx->return_value,
+bitcast(bld_base, TGSI_TYPE_FLOAT,
+get_primitive_id(bld_base, 
0)),
+VS_EPILOG_PRIMID_LOC, "");
}

si_llvm_export_vs(bld_base, outputs, i);
@@ -3724,6 +3735,11 @@ static void create_function(struct si_shader_context 
*ctx)

for (i = 0; i < shader->selector->info.num_inputs; i++)
params[num_params++] = ctx->i32;
+
+   /* PrimitiveID output. */
+   if (!shader->key.vs.as_es && !shader->key.vs.as_ls)
+   for (i = 0; i <= VS_EPILOG_PRIMID_LOC; i++)
+   returns[num_returns++] = ctx->f32;
}
break;

@@ -3758,6 +3774,11 @@ static void create_function(struct si_shader_context 
*ctx)
params[ctx->param_tes_v = num_params++] = ctx->f32;
params[ctx->param_tes_rel_patch_id = num_params++] = ctx->i32;

Re: [Mesa-dev] [PATCH 09/25] radeonsi: add code for combining and uploading shaders from 3 shader parts

2016-02-16 Thread Nicolai Hähnle


On 16.02.2016 11:10, Marek Olšák wrote:

On Tue, Feb 16, 2016 at 4:53 PM, Nicolai Hähnle  wrote:

On 15.02.2016 18:59, Marek Olšák wrote:


From: Marek Olšák 

---
   src/gallium/drivers/radeonsi/si_shader.c | 35

   src/gallium/drivers/radeonsi/si_shader.h |  9 
   2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index dbb9217..a6a0984 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4036,26 +4036,45 @@ void si_shader_apply_scratch_relocs(struct
si_context *sctx,

   int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader
*shader)
   {
-   const struct radeon_shader_binary *binary = &shader->binary;
-   unsigned code_size = binary->code_size + binary->rodata_size;
+   const struct radeon_shader_binary *prolog =
+   shader->prolog ? &shader->prolog->binary : NULL;
+   const struct radeon_shader_binary *epilog =
+   shader->epilog ? &shader->epilog->binary : NULL;
+   const struct radeon_shader_binary *mainb = &shader->binary;
+   unsigned bo_size =
+   (prolog ? prolog->code_size : 0) +
+   mainb->code_size +
+   (epilog ? epilog->code_size : mainb->rodata_size);
 unsigned char *ptr;

+   assert(!prolog || !prolog->rodata_size);
+   assert((!prolog && !epilog) || !mainb->rodata_size);
+   assert(!epilog || !epilog->rodata_size);



Strictly speaking it should be possible for main to have rodata if there is
a prolog but no epilog, right? In any case, patches 1-9 are


Yes. The thing is, the epilog is always present and can't be removed.
If it's empty, it must contain s_endpgm at least.

On the other hand, empty prologs aren't even compiled and
shader->prolog is NULL in that case.

We could support rodata for main if the compiler reserved some free
space for the epilog between the code and rodata.


Ah, thanks for the explanation, I forgot about the s_endpgm.



Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/25] radeonsi: add code for combining and uploading shaders from 3 shader parts

2016-02-16 Thread Marek Olšák

On Tue, Feb 16, 2016 at 4:53 PM, Nicolai Hähnle  wrote:
> On 15.02.2016 18:59, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> ---
>>   src/gallium/drivers/radeonsi/si_shader.c | 35
>> 
>>   src/gallium/drivers/radeonsi/si_shader.h |  9 
>>   2 files changed, 36 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
>> b/src/gallium/drivers/radeonsi/si_shader.c
>> index dbb9217..a6a0984 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -4036,26 +4036,45 @@ void si_shader_apply_scratch_relocs(struct
>> si_context *sctx,
>>
>>   int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader
>> *shader)
>>   {
>> -   const struct radeon_shader_binary *binary = &shader->binary;
>> -   unsigned code_size = binary->code_size + binary->rodata_size;
>> +   const struct radeon_shader_binary *prolog =
>> +   shader->prolog ? &shader->prolog->binary : NULL;
>> +   const struct radeon_shader_binary *epilog =
>> +   shader->epilog ? &shader->epilog->binary : NULL;
>> +   const struct radeon_shader_binary *mainb = &shader->binary;
>> +   unsigned bo_size =
>> +   (prolog ? prolog->code_size : 0) +
>> +   mainb->code_size +
>> +   (epilog ? epilog->code_size : mainb->rodata_size);
>> unsigned char *ptr;
>>
>> +   assert(!prolog || !prolog->rodata_size);
>> +   assert((!prolog && !epilog) || !mainb->rodata_size);
>> +   assert(!epilog || !epilog->rodata_size);
>
>
> Strictly speaking it should be possible for main to have rodata if there is
> a prolog but no epilog, right? In any case, patches 1-9 are

Yes. The thing is, the epilog is always present and can't be removed.
If it's empty, it must contain s_endpgm at least.

On the other hand, empty prologs aren't even compiled and
shader->prolog is NULL in that case.

We could support rodata for main if the compiler reserved some free
space for the epilog between the code and rodata.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] intel skylake gpu support

2016-02-16 Thread Ben Widawsky

On Tue, Feb 16, 2016 at 10:39:23AM -0500, Rob Clark wrote:
> Try xf86-video-modesetting instead of xf86-video-intel..

Might I inquire the thought behind this? It's my impression that unless one is
using glamor, modesetting won't ever outperform xf86-video-intel (which defaults
to the hardware blitter on gen9+). Certainly modesetting might be a logical
choice if there are corruption issues.

It sounds more like the person is missing all the required vaapi goop to me, but
I'm genuinely curious if you know something I don't :-)

> 
> BR,
> -R
> 
> On Tue, Feb 16, 2016 at 2:24 AM, Jarkko Korpi  
> wrote:
> > I have 3 questions for you.
> >
> > I noticed that opengl 4.0 support is missing just 1 extension
> > GL_ARB_gpu_shader_fp64. Is there any estimated schedule this to finnish?
> >
> > Then the other question. My cpu is skylake 6600k which has powersaving on
> > that it drops its speed when not doing much. It can drop cores at 800mhz.
> > Most of youtube videos if not all are in vp9 format, that's the info I get
> > when i look at the video stats. But I get lots of dropped frames even with
> > low resolution clips. Is this connection issue or driver issue or cpu not
> > using enough speed?
> >
> > Does intel driver somehow have hardware encode/decode for vp9?
> >
> >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/25] radeonsi: add VS prolog

2016-02-16 Thread Nicolai Hähnle


On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

This is disabled with use_monolithic_shaders = true.
---
  src/gallium/drivers/radeonsi/si_pipe.c   |  19 +++
  src/gallium/drivers/radeonsi/si_pipe.h   |   3 +
  src/gallium/drivers/radeonsi/si_shader.c | 236 ++-
  src/gallium/drivers/radeonsi/si_shader.h |   9 ++
  4 files changed, 266 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 448fe88..7ce9570 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -22,6 +22,7 @@
   */

  #include "si_pipe.h"
+#include "si_shader.h"
  #include "si_public.h"
  #include "sid.h"

@@ -536,6 +537,11 @@ static int si_get_shader_param(struct pipe_screen* 
pscreen, unsigned shader, enu
  static void si_destroy_screen(struct pipe_screen* pscreen)
  {
struct si_screen *sscreen = (struct si_screen *)pscreen;
+   struct si_shader_part *parts[] = {
+   sscreen->vs_prologs,
+   /* this will be filled with other shader parts */
+   };
+   unsigned i;

if (!sscreen)
return;
@@ -543,6 +549,18 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
if (!sscreen->b.ws->unref(sscreen->b.ws))
return;

+   /* Free shader parts. */
+   for (i = 0; i < ARRAY_SIZE(parts); i++) {
+   while (parts[i]) {
+   struct si_shader_part *part = parts[i];
+
+   parts[i] = part->next;
+   radeon_shader_binary_clean(&part->binary);
+   FREE(part);
+   }
+   }
+   pipe_mutex_destroy(sscreen->shader_parts_mutex);
+
r600_destroy_common_screen(&sscreen->b);
  }

@@ -600,6 +618,7 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)

sscreen->b.has_cp_dma = true;
sscreen->b.has_streamout = true;
+   pipe_mutex_init(sscreen->shader_parts_mutex);
sscreen->use_monolithic_shaders = true;

if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 2a2455c..f4bafc2 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -87,6 +87,9 @@ struct si_screen {

/* Whether shaders are monolithic (1-part) or separate (3-part). */
booluse_monolithic_shaders;
+
+   pipe_mutex  shader_parts_mutex;
+   struct si_shader_part   *vs_prologs;
  };

  struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index b74ed1e..fbb8394 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -83,6 +83,7 @@ struct si_shader_context
int param_rel_auto_id;
int param_vs_prim_id;
int param_instance_id;
+   int param_vertex_index0;
int param_tes_u;
int param_tes_v;
int param_tes_rel_patch_id;
@@ -432,7 +433,11 @@ static void declare_input_vs(
/* Build the attribute offset */
attribute_offset = lp_build_const_int32(gallivm, 0);

-   if (divisor) {
+   if (!ctx->is_monolithic) {
+   buffer_index = LLVMGetParam(radeon_bld->main_fn,
+   ctx->param_vertex_index0 +
+   input_index);
+   } else if (divisor) {
/* Build index from instance ID, start instance and divisor */
ctx->shader->uses_instanceid = true;
buffer_index = get_instance_index_for_fetch(&ctx->radeon_bld,
@@ -3711,6 +3716,15 @@ static void create_function(struct si_shader_context 
*ctx)
params[ctx->param_rel_auto_id = num_params++] = ctx->i32;
params[ctx->param_vs_prim_id = num_params++] = ctx->i32;
params[ctx->param_instance_id = num_params++] = ctx->i32;
+
+   if (!ctx->is_monolithic &&
+   !ctx->is_gs_copy_shader) {
+   /* Vertex load indices. */
+   ctx->param_vertex_index0 = num_params;
+
+   for (i = 0; i < shader->selector->info.num_inputs; i++)
+   params[num_params++] = ctx->i32;
+   }
break;

case TGSI_PROCESSOR_TESS_CTRL:
@@ -4678,6 +4692,203 @@ out:
return r;
  }

+/**
+ * Create, compile and return a shader part (prolog or epilog).
+ *
+ * \param sscreen  screen
+ * \param list list of shader parts of the same category
+ * \param key  shader part key
+ * \param tm   LLVM target machine
+ * \param debugdebug callback
+ * \param compile  the callback responsible for compilation
+ * \return

Re: [Mesa-dev] [PATCH 11/25] radeonsi: first bits for non-monolithic shaders

2016-02-16 Thread Nicolai Hähnle


On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
  src/gallium/drivers/radeonsi/si_pipe.h   |  3 ++
  src/gallium/drivers/radeonsi/si_shader.c | 53 
  src/gallium/drivers/radeonsi/si_shader.h |  2 +-
  4 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index fa60732..448fe88 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -600,6 +600,7 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)

sscreen->b.has_cp_dma = true;
sscreen->b.has_streamout = true;
+   sscreen->use_monolithic_shaders = true;

if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS | DBG_PS | 
DBG_CS;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index b5790d6..2a2455c 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -84,6 +84,9 @@ struct si_compute;
  struct si_screen {
struct r600_common_screen   b;
unsignedgs_table_depth;
+
+   /* Whether shaders are monolithic (1-part) or separate (3-part). */
+   booluse_monolithic_shaders;
  };

  struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index b058019..b74ed1e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -70,6 +70,12 @@ struct si_shader_context

unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader. */
bool is_gs_copy_shader;
+
+   /* Whether to generate the optimized shader variant compiled as a whole
+* (without a prolog and epilog)
+*/
+   bool is_monolithic;
+
int param_streamout_config;
int param_streamout_write_index;
int param_streamout_offset[4];
@@ -3657,8 +3663,10 @@ static void create_function(struct si_shader_context 
*ctx)
struct lp_build_tgsi_context *bld_base = &ctx->radeon_bld.soa.bld_base;
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct si_shader *shader = ctx->shader;
-   LLVMTypeRef params[SI_NUM_PARAMS], v2i32, v3i32;
+   LLVMTypeRef params[SI_NUM_PARAMS + SI_NUM_VERTEX_BUFFERS], v2i32, v3i32;
+   LLVMTypeRef returns[16+32*4];


This is a bit of a magic number, I guess something like max parameters 
plus attributes. Can you replace it by the appropriate defines?


Apart from this, patches 10-11 are

Reviewed-by: Nicolai Hähnle 


unsigned i, last_array_pointer, last_sgpr, num_params;
+   unsigned num_returns = 0;

v2i32 = LLVMVectorType(ctx->i32, 2);
v3i32 = LLVMVectorType(ctx->i32, 3);
@@ -3785,7 +3793,7 @@ static void create_function(struct si_shader_context *ctx)

assert(num_params <= Elements(params));

-   si_create_function(ctx, NULL, 0, params,
+   si_create_function(ctx, returns, num_returns, params,
   num_params, last_array_pointer, last_sgpr);

shader->num_input_sgprs = 0;
@@ -4492,9 +4500,11 @@ static void si_init_shader_ctx(struct si_shader_context 
*ctx,
bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
  }

-int si_shader_create(struct si_screen *sscreen, LLVMTargetMachineRef tm,
-struct si_shader *shader,
-struct pipe_debug_callback *debug)
+static int si_compile_tgsi_shader(struct si_screen *sscreen,
+ LLVMTargetMachineRef tm,
+ struct si_shader *shader,
+ bool is_monolithic,
+ struct pipe_debug_callback *debug)
  {
struct si_shader_selector *sel = shader->selector;
struct tgsi_token *tokens = sel->tokens;
@@ -4524,6 +4534,7 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,

si_init_shader_ctx(&ctx, sscreen, shader, tm,
   poly_stipple ? &stipple_shader_info : &sel->info);
+   ctx.is_monolithic = is_monolithic;

shader->uses_instanceid = sel->info.uses_instanceid;

@@ -4604,14 +4615,6 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,
goto out;
}

-   si_shader_dump(sscreen, shader, debug, ctx.type);
-
-   r = si_shader_binary_upload(sscreen, shader);
-   if (r) {
-   fprintf(stderr, "LLVM failed to upload shader\n");
-   goto out;
-   }
-
radeon_llvm_dispose(&ctx.radeon_bld);

/* Calculate the number of fragment input VGPRs. */
@@ -4675,6 +4678,30 @@ out:
return r;
  }

+int si_shader_create(struct si_screen

Re: [Mesa-dev] [PATCH 09/25] radeonsi: add code for combining and uploading shaders from 3 shader parts

2016-02-16 Thread Nicolai Hähnle


On 15.02.2016 18:59, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeonsi/si_shader.c | 35 
  src/gallium/drivers/radeonsi/si_shader.h |  9 
  2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index dbb9217..a6a0984 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4036,26 +4036,45 @@ void si_shader_apply_scratch_relocs(struct si_context 
*sctx,

  int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader)
  {
-   const struct radeon_shader_binary *binary = &shader->binary;
-   unsigned code_size = binary->code_size + binary->rodata_size;
+   const struct radeon_shader_binary *prolog =
+   shader->prolog ? &shader->prolog->binary : NULL;
+   const struct radeon_shader_binary *epilog =
+   shader->epilog ? &shader->epilog->binary : NULL;
+   const struct radeon_shader_binary *mainb = &shader->binary;
+   unsigned bo_size =
+   (prolog ? prolog->code_size : 0) +
+   mainb->code_size +
+   (epilog ? epilog->code_size : mainb->rodata_size);
unsigned char *ptr;

+   assert(!prolog || !prolog->rodata_size);
+   assert((!prolog && !epilog) || !mainb->rodata_size);
+   assert(!epilog || !epilog->rodata_size);


Strictly speaking it should be possible for main to have rodata if there 
is a prolog but no epilog, right? In any case, patches 1-9 are


Reviewed-by: Nicolai Hähnle 


+
r600_resource_reference(&shader->bo, NULL);
shader->bo = si_resource_create_custom(&sscreen->b.b,
   PIPE_USAGE_IMMUTABLE,
-  code_size);
+  bo_size);
if (!shader->bo)
return -ENOMEM;

+   /* Upload. */
ptr = sscreen->b.ws->buffer_map(shader->bo->buf, NULL,
PIPE_TRANSFER_READ_WRITE);
-   util_memcpy_cpu_to_le32(ptr, binary->code, binary->code_size);
-   if (binary->rodata_size > 0) {
-   ptr += binary->code_size;
-   util_memcpy_cpu_to_le32(ptr, binary->rodata,
-   binary->rodata_size);
+
+   if (prolog) {
+   util_memcpy_cpu_to_le32(ptr, prolog->code, prolog->code_size);
+   ptr += prolog->code_size;
}

+   util_memcpy_cpu_to_le32(ptr, mainb->code, mainb->code_size);
+   ptr += mainb->code_size;
+
+   if (epilog)
+   util_memcpy_cpu_to_le32(ptr, epilog->code, epilog->code_size);
+   else if (mainb->rodata_size > 0)
+   util_memcpy_cpu_to_le32(ptr, mainb->rodata, mainb->rodata_size);
+
sscreen->b.ws->buffer_unmap(shader->bo->buf);
return 0;
  }
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 9331156..4c3c14a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -304,6 +304,9 @@ struct si_shader {
struct si_shader_selector   *selector;
struct si_shader*next_variant;

+   struct si_shader_part   *prolog;
+   struct si_shader_part   *epilog;
+
struct si_shader*gs_copy_shader;
struct si_pm4_state *pm4;
struct r600_resource*bo;
@@ -322,6 +325,12 @@ struct si_shader {
unsignednr_param_exports;
  };

+struct si_shader_part {
+   struct si_shader_part *next;
+   struct radeon_shader_binary binary;
+   struct si_shader_config config;
+};
+
  static inline struct tgsi_shader_info *si_get_vs_info(struct si_context *sctx)
  {
if (sctx->gs_shader.cso)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 117 matches

Mail list logo