[Mesa-dev] [PATCH] glsl: mark link_uniform_blocks_are_compatible() as static

2016-07-01 Thread Timothy Arceri
Missed this when doing 6d1a59d15b.
---
 src/compiler/glsl/link_uniform_blocks.cpp | 2 +-
 src/compiler/glsl/linker.h| 4 
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/compiler/glsl/link_uniform_blocks.cpp 
b/src/compiler/glsl/link_uniform_blocks.cpp
index 1ccd0df..5b0dff6 100644
--- a/src/compiler/glsl/link_uniform_blocks.cpp
+++ b/src/compiler/glsl/link_uniform_blocks.cpp
@@ -469,7 +469,7 @@ link_uniform_blocks(void *mem_ctx,
_mesa_hash_table_destroy(block_hash, NULL);
 }
 
-bool
+static bool
 link_uniform_blocks_are_compatible(const gl_uniform_block *a,
const gl_uniform_block *b)
 {
diff --git a/src/compiler/glsl/linker.h b/src/compiler/glsl/linker.h
index 6687b14..4e36d0b 100644
--- a/src/compiler/glsl/linker.h
+++ b/src/compiler/glsl/linker.h
@@ -50,10 +50,6 @@ link_cross_validate_uniform_block(void *mem_ctx,
  unsigned int *num_linked_blocks,
  struct gl_uniform_block *new_block);
 
-extern bool
-link_uniform_blocks_are_compatible(const gl_uniform_block *a,
-  const gl_uniform_block *b);
-
 extern void
 link_uniform_blocks(void *mem_ctx,
 struct gl_context *ctx,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.

2016-07-01 Thread Jason Ekstrand
On Fri, Jul 1, 2016 at 5:43 PM, Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > 3 and 4 are
> >
> > Cc: "12.0 11.1 11.2" 
>
> Hmm, I'll cc PATCH 2 to mesa-stable too since technically it also fixes
> a bug.  PATCH 1 shouldn't make much of a difference though.
>
> > Reviewed-by: Jason Ekstrand 
> >
> > I did look over 3 fairly carefully.
> >
> Thanks.
>
> > It's worth noting that I think we have some double-pipe-controls that
> could
> > probably be put together now.  Not sure that we actually want to do that
> > though.
>
> I don't think it matters much, usually it's nearly the same amount of
> lines of code to do it as one call to brw_emit_pipe_control_flush() or
> as two, both approaches are functionally equivalent, and doing it
> explicitly as two brw_emit_pipe_control_flush() calls makes it clear to
> the reader that the read cache invalidations are supposed to happen
> after the write caches are flushed.  In some places though it's really
> convenient to be able to put all cache bits for which coherency is
> required into a single bitfield and have brw_emit_pipe_control_flush()
> figure out whether the command needs to be split (e.g.
> brw_memory_barrier()).
>

That sounds reasonable.  I wasn't even really sure about it myself but
thought it would be worth a few words of discussion.  Sounds like we're on
the same page.


> > --Jason
> >
> > On Thu, Jun 30, 2016 at 10:07 PM, Francisco Jerez  >
> > wrote:
> >
> >> This hardware race condition has caused problems several times already
> >> (see "i965: Fix cache pollution race during L3 partitioning set-up.",
> >> "i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and
> >> "i965: intel_texture_barrier reimplemented").  The problem is that
> >> whenever we attempt to both flush and invalidate multiple caches with
> >> a single pipe control command the flush and invalidation happen in
> >> reverse order, so the contents flushed from the R/W caches aren't
> >> guaranteed to become visible from the invalidated caches after the
> >> PIPE_CONTROL command completes execution if some concurrent rendering
> >> workload happened to pollute any of the invalidated R/O caches in the
> >> short window of time between the invalidation and flush.
> >>
> >> This makes sure that brw_emit_pipe_control_flush() has the effect
> >> expected by most callers of making the contents flushed from any R/W
> >> caches visible from the invalidated R/O caches.
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 18 ++
> >>  src/mesa/drivers/dri/i965/intel_reg.h|  9 +
> >>  2 files changed, 27 insertions(+)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> >> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> >> index 14a8f7c..05e8c05 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> >> @@ -96,6 +96,24 @@ gen7_cs_stall_every_four_pipe_controls(struct
> >> brw_context *brw, uint32_t flags)
> >>  void
> >>  brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
> >>  {
> >> +   if (brw->gen >= 6 &&
> >> +   (flags & PIPE_CONTROL_CACHE_FLUSH_BITS) &&
> >> +   (flags & PIPE_CONTROL_CACHE_INVALIDATE_BITS)) {
> >> +  /* A pipe control command with flush and invalidate bits set
> >> +   * simultaneously is an inherently racy operation on Gen6+ if the
> >> +   * contents of the flushed caches were intended to become visible
> >> from
> >> +   * any of the invalidated caches.  Split it in two PIPE_CONTROLs,
> >> the
> >> +   * first one should stall the pipeline to make sure that the
> >> flushed R/W
> >> +   * caches are coherent with memory once the specified R/O caches
> are
> >> +   * invalidated.  On pre-Gen6 hardware the (implicit) R/O cache
> >> +   * invalidation seems to happen at the bottom of the pipeline
> >> together
> >> +   * with any write cache flush, so this shouldn't be a concern.
> >> +   */
> >> +  brw_emit_pipe_control_flush(brw, (flags &
> >> PIPE_CONTROL_CACHE_FLUSH_BITS) |
> >> +   PIPE_CONTROL_CS_STALL);
> >> +  flags &= ~(PIPE_CONTROL_CACHE_FLUSH_BITS |
> PIPE_CONTROL_CS_STALL);
> >> +   }
> >> +
> >> if (brw->gen >= 8) {
> >>if (brw->gen == 8)
> >>   gen8_add_cs_stall_workaround_bits();
> >> diff --git a/src/mesa/drivers/dri/i965/intel_reg.h
> >> b/src/mesa/drivers/dri/i965/intel_reg.h
> >> index 95365fe..7a82be4 100644
> >> --- a/src/mesa/drivers/dri/i965/intel_reg.h
> >> +++ b/src/mesa/drivers/dri/i965/intel_reg.h
> >> @@ -134,6 +134,15 @@
> >>  #define PIPE_CONTROL_PPGTT_WRITE   (0 << 2)
> >>  #define PIPE_CONTROL_GLOBAL_GTT_WRITE  (1 << 2)
> >>
> >> +#define PIPE_CONTROL_CACHE_FLUSH_BITS \
> >> +   (PIPE_CONTROL_DEPTH_CACHE_FLUSH | PIPE_CONTROL_DATA_CACHE_FLUSH | \

Re: [Mesa-dev] [PATCH 2/4] anv/image: Simplify initialization of the isl_tiling

2016-07-01 Thread Jason Ekstrand
On Fri, Jul 1, 2016 at 6:13 PM, Nanley Chery  wrote:

> On Fri, Jul 01, 2016 at 02:24:19PM -0700, Chad Versace wrote:
> > On Mon 27 Jun 2016, Nanley Chery wrote:
> > > Signed-off-by: Nanley Chery 
> > > ---
> > >  src/intel/vulkan/anv_image.c | 10 --
> > >  1 file changed, 4 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_image.c
> b/src/intel/vulkan/anv_image.c
> > > index 77d9931..b3f5f5c 100644
> > > --- a/src/intel/vulkan/anv_image.c
> > > +++ b/src/intel/vulkan/anv_image.c
> > > @@ -120,10 +120,6 @@ make_surface(const struct anv_device *dev,
> > >[VK_IMAGE_TYPE_3D] = ISL_SURF_DIM_3D,
> > > };
> > >
> > > -   isl_tiling_flags_t tiling_flags = anv_info->isl_tiling_flags;
> > > -   if (vk_info->tiling == VK_IMAGE_TILING_LINEAR)
> > > -  tiling_flags = ISL_TILING_LINEAR_BIT;
> > > -
> > > struct anv_surface *anv_surf = get_surface(image, aspect);
> > >
> > > image->extent = anv_sanitize_image_extent(vk_info->imageType,
> > > @@ -142,7 +138,7 @@ make_surface(const struct anv_device *dev,
> > >.min_alignment = 0,
> > >.min_pitch = anv_info->stride,
> > >.usage = choose_isl_surf_usage(image->usage, aspect),
> > > -  .tiling_flags = tiling_flags);
> > > +  .tiling_flags = anv_info->isl_tiling_flags);
> > >
> > > /* isl_surf_init() will fail only if provided invalid input.
> Invalid input
> > >  * is illegal in Vulkan.
> > > @@ -260,7 +256,9 @@ anv_CreateImage(VkDevice device,
> > > return anv_image_create(device,
> > >&(struct anv_image_create_info) {
> > >   .vk_info = pCreateInfo,
> > > - .isl_tiling_flags = ISL_TILING_ANY_MASK,
> > > + .isl_tiling_flags = (pCreateInfo->tiling ==
> VK_IMAGE_TILING_LINEAR) ?
> > > + ISL_TILING_LINEAR_BIT :
> ISL_TILING_ANY_MASK,
> > > +
> > >},
> > >pAllocator,
> > >pImage);
> >
> > I don't agree with this patch.
> >
> > Locally, the patch look correct. But when you consider that
> > anv_image_create() is public within the driver, the patch makes the code
> > fragile. Pre-patch, if the caller of anv_image_create() sets
> > anv_image_create_info::vk_info::tiling and leaves
> > anv_image_create_info::isl_tiling_flags unset (which I believe should be
> > a valid combination), then anv_image_create() correctly converts the
> > VkImageTilingFlags to isl_tiling_flags. Post-patch, that's no longer the
> > case; anv_image_create() ignores its VkImageTiling input.
>
> Thanks for finding that bug.
>
> Your description has actually pointed out an issue in the current code:
> If an internal caller specifies
> anv_image_create_info::vk_info::tiling = VK_IMAGE_TILING_OPTIMAL
> and leaves anv_image_create_info::isl_tiling_flags unset, then
> anv_image_create() ignores the VkImageTiling input and causes ISL to
> fail image creation later.
>
> To solve this problem, I think we should define ::isl_tiling_flags to be a
> opt-in bit-mask which works with the requested ::vk_info::tiling to provide
> more specificity on the actual desired tiling. With this in mind, we can
> drop
> the last two hunks from the above patch and replace the first with the
> following:
> `
>  isl_tiling_flags_t tiling_flags =
> (pCreateInfo->tiling == VK_IMAGE_TILING_LINEAR ?
> ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK);
>  if (anv_info->isl_tiling_flags)
> tiling_flags &= anv_info->isl_tiling_flags;
>  assert (tiling_flags);
>

I think I like that.  The assert ensures that you don't try and combine
VK_IMAGE_TILING_LINEAR with ISL bits that contradict it.  On the other
hand, it does let you do VK_IMAGE_TILING_OPTIMAL with ISL_TILING_LINEAR_BIT
which I think is ok.

--Jason


> `
> What do you think?
>
> - Nanley
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCHv5 2/4] anv: use cache uuid based on the build timestamp.

2016-07-01 Thread Jason Ekstrand
I think this patch finally looks good.  I read the SOURCE_DATE_EPOCH spec
and we seem to be following it correctly.  Thanks for all your hard work (I
know it's been a lot) on this!

Reviewed-by: Jason Ekstrand 

On Fri, Jul 1, 2016 at 9:22 AM, Emil Velikov 
wrote:

> From: Emil Velikov 
>
> Do not rely on the git sha1:
>  - its current truncated form makes it less unique
>  - it does not attribute for local (Vulkand or otherwise) changes
>
> Use a timestamp produced at the time of build. It's perfectly unique,
> unless someone explicitly thinkers with their system clock. Even then
> chances of producing the exact same one are very small, if not zero.
>
> v2: Remove .tmp rule. Its not needed since we want for the header to be
> regenerated on each time we call make (Eric).
>
> v3:
>  - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel)
>  - Replace the generated header with a define, to prevent needless
> builds on consecutive `make' and/or `make install' calls. (Dave)
>
> v4:
>  - Keep the timestamp generation at make time. (Jason)
>
> v5:
>  - Ensure that file is regenerated on incremental builds.
>
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Emil Velikov 
> ---
>  configure.ac  |  2 ++
>  src/intel/vulkan/Makefile.am  | 10 +-
>  src/intel/vulkan/anv_device.c |  4 ++--
>  3 files changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 8321e8e..3d32c53 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2620,6 +2620,8 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
>  AC_SUBST([XA_TINY], $XA_TINY)
>  AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
>
> +AC_SUBST([TIMESTAMP_CMD], '`test $(SOURCE_DATE_EPOCH) && echo
> $(SOURCE_DATE_EPOCH) || date +%s`')
> +
>  AC_ARG_ENABLE(valgrind,
>[AS_HELP_STRING([--enable-valgrind],
>   [Build mesa with valgrind support (default:
> auto)])],
> diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
> index 4d9ff90..3f49020 100644
> --- a/src/intel/vulkan/Makefile.am
> +++ b/src/intel/vulkan/Makefile.am
> @@ -131,8 +131,16 @@ anv_entrypoints.c : anv_entrypoints_gen.py
> $(vulkan_include_HEADERS)
> $(AM_V_GEN) cat $(vulkan_include_HEADERS) |\
> $(PYTHON2) $(srcdir)/anv_entrypoints_gen.py code > $@
>
> +.PHONY: anv_timestamp.h
> +
> +anv_timestamp.h:
> +   @echo "Updating anv_timestamp.h"
> +   $(AM_V_GEN) echo "#define ANV_TIMESTAMP \"$(TIMESTAMP_CMD)\"" > $@
> +
> +anv_device.$(OBJEXT): anv_timestamp.h
> +
>  BUILT_SOURCES = $(VULKAN_GENERATED_FILES)
> -CLEANFILES = $(BUILT_SOURCES) dev_icd.json
> +CLEANFILES = $(BUILT_SOURCES) dev_icd.json anv_timestamp.h
>  EXTRA_DIST = \
> $(top_srcdir)/include/vulkan/vk_icd.h \
> anv_entrypoints_gen.py \
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index ea8e875..dd941b6 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -28,7 +28,7 @@
>  #include 
>
>  #include "anv_private.h"
> -#include "git_sha1.h"
> +#include "anv_timestamp.h"
>  #include "util/strtod.h"
>  #include "util/debug.h"
>
> @@ -426,7 +426,7 @@ void
>  anv_device_get_cache_uuid(void *uuid)
>  {
> memset(uuid, 0, VK_UUID_SIZE);
> -   snprintf(uuid, VK_UUID_SIZE, "anv-%s", MESA_GIT_SHA1 + 4);
> +   snprintf(uuid, VK_UUID_SIZE, "anv-%s", ANV_TIMESTAMP);
>  }
>
>  void anv_GetPhysicalDeviceProperties(
> --
> 2.8.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] anv/image: Simplify initialization of the isl_tiling

2016-07-01 Thread Nanley Chery
On Fri, Jul 01, 2016 at 02:24:19PM -0700, Chad Versace wrote:
> On Mon 27 Jun 2016, Nanley Chery wrote:
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/intel/vulkan/anv_image.c | 10 --
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> > index 77d9931..b3f5f5c 100644
> > --- a/src/intel/vulkan/anv_image.c
> > +++ b/src/intel/vulkan/anv_image.c
> > @@ -120,10 +120,6 @@ make_surface(const struct anv_device *dev,
> >[VK_IMAGE_TYPE_3D] = ISL_SURF_DIM_3D,
> > };
> >  
> > -   isl_tiling_flags_t tiling_flags = anv_info->isl_tiling_flags;
> > -   if (vk_info->tiling == VK_IMAGE_TILING_LINEAR)
> > -  tiling_flags = ISL_TILING_LINEAR_BIT;
> > -
> > struct anv_surface *anv_surf = get_surface(image, aspect);
> >  
> > image->extent = anv_sanitize_image_extent(vk_info->imageType,
> > @@ -142,7 +138,7 @@ make_surface(const struct anv_device *dev,
> >.min_alignment = 0,
> >.min_pitch = anv_info->stride,
> >.usage = choose_isl_surf_usage(image->usage, aspect),
> > -  .tiling_flags = tiling_flags);
> > +  .tiling_flags = anv_info->isl_tiling_flags);
> >  
> > /* isl_surf_init() will fail only if provided invalid input. Invalid 
> > input
> >  * is illegal in Vulkan.
> > @@ -260,7 +256,9 @@ anv_CreateImage(VkDevice device,
> > return anv_image_create(device,
> >&(struct anv_image_create_info) {
> >   .vk_info = pCreateInfo,
> > - .isl_tiling_flags = ISL_TILING_ANY_MASK,
> > + .isl_tiling_flags = (pCreateInfo->tiling == 
> > VK_IMAGE_TILING_LINEAR) ?
> > + ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK,
> > +
> >},
> >pAllocator,
> >pImage);
> 
> I don't agree with this patch.
> 
> Locally, the patch look correct. But when you consider that
> anv_image_create() is public within the driver, the patch makes the code
> fragile. Pre-patch, if the caller of anv_image_create() sets
> anv_image_create_info::vk_info::tiling and leaves
> anv_image_create_info::isl_tiling_flags unset (which I believe should be
> a valid combination), then anv_image_create() correctly converts the
> VkImageTilingFlags to isl_tiling_flags. Post-patch, that's no longer the
> case; anv_image_create() ignores its VkImageTiling input.

Thanks for finding that bug.

Your description has actually pointed out an issue in the current code:
If an internal caller specifies
anv_image_create_info::vk_info::tiling = VK_IMAGE_TILING_OPTIMAL
and leaves anv_image_create_info::isl_tiling_flags unset, then
anv_image_create() ignores the VkImageTiling input and causes ISL to
fail image creation later.

To solve this problem, I think we should define ::isl_tiling_flags to be a
opt-in bit-mask which works with the requested ::vk_info::tiling to provide
more specificity on the actual desired tiling. With this in mind, we can drop
the last two hunks from the above patch and replace the first with the
following:
`
 isl_tiling_flags_t tiling_flags =
(pCreateInfo->tiling == VK_IMAGE_TILING_LINEAR ? 
ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK);
 if (anv_info->isl_tiling_flags)
tiling_flags &= anv_info->isl_tiling_flags;
 assert (tiling_flags);
`
What do you think?

- Nanley
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.

2016-07-01 Thread Francisco Jerez
Jason Ekstrand  writes:

> 3 and 4 are
>
> Cc: "12.0 11.1 11.2" 

Hmm, I'll cc PATCH 2 to mesa-stable too since technically it also fixes
a bug.  PATCH 1 shouldn't make much of a difference though.

> Reviewed-by: Jason Ekstrand 
>
> I did look over 3 fairly carefully.
>
Thanks.

> It's worth noting that I think we have some double-pipe-controls that could
> probably be put together now.  Not sure that we actually want to do that
> though.

I don't think it matters much, usually it's nearly the same amount of
lines of code to do it as one call to brw_emit_pipe_control_flush() or
as two, both approaches are functionally equivalent, and doing it
explicitly as two brw_emit_pipe_control_flush() calls makes it clear to
the reader that the read cache invalidations are supposed to happen
after the write caches are flushed.  In some places though it's really
convenient to be able to put all cache bits for which coherency is
required into a single bitfield and have brw_emit_pipe_control_flush()
figure out whether the command needs to be split (e.g.
brw_memory_barrier()).

> --Jason
>
> On Thu, Jun 30, 2016 at 10:07 PM, Francisco Jerez 
> wrote:
>
>> This hardware race condition has caused problems several times already
>> (see "i965: Fix cache pollution race during L3 partitioning set-up.",
>> "i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and
>> "i965: intel_texture_barrier reimplemented").  The problem is that
>> whenever we attempt to both flush and invalidate multiple caches with
>> a single pipe control command the flush and invalidation happen in
>> reverse order, so the contents flushed from the R/W caches aren't
>> guaranteed to become visible from the invalidated caches after the
>> PIPE_CONTROL command completes execution if some concurrent rendering
>> workload happened to pollute any of the invalidated R/O caches in the
>> short window of time between the invalidation and flush.
>>
>> This makes sure that brw_emit_pipe_control_flush() has the effect
>> expected by most callers of making the contents flushed from any R/W
>> caches visible from the invalidated R/O caches.
>> ---
>>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 18 ++
>>  src/mesa/drivers/dri/i965/intel_reg.h|  9 +
>>  2 files changed, 27 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c
>> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
>> index 14a8f7c..05e8c05 100644
>> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
>> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
>> @@ -96,6 +96,24 @@ gen7_cs_stall_every_four_pipe_controls(struct
>> brw_context *brw, uint32_t flags)
>>  void
>>  brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
>>  {
>> +   if (brw->gen >= 6 &&
>> +   (flags & PIPE_CONTROL_CACHE_FLUSH_BITS) &&
>> +   (flags & PIPE_CONTROL_CACHE_INVALIDATE_BITS)) {
>> +  /* A pipe control command with flush and invalidate bits set
>> +   * simultaneously is an inherently racy operation on Gen6+ if the
>> +   * contents of the flushed caches were intended to become visible
>> from
>> +   * any of the invalidated caches.  Split it in two PIPE_CONTROLs,
>> the
>> +   * first one should stall the pipeline to make sure that the
>> flushed R/W
>> +   * caches are coherent with memory once the specified R/O caches are
>> +   * invalidated.  On pre-Gen6 hardware the (implicit) R/O cache
>> +   * invalidation seems to happen at the bottom of the pipeline
>> together
>> +   * with any write cache flush, so this shouldn't be a concern.
>> +   */
>> +  brw_emit_pipe_control_flush(brw, (flags &
>> PIPE_CONTROL_CACHE_FLUSH_BITS) |
>> +   PIPE_CONTROL_CS_STALL);
>> +  flags &= ~(PIPE_CONTROL_CACHE_FLUSH_BITS | PIPE_CONTROL_CS_STALL);
>> +   }
>> +
>> if (brw->gen >= 8) {
>>if (brw->gen == 8)
>>   gen8_add_cs_stall_workaround_bits();
>> diff --git a/src/mesa/drivers/dri/i965/intel_reg.h
>> b/src/mesa/drivers/dri/i965/intel_reg.h
>> index 95365fe..7a82be4 100644
>> --- a/src/mesa/drivers/dri/i965/intel_reg.h
>> +++ b/src/mesa/drivers/dri/i965/intel_reg.h
>> @@ -134,6 +134,15 @@
>>  #define PIPE_CONTROL_PPGTT_WRITE   (0 << 2)
>>  #define PIPE_CONTROL_GLOBAL_GTT_WRITE  (1 << 2)
>>
>> +#define PIPE_CONTROL_CACHE_FLUSH_BITS \
>> +   (PIPE_CONTROL_DEPTH_CACHE_FLUSH | PIPE_CONTROL_DATA_CACHE_FLUSH | \
>> +PIPE_CONTROL_RENDER_TARGET_FLUSH)
>> +
>> +#define PIPE_CONTROL_CACHE_INVALIDATE_BITS \
>> +   (PIPE_CONTROL_STATE_CACHE_INVALIDATE |
>> PIPE_CONTROL_CONST_CACHE_INVALIDATE | \
>> +PIPE_CONTROL_VF_CACHE_INVALIDATE |
>> PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | \
>> +PIPE_CONTROL_INSTRUCTION_INVALIDATE)
>> +
>>  /** @} */
>>
>>  #define XY_SETUP_BLT_CMD   (CMD_2D | (0x01 << 22))
>> --
>> 2.9.0
>>
>>


signature.asc

Re: [Mesa-dev] [PATCH 4/4] i965: Fix remaining flush vs invalidate race conditions in brw_emit_pipe_control_flush.

2016-07-01 Thread Jason Ekstrand
3 and 4 are

Cc: "12.0 11.1 11.2" 
Reviewed-by: Jason Ekstrand 

I did look over 3 fairly carefully.

It's worth noting that I think we have some double-pipe-controls that could
probably be put together now.  Not sure that we actually want to do that
though.
--Jason

On Thu, Jun 30, 2016 at 10:07 PM, Francisco Jerez 
wrote:

> This hardware race condition has caused problems several times already
> (see "i965: Fix cache pollution race during L3 partitioning set-up.",
> "i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and
> "i965: intel_texture_barrier reimplemented").  The problem is that
> whenever we attempt to both flush and invalidate multiple caches with
> a single pipe control command the flush and invalidation happen in
> reverse order, so the contents flushed from the R/W caches aren't
> guaranteed to become visible from the invalidated caches after the
> PIPE_CONTROL command completes execution if some concurrent rendering
> workload happened to pollute any of the invalidated R/O caches in the
> short window of time between the invalidation and flush.
>
> This makes sure that brw_emit_pipe_control_flush() has the effect
> expected by most callers of making the contents flushed from any R/W
> caches visible from the invalidated R/O caches.
> ---
>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 18 ++
>  src/mesa/drivers/dri/i965/intel_reg.h|  9 +
>  2 files changed, 27 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index 14a8f7c..05e8c05 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -96,6 +96,24 @@ gen7_cs_stall_every_four_pipe_controls(struct
> brw_context *brw, uint32_t flags)
>  void
>  brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags)
>  {
> +   if (brw->gen >= 6 &&
> +   (flags & PIPE_CONTROL_CACHE_FLUSH_BITS) &&
> +   (flags & PIPE_CONTROL_CACHE_INVALIDATE_BITS)) {
> +  /* A pipe control command with flush and invalidate bits set
> +   * simultaneously is an inherently racy operation on Gen6+ if the
> +   * contents of the flushed caches were intended to become visible
> from
> +   * any of the invalidated caches.  Split it in two PIPE_CONTROLs,
> the
> +   * first one should stall the pipeline to make sure that the
> flushed R/W
> +   * caches are coherent with memory once the specified R/O caches are
> +   * invalidated.  On pre-Gen6 hardware the (implicit) R/O cache
> +   * invalidation seems to happen at the bottom of the pipeline
> together
> +   * with any write cache flush, so this shouldn't be a concern.
> +   */
> +  brw_emit_pipe_control_flush(brw, (flags &
> PIPE_CONTROL_CACHE_FLUSH_BITS) |
> +   PIPE_CONTROL_CS_STALL);
> +  flags &= ~(PIPE_CONTROL_CACHE_FLUSH_BITS | PIPE_CONTROL_CS_STALL);
> +   }
> +
> if (brw->gen >= 8) {
>if (brw->gen == 8)
>   gen8_add_cs_stall_workaround_bits();
> diff --git a/src/mesa/drivers/dri/i965/intel_reg.h
> b/src/mesa/drivers/dri/i965/intel_reg.h
> index 95365fe..7a82be4 100644
> --- a/src/mesa/drivers/dri/i965/intel_reg.h
> +++ b/src/mesa/drivers/dri/i965/intel_reg.h
> @@ -134,6 +134,15 @@
>  #define PIPE_CONTROL_PPGTT_WRITE   (0 << 2)
>  #define PIPE_CONTROL_GLOBAL_GTT_WRITE  (1 << 2)
>
> +#define PIPE_CONTROL_CACHE_FLUSH_BITS \
> +   (PIPE_CONTROL_DEPTH_CACHE_FLUSH | PIPE_CONTROL_DATA_CACHE_FLUSH | \
> +PIPE_CONTROL_RENDER_TARGET_FLUSH)
> +
> +#define PIPE_CONTROL_CACHE_INVALIDATE_BITS \
> +   (PIPE_CONTROL_STATE_CACHE_INVALIDATE |
> PIPE_CONTROL_CONST_CACHE_INVALIDATE | \
> +PIPE_CONTROL_VF_CACHE_INVALIDATE |
> PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | \
> +PIPE_CONTROL_INSTRUCTION_INVALIDATE)
> +
>  /** @} */
>
>  #define XY_SETUP_BLT_CMD   (CMD_2D | (0x01 << 22))
> --
> 2.9.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: don't try to lower non-gl builtins as if they were gl_FragData

2016-07-01 Thread Ilia Mirkin
If a shader has an output array, it will get treated as though it were
gl_FragData and rewritten into gl_out_FragData instances. We only want
this to happen on the actual gl_FragData and not everything else.

This is a small part of the problem pointed out by the below bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
Signed-off-by: Ilia Mirkin 
---
 src/compiler/glsl/opt_dead_builtin_varyings.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/opt_dead_builtin_varyings.cpp 
b/src/compiler/glsl/opt_dead_builtin_varyings.cpp
index 7feea3b..a28887f 100644
--- a/src/compiler/glsl/opt_dead_builtin_varyings.cpp
+++ b/src/compiler/glsl/opt_dead_builtin_varyings.cpp
@@ -85,7 +85,8 @@ public:
{
   ir_variable *var = ir->variable_referenced();
 
-  if (!var || var->data.mode != this->mode || !var->type->is_array())
+  if (!var || var->data.mode != this->mode || !var->type->is_array() ||
+  !is_gl_identifier(var->name))
  return visit_continue;
 
   /* Only match gl_FragData[], not gl_SecondaryFragDataEXT[] */
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 23/36] isl: Remove duplicate px->sa conversions

2016-07-01 Thread Jason Ekstrand
In all three cases, we start with width and height taken from
isl_surf::phys_slice0_extent_sa which is already in samples.  There is no
need to do the conversion and doing so gives us an incorrect value.
---
 src/intel/isl/isl.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 404cfc1..be3adfc 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -610,18 +610,6 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
   uint32_t W = isl_minify(W0, l);
   uint32_t H = isl_minify(H0, l);
 
-  if (msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {
- /* From the Broadwell PRM >> Volume 5: Memory Views >> Computing Mip 
Level
-  * Sizes (p133):
-  *
-  *If the surface is multisampled and it is a depth or stencil
-  *surface or Multisampled Surface StorageFormat in
-  *SURFACE_STATE is MSFMT_DEPTH_STENCIL, W_L and H_L must be
-  *adjusted as follows before proceeding: [...]
-  */
- isl_msaa_interleaved_scale_px_to_sa(info->samples, , );
-  }
-
   uint32_t w = isl_align_npot(W, image_align_sa->w);
   uint32_t h = isl_align_npot(H, image_align_sa->h);
 
@@ -1285,17 +1273,9 @@ get_image_offset_sa_gen4_2d(const struct isl_surf *surf,
for (uint32_t l = 0; l < level; ++l) {
   if (l == 1) {
  uint32_t W = isl_minify(W0, l);
-
- if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
-isl_msaa_interleaved_scale_px_to_sa(surf->samples, , NULL);
-
  x += isl_align_npot(W, image_align_sa.w);
   } else {
  uint32_t H = isl_minify(H0, l);
-
- if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
-isl_msaa_interleaved_scale_px_to_sa(surf->samples, NULL, );
-
  y += isl_align_npot(H, image_align_sa.h);
   }
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23.5/36] isl: Take the slice0_extent shortcut for interleaved MSAA

2016-07-01 Thread Jason Ekstrand
The shortcut works just fine for MSAA and the comment even says so.
---
 src/intel/isl/isl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index be3adfc..6bdb248 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -575,7 +575,7 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
 
assert(phys_level0_sa->depth == 1);
 
-   if (info->levels == 1 && msaa_layout != ISL_MSAA_LAYOUT_INTERLEAVED) {
+   if (info->levels == 1) {
   /* Do not pad the surface to the image alignment. Instead, pad it only
* to the pixel format's block alignment.
*
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Make single-buffered GLES representation internally consistent

2016-07-01 Thread Chad Versace
On Thu 30 Jun 2016, Stéphane Marchesin wrote:
> On Thu, Jun 30, 2016 at 3:20 PM, Gurchetan Singh
>  wrote:
> > There are a few places in the code where clearing and reading are done on
> > incorrect buffers for GLES contexts.  See comments for details.  This
> > fixes 75 GLES3 dEQP tests on the surfaceless platform with no regressions.
> >
> > v2: Corrected unclear comment
> > v3: Make the change in context.c instead of get.c
> > v4: Removed whitespace
> 
> I looked for a better way than initializing from makecurrent, but
> there doesn't seem to be one, so...
> 
> Reviewed-by: 

This looks like a difficult problem to fix.  I also looked for a better
way to fix the problem, but I gave up after concluding that any
thoroughly correct fix seemed to require endless yak shaving.

If this fixes dEQP regressions, without regressing other things, then
I approve the patch. But, I'm not convinced yet that it doesn't regress
X11 pixmaps. I want to run some tests and step through the code with gdb
before I give a reviewed-by.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] glsl: Separate overlapping sentinel nodes in exec_list.

2016-07-01 Thread Ian Romanick
On 06/27/2016 02:42 PM, Matt Turner wrote:
> I do appreciate the cleverness, but unfortunately it prevents a lot more
> cleverness in the form of additional compiler optimizations brought on
> by -fstrict-aliasing.
> 
> No difference in OglBatch7 (n=20).
> 
> Co-authored-by: Davin McCall 
> ---
>  src/compiler/glsl/ast.h|   4 +-
>  src/compiler/glsl/ast_function.cpp |  22 ++--
>  src/compiler/glsl/ast_to_hir.cpp   |   6 +-
>  src/compiler/glsl/ast_type.cpp |   2 +-
>  src/compiler/glsl/glsl_parser_extras.cpp   |   6 +-
>  src/compiler/glsl/ir.cpp   |   8 +-
>  src/compiler/glsl/ir_clone.cpp |   2 +-
>  src/compiler/glsl/ir_constant_expression.cpp   |   2 +-
>  src/compiler/glsl/ir_function.cpp  |  14 +--
>  src/compiler/glsl/ir_reader.cpp|   4 +-
>  src/compiler/glsl/ir_validate.cpp  |   4 +-
>  src/compiler/glsl/list.h   | 136 
> +
>  src/compiler/glsl/lower_distance.cpp   |   4 +-
>  src/compiler/glsl/lower_jumps.cpp  |   2 +-
>  src/compiler/glsl/lower_packed_varyings.cpp|   8 +-
>  src/compiler/glsl/lower_tess_level.cpp |   4 +-
>  src/compiler/glsl/opt_conditional_discard.cpp  |   6 +-
>  src/compiler/glsl/opt_dead_builtin_varyings.cpp|   2 +-
>  src/compiler/glsl/opt_dead_code.cpp|   2 +-
>  src/compiler/glsl/opt_flatten_nested_if_blocks.cpp |   2 +-
>  src/compiler/nir/nir.h |   4 +-
>  src/compiler/nir/nir_opt_gcm.c |   2 +-
>  src/mesa/drivers/dri/i965/brw_cfg.h|   2 +-
>  src/mesa/drivers/dri/i965/brw_fs_builder.h |   2 +-
>  src/mesa/drivers/dri/i965/brw_vec4_builder.h   |   2 +-
>  25 files changed, 116 insertions(+), 136 deletions(-)
> 
> diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
> index 06c7b03..5029ba7 100644
> --- a/src/compiler/glsl/ast.h
> +++ b/src/compiler/glsl/ast.h
> @@ -346,8 +346,8 @@ public:
>  
> bool is_single_dimension() const
> {
> -  return this->array_dimensions.tail_pred->prev != NULL &&
> - this->array_dimensions.tail_pred->prev->is_head_sentinel();
> +  return this->array_dimensions.tail_sentinel.prev->prev != NULL &&
> + 
> this->array_dimensions.tail_sentinel.prev->prev->is_head_sentinel();
> }
>  
> virtual void print(void) const;
> diff --git a/src/compiler/glsl/ast_function.cpp 
> b/src/compiler/glsl/ast_function.cpp
> index f74394f..41b0765 100644
> --- a/src/compiler/glsl/ast_function.cpp
> +++ b/src/compiler/glsl/ast_function.cpp
> @@ -186,8 +186,8 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
>  exec_list _ir_parameters,
>  exec_list _ast_parameters)
>  {
> -   exec_node *actual_ir_node  = actual_ir_parameters.head;
> -   exec_node *actual_ast_node = actual_ast_parameters.head;
> +   exec_node *actual_ir_node  = actual_ir_parameters.head_sentinel.next;
> +   exec_node *actual_ast_node = actual_ast_parameters.head_sentinel.next;

This seems pretty annoying.  I've had this reply sitting open since
Tuesday morning... and I don't have a good solution.

Normally you'd add a function or macro to wrap this, but we already have
exec_list::get_head.  That function has the extra semantic that if the
list is empty, you get NULL.  Most of the places that don't already use
that method don't want to use it.  They either already know that the
list is not empty (as a precondition) or they're going to compare the
node pointer with the sentinel.

We could add another function... but what do you call it, and how do you
keep people from using the wrong one?  get_head_raw()?  get_head_fast()?
_head()?  first()?   Hmm... maybe change the existing functions to
get_first_node() / get_last_node() and call the new functions get_head()
/ get_tail()?

All I know is that I really don't want to type head_sentinel.next or
tail_sentinel.prev... mostly because at least once I'll accidentally
type (or cut-and-paste, or search-and-replace) tail_sentinel.next, and
I'll never find the bug.

Anyway... patches 2/7 are

Reviewed-by: Ian Romanick 

I'd like to have some kind of improvement here, but I think the
mechanics of the change are correct.

> foreach_in_list(const ir_variable, formal, >parameters) {
>/* The lists must be the same length. */
> @@ -318,10 +318,12 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
> const char *func_name = sig->function_name();
> bool is_atomic = is_atomic_function(func_name);
> if (is_atomic) {
> -  const ir_rvalue *const actual = (ir_rvalue *) 
> actual_ir_parameters.head;
> +  const ir_rvalue *const actual =
> + (ir_rvalue *) actual_ir_parameters.head_sentinel.next;
>  
>

Re: [Mesa-dev] [PATCH 3/4] anv/blit2d: Copy with stencil sources when needed

2016-07-01 Thread Jason Ekstrand
seems fine to me

On Fri, Jul 1, 2016 at 2:25 PM, Chad Versace  wrote:

> On Mon 27 Jun 2016, Nanley Chery wrote:
> > In the next patch, ISL will unconditionally perform verification of a
> > surface's tiling and usage. Since it will require that w-tiled images
> > be stencil buffers, create a stencil surface to copy from a
> > w-tiled/stencil surface.
> >
> > Signed-off-by: Nanley Chery 
>
> The patch looks correct to me, but I don't feel qualified to review this
> patch. I'm not familiar with the anv_meta_blit2d.c code. If Jason
> doesn't object, feel free to push.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] anv/blit2d: Copy with stencil sources when needed

2016-07-01 Thread Chad Versace
On Mon 27 Jun 2016, Nanley Chery wrote:
> In the next patch, ISL will unconditionally perform verification of a
> surface's tiling and usage. Since it will require that w-tiled images
> be stencil buffers, create a stencil surface to copy from a
> w-tiled/stencil surface.
> 
> Signed-off-by: Nanley Chery 

The patch looks correct to me, but I don't feel qualified to review this
patch. I'm not familiar with the anv_meta_blit2d.c code. If Jason
doesn't object, feel free to push.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] anv/image: Simplify initialization of the isl_tiling

2016-07-01 Thread Chad Versace
On Mon 27 Jun 2016, Nanley Chery wrote:
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/anv_image.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> index 77d9931..b3f5f5c 100644
> --- a/src/intel/vulkan/anv_image.c
> +++ b/src/intel/vulkan/anv_image.c
> @@ -120,10 +120,6 @@ make_surface(const struct anv_device *dev,
>[VK_IMAGE_TYPE_3D] = ISL_SURF_DIM_3D,
> };
>  
> -   isl_tiling_flags_t tiling_flags = anv_info->isl_tiling_flags;
> -   if (vk_info->tiling == VK_IMAGE_TILING_LINEAR)
> -  tiling_flags = ISL_TILING_LINEAR_BIT;
> -
> struct anv_surface *anv_surf = get_surface(image, aspect);
>  
> image->extent = anv_sanitize_image_extent(vk_info->imageType,
> @@ -142,7 +138,7 @@ make_surface(const struct anv_device *dev,
>.min_alignment = 0,
>.min_pitch = anv_info->stride,
>.usage = choose_isl_surf_usage(image->usage, aspect),
> -  .tiling_flags = tiling_flags);
> +  .tiling_flags = anv_info->isl_tiling_flags);
>  
> /* isl_surf_init() will fail only if provided invalid input. Invalid input
>  * is illegal in Vulkan.
> @@ -260,7 +256,9 @@ anv_CreateImage(VkDevice device,
> return anv_image_create(device,
>&(struct anv_image_create_info) {
>   .vk_info = pCreateInfo,
> - .isl_tiling_flags = ISL_TILING_ANY_MASK,
> + .isl_tiling_flags = (pCreateInfo->tiling == VK_IMAGE_TILING_LINEAR) 
> ?
> + ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK,
> +
>},
>pAllocator,
>pImage);

I don't agree with this patch.

Locally, the patch look correct. But when you consider that
anv_image_create() is public within the driver, the patch makes the code
fragile. Pre-patch, if the caller of anv_image_create() sets
anv_image_create_info::vk_info::tiling and leaves
anv_image_create_info::isl_tiling_flags unset (which I believe should be
a valid combination), then anv_image_create() correctly converts the
VkImageTilingFlags to isl_tiling_flags. Post-patch, that's no longer the
case; anv_image_create() ignores its VkImageTiling input.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] Revert "isl: Don't filter tiling flags if a specific tiling bit is set"

2016-07-01 Thread Chad Versace
On Mon 27 Jun 2016, Nanley Chery wrote:
> This reverts commit 091f1da902c71ac8d3d27b325a118e2f683f1ae5.
> 
> Although a user may specify a specfic tiling bit, ISL should still
> prevent incompatible tiling/surface combinations.
> 
> Signed-off-by: Nanley Chery 
> ---
> 
> Prior to patch https://patchwork.freedesktop.org/patch/95338/ ,
> this change made crucible tests which attempted to make a linear
> depth image assert-fail.

Patch 4 is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] isl: Fix isl_tiling_is_any_y()

2016-07-01 Thread Chad Versace
On Mon 27 Jun 2016, Nanley Chery wrote:
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/isl/isl.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Patch 1 is
Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/36] i965/blorp: Refactor interleaved multisample destination handling

2016-07-01 Thread Jason Ekstrand
On Thu, Jun 30, 2016 at 11:56 PM, Pohjolainen, Topi <
topi.pohjolai...@intel.com> wrote:

> On Wed, Jun 29, 2016 at 05:37:33PM -0700, Jason Ekstrand wrote:
> > We put all of the code for fake IMS together.  This requires moving a bit
> > of the program key setup code further down so that it gets the right
> values
> > out of the final surface.
> > ---
> >  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 71
> +---
> >  1 file changed, 34 insertions(+), 37 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > index 9a0b9bb..c253412 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > @@ -1695,28 +1695,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> >unreachable("Unrecognized blorp format");
> > }
> >
> > -   if (brw->gen > 6) {
> > -  /* Gen7's rendering hardware only supports the IMS layout for
> depth and
> > -   * stencil render targets.  Blorp always maps its destination
> surface as
> > -   * a color render target (even if it's actually a depth or stencil
> > -   * buffer).  So if the destination is IMS, we'll have to map it
> as a
> > -   * single-sampled texture and interleave the samples ourselves.
> > -   */
> > -  if (dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
>
> This should have been switched to isl enum in patch 12 (Use isl_msaa_layout
> instead of intel_msaa_layout)?
>
> > - params.dst.surf.samples = 1;
> > - params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
> > -  }
> > -   }
> > -
> > -   if (params.src.surf.samples > 0 && params.dst.surf.samples > 1) {
> > -  /* We are blitting from a multisample buffer to a multisample
> buffer, so
> > -   * we must preserve samples within a pixel.  This means we have to
> > -   * arrange for the WM program to run once per sample rather than
> once
> > -   * per pixel.
> > -   */
> > -  wm_prog_key.persample_msaa_dispatch = true;
> > -   }
> > -
> > /* Scaled blitting or not. */
> > wm_prog_key.blit_scaled =
> >((dst_x1 - dst_x0) == (src_x1 - src_x0) &&
> > @@ -1756,20 +1734,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> > wm_prog_key.src_samples = src_mt->num_samples;
> > wm_prog_key.dst_samples = dst_mt->num_samples;
> >
> > -   /* tex_samples and rt_samples are the sample counts that are set up
> in
> > -* SURFACE_STATE.
> > -*/
> > -   wm_prog_key.tex_samples = params.src.surf.samples;
> > -   wm_prog_key.rt_samples  = params.dst.surf.samples;
> > -
> > wm_prog_key.tex_aux_usage = params.src.aux_usage;
> >
> > -   /* tex_layout and rt_layout indicate the MSAA layout the GPU
> pipeline will
> > -* use to access the source and destination surfaces.
> > -*/
> > -   wm_prog_key.tex_layout = params.src.surf.msaa_layout;
> > -   wm_prog_key.rt_layout = params.dst.surf.msaa_layout;
> > -
> > /* src_layout and dst_layout indicate the true MSAA layout used by
> src and
> >  * dst.
> >  */
> > @@ -1805,7 +1771,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> >params.wm_push_consts.src_z = 0;
> > }
> >
> > -   if (params.dst.surf.samples <= 1 && dst_mt->num_samples > 1) {
> > +   if (brw->gen > 6 && dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
>
> And here we should have ISL enum, right?
>

I *think* you are correct that it could be changed.  However,
mt->msaa_layout is the "unmodified" version so I'm a bit paranoid about
changing it.  It will get converted to ISL eventually; If not in this
series then in the next one.
--Jason


>
> >/* We must expand the rectangle we send through the rendering
> pipeline,
> > * to account for the fact that we are mapping the destination
> region as
> > * single-sampled when it is in fact multisampled.  We must also
> align
> > @@ -1818,8 +1784,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> > * If it's UMS, then we have no choice but to set up the rendering
> > * pipeline as multisampled.
> > */
> > -  assert(dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS);
> > -  switch (dst_mt->num_samples) {
> > +  assert(params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED);
> > +  switch (params.dst.surf.samples) {
> >case 2:
> >   params.x0 = ROUND_DOWN_TO(params.x0 * 2, 4);
> >   params.y0 = ROUND_DOWN_TO(params.y0, 4);
> > @@ -1847,6 +1813,16 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> >default:
> >   unreachable("Unrecognized sample count in
> brw_blorp_blit_params ctor");
> >}
> > +
> > +  /* Gen7's rendering hardware only supports the IMS layout for
> depth and
> > +   * stencil render targets.  Blorp always maps its destination
> surface as
> > +   * a color render target (even if it's actually a depth or stencil
> > +   * buffer).  So if 

Re: [Mesa-dev] [PATCH 15/36] i965/blorp: Move intratile offset calculations out of surface state setup

2016-07-01 Thread Jason Ekstrand
On Fri, Jul 1, 2016 at 12:24 AM, Pohjolainen, Topi <
topi.pohjolai...@intel.com> wrote:

> On Wed, Jun 29, 2016 at 05:37:34PM -0700, Jason Ekstrand wrote:
> > ---
> >  src/mesa/drivers/dri/i965/brw_blorp.c| 24
> 
> >  src/mesa/drivers/dri/i965/brw_blorp.h| 15 ++-
> >  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp |  8 
> >  3 files changed, 18 insertions(+), 29 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> > index d6581d0..5e433d3 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> > @@ -66,9 +66,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> > info->width = minify(mt->physical_width0, level - mt->first_level);
> > info->height = minify(mt->physical_height0, level - mt->first_level);
> >
> > -   intel_miptree_get_image_offset(mt, level, layer,
> > -  >x_offset, >y_offset);
> > -
> > info->swizzle = SWIZZLE_XYZW;
> >
> > if (format == MESA_FORMAT_NONE)
> > @@ -110,6 +107,15 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> >break;
> > }
> > }
> > +
> > +   uint32_t x_offset, y_offset;
> > +   intel_miptree_get_image_offset(mt, level, layer, _offset,
> _offset);
> > +
> > +   uint8_t bs = isl_format_get_layout(info->brw_surfaceformat)->bs;
> > +   isl_tiling_get_intratile_offset_el(>isl_dev, info->surf.tiling,
> bs,
> > +  info->surf.row_pitch, x_offset,
> y_offset,
> > +  >bo_offset,
> > +  >tile_x_sa,
> >tile_y_sa);
> >  }
> >
> >
> > @@ -305,13 +311,6 @@ brw_blorp_emit_surface_state(struct brw_context
> *brw,
> >ISL_SURF_USAGE_TEXTURE_BIT,
> > };
> >
> > -   uint32_t offset, tile_x, tile_y;
> > -   isl_tiling_get_intratile_offset_el(>isl_dev, surf.tiling,
> > -
> isl_format_get_layout(view.format)->bs,
> > -  surf.row_pitch,
> > -  surface->x_offset,
> surface->y_offset,
> > -  , _x, _y);
> > -
> > uint32_t surf_offset;
> > uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> >ss_info.num_dwords * 4,
> ss_info.ss_align,
> > @@ -320,11 +319,12 @@ brw_blorp_emit_surface_state(struct brw_context
> *brw,
> > const uint32_t mocs = is_render_target ? ss_info.rb_mocs :
> ss_info.tex_mocs;
> >
> > isl_surf_fill_state(>isl_dev, dw, .surf = , .view = ,
> > -   .address = surface->mt->bo->offset64 + offset,
> > +   .address = surface->mt->bo->offset64 +
> surface->bo_offset,
> > .aux_surf = aux_surf, .aux_usage =
> surface->aux_usage,
> > .aux_address = aux_offset,
> > .mocs = mocs, .clear_color = clear_color,
> > -   .x_offset_sa = tile_x, .y_offset_sa = tile_y);
> > +   .x_offset_sa = surface->tile_x_sa,
> > +   .y_offset_sa = surface->tile_y_sa);
> >
> > /* Emit relocation to surface contents */
> > drm_intel_bo_emit_reloc(brw->batch.bo,
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> > index b8a8d06..fddd007 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> > @@ -104,19 +104,8 @@ struct brw_blorp_surface_info
> >  */
> > uint32_t height;
> >
> > -   /**
> > -* X offset within the surface to texture from (or render to).  For
> > -* surfaces using INTEL_MSAA_LAYOUT_IMS, this is measured in
> samples, not
> > -* pixels.
> > -*/
> > -   uint32_t x_offset;
> > -
> > -   /**
> > -* Y offset within the surface to texture from (or render to).  For
> > -* surfaces using INTEL_MSAA_LAYOUT_IMS, this is measured in
> samples, not
> > -* pixels.
> > -*/
> > -   uint32_t y_offset;
> > +   uint32_t bo_offset;
> > +   uint32_t tile_x_sa, tile_y_sa;
> >
> > /**
> >  * Format that should be used when setting up the surface state for
> this
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > index c253412..5696d52 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > @@ -1896,8 +1896,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> >params.y1 = ALIGN(params.y1, y_align) / 2;
> >params.dst.width = ALIGN(params.dst.width, x_align) * 2;
> >params.dst.height = ALIGN(params.dst.height, y_align) / 2;
> > -  params.dst.x_offset *= 2;
> > -  params.dst.y_offset /= 2;
> > +  params.dst.tile_x_sa *= 2;
> > +  params.dst.tile_y_sa /= 2;
>
> This I had to think about 

[Mesa-dev] [Bug 96770] include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96770

Bug ID: 96770
   Summary: include/GL/mesa_glinterop.h:62: error: redefinition of
typedef ‘GLXContext’
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: emil.l.veli...@gmail.com, mar...@gmail.com,
tstel...@gmail.com

mesa: 07cc838b105dd3f34526db73064f1f21b452240e (master 12.1.0-devel)

Build error with GCC 4.4.

  Compiling src/glx/dri_common_interop.c ...
In file included from src/glx/dri_common_interop.c:33:
include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’
include/GL/glx.h:165: note: previous declaration of ‘GLXContext’ was here


commit 8472045b16b3e4621553fe451a20a9ba9f0d44b6
Author: Emil Velikov 
Date:   Tue May 3 12:25:34 2016 +0100

mesa_glinterop: remove inclusion of GLX header

Since we only need partial information about the GLX symbols we can
forward declare them and drop the include. Obviously each user of the
said API will needs more than what's provides, so they'll include the
GLX header.

If they don't, the compiler will give us a nice warning ;-)

Signed-off-by: Emil Velikov 
Reviewed-by: Marek Olšák 
Tested-by: Tom Stellard 

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i965/blorp: Use flat vertex inputs instead of uniforms

2016-07-01 Thread Jason Ekstrand
On Thu, Jun 23, 2016 at 12:16 PM, Topi Pohjolainen <
topi.pohjolai...@intel.com> wrote:

> In addition to the actual vertex coordinates blorp will get another
> vertex input buffer providing the constants that are until now
> provided as uniforms. This will remove the need to configure push
> constants and their allocation in the pipeline.
>
> First three patches refactor the vertex data setup for blorp. The
> existing logic in blorp already supports all gens (gen6-gen9). I
> chose to change the core upload logic accordingly and simply use that
> for blorp.
>
> Patches 5-8 pack the constants in blorp programs into vec4s. By
> default compiler puts input variables two full hardware registers
> apart. Having them in vec4s drops the need to repack them.
>
> Last four patches take actual advantage of the change by dropping
> unnecessary pipeline reconfiguration.
>
> CC: Kenneth Graunke 
> CC: Jason Ekstrand 
>
> Topi Pohjolainen (18):
>   i965/draw: Expose vertex buffer state setup
>   i965: Unify vertex buffer setup
>

I like these two cleanups

Reviewed-by: Jason Ekstrand 


>   i965/blorp: Split vertex data and element setup
>   i965/blorp: Use core vertex buffer state setup
>   i965/blorp: Rename push constants to inputs
>

Reviewed-by: Jason Ekstrand 


>   i965/blorp: Share input slot between pixel kill and blend/scaled
>

I'm not sure this patch is needed and it does seem to complicate things.
I'm not 100% set on that; feel free to argue for it. :-)  See the inline
comments for details.


>   i965/blorp: Load tranformation coordinates as vec4
>

Reviewed-by: Jason Ekstrand 


>   i965/blorp: Drop LOAD_UNIFORM macro
>

I'm not sure if we want to drop the macro.  As I mentioned inline on the
next patch, using offsetof is kind-of nice which is why I made it in the
first place.  I'm not that set on offsetof but I think it merits a little
discusison.


>   i965/blorp: Store input read mask
>   i965/blorp: Add support for flat input buffer
>   i965/blorp: Tell vertex fetcher about flat inputs
>   i965/blorp: Prepare for more than two vertex attributes
>   i965/blorp: Use flat inputs instead of uniforms
>   i965/blorp: Remove support for push constants
>

Thanks for working on this!  The above 6 are

Reviewed-by: Jason Ekstrand 


>   i965/urb: Allow blorp to record current settings
>   i965/blorp: Fix the size requirement for vertex elements
>

I'm not sure about these two...  They add a lot of coupling between blorp
and the driver that's going to make it hard to share things between GL and
Vulkan.  What benchmark did they help and by how much?  Also, are they
required for the next two?  If not, let's put them last in the series so
that we can get the rest of it pushed as soon as the comments on patch 6
are sorted.


>   i965/blorp/gen7+: Stop trashing push constant allocation
>   i965/blorp/gen7+: Do not trigger push constant space reconfig
>

Yes please!

Reviewed-by: Jason Ekstrand 


>
>  src/mesa/drivers/dri/i965/brw_blorp.c |  18 +-
>  src/mesa/drivers/dri/i965/brw_blorp.h |  81 ++---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 135 +--
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  15 +-
>  src/mesa/drivers/dri/i965/brw_context.h   |  12 +-
>  src/mesa/drivers/dri/i965/brw_draw.h  |  13 ++
>  src/mesa/drivers/dri/i965/brw_draw_upload.c   |  74 +---
>  src/mesa/drivers/dri/i965/gen6_blorp.c| 240
> +-
>  src/mesa/drivers/dri/i965/gen7_blorp.c| 153 
>  src/mesa/drivers/dri/i965/gen7_urb.c  |  93 +-
>  src/mesa/drivers/dri/i965/gen8_blorp.c|  86 ++---
>  src/mesa/drivers/dri/i965/gen8_draw_upload.c  |  41 ++---
>  12 files changed, 448 insertions(+), 513 deletions(-)
>
> --
> 2.5.5
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/18] i965/blorp: Remove support for push constants

2016-07-01 Thread Jason Ekstrand
Hooray! \o/

On Thu, Jun 23, 2016 at 12:17 PM, Topi Pohjolainen <
topi.pohjolai...@intel.com> wrote:

> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c  | 17 ++-
>  src/mesa/drivers/dri/i965/brw_blorp.h  | 15 -
>  src/mesa/drivers/dri/i965/gen6_blorp.c | 56
> +-
>  src/mesa/drivers/dri/i965/gen7_blorp.c | 41 +
>  src/mesa/drivers/dri/i965/gen8_blorp.c | 28 +
>  5 files changed, 12 insertions(+), 145 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 4d6c0ba..04c10b6 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -187,17 +187,8 @@ brw_blorp_compile_nir_shader(struct brw_context *brw,
> struct nir_shader *nir,
> struct brw_wm_prog_data wm_prog_data;
> memset(_prog_data, 0, sizeof(wm_prog_data));
>
> -   /* We set up the params array but instead of making them point at
> actual
> -* GL constant values, they just store an index.  This is just fine as
> the
> -* backend compiler never looks at the contents of the pointers, it
> just
> -* re-arranges them for us.
> -*/
> -   const union gl_constant_value
> *param[BRW_BLORP_NUM_PUSH_CONSTANT_DWORDS];
> -   for (unsigned i = 0; i < ARRAY_SIZE(param); i++)
> -  param[i] = (const union gl_constant_value *)(intptr_t)i;
> -
> -   wm_prog_data.base.nr_params = BRW_BLORP_NUM_PUSH_CONSTANT_DWORDS;
> -   wm_prog_data.base.param = param;
> +   wm_prog_data.base.nr_params = 0;
> +   wm_prog_data.base.param = NULL;
>
> /* BLORP always just uses the first two binding table entries */
> wm_prog_data.binding_table.render_target_start = 0;
> @@ -235,9 +226,7 @@ brw_blorp_compile_nir_shader(struct brw_context *brw,
> struct nir_shader *nir,
> prog_data->num_varying_inputs = wm_prog_data.num_varying_inputs;
> prog_data->inputs_read = nir->info.inputs_read;
>
> -   prog_data->nr_params = wm_prog_data.base.nr_params;
> -   for (unsigned i = 0; i < ARRAY_SIZE(param); i++)
> -  prog_data->param[i] = (uintptr_t)wm_prog_data.base.param[i];
> +   assert(wm_prog_data.base.nr_params == 0);
>
> return program;
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> index 91d111e..d3fc713 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> @@ -233,13 +233,6 @@ struct brw_blorp_wm_inputs
> uint32_t pad[7];
>  };
>
> -#define BRW_BLORP_NUM_PUSH_CONSTANT_DWORDS \
> -   (sizeof(struct brw_blorp_wm_inputs) / 4)
> -
> -/* Every 32 bytes of push constant data constitutes one GEN register. */
> -static const unsigned int BRW_BLORP_NUM_PUSH_CONST_REGS =
> -   sizeof(struct brw_blorp_wm_inputs) / 32;
> -
>  struct brw_blorp_prog_data
>  {
> bool dispatch_8;
> @@ -263,14 +256,6 @@ struct brw_blorp_prog_data
> uint32_t flat_inputs;
> unsigned num_varying_inputs;
> GLbitfield64 inputs_read;
> -
> -   /* The compiler will re-arrange push constants and store the upload
> order
> -* here. Given an index 'i' in the final upload buffer, param[i] gives
> the
> -* index in the uniform store. In other words, the value to be
> uploaded can
> -* be found by brw_blorp_params::wm_push_consts[param[i]].
> -*/
> -   uint8_t nr_params;
> -   uint8_t param[BRW_BLORP_NUM_PUSH_CONSTANT_DWORDS];
>  };
>
>  inline unsigned
> diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.c
> b/src/mesa/drivers/dri/i965/gen6_blorp.c
> index 8b6b8f9..c38952e 100644
> --- a/src/mesa/drivers/dri/i965/gen6_blorp.c
> +++ b/src/mesa/drivers/dri/i965/gen6_blorp.c
> @@ -350,26 +350,6 @@ gen6_blorp_emit_cc_state_pointers(struct brw_context
> *brw,
> ADVANCE_BATCH();
>  }
>
> -
> -/* WM push constants */
> -uint32_t
> -gen6_blorp_emit_wm_constants(struct brw_context *brw,
> - const struct brw_blorp_params *params)
> -{
> -   uint32_t wm_push_const_offset;
> -
> -   uint32_t *constants = brw_state_batch(brw, AUB_TRACE_WM_CONSTANTS,
> - sizeof(params->wm_inputs),
> - 32, _push_const_offset);
> -
> -   const uint32_t *push_consts = (const uint32_t *)>wm_inputs;
> -   for (unsigned i = 0; i < params->wm_prog_data->nr_params; i++)
> -  constants[i] = push_consts[params->wm_prog_data->param[i]];
> -
> -   return wm_push_const_offset;
> -}
> -
> -
>  /* SURFACE_STATE for renderbuffer or texture surface (see
>   * brw_update_renderbuffer_surface and brw_update_texture_surface)
>   */
> @@ -755,32 +735,6 @@ gen6_blorp_emit_wm_config(struct brw_context *brw,
> ADVANCE_BATCH();
>  }
>
> -
> -static void
> -gen6_blorp_emit_constant_ps(struct brw_context *brw,
> -const struct brw_blorp_params *params,
> -uint32_t wm_push_const_offset)
> -{
> 

Re: [Mesa-dev] [PATCH 13/18] i965/blorp: Use flat inputs instead of uniforms

2016-07-01 Thread Jason Ekstrand
On Thu, Jun 23, 2016 at 12:17 PM, Topi Pohjolainen <
topi.pohjolai...@intel.com> wrote:

> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 43
> ++-
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  9 +++---
>  2 files changed, 27 insertions(+), 25 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index 1683f8e..616f87e 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -331,10 +331,10 @@ enum sampler_message_arg
>
>  struct brw_blorp_blit_vars {
> /* Input values from brw_blorp_wm_inputs */
> -   nir_variable *u_discard_rect;
> -   nir_variable *u_rect_grid;
> -   nir_variable *u_coord_transform;
> -   nir_variable *u_src_z;
> +   nir_variable *v_discard_rect;
> +   nir_variable *v_rect_grid;
> +   nir_variable *v_coord_transform;
> +   nir_variable *v_src_z;
>
> /* gl_FragCoord */
> nir_variable *frag_coord;
> @@ -349,34 +349,35 @@ brw_blorp_blit_vars_init(nir_builder *b, struct
> brw_blorp_blit_vars *v,
>  {
> if (key->use_kill) {
>assert(!(key->blend && key->blit_scaled));
> -  v->u_discard_rect = nir_variable_create(b->shader, nir_var_uniform,
> +  v->v_discard_rect = nir_variable_create(b->shader,
> nir_var_shader_in,
>glsl_type::uvec4_type,
>"discard_rect");
> -  v->u_discard_rect->data.location =
> - offsetof(struct brw_blorp_wm_inputs, discard_rect);
> -  v->u_rect_grid = NULL;
> +  v->v_discard_rect->data.location = VARYING_SLOT_VAR0;
>

I think I'd prefer "VARYING_SLOT_VAR0 + offsetof() / 16".  That way, if
someone adds a new varying in the middle, slots automatically get updated.
That's the reason why I made the UNIFORM macro in the first place.


> +  v->v_discard_rect->data.interpolation = INTERP_QUALIFIER_FLAT;
> +  v->v_rect_grid = NULL;
> } else {
>/* Blending grid only has two components but loading it as vec4
> * will keep offsets for the subsequent inputs the same between
> * this and the discard branch.
> */
> -  v->u_rect_grid = nir_variable_create(b->shader, nir_var_uniform,
> +  v->v_rect_grid = nir_variable_create(b->shader, nir_var_shader_in,
> glsl_type::vec2_type,
> "rect_grid");
> -  v->u_rect_grid->data.location =
> - offsetof(struct brw_blorp_wm_inputs, rect_grid);
> -  v->u_discard_rect = NULL;
> +  v->v_rect_grid->data.location = VARYING_SLOT_VAR0;
> +  v->v_rect_grid->data.interpolation = INTERP_QUALIFIER_FLAT;
> +  v->v_discard_rect = NULL;
> }
>
> -   v->u_coord_transform = nir_variable_create(b->shader, nir_var_uniform,
> +   v->v_coord_transform = nir_variable_create(b->shader,
> nir_var_shader_in,
>glsl_type::vec4_type,
>"coord_transform");
> -   v->u_coord_transform->data.location =
> -  offsetof(struct brw_blorp_wm_inputs, x_transform.multiplier);
> +   v->v_coord_transform->data.location = VARYING_SLOT_VAR1;
> +   v->v_coord_transform->data.interpolation = INTERP_QUALIFIER_FLAT;
>
> -   v->u_src_z = nir_variable_create(b->shader, nir_var_uniform,
> +   v->v_src_z = nir_variable_create(b->shader, nir_var_shader_in,
>  glsl_type::uint_type, "src_z");
> -   v->u_src_z->data.location = offsetof(struct brw_blorp_wm_inputs,
> src_z);
> +   v->v_src_z->data.location = VARYING_SLOT_VAR2;
> +   v->v_src_z->data.interpolation = INTERP_QUALIFIER_FLAT;
>
> v->frag_coord = nir_variable_create(b->shader, nir_var_shader_in,
> glsl_vec4_type(), "gl_FragCoord");
> @@ -411,7 +412,7 @@ nir_ssa_def *
>  blorp_blit_apply_transform(nir_builder *b, nir_ssa_def *src_pos,
> struct brw_blorp_blit_vars *v)
>  {
> -   nir_ssa_def *coord_transform = nir_load_var(b, v->u_coord_transform);
> +   nir_ssa_def *coord_transform = nir_load_var(b, v->v_coord_transform);
>
> nir_ssa_def *offset = nir_vec2(b, nir_channel(b, coord_transform, 1),
>   nir_channel(b, coord_transform, 3));
> @@ -426,7 +427,7 @@ blorp_nir_discard_if_outside_rect(nir_builder *b,
> nir_ssa_def *pos,
>struct brw_blorp_blit_vars *v)
>  {
> nir_ssa_def *c0, *c1, *c2, *c3;
> -   nir_ssa_def *discard_rect = nir_load_var(b, v->u_discard_rect);
> +   nir_ssa_def *discard_rect = nir_load_var(b, v->v_discard_rect);
> nir_ssa_def *dst_x0 = nir_channel(b, discard_rect, 0);
> nir_ssa_def *dst_x1 = nir_channel(b, discard_rect, 1);
> nir_ssa_def *dst_y0 = nir_channel(b, discard_rect, 2);
> @@ -514,7 

Re: [Mesa-dev] [PATCH 06/18] i965/blorp: Share input slot between pixel kill and blend/scaled

2016-07-01 Thread Jason Ekstrand
On Thu, Jun 23, 2016 at 12:17 PM, Topi Pohjolainen <
topi.pohjolai...@intel.com> wrote:

> These are never used in parallel, lets document this.
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.h | 47 ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 82
> ++-
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  6 +-
>  3 files changed, 97 insertions(+), 38 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> index a4036c1..6f3581c 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> @@ -179,15 +179,48 @@ struct brw_blorp_coord_transform
> float offset;
>  };
>
> +/**
> + * Bounding rectangle telling pixel discard which pixels are not to be
> + * touched. This is needed in when surfaces are configured as something
> else
> + * what they really are:
> + *
> + *- writing W-tiled stencil as Y-tiled
> + *- writing interleaved multisampled as single sampled.
> + *
> + * See blorp_nir_discard_if_outside_rect().
> + */
> +struct brw_blorp_discard_rect
> +{
> +   uint32_t x0;
> +   uint32_t x1;
> +   uint32_t y0;
> +   uint32_t y1;
> +};
> +
> +/**
> + * Grid needed for blended and scaled blits of integer formats, see
> + * blorp_nir_manual_blend_bilinear().
> + */
> +struct brw_blorp_rect_grid
> +{
> +   float x1;
> +   float y1;
> +};
> +
>  struct brw_blorp_wm_inputs
>  {
> -   uint32_t dst_x0;
> -   uint32_t dst_x1;
> -   uint32_t dst_y0;
> -   uint32_t dst_y1;
> +   /* Blended and scaled blits never use pixel discard meaning
> +* blorp_nir_discard_if_outside_rect() and
> blorp_nir_manual_blend_bilinear()
> +* can emit code using the same input slot. Clear color in turn is only
> +* used by clear programs.
> +*/
> +   union {
> +  struct brw_blorp_discard_rect discard_rect;
> +  struct brw_blorp_rect_grid rect_grid;
> +  union gl_color_union clear_color;
> +   };
> +
> /* Top right coordinates of the rectangular grid used for scaled
> blitting */
> -   float rect_grid_x1;
> -   float rect_grid_y1;
> struct brw_blorp_coord_transform x_transform;
> struct brw_blorp_coord_transform y_transform;
>
> @@ -197,7 +230,7 @@ struct brw_blorp_wm_inputs
> uint32_t src_z;
>
> /* Pad out to an integral number of registers */
> -   uint32_t pad[5];
> +   uint32_t pad[7];
>

Also, now that we're re-arranging push constants and now inputs, this bit
of padding isn't needed.  I should have deleted it when I moved everything
to NIR.


>  };
>
>  #define BRW_BLORP_NUM_PUSH_CONSTANT_DWORDS \
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index 16d2504..a9eac60 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -331,12 +331,8 @@ enum sampler_message_arg
>
>  struct brw_blorp_blit_vars {
> /* Input values from brw_blorp_wm_inputs */
> -   nir_variable *u_dst_x0;
> -   nir_variable *u_dst_x1;
> -   nir_variable *u_dst_y0;
> -   nir_variable *u_dst_y1;
> -   nir_variable *u_rect_grid_x1;
> -   nir_variable *u_rect_grid_y1;
> +   nir_variable *u_discard_rect;
> +   nir_variable *u_rect_grid;
> struct {
>nir_variable *multiplier;
>nir_variable *offset;
> @@ -354,17 +350,32 @@ static void
>  brw_blorp_blit_vars_init(nir_builder *b, struct brw_blorp_blit_vars *v,
>   const struct brw_blorp_blit_prog_key *key)
>  {
> +   if (key->use_kill) {
> +  assert(!(key->blend && key->blit_scaled));
> +  v->u_discard_rect = nir_variable_create(b->shader, nir_var_uniform,
> +  glsl_type::uvec4_type,
> +  "discard_rect");
> +  v->u_discard_rect->data.location =
> + offsetof(struct brw_blorp_wm_inputs, discard_rect);
> +  v->u_rect_grid = NULL;
> +   } else {
> +  /* Blending grid only has two components but loading it as vec4
> +   * will keep offsets for the subsequent inputs the same between
> +   * this and the discard branch.
> +   */
> +  v->u_rect_grid = nir_variable_create(b->shader, nir_var_uniform,
> +   glsl_type::vec2_type,
> +   "rect_grid");
> +  v->u_rect_grid->data.location =
> + offsetof(struct brw_blorp_wm_inputs, rect_grid);
> +  v->u_discard_rect = NULL;
> +   }
> +
>  #define LOAD_UNIFORM(name, type)\
> v->u_##name = nir_variable_create(b->shader, nir_var_uniform, type,
> #name); \
> v->u_##name->data.location = \
>offsetof(struct brw_blorp_wm_inputs, name);
>
> -   LOAD_UNIFORM(dst_x0, glsl_uint_type())
> -   LOAD_UNIFORM(dst_x1, glsl_uint_type())
> -   LOAD_UNIFORM(dst_y0, glsl_uint_type())
> -   LOAD_UNIFORM(dst_y1, glsl_uint_type())
> -   

Re: [Mesa-dev] [PATCH 06/18] i965/blorp: Share input slot between pixel kill and blend/scaled

2016-07-01 Thread Jason Ekstrand
Since we're using inputs_read and the packing from the fs compiler, is this
really needed?  It seems like we could just add two floats worth of padding
after rect_grid_y1 to pad it to a vec4.

On Thu, Jun 23, 2016 at 12:17 PM, Topi Pohjolainen <
topi.pohjolai...@intel.com> wrote:

> These are never used in parallel, lets document this.
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.h | 47 ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 82
> ++-
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  6 +-
>  3 files changed, 97 insertions(+), 38 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> index a4036c1..6f3581c 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> @@ -179,15 +179,48 @@ struct brw_blorp_coord_transform
> float offset;
>  };
>
> +/**
> + * Bounding rectangle telling pixel discard which pixels are not to be
> + * touched. This is needed in when surfaces are configured as something
> else
> + * what they really are:
> + *
> + *- writing W-tiled stencil as Y-tiled
> + *- writing interleaved multisampled as single sampled.
> + *
> + * See blorp_nir_discard_if_outside_rect().
> + */
> +struct brw_blorp_discard_rect
> +{
> +   uint32_t x0;
> +   uint32_t x1;
> +   uint32_t y0;
> +   uint32_t y1;
> +};
> +
> +/**
> + * Grid needed for blended and scaled blits of integer formats, see
> + * blorp_nir_manual_blend_bilinear().
> + */
> +struct brw_blorp_rect_grid
> +{
> +   float x1;
> +   float y1;
> +};
> +
>  struct brw_blorp_wm_inputs
>  {
> -   uint32_t dst_x0;
> -   uint32_t dst_x1;
> -   uint32_t dst_y0;
> -   uint32_t dst_y1;
> +   /* Blended and scaled blits never use pixel discard meaning
> +* blorp_nir_discard_if_outside_rect() and
> blorp_nir_manual_blend_bilinear()
> +* can emit code using the same input slot. Clear color in turn is only
> +* used by clear programs.
> +*/
> +   union {
> +  struct brw_blorp_discard_rect discard_rect;
> +  struct brw_blorp_rect_grid rect_grid;
> +  union gl_color_union clear_color;
> +   };
> +
> /* Top right coordinates of the rectangular grid used for scaled
> blitting */
> -   float rect_grid_x1;
> -   float rect_grid_y1;
> struct brw_blorp_coord_transform x_transform;
> struct brw_blorp_coord_transform y_transform;
>
> @@ -197,7 +230,7 @@ struct brw_blorp_wm_inputs
> uint32_t src_z;
>
> /* Pad out to an integral number of registers */
> -   uint32_t pad[5];
> +   uint32_t pad[7];
>  };
>
>  #define BRW_BLORP_NUM_PUSH_CONSTANT_DWORDS \
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index 16d2504..a9eac60 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -331,12 +331,8 @@ enum sampler_message_arg
>
>  struct brw_blorp_blit_vars {
> /* Input values from brw_blorp_wm_inputs */
> -   nir_variable *u_dst_x0;
> -   nir_variable *u_dst_x1;
> -   nir_variable *u_dst_y0;
> -   nir_variable *u_dst_y1;
> -   nir_variable *u_rect_grid_x1;
> -   nir_variable *u_rect_grid_y1;
> +   nir_variable *u_discard_rect;
> +   nir_variable *u_rect_grid;
> struct {
>nir_variable *multiplier;
>nir_variable *offset;
> @@ -354,17 +350,32 @@ static void
>  brw_blorp_blit_vars_init(nir_builder *b, struct brw_blorp_blit_vars *v,
>   const struct brw_blorp_blit_prog_key *key)
>  {
> +   if (key->use_kill) {
> +  assert(!(key->blend && key->blit_scaled));
> +  v->u_discard_rect = nir_variable_create(b->shader, nir_var_uniform,
> +  glsl_type::uvec4_type,
> +  "discard_rect");
> +  v->u_discard_rect->data.location =
> + offsetof(struct brw_blorp_wm_inputs, discard_rect);
> +  v->u_rect_grid = NULL;
> +   } else {
> +  /* Blending grid only has two components but loading it as vec4
> +   * will keep offsets for the subsequent inputs the same between
> +   * this and the discard branch.
> +   */
> +  v->u_rect_grid = nir_variable_create(b->shader, nir_var_uniform,
> +   glsl_type::vec2_type,
> +   "rect_grid");
> +  v->u_rect_grid->data.location =
> + offsetof(struct brw_blorp_wm_inputs, rect_grid);
> +  v->u_discard_rect = NULL;
> +   }
> +
>  #define LOAD_UNIFORM(name, type)\
> v->u_##name = nir_variable_create(b->shader, nir_var_uniform, type,
> #name); \
> v->u_##name->data.location = \
>offsetof(struct brw_blorp_wm_inputs, name);
>
> -   LOAD_UNIFORM(dst_x0, glsl_uint_type())
> -   LOAD_UNIFORM(dst_x1, glsl_uint_type())
> -   LOAD_UNIFORM(dst_y0, glsl_uint_type())
> -   

Re: [Mesa-dev] [PATCH 02/18] i965: Unify vertex buffer setup

2016-07-01 Thread Jason Ekstrand
I don't know what I think about re-using the VB setup in blorp, but these
two patches are a very nice cleanup in their own right.  1 and 2 are

Reviewed-by: Jason Ekstrand 

On Thu, Jun 23, 2016 at 12:16 PM, Topi Pohjolainen <
topi.pohjolai...@intel.com> wrote:

> On gen >= 8 one doesn't provide ending address but number of bytes
> available. This is relative to the given offset.
>
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_draw_upload.c  | 34 +++
>  src/mesa/drivers/dri/i965/gen8_draw_upload.c | 41
> 
>  2 files changed, 46 insertions(+), 29 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> index 623e4ab..fdb1b35 100644
> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> @@ -708,7 +708,9 @@ brw_emit_vertex_buffer_state(struct brw_context *brw,
> struct gl_context *ctx = >ctx;
> uint32_t dw0;
>
> -   if (brw->gen >= 6) {
> +   if (brw->gen >= 8) {
> +  dw0 = buffer_nr << GEN6_VB0_INDEX_SHIFT;
> +   } else if (brw->gen >= 6) {
>dw0 = (buffer_nr << GEN6_VB0_INDEX_SHIFT) |
>  (step_rate ? GEN6_VB0_ACCESS_INSTANCEDATA
> : GEN6_VB0_ACCESS_VERTEXDATA);
> @@ -721,15 +723,35 @@ brw_emit_vertex_buffer_state(struct brw_context *brw,
> if (brw->gen >= 7)
>dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;
>
> -   if (brw->gen == 7)
> +   switch (brw->gen) {
> +   case 7:
>dw0 |= GEN7_MOCS_L3 << 16;
> +  break;
> +   case 8:
> +  dw0 |= BDW_MOCS_WB << 16;
> +  break;
> +   case 9:
> +  dw0 |= SKL_MOCS_WB << 16;
> +  break;
> +   }
>
> WARN_ONCE(stride >= (brw->gen >= 5 ? 2048 : 2047),
>   "VBO stride %d too large, bad rendering may occur\n",
>   stride);
> OUT_BATCH(dw0 | (stride << BRW_VB0_PITCH_SHIFT));
> -   OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0, start_offset);
> -   if (brw->gen >= 5) {
> +   if (brw->gen >= 8) {
> +  OUT_RELOC64(bo, I915_GEM_DOMAIN_VERTEX, 0, start_offset);
> +  /* From the BSpec: 3D Pipeline Stages - 3D Pipeline Geometry -
> +   * Vertex Fetch (VF) Stage - State
> +   *
> +   * Instead of "VBState.StartingBufferAddress + VBState.MaxIndex x
> +   * VBState.BufferPitch", the address of the byte immediately beyond
> the
> +   * last valid byte of the buffer is determined by
> +   * "VBState.StartingBufferAddress + VBState.BufferSize".
> +   */
> +  OUT_BATCH(end_offset - start_offset);
> +   } else if (brw->gen >= 5) {
> +  OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0, start_offset);
>/* From the BSpec: 3D Pipeline Stages - 3D Pipeline Geometry -
> * Vertex Fetch (VF) Stage - State
> *
> @@ -739,10 +761,12 @@ brw_emit_vertex_buffer_state(struct brw_context *brw,
> *  "VBState.EndAddress + 1".
> */
>OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0, end_offset - 1);
> +  OUT_BATCH(step_rate);
> } else {
> +  OUT_RELOC(bo, I915_GEM_DOMAIN_VERTEX, 0, start_offset);
>OUT_BATCH(0);
> +  OUT_BATCH(step_rate);
> }
> -   OUT_BATCH(step_rate);
>
> return __map;
>  }
> diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> index 5b3f673..d2c7853 100644
> --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> @@ -52,7 +52,6 @@ static void
>  gen8_emit_vertices(struct brw_context *brw)
>  {
> struct gl_context *ctx = >ctx;
> -   uint32_t mocs_wb = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
> bool uses_edge_flag;
>
> brw_prepare_vertices(brw);
> @@ -141,35 +140,29 @@ gen8_emit_vertices(struct brw_context *brw)
>BEGIN_BATCH(1 + 4 * nr_buffers);
>OUT_BATCH((_3DSTATE_VERTEX_BUFFERS << 16) | (4 * nr_buffers - 1));
>for (unsigned i = 0; i < brw->vb.nr_buffers; i++) {
> - struct brw_vertex_buffer *buffer = >vb.buffers[i];
> - uint32_t dw0 = 0;
> -
> - dw0 |= i << GEN6_VB0_INDEX_SHIFT;
> - dw0 |= GEN7_VB0_ADDRESS_MODIFYENABLE;
> - dw0 |= buffer->stride << BRW_VB0_PITCH_SHIFT;
> - dw0 |= mocs_wb << 16;
> -
> - OUT_BATCH(dw0);
> - OUT_RELOC64(buffer->bo, I915_GEM_DOMAIN_VERTEX, 0,
> buffer->offset);
> - OUT_BATCH(buffer->size);
> + const struct brw_vertex_buffer *buffer = >vb.buffers[i];
> + EMIT_VERTEX_BUFFER_STATE(brw, i, buffer->bo,
> +  buffer->offset,
> +  buffer->offset + buffer->size,
> +  buffer->stride, 0 /* unused */);
>}
>
>if (uses_draw_params) {
> - OUT_BATCH(brw->vb.nr_buffers << GEN6_VB0_INDEX_SHIFT |
> -   GEN7_VB0_ADDRESS_MODIFYENABLE |
> 

Re: [Mesa-dev] [PATCH 2/2] mesa: Add -fno-math-errno -fno-trapping-math to CXXFLAGS.

2016-07-01 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Jul 1, 2016 at 12:59 AM, Matt Turner  wrote:
> Not sure why I forgot to add them to CXXFLAGS in commit f55c408067 or
> commit 875458b778. Cuts about 1k of .text.
>
>text data  bss  dec  hex  filename
> 5806354   28781629384  6123554   5d7022  i965_dri.so before
> 5805497   28774429384  6122625   5d6c81  i965_dri.so after
> ---
>  configure.ac | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/configure.ac b/configure.ac
> index 95cfc36..e8cd97f 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -339,6 +339,9 @@ if test "x$GXX" = xyes; then
>
>  # Restore CXXFLAGS; VISIBILITY_CXXFLAGS are added to it where needed.
>  CXXFLAGS=$save_CXXFLAGS
> +
> +# We don't want floating-point math functions to set errno or trap
> +CXXFLAGS="$CXXFLAGS -fno-math-errno -fno-trapping-math"
>  fi
>
>  AC_SUBST([MSVC2013_COMPAT_CFLAGS])
> --
> 2.7.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96765] BindFragDataLocationIndexed on array fragment shader output.

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96765

Bug ID: 96765
   Summary: BindFragDataLocationIndexed on array fragment shader
output.
   Product: Mesa
   Version: 11.2
  Hardware: Other
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: coren...@wallez.net
QA Contact: mesa-dev@lists.freedesktop.org

While trying to make Chromium use the 3.3 core profile of Mesa 11.2, the
following bug showed up.

When trying to use an array fragment varying variable as the secondary fragment
color (index = 0) the driver seems to decide to set it to another location
instead. Causing nothing to be rendered. The same code with non-array variables
provides the expected result.

Repro case at
https://github.com/Kangz/GLDriverBugs/blob/master/blend_func_extended_array/Main.cpp
running it shows that SecondaryFragData[0] is set to location 1 index 0 instead
of location 0 index 1.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96765] BindFragDataLocationIndexed on array fragment shader output.

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96765

Corentin Wallez  changed:

   What|Removed |Added

 CC||marche...@icps.u-strasbg.fr

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/main: fix error checking logic on CopyImageSubData

2016-07-01 Thread Alejandro Piñeiro
For the case (both src or dst) where we had a texobject, but the
texobject target was not the same that the method target, this spec
paragraph was appplied:

 /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core
  * Profile spec says:
  *
  * "An INVALID_VALUE error is generated if either name does not
  * correspond to a valid renderbuffer or texture object according
  * to the corresponding target parameter."
  */

But for that case, the correct spec paragraph should be:
 /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core
  * Profile spec says:
  *
  * "An INVALID_ENUM error is generated if either target is
  *  not RENDERBUFFER or a valid non-proxy texture target;
  *  is TEXTURE_BUFFER or one of the cubemap face selectors
  *  described in table 8.18; or if the target does not
  *  match the type of the object."
  */

specifically the last sentence: "or if the target does not match the
type of the object".

This patch fixes the error returned (s/INVALID/ENUM) for that case,
and moves up the INVALID_VALUE spec paragraph, as that case (invalid
texture object) was handled before.

Fixes:
GL44-CTS.copy_image.target_miss_match
---

Note to Mark Janes: this patch causes the piglit test
arb_copy_image-api_errors to start failing. I will send a patch to
piglit in short.

 src/mesa/main/copyimage.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/copyimage.c b/src/mesa/main/copyimage.c
index 67a795f..4d18bed 100644
--- a/src/mesa/main/copyimage.c
+++ b/src/mesa/main/copyimage.c
@@ -139,6 +139,12 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum 
target,
   struct gl_texture_object *texObj = _mesa_lookup_texture(ctx, name);
 
   if (!texObj) {
+ /*
+  * From GL_ARB_copy_image specification:
+  * "INVALID_VALUE is generated if either  or  does
+  * not correspond to a valid renderbuffer or texture object according
+  * to the corresponding target parameter."
+  */
  _mesa_error(ctx, GL_INVALID_VALUE,
  "glCopyImageSubData(%sName = %u)", dbg_prefix, name);
  return false;
@@ -155,12 +161,11 @@ prepare_target(struct gl_context *ctx, GLuint name, 
GLenum target,
   /* Note that target will not be a cube face name */
   if (texObj->Target != target) {
  /*
-  * From GL_ARB_copy_image specification:
-  * "INVALID_VALUE is generated if either  or  does
-  * not correspond to a valid renderbuffer or texture object according
-  * to the corresponding target parameter."
+  * From GL_ARB_copy_image_specification:
+  * "INVALID_ENUM is generated if the target does not match the type
+  * of the object."
   */
- _mesa_error(ctx, GL_INVALID_VALUE,
+ _mesa_error(ctx, GL_INVALID_ENUM,
  "glCopyImageSubData(%sTarget = %s)", dbg_prefix,
  _mesa_enum_to_string(target));
  return false;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCHv5 2/4] anv: use cache uuid based on the build timestamp.

2016-07-01 Thread Emil Velikov
From: Emil Velikov 

Do not rely on the git sha1:
 - its current truncated form makes it less unique
 - it does not attribute for local (Vulkand or otherwise) changes

Use a timestamp produced at the time of build. It's perfectly unique,
unless someone explicitly thinkers with their system clock. Even then
chances of producing the exact same one are very small, if not zero.

v2: Remove .tmp rule. Its not needed since we want for the header to be
regenerated on each time we call make (Eric).

v3:
 - Honour SOURCE_DATE_EPOCH, to make the build reproducible (Michel)
 - Replace the generated header with a define, to prevent needless
builds on consecutive `make' and/or `make install' calls. (Dave)

v4:
 - Keep the timestamp generation at make time. (Jason)

v5:
 - Ensure that file is regenerated on incremental builds.

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Emil Velikov 
---
 configure.ac  |  2 ++
 src/intel/vulkan/Makefile.am  | 10 +-
 src/intel/vulkan/anv_device.c |  4 ++--
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index 8321e8e..3d32c53 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2620,6 +2620,8 @@ AC_SUBST([XA_MINOR], $XA_MINOR)
 AC_SUBST([XA_TINY], $XA_TINY)
 AC_SUBST([XA_VERSION], "$XA_MAJOR.$XA_MINOR.$XA_TINY")
 
+AC_SUBST([TIMESTAMP_CMD], '`test $(SOURCE_DATE_EPOCH) && echo 
$(SOURCE_DATE_EPOCH) || date +%s`')
+
 AC_ARG_ENABLE(valgrind,
   [AS_HELP_STRING([--enable-valgrind],
  [Build mesa with valgrind support (default: 
auto)])],
diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
index 4d9ff90..3f49020 100644
--- a/src/intel/vulkan/Makefile.am
+++ b/src/intel/vulkan/Makefile.am
@@ -131,8 +131,16 @@ anv_entrypoints.c : anv_entrypoints_gen.py 
$(vulkan_include_HEADERS)
$(AM_V_GEN) cat $(vulkan_include_HEADERS) |\
$(PYTHON2) $(srcdir)/anv_entrypoints_gen.py code > $@
 
+.PHONY: anv_timestamp.h
+
+anv_timestamp.h:
+   @echo "Updating anv_timestamp.h"
+   $(AM_V_GEN) echo "#define ANV_TIMESTAMP \"$(TIMESTAMP_CMD)\"" > $@
+
+anv_device.$(OBJEXT): anv_timestamp.h
+
 BUILT_SOURCES = $(VULKAN_GENERATED_FILES)
-CLEANFILES = $(BUILT_SOURCES) dev_icd.json
+CLEANFILES = $(BUILT_SOURCES) dev_icd.json anv_timestamp.h
 EXTRA_DIST = \
$(top_srcdir)/include/vulkan/vk_icd.h \
anv_entrypoints_gen.py \
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index ea8e875..dd941b6 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -28,7 +28,7 @@
 #include 
 
 #include "anv_private.h"
-#include "git_sha1.h"
+#include "anv_timestamp.h"
 #include "util/strtod.h"
 #include "util/debug.h"
 
@@ -426,7 +426,7 @@ void
 anv_device_get_cache_uuid(void *uuid)
 {
memset(uuid, 0, VK_UUID_SIZE);
-   snprintf(uuid, VK_UUID_SIZE, "anv-%s", MESA_GIT_SHA1 + 4);
+   snprintf(uuid, VK_UUID_SIZE, "anv-%s", ANV_TIMESTAMP);
 }
 
 void anv_GetPhysicalDeviceProperties(
-- 
2.8.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-01 Thread Andy Furniss

Christian König wrote:

Am 01.07.2016 um 18:02 schrieb Andy Furniss:

Christian König wrote:

Am 01.07.2016 um 16:42 schrieb Andy Furniss:

Boyuan Zhang wrote:

Signed-off-by: Boyuan Zhang 


Is this supposed to be the same functionally as the first version?

I notice on Tonga that I previously got cabac, but don't now.

I see the new options are now in radeon_vce_52.c whereas before
radeon_vce_40_2_2.c was changed - is this why?


We wanted to keep the code for the old firmware versions as it is,
because we otherwise would need to test them again.

Because of this we moved all modifications into radeon_vce_52.c.


Oh OK, I guess it's early days but cabac was one thing that seemed OK.

There are many issued on Tonga using vaapi - but then I suppose
gstreamer and ffmpeg are not really set up for it yet.

eg. worse first = gpu vm fault/lock with gstreamer but not ffmpeg.

to provoke do

gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12, width=1920,
height=1080, framerate=60/1 ! vaapih264enc | fakesink

then flip away/back to another desktop soon will lock.

Maybe I am too early eg. other patches needed? or does what you say
above mean that vaapienc shouldn't even be enabled for older chips?


Tonga should be perfectly supported, you just need recent enough firmware.

What does "dmesg | grep 'Found VCE firmware Version'" give you?


Found VCE firmware Version: 50.17 Binary ID: 3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-01 Thread Christian König

Am 01.07.2016 um 18:02 schrieb Andy Furniss:

Christian König wrote:

Am 01.07.2016 um 16:42 schrieb Andy Furniss:

Boyuan Zhang wrote:

Signed-off-by: Boyuan Zhang 


Is this supposed to be the same functionally as the first version?

I notice on Tonga that I previously got cabac, but don't now.

I see the new options are now in radeon_vce_52.c whereas before
radeon_vce_40_2_2.c was changed - is this why?


We wanted to keep the code for the old firmware versions as it is,
because we otherwise would need to test them again.

Because of this we moved all modifications into radeon_vce_52.c.


Oh OK, I guess it's early days but cabac was one thing that seemed OK.

There are many issued on Tonga using vaapi - but then I suppose
gstreamer and ffmpeg are not really set up for it yet.

eg. worse first = gpu vm fault/lock with gstreamer but not ffmpeg.

to provoke do

gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12, width=1920, 
height=1080, framerate=60/1 ! vaapih264enc | fakesink


then flip away/back to another desktop soon will lock.

Maybe I am too early eg. other patches needed? or does what you say
above mean that vaapienc shouldn't even be enabled for older chips?


Tonga should be perfectly supported, you just need recent enough firmware.

What does "dmesg | grep 'Found VCE firmware Version'" give you?

Regards,
Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-01 Thread Andy Furniss

Christian König wrote:

Am 01.07.2016 um 16:42 schrieb Andy Furniss:

Boyuan Zhang wrote:

Signed-off-by: Boyuan Zhang 


Is this supposed to be the same functionally as the first version?

I notice on Tonga that I previously got cabac, but don't now.

I see the new options are now in radeon_vce_52.c whereas before
radeon_vce_40_2_2.c was changed - is this why?


We wanted to keep the code for the old firmware versions as it is,
because we otherwise would need to test them again.

Because of this we moved all modifications into radeon_vce_52.c.


Oh OK, I guess it's early days but cabac was one thing that seemed OK.

There are many issued on Tonga using vaapi - but then I suppose
gstreamer and ffmpeg are not really set up for it yet.

eg. worse first = gpu vm fault/lock with gstreamer but not ffmpeg.

to provoke do

gst-launch-1.0 videotestsrc ! video/x-raw, format=NV12, width=1920, 
height=1080, framerate=60/1 ! vaapih264enc | fakesink


then flip away/back to another desktop soon will lock.

Maybe I am too early eg. other patches needed? or does what you say
above mean that vaapienc shouldn't even be enabled for older chips?


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96639] st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96639

Nicolai Hähnle  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96639] st/mesa: transfer_map with too-high level with dEQP-GLES2.functional.texture.completeness.cube.extra_level

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96639

--- Comment #1 from Nicolai Hähnle  ---
Fixed in master commit 07cc838b105dd3f34526db73064f1f21b452240e.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96629] dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0: Assertion `width >= 1' failed.

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96629

Nicolai Hähnle  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Nicolai Hähnle  ---
Fixed in master commit 0ba053b34c29106817568996ac53b41029cf4e4c

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-01 Thread Christian König

Am 01.07.2016 um 16:42 schrieb Andy Furniss:

Boyuan Zhang wrote:

Signed-off-by: Boyuan Zhang 


Is this supposed to be the same functionally as the first version?

I notice on Tonga that I previously got cabac, but don't now.

I see the new options are now in radeon_vce_52.c whereas before
radeon_vce_40_2_2.c was changed - is this why?


We wanted to keep the code for the old firmware versions as it is, 
because we otherwise would need to test them again.


Because of this we moved all modifications into radeon_vce_52.c.

Regards,
Christian.



TIA

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr: clean up c++ feature flag test to not deal with IFS

2016-07-01 Thread Chuck Atkins
Cc: Tim Rowley 
Signed-off-by: Chuck Atkins 
---
 configure.ac | 43 ++-
 1 file changed, 18 insertions(+), 25 deletions(-)

diff --git a/configure.ac b/configure.ac
index 8321e8e..844e91c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2331,37 +2331,30 @@ swr_llvm_check() {
 }
 
 swr_require_cxx_feature_flags() {
-feature_name="$1"
-preprocessor_test="$2"
-option_list="$3"
-output_var="$4"
+local feature_name="$1"
+local output_var="$2"
+local preprocessor_test="$3"
+shift 3
 
 AC_MSG_CHECKING([whether $CXX supports $feature_name])
 AC_LANG_PUSH([C++])
 save_CXXFLAGS="$CXXFLAGS"
-save_IFS="$IFS"
-IFS=","
 found=0
-for opts in $option_list
-do
-unset IFS
-CXXFLAGS="$opts $save_CXXFLAGS"
+while test $# -gt 0; do
+CXXFLAGS="$save_CXXFLAGS $1"
 AC_COMPILE_IFELSE(
 [AC_LANG_PROGRAM(
 [   #if !($preprocessor_test)
 #error
 #endif
 ])],
-[found=1; break],
-[])
-IFS=","
+[eval "$output_var=\"\$1\""; found=1; break;],
+[shift])
 done
-IFS="$save_IFS"
 CXXFLAGS="$save_CXXFLAGS"
 AC_LANG_POP([C++])
 if test $found -eq 1; then
-AC_MSG_RESULT([$opts])
-eval "$output_var=\$opts"
+AC_MSG_RESULT([ifelse([$1],,[yes],[$1])])
 return 0
 fi
 AC_MSG_RESULT([no])
@@ -2438,19 +2431,19 @@ if test -n "$with_gallium_drivers"; then
 xswr)
 swr_llvm_check "swr"
 
-swr_require_cxx_feature_flags "C++11" "__cplusplus >= 201103L" \
-",-std=c++11" \
-SWR_CXX11_CXXFLAGS
+swr_require_cxx_feature_flags "C++11" SWR_CXX11_CXXFLAGS \
+"__cplusplus >= 201103L" \
+"" "-std=c++11"
 AC_SUBST([SWR_CXX11_CXXFLAGS])
 
-swr_require_cxx_feature_flags "AVX" "defined(__AVX__)" \
-",-mavx,-march=core-avx" \
-SWR_AVX_CXXFLAGS
+swr_require_cxx_feature_flags "AVX" SWR_AVX_CXXFLAGS \
+"defined(__AVX__)" \
+"" "-mavx" "-march=core-avx"
 AC_SUBST([SWR_AVX_CXXFLAGS])
 
-swr_require_cxx_feature_flags "AVX2" "defined(__AVX2__)" \
-",-mavx2 -mfma -mbmi2 -mf16c,-march=core-avx2" \
-SWR_AVX2_CXXFLAGS
+swr_require_cxx_feature_flags "AVX2" SWR_AVX2_CXXFLAGS \
+"defined(__AVX2__)" \
+"" "-mavx2 -mfma -mbmi2 -mf16c" "-march=core-avx2"
 AC_SUBST([SWR_AVX2_CXXFLAGS])
 
 HAVE_GALLIUM_SWR=yes
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/36] i965/blorp: Add an isl_view to blorp_surface_info

2016-07-01 Thread Jason Ekstrand
On Jul 1, 2016 1:04 AM, "Pohjolainen, Topi" 
wrote:
>
> On Wed, Jun 29, 2016 at 05:37:35PM -0700, Jason Ekstrand wrote:
> > Eventually, this will be the actual view that gets passed into isl to
> > create the surface state.  For now, we just use it for the format and
the
> > swizzle.
> > ---
> >  src/mesa/drivers/dri/i965/brw_blorp.c | 38
+++
> >  src/mesa/drivers/dri/i965/brw_blorp.h | 16 ++-
> >  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 34

> >  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  2 +-
> >  src/mesa/drivers/dri/i965/gen8_blorp.c| 29 
> >  5 files changed, 64 insertions(+), 55 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
b/src/mesa/drivers/dri/i965/brw_blorp.c
> > index 5e433d3..df92822 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> > @@ -43,9 +43,11 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> >  * using INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, then it
had better
> >  * be a multiple of num_samples.
> >  */
> > +   unsigned layer_multiplier = 1;
> > if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
> > mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) {
> >assert(mt->num_samples <= 1 || layer % mt->num_samples == 0);
> > +  layer_multiplier = MAX2(mt->num_samples, 1);
> > }
> >
> > intel_miptree_check_level_layer(mt, level, layer);
> > @@ -61,13 +63,27 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> >info->aux_usage = ISL_AUX_USAGE_NONE;
> > }
> >
> > +   info->view = (struct isl_view) {
> > +  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
> > +  ISL_SURF_USAGE_TEXTURE_BIT,
> > +  .format = ISL_FORMAT_UNSUPPORTED, /* Set later */
> > +  .base_level = level,
> > +  .levels = 1,
> > +  .base_array_layer = layer / layer_multiplier,
> > +  .array_len = 1,
> > +  .channel_select = {
> > + ISL_CHANNEL_SELECT_RED,
> > + ISL_CHANNEL_SELECT_GREEN,
> > + ISL_CHANNEL_SELECT_BLUE,
> > + ISL_CHANNEL_SELECT_ALPHA,
> > +  },
> > +   };
> > +
> > info->level = level;
> > info->layer = layer;
> > info->width = minify(mt->physical_width0, level - mt->first_level);
> > info->height = minify(mt->physical_height0, level -
mt->first_level);
> >
> > -   info->swizzle = SWIZZLE_XYZW;
> > -
> > if (format == MESA_FORMAT_NONE)
> >format = mt->format;
> >
> > @@ -75,8 +91,8 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> > case MESA_FORMAT_S_UINT8:
> >assert(info->surf.tiling == ISL_TILING_W);
> >/* Prior to Broadwell, we can't render to R8_UINT */
> > -  info->brw_surfaceformat = brw->gen >= 8 ?
BRW_SURFACEFORMAT_R8_UINT :
> > -
BRW_SURFACEFORMAT_R8_UNORM;
> > +  info->view.format = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
> > +  BRW_SURFACEFORMAT_R8_UNORM;
>
> isl_view::format is of the type "enum isl_format" but we assigned it with
> BRW_SURFACEFORMAT?

ISL uses the hardware values so the two are interchangeable. I haven't yet
gone through the dri driver and deleted the BRW_SURFACEFORMAT defines.  I
could, however fix this particular case while I'm in the neighborhood.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/36] i965/blorp: Make sample count asserts a bit more lazy

2016-07-01 Thread Jason Ekstrand
On Jun 30, 2016 10:14 PM, "Pohjolainen, Topi" 
wrote:
>
> On Thu, Jun 30, 2016 at 06:57:39AM -0700, Jason Ekstrand wrote:
> >On Jun 29, 2016 11:07 PM, "Pohjolainen, Topi"
> ><[1]topi.pohjolai...@intel.com> wrote:
> >>
> >> On Wed, Jun 29, 2016 at 05:37:28PM -0700, Jason Ekstrand wrote:
> >> > ---
> >> >  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 10 +-
> >> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >>
> >> Could you add some rational here? In the next patch you still use
> >> MAX2(mt->num_samples, 1) and it looks that it at least should still
> >work
> >> without this.
> >
> >Over time, these will all get replaced with isl_surf.samples which
are
> >1 for single sampled.  On the other hand, mt->num_samples is zero for
> >single-sampled.
>
> Does that mean that value zero becomes illegal once we get the value from
> isl_surf?

Yes it does

> >
> >> >
> >> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> >b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> >> > index 1e15bd5..257db06 100644
> >> > --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> >> > +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> >> > @@ -1302,7 +1302,7 @@ brw_blorp_build_nir_shader(struct
brw_context
> >*brw,
> >> > nir_ssa_def *src_pos, *dst_pos, *color;
> >> >
> >> > /* Sanity checks */
> >> > -   if (key->dst_tiled_w && key->rt_samples > 0) {
> >> > +   if (key->dst_tiled_w && key->rt_samples > 1) {
> >> >/* If the destination image is W tiled and multisampled,
> >then the thread
> >> > * must be dispatched once per sample, not once per pixel.
> >This is
> >> > * necessary because after conversion between W and Y
> >tiling, there's no
> >> > @@ -1333,13 +1333,13 @@ brw_blorp_build_nir_shader(struct
> >brw_context *brw,
> >> >
> >> > /* Make sure layout is consistent with sample count */
> >> > assert((key->tex_layout == INTEL_MSAA_LAYOUT_NONE) ==
> >> > -  (key->tex_samples == 0));
> >> > +  (key->tex_samples <= 1));
> >> > assert((key->rt_layout == INTEL_MSAA_LAYOUT_NONE) ==
> >> > -  (key->rt_samples == 0));
> >> > +  (key->rt_samples <= 1));
> >> > assert((key->src_layout == INTEL_MSAA_LAYOUT_NONE) ==
> >> > -  (key->src_samples == 0));
> >> > +  (key->src_samples <= 1));
> >> > assert((key->dst_layout == INTEL_MSAA_LAYOUT_NONE) ==
> >> > -  (key->dst_samples == 0));
> >> > +  (key->dst_samples <= 1));
> >> >
> >> > nir_builder b;
> >> > nir_builder_init_simple_shader(, NULL,
MESA_SHADER_FRAGMENT,
> >NULL);
> >> > --
> >> > 2.5.0.400.gff86faf
> >> >
> >> > ___
> >> > mesa-dev mailing list
> >> > [2]mesa-dev@lists.freedesktop.org
> >> > [3]https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> > References
> >
> >1. mailto:topi.pohjolai...@intel.com
> >2. mailto:mesa-dev@lists.freedesktop.org
> >3. https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4] swr: Refactor checks for compiler feature flags

2016-07-01 Thread Emil Velikov
On 1 July 2016 at 16:03, Chuck Atkins  wrote:
>> This part should have been a separate patch. Please try to keep things
>> separate for the future.
>
>
> Indeed, I should have this as two separate commits, one to encapsulate the
> flag test and another to add additional options to test for.  I'll keep them
> more segmented in the future.
>
>
>>
>> Esp with the IFS override this looks rather nasty imho.
>
>
> I hate messing with IFS as it tends to get ugly real fast, I just couldn't
> think of a better way at the time.  After sitting with it for a bit longer
> now, I've updated the swr_require_cxx_feature_flags function now to not use
> IFS.  Would you rather I push it as a new patch or just leave it alone for
> now?
This patch has landed afaict, so if you want to rework it just do so
on top of it.

But at the end of the day It's up-to you and/or Tim really.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4] swr: Refactor checks for compiler feature flags

2016-07-01 Thread Chuck Atkins
>
> This part should have been a separate patch. Please try to keep things
> separate for the future.
>

Indeed, I should have this as two separate commits, one to encapsulate the
flag test and another to add additional options to test for.  I'll keep
them more segmented in the future.



> Esp with the IFS override this looks rather nasty imho.
>

I hate messing with IFS as it tends to get ugly real fast, I just couldn't
think of a better way at the time.  After sitting with it for a bit longer
now, I've updated the swr_require_cxx_feature_flags function now to not use
IFS.  Would you rather I push it as a new patch or just leave it alone for
now?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-01 Thread Andy Furniss

Boyuan Zhang wrote:

Signed-off-by: Boyuan Zhang 


Is this supposed to be the same functionally as the first version?

I notice on Tonga that I previously got cabac, but don't now.

I see the new options are now in radeon_vce_52.c whereas before
radeon_vce_40_2_2.c was changed - is this why?

TIA

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/14] gallium/radeon: add can_sample_z/s flags for textures

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/r600/r600_blit.c  |  6 ++
 src/gallium/drivers/r600/r600_pipe.h  |  8 
 src/gallium/drivers/r600/r600_state.c | 14 +++---
 src/gallium/drivers/radeon/r600_pipe_common.h |  9 +
 src/gallium/drivers/radeon/r600_texture.c | 19 +++
 5 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_blit.c 
b/src/gallium/drivers/r600/r600_blit.c
index edf9726..6e5675b 100644
--- a/src/gallium/drivers/r600/r600_blit.c
+++ b/src/gallium/drivers/r600/r600_blit.c
@@ -277,8 +277,7 @@ void r600_decompress_depth_textures(struct r600_context 
*rctx,
tex = (struct r600_texture *)view->texture;
assert(tex->is_depth && !tex->is_flushing_texture);
 
-   if (rctx->b.chip_class >= EVERGREEN ||
-   r600_can_read_depth(tex)) {
+   if (r600_can_sample_zs(tex, rview->is_stencil_sampler)) {
r600_blit_decompress_depth_in_place(rctx, tex,
   rview->is_stencil_sampler,
   view->u.tex.first_level, 
view->u.tex.last_level,
@@ -374,8 +373,7 @@ static bool r600_decompress_subresource(struct pipe_context 
*ctx,
struct r600_texture *rtex = (struct r600_texture*)tex;
 
if (rtex->is_depth && !rtex->is_flushing_texture) {
-   if (rctx->b.chip_class >= EVERGREEN ||
-   r600_can_read_depth(rtex)) {
+   if (r600_can_sample_zs(rtex, false)) {
r600_blit_decompress_depth_in_place(rctx, rtex, false,
   level, level,
   first_layer, last_layer);
diff --git a/src/gallium/drivers/r600/r600_pipe.h 
b/src/gallium/drivers/r600/r600_pipe.h
index 0dd538b..e1b2aed 100644
--- a/src/gallium/drivers/r600/r600_pipe.h
+++ b/src/gallium/drivers/r600/r600_pipe.h
@@ -926,14 +926,6 @@ static inline unsigned r600_pack_float_12p4(float x)
   x >= 4096 ? 0x : x * 16;
 }
 
-/* Return if the depth format can be read without the DB->CB copy on 
r6xx-r7xx. */
-static inline bool r600_can_read_depth(struct r600_texture *rtex)
-{
-   return rtex->resource.b.b.nr_samples <= 1 &&
-  (rtex->resource.b.b.format == PIPE_FORMAT_Z16_UNORM ||
-   rtex->resource.b.b.format == PIPE_FORMAT_Z32_FLOAT);
-}
-
 static inline unsigned r600_get_flush_flags(enum r600_coherency coher)
 {
switch (coher) {
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 8b1b951..8ae8380 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -706,7 +706,13 @@ r600_create_sampler_view_custom(struct pipe_context *ctx,
return NULL;
}
 
-   if (tmp->is_depth && !tmp->is_flushing_texture && 
!r600_can_read_depth(tmp)) {
+   if (state->format == PIPE_FORMAT_X24S8_UINT ||
+   state->format == PIPE_FORMAT_S8X24_UINT ||
+   state->format == PIPE_FORMAT_X32_S8X24_UINT ||
+   state->format == PIPE_FORMAT_S8_UINT)
+   view->is_stencil_sampler = true;
+
+   if (tmp->is_depth && !r600_can_sample_zs(tmp, 
view->is_stencil_sampler)) {
if (!r600_init_flushed_depth_texture(ctx, texture, NULL)) {
FREE(view);
return NULL;
@@ -744,12 +750,6 @@ r600_create_sampler_view_custom(struct pipe_context *ctx,
break;
}
 
-   if (state->format == PIPE_FORMAT_X24S8_UINT ||
-   state->format == PIPE_FORMAT_S8X24_UINT ||
-   state->format == PIPE_FORMAT_X32_S8X24_UINT ||
-   state->format == PIPE_FORMAT_S8_UINT)
-   view->is_stencil_sampler = true;
-
view->tex_resource = >resource;
view->tex_resource_words[0] = 
(S_038000_DIM(r600_tex_dim(texture->target, texture->nr_samples)) |
   S_038000_TILE_MODE(array_mode) |
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index c145dc3..2d746cb 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -241,6 +241,8 @@ struct r600_texture {
uint64_tsize;
unsignednum_level0_transfers;
boolis_depth;
+   boolcan_sample_z;
+   boolcan_sample_s;
unsigneddirty_level_mask; /* each bit says if 
that mipmap is compressed */
unsignedstencil_dirty_level_mask; /* each bit 
says if that mipmap is compressed */
struct r600_texture 

[Mesa-dev] [PATCH 08/14] gallium/radeon: replace is_flushing_texture with db_compatible

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This is a left-over of when I considered generalizing the separate stencil
support. I do prefer the new name since it emphasizes what flushing vs.
non-flushing means from a functional point-of-view, namely special handling
of the texture format.
---
 src/gallium/drivers/r600/evergreen_state.c| 12 +++-
 src/gallium/drivers/r600/r600_blit.c  |  4 ++--
 src/gallium/drivers/r600/r600_state.c |  4 ++--
 src/gallium/drivers/r600/r600_state_common.c  |  2 +-
 src/gallium/drivers/radeon/r600_pipe_common.h |  2 +-
 src/gallium/drivers/radeon/r600_texture.c |  3 ++-
 src/gallium/drivers/radeonsi/si_blit.c|  4 ++--
 src/gallium/drivers/radeonsi/si_descriptors.c |  4 ++--
 src/gallium/drivers/radeonsi/si_state.c   |  6 --
 9 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index fab0359..fe4f14c 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -702,14 +702,16 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
surflevel = tmp->surface.level;
 
/* Texturing with separate depth and stencil. */
-   if (tmp->is_depth && !tmp->is_flushing_texture) {
+   if (tmp->db_compatible) {
switch (pipe_format) {
case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
pipe_format = PIPE_FORMAT_Z32_FLOAT;
break;
case PIPE_FORMAT_X8Z24_UNORM:
case PIPE_FORMAT_S8_UINT_Z24_UNORM:
-   /* Z24 is always stored like this. */
+   /* Z24 is always stored like this for DB
+* compatibility.
+*/
pipe_format = PIPE_FORMAT_Z24X8_UNORM;
break;
case PIPE_FORMAT_X24S8_UINT:
@@ -724,7 +726,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
}
 
if (R600_BIG_ENDIAN)
-   do_endian_swap = !(tmp->is_depth && !tmp->is_flushing_texture);
+   do_endian_swap = !tmp->db_compatible;
 
format = r600_translate_texformat(ctx->screen, pipe_format,
  swizzle,
@@ -868,7 +870,7 @@ evergreen_create_sampler_view_custom(struct pipe_context 
*ctx,
  S_03001C_BANK_HEIGHT(bankh) |
  S_03001C_MACRO_TILE_ASPECT(macro_aspect) |
  S_03001C_NUM_BANKS(nbanks) |
- S_03001C_DEPTH_SAMPLE_ORDER(tmp->is_depth 
&& !tmp->is_flushing_texture);
+ 
S_03001C_DEPTH_SAMPLE_ORDER(tmp->db_compatible);
return >base;
 }
 
@@ -1088,7 +1090,7 @@ void evergreen_init_color_surface(struct r600_context 
*rctx,
}
 
if (R600_BIG_ENDIAN)
-   do_endian_swap = !(rtex->is_depth && 
!rtex->is_flushing_texture);
+   do_endian_swap = !rtex->db_compatible;
 
format = r600_translate_colorformat(rctx->b.chip_class, 
surf->base.format,
  do_endian_swap);
diff --git a/src/gallium/drivers/r600/r600_blit.c 
b/src/gallium/drivers/r600/r600_blit.c
index 6e5675b..a6c5b44 100644
--- a/src/gallium/drivers/r600/r600_blit.c
+++ b/src/gallium/drivers/r600/r600_blit.c
@@ -275,7 +275,7 @@ void r600_decompress_depth_textures(struct r600_context 
*rctx,
rview = (struct r600_pipe_sampler_view*)view;
 
tex = (struct r600_texture *)view->texture;
-   assert(tex->is_depth && !tex->is_flushing_texture);
+   assert(tex->db_compatible);
 
if (r600_can_sample_zs(tex, rview->is_stencil_sampler)) {
r600_blit_decompress_depth_in_place(rctx, tex,
@@ -372,7 +372,7 @@ static bool r600_decompress_subresource(struct pipe_context 
*ctx,
struct r600_context *rctx = (struct r600_context *)ctx;
struct r600_texture *rtex = (struct r600_texture*)tex;
 
-   if (rtex->is_depth && !rtex->is_flushing_texture) {
+   if (rtex->db_compatible) {
if (r600_can_sample_zs(rtex, false)) {
r600_blit_decompress_depth_in_place(rctx, rtex, false,
   level, level,
diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index 8ae8380..b2d3d4d 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -695,7 +695,7 @@ r600_create_sampler_view_custom(struct pipe_context *ctx,
swizzle[3] = state->swizzle_a;
 
if (R600_BIG_ENDIAN)
-   do_endian_swap = !(tmp->is_depth && !tmp->is_flushing_texture);
+   do_endian_swap = !tmp->db_compatible;
 

[Mesa-dev] [PATCH 09/14] radeonsi: sample from flushed depth texture when required

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Note that this has no effect yet. A case where can_sample_z/s can be false
in radeonsi will be added in a later patch.
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 35 +--
 src/gallium/drivers/radeonsi/si_state.c   | 19 +++
 2 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 3def237..92875a2 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -295,13 +295,22 @@ static void si_release_sampler_views(struct 
si_sampler_views *views)
 
 static void si_sampler_view_add_buffer(struct si_context *sctx,
   struct pipe_resource *resource,
-  enum radeon_bo_usage usage)
+  enum radeon_bo_usage usage,
+  bool is_stencil_sampler)
 {
-   struct r600_resource *rres = (struct r600_resource*)resource;
+   struct r600_resource *rres;
 
if (!resource)
return;
 
+   if (resource->target != PIPE_BUFFER) {
+   struct r600_texture *tex = (struct r600_texture*)resource;
+
+   if (tex->is_depth && !r600_can_sample_zs(tex, 
is_stencil_sampler))
+   resource = >flushed_depth_texture->resource.b.b;
+   }
+
+   rres = (struct r600_resource*)resource;
radeon_add_to_buffer_list(>b, >b.gfx, rres, usage,
  r600_get_sampler_view_priority(rres));
 
@@ -323,9 +332,11 @@ static void si_sampler_views_begin_new_cs(struct 
si_context *sctx,
/* Add buffers to the CS. */
while (mask) {
int i = u_bit_scan();
+   struct si_sampler_view *sview = (struct si_sampler_view 
*)views->views[i];
 
-   si_sampler_view_add_buffer(sctx, views->views[i]->texture,
-  RADEON_USAGE_READ);
+   si_sampler_view_add_buffer(sctx, sview->base.texture,
+  RADEON_USAGE_READ,
+  sview->is_stencil_sampler);
}
 }
 
@@ -345,9 +356,16 @@ void si_set_mutable_tex_desc_fields(struct r600_texture 
*tex,
unsigned block_width, bool is_stencil,
uint32_t *state)
 {
-   uint64_t va = tex->resource.gpu_address + base_level_info->offset;
+   uint64_t va;
unsigned pitch = base_level_info->nblk_x * block_width;
 
+   if (tex->is_depth && !r600_can_sample_zs(tex, is_stencil)) {
+   tex = tex->flushed_depth_texture;
+   is_stencil = false;
+   }
+
+   va = tex->resource.gpu_address + base_level_info->offset;
+
state[1] &= C_008F14_BASE_ADDRESS_HI;
state[3] &= C_008F1C_TILING_INDEX;
state[4] &= C_008F20_PITCH;
@@ -384,7 +402,8 @@ static void si_set_sampler_view(struct si_context *sctx,
uint32_t *desc = descs->list + slot * 16;
 
si_sampler_view_add_buffer(sctx, view->texture,
-  RADEON_USAGE_READ);
+  RADEON_USAGE_READ,
+  rview->is_stencil_sampler);
 
pipe_sampler_view_reference(>views[slot], view);
memcpy(desc, rview->state, 8*4);
@@ -546,7 +565,7 @@ si_image_views_begin_new_cs(struct si_context *sctx, struct 
si_images_info *imag
assert(view->resource);
 
si_sampler_view_add_buffer(sctx, view->resource,
-  RADEON_USAGE_READWRITE);
+  RADEON_USAGE_READWRITE, false);
}
 }
 
@@ -605,7 +624,7 @@ static void si_set_shader_image(struct si_context *ctx,
util_copy_image_view(>views[slot], view);
 
si_sampler_view_add_buffer(ctx, >b.b,
-  RADEON_USAGE_READWRITE);
+  RADEON_USAGE_READWRITE, false);
 
if (res->b.b.target == PIPE_BUFFER) {
if (view->access & PIPE_IMAGE_ACCESS_WRITE)
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index ad63dab..4182906 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2978,6 +2978,25 @@ si_create_sampler_view_custom(struct pipe_context *ctx,
 
/* Texturing with separate depth and stencil. */
pipe_format = state->format;
+
+   /* Depth/stencil texturing sometimes needs separate texture. */
+   if (tmp->is_depth && !r600_can_sample_zs(tmp, 
view->is_stencil_sampler)) {
+   if (!tmp->flushed_depth_texture &&
+   

[Mesa-dev] [PATCH 05/14] r600g: remove obsolete flushed texture initialization in color surface setup

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Seems to have been unnecessary for quite some time, and seems like an odd
place to do the initialization anyway.
---
 src/gallium/drivers/r600/r600_state.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_state.c 
b/src/gallium/drivers/r600/r600_state.c
index e805d33..8b1b951 100644
--- a/src/gallium/drivers/r600/r600_state.c
+++ b/src/gallium/drivers/r600/r600_state.c
@@ -842,12 +842,6 @@ static void r600_init_color_surface(struct r600_context 
*rctx,
int i;
bool blend_bypass = 0, blend_clamp = 1, do_endian_swap = FALSE;
 
-   if (rtex->is_depth && !rtex->is_flushing_texture && 
!r600_can_read_depth(rtex)) {
-   r600_init_flushed_depth_texture(>b.b, surf->base.texture, 
NULL);
-   rtex = rtex->flushed_depth_texture;
-   assert(rtex);
-   }
-
offset = rtex->surface.level[level].offset;
color_view = S_028080_SLICE_START(surf->base.u.tex.first_layer) |
 S_028080_SLICE_MAX(surf->base.u.tex.last_layer);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/14] gallium/radeon: print StencilLayout only once

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

It is the same for all levels.
---
 src/gallium/drivers/radeon/r600_texture.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 7651e87..0a25dbe 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -961,9 +961,9 @@ void r600_print_texture_info(struct r600_texture *rtex, 
FILE *f)
rtex->surface.level[i].mode);
 
if (rtex->surface.flags & RADEON_SURF_SBUFFER) {
+   fprintf(f, "  StencilLayout: tilesplit=%u\n",
+   rtex->surface.stencil_tile_split);
for (i = 0; i <= rtex->surface.last_level; i++) {
-   fprintf(f, "  StencilLayout: tilesplit=%u\n",
-   rtex->surface.stencil_tile_split);
fprintf(f, "  StencilLevel[%i]: offset=%"PRIu64", "
"slice_size=%"PRIu64", npix_x=%u, "
"npix_y=%u, npix_z=%u, nblk_x=%u, nblk_y=%u, "
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/14] radeonsi: extract DB->CB copy logic into its own function

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Also clean up some of the looping.
---
 src/gallium/drivers/radeonsi/si_blit.c | 97 +-
 1 file changed, 61 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 26ce820..85c8ca3 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -101,61 +101,61 @@ static unsigned u_max_sample(struct pipe_resource *r)
return r->nr_samples ? r->nr_samples - 1 : 0;
 }
 
-static void si_blit_decompress_depth(struct pipe_context *ctx,
-struct r600_texture *texture,
-struct r600_texture *staging,
-unsigned first_level, unsigned last_level,
-unsigned first_layer, unsigned last_layer,
-unsigned first_sample, unsigned 
last_sample)
+static void
+si_blit_dbcb_copy(struct si_context *sctx,
+ struct r600_texture *src,
+ struct r600_texture *dst,
+ unsigned planes, unsigned level_mask,
+ unsigned first_layer, unsigned last_layer,
+ unsigned first_sample, unsigned last_sample)
 {
-   struct si_context *sctx = (struct si_context *)ctx;
-   unsigned layer, level, sample, checked_last_layer, max_layer;
-   float depth = 1.0f;
-   const struct util_format_description *desc;
-
-   assert(staging != NULL && "use si_blit_decompress_zs_in_place instead");
-
-   desc = util_format_description(staging->resource.b.b.format);
+   struct pipe_surface surf_tmpl = {{0}};
+   unsigned layer, sample, checked_last_layer, max_layer;
 
-   if (util_format_has_depth(desc))
+   if (planes & PIPE_MASK_Z)
sctx->dbcb_depth_copy_enabled = true;
-   if (util_format_has_stencil(desc))
+   if (planes & PIPE_MASK_S)
sctx->dbcb_stencil_copy_enabled = true;
+   si_mark_atom_dirty(sctx, >db_render_state);
 
assert(sctx->dbcb_depth_copy_enabled || 
sctx->dbcb_stencil_copy_enabled);
 
-   for (level = first_level; level <= last_level; level++) {
+   while (level_mask) {
+   unsigned level = u_bit_scan(_mask);
+
/* The smaller the mipmap level, the less layers there are
 * as far as 3D textures are concerned. */
-   max_layer = util_max_layer(>resource.b.b, level);
+   max_layer = util_max_layer(>resource.b.b, level);
checked_last_layer = MIN2(last_layer, max_layer);
 
+   surf_tmpl.u.tex.level = level;
+
for (layer = first_layer; layer <= checked_last_layer; layer++) 
{
-   for (sample = first_sample; sample <= last_sample; 
sample++) {
-   struct pipe_surface *zsurf, *cbsurf, surf_tmpl;
+   struct pipe_surface *zsurf, *cbsurf;
 
-   sctx->dbcb_copy_sample = sample;
-   si_mark_atom_dirty(sctx, 
>db_render_state);
+   surf_tmpl.format = src->resource.b.b.format;
+   surf_tmpl.u.tex.first_layer = layer;
+   surf_tmpl.u.tex.last_layer = layer;
 
-   surf_tmpl.format = texture->resource.b.b.format;
-   surf_tmpl.u.tex.level = level;
-   surf_tmpl.u.tex.first_layer = layer;
-   surf_tmpl.u.tex.last_layer = layer;
+   zsurf = sctx->b.b.create_surface(>b.b, 
>resource.b.b, _tmpl);
 
-   zsurf = ctx->create_surface(ctx, 
>resource.b.b, _tmpl);
+   surf_tmpl.format = dst->resource.b.b.format;
+   cbsurf = sctx->b.b.create_surface(>b.b, 
>resource.b.b, _tmpl);
 
-   surf_tmpl.format = staging->resource.b.b.format;
-   cbsurf = ctx->create_surface(ctx,
-   (struct pipe_resource*)staging, 
_tmpl);
+   for (sample = first_sample; sample <= last_sample; 
sample++) {
+   if (sample != sctx->dbcb_copy_sample) {
+   sctx->dbcb_copy_sample = sample;
+   si_mark_atom_dirty(sctx, 
>db_render_state);
+   }
 
-   si_blitter_begin(ctx, SI_DECOMPRESS);
+   si_blitter_begin(>b.b, SI_DECOMPRESS);

util_blitter_custom_depth_stencil(sctx->blitter, zsurf, cbsurf, 1 << sample,
- 
sctx->custom_dsa_flush, depth);
-   

[Mesa-dev] [PATCH 03/14] gallium/radeon: remove redundant null-pointer check

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/r600_texture.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 0a25dbe..b3920d7 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -560,8 +560,7 @@ static void r600_texture_destroy(struct pipe_screen *screen,
struct r600_texture *rtex = (struct r600_texture*)ptex;
struct r600_resource *resource = >resource;
 
-   if (rtex->flushed_depth_texture)
-   r600_texture_reference(>flushed_depth_texture, NULL);
+   pipe_resource_reference((struct pipe_resource 
**)>flushed_depth_texture, NULL);
 
r600_resource_reference(>htile_buffer, NULL);
if (rtex->cmask_buffer != >resource) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/14] gallium/radeon: add depth/stencil_adjusted output to surface computation

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

This fixes a rare bug with stencil texturing -- seen on Polaris and Tonga,
though it's basically a function of the memory configuration so could affect
other parts as well.

Fixes piglit "unaligned-blit * stencil downsample" and various
"fbo-depth-array *stencil*" tests.
---
 src/gallium/drivers/radeon/r600_texture.c  | 4 ++--
 src/gallium/drivers/radeon/radeon_winsys.h | 8 
 src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 4 
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index bbe709a..c116926 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1030,8 +1030,8 @@ r600_texture_create_object(struct pipe_screen *screen,
if (base->flags & (R600_RESOURCE_FLAG_TRANSFER |
   R600_RESOURCE_FLAG_FLUSHED_DEPTH) ||
rscreen->chip_class >= EVERGREEN) {
-   rtex->can_sample_z = true;
-   rtex->can_sample_s = true;
+   rtex->can_sample_z = !rtex->surface.depth_adjusted;
+   rtex->can_sample_s = !rtex->surface.stencil_adjusted;
} else {
if (rtex->resource.b.b.nr_samples <= 1 &&
(rtex->resource.b.b.format == PIPE_FORMAT_Z16_UNORM 
||
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 1069193..35ee43f 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -400,6 +400,14 @@ struct radeon_surf {
 uint32_tmacro_tile_index;
 uint32_tmicro_tile_mode; /* displayable, thin, depth, 
rotated */
 
+/* Whether the depth miptree or stencil miptree as used by the DB are
+ * adjusted from their TC compatible form to ensure depth/stencil
+ * compatibility. If either is true, the corresponding plane cannot be
+ * sampled from.
+ */
+booldepth_adjusted;
+boolstencil_adjusted;
+
 uint64_tdcc_size;
 uint64_tdcc_alignment;
 };
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
index dd033e0..cafa75d 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
@@ -454,6 +454,10 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
  if (r)
 return r;
 
+ /* DB uses the depth pitch for both stencil and depth. */
+ if (surf->stencil_level[level].nblk_x != surf->level[level].nblk_x)
+surf->stencil_adjusted = true;
+
  if (level == 0) {
 /* For 2D modes only. */
 if (AddrSurfInfoOut.tileMode >= ADDR_TM_2D_TILED_THIN1) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/14] gallium/radeon/winsyses: remove unused stencil_offset

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/radeon_winsys.h | 1 -
 src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 2 --
 src/gallium/winsys/radeon/drm/radeon_drm_surface.c | 2 --
 3 files changed, 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index f32edf0..1069193 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -391,7 +391,6 @@ struct radeon_surf {
 uint32_tmtilea;
 uint32_ttile_split;
 uint32_tstencil_tile_split;
-uint64_tstencil_offset;
 struct radeon_surf_levellevel[RADEON_SURF_MAX_LEVEL];
 struct radeon_surf_levelstencil_level[RADEON_SURF_MAX_LEVEL];
 uint32_ttiling_index[RADEON_SURF_MAX_LEVEL];
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
index 081f0e1..dd033e0 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
@@ -455,8 +455,6 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
 return r;
 
  if (level == 0) {
-surf->stencil_offset = surf->stencil_level[0].offset;
-
 /* For 2D modes only. */
 if (AddrSurfInfoOut.tileMode >= ADDR_TM_2D_TILED_THIN1) {
surf->stencil_tile_split =
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_surface.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_surface.c
index c6025ff..0399e5a 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_surface.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_surface.c
@@ -123,7 +123,6 @@ static void surf_winsys_to_drm(struct radeon_surface 
*surf_drm,
 surf_drm->mtilea = surf_ws->mtilea;
 surf_drm->tile_split = surf_ws->tile_split;
 surf_drm->stencil_tile_split = surf_ws->stencil_tile_split;
-surf_drm->stencil_offset = surf_ws->stencil_offset;
 
 for (i = 0; i < RADEON_SURF_MAX_LEVEL; i++) {
 surf_level_winsys_to_drm(_drm->level[i], _ws->level[i]);
@@ -163,7 +162,6 @@ static void surf_drm_to_winsys(struct radeon_drm_winsys *ws,
 surf_ws->mtilea = surf_drm->mtilea;
 surf_ws->tile_split = surf_drm->tile_split;
 surf_ws->stencil_tile_split = surf_drm->stencil_tile_split;
-surf_ws->stencil_offset = surf_drm->stencil_offset;
 
 surf_ws->macro_tile_index = cik_get_macro_tile_index(surf_ws);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/14] winsys/amdgpu: avoid flushed depth when possible

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

If a depth/stencil texture has no mipmaps, we can always get a layout that is
compatible with DB and TC.
---
 src/gallium/winsys/amdgpu/drm/amdgpu_surface.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
index cafa75d..66a3534 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_surface.c
@@ -373,10 +373,15 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
AddrSurfInfoIn.flags.noStencil = (surf->flags & RADEON_SURF_SBUFFER) == 0;
AddrSurfInfoIn.flags.compressZ = AddrSurfInfoIn.flags.depth;
 
-   /* TODO: update addrlib to a newer version, remove this, and
-* set flags.matchStencilTileCfg = 1 to fix stencil texturing.
+   /* noStencil = 0 can result in a depth part that is incompatible with
+* texturing. So set noStencil = 1 when mipmaps are requested (in this case,
+* we end up setting stencil_adjusted when required).
+*
+* TODO: update addrlib to a newer version, remove this, and
+* use flags.matchStencilTileCfg = 1 as an alternative fix.
 */
-   AddrSurfInfoIn.flags.noStencil = 1;
+  if (surf->last_level > 0)
+  AddrSurfInfoIn.flags.noStencil = 1;
 
/* Set preferred macrotile parameters. This is usually required
 * for shared resources. This is for 2D tiling only. */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/14] radeonsi: decompress to flushed depth texture when required

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeonsi/si_blit.c | 132 +
 1 file changed, 103 insertions(+), 29 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 85c8ca3..6678b23 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -101,7 +101,7 @@ static unsigned u_max_sample(struct pipe_resource *r)
return r->nr_samples ? r->nr_samples - 1 : 0;
 }
 
-static void
+static unsigned
 si_blit_dbcb_copy(struct si_context *sctx,
  struct r600_texture *src,
  struct r600_texture *dst,
@@ -111,6 +111,7 @@ si_blit_dbcb_copy(struct si_context *sctx,
 {
struct pipe_surface surf_tmpl = {{0}};
unsigned layer, sample, checked_last_layer, max_layer;
+   unsigned fully_copied_levels = 0;
 
if (planes & PIPE_MASK_Z)
sctx->dbcb_depth_copy_enabled = true;
@@ -157,11 +158,17 @@ si_blit_dbcb_copy(struct si_context *sctx,
pipe_surface_reference(, NULL);
pipe_surface_reference(, NULL);
}
+
+   if (first_layer == 0 && last_layer >= max_layer &&
+   first_sample == 0 && last_sample >= 
u_max_sample(>resource.b.b))
+   fully_copied_levels |= 1u << level;
}
 
sctx->dbcb_depth_copy_enabled = false;
sctx->dbcb_stencil_copy_enabled = false;
si_mark_atom_dirty(sctx, >db_render_state);
+
+   return fully_copied_levels;
 }
 
 static void si_blit_decompress_depth(struct pipe_context *ctx,
@@ -254,52 +261,119 @@ si_blit_decompress_zs_planes_in_place(struct si_context 
*sctx,
si_mark_atom_dirty(sctx, >db_render_state);
 }
 
-/* Decompress Z and/or S planes in place, depending on mask.
+/* Helper function of si_flush_depth_texture: decompress the given levels
+ * of Z and/or S planes in place.
  */
 static void
 si_blit_decompress_zs_in_place(struct si_context *sctx,
   struct r600_texture *texture,
-  unsigned planes,
-  unsigned first_level, unsigned last_level,
+  unsigned levels_z, unsigned levels_s,
   unsigned first_layer, unsigned last_layer)
 {
-   unsigned level_mask =
-   u_bit_consecutive(first_level, last_level - first_level + 1);
-   unsigned cur_level_mask;
+   unsigned both = levels_z & levels_s;
 
/* First, do combined Z & S decompresses for levels that need it. */
-   if (planes == (PIPE_MASK_Z | PIPE_MASK_S)) {
-   cur_level_mask =
-   level_mask &
-   texture->dirty_level_mask &
-   texture->stencil_dirty_level_mask;
+   if (both) {
si_blit_decompress_zs_planes_in_place(
sctx, texture, PIPE_MASK_Z | PIPE_MASK_S,
-   cur_level_mask,
+   both,
first_layer, last_layer);
-   level_mask &= ~cur_level_mask;
+   levels_z &= ~both;
+   levels_s &= ~both;
}
 
/* Now do separate Z and S decompresses. */
-   if (planes & PIPE_MASK_Z) {
-   cur_level_mask = level_mask & texture->dirty_level_mask;
+   if (levels_z) {
si_blit_decompress_zs_planes_in_place(
sctx, texture, PIPE_MASK_Z,
-   cur_level_mask,
+   levels_z,
first_layer, last_layer);
-   level_mask &= ~cur_level_mask;
}
 
-   if (planes & PIPE_MASK_S) {
-   cur_level_mask = level_mask & texture->stencil_dirty_level_mask;
+   if (levels_s) {
si_blit_decompress_zs_planes_in_place(
sctx, texture, PIPE_MASK_S,
-   cur_level_mask,
+   levels_s,
first_layer, last_layer);
}
 }
 
 static void
+si_flush_depth_texture(struct si_context *sctx,
+  struct r600_texture *tex,
+  unsigned required_planes,
+  unsigned first_level, unsigned last_level,
+  unsigned first_layer, unsigned last_layer)
+{
+   unsigned inplace_planes = 0;
+   unsigned copy_planes = 0;
+   unsigned level_mask = u_bit_consecutive(first_level, last_level - 
first_level + 1);
+   unsigned levels_z = 0;
+   unsigned levels_s = 0;
+
+   if (required_planes & PIPE_MASK_Z) {
+   levels_z = level_mask & tex->dirty_level_mask;
+
+   if (levels_z) {
+   if (r600_can_sample_zs(tex, false))
+   

[Mesa-dev] [PATCH 12/14] gallium/radeon: allocate only the required plane for flushed depth

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/r600_texture.c | 37 ---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 63a4512..bbe709a 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1275,12 +1275,43 @@ bool r600_init_flushed_depth_texture(struct 
pipe_context *ctx,
struct pipe_resource resource;
struct r600_texture **flushed_depth_texture = staging ?
staging : >flushed_depth_texture;
+   enum pipe_format pipe_format = texture->format;
+
+   if (!staging) {
+   if (rtex->flushed_depth_texture)
+   return true; /* it's ready */
+
+   if (!rtex->can_sample_z && rtex->can_sample_s) {
+   switch (pipe_format) {
+   case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT:
+   /* Save memory by not allocating the S plane. */
+   pipe_format = PIPE_FORMAT_Z32_FLOAT;
+   break;
+   case PIPE_FORMAT_Z24_UNORM_S8_UINT:
+   case PIPE_FORMAT_S8_UINT_Z24_UNORM:
+   /* Save memory bandwidth by not copying the
+* stencil part during flush.
+*
+* This potentially increases memory bandwidth
+* if an application uses both Z and S texturing
+* simultaneously (a flushed Z24S8 texture
+* would be stored compactly), but how often
+* does that really happen?
+*/
+   pipe_format = PIPE_FORMAT_Z24X8_UNORM;
+   break;
+   default:;
+   }
+   } else if (!rtex->can_sample_s && rtex->can_sample_z) {
+   
assert(util_format_has_stencil(util_format_description(pipe_format)));
 
-   if (!staging && rtex->flushed_depth_texture)
-   return true; /* it's ready */
+   /* DB->CB copies to an 8bpp surface don't work. */
+   pipe_format = PIPE_FORMAT_X24S8_UINT;
+   }
+   }
 
resource.target = texture->target;
-   resource.format = texture->format;
+   resource.format = pipe_format;
resource.width0 = texture->width0;
resource.height0 = texture->height0;
resource.depth0 = texture->depth0;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/14] radeonsi: correctly mark levels of 3D textures as fully decompressed

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

Account for the fact that max_layer is minified for higher levels.
---
 src/gallium/drivers/radeonsi/si_blit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index a48c5bc..1148787 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -214,7 +214,7 @@ si_blit_decompress_zs_planes_in_place(struct si_context 
*sctx,
 
/* The texture will always be dirty if some layers aren't 
flushed.
 * I don't think this case occurs often though. */
-   if (first_layer == 0 && last_layer == max_layer) {
+   if (first_layer == 0 && last_layer >= max_layer) {
fully_decompressed_mask |= 1u << level;
}
}
@@ -361,7 +361,7 @@ static void si_blit_decompress_color(struct pipe_context 
*ctx,
 
/* The texture will always be dirty if some layers aren't 
flushed.
 * I don't think this case occurs often though. */
-   if (first_layer == 0 && last_layer == max_layer) {
+   if (first_layer == 0 && last_layer >= max_layer) {
rtex->dirty_level_mask &= ~(1 << level);
}
}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/14] gallium/radeon: flushed depth textures for radeonsi

2016-07-01 Thread Nicolai Hähnle
Hi,

this series is motivated by a rare stencil texturing bug. The bug is that
a combined Z/S buffer (with separate stencil) must use the same pitch for
both Z and S when accessed by the DB hardware block, but in some
configurations, the S part wants a larger pitch because its alignment
requirements (in pixels) are higher.

This series fixes the issue by flushing the stencil part to a separate
texture when needed.

Note that we actually have a choice of how to fix it. Alternatively, one
could tell addrlib to take the stencil part into account when computing the
Z layout, by setting noStencil = 0. Then Z and S always have the same pitch,
but the downside is that if Z has mipmaps, it makes the Z layout
incompatible with texture sampling, so then the Z part would have to be
flushed.

Based on the assumption that depth texturing is more frequent than stencil
texturing, I chose the first option. In any case, we only ever need a flushed
texture when mipmapping is used.

Tested on Redwood, Verde, Tonga, Polaris. Please review!

Thanks,
Nicolai
--
 src/gallium/drivers/r600/evergreen_state.c   |  12 +-
 src/gallium/drivers/r600/r600_blit.c |  10 +-
 src/gallium/drivers/r600/r600_pipe.h |   8 -
 src/gallium/drivers/r600/r600_state.c|  24 +-
 src/gallium/drivers/r600/r600_state_common.c |   2 +-
 .../drivers/radeon/r600_pipe_common.h|  11 +-
 src/gallium/drivers/radeon/r600_texture.c|  65 -
 src/gallium/drivers/radeon/radeon_winsys.h   |   9 +-
 src/gallium/drivers/radeonsi/si_blit.c   | 235 -
 .../drivers/radeonsi/si_descriptors.c|  39 ++-
 src/gallium/drivers/radeonsi/si_state.c  |  25 +-
 .../winsys/amdgpu/drm/amdgpu_surface.c   |  17 +-
 .../winsys/radeon/drm/radeon_drm_surface.c   |   2 -
 13 files changed, 324 insertions(+), 135 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/14] gallium/radeon: flush stdout after printing texture information

2016-07-01 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/r600_texture.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 68f1701..7651e87 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1103,6 +1103,7 @@ r600_texture_create_object(struct pipe_screen *screen,
if (rscreen->debug_flags & DBG_TEX) {
puts("Texture:");
r600_print_texture_info(rtex, stdout);
+   fflush(stdout);
}
 
return rtex;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449

Kai  changed:

   What|Removed |Added

 Depends on||96762


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=96762
[Bug 96762] [radeonsi] Firewatch: nothing rendered in scrollable (text) areas
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/radeon: add and use radeon_info::max_alloc_size (v2)

2016-07-01 Thread Marek Olšák
From: Marek Olšák 

v2: - squashed the patches
- use INT_MAX
- clamp max_const_buffer_size
- check the DRM version in radeon
---
 src/gallium/drivers/r600/r600_pipe.c  |  4 ++--
 src/gallium/drivers/radeon/r600_pipe_common.c | 11 +--
 src/gallium/drivers/radeon/radeon_winsys.h|  1 +
 src/gallium/drivers/radeonsi/si_pipe.c|  4 ++--
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c |  2 ++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c |  4 
 6 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index 119c76b..f23daf9 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -301,7 +301,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
return 0;
 
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
-   return MIN2(rscreen->b.info.vram_size, 0x);
+   return MIN2(rscreen->b.info.max_alloc_size, INT_MAX);
 
 case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
 return R600_MAP_BUFFER_ALIGNMENT;
@@ -509,7 +509,7 @@ static int r600_get_shader_param(struct pipe_screen* 
pscreen, unsigned shader, e
pscreen->get_compute_param(pscreen, PIPE_SHADER_IR_TGSI,
PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE,
_const_buffer_size);
-   return max_const_buffer_size;
+   return MIN2(max_const_buffer_size, INT_MAX);
 
} else {
return R600_MAX_CONST_BUFFER_SIZE;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 74674e7..f75fa6c 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -864,8 +864,8 @@ static int r600_get_compute_param(struct pipe_screen 
*screen,
 * 4 * MAX_MEM_ALLOC_SIZE.
 */
*max_global_size = MIN2(4 * max_mem_alloc_size,
-   rscreen->info.gart_size +
-   rscreen->info.vram_size);
+   MAX2(rscreen->info.gart_size,
+rscreen->info.vram_size));
}
return sizeof(uint64_t);
 
@@ -889,10 +889,7 @@ static int r600_get_compute_param(struct pipe_screen 
*screen,
if (ret) {
uint64_t *max_mem_alloc_size = ret;
 
-   /* XXX: The limit in older kernels is 256 MB.  We
-* should add a query here for newer kernels.
-*/
-   *max_mem_alloc_size = 256 * 1024 * 1024;
+   *max_mem_alloc_size = rscreen->info.max_alloc_size;
}
return sizeof(uint64_t);
 
@@ -1098,6 +1095,8 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
printf("chip_class = %i\n", rscreen->info.chip_class);
printf("gart_size = %i MB\n", 
(int)DIV_ROUND_UP(rscreen->info.gart_size, 1024*1024));
printf("vram_size = %i MB\n", 
(int)DIV_ROUND_UP(rscreen->info.vram_size, 1024*1024));
+   printf("max_alloc_size = %i MB\n",
+  (int)DIV_ROUND_UP(rscreen->info.max_alloc_size, 
1024*1024));
printf("has_virtual_memory = %i\n", 
rscreen->info.has_virtual_memory);
printf("gfx_ib_pad_with_type2 = %i\n", 
rscreen->info.gfx_ib_pad_with_type2);
printf("has_sdma = %i\n", rscreen->info.has_sdma);
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index afb970e..1ba288a 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -253,6 +253,7 @@ struct radeon_info {
 uint32_tgart_page_size;
 uint64_tgart_size;
 uint64_tvram_size;
+uint64_tmax_alloc_size;
 boolhas_dedicated_vram;
 boolhas_virtual_memory;
 boolgfx_ib_pad_with_type2;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index ad2a86a..57c3fbd 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -419,7 +419,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
   HAVE_LLVM >= 0x0307 ? 410 : 330;
 
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
-   return MIN2(sscreen->b.info.vram_size, 0x);
+   return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);
 
   

Re: [Mesa-dev] [RFC] Coding style scripts (Was Re: [PATCH 1/2] gallium: replace [0-9]*.f with [0-9]*.0f)

2016-07-01 Thread Emil Velikov
On 28 June 2016 at 04:00, Michel Dänzer  wrote:
> On 27.06.2016 19:52, Jose Fonseca wrote:
>>
>> BTW, I've been using http://editorconfig.org/ on several projects. It's
>> widely supported by many editors including Emacs.
>>
Very nice suggestion Jose. Thanks !

>> There's even Python based tools to check editorconfig (
>> https://github.com/editorconfig/editorconfig/wiki/FAQ#my-files-are-not-automatically-reformatted-the-editorconfig-plugin-is-not-working
>> ).
Those tools are short on activity over the past 9-20+ months. Will
have to check how "imperfect" they are, one of these days.

>>  We already require Python, downloading a Python package is trivial
>> to do nowadays with pip, so we could easily standardize on .editorconfig
>> files everywhere.
>
> FWIW, +1 for adding .editorconfig files corresponding to the existing
> .dir-locals.el files as a first step, so users of other editors can also
> get the benefit of their code getting formatted correctly by default.
>
Sounds reasonable. Can I volunteer you for testing them in VIM, since
my editor does not have work with .editorconfig(s) yet :-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/6] RadeonSI & ddebug: Apitrace interactions and better debugging

2016-07-01 Thread Bas Nieuwenhuizen
Series is

Reviewed-by: Bas Nieuwenhuizen 

On Fri, Jul 1, 2016 at 1:21 AM, Marek Olšák  wrote:
> Hi,
>
> This series adds apitrace call tracking into ddebug and radeonsi and
> other improvements.
>
> It will improve our debugging and profiling possibilities. Just set
> GALLIUM_DDEBUG="apitrace [draw call number]" and you will get
> a complete ddebug log with TGSI, LLVM IR (new!), and asm. Both radeonsi
> and ddebug know the exact draw call number. It can be very powerful
> with apitrace profiling or just debugging bad draw calls.
>
> Also, the LLVM IR is newly printed into ddebug logs for all cases
> (GPU hang, VM fault, apitrace mode, etc).
>
> Please review.
>
> Marek
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/8] nv50/ir: optimize ADD3(d, 0x0, b, c) to ADD(d, b, c)

2016-07-01 Thread Samuel Pitoiset



On 07/01/2016 06:19 AM, Ilia Mirkin wrote:

On Thu, Jun 30, 2016 at 6:54 PM, Samuel Pitoiset
 wrote:



On 07/01/2016 12:44 AM, Ilia Mirkin wrote:


If moveSources doesn't move modifiers, we have a serious problem.
However it looks like it does:

void
Instruction::setSrc(int s, const ValueRef& ref)
{
   setSrc(s, ref.get());
   srcs[s].mod = ref.mod;
}

which is what moveSources calls.



I was not sure about moveSources() because we have two variants and the
other one doesn't move the modifiers.




On Thu, Jun 30, 2016 at 6:26 PM, Samuel Pitoiset
 wrote:


And ADD3(d, a, 0x0, c) to ADD(d, a, c) as well.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12
+++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1cf1fa3..517f779 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1032,7 +1032,17 @@ ConstantFolding::opnd(Instruction *i,
ImmediateValue , int s)
 i->src(0).mod = Modifier(0);
   }
   break;
-
+   case OP_ADD3:
+  if (i->usesFlags())
+ break;



Why? ADD can produce/consume a flag just fine.



Well, this is loosely based on OP_ADD which does exactly the same check.


Right, because you can't get rid of ADD 0, $c. (There is no MOV $c
variant... or we don't support it ... we probably should, I think
that's what CSET/CSETP are.) However you can definitely flip the ADD3
into an ADD2 even if the carry flag is being added in.


Yeah, makes sense.








+  if (imm0.isInteger(0)) {
+ i->op = OP_ADD;
+ for (int k = s; k < 2; k++) {
+i->setSrc(k, i->getSrc(k + 1));
+i->src(k).mod = i->src(k + 1).mod;
+ }



aka

i->moveSources(s + 1, -1) ?



Yes.





+  }
+  break;
case OP_DIV:
   if (s != 1 || (i->dType != TYPE_S32 && i->dType != TYPE_U32))
  break;
--
2.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: use max_alloc_size

2016-07-01 Thread Vedran Miletić

On 07/01/2016 02:27 PM, Marek Olšák wrote:

On Fri, Jul 1, 2016 at 1:45 PM, Vedran Miletić  wrote:

On 07/01/2016 11:11 AM, Marek Olšák wrote:


On Fri, Jul 1, 2016 at 10:54 AM, Marek Olšák  wrote:


On Fri, Jul 1, 2016 at 2:52 AM, Vedran Miletić 
wrote:


On 07/01/2016 01:29 AM, Marek Olšák wrote:



From: Marek Olšák 

also fix max_global_size to take a maximum of {vram_size, gart_size}
---
 src/gallium/drivers/r600/r600_pipe.c  | 2 +-
 src/gallium/drivers/radeon/r600_pipe_common.c | 9 +++--
 src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_pipe.c
b/src/gallium/drivers/r600/r600_pipe.c
index 119c76b..55bbde1 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -301,7 +301,7 @@ static int r600_get_param(struct pipe_screen*
pscreen,
enum pipe_cap param)
return 0;

case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
-   return MIN2(rscreen->b.info.vram_size, 0x);
+   return MIN2(rscreen->b.info.max_alloc_size,
0x);

 case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
 return R600_MAP_BUFFER_ALIGNMENT;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index d7f1d41..f75fa6c 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -864,8 +864,8 @@ static int r600_get_compute_param(struct
pipe_screen
*screen,
 * 4 * MAX_MEM_ALLOC_SIZE.
 */
*max_global_size = MIN2(4 * max_mem_alloc_size,
-   rscreen->info.gart_size +
-   rscreen->info.vram_size);
+
MAX2(rscreen->info.gart_size,
+
rscreen->info.vram_size));




Can't you also use info.max_alloc_size here?



I can do *max_global_size = max_alloc_size; Does that sound good?



Even if max_alloc_size can be 256 MB?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



How about

*max_global_size = MIN2(4 * max_mem_alloc_size,
rscreen->info.max_alloc_size);

Would that work? That avoids calculating the same value in two distinct
places, since rscreen->info.max_alloc_size = MAX2(rscreen->info.gart_size,
rscreen->info.vram_size) for both radeon and amdgpu.


max_mem_alloc_size == max_alloc_size. Thus:

MIN2(4 * max_mem_alloc_size, max_alloc_size) == max_alloc_size

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



Right, indeed.

In the case memory allocation is limited to 256 MB, this could would 
give 256 MB, while the code you initially had will give 1 GB of memory 
or VRAM or GTT, so it's definitely a better option.


Regards,
Vedran

--
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4] swr: Refactor checks for compiler feature flags

2016-07-01 Thread Emil Velikov
Hi Chuck,

On 28 June 2016 at 20:50, Chuck Atkins  wrote:
> Encapsulate the test for which flags are needed to get a compiler to
> support certain features.  Along with this, give various options to try
> for AVX and AVX2 support.  Ideally we want to use specific instruction
> set feature flags, like -mavx2 for instance instead of -march=haswell,
> but the flags required for certain compilers are different.  This
> allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c
> while the Intel compiler which doesn't support those flags can fall

 > back to using -march=core-avx2.
>
This part should have been a separate patch. Please try to keep things
separate for the future.

> This addresses a bug where the Intel compiler will silently ignore the
> AVX2 instruction feature flags and then potentially fail to build.
>
> v2: Pass preprocessor-check argument as true-state instead of
> false-state for clarity.
> v3: Reduce AVX2 define test to just __AVX2__.  Additional defines suchas
> __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined
> w.r.t thier availability.
> v4: Fix C++11 flags being added globally and add more logic to
> swr_require_cxx_feature_flags
>
> Cc: Tim Rowley 
> Signed-off-by: Chuck Atkins 
> ---
>  configure.ac| 73 
> +
>  src/gallium/drivers/swr/Makefile.am |  4 +-
>  2 files changed, 52 insertions(+), 25 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index cc9bc47..8321e8e 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2330,6 +2330,45 @@ swr_llvm_check() {
>  fi
>  }
>
> +swr_require_cxx_feature_flags() {
> +feature_name="$1"
> +preprocessor_test="$2"
> +option_list="$3"
> +output_var="$4"
> +
> +AC_MSG_CHECKING([whether $CXX supports $feature_name])
> +AC_LANG_PUSH([C++])
> +save_CXXFLAGS="$CXXFLAGS"
> +save_IFS="$IFS"
> +IFS=","
> +found=0
> +for opts in $option_list
> +do
> +unset IFS
> +CXXFLAGS="$opts $save_CXXFLAGS"
> +AC_COMPILE_IFELSE(
> +[AC_LANG_PROGRAM(
> +[   #if !($preprocessor_test)
> +#error
> +#endif
> +])],
> +[found=1; break],
> +[])
> +IFS=","
> +done
> +IFS="$save_IFS"
> +CXXFLAGS="$save_CXXFLAGS"
> +AC_LANG_POP([C++])
> +if test $found -eq 1; then
> +AC_MSG_RESULT([$opts])
> +eval "$output_var=\$opts"
> +return 0
> +fi
> +AC_MSG_RESULT([no])
> +AC_MSG_ERROR([swr requires $feature_name support])
> +return 1
Esp with the IFS override this looks rather nasty imho.

Just a suggestion: An approach similar to LIBDRM_CC_TRY_FLAG in [1] is
a lot more readable imho.

-Emil
[1] https://cgit.freedesktop.org/mesa/drm/tree/configure.ac
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/12] st/va: add functions for VAAPI encode

2016-07-01 Thread Christian König

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:

Signed-off-by: Boyuan Zhang 
---
  src/gallium/state_trackers/va/buffer.c |   6 +
  src/gallium/state_trackers/va/picture.c| 170 -
  src/gallium/state_trackers/va/va_private.h |   3 +
  3 files changed, 177 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/buffer.c 
b/src/gallium/state_trackers/va/buffer.c
index 7d3167b..dfcebbe 100644
--- a/src/gallium/state_trackers/va/buffer.c
+++ b/src/gallium/state_trackers/va/buffer.c
@@ -133,6 +133,12 @@ vlVaMapBuffer(VADriverContextP ctx, VABufferID buf_id, 
void **pbuff)
if (!buf->derived_surface.transfer || !*pbuff)
   return VA_STATUS_ERROR_INVALID_BUFFER;
  
+  if (buf->type == VAEncCodedBufferType) {

+ ((VACodedBufferSegment*)buf->data)->buf = *pbuff;
+ ((VACodedBufferSegment*)buf->data)->size = buf->coded_size;
+ ((VACodedBufferSegment*)buf->data)->next = NULL;
+ *pbuff = buf->data;
+  }
 } else {
pipe_mutex_unlock(drv->mutex);
*pbuff = buf->data;
diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 89ac024..26205b1 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -78,7 +78,8 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID 
context_id, VASurfaceID rende
return VA_STATUS_SUCCESS;
 }
  
-   context->decoder->begin_frame(context->decoder, context->target, >desc.base);

+   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE)
+  context->decoder->begin_frame(context->decoder, context->target, 
>desc.base);
  
 return VA_STATUS_SUCCESS;

  }
@@ -278,6 +279,140 @@ handleVASliceDataBufferType(vlVaContext *context, 
vlVaBuffer *buf)
num_buffers, (const void * const*)buffers, sizes);
  }
  
+static VAStatus

+handleVAEncMiscParameterTypeRateControl(vlVaContext *context, 
VAEncMiscParameterBuffer *misc)
+{
+   VAEncMiscParameterRateControl *rc = (VAEncMiscParameterRateControl 
*)misc->data;
+   if (context->desc.h264enc.rate_ctrl.rate_ctrl_method ==
+   PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT)
+  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second;
+   else
+  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second * 
rc->target_percentage;
+   context->desc.h264enc.rate_ctrl.peak_bitrate = rc->bits_per_second;
+   if (context->desc.h264enc.rate_ctrl.target_bitrate < 200)
+  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
MIN2((context->desc.h264enc.rate_ctrl.target_bitrate * 2.75), 200);
+   else
+  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
context->desc.h264enc.rate_ctrl.target_bitrate;
+
+   return VA_STATUS_SUCCESS;
+}
+
+static VAStatus
+handleVAEncSequenceParameterBufferType(vlVaDriver *drv, vlVaContext *context, 
vlVaBuffer *buf)
+{
+   VAEncSequenceParameterBufferH264 *h264 = (VAEncSequenceParameterBufferH264 
*)buf->data;
+   if (!context->decoder) {
+  context->templat.max_references = h264->max_num_ref_frames;
+  context->templat.level = h264->level_idc;
+  context->decoder = drv->pipe->create_video_codec(drv->pipe, 
>templat);
+  if (!context->decoder)
+ return VA_STATUS_ERROR_ALLOCATION_FAILED;
+   }
+   context->desc.h264enc.gop_size = h264->intra_idr_period;
+   return VA_STATUS_SUCCESS;
+}
+
+static VAStatus
+handleVAEncMiscParameterBufferType(vlVaContext *context, vlVaBuffer *buf)
+{
+   VAStatus vaStatus = VA_STATUS_SUCCESS;
+   VAEncMiscParameterBuffer *misc;
+   misc = buf->data;
+
+   switch (misc->type) {
+   case VAEncMiscParameterTypeRateControl:
+  vaStatus = handleVAEncMiscParameterTypeRateControl(context, misc);
+  break;
+
+   default:
+  break;
+   }
+
+   return vaStatus;
+}
+
+static VAStatus
+handleVAEncPictureParameterBufferType(vlVaDriver *drv, vlVaContext *context, 
vlVaBuffer *buf)
+{
+   VAEncPictureParameterBufferH264 *h264;
+   vlVaBuffer *coded_buf;
+
+   h264 = buf->data;
+   context->desc.h264enc.frame_num = h264->frame_num;
+   context->desc.h264enc.not_referenced = false;
+   context->desc.h264enc.is_idr = (h264->pic_fields.bits.idr_pic_flag == 1);
+   context->desc.h264enc.pic_order_cnt = h264->CurrPic.TopFieldOrderCnt / 2;
+   if (context->desc.h264enc.is_idr)
+  context->desc.h264enc.i_remain = 1;
+   else
+  context->desc.h264enc.i_remain = 0;
+
+   context->desc.h264enc.p_remain = context->desc.h264enc.gop_size - 
context->desc.h264enc.gop_cnt - context->desc.h264enc.i_remain;
+
+   coded_buf = handle_table_get(drv->htab, h264->coded_buf);
+   coded_buf->derived_surface.resource = pipe_buffer_create(drv->pipe->screen, 
PIPE_BIND_VERTEX_BUFFER,
+ PIPE_USAGE_STREAM, coded_buf->size);
+   context->coded_buf = coded_buf;
+
+   context->desc.h264enc.frame_idx[h264->CurrPic.picture_id] = h264->frame_num;
+   if 

Re: [Mesa-dev] [PATCH 06/12] st/va: colorspace conversion when image is yv12 and surface is nv12

2016-07-01 Thread Christian König

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:

Signed-off-by: Boyuan Zhang 
---
  src/gallium/state_trackers/va/image.c | 48 +--
  1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/va/image.c 
b/src/gallium/state_trackers/va/image.c
index 3c8cc9c..1f68169 100644
--- a/src/gallium/state_trackers/va/image.c
+++ b/src/gallium/state_trackers/va/image.c
@@ -499,7 +499,7 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, 
VAImageID image,
 VAImage *vaimage;
 struct pipe_sampler_view **views;
 enum pipe_format format;
-   void *data[3];
+   uint8_t *data[3];
 unsigned pitches[3], i, j;
  
 if (!ctx)

@@ -539,7 +539,9 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, 
VAImageID image,
return VA_STATUS_ERROR_OPERATION_FAILED;
 }
  
-   if (format != surf->buffer->buffer_format) {

+   if ((format != surf->buffer->buffer_format) &&
+  ((format != PIPE_FORMAT_YV12) || (surf->buffer->buffer_format != 
PIPE_FORMAT_NV12)) &&
+  ((format != PIPE_FORMAT_IYUV) || (surf->buffer->buffer_format != 
PIPE_FORMAT_NV12))) {
struct pipe_video_buffer *tmp_buf;
struct pipe_video_buffer templat = surf->templat;
  
@@ -581,12 +583,42 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, VAImageID image,

unsigned width, height;
if (!views[i]) continue;
vlVaVideoSurfaceSize(surf, i, , );
-  for (j = 0; j < views[i]->texture->array_size; ++j) {
- struct pipe_box dst_box = {0, 0, j, width, height, 1};
- drv->pipe->transfer_inline_write(drv->pipe, views[i]->texture, 0,
-PIPE_TRANSFER_WRITE, _box,
-data[i] + pitches[i] * j,
-pitches[i] * views[i]->texture->array_size, 0);
+  if ((format == PIPE_FORMAT_YV12) || (format == PIPE_FORMAT_IYUV) &&
+ (surf->buffer->buffer_format == PIPE_FORMAT_NV12) && (i == 1)) {
+ struct pipe_transfer *transfer = NULL;
+ uint8_t *map = NULL;
+ struct pipe_box dst_box_1 = {0, 0, 0, width, height, 1};
+ map = drv->pipe->transfer_map(drv->pipe,
+   views[i]->texture,
+   0,
+   PIPE_TRANSFER_DISCARD_RANGE,
+   _box_1, );
+ if (map == NULL)
+return VA_STATUS_ERROR_OPERATION_FAILED;
+
+ bool odd = false;
+ for (unsigned int k = 0; k < ((vaimage->offsets[1])/2) ; k++){
+if (odd == false) {
+   map[k] = data[i][k/2];
+   odd = true;
+}
+else {
+   map[k] = data[i+1][k/2];
+   odd = false;
+}
+ }
+ pipe_transfer_unmap(drv->pipe, transfer);
+ pipe_mutex_unlock(drv->mutex);
+ return VA_STATUS_SUCCESS;
+  }
+  else {


Style issue, the "}" and the "else {" should be on the same line.

Apart from that please use the u_copy_yv12_to_nv12() functions for the 
conversion instead of coding it manually.


Also the code doesn't looks like it handles IYUV correctly.

Regards,
Christian.


+ for (j = 0; j < views[i]->texture->array_size; ++j) {
+struct pipe_box dst_box = {0, 0, j, width, height, 1};
+drv->pipe->transfer_inline_write(drv->pipe, views[i]->texture, 0,
+   PIPE_TRANSFER_WRITE, _box,
+   data[i] + pitches[i] * j,
+   pitches[i] * views[i]->texture->array_size, 0);
+ }
}
 }
 pipe_mutex_unlock(drv->mutex);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] radeonsi: enable string markers and record apitrace call numbers

2016-07-01 Thread Bas Nieuwenhuizen
On Fri, Jul 1, 2016 at 2:24 PM, Marek Olšák  wrote:
> On Fri, Jul 1, 2016 at 2:16 PM, Bas Nieuwenhuizen
>  wrote:
>> On Fri, Jul 1, 2016 at 1:21 AM, Marek Olšák  wrote:
>>> From: Marek Olšák 
>>>
>>> ---
>>>  src/gallium/drivers/radeonsi/si_debug.c |  4 
>>>  src/gallium/drivers/radeonsi/si_pipe.c  | 20 +++-
>>>  src/gallium/drivers/radeonsi/si_pipe.h  |  1 +
>>>  3 files changed, 24 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
>>> b/src/gallium/drivers/radeonsi/si_debug.c
>>> index 112e686..220ce13 100644
>>> --- a/src/gallium/drivers/radeonsi/si_debug.c
>>> +++ b/src/gallium/drivers/radeonsi/si_debug.c
>>> @@ -814,6 +814,10 @@ void si_check_vm_faults(struct r600_common_context 
>>> *ctx,
>>> fprintf(f, "Device name: %s\n\n", screen->get_name(screen));
>>> fprintf(f, "Failing VM page: 0x%08x\n\n", addr);
>>>
>>> +   if (sctx->apitrace_call_number)
>>> +   fprintf(f, "Last apitrace call: %u\n\n",
>>> +   sctx->apitrace_call_number);
>>> +
>>> switch (ring) {
>>> case RING_GFX:
>>> si_dump_debug_state(>b.b, f, 0);
>>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
>>> b/src/gallium/drivers/radeonsi/si_pipe.c
>>> index 6c88fe3..f15e589 100644
>>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>>> @@ -31,6 +31,7 @@
>>>  #include "util/u_memory.h"
>>>  #include "util/u_suballoc.h"
>>>  #include "vl/vl_decoder.h"
>>> +#include "../ddebug/dd_util.h"
>>>
>>>  #define SI_LLVM_DEFAULT_FEATURES \
>>> "+DumpCode,+vgpr-spilling,-fp32-denormals,+fp64-denormals"
>>> @@ -106,6 +107,22 @@ si_amdgpu_get_reset_status(struct pipe_context *ctx)
>>> return sctx->b.ws->ctx_query_reset_status(sctx->b.ctx);
>>>  }
>>>
>>> +/* Apitrace profiling:
>>> + *   1) qapitrace : Tools -> Profile: Measure CPU & GPU times
>>> + *   2) In the middle panel, zoom in (mouse wheel) on some bad draw call
>>> + *  and remember its number.
>>> + *   3) In Mesa, enable queries and performance counters around that draw
>>> + *  call and print the results.
>>> + *   4) glretrace --benchmark --markers ..
>>> + */
>>
>> Were there issues that you hit with the existing apitrace performance
>> counters support?
>
> I didn't know apitrace had that. How to use it?

Use it with e.g.

apitrace replay --pdrawcalls="GL_AMD_performance_monitor:
SQ_080,SQ_081" test.trace

The only documentation I found is at
https://github.com/apitrace/apitrace/commit/87a07779ecc48249046d000ec87a57fd15112ff4

You can list available metrics with

apitrace replay --list-metrics test.trace (trace file is required for
some reason)

 - Bas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/12] st/va: add nv12 i420 yv12 format to deriveimage call

2016-07-01 Thread Christian König

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:

Signed-off-by: Boyuan Zhang 


That only works by coincident correctly, the buffer only contains a 
reference to the first texture plane and not all of them.


So vlVaMapBuffer() won't be able to come up with something reasonable 
for other planes.


Additional to that the offsets and pitches can change when backing 
buffers are reallocated.


So the only correct implementation would be to create a persistent 
mapping in the vlVaMapBuffer() call and then come up with the offset and 
keep that mapping until the derived image is destroyed again.


Regards,
Christian.

---
  src/gallium/state_trackers/va/image.c | 21 +
  1 file changed, 21 insertions(+)

diff --git a/src/gallium/state_trackers/va/image.c 
b/src/gallium/state_trackers/va/image.c
index c82b554..3c8cc9c 100644
--- a/src/gallium/state_trackers/va/image.c
+++ b/src/gallium/state_trackers/va/image.c
@@ -275,6 +275,27 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID surface, 
VAImage *image)
 }
  
 switch (img->format.fourcc) {

+   case VA_FOURCC('N','V','1','2'):
+  img->num_planes = 2;
+  img->pitches[0] = pitch[0];
+  img->offsets[0] = 0;
+  img->pitches[1] = pitch[1];
+  img->offsets[1] = pitch[0] * h;
+  img->data_size  = pitch[0] * h + pitch[1] * h / 2;
+  break;
+
+   case VA_FOURCC('I','4','2','0'):
+   case VA_FOURCC('Y','V','1','2'):
+  img->num_planes = 3;
+  img->pitches[0] = pitch[0];
+  img->offsets[0] = 0;
+  img->pitches[1] = pitch[1];
+  img->offsets[1] = pitch[0] * h;
+  img->pitches[2] = pitch[2];
+  img->offsets[2] = pitch[0] * h + pitch[1] * h / 4;
+  img->data_size  = pitch[0] * h + pitch[1] * h / 4 + pitch[2] * h / 4;
+  break;
+
 case VA_FOURCC('U','Y','V','Y'):
 case VA_FOURCC('Y','U','Y','V'):
img->num_planes = 1;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/12] st/va: use correct pitch value for deriveimage call

2016-07-01 Thread Christian König

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:

Signed-off-by: Boyuan Zhang 
---
  src/gallium/state_trackers/va/image.c | 55 ---
  1 file changed, 51 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/va/image.c 
b/src/gallium/state_trackers/va/image.c
index 1b956e3..c82b554 100644
--- a/src/gallium/state_trackers/va/image.c
+++ b/src/gallium/state_trackers/va/image.c
@@ -185,10 +185,12 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID 
surface, VAImage *image)
 vlVaSurface *surf;
 vlVaBuffer *img_buf;
 VAImage *img;
+   struct pipe_sampler_view **views;
 struct pipe_surface **surfaces;
 int w;
 int h;
 int i;
+   int pitch[3];
  
 if (!ctx)

return VA_STATUS_ERROR_INVALID_CONTEXT;
@@ -220,6 +222,51 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID surface, 
VAImage *image)
 w = align(surf->buffer->width, 2);
 h = align(surf->buffer->height, 2);
  
+   switch (img->format.fourcc) {

+  case VA_FOURCC('N','V','1','2'):
+ img->num_planes = 2;
+ break;
+
+  case VA_FOURCC('I','4','2','0'):
+  case VA_FOURCC('Y','V','1','2'):
+ img->num_planes = 3;
+ break;
+
+  case VA_FOURCC('U','Y','V','Y'):
+  case VA_FOURCC('Y','U','Y','V'):
+  case VA_FOURCC('B','G','R','A'):
+  case VA_FOURCC('R','G','B','A'):
+  case VA_FOURCC('B','G','R','X'):
+  case VA_FOURCC('R','G','B','X'):
+ img->num_planes = 1;
+ break;


This switch is unnecessary for not present planes the views will just be 
empty.


Additional to that images can only handle single plane formats for now.

Regards,
Christian.


+
+  default:
+ /* VaDeriveImage is designed for contiguous planes. */
+ FREE(img);
+ return VA_STATUS_ERROR_INVALID_IMAGE_FORMAT;
+   }
+
+   views = surf->buffer->get_sampler_view_planes(surf->buffer);
+   if (!views)
+  return VA_STATUS_ERROR_OPERATION_FAILED;
+
+   for (i = 0; i < img->num_planes; i++) {
+  unsigned width, height;
+  if (!views[i]) continue;
+  vlVaVideoSurfaceSize(surf, i, , );
+  struct pipe_box box = {0, 0, 0, width, height, 1};
+  struct pipe_transfer *transfer;
+  uint8_t *map;
+  map = drv->pipe->transfer_map(drv->pipe, views[i]->texture, 0,
+PIPE_TRANSFER_READ, , );
+  if (!map)
+ return VA_STATUS_ERROR_OPERATION_FAILED;
+
+  pitch[i] = transfer->stride;
+  pipe_transfer_unmap(drv->pipe, transfer);
+   }
+
 for (i = 0; i < ARRAY_SIZE(formats); ++i) {
if (img->format.fourcc == formats[i].fourcc) {
   img->format = formats[i];
@@ -231,9 +278,9 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID surface, 
VAImage *image)
 case VA_FOURCC('U','Y','V','Y'):
 case VA_FOURCC('Y','U','Y','V'):
img->num_planes = 1;
-  img->pitches[0] = w * 2;
+  img->pitches[0] = pitch[0] * 2;
img->offsets[0] = 0;
-  img->data_size  = w * h * 2;
+  img->data_size  = pitch[0] * h * 2;
break;
  
 case VA_FOURCC('B','G','R','A'):

@@ -241,9 +288,9 @@ vlVaDeriveImage(VADriverContextP ctx, VASurfaceID surface, 
VAImage *image)
 case VA_FOURCC('B','G','R','X'):
 case VA_FOURCC('R','G','B','X'):
img->num_planes = 1;
-  img->pitches[0] = w * 4;
+  img->pitches[0] = pitch[0] * 4;
img->offsets[0] = 0;
-  img->data_size  = w * h * 4;
+  img->data_size  = pitch[0] * h * 4;
break;
  
 default:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: use max_alloc_size

2016-07-01 Thread Marek Olšák
On Fri, Jul 1, 2016 at 1:45 PM, Vedran Miletić  wrote:
> On 07/01/2016 11:11 AM, Marek Olšák wrote:
>>
>> On Fri, Jul 1, 2016 at 10:54 AM, Marek Olšák  wrote:
>>>
>>> On Fri, Jul 1, 2016 at 2:52 AM, Vedran Miletić 
>>> wrote:

 On 07/01/2016 01:29 AM, Marek Olšák wrote:
>
>
> From: Marek Olšák 
>
> also fix max_global_size to take a maximum of {vram_size, gart_size}
> ---
>  src/gallium/drivers/r600/r600_pipe.c  | 2 +-
>  src/gallium/drivers/radeon/r600_pipe_common.c | 9 +++--
>  src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
>  3 files changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_pipe.c
> b/src/gallium/drivers/r600/r600_pipe.c
> index 119c76b..55bbde1 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -301,7 +301,7 @@ static int r600_get_param(struct pipe_screen*
> pscreen,
> enum pipe_cap param)
> return 0;
>
> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
> -   return MIN2(rscreen->b.info.vram_size, 0x);
> +   return MIN2(rscreen->b.info.max_alloc_size,
> 0x);
>
>  case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
>  return R600_MAP_BUFFER_ALIGNMENT;
> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
> b/src/gallium/drivers/radeon/r600_pipe_common.c
> index d7f1d41..f75fa6c 100644
> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> @@ -864,8 +864,8 @@ static int r600_get_compute_param(struct
> pipe_screen
> *screen,
>  * 4 * MAX_MEM_ALLOC_SIZE.
>  */
> *max_global_size = MIN2(4 * max_mem_alloc_size,
> -   rscreen->info.gart_size +
> -   rscreen->info.vram_size);
> +
> MAX2(rscreen->info.gart_size,
> +
> rscreen->info.vram_size));



 Can't you also use info.max_alloc_size here?
>>>
>>>
>>> I can do *max_global_size = max_alloc_size; Does that sound good?
>>
>>
>> Even if max_alloc_size can be 256 MB?
>>
>> Marek
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
> How about
>
> *max_global_size = MIN2(4 * max_mem_alloc_size,
> rscreen->info.max_alloc_size);
>
> Would that work? That avoids calculating the same value in two distinct
> places, since rscreen->info.max_alloc_size = MAX2(rscreen->info.gart_size,
> rscreen->info.vram_size) for both radeon and amdgpu.

max_mem_alloc_size == max_alloc_size. Thus:

MIN2(4 * max_mem_alloc_size, max_alloc_size) == max_alloc_size

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] radeonsi: enable string markers and record apitrace call numbers

2016-07-01 Thread Marek Olšák
On Fri, Jul 1, 2016 at 2:16 PM, Bas Nieuwenhuizen
 wrote:
> On Fri, Jul 1, 2016 at 1:21 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/drivers/radeonsi/si_debug.c |  4 
>>  src/gallium/drivers/radeonsi/si_pipe.c  | 20 +++-
>>  src/gallium/drivers/radeonsi/si_pipe.h  |  1 +
>>  3 files changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
>> b/src/gallium/drivers/radeonsi/si_debug.c
>> index 112e686..220ce13 100644
>> --- a/src/gallium/drivers/radeonsi/si_debug.c
>> +++ b/src/gallium/drivers/radeonsi/si_debug.c
>> @@ -814,6 +814,10 @@ void si_check_vm_faults(struct r600_common_context *ctx,
>> fprintf(f, "Device name: %s\n\n", screen->get_name(screen));
>> fprintf(f, "Failing VM page: 0x%08x\n\n", addr);
>>
>> +   if (sctx->apitrace_call_number)
>> +   fprintf(f, "Last apitrace call: %u\n\n",
>> +   sctx->apitrace_call_number);
>> +
>> switch (ring) {
>> case RING_GFX:
>> si_dump_debug_state(>b.b, f, 0);
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
>> b/src/gallium/drivers/radeonsi/si_pipe.c
>> index 6c88fe3..f15e589 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>> @@ -31,6 +31,7 @@
>>  #include "util/u_memory.h"
>>  #include "util/u_suballoc.h"
>>  #include "vl/vl_decoder.h"
>> +#include "../ddebug/dd_util.h"
>>
>>  #define SI_LLVM_DEFAULT_FEATURES \
>> "+DumpCode,+vgpr-spilling,-fp32-denormals,+fp64-denormals"
>> @@ -106,6 +107,22 @@ si_amdgpu_get_reset_status(struct pipe_context *ctx)
>> return sctx->b.ws->ctx_query_reset_status(sctx->b.ctx);
>>  }
>>
>> +/* Apitrace profiling:
>> + *   1) qapitrace : Tools -> Profile: Measure CPU & GPU times
>> + *   2) In the middle panel, zoom in (mouse wheel) on some bad draw call
>> + *  and remember its number.
>> + *   3) In Mesa, enable queries and performance counters around that draw
>> + *  call and print the results.
>> + *   4) glretrace --benchmark --markers ..
>> + */
>
> Were there issues that you hit with the existing apitrace performance
> counters support?

I didn't know apitrace had that. How to use it?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-01 Thread Christian König

Hi Boyuan,

as Emil wrote as well try to add some commit messages to the set. For 
this patch something like the following should do it:


Allow to specify more parameters in the encoding interface which where 
previously just hardcoded in the encoder.


Additional to that we need to reorder the patches a bit. First the 
interface changes, then the OMX changes to fill in the previously 
hardcoded values, then the radeon backend changes and then last the 
VA-API changes to use the new interface.


Additional to that a few notes below.

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:

Signed-off-by: Boyuan Zhang 
---
  src/gallium/include/pipe/p_video_state.h | 36 
  1 file changed, 36 insertions(+)

diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index d353be6..9cd489b 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -352,9 +352,29 @@ struct pipe_h264_enc_rate_control
 unsigned frame_rate_num;
 unsigned frame_rate_den;
 unsigned vbv_buffer_size;
+   unsigned vbv_buf_lv;
 unsigned target_bits_picture;
 unsigned peak_bits_picture_integer;
 unsigned peak_bits_picture_fraction;
+   unsigned fill_data_enable;
+   unsigned enforce_hrd;
+};
+
+struct pipe_h264_enc_motion_estimation
+{
+   unsigned motion_est_quarter_pixel;
+   unsigned enc_disable_sub_mode;
+   unsigned lsmvert;
+   unsigned enc_en_ime_overw_dis_subm;
+   unsigned enc_ime_overw_dis_subm_no;
+   unsigned enc_ime2_search_range_x;
+   unsigned enc_ime2_search_range_y;
+};
+
+struct pipe_h264_enc_pic_control
+{
+   unsigned enc_cabac_enable;
+   unsigned enc_constraint_set_flags;
  };
  
  struct pipe_h264_enc_picture_desc

@@ -363,17 +383,33 @@ struct pipe_h264_enc_picture_desc
  
 struct pipe_h264_enc_rate_control rate_ctrl;
  
+   struct pipe_h264_enc_motion_estimation motion_est;

+   struct pipe_h264_enc_pic_control pic_ctrl;
+
 unsigned quant_i_frames;
 unsigned quant_p_frames;
 unsigned quant_b_frames;
  
 enum pipe_h264_enc_picture_type picture_type;

 unsigned frame_num;
+   unsigned frame_num_cnt;
+   unsigned p_remain;
+   unsigned i_remain;
+   unsigned idr_pic_id;
+   unsigned gop_cnt;
 unsigned pic_order_cnt;
 unsigned ref_idx_l0;
 unsigned ref_idx_l1;
+   unsigned gop_size;
+   unsigned ref_pic_mode;
  
 bool not_referenced;

+   bool is_idr;


Why can't this be inferred from the encoded picture type?


+   bool has_ref_pic_list;
+   bool enable_vui;
+   unsigned int ref_pic_list_0[32];
+   unsigned int ref_pic_list_1[32];
+   unsigned int frame_idx[32];


I thought we wanted to drop the ref_pic_list handling for now. If that 
is still the case please drop those fields here as well.


Regards,
Christian.


  };
  
  struct pipe_h265_sps


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/6] radeonsi: enable string markers and record apitrace call numbers

2016-07-01 Thread Bas Nieuwenhuizen
On Fri, Jul 1, 2016 at 1:21 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/radeonsi/si_debug.c |  4 
>  src/gallium/drivers/radeonsi/si_pipe.c  | 20 +++-
>  src/gallium/drivers/radeonsi/si_pipe.h  |  1 +
>  3 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
> b/src/gallium/drivers/radeonsi/si_debug.c
> index 112e686..220ce13 100644
> --- a/src/gallium/drivers/radeonsi/si_debug.c
> +++ b/src/gallium/drivers/radeonsi/si_debug.c
> @@ -814,6 +814,10 @@ void si_check_vm_faults(struct r600_common_context *ctx,
> fprintf(f, "Device name: %s\n\n", screen->get_name(screen));
> fprintf(f, "Failing VM page: 0x%08x\n\n", addr);
>
> +   if (sctx->apitrace_call_number)
> +   fprintf(f, "Last apitrace call: %u\n\n",
> +   sctx->apitrace_call_number);
> +
> switch (ring) {
> case RING_GFX:
> si_dump_debug_state(>b.b, f, 0);
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index 6c88fe3..f15e589 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -31,6 +31,7 @@
>  #include "util/u_memory.h"
>  #include "util/u_suballoc.h"
>  #include "vl/vl_decoder.h"
> +#include "../ddebug/dd_util.h"
>
>  #define SI_LLVM_DEFAULT_FEATURES \
> "+DumpCode,+vgpr-spilling,-fp32-denormals,+fp64-denormals"
> @@ -106,6 +107,22 @@ si_amdgpu_get_reset_status(struct pipe_context *ctx)
> return sctx->b.ws->ctx_query_reset_status(sctx->b.ctx);
>  }
>
> +/* Apitrace profiling:
> + *   1) qapitrace : Tools -> Profile: Measure CPU & GPU times
> + *   2) In the middle panel, zoom in (mouse wheel) on some bad draw call
> + *  and remember its number.
> + *   3) In Mesa, enable queries and performance counters around that draw
> + *  call and print the results.
> + *   4) glretrace --benchmark --markers ..
> + */

Were there issues that you hit with the existing apitrace performance
counters support?

> +static void si_emit_string_marker(struct pipe_context *ctx,
> + const char *string, int len)
> +{
> +   struct si_context *sctx = (struct si_context *)ctx;
> +
> +   dd_parse_apitrace_marker(string, len, >apitrace_call_number);
> +}
> +
>  static struct pipe_context *si_create_context(struct pipe_screen *screen,
>void *priv, unsigned flags)
>  {
> @@ -125,6 +142,7 @@ static struct pipe_context *si_create_context(struct 
> pipe_screen *screen,
> sctx->b.b.screen = screen; /* this must be set first */
> sctx->b.b.priv = priv;
> sctx->b.b.destroy = si_destroy_context;
> +   sctx->b.b.emit_string_marker = si_emit_string_marker;
> sctx->b.set_atom_dirty = (void *)si_set_atom_dirty;
> sctx->screen = sscreen; /* Easy accessing of screen/winsys. */
> sctx->is_debug = (flags & PIPE_CONTEXT_DEBUG) != 0;
> @@ -361,6 +379,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR:
> case PIPE_CAP_GENERATE_MIPMAP:
> case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
> +   case PIPE_CAP_STRING_MARKER:
> return 1;
>
> case PIPE_CAP_RESOURCE_FROM_USER_MEMORY:
> @@ -413,7 +432,6 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_DRAW_PARAMETERS:
> case PIPE_CAP_MULTI_DRAW_INDIRECT:
> case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
> -   case PIPE_CAP_STRING_MARKER:
> case PIPE_CAP_QUERY_BUFFER_OBJECT:
> case PIPE_CAP_CULL_DISTANCE:
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
> b/src/gallium/drivers/radeonsi/si_pipe.h
> index 3aff0ac..9d15cbf 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.h
> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> @@ -330,6 +330,7 @@ struct si_context {
> struct r600_resource*trace_buf;
> unsignedtrace_id;
> uint64_tdmesg_timestamp;
> +   unsignedapitrace_call_number;
>
> /* Other state */
> bool need_check_render_feedback;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: use max_alloc_size

2016-07-01 Thread Jan Vesely
On Fri, 2016-07-01 at 13:45 +0200, Vedran Miletić wrote:
> On 07/01/2016 11:11 AM, Marek Olšák wrote:
> > 
> > On Fri, Jul 1, 2016 at 10:54 AM, Marek Olšák 
> > wrote:
> > > 
> > > On Fri, Jul 1, 2016 at 2:52 AM, Vedran Miletić  > > t> wrote:
> > > > 
> > > > On 07/01/2016 01:29 AM, Marek Olšák wrote:
> > > > > 
> > > > > 
> > > > > From: Marek Olšák 
> > > > > 
> > > > > also fix max_global_size to take a maximum of {vram_size,
> > > > > gart_size}
> > > > > ---
> > > > >  src/gallium/drivers/r600/r600_pipe.c  | 2 +-
> > > > >  src/gallium/drivers/radeon/r600_pipe_common.c | 9 +++--
> > > > >  src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
> > > > >  3 files changed, 5 insertions(+), 8 deletions(-)
> > > > > 
> > > > > diff --git a/src/gallium/drivers/r600/r600_pipe.c
> > > > > b/src/gallium/drivers/r600/r600_pipe.c
> > > > > index 119c76b..55bbde1 100644
> > > > > --- a/src/gallium/drivers/r600/r600_pipe.c
> > > > > +++ b/src/gallium/drivers/r600/r600_pipe.c
> > > > > @@ -301,7 +301,7 @@ static int r600_get_param(struct
> > > > > pipe_screen* pscreen,
> > > > > enum pipe_cap param)
> > > > > return 0;
> > > > > 
> > > > > case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
> > > > > -   return MIN2(rscreen->b.info.vram_size,
> > > > > 0x);
> > > > > +   return MIN2(rscreen->b.info.max_alloc_size,
> > > > > 0x);
> > > > > 
> > > > >  case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
> > > > >  return R600_MAP_BUFFER_ALIGNMENT;
> > > > > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
> > > > > b/src/gallium/drivers/radeon/r600_pipe_common.c
> > > > > index d7f1d41..f75fa6c 100644
> > > > > --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> > > > > +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> > > > > @@ -864,8 +864,8 @@ static int r600_get_compute_param(struct
> > > > > pipe_screen
> > > > > *screen,
> > > > >  * 4 * MAX_MEM_ALLOC_SIZE.
> > > > >  */
> > > > > *max_global_size = MIN2(4 *
> > > > > max_mem_alloc_size,
> > > > > -   rscreen->info.gart_size +
> > > > > -   rscreen->info.vram_size);
> > > > > +
> > > > > MAX2(rscreen->info.gart_size,
> > > > > +
> > > > > rscreen->info.vram_size));
> > > > 
> > > > Can't you also use info.max_alloc_size here?
> > > I can do *max_global_size = max_alloc_size; Does that sound good?
> > Even if max_alloc_size can be 256 MB?
> > 
> > Marek
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> How about
> 
> *max_global_size = MIN2(4 * max_mem_alloc_size,
>  rscreen->info.max_alloc_size);
> 
> Would that work? That avoids calculating the same value in two
> distinct 
> places, since rscreen->info.max_alloc_size = 
> MAX2(rscreen->info.gart_size, rscreen->info.vram_size) for both
> radeon 
> and amdgpu.

r600 uses one buffer to back a pool of global AS user buffers,
max_alloc_size should be == max_global_size

Jan

> 
> Regards,
> Vedran
> 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon uvd add uvd fw version for amdgpu

2016-07-01 Thread Christian König

Am 01.07.2016 um 13:14 schrieb Emil Velikov:

Hi all,

On 29 June 2016 at 20:20, Christian König  wrote:

Am 29.06.2016 um 18:35 schrieb Alex Deucher:

On Wed, Jun 29, 2016 at 11:38 AM, Leo Liu  wrote:

From: sonjiang 

Signed-off-by: sonjiang 
Cc: "12.0" 

For the series:
Reviewed-by: Alex Deucher 


Reviewed-by: Christian König  as well.


Here we have three patches, suggesting a bug with absolutely no
information what the issue is and/or why this approach is correct.

I'm sorry to say this, but as is, this series is not landing in
stable. Sonjiang, being the author of these please reply with a brief
justification why we want those. Before doing so I would strongly
recommend reading this [1] blog post.


Well to put a carrot on the front of your stick: I asked what the 
firmware version patch is all about internally as well when I've seen 
those patches. So it would have even made our internal review much 
easier if Sonny added a commit message in the first place.


My fault to not requesting that his answer is put as a commit message on 
the patches.


On the other hand this is for Polaris, we had time pressure to get it 
out of the door and today is a public holiday in Canada. So you probably 
won't get updated message before Monday.


Is that soon enough? Otherwise UVD will be broken on Polaris in the 
stable branch.


Regards,
Christian.



Thanks
Emil

[1] http://who-t.blogspot.co.uk/2009/12/on-commit-messages.html


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: use max_alloc_size

2016-07-01 Thread Vedran Miletić

On 07/01/2016 11:11 AM, Marek Olšák wrote:

On Fri, Jul 1, 2016 at 10:54 AM, Marek Olšák  wrote:

On Fri, Jul 1, 2016 at 2:52 AM, Vedran Miletić  wrote:

On 07/01/2016 01:29 AM, Marek Olšák wrote:


From: Marek Olšák 

also fix max_global_size to take a maximum of {vram_size, gart_size}
---
 src/gallium/drivers/r600/r600_pipe.c  | 2 +-
 src/gallium/drivers/radeon/r600_pipe_common.c | 9 +++--
 src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_pipe.c
b/src/gallium/drivers/r600/r600_pipe.c
index 119c76b..55bbde1 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -301,7 +301,7 @@ static int r600_get_param(struct pipe_screen* pscreen,
enum pipe_cap param)
return 0;

case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
-   return MIN2(rscreen->b.info.vram_size, 0x);
+   return MIN2(rscreen->b.info.max_alloc_size, 0x);

 case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
 return R600_MAP_BUFFER_ALIGNMENT;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index d7f1d41..f75fa6c 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -864,8 +864,8 @@ static int r600_get_compute_param(struct pipe_screen
*screen,
 * 4 * MAX_MEM_ALLOC_SIZE.
 */
*max_global_size = MIN2(4 * max_mem_alloc_size,
-   rscreen->info.gart_size +
-   rscreen->info.vram_size);
+
MAX2(rscreen->info.gart_size,
+
rscreen->info.vram_size));



Can't you also use info.max_alloc_size here?


I can do *max_global_size = max_alloc_size; Does that sound good?


Even if max_alloc_size can be 256 MB?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



How about

*max_global_size = MIN2(4 * max_mem_alloc_size,
rscreen->info.max_alloc_size);

Would that work? That avoids calculating the same value in two distinct 
places, since rscreen->info.max_alloc_size = 
MAX2(rscreen->info.gart_size, rscreen->info.vram_size) for both radeon 
and amdgpu.


Regards,
Vedran

--
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallium/radeon: add radeon_info::max_alloc_size into the winsys interface

2016-07-01 Thread Vedran Miletić

On 07/01/2016 10:54 AM, Michel Dänzer wrote:

On 01.07.2016 08:29, Marek Olšák wrote:


diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index c4d28ff..76a125c 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -372,6 +372,7 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
 }
 ws->info.gart_size = gem_info.gart_size;
 ws->info.vram_size = gem_info.vram_size;
+ws->info.max_alloc_size = MAX2(ws->info.vram_size, ws->info.gart_size);


The radeon driver in kernels older than 3.17 can't allocate BOs larger
than 256MB.




Indeed. You should ensure DRM >= 2.40.

Regards,
Vedran

--
Vedran Miletić
vedran.miletic.net
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-01 Thread Emil Velikov
Hi Boyuan,

I believe Christian already mentioned this - here (and other patches
really) you want to mention "why we want this". Something like "VAAPI
does not allow for XX and YY. And requires the driver to explicitly
manage/provide ZZ" will be more than enough imho.

In general I would suggest beefing up your commit messages - it's not
that every single commit has to have one, but from your existing 25+
patches (with this series) there's a total of ~5 lines worth. Do take
a look at [1] [2] for more info.

Thanks
Emil

[1] http://who-t.blogspot.co.uk/2009/12/on-commit-messages.html
[2] http://chris.beams.io/posts/git-commit/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: make attribute binding message more useful

2016-07-01 Thread Iago Toral
On Fri, 2016-07-01 at 14:00 +1000, Timothy Arceri wrote:
> ---
>  src/mesa/main/shader_query.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
> index b5e1a44..a2a93b1 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -84,7 +84,8 @@ _mesa_BindAttribLocation(GLuint program, GLuint index,
> }
>  
> if (index >= ctx->Const.Program[MESA_SHADER_VERTEX].MaxAttribs) {
> -  _mesa_error(ctx, GL_INVALID_VALUE, "glBindAttribLocation(index)");
> +  _mesa_error(ctx, GL_INVALID_VALUE, "glBindAttribLocation(%u >= %u)",
> +  index, ctx->Const.Program[MESA_SHADER_VERTEX].MaxAttribs);

Feel free to ignore this if you think it does not add anything
significant, but maybe we can make it a bit more clear by saying
something like:

"glBindAttribLocation(%u): %u exceeds the maximum number of attributes
(%u)"

Either way:
Reviewed-by: Iago Toral Quiroga 

>return;
> }
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] radeon uvd add uvd fw version for amdgpu

2016-07-01 Thread Emil Velikov
Hi all,

On 29 June 2016 at 20:20, Christian König  wrote:
> Am 29.06.2016 um 18:35 schrieb Alex Deucher:
>>
>> On Wed, Jun 29, 2016 at 11:38 AM, Leo Liu  wrote:
>>>
>>> From: sonjiang 
>>>
>>> Signed-off-by: sonjiang 
>>> Cc: "12.0" 
>>
>> For the series:
>> Reviewed-by: Alex Deucher 
>
>
> Reviewed-by: Christian König  as well.
>
Here we have three patches, suggesting a bug with absolutely no
information what the issue is and/or why this approach is correct.

I'm sorry to say this, but as is, this series is not landing in
stable. Sonjiang, being the author of these please reply with a brief
justification why we want those. Before doing so I would strongly
recommend reading this [1] blog post.

Thanks
Emil

[1] http://who-t.blogspot.co.uk/2009/12/on-commit-messages.html
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Add gperf as a build dependency?

2016-07-01 Thread Emil Velikov
On 30 June 2016 at 19:50, Jose Fonseca  wrote:
> On 30/06/16 18:13, Ian Romanick wrote:
>>
>> I think I might want to use gperf for something in Mesa, but I'm not
>> 100% sure yet.  Before I proceed, is it even acceptable to add that as a
>> build dependency?
>
>
>
> I presume C code generated by gperf is freely usable and it's not subject to
> GPL, right?  (I couldn't find any explicit mention of that in the docs.)
>
Your assumption is spot on. Here's a slightly more authoritative reference.

https://www.gnu.org/software/gperf/manual/gperf.html#Output-Copyright

>
> It looks like the source code builds for Windows with MSVC.  So while it
> will be a bit of a hassle to get it added to our build toolchain, that's not
> a blocker.
>
Shame the https://sourceforge.net/projects/gnuwin32/ hasn't been
updated in a long time.

On the topic in question - we should be fine with it. All Linux and
most/all BSDs distros ship the tool, plus we'll only use it to
generate sources when working with git checkouts.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: use max_alloc_size

2016-07-01 Thread Marek Olšák
On Fri, Jul 1, 2016 at 10:54 AM, Marek Olšák  wrote:
> On Fri, Jul 1, 2016 at 2:52 AM, Vedran Miletić  wrote:
>> On 07/01/2016 01:29 AM, Marek Olšák wrote:
>>>
>>> From: Marek Olšák 
>>>
>>> also fix max_global_size to take a maximum of {vram_size, gart_size}
>>> ---
>>>  src/gallium/drivers/r600/r600_pipe.c  | 2 +-
>>>  src/gallium/drivers/radeon/r600_pipe_common.c | 9 +++--
>>>  src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
>>>  3 files changed, 5 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/r600/r600_pipe.c
>>> b/src/gallium/drivers/r600/r600_pipe.c
>>> index 119c76b..55bbde1 100644
>>> --- a/src/gallium/drivers/r600/r600_pipe.c
>>> +++ b/src/gallium/drivers/r600/r600_pipe.c
>>> @@ -301,7 +301,7 @@ static int r600_get_param(struct pipe_screen* pscreen,
>>> enum pipe_cap param)
>>> return 0;
>>>
>>> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
>>> -   return MIN2(rscreen->b.info.vram_size, 0x);
>>> +   return MIN2(rscreen->b.info.max_alloc_size, 0x);
>>>
>>>  case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
>>>  return R600_MAP_BUFFER_ALIGNMENT;
>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
>>> b/src/gallium/drivers/radeon/r600_pipe_common.c
>>> index d7f1d41..f75fa6c 100644
>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>>> @@ -864,8 +864,8 @@ static int r600_get_compute_param(struct pipe_screen
>>> *screen,
>>>  * 4 * MAX_MEM_ALLOC_SIZE.
>>>  */
>>> *max_global_size = MIN2(4 * max_mem_alloc_size,
>>> -   rscreen->info.gart_size +
>>> -   rscreen->info.vram_size);
>>> +
>>> MAX2(rscreen->info.gart_size,
>>> +
>>> rscreen->info.vram_size));
>>
>>
>> Can't you also use info.max_alloc_size here?
>
> I can do *max_global_size = max_alloc_size; Does that sound good?

Even if max_alloc_size can be 256 MB?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: use max_alloc_size

2016-07-01 Thread Marek Olšák
On Fri, Jul 1, 2016 at 2:52 AM, Vedran Miletić  wrote:
> On 07/01/2016 01:29 AM, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> also fix max_global_size to take a maximum of {vram_size, gart_size}
>> ---
>>  src/gallium/drivers/r600/r600_pipe.c  | 2 +-
>>  src/gallium/drivers/radeon/r600_pipe_common.c | 9 +++--
>>  src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
>>  3 files changed, 5 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/gallium/drivers/r600/r600_pipe.c
>> b/src/gallium/drivers/r600/r600_pipe.c
>> index 119c76b..55bbde1 100644
>> --- a/src/gallium/drivers/r600/r600_pipe.c
>> +++ b/src/gallium/drivers/r600/r600_pipe.c
>> @@ -301,7 +301,7 @@ static int r600_get_param(struct pipe_screen* pscreen,
>> enum pipe_cap param)
>> return 0;
>>
>> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
>> -   return MIN2(rscreen->b.info.vram_size, 0x);
>> +   return MIN2(rscreen->b.info.max_alloc_size, 0x);
>>
>>  case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
>>  return R600_MAP_BUFFER_ALIGNMENT;
>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
>> b/src/gallium/drivers/radeon/r600_pipe_common.c
>> index d7f1d41..f75fa6c 100644
>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>> @@ -864,8 +864,8 @@ static int r600_get_compute_param(struct pipe_screen
>> *screen,
>>  * 4 * MAX_MEM_ALLOC_SIZE.
>>  */
>> *max_global_size = MIN2(4 * max_mem_alloc_size,
>> -   rscreen->info.gart_size +
>> -   rscreen->info.vram_size);
>> +
>> MAX2(rscreen->info.gart_size,
>> +
>> rscreen->info.vram_size));
>
>
> Can't you also use info.max_alloc_size here?

I can do *max_global_size = max_alloc_size; Does that sound good?

>
>> }
>> return sizeof(uint64_t);
>>
>> @@ -889,10 +889,7 @@ static int r600_get_compute_param(struct pipe_screen
>> *screen,
>> if (ret) {
>> uint64_t *max_mem_alloc_size = ret;
>>
>> -   /* XXX: The limit in older kernels is 256 MB.  We
>> -* should add a query here for newer kernels.
>> -*/
>> -   *max_mem_alloc_size = 256 * 1024 * 1024;
>> +   *max_mem_alloc_size =
>> rscreen->info.max_alloc_size;
>> }
>> return sizeof(uint64_t);
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
>> b/src/gallium/drivers/radeonsi/si_pipe.c
>> index ad2a86a..30e6253 100644
>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>> @@ -419,7 +419,7 @@ static int si_get_param(struct pipe_screen* pscreen,
>> enum pipe_cap param)
>>HAVE_LLVM >= 0x0307 ? 410 : 330;
>>
>> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
>> -   return MIN2(sscreen->b.info.vram_size, 0x);
>> +   return MIN2(sscreen->b.info.max_alloc_size, 0x);
>>
>> case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
>> return 0;
>>
>
> Had something similar in the works, Bas did as well, but this approach is
> cleaner.
>
> With these changes, in si_pipe.c and r600_pipe.c, you should not return
> max_const_buffer_size anymore, since it can exceed int limits, but instead
> something like
>
> MAX2(0x7ff, max_const_buffer_size)

Good point. I'll fix that.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallium/radeon: add radeon_info::max_alloc_size into the winsys interface

2016-07-01 Thread Michel Dänzer
On 01.07.2016 08:29, Marek Olšák wrote:
> 
> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> index c4d28ff..76a125c 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> @@ -372,6 +372,7 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>  }
>  ws->info.gart_size = gem_info.gart_size;
>  ws->info.vram_size = gem_info.vram_size;
> +ws->info.max_alloc_size = MAX2(ws->info.vram_size, ws->info.gart_size);

The radeon driver in kernels older than 3.17 can't allocate BOs larger
than 256MB.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/36] i965/blorp: Add an isl_view to blorp_surface_info

2016-07-01 Thread Pohjolainen, Topi
On Wed, Jun 29, 2016 at 05:37:35PM -0700, Jason Ekstrand wrote:
> Eventually, this will be the actual view that gets passed into isl to
> create the surface state.  For now, we just use it for the format and the
> swizzle.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 38 
> +++
>  src/mesa/drivers/dri/i965/brw_blorp.h | 16 ++-
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  | 34 
>  src/mesa/drivers/dri/i965/brw_blorp_clear.cpp |  2 +-
>  src/mesa/drivers/dri/i965/gen8_blorp.c| 29 
>  5 files changed, 64 insertions(+), 55 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 5e433d3..df92822 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -43,9 +43,11 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>  * using INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, then it had 
> better
>  * be a multiple of num_samples.
>  */
> +   unsigned layer_multiplier = 1;
> if (mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
> mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) {
>assert(mt->num_samples <= 1 || layer % mt->num_samples == 0);
> +  layer_multiplier = MAX2(mt->num_samples, 1);
> }
>  
> intel_miptree_check_level_layer(mt, level, layer);
> @@ -61,13 +63,27 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>info->aux_usage = ISL_AUX_USAGE_NONE;
> }
>  
> +   info->view = (struct isl_view) {
> +  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
> +  ISL_SURF_USAGE_TEXTURE_BIT,
> +  .format = ISL_FORMAT_UNSUPPORTED, /* Set later */
> +  .base_level = level,
> +  .levels = 1,
> +  .base_array_layer = layer / layer_multiplier,
> +  .array_len = 1,
> +  .channel_select = {
> + ISL_CHANNEL_SELECT_RED,
> + ISL_CHANNEL_SELECT_GREEN,
> + ISL_CHANNEL_SELECT_BLUE,
> + ISL_CHANNEL_SELECT_ALPHA,
> +  },
> +   };
> +
> info->level = level;
> info->layer = layer;
> info->width = minify(mt->physical_width0, level - mt->first_level);
> info->height = minify(mt->physical_height0, level - mt->first_level);
>  
> -   info->swizzle = SWIZZLE_XYZW;
> -
> if (format == MESA_FORMAT_NONE)
>format = mt->format;
>  
> @@ -75,8 +91,8 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> case MESA_FORMAT_S_UINT8:
>assert(info->surf.tiling == ISL_TILING_W);
>/* Prior to Broadwell, we can't render to R8_UINT */
> -  info->brw_surfaceformat = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
> -BRW_SURFACEFORMAT_R8_UNORM;
> +  info->view.format = brw->gen >= 8 ? BRW_SURFACEFORMAT_R8_UINT :
> +  BRW_SURFACEFORMAT_R8_UNORM;

isl_view::format is of the type "enum isl_format" but we assigned it with
BRW_SURFACEFORMAT?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/36] i965/blorp: Move intratile offset calculations out of surface state setup

2016-07-01 Thread Pohjolainen, Topi
On Wed, Jun 29, 2016 at 05:37:34PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c| 24 
>  src/mesa/drivers/dri/i965/brw_blorp.h| 15 ++-
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp |  8 
>  3 files changed, 18 insertions(+), 29 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index d6581d0..5e433d3 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -66,9 +66,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> info->width = minify(mt->physical_width0, level - mt->first_level);
> info->height = minify(mt->physical_height0, level - mt->first_level);
>  
> -   intel_miptree_get_image_offset(mt, level, layer,
> -  >x_offset, >y_offset);
> -
> info->swizzle = SWIZZLE_XYZW;
>  
> if (format == MESA_FORMAT_NONE)
> @@ -110,6 +107,15 @@ brw_blorp_surface_info_init(struct brw_context *brw,
>break;
> }
> }
> +
> +   uint32_t x_offset, y_offset;
> +   intel_miptree_get_image_offset(mt, level, layer, _offset, _offset);
> +
> +   uint8_t bs = isl_format_get_layout(info->brw_surfaceformat)->bs;
> +   isl_tiling_get_intratile_offset_el(>isl_dev, info->surf.tiling, bs,
> +  info->surf.row_pitch, x_offset, 
> y_offset,
> +  >bo_offset,
> +  >tile_x_sa, >tile_y_sa);
>  }
>  
>  
> @@ -305,13 +311,6 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
>ISL_SURF_USAGE_TEXTURE_BIT,
> };
>  
> -   uint32_t offset, tile_x, tile_y;
> -   isl_tiling_get_intratile_offset_el(>isl_dev, surf.tiling,
> -  isl_format_get_layout(view.format)->bs,
> -  surf.row_pitch,
> -  surface->x_offset, surface->y_offset,
> -  , _x, _y);
> -
> uint32_t surf_offset;
> uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
>ss_info.num_dwords * 4, ss_info.ss_align,
> @@ -320,11 +319,12 @@ brw_blorp_emit_surface_state(struct brw_context *brw,
> const uint32_t mocs = is_render_target ? ss_info.rb_mocs : 
> ss_info.tex_mocs;
>  
> isl_surf_fill_state(>isl_dev, dw, .surf = , .view = ,
> -   .address = surface->mt->bo->offset64 + offset,
> +   .address = surface->mt->bo->offset64 + 
> surface->bo_offset,
> .aux_surf = aux_surf, .aux_usage = surface->aux_usage,
> .aux_address = aux_offset,
> .mocs = mocs, .clear_color = clear_color,
> -   .x_offset_sa = tile_x, .y_offset_sa = tile_y);
> +   .x_offset_sa = surface->tile_x_sa,
> +   .y_offset_sa = surface->tile_y_sa);
>  
> /* Emit relocation to surface contents */
> drm_intel_bo_emit_reloc(brw->batch.bo,
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> index b8a8d06..fddd007 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> @@ -104,19 +104,8 @@ struct brw_blorp_surface_info
>  */
> uint32_t height;
>  
> -   /**
> -* X offset within the surface to texture from (or render to).  For
> -* surfaces using INTEL_MSAA_LAYOUT_IMS, this is measured in samples, not
> -* pixels.
> -*/
> -   uint32_t x_offset;
> -
> -   /**
> -* Y offset within the surface to texture from (or render to).  For
> -* surfaces using INTEL_MSAA_LAYOUT_IMS, this is measured in samples, not
> -* pixels.
> -*/
> -   uint32_t y_offset;
> +   uint32_t bo_offset;
> +   uint32_t tile_x_sa, tile_y_sa;
>  
> /**
>  * Format that should be used when setting up the surface state for this
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index c253412..5696d52 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -1896,8 +1896,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>params.y1 = ALIGN(params.y1, y_align) / 2;
>params.dst.width = ALIGN(params.dst.width, x_align) * 2;
>params.dst.height = ALIGN(params.dst.height, y_align) / 2;
> -  params.dst.x_offset *= 2;
> -  params.dst.y_offset /= 2;
> +  params.dst.tile_x_sa *= 2;
> +  params.dst.tile_y_sa /= 2;

This I had to think about a little. Previously we multiplied full x/y offsets,
resolved tile aligned buffer offset and intra tile offset based on that. Now
we let ISL to take into account the msaa setting and we only multiply the
resolved intra tile offsets. Makes sense and looks a little 

[Mesa-dev] [Bug 96235] st_nir.h:34: error: redefinition of typedef ‘nir_shader’

2016-07-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96235

Vinson Lee  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #2 from Vinson Lee  ---
commit 3fea592c4eb26f6652bef1e5dc430e2296e14bac
Author: Vinson Lee 
Date:   Wed Jun 29 20:15:03 2016 -0700

mesa/st: Use 'struct nir_shader' instead of 'nir_shader'.

Fix this build error with GCC 4.4.

  CC state_tracker/st_nir_lower_builtin.lo
In file included from state_tracker/st_nir_lower_builtin.c:61:
state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’
../../src/compiler/nir/nir.h:1830: note: previous declaration of
‘nir_shader’ was here

Suggested-by: Rob Clark 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96235
Signed-off-by: Vinson Lee 
Reviewed-by: Jason Ekstrand 
Reviewed-by: Rob Clark 

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/36] i965/blorp: Refactor interleaved multisample destination handling

2016-07-01 Thread Pohjolainen, Topi
On Fri, Jul 01, 2016 at 09:56:10AM +0300, Pohjolainen, Topi wrote:
> On Wed, Jun 29, 2016 at 05:37:33PM -0700, Jason Ekstrand wrote:
> > We put all of the code for fake IMS together.  This requires moving a bit
> > of the program key setup code further down so that it gets the right values
> > out of the final surface.
> > ---
> >  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 71 
> > +---
> >  1 file changed, 34 insertions(+), 37 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> > b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > index 9a0b9bb..c253412 100644
> > --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> > @@ -1695,28 +1695,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> >unreachable("Unrecognized blorp format");
> > }
> >  
> > -   if (brw->gen > 6) {
> > -  /* Gen7's rendering hardware only supports the IMS layout for depth 
> > and
> > -   * stencil render targets.  Blorp always maps its destination 
> > surface as
> > -   * a color render target (even if it's actually a depth or stencil
> > -   * buffer).  So if the destination is IMS, we'll have to map it as a
> > -   * single-sampled texture and interleave the samples ourselves.
> > -   */
> > -  if (dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
> 
> This should have been switched to isl enum in patch 12 (Use isl_msaa_layout
> instead of intel_msaa_layout)?
> 
> > - params.dst.surf.samples = 1;
> > - params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
> > -  }
> > -   }
> > -
> > -   if (params.src.surf.samples > 0 && params.dst.surf.samples > 1) {
> > -  /* We are blitting from a multisample buffer to a multisample 
> > buffer, so
> > -   * we must preserve samples within a pixel.  This means we have to
> > -   * arrange for the WM program to run once per sample rather than once
> > -   * per pixel.
> > -   */
> > -  wm_prog_key.persample_msaa_dispatch = true;
> > -   }
> > -
> > /* Scaled blitting or not. */
> > wm_prog_key.blit_scaled =
> >((dst_x1 - dst_x0) == (src_x1 - src_x0) &&
> > @@ -1756,20 +1734,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> > wm_prog_key.src_samples = src_mt->num_samples;
> > wm_prog_key.dst_samples = dst_mt->num_samples;
> >  
> > -   /* tex_samples and rt_samples are the sample counts that are set up in
> > -* SURFACE_STATE.
> > -*/
> > -   wm_prog_key.tex_samples = params.src.surf.samples;
> > -   wm_prog_key.rt_samples  = params.dst.surf.samples;
> > -
> > wm_prog_key.tex_aux_usage = params.src.aux_usage;
> >  
> > -   /* tex_layout and rt_layout indicate the MSAA layout the GPU pipeline 
> > will
> > -* use to access the source and destination surfaces.
> > -*/
> > -   wm_prog_key.tex_layout = params.src.surf.msaa_layout;
> > -   wm_prog_key.rt_layout = params.dst.surf.msaa_layout;
> > -
> > /* src_layout and dst_layout indicate the true MSAA layout used by src 
> > and
> >  * dst.
> >  */
> > @@ -1805,7 +1771,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> >params.wm_push_consts.src_z = 0;
> > }
> >  
> > -   if (params.dst.surf.samples <= 1 && dst_mt->num_samples > 1) {
> > +   if (brw->gen > 6 && dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
> 
> And here we should have ISL enum, right?

Otherwise the patch looks right:

Reviewed-by: Topi Pohjolainen 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] i965: Emit SKL VF cache invalidation W/A from brw_emit_pipe_control_flush.

2016-07-01 Thread Alejandro Piñeiro
Looks good to me:

Reviewed-by: Alejandro Piñeiro 

Note: I think that patches 3-4 should be reviewed by a more seasoned
developer (specially patch 3).

On 01/07/16 07:07, Francisco Jerez wrote:
> There were two places in the driver doing a pipe control VF cache
> flush, one of them was missing this workaround, move it down into
> brw_emit_pipe_control_flush to make sure we don't miss it again.
> ---
>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index 586355d..14a8f7c 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -100,6 +100,16 @@ brw_emit_pipe_control_flush(struct brw_context *brw, 
> uint32_t flags)
>if (brw->gen == 8)
>   gen8_add_cs_stall_workaround_bits();
>  
> +  if (brw->gen == 9 &&
> +  (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE)) {
> + /* Hardware workaround: SKL
> +  *
> +  * Emit Pipe Control with all bits set to zero before emitting
> +  * a Pipe Control with VF Cache Invalidate set.
> +  */
> + brw_emit_pipe_control_flush(brw, 0);
> +  }
> +
>BEGIN_BATCH(6);
>OUT_BATCH(_3DSTATE_PIPE_CONTROL | (6 - 2));
>OUT_BATCH(flags);
> @@ -322,15 +332,6 @@ brw_emit_mi_flush(struct brw_context *brw)
> } else {
>int flags = PIPE_CONTROL_NO_WRITE | PIPE_CONTROL_RENDER_TARGET_FLUSH;
>if (brw->gen >= 6) {
> - if (brw->gen == 9) {
> -/* Hardware workaround: SKL
> - *
> - * Emit Pipe Control with all bits set to zero before emitting
> - * a Pipe Control with VF Cache Invalidate set.
> - */
> -brw_emit_pipe_control_flush(brw, 0);
> - }
> -
>   flags |= PIPE_CONTROL_INSTRUCTION_INVALIDATE |
>PIPE_CONTROL_DEPTH_CACHE_FLUSH |
>PIPE_CONTROL_VF_CACHE_INVALIDATE |

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/36] i965/blorp: Refactor interleaved multisample destination handling

2016-07-01 Thread Pohjolainen, Topi
On Wed, Jun 29, 2016 at 05:37:33PM -0700, Jason Ekstrand wrote:
> We put all of the code for fake IMS together.  This requires moving a bit
> of the program key setup code further down so that it gets the right values
> out of the final surface.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 71 
> +---
>  1 file changed, 34 insertions(+), 37 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index 9a0b9bb..c253412 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -1695,28 +1695,6 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>unreachable("Unrecognized blorp format");
> }
>  
> -   if (brw->gen > 6) {
> -  /* Gen7's rendering hardware only supports the IMS layout for depth and
> -   * stencil render targets.  Blorp always maps its destination surface 
> as
> -   * a color render target (even if it's actually a depth or stencil
> -   * buffer).  So if the destination is IMS, we'll have to map it as a
> -   * single-sampled texture and interleave the samples ourselves.
> -   */
> -  if (dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {

This should have been switched to isl enum in patch 12 (Use isl_msaa_layout
instead of intel_msaa_layout)?

> - params.dst.surf.samples = 1;
> - params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
> -  }
> -   }
> -
> -   if (params.src.surf.samples > 0 && params.dst.surf.samples > 1) {
> -  /* We are blitting from a multisample buffer to a multisample buffer, 
> so
> -   * we must preserve samples within a pixel.  This means we have to
> -   * arrange for the WM program to run once per sample rather than once
> -   * per pixel.
> -   */
> -  wm_prog_key.persample_msaa_dispatch = true;
> -   }
> -
> /* Scaled blitting or not. */
> wm_prog_key.blit_scaled =
>((dst_x1 - dst_x0) == (src_x1 - src_x0) &&
> @@ -1756,20 +1734,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> wm_prog_key.src_samples = src_mt->num_samples;
> wm_prog_key.dst_samples = dst_mt->num_samples;
>  
> -   /* tex_samples and rt_samples are the sample counts that are set up in
> -* SURFACE_STATE.
> -*/
> -   wm_prog_key.tex_samples = params.src.surf.samples;
> -   wm_prog_key.rt_samples  = params.dst.surf.samples;
> -
> wm_prog_key.tex_aux_usage = params.src.aux_usage;
>  
> -   /* tex_layout and rt_layout indicate the MSAA layout the GPU pipeline will
> -* use to access the source and destination surfaces.
> -*/
> -   wm_prog_key.tex_layout = params.src.surf.msaa_layout;
> -   wm_prog_key.rt_layout = params.dst.surf.msaa_layout;
> -
> /* src_layout and dst_layout indicate the true MSAA layout used by src and
>  * dst.
>  */
> @@ -1805,7 +1771,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>params.wm_push_consts.src_z = 0;
> }
>  
> -   if (params.dst.surf.samples <= 1 && dst_mt->num_samples > 1) {
> +   if (brw->gen > 6 && dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) {

And here we should have ISL enum, right?

>/* We must expand the rectangle we send through the rendering pipeline,
> * to account for the fact that we are mapping the destination region 
> as
> * single-sampled when it is in fact multisampled.  We must also align
> @@ -1818,8 +1784,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> * If it's UMS, then we have no choice but to set up the rendering
> * pipeline as multisampled.
> */
> -  assert(dst_mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS);
> -  switch (dst_mt->num_samples) {
> +  assert(params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED);
> +  switch (params.dst.surf.samples) {
>case 2:
>   params.x0 = ROUND_DOWN_TO(params.x0 * 2, 4);
>   params.y0 = ROUND_DOWN_TO(params.y0, 4);
> @@ -1847,6 +1813,16 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>default:
>   unreachable("Unrecognized sample count in brw_blorp_blit_params 
> ctor");
>}
> +
> +  /* Gen7's rendering hardware only supports the IMS layout for depth and
> +   * stencil render targets.  Blorp always maps its destination surface 
> as
> +   * a color render target (even if it's actually a depth or stencil
> +   * buffer).  So if the destination is IMS, we'll have to map it as a
> +   * single-sampled texture and interleave the samples ourselves.
> +   */
> +  params.dst.surf.samples = 1;
> +  params.dst.surf.msaa_layout = ISL_MSAA_LAYOUT_NONE;
> +
>wm_prog_key.use_kill = true;
> }
>  
> @@ -1950,6 +1926,27 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
>params.src.y_offset /= 2;
> }
>  
> +   /* tex_samples and rt_samples are the sample counts that are set up in
> +* SURFACE_STATE.
> +*/
> +   

Re: [Mesa-dev] [PATCH 1/4] i965: Emit SNB write cache flush W/A from brw_emit_pipe_control_flush.

2016-07-01 Thread Alejandro Piñeiro
Looks good to me:

Reviewed-by: Alejandro Piñeiro 

On 01/07/16 07:07, Francisco Jerez wrote:
> Shouldn't cause any functional changes at this point, but we have
> forgotten to apply this workaround several times in the past, make
> sure it doesn't happen again.
> ---
>  src/mesa/drivers/dri/i965/brw_misc_state.c   |  9 -
>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 21 +++--
>  src/mesa/drivers/dri/i965/intel_fbo.c|  8 
>  src/mesa/drivers/dri/i965/intel_tex.c|  8 
>  4 files changed, 11 insertions(+), 35 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
> b/src/mesa/drivers/dri/i965/brw_misc_state.c
> index 690c2f6..c3d341f 100644
> --- a/src/mesa/drivers/dri/i965/brw_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
> @@ -925,15 +925,6 @@ brw_emit_select_pipeline(struct brw_context *brw, enum 
> brw_pipeline pipeline)
>const unsigned dc_flush =
>   brw->gen >= 7 ? PIPE_CONTROL_DATA_CACHE_FLUSH : 0;
>  
> -  if (brw->gen == 6) {
> - /* Hardware workaround: SNB B-Spec says:
> -  *
> -  *   Before a PIPE_CONTROL with Write Cache Flush Enable = 1, a
> -  *   PIPE_CONTROL with any non-zero post-sync-op is required.
> -  */
> - brw_emit_post_sync_nonzero_flush(brw);
> -  }
> -
>brw_emit_pipe_control_flush(brw,
>PIPE_CONTROL_RENDER_TARGET_FLUSH |
>PIPE_CONTROL_DEPTH_CACHE_FLUSH |
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index 4672efd..586355d 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -109,6 +109,17 @@ brw_emit_pipe_control_flush(struct brw_context *brw, 
> uint32_t flags)
>OUT_BATCH(0);
>ADVANCE_BATCH();
> } else if (brw->gen >= 6) {
> +  if (brw->gen == 6 &&
> +  (flags & PIPE_CONTROL_RENDER_TARGET_FLUSH)) {
> + /* Hardware workaround: SNB B-Spec says:
> +  *
> +  *   [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache Flush
> +  *   Enable = 1, a PIPE_CONTROL with any non-zero post-sync-op is
> +  *   required.
> +  */
> + brw_emit_post_sync_nonzero_flush(brw);
> +  }
> +
>flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags);
>  
>BEGIN_BATCH(5);
> @@ -325,16 +336,6 @@ brw_emit_mi_flush(struct brw_context *brw)
>PIPE_CONTROL_VF_CACHE_INVALIDATE |
>PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
>PIPE_CONTROL_CS_STALL;
> -
> - if (brw->gen == 6) {
> -/* Hardware workaround: SNB B-Spec says:
> - *
> - * [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache
> - * Flush Enable =1, a PIPE_CONTROL with any non-zero
> - * post-sync-op is required.
> - */
> -brw_emit_post_sync_nonzero_flush(brw);
> - }
>}
>brw_emit_pipe_control_flush(brw, flags);
> }
> diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
> b/src/mesa/drivers/dri/i965/intel_fbo.c
> index 939f9a0..707a9d2 100644
> --- a/src/mesa/drivers/dri/i965/intel_fbo.c
> +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
> @@ -1061,14 +1061,6 @@ brw_render_cache_set_check_flush(struct brw_context 
> *brw, drm_intel_bo *bo)
>return;
>  
> if (brw->gen >= 6) {
> -  if (brw->gen == 6) {
> - /* [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache
> -  * Flush Enable = 1, a PIPE_CONTROL with any non-zero
> -  * post-sync-op is required.
> -  */
> - brw_emit_post_sync_nonzero_flush(brw);
> -  }
> -
>brw_emit_pipe_control_flush(brw,
>PIPE_CONTROL_DEPTH_CACHE_FLUSH |
>PIPE_CONTROL_RENDER_TARGET_FLUSH |
> diff --git a/src/mesa/drivers/dri/i965/intel_tex.c 
> b/src/mesa/drivers/dri/i965/intel_tex.c
> index a802d5a..8c32fe3 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex.c
> @@ -364,14 +364,6 @@ intel_texture_barrier(struct gl_context *ctx)
> struct brw_context *brw = brw_context(ctx);
>  
> if (brw->gen >= 6) {
> -  if (brw->gen == 6) {
> - /* [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache
> -  * Flush Enable = 1, a PIPE_CONTROL with any non-zero
> -  * post-sync-op is required.
> -  */
> - brw_emit_post_sync_nonzero_flush(brw);
> -  }
> -
>brw_emit_pipe_control_flush(brw,
>PIPE_CONTROL_DEPTH_CACHE_FLUSH |
>PIPE_CONTROL_RENDER_TARGET_FLUSH |


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 13/36] i965/blorp: Get rid of brw_blorp_surface_info::array_layout

2016-07-01 Thread Pohjolainen, Topi
On Wed, Jun 29, 2016 at 05:37:32PM -0700, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 1 -
>  src/mesa/drivers/dri/i965/brw_blorp.h | 9 -
>  2 files changed, 10 deletions(-)

I found only one piece of blorp even considering the layout, and that reads
it from the miptree directly:

src/mesa/drivers/dri/i965/gen6_blorp.c:  if (hiz_mt->array_layout == 
ALL_SLICES_AT_EACH_LOD) {

So:

Reviewed-by: Topi Pohjolainen 

> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 34a16dc..d6581d0 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -69,7 +69,6 @@ brw_blorp_surface_info_init(struct brw_context *brw,
> intel_miptree_get_image_offset(mt, level, layer,
>>x_offset, >y_offset);
>  
> -   info->array_layout = mt->array_layout;
> info->swizzle = SWIZZLE_XYZW;
>  
> if (format == MESA_FORMAT_NONE)
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> index dc8632b..b8a8d06 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> @@ -119,15 +119,6 @@ struct brw_blorp_surface_info
> uint32_t y_offset;
>  
> /**
> -* Indicates if we use the standard miptree layout 
> (ALL_LOD_IN_EACH_SLICE),
> -* or if we tightly pack array slices at each LOD 
> (ALL_SLICES_AT_EACH_LOD).
> -*
> -* If ALL_SLICES_AT_EACH_LOD is set, then ARYSPC_LOD0 can be used. Ignored
> -* prior to Gen7.
> -*/
> -   enum miptree_array_layout array_layout;
> -
> -   /**
>  * Format that should be used when setting up the surface state for this
>  * surface.  Should correspond to one of the BRW_SURFACEFORMAT_* enums.
>  */
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/36] i965/blorp: Use the ISL aux_layout for deciding whether to do an MCS fetch

2016-07-01 Thread Pohjolainen, Topi
On Wed, Jun 29, 2016 at 05:37:30PM -0700, Jason Ekstrand wrote:
> AUX USAGE

Commit message looks a little incomplete. Otherwise patches 10-12 are

Reviewed-by: Topi Pohjolainen 

> ---
>  src/mesa/drivers/dri/i965/brw_blorp.h|  2 ++
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 16 +---
>  2 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
> b/src/mesa/drivers/dri/i965/brw_blorp.h
> index a85d368..89a799a 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.h
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
> @@ -272,6 +272,8 @@ struct brw_blorp_blit_prog_key
>  */
> enum intel_msaa_layout tex_layout;
>  
> +   enum isl_aux_usage tex_aux_usage;
> +
> /* Actual number of samples per pixel in the source image. */
> unsigned src_samples;
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> index 963432b..53c6cd8 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
> @@ -901,7 +901,7 @@ static inline int count_trailing_one_bits(unsigned value)
>  static nir_ssa_def *
>  blorp_nir_manual_blend_average(nir_builder *b, nir_ssa_def *pos,
> unsigned tex_samples,
> -   enum intel_msaa_layout tex_layout,
> +   enum isl_aux_usage tex_aux_usage,
> enum brw_reg_type dst_type)
>  {
> /* If non-null, this is the outer-most if statement */
> @@ -911,7 +911,7 @@ blorp_nir_manual_blend_average(nir_builder *b, 
> nir_ssa_def *pos,
>nir_local_variable_create(b->impl, glsl_vec4_type(), "color");
>  
> nir_ssa_def *mcs = NULL;
> -   if (tex_layout == INTEL_MSAA_LAYOUT_CMS)
> +   if (tex_aux_usage == ISL_AUX_USAGE_MCS)
>mcs = blorp_nir_txf_ms_mcs(b, pos);
>  
> /* We add together samples using a binary tree structure, e.g. for 4x 
> MSAA:
> @@ -956,7 +956,7 @@ blorp_nir_manual_blend_average(nir_builder *b, 
> nir_ssa_def *pos,
>  nir_imm_int(b, i));
>texture_data[stack_depth++] = blorp_nir_txf_ms(b, ms_pos, mcs, 
> dst_type);
>  
> -  if (i == 0 && tex_layout == INTEL_MSAA_LAYOUT_CMS) {
> +  if (i == 0 && tex_aux_usage == ISL_AUX_USAGE_MCS) {
>   /* The Ivy Bridge PRM, Vol4 Part1 p27 (Multisample Control Surface)
>* suggests an optimization:
>*
> @@ -1073,7 +1073,7 @@ blorp_nir_manual_blend_bilinear(nir_builder *b, 
> nir_ssa_def *pos,
> * here inside the loop after computing the pixel coordinates.
> */
>nir_ssa_def *mcs = NULL;
> -  if (key->tex_layout == INTEL_MSAA_LAYOUT_CMS)
> +  if (key->tex_aux_usage == ISL_AUX_USAGE_MCS)
>   mcs = blorp_nir_txf_ms_mcs(b, sample_coords_int);
>  
>/* Compute sample index and map the sample index to a sample number.
> @@ -1430,7 +1430,7 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
>} else {
>   /* Gen7+ hardware doesn't automaticaly blend. */
>   color = blorp_nir_manual_blend_average(, src_pos, 
> key->src_samples,
> -key->src_layout,
> +key->tex_aux_usage,
>  key->texture_data_type);
>}
> } else if (key->blend && key->blit_scaled) {
> @@ -1482,7 +1482,7 @@ brw_blorp_build_nir_shader(struct brw_context *brw,
>  color = blorp_nir_txf(, , src_pos, key->texture_data_type);
>   } else {
>  nir_ssa_def *mcs = NULL;
> -if (key->tex_layout == INTEL_MSAA_LAYOUT_CMS)
> +if (key->tex_aux_usage == ISL_AUX_USAGE_MCS)
> mcs = blorp_nir_txf_ms_mcs(, src_pos);
>  
>  color = blorp_nir_txf_ms(, src_pos, mcs, 
> key->texture_data_type);
> @@ -1516,7 +1516,7 @@ brw_blorp_get_blit_kernel(struct brw_context *brw,
> struct brw_wm_prog_key wm_key;
> brw_blorp_init_wm_prog_key(_key);
> wm_key.tex.compressed_multisample_layout_mask =
> -  prog_key->tex_layout == INTEL_MSAA_LAYOUT_CMS;
> +  prog_key->tex_aux_usage == ISL_AUX_USAGE_MCS;
> wm_key.tex.msaa_16 = prog_key->tex_samples == 16;
> wm_key.multisample_fbo = prog_key->rt_samples > 1;
>  
> @@ -1788,6 +1788,8 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
> wm_prog_key.tex_samples = params.src.surf.samples;
> wm_prog_key.rt_samples  = params.dst.surf.samples;
>  
> +   wm_prog_key.tex_aux_usage = params.src.aux_usage;
> +
> /* tex_layout and rt_layout indicate the MSAA layout the GPU pipeline will
>  * use to access the source and destination surfaces.
>  */
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> 

Re: [Mesa-dev] [PATCH v2] docs: update MESA_DEBUG envvar documentation.

2016-07-01 Thread Alejandro Piñeiro
On 30/06/16 15:28, Brian Paul wrote:
> On 06/29/2016 09:50 AM, Alejandro Piñeiro wrote:
>> silent, flush, incomplete_tex and incomplete_fbo flags were not
>> documented (see src/mesa/main.debug.c for more info).
>>
>> FP is not checked anymore.
>>
>> v2 (Brian Paul):
>>   * MESA_DEBUG accepts a comma-separated list of parameters.
>>   * Clarify how MESA_DEBUG behaves with mesa debug and release builds.
>>   * Updated wording.
>> ---
>>
>> Tried to answer all the comments from previous review on the
>> documentation itself.
>>
>> Initially I had the following paragraph as an extra explanation:
>>
>> "MESA_DEBUG=1 on release builds is equivalent to the default behaviour
>> of debug builds. MESA_DEBUG=silent on debug builds is equivalent to
>> the default behaviour of release builds."
>>
>> But I think that is not needed, as should be understood from the
>> documentation.
>>
>>   docs/envvars.html | 15 +--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/envvars.html b/docs/envvars.html
>> index ed957bd..7d19fb3 100644
>> --- a/docs/envvars.html
>> +++ b/docs/envvars.html
>> @@ -50,8 +50,19 @@ sometimes be useful for debugging end-user issues.
>>  if the application generates a GL_INVALID_ENUM error, a
>> corresponding error
>>  message indicating where the error occurred, and possibly why,
>> will be
>>  printed to stderr.
>> -   If the value of MESA_DEBUG is 'FP' floating point arithmetic
>> errors will
>> -   generate exceptions.
>> +
>> +   MESA_DEBUG=1 is not needed for mesa debug builds, as it is the
>> +   behaviour by default. It would be needed for mesa release builds,
>> +   that silent debug messages.
>
> The second sentence is a bit off.  How about this:  "For release
> builds, MESA_DEBUG defaults to off (no debug output)."

Ok.

>
>
>> +
>> +   MESA_DEBUG accepts the following comma-separated list of named
>> +   flags, which adds extra behaviour to just set MESA_DEBUG=1:
>> +   
>> + silent - turn off debug messages. Only useful for debug
>> builds.
>> + flush - flush after each drawing command
>> + incomplete_tex - extra debug messages when a texture is
>> incomplete
>> + incomplete_fbo - extra debug messages when a fbo is
>> incomplete
>> +   
>>   MESA_LOG_FILE - specifies a file name for logging all errors,
>> warnings,
>>   etc., rather than stderr
>>   MESA_TEX_PROG - if set, implement conventional texture env
>> modes with
>>
>
> With that, Reviewed-by: Brian Paul 

Thanks. Just pushed.

>
> Longer term, MESA_DEBUG should be cleaned up.  It's a bit of a
> hodgepodge now.
>

Yes, right it is hard to follow the debug logic unless you check the
code. For example getenv(MESA_DEBUG) is called in several places of the
code. Probably it would be better to call it once, at main/debug.c, when
the DEBUG_FLAGS are initialized.

BR

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] i965: intel_texture_barrier reimplemented

2016-07-01 Thread Alejandro Piñeiro
On 30/06/16 23:16, Francisco Jerez wrote:
> Alejandro Piñeiro  writes:
>
>> Fixes:
>> GL44-CTS.texture_barrier_ARB.same-texel-rw-multipass
>>
>> On Haswell, Broadwell and Skylake (note that in order to execute that
>> test, it is needed to override GL and GLSL versions).
>>
>> On gen6 this test was already working without this change. It keeps
>> working after it.
>>
>> This commit replaces the call to brw_emit_mi_flush for gen6+ with two
>> calls to brw_emit_pipe_control_flush:
>>
>>  * The first one with RENDER_TARGET_FLUSH and CS_STALL set to initiate
>>a render cache flush after any concurrent rendering completes and
>>cause the CS to stop parsing commands until the render cache
>>becomes coherent with memory.
>>
>>  * The second one have TEXTURE_CACHE_INVALIDATE set (and no CS stall)
>>to clean up any stale data from the sampler caches before rendering
>>continues.
>>
>> Didn't touch gen4-5, basically because I don't have a way to test
>> them.
>>
>> More info on commits:
>> 0aa4f99f562a05880a779707cbcd46be459863bf
>> 72473658c51d5e074ce219c1e6385a4cce29f467
>>
>> Thanks to Curro to help to tracking this down, as the root case was a
>> hw race condition.
>>
>> v2: use two calls to pipe_control_flush instead of a combination of
>> gen7_emit_cs_stall_flush and brw_emit_mi_flush calls (Curro)
>> v3: no need to const cache invalidation (Curro)
>> ---
>>
>> FWIW: checked with the CTS tests, and the piglit series, and confirmed
>> that the const cache invalidation is not needed.
>>
>>  src/mesa/drivers/dri/i965/intel_tex.c | 21 -
>>  1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/intel_tex.c 
>> b/src/mesa/drivers/dri/i965/intel_tex.c
>> index cac33ac..a802d5a 100644
>> --- a/src/mesa/drivers/dri/i965/intel_tex.c
>> +++ b/src/mesa/drivers/dri/i965/intel_tex.c
>> @@ -9,6 +9,7 @@
>>  #include "intel_mipmap_tree.h"
>>  #include "intel_tex.h"
>>  #include "intel_fbo.h"
>> +#include "intel_reg.h"
>>  
>>  #define FILE_DEBUG_FLAG DEBUG_TEXTURE
>>  
>> @@ -362,7 +363,25 @@ intel_texture_barrier(struct gl_context *ctx)
>>  {
>> struct brw_context *brw = brw_context(ctx);
>>  
>> -   brw_emit_mi_flush(brw);
>> +   if (brw->gen >= 6) {
>> +  if (brw->gen == 6) {
>> + /* [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache
>> +  * Flush Enable = 1, a PIPE_CONTROL with any non-zero
>> +  * post-sync-op is required.
>> +  */
>> + brw_emit_post_sync_nonzero_flush(brw);
>> +  }
>> +
>> +  brw_emit_pipe_control_flush(brw,
>> +  PIPE_CONTROL_DEPTH_CACHE_FLUSH |
>> +  PIPE_CONTROL_RENDER_TARGET_FLUSH |
>> +  PIPE_CONTROL_CS_STALL);
>> +
>> +  brw_emit_pipe_control_flush(brw,
>> +  PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
>> +   } else {
>> +  brw_emit_mi_flush(brw);
>> +   }
> Looks reasonable to me, let's get this bug fixed on master, things can
> be refactored later on:
>
> Reviewed-by: Francisco Jerez 

Ok, just pushed. Thanks for all the review and comments.


BR



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev