[Mesa-dev] [AppVeyor] mesa master #3477 completed

2017-02-16 Thread AppVeyor


Build mesa 3477 completed



Commit 172c48cc15 by Timothy Arceri on 2/17/2017 5:27 AM:

glsl: fix scons builds with shader cache\n\nFor now its disabled for scons so wrap glsl cache calls in a\ndefine conditional.


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 29/32] util/disk_cache: check cache exists before calling munmap()

2017-02-16 Thread Timothy Arceri



On 17/02/17 14:40, Mark Janes wrote:

Timothy Arceri  writes:


On 17/02/17 12:20, Mark Janes wrote:

This series breaks the scons build:

src/compiler/glsl/linker.cpp:4641: undefined reference to
`shader_cache_read_program_metadata(gl_context*, gl_shader_program*)'


To me it looks like its been broken for almost a month already. I'm getting.

ast_to_hir.cpp:263:36: error: ‘ir_unop_i642d’ was not declared in this scope


I get similar errors if I don't `git clean -xfd` first.


That works, thanks. I've pushed a fix.




author  Dave Airlie   2016-06-09 00:01:00
committer   Ian Romanick2017-01-20
commit  78cc44280e3faeded8eea7face614e13d28481f0
tree184345721e2f88812069fcf94801250b6a214b05
parent  85faf5082f06ed5828c6d97bb11dd2292ad0f86a

glsl/ast: Add 64-bit integer support to conversion functions






Timothy Arceri  writes:


---
 src/util/disk_cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 10b9d81..8eccf72 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -383,7 +383,8 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp)
 void
 disk_cache_destroy(struct disk_cache *cache)
 {
-   munmap(cache->index_mmap, cache->index_mmap_size);
+   if (cache)
+  munmap(cache->index_mmap, cache->index_mmap_size);

ralloc_free(cache);
 }
--
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] compiler: simplify building glsl shader cache

2017-02-16 Thread Timothy Arceri
Actually please ignore this one. It's going to be a pain to build these 
with scons so its better to leave them here.


On 17/02/17 16:07, Timothy Arceri wrote:

I think this made sense at one point when you could disable building
of the cache. Now it's always built so just merge it into
LIBGLSL_FILES.

This partially fixes scons builds.
---
 src/compiler/Makefile.glsl.am | 3 +--
 src/compiler/Makefile.sources | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index 41edb3c..f673196 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -131,8 +131,7 @@ glsl_libglsl_la_LIBADD = \

 glsl_libglsl_la_SOURCES =  \
$(LIBGLSL_GENERATED_FILES)  \
-   $(LIBGLSL_FILES)\
-   $(LIBGLSL_SHADER_CACHE_FILES)
+   $(LIBGLSL_FILES)

 glsl_libstandalone_la_SOURCES = \
$(GLSL_COMPILER_CXX_FILES)
diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 1e8edc0..04a44cf 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -140,9 +140,7 @@ LIBGLSL_FILES = \
glsl/program.h \
glsl/propagate_invariance.cpp \
glsl/s_expression.cpp \
-   glsl/s_expression.h
-
-LIBGLSL_SHADER_CACHE_FILES = \
+   glsl/s_expression.h \
glsl/shader_cache.cpp \
glsl/shader_cache.h



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] compiler: simplify building glsl shader cache

2017-02-16 Thread Timothy Arceri
I think this made sense at one point when you could disable building
of the cache. Now it's always built so just merge it into
LIBGLSL_FILES.

This partially fixes scons builds.
---
 src/compiler/Makefile.glsl.am | 3 +--
 src/compiler/Makefile.sources | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index 41edb3c..f673196 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -131,8 +131,7 @@ glsl_libglsl_la_LIBADD = \
 
 glsl_libglsl_la_SOURCES =  \
$(LIBGLSL_GENERATED_FILES)  \
-   $(LIBGLSL_FILES)\
-   $(LIBGLSL_SHADER_CACHE_FILES)
+   $(LIBGLSL_FILES)
 
 glsl_libstandalone_la_SOURCES = \
$(GLSL_COMPILER_CXX_FILES)
diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 1e8edc0..04a44cf 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -140,9 +140,7 @@ LIBGLSL_FILES = \
glsl/program.h \
glsl/propagate_invariance.cpp \
glsl/s_expression.cpp \
-   glsl/s_expression.h
-
-LIBGLSL_SHADER_CACHE_FILES = \
+   glsl/s_expression.h \
glsl/shader_cache.cpp \
glsl/shader_cache.h
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Add an OUT_BATCH64() macro.

2017-02-16 Thread Ben Widawsky

On 17-02-14 13:45:48, Kenneth Graunke wrote:

This is more convenient than OUT_BATCH'ing both halves.



It also is potentially more efficient (probably immeasurable).

Reviewed-by: Ben Widawsky 


Signed-off-by: Kenneth Graunke 
Cc: Ben Widawsky 
---
src/mesa/drivers/dri/i965/gen8_depth_state.c  | 3 +--
src/mesa/drivers/dri/i965/gen8_ds_state.c | 3 +--
src/mesa/drivers/dri/i965/gen8_gs_state.c | 3 +--
src/mesa/drivers/dri/i965/gen8_hs_state.c | 3 +--
src/mesa/drivers/dri/i965/gen8_ps_state.c | 3 +--
src/mesa/drivers/dri/i965/gen8_vs_state.c | 3 +--
src/mesa/drivers/dri/i965/intel_batchbuffer.h | 1 +
7 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
b/src/mesa/drivers/dri/i965/gen8_depth_state.c
index a7e61354fd5..c085246bc92 100644
--- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
@@ -72,8 +72,7 @@ emit_depth_packets(struct brw_context *brw,
  OUT_RELOC64(depth_mt->bo,
  I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
   } else {
-  OUT_BATCH(0);
-  OUT_BATCH(0);
+  OUT_BATCH64(0);
   }
   OUT_BATCH(((width - 1) << 4) | ((height - 1) << 18) | lod);
   OUT_BATCH(((depth - 1) << 21) | (min_array_element << 10) | mocs_wb);
diff --git a/src/mesa/drivers/dri/i965/gen8_ds_state.c 
b/src/mesa/drivers/dri/i965/gen8_ds_state.c
index ee2f82e1098..55738fd1ffc 100644
--- a/src/mesa/drivers/dri/i965/gen8_ds_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ds_state.c
@@ -56,8 +56,7 @@ gen8_upload_ds_state(struct brw_context *brw)
 I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
 ffs(stage_state->per_thread_scratch) - 11);
  } else {
- OUT_BATCH(0);
- OUT_BATCH(0);
+ OUT_BATCH64(0);
  }
  OUT_BATCH(SET_FIELD(prog_data->dispatch_grf_start_reg,
  GEN7_DS_DISPATCH_START_GRF) |
diff --git a/src/mesa/drivers/dri/i965/gen8_gs_state.c 
b/src/mesa/drivers/dri/i965/gen8_gs_state.c
index 2b74f1bd575..31c6f89bc13 100644
--- a/src/mesa/drivers/dri/i965/gen8_gs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_gs_state.c
@@ -63,8 +63,7 @@ gen8_upload_gs_state(struct brw_context *brw)
 I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
 ffs(stage_state->per_thread_scratch) - 11);
  } else {
- OUT_BATCH(0);
- OUT_BATCH(0);
+ OUT_BATCH64(0);
  }

  /* DW6 */
diff --git a/src/mesa/drivers/dri/i965/gen8_hs_state.c 
b/src/mesa/drivers/dri/i965/gen8_hs_state.c
index ee47e5e54a0..dbdd19b1f5c 100644
--- a/src/mesa/drivers/dri/i965/gen8_hs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_hs_state.c
@@ -57,8 +57,7 @@ gen8_upload_hs_state(struct brw_context *brw)
 I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
 ffs(stage_state->per_thread_scratch) - 11);
  } else {
- OUT_BATCH(0);
- OUT_BATCH(0);
+ OUT_BATCH64(0);
  }
  OUT_BATCH(GEN7_HS_INCLUDE_VERTEX_HANDLES |
SET_FIELD(prog_data->dispatch_grf_start_reg,
diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
index 03468267ce6..9b1a78c6ee6 100644
--- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
@@ -269,8 +269,7 @@ gen8_upload_ps_state(struct brw_context *brw,
  I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
  ffs(stage_state->per_thread_scratch) - 11);
   } else {
-  OUT_BATCH(0);
-  OUT_BATCH(0);
+  OUT_BATCH64(0);
   }
   OUT_BATCH(dw6);
   OUT_BATCH(dw7);
diff --git a/src/mesa/drivers/dri/i965/gen8_vs_state.c 
b/src/mesa/drivers/dri/i965/gen8_vs_state.c
index 7b66da4b17c..a2b08fe92a0 100644
--- a/src/mesa/drivers/dri/i965/gen8_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_vs_state.c
@@ -62,8 +62,7 @@ upload_vs_state(struct brw_context *brw)
  I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
  ffs(stage_state->per_thread_scratch) - 11);
   } else {
-  OUT_BATCH(0);
-  OUT_BATCH(0);
+  OUT_BATCH64(0);
   }

   OUT_BATCH((prog_data->dispatch_grf_start_reg <<
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index bf7cadfc4d6..da8f7e561f4 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -161,6 +161,7 @@ intel_batchbuffer_advance(struct brw_context *brw)

#define OUT_BATCH(d) *__map++ = (d)
#define OUT_BATCH_F(f) OUT_BATCH(float_as_int((f)))
+#define OUT_BATCH64(d) *((uint64_t *) __map) = (d); __map += 2
#define OUT_RELOC(buf, read_domains, write_domain, delta) do {\
   uint32_t __offset = (__map - brw->batch.map) * 4;  \
--
2.11.1



--
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@li

Re: [Mesa-dev] [PATCH 29/32] util/disk_cache: check cache exists before calling munmap()

2017-02-16 Thread Mark Janes
Timothy Arceri  writes:

> On 17/02/17 12:20, Mark Janes wrote:
>> This series breaks the scons build:
>>
>> src/compiler/glsl/linker.cpp:4641: undefined reference to
>> `shader_cache_read_program_metadata(gl_context*, gl_shader_program*)'
>
> To me it looks like its been broken for almost a month already. I'm getting.
>
> ast_to_hir.cpp:263:36: error: ‘ir_unop_i642d’ was not declared in this scope

I get similar errors if I don't `git clean -xfd` first.

> authorDave Airlie 2016-06-09 00:01:00
> committer Ian Romanick  2017-01-20
> commit78cc44280e3faeded8eea7face614e13d28481f0
> tree  184345721e2f88812069fcf94801250b6a214b05
> parent85faf5082f06ed5828c6d97bb11dd2292ad0f86a
>
> glsl/ast: Add 64-bit integer support to conversion functions
>
>
>>
>>
>>
>> Timothy Arceri  writes:
>>
>>> ---
>>>  src/util/disk_cache.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
>>> index 10b9d81..8eccf72 100644
>>> --- a/src/util/disk_cache.c
>>> +++ b/src/util/disk_cache.c
>>> @@ -383,7 +383,8 @@ disk_cache_create(const char *gpu_name, const char 
>>> *timestamp)
>>>  void
>>>  disk_cache_destroy(struct disk_cache *cache)
>>>  {
>>> -   munmap(cache->index_mmap, cache->index_mmap_size);
>>> +   if (cache)
>>> +  munmap(cache->index_mmap, cache->index_mmap_size);
>>>
>>> ralloc_free(cache);
>>>  }
>>> --
>>> 2.9.3
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 29/32] util/disk_cache: check cache exists before calling munmap()

2017-02-16 Thread Timothy Arceri



On 17/02/17 12:20, Mark Janes wrote:

This series breaks the scons build:

src/compiler/glsl/linker.cpp:4641: undefined reference to
`shader_cache_read_program_metadata(gl_context*, gl_shader_program*)'


To me it looks like its been broken for almost a month already. I'm getting.

ast_to_hir.cpp:263:36: error: ‘ir_unop_i642d’ was not declared in this scope


author  Dave Airlie   2016-06-09 00:01:00
committer   Ian Romanick2017-01-20
commit  78cc44280e3faeded8eea7face614e13d28481f0
tree184345721e2f88812069fcf94801250b6a214b05
parent  85faf5082f06ed5828c6d97bb11dd2292ad0f86a

glsl/ast: Add 64-bit integer support to conversion functions






Timothy Arceri  writes:


---
 src/util/disk_cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 10b9d81..8eccf72 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -383,7 +383,8 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp)
 void
 disk_cache_destroy(struct disk_cache *cache)
 {
-   munmap(cache->index_mmap, cache->index_mmap_size);
+   if (cache)
+  munmap(cache->index_mmap, cache->index_mmap_size);

ralloc_free(cache);
 }
--
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: avoid using libdrm with host modules

2017-02-16 Thread Chih-Wei Huang
2016-11-02 23:42 GMT+08:00 Emil Velikov :
>
> Skimming through the outstanding patches for yours [1] I've tagged
> some [2] as superseded since the functionality has already landed. Let
> me know the status of the rest when you've got the chance.

Sorry I forgot to reply.

> [1] https://patchwork.freedesktop.org/project/mesa/patches/?submitter=15395
>
> [2]
> https://patchwork.freedesktop.org/patch/52321/
> https://patchwork.freedesktop.org/patch/52323/
> https://patchwork.freedesktop.org/patch/52337/

Yes, they are superseded and unnecessary.

> https://patchwork.freedesktop.org/patch/61946/

Not sure about this now.
I guess it's also unnecessary.


-- 
Chih-Wei
Android-x86 project
http://www.android-x86.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 29/32] util/disk_cache: check cache exists before calling munmap()

2017-02-16 Thread Mark Janes
This series breaks the scons build:

src/compiler/glsl/linker.cpp:4641: undefined reference to
`shader_cache_read_program_metadata(gl_context*, gl_shader_program*)'



Timothy Arceri  writes:

> ---
>  src/util/disk_cache.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
> index 10b9d81..8eccf72 100644
> --- a/src/util/disk_cache.c
> +++ b/src/util/disk_cache.c
> @@ -383,7 +383,8 @@ disk_cache_create(const char *gpu_name, const char 
> *timestamp)
>  void
>  disk_cache_destroy(struct disk_cache *cache)
>  {
> -   munmap(cache->index_mmap, cache->index_mmap_size);
> +   if (cache)
> +  munmap(cache->index_mmap, cache->index_mmap_size);
>  
> ralloc_free(cache);
>  }
> -- 
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add utility build-id code.

2017-02-16 Thread Jonathan Gray
On Thu, Feb 16, 2017 at 04:25:02PM +, Emil Velikov wrote:
> On 16 February 2017 at 14:23, Jonathan Gray  wrote:
> > On Wed, Feb 15, 2017 at 11:11:50AM -0800, Matt Turner wrote:
> >> Provides the ability to read the .note.gnu.build-id section of ELF
> >> binaries, which is inserted by the --build-id=... flag to ld.
> >>
> >> Reviewed-by: Emil Velikov 
> >
> > I don't have time to dig into details right now but this broke the Mesa
> > build on OpenBSD and likely other non-linux platforms:
> >
> > libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" -DPACKAGE_TARNAME=\"mesa\" 
> > -DPACKAGE_VERSION=\"17.1.0-devel\" "-DPACKAGE_STRING=\"Mesa 17.1.0-devel\"" 
> > "-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"";
> >  -DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"17.1.0-devel\" 
> > -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
> > -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 
> > -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" 
> > -DYYTEXT_POINTER=1 -DHAVE___BUILTIN_CLZ=1 -DHAVE___BUILTIN_CLZLL=1 
> > -DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1 -DHAVE___BUILTIN_FFS=1 
> > -DHAVE___BUILTIN_FFSLL=1 -DHAVE___BUILTIN_POPCOUNT=1 
> > -DHAVE___BUILTIN_POPCOUNTLL=1 -DHAVE_FUNC_ATTRIBUTE_CONST=1 
> > -DHAVE_FUNC_ATTRIBUTE_FLATTEN=1 -DHAVE_FUNC_ATTRIBUTE_FORMAT=1 
> > -DHAVE_FUNC_ATTRIBUTE_MALLOC=1 -DHAVE_FUNC_ATTRIBUTE_PACKED=1 
> > -DHAVE_FUNC_ATTRIBUTE_PURE=1 -DHAVE_FUNC_ATTRIBUTE_UNUSED=1 
> > -DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 
> > -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1 -DHAVE_FUNC_ATTRIBUTE_WEAK=1 
> > -DHAVE_FUNC_ATTRIBUTE_ALIAS=1 -DHAVE_DLADDR=1 -DHAVE_CLOCK_GETTIME=1 
> > -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -I. -D__STDC_CONSTANT_MACROS 
> > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DDEBUG 
> > -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM -DHAVE_SYS_SYSCTL_H -DHAVE_STRTOF 
> > -DHAVE_MKOSTEMP -DHAVE_DLOPEN -DHAVE_DL_ITERATE_PHDR -DHAVE_POSIX_MEMALIGN 
> > -DHAVE_LIBDRM -DGLX_USE_DRM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING 
> > -DENABLE_SHADER_CACHE -DHAVE_MINCORE -I../../include -I../../src 
> > -I../../src/mapi -I../../src/mesa -I../../src/gallium/include 
> > -I../../src/gallium/auxiliary -fvisibility=hidden -Werror=pointer-arith -g 
> > -O2 -Wall -std=gnu99 -Werror=implicit-function-declaration 
> > -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -MT 
> > libmesautil_la-build_id.lo -MD -MP -MF .deps/libmesautil_la-build_id.Tpo -c 
> > build_id.c  -fPIC -DPIC -o .libs/libmesautil_la-build_id.o
> > In file included from /usr/include/elf_abi.h:31,
> >  from /usr/include/link_elf.h:10,
> >  from /usr/include/link.h:39,
> >  from build_id.c:25:
> > /usr/include/sys/exec_elf.h:585: error: expected specifier-qualifier-list 
> > before 'uint32_t'
> > In file included from /usr/include/link.h:39,
> >  from build_id.c:25:
> > /usr/include/link_elf.h:22: error: expected specifier-qualifier-list before 
> > 'caddr_t'
> > /usr/include/link_elf.h:37: error: expected '=', ',', ';', 'asm' or 
> > '__attribute__' before 'int'
> > In file included from build_id.c:25:
> > /usr/include/link.h:49: error: expected '=', ',', ';', 'asm' or 
> > '__attribute__' before 'struct'
> > /usr/include/link.h:65: error: expected specifier-qualifier-list before 
> > 'caddr_t'
> These look like issue in your platform code/headers. Perhaps some bad
> interaction with the bits that Mesa defines ?
> 
> Quick workaround is to check the function only when needed, roughly
> like this pseudo code:
> 
> if test $building_any_vulkan_driver = yes ;then
> require_dl...=yes
>
> fi
> 
> 
> if test $require_dl... = yes ; then
>AC_CHECK_FUNC([dl_iterate_phdr], [DEFINES="$DEFINES
> -DHAVE_DL_ITERATE_PHDR"], [AC_MSG_ERROR([required  not found])])
> fi
> 
> 
> Please give it a bash and send us a patch that works on your end.

Leaning towards something along the lines of the following.
With Nhdr struct definitions added to system exec_elf.h.

The need for sys/types.h here may go away shortly as well.

diff --git a/src/util/build_id.c b/src/util/build_id.c
index 2993a80cfe..92250a1f5f 100644
--- a/src/util/build_id.c
+++ b/src/util/build_id.c
@@ -22,12 +22,22 @@
  */
 
 #ifdef HAVE_DL_ITERATE_PHDR
+
+#include 
 #include 
 #include 
 #include 
 
 #include "build_id.h"
 
+#ifndef NT_GNU_BUILD_ID
+#define NT_GNU_BUILD_ID 3
+#endif
+
+#ifndef ElfW
+#define ElfW(type) Elf_##type
+#endif
+
 #define ALIGN(val, align)  (((val) + (align) - 1) & ~((align) - 1))
 
 struct build_id_note {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv: Never try to create more than max_sets descriptor sets.

2017-02-16 Thread Dave Airlie
On 17 February 2017 at 06:26, Bas Nieuwenhuizen  
wrote:
> We only use the freed ones after all free space has been used. If
> the app only allocates small descriptor sets, we might go over
> max_sets before the memory is full.
>
> Signed-off-by: Bas Nieuwenhuizen 
> CC: 
> Fixes: f4e499ec79147f4172f3669ae9dafd941aaeeb65


Both look good to me,

Reviewed-by: Dave Airlie 
Thanks,
Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] spec: MESA_program_binary

2017-02-16 Thread Timothy Arceri



On 17/02/17 10:44, Ian Romanick wrote:

On 02/15/2017 11:58 PM, Timothy Arceri wrote:



On 16/02/17 17:55, Tapani Pälli wrote:


On 02/16/2017 04:52 AM, Timothy Arceri wrote:

In order add functionality to ARB_get_program_binary we need
binary format enums.


I've understood that this is a driver internal enumeration. When
application gets the binary it also receives enum (integer value) what
format we gave. Then when loading application needs to query what
formats are supported by the implementation and load the correct binary.
We just need to internally make agreement on format list and return
correct one matching the current driver in use?


Not that it's actually likely to happen but if we were to only have a
single MESA enum an application could only distribute a single binary.


Applications really, really, *REALLY* should not distribute binaries
retrieved from the driver.  The intention of this extension is for
applications to implement their own shader cache, for example, at
application installation.  The driver can reject the binary at any time
for any reason.  Driver changes, hardware changes, OS changes, phase of
the moon, etc.

Looking at the GLES extension registry, it appears that the other
vendors have just a single binary for all the hardware they make.  Based
on that, having a single Mesa enum isn't an insane idea.  We would just
need to agree on the format of the header so that the driver receiving
the blob could determine which driver generated the blob.


The only other thing to consider with a single enum is that it will 
require a laptop with an Intel cpu and Nvidia gpu for example to 
recompile the binary if the user were to switch between using the Intel 
and Nvidia gpus. This might happen depending on if the laptop is plugged 
into a power source or not.


If we don't care about this than one enum is fine.




e.g either for AMD, INTEL or NVIDIA but not one for each. That is unless
we were to compile and pack all gpu vendor binarys at the same time
which seems overly complicated and expensive.

I could see an intenal id being used for gpu generations from hardware
vendors.


---

Techland games such as Dead Island and Dying Light make use of
GetProgramBinary(). My current guess is the Dead Island crash
https://bugs.freedesktop.org/show_bug.cgi?id=85564 is caused
due to buggy handling of this feature not being available.

Anyway I'm not sure how we go about getting Khronos to assign
enums for the binary formats but thought I'd send this to the
list for discussion.


There's a two step process:

1. Vendors request a block of values via the Khronos internal bugzilla.

2. When the spec is ready, another bug is submitted requesting the spec
be published.

Mesa might still have some available enums assigned to it.  I'll have to
check...


 docs/specs/MESA_program_binary.txt | 78
++
 1 file changed, 78 insertions(+)
 create mode 100644 docs/specs/MESA_program_binary.txt

diff --git a/docs/specs/MESA_program_binary.txt
b/docs/specs/MESA_program_binary.txt
new file mode 100644
index 000..b34e42e
--- /dev/null
+++ b/docs/specs/MESA_program_binary.txt
@@ -0,0 +1,78 @@
+Name
+
+MESA_program_binary
+
+Name Strings
+
+GL_MESA_program_binary
+
+Contact
+
+Timothy Arceri (tarceri 'at' itsqueeze.com)
+
+Status
+
+Complete.
+
+Version
+
+Last Modified Date: February 16, 2017
+Revision: #1
+
+Number
+
+???
+
+Dependencies
+
+OpenGL ES 2.0 is required.
+
+Written based on the wording of the OpenGL ES 2.0 specification.
+
+This extension interacts with OES_get_program_binary.
+
+Overview
+
+MESA provides drivers for multiple hardware vendors. This extension
+provides binary formats in order to avoid conflicts between
drivers when
+loading precompiled binaries.
+
+New Procedures and Functions
+
+None.
+
+New Tokens
+
+Accepted by the  parameter of ShaderBinary:
+
+MESA_PROGRAM_BINARY_AMD
+MESA_PROGRAM_BINARY_NV 
+MESA_PROGRAM_BINARY_INTEL  
+MESA_PROGRAM_BINARY_BCOM   
+MESA_PROGRAM_BINARY_QCOM   
+
+Additions to Chapter 2 of the OpenGL ES 2.0 Specification (OpenGL
Operation)
+
+Add the following paragraph to the end of section 2.10.2:
+
+"Depending on the hardware in use the apropriate  is
+returned when querying the list of SHADER_BINARY_FORMATS.
+
+Pre-compiled shader binaries in this format may be loaded via
ShaderBinary.
+
+When a binary fails to load, an INVALID_VALUE error is generated
and a
+more detailed error message is appended to the shader's info log."
+
+Errors
+
+INVALID_VALUE is generated if the  parameter to
ShaderBinary was
+produced with an incompatible version of the MESA shader compiler.
+
+New State
+
+None.
+
+Revision History
+
+#0102/16/2010Timothy Arceri   First draft.
+


__

[Mesa-dev] [AppVeyor] mesa master #3473 failed

2017-02-16 Thread AppVeyor



Build mesa 3473 failed


Commit a3ab09f90f by Timothy Arceri on 2/7/2017 1:10 AM:

util/disk_cache: check cache exists before calling munmap()\n\nReviewed-by: Nicolai Hähnle 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/mesa: stop using TGSI_OPCODE_CLAMP

2017-02-16 Thread Roland Scheidegger
I've just checked, and we don't use it anywhere (even if we did, we
could easily replace it with min/max - we can't really execute it any
different than min/max with sw rasterization in any case).
So no objections...

Am 17.02.2017 um 00:37 schrieb Dave Airlie:
> On 17 February 2017 at 08:00, Marek Olšák  wrote:
>> From: Marek Olšák 
> 
> 1, the new 2, 3, 4 are
> 
> Reviewed-by: Dave Airlie 
> 
> 5 is reviewed-by me, but might want to wait for a vmware person to say
> if it causes unknown fallout (not that I think we should care too
> much).
> 
> Dave.
> 
>>
>> ---
>>  src/mesa/state_tracker/st_atifs_to_tgsi.c | 14 --
>>  1 file changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c 
>> b/src/mesa/state_tracker/st_atifs_to_tgsi.c
>> index 9c4218e..64879f1 100644
>> --- a/src/mesa/state_tracker/st_atifs_to_tgsi.c
>> +++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c
>> @@ -612,21 +612,20 @@ st_init_atifs_prog(struct gl_context *ctx, struct 
>> gl_program *prog)
>> prog->arb.NumParameters = MAX_NUM_FRAGMENT_CONSTANTS_ATI + 2; /* 2 state 
>> variables for fog */
>>  }
>>
>>
>>  struct tgsi_atifs_transform {
>> struct tgsi_transform_context base;
>> struct tgsi_shader_info info;
>> const struct st_fp_variant_key *key;
>> bool first_instruction_emitted;
>> unsigned fog_factor_temp;
>> -   unsigned fog_clamp_imm;
>>  };
>>
>>  static inline struct tgsi_atifs_transform *
>>  tgsi_atifs_transform(struct tgsi_transform_context *tctx)
>>  {
>> return (struct tgsi_atifs_transform *)tctx;
>>  }
>>
>>  /* copied from st_cb_drawpixels_shader.c */
>>  static void
>> @@ -669,24 +668,20 @@ transform_instr(struct tgsi_transform_context *tctx,
>>
>> if (ctx->first_instruction_emitted)
>>goto transform_inst;
>>
>> ctx->first_instruction_emitted = true;
>>
>> if (ctx->key->fog) {
>>/* add a new temp for the fog factor */
>>ctx->fog_factor_temp = ctx->info.file_max[TGSI_FILE_TEMPORARY] + 1;
>>tgsi_transform_temp_decl(tctx, ctx->fog_factor_temp);
>> -
>> -  /* add immediates for clamp */
>> -  ctx->fog_clamp_imm = ctx->info.immediate_count;
>> -  tgsi_transform_immediate_decl(tctx, 1.0f, 0.0f, 0.0f, 0.0f);
>> }
>>
>>  transform_inst:
>> if (current_inst->Instruction.Opcode == TGSI_OPCODE_TEX) {
>>/* fix texture target */
>>unsigned newtarget = 
>> ctx->key->texture_targets[current_inst->Src[1].Register.Index];
>>if (newtarget)
>>   current_inst->Texture.Texture = newtarget;
>>
>> } else if (ctx->key->fog && current_inst->Instruction.Opcode == 
>> TGSI_OPCODE_MOV &&
>> @@ -783,31 +778,30 @@ transform_inst:
>>   inst.Instruction.Opcode = TGSI_OPCODE_EX2;
>>   inst.Instruction.NumDstRegs = 1;
>>   inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
>>   inst.Dst[0].Register.Index = ctx->fog_factor_temp;
>>   inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
>>   inst.Instruction.NumSrcRegs = 1;
>>   SET_SRC(&inst, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, 
>> Z, W);
>>   inst.Src[0].Register.Negate ^= 1;
>>   tctx->emit_instruction(tctx, &inst);
>>}
>> -  /* f = CLAMP(f, 0.0, 1.0) */
>> +  /* f = saturate(f) */
>>inst = tgsi_default_full_instruction();
>> -  inst.Instruction.Opcode = TGSI_OPCODE_CLAMP;
>> +  inst.Instruction.Opcode = TGSI_OPCODE_MOV;
>>inst.Instruction.NumDstRegs = 1;
>> +  inst.Instruction.Saturate = 1;
>>inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
>>inst.Dst[0].Register.Index = ctx->fog_factor_temp;
>>inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
>> -  inst.Instruction.NumSrcRegs = 3;
>> +  inst.Instruction.NumSrcRegs = 1;
>>SET_SRC(&inst, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, Z, 
>> W);
>> -  SET_SRC(&inst, 1, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, Y, Y, Y, 
>> Y); // 0.0
>> -  SET_SRC(&inst, 2, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, X, X, X, 
>> X); // 1.0
>>tctx->emit_instruction(tctx, &inst);
>>
>>/* REG0 = LRP(f, REG0, fogcolor) */
>>inst = tgsi_default_full_instruction();
>>inst.Instruction.Opcode = TGSI_OPCODE_LRP;
>>inst.Instruction.NumDstRegs = 1;
>>inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
>>inst.Dst[0].Register.Index = reg0_index;
>>inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
>>inst.Instruction.NumSrcRegs = 3;
>> --
>> 2.7.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list

Re: [Mesa-dev] [PATCH] mesa: Always expose GREMEDY_string_marker.

2017-02-16 Thread Ian Romanick
Have we contacted the developer?  This doesn't seem like a thing they
really want to do in a release build.

On 02/15/2017 04:59 PM, Kenneth Graunke wrote:
> Equivalent marker functionality is already included in KHR_debug, which
> we already expose unconditionally in all drivers (dummy_true).
> 
> Grim Fandango Remastered apparently calls glStringMarkerGREMEDY()
> without checking for the extension, spewing GL errors.  Assuming the
> existence of the extension is definitely not valid, but it also seems
> kinda mean to spew GL errors when we could simply expose the feature
> and silently ignore the provided string markers.
> 
> This patch enables GREMEDY_string_marker everywhere, and makes the calls
> no-ops if the driver doesn't provide the EmitStringMarker() hook, just
> like we did for the KHR_debug functionality.
> 
> This may impact freedreno, which actually puts markers in its command
> buffers.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/debug_output.c   | 4 +---
>  src/mesa/main/extensions_table.h   | 2 +-
>  src/mesa/state_tracker/st_debug.c  | 1 -
>  src/mesa/state_tracker/st_debug.h  | 3 +--
>  src/mesa/state_tracker/st_extensions.c | 4 
>  5 files changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/src/mesa/main/debug_output.c b/src/mesa/main/debug_output.c
> index bc933db93d4..1d2dee128b4 100644
> --- a/src/mesa/main/debug_output.c
> +++ b/src/mesa/main/debug_output.c
> @@ -1308,12 +1308,10 @@ void GLAPIENTRY
>  _mesa_StringMarkerGREMEDY(GLsizei len, const GLvoid *string)
>  {
> GET_CURRENT_CONTEXT(ctx);
> -   if (ctx->Extensions.GREMEDY_string_marker) {
> +   if (ctx->Driver.EmitStringMarker) {
>/* if length not specified, string will be null terminated: */
>if (len <= 0)
>   len = strlen(string);
>ctx->Driver.EmitStringMarker(ctx, string, len);
> -   } else {
> -  _mesa_error(ctx, GL_INVALID_OPERATION, "StringMarkerGREMEDY");
> }
>  }
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index 7ea56c8422d..ec48aadde3f 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -285,7 +285,7 @@ EXT(EXT_vertex_array, dummy_true
>  EXT(EXT_vertex_array_bgra   , EXT_vertex_array_bgra  
> , GLL, GLC,  x ,  x , 2008)
>  EXT(EXT_window_rectangles   , EXT_window_rectangles  
> , GLL, GLC,  x ,  30, 2016)
>  
> -EXT(GREMEDY_string_marker   , GREMEDY_string_marker  
> , GLL, GLC,  x ,  x , 2007)
> +EXT(GREMEDY_string_marker   , dummy_true 
> , GLL, GLC,  x ,  x , 2007)
>  
>  EXT(IBM_multimode_draw_arrays   , dummy_true 
> , GLL, GLC,  x ,  x , 1998)
>  EXT(IBM_rasterpos_clip  , dummy_true 
> , GLL,  x ,  x ,  x , 1996)
> diff --git a/src/mesa/state_tracker/st_debug.c 
> b/src/mesa/state_tracker/st_debug.c
> index d6cb5cd57d8..f2e982c8c7a 100644
> --- a/src/mesa/state_tracker/st_debug.c
> +++ b/src/mesa/state_tracker/st_debug.c
> @@ -58,7 +58,6 @@ static const struct debug_named_value st_debug_flags[] = {
> { "buffer",   DEBUG_BUFFER, NULL },
> { "wf",   DEBUG_WIREFRAME, NULL },
> { "precompile",  DEBUG_PRECOMPILE, NULL },
> -   { "gremedy",  DEBUG_GREMEDY, "Enable GREMEDY debug extensions" },
> { "noreadpixcache", DEBUG_NOREADPIXCACHE, NULL },
> DEBUG_NAMED_VALUE_END
>  };
> diff --git a/src/mesa/state_tracker/st_debug.h 
> b/src/mesa/state_tracker/st_debug.h
> index 6c1e915f68c..4b92a669a37 100644
> --- a/src/mesa/state_tracker/st_debug.h
> +++ b/src/mesa/state_tracker/st_debug.h
> @@ -50,8 +50,7 @@ st_print_current(void);
>  #define DEBUG_BUFFER0x200
>  #define DEBUG_WIREFRAME 0x400
>  #define DEBUG_PRECOMPILE   0x800
> -#define DEBUG_GREMEDY   0x1000
> -#define DEBUG_NOREADPIXCACHE 0x2000
> +#define DEBUG_NOREADPIXCACHE 0x1000
>  
>  #ifdef DEBUG
>  extern int ST_DEBUG;
> diff --git a/src/mesa/state_tracker/st_extensions.c 
> b/src/mesa/state_tracker/st_extensions.c
> index 37fe4469c37..d9057c77657 100644
> --- a/src/mesa/state_tracker/st_extensions.c
> +++ b/src/mesa/state_tracker/st_extensions.c
> @@ -1167,10 +1167,6 @@ void st_init_extensions(struct pipe_screen *screen,
>extensions->ARB_vertex_attrib_64bit = GL_TRUE;
> }
>  
> -   if ((ST_DEBUG & DEBUG_GREMEDY) &&
> -   screen->get_param(screen, PIPE_CAP_STRING_MARKER))
> -  extensions->GREMEDY_string_marker = GL_TRUE;
> -
> if (screen->get_param(screen, PIPE_CAP_COMPUTE)) {
>int compute_supported_irs =
>   screen->get_shader_param(screen, PIPE_SHADER_COMPUTE,
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] gallium: remove TGSI_OPCODE_CLAMP

2017-02-16 Thread Roland Scheidegger
There's some gallium docs which need to go too.
I suppose it's of not much use indeed if glsl doesn't use it (glsl has
clamp but I guess the compiler does away with it).

Roland


Am 16.02.2017 um 23:00 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> Not used and not widely supported. Use MIN+MAX instead.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c   | 16 
> 
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c  |  8 
>  src/gallium/auxiliary/nir/tgsi_to_nir.c  | 11 ---
>  src/gallium/auxiliary/tgsi/tgsi_exec.c   | 16 
> 
>  src/gallium/auxiliary/tgsi/tgsi_info.c   |  2 +-
>  src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h |  1 -
>  src/gallium/auxiliary/tgsi/tgsi_util.c   |  1 -
>  .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp| 10 --
>  src/gallium/drivers/r300/r300_tgsi_to_rc.c   |  1 -
>  src/gallium/drivers/r600/r600_shader.c   |  6 +++---
>  src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c|  3 ---
>  src/gallium/drivers/svga/svga_tgsi_insn.c|  1 -
>  src/gallium/include/pipe/p_shader_tokens.h   |  2 +-
>  13 files changed, 5 insertions(+), 73 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> index e78cdb0..dc6568a 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
> @@ -103,35 +103,20 @@ static void
>  arr_emit(
> const struct lp_build_tgsi_action * action,
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data)
>  {
> LLVMValueRef tmp = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_ROUND, 
> emit_data->args[0]);
> emit_data->output[emit_data->chan] = 
> LLVMBuildFPToSI(bld_base->base.gallivm->builder, tmp,
>   
> bld_base->uint_bld.vec_type, "");
>  }
>  
> -/* TGSI_OPCODE_CLAMP */
> -static void
> -clamp_emit(
> -   const struct lp_build_tgsi_action * action,
> -   struct lp_build_tgsi_context * bld_base,
> -   struct lp_build_emit_data * emit_data)
> -{
> -   LLVMValueRef tmp;
> -   tmp = lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_MAX,
> -   emit_data->args[0],
> -   emit_data->args[1]);
> -   emit_data->output[emit_data->chan] = lp_build_emit_llvm_binary(bld_base,
> -   TGSI_OPCODE_MIN, tmp, 
> emit_data->args[2]);
> -}
> -
>  /* DP* Helper */
>  
>  static void
>  dp_fetch_args(
> struct lp_build_tgsi_context * bld_base,
> struct lp_build_emit_data * emit_data,
> unsigned dp_components)
>  {
> unsigned chan, src;
> for (src = 0; src < 2; src++) {
> @@ -1323,21 +1308,20 @@ lp_set_default_actions(struct lp_build_tgsi_context * 
> bld_base)
> bld_base->op_actions[TGSI_OPCODE_IF].fetch_args = scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_UIF].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_KILL_IF].fetch_args = kil_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_KILL].fetch_args = kilp_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_RCP].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_SIN].fetch_args = 
> scalar_unary_fetch_args;
> bld_base->op_actions[TGSI_OPCODE_LG2].fetch_args = 
> scalar_unary_fetch_args;
>  
> bld_base->op_actions[TGSI_OPCODE_ADD].emit = add_emit;
> bld_base->op_actions[TGSI_OPCODE_ARR].emit = arr_emit;
> -   bld_base->op_actions[TGSI_OPCODE_CLAMP].emit = clamp_emit;
> bld_base->op_actions[TGSI_OPCODE_END].emit = end_emit;
> bld_base->op_actions[TGSI_OPCODE_FRC].emit = frc_emit;
> bld_base->op_actions[TGSI_OPCODE_LRP].emit = lrp_emit;
> bld_base->op_actions[TGSI_OPCODE_MAD].emit = mad_emit;
> bld_base->op_actions[TGSI_OPCODE_MOV].emit = mov_emit;
> bld_base->op_actions[TGSI_OPCODE_MUL].emit = mul_emit;
> bld_base->op_actions[TGSI_OPCODE_DIV].emit = fdiv_emit;
> bld_base->op_actions[TGSI_OPCODE_RCP].emit = rcp_emit;
>  
> bld_base->op_actions[TGSI_OPCODE_UARL].emit = mov_emit;
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
> index 6c177b0..2bd4291 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
> @@ -602,28 +602,20 @@ lp_emit_instruction_aos(
>  
> case TGSI_OPCODE_DP2A:
>return FALSE;
>  
> case TGSI_OPCODE_FRC:
>src0 = lp_build_emit_fetch(&bld->bld_base, inst, 0, LP_CHAN_ALL);
>tmp0 = lp_build_floor(&bld->bld_base.base, src0);
>dst0 = lp_build_sub(&bld->bld_base.base, src0, tmp0);
>break;
>  
> -   case TGSI_OPCODE_CLAMP:
> -  src0 = lp_build_

Re: [Mesa-dev] [RFC] spec: MESA_program_binary

2017-02-16 Thread Ian Romanick
On 02/15/2017 11:58 PM, Timothy Arceri wrote:
> 
> 
> On 16/02/17 17:55, Tapani Pälli wrote:
>>
>> On 02/16/2017 04:52 AM, Timothy Arceri wrote:
>>> In order add functionality to ARB_get_program_binary we need
>>> binary format enums.
>>
>> I've understood that this is a driver internal enumeration. When
>> application gets the binary it also receives enum (integer value) what
>> format we gave. Then when loading application needs to query what
>> formats are supported by the implementation and load the correct binary.
>> We just need to internally make agreement on format list and return
>> correct one matching the current driver in use?
> 
> Not that it's actually likely to happen but if we were to only have a
> single MESA enum an application could only distribute a single binary.

Applications really, really, *REALLY* should not distribute binaries
retrieved from the driver.  The intention of this extension is for
applications to implement their own shader cache, for example, at
application installation.  The driver can reject the binary at any time
for any reason.  Driver changes, hardware changes, OS changes, phase of
the moon, etc.

Looking at the GLES extension registry, it appears that the other
vendors have just a single binary for all the hardware they make.  Based
on that, having a single Mesa enum isn't an insane idea.  We would just
need to agree on the format of the header so that the driver receiving
the blob could determine which driver generated the blob.

> e.g either for AMD, INTEL or NVIDIA but not one for each. That is unless
> we were to compile and pack all gpu vendor binarys at the same time
> which seems overly complicated and expensive.
> 
> I could see an intenal id being used for gpu generations from hardware
> vendors.
> 
>>> ---
>>>
>>> Techland games such as Dead Island and Dying Light make use of
>>> GetProgramBinary(). My current guess is the Dead Island crash
>>> https://bugs.freedesktop.org/show_bug.cgi?id=85564 is caused
>>> due to buggy handling of this feature not being available.
>>>
>>> Anyway I'm not sure how we go about getting Khronos to assign
>>> enums for the binary formats but thought I'd send this to the
>>> list for discussion.

There's a two step process:

1. Vendors request a block of values via the Khronos internal bugzilla.

2. When the spec is ready, another bug is submitted requesting the spec
be published.

Mesa might still have some available enums assigned to it.  I'll have to
check...

>>>  docs/specs/MESA_program_binary.txt | 78
>>> ++
>>>  1 file changed, 78 insertions(+)
>>>  create mode 100644 docs/specs/MESA_program_binary.txt
>>>
>>> diff --git a/docs/specs/MESA_program_binary.txt
>>> b/docs/specs/MESA_program_binary.txt
>>> new file mode 100644
>>> index 000..b34e42e
>>> --- /dev/null
>>> +++ b/docs/specs/MESA_program_binary.txt
>>> @@ -0,0 +1,78 @@
>>> +Name
>>> +
>>> +MESA_program_binary
>>> +
>>> +Name Strings
>>> +
>>> +GL_MESA_program_binary
>>> +
>>> +Contact
>>> +
>>> +Timothy Arceri (tarceri 'at' itsqueeze.com)
>>> +
>>> +Status
>>> +
>>> +Complete.
>>> +
>>> +Version
>>> +
>>> +Last Modified Date: February 16, 2017
>>> +Revision: #1
>>> +
>>> +Number
>>> +
>>> +???
>>> +
>>> +Dependencies
>>> +
>>> +OpenGL ES 2.0 is required.
>>> +
>>> +Written based on the wording of the OpenGL ES 2.0 specification.
>>> +
>>> +This extension interacts with OES_get_program_binary.
>>> +
>>> +Overview
>>> +
>>> +MESA provides drivers for multiple hardware vendors. This extension
>>> +provides binary formats in order to avoid conflicts between
>>> drivers when
>>> +loading precompiled binaries.
>>> +
>>> +New Procedures and Functions
>>> +
>>> +None.
>>> +
>>> +New Tokens
>>> +
>>> +Accepted by the  parameter of ShaderBinary:
>>> +
>>> +MESA_PROGRAM_BINARY_AMD
>>> +MESA_PROGRAM_BINARY_NV 
>>> +MESA_PROGRAM_BINARY_INTEL  
>>> +MESA_PROGRAM_BINARY_BCOM   
>>> +MESA_PROGRAM_BINARY_QCOM   
>>> +
>>> +Additions to Chapter 2 of the OpenGL ES 2.0 Specification (OpenGL
>>> Operation)
>>> +
>>> +Add the following paragraph to the end of section 2.10.2:
>>> +
>>> +"Depending on the hardware in use the apropriate  is
>>> +returned when querying the list of SHADER_BINARY_FORMATS.
>>> +
>>> +Pre-compiled shader binaries in this format may be loaded via
>>> ShaderBinary.
>>> +
>>> +When a binary fails to load, an INVALID_VALUE error is generated
>>> and a
>>> +more detailed error message is appended to the shader's info log."
>>> +
>>> +Errors
>>> +
>>> +INVALID_VALUE is generated if the  parameter to
>>> ShaderBinary was
>>> +produced with an incompatible version of the MESA shader compiler.
>>> +
>>> +New State
>>> +
>>> +None.
>>> +
>>> +Revision History
>>> +
>>> +#0102/16/

[Mesa-dev] [Bug 99305] account creation request

2017-02-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99305

--- Comment #4 from George Kyriazis  ---
I don't know who needs to do this.

Suggested next steps?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/mesa: stop using TGSI_OPCODE_CLAMP

2017-02-16 Thread Dave Airlie
On 17 February 2017 at 08:00, Marek Olšák  wrote:
> From: Marek Olšák 

1, the new 2, 3, 4 are

Reviewed-by: Dave Airlie 

5 is reviewed-by me, but might want to wait for a vmware person to say
if it causes unknown fallout (not that I think we should care too
much).

Dave.

>
> ---
>  src/mesa/state_tracker/st_atifs_to_tgsi.c | 14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c 
> b/src/mesa/state_tracker/st_atifs_to_tgsi.c
> index 9c4218e..64879f1 100644
> --- a/src/mesa/state_tracker/st_atifs_to_tgsi.c
> +++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c
> @@ -612,21 +612,20 @@ st_init_atifs_prog(struct gl_context *ctx, struct 
> gl_program *prog)
> prog->arb.NumParameters = MAX_NUM_FRAGMENT_CONSTANTS_ATI + 2; /* 2 state 
> variables for fog */
>  }
>
>
>  struct tgsi_atifs_transform {
> struct tgsi_transform_context base;
> struct tgsi_shader_info info;
> const struct st_fp_variant_key *key;
> bool first_instruction_emitted;
> unsigned fog_factor_temp;
> -   unsigned fog_clamp_imm;
>  };
>
>  static inline struct tgsi_atifs_transform *
>  tgsi_atifs_transform(struct tgsi_transform_context *tctx)
>  {
> return (struct tgsi_atifs_transform *)tctx;
>  }
>
>  /* copied from st_cb_drawpixels_shader.c */
>  static void
> @@ -669,24 +668,20 @@ transform_instr(struct tgsi_transform_context *tctx,
>
> if (ctx->first_instruction_emitted)
>goto transform_inst;
>
> ctx->first_instruction_emitted = true;
>
> if (ctx->key->fog) {
>/* add a new temp for the fog factor */
>ctx->fog_factor_temp = ctx->info.file_max[TGSI_FILE_TEMPORARY] + 1;
>tgsi_transform_temp_decl(tctx, ctx->fog_factor_temp);
> -
> -  /* add immediates for clamp */
> -  ctx->fog_clamp_imm = ctx->info.immediate_count;
> -  tgsi_transform_immediate_decl(tctx, 1.0f, 0.0f, 0.0f, 0.0f);
> }
>
>  transform_inst:
> if (current_inst->Instruction.Opcode == TGSI_OPCODE_TEX) {
>/* fix texture target */
>unsigned newtarget = 
> ctx->key->texture_targets[current_inst->Src[1].Register.Index];
>if (newtarget)
>   current_inst->Texture.Texture = newtarget;
>
> } else if (ctx->key->fog && current_inst->Instruction.Opcode == 
> TGSI_OPCODE_MOV &&
> @@ -783,31 +778,30 @@ transform_inst:
>   inst.Instruction.Opcode = TGSI_OPCODE_EX2;
>   inst.Instruction.NumDstRegs = 1;
>   inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
>   inst.Dst[0].Register.Index = ctx->fog_factor_temp;
>   inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
>   inst.Instruction.NumSrcRegs = 1;
>   SET_SRC(&inst, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, 
> Z, W);
>   inst.Src[0].Register.Negate ^= 1;
>   tctx->emit_instruction(tctx, &inst);
>}
> -  /* f = CLAMP(f, 0.0, 1.0) */
> +  /* f = saturate(f) */
>inst = tgsi_default_full_instruction();
> -  inst.Instruction.Opcode = TGSI_OPCODE_CLAMP;
> +  inst.Instruction.Opcode = TGSI_OPCODE_MOV;
>inst.Instruction.NumDstRegs = 1;
> +  inst.Instruction.Saturate = 1;
>inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
>inst.Dst[0].Register.Index = ctx->fog_factor_temp;
>inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
> -  inst.Instruction.NumSrcRegs = 3;
> +  inst.Instruction.NumSrcRegs = 1;
>SET_SRC(&inst, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, Z, 
> W);
> -  SET_SRC(&inst, 1, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, Y, Y, Y, 
> Y); // 0.0
> -  SET_SRC(&inst, 2, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, X, X, X, 
> X); // 1.0
>tctx->emit_instruction(tctx, &inst);
>
>/* REG0 = LRP(f, REG0, fogcolor) */
>inst = tgsi_default_full_instruction();
>inst.Instruction.Opcode = TGSI_OPCODE_LRP;
>inst.Instruction.NumDstRegs = 1;
>inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
>inst.Dst[0].Register.Index = reg0_index;
>inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
>inst.Instruction.NumSrcRegs = 3;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] tgsi/lowering: stop using TGSI_OPCODE_CLAMP

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

v2: do it correctly
---
 src/gallium/auxiliary/tgsi/tgsi_lowering.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_lowering.c 
b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
index bf6cbb3..c26c13b 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_lowering.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
@@ -565,30 +565,39 @@ transform_lit(struct tgsi_transform_context *tctx,
   /* MAX tmpA.xy, src.xy, imm{0.0} */
   new_inst = tgsi_default_full_instruction();
   new_inst.Instruction.Opcode = TGSI_OPCODE_MAX;
   new_inst.Instruction.NumDstRegs = 1;
   reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_XY);
   new_inst.Instruction.NumSrcRegs = 2;
   reg_src(&new_inst.Src[0], src, SWIZ(X, Y, _, _));
   reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(X, X, _, _));
   tctx->emit_instruction(tctx, &new_inst);
 
-  /* CLAMP tmpA.z, src.w, -imm{128.0}, imm{128.0} */
+  /* MIN tmpA.z, src.w, imm{128.0} */
   new_inst = tgsi_default_full_instruction();
-  new_inst.Instruction.Opcode = TGSI_OPCODE_CLAMP;
+  new_inst.Instruction.Opcode = TGSI_OPCODE_MIN;
   new_inst.Instruction.NumDstRegs = 1;
   reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Z);
-  new_inst.Instruction.NumSrcRegs = 3;
+  new_inst.Instruction.NumSrcRegs = 2;
   reg_src(&new_inst.Src[0], src, SWIZ(_, _, W, _));
   reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(_, _, Z, _));
+  tctx->emit_instruction(tctx, &new_inst);
+
+  /* MAX tmpA.z, tmpA.z, -imm{128.0} */
+  new_inst = tgsi_default_full_instruction();
+  new_inst.Instruction.Opcode = TGSI_OPCODE_MAX;
+  new_inst.Instruction.NumDstRegs = 1;
+  reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Z);
+  new_inst.Instruction.NumSrcRegs = 2;
+  reg_src(&new_inst.Src[0], &ctx->tmp[A].src, SWIZ(_, _, Z, _));
+  reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(_, _, Z, _));
   new_inst.Src[1].Register.Negate = true;
-  reg_src(&new_inst.Src[2], &ctx->imm, SWIZ(_, _, Z, _));
   tctx->emit_instruction(tctx, &new_inst);
 
   /* LG2 tmpA.y, tmpA.y */
   new_inst = tgsi_default_full_instruction();
   new_inst.Instruction.Opcode = TGSI_OPCODE_LG2;
   new_inst.Instruction.NumDstRegs = 1;
   reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Y);
   new_inst.Instruction.NumSrcRegs = 1;
   reg_src(&new_inst.Src[0], &ctx->tmp[A].src, SWIZ(Y, _, _, _));
   tctx->emit_instruction(tctx, &new_inst);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #3472 completed

2017-02-16 Thread AppVeyor


Build mesa 3472 completed



Commit b0232d98e9 by Dave Airlie on 2/16/2017 3:54 AM:

radeonsi: use shared emit_umsb helper.\n\nReviewed-by: Edward O'Callaghan \nReviewed-by: Marek Olšák \nSigned-off-by: Dave Airlie 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] tgsi/lowering: stop using TGSI_OPCODE_CLAMP

2017-02-16 Thread Dave Airlie
On 17 February 2017 at 08:00, Marek Olšák  wrote:
> From: Marek Olšák 

This doesn't look right to me, shouldn't it be

MIN tmpA.z, src.w, imm{128.0}
MAX tmpA.z tmpA.z. -imm{128.0}

?

Dave.

>
> ---
>  src/gallium/auxiliary/tgsi/tgsi_lowering.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_lowering.c 
> b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
> index bf6cbb3..dbe5a71 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_lowering.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
> @@ -565,30 +565,39 @@ transform_lit(struct tgsi_transform_context *tctx,
>/* MAX tmpA.xy, src.xy, imm{0.0} */
>new_inst = tgsi_default_full_instruction();
>new_inst.Instruction.Opcode = TGSI_OPCODE_MAX;
>new_inst.Instruction.NumDstRegs = 1;
>reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_XY);
>new_inst.Instruction.NumSrcRegs = 2;
>reg_src(&new_inst.Src[0], src, SWIZ(X, Y, _, _));
>reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(X, X, _, _));
>tctx->emit_instruction(tctx, &new_inst);
>
> -  /* CLAMP tmpA.z, src.w, -imm{128.0}, imm{128.0} */
> +  /* MIN tmpA.z, src.w, imm{128.0} */
>new_inst = tgsi_default_full_instruction();
> -  new_inst.Instruction.Opcode = TGSI_OPCODE_CLAMP;
> +  new_inst.Instruction.Opcode = TGSI_OPCODE_MIN;
>new_inst.Instruction.NumDstRegs = 1;
>reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Z);
> -  new_inst.Instruction.NumSrcRegs = 3;
> +  new_inst.Instruction.NumSrcRegs = 2;
> +  reg_src(&new_inst.Src[0], src, SWIZ(_, _, W, _));
> +  reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(_, _, Z, _));
> +  tctx->emit_instruction(tctx, &new_inst);
> +
> +  /* MAX tmpA.z, src.w, -imm{128.0} */
> +  new_inst = tgsi_default_full_instruction();
> +  new_inst.Instruction.Opcode = TGSI_OPCODE_MAX;
> +  new_inst.Instruction.NumDstRegs = 1;
> +  reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Z);
> +  new_inst.Instruction.NumSrcRegs = 2;
>reg_src(&new_inst.Src[0], src, SWIZ(_, _, W, _));
>reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(_, _, Z, _));
>new_inst.Src[1].Register.Negate = true;
> -  reg_src(&new_inst.Src[2], &ctx->imm, SWIZ(_, _, Z, _));
>tctx->emit_instruction(tctx, &new_inst);
>
>/* LG2 tmpA.y, tmpA.y */
>new_inst = tgsi_default_full_instruction();
>new_inst.Instruction.Opcode = TGSI_OPCODE_LG2;
>new_inst.Instruction.NumDstRegs = 1;
>reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Y);
>new_inst.Instruction.NumSrcRegs = 1;
>reg_src(&new_inst.Src[0], &ctx->tmp[A].src, SWIZ(Y, _, _, _));
>tctx->emit_instruction(tctx, &new_inst);
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V2 GLSL IR & TGSI on-disk shader cache

2017-02-16 Thread Timothy Arceri



On 17/02/17 01:27, Nicolai Hähnle wrote:

Hi Timothy,

thank you for the update. I had a look at all the patches now, and
especially the glsl parts looks basically ready to go. There are only
minor comments for which I don't need a full resend of the series, and
an open question on patch 22 where it would be nice to get a proper answer.


Thanks! It's a relief to finally get this stuff reviewed. I'm not really 
sure about an authoritative source for patch 22, although I'm happy to 
add the group check, I think it makes some sense.





On 14.02.2017 01:52, Timothy Arceri wrote:

Changes in V2:

- no longer mess around storing/restoring any pointers
- implemented support for compute shaders
- dropped some patches only needed by i965 for now
- add fallback support for shader source that is changed after its
compiledi (piglit test on the list)
- simplify cache enable for r600/radeonsi by unconditionally creating
the cache in screen_create.


Remind me how each part of the cache can be disabled?


We can't really enable GLSL IR cache by itself (I guess we could enable 
tgis but that wouldn't make much sense). The code simply checks id 
ctx->Cache != NULL in various locations which means cache is enabled.




Thanks,
Nicolai



- make glsl version (the version reported as supported by the
implemenation at
  compile time) part of the sha1 input rather than adding mesa string
to the cache object itself.
  This avoids fallbacks and should be more reliable.
- add any drirc options as sha1 inputs
- some other tidy ups suggested by Nicolai and Marek

In future we probably want to check what other env vars have been set,
but for now the gl/glsl version and drirc options should cover most
things.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/18] radeonsi: upload constants into VRAM instead of GTT

2017-02-16 Thread Marek Olšák
On Thu, Feb 16, 2017 at 4:21 PM, Nicolai Hähnle  wrote:
> On 16.02.2017 13:53, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> This lowers lgkm wait cycles by 30% on VI and normal conditions.
>> The might be a measurable improvement when CE is disabled (radeon)
>> or under L2 thrashing.
>
>
> Good idea. I'm just wondering if all the users of const upload end up as
> streaming writes? I hope we don't accidentally hit some place where writes
> from the CPU end up extremely slow, e.g. where st/mesa uploads some
> structures.

I think constant buffers always benefit from being in VRAM. If every
CU loads a value from a constant buffer, you'll get at least 16 TC L2
read requests on Fiji (each group of 4 CUs submits one), which can be
misses under thrashing. This is very different from "streaming" where
you expect to get exactly 1 read request for each piece of data.

The small problem with VRAM uploads may be write combining. I don't
know the alignment at which it operates and how exactly it works. E.g.
if we get 2 16-byte uploads aligned to 32, there is an untouched hole
of 16 bytes. Does the hole have any effect on upload performance?
u_upload_mgr could fill all holes if it was a problem.

Also, Feral's games upload directly to VRAM all the time. This patch
is nothing compared to what they're doing.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/18] radeonsi: use a clever alignment for descriptor uploads

2017-02-16 Thread Marek Olšák
On Thu, Feb 16, 2017 at 4:17 PM, Nicolai Hähnle  wrote:
> On 16.02.2017 13:53, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> Non-VBO descriptors won't be smaller than the cache line, so simply use
>> the cache line size.
>
>
> What about SSBOs? Those are just 16 bytes.
>
> Also, shader images are just 32 bytes (though we may have to bump this to 64

We always upload the whole list for non-VBO descriptors, which is
num_slot * slot_size. That's a lot more than a cache line. We could
certainly optimize this for both CE and non-CE paths. The CE path
evicts more cache lines needlessly, while the non-CE path has to
upload more data.

Since only the necessary number of VBO descriptors is uploaded, we can
hang the hardware if the vertex shader is using more inputs than the
vertex element state, which luckily can't happen with st/mesa.

> bytes for multisample image support -- except that it's unclear how to write
> to a multisample shader image while keeping the FMASK).

I wouldn't like to support MSAA image stores.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/18] radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start

2017-02-16 Thread Marek Olšák
On Thu, Feb 16, 2017 at 4:10 PM, Nicolai Hähnle  wrote:
> On 16.02.2017 13:53, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> start can only be non-zero with MultiDrawElements, which is unlikely
>> to occur with UNSIGNED_BYTE indices.
>
>
> Do we have a test case for this?

Sadly we don't.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] st/mesa: stop using TGSI_OPCODE_CLAMP

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

---
 src/mesa/state_tracker/st_atifs_to_tgsi.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c 
b/src/mesa/state_tracker/st_atifs_to_tgsi.c
index 9c4218e..64879f1 100644
--- a/src/mesa/state_tracker/st_atifs_to_tgsi.c
+++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c
@@ -612,21 +612,20 @@ st_init_atifs_prog(struct gl_context *ctx, struct 
gl_program *prog)
prog->arb.NumParameters = MAX_NUM_FRAGMENT_CONSTANTS_ATI + 2; /* 2 state 
variables for fog */
 }
 
 
 struct tgsi_atifs_transform {
struct tgsi_transform_context base;
struct tgsi_shader_info info;
const struct st_fp_variant_key *key;
bool first_instruction_emitted;
unsigned fog_factor_temp;
-   unsigned fog_clamp_imm;
 };
 
 static inline struct tgsi_atifs_transform *
 tgsi_atifs_transform(struct tgsi_transform_context *tctx)
 {
return (struct tgsi_atifs_transform *)tctx;
 }
 
 /* copied from st_cb_drawpixels_shader.c */
 static void
@@ -669,24 +668,20 @@ transform_instr(struct tgsi_transform_context *tctx,
 
if (ctx->first_instruction_emitted)
   goto transform_inst;
 
ctx->first_instruction_emitted = true;
 
if (ctx->key->fog) {
   /* add a new temp for the fog factor */
   ctx->fog_factor_temp = ctx->info.file_max[TGSI_FILE_TEMPORARY] + 1;
   tgsi_transform_temp_decl(tctx, ctx->fog_factor_temp);
-
-  /* add immediates for clamp */
-  ctx->fog_clamp_imm = ctx->info.immediate_count;
-  tgsi_transform_immediate_decl(tctx, 1.0f, 0.0f, 0.0f, 0.0f);
}
 
 transform_inst:
if (current_inst->Instruction.Opcode == TGSI_OPCODE_TEX) {
   /* fix texture target */
   unsigned newtarget = 
ctx->key->texture_targets[current_inst->Src[1].Register.Index];
   if (newtarget)
  current_inst->Texture.Texture = newtarget;
 
} else if (ctx->key->fog && current_inst->Instruction.Opcode == 
TGSI_OPCODE_MOV &&
@@ -783,31 +778,30 @@ transform_inst:
  inst.Instruction.Opcode = TGSI_OPCODE_EX2;
  inst.Instruction.NumDstRegs = 1;
  inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
  inst.Dst[0].Register.Index = ctx->fog_factor_temp;
  inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
  inst.Instruction.NumSrcRegs = 1;
  SET_SRC(&inst, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, Z, 
W);
  inst.Src[0].Register.Negate ^= 1;
  tctx->emit_instruction(tctx, &inst);
   }
-  /* f = CLAMP(f, 0.0, 1.0) */
+  /* f = saturate(f) */
   inst = tgsi_default_full_instruction();
-  inst.Instruction.Opcode = TGSI_OPCODE_CLAMP;
+  inst.Instruction.Opcode = TGSI_OPCODE_MOV;
   inst.Instruction.NumDstRegs = 1;
+  inst.Instruction.Saturate = 1;
   inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
   inst.Dst[0].Register.Index = ctx->fog_factor_temp;
   inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
-  inst.Instruction.NumSrcRegs = 3;
+  inst.Instruction.NumSrcRegs = 1;
   SET_SRC(&inst, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, Z, W);
-  SET_SRC(&inst, 1, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, Y, Y, Y, Y); 
// 0.0
-  SET_SRC(&inst, 2, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, X, X, X, X); 
// 1.0
   tctx->emit_instruction(tctx, &inst);
 
   /* REG0 = LRP(f, REG0, fogcolor) */
   inst = tgsi_default_full_instruction();
   inst.Instruction.Opcode = TGSI_OPCODE_LRP;
   inst.Instruction.NumDstRegs = 1;
   inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
   inst.Dst[0].Register.Index = reg0_index;
   inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
   inst.Instruction.NumSrcRegs = 3;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] radeonsi: stop using TGSI_OPCODE_CLAMP by moving it amd/common

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

---
 src/amd/common/ac_llvm_build.c  | 14 ++
 src/amd/common/ac_llvm_build.h  |  1 +
 src/gallium/drivers/radeonsi/si_shader.c| 10 +-
 src/gallium/drivers/radeonsi/si_shader_internal.h   |  3 ---
 src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 17 +
 5 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 20216a7..7e8552b 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -756,10 +756,24 @@ ac_emit_sendmsg(struct ac_llvm_context *ctx,
uint32_t msg,
LLVMValueRef wave_id)
 {
LLVMValueRef args[2];
const char *intr_name = (HAVE_LLVM < 0x0400) ? "llvm.SI.sendmsg" : 
"llvm.amdgcn.s.sendmsg";
args[0] = LLVMConstInt(ctx->i32, msg, false);
args[1] = wave_id;
ac_emit_llvm_intrinsic(ctx, intr_name, ctx->voidt,
   args, 2, 0);
 }
+
+LLVMValueRef ac_emit_clamp(struct ac_llvm_context *ctx, LLVMValueRef value)
+{
+   const char *intr = HAVE_LLVM >= 0x0308 ? "llvm.AMDGPU.clamp." :
+"llvm.AMDIL.clamp.";
+   LLVMValueRef args[3] = {
+   value,
+   LLVMConstReal(ctx->f32, 0),
+   LLVMConstReal(ctx->f32, 1),
+   };
+
+   return ac_emit_llvm_intrinsic(ctx, intr, ctx->f32, args, 3,
+ AC_FUNC_ATTR_READNONE);
+}
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index e88874a..d24c931 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -174,16 +174,17 @@ ac_emit_ddxy(struct ac_llvm_context *ctx,
 #define AC_SENDMSG_GS_DONE 3
 
 #define AC_SENDMSG_GS_OP_NOP  (0 << 4)
 #define AC_SENDMSG_GS_OP_CUT  (1 << 4)
 #define AC_SENDMSG_GS_OP_EMIT (2 << 4)
 #define AC_SENDMSG_GS_OP_EMIT_CUT (3 << 4)
 
 void ac_emit_sendmsg(struct ac_llvm_context *ctx,
 uint32_t msg,
 LLVMValueRef wave_id);
+LLVMValueRef ac_emit_clamp(struct ac_llvm_context *ctx, LLVMValueRef value);
 
 #ifdef __cplusplus
 }
 #endif
 
 #endif
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index d3e3984..a67ac82 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1011,21 +1011,21 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
lp_build_const_int32(gallivm, SI_HS_RING_TESS_OFFCHIP));
 
base = LLVMGetParam(ctx->main_fn, ctx->param_oc_lds);
buf_addr = get_tcs_tes_buffer_address_from_reg(ctx, reg, NULL);
 
 
TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) {
LLVMValueRef value = dst[chan_index];
 
if (inst->Instruction.Saturate)
-   value = si_llvm_saturate(bld_base, value);
+   value = ac_emit_clamp(&ctx->ac, value);
 
lds_store(bld_base, chan_index, dw_addr, value);
 
value = LLVMBuildBitCast(gallivm->builder, value, ctx->i32, "");
values[chan_index] = value;
 
if (inst->Dst[0].Register.WriteMask != 0xF) {
ac_build_tbuffer_store_dwords(&ctx->ac, buffer, value, 
1,
  buf_addr, base,
  4 * chan_index);
@@ -1803,21 +1803,21 @@ static void si_llvm_init_export_args(struct 
lp_build_tgsi_context *bld_base,
ctx->i32, pack_args, 2,
LP_FUNC_ATTR_READNONE);
args[chan + 5] =
LLVMBuildBitCast(base->gallivm->builder,
 packed, ctx->f32, "");
}
break;
 
case V_028714_SPI_SHADER_UNORM16_ABGR:
for (chan = 0; chan < 4; chan++) {
-   val[chan] = si_llvm_saturate(bld_base, values[chan]);
+   val[chan] = ac_emit_clamp(&ctx->ac, values[chan]);
val[chan] = LLVMBuildFMul(builder, val[chan],
  lp_build_const_float(gallivm, 
65535), "");
val[chan] = LLVMBuildFAdd(builder, val[chan],
  lp_build_const_float(gallivm, 
0.5), "");
val[chan] = LLVMBuildFPToUI(builder, val[chan],
ctx->i32, "");
}
 
args[4] = uint->one; /* COMPR flag */
args[5] = bitcast(bld_base, TGSI_TYPE_FLOAT,
@@ -2681,21 +2681,21 @@ static void si_llvm_emit_vs_epilogue(struct 
lp_bu

[Mesa-dev] [PATCH 5/5] gallium: remove TGSI_OPCODE_CLAMP

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

Not used and not widely supported. Use MIN+MAX instead.
---
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c   | 16 
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c  |  8 
 src/gallium/auxiliary/nir/tgsi_to_nir.c  | 11 ---
 src/gallium/auxiliary/tgsi/tgsi_exec.c   | 16 
 src/gallium/auxiliary/tgsi/tgsi_info.c   |  2 +-
 src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h |  1 -
 src/gallium/auxiliary/tgsi/tgsi_util.c   |  1 -
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp| 10 --
 src/gallium/drivers/r300/r300_tgsi_to_rc.c   |  1 -
 src/gallium/drivers/r600/r600_shader.c   |  6 +++---
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c|  3 ---
 src/gallium/drivers/svga/svga_tgsi_insn.c|  1 -
 src/gallium/include/pipe/p_shader_tokens.h   |  2 +-
 13 files changed, 5 insertions(+), 73 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index e78cdb0..dc6568a 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -103,35 +103,20 @@ static void
 arr_emit(
const struct lp_build_tgsi_action * action,
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
LLVMValueRef tmp = lp_build_emit_llvm_unary(bld_base, TGSI_OPCODE_ROUND, 
emit_data->args[0]);
emit_data->output[emit_data->chan] = 
LLVMBuildFPToSI(bld_base->base.gallivm->builder, tmp,

bld_base->uint_bld.vec_type, "");
 }
 
-/* TGSI_OPCODE_CLAMP */
-static void
-clamp_emit(
-   const struct lp_build_tgsi_action * action,
-   struct lp_build_tgsi_context * bld_base,
-   struct lp_build_emit_data * emit_data)
-{
-   LLVMValueRef tmp;
-   tmp = lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_MAX,
-   emit_data->args[0],
-   emit_data->args[1]);
-   emit_data->output[emit_data->chan] = lp_build_emit_llvm_binary(bld_base,
-   TGSI_OPCODE_MIN, tmp, 
emit_data->args[2]);
-}
-
 /* DP* Helper */
 
 static void
 dp_fetch_args(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data,
unsigned dp_components)
 {
unsigned chan, src;
for (src = 0; src < 2; src++) {
@@ -1323,21 +1308,20 @@ lp_set_default_actions(struct lp_build_tgsi_context * 
bld_base)
bld_base->op_actions[TGSI_OPCODE_IF].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_UIF].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_KILL_IF].fetch_args = kil_fetch_args;
bld_base->op_actions[TGSI_OPCODE_KILL].fetch_args = kilp_fetch_args;
bld_base->op_actions[TGSI_OPCODE_RCP].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_SIN].fetch_args = scalar_unary_fetch_args;
bld_base->op_actions[TGSI_OPCODE_LG2].fetch_args = scalar_unary_fetch_args;
 
bld_base->op_actions[TGSI_OPCODE_ADD].emit = add_emit;
bld_base->op_actions[TGSI_OPCODE_ARR].emit = arr_emit;
-   bld_base->op_actions[TGSI_OPCODE_CLAMP].emit = clamp_emit;
bld_base->op_actions[TGSI_OPCODE_END].emit = end_emit;
bld_base->op_actions[TGSI_OPCODE_FRC].emit = frc_emit;
bld_base->op_actions[TGSI_OPCODE_LRP].emit = lrp_emit;
bld_base->op_actions[TGSI_OPCODE_MAD].emit = mad_emit;
bld_base->op_actions[TGSI_OPCODE_MOV].emit = mov_emit;
bld_base->op_actions[TGSI_OPCODE_MUL].emit = mul_emit;
bld_base->op_actions[TGSI_OPCODE_DIV].emit = fdiv_emit;
bld_base->op_actions[TGSI_OPCODE_RCP].emit = rcp_emit;
 
bld_base->op_actions[TGSI_OPCODE_UARL].emit = mov_emit;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
index 6c177b0..2bd4291 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_aos.c
@@ -602,28 +602,20 @@ lp_emit_instruction_aos(
 
case TGSI_OPCODE_DP2A:
   return FALSE;
 
case TGSI_OPCODE_FRC:
   src0 = lp_build_emit_fetch(&bld->bld_base, inst, 0, LP_CHAN_ALL);
   tmp0 = lp_build_floor(&bld->bld_base.base, src0);
   dst0 = lp_build_sub(&bld->bld_base.base, src0, tmp0);
   break;
 
-   case TGSI_OPCODE_CLAMP:
-  src0 = lp_build_emit_fetch(&bld->bld_base, inst, 0, LP_CHAN_ALL);
-  src1 = lp_build_emit_fetch(&bld->bld_base, inst, 1, LP_CHAN_ALL);
-  src2 = lp_build_emit_fetch(&bld->bld_base, inst, 2, LP_CHAN_ALL);
-  tmp0 = lp_build_max(&bld->bld_base.base, src0, src1);
-  dst0 = lp_build_min(&bld->bld_base.base, tmp0, src2);
-  break;
-
case TGSI_OPCODE_FLR:
   src0 = lp_build_emit_fetch(&bld->bld_base, inst, 0, LP_CHAN_ALL);
   dst0 = lp_

[Mesa-dev] [PATCH 4/5] ac/llvm: use min+max instead of AMDGPU.clamp on LLVM 5.0

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

It selects v_med3_f32, which has the same rate & size.
---
 src/amd/common/ac_llvm_build.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 7e8552b..cbc048c 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -759,20 +759,37 @@ ac_emit_sendmsg(struct ac_llvm_context *ctx,
LLVMValueRef args[2];
const char *intr_name = (HAVE_LLVM < 0x0400) ? "llvm.SI.sendmsg" : 
"llvm.amdgcn.s.sendmsg";
args[0] = LLVMConstInt(ctx->i32, msg, false);
args[1] = wave_id;
ac_emit_llvm_intrinsic(ctx, intr_name, ctx->voidt,
   args, 2, 0);
 }
 
 LLVMValueRef ac_emit_clamp(struct ac_llvm_context *ctx, LLVMValueRef value)
 {
+   if (HAVE_LLVM >= 0x0500) {
+   LLVMValueRef max[2] = {
+   value,
+   LLVMConstReal(ctx->f32, 0),
+   };
+   LLVMValueRef min[2] = {
+   LLVMConstReal(ctx->f32, 1),
+   };
+
+   min[1] = ac_emit_llvm_intrinsic(ctx, "llvm.maxnum.f32",
+   ctx->f32, max, 2,
+   AC_FUNC_ATTR_READNONE);
+   return ac_emit_llvm_intrinsic(ctx, "llvm.minnum.f32",
+ ctx->f32, min, 2,
+ AC_FUNC_ATTR_READNONE);
+   }
+
const char *intr = HAVE_LLVM >= 0x0308 ? "llvm.AMDGPU.clamp." :
 "llvm.AMDIL.clamp.";
LLVMValueRef args[3] = {
value,
LLVMConstReal(ctx->f32, 0),
LLVMConstReal(ctx->f32, 1),
};
 
return ac_emit_llvm_intrinsic(ctx, intr, ctx->f32, args, 3,
  AC_FUNC_ATTR_READNONE);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] tgsi/lowering: stop using TGSI_OPCODE_CLAMP

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/tgsi/tgsi_lowering.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_lowering.c 
b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
index bf6cbb3..dbe5a71 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_lowering.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_lowering.c
@@ -565,30 +565,39 @@ transform_lit(struct tgsi_transform_context *tctx,
   /* MAX tmpA.xy, src.xy, imm{0.0} */
   new_inst = tgsi_default_full_instruction();
   new_inst.Instruction.Opcode = TGSI_OPCODE_MAX;
   new_inst.Instruction.NumDstRegs = 1;
   reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_XY);
   new_inst.Instruction.NumSrcRegs = 2;
   reg_src(&new_inst.Src[0], src, SWIZ(X, Y, _, _));
   reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(X, X, _, _));
   tctx->emit_instruction(tctx, &new_inst);
 
-  /* CLAMP tmpA.z, src.w, -imm{128.0}, imm{128.0} */
+  /* MIN tmpA.z, src.w, imm{128.0} */
   new_inst = tgsi_default_full_instruction();
-  new_inst.Instruction.Opcode = TGSI_OPCODE_CLAMP;
+  new_inst.Instruction.Opcode = TGSI_OPCODE_MIN;
   new_inst.Instruction.NumDstRegs = 1;
   reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Z);
-  new_inst.Instruction.NumSrcRegs = 3;
+  new_inst.Instruction.NumSrcRegs = 2;
+  reg_src(&new_inst.Src[0], src, SWIZ(_, _, W, _));
+  reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(_, _, Z, _));
+  tctx->emit_instruction(tctx, &new_inst);
+
+  /* MAX tmpA.z, src.w, -imm{128.0} */
+  new_inst = tgsi_default_full_instruction();
+  new_inst.Instruction.Opcode = TGSI_OPCODE_MAX;
+  new_inst.Instruction.NumDstRegs = 1;
+  reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Z);
+  new_inst.Instruction.NumSrcRegs = 2;
   reg_src(&new_inst.Src[0], src, SWIZ(_, _, W, _));
   reg_src(&new_inst.Src[1], &ctx->imm, SWIZ(_, _, Z, _));
   new_inst.Src[1].Register.Negate = true;
-  reg_src(&new_inst.Src[2], &ctx->imm, SWIZ(_, _, Z, _));
   tctx->emit_instruction(tctx, &new_inst);
 
   /* LG2 tmpA.y, tmpA.y */
   new_inst = tgsi_default_full_instruction();
   new_inst.Instruction.Opcode = TGSI_OPCODE_LG2;
   new_inst.Instruction.NumDstRegs = 1;
   reg_dst(&new_inst.Dst[0], &ctx->tmp[A].dst, TGSI_WRITEMASK_Y);
   new_inst.Instruction.NumSrcRegs = 1;
   reg_src(&new_inst.Src[0], &ctx->tmp[A].src, SWIZ(Y, _, _, _));
   tctx->emit_instruction(tctx, &new_inst);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Automated fuzzy testing of shader compilers - contacts in Mesa?

2017-02-16 Thread Ilia Mirkin
On Thu, Feb 16, 2017 at 12:11 PM, Hugues Evrard  wrote:
> Hi all,
>
> I'm a researcher at Imperial College London, my group is working on a
> testing framework for graphics drivers, in particular shader compilers,
> and we would like to get in touch with Mesa shader compiler developers.
>
> Our approach has already identified more than 50 bugs, spanning from bad
> image rendering to more severe issues like system hangs or crashes,
> across drivers of all major GPU designers (Intel, AMD...). In a
> nutshell, we use the "metamorphic testing" approach to perform fuzzing
> of GLSL shaders, and we're able to eventually produce a minimal test
> case to trigger bugs. For illustrated examples, see:
> https://medium.com/@afd_icl/crashes-hangs-and-crazy-images-by-adding-zero-689d15ce922b
>
> We're currently conducting experiments on Mesa drivers, and we will
> report our findings soon. I'm contacting this mailing list in advance
> since two of us are going to be in the US for the Game Dev. Conference
> at the end of the month, plus an extra week or more in March to visit
> companies (Apple, Nvidia, Qualcomm, ...). In order to make the most of
> this US journey, we would like to meet people from the Mesa shader
> compiler teams. The mesa website says that the GLSL compiler is
> contributed by Intel, is that still accurate? If so, could anyone help
> us to get in touch with this team (who is maybe reading this mailing
> list)? What about AMD?

While the original GLSL compiler was contributed largely by Intel, as
I understand it, it has since been developed and improved by many
people belonging to many organizations (and in some case, even
independent contributors).

Also to make it clear, there are many different steps to a successful
compilation, and the GLSL compiler is just one of them (the first
one). Some of the steps are shared between drivers, while others are
driver- or even hardware-specific. All of them have their own teams of
developers, along with their own unique bugs.

I hope you'll find the community is pretty responsive to issues that
are identified. The best way to report issues is with reproducible
examples -- apitrace is a great tool for creating these. I believe
some of your examples use WebGL, but that can end up being subject to
browsers' implementations.

I'm definitely looking forward to getting some quality bug reports!

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Automated fuzzy testing of shader compilers - contacts in Mesa?

2017-02-16 Thread Nicolai Hähnle

On 16.02.2017 18:11, Hugues Evrard wrote:

Hi all,

I'm a researcher at Imperial College London, my group is working on a
testing framework for graphics drivers, in particular shader compilers,
and we would like to get in touch with Mesa shader compiler developers.

Our approach has already identified more than 50 bugs, spanning from bad
image rendering to more severe issues like system hangs or crashes,
across drivers of all major GPU designers (Intel, AMD...). In a
nutshell, we use the "metamorphic testing" approach to perform fuzzing
of GLSL shaders, and we're able to eventually produce a minimal test
case to trigger bugs. For illustrated examples, see:
https://medium.com/@afd_icl/crashes-hangs-and-crazy-images-by-adding-zero-689d15ce922b

We're currently conducting experiments on Mesa drivers, and we will
report our findings soon. I'm contacting this mailing list in advance
since two of us are going to be in the US for the Game Dev. Conference
at the end of the month, plus an extra week or more in March to visit
companies (Apple, Nvidia, Qualcomm, ...). In order to make the most of
this US journey, we would like to meet people from the Mesa shader
compiler teams. The mesa website says that the GLSL compiler is
contributed by Intel, is that still accurate? If so, could anyone help
us to get in touch with this team (who is maybe reading this mailing
list)? What about AMD?


AMD is around as well :)

I've been following your postings, and I'm happy to hear you're 
interesting in fuzzing Mesa as well. I'll send you some more details.


Cheers,
Nicolai


Note that besides GLSL compilers, we're looking forward to extend our
approach to other APIs like Vulkan. Any contact in companies that deal
with shader manipulation, such as video game engines editors like Valve
or Unity (we're already in touch with Epic Games) would also be much
appreciated!

Thanks in advance for your support,
--
Hugues Evrard
Research Associate
Multicore Programming Group, Imperial College London
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99305] account creation request

2017-02-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99305

--- Comment #3 from Alex Deucher  ---
You generally need someone from the project to ack you and switch the
product/component to freedesktop.org/New Accounts

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99305] account creation request

2017-02-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99305

--- Comment #2 from George Kyriazis  ---
pinging whoever is responsible.  Can I get an idea of what the turnaround time
will be for account creation?

Thanks!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radv: Use different allocator for descriptor set vram.

2017-02-16 Thread Bas Nieuwenhuizen
This one only keeps allocated memory in the list, and list nodes
in the descriptor sets. Thsi doesn't need messing around with
max_sets, and we get automatic merging of free regions.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_descriptor_set.c | 82 +++-
 src/amd/vulkan/radv_private.h| 18 ++--
 2 files changed, 29 insertions(+), 71 deletions(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 81291d10037..e2bd9b92d90 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -275,39 +275,37 @@ radv_descriptor_set_create(struct radv_device *device,
uint32_t layout_size = align_u32(layout->size, 32);
set->size = layout->size;
if (!cmd_buffer) {
-   if (pool->current_offset + layout_size <= pool->size &&
-   pool->allocated_sets < pool->max_sets) {
+   /* try to allocate linearly first, so that we don't 
spend
+* time looking for gaps if the app only allocates &
+* resets via the pool. */
+   if (pool->current_offset + layout_size <= pool->size) {
set->bo = pool->bo;
set->mapped_ptr = (uint32_t*)(pool->mapped_ptr 
+ pool->current_offset);
set->va = device->ws->buffer_get_va(set->bo) + 
pool->current_offset;
pool->current_offset += layout_size;
-   ++pool->allocated_sets;
+   list_addtail(&set->vram_list, &pool->vram_list);
} else {
-   int entry = pool->free_list, prev_entry = -1;
-   uint32_t offset;
-   while (entry >= 0) {
-   if (pool->free_nodes[entry].size >= 
layout_size) {
-   if (prev_entry >= 0)
-   
pool->free_nodes[prev_entry].next = pool->free_nodes[entry].next;
-   else
-   pool->free_list = 
pool->free_nodes[entry].next;
+   uint64_t offset = 0;
+   struct list_head *prev = &pool->vram_list;
+   struct radv_descriptor_set *cur;
+   LIST_FOR_EACH_ENTRY(cur, &pool->vram_list, 
vram_list) {
+   uint64_t start = 
(uint8_t*)cur->mapped_ptr - pool->mapped_ptr;
+   if (start - offset >= layout_size)
break;
-   }
-   prev_entry = entry;
-   entry = pool->free_nodes[entry].next;
+
+   offset = start + cur->size;
+   prev = &cur->vram_list;
}
 
-   if (entry < 0) {
+   if (pool->size - offset < layout_size) {
+   vk_free2(&device->alloc, NULL, 
set->dynamic_descriptors);
vk_free2(&device->alloc, NULL, set);
return 
vk_error(VK_ERROR_OUT_OF_POOL_MEMORY_KHR);
}
-   offset = pool->free_nodes[entry].offset;
-   pool->free_nodes[entry].next = pool->full_list;
-   pool->full_list = entry;
-
set->bo = pool->bo;
set->mapped_ptr = (uint32_t*)(pool->mapped_ptr 
+ offset);
set->va = device->ws->buffer_get_va(set->bo) + 
offset;
+   list_add(&set->vram_list, prev);
}
} else {
unsigned bo_offset;
@@ -324,11 +322,6 @@ radv_descriptor_set_create(struct radv_device *device,
}
}
 
-   if (pool)
-   list_add(&set->descriptor_pool, &pool->descriptor_sets);
-   else
-   list_inithead(&set->descriptor_pool);
-
for (unsigned i = 0; i < layout->binding_count; ++i) {
if (!layout->binding[i].immutable_samplers)
continue;
@@ -355,19 +348,10 @@ radv_descriptor_set_destroy(struct radv_device *device,
struct radv_descriptor_set *set,
bool free_bo)
 {
-   if (free_bo && set->size) {
-   assert(pool->full_list >= 0);
-  

[Mesa-dev] [PATCH 1/2] radv: Never try to create more than max_sets descriptor sets.

2017-02-16 Thread Bas Nieuwenhuizen
We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.

Signed-off-by: Bas Nieuwenhuizen 
CC: 
Fixes: f4e499ec79147f4172f3669ae9dafd941aaeeb65
---
 src/amd/vulkan/radv_descriptor_set.c | 7 +--
 src/amd/vulkan/radv_private.h| 1 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 6d89d601de0..81291d10037 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -275,12 +275,13 @@ radv_descriptor_set_create(struct radv_device *device,
uint32_t layout_size = align_u32(layout->size, 32);
set->size = layout->size;
if (!cmd_buffer) {
-   if (pool->current_offset + layout_size <= pool->size) {
+   if (pool->current_offset + layout_size <= pool->size &&
+   pool->allocated_sets < pool->max_sets) {
set->bo = pool->bo;
set->mapped_ptr = (uint32_t*)(pool->mapped_ptr 
+ pool->current_offset);
set->va = device->ws->buffer_get_va(set->bo) + 
pool->current_offset;
pool->current_offset += layout_size;
-
+   ++pool->allocated_sets;
} else {
int entry = pool->free_list, prev_entry = -1;
uint32_t offset;
@@ -417,6 +418,7 @@ VkResult radv_CreateDescriptorPool(
pool->full_list = 0;
pool->free_nodes[max_sets - 1].next = -1;
pool->max_sets = max_sets;
+   pool->allocated_sets = 0;
 
for (int i = 0; i  + 1 < max_sets; ++i)
pool->free_nodes[i].next = i + 1;
@@ -494,6 +496,7 @@ VkResult radv_ResetDescriptorPool(
radv_descriptor_set_destroy(device, pool, set, false);
}
 
+   pool->allocated_sets = 0;
pool->current_offset = 0;
pool->free_list = -1;
pool->full_list = 0;
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 7b1d8fb1f45..9c326dcef83 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -564,6 +564,7 @@ struct radv_descriptor_pool {
int free_list;
int full_list;
uint32_t max_sets;
+   uint32_t allocated_sets;
struct radv_descriptor_pool_free_node free_nodes[];
 };
 
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 3/4] run: set INTEL_NO_HW together with INTEL_DEVID_OVERRIDE

2017-02-16 Thread Kenneth Graunke
On Thursday, February 16, 2017 4:29:50 AM PST Lionel Landwerlin wrote:
> Since we're already asking the driver to generate code for a different
> hardware than what we're running on, better not even bother with emitting
> any batch.
> 
> Signed-off-by: Lionel Landwerlin 
> ---
>  run.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/run.c b/run.c
> index 62c19c8..7543b2a 100644
> --- a/run.c
> +++ b/run.c
> @@ -370,6 +370,7 @@ main(int argc, char **argv)
>  
>  printf("### Compiling for %s ###\n", platform->name);
>  setenv("INTEL_DEVID_OVERRIDE", platform->pci_id, 1);
> +setenv("INTEL_NO_HW", "1", 1);
>  break;
>  }
>  case 'j':
> 

I don't think you need this patch - libdrm will already not execute
batches if INTEL_DEVID_OVERRIDE is used to force a PCI ID that doesn't
match the one on the system.

Unless the fake PCI ID happens to match the one you're compiling for...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Clamp GetUniformuiv values to be >= 0.

2017-02-16 Thread Antía Puentes
On lun, 2016-12-12 at 10:43 +0100, Nicolai Hähnle wrote:
> On 12.12.2016 00:25, Kenneth Graunke wrote:
> > 
> > Section 2.2.2 (Data Conversions For State Query Commands) of the
> > OpenGL 4.5 October 24th 2016 specification says:
> > 
> > "If a command returning unsigned integer data is called, such as
> >  GetSamplerParameterIuiv, negative values are clamped to zero."
> > 
> > Fixes GL44-CTS.gpu_shader_fp64.state_query.
> > 
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/main/uniform_query.cpp | 48
> > +
> >  1 file changed, 39 insertions(+), 9 deletions(-)
> > 
> > Hey Nicolai,
> > 
> > I wrote a similar patch a while back, but never got around to
> > sending it,
> > since I realized that the gl45release branch expects our current
> > behavior,
> > and the change to make the CTS expect clamping is only on the
> > master branch.
> > 
> > Apparently I made some additional changes, compared to yours.  I
> > figured
> > I'd send this along and let you see if you think any of my extra
> > changes
> > are still necessary.  If so, feel free to fold them into your
> > patch.
> > 
> > I also think we need to fix several other glGet* commands...it's
> > just that
> > this is the only one currently tested.  A bunch work because the
> > values
> > returned can't be negative.
> I think your patch is a strict superset of what mine does and should
> be 
> used instead. I do have one comment below, with that fixed it has my
> R-b.

This patch was never pushed, was it? and GL45-CTS.gpu_shader_fp64.state_query
fails in the new vk-gl-cts repository because it expects these negative
values to be clamped.

> There is the more general question of how to cope with those cases
> where 
> the CTS requires non-standard behavior. I think we should insist on 
> doing the right thing in Mesa, and push for changes to the CTS.
> 
> Until quite recently, I've been occupied by radeonsi- and 
> Gallium-specific bugs, but that's changing and I'm looking into
> using 
> CTS master rather than back-porting fixes to the dead gl45release
> branch 
> (hence this patch).
> 
> > 
> > 
> >  --Ken
> > 
> > diff --git a/src/mesa/main/uniform_query.cpp
> > b/src/mesa/main/uniform_query.cpp
> > index db700df..f05a29f 100644
> > --- a/src/mesa/main/uniform_query.cpp
> > +++ b/src/mesa/main/uniform_query.cpp
> > @@ -347,14 +347,10 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> > * just memcpy the data.  If the types are not compatible,
> > perform a
> > * slower convert-and-copy process.
> > */
> > -  if (returnType == uni->type->base_type
> > -     || ((returnType == GLSL_TYPE_INT
> > -      || returnType == GLSL_TYPE_UINT)
> > -     &&
> > -     (uni->type->base_type == GLSL_TYPE_INT
> > -      || uni->type->base_type == GLSL_TYPE_UINT
> > -   || uni->type->base_type == GLSL_TYPE_SAMPLER
> > -   || uni->type->base_type == GLSL_TYPE_IMAGE))) {
> > +  if (returnType == uni->type->base_type ||
> > +  ((returnType == GLSL_TYPE_INT || returnType ==
> > GLSL_TYPE_UINT) &&
> > +   (uni->type->base_type == GLSL_TYPE_SAMPLER ||
> > +uni->type->base_type == GLSL_TYPE_IMAGE))) {
> >      memcpy(paramsOut, src, bytes);
> >    } else {
> >      union gl_constant_value *const dst =
> > @@ -422,7 +418,6 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> >        }
> >        break;
> >     case GLSL_TYPE_INT:
> > -   case GLSL_TYPE_UINT:
> >        switch (uni->type->base_type) {
> >        case GLSL_TYPE_FLOAT:
> >       /* While the GL 3.2 core spec doesn't explicitly
> > @@ -447,6 +442,9 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> >        case GLSL_TYPE_BOOL:
> >       dst[didx].i = src[sidx].i ? 1 : 0;
> >       break;
> > +   case GLSL_TYPE_UINT:
> > +  dst[didx].i = src[sidx].i;
> I think this should be
> 
> dst[didx].i = MIN2(src[sidx].u, INT_MAX);
> 
> Cheers,
> Nicolai
> 
> > 
> > +  break;
> > case GLSL_TYPE_DOUBLE: {
> >    double tmp;
> >    memcpy(&tmp, &src[sidx].f, sizeof(tmp));
> > @@ -458,6 +456,38 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> >       break;
> >        }
> >        break;
> > +case GLSL_TYPE_UINT:
> > +   switch (uni->type->base_type) {
> > +   case GLSL_TYPE_FLOAT:
> > +  /* The spec isn't terribly clear how to handle
> > negative
> > +   * values with an unsigned return type.
> > +   *
> > +   * GL 4.5 section 2.2.2 ("Data Conversions for
> > State
> > +   * Query Commands") says:
> > +   *
> > +   * "If a value is so large in magnit

Re: [Mesa-dev] [PATCH] removed report to vendor message when dri3 is not detected

2017-02-16 Thread Emil Velikov
Hi Jacob,

On 11 February 2017 at 01:44, Jacob Lifshay  wrote:
> fixes bug 99715
>
While topic is [somewhat] ongoing, I'd suggest skimming through git
log for future patches.
Namely: use he correct prefix for the patch, explain why in the commit
message and use Bugzilla: http... tag.

Thanks !
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: fix uninitialized memory access

2017-02-16 Thread Emil Velikov
On 16 February 2017 at 15:06, Lionel Landwerlin
 wrote:
> Found while running shader-db under valgrind.
>
> Signed-off-by: Lionel Landwerlin 
Cc: mesa-sta...@lists.freedesktop.org

Regardless how likely it is to hit ;-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: fix uninitialized memory access

2017-02-16 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2017-02-16 07:06:09, Lionel Landwerlin wrote:
> Found while running shader-db under valgrind.
> 
> Signed-off-by: Lionel Landwerlin 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
> index f56f05b7e9..952276faed 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
> @@ -207,9 +207,8 @@ fs_visitor::register_coalesce()
>  channels_remaining = -1;
>  continue;
>   }
> - dst_reg_offset[offset] = inst->dst.offset / REG_SIZE;
> - if (inst->size_written > REG_SIZE)
> -dst_reg_offset[offset + 1] = inst->dst.offset / REG_SIZE + 1;
> + for (unsigned i = 0; i < MAX2(inst->size_written / REG_SIZE, 1); 
> i++)
> +dst_reg_offset[offset + i] = inst->dst.offset / REG_SIZE + i;
>   mov[offset] = inst;
>   channels_remaining -= regs_written(inst);
>}
> -- 
> 2.11.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Automated fuzzy testing of shader compilers - contacts in Mesa?

2017-02-16 Thread Matt Turner
On Thu, Feb 16, 2017 at 9:11 AM, Hugues Evrard  wrote:
> Hi all,
>
> I'm a researcher at Imperial College London, my group is working on a
> testing framework for graphics drivers, in particular shader compilers,
> and we would like to get in touch with Mesa shader compiler developers.
>
> Our approach has already identified more than 50 bugs, spanning from bad
> image rendering to more severe issues like system hangs or crashes,
> across drivers of all major GPU designers (Intel, AMD...). In a
> nutshell, we use the "metamorphic testing" approach to perform fuzzing
> of GLSL shaders, and we're able to eventually produce a minimal test
> case to trigger bugs. For illustrated examples, see:
> https://medium.com/@afd_icl/crashes-hangs-and-crazy-images-by-adding-zero-689d15ce922b
>
> We're currently conducting experiments on Mesa drivers, and we will
> report our findings soon. I'm contacting this mailing list in advance
> since two of us are going to be in the US for the Game Dev. Conference
> at the end of the month, plus an extra week or more in March to visit
> companies (Apple, Nvidia, Qualcomm, ...). In order to make the most of
> this US journey, we would like to meet people from the Mesa shader
> compiler teams. The mesa website says that the GLSL compiler is
> contributed by Intel, is that still accurate? If so, could anyone help
> us to get in touch with this team (who is maybe reading this mailing
> list)? What about AMD?

Yes, contributed by Intel, and we're still around!

We're very excited to hear about the results of your work. I'll email
you off list.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Automated fuzzy testing of shader compilers - contacts in Mesa?

2017-02-16 Thread Hugues Evrard
Hi all,

I'm a researcher at Imperial College London, my group is working on a
testing framework for graphics drivers, in particular shader compilers,
and we would like to get in touch with Mesa shader compiler developers.

Our approach has already identified more than 50 bugs, spanning from bad
image rendering to more severe issues like system hangs or crashes,
across drivers of all major GPU designers (Intel, AMD...). In a
nutshell, we use the "metamorphic testing" approach to perform fuzzing
of GLSL shaders, and we're able to eventually produce a minimal test
case to trigger bugs. For illustrated examples, see:
https://medium.com/@afd_icl/crashes-hangs-and-crazy-images-by-adding-zero-689d15ce922b

We're currently conducting experiments on Mesa drivers, and we will
report our findings soon. I'm contacting this mailing list in advance
since two of us are going to be in the US for the Game Dev. Conference
at the end of the month, plus an extra week or more in March to visit
companies (Apple, Nvidia, Qualcomm, ...). In order to make the most of
this US journey, we would like to meet people from the Mesa shader
compiler teams. The mesa website says that the GLSL compiler is
contributed by Intel, is that still accurate? If so, could anyone help
us to get in touch with this team (who is maybe reading this mailing
list)? What about AMD?

Note that besides GLSL compilers, we're looking forward to extend our
approach to other APIs like Vulkan. Any contact in companies that deal
with shader manipulation, such as video game engines editors like Valve
or Unity (we're already in touch with Epic Games) would also be much
appreciated!

Thanks in advance for your support,
--
Hugues Evrard
Research Associate
Multicore Programming Group, Imperial College London
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: fix 32-bit data type to int64 conversion on BSW/BXT

2017-02-16 Thread Matt Turner
On Sun, Feb 12, 2017 at 10:06 PM, Samuel Iglesias Gonsálvez
 wrote:
> The 32-bit to 64-bit conversions need to have the 32-bit
> data source elements aligned to 64-bit but only with doubles as
> destination type.

How aggravating that the documentation doesn't say this.

>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
>

No newline here.

> Signed-off-by: Samuel Iglesias Gonsálvez 
> Tested-by: Mark Janes 

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add utility build-id code.

2017-02-16 Thread Emil Velikov
On 16 February 2017 at 14:23, Jonathan Gray  wrote:
> On Wed, Feb 15, 2017 at 11:11:50AM -0800, Matt Turner wrote:
>> Provides the ability to read the .note.gnu.build-id section of ELF
>> binaries, which is inserted by the --build-id=... flag to ld.
>>
>> Reviewed-by: Emil Velikov 
>
> I don't have time to dig into details right now but this broke the Mesa
> build on OpenBSD and likely other non-linux platforms:
>
> libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" -DPACKAGE_TARNAME=\"mesa\" 
> -DPACKAGE_VERSION=\"17.1.0-devel\" "-DPACKAGE_STRING=\"Mesa 17.1.0-devel\"" 
> "-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"";
>  -DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"17.1.0-devel\" 
> -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
> -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 
> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" 
> -DYYTEXT_POINTER=1 -DHAVE___BUILTIN_CLZ=1 -DHAVE___BUILTIN_CLZLL=1 
> -DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1 -DHAVE___BUILTIN_FFS=1 
> -DHAVE___BUILTIN_FFSLL=1 -DHAVE___BUILTIN_POPCOUNT=1 
> -DHAVE___BUILTIN_POPCOUNTLL=1 -DHAVE_FUNC_ATTRIBUTE_CONST=1 
> -DHAVE_FUNC_ATTRIBUTE_FLATTEN=1 -DHAVE_FUNC_ATTRIBUTE_FORMAT=1 
> -DHAVE_FUNC_ATTRIBUTE_MALLOC=1 -DHAVE_FUNC_ATTRIBUTE_PACKED=1 
> -DHAVE_FUNC_ATTRIBUTE_PURE=1 -DHAVE_FUNC_ATTRIBUTE_UNUSED=1 
> -DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1 
> -DHAVE_FUNC_ATTRIBUTE_WEAK=1 -DHAVE_FUNC_ATTRIBUTE_ALIAS=1 -DHAVE_DLADDR=1 
> -DHAVE_CLOCK_GETTIME=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -I. 
> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DDEBUG 
> -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM -DHAVE_SYS_SYSCTL_H -DHAVE_STRTOF 
> -DHAVE_MKOSTEMP -DHAVE_DLOPEN -DHAVE_DL_ITERATE_PHDR -DHAVE_POSIX_MEMALIGN 
> -DHAVE_LIBDRM -DGLX_USE_DRM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING 
> -DENABLE_SHADER_CACHE -DHAVE_MINCORE -I../../include -I../../src 
> -I../../src/mapi -I../../src/mesa -I../../src/gallium/include 
> -I../../src/gallium/auxiliary -fvisibility=hidden -Werror=pointer-arith -g 
> -O2 -Wall -std=gnu99 -Werror=implicit-function-declaration 
> -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -MT 
> libmesautil_la-build_id.lo -MD -MP -MF .deps/libmesautil_la-build_id.Tpo -c 
> build_id.c  -fPIC -DPIC -o .libs/libmesautil_la-build_id.o
> In file included from /usr/include/elf_abi.h:31,
>  from /usr/include/link_elf.h:10,
>  from /usr/include/link.h:39,
>  from build_id.c:25:
> /usr/include/sys/exec_elf.h:585: error: expected specifier-qualifier-list 
> before 'uint32_t'
> In file included from /usr/include/link.h:39,
>  from build_id.c:25:
> /usr/include/link_elf.h:22: error: expected specifier-qualifier-list before 
> 'caddr_t'
> /usr/include/link_elf.h:37: error: expected '=', ',', ';', 'asm' or 
> '__attribute__' before 'int'
> In file included from build_id.c:25:
> /usr/include/link.h:49: error: expected '=', ',', ';', 'asm' or 
> '__attribute__' before 'struct'
> /usr/include/link.h:65: error: expected specifier-qualifier-list before 
> 'caddr_t'
These look like issue in your platform code/headers. Perhaps some bad
interaction with the bits that Mesa defines ?

Quick workaround is to check the function only when needed, roughly
like this pseudo code:

if test $building_any_vulkan_driver = yes ;then
require_dl...=yes
   
fi


if test $require_dl... = yes ; then
   AC_CHECK_FUNC([dl_iterate_phdr], [DEFINES="$DEFINES
-DHAVE_DL_ITERATE_PHDR"], [AC_MSG_ERROR([required  not found])])
fi


Please give it a bash and send us a patch that works on your end.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] gallium/hud: create files after graphs are created to get final names

2017-02-16 Thread Edmondo Tommasina
Thanks for the patch, it looks good.

Reviewed-by: Edmondo Tommasina 



On Thu, Feb 16, 2017 at 1:52 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/auxiliary/hud/hud_context.c  | 25 +++--
>  src/gallium/auxiliary/hud/hud_cpu.c  |  4 
>  src/gallium/auxiliary/hud/hud_driver_query.c |  2 --
>  src/gallium/auxiliary/hud/hud_fps.c  |  2 --
>  src/gallium/auxiliary/hud/hud_private.h  |  2 --
>  5 files changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_context.c 
> b/src/gallium/auxiliary/hud/hud_context.c
> index 9de260c..aaa52d5 100644
> --- a/src/gallium/auxiliary/hud/hud_context.c
> +++ b/src/gallium/auxiliary/hud/hud_context.c
> @@ -932,33 +932,46 @@ static void
>  hud_graph_destroy(struct hud_graph *graph)
>  {
> FREE(graph->vertices);
> if (graph->free_query_data)
>graph->free_query_data(graph->query_data);
> if (graph->fd)
>fclose(graph->fd);
> FREE(graph);
>  }
>
> -void
> +static void strcat_without_spaces(char *dst, const char *src)
> +{
> +   dst += strlen(dst);
> +   while (*src) {
> +  if (*src == ' ')
> + *dst++ = '_';
> +  else
> + *dst++ = *src;
> +  src++;
> +   }
> +   *dst = 0;
> +}
> +
> +static void
>  hud_graph_set_dump_file(struct hud_graph *gr)
>  {
>  #ifndef PIPE_OS_WINDOWS
> const char *hud_dump_dir = getenv("GALLIUM_HUD_DUMP_DIR");
> char *dump_file;
>
> if (hud_dump_dir && access(hud_dump_dir, W_OK) == 0) {
>dump_file = malloc(strlen(hud_dump_dir) + sizeof("/") + 
> sizeof(gr->name));
>if (dump_file) {
>   strcpy(dump_file, hud_dump_dir);
>   strcat(dump_file, "/");
> - strcat(dump_file, gr->name);
> + strcat_without_spaces(dump_file, gr->name);
>   gr->fd = fopen(dump_file, "w+");
>   free(dump_file);
>}
> }
>  #endif
>  }
>
>  /**
>   * Read a string from the environment variable.
>   * The separators "+", ",", ":", and ";" terminate the string.
> @@ -1369,20 +1382,28 @@ hud_parse_env_var(struct hud_context *hud, const char 
> *env)
> }
>
> if (pane) {
>if (pane->num_graphs) {
>   LIST_ADDTAIL(&pane->head, &hud->pane_list);
>}
>else {
>   FREE(pane);
>}
> }
> +
> +   LIST_FOR_EACH_ENTRY(pane, &hud->pane_list, head) {
> +  struct hud_graph *gr;
> +
> +  LIST_FOR_EACH_ENTRY(gr, &pane->graph_list, head) {
> + hud_graph_set_dump_file(gr);
> +  }
> +   }
>  }
>
>  static void
>  print_help(struct pipe_screen *screen)
>  {
> int i, num_queries, num_cpus = hud_get_num_cpus();
>
> puts("Syntax: 
> GALLIUM_HUD=name1[+name2][...][:value1][,nameI...][;nameJ...]");
> puts("");
> puts("  Names are identifiers of data sources which will be drawn as 
> graphs");
> diff --git a/src/gallium/auxiliary/hud/hud_cpu.c 
> b/src/gallium/auxiliary/hud/hud_cpu.c
> index a8d97b8..1cba353 100644
> --- a/src/gallium/auxiliary/hud/hud_cpu.c
> +++ b/src/gallium/auxiliary/hud/hud_cpu.c
> @@ -207,22 +207,20 @@ hud_cpu_graph_install(struct hud_pane *pane, unsigned 
> cpu_index)
> gr->query_new_value = query_cpu_load;
>
> /* Don't use free() as our callback as that messes up Gallium's
>  * memory debugger.  Use simple free_query_data() wrapper.
>  */
> gr->free_query_data = free_query_data;
>
> info = gr->query_data;
> info->cpu_index = cpu_index;
>
> -   hud_graph_set_dump_file(gr);
> -
> hud_pane_add_graph(pane, gr);
> hud_pane_set_max_value(pane, 100);
>  }
>
>  int
>  hud_get_num_cpus(void)
>  {
> uint64_t busy, total;
> int i = 0;
>
> @@ -278,15 +276,13 @@ hud_api_thread_busy_install(struct hud_pane *pane)
>return;
> }
>
> gr->query_new_value = query_api_thread_busy_status;
>
> /* Don't use free() as our callback as that messes up Gallium's
>  * memory debugger.  Use simple free_query_data() wrapper.
>  */
> gr->free_query_data = free_query_data;
>
> -   hud_graph_set_dump_file(gr);
> -
> hud_pane_add_graph(pane, gr);
> hud_pane_set_max_value(pane, 100);
>  }
> diff --git a/src/gallium/auxiliary/hud/hud_driver_query.c 
> b/src/gallium/auxiliary/hud/hud_driver_query.c
> index 6a97dbd..76104b5 100644
> --- a/src/gallium/auxiliary/hud/hud_driver_query.c
> +++ b/src/gallium/auxiliary/hud/hud_driver_query.c
> @@ -387,22 +387,20 @@ hud_pipe_query_install(struct hud_batch_query_context 
> **pbq,
> if (flags & PIPE_DRIVER_QUERY_FLAG_BATCH) {
>if (!batch_query_add(pbq, pipe, query_type, &info->result_index))
>   goto fail_info;
>info->batch = *pbq;
> } else {
>gr->begin_query = begin_query;
>info->query_type = query_type;
>info->result_index = result_index;
> }
>
> -   hud_graph_set_dump_file(gr);
> -
> hud_pane_add_graph(pane, gr);
> pane->type = type; /* must be set before updating the max_value */
>
> if (pane->max

Re: [Mesa-dev] [PATCH] glx/glvnd: Fix GLXdispatchIndex sorting

2017-02-16 Thread Emil Velikov
Hi Hans,

On 6 February 2017 at 13:09, Hans de Goede  wrote:
> Commit 8bca8d89ef3b ("glx/glvnd: Fix dispatch function names and indices")
> fixed the sorting of the array initializers in g_glxglvnddispatchfuncs.c
> because FindGLXFunction's binary search needs these to be sorted
> alphabetically.
>
> That commit also mostly fixed the sorting of the DI_foo defines in
> g_glxglvnddispatchindices.h, which is what actually matters as the
> arrays are initialized using "[DI_foo] = glXfoo," but a small error
> crept in which at least causes glXGetVisualFromFBConfigSGIX to not
> resolve, breaking games such as "The Binding of Isaac: Rebirth" and
> "Crypt of the NecroDancer" from Steam not working and possible causes
> other problems too.
>
> This commit fixes the last of the sorting errors, fixing these mentioned
> games not working.
>
> Fixes: 8bca8d89ef3b ("glx/glvnd: Fix dispatch function names and indices")
> Cc: "13.0" 
> Cc: "17.0" 
> Cc: Adam Jackson 
> Signed-off-by: Hans de Goede 
> ---
A while back as Adam did a similar thing, it was suggested that we get
an actual test so that things don't break.

I was stupid^Wkind enough to opt for "we can have such patch as
follow-up", only that it never came.
As you can imagine not cool...

Can we have one now, please ?
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa 17.0 #3471 failed

2017-02-16 Thread AppVeyor



Build mesa 3471 failed


Commit 73f8abe32a by Hans de Goede on 2/6/2017 11:13 AM:

glx/glvnd: Fix GLXdispatchIndex sorting\n\nCommit 8bca8d89ef3b ("glx/glvnd: Fix dispatch function names and indices")\nfixed the sorting of the array initializers in g_glxglvnddispatchfuncs.c\nbecause FindGLXFunction's binary search needs these to be sorted\nalphabetically.\n\nThat commit also mostly fixed the sorting of the DI_foo defines in\ng_glxglvnddispatchindices.h, which is what actually matters as the\narrays are initialized using "[DI_foo] = glXfoo," but a small error\ncrept in which at least causes glXGetVisualFromFBConfigSGIX to not\nresolve, breaking games such as "The Binding of Isaac: Rebirth" and\n"Crypt of the NecroDancer" from Steam not working and possible causes\nother problems too.\n\nThis commit fixes the last of the sorting errors, fixing these mentioned\ngames not working.\n\nFixes: 8bca8d89ef3b ("glx/glvnd: Fix dispatch function names and indices")\nCc: "13.0" \nCc: "17.0" \nCc: Adam Jackson \nSigned-off-by: Hans de Goede \nReviewed-by: Eric Engestrom 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/13] mesa: remove unneeded extern C {} wrapper

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

compiler.h defines a few mesa specific macros which are not C specific.
This allows us to avoid buggy extern C { #include $system_header }
constructs.

Signed-off-by: Emil Velikov 
---
 src/mesa/main/compiler.h | 10 --
 1 file changed, 10 deletions(-)

diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h
index c5ee7412b6..43a06b4313 100644
--- a/src/mesa/main/compiler.h
+++ b/src/mesa/main/compiler.h
@@ -41,11 +41,6 @@
 #include "c99_compat.h" /* inline, __func__, etc. */
 
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
-
 /**
  * Either define MESA_BIG_ENDIAN or MESA_LITTLE_ENDIAN, and CPU_TO_LE32.
  * Do not use these unless absolutely necessary!
@@ -78,9 +73,4 @@ extern "C" {
 #define IEEE_ONE 0x3f80
 
 
-#ifdef __cplusplus
-}
-#endif
-
-
 #endif /* COMPILER_H */
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/13] i915: remove 'virtual' and extern C workarounds

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/i915/intel_context.h | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/intel_context.h 
b/src/mesa/drivers/dri/i915/intel_context.h
index 5832169825..139a033777 100644
--- a/src/mesa/drivers/dri/i915/intel_context.h
+++ b/src/mesa/drivers/dri/i915/intel_context.h
@@ -34,18 +34,9 @@
 #include "main/mtypes.h"
 #include "main/mm.h"
 
-#ifdef __cplusplus
-extern "C" {
-   /* Evil hack for using libdrm in a c++ compiler. */
-   #define virtual virt
-#endif
-
 #include 
 #include 
 #include 
-#ifdef __cplusplus
-   #undef virtual
-#endif
 
 #include "intel_screen.h"
 #include "intel_tex_obj.h"
@@ -56,6 +47,10 @@ extern "C" {
 #include "tnl_dd/t_dd_vertex.h"
 #undef TAG
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 #define DV_PF_555  (1<<8)
 #define DV_PF_565  (2<<8)
 #define DV_PF_ (3<<8)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/13] radv: remove unneeded extern C notation

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Header is never #include(d) by a C++ source.

Signed-off-by: Emil Velikov 
---
 src/amd/vulkan/vk_format.h | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/src/amd/vulkan/vk_format.h b/src/amd/vulkan/vk_format.h
index 58ee3f71f0..bee8e7d9ee 100644
--- a/src/amd/vulkan/vk_format.h
+++ b/src/amd/vulkan/vk_format.h
@@ -26,13 +26,10 @@
 
 #pragma once
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 #include 
 #include 
 #include 
+
 enum vk_format_layout {
/**
 * Formats with vk_format_block::width == vk_format_block::height == 1
@@ -446,6 +443,3 @@ vk_format_get_component_bits(VkFormat format,
return 0;
}
 }
-#ifdef __cplusplus
-} // extern "C" {
-#endif
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/13] mesa: annotate functions for C linkage

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

i.e. add extern C {} in program/symbol_table.h

It will allow us remove a workaround we have elsewhere in the code.

Signed-off-by: Emil Velikov 
---
 src/mesa/program/symbol_table.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/program/symbol_table.h b/src/mesa/program/symbol_table.h
index cba47143ef..6db2164fc2 100644
--- a/src/mesa/program/symbol_table.h
+++ b/src/mesa/program/symbol_table.h
@@ -23,6 +23,10 @@
 #ifndef MESA_SYMBOL_TABLE_H
 #define MESA_SYMBOL_TABLE_H
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 struct _mesa_symbol_table;
 
 extern void _mesa_symbol_table_push_scope(struct _mesa_symbol_table *table);
@@ -51,4 +55,8 @@ extern struct _mesa_symbol_table 
*_mesa_symbol_table_ctor(void);
 
 extern void _mesa_symbol_table_dtor(struct _mesa_symbol_table *);
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* MESA_SYMBOL_TABLE_H */
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/13] anv: remove unneeded extern C notation

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Signed-off-by: Emil Velikov 
---
 src/intel/vulkan/anv_private.h | 8 
 1 file changed, 8 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index da1ca29f64..f3a267f051 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -68,10 +68,6 @@ struct gen_l3_config;
 
 #include "wsi_common.h"
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 /* Allowing different clear colors requires us to perform a depth resolve at
  * the end of certain render passes. This is because while slow clears store
  * the clear color in the HiZ buffer, fast clears (without a resolve) don't.
@@ -1970,8 +1966,4 @@ ANV_DEFINE_NONDISP_HANDLE_CASTS(anv_shader_module, 
VkShaderModule)
 #  undef genX
 #endif
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif /* ANV_PRIVATE_H */
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/13] i965: add extern C notation in headers

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Otherwise symbols wont be annotated with C linkage and we'll fail at
link time.

Currently this is worked around by wrapping the header inclusion itself.
The latter in itself fragile and not recommended.

Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/i965/intel_debug.h   | 7 +++
 src/mesa/drivers/dri/i965/intel_screen.h  | 8 
 src/mesa/drivers/dri/i965/intel_tex_obj.h | 7 +++
 3 files changed, 22 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
b/src/mesa/drivers/dri/i965/intel_debug.h
index afca36eb33..e8e329bc60 100644
--- a/src/mesa/drivers/dri/i965/intel_debug.h
+++ b/src/mesa/drivers/dri/i965/intel_debug.h
@@ -24,6 +24,9 @@
  */
 #pragma once
 
+#ifdef __cplusplus
+extern "C" {
+#endif
 /**
  * \file intel_debug.h
  *
@@ -122,3 +125,7 @@ extern uint64_t INTEL_DEBUG;
 extern uint64_t intel_debug_flag_for_shader_stage(gl_shader_stage stage);
 
 extern void brw_process_intel_debug_variable(void);
+
+#ifdef __cplusplus
+}
+#endif
diff --git a/src/mesa/drivers/dri/i965/intel_screen.h 
b/src/mesa/drivers/dri/i965/intel_screen.h
index a1e2b31774..147af257be 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.h
+++ b/src/mesa/drivers/dri/i965/intel_screen.h
@@ -37,6 +37,10 @@
 #include "i915_drm.h"
 #include "xmlconfig.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 struct intel_screen
 {
int deviceID;
@@ -154,4 +158,8 @@ can_do_predicate_writes(const struct intel_screen *screen)
return screen->kernel_features & KERNEL_ALLOWS_PREDICATE_WRITES;
 }
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif
diff --git a/src/mesa/drivers/dri/i965/intel_tex_obj.h 
b/src/mesa/drivers/dri/i965/intel_tex_obj.h
index 844aad1ab3..27c18b7c3c 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_obj.h
+++ b/src/mesa/drivers/dri/i965/intel_tex_obj.h
@@ -28,6 +28,9 @@
 
 #include "swrast/s_context.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
 
 struct intel_texture_object
 {
@@ -90,4 +93,8 @@ intel_texture_image(struct gl_texture_image *img)
return (struct intel_texture_image *) img;
 }
 
+#ifdef __cplusplus
+}
+#endif
+
 #endif /* _INTEL_TEX_OBJ_H */
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/13] gallium: do not #include foo.h within extern C {}

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit.

Signed-off-by: Emil Velikov 
---
 src/gallium/auxiliary/tgsi/tgsi_util.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.h 
b/src/gallium/auxiliary/tgsi/tgsi_util.h
index 83a930b69c..aa4606d0b2 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_util.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_util.h
@@ -28,12 +28,12 @@
 #ifndef TGSI_UTIL_H
 #define TGSI_UTIL_H
 
+#include "pipe/p_shader_tokens.h"
+
 #if defined __cplusplus
 extern "C" {
 #endif
 
-#include "pipe/p_shader_tokens.h"
-
 struct tgsi_src_register;
 struct tgsi_full_src_register;
 struct tgsi_full_instruction;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/18] radeonsi: upload constants into VRAM instead of GTT

2017-02-16 Thread Nicolai Hähnle

On 16.02.2017 13:53, Marek Olšák wrote:

From: Marek Olšák 

This lowers lgkm wait cycles by 30% on VI and normal conditions.
The might be a measurable improvement when CE is disabled (radeon)
or under L2 thrashing.


Good idea. I'm just wondering if all the users of const upload end up as 
streaming writes? I hope we don't accidentally hit some place where 
writes from the CPU end up extremely slow, e.g. where st/mesa uploads 
some structures.


Nicolai



---
 src/gallium/drivers/radeon/r600_pipe_common.c | 11 ---
 src/gallium/drivers/radeonsi/si_compute.c |  4 ++--
 src/gallium/drivers/radeonsi/si_descriptors.c |  6 +++---
 src/gallium/drivers/radeonsi/si_state.c   |  7 +--
 4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index d573b39..1781584 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -600,21 +600,25 @@ bool r600_common_context_init(struct r600_common_context 
*rctx,
rctx->allocator_zeroed_memory =
u_suballocator_create(&rctx->b, rscreen->info.gart_page_size,
  0, PIPE_USAGE_DEFAULT, 0, true);
if (!rctx->allocator_zeroed_memory)
return false;

rctx->b.stream_uploader = u_upload_create(&rctx->b, 1024 * 1024,
  0, PIPE_USAGE_STREAM);
if (!rctx->b.stream_uploader)
return false;
-   rctx->b.const_uploader = rctx->b.stream_uploader;
+
+   rctx->b.const_uploader = u_upload_create(&rctx->b, 128 * 1024,
+0, PIPE_USAGE_DEFAULT);
+   if (!rctx->b.const_uploader)
+   return false;

rctx->ctx = rctx->ws->ctx_create(rctx->ws);
if (!rctx->ctx)
return false;

if (rscreen->info.has_sdma && !(rscreen->debug_flags & 
DBG_NO_ASYNC_DMA)) {
rctx->dma.cs = rctx->ws->cs_create(rctx->ctx, RING_DMA,
   r600_flush_dma_ring,
   rctx);
rctx->dma.flush = r600_flush_dma_ring;
@@ -642,23 +646,24 @@ void r600_common_context_cleanup(struct 
r600_common_context *rctx)
if (rctx->query_result_shader)
rctx->b.delete_compute_state(&rctx->b, 
rctx->query_result_shader);

if (rctx->gfx.cs)
rctx->ws->cs_destroy(rctx->gfx.cs);
if (rctx->dma.cs)
rctx->ws->cs_destroy(rctx->dma.cs);
if (rctx->ctx)
rctx->ws->ctx_destroy(rctx->ctx);

-   if (rctx->b.stream_uploader) {
+   if (rctx->b.stream_uploader)
u_upload_destroy(rctx->b.stream_uploader);
-   }
+   if (rctx->b.const_uploader)
+   u_upload_destroy(rctx->b.const_uploader);

slab_destroy_child(&rctx->pool_transfers);

if (rctx->allocator_zeroed_memory) {
u_suballocator_destroy(rctx->allocator_zeroed_memory);
}
rctx->ws->fence_reference(&rctx->last_gfx_fence, NULL);
rctx->ws->fence_reference(&rctx->last_sdma_fence, NULL);
 }

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 381837c..88d72c1 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -496,21 +496,21 @@ static void si_setup_user_sgprs_co_v2(struct si_context 
*sctx,

dispatch.grid_size_x = info->grid[0] * info->block[0];
dispatch.grid_size_y = info->grid[1] * info->block[1];
dispatch.grid_size_z = info->grid[2] * info->block[2];

dispatch.private_segment_size = program->private_size;
dispatch.group_segment_size = program->local_size;

dispatch.kernarg_address = kernel_args_va;

-   u_upload_data(sctx->b.b.stream_uploader, 0, sizeof(dispatch),
+   u_upload_data(sctx->b.b.const_uploader, 0, sizeof(dispatch),
   256, &dispatch, &dispatch_offset,
   (struct pipe_resource**)&dispatch_buf);

if (!dispatch_buf) {
fprintf(stderr, "Error: Failed to allocate dispatch "
"packet.");
}
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, dispatch_buf,
  RADEON_USAGE_READ, RADEON_PRIO_CONST_BUFFER);

@@ -558,21 +558,21 @@ static void si_upload_compute_input(struct si_context 
*sctx,
unsigned num_work_size_bytes = program->use_code_object_v2 ? 0 : 36;
uint32_t kernel_args_offset = 0;
uint32_t *kernel_args;
void *kernel_args_ptr;
uint64_t kernel_args_va;
unsigned i;

/* The extra num_work_size

[Mesa-dev] [PATCH 11/13] i965: remove 'virtual' and extern C workarounds

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

The headers are properly annotated thus we don't need these.

Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/i965/brw_context.h | 16 +++-
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 01e651b09f..ce4816fc98 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -43,26 +43,16 @@
 #include "isl/isl.h"
 #include "blorp/blorp.h"
 
-#ifdef __cplusplus
-extern "C" {
-   /* Evil hack for using libdrm in a c++ compiler. */
-#define virtual virt
-#endif
-
 #include 
-#ifdef __cplusplus
-   #undef virtual
-}
-#endif
 
-#ifdef __cplusplus
-extern "C" {
-#endif
 #include "intel_debug.h"
 #include "intel_screen.h"
 #include "intel_tex_obj.h"
 #include "intel_resolve_map.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
 /* Glossary:
  *
  * URB - uniform resource buffer.  A mid-sized buffer which is
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/13] Misc extern C fixes

2017-02-16 Thread Emil Velikov
Just a bunch of extern C issues flagged by [1]. There's a few more 
remaining such as the glsl_types C API living in nir_types.{cpp,h} but 
that can be resolved at a later date.

-Emil

[1] git grep -B2 "#.*\" -- src/ | grep  "\"

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/13] i915: remove extern "C" guards

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

None of this code is used in C++ context.

Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/i915/intel_batchbuffer.h | 8 
 src/mesa/drivers/dri/i915/intel_context.h | 8 
 src/mesa/drivers/dri/i915/intel_fbo.h | 8 
 src/mesa/drivers/dri/i915/intel_mipmap_tree.h | 8 
 src/mesa/drivers/dri/i915/intel_regions.h | 8 
 5 files changed, 40 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i915/intel_batchbuffer.h
index c4efa762bc..9ebc61f1c2 100644
--- a/src/mesa/drivers/dri/i915/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i915/intel_batchbuffer.h
@@ -7,10 +7,6 @@
 #include "intel_bufmgr.h"
 #include "intel_reg.h"
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 /**
  * Number of bytes to reserve for commands necessary to complete a batch.
  *
@@ -152,8 +148,4 @@ intel_batchbuffer_advance(struct intel_context *intel)
 #define ADVANCE_BATCH() intel_batchbuffer_advance(intel);
 #define CACHED_BATCH() intel_batchbuffer_cached_advance(intel);
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
diff --git a/src/mesa/drivers/dri/i915/intel_context.h 
b/src/mesa/drivers/dri/i915/intel_context.h
index 139a033777..d0f3d367cb 100644
--- a/src/mesa/drivers/dri/i915/intel_context.h
+++ b/src/mesa/drivers/dri/i915/intel_context.h
@@ -47,10 +47,6 @@
 #include "tnl_dd/t_dd_vertex.h"
 #undef TAG
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 #define DV_PF_555  (1<<8)
 #define DV_PF_565  (2<<8)
 #define DV_PF_ (3<<8)
@@ -446,8 +442,4 @@ intel_context(struct gl_context * ctx)
return (struct intel_context *) ctx;
 }
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
diff --git a/src/mesa/drivers/dri/i915/intel_fbo.h 
b/src/mesa/drivers/dri/i915/intel_fbo.h
index 769dab8689..a30830b471 100644
--- a/src/mesa/drivers/dri/i915/intel_fbo.h
+++ b/src/mesa/drivers/dri/i915/intel_fbo.h
@@ -36,10 +36,6 @@
 #include "intel_mipmap_tree.h"
 #include "intel_screen.h"
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 struct intel_context;
 struct intel_mipmap_tree;
 struct intel_texture_image;
@@ -158,8 +154,4 @@ intel_renderbuffer_get_tile_offsets(struct 
intel_renderbuffer *irb,
 struct intel_region*
 intel_get_rb_region(struct gl_framebuffer *fb, GLuint attIndex);
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif /* INTEL_FBO_H */
diff --git a/src/mesa/drivers/dri/i915/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i915/intel_mipmap_tree.h
index 2520b3035b..853a4a7986 100644
--- a/src/mesa/drivers/dri/i915/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i915/intel_mipmap_tree.h
@@ -34,10 +34,6 @@
 #include "intel_regions.h"
 #include "GL/internal/dri_interface.h"
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 /* A layer on top of the intel_regions code which adds:
  *
  * - Code to size and layout a region to hold a set of mipmaps.
@@ -369,8 +365,4 @@ intel_miptree_unmap(struct intel_context *intel,
unsigned int slice);
 
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
diff --git a/src/mesa/drivers/dri/i915/intel_regions.h 
b/src/mesa/drivers/dri/i915/intel_regions.h
index eb1c3f62b3..562f7cd902 100644
--- a/src/mesa/drivers/dri/i915/intel_regions.h
+++ b/src/mesa/drivers/dri/i915/intel_regions.h
@@ -41,10 +41,6 @@
 #include "main/mtypes.h"
 #include "intel_bufmgr.h"
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 struct intel_context;
 struct intel_screen;
 struct intel_buffer_object;
@@ -153,8 +149,4 @@ struct __DRIimageRec {
void *data;
 };
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 18/18] radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirable

2017-02-16 Thread Nicolai Hähnle
Some cool improvements all around. Some questions on patches 9, 12, 15, 
the rest are


Reviewed-by: Nicolai Hähnle 

On 16.02.2017 13:53, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_texture.c   | 11 +--
 src/gallium/drivers/radeonsi/si_compute.c   |  6 ++--
 src/gallium/drivers/radeonsi/si_cp_dma.c|  6 ++--
 src/gallium/drivers/radeonsi/si_pipe.c  | 12 +---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 41 -
 5 files changed, 50 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 47aa8b1..0865d35 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -756,21 +756,23 @@ static void r600_texture_alloc_cmask_separate(struct 
r600_common_screen *rscreen

assert(rtex->cmask.size == 0);

if (rscreen->chip_class >= SI) {
si_texture_get_cmask_info(rscreen, rtex, &rtex->cmask);
} else {
r600_texture_get_cmask_info(rscreen, rtex, &rtex->cmask);
}

rtex->cmask_buffer = (struct r600_resource *)
-   r600_aligned_buffer_create(&rscreen->b, 0, PIPE_USAGE_DEFAULT,
+   r600_aligned_buffer_create(&rscreen->b,
+  R600_RESOURCE_FLAG_UNMAPPABLE,
+  PIPE_USAGE_DEFAULT,
   rtex->cmask.size,
   rtex->cmask.alignment);
if (rtex->cmask_buffer == NULL) {
rtex->cmask.size = 0;
return;
}

/* update colorbuffer state bits */
rtex->cmask.base_address_reg = rtex->cmask_buffer->gpu_address >> 8;

@@ -867,21 +869,23 @@ static void r600_texture_allocate_htile(struct 
r600_common_screen *rscreen,
clear_value = 0x030F;
} else {
r600_texture_get_htile_size(rscreen, rtex);
clear_value = 0;
}

if (!rtex->surface.htile_size)
return;

rtex->htile_buffer = (struct r600_resource*)
-   r600_aligned_buffer_create(&rscreen->b, 0, PIPE_USAGE_DEFAULT,
+   r600_aligned_buffer_create(&rscreen->b,
+  R600_RESOURCE_FLAG_UNMAPPABLE,
+  PIPE_USAGE_DEFAULT,
   rtex->surface.htile_size,
   rtex->surface.htile_alignment);
if (rtex->htile_buffer == NULL) {
/* this is not a fatal error as we can still keep rendering
 * without htile buffer */
R600_ERR("Failed to create buffer object for htile buffer.\n");
} else {
r600_screen_clear_buffer(rscreen, &rtex->htile_buffer->b.b,
 0, rtex->surface.htile_size,
 clear_value);
@@ -2099,21 +2103,22 @@ static void vi_separate_dcc_try_enable(struct 
r600_common_context *rctx,
r600_texture_discard_cmask(rctx->screen, tex);

/* Get a DCC buffer. */
if (tex->last_dcc_separate_buffer) {
assert(tex->dcc_gather_statistics);
assert(!tex->dcc_separate_buffer);
tex->dcc_separate_buffer = tex->last_dcc_separate_buffer;
tex->last_dcc_separate_buffer = NULL;
} else {
tex->dcc_separate_buffer = (struct r600_resource*)
-   r600_aligned_buffer_create(rctx->b.screen, 0,
+   r600_aligned_buffer_create(rctx->b.screen,
+  
R600_RESOURCE_FLAG_UNMAPPABLE,
   PIPE_USAGE_DEFAULT,
   tex->surface.dcc_size,
   tex->surface.dcc_alignment);
if (!tex->dcc_separate_buffer)
return;
}

/* dcc_offset is the absolute GPUVM address. */
tex->dcc_offset = tex->dcc_separate_buffer->gpu_address;

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 88d72c1..f4efb0d 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -282,22 +282,24 @@ static bool si_setup_compute_scratch_buffer(struct 
si_context *sctx,
uint64_t scratch_bo_size, scratch_needed;
scratch_bo_size = 0;
scratch_needed = config->scratch_bytes_per_wave * sctx->scratch_waves;
if (sctx->compute_scratch_buffer)
scratch_bo_size = sctx->compute_scratch_buffer->b.b.width0;

if (scratch_bo_size < scratch_needed) {
r600_resource_reference(&sctx->compute_scratch_

Re: [Mesa-dev] [PATCH 3/3] r100: use correct libdrm_radeon macro

2017-02-16 Thread Emil Velikov
On 14 February 2017 at 08:28, Nicolai Hähnle  wrote:
> On 14.02.2017 02:15, Emil Velikov wrote:
>>
>> Remove local definition of RADEON_INFO_TILE_CONFIG and use the correct
>> macro provided by libdrm_radeon RADEON_INFO_TILING_CONFIG.
>>
>> Latter was present as of libdrm 2.4.22, sirca 2010.
>>
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/mesa/drivers/dri/radeon/radeon_screen.c | 8 ++--
>>  1 file changed, 2 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/radeon/radeon_screen.c
>> b/src/mesa/drivers/dri/radeon/radeon_screen.c
>> index 9a07535155..06901348a3 100644
>> --- a/src/mesa/drivers/dri/radeon/radeon_screen.c
>> +++ b/src/mesa/drivers/dri/radeon/radeon_screen.c
>> @@ -128,10 +128,6 @@ DRI_CONF_END
>>  };
>>  #endif
>>
>> -#ifndef RADEON_INFO_TILE_CONFIG
>> -#define RADEON_INFO_TILE_CONFIG 0x6
>> -#endif
>> -
>>  static int
>>  radeonGetParam(__DRIscreen *sPriv, int param, void *value)
>>  {
>> @@ -148,8 +144,8 @@ radeonGetParam(__DRIscreen *sPriv, int param, void
>> *value)
>>case RADEON_PARAM_NUM_Z_PIPES:
>>  info.request = RADEON_INFO_NUM_Z_PIPES;
>>  break;
>> -  case RADEON_INFO_TILE_CONFIG:
>> -info.request = RADEON_INFO_TILE_CONFIG;
>> +  case RADEON_INFO_TILING_CONFIG:
>> +info.request = RADEON_INFO_TILING_CONFIG;
>>  break;
>>default:
>>  return -EINVAL;
>>
>
> Hmm, this doesn't seem to be actually used anywhere. Then again, why not
> leave cleaning this up to somebody who actually still has the hardware...
>
Agreed.

> Series is:
>
> Reviewed-by: Nicolai Hähnle 
>
Thanks !
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/13] st/mesa: move extern C wrappers where applicable

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Namely, after the include directives. The headers are properly annotated
so keeping things as-is is only asking for trouble.

Signed-off-by: Emil Velikov 
---
 src/mesa/state_tracker/st_atifs_to_tgsi.h | 6 +++---
 src/mesa/state_tracker/st_mesa_to_tgsi.h  | 8 
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.h 
b/src/mesa/state_tracker/st_atifs_to_tgsi.h
index c1b6758ba0..14227023ba 100644
--- a/src/mesa/state_tracker/st_atifs_to_tgsi.h
+++ b/src/mesa/state_tracker/st_atifs_to_tgsi.h
@@ -23,13 +23,13 @@
 #ifndef ST_ATIFS_TO_TGSI_H
 #define ST_ATIFS_TO_TGSI_H
 
+#include "main/glheader.h"
+#include "pipe/p_defines.h"
+
 #if defined __cplusplus
 extern "C" {
 #endif
 
-#include "main/glheader.h"
-#include "pipe/p_defines.h"
-
 struct gl_context;
 struct gl_program;
 struct ureg_program;
diff --git a/src/mesa/state_tracker/st_mesa_to_tgsi.h 
b/src/mesa/state_tracker/st_mesa_to_tgsi.h
index ed7a3adfe1..3df54ce5b8 100644
--- a/src/mesa/state_tracker/st_mesa_to_tgsi.h
+++ b/src/mesa/state_tracker/st_mesa_to_tgsi.h
@@ -29,15 +29,15 @@
 #ifndef ST_MESA_TO_TGSI_H
 #define ST_MESA_TO_TGSI_H
 
-#if defined __cplusplus
-extern "C" {
-#endif
-
 #include "main/glheader.h"
 
 #include "pipe/p_compiler.h"
 #include "pipe/p_defines.h"
 
+#if defined __cplusplus
+extern "C" {
+#endif
+
 struct gl_context;
 struct gl_program;
 struct tgsi_token;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/13] glsl: resolve extern C workarounds/hacks

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Do not wrap header inclusion in extern C since it can cause issues.

Signed-off-by: Emil Velikov 
---
 src/compiler/glsl/blob.h  | 8 
 src/compiler/glsl/glsl_symbol_table.h | 2 --
 src/compiler/glsl/ir_print_visitor.h  | 2 --
 3 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/src/compiler/glsl/blob.h b/src/compiler/glsl/blob.h
index 0765bf3ef1..81b9917afc 100644
--- a/src/compiler/glsl/blob.h
+++ b/src/compiler/glsl/blob.h
@@ -25,14 +25,14 @@
 #ifndef BLOB_H
 #define BLOB_H
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 #include 
 #include 
 #include 
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 /* The blob functions implement a simple, low-level API for serializing and
  * deserializing.
  *
diff --git a/src/compiler/glsl/glsl_symbol_table.h 
b/src/compiler/glsl/glsl_symbol_table.h
index 087cc71f63..be910b4170 100644
--- a/src/compiler/glsl/glsl_symbol_table.h
+++ b/src/compiler/glsl/glsl_symbol_table.h
@@ -28,9 +28,7 @@
 
 #include 
 
-extern "C" {
 #include "program/symbol_table.h"
-}
 #include "ir.h"
 
 class symbol_table_entry;
diff --git a/src/compiler/glsl/ir_print_visitor.h 
b/src/compiler/glsl/ir_print_visitor.h
index 965e63ade8..858fe97b4f 100644
--- a/src/compiler/glsl/ir_print_visitor.h
+++ b/src/compiler/glsl/ir_print_visitor.h
@@ -29,9 +29,7 @@
 #include "ir.h"
 #include "ir_visitor.h"
 
-extern "C" {
 #include "program/symbol_table.h"
-}
 
 /**
  * Abstract base class of visitors of IR instruction trees
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/13] mesa/tests: remove unneeded extern C { #include foo } hack

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

The header itself (enums.h) is already properly annotated.

Signed-off-by: Emil Velikov 
---
 src/mesa/main/tests/enum_strings.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/mesa/main/tests/enum_strings.cpp 
b/src/mesa/main/tests/enum_strings.cpp
index 4d8d12fdf2..1395ac8fb3 100644
--- a/src/mesa/main/tests/enum_strings.cpp
+++ b/src/mesa/main/tests/enum_strings.cpp
@@ -24,9 +24,7 @@
 #include 
 #include 
 
-extern "C" {
 #include "main/enums.h"
-}
 
 struct enum_info {
int value;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/13] nir: do not #include util/debug.h within extern C {}

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

It's a problem waiting to happen. Individual headers should be annotated
if needed.

Signed-off-by: Emil Velikov 
---
 src/compiler/nir/nir.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index d92e6eb110..a110b816f0 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -40,6 +40,10 @@
 #include "compiler/shader_info.h"
 #include 
 
+#ifdef DEBUG
+#include "util/debug.h"
+#endif /* DEBUG */
+
 #include "nir_opcodes.h"
 
 #ifdef __cplusplus
@@ -2279,7 +2283,6 @@ void nir_validate_shader(nir_shader *shader);
 void nir_metadata_set_validation_flag(nir_shader *shader);
 void nir_metadata_check_validation_flag(nir_shader *shader);
 
-#include "util/debug.h"
 static inline bool
 should_clone_nir(void)
 {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/18] radeonsi: use a clever alignment for descriptor uploads

2017-02-16 Thread Nicolai Hähnle

On 16.02.2017 13:53, Marek Olšák wrote:

From: Marek Olšák 

Non-VBO descriptors won't be smaller than the cache line, so simply use
the cache line size.


What about SSBOs? Those are just 16 bytes.

Also, shader images are just 32 bytes (though we may have to bump this 
to 64 bytes for multisample image support -- except that it's unclear 
how to write to a multisample shader image while keeping the FMASK).


Thanks,
Nicolai


---
 src/gallium/drivers/radeonsi/si_descriptors.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 72b33f3..4f2dbbb 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -227,23 +227,24 @@ static bool si_upload_descriptors(struct si_context *sctx,
radeon_emit(sctx->ce_ib, desc->ce_offset + begin * 4);
radeon_emit_array(sctx->ce_ib, list + begin, count);
}

if (!si_ce_upload(sctx, desc->ce_offset, list_size,
   &desc->buffer_offset, &desc->buffer))
return false;
} else {
void *ptr;

-   u_upload_alloc(sctx->b.b.stream_uploader, 0, list_size, 256,
-   &desc->buffer_offset,
-   (struct pipe_resource**)&desc->buffer, &ptr);
+   u_upload_alloc(sctx->b.b.stream_uploader, 0, list_size,
+  sctx->screen->b.info.tcc_cache_line_size,
+  &desc->buffer_offset,
+  (struct pipe_resource**)&desc->buffer, &ptr);
if (!desc->buffer)
return false; /* skip the draw call */

util_memcpy_cpu_to_le32(ptr, desc->list, list_size);
desc->gpu_list = ptr;

radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, desc->buffer,
RADEON_USAGE_READ, RADEON_PRIO_DESCRIPTORS);
}
desc->dirty_mask = 0;
@@ -941,34 +942,36 @@ static void si_vertex_buffers_begin_new_cs(struct 
si_context *sctx)
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
  desc->buffer, RADEON_USAGE_READ,
  RADEON_PRIO_DESCRIPTORS);
 }

 bool si_upload_vertex_buffer_descriptors(struct si_context *sctx)
 {
struct si_vertex_element *velems = sctx->vertex_elements;
struct si_descriptors *desc = &sctx->vertex_buffers;
unsigned i, count = velems->count;
+   unsigned desc_list_byte_size = velems->desc_list_byte_size;
uint64_t va;
uint32_t *ptr;

if (!sctx->vertex_buffers_dirty || !count || !velems)
return true;

unsigned first_vb_use_mask = velems->first_vb_use_mask;

/* Vertex buffer descriptors are the only ones which are uploaded
 * directly through a staging buffer and don't go through
 * the fine-grained upload path.
 */
u_upload_alloc(sctx->b.b.stream_uploader, 0,
-  velems->desc_list_byte_size, 256,
+  desc_list_byte_size,
+  si_optimal_tcc_alignment(sctx, desc_list_byte_size),
   &desc->buffer_offset,
   (struct pipe_resource**)&desc->buffer, (void**)&ptr);
if (!desc->buffer)
return false;

radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
  desc->buffer, RADEON_USAGE_READ,
  RADEON_PRIO_DESCRIPTORS);

assert(count <= SI_MAX_ATTRIBS);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/18] radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start

2017-02-16 Thread Nicolai Hähnle

On 16.02.2017 13:53, Marek Olšák wrote:

From: Marek Olšák 

start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.


Do we have a test case for this?

Cheers,
Nicolai



---
 src/gallium/drivers/radeonsi/si_state_draw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index d453309..8f5dcbc 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -1045,21 +1045,21 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
ib.offset = sctx->index_buffer.offset;

/* Translate or upload, if needed. */
/* 8-bit indices are supported on VI. */
if (sctx->b.chip_class <= CIK && ib.index_size == 1) {
struct pipe_resource *out_buffer = NULL;
unsigned out_offset, start, count, start_offset;
void *ptr;

si_get_draw_start_count(sctx, info, &start, &count);
-   start_offset = start * ib.index_size;
+   start_offset = start * 2;

u_upload_alloc(ctx->stream_uploader, start_offset,
count * 2, 256,
   &out_offset, &out_buffer, &ptr);
if (!out_buffer) {
pipe_resource_reference(&ib.buffer, NULL);
return;
}

util_shorten_ubyte_elts_to_userptr(&sctx->b.b, &ib, 0,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: fix uninitialized memory access

2017-02-16 Thread Lionel Landwerlin
Found while running shader-db under valgrind.

Signed-off-by: Lionel Landwerlin 
---
 src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
index f56f05b7e9..952276faed 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
@@ -207,9 +207,8 @@ fs_visitor::register_coalesce()
 channels_remaining = -1;
 continue;
  }
- dst_reg_offset[offset] = inst->dst.offset / REG_SIZE;
- if (inst->size_written > REG_SIZE)
-dst_reg_offset[offset + 1] = inst->dst.offset / REG_SIZE + 1;
+ for (unsigned i = 0; i < MAX2(inst->size_written / REG_SIZE, 1); i++)
+dst_reg_offset[offset + i] = inst->dst.offset / REG_SIZE + i;
  mov[offset] = inst;
  channels_remaining -= regs_written(inst);
   }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] [util] add extern "C" guards

2017-02-16 Thread Emil Velikov
On 16 February 2017 at 14:48, Kyriazis, George
 wrote:
>
>> -Original Message-
>> From: ibmir...@gmail.com [mailto:ibmir...@gmail.com] On Behalf Of Ilia
>> Mirkin
>> Sent: Wednesday, February 15, 2017 10:04 PM
>> To: Kyriazis, George 
>> Cc: mesa-dev@lists.freedesktop.org
>> Subject: Re: [Mesa-dev] [PATCH v2 1/2] [util] add extern "C" guards
>>
>> On Wed, Feb 15, 2017 at 10:53 PM, George Kyriazis
>>  wrote:
>> > Added extern "C" __cplusplus guards on headers that did not have them.
>> > ---
>> >  src/gallium/auxiliary/util/u_transfer.h   | 8 
>> >  src/gallium/auxiliary/util/u_upload_mgr.h | 7 +++
>> >  2 files changed, 15 insertions(+)
>> >
>> > diff --git a/src/gallium/auxiliary/util/u_transfer.h
>> > b/src/gallium/auxiliary/util/u_transfer.h
>> > index ab787ab..1408498 100644
>> > --- a/src/gallium/auxiliary/util/u_transfer.h
>> > +++ b/src/gallium/auxiliary/util/u_transfer.h
>> > @@ -10,6 +10,10 @@
>> >  struct pipe_context;
>> >  struct winsys_handle;
>> >
>> > +#ifdef __cplusplus
>> > +extern "C" {
>> > +#endif
>>
>> I'm a little weak on the details, but I wonder if this has to encompass the 
>> type
>> forward decls above. I know that the C extern convention affects function
>> name mangling, but it would stand to reason that it could also affect types.
>> Not sure. e.g. u_blit.h and u_blitter.h include the forward decls inside the
>> extern section.
>>
>
>
> Yes,
>
> I was wondering about that, too, but I followed what was happening in other 
> header files, for example util/u_format.h.
>
> I've seen name mangling affect global vars on MS devenv, but not on gcc, but 
> types seem to work OK on both.
>
Please fix the commit summary - > s|[util]|util/u_format|

Having a forward decl. before the extern C is an exception, so I'd move it.

With the above:
Reviewed-by: Emil Velikov 

Ilia, nouveau has a few extern C { #include foo } cases. Can you give
them a bash - I'm purging through the rest of mesa.

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] vl: fix a buffer leak in the bicubic filter by using an uploader

2017-02-16 Thread Nicolai Hähnle

On 16.02.2017 13:52, Marek Olšák wrote:

From: Marek Olšák 

there's no error checking, because the previous code didn't do it either.
---
 src/gallium/auxiliary/vl/vl_bicubic_filter.c | 27 +++
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_bicubic_filter.c 
b/src/gallium/auxiliary/vl/vl_bicubic_filter.c
index 774702c..d300034 100644
--- a/src/gallium/auxiliary/vl/vl_bicubic_filter.c
+++ b/src/gallium/auxiliary/vl/vl_bicubic_filter.c
@@ -28,20 +28,21 @@
 #include 

 #include "pipe/p_context.h"

 #include "tgsi/tgsi_ureg.h"

 #include "util/u_draw.h"
 #include "util/u_memory.h"
 #include "util/u_math.h"
 #include "util/u_rect.h"
+#include "util/u_upload_mgr.h"

 #include "vl_types.h"
 #include "vl_vertex_buffers.h"
 #include "vl_bicubic_filter.h"

 enum VS_OUTPUT
 {
VS_O_VPOS = 0,
VS_O_VTEX = 0
 };
@@ -377,78 +378,72 @@ void
 vl_bicubic_filter_render(struct vl_bicubic_filter *filter,
 struct pipe_sampler_view *src,
 struct pipe_surface *dst,
 struct u_rect *dst_area,
 struct u_rect *dst_clip)
 {
struct pipe_viewport_state viewport;
struct pipe_framebuffer_state fb_state;
struct pipe_scissor_state scissor;
union pipe_color_union clear_color;
-   struct pipe_transfer *buf_transfer;
-   struct pipe_resource *surface_size;
+
assert(filter && src && dst);

if (dst_clip) {
   scissor.minx = dst_clip->x0;
   scissor.miny = dst_clip->y0;
   scissor.maxx = dst_clip->x1;
   scissor.maxy = dst_clip->y1;
} else {
   scissor.minx = 0;
   scissor.miny = 0;
   scissor.maxx = dst->width;
   scissor.maxy = dst->height;
}

clear_color.f[0] = clear_color.f[1] = 0.0f;
clear_color.f[2] = clear_color.f[3] = 0.0f;
-   surface_size = pipe_buffer_create
-   (
-  filter->pipe->screen,
-  PIPE_BIND_CONSTANT_BUFFER,
-  PIPE_USAGE_DEFAULT,
-  2*sizeof(float)
-   );
-

memset(&viewport, 0, sizeof(viewport));
if(dst_area){
   viewport.scale[0] = dst_area->x1 - dst_area->x0;
   viewport.scale[1] = dst_area->y1 - dst_area->y0;
   viewport.translate[0] = dst_area->x0;
   viewport.translate[1] = dst_area->y0;
} else {
   viewport.scale[0] = dst->width;
   viewport.scale[1] = dst->height;
}
viewport.scale[2] = 1;

-   float *ptr = pipe_buffer_map(filter->pipe, surface_size,
-   PIPE_TRANSFER_WRITE | 
PIPE_TRANSFER_DISCARD_RANGE,
-   &buf_transfer);
+   struct pipe_constant_buffer cb = {};
+   float *ptr;
+
+   u_upload_alloc(filter->pipe->const_uploader, 0, 2 * sizeof(float), 256,
+  &cb.buffer_offset, &cb.buffer, (void**)&ptr);
+   cb.buffer_size = cb.buffer->width0 - cb.buffer_offset;


2 * sizeof(float)

With that, the series is

Reviewed-by: Nicolai Hähnle 



ptr[0] = 0.5f/viewport.scale[0];
ptr[1] = 0.5f/viewport.scale[1];
-
-   pipe_buffer_unmap(filter->pipe, buf_transfer);
+   u_upload_unmap(filter->pipe->const_uploader);

memset(&fb_state, 0, sizeof(fb_state));
fb_state.width = dst->width;
fb_state.height = dst->height;
fb_state.nr_cbufs = 1;
fb_state.cbufs[0] = dst;

filter->pipe->set_scissor_states(filter->pipe, 0, 1, &scissor);
filter->pipe->clear_render_target(filter->pipe, dst, &clear_color,
  0, 0, dst->width, dst->height, false);
-   pipe_set_constant_buffer(filter->pipe, PIPE_SHADER_FRAGMENT, 0, 
surface_size);
+   filter->pipe->set_constant_buffer(filter->pipe, PIPE_SHADER_FRAGMENT,
+ 0, &cb);
filter->pipe->bind_rasterizer_state(filter->pipe, filter->rs_state);
filter->pipe->bind_blend_state(filter->pipe, filter->blend);
filter->pipe->bind_sampler_states(filter->pipe, PIPE_SHADER_FRAGMENT,
  0, 1, &filter->sampler);
filter->pipe->set_sampler_views(filter->pipe, PIPE_SHADER_FRAGMENT,
0, 1, &src);
filter->pipe->bind_vs_state(filter->pipe, filter->vs);
filter->pipe->bind_fs_state(filter->pipe, filter->fs);
filter->pipe->set_framebuffer_state(filter->pipe, &fb_state);
filter->pipe->set_viewport_states(filter->pipe, 0, 1, &viewport);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] [util] add extern "C" guards

2017-02-16 Thread Kyriazis, George

> -Original Message-
> From: ibmir...@gmail.com [mailto:ibmir...@gmail.com] On Behalf Of Ilia
> Mirkin
> Sent: Wednesday, February 15, 2017 10:04 PM
> To: Kyriazis, George 
> Cc: mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH v2 1/2] [util] add extern "C" guards
> 
> On Wed, Feb 15, 2017 at 10:53 PM, George Kyriazis
>  wrote:
> > Added extern "C" __cplusplus guards on headers that did not have them.
> > ---
> >  src/gallium/auxiliary/util/u_transfer.h   | 8 
> >  src/gallium/auxiliary/util/u_upload_mgr.h | 7 +++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/src/gallium/auxiliary/util/u_transfer.h
> > b/src/gallium/auxiliary/util/u_transfer.h
> > index ab787ab..1408498 100644
> > --- a/src/gallium/auxiliary/util/u_transfer.h
> > +++ b/src/gallium/auxiliary/util/u_transfer.h
> > @@ -10,6 +10,10 @@
> >  struct pipe_context;
> >  struct winsys_handle;
> >
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> 
> I'm a little weak on the details, but I wonder if this has to encompass the 
> type
> forward decls above. I know that the C extern convention affects function
> name mangling, but it would stand to reason that it could also affect types.
> Not sure. e.g. u_blit.h and u_blitter.h include the forward decls inside the
> extern section.
> 


Yes,

I was wondering about that, too, but I followed what was happening in other 
header files, for example util/u_format.h.

I've seen name mangling affect global vars on MS devenv, but not on gcc, but 
types seem to work OK on both.


> With that figured out one way or the other, this is
> 
> Reviewed-by: Ilia Mirkin 
> 
> > +
> >  boolean u_default_resource_get_handle(struct pipe_screen *screen,
> >struct pipe_resource *resource,
> >struct winsys_handle *handle);
> > @@ -95,4 +99,8 @@ void u_transfer_flush_region_vtbl( struct
> > pipe_context *pipe,  void u_transfer_unmap_vtbl( struct pipe_context
> *rm_ctx,
> >  struct pipe_transfer *transfer );
> >
> > +#ifdef __cplusplus
> > +} // extern "C" {
> > +#endif
> > +
> >  #endif
> > diff --git a/src/gallium/auxiliary/util/u_upload_mgr.h
> > b/src/gallium/auxiliary/util/u_upload_mgr.h
> > index 633291e..4538291 100644
> > --- a/src/gallium/auxiliary/util/u_upload_mgr.h
> > +++ b/src/gallium/auxiliary/util/u_upload_mgr.h
> > @@ -38,6 +38,9 @@
> >  struct pipe_context;
> >  struct pipe_resource;
> >
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> >
> >  /**
> >   * Create the upload manager.
> > @@ -109,4 +112,8 @@ void u_upload_data(struct u_upload_mgr *upload,
> > unsigned *out_offset,
> > struct pipe_resource **outbuf);
> >
> > +#ifdef __cplusplus
> > +} // extern "C" {
> > +#endif
> > +
> >  #endif
> > --
> > 2.7.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] spec: MESA_program_binary

2017-02-16 Thread Nicolai Hähnle

On 16.02.2017 09:21, Tapani Pälli wrote:

On 02/16/2017 10:11 AM, Timothy Arceri wrote:

On 16/02/17 18:58, Timothy Arceri wrote:

On 16/02/17 17:55, Tapani Pälli wrote:

On 02/16/2017 04:52 AM, Timothy Arceri wrote:

In order add functionality to ARB_get_program_binary we need
binary format enums.


I've understood that this is a driver internal enumeration. When
application gets the binary it also receives enum (integer value) what
format we gave. Then when loading application needs to query what
formats are supported by the implementation and load the correct
binary.
We just need to internally make agreement on format list and return
correct one matching the current driver in use?


Not that it's actually likely to happen but if we were to only have a
single MESA enum an application could only distribute a single binary.
e.g either for AMD, INTEL or NVIDIA but not one for each. That is unless
we were to compile and pack all gpu vendor binarys at the same time
which seems overly complicated and expensive.

I could see an intenal id being used for gpu generations from hardware
vendors.


Or are you saying we don't need to define the enums? If so I don't think
that is correct. The ARB_get_program_binary extension says:

"A vendor extension must also be present in order
to define one or more binary formats, thereby populating the list of
PROGRAM_BINARY_FORMATS.  The  returned by
GetProgramBinary is always one of the binary formats in this list.

...

The beauty of this extension, however, is that an application does
not need
to be aware of the vendor extension on any given implementation. It
only
needs to retrieve a program binary with an anonymous
 and
resupply that same  when loading the program binary."



OK, I did not spot this one. At same time we have to supply extension
where values defined (which makes it hard to make changes later) but
then from application POV the values are still considered anonymous and
it will likely not use the extension. This is a bit strange requirement ..

We can still internally put more data in to the blob about exact backend
and version requirements and so on so I guess single enum value per
vendor is enough.


So the question is what the use case of this extension really is. Keep 
in mind that the driver can always decide to fail loading a binary.


If the purpose is to allow games to cache shaders for a second run, then 
I think even a single Mesa enum is sufficient -- the local driver is 
always going to be the same.


If the purpose is to distribute pre-compiled binaries via the internet, 
then assigning enums that need to be registered with Khronos is clearly 
not scalable. We can't have an enum for each build ID, so it's all 
unworkable anyway, and we'd need some way of querying a build ID string.


I think this points towards us registering a single enum for Mesa only.

Still, a bit more information about how this extension is actually used 
in the wild could change my mind.


Cheers,
Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] docs/submittingpatches.html: remove version tag for nominations

2017-02-16 Thread Emil Velikov
On 15 February 2017 at 19:39, Eric Anholt  wrote:
> Emil Velikov  writes:
>
>> From: Emil Velikov 
>>
>> The version tag used to nominate has bitten even experienced mesa
>> developers. Not to mention that it deviates from the one used in the
>> kernel leading to further confusion.
>>
>> Simplify things and omit it all together.
>>
>> Signed-off-by: Emil Velcro 
>> ---
>> Another option would be to align it with the kernel one, but that could
>> bring even further confusion.
>
> I like this a lot -- I'm usually just copy-and-pasting someone else's cc
> stable line, so I'll probably occasionally nominate my stuff for an
> inactive stable branch.  I'm not too worried about patches accidentally
> applying to too-old code.
>
Ack, ty.

> FWIW, this is more or less how the kernel's stable branches have been
> working for me -- I write a "Fixes:  short commit subject"
> line in the commit, and Greg cherry-picks it back to all stable branches
> since that sha1 that it applies to.
We have that one as well. It's a recent addition, so not really a silver bullet.

We're getting there ;-)
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 1/4] run: add -j option to select number of threads

2017-02-16 Thread Matt Turner
On Thu, Feb 16, 2017 at 4:29 AM, Lionel Landwerlin
 wrote:
> Signed-off-by: Lionel Landwerlin 
> ---
>  run.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/run.c b/run.c
> index 2654bff..d2ec8c6 100644
> --- a/run.c
> +++ b/run.c
> @@ -307,7 +307,7 @@ const struct platform platforms[] = {
>  void print_usage(const char *prog_name)
>  {
>  fprintf(stderr,
> -"Usage: %s [-d ] [-p ]  *.shader_test files>\n"
> +"Usage: %s [-d ] [-j ] [-p ] 
> \n"
>  "Other options: \n"
>  " -1Disable multi-threading\n",
>  prog_name);
> @@ -335,7 +335,7 @@ main(int argc, char **argv)
>
>  max_threads = omp_get_max_threads();
>
> -while((opt = getopt(argc, argv, "1d:p:")) != -1) {
> +while((opt = getopt(argc, argv, "1d:j:p:")) != -1) {

Would you  mind fixing the lack of a space between while and ( while
you're modifying this line?

>  switch(opt) {
>  case 'd': {
>  char *endptr;
> @@ -368,6 +368,9 @@ main(int argc, char **argv)
>  setenv("INTEL_DEVID_OVERRIDE", platform->pci_id, 1);
>  break;
>  }
> +case 'j':
> +max_threads = atoi(optarg);
> +break;
>  case '1':
>  max_threads = 1;
>  break;

I would not be opposed to deleting the -1 argument at the same time.

> --
> 2.11.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V2 GLSL IR & TGSI on-disk shader cache

2017-02-16 Thread Nicolai Hähnle

Hi Timothy,

thank you for the update. I had a look at all the patches now, and 
especially the glsl parts looks basically ready to go. There are only 
minor comments for which I don't need a full resend of the series, and 
an open question on patch 22 where it would be nice to get a proper answer.



On 14.02.2017 01:52, Timothy Arceri wrote:

Changes in V2:

- no longer mess around storing/restoring any pointers
- implemented support for compute shaders
- dropped some patches only needed by i965 for now
- add fallback support for shader source that is changed after its compiledi 
(piglit test on the list)
- simplify cache enable for r600/radeonsi by unconditionally creating the cache 
in screen_create.


Remind me how each part of the cache can be disabled?

Thanks,
Nicolai



- make glsl version (the version reported as supported by the implemenation at
  compile time) part of the sha1 input rather than adding mesa string to the 
cache object itself.
  This avoids fallbacks and should be more reliable.
- add any drirc options as sha1 inputs
- some other tidy ups suggested by Nicolai and Marek

In future we probably want to check what other env vars have been set,
but for now the gl/glsl version and drirc options should cover most things.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add utility build-id code.

2017-02-16 Thread Jonathan Gray
On Wed, Feb 15, 2017 at 11:11:50AM -0800, Matt Turner wrote:
> Provides the ability to read the .note.gnu.build-id section of ELF
> binaries, which is inserted by the --build-id=... flag to ld.
> 
> Reviewed-by: Emil Velikov 

I don't have time to dig into details right now but this broke the Mesa
build on OpenBSD and likely other non-linux platforms:

libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" -DPACKAGE_TARNAME=\"mesa\" 
-DPACKAGE_VERSION=\"17.1.0-devel\" "-DPACKAGE_STRING=\"Mesa 17.1.0-devel\"" 
"-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"";
 -DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"17.1.0-devel\" 
-DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 
-DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" 
-DYYTEXT_POINTER=1 -DHAVE___BUILTIN_CLZ=1 -DHAVE___BUILTIN_CLZLL=1 
-DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1 -DHAVE___BUILTIN_FFS=1 
-DHAVE___BUILTIN_FFSLL=1 -DHAVE___BUILTIN_POPCOUNT=1 
-DHAVE___BUILTIN_POPCOUNTLL=1 -DHAVE_FUNC_ATTRIBUTE_CONST=1 
-DHAVE_FUNC_ATTRIBUTE_FLATTEN=1 -DHAVE_FUNC_ATTRIBUTE_FORMAT=1 
-DHAVE_FUNC_ATTRIBUTE_MALLOC=1 -DHAVE_FUNC_ATTRIBUTE_PACKED=1 
-DHAVE_FUNC_ATTRIBUTE_PURE=1 -DHAVE_FUNC_ATTRIBUTE_UNUSED=1 
-DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1 
-DHAVE_FUNC_ATTRIBUTE_WEAK=1 -DHAVE_FUNC_ATTRIBUTE_ALIAS=1 -DHAVE_DLADDR=1 
-DHAVE_CLOCK_GETTIME=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -I. 
-D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DDEBUG 
-DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM -DHAVE_SYS_SYSCTL_H -DHAVE_STRTOF 
-DHAVE_MKOSTEMP -DHAVE_DLOPEN -DHAVE_DL_ITERATE_PHDR -DHAVE_POSIX_MEMALIGN 
-DHAVE_LIBDRM -DGLX_USE_DRM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING 
-DENABLE_SHADER_CACHE -DHAVE_MINCORE -I../../include -I../../src 
-I../../src/mapi -I../../src/mesa -I../../src/gallium/include 
-I../../src/gallium/auxiliary -fvisibility=hidden -Werror=pointer-arith -g -O2 
-Wall -std=gnu99 -Werror=implicit-function-declaration 
-Werror=missing-prototypes -fno-math-errno -fno-trapping-math -MT 
libmesautil_la-build_id.lo -MD -MP -MF .deps/libmesautil_la-build_id.Tpo -c 
build_id.c  -fPIC -DPIC -o .libs/libmesautil_la-build_id.o
In file included from /usr/include/elf_abi.h:31,
 from /usr/include/link_elf.h:10,
 from /usr/include/link.h:39,
 from build_id.c:25:
/usr/include/sys/exec_elf.h:585: error: expected specifier-qualifier-list 
before 'uint32_t'
In file included from /usr/include/link.h:39,
 from build_id.c:25:
/usr/include/link_elf.h:22: error: expected specifier-qualifier-list before 
'caddr_t'
/usr/include/link_elf.h:37: error: expected '=', ',', ';', 'asm' or 
'__attribute__' before 'int'
In file included from build_id.c:25:
/usr/include/link.h:49: error: expected '=', ',', ';', 'asm' or '__attribute__' 
before 'struct'
/usr/include/link.h:65: error: expected specifier-qualifier-list before 
'caddr_t'
build_id.c:34: error: expected specifier-qualifier-list before 'ElfW'
build_id.c: In function 'build_id_find_nhdr_callback':
build_id.c:63: error: 'struct build_id_note' has no member named 'nhdr'
build_id.c:63: error: 'NT_GNU_BUILD_ID' undeclared (first use in this function)
build_id.c:63: error: (Each undeclared identifier is reported only once
build_id.c:63: error: for each function it appears in.)
build_id.c:64: error: 'struct build_id_note' has no member named 'nhdr'
build_id.c:65: error: 'struct build_id_note' has no member named 'nhdr'
build_id.c:66: error: 'struct build_id_note' has no member named 'name'
build_id.c:71: warning: implicit declaration of function 'ElfW'
build_id.c:71: error: 'Nhdr' undeclared (first use in this function)
build_id.c:72: error: 'struct build_id_note' has no member named 'nhdr'
build_id.c:73: error: 'struct build_id_note' has no member named 'nhdr'
build_id.c: In function 'build_id_find_nhdr':
build_id.c:90: warning: implicit declaration of function 'dl_iterate_phdr'
build_id.c: In function 'build_id_length':
build_id.c:99: error: 'const struct build_id_note' has no member named 'nhdr'
build_id.c: In function 'build_id_read':
build_id.c:106: error: 'const struct build_id_note' has no member named 
'build_id'
*** Error 1 in target 'libmesautil_la-build_id.lo'
mv -f .deps/libmesautil_la-strndup.Tpo .deps/libmesautil_la-strndup.Plo
mv -f sha1/.deps/libmesautil_la-sha1.Tpo sha1/.deps/libmesautil_la-sha1.Plo
*** Error 1 in src/util (Makefile:730 'libmesautil_la-build_id.lo')
*** Error 1 in src/util (Makefile:919 'all-recursive')
*** Error 2 in src/util (Makefile:596 'all')
*** Error 1 in src (Makefile:819 'all-recursive')
*** Error 2 in src (Makefile:584 'all')

adding a sys/types.h include before link.h gets slightly further

libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" -DPACKAGE_TARNAME=\"mesa\" 
-DPACKAGE_VERSION=\"17.1.0-devel\" "-D

Re: [Mesa-dev] [PATCH 31/32] st/mesa: implement a tgsi on-disk shader cache

2017-02-16 Thread Nicolai Hähnle

On 14.02.2017 01:52, Timothy Arceri wrote:

Implements a tgsi cache for the OpenGL state tracker.

V2: add support for compute shaders


A few high-level points:

I think it would be nice to have the reading and writing functions in 
the same file, as in the GLSL case. It makes the structure of the code 
easier to follow.


The TGSI reading needs real error handling. As far as I can see, if the 
cache happens to lose one of the TGSI blobs for whatever reason), things 
will silently break in weird ways.


I also don't like that the cache SHA is calculated separately in two 
different places. Wouldn't it be possible to take the same approach as 
in GLSL, where the SHA is computed in one place, and then a different 
path is taken depending on whether the object is found in the cache or not?


One minor comment below:



---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 222 +
 src/mesa/state_tracker/st_program.c| 133 -
 2 files changed, 350 insertions(+), 5 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 630f5af..b485776 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -32,6 +32,7 @@

 #include "st_glsl_to_tgsi.h"

+#include "compiler/glsl/blob.h"
 #include "compiler/glsl/glsl_parser_extras.h"
 #include "compiler/glsl/ir_optimization.h"
 #include "compiler/glsl/program.h"
@@ -47,6 +48,8 @@
 #include "pipe/p_screen.h"
 #include "tgsi/tgsi_ureg.h"
 #include "tgsi/tgsi_info.h"
+#include "util/disk_cache.h"
+#include "util/mesa-sha1.h"
 #include "util/u_math.h"
 #include "util/u_memory.h"
 #include "st_program.h"
@@ -6999,6 +7002,219 @@ has_unsupported_control_flow(exec_list *ir,
return visitor.unsupported;
 }

+static void
+read_stream_out_from_cache(struct blob_reader *blob_reader,
+   struct pipe_shader_state *tgsi)
+{
+   blob_copy_bytes(blob_reader, (uint8_t *) &tgsi->stream_output,
+sizeof(tgsi->stream_output));
+}
+
+static void
+read_tgsi_from_cache(struct blob_reader *blob_reader,
+ const struct tgsi_token **tokens)
+{
+   uint32_t num_tokens  = blob_read_uint32(blob_reader);
+   unsigned tokens_size = num_tokens * sizeof(struct tgsi_token);
+   *tokens = (const tgsi_token*) MALLOC(tokens_size);
+   blob_copy_bytes(blob_reader, (uint8_t *) *tokens, tokens_size);
+}
+
+static void
+load_tgsi_from_disk_cache(struct gl_context *ctx,
+  struct gl_shader_program *prog)
+{
+   unsigned char sha1[20];
+   char sha1_buf[41];
+   struct st_context *st = st_context(ctx);
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (prog->_LinkedShaders[i] == NULL)
+ continue;
+
+  char *buf = ralloc_strdup(NULL, "tsgi_tokens ");


Typo: tgsi (same below)

Cheers,
Nicolai



+  _mesa_sha1_format(sha1_buf,
+prog->_LinkedShaders[i]->Program->sh.data->sha1);
+  ralloc_strcat(&buf, sha1_buf);
+
+  struct gl_program *glprog = prog->_LinkedShaders[i]->Program;
+  switch (glprog->info.stage) {
+  case MESA_SHADER_VERTEX:
+ ralloc_strcat(&buf, " vs");
+ _mesa_sha1_compute(buf, strlen(buf), sha1);
+ break;
+  case MESA_SHADER_TESS_EVAL:
+ ralloc_strcat(&buf, " tes");
+ _mesa_sha1_compute(buf, strlen(buf), sha1);
+ break;
+  case MESA_SHADER_TESS_CTRL:
+ ralloc_strcat(&buf, " tcs");
+ _mesa_sha1_compute(buf, strlen(buf), sha1);
+ break;
+  case MESA_SHADER_GEOMETRY:
+ ralloc_strcat(&buf, " gs");
+ _mesa_sha1_compute(buf, strlen(buf), sha1);
+ break;
+  case MESA_SHADER_FRAGMENT:
+ ralloc_strcat(&buf, " fs");
+ _mesa_sha1_compute(buf, strlen(buf), sha1);
+ break;
+  case MESA_SHADER_COMPUTE:
+ ralloc_strcat(&buf, " cs");
+ _mesa_sha1_compute(buf, strlen(buf), sha1);
+ break;
+
+  default:
+ unreachable("Unsupported stage");
+  }
+
+  size_t size;
+  uint8_t *buffer = (uint8_t *) disk_cache_get(ctx->Cache, sha1, &size);
+  if (buffer) {
+ struct blob_reader blob_reader;
+ blob_reader_init(&blob_reader, buffer, size);
+
+ switch (glprog->info.stage) {
+ case MESA_SHADER_VERTEX: {
+struct st_vertex_program *stvp =
+   (struct st_vertex_program *) glprog;
+
+st_release_vp_variants(st, stvp);
+
+stvp->num_inputs = blob_read_uint32(&blob_reader);
+blob_copy_bytes(&blob_reader, (uint8_t *) stvp->index_to_input,
+sizeof(stvp->index_to_input));
+blob_copy_bytes(&blob_reader, (uint8_t *) stvp->result_to_output,
+sizeof(stvp->result_to_output));
+
+read_stream_out_from_cache(&blob_reader, &stvp->tgsi);
+read_tgsi_from_cache(&blob_read

Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries

2017-02-16 Thread Robert Bragg
On Wed, Feb 15, 2017 at 11:04 PM, Ilia Mirkin  wrote:
> On Tue, Jan 24, 2017 at 5:27 PM, Robert Bragg  wrote:
>> Depending on how strictly we consider that the queries should only 
>> measure
>> the commands they bracket then I think some stalling will be necessary to
>> serialize the work associated with a query and defer reading the end 
>> state
>> until after the relevant stages have completed their work.
>>
>> We aren't very precise about this in GL currently, but in Begin maybe we
>> should stall until everything >= the statistic-stage is idle and in End
>> stall until everything <= the statistic-stage is idle before reading
>> (where
>> 'statistic-stage' here is the pipeline stage associated with the pipeline
>> statistic being queried (or respectively the min/max stage for a set)).
>>
>> For reference in my implementation of INTEL_performance_query facing this
>> same question, I'm currently just stalling before and after queries:
>>
>>
>> https://github.com/rib/mesa/blob/wip/rib/oa-next/src/mesa/drivers/dri/i965/brw_performance_query.c#L994
>>
>> https://github.com/rib/mesa/blob/wip/rib/oa-next/src/mesa/drivers/dri/i965/brw_performance_query.c#L1136
>
> So that's essentially what I'm doing here, I think. (And what the GL
> driver does.)
>>
>> Yup, the upshot might just be a comment explaining the need for a
>> stall. I think we probably need a stall in CmdEndQuery too, otherwise
>> the command streamer may read the end counter before the work has
>> finished.
>
> Robert,
>
> Can you give me some examples of how I might implement this? I'm not
> so familiar with the Intel HW to know this offhand. Mostly hoping you
> can point me at a mapping of which bit in what command corresponds to
> which stage.

Heh, actually just after I sent out my series for
GL_INTEL_performance_query yesterday I of course remembered that I
needed to fold back command streamer synchronization from a later
patch to the one for pipeline statistics.

My last reply was just trying to suggest replacing the "TODO: This
might only be necessary for certain stats" comment - so nothing to
really implement. I had thought you might be missing a corresponding
stall in the CmdEndQuery but just checking it looks like you already
have one with the same TODO comment. Sorry I didn't double check that
at the time.

I'm not sure it's worth worrying about trying to apply fine grained
control over flushing, even though I suggested that idea originally.
After looking into that possibility more I don't think the HW actually
supports very detailed control (with one exception maybe being to use
DEPTH_STALL with occlusion queries).

My (limited) understanding is that a PIPE_CONTROL with CS_STALL and
STALL_AT_SCOREBOARD should generally suffice to stall the command
streamer until the pipeline has been drained. Since this is what you
are already doing my last reply was trying to say that it maybe just
needs a better comment to explain why we need:

+  /* TODO: This might only be necessary for certain stats */
+  anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
+ pc.CommandStreamerStallEnable = true;
+ pc.StallAtPixelScoreboard = true;
+  }

instead of "TODO: This might only be necessary for certain stats".


I don't know if it's a clear explanation, but feel free to steal
anything from my latest attempt to comment the need for stalling in
this patch for INTEL_performance_query.

https://lists.freedesktop.org/archives/mesa-dev/2017-February/144670.html

Btw, in case you ask, I've never found a good explanation of what
'stall at scoreboard' really means since I'm not really familiar with
what the scoreboard is. :-/ One impression I've got is that it's just
the least-restrictive way of satisfying the restrictions on using
CS_STALL. I think the scoreboard is something related to dependency
tracking while scheduling threads to execute on EUs so I currently
imagine it to mean "stall until there are no more threads left to
schedule for pixel shading" - maybe someone else knows better.

One other data point here is that the Intel driver on windows uses
PIPE_CONTROL + CS_STALL + STALL_AT_SCOREBOARD in its implementation of
INTEL_performance_query and query objects, so hopefully they've found
that to be enough. So even if this does nothing to explain why, all
things being equal it could be good to be consistent if we're ever
trying to compare metrics between different drivers.

Regards,
- Robert


>
> Thanks,
>
>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] travis: bring the scons build on par with AppVeyor

2017-02-16 Thread Emil Velikov
From: Emil Velikov 

Namely, always build with LLVM and run the check target.

Cc: Rhys Kidd 
Cc: Eric Anholt 
Signed-off-by: Emil Velikov 
---
 .travis.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index fb72a5e9b9..a3b094f9a1 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -109,5 +109,5 @@ script:
 ;
   make && make check;
 elif test x$BUILD = xscons; then
-  scons;
+  scons llvm=1 && scons llvm=1 check;
 fi
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 29/32] util/disk_cache: check cache exists before calling munmap()

2017-02-16 Thread Nicolai Hähnle

Patches 26-29:

Reviewed-by: Nicolai Hähnle 

On 14.02.2017 01:52, Timothy Arceri wrote:

---
 src/util/disk_cache.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 10b9d81..8eccf72 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -383,7 +383,8 @@ disk_cache_create(const char *gpu_name, const char 
*timestamp)
 void
 disk_cache_destroy(struct disk_cache *cache)
 {
-   munmap(cache->index_mmap, cache->index_mmap_size);
+   if (cache)
+  munmap(cache->index_mmap, cache->index_mmap_size);

ralloc_free(cache);
 }



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 25/32] st/mesa/glsl: build string of dri options and use as input to building sha for shaders

2017-02-16 Thread Nicolai Hähnle

On 14.02.2017 01:52, Timothy Arceri wrote:

---
 src/compiler/glsl/shader_cache.cpp  |  6 
 src/gallium/include/state_tracker/st_api.h  |  1 +
 src/gallium/state_trackers/dri/dri_screen.c |  2 ++
 src/mesa/drivers/dri/common/xmlconfig.h | 52 +
 src/mesa/main/mtypes.h  |  3 ++
 src/mesa/state_tracker/st_extensions.c  |  2 ++
 6 files changed, 66 insertions(+)

diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index 30e652f..8f97564 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1318,7 +1318,13 @@ shader_cache_read_program_metadata(struct gl_context 
*ctx,
   ctx->API, ctx->Const.GLSLVersion,
   ctx->Const.ForceGLSLVersion);

+   /* DRI config options may also change the output from the compiler so
+* include them as an input to sha1 creation.
+*/
char sha1buf[41];
+   _mesa_sha1_format(sha1buf, ctx->Const.dri_config_options_sha1);
+   ralloc_strcat(&buf, sha1buf);
+
for (unsigned i = 0; i < prog->NumShaders; i++) {
   struct gl_shader *sh = prog->Shaders[i];
   ralloc_asprintf_append(&buf, "%s: %s\n",
diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index a2e37d2..872bdb6 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -246,6 +246,7 @@ struct st_config_options
boolean force_s3tc_enable;
boolean allow_glsl_extension_directive_midshader;
boolean glsl_zero_init;
+   unsigned char config_options_sha1[20];
 };

 /**
diff --git a/src/gallium/state_trackers/dri/dri_screen.c 
b/src/gallium/state_trackers/dri/dri_screen.c
index a950f52..9de970c 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -100,6 +100,8 @@ dri_fill_st_options(struct st_config_options *options,
options->allow_glsl_extension_directive_midshader =
   driQueryOptionb(optionCache, "allow_glsl_extension_directive_midshader");
options->glsl_zero_init = driQueryOptionb(optionCache, "glsl_zero_init");
+
+   driComputeOptionsSha1(optionCache, options->config_options_sha1);
 }

 static const __DRIconfig **
diff --git a/src/mesa/drivers/dri/common/xmlconfig.h 
b/src/mesa/drivers/dri/common/xmlconfig.h
index 8969843..4aa09e8 100644
--- a/src/mesa/drivers/dri/common/xmlconfig.h
+++ b/src/mesa/drivers/dri/common/xmlconfig.h
@@ -30,6 +30,9 @@
 #ifndef __XMLCONFIG_H
 #define __XMLCONFIG_H

+#include "util/mesa-sha1.h"
+#include "util/ralloc.h"
+
 #define STRING_CONF_MAXLEN 25

 /** \brief Option data types */
@@ -124,4 +127,53 @@ float driQueryOptionf (const driOptionCache *cache, const 
char *name);
 /** \brief Query a string option value */
 char *driQueryOptionstr (const driOptionCache *cache, const char *name);

+/**
+ * Returns a concatenated string of the options for this application.


This comment seems wrong. With that fixed, patches 23-25 are:

Reviewed-by: Nicolai Hähnle 


+ */
+static inline void
+driComputeOptionsSha1(const driOptionCache *cache, unsigned char *sha1)
+{
+   void *ctx = ralloc_context(NULL);
+   char *dri_options = ralloc_strdup(ctx, "");
+
+   for (unsigned i = 0; i < 1 << cache->tableSize; i++) {
+  if (cache->info[i].name == NULL)
+ continue;
+
+  bool ret = false;
+  switch (cache->info[i].type) {
+  case DRI_BOOL:
+ ret = ralloc_asprintf_append(&dri_options, "%s:%u,",
+  cache->info[i].name,
+  cache->values[i]._bool);
+ break;
+  case DRI_INT:
+  case DRI_ENUM:
+ ret = ralloc_asprintf_append(&dri_options, "%s:%d,",
+  cache->info[i].name,
+  cache->values[i]._int);
+ break;
+  case DRI_FLOAT:
+ ret = ralloc_asprintf_append(&dri_options, "%s:%f,",
+  cache->info[i].name,
+  cache->values[i]._float);
+ break;
+  case DRI_STRING:
+ ret = ralloc_asprintf_append(&dri_options, "%s:%s,",
+  cache->info[i].name,
+  cache->values[i]._string);
+ break;
+  default:
+ unreachable("unsupported dri config type!");
+  }
+
+  if (!ret) {
+ break;
+  }
+   }
+
+   _mesa_sha1_compute(dri_options, strlen(dri_options), sha1);
+   ralloc_free(ctx);
+}
+
 #endif
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index c51e8ec..15df3cc 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3764,6 +3764,9 @@ struct gl_constants

/** GL_OES_primitive_bounding_box */
bool NoPrimitiveBoundingBoxOutput;
+
+   /** Used as an input for sha1 generation in the on-disk shader cache */
+   unsigned ch

[Mesa-dev] [PATCH v2] i965: Implement INTEL_performance_query backend

2017-02-16 Thread Robert Bragg
I forgot to fold back the addition of pipeline stalls around
queries from a later patch (a detailed explanation is included
as a comment in the code below).

--- >8 --- (git am --scissor)

This adds a bare-bones backend for the INTEL_performance_query extension
that exposes pipeline statistics.

Although this could be considered redundant given that the same
statistics are already available via query objects, they are a simple
starting point for this extension and it's expected to be convenient for
tools wanting to have a single go to api to introspect what performance
counters are available, along with names, descriptions and semantic/data
types.

This code is derived from Kenneth Graunke's work, temporarily removed
while the frontend and backend interface were reworked.

Signed-off-by: Robert Bragg 
---
 src/mesa/drivers/dri/i965/Makefile.sources|   2 +
 src/mesa/drivers/dri/i965/brw_context.c   |   3 +
 src/mesa/drivers/dri/i965/brw_context.h   |  23 +
 src/mesa/drivers/dri/i965/brw_performance_query.c | 649 ++
 src/mesa/drivers/dri/i965/brw_performance_query.h |  49 ++
 src/mesa/drivers/dri/i965/intel_extensions.c  |   3 +
 6 files changed, 729 insertions(+)
 create mode 100644 src/mesa/drivers/dri/i965/brw_performance_query.c
 create mode 100644 src/mesa/drivers/dri/i965/brw_performance_query.h

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index dd546826d1..5278e86339 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -135,6 +135,8 @@ i965_FILES = \
brw_nir_uniforms.cpp \
brw_object_purgeable.c \
brw_pipe_control.c \
+   brw_performance_query.h \
+   brw_performance_query.c \
brw_program.c \
brw_program.h \
brw_program_cache.c \
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index c56a14e3d6..393adede8a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1139,6 +1139,9 @@ brwCreateContext(gl_api api,
_mesa_initialize_dispatch_tables(ctx);
_mesa_initialize_vbo_vtxfmt(ctx);
 
+   if (ctx->Extensions.INTEL_performance_query)
+  brw_init_performance_queries(brw);
+
vbo_use_buffer_objects(ctx);
vbo_always_unmap_buffers(ctx);
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 01e651b09f..a6d91bcce0 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -655,6 +655,19 @@ struct shader_times;
 
 struct gen_l3_config;
 
+enum brw_query_kind {
+   PIPELINE_STATS
+};
+
+struct brw_perf_query_info
+{
+   enum brw_query_kind kind;
+   const char *name;
+   struct brw_perf_query_counter *counters;
+   int n_counters;
+   size_t data_size;
+};
+
 /**
  * brw_context is derived from gl_context.
  */
@@ -1132,6 +1145,13 @@ struct brw_context
   bool supported;
} predicate;
 
+   struct {
+  struct brw_perf_query_info *queries;
+  int n_queries;
+
+  int n_active_pipeline_stats_queries;
+   } perfquery;
+
int num_atoms[BRW_NUM_PIPELINES];
const struct brw_tracked_state render_atoms[76];
const struct brw_tracked_state compute_atoms[11];
@@ -1434,6 +1454,9 @@ bool brw_render_target_supported(struct brw_context *brw,
  struct gl_renderbuffer *rb);
 uint32_t brw_depth_format(struct brw_context *brw, mesa_format format);
 
+/* brw_performance_query.c */
+void brw_init_performance_queries(struct brw_context *brw);
+
 /* intel_buffer_objects.c */
 int brw_bo_map(struct brw_context *brw, drm_intel_bo *bo, int write_enable,
const char *bo_name);
diff --git a/src/mesa/drivers/dri/i965/brw_performance_query.c 
b/src/mesa/drivers/dri/i965/brw_performance_query.c
new file mode 100644
index 00..f1b6f583bf
--- /dev/null
+++ b/src/mesa/drivers/dri/i965/brw_performance_query.c
@@ -0,0 +1,649 @@
+/*
+ * Copyright © 2013 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT

Re: [Mesa-dev] [PATCH] [swr] fix windows build

2017-02-16 Thread Emil Velikov
On 15 February 2017 at 22:46, Ilia Mirkin  wrote:
> Yeah, just like all the other headers:
>
> #ifdef __cplusplus
> extern "C" {
> #endif
>
> define api's
>
> #ifdef __cplusplus
> }
> #endif
>
> You can see examples in, e.g., u_bitcast.h (picked one at random).
>
Thanks for spotting this one Ilia.

Anyone wondering why using extern C { #include "foo.h"} is a bad idea, see [1].

-Emil
[1] 
http://developers.redhat.com/blog/2016/02/29/why-cstdlib-is-more-complicated-than-you-might-think/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/18] gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLE

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_buffer_common.c | 4 ++--
 src/gallium/drivers/radeon/r600_pipe_common.h   | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 5ccfb09..cc9d3be 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -152,22 +152,22 @@ void r600_init_resource_fields(struct r600_common_screen 
*rscreen,
 * executes a command stream.
 */
if (rscreen->info.drm_major == 2 &&
rscreen->info.drm_minor < 40)
res->domains = RADEON_DOMAIN_GTT;
else if (res->domains & RADEON_DOMAIN_VRAM)
res->flags |= RADEON_FLAG_CPU_ACCESS;
}
 
/* Tiled textures are unmappable. Always put them in VRAM. */
-   if (res->b.b.target != PIPE_BUFFER &&
-   !rtex->surface.is_linear) {
+   if ((res->b.b.target != PIPE_BUFFER && !rtex->surface.is_linear) ||
+   res->flags & R600_RESOURCE_FLAG_UNMAPPABLE) {
res->domains = RADEON_DOMAIN_VRAM;
res->flags &= ~RADEON_FLAG_CPU_ACCESS;
res->flags |= RADEON_FLAG_NO_CPU_ACCESS |
 RADEON_FLAG_GTT_WC;
}
 
/* If VRAM is just stolen system memory, allow both VRAM and
 * GTT, whichever has free space. If a buffer is evicted from
 * VRAM to GTT, it will stay there.
 *
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index b4f0f0b..e8dbf5d 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -42,20 +42,21 @@
 #include "util/slab.h"
 #include "util/u_suballoc.h"
 #include "util/u_transfer.h"
 
 #define ATI_VENDOR_ID 0x1002
 
 #define R600_RESOURCE_FLAG_TRANSFER(PIPE_RESOURCE_FLAG_DRV_PRIV << 
0)
 #define R600_RESOURCE_FLAG_FLUSHED_DEPTH   (PIPE_RESOURCE_FLAG_DRV_PRIV << 
1)
 #define R600_RESOURCE_FLAG_FORCE_TILING
(PIPE_RESOURCE_FLAG_DRV_PRIV << 2)
 #define R600_RESOURCE_FLAG_DISABLE_DCC (PIPE_RESOURCE_FLAG_DRV_PRIV << 
3)
+#define R600_RESOURCE_FLAG_UNMAPPABLE  (PIPE_RESOURCE_FLAG_DRV_PRIV << 
4)
 
 #define R600_CONTEXT_STREAMOUT_FLUSH   (1u << 0)
 /* Pipeline & streamout query controls. */
 #define R600_CONTEXT_START_PIPELINE_STATS  (1u << 1)
 #define R600_CONTEXT_STOP_PIPELINE_STATS   (1u << 2)
 #define R600_CONTEXT_PRIVATE_FLAG  (1u << 3)
 
 /* special primitive types */
 #define R600_PRIM_RECTANGLE_LIST   PIPE_PRIM_MAX
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/18] radeonsi: use a clever alignment for descriptor uploads

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

Non-VBO descriptors won't be smaller than the cache line, so simply use
the cache line size.
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 72b33f3..4f2dbbb 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -227,23 +227,24 @@ static bool si_upload_descriptors(struct si_context *sctx,
radeon_emit(sctx->ce_ib, desc->ce_offset + begin * 4);
radeon_emit_array(sctx->ce_ib, list + begin, count);
}
 
if (!si_ce_upload(sctx, desc->ce_offset, list_size,
   &desc->buffer_offset, &desc->buffer))
return false;
} else {
void *ptr;
 
-   u_upload_alloc(sctx->b.b.stream_uploader, 0, list_size, 256,
-   &desc->buffer_offset,
-   (struct pipe_resource**)&desc->buffer, &ptr);
+   u_upload_alloc(sctx->b.b.stream_uploader, 0, list_size,
+  sctx->screen->b.info.tcc_cache_line_size,
+  &desc->buffer_offset,
+  (struct pipe_resource**)&desc->buffer, &ptr);
if (!desc->buffer)
return false; /* skip the draw call */
 
util_memcpy_cpu_to_le32(ptr, desc->list, list_size);
desc->gpu_list = ptr;
 
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, desc->buffer,
RADEON_USAGE_READ, RADEON_PRIO_DESCRIPTORS);
}
desc->dirty_mask = 0;
@@ -941,34 +942,36 @@ static void si_vertex_buffers_begin_new_cs(struct 
si_context *sctx)
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
  desc->buffer, RADEON_USAGE_READ,
  RADEON_PRIO_DESCRIPTORS);
 }
 
 bool si_upload_vertex_buffer_descriptors(struct si_context *sctx)
 {
struct si_vertex_element *velems = sctx->vertex_elements;
struct si_descriptors *desc = &sctx->vertex_buffers;
unsigned i, count = velems->count;
+   unsigned desc_list_byte_size = velems->desc_list_byte_size;
uint64_t va;
uint32_t *ptr;
 
if (!sctx->vertex_buffers_dirty || !count || !velems)
return true;
 
unsigned first_vb_use_mask = velems->first_vb_use_mask;
 
/* Vertex buffer descriptors are the only ones which are uploaded
 * directly through a staging buffer and don't go through
 * the fine-grained upload path.
 */
u_upload_alloc(sctx->b.b.stream_uploader, 0,
-  velems->desc_list_byte_size, 256,
+  desc_list_byte_size,
+  si_optimal_tcc_alignment(sctx, desc_list_byte_size),
   &desc->buffer_offset,
   (struct pipe_resource**)&desc->buffer, (void**)&ptr);
if (!desc->buffer)
return false;
 
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
  desc->buffer, RADEON_USAGE_READ,
  RADEON_PRIO_DESCRIPTORS);
 
assert(count <= SI_MAX_ATTRIBS);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/18] radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirable

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_texture.c   | 11 +--
 src/gallium/drivers/radeonsi/si_compute.c   |  6 ++--
 src/gallium/drivers/radeonsi/si_cp_dma.c|  6 ++--
 src/gallium/drivers/radeonsi/si_pipe.c  | 12 +---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 41 -
 5 files changed, 50 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 47aa8b1..0865d35 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -756,21 +756,23 @@ static void r600_texture_alloc_cmask_separate(struct 
r600_common_screen *rscreen
 
assert(rtex->cmask.size == 0);
 
if (rscreen->chip_class >= SI) {
si_texture_get_cmask_info(rscreen, rtex, &rtex->cmask);
} else {
r600_texture_get_cmask_info(rscreen, rtex, &rtex->cmask);
}
 
rtex->cmask_buffer = (struct r600_resource *)
-   r600_aligned_buffer_create(&rscreen->b, 0, PIPE_USAGE_DEFAULT,
+   r600_aligned_buffer_create(&rscreen->b,
+  R600_RESOURCE_FLAG_UNMAPPABLE,
+  PIPE_USAGE_DEFAULT,
   rtex->cmask.size,
   rtex->cmask.alignment);
if (rtex->cmask_buffer == NULL) {
rtex->cmask.size = 0;
return;
}
 
/* update colorbuffer state bits */
rtex->cmask.base_address_reg = rtex->cmask_buffer->gpu_address >> 8;
 
@@ -867,21 +869,23 @@ static void r600_texture_allocate_htile(struct 
r600_common_screen *rscreen,
clear_value = 0x030F;
} else {
r600_texture_get_htile_size(rscreen, rtex);
clear_value = 0;
}
 
if (!rtex->surface.htile_size)
return;
 
rtex->htile_buffer = (struct r600_resource*)
-   r600_aligned_buffer_create(&rscreen->b, 0, PIPE_USAGE_DEFAULT,
+   r600_aligned_buffer_create(&rscreen->b,
+  R600_RESOURCE_FLAG_UNMAPPABLE,
+  PIPE_USAGE_DEFAULT,
   rtex->surface.htile_size,
   rtex->surface.htile_alignment);
if (rtex->htile_buffer == NULL) {
/* this is not a fatal error as we can still keep rendering
 * without htile buffer */
R600_ERR("Failed to create buffer object for htile buffer.\n");
} else {
r600_screen_clear_buffer(rscreen, &rtex->htile_buffer->b.b,
 0, rtex->surface.htile_size,
 clear_value);
@@ -2099,21 +2103,22 @@ static void vi_separate_dcc_try_enable(struct 
r600_common_context *rctx,
r600_texture_discard_cmask(rctx->screen, tex);
 
/* Get a DCC buffer. */
if (tex->last_dcc_separate_buffer) {
assert(tex->dcc_gather_statistics);
assert(!tex->dcc_separate_buffer);
tex->dcc_separate_buffer = tex->last_dcc_separate_buffer;
tex->last_dcc_separate_buffer = NULL;
} else {
tex->dcc_separate_buffer = (struct r600_resource*)
-   r600_aligned_buffer_create(rctx->b.screen, 0,
+   r600_aligned_buffer_create(rctx->b.screen,
+  
R600_RESOURCE_FLAG_UNMAPPABLE,
   PIPE_USAGE_DEFAULT,
   tex->surface.dcc_size,
   tex->surface.dcc_alignment);
if (!tex->dcc_separate_buffer)
return;
}
 
/* dcc_offset is the absolute GPUVM address. */
tex->dcc_offset = tex->dcc_separate_buffer->gpu_address;
 
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 88d72c1..f4efb0d 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -282,22 +282,24 @@ static bool si_setup_compute_scratch_buffer(struct 
si_context *sctx,
uint64_t scratch_bo_size, scratch_needed;
scratch_bo_size = 0;
scratch_needed = config->scratch_bytes_per_wave * sctx->scratch_waves;
if (sctx->compute_scratch_buffer)
scratch_bo_size = sctx->compute_scratch_buffer->b.b.width0;
 
if (scratch_bo_size < scratch_needed) {
r600_resource_reference(&sctx->compute_scratch_buffer, NULL);
 
sctx->compute_scratch_buffer = (struct r600_resource*)
-   pipe_buffer_create(&sctx->screen->b.

[Mesa-dev] [PATCH 15/18] radeonsi: upload constants into VRAM instead of GTT

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

This lowers lgkm wait cycles by 30% on VI and normal conditions.
The might be a measurable improvement when CE is disabled (radeon)
or under L2 thrashing.
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 11 ---
 src/gallium/drivers/radeonsi/si_compute.c |  4 ++--
 src/gallium/drivers/radeonsi/si_descriptors.c |  6 +++---
 src/gallium/drivers/radeonsi/si_state.c   |  7 +--
 4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index d573b39..1781584 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -600,21 +600,25 @@ bool r600_common_context_init(struct r600_common_context 
*rctx,
rctx->allocator_zeroed_memory =
u_suballocator_create(&rctx->b, rscreen->info.gart_page_size,
  0, PIPE_USAGE_DEFAULT, 0, true);
if (!rctx->allocator_zeroed_memory)
return false;
 
rctx->b.stream_uploader = u_upload_create(&rctx->b, 1024 * 1024,
  0, PIPE_USAGE_STREAM);
if (!rctx->b.stream_uploader)
return false;
-   rctx->b.const_uploader = rctx->b.stream_uploader;
+
+   rctx->b.const_uploader = u_upload_create(&rctx->b, 128 * 1024,
+0, PIPE_USAGE_DEFAULT);
+   if (!rctx->b.const_uploader)
+   return false;
 
rctx->ctx = rctx->ws->ctx_create(rctx->ws);
if (!rctx->ctx)
return false;
 
if (rscreen->info.has_sdma && !(rscreen->debug_flags & 
DBG_NO_ASYNC_DMA)) {
rctx->dma.cs = rctx->ws->cs_create(rctx->ctx, RING_DMA,
   r600_flush_dma_ring,
   rctx);
rctx->dma.flush = r600_flush_dma_ring;
@@ -642,23 +646,24 @@ void r600_common_context_cleanup(struct 
r600_common_context *rctx)
if (rctx->query_result_shader)
rctx->b.delete_compute_state(&rctx->b, 
rctx->query_result_shader);
 
if (rctx->gfx.cs)
rctx->ws->cs_destroy(rctx->gfx.cs);
if (rctx->dma.cs)
rctx->ws->cs_destroy(rctx->dma.cs);
if (rctx->ctx)
rctx->ws->ctx_destroy(rctx->ctx);
 
-   if (rctx->b.stream_uploader) {
+   if (rctx->b.stream_uploader)
u_upload_destroy(rctx->b.stream_uploader);
-   }
+   if (rctx->b.const_uploader)
+   u_upload_destroy(rctx->b.const_uploader);
 
slab_destroy_child(&rctx->pool_transfers);
 
if (rctx->allocator_zeroed_memory) {
u_suballocator_destroy(rctx->allocator_zeroed_memory);
}
rctx->ws->fence_reference(&rctx->last_gfx_fence, NULL);
rctx->ws->fence_reference(&rctx->last_sdma_fence, NULL);
 }
 
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 381837c..88d72c1 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -496,21 +496,21 @@ static void si_setup_user_sgprs_co_v2(struct si_context 
*sctx,
 
dispatch.grid_size_x = info->grid[0] * info->block[0];
dispatch.grid_size_y = info->grid[1] * info->block[1];
dispatch.grid_size_z = info->grid[2] * info->block[2];
 
dispatch.private_segment_size = program->private_size;
dispatch.group_segment_size = program->local_size;
 
dispatch.kernarg_address = kernel_args_va;
 
-   u_upload_data(sctx->b.b.stream_uploader, 0, sizeof(dispatch),
+   u_upload_data(sctx->b.b.const_uploader, 0, sizeof(dispatch),
   256, &dispatch, &dispatch_offset,
   (struct pipe_resource**)&dispatch_buf);
 
if (!dispatch_buf) {
fprintf(stderr, "Error: Failed to allocate dispatch "
"packet.");
}
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx, dispatch_buf,
  RADEON_USAGE_READ, RADEON_PRIO_CONST_BUFFER);
 
@@ -558,21 +558,21 @@ static void si_upload_compute_input(struct si_context 
*sctx,
unsigned num_work_size_bytes = program->use_code_object_v2 ? 0 : 36;
uint32_t kernel_args_offset = 0;
uint32_t *kernel_args;
void *kernel_args_ptr;
uint64_t kernel_args_va;
unsigned i;
 
/* The extra num_work_size_bytes are for work group / work item size 
information */
kernel_args_size = program->input_size + num_work_size_bytes;
 
-   u_upload_alloc(sctx->b.b.stream_uploader, 0, kernel_args_size,
+   u_upload_alloc(sctx->b.b.const_uploader, 0, kernel_args_size,
 

[Mesa-dev] [PATCH 14/18] gallium/radeon: use TCC line size as alignment in other places

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_buffer_common.c | 3 ++-
 src/gallium/drivers/radeon/r600_pipe_common.c   | 3 ++-
 src/gallium/drivers/radeonsi/si_compute.c   | 3 ++-
 src/gallium/drivers/radeonsi/si_descriptors.c   | 5 +++--
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 9e5a8a6..e37e36f 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -362,21 +362,22 @@ static void *r600_buffer_transfer_map(struct pipe_context 
*ctx,
 
/* Check if mapping this buffer would cause waiting for the 
GPU. */
if (r600_rings_is_buffer_referenced(rctx, rbuffer->buf, 
RADEON_USAGE_READWRITE) ||
!rctx->ws->buffer_wait(rbuffer->buf, 0, 
RADEON_USAGE_READWRITE)) {
/* Do a wait-free write-only transfer using a temporary 
buffer. */
unsigned offset;
struct r600_resource *staging = NULL;
 
u_upload_alloc(ctx->stream_uploader, 0,
box->width + (box->x % 
R600_MAP_BUFFER_ALIGNMENT),
-  256, &offset, (struct 
pipe_resource**)&staging,
+  rctx->screen->info.tcc_cache_line_size,
+  &offset, (struct 
pipe_resource**)&staging,
(void**)&data);
 
if (staging) {
data += box->x % R600_MAP_BUFFER_ALIGNMENT;
return r600_buffer_get_transfer(ctx, resource, 
usage, box,
ptransfer, 
data, staging, offset);
}
} else {
/* At this point, the buffer is always idle (we checked 
it above). */
usage |= PIPE_TRANSFER_UNSYNCHRONIZED;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 8405c5e..d573b39 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -186,21 +186,22 @@ void r600_draw_rectangle(struct blitter_context *blitter,
viewport.scale[1] = 1.0f;
viewport.scale[2] = 1.0f;
viewport.translate[0] = 0.0f;
viewport.translate[1] = 0.0f;
viewport.translate[2] = 0.0f;
rctx->b.set_viewport_states(&rctx->b, 0, 1, &viewport);
 
/* Upload vertices. The hw rectangle has only 3 vertices,
 * I guess the 4th one is derived from the first 3.
 * The vertex specification should match u_blitter's vertex element 
state. */
-   u_upload_alloc(rctx->b.stream_uploader, 0, sizeof(float) * 24, 256,
+   u_upload_alloc(rctx->b.stream_uploader, 0, sizeof(float) * 24,
+  rctx->screen->info.tcc_cache_line_size,
&offset, &buf, (void**)&vb);
if (!buf)
return;
 
vb[0] = x1;
vb[1] = y1;
vb[2] = depth;
vb[3] = 1;
 
vb[8] = x1;
diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index aae651c..381837c 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -558,21 +558,22 @@ static void si_upload_compute_input(struct si_context 
*sctx,
unsigned num_work_size_bytes = program->use_code_object_v2 ? 0 : 36;
uint32_t kernel_args_offset = 0;
uint32_t *kernel_args;
void *kernel_args_ptr;
uint64_t kernel_args_va;
unsigned i;
 
/* The extra num_work_size_bytes are for work group / work item size 
information */
kernel_args_size = program->input_size + num_work_size_bytes;
 
-   u_upload_alloc(sctx->b.b.stream_uploader, 0, kernel_args_size, 256,
+   u_upload_alloc(sctx->b.b.stream_uploader, 0, kernel_args_size,
+  sctx->screen->b.info.tcc_cache_line_size,
   &kernel_args_offset,
   (struct pipe_resource**)&input_buffer, &kernel_args_ptr);
 
kernel_args = (uint32_t*)kernel_args_ptr;
kernel_args_va = input_buffer->gpu_address + kernel_args_offset;
 
if (!code_object) {
for (i = 0; i < 3; i++) {
kernel_args[i] = info->grid[i];
kernel_args[i + 3] = info->grid[i] * info->block[i];
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 4f2dbbb..b4f1fbf 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -130,22 +130,23 @@ static void si_init_descriptors(struct si_descriptors 
*desc,
 static 

[Mesa-dev] [PATCH 16/18] gallium/radeon: change r600_aligned_buffer_create to take flags, not bind

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

All call sites set bind = 0. The next commit will use this.
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 6 +++---
 src/gallium/drivers/radeon/r600_pipe_common.h   | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index e37e36f..5ccfb09 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -543,33 +543,33 @@ struct pipe_resource *r600_buffer_create(struct 
pipe_screen *screen,
rbuffer->flags |= RADEON_FLAG_HANDLE;
 
if (!r600_alloc_resource(rscreen, rbuffer)) {
FREE(rbuffer);
return NULL;
}
return &rbuffer->b.b;
 }
 
 struct pipe_resource *r600_aligned_buffer_create(struct pipe_screen *screen,
-unsigned bind,
+unsigned flags,
 unsigned usage,
 unsigned size,
 unsigned alignment)
 {
struct pipe_resource buffer;
 
memset(&buffer, 0, sizeof buffer);
buffer.target = PIPE_BUFFER;
buffer.format = PIPE_FORMAT_R8_UNORM;
-   buffer.bind = bind;
+   buffer.bind = 0;
buffer.usage = usage;
-   buffer.flags = 0;
+   buffer.flags = flags;
buffer.width0 = size;
buffer.height0 = 1;
buffer.depth0 = 1;
buffer.array_size = 1;
return r600_buffer_create(screen, &buffer, alignment);
 }
 
 struct pipe_resource *
 r600_buffer_from_user_memory(struct pipe_screen *screen,
 const struct pipe_resource *templ,
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 1fe44d9..b4f0f0b 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -711,21 +711,21 @@ void r600_buffer_subdata(struct pipe_context *ctx,
 unsigned size, const void *data);
 void r600_init_resource_fields(struct r600_common_screen *rscreen,
   struct r600_resource *res,
   uint64_t size, unsigned alignment);
 bool r600_alloc_resource(struct r600_common_screen *rscreen,
 struct r600_resource *res);
 struct pipe_resource *r600_buffer_create(struct pipe_screen *screen,
 const struct pipe_resource *templ,
 unsigned alignment);
 struct pipe_resource * r600_aligned_buffer_create(struct pipe_screen *screen,
- unsigned bind,
+ unsigned flags,
  unsigned usage,
  unsigned size,
  unsigned alignment);
 struct pipe_resource *
 r600_buffer_from_user_memory(struct pipe_screen *screen,
 const struct pipe_resource *templ,
 void *user_memory);
 void
 r600_invalidate_resource(struct pipe_context *ctx,
 struct pipe_resource *resource);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/18] radeonsi: move index buffer flushing into a non-upload indexed case

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

The other codepaths don't need this.
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 8f5dcbc..ca28f50 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -1079,30 +1079,29 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
start_offset = start * ib.index_size;
 
u_upload_data(ctx->stream_uploader, start_offset,
   count * ib.index_size,
  256, (char*)ib.user_buffer + start_offset,
  &ib.offset, &ib.buffer);
if (!ib.buffer)
return;
/* info->start will be added by the drawing code */
ib.offset -= start_offset;
+   } else if (sctx->b.chip_class <= CIK &&
+  r600_resource(ib.buffer)->TC_L2_dirty) {
+   /* VI reads index buffers through TC L2, so it doesn't
+* need this. */
+   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
+   r600_resource(ib.buffer)->TC_L2_dirty = false;
}
}
 
-   /* VI reads index buffers through TC L2. */
-   if (info->indexed && sctx->b.chip_class <= CIK &&
-   r600_resource(ib.buffer)->TC_L2_dirty) {
-   sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
-   r600_resource(ib.buffer)->TC_L2_dirty = false;
-   }
-
if (info->indirect) {
/* Add the buffer size for memory checking in need_cs_space. */
r600_context_add_resource_size(ctx, info->indirect);
 
if (r600_resource(info->indirect)->TC_L2_dirty) {
sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
r600_resource(info->indirect)->TC_L2_dirty = false;
}
 
if (info->indirect_params &&
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/18] radeonsi: use a clever alignment for index buffer uploads

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state_draw.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ca28f50..8fed61a 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -1041,28 +1041,30 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
/* Initialize the index buffer struct. */
pipe_resource_reference(&ib.buffer, sctx->index_buffer.buffer);
ib.user_buffer = sctx->index_buffer.user_buffer;
ib.index_size = sctx->index_buffer.index_size;
ib.offset = sctx->index_buffer.offset;
 
/* Translate or upload, if needed. */
/* 8-bit indices are supported on VI. */
if (sctx->b.chip_class <= CIK && ib.index_size == 1) {
struct pipe_resource *out_buffer = NULL;
-   unsigned out_offset, start, count, start_offset;
+   unsigned out_offset, start, count, start_offset, size;
void *ptr;
 
si_get_draw_start_count(sctx, info, &start, &count);
start_offset = start * 2;
+   size = count * 2;
 
u_upload_alloc(ctx->stream_uploader, start_offset,
-   count * 2, 256,
+  size,
+  si_optimal_tcc_alignment(sctx, size),
   &out_offset, &out_buffer, &ptr);
if (!out_buffer) {
pipe_resource_reference(&ib.buffer, NULL);
return;
}
 
util_shorten_ubyte_elts_to_userptr(&sctx->b.b, &ib, 0,
   ib.offset + 
start_offset,
   count, ptr);
 
@@ -1072,22 +1074,23 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
/* info->start will be added by the drawing code */
ib.offset = out_offset - start_offset;
ib.index_size = 2;
} else if (ib.user_buffer && !ib.buffer) {
unsigned start, count, start_offset;
 
si_get_draw_start_count(sctx, info, &start, &count);
start_offset = start * ib.index_size;
 
u_upload_data(ctx->stream_uploader, start_offset,
-  count * ib.index_size,
- 256, (char*)ib.user_buffer + start_offset,
+ count * ib.index_size,
+ sctx->screen->b.info.tcc_cache_line_size,
+ (char*)ib.user_buffer + start_offset,
  &ib.offset, &ib.buffer);
if (!ib.buffer)
return;
/* info->start will be added by the drawing code */
ib.offset -= start_offset;
} else if (sctx->b.chip_class <= CIK &&
   r600_resource(ib.buffer)->TC_L2_dirty) {
/* VI reads index buffers through TC L2, so it doesn't
 * need this. */
sctx->b.flags |= SI_CONTEXT_WRITEBACK_GLOBAL_L2;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/18] radeonsi: use a clever alignment for constant buffer uploads

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

This results in a very tiny decrease in lgkm wait cycles.
---
 src/gallium/drivers/radeon/radeon_winsys.h|  1 +
 src/gallium/drivers/radeonsi/si_descriptors.c |  4 +++-
 src/gallium/drivers/radeonsi/si_pipe.h| 15 +++
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c |  1 +
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c |  1 +
 5 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 432550d..812c036 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -194,20 +194,21 @@ struct radeon_info {
 boolgfx_ib_pad_with_type2;
 boolhas_sdma;
 boolhas_uvd;
 uint32_tuvd_fw_version;
 uint32_tvce_fw_version;
 uint32_tme_fw_version;
 uint32_tpfp_fw_version;
 uint32_tce_fw_version;
 uint32_tvce_harvest_config;
 uint32_tclock_crystal_freq;
+uint32_ttcc_cache_line_size;
 
 /* Kernel info. */
 uint32_tdrm_major; /* version */
 uint32_tdrm_minor;
 uint32_tdrm_patchlevel;
 boolhas_userptr;
 
 /* Shader cores. */
 uint32_tr600_max_quad_pipes; /* wave size / 16 */
 uint32_tmax_shader_clock;
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 8f636af..72b33f3 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -1040,21 +1040,23 @@ static struct si_descriptors *
 si_const_buffer_descriptors(struct si_context *sctx, unsigned shader)
 {
return &sctx->descriptors[si_const_buffer_descriptors_idx(shader)];
 }
 
 void si_upload_const_buffer(struct si_context *sctx, struct r600_resource 
**rbuffer,
const uint8_t *ptr, unsigned size, uint32_t 
*const_offset)
 {
void *tmp;
 
-   u_upload_alloc(sctx->b.b.stream_uploader, 0, size, 256, const_offset,
+   u_upload_alloc(sctx->b.b.stream_uploader, 0, size,
+  si_optimal_tcc_alignment(sctx, size),
+  const_offset,
   (struct pipe_resource**)rbuffer, &tmp);
if (*rbuffer)
util_memcpy_cpu_to_le32(tmp, ptr, size);
 }
 
 static void si_set_constant_buffer(struct si_context *sctx,
   struct si_buffer_resources *buffers,
   unsigned descriptors_idx,
   uint slot, const struct pipe_constant_buffer 
*input)
 {
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index fb24bab..bee6881 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -505,11 +505,26 @@ static inline struct si_shader* si_get_vs_state(struct 
si_context *sctx)
 static inline bool si_vs_exports_prim_id(struct si_shader *shader)
 {
if (shader->selector->type == PIPE_SHADER_VERTEX)
return shader->key.part.vs.epilog.export_prim_id;
else if (shader->selector->type == PIPE_SHADER_TESS_EVAL)
return shader->key.part.tes.epilog.export_prim_id;
else
return false;
 }
 
+static inline unsigned
+si_optimal_tcc_alignment(struct si_context *sctx, unsigned upload_size)
+{
+   unsigned alignment, tcc_cache_line_size;
+
+   /* If the upload size is less than the cache line size (e.g. 16, 32),
+* the whole thing will fit into a cache line if we align it to its 
size.
+* The idea is that multiple small uploads can share a cache line.
+* If the upload size is greater, align it to the cache line size.
+*/
+   alignment = util_next_power_of_two(upload_size);
+   tcc_cache_line_size = sctx->screen->b.info.tcc_cache_line_size;
+   return MIN2(alignment, tcc_cache_line_size);
+}
+
 #endif
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index db0087c..6511c48 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -338,20 +338,21 @@ static bool do_winsys_init(struct amdgpu_winsys *ws, int 
fd)
ws->info.max_se = ws->amdinfo.num_shader_engines;
ws->info.max_sh_per_se = ws->amdinfo.num_shader_arrays_per_engine;
ws->info.has_uvd = uvd.available_rings != 0;
ws->info.uvd_fw_version =
  uvd.available_rings ? uvd_version : 0;
ws->info.vce_fw_version =
  vce.available_rings ? vce_version : 0;
ws->info.has_userptr = true;
 

[Mesa-dev] [PATCH 07/18] radeonsi: sort members of si_shader_key::part

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

and improve some comments
---
 src/gallium/drivers/radeonsi/si_shader.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index d4b57c9..b7cf7ea 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -416,43 +416,43 @@ union si_shader_part_key {
unsignedwrites_z:1;
unsignedwrites_stencil:1;
unsignedwrites_samplemask:1;
} ps_epilog;
 };
 
 struct si_shader_key {
/* Prolog and epilog flags. */
union {
struct {
-   struct si_ps_prolog_bits prolog;
-   struct si_ps_epilog_bits epilog;
-   } ps;
-   struct {
struct si_vs_prolog_bits prolog;
struct si_vs_epilog_bits epilog;
} vs;
struct {
struct si_tcs_epilog_bits epilog;
} tcs; /* tessellation control shader */
struct {
struct si_vs_epilog_bits epilog; /* same as VS */
} tes; /* tessellation evaluation shader */
struct {
struct si_gs_prolog_bits prolog;
} gs;
+   struct {
+   struct si_ps_prolog_bits prolog;
+   struct si_ps_epilog_bits epilog;
+   } ps;
} part;
 
/* These two are initially set according to the NEXT_SHADER property,
 * or guessed if the property doesn't seem correct.
 */
-   unsigned as_es:1; /* export shader */
-   unsigned as_ls:1; /* local shader */
+   unsigned as_es:1; /* export shader, which precedes GS */
+   unsigned as_ls:1; /* local shader, which precedes TCS */
 
/* Flags for monolithic compilation only. */
union {
struct {
/* One byte for every input: SI_FIX_FETCH_* enums. */
uint8_t fix_fetch[SI_MAX_ATTRIBS];
} vs;
struct {
uint64_tinputs_to_copy; /* for fixed-func TCS */
} tcs;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/18] radeonsi: use SI_MAX_ATTRIBS where it should be used

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

for consistency; no change in behavior
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 2 +-
 src/gallium/drivers/radeonsi/si_pipe.c| 2 +-
 src/gallium/drivers/radeonsi/si_shader.c  | 4 ++--
 src/gallium/drivers/radeonsi/si_shader.h  | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 59022ed..8f636af 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -964,21 +964,21 @@ bool si_upload_vertex_buffer_descriptors(struct 
si_context *sctx)
   velems->desc_list_byte_size, 256,
   &desc->buffer_offset,
   (struct pipe_resource**)&desc->buffer, (void**)&ptr);
if (!desc->buffer)
return false;
 
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
  desc->buffer, RADEON_USAGE_READ,
  RADEON_PRIO_DESCRIPTORS);
 
-   assert(count <= SI_NUM_VERTEX_BUFFERS);
+   assert(count <= SI_MAX_ATTRIBS);
 
for (i = 0; i < count; i++) {
struct pipe_vertex_element *ve = &velems->elements[i];
struct pipe_vertex_buffer *vb;
struct r600_resource *rbuffer;
unsigned offset;
unsigned vbo_index = ve->vertex_buffer_index;
uint32_t *desc = &ptr[i*4];
 
vb = &sctx->vertex_buffer[vbo_index];
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 61bcd2c..a947bad 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -615,21 +615,21 @@ static int si_get_shader_param(struct pipe_screen* 
pscreen, unsigned shader, enu
 
switch (param) {
/* Shader limits. */
case PIPE_SHADER_CAP_MAX_INSTRUCTIONS:
case PIPE_SHADER_CAP_MAX_ALU_INSTRUCTIONS:
case PIPE_SHADER_CAP_MAX_TEX_INSTRUCTIONS:
case PIPE_SHADER_CAP_MAX_TEX_INDIRECTIONS:
case PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH:
return 16384;
case PIPE_SHADER_CAP_MAX_INPUTS:
-   return shader == PIPE_SHADER_VERTEX ? SI_NUM_VERTEX_BUFFERS : 
32;
+   return shader == PIPE_SHADER_VERTEX ? SI_MAX_ATTRIBS : 32;
case PIPE_SHADER_CAP_MAX_OUTPUTS:
return shader == PIPE_SHADER_FRAGMENT ? 8 : 32;
case PIPE_SHADER_CAP_MAX_TEMPS:
return 256; /* Max native temporaries. */
case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
return 4096 * sizeof(float[4]); /* actually only memory limits 
this */
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
return SI_NUM_CONST_BUFFERS;
case PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS:
case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index de42778..d3e3984 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5259,37 +5259,37 @@ static unsigned si_get_max_workgroup_size(struct 
si_shader *shader)
max_work_group_size = SI_MAX_VARIABLE_THREADS_PER_BLOCK;
}
return max_work_group_size;
 }
 
 static void create_function(struct si_shader_context *ctx)
 {
struct lp_build_tgsi_context *bld_base = &ctx->bld_base;
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct si_shader *shader = ctx->shader;
-   LLVMTypeRef params[SI_NUM_PARAMS + SI_NUM_VERTEX_BUFFERS], v3i32;
+   LLVMTypeRef params[SI_NUM_PARAMS + SI_MAX_ATTRIBS], v3i32;
LLVMTypeRef returns[16+32*4];
unsigned i, last_sgpr, num_params, num_return_sgprs;
unsigned num_returns = 0;
unsigned num_prolog_vgprs = 0;
 
v3i32 = LLVMVectorType(ctx->i32, 3);
 
params[SI_PARAM_RW_BUFFERS] = const_array(ctx->v16i8, 
SI_NUM_RW_BUFFERS);
params[SI_PARAM_CONST_BUFFERS] = const_array(ctx->v16i8, 
SI_NUM_CONST_BUFFERS);
params[SI_PARAM_SAMPLERS] = const_array(ctx->v8i32, SI_NUM_SAMPLERS);
params[SI_PARAM_IMAGES] = const_array(ctx->v8i32, SI_NUM_IMAGES);
params[SI_PARAM_SHADER_BUFFERS] = const_array(ctx->v4i32, 
SI_NUM_SHADER_BUFFERS);
 
switch (ctx->type) {
case PIPE_SHADER_VERTEX:
-   params[SI_PARAM_VERTEX_BUFFERS] = const_array(ctx->v16i8, 
SI_NUM_VERTEX_BUFFERS);
+   params[SI_PARAM_VERTEX_BUFFERS] = const_array(ctx->v16i8, 
SI_MAX_ATTRIBS);
params[SI_PARAM_BASE_VERTEX] = ctx->i32;
params[SI_PARAM_START_INSTANCE] = ctx->i32;
params[SI_PARAM_DRAWID] = ctx->i32;
num_params = SI_PARAM_DRAWID+1;
 
if (shader->key.as_es) {
params[ctx->param_es2gs_offset = num_params

[Mesa-dev] [PATCH 05/18] radeonsi: don't compile pure monolithic shaders asynchronously

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

there is no point, we have to wait anyway.
---
 src/gallium/drivers/radeonsi/si_state_shaders.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 9570259..3630911 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1229,45 +1229,49 @@ again:
/* Build a new shader. */
shader = CALLOC_STRUCT(si_shader);
if (!shader) {
pipe_mutex_unlock(sel->mutex);
return -ENOMEM;
}
shader->selector = sel;
shader->key = *key;
shader->compiler_ctx_state = *compiler_state;
 
+   bool is_pure_monolithic =
+   memcmp(&key->mono, &zeroed.mono, sizeof(key->mono)) != 0;
+
/* Monolithic-only shaders don't make a distinction between optimized
 * and unoptimized. */
shader->is_monolithic =
!sel->main_shader_part ||
sel->main_shader_part->key.as_ls != key->as_ls ||
sel->main_shader_part->key.as_es != key->as_es ||
-   memcmp(&key->opt, &zeroed.opt, sizeof(key->opt)) != 0 ||
-   memcmp(&key->mono, &zeroed.mono, sizeof(key->mono)) != 0;
+   is_pure_monolithic ||
+   memcmp(&key->opt, &zeroed.opt, sizeof(key->opt)) != 0;
 
shader->is_optimized =
!sscreen->use_monolithic_shaders &&
memcmp(&key->opt, &zeroed.opt, sizeof(key->opt)) != 0;
if (shader->is_optimized)
util_queue_fence_init(&shader->optimized_ready);
 
if (!sel->last_variant) {
sel->first_variant = shader;
sel->last_variant = shader;
} else {
sel->last_variant->next_variant = shader;
sel->last_variant = shader;
}
 
/* If it's an optimized shader, compile it asynchronously. */
if (shader->is_optimized &&
+   !is_pure_monolithic &&
thread_index < 0) {
/* Compile it asynchronously. */
util_queue_add_job(&sscreen->shader_compiler_queue,
   shader, &shader->optimized_ready,
   si_build_shader_variant, NULL);
 
/* Use the default (unoptimized) shader for now. */
memset(&key->opt, 0, sizeof(key->opt));
pipe_mutex_unlock(sel->mutex);
goto again;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/18] radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index d453309..8f5dcbc 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -1045,21 +1045,21 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
ib.offset = sctx->index_buffer.offset;
 
/* Translate or upload, if needed. */
/* 8-bit indices are supported on VI. */
if (sctx->b.chip_class <= CIK && ib.index_size == 1) {
struct pipe_resource *out_buffer = NULL;
unsigned out_offset, start, count, start_offset;
void *ptr;
 
si_get_draw_start_count(sctx, info, &start, &count);
-   start_offset = start * ib.index_size;
+   start_offset = start * 2;
 
u_upload_alloc(ctx->stream_uploader, start_offset,
count * 2, 256,
   &out_offset, &out_buffer, &ptr);
if (!out_buffer) {
pipe_resource_reference(&ib.buffer, NULL);
return;
}
 
util_shorten_ubyte_elts_to_userptr(&sctx->b.b, &ib, 0,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/18] radeonsi: remove the fix_size3 workaround

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

not needed with the shader fallback
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 22 --
 src/gallium/drivers/radeonsi/si_state.c   |  9 -
 src/gallium/drivers/radeonsi/si_state.h   |  5 -
 3 files changed, 36 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 3c98176..59022ed 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -947,21 +947,20 @@ bool si_upload_vertex_buffer_descriptors(struct 
si_context *sctx)
 {
struct si_vertex_element *velems = sctx->vertex_elements;
struct si_descriptors *desc = &sctx->vertex_buffers;
unsigned i, count = velems->count;
uint64_t va;
uint32_t *ptr;
 
if (!sctx->vertex_buffers_dirty || !count || !velems)
return true;
 
-   unsigned fix_size3 = velems->fix_size3;
unsigned first_vb_use_mask = velems->first_vb_use_mask;
 
/* Vertex buffer descriptors are the only ones which are uploaded
 * directly through a staging buffer and don't go through
 * the fine-grained upload path.
 */
u_upload_alloc(sctx->b.b.stream_uploader, 0,
   velems->desc_list_byte_size, 256,
   &desc->buffer_offset,
   (struct pipe_resource**)&desc->buffer, (void**)&ptr);
@@ -996,42 +995,21 @@ bool si_upload_vertex_buffer_descriptors(struct 
si_context *sctx)
desc[0] = va;
desc[1] = S_008F04_BASE_ADDRESS_HI(va >> 32) |
  S_008F04_STRIDE(vb->stride);
 
if (sctx->b.chip_class <= CIK && vb->stride) {
/* Round up by rounding down and adding 1 */
desc[2] = (vb->buffer->width0 - offset -
   velems->format_size[i]) /
  vb->stride + 1;
} else {
-   uint32_t size3;
-
desc[2] = vb->buffer->width0 - offset;
-
-   /* For attributes of size 3 with byte or short
-* components, we use a 4-component data format.
-*
-* As a consequence, we have to round the buffer size
-* up so that the hardware sees four components as
-* being inside the buffer if and only if the first
-* three components are in the buffer.
-*
-* Since the offset and stride are guaranteed to be
-* 4-byte aligned, this alignment will never cross the
-* winsys buffer boundary.
-*/
-   size3 = (fix_size3 >> (2 * i)) & 3;
-   if (vb->stride && size3) {
-   assert(offset % 4 == 0 && vb->stride % 4 == 0);
-   assert(size3 <= 2);
-   desc[2] = align(desc[2], size3 * 2);
-   }
}
 
desc[3] = velems->rsrc_word3[i];
 
if (first_vb_use_mask & (1 << i)) {
radeon_add_to_buffer_list(&sctx->b, &sctx->b.gfx,
  (struct r600_resource*)vb->buffer,
  RADEON_USAGE_READ, 
RADEON_PRIO_VERTEX_BUFFER);
}
}
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 024de8b..f53f8dd 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3474,29 +3474,20 @@ static void *si_create_vertex_elements(struct 
pipe_context *ctx,
v->fix_fetch[i] = SI_FIX_FETCH_RGB_16;
}
}
 
v->rsrc_word3[i] = 
S_008F0C_DST_SEL_X(si_map_swizzle(swizzle[0])) |
   
S_008F0C_DST_SEL_Y(si_map_swizzle(swizzle[1])) |
   
S_008F0C_DST_SEL_Z(si_map_swizzle(swizzle[2])) |
   
S_008F0C_DST_SEL_W(si_map_swizzle(swizzle[3])) |
   S_008F0C_NUM_FORMAT(num_format) |
   S_008F0C_DATA_FORMAT(data_format);
-
-   /* We work around the fact that 8_8_8 and 16_16_16 data formats
-* do not exist by using the corresponding 4-component formats.
-* This requires a fixup of the descriptor for bounds checks.
-*/
-   if (desc->block.bits == 3 * 8 ||
-   desc->block.bits == 3 * 16) {
-   v->fix_size3 |= (desc->block.bits / 24) << (2 * i);
-   }
}
mem

[Mesa-dev] [PATCH 01/18] radeonsi: make fix_fetch an array of uint8_t

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

so that we can add 3-component fallbacks.
---
 src/gallium/drivers/radeonsi/si_shader.c|  8 +--
 src/gallium/drivers/radeonsi/si_shader.h|  5 ++---
 src/gallium/drivers/radeonsi/si_state.c | 28 -
 src/gallium/drivers/radeonsi/si_state.h |  2 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c |  5 ++---
 5 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index cfff54a..8b9fed9 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -359,21 +359,21 @@ static void declare_input_vs(
t_list_ptr = LLVMGetParam(ctx->main_fn, SI_PARAM_VERTEX_BUFFERS);
 
t_offset = lp_build_const_int32(gallivm, input_index);
 
t_list = ac_build_indexed_load_const(&ctx->ac, t_list_ptr, t_offset);
 
vertex_index = LLVMGetParam(ctx->main_fn,
ctx->param_vertex_index0 +
input_index);
 
-   fix_fetch = (ctx->shader->key.mono.vs.fix_fetch >> (4 * input_index)) & 
0xf;
+   fix_fetch = ctx->shader->key.mono.vs.fix_fetch[input_index];
 
/* Do multiple loads for double formats. */
if (fix_fetch == SI_FIX_FETCH_RGB_64_FLOAT) {
num_fetches = 3; /* 3 2-dword loads */
fetch_stride = 8;
} else if (fix_fetch == SI_FIX_FETCH_RGBA_64_FLOAT) {
num_fetches = 2; /* 2 4-dword loads */
fetch_stride = 16;
} else {
num_fetches = 1;
@@ -6263,21 +6263,25 @@ static void si_dump_shader_key(unsigned shader, struct 
si_shader_key *key,
switch (shader) {
case PIPE_SHADER_VERTEX:
fprintf(f, "  part.vs.prolog.instance_divisors = {");
for (i = 0; i < 
ARRAY_SIZE(key->part.vs.prolog.instance_divisors); i++)
fprintf(f, !i ? "%u" : ", %u",
key->part.vs.prolog.instance_divisors[i]);
fprintf(f, "}\n");
fprintf(f, "  part.vs.epilog.export_prim_id = %u\n", 
key->part.vs.epilog.export_prim_id);
fprintf(f, "  as_es = %u\n", key->as_es);
fprintf(f, "  as_ls = %u\n", key->as_ls);
-   fprintf(f, "  mono.vs.fix_fetch = 0x%"PRIx64"\n", 
key->mono.vs.fix_fetch);
+
+   fprintf(f, "  mono.vs.fix_fetch = {");
+   for (i = 0; i < SI_MAX_ATTRIBS; i++)
+   fprintf(f, !i ? "%u" : ", %u", 
key->mono.vs.fix_fetch[i]);
+   fprintf(f, "}\n");
break;
 
case PIPE_SHADER_TESS_CTRL:
fprintf(f, "  part.tcs.epilog.prim_mode = %u\n", 
key->part.tcs.epilog.prim_mode);
fprintf(f, "  mono.tcs.inputs_to_copy = 0x%"PRIx64"\n", 
key->mono.tcs.inputs_to_copy);
break;
 
case PIPE_SHADER_TESS_EVAL:
fprintf(f, "  part.tes.epilog.export_prim_id = %u\n", 
key->part.tes.epilog.export_prim_id);
fprintf(f, "  as_es = %u\n", key->as_es);
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 6398b39..4616190 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -243,21 +243,20 @@ enum {
SI_FIX_FETCH_RGBX_32_UNORM,
SI_FIX_FETCH_RGBA_32_SNORM,
SI_FIX_FETCH_RGBX_32_SNORM,
SI_FIX_FETCH_RGBA_32_USCALED,
SI_FIX_FETCH_RGBA_32_SSCALED,
SI_FIX_FETCH_RGBA_32_FIXED,
SI_FIX_FETCH_RGBX_32_FIXED,
SI_FIX_FETCH_RG_64_FLOAT,
SI_FIX_FETCH_RGB_64_FLOAT,
SI_FIX_FETCH_RGBA_64_FLOAT,
-   SI_FIX_FETCH_RESERVED_15, /* maximum */
 };
 
 struct si_shader;
 
 /* State of the context creating the shader object. */
 struct si_compiler_ctx_state {
/* Should only be used by si_init_shader_selector_async and
 * si_build_shader_variant if thread_index == -1 (non-threaded). */
LLVMTargetMachineReftm;
 
@@ -438,22 +437,22 @@ struct si_shader_key {
 
/* These two are initially set according to the NEXT_SHADER property,
 * or guessed if the property doesn't seem correct.
 */
unsigned as_es:1; /* export shader */
unsigned as_ls:1; /* local shader */
 
/* Flags for monolithic compilation only. */
union {
struct {
-   /* One nibble for every input: SI_FIX_FETCH_* enums. */
-   uint64_tfix_fetch;
+   /* One byte for every input: SI_FIX_FETCH_* enums. */
+   uint8_t fix_fetch[SI_MAX_ATTRIBS];
} vs;
struct {
uint64_tinputs_to_copy; /* for fixed-func TCS */
} tcs;
} mono;
 
/* Optimization flags for 

[Mesa-dev] [PATCH 04/18] radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI

2017-02-16 Thread Marek Olšák
From: Marek Olšák 

So that we can disable u_vbuf for GL core profiles.

This is a v2 of the previous VI-only patch.
It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.
---
 src/gallium/drivers/radeonsi/si_pipe.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 2dc884a..61bcd2c 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -353,23 +353,20 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
case PIPE_CAP_SM3:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
case PIPE_CAP_PRIMITIVE_RESTART:
case PIPE_CAP_CONDITIONAL_RENDER:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_INDEP_BLEND_ENABLE:
case PIPE_CAP_INDEP_BLEND_FUNC:
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
-   case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
-   case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
-   case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_USER_INDEX_BUFFERS:
case PIPE_CAP_USER_CONSTANT_BUFFERS:
case PIPE_CAP_START_INSTANCE:
case PIPE_CAP_NPOT_TEXTURES:
case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
 case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
case PIPE_CAP_TGSI_INSTANCEID:
@@ -455,20 +452,29 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
 
case PIPE_CAP_GLSL_FEATURE_LEVEL:
if (si_have_tgsi_compute(sscreen))
return 450;
return HAVE_LLVM >= 0x0309 ? 420 :
   HAVE_LLVM >= 0x0307 ? 410 : 330;
 
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);
 
+   case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
+   case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
+   case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
+   /* SI doesn't support unaligned loads.
+* CIK needs DRM 2.50.0 on radeon. */
+   return sscreen->b.chip_class == SI ||
+  (sscreen->b.info.drm_major == 2 &&
+   sscreen->b.info.drm_minor < 50);
+
/* Unsupported features. */
case PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY:
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_USER_VERTEX_BUFFERS:
case PIPE_CAP_FAKE_SW_MSAA:
case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
case PIPE_CAP_VERTEXID_NOBASE:
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >