date:20160929

Re: [Mesa-dev] [PATCH] glsl: optimize copy_propagation_elements pass

2016-09-29 Thread Tapani Pälli



On 09/28/2016 06:14 PM, Ian Romanick wrote:

On 09/16/2016 06:21 PM, Tapani Pälli wrote:

Changes make copy_propagation_elements pass faster, reducing link
time spent in test case of bug 94477. Does not fix the actual issue
but brings down the total time. No regressions seen in CI.


How does this affect the time of a full shader-db run?


Almost none at all, this is the open-source shaders (100 runs):

Difference at 95.0% confidence
0.0312 +/- 0.00502746
1.72566% +/- 0.278068%
(Student's t, pooled s = 0.0181375)

(testing with DOTA-2 shaders gave very similar result)

My assumption is that this really helps only the most pathological cases 
like in the bug where list size becomes enormous (thousands of entries). 
With just few entries, list is 'fast enough' to walk through anyway (?)


BTW Eric was proposing to just remove this pass. However when testing 
what happens on removal I noticed there's functional failures 
(arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform starts to 
fail), so it seems we are currently dependent on this pass.



There are a bunch of bits of this that are confusing to me.  I think
some high-level explanation about which hash tables the acp_ref can be
in, which lists it can be in, and how they relate would help.  I've
pointed out a couple of the confusing bits below.


Signed-off-by: Tapani Pälli 
---

For performance measurements, Martina reported in the bug 8x speedup
to the test case shader link time when using this patch together with
commit 2cd02e30d2e1677762d34f1831b8e609970ef0f3

 .../glsl/opt_copy_propagation_elements.cpp | 187 -
 1 file changed, 145 insertions(+), 42 deletions(-)

diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp 
b/src/compiler/glsl/opt_copy_propagation_elements.cpp
index e4237cc..1c5060a 100644
--- a/src/compiler/glsl/opt_copy_propagation_elements.cpp
+++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp
@@ -46,6 +46,7 @@
 #include "ir_basic_block.h"
 #include "ir_optimization.h"
 #include "compiler/glsl_types.h"
+#include "util/hash_table.h"

 static bool debug = false;

@@ -76,6 +77,18 @@ public:
int swizzle[4];
 };

+/* Class that refers to acp_entry in another exec_list. Used
+ * when making removals based on rhs.
+ */
+class acp_ref : public exec_node


This pattern is called a box, so maybe acp_box would be a better name.
I'm not too hung up on it.

With this change, can the acp_entry itself still be in a list?


The idea here is a class that only refers to a acp_entry but does not 
take any ownership .. so it's really just a list of pointers. I'm OK 
with renaming it.



+{
+public:
+   acp_ref(acp_entry *e)
+   {
+  entry = e;
+   }
+   acp_entry *entry;
+};

 class kill_entry : public exec_node
 {
@@ -98,14 +111,42 @@ public:
   this->killed_all = false;
   this->mem_ctx = ralloc_context(NULL);
   this->shader_mem_ctx = NULL;
-  this->acp = new(mem_ctx) exec_list;
   this->kills = new(mem_ctx) exec_list;
+
+  create_acp();
}
~ir_copy_propagation_elements_visitor()
{
   ralloc_free(mem_ctx);
}

+   void create_acp()
+   {
+  lhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
+   _mesa_key_pointer_equal);
+  rhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
+   _mesa_key_pointer_equal);
+   }
+
+   void destroy_acp()
+   {
+  _mesa_hash_table_destroy(lhs_ht, NULL);
+  _mesa_hash_table_destroy(rhs_ht, NULL);
+   }
+
+   void populate_acp(hash_table *lhs, hash_table *rhs)
+   {
+  struct hash_entry *entry;
+  hash_table_foreach(lhs, entry)
+  {


Opening { on the hash_table_foreach line.


+ _mesa_hash_table_insert(lhs_ht, entry->key, entry->data);
+  }


Blank line here.


+  hash_table_foreach(rhs, entry)
+  {


Opening { on the hash_table_foreach line.


+ _mesa_hash_table_insert(rhs_ht, entry->key, entry->data);
+  }
+   }


thanks, will fix these


+
void handle_loop(ir_loop *, bool keep_acp);
virtual ir_visitor_status visit_enter(class ir_loop *);
virtual ir_visitor_status visit_enter(class ir_function_signature *);
@@ -120,8 +161,10 @@ public:
void kill(kill_entry *k);
void handle_if_block(exec_list *instructions);

-   /** List of acp_entry: The available copies to propagate */
-   exec_list *acp;
+   /** Hash of acp_entry: The available copies to propagate */
+   hash_table *lhs_ht;
+   hash_table *rhs_ht;
+
/**
 * List of kill_entry: The variables whose values were killed in this
 * block.
@@ -147,23 +190,29 @@ 
ir_copy_propagation_elements_visitor::visit_enter(ir_function_signature *ir)
 * block.  Any instructions at global scope will be shuffled into
 * main() at link time, so they're irrelevant to us.
 */
-   exec_list *orig_acp = this->acp;
exec_list *orig_kills = this->kills;
bool orig_killed_all =

Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-29 Thread haagch+mesadev

On 29.09.2016 01:45, Andy Furniss wrote:
> Mark Thompson wrote:
>> On 28/09/16 11:47, Andy Furniss wrote:
>>> Andy Furniss wrote:
 Andy Furniss wrote:

> I started to get some numbers but found a possible bug = I
> made a 1000 frame 15mbit 1080p50 mkv with ffmpeg/libx264.
>
> Using it as source for transcode (s/w or h/w dec) it came out
> far too low bitrate.
>
> After re-applying debugging patch to mesa it turns out that
> framerate is being set as 1000 in the encoder, I don't know if
> the file is duff or if it's the patches.
>
> more tomorrow.

 Still happens with your libav patch 1 and 2 reversed.

 No matter what I try it seems that mkv + hwupload will be seen
 as 1000 fps.

 With mp4 it's 25fps.
>>>
>>> Oops ignore the mp4 bit, it is possible to get that to work - no
>>> luck with mkv so far though.
>>>

 Also the same with a mkv not made by me, these file all play at
 correct rate and are seen by avconv -i or mediainfo as correct.

 I guess there is a difference between reading the fps from the
 container vs reading from the stream.
>>
>> VAAPI CBR does not support VFR input; you end up with the 1000fps
>> because that's the time base inside your input file and we have no
>> way of knowing at init-time what the frame timestamps are going to
>> be.  To make it work, either use CQP (which works sensibly with VFR
>> streams if the container supports them), or make the stream CFR -
>> this is probably easiest by adding "-r 42" (or similar) as an output
>> option.  The mp4 output will not exhibit the problem because the
>> muxer declares in advance that it is CFR-only, so avconv already
>> forces the time base and timestamps into the right state for it.
>
> Ugh, yea, after spending far too much time messing around with input it
> was output format that made the difference.
>
> On bench marking - your 30 vs 250 does seem low. If you haven't tried
> setting gpu and cpu to perf maybe that will help. It helps with my tonga
> sometimes.
>
> If it's a dual GPU system that may add complications as I recall  seeing
> a bug report where someone had issues with prime + powerplay/dpm.  I
> can't remember what the GPU was though.
Perhaps my bug at https://bugs.freedesktop.org/show_bug.cgi?id=97075
tl;dr: If you're affected you get much better VCE performance by just running 
e.g.
vblank_mode=0 glxgears -info
in the background.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97957] Awful screen tearing in a separate X server with DRI3

2016-09-29 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97957

Chris Wilson  changed:

   What|Removed |Added

  Component|Driver/intel|Mesa core
   Assignee|ch...@chris-wilson.co.uk|mesa-dev@lists.freedesktop.
   ||org
Product|xorg|Mesa
 QA Contact|intel-gfx-bugs@lists.freede |mesa-dev@lists.freedesktop.
   |sktop.org   |org

--- Comment #5 from Chris Wilson  ---
Fix
https://cgit.freedesktop.org/~ickle/mesa/commit/?h=brw-batch&id=8ffee986f537888921716ab632236ea4c55fb0f1

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97957] Awful screen tearing in a separate X server with DRI3

2016-09-29 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97957

Chris Wilson  changed:

   What|Removed |Added

 CC||kostas...@gmail.com

--- Comment #6 from Chris Wilson  ---
*** Bug 97173 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: enable GL_KHR_robustness

2016-09-29 Thread Nicolai Hähnle


On 29.09.2016 00:00, Bas Nieuwenhuizen wrote:

On Wed, Sep 28, 2016 at 6:27 PM, Nicolai Hähnle  wrote:

On 28.09.2016 16:20, Ilia Mirkin wrote:


On Wed, Sep 28, 2016 at 6:25 AM, Nicolai Hähnle 
wrote:


From: Nicolai Hähnle 

The difference to the virtually identical ARB_robustness (which is
already
enabled unconditionally) is miniscule and handled elsewhere, but this set
of caps seems like the right thing to require for this extension.
---
 docs/features.txt  | 2 +-
 docs/relnotes/12.1.0.html  | 1 +
 src/mesa/state_tracker/st_extensions.c | 4 
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/features.txt b/docs/features.txt
index fbb3952..52d194e 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -211,21 +211,21 @@ GL 4.5, GLSL 4.50:
   GL_ARB_ES3_1_compatibilityDONE (i965/hsw+,
nvc0, radeonsi)
   GL_ARB_clip_control   DONE (i965,
nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_invertedDONE (i965,
nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance  DONE (i965,
nv50, nvc0, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_derivative_control DONE (i965,
nv50, nvc0, r600, radeonsi)
   GL_ARB_direct_state_accessDONE (all
drivers)
   GL_ARB_get_texture_sub_image  DONE (all
drivers)
   GL_ARB_shader_texture_image_samples   DONE (i965,
nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrierDONE (i965,
nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control  DONE (all - but
needs GLX/EGL extension to be useful)
-  GL_KHR_robustness DONE (i965)
+  GL_KHR_robustness DONE (i965,
radeonsi)
   GL_EXT_shader_integer_mix DONE (all
drivers that support GLSL)

 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi

   GL_ARB_arrays_of_arrays   DONE (all
drivers that support GLSL 1.30)
   GL_ARB_compute_shader DONE
(i965/gen7+, softpipe)
   GL_ARB_draw_indirect  DONE
(i965/gen7+, r600, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location  DONE (all
drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments DONE
(i965/gen7+, r600, softpipe)
diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
index cdd8909..0c99f19 100644
--- a/docs/relnotes/12.1.0.html
+++ b/docs/relnotes/12.1.0.html
@@ -52,20 +52,21 @@ Note: some of the new features are only available
with certain drivers.
 GL_ARB_cull_distance on radeonsi
 GL_ARB_enhanced_layouts on i965
 GL_ARB_indirect_parameters on radeonsi
 GL_ARB_shader_draw_parameters on radeonsi
 GL_ARB_shader_group_vote on nvc0
 GL_ARB_shader_viewport_layer_array on i965/gen6+
 GL_ARB_stencil_texturing on i965/hsw
 GL_ARB_texture_stencil8 on i965/hsw
 GL_EXT_window_rectangles on nv50, nvc0
 GL_KHR_blend_equation_advanced on i965
+GL_KHR_robustness on radeonsi
 GL_KHR_texture_compression_astc_sliced_3d on i965
 GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe,
llvmpipe
 GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi
 GL_OES_primitive_bounding_box on i965/gen7+, nvc0, radeonsi
 GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi
 GL_OES_tessellation_shader on i965/gen7+, nvc0, radeonsi
 GL_OES_viewport_array on nvc0, radeonsi
 GL_ANDROID_extension_pack_es31a on i965/gen9+
 

diff --git a/src/mesa/state_tracker/st_extensions.c
b/src/mesa/state_tracker/st_extensions.c
index 4f42217..4789f2c 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -1192,20 +1192,24 @@ void st_init_extensions(struct pipe_screen
*screen,
 consts->MaxComputeWorkGroupCount[i] = grid_size[i];
 consts->MaxComputeWorkGroupSize[i] = block_size[i];
  }

  extensions->ARB_compute_shader =

extensions->ARB_shader_image_load_store &&

extensions->ARB_shader_atomic_counters;
   }
}

+   extensions->KHR_robustness =
+  extensions->ARB_robust_buffer_access_behavior &&
+  screen->get_param(screen, PIPE_CAP_DEVICE_RESET_STATUS_QUERY);



Is that necessary? From what I can tell, a no-op implementation is
sufficient for KHR_robustness.



The extension does talk about buffer robustness. For the reset notification
business, I also don't think the spec really has a lot of teeth. I just
thought that checking this cap is what's most in line with the spirit of the
extension.


I think it has some teeth though:

"After a graphics reset has occurred on a context, subsequent GL commands
on that context (or any context which sha

Re: [Mesa-dev] [PATCH] st/mesa: enable GL_KHR_robustness

2016-09-29 Thread Bas Nieuwenhuizen

On Thu, Sep 29, 2016 at 10:20 AM, Nicolai Hähnle  wrote:
> On 29.09.2016 00:00, Bas Nieuwenhuizen wrote:
>>
>> On Wed, Sep 28, 2016 at 6:27 PM, Nicolai Hähnle 
>> wrote:
>>>
>>> On 28.09.2016 16:20, Ilia Mirkin wrote:


 On Wed, Sep 28, 2016 at 6:25 AM, Nicolai Hähnle 
 wrote:
>
>
> From: Nicolai Hähnle 
>
> The difference to the virtually identical ARB_robustness (which is
> already
> enabled unconditionally) is miniscule and handled elsewhere, but this
> set
> of caps seems like the right thing to require for this extension.
> ---
>  docs/features.txt  | 2 +-
>  docs/relnotes/12.1.0.html  | 1 +
>  src/mesa/state_tracker/st_extensions.c | 4 
>  3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/docs/features.txt b/docs/features.txt
> index fbb3952..52d194e 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -211,21 +211,21 @@ GL 4.5, GLSL 4.50:
>GL_ARB_ES3_1_compatibilityDONE
> (i965/hsw+,
> nvc0, radeonsi)
>GL_ARB_clip_control   DONE (i965,
> nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
>GL_ARB_conditional_render_invertedDONE (i965,
> nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
>GL_ARB_cull_distance  DONE (i965,
> nv50, nvc0, radeonsi, llvmpipe, softpipe, swr)
>GL_ARB_derivative_control DONE (i965,
> nv50, nvc0, r600, radeonsi)
>GL_ARB_direct_state_accessDONE (all
> drivers)
>GL_ARB_get_texture_sub_image  DONE (all
> drivers)
>GL_ARB_shader_texture_image_samples   DONE (i965,
> nv50, nvc0, r600, radeonsi)
>GL_ARB_texture_barrierDONE (i965,
> nv50, nvc0, r600, radeonsi)
>GL_KHR_context_flush_control  DONE (all -
> but
> needs GLX/EGL extension to be useful)
> -  GL_KHR_robustness DONE (i965)
> +  GL_KHR_robustness DONE (i965,
> radeonsi)
>GL_EXT_shader_integer_mix DONE (all
> drivers that support GLSL)
>
>  These are the extensions cherry-picked to make GLES 3.1
>  GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
>
>GL_ARB_arrays_of_arrays   DONE (all
> drivers that support GLSL 1.30)
>GL_ARB_compute_shader DONE
> (i965/gen7+, softpipe)
>GL_ARB_draw_indirect  DONE
> (i965/gen7+, r600, llvmpipe, softpipe, swr)
>GL_ARB_explicit_uniform_location  DONE (all
> drivers that support GLSL)
>GL_ARB_framebuffer_no_attachments DONE
> (i965/gen7+, r600, softpipe)
> diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
> index cdd8909..0c99f19 100644
> --- a/docs/relnotes/12.1.0.html
> +++ b/docs/relnotes/12.1.0.html
> @@ -52,20 +52,21 @@ Note: some of the new features are only available
> with certain drivers.
>  GL_ARB_cull_distance on radeonsi
>  GL_ARB_enhanced_layouts on i965
>  GL_ARB_indirect_parameters on radeonsi
>  GL_ARB_shader_draw_parameters on radeonsi
>  GL_ARB_shader_group_vote on nvc0
>  GL_ARB_shader_viewport_layer_array on i965/gen6+
>  GL_ARB_stencil_texturing on i965/hsw
>  GL_ARB_texture_stencil8 on i965/hsw
>  GL_EXT_window_rectangles on nv50, nvc0
>  GL_KHR_blend_equation_advanced on i965
> +GL_KHR_robustness on radeonsi
>  GL_KHR_texture_compression_astc_sliced_3d on i965
>  GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe,
> llvmpipe
>  GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi
>  GL_OES_primitive_bounding_box on i965/gen7+, nvc0, radeonsi
>  GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi
>  GL_OES_tessellation_shader on i965/gen7+, nvc0, radeonsi
>  GL_OES_viewport_array on nvc0, radeonsi
>  GL_ANDROID_extension_pack_es31a on i965/gen9+
>  
>
> diff --git a/src/mesa/state_tracker/st_extensions.c
> b/src/mesa/state_tracker/st_extensions.c
> index 4f42217..4789f2c 100644
> --- a/src/mesa/state_tracker/st_extensions.c
> +++ b/src/mesa/state_tracker/st_extensions.c
> @@ -1192,20 +1192,24 @@ void st_init_extensions(struct pipe_screen
> *screen,
>  consts->MaxComputeWorkGroupCount[i] = grid_size[i];
>  consts->MaxComputeWorkGroupSize[i] = block_size[i];
>   }
>
>   extensions->ARB_compute_shader =
>
> extensions->ARB_shader_image_load_store &&
>
>

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Nicolai Hähnle


On 28.09.2016 18:49, Marek Olšák wrote:

From: Marek Olšák 

More info about jemalloc:
   https://github.com/jemalloc/jemalloc/wiki/History

Average from 3 takes compiling Alien Isolation shaders from GLSL to GCN
bytecode:
   glibc:17.183s
   jemalloc: 15.558s
   diff: -9.5%

The diff is -10.5% for a full shader-db run.
---

TODO: The jemalloc dependency should be added to configure.ac before this.

We can probably redirect all malloc/calloc/realloc/free calls in Mesa to
jemalloc. We can either use _mesa_jemalloc, etc. everywhere or we can
redirect calls to jemalloc using #define malloc _mesa_jemalloc, etc.

Right now, I just use: export LDFLAGS=-ljemalloc


Sounds good to me. It should probably be a configurable option, 
defaulting to jemalloc and failing if not available unless explicitly 
disabled.


On the Gallium side of things, switching to jemalloc could be pretty 
straightforward via the macros in u_memory.h, once we know that they're 
actually used consistently (which we currently don't -- it would be nice 
to know how jemalloc and glibc malloc react when the calls are mixed).


Cheers,
Nicolai


 src/util/jemalloc_wrapper.h | 95 +
 src/util/ralloc.c   |  7 ++--
 2 files changed, 99 insertions(+), 3 deletions(-)
 create mode 100644 src/util/jemalloc_wrapper.h

diff --git a/src/util/jemalloc_wrapper.h b/src/util/jemalloc_wrapper.h
new file mode 100644
index 000..d994576
--- /dev/null
+++ b/src/util/jemalloc_wrapper.h
@@ -0,0 +1,95 @@
+/*
+ * Copyright © 2016 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef JEMALLOC_WRAPPER_H
+#define JEMALLOC_WRAPPER_H
+
+#if defined(_WIN32) || defined(WIN32)
+
+/* Use a fallback on Windows. */
+#include 
+#define _mesa_jemalloc malloc
+#define _mesa_jezalloc(size) calloc(1, size)
+#define _mesa_jecalloc calloc
+#define _mesa_jerealloc realloc
+#define _mesa_jefree free
+
+#else /* Linux, BSD, etc. */
+
+#include "macros.h"
+#include 
+
+/* Standard function names (malloc, etc.) can't be used, because glibc is
+ * loaded first, so libjemalloc can't override them. Instead, alternative
+ * jemalloc functions are used, which do the same thing.
+ *
+ * - mallocx = like malloc
+ * - rallocx = like realloc
+ * - dallocx = like free
+ */
+
+static inline void *
+_mesa_jemalloc(size_t size)
+{
+   if (unlikely(!size))
+  size = 1;
+
+   return mallocx(size, 0);
+}
+
+static inline void *
+_mesa_jezalloc(size_t size)
+{
+   if (unlikely(!size))
+  size = 1;
+
+   return mallocx(size, MALLOCX_ZERO);
+}
+
+static inline void *
+_mesa_jecalloc(size_t num, size_t size)
+{
+   /* TODO:
+*   Check for overflow if needed.
+*   See jemalloc, which seems to use a compatible license.
+*/
+   return _mesa_jezalloc(num * size);
+}
+
+static inline void *
+_mesa_jerealloc(void *ptr, size_t size)
+{
+   if (unlikely(!size))
+  size = 1;
+
+   return rallocx(ptr, size, 0);
+}
+
+static inline void
+_mesa_jefree(void *ptr)
+{
+   dallocx(ptr, 0);
+}
+
+#endif /* OS/Platform check */
+#endif /* JEMALLOC_WRAPPER_H */
diff --git a/src/util/ralloc.c b/src/util/ralloc.c
index 9526011..d3897ed 100644
--- a/src/util/ralloc.c
+++ b/src/util/ralloc.c
@@ -33,20 +33,21 @@
 #include 
 #endif

 /* Some versions of MinGW are missing _vscprintf's declaration, although they
  * still provide the symbol in the import library. */
 #ifdef __MINGW32__
 _CRTIMP int _vscprintf(const char *format, va_list argptr);
 #endif

 #include "ralloc.h"
+#include "jemalloc_wrapper.h"

 #ifndef va_copy
 #ifdef __va_copy
 #define va_copy(dest, src) __va_copy((dest), (src))
 #else
 #define va_copy(dest, src) (dest) = (src)
 #endif
 #endif

 #define CANARY 0x5A1106
@@ -115,21 +116,21 @@ ralloc_size(const void *ctx, size_t size)
 *
 * TODO: Make ralloc_size not zero fill memory, and cleanup any code that
 * should instead be

Re: [Mesa-dev] [PATCH 3/3] radeonsi: remove cb0_is_integer handling

2016-09-29 Thread Nicolai Hähnle


This patch is

Reviewed-by: Nicolai Hähnle 

though I wonder if it breaks nine. If it does, it should be up to nine 
to be fixed similarly to st/mesa, though.


On 28.09.2016 15:44, Marek Olšák wrote:

From: Marek Olšák 

st/mesa does this for us.
---
 src/gallium/drivers/radeonsi/si_pipe.h  | 1 -
 src/gallium/drivers/radeonsi/si_state.c | 9 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 6 ++
 3 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 1080e72..ef25df8 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -158,21 +158,20 @@ struct si_images_info {
struct pipe_image_view  views[SI_NUM_IMAGES];
uint32_tcompressed_colortex_mask;
unsignedenabled_mask;
 };

 struct si_framebuffer {
struct r600_atomatom;
struct pipe_framebuffer_state   state;
unsignednr_samples;
unsignedlog_samples;
-   unsignedcb0_is_integer;
unsignedcompressed_cb_mask;
unsignedspi_shader_col_format;
unsignedspi_shader_col_format_alpha;
unsignedspi_shader_col_format_blend;
unsignedspi_shader_col_format_blend_alpha;
unsignedcolor_is_int8; /* bitmask */
unsigneddirty_cbufs;
booldirty_zsbuf;
boolany_dst_linear;
 };
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 1703e42..1d6bf06 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1125,22 +1125,21 @@ static void si_emit_db_render_state(struct si_context 
*sctx, struct r600_atom *s
radeon_emit(cs, S_028004_ZPASS_INCREMENT_DISABLE(1));
}
}

/* DB_RENDER_OVERRIDE2 */
radeon_set_context_reg(cs, R_028010_DB_RENDER_OVERRIDE2,

S_028010_DISABLE_ZMASK_EXPCLEAR_OPTIMIZATION(sctx->db_depth_disable_expclear) |

S_028010_DISABLE_SMEM_EXPCLEAR_OPTIMIZATION(sctx->db_stencil_disable_expclear) |
S_028010_DECOMPRESS_Z_ON_FLUSH(sctx->framebuffer.nr_samples >= 
4));

-   db_shader_control = 
S_02880C_ALPHA_TO_MASK_DISABLE(sctx->framebuffer.cb0_is_integer) |
-   sctx->ps_db_shader_control;
+   db_shader_control = sctx->ps_db_shader_control;

/* Bug workaround for smoothing (overrasterization) on SI. */
if (sctx->b.chip_class == SI && sctx->smoothing_enabled) {
db_shader_control &= C_02880C_Z_ORDER;
db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z);
}

/* Disable the gl_SampleMask fragment shader output if MSAA is 
disabled. */
if (sctx->framebuffer.nr_samples <= 1 || (rs && 
!rs->multisample_enable))
db_shader_control &= C_02880C_MASK_EXPORT_ENABLE;
@@ -2245,21 +2244,20 @@ static void si_dec_framebuffer_counters(const struct 
pipe_framebuffer_state *sta
}
 }

 static void si_set_framebuffer_state(struct pipe_context *ctx,
 const struct pipe_framebuffer_state *state)
 {
struct si_context *sctx = (struct si_context *)ctx;
struct pipe_constant_buffer constbuf = {0};
struct r600_surface *surf = NULL;
struct r600_texture *rtex;
-   bool old_cb0_is_integer = sctx->framebuffer.cb0_is_integer;
bool old_any_dst_linear = sctx->framebuffer.any_dst_linear;
unsigned old_nr_samples = sctx->framebuffer.nr_samples;
int i;

for (i = 0; i < sctx->framebuffer.state.nr_cbufs; i++) {
if (!sctx->framebuffer.state.cbufs[i])
continue;

rtex = (struct 
r600_texture*)sctx->framebuffer.state.cbufs[i]->texture;
if (rtex->dcc_gather_statistics)
@@ -2290,27 +2288,22 @@ static void si_set_framebuffer_state(struct 
pipe_context *ctx,

sctx->framebuffer.spi_shader_col_format = 0;
sctx->framebuffer.spi_shader_col_format_alpha = 0;
sctx->framebuffer.spi_shader_col_format_blend = 0;
sctx->framebuffer.spi_shader_col_format_blend_alpha = 0;
sctx->framebuffer.color_is_int8 = 0;

sctx->framebuffer.compressed_cb_mask = 0;
sctx->framebuffer.nr_samples = util_framebuffer_get_num_samples(state);
sctx->framebuffer.log_samples = 
util_logbase2(sctx->framebuffer.nr_samples);
-   sctx->framebuffer.cb0_is_integer = state->nr_cbufs && state->cbufs[0] &&
- 
util_format_is_pure_i

Re: [Mesa-dev] [PATCH 3/3] radeonsi: remove cb0_is_integer handling

2016-09-29 Thread Marek Olšák

On Thu, Sep 29, 2016 at 11:23 AM, Nicolai Hähnle  wrote:
> This patch is
>
> Reviewed-by: Nicolai Hähnle 
>
> though I wonder if it breaks nine. If it does, it should be up to nine to be
> fixed similarly to st/mesa, though.

Nine (DX9) doesn't use (doesn't support) integer textures.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 25/95] i965/vec4: fix base offset for nir_registers with doubles

2016-09-29 Thread Iago Toral

On Wed, 2016-08-17 at 14:16 -0700, Francisco Jerez wrote:
> Iago Toral  writes:
> 
> > 
> > On Tue, 2016-08-02 at 18:40 -0700, Francisco Jerez wrote:
> > > 
> > > Iago Toral Quiroga  writes:
> > > 
> > > > 
> > > > 
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > index cf35f2e..fde7b60 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > @@ -280,7 +280,8 @@ vec4_visitor::get_nir_dest(const nir_dest
> > > > &dest)
> > > >    nir_ssa_values[dest.ssa.index] = dst;
> > > >    return dst;
> > > > } else {
> > > > -  return dst_reg_for_nir_reg(this, dest.reg.reg,
> > > > dest.reg.base_offset,
> > > > +  unsigned base_offset = dest.reg.base_offset *
> > > > dest.reg.reg-
> > > > > 
> > > > > bit_size / 32;
> > > > +  return dst_reg_for_nir_reg(this, dest.reg.reg,
> > > > base_offset,
> > > >   dest.reg.indirect);
> > > > }
> > > >  }
> > > > @@ -308,8 +309,9 @@ vec4_visitor::get_nir_src(const nir_src
> > > > &src,
> > > > enum brw_reg_type type,
> > > >    reg = nir_ssa_values[src.ssa->index];
> > > > }
> > > > else {
> > > > - reg = dst_reg_for_nir_reg(this, src.reg.reg,
> > > > src.reg.base_offset,
> > > > -   src.reg.indirect);
> > > > +  unsigned base_offset = src.reg.base_offset *
> > > > src.reg.reg-
> > > > > 
> > > > > bit_size / 32;
> > > > +  reg = dst_reg_for_nir_reg(this, src.reg.reg,
> > > > base_offset,
> > > > +src.reg.indirect);
> > > I think this wouldn't have been necessary if you had fixed the
> > > offset()
> > > helper to take into account the register type (as it does in the
> > > FS
> > > back-end)?
> > Yes, you are right.
> > 
> > I found it convenient that offset() didn't consider the type
> > because
> > that gave us a bit more flexibility to offset into the second half
> > of a
> > SIMD4x2 dvecN simply by doing offset(reg, 1). If offset takes the
> > type
> > into account we lose that ability because for a DF register the
> > minimum
> > we could offset with a delta=1 would be 2 SIMD8 registers (a full
> > SIMD4x2 dvec4). We can still avoid calling offset() in these cases
> > and
> > just increase reg_offset manually by 1 instead, I understand that
> > you
> > prefer this?
> > 
> Right, I think I'd rather have offset() mean the same thing on both
> the
> VEC4 and FS back-ends to avoid confusion (i.e. step by as many scalar
> components as the specified SIMD width).
> > 
> > Alternatively, we could also have typed and untyped versions of the
> > offset function so we can choose the one we need depending on the
> > case,
> > but maybe that would be more confusing?
> > 
> I don't think an untyped (i.e. based on an offset argument in byte or
> 32-byte units) variant of offset() would be a particularly compelling
> way to achieve the above either, in fact this seems to be the cause
> of
> the rather artificial limitation of the SIMD lowering pass that
> prevents
> it from handling cases where each half doesn't write or read exactly
> one
> register of the instruction.  If you had something like the FS
> backend's
> horiz_offset() you would just get the 4-wide vector of the register
> for
> the group index specified as argument, regardless of the type of the
> register.  The current representation of register offsets (as a 32B
> multiple) may still get in the way for non-64 bit register types, but
> that's not really the SIMD lowering pass' business and can probably
> be
> addressed separately.

I have been looking into this. It seems like a lot if use cases for the
offset() helper in the vec4 backend look like the following (this is
from the cse pass, but there are plenty more like this):

for (unsigned i = 0; i < regs_written(entry->generator); ++i) {
   vec4_instruction *copy = MOV(offset(entry->generator->dst, i),
offset(entry->tmp, i));
   ...
}

That is, code that is setup on a register by register basis rather than
a channel basis. For these cases I think the current offset() helper is
more useful. So how about we add a byte_offset() helper to the Vec4 IR
and we use that in cases like these and then we change the offset()
helper to operate with the channel semantics that we have in the FS IR
as you suggest and use that everywhere else?

> > 
> > Iago
> > 
> > > 
> > > > 
> > > > 
> > > > }
> > > >  
> > > > reg = retype(reg, type);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Marek Olšák

On Thu, Sep 29, 2016 at 11:20 AM, Nicolai Hähnle  wrote:
> On 28.09.2016 18:49, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> More info about jemalloc:
>>https://github.com/jemalloc/jemalloc/wiki/History
>>
>> Average from 3 takes compiling Alien Isolation shaders from GLSL to GCN
>> bytecode:
>>glibc:17.183s
>>jemalloc: 15.558s
>>diff: -9.5%
>>
>> The diff is -10.5% for a full shader-db run.
>> ---
>>
>> TODO: The jemalloc dependency should be added to configure.ac before this.
>>
>> We can probably redirect all malloc/calloc/realloc/free calls in Mesa to
>> jemalloc. We can either use _mesa_jemalloc, etc. everywhere or we can
>> redirect calls to jemalloc using #define malloc _mesa_jemalloc, etc.
>>
>> Right now, I just use: export LDFLAGS=-ljemalloc
>
>
> Sounds good to me. It should probably be a configurable option, defaulting
> to jemalloc and failing if not available unless explicitly disabled.

If it was a configurable option, almost nobody would use it. Let's
make it mandatory.

>
> On the Gallium side of things, switching to jemalloc could be pretty
> straightforward via the macros in u_memory.h, once we know that they're
> actually used consistently (which we currently don't -- it would be nice to
> know how jemalloc and glibc malloc react when the calls are mixed).

Redefining malloc/calloc/realloc/free/posix_memalign for all Mesa code
would be more robust.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 93089] mesa fails to check for gcc atomic primitives before using them

2016-09-29 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=93089

Elmar Hanlhofer  changed:

   What|Removed |Added

 CC||bugzi...@plop.at

--- Comment #10 from Elmar Hanlhofer  ---
Created attachment 126862
  --> https://bugs.freedesktop.org/attachment.cgi?id=126862&action=edit
i486 __sync_val_compare_and_swap_8 disable debug message

I came across the same problem when I compiled Mesa for the new Plop Linux i486
release. As the affected code is only in a debug message routine in
"intel_debug.c", I simply disabled it. Now it compiles fine with "march=i486".
I attached the patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/7] gallium/radeon: cleanup and fix branch emits

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

Some of the existing code is needlessly complicated. The basic principle
should be: control-flow opcodes emit branches to properly terminate the
current block, _unless_ the current block already has a terminator (which
happens if and only if there was a BRK or CONT).

This also fixes a bug where multiple terminators were created in a block.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887
Cc: mesa-sta...@lists.freedesktop.org
---
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 51 ++
 1 file changed, 14 insertions(+), 37 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index bcb3143..201bed8 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -789,20 +789,30 @@ void radeon_llvm_emit_store(struct lp_build_tgsi_context 
*bld_base,
val2 = LLVMBuildExtractElement(builder, ptr,

bld_base->uint_bld.one, "");
 
LLVMBuildStore(builder, bitcast(bld_base, 
TGSI_TYPE_FLOAT, value), temp_ptr);
LLVMBuildStore(builder, bitcast(bld_base, 
TGSI_TYPE_FLOAT, val2), temp_ptr2);
}
}
}
 }
 
+/* Emit a branch to the given default target for the current block if
+ * applicable -- that is, if the current block does not already contain a
+ * branch from a break or continue.
+ */
+static void emit_default_branch(LLVMBuilderRef builder, LLVMBasicBlockRef 
target)
+{
+   if (!LLVMGetBasicBlockTerminator(LLVMGetInsertBlock(builder)))
+LLVMBuildBr(builder, target);
+}
+
 static void bgnloop_emit(const struct lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBasicBlockRef loop_block;
LLVMBasicBlockRef endloop_block;
endloop_block = LLVMAppendBasicBlockInContext(gallivm->context,
ctx->main_fn, "ENDLOOP");
@@ -849,88 +859,55 @@ static void cont_emit(const struct lp_build_tgsi_action 
*action,
LLVMBuildBr(gallivm->builder, current_loop->loop_block);
 }
 
 static void else_emit(const struct lp_build_tgsi_action *action,
  struct lp_build_tgsi_context *bld_base,
  struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_branch *current_branch = get_current_branch(ctx);
-   LLVMBasicBlockRef current_block = LLVMGetInsertBlock(gallivm->builder);
-
-   /* We need to add a terminator to the current block if the previous
-* instruction was an ENDIF.Example:
-* IF
-*   [code]
-*   IF
-* [code]
-*   ELSE
-*[code]
-*   ENDIF <--
-* ELSE<--
-*   [code]
-* ENDIF
-*/
 
-   if (current_block != current_branch->if_block) {
-   LLVMBuildBr(gallivm->builder, current_branch->endif_block);
-   }
-   if (!LLVMGetBasicBlockTerminator(current_branch->if_block)) {
-   LLVMBuildBr(gallivm->builder, current_branch->endif_block);
-   }
+   emit_default_branch(gallivm->builder, current_branch->endif_block);
current_branch->has_else = 1;
LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->else_block);
 }
 
 static void endif_emit(const struct lp_build_tgsi_action *action,
   struct lp_build_tgsi_context *bld_base,
   struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_branch *current_branch = get_current_branch(ctx);
-   LLVMBasicBlockRef current_block = LLVMGetInsertBlock(gallivm->builder);
 
-   /* If we have consecutive ENDIF instructions, then the first ENDIF
-* will not have a terminator, so we need to add one. */
-   if (current_block != current_branch->if_block
-   && current_block != current_branch->else_block
-   && !LLVMGetBasicBlockTerminator(current_block)) {
+   emit_default_branch(gallivm->builder, current_branch->endif_block);
 
-LLVMBuildBr(gallivm->builder, current_branch->endif_block);
-   }
+   /* Need to fixup an empty else block if there was no ELSE opcode. */
if (!LLVMGetBasicBlockTerminator(current_branch->else_block)) {

[Mesa-dev] [PATCH 5/7] gallium/radeon: unify the creation of basic blocks

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

This changes the order of basic blocks to be equal to the order of code in the
original TGSI, which is nice for making sense of shader dumps.
---
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 34 +++---
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 2f100bd..6a10af3 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -836,41 +836,58 @@ static void set_basicblock_name(LLVMBasicBlockRef bb, 
const char *base, int pc)
 {
char buf[32];
/* Subtract 1 so that the number shown is that of the corresponding
 * opcode in the TGSI dump, e.g. an if block has the same suffix as
 * the instruction number of the corresponding TGSI IF.
 */
snprintf(buf, sizeof(buf), "%s%d", base, pc - 1);
LLVMSetValueName(LLVMBasicBlockAsValue(bb), buf);
 }
 
+/* Append a basic block at the level of the parent flow.
+ */
+static LLVMBasicBlockRef append_basic_block(struct radeon_llvm_context *ctx,
+   const char *name)
+{
+   struct gallivm_state *gallivm = &ctx->gallivm;
+
+   assert(ctx->flow_depth >= 1);
+
+   if (ctx->flow_depth >= 2) {
+   struct radeon_llvm_flow *flow = &ctx->flow[ctx->flow_depth - 2];
+
+   return LLVMInsertBasicBlockInContext(gallivm->context,
+flow->next_block, name);
+   }
+
+   return LLVMAppendBasicBlockInContext(gallivm->context, ctx->main_fn, 
name);
+}
+
 /* Emit a branch to the given default target for the current block if
  * applicable -- that is, if the current block does not already contain a
  * branch from a break or continue.
  */
 static void emit_default_branch(LLVMBuilderRef builder, LLVMBasicBlockRef 
target)
 {
if (!LLVMGetBasicBlockTerminator(LLVMGetInsertBlock(builder)))
 LLVMBuildBr(builder, target);
 }
 
 static void bgnloop_emit(const struct lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_flow *flow = push_flow(ctx);
-   flow->next_block = LLVMAppendBasicBlockInContext(gallivm->context,
-   ctx->main_fn, "ENDLOOP");
-   flow->loop_entry_block = LLVMInsertBasicBlockInContext(gallivm->context,
-   flow->next_block, "LOOP");
+   flow->loop_entry_block = append_basic_block(ctx, "LOOP");
+   flow->next_block = append_basic_block(ctx, "ENDLOOP");
set_basicblock_name(flow->loop_entry_block, "loop", bld_base->pc);
LLVMBuildBr(gallivm->builder, flow->loop_entry_block);
LLVMPositionBuilderAtEnd(gallivm->builder, flow->loop_entry_block);
 }
 
 static void brk_emit(const struct lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
@@ -895,22 +912,21 @@ static void else_emit(const struct lp_build_tgsi_action 
*action,
  struct lp_build_tgsi_context *bld_base,
  struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_flow *current_branch = get_current_flow(ctx);
LLVMBasicBlockRef endif_block;
 
assert(!current_branch->loop_entry_block);
 
-   endif_block = LLVMAppendBasicBlockInContext(gallivm->context,
-   ctx->main_fn, "ENDIF");
+   endif_block = append_basic_block(ctx, "ENDIF");
emit_default_branch(gallivm->builder, endif_block);
 
LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->next_block);
set_basicblock_name(current_branch->next_block, "else", bld_base->pc);
 
current_branch->next_block = endif_block;
 }
 
 static void endif_emit(const struct lp_build_tgsi_action *action,
   struct lp_build_tgsi_context *bld_base,
@@ -949,24 +965,22 @@ static void endloop_emit(const struct 
lp_build_tgsi_action *action,
 static void if_cond_emit(const struct lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data,
 LLVMValueRef cond)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bl

[Mesa-dev] [PATCH 3/7] gallium/radeon: simplify if/else/endif blocks

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

In particular, we no longer emit an else block when there is no ELSE
instruction.
---
 src/gallium/drivers/radeon/radeon_llvm.h   |  4 +--
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 39 ++
 2 files changed, 18 insertions(+), 25 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_llvm.h 
b/src/gallium/drivers/radeon/radeon_llvm.h
index f508d32..58193db 100644
--- a/src/gallium/drivers/radeon/radeon_llvm.h
+++ b/src/gallium/drivers/radeon/radeon_llvm.h
@@ -34,23 +34,21 @@
 
 #define RADEON_LLVM_MAX_INPUT_SLOTS 32
 #define RADEON_LLVM_MAX_INPUTS 32 * 4
 #define RADEON_LLVM_MAX_OUTPUTS 32 * 4
 
 #define RADEON_LLVM_INITIAL_CF_DEPTH 4
 
 #define RADEON_LLVM_MAX_SYSTEM_VALUES 4
 
 struct radeon_llvm_branch {
-   LLVMBasicBlockRef endif_block;
-   LLVMBasicBlockRef if_block;
-   LLVMBasicBlockRef else_block;
+   LLVMBasicBlockRef next_block;
unsigned has_else;
 };
 
 struct radeon_llvm_loop {
LLVMBasicBlockRef loop_block;
LLVMBasicBlockRef endloop_block;
 };
 
 struct radeon_llvm_context {
struct lp_build_tgsi_soa_context soa;
diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 2f6b7e2..6cae858 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -871,46 +871,45 @@ static void cont_emit(const struct lp_build_tgsi_action 
*action,
LLVMBuildBr(gallivm->builder, current_loop->loop_block);
 }
 
 static void else_emit(const struct lp_build_tgsi_action *action,
  struct lp_build_tgsi_context *bld_base,
  struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_branch *current_branch = get_current_branch(ctx);
+   LLVMBasicBlockRef endif_block;
+
+   endif_block = LLVMAppendBasicBlockInContext(gallivm->context,
+   ctx->main_fn, "ENDIF");
+   emit_default_branch(gallivm->builder, endif_block);
 
-   emit_default_branch(gallivm->builder, current_branch->endif_block);
current_branch->has_else = 1;
-   LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->else_block);
-   set_basicblock_name(current_branch->else_block, "else", bld_base->pc);
+   LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->next_block);
+   set_basicblock_name(current_branch->next_block, "else", bld_base->pc);
+
+   current_branch->next_block = endif_block;
 }
 
 static void endif_emit(const struct lp_build_tgsi_action *action,
   struct lp_build_tgsi_context *bld_base,
   struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_branch *current_branch = get_current_branch(ctx);
 
-   emit_default_branch(gallivm->builder, current_branch->endif_block);
+   emit_default_branch(gallivm->builder, current_branch->next_block);
+   LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->next_block);
+   set_basicblock_name(current_branch->next_block, "endif", bld_base->pc);
 
-   /* Need to fixup an empty else block if there was no ELSE opcode. */
-   if (!LLVMGetBasicBlockTerminator(current_branch->else_block)) {
-   LLVMPositionBuilderAtEnd(gallivm->builder, 
current_branch->else_block);
-   LLVMBuildBr(gallivm->builder, current_branch->endif_block);
-   set_basicblock_name(current_branch->else_block, "empty_else", 
bld_base->pc);
-   }
-
-   LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->endif_block);
-   set_basicblock_name(current_branch->endif_block, "endif", bld_base->pc);
ctx->branch_depth--;
 }
 
 static void endloop_emit(const struct lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_loop *current_loop = get_current_loop(ctx);
@@ -922,47 +921,43 @@ static void endloop_emit(const struct 
lp_build_tgsi_action *action,
ctx->loop_depth--;
 }
 
 static void if_cond_emit(const struct lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data,
 LLVMValueRef cond)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
-   LLVMBasicBlockRef i

[Mesa-dev] [PATCH 6/7] gallium/radeon: fix argument type of llvm.{cttz, ctlz}.i32 intrinsics

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

Caught by R600_DEBUG=checkir (next commit).
---
 src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 6a10af3..80e9707 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -1629,40 +1629,40 @@ static void emit_lsb(const struct lp_build_tgsi_action 
*action,
LLVMValueRef args[2] = {
emit_data->args[0],
 
/* The value of 1 means that ffs(x=0) = undef, so LLVM won't
 * add special code to check for x=0. The reason is that
 * the LLVM behavior for x=0 is different from what we
 * need here.
 *
 * The hardware already implements the correct behavior.
 */
-   lp_build_const_int32(gallivm, 1)
+   LLVMConstInt(LLVMInt1TypeInContext(gallivm->context), 1, 0)
};
 
emit_data->output[emit_data->chan] =
lp_build_intrinsic(gallivm->builder, "llvm.cttz.i32",
emit_data->dst_type, args, ARRAY_SIZE(args),
LLVMReadNoneAttribute);
 }
 
 /* Find the last bit set. */
 static void emit_umsb(const struct lp_build_tgsi_action *action,
  struct lp_build_tgsi_context *bld_base,
  struct lp_build_emit_data *emit_data)
 {
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef args[2] = {
emit_data->args[0],
/* Don't generate code for handling zero: */
-   lp_build_const_int32(gallivm, 1)
+   LLVMConstInt(LLVMInt1TypeInContext(gallivm->context), 1, 0)
};
 
LLVMValueRef msb =
lp_build_intrinsic(builder, "llvm.ctlz.i32",
emit_data->dst_type, args, ARRAY_SIZE(args),
LLVMReadNoneAttribute);
 
/* The HW returns the last bit index from MSB, but TGSI wants
 * the index from LSB. Invert it by doing "31 - msb". */
msb = LLVMBuildSub(builder, lp_build_const_int32(gallivm, 31),
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] gallium/radeon: label basic blocks by the corresponding TGSI pc

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 201bed8..2f6b7e2 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -789,20 +789,31 @@ void radeon_llvm_emit_store(struct lp_build_tgsi_context 
*bld_base,
val2 = LLVMBuildExtractElement(builder, ptr,

bld_base->uint_bld.one, "");
 
LLVMBuildStore(builder, bitcast(bld_base, 
TGSI_TYPE_FLOAT, value), temp_ptr);
LLVMBuildStore(builder, bitcast(bld_base, 
TGSI_TYPE_FLOAT, val2), temp_ptr2);
}
}
}
 }
 
+static void set_basicblock_name(LLVMBasicBlockRef bb, const char *base, int pc)
+{
+   char buf[32];
+   /* Subtract 1 so that the number shown is that of the corresponding
+* opcode in the TGSI dump, e.g. an if block has the same suffix as
+* the instruction number of the corresponding TGSI IF.
+*/
+   snprintf(buf, sizeof(buf), "%s%d", base, pc - 1);
+   LLVMSetValueName(LLVMBasicBlockAsValue(bb), buf);
+}
+
 /* Emit a branch to the given default target for the current block if
  * applicable -- that is, if the current block does not already contain a
  * branch from a break or continue.
  */
 static void emit_default_branch(LLVMBuilderRef builder, LLVMBasicBlockRef 
target)
 {
if (!LLVMGetBasicBlockTerminator(LLVMGetInsertBlock(builder)))
 LLVMBuildBr(builder, target);
 }
 
@@ -811,20 +822,21 @@ static void bgnloop_emit(const struct 
lp_build_tgsi_action *action,
 struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBasicBlockRef loop_block;
LLVMBasicBlockRef endloop_block;
endloop_block = LLVMAppendBasicBlockInContext(gallivm->context,
ctx->main_fn, "ENDLOOP");
loop_block = LLVMInsertBasicBlockInContext(gallivm->context,
endloop_block, "LOOP");
+   set_basicblock_name(loop_block, "loop", bld_base->pc);
LLVMBuildBr(gallivm->builder, loop_block);
LLVMPositionBuilderAtEnd(gallivm->builder, loop_block);
 
if (++ctx->loop_depth > ctx->loop_depth_max) {
unsigned new_max = ctx->loop_depth_max << 1;
 
if (!new_max)
new_max = RADEON_LLVM_INITIAL_CF_DEPTH;
 
ctx->loop = REALLOC(ctx->loop, ctx->loop_depth_max *
@@ -863,71 +875,76 @@ static void else_emit(const struct lp_build_tgsi_action 
*action,
  struct lp_build_tgsi_context *bld_base,
  struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_branch *current_branch = get_current_branch(ctx);
 
emit_default_branch(gallivm->builder, current_branch->endif_block);
current_branch->has_else = 1;
LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->else_block);
+   set_basicblock_name(current_branch->else_block, "else", bld_base->pc);
 }
 
 static void endif_emit(const struct lp_build_tgsi_action *action,
   struct lp_build_tgsi_context *bld_base,
   struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct radeon_llvm_branch *current_branch = get_current_branch(ctx);
 
emit_default_branch(gallivm->builder, current_branch->endif_block);
 
/* Need to fixup an empty else block if there was no ELSE opcode. */
if (!LLVMGetBasicBlockTerminator(current_branch->else_block)) {
LLVMPositionBuilderAtEnd(gallivm->builder, 
current_branch->else_block);
LLVMBuildBr(gallivm->builder, current_branch->endif_block);
+   set_basicblock_name(current_branch->else_block, "empty_else", 
bld_base->pc);
}
 
LLVMPositionBuilderAtEnd(gallivm->builder, current_branch->endif_block);
+   set_basicblock_name(current_branch->endif_block, "endif", bld_base->pc);
ctx->branch_depth--;
 }
 
 static void endloop_emit(const struct lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data)
 {
struct radeon_llvm_context *ctx = radeon_llvm_context(b

[Mesa-dev] [PATCH 4/7] gallium/radeon: merge branch and loop flow control stacks

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/radeon_llvm.h   |  20 +--
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 140 +++--
 2 files changed, 78 insertions(+), 82 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_llvm.h 
b/src/gallium/drivers/radeon/radeon_llvm.h
index 58193db..2f9572a 100644
--- a/src/gallium/drivers/radeon/radeon_llvm.h
+++ b/src/gallium/drivers/radeon/radeon_llvm.h
@@ -33,29 +33,21 @@
 #include "tgsi/tgsi_parse.h"
 
 #define RADEON_LLVM_MAX_INPUT_SLOTS 32
 #define RADEON_LLVM_MAX_INPUTS 32 * 4
 #define RADEON_LLVM_MAX_OUTPUTS 32 * 4
 
 #define RADEON_LLVM_INITIAL_CF_DEPTH 4
 
 #define RADEON_LLVM_MAX_SYSTEM_VALUES 4
 
-struct radeon_llvm_branch {
-   LLVMBasicBlockRef next_block;
-   unsigned has_else;
-};
-
-struct radeon_llvm_loop {
-   LLVMBasicBlockRef loop_block;
-   LLVMBasicBlockRef endloop_block;
-};
+struct radeon_llvm_flow;
 
 struct radeon_llvm_context {
struct lp_build_tgsi_soa_context soa;
 
/*=== Front end configuration ===*/
 
/* Instructions that are not described by any of the TGSI opcodes. */
 
/** This function is responsible for initilizing the inputs array and 
will be
  * called once for each input declared in the TGSI shader.
@@ -83,27 +75,23 @@ struct radeon_llvm_context {
/** This pointer is used to contain the temporary values.
  * The amount of temporary used in tgsi can't be bound to a max value 
and
  * thus we must allocate this array at runtime.
  */
LLVMValueRef *temps;
unsigned temps_count;
LLVMValueRef system_values[RADEON_LLVM_MAX_SYSTEM_VALUES];
 
/*=== Private Members ===*/
 
-   struct radeon_llvm_branch *branch;
-   struct radeon_llvm_loop *loop;
-
-   unsigned branch_depth;
-   unsigned branch_depth_max;
-   unsigned loop_depth;
-   unsigned loop_depth_max;
+   struct radeon_llvm_flow *flow;
+   unsigned flow_depth;
+   unsigned flow_depth_max;
 
struct tgsi_array_info *temp_arrays;
LLVMValueRef *temp_array_allocas;
 
LLVMValueRef undef_alloca;
 
LLVMValueRef main_fn;
LLVMTypeRef return_type;
 
unsigned fpmath_md_kind;
diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 6cae858..2f100bd 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -35,20 +35,28 @@
 #include "tgsi/tgsi_info.h"
 #include "tgsi/tgsi_parse.h"
 #include "util/u_math.h"
 #include "util/u_memory.h"
 #include "util/u_debug.h"
 
 #include 
 #include 
 #include 
 
+/* Data for if/else/endif and bgnloop/endloop control flow structures.
+ */
+struct radeon_llvm_flow {
+   /* Loop exit or next part of if/else/endif. */
+   LLVMBasicBlockRef next_block;
+   LLVMBasicBlockRef loop_entry_block;
+};
+
 LLVMTypeRef tgsi2llvmtype(struct lp_build_tgsi_context *bld_base,
  enum tgsi_opcode_type type)
 {
LLVMContextRef ctx = bld_base->base.gallivm->context;
 
switch (type) {
case TGSI_TYPE_UNSIGNED:
case TGSI_TYPE_SIGNED:
return LLVMInt32TypeInContext(ctx);
case TGSI_TYPE_UNSIGNED64:
@@ -98,29 +106,57 @@ LLVMValueRef radeon_llvm_bound_index(struct 
radeon_llvm_context *ctx,
 * In practice, LLVM generates worse code (at the time of
 * writing), because its value tracking is not strong enough.
 */
cc = LLVMBuildICmp(builder, LLVMIntULE, index, c_max, "");
index = LLVMBuildSelect(builder, cc, index, c_max, "");
}
 
return index;
 }
 
-static struct radeon_llvm_loop *get_current_loop(struct radeon_llvm_context 
*ctx)
+static struct radeon_llvm_flow *
+get_current_flow(struct radeon_llvm_context *ctx)
+{
+   if (ctx->flow_depth > 0)
+   return &ctx->flow[ctx->flow_depth - 1];
+   return NULL;
+}
+
+static struct radeon_llvm_flow *
+get_innermost_loop(struct radeon_llvm_context *ctx)
 {
-   return ctx->loop_depth > 0 ? ctx->loop + (ctx->loop_depth - 1) : NULL;
+   for (unsigned i = ctx->flow_depth; i > 0; --i) {
+   if (ctx->flow[i - 1].loop_entry_block)
+   return &ctx->flow[i - 1];
+   }
+   return NULL;
 }
 
-static struct radeon_llvm_branch *get_current_branch(struct 
radeon_llvm_context *ctx)
+static struct radeon_llvm_flow *
+push_flow(struct radeon_llvm_context *ctx)
 {
-   return ctx->branch_depth > 0 ?
-   ctx->branch + (ctx->branch_depth - 1) : NULL;
+   struct radeon_llvm_flow *flow;
+
+   if (ctx->flow_depth >= ctx->flow_depth_max) {
+   unsigned new_max = MAX2(ctx->flow_depth << 1, 
RADEON_LLVM_INITIAL_CF_DEPTH);
+   ctx->flow = REALLOC(ctx->flow,
+

[Mesa-dev] [PATCH 7/7] radeonsi: optionally run the LLVM IR verifier pass

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

This is enabled automatically if shader printing is enabled, or separately
by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later
pass.
---
 src/gallium/drivers/radeon/r600_pipe_common.c  |  7 ++
 src/gallium/drivers/radeon/r600_pipe_common.h  |  3 +++
 src/gallium/drivers/radeon/radeon_llvm.h   |  3 ++-
 .../drivers/radeon/radeon_setup_tgsi_llvm.c|  6 -
 src/gallium/drivers/radeonsi/si_shader.c   | 28 --
 5 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 5b1ce04..e7bf7f2 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -632,20 +632,21 @@ static const struct debug_named_value 
common_debug_options[] = {
{ "vs", DBG_VS, "Print vertex shaders" },
{ "gs", DBG_GS, "Print geometry shaders" },
{ "ps", DBG_PS, "Print pixel shaders" },
{ "cs", DBG_CS, "Print compute shaders" },
{ "tcs", DBG_TCS, "Print tessellation control shaders" },
{ "tes", DBG_TES, "Print tessellation evaluation shaders" },
{ "noir", DBG_NO_IR, "Don't print the LLVM IR"},
{ "notgsi", DBG_NO_TGSI, "Don't print the TGSI"},
{ "noasm", DBG_NO_ASM, "Don't print disassembled shaders"},
{ "preoptir", DBG_PREOPT_IR, "Print the LLVM IR before initial 
optimizations" },
+   { "checkir", DBG_CHECK_IR, "Enable additional sanity checks on shader 
IR" },
 
{ "testdma", DBG_TEST_DMA, "Invoke SDMA tests and exit." },
 
/* features */
{ "nodma", DBG_NO_ASYNC_DMA, "Disable asynchronous DMA" },
{ "nohyperz", DBG_NO_HYPERZ, "Disable Hyper-Z" },
/* GL uses the word INVALIDATE, gallium uses the word DISCARD */
{ "noinvalrange", DBG_NO_DISCARD_RANGE, "Disable handling of 
INVALIDATE_RANGE map flags" },
{ "no2d", DBG_NO_2D_TILING, "Disable 2D tiling" },
{ "notiling", DBG_NO_TILING, "Disable tiling" },
@@ -1276,20 +1277,26 @@ bool r600_can_dump_shader(struct r600_common_screen 
*rscreen,
return (rscreen->debug_flags & DBG_GS) != 0;
case PIPE_SHADER_FRAGMENT:
return (rscreen->debug_flags & DBG_PS) != 0;
case PIPE_SHADER_COMPUTE:
return (rscreen->debug_flags & DBG_CS) != 0;
default:
return false;
}
 }
 
+bool r600_extra_shader_checks(struct r600_common_screen *rscreen, unsigned 
processor)
+{
+   return (rscreen->debug_flags & DBG_CHECK_IR) ||
+  r600_can_dump_shader(rscreen, processor);
+}
+
 void r600_screen_clear_buffer(struct r600_common_screen *rscreen, struct 
pipe_resource *dst,
  uint64_t offset, uint64_t size, unsigned value,
  enum r600_coherency coher)
 {
struct r600_common_context *rctx = (struct 
r600_common_context*)rscreen->aux_context;
 
pipe_mutex_lock(rscreen->aux_context_lock);
rctx->clear_buffer(&rctx->b, dst, offset, size, value, coher);
rscreen->aux_context->flush(rscreen->aux_context, NULL, 0);
pipe_mutex_unlock(rscreen->aux_context_lock);
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index f23f1c4..c836ab1 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -71,20 +71,21 @@
 #define DBG_VS (1 << 6)
 #define DBG_GS (1 << 7)
 #define DBG_PS (1 << 8)
 #define DBG_CS (1 << 9)
 #define DBG_TCS(1 << 10)
 #define DBG_TES(1 << 11)
 #define DBG_NO_IR  (1 << 12)
 #define DBG_NO_TGSI(1 << 13)
 #define DBG_NO_ASM (1 << 14)
 #define DBG_PREOPT_IR  (1 << 15)
+#define DBG_CHECK_IR   (1 << 16)
 /* gaps */
 #define DBG_TEST_DMA   (1 << 20)
 /* Bits 21-31 are reserved for the r600g driver. */
 /* features */
 #define DBG_NO_ASYNC_DMA   (1llu << 32)
 #define DBG_NO_HYPERZ  (1llu << 33)
 #define DBG_NO_DISCARD_RANGE   (1llu << 34)
 #define DBG_NO_2D_TILING   (1llu << 35)
 #define DBG_NO_TILING  (1llu << 36)
 #define DBG_SWITCH_ON_EOP  (1llu << 37)
@@ -714,20 +715,22 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
 void r600_destroy_common_screen(struct r600_common_screen *rscreen);
 void r600_preflush_suspend_features(struct r600_common_context *ctx);
 void r600_postflush_resume_features(struct r600_common_context *ctx);
 bool r600_common_context_init(struct r600_common_context *rctx,
  struct r600_common_screen *rscreen,
  unsigned context_flags);
 void r600_common_context_cleanup(struct r600_common_context *rctx);
 void r600_context_add_resource_s

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Clemens Eisserer

Hi Marek,

> The diff is -10.5% for a full shader-db run.

Interesting finding, did you also have a look at the memory footprint (rss)
during the shader-db Run (average and Spikes)?

Br, Clemens
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/hud: Add power sensor support

2016-09-29 Thread Steven Toth

Implement support for power based sensors, reporting units in
milli-watts and watts.

Also, minor cleanup - change the related if block to a switch.

Tested with two different power sensors, including the nouveau
'power1' sensors on a GTX950 card.

Signed-off-by: Steven Toth 
---
 src/gallium/auxiliary/hud/hud_context.c  | 10 
 src/gallium/auxiliary/hud/hud_private.h  |  1 +
 src/gallium/auxiliary/hud/hud_sensors_temp.c | 38 
 src/gallium/include/pipe/p_defines.h |  1 +
 4 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index a82cdf2..3445488 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -261,6 +261,7 @@ number_to_human_readable(uint64_t num, uint64_t max_value,
static const char *temperature_units[] = {" C"};
static const char *volt_units[] = {" mV", " V"};
static const char *amp_units[] = {" mA", " A"};
+   static const char *watt_units[] = {" mW", " W"};
 
const char **units;
unsigned max_unit;
@@ -301,6 +302,10 @@ number_to_human_readable(uint64_t num, uint64_t max_value,
   max_unit = ARRAY_SIZE(hz_units)-1;
   units = hz_units;
   break;
+   case PIPE_DRIVER_QUERY_TYPE_WATTS:
+  max_unit = ARRAY_SIZE(watt_units)-1;
+  units = watt_units;
+  break;
default:
   if (max_value == 100) {
  max_unit = ARRAY_SIZE(percent_units)-1;
@@ -1067,6 +1072,11 @@ hud_parse_env_var(struct hud_context *hud, const char 
*env)
 SENSORS_CURRENT_CURRENT);
  pane->type = PIPE_DRIVER_QUERY_TYPE_AMPS;
   }
+  else if (sscanf(name, "sensors_pow_cu-%s", arg_name) == 1) {
+ hud_sensors_temp_graph_install(pane, arg_name,
+SENSORS_POWER_CURRENT);
+ pane->type = PIPE_DRIVER_QUERY_TYPE_WATTS;
+  }
 #endif
   else if (strcmp(name, "samples-passed") == 0 &&
has_occlusion_query(hud->pipe->screen)) {
diff --git a/src/gallium/auxiliary/hud/hud_private.h 
b/src/gallium/auxiliary/hud/hud_private.h
index c825512..51049af 100644
--- a/src/gallium/auxiliary/hud/hud_private.h
+++ b/src/gallium/auxiliary/hud/hud_private.h
@@ -124,6 +124,7 @@ int hud_get_num_sensors(bool displayhelp);
 #define SENSORS_TEMP_CRITICAL2
 #define SENSORS_VOLTAGE_CURRENT  3
 #define SENSORS_CURRENT_CURRENT  4
+#define SENSORS_POWER_CURRENT5
 void hud_sensors_temp_graph_install(struct hud_pane *pane, const char 
*dev_name,
 unsigned int mode);
 #endif
diff --git a/src/gallium/auxiliary/hud/hud_sensors_temp.c 
b/src/gallium/auxiliary/hud/hud_sensors_temp.c
index bceffc4..7d1398a 100644
--- a/src/gallium/auxiliary/hud/hud_sensors_temp.c
+++ b/src/gallium/auxiliary/hud/hud_sensors_temp.c
@@ -119,6 +119,15 @@ get_sensor_values(struct sensors_temp_info *sti)
   if (sf)
  sti->critical = get_value(sti->chip, sf);
   break;
+   case SENSORS_POWER_CURRENT:
+  sf = sensors_get_subfeature(sti->chip, sti->feature,
+  SENSORS_SUBFEATURE_POWER_INPUT);
+  if (sf) {
+ /* Sensors API returns in WATTs, even though driver is reporting mW,
+  * convert back to mW */
+ sti->current = get_value(sti->chip, sf) * 1000;
+  }
+  break;
}
 
sf = sensors_get_subfeature(sti->chip, sti->feature,
@@ -173,6 +182,9 @@ query_sti_load(struct hud_graph *gr)
  case SENSORS_CURRENT_CURRENT:
 hud_graph_add_value(gr, (uint64_t) sti->current);
 break;
+ case SENSORS_POWER_CURRENT:
+hud_graph_add_value(gr, (uint64_t) sti->current);
+break;
  }
 
  sti->last_time = now;
@@ -217,6 +229,7 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, const 
char *dev_name,
   mode == SENSORS_VOLTAGE_CURRENT ? "VOLTS" :
   mode == SENSORS_CURRENT_CURRENT ? "AMPS" :
   mode == SENSORS_TEMP_CURRENT ? "CU" :
+  mode == SENSORS_POWER_CURRENT ? "POWER" :
   mode == SENSORS_TEMP_CRITICAL ? "CR" : "UNDEFINED");
 #endif
 
@@ -234,6 +247,7 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, const 
char *dev_name,
sti->mode == SENSORS_VOLTAGE_CURRENT ? "Volts" :
sti->mode == SENSORS_CURRENT_CURRENT ? "Amps" :
sti->mode == SENSORS_TEMP_CURRENT ? "Curr" :
+   sti->mode == SENSORS_POWER_CURRENT ? "Pow" :
sti->mode == SENSORS_TEMP_CRITICAL ? "Crit" : "Unkn");
 
gr->query_data = sti;
@@ -256,6 +270,9 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, const 
char *dev_name,
case SENSORS_CURRENT_CURRENT:
   hud_pane_set_max_value(pane, 5000);
   break;
+   case SENSORS_POWER_CURRENT:
+  hud_pane_set_max_value(pane, 5000 /* mW */);
+  break;
}
 }
 
@@ -303,19 +320,27 @@ build_sensor_list(void)

Re: [Mesa-dev] [Request for Testing] i965: import prime buffers in the current context, not screen

2016-09-29 Thread Martin Peres


On 02/09/16 10:08, Martin Peres wrote:



On 25/08/16 21:49, Kristian Høgsberg wrote:

On Thu, Aug 25, 2016 at 11:38 AM, Chad Versace
 wrote:

On Thu 25 Aug 2016, Martin Peres wrote:

This mirrors the codepath taken by DRI2 in IntelSetTexBuffer2() and
fixes many applications when using DRI3:
 - Totem with libva on hw-accelerated decoding
 - obs-studio, using Window Capture (Xcomposite) as a Source
 - gstreamer with VAAPI

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71759
Signed-off-by: Martin Peres 
---

This patch supposedly prevents gnome from running for one Arch Linux
maintainer. Kenneth Graunke  and I failed to reproduce it so I am
asking for your help/comments on this.

 src/mesa/drivers/dri/i965/intel_screen.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c
b/src/mesa/drivers/dri/i965/intel_screen.c
index 7876652..5c0d300 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -698,8 +698,11 @@ intel_create_image_from_fds(__DRIscreen *screen,
 int *fds, int num_fds, int *strides,
int *offsets,
 void *loaderPrivate)
 {
+   GET_CURRENT_CONTEXT(ctx);
struct intel_screen *intelScreen = screen->driverPrivate;
+   struct brw_context *brw = brw_context(ctx);
struct intel_image_format *f;
+   dri_bufmgr *bufmgr;
__DRIimage *image;
int i, index;

@@ -740,8 +743,25 @@ intel_create_image_from_fds(__DRIscreen *screen,
  size = end;
}

-   image->bo = drm_intel_bo_gem_create_from_prime(intelScreen->bufmgr,
-  fds[0], size);
+   /* Let's import the buffer into the current context instead of
the current
+* screen as some applications like gstreamer, totem, or obs
create multiple
+* X connections which end up creating multiple screens and thus
multiple
+* buffer managers. They then proceed to use a different X
connection to call
+* GLXBindTexImageExt() which should import the buffer in the
current thread
+* bound and not the current screen. This is done properly
upstairs for
+* texture management so we need to mirror this behaviour if we
don't want
+* the kernel rejecting our pushbuffers as the buffers have not
been imported
+* by the same bufmgr that sent the pushbuffer.
+*
+* If there is no context currently bound, then revert to using
the screen's
+* buffer manager and hope for the best...


Nope. This patch breaks EGL_EXT_image_dma_buf_import.

When the user calls eglCreateImageKHR(EGLDisplay, EGLContext, ...) with
image type EGL_LINUX_DMA_BUF_EXT, then the spec requires that context be
NULL. The EGLDisplay parameter determines the namespace of the newly
created
EGLImage. By design, the currently bound context (and its display) DO
NOT affect
eglCreateImage.

  Problem Example:
eglMakeCurrent(dpyA, ..., ctxA);
img = eglCreateImage(dpyB, EGL_NO_CONTEXT, ...);


I see, that may explain the issue Ionut found with gnome. Thanks Chad!



The difference between DRI2 and DRI3 is that for DRI2 we'd get a
DRI2Buffer back from getBuffers, and then open the flink name inside
the driver with the current context's bufmgr. In the DRI3 world, we go
from prime fd to drm_bo in dri3_get_pixmap_buffer() with the screen
that's associated with the current drawable.


Yes, that is the problem indeed.



I think the fix we're looking for is to make
draw->vtable->get_dri_context() also return the __DRIscreen, then use
that in dri3_get_pixmap_buffer() to get the right __DRIscreen to pass
to loader_dri3_create_image().


Seems sensible and wouldn't require changing the world! Thanks Kristian!
I will get to it when I come back from vacation!


Hey,

I am finally trying your approach but then I am not sure I understand 
what you are saying.


My patch was changing the import to always happen in the currently-bound 
screen, and you said it violates the spec of EGL_EXT_image_dma_buf_import.


eglCreateImage's dpy is passed all the way down to 
intel_create_image_from_fds, which is what the spec mandates.


So, what do we do? We cannot use one single buffer manager for everyone 
unless we always force the authentication dance :s


Martin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-09-29 Thread Emil Velikov

On 28 September 2016 at 19:53, Marek Olšák  wrote:
> Hi,
>
> It's been almost 4 months since the 12.0 branch was created, and soon
> it will have been 3 months since Mesa 12.0 was released.
>
> Is there any reason we haven't created the stable branch yet?
>
> Ideally, we would time the release so that it's 1-2 months before fall
> distribution releases.
>

Thanks Marek !

In all honesty I was secretly hoping that we'll get Dave/Bas RADV for
12.1. With the topic of which would be 'the default' Vulkan driver for
ATI/AMD hardware to be considered at a later stage.

That said here are the tentative dates:

Oct 7/14 2016 - Feature freeze/Release candidate 1
Oct 14/21 2016 - Release candidate 2
Oct 21/28 2016 - Release candidate 3/final release

Fwiw I'm still in favour of getting RADV in even if it's not
perfect/feature complete. Devs, let me know if there's a "must have"
feature that we want in 12.1.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Emil Velikov

On 29 September 2016 at 11:48, Marek Olšák  wrote:
> On Thu, Sep 29, 2016 at 11:20 AM, Nicolai Hähnle  wrote:
>> On 28.09.2016 18:49, Marek Olšák wrote:
>>>
>>> From: Marek Olšák 
>>>
>>> More info about jemalloc:
>>>https://github.com/jemalloc/jemalloc/wiki/History
>>>
>>> Average from 3 takes compiling Alien Isolation shaders from GLSL to GCN
>>> bytecode:
>>>glibc:17.183s
>>>jemalloc: 15.558s
>>>diff: -9.5%
>>>
>>> The diff is -10.5% for a full shader-db run.
>>> ---
>>>
>>> TODO: The jemalloc dependency should be added to configure.ac before this.
>>>
>>> We can probably redirect all malloc/calloc/realloc/free calls in Mesa to
>>> jemalloc. We can either use _mesa_jemalloc, etc. everywhere or we can
>>> redirect calls to jemalloc using #define malloc _mesa_jemalloc, etc.
>>>
>>> Right now, I just use: export LDFLAGS=-ljemalloc
>>
>>
>> Sounds good to me. It should probably be a configurable option, defaulting
>> to jemalloc and failing if not available unless explicitly disabled.
>
> If it was a configurable option, almost nobody would use it. Let's
> make it mandatory.
>
This combined with ...

>>
>> On the Gallium side of things, switching to jemalloc could be pretty
>> straightforward via the macros in u_memory.h, once we know that they're
>> actually used consistently (which we currently don't -- it would be nice to
>> know how jemalloc and glibc malloc react when the calls are mixed).
>
> Redefining malloc/calloc/realloc/free/posix_memalign for all Mesa code
> would be more robust.
>
... this doesn't is not a wise move.

Don't force jemalloc onto everyone without having an explicit ACK from
a wide audience, please ? Considering the static/shared link (or w/o
jemalloc all together) distributions will have their
preferences/policies which won't align with my/your view.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] anv/gen7_pipeline: Use MSDISPMODE_PERSAMPLE for non-multisampled fbo

2016-09-29 Thread Jason Ekstrand

On Sep 28, 2016 12:10 PM, "Anuj Phogat"  wrote:
>
> On Wed, Sep 21, 2016 at 12:49 PM, Anuj Phogat 
wrote:
> > On Wed, Sep 21, 2016 at 11:49 AM, Jason Ekstrand 
wrote:
> >> This seems odd... When can it even happen that we have
persample_dispatch
> >> set in wm_surface_state and have only one sample?  Does this fix a test
> >> case?
> >>
> > No, It just fixes a simulator warning. It's recommended in graphics spec
> > for gen7. Also look at gen7_wm_state.c.
> >
> Jason, do you still have concerns about this patch?

No, not really. R-b

> >> On Sep 21, 2016 9:14 PM, "Anuj Phogat"  wrote:
> >>>
> >>> Signed-off-by: Anuj Phogat 
> >>> ---
> >>>  src/intel/vulkan/gen7_pipeline.c | 3 ++-
> >>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/src/intel/vulkan/gen7_pipeline.c
> >>> b/src/intel/vulkan/gen7_pipeline.c
> >>> index 878308b..5150ef9 100644
> >>> --- a/src/intel/vulkan/gen7_pipeline.c
> >>> +++ b/src/intel/vulkan/gen7_pipeline.c
> >>> @@ -267,7 +267,8 @@ genX(graphics_pipeline_create)(
> >>>
> >>>   wm.MultisampleRasterizationMode= samples > 1 ?
> >>>
MSRASTMODE_ON_PATTERN :
> >>> MSRASTMODE_OFF_PIXEL;
> >>> - wm.MultisampleDispatchMode =
> >>> wm_prog_data->persample_dispatch ?
> >>> + wm.MultisampleDispatchMode = ((samples == 1) ||
> >>> +   (samples > 1 &&
> >>> wm_prog_data->persample_dispatch)) ?

You could simply do "samples ==1 || wm_prog_data->persample_dispatch" but I
don't really care that much one way or the other.

> >>>
MSDISPMODE_PERSAMPLE :
> >>> MSDISPMODE_PERPIXEL;
> >>>}
> >>> }
> >>> --
> >>> 2.5.5
> >>>
> >>> ___
> >>> mesa-dev mailing list
> >>> mesa-dev@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-09-29 Thread Jason Ekstrand

On Sep 29, 2016 7:56 AM, "Emil Velikov"  wrote:
>
> On 28 September 2016 at 19:53, Marek Olšák  wrote:
> > Hi,
> >
> > It's been almost 4 months since the 12.0 branch was created, and soon
> > it will have been 3 months since Mesa 12.0 was released.
> >
> > Is there any reason we haven't created the stable branch yet?
> >
> > Ideally, we would time the release so that it's 1-2 months before fall
> > distribution releases.
> >
>
> Thanks Marek !
>
> In all honesty I was secretly hoping that we'll get Dave/Bas RADV for
> 12.1. With the topic of which would be 'the default' Vulkan driver for
> ATI/AMD hardware to be considered at a later stage.

If they have even close to the amount of work we had to get it merged, I
don't think that's at all realistic.  Then again, Dave is the one who wants
to have a Vulkan driver for AMD hardware that he can package and ship so
I'll let him decide how badly he wants it in this release.

> That said here are the tentative dates:
>
> Oct 7/14 2016 - Feature freeze/Release candidate 1
> Oct 14/21 2016 - Release candidate 2
> Oct 21/28 2016 - Release candidate 3/final release
>
> Fwiw I'm still in favour of getting RADV in even if it's not
> perfect/feature complete. Devs, let me know if there's a "must have"
> feature that we want in 12.1.
>
> Thanks
> Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Next Mesa release, anyone?

2016-09-29 Thread Jason Ekstrand

On Sep 28, 2016 11:44 PM, "Timo Aaltonen"  wrote:
>
> On 28.09.2016 21:53, Marek Olšák wrote:
> > Hi,
> >
> > It's been almost 4 months since the 12.0 branch was created, and soon
> > it will have been 3 months since Mesa 12.0 was released.
> >
> > Is there any reason we haven't created the stable branch yet?
> >
> > Ideally, we would time the release so that it's 1-2 months before fall
> > distribution releases.
>
> Yep, 12.0 got delayed and 12.1 seems to follow the same path, and it's
> too late for Ubuntu 16.10 already which will ship with 12.0.3.

In order to avoid this problem in the future, when is the latest we can
release such that you can pick it up?

> --
> t
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Next Mesa release, anyone?

2016-09-29 Thread Timo Aaltonen

On 29.09.2016 18:10, Jason Ekstrand wrote:
> On Sep 28, 2016 11:44 PM, "Timo Aaltonen"  > wrote:
>>
>> On 28.09.2016 21:53, Marek Olšák wrote:
>> > Hi,
>> >
>> > It's been almost 4 months since the 12.0 branch was created, and soon
>> > it will have been 3 months since Mesa 12.0 was released.
>> >
>> > Is there any reason we haven't created the stable branch yet?
>> >
>> > Ideally, we would time the release so that it's 1-2 months before fall
>> > distribution releases.
>>
>> Yep, 12.0 got delayed and 12.1 seems to follow the same path, and it's
>> too late for Ubuntu 16.10 already which will ship with 12.0.3.
> 
> In order to avoid this problem in the future, when is the latest we can
> release such that you can pick it up?

The Ubuntu feature-freeze date tends to be around mid-February/August,
but freeze exceptions are allowed. So having rc1 roughly around FF would
be fine. Then the actual .0 release a few weeks later is not a problem,
and there would still be time to squeeze in a bugfix release or two
before the distro release late April/October.


-- 
t
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97957] Awful screen tearing in a separate X server with DRI3

2016-09-29 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97957

Dieter Nützel  changed:

   What|Removed |Added

 CC||mic...@daenzer.net

--- Comment #7 from Dieter Nützel  ---
Hello Chris,

did you send this
https://cgit.freedesktop.org/~ickle/mesa/commit/?h=brw-batch&id=8ffee986f537888921716ab632236ea4c55fb0f1
in for review?

E.g. to Michel?
Maybe it solve this
https://bugs.freedesktop.org/show_bug.cgi?id=97260#c56
one , too?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-29 Thread Emil Velikov

On 27 September 2016 at 13:47, Marek Olšák  wrote:
> On Tue, Sep 27, 2016 at 2:34 PM, Emil Velikov  
> wrote:
>> On 26 September 2016 at 08:41, Tapani Pälli  wrote:
>>> This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
>>> and same crash in many dEQP EGL tests.
>>>
>>> I also found that some Qt example did a workaround because of this
>>> crash: https://bugreports.qt.io/browse/QTBUG-47509
>>>
>>> Signed-off-by: Tapani Pälli 
>>> ---
>>>
>>> This is RFC as I'm not sure if we are supposed to support this. I tried
>>> to verify overall pbuffer situation with some mesa-demos using pbuffer
>>> but those are not working for me at all with or without my patch.
>>>
>>>  src/egl/main/eglconfig.c | 5 +
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
>>> index 6161d26..20cf9d4 100644
>>> --- a/src/egl/main/eglconfig.c
>>> +++ b/src/egl/main/eglconfig.c
>>> @@ -407,6 +407,11 @@ _eglValidateConfig(const _EGLConfig *conf, EGLBoolean 
>>> for_matching)
>>>return EGL_FALSE;
>>> }
>>>
>>> +   /* pbuffer with MSAA not supported */
>> Fwiw on my system piglit also crashes + the demos don't render
>> anything. So I'm leaning that we want this as-is (for the time being)
>> + cc stable ?
>>
>> Can you apply a minor polish to the comment - "XXX/TODO: pbuffer +
>> MSAA does not work + QT bugreport" or alike.
>
> Please don't add "XXX/TODO". pbuffers were spec'd in 1997 and were
> meant to be used on GL 1.x hardware that didn't support MSAA
> texturing, thus MSAA pbuffers don't make any sense. Just keep the
> current comment.
>
Can we use your reply instead - it's wise to have the not as often
visited parts nicely documented ?

pbuffers + msaa seems to be working fine on GLX so it shouldn't be too
crazy to add support for EGL. Then again, until/if that happens all we
need is to clear the EGL_PBUFFER_BIT if we have a MSAA config in
dri2_add_config(). Props to Ian for the reminder that SurfaceType is a
bitmask :-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/7] egl: Unify the EGLint/EGLAttrib paths in eglCreateSync*

2016-09-29 Thread Emil Velikov

On 28 September 2016 at 07:28, Chad Versace  wrote:

> +   if (sizeof(int_list[0]) == sizeof(attrib_list[0])) {
> +  attrib_list = (EGLAttrib *) int_list;
> +   } else {
> +  err = _eglConvertIntsToAttribs(int_list, &attrib_list);
> +  if (err != EGL_SUCCESS)
> + RETURN_EGL_ERROR(disp, err, EGL_NO_SYNC);
> +   }
> +
> +   sync = _eglCreateSync(disp, type, attrib_list, EGL_FALSE,
>   EGL_BAD_ATTRIBUTE);
> +
> +   if ((void *) int_list != (void *) attrib_list)
Please use the same conditional as above - sizeof(int_list[0]) !=
sizeof(attrib_list[0]).

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] gallium/radeon: update documentation of buffer_get_virtual_address

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/radeon_winsys.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index dcbebe0..d248004 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -500,20 +500,23 @@ struct radeon_winsys {
  * \return  true on success.
  */
 bool (*buffer_get_handle)(struct pb_buffer *buf,
   unsigned stride, unsigned offset,
   unsigned slice_size,
   struct winsys_handle *whandle);
 
 /**
  * Return the virtual address of a buffer.
  *
+ * When virtual memory is not in use, this is the offset relative to the
+ * relocation base (non-zero for sub-allocated buffers).
+ *
  * \param buf   A winsys buffer object
  * \return  virtual address
  */
 uint64_t (*buffer_get_virtual_address)(struct pb_buffer *buf);
 
 /**
  * Query the initial placement of the buffer from the kernel driver.
  */
 enum radeon_bo_domain (*buffer_get_initial_domain)(struct pb_buffer *buf);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] egl: Fixes and cleanups for EGLSync

2016-09-29 Thread Emil Velikov

On 28 September 2016 at 07:28, Chad Versace  wrote:
> Fixes a deadlock in
> dEQP-EGL.functional.fence_sync.invalid.get_invalid_value.
>
> With the deadlock fixed, it's now possible to run all of
> 'dEQP-EGL.functional.fence_sync.*'.
>
> The patch series' main goal is to unify the attribute parsing between
> eglCreateSyncKHR(..., EGLint *attrib_list) and
> eglCreateSync(..., EGLAttrib *attrib_list)
> During the unification, some bugs found by inspection are fixed.
>
> Tested with the following. No regressions found. Ratios are pass/fail.
>
> BEFORE AFTER
>   dEQP-EGL.functional.fence_sync.*  deadlock   27/0
>   dEQP-EGL.functional.reusable_sync.* : 8/17   8/17
>   piglit egl_khr_fence_sync pass   pass
>
With the small comment in 7/7 the series is
Reviewed-by: Emil Velikov 

Note that this series will clash with the final patch from the
EGL_KHR_debug implementation by Kyle/Adam, although we might want to
pull this first (considering the bugfixes) and rebase the latter.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] radeon/vce: adjust the buffer offset when relocation is used

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/radeon_vce.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index dd4c367..d45ec12 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -542,14 +542,15 @@ void rvce_add_buffer(struct rvce_encoder *enc, struct 
pb_buffer *buf,
 
reloc_idx = enc->ws->cs_add_buffer(enc->cs, buf, usage | 
RADEON_USAGE_SYNCHRONIZED,
   domain, RADEON_PRIO_VCE);
if (enc->use_vm) {
uint64_t addr;
addr = enc->ws->buffer_get_virtual_address(buf);
addr = addr + offset;
RVCE_CS(addr >> 32);
RVCE_CS(addr);
} else {
+   offset += enc->ws->buffer_get_virtual_address(buf);
RVCE_CS(reloc_idx * 4);
RVCE_CS(offset);
}
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/4] radeon/vce,uvd: regression fixes

2016-09-29 Thread Nicolai Hähnle

Hi all,

this should fix two types of regressions introduced by the buffer
sub-allocation.

1. On radeon, old-style relocations are used, and the offset of
   sub-allocated buffers needs to be taken into account there.
   (I haven't tested this part yet.)

2. Also on amdgpu, the kernel expects feedback buffers to be at
   least 4KB in size, but we only allocated 512 bytes -- which was
   irrelevant before sub-allocations.

Please review!
Nicolai

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] radeon/vce: allocate at least 4KB of memory for the feedback buffer

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

The kernel's CS checker requires it. This fixes a regression introduced by
the buffer sub-allocation.

Cc: Christian König 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976
---
 src/gallium/drivers/radeon/radeon_vce.c   |  6 +++---
 src/gallium/drivers/radeon/radeon_video.c | 12 
 src/gallium/drivers/radeon/radeon_video.h |  3 +++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index 10c5a78..dd4c367 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -232,21 +232,21 @@ void rvce_frame_offset(struct rvce_encoder *enc, struct 
rvce_cpb_slot *slot,
 }
 
 /**
  * destroy this video encoder
  */
 static void rvce_destroy(struct pipe_video_codec *encoder)
 {
struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
if (enc->stream_handle) {
struct rvid_buffer fb;
-   rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+   rvid_create_feedback_buffer(enc->screen, &fb);
enc->fb = &fb;
enc->session(enc);
enc->feedback(enc);
enc->destroy(enc);
flush(enc);
rvid_destroy_buffer(&fb);
}
rvid_destroy_buffer(&enc->cpb);
enc->ws->cs_destroy(enc->cs);
FREE(enc->cpb_array);
@@ -275,21 +275,21 @@ static void rvce_begin_frame(struct pipe_video_codec 
*encoder,
 
if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR)
reset_cpb(enc);
else if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_P ||
 pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B)
sort_cpb(enc);

if (!enc->stream_handle) {
struct rvid_buffer fb;
enc->stream_handle = rvid_alloc_stream_handle();
-   rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+   rvid_create_feedback_buffer(enc->screen, &fb);
enc->fb = &fb;
enc->session(enc);
enc->create(enc);
enc->config(enc);
enc->feedback(enc);
flush(enc);
//dump_feedback(enc, &fb);
rvid_destroy_buffer(&fb);
need_rate_control = false;
}
@@ -304,21 +304,21 @@ static void rvce_begin_frame(struct pipe_video_codec 
*encoder,
 static void rvce_encode_bitstream(struct pipe_video_codec *encoder,
  struct pipe_video_buffer *source,
  struct pipe_resource *destination,
  void **fb)
 {
struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
enc->get_buffer(destination, &enc->bs_handle, NULL);
enc->bs_size = destination->width0;
 
*fb = enc->fb = CALLOC_STRUCT(rvid_buffer);
-   if (!rvid_create_buffer(enc->screen, enc->fb, 512, PIPE_USAGE_STAGING)) 
{
+   if (!rvid_create_feedback_buffer(enc->screen, enc->fb)) {
RVID_ERR("Can't create feedback buffer.\n");
return;
}
if (!radeon_emitted(enc->cs, 0))
enc->session(enc);
enc->encode(enc);
enc->feedback(enc);
 }
 
 static void rvce_end_frame(struct pipe_video_codec *encoder,
diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index d7c5a16..f60ae05 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -65,20 +65,32 @@ bool rvid_create_buffer(struct pipe_screen *screen, struct 
rvid_buffer *buffer,
unsigned size, unsigned usage)
 {
memset(buffer, 0, sizeof(*buffer));
buffer->usage = usage;
buffer->res = (struct r600_resource *)
pipe_buffer_create(screen, PIPE_BIND_CUSTOM, usage, size);
 
return buffer->res != NULL;
 }
 
+bool rvid_create_feedback_buffer(struct pipe_screen *screen,
+struct rvid_buffer *buffer)
+{
+   /* The kernel's CS checker asks for at least 4KB space.
+*
+* TODO If we update the kernel checker to be satisfied with less,
+* we could save some memory here (since the sub-allocator could be
+* used).
+*/
+   return rvid_create_buffer(screen, buffer, 4096, PIPE_USAGE_STAGING);
+}
+
 /* destroy a buffer */
 void rvid_destroy_buffer(struct rvid_buffer *buffer)
 {
r600_resource_reference(&buffer->res, NULL);
 }
 
 /* reallocate a buffer, preserving its content */
 bool rvid_resize_buffer(struct pipe_screen *screen, struct radeon_winsys_cs 
*cs,
struct rvid_buffer *new_buf, unsigned new_size)
 {
diff --git a/src/gallium/drivers/radeon/radeon_video.h 
b/src/gallium/drivers/radeon/radeon_video.h
index 39305

[Mesa-dev] [PATCH 3/4] radeon/uvd: adjust the buffer offset when relocation is used

2016-09-29 Thread Nicolai Hähnle

From: Nicolai Hähnle 

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969
---
 src/gallium/drivers/radeon/radeon_uvd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index 3ae0eaa..9c376cb 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -116,20 +116,21 @@ static void send_cmd(struct ruvd_decoder *dec, unsigned 
cmd,
reloc_idx = dec->ws->cs_add_buffer(dec->cs, buf, usage | 
RADEON_USAGE_SYNCHRONIZED,
   domain,
  RADEON_PRIO_UVD);
if (!dec->use_legacy) {
uint64_t addr;
addr = dec->ws->buffer_get_virtual_address(buf);
addr = addr + off;
set_reg(dec, RUVD_GPCOM_VCPU_DATA0, addr);
set_reg(dec, RUVD_GPCOM_VCPU_DATA1, addr >> 32);
} else {
+   off += dec->ws->buffer_get_virtual_address(buf);
set_reg(dec, RUVD_GPCOM_VCPU_DATA0, off);
set_reg(dec, RUVD_GPCOM_VCPU_DATA1, reloc_idx * 4);
}
set_reg(dec, RUVD_GPCOM_VCPU_CMD, cmd << 1);
 }
 
 /* do the codec needs an IT buffer ?*/
 static bool have_it(struct ruvd_decoder *dec)
 {
return dec->stream_type == RUVD_CODEC_H264_PERF ||
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] radeon/vce: allocate at least 4KB of memory for the feedback buffer

2016-09-29 Thread Christian König


NAK to the whole approach.

VCE feedback buffers are completely separated to UVD or other MM 
feedback buffers, so they shouldn't be allocated in radeon_video.c


Additional to that older UVD message, feedback and bitstream buffers 
have special memory placement requirements. So you clearly shouldn't 
allocate those from the sub allocator.


Sorry that I didn't objected earlier, but I wasn't aware of this change 
till now.


Regards,
Christian.

Am 29.09.2016 um 18:35 schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

The kernel's CS checker requires it. This fixes a regression introduced by
the buffer sub-allocation.

Cc: Christian König 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976
---
  src/gallium/drivers/radeon/radeon_vce.c   |  6 +++---
  src/gallium/drivers/radeon/radeon_video.c | 12 
  src/gallium/drivers/radeon/radeon_video.h |  3 +++
  3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index 10c5a78..dd4c367 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -232,21 +232,21 @@ void rvce_frame_offset(struct rvce_encoder *enc, struct 
rvce_cpb_slot *slot,
  }
  
  /**

   * destroy this video encoder
   */
  static void rvce_destroy(struct pipe_video_codec *encoder)
  {
struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
if (enc->stream_handle) {
struct rvid_buffer fb;
-   rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+   rvid_create_feedback_buffer(enc->screen, &fb);
enc->fb = &fb;
enc->session(enc);
enc->feedback(enc);
enc->destroy(enc);
flush(enc);
rvid_destroy_buffer(&fb);
}
rvid_destroy_buffer(&enc->cpb);
enc->ws->cs_destroy(enc->cs);
FREE(enc->cpb_array);
@@ -275,21 +275,21 @@ static void rvce_begin_frame(struct pipe_video_codec 
*encoder,
  
  	if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR)

reset_cpb(enc);
else if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_P ||
 pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B)
sort_cpb(enc);

if (!enc->stream_handle) {
struct rvid_buffer fb;
enc->stream_handle = rvid_alloc_stream_handle();
-   rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+   rvid_create_feedback_buffer(enc->screen, &fb);
enc->fb = &fb;
enc->session(enc);
enc->create(enc);
enc->config(enc);
enc->feedback(enc);
flush(enc);
//dump_feedback(enc, &fb);
rvid_destroy_buffer(&fb);
need_rate_control = false;
}
@@ -304,21 +304,21 @@ static void rvce_begin_frame(struct pipe_video_codec 
*encoder,
  static void rvce_encode_bitstream(struct pipe_video_codec *encoder,
  struct pipe_video_buffer *source,
  struct pipe_resource *destination,
  void **fb)
  {
struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
enc->get_buffer(destination, &enc->bs_handle, NULL);
enc->bs_size = destination->width0;
  
  	*fb = enc->fb = CALLOC_STRUCT(rvid_buffer);

-   if (!rvid_create_buffer(enc->screen, enc->fb, 512, PIPE_USAGE_STAGING)) 
{
+   if (!rvid_create_feedback_buffer(enc->screen, enc->fb)) {
RVID_ERR("Can't create feedback buffer.\n");
return;
}
if (!radeon_emitted(enc->cs, 0))
enc->session(enc);
enc->encode(enc);
enc->feedback(enc);
  }
  
  static void rvce_end_frame(struct pipe_video_codec *encoder,

diff --git a/src/gallium/drivers/radeon/radeon_video.c 
b/src/gallium/drivers/radeon/radeon_video.c
index d7c5a16..f60ae05 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -65,20 +65,32 @@ bool rvid_create_buffer(struct pipe_screen *screen, struct 
rvid_buffer *buffer,
unsigned size, unsigned usage)
  {
memset(buffer, 0, sizeof(*buffer));
buffer->usage = usage;
buffer->res = (struct r600_resource *)
pipe_buffer_create(screen, PIPE_BIND_CUSTOM, usage, size);
  
  	return buffer->res != NULL;

  }
  
+bool rvid_create_feedback_buffer(struct pipe_screen *screen,

+struct rvid_buffer *buffer)
+{
+   /* The kernel's CS checker asks for at least 4KB space.
+*
+* TODO If we update the kernel checker to be satisfied with less,
+* we could save some memory here (since the sub-allocator could be
+* used).
+*/
+   return rvid_create_bu

Re: [Mesa-dev] [PATCH 7/7] egl: Unify the EGLint/EGLAttrib paths in eglCreateSync*

2016-09-29 Thread Emil Velikov

On 28 September 2016 at 07:28, Chad Versace  wrote:
> Pre-patch, there were two code paths for parsing EGLSync attribute
> lists: one path for old-style EGLint lists, used by eglCreateSyncKHR,
> and another for new-style EGLAttrib lists, used by eglCreateSync (1.5)
> and eglCreateSync64 (EGL_KHR_cl_event2).
>
Actually we might want to use the same helper instead of
_eglConvertAttribsToInt for all entry points where the pre-1.5 entry
point was using EGLint while the EGL 1.5 one uses EGLAttrib.

In those cases we currently a) loose the upper bits (admittedly they
aren't used afaics) and b) we'll error out if the user provides an
empty/null list (not the most useful thing to do, but still).

Afaics all the EGL 1.5 entry points either explicitly state that
attrib_list can be null or empty or say that all the unspecified
attribs will be set to their default value.

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] radeon/vce: allocate at least 4KB of memory for the feedback buffer

2016-09-29 Thread Christian König

What special memory placement requirements are those, and how are they 
expressed?

Uff, where should I start?

The UVD hardware can only access the GPU address space through two 
special apertures. But most of the time you have to deal with more than 
two buffers at the same time.


So to make all buffers accessible for a command submission the kernel 
needs to be able to move them around individually, while still taking 
care to not make buffers from the command submissions which are 
currently in flight inaccessible.


If multiple message/feedback/bitstream/context buffers (or even DPB, but 
I think those are to large) are sub-allocated then the kernel can't move 
them around individually and you will run into a bunch of problems when 
you increase the numbers of streams handled at the same time.


If they are captured properly in the alignment requirements and flags 
then the approach should work.
Not even remotely. Userspace isn't aware of those requirements because 
it isn't aware of the physical placement of the buffers. That is all 
handled inside the kernel.


I suggest that you just add a flag to exclude UVD buffers from being 
sub-allocated.


For VCE the requirements are fortunately not so problematic. I suggest 
that just increase the buffer size to 4096.


Regards,
Christian.

Am 29.09.2016 um 18:59 schrieb Nicolai Hähnle:

On 29.09.2016 18:52, Christian König wrote:

NAK to the whole approach.

VCE feedback buffers are completely separated to UVD or other MM
feedback buffers, so they shouldn't be allocated in radeon_video.c


I can rename the function to rvce_create_feedback_buffer and move it 
to radeon_vce.c, that shouldn't be a problem.



Additional to that older UVD message, feedback and bitstream buffers
have special memory placement requirements. So you clearly shouldn't
allocate those from the sub allocator.


What special memory placement requirements are those, and how are they 
expressed?


If they are captured properly in the alignment requirements and flags 
then the approach should work.


Nicolai



Sorry that I didn't objected earlier, but I wasn't aware of this change
till now.

Regards,
Christian.

Am 29.09.2016 um 18:35 schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

The kernel's CS checker requires it. This fixes a regression
introduced by
the buffer sub-allocation.

Cc: Christian König 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976
---
  src/gallium/drivers/radeon/radeon_vce.c   |  6 +++---
  src/gallium/drivers/radeon/radeon_video.c | 12 
  src/gallium/drivers/radeon/radeon_video.h |  3 +++
  3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c
b/src/gallium/drivers/radeon/radeon_vce.c
index 10c5a78..dd4c367 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -232,21 +232,21 @@ void rvce_frame_offset(struct rvce_encoder *enc,
struct rvce_cpb_slot *slot,
  }
/**
   * destroy this video encoder
   */
  static void rvce_destroy(struct pipe_video_codec *encoder)
  {
  struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
  if (enc->stream_handle) {
  struct rvid_buffer fb;
-rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+rvid_create_feedback_buffer(enc->screen, &fb);
  enc->fb = &fb;
  enc->session(enc);
  enc->feedback(enc);
  enc->destroy(enc);
  flush(enc);
  rvid_destroy_buffer(&fb);
  }
  rvid_destroy_buffer(&enc->cpb);
  enc->ws->cs_destroy(enc->cs);
  FREE(enc->cpb_array);
@@ -275,21 +275,21 @@ static void rvce_begin_frame(struct
pipe_video_codec *encoder,
if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR)
  reset_cpb(enc);
  else if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_P ||
   pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B)
  sort_cpb(enc);

  if (!enc->stream_handle) {
  struct rvid_buffer fb;
  enc->stream_handle = rvid_alloc_stream_handle();
-rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+rvid_create_feedback_buffer(enc->screen, &fb);
  enc->fb = &fb;
  enc->session(enc);
  enc->create(enc);
  enc->config(enc);
  enc->feedback(enc);
  flush(enc);
  //dump_feedback(enc, &fb);
  rvid_destroy_buffer(&fb);
  need_rate_control = false;
  }
@@ -304,21 +304,21 @@ static void rvce_begin_frame(struct
pipe_video_codec *encoder,
  static void rvce_encode_bitstream(struct pipe_video_codec *encoder,
struct pipe_video_buffer *source,
struct pipe_resource *destination,
void **fb)
  {
  struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
  enc->get_buffer(destination, &enc->bs_handle, NULL);
  enc->bs_size = destination->width0;
*fb = enc->fb = CALLOC_STRUCT(rvid_b

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Marek Olšák

On Thu, Sep 29, 2016 at 4:56 PM, Emil Velikov  wrote:
> On 29 September 2016 at 11:48, Marek Olšák  wrote:
>> On Thu, Sep 29, 2016 at 11:20 AM, Nicolai Hähnle  wrote:
>>> On 28.09.2016 18:49, Marek Olšák wrote:

 From: Marek Olšák 

 More info about jemalloc:
https://github.com/jemalloc/jemalloc/wiki/History

 Average from 3 takes compiling Alien Isolation shaders from GLSL to GCN
 bytecode:
glibc:17.183s
jemalloc: 15.558s
diff: -9.5%

 The diff is -10.5% for a full shader-db run.
 ---

 TODO: The jemalloc dependency should be added to configure.ac before this.

 We can probably redirect all malloc/calloc/realloc/free calls in Mesa to
 jemalloc. We can either use _mesa_jemalloc, etc. everywhere or we can
 redirect calls to jemalloc using #define malloc _mesa_jemalloc, etc.

 Right now, I just use: export LDFLAGS=-ljemalloc
>>>
>>>
>>> Sounds good to me. It should probably be a configurable option, defaulting
>>> to jemalloc and failing if not available unless explicitly disabled.
>>
>> If it was a configurable option, almost nobody would use it. Let's
>> make it mandatory.
>>
> This combined with ...
>
>>>
>>> On the Gallium side of things, switching to jemalloc could be pretty
>>> straightforward via the macros in u_memory.h, once we know that they're
>>> actually used consistently (which we currently don't -- it would be nice to
>>> know how jemalloc and glibc malloc react when the calls are mixed).
>>
>> Redefining malloc/calloc/realloc/free/posix_memalign for all Mesa code
>> would be more robust.
>>
> ... this doesn't is not a wise move.
>
> Don't force jemalloc onto everyone without having an explicit ACK from
> a wide audience, please ? Considering the static/shared link (or w/o
> jemalloc all together) distributions will have their
> preferences/policies which won't align with my/your view.

I guess we can have an option to disable jemalloc, but only if most
users won't use that option. The real problem is that the GLSL
compiler is alloc-bound and anybody wanting to use the GLSL compiler
should stay away from glibc's allocator.

The GLSL compiler can be slowed down significantly by keeping 5x
LLVMContext in memory between compilations in radeonsi. The fact that
radeonsi can indirectly slow down the GLSL compiler (but not LLVM) is
a strong indication that we have a problem.

Since I happen to be responsible for delivering a well-performing
driver for our hw across all installations, I guess you understand why
I'm inclined to making jemalloc mandatory for ralloc at least. Using
jemalloc for whole Mesa is not important and that idea can be dropped.

One comment on distributions. Some distributions complained years ago
that I made LLVM mandatory for r300g even though it worked fine
without it (but too slow on some GPUs). I made it clear that if they
didn't like the LLVM dependency, they were free to stop shipping that
driver. That taught me that distributions are not always interested in
delivering Mesa in the configuration and level of quality that was
intended by Mesa developers. If there is an option to make something
slower or worse, distributions shouldn't have the option, ever.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Clemens Eisserer

Hi,

> The GLSL compiler can be slowed down significantly by keeping 5x
> LLVMContext in memory between compilations in radeonsi. The fact that
> radeonsi can indirectly slow down the GLSL compiler (but not LLVM) is
> a strong indication that we have a problem.

This will mean there will be two heap regions in virtual adress space.
As high-performance malloc implementations are quite greedy releasing
memory back to the kernel - wouldn't it be good to have at least some
numbers regarding memory footprint before making jemalloc the default?

Br, Clemens
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Emil Velikov

On 29 September 2016 at 18:48, Marek Olšák  wrote:
> On Thu, Sep 29, 2016 at 4:56 PM, Emil Velikov  
> wrote:
>> On 29 September 2016 at 11:48, Marek Olšák  wrote:
>>> On Thu, Sep 29, 2016 at 11:20 AM, Nicolai Hähnle  wrote:
 On 28.09.2016 18:49, Marek Olšák wrote:
>
> From: Marek Olšák 
>
> More info about jemalloc:
>https://github.com/jemalloc/jemalloc/wiki/History
>
> Average from 3 takes compiling Alien Isolation shaders from GLSL to GCN
> bytecode:
>glibc:17.183s
>jemalloc: 15.558s
>diff: -9.5%
>
> The diff is -10.5% for a full shader-db run.
> ---
>
> TODO: The jemalloc dependency should be added to configure.ac before this.
>
> We can probably redirect all malloc/calloc/realloc/free calls in Mesa to
> jemalloc. We can either use _mesa_jemalloc, etc. everywhere or we can
> redirect calls to jemalloc using #define malloc _mesa_jemalloc, etc.
>
> Right now, I just use: export LDFLAGS=-ljemalloc


 Sounds good to me. It should probably be a configurable option, defaulting
 to jemalloc and failing if not available unless explicitly disabled.
>>>
>>> If it was a configurable option, almost nobody would use it. Let's
>>> make it mandatory.
>>>
>> This combined with ...
>>

 On the Gallium side of things, switching to jemalloc could be pretty
 straightforward via the macros in u_memory.h, once we know that they're
 actually used consistently (which we currently don't -- it would be nice to
 know how jemalloc and glibc malloc react when the calls are mixed).
>>>
>>> Redefining malloc/calloc/realloc/free/posix_memalign for all Mesa code
>>> would be more robust.
>>>
>> ... this doesn't is not a wise move.
>>
>> Don't force jemalloc onto everyone without having an explicit ACK from
>> a wide audience, please ? Considering the static/shared link (or w/o
>> jemalloc all together) distributions will have their
>> preferences/policies which won't align with my/your view.
>
> I guess we can have an option to disable jemalloc, but only if most
> users won't use that option. The real problem is that the GLSL
> compiler is alloc-bound and anybody wanting to use the GLSL compiler
> should stay away from glibc's allocator.
>
> The GLSL compiler can be slowed down significantly by keeping 5x
> LLVMContext in memory between compilations in radeonsi. The fact that
> radeonsi can indirectly slow down the GLSL compiler (but not LLVM) is
> a strong indication that we have a problem.
>
If the issue is present in only one driver, one might be looking at
the wrong end. Alternatively others will also be in favour of this and
things will flow naturally.

> Since I happen to be responsible for delivering a well-performing
> driver for our hw across all installations, I guess you understand why
> I'm inclined to making jemalloc mandatory for ralloc at least. Using
> jemalloc for whole Mesa is not important and that idea can be dropped.
>
> One comment on distributions. Some distributions complained years ago
> that I made LLVM mandatory for r300g even though it worked fine
> without it (but too slow on some GPUs). I made it clear that if they
> didn't like the LLVM dependency, they were free to stop shipping that
> driver. That taught me that distributions are not always interested in
> delivering Mesa in the configuration and level of quality that was
> intended by Mesa developers. If there is an option to make something
> slower or worse, distributions shouldn't have the option, ever.
>
I see your point even before you stated it, but one should not enforce
their preference onto everyone. That is not how open-source works
afaict.

Yes, I've seen distros that patch out the llvm r300 "dependency",
surely you don't want this happening here as well ?

Afaict you're getting the stick out for distribution (maintainers)
while one ought to put a carrot at the end of it.

99% of the cases you don't want reports of people running patched mesa
where you have little to no idea what's happening. By forcing such
decisions you effectively give the finger to maintainers and a "do
apply local patches" card to everyone :-(

Simply add the an identifier to the vendor string and if one objects
about performance and is having "sub par" performance point them to
the wiki.


Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/r300: initialize pipe_resource::next to NULL

2016-09-29 Thread Rob Clark

On Tue, Sep 27, 2016 at 10:56 PM, Michel Dänzer  wrote:
> On 28/09/16 12:33 AM, Rob Clark wrote:
>> Signed-off-by: Rob Clark 
>> ---
>> I had a scan through the rest of pipe_resource allocations, and I think
>> this is the only remaining one (besides r600_alloc_buffer_struct())
>> which was using MALLOC_STRUCT()..  sorry 'bout that
>
> Note that the MALLOC_STRUCT here isn't relevant:
>
>
>> diff --git a/src/gallium/drivers/r300/r300_screen_buffer.c 
>> b/src/gallium/drivers/r300/r300_screen_buffer.c
>> index 4747058..24dd92f 100644
>> --- a/src/gallium/drivers/r300/r300_screen_buffer.c
>> +++ b/src/gallium/drivers/r300/r300_screen_buffer.c
>> @@ -163,6 +163,7 @@ struct pipe_resource *r300_buffer_create(struct 
>> pipe_screen *screen,
>>  rbuf = MALLOC_STRUCT(r300_resource);
>>
>>  rbuf->b.b = *templ;
>
> The pipe_resource::next field is copied in from the template here, so
> the question is really whether the next field of the template is
> initialized to NULL by all callers.

bleh.. right, ok, I guess I need to track down which callers aren't
zero-initializing the templ.

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/gen8+: Enable GL_OES_viewport_array

2016-09-29 Thread Anuj Phogat

On Thu, Sep 22, 2016 at 5:59 PM, Ilia Mirkin  wrote:
> On Wed, Sep 21, 2016 at 2:15 PM, Anuj Phogat  wrote:
>> Signed-off-by: Anuj Phogat 
>>
>> ---
>> This patch requires below series:
>> https://patchwork.freedesktop.org/series/12594/
>
> FYI, this is now pushed. Given that the ext relies on
> OES_geometry_shader, the placement of the enable makes sense. However
> all 3 of these should probably go into the gen7.5 section since HSW
> now has ES 3.1 as well. But that should be done later. As is, this is
>
Testing showed that I either have to enable ARB_viewport_array along
with OES_viewport_array in GLES contexts or expose the enums
for OES_Viewport_array. I'm sending out few patches to fix these issues.

I also tried enabling OES_geometry_shader and OES_viewport_array
on hsw but hit some khronos cts regressions. I'll work on them.

> Reviewed-by: Ilia Mirkin 
>
>> ---
>>  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
>> b/src/mesa/drivers/dri/i965/intel_extensions.c
>> index 93eb966..53bd7cc 100644
>> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
>> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
>> @@ -404,6 +404,7 @@ intelInitExtensions(struct gl_context *ctx)
>>ctx->Extensions.ARB_ES3_2_compatibility = true;
>>ctx->Extensions.OES_geometry_shader = true;
>>ctx->Extensions.OES_texture_cube_map_array = true;
>> +  ctx->Extensions.OES_viewport_array = true;
>> }
>>
>> if (brw->gen >= 9) {
>> --
>> 2.5.5
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/gen8+: Enable GL_OES_viewport_array

2016-09-29 Thread Ilia Mirkin

On Thu, Sep 29, 2016 at 2:40 PM, Anuj Phogat  wrote:
> On Thu, Sep 22, 2016 at 5:59 PM, Ilia Mirkin  wrote:
>> On Wed, Sep 21, 2016 at 2:15 PM, Anuj Phogat  wrote:
>>> Signed-off-by: Anuj Phogat 
>>>
>>> ---
>>> This patch requires below series:
>>> https://patchwork.freedesktop.org/series/12594/
>>
>> FYI, this is now pushed. Given that the ext relies on
>> OES_geometry_shader, the placement of the enable makes sense. However
>> all 3 of these should probably go into the gen7.5 section since HSW
>> now has ES 3.1 as well. But that should be done later. As is, this is
>>
> Testing showed that I either have to enable ARB_viewport_array along
> with OES_viewport_array in GLES contexts or expose the enums
> for OES_Viewport_array. I'm sending out few patches to fix these issues.

I was aware of this. I didn't think it'd be an issue.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] i965/gen8+: Enable GL_OES_viewport_array

2016-09-29 Thread Anuj Phogat

This patch causes 2 regressions in khronos' gles cts tests
on various intel platforms.
Failing tests:
ES3-CTS.functional.state_query.integers.viewport_getinteger
ES3-CTS.functional.state_query.integers.viewport_getfloat

Here is an explanation of what's causing the failures:

CTS tests are not clamping the x, y location of the viewport's
bottom-left corner as recommended by ARB_viewport_array and
OES_viewport_array:
"The location of the viewport's bottom-left corner, given by (x,y), are
 clamped to be within the implementation-dependent viewport bounds range.
 The viewport bounds range [min, max] tuple may be determined by
 calling GetFloatv with the symbolic constant VIEWPORT_BOUNDS_RANGE_OES"

I will open a khronos CTS bug to get the test fixed.

V2: Initialize the relevant variables for GL_OES_viewport_array on hsw+

Signed-off-by: Anuj Phogat 
Cc: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/brw_context.c  | 5 +++--
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6efad78..b688612 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -776,8 +776,9 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.MaxViewportHeight = 32768;
}
 
-   /* ARB_viewport_array */
-   if (brw->gen >= 6 && ctx->API == API_OPENGL_CORE) {
+   /* ARB_viewport_array, OES_viewport_array */
+   if ((brw->gen >= 6 && ctx->API == API_OPENGL_CORE) ||
+   (brw->gen >= 8  && ctx->API == API_OPENGLES2)) {
   ctx->Const.MaxViewports = GEN6_NUM_VIEWPORTS;
   ctx->Const.ViewportSubpixelBits = 0;
 
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 93eb966..53bd7cc 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -404,6 +404,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_ES3_2_compatibility = true;
   ctx->Extensions.OES_geometry_shader = true;
   ctx->Extensions.OES_texture_cube_map_array = true;
+  ctx->Extensions.OES_viewport_array = true;
}
 
if (brw->gen >= 9) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] mesa: Add a check for OES_viewport_array

2016-09-29 Thread Anuj Phogat

Signed-off-by: Anuj Phogat 
Cc: Ilia Mirkin 
---
 src/mesa/main/viewport.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/viewport.c b/src/mesa/main/viewport.c
index f59723f..bd58044 100644
--- a/src/mesa/main/viewport.c
+++ b/src/mesa/main/viewport.c
@@ -52,7 +52,9 @@ set_viewport_no_notify(struct gl_context *ctx, unsigned idx,
 * determined by calling GetFloatv with the symbolic constant
 * VIEWPORT_BOUNDS_RANGE (see section 6.1)."
 */
-   if (ctx->Extensions.ARB_viewport_array) {
+   if (ctx->Extensions.ARB_viewport_array ||
+   (ctx->Extensions.OES_viewport_array &&
+_mesa_is_gles31(ctx))) {
   x = CLAMP(x,
 ctx->Const.ViewportBounds.Min, ctx->Const.ViewportBounds.Max);
   y = CLAMP(y,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] mesa: Enable enums for OES_viewport_array

2016-09-29 Thread Anuj Phogat

Signed-off-by: Anuj Phogat 
Cc: Ilia Mirkin 
---
 src/mesa/main/get.c  | 6 ++
 src/mesa/main/get_hash_params.py | 8 
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index e7ebc7f..64a4b0e 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -405,6 +405,12 @@ static const int 
extra_ARB_viewport_array_or_oes_geometry_shader[] = {
EXTRA_END
 };
 
+static const int extra_ARB_viewport_array_or_oes_viewport_array[] = {
+   EXT(ARB_viewport_array),
+   EXT(OES_viewport_array),
+   EXTRA_END
+};
+
 static const int extra_ARB_gpu_shader5_or_oes_geometry_shader[] = {
EXT(ARB_gpu_shader5),
EXTRA_EXT_ES_GS,
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index f65960a..7520a39 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -615,10 +615,10 @@ descriptor=[
   [ "PRIMITIVE_BOUNDING_BOX_ARB", "CONTEXT_FLOAT8(PrimitiveBoundingBox), 
extra_OES_primitive_bounding_box" ],
 
 # GL_ARB_viewport_array / GL_OES_viewport_array
-  [ "MAX_VIEWPORTS", "CONTEXT_INT(Const.MaxViewports), 
extra_ARB_viewport_array" ],
-  [ "VIEWPORT_SUBPIXEL_BITS", "CONTEXT_INT(Const.ViewportSubpixelBits), 
extra_ARB_viewport_array" ],
-  [ "VIEWPORT_BOUNDS_RANGE", "CONTEXT_FLOAT2(Const.ViewportBounds), 
extra_ARB_viewport_array" ],
-  [ "VIEWPORT_INDEX_PROVOKING_VERTEX", 
"CONTEXT_ENUM(Const.LayerAndVPIndexProvokingVertex), extra_ARB_viewport_array" 
],
+  [ "MAX_VIEWPORTS", "CONTEXT_INT(Const.MaxViewports), 
extra_ARB_viewport_array_or_oes_viewport_array" ],
+  [ "VIEWPORT_SUBPIXEL_BITS", "CONTEXT_INT(Const.ViewportSubpixelBits), 
extra_ARB_viewport_array_or_oes_viewport_array" ],
+  [ "VIEWPORT_BOUNDS_RANGE", "CONTEXT_FLOAT2(Const.ViewportBounds), 
extra_ARB_viewport_array_or_oes_viewport_array" ],
+  [ "VIEWPORT_INDEX_PROVOKING_VERTEX", 
"CONTEXT_ENUM(Const.LayerAndVPIndexProvokingVertex), 
extra_ARB_viewport_array_or_oes_viewport_array" ],
 ]},
 
 { "apis": ["GL_CORE", "GLES32"], "params": [
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] mesa: Enable enums for OES_viewport_array

2016-09-29 Thread Ilia Mirkin

2016-09-29 14:42 GMT-04:00 Anuj Phogat :
> Signed-off-by: Anuj Phogat 
> Cc: Ilia Mirkin 
> ---
>  src/mesa/main/get.c  | 6 ++
>  src/mesa/main/get_hash_params.py | 8 
>  2 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index e7ebc7f..64a4b0e 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -405,6 +405,12 @@ static const int 
> extra_ARB_viewport_array_or_oes_geometry_shader[] = {
> EXTRA_END
>  };
>
> +static const int extra_ARB_viewport_array_or_oes_viewport_array[] = {
> +   EXT(ARB_viewport_array),
> +   EXT(OES_viewport_array),
> +   EXTRA_END
> +};

I originally had this patch in my series but took it out - why isn't
it reasonable to just flip on the ARB_viewport_array bit and move on?
(i.e. decree that in order to enable OES_viewport_array you must also
enable ARB_viewport_array)

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-29 Thread Marek Olšák

On Thu, Sep 29, 2016 at 6:23 PM, Emil Velikov  wrote:
> On 27 September 2016 at 13:47, Marek Olšák  wrote:
>> On Tue, Sep 27, 2016 at 2:34 PM, Emil Velikov  
>> wrote:
>>> On 26 September 2016 at 08:41, Tapani Pälli  wrote:
 This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
 and same crash in many dEQP EGL tests.

 I also found that some Qt example did a workaround because of this
 crash: https://bugreports.qt.io/browse/QTBUG-47509

 Signed-off-by: Tapani Pälli 
 ---

 This is RFC as I'm not sure if we are supposed to support this. I tried
 to verify overall pbuffer situation with some mesa-demos using pbuffer
 but those are not working for me at all with or without my patch.

  src/egl/main/eglconfig.c | 5 +
  1 file changed, 5 insertions(+)

 diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
 index 6161d26..20cf9d4 100644
 --- a/src/egl/main/eglconfig.c
 +++ b/src/egl/main/eglconfig.c
 @@ -407,6 +407,11 @@ _eglValidateConfig(const _EGLConfig *conf, EGLBoolean 
 for_matching)
return EGL_FALSE;
 }

 +   /* pbuffer with MSAA not supported */
>>> Fwiw on my system piglit also crashes + the demos don't render
>>> anything. So I'm leaning that we want this as-is (for the time being)
>>> + cc stable ?
>>>
>>> Can you apply a minor polish to the comment - "XXX/TODO: pbuffer +
>>> MSAA does not work + QT bugreport" or alike.
>>
>> Please don't add "XXX/TODO". pbuffers were spec'd in 1997 and were
>> meant to be used on GL 1.x hardware that didn't support MSAA
>> texturing, thus MSAA pbuffers don't make any sense. Just keep the
>> current comment.
>>
> Can we use your reply instead - it's wise to have the not as often
> visited parts nicely documented ?

I don't think my comment is useful if the pbuffer is not expected to
be bound as a texture. Your TODO comment is better. Sorry for the
noise.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/r300: initialize pipe_resource::next to NULL

2016-09-29 Thread Marek Olšák

On Thu, Sep 29, 2016 at 8:37 PM, Rob Clark  wrote:
> On Tue, Sep 27, 2016 at 10:56 PM, Michel Dänzer  wrote:
>> On 28/09/16 12:33 AM, Rob Clark wrote:
>>> Signed-off-by: Rob Clark 
>>> ---
>>> I had a scan through the rest of pipe_resource allocations, and I think
>>> this is the only remaining one (besides r600_alloc_buffer_struct())
>>> which was using MALLOC_STRUCT()..  sorry 'bout that
>>
>> Note that the MALLOC_STRUCT here isn't relevant:
>>
>>
>>> diff --git a/src/gallium/drivers/r300/r300_screen_buffer.c 
>>> b/src/gallium/drivers/r300/r300_screen_buffer.c
>>> index 4747058..24dd92f 100644
>>> --- a/src/gallium/drivers/r300/r300_screen_buffer.c
>>> +++ b/src/gallium/drivers/r300/r300_screen_buffer.c
>>> @@ -163,6 +163,7 @@ struct pipe_resource *r300_buffer_create(struct 
>>> pipe_screen *screen,
>>>  rbuf = MALLOC_STRUCT(r300_resource);
>>>
>>>  rbuf->b.b = *templ;
>>
>> The pipe_resource::next field is copied in from the template here, so
>> the question is really whether the next field of the template is
>> initialized to NULL by all callers.
>
> bleh.. right, ok, I guess I need to track down which callers aren't
> zero-initializing the templ.

or do: next = NULL; in all drivers?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] mesa: Enable enums for OES_viewport_array

2016-09-29 Thread Anuj Phogat

On Thu, Sep 29, 2016 at 11:45 AM, Ilia Mirkin  wrote:
> 2016-09-29 14:42 GMT-04:00 Anuj Phogat :
>> Signed-off-by: Anuj Phogat 
>> Cc: Ilia Mirkin 
>> ---
>>  src/mesa/main/get.c  | 6 ++
>>  src/mesa/main/get_hash_params.py | 8 
>>  2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
>> index e7ebc7f..64a4b0e 100644
>> --- a/src/mesa/main/get.c
>> +++ b/src/mesa/main/get.c
>> @@ -405,6 +405,12 @@ static const int 
>> extra_ARB_viewport_array_or_oes_geometry_shader[] = {
>> EXTRA_END
>>  };
>>
>> +static const int extra_ARB_viewport_array_or_oes_viewport_array[] = {
>> +   EXT(ARB_viewport_array),
>> +   EXT(OES_viewport_array),
>> +   EXTRA_END
>> +};
>
> I originally had this patch in my series but took it out - why isn't
> it reasonable to just flip on the ARB_viewport_array bit and move on?
> (i.e. decree that in order to enable OES_viewport_array you must also
> enable ARB_viewport_array)
>
I don't see a big reason to prefer one or the other. I noticed we are
doing it this way for few other gles extensions and found it slightly
cleaner. Otherwise I don't have a strong preference.

>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] ralloc: use jemalloc for faster GLSL compilation

2016-09-29 Thread Marek Olšák

On Thu, Sep 29, 2016 at 8:19 PM, Emil Velikov  wrote:
> On 29 September 2016 at 18:48, Marek Olšák  wrote:
>> On Thu, Sep 29, 2016 at 4:56 PM, Emil Velikov  
>> wrote:
>>> On 29 September 2016 at 11:48, Marek Olšák  wrote:
 On Thu, Sep 29, 2016 at 11:20 AM, Nicolai Hähnle  
 wrote:
> On 28.09.2016 18:49, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> More info about jemalloc:
>>https://github.com/jemalloc/jemalloc/wiki/History
>>
>> Average from 3 takes compiling Alien Isolation shaders from GLSL to GCN
>> bytecode:
>>glibc:17.183s
>>jemalloc: 15.558s
>>diff: -9.5%
>>
>> The diff is -10.5% for a full shader-db run.
>> ---
>>
>> TODO: The jemalloc dependency should be added to configure.ac before 
>> this.
>>
>> We can probably redirect all malloc/calloc/realloc/free calls in Mesa to
>> jemalloc. We can either use _mesa_jemalloc, etc. everywhere or we can
>> redirect calls to jemalloc using #define malloc _mesa_jemalloc, etc.
>>
>> Right now, I just use: export LDFLAGS=-ljemalloc
>
>
> Sounds good to me. It should probably be a configurable option, defaulting
> to jemalloc and failing if not available unless explicitly disabled.

 If it was a configurable option, almost nobody would use it. Let's
 make it mandatory.

>>> This combined with ...
>>>
>
> On the Gallium side of things, switching to jemalloc could be pretty
> straightforward via the macros in u_memory.h, once we know that they're
> actually used consistently (which we currently don't -- it would be nice 
> to
> know how jemalloc and glibc malloc react when the calls are mixed).

 Redefining malloc/calloc/realloc/free/posix_memalign for all Mesa code
 would be more robust.

>>> ... this doesn't is not a wise move.
>>>
>>> Don't force jemalloc onto everyone without having an explicit ACK from
>>> a wide audience, please ? Considering the static/shared link (or w/o
>>> jemalloc all together) distributions will have their
>>> preferences/policies which won't align with my/your view.
>>
>> I guess we can have an option to disable jemalloc, but only if most
>> users won't use that option. The real problem is that the GLSL
>> compiler is alloc-bound and anybody wanting to use the GLSL compiler
>> should stay away from glibc's allocator.
>>
>> The GLSL compiler can be slowed down significantly by keeping 5x
>> LLVMContext in memory between compilations in radeonsi. The fact that
>> radeonsi can indirectly slow down the GLSL compiler (but not LLVM) is
>> a strong indication that we have a problem.
>>
> If the issue is present in only one driver, one might be looking at
> the wrong end. Alternatively others will also be in favour of this and
> things will flow naturally.

The improvement is even better with the Gallium noop driver. When I
tested noop, the compile time was reduced by 15%. I think radeonsi is
the driver that will benefit the least from it, not the most.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] mesa: Enable enums for OES_viewport_array

2016-09-29 Thread Ilia Mirkin

On Thu, Sep 29, 2016 at 3:03 PM, Anuj Phogat  wrote:
> On Thu, Sep 29, 2016 at 11:45 AM, Ilia Mirkin  wrote:
>> 2016-09-29 14:42 GMT-04:00 Anuj Phogat :
>>> Signed-off-by: Anuj Phogat 
>>> Cc: Ilia Mirkin 
>>> ---
>>>  src/mesa/main/get.c  | 6 ++
>>>  src/mesa/main/get_hash_params.py | 8 
>>>  2 files changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
>>> index e7ebc7f..64a4b0e 100644
>>> --- a/src/mesa/main/get.c
>>> +++ b/src/mesa/main/get.c
>>> @@ -405,6 +405,12 @@ static const int 
>>> extra_ARB_viewport_array_or_oes_geometry_shader[] = {
>>> EXTRA_END
>>>  };
>>>
>>> +static const int extra_ARB_viewport_array_or_oes_viewport_array[] = {
>>> +   EXT(ARB_viewport_array),
>>> +   EXT(OES_viewport_array),
>>> +   EXTRA_END
>>> +};
>>
>> I originally had this patch in my series but took it out - why isn't
>> it reasonable to just flip on the ARB_viewport_array bit and move on?
>> (i.e. decree that in order to enable OES_viewport_array you must also
>> enable ARB_viewport_array)
>>
> I don't see a big reason to prefer one or the other. I noticed we are
> doing it this way for few other gles extensions and found it slightly
> cleaner. Otherwise I don't have a strong preference.

I don't feel too strongly about it either. If you think this is
better, this series is

Reviewed-by: Ilia Mirkin 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] st/omx/dec/h265: decoder size should follow from sps

2016-09-29 Thread Leo Liu




On 09/27/2016 06:24 AM, Emil Velikov wrote:

On 23 September 2016 at 17:32, Leo Liu  wrote:

So that it will pass correct size to width(height)_in_samples in
uvd message buffer.


The st code is device agnostic. s/uvd/hardware/ perhaps ?


From st, it pass the width/height to vl layer 
pipe_video_codec::width/height by calling create_video_codec, and then 
pass to hw/uvd thru vl layer.


Regards,
Leo



-Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/5] st/omx/dec/h265: fix the skip for before and after list

2016-09-29 Thread Leo Liu




On 09/27/2016 06:37 AM, Emil Velikov wrote:

On 23 September 2016 at 17:32, Leo Liu  wrote:

Should not be skipped when rps->used false


Please 'translate' "rps->used false" to English ?


Will do.

Also one might want
to mention if the patch/es fix any known issue - bugzilla, fixes
"jerky" playback, etc.


Issue came from more testing internally.



Afaict 3-5 are bugfixes so please add a bit more context in the commit
message


Sure. will do.


and
Cc: mesa-sta...@lists.freedesktop.org


The omx h265 dec is newly implemented, not in stable yet.


Reviewed-by: Emil Velikov 


Thanks for the reviews.

Regards,
Leo


Thanks
Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] more unsigned -> enum pipe_shader_type

2016-09-29 Thread Brian Paul



If anyone is looking for something simple to do, this is just a reminder 
that there are more places in gallium where we can replace unsigned with 
enum pipe_shader_type (a bunch was done in August).


Specifically,
pipe_screen::get_shader_param()
pipe_screen::get_compiler_options()
pipe_context::set_constant_buffer()

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] st/omx/dec/h265: add scaling list data

2016-09-29 Thread Leo Liu




On 09/27/2016 06:23 AM, Emil Velikov wrote:

On 23 September 2016 at 17:32, Leo Liu  wrote:

Specified by 7.3.4

There's a word missing in there ^ - table 7.3.4 ?


Yeh, something is missed, it's from subclause 7.3.4




Signed-off-by: Leo Liu 
---
  src/gallium/state_trackers/omx/vid_dec_h265.c | 126 +-
  1 file changed, 121 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec_h265.c 
b/src/gallium/state_trackers/omx/vid_dec_h265.c
index 0772b4d..3c46505 100644
--- a/src/gallium/state_trackers/omx/vid_dec_h265.c
+++ b/src/gallium/state_trackers/omx/vid_dec_h265.c
@@ -57,6 +57,28 @@ enum {
 NAL_UNIT_TYPE_PPS = 34,
  };

+static const uint8_t Default_8x8_Intra[64] = {
+   16, 16, 16, 16, 17, 18, 21, 24,
+   16, 16, 16, 16, 17, 19, 22, 25,
+   16, 16, 17, 18, 20, 22, 25, 29,
+   16, 16, 18, 21, 24, 27, 31, 36,
+   17, 17, 20, 24, 30, 35, 41, 47,
+   18, 19, 22, 27, 35, 44, 54, 65,
+   21, 22, 25, 31, 41, 54, 70, 88,
+   24, 25, 29, 36, 47, 65, 88, 115
+};
+
+static const uint8_t Default_8x8_Inter[64] = {
+   16, 16, 16, 16, 17, 18, 20, 24,
+   16, 16, 16, 17, 18, 20, 24, 25,
+   16, 16, 17, 18, 20, 24, 25, 28,
+   16, 17, 18, 20, 24, 25, 28, 33,
+   17, 18, 20, 24, 25, 28, 33, 41,
+   18, 20, 24, 25, 28, 33, 41, 54,
+   20, 24, 25, 28, 33, 41, 54, 71,
+   24, 25, 28, 33, 41, 54, 71, 91
+};
+

Style used for the names is a bit iffy - use default_8x8_inter ?


The style is inherited from vid_dec_h264.c, and similar style for all 
scaling list table.




Since neither of these is omx specific worth moving these to aux/vl ?

This is specific to omx, and also specific to h265 (h264 got its own).

These tables are only for omx, but not for vdpau and vaapi, because only 
for omx, we need to parse it.


for vdpau and vaapi, it get done thru the framework.




  struct dpb_list {
 struct list_head list;
 struct pipe_video_buffer *buffer;
@@ -188,10 +210,104 @@ static unsigned profile_tier_level(struct vl_rbsp *rbsp,
 return level_idc;
  }

-static void scaling_list_data(void)
+static void scaling_list_data(vid_dec_PrivateType *priv,
+  struct vl_rbsp *rbsp, struct pipe_h265_sps *sps)
  {
-   /* TODO */
-   assert(0);
+   unsigned size_id, matrix_id;
+
+   for (size_id = 0; size_id < 4; ++size_id) {

Why would one loop over size_id, if close of everything in the loop is
special cased on the size_id ?


It loops matrix_id, but could be optimized. will do in v2.

Thanks,
Leo



-Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] radeon/vce: allocate at least 4KB of memory for the feedback buffer

2016-09-29 Thread Dieter Nützel


[bisected]

gallium/radeon: add query fences and r600_get_hw_query_params

introduce regression on r600g/NI/Turks XT with Blender 2.76.
Picking/selecting with right mouse button result in SIG:

radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
Writing: /tmp/bh.crash.txt

[1]Segmentation faultblender

_This_ patch do _NOT_ solve it.

631c47384c1f45450359fd7d1df2c5f0c79f40bc is the first bad commit
commit 631c47384c1f45450359fd7d1df2c5f0c79f40bc
Author: Nicolai Hähnle 
Date:   Wed Sep 14 10:38:33 2016 +0200

gallium/radeon: add query fences and r600_get_hw_query_params

We will support the waiting option in ARB_query_buffer_object using
WAIT_REG_MEM on an appropriate fence-like dword. Some queries 
conveniently
write their results with the highest bit set, and we can just use 
that;

for others, we have to write a fence explicitly.

ZPASS_DONE for occlusion queries writes its results with the high 
bit
set, but it writes up to 8 pairs of results (one for each DB). We 
have
to wait for all of these results, so let's just add an explicit 
fence.


The new function provides summary information to be used by 
subsequent

patches.

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Marek Olšák 

:04 04 bed7362ecccdebb63b505d50b3777dc10963aef9 
fe9ca1f733c7897e1362194240e114482f91bbb3 M src


Regards,
Dieter.

Am 29.09.2016 18:35, schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

The kernel's CS checker requires it. This fixes a regression introduced 
by

the buffer sub-allocation.

Cc: Christian König 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976
---
 src/gallium/drivers/radeon/radeon_vce.c   |  6 +++---
 src/gallium/drivers/radeon/radeon_video.c | 12 
 src/gallium/drivers/radeon/radeon_video.h |  3 +++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c
b/src/gallium/drivers/radeon/radeon_vce.c
index 10c5a78..dd4c367 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -232,21 +232,21 @@ void rvce_frame_offset(struct rvce_encoder *enc,
struct rvce_cpb_slot *slot,
 }

 /**
  * destroy this video encoder
  */
 static void rvce_destroy(struct pipe_video_codec *encoder)
 {
struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
if (enc->stream_handle) {
struct rvid_buffer fb;
-   rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+   rvid_create_feedback_buffer(enc->screen, &fb);
enc->fb = &fb;
enc->session(enc);
enc->feedback(enc);
enc->destroy(enc);
flush(enc);
rvid_destroy_buffer(&fb);
}
rvid_destroy_buffer(&enc->cpb);
enc->ws->cs_destroy(enc->cs);
FREE(enc->cpb_array);
@@ -275,21 +275,21 @@ static void rvce_begin_frame(struct
pipe_video_codec *encoder,

if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR)
reset_cpb(enc);
else if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_P ||
 pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B)
sort_cpb(enc);

if (!enc->stream_handle) {
struct rvid_buffer fb;
enc->stream_handle = rvid_alloc_stream_handle();
-   rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+   rvid_create_feedback_buffer(enc->screen, &fb);
enc->fb = &fb;
enc->session(enc);
enc->create(enc);
enc->config(enc);
enc->feedback(enc);
flush(enc);
//dump_feedback(enc, &fb);
rvid_destroy_buffer(&fb);
need_rate_control = false;
}
@@ -304,21 +304,21 @@ static void rvce_begin_frame(struct
pipe_video_codec *encoder,
 static void rvce_encode_bitstream(struct pipe_video_codec *encoder,
  struct pipe_video_buffer *source,
  struct pipe_resource *destination,
  void **fb)
 {
struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
enc->get_buffer(destination, &enc->bs_handle, NULL);
enc->bs_size = destination->width0;

*fb = enc->fb = CALLOC_STRUC

Re: [Mesa-dev] [PATCH 1/4] radeon/vce: allocate at least 4KB of memory for the feedback buffer

2016-09-29 Thread Dieter Nützel


Am 29.09.2016 22:34, schrieb Dieter Nützel:

[bisected]

gallium/radeon: add query fences and r600_get_hw_query_params

introduce regression on r600g/NI/Turks XT with Blender 2.76.
Picking/selecting with right mouse button result in SIG:

radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-22).
Writing: /tmp/bh.crash.txt

[1]Segmentation faultblender

_This_ patch do _NOT_ solve it.

631c47384c1f45450359fd7d1df2c5f0c79f40bc is the first bad commit
commit 631c47384c1f45450359fd7d1df2c5f0c79f40bc
Author: Nicolai Hähnle 
Date:   Wed Sep 14 10:38:33 2016 +0200

gallium/radeon: add query fences and r600_get_hw_query_params

We will support the waiting option in ARB_query_buffer_object using
WAIT_REG_MEM on an appropriate fence-like dword. Some queries 
conveniently
write their results with the highest bit set, and we can just use 
that;

for others, we have to write a fence explicitly.

ZPASS_DONE for occlusion queries writes its results with the high 
bit
set, but it writes up to 8 pairs of results (one for each DB). We 
have
to wait for all of these results, so let's just add an explicit 
fence.


The new function provides summary information to be used by 
subsequent

patches.

Reviewed-by: Edward O'Callaghan 
Reviewed-by: Marek Olšák 

:04 04 bed7362ecccdebb63b505d50b3777dc10963aef9
fe9ca1f733c7897e1362194240e114482f91bbb3 M src

Regards,
Dieter.


Addendum (read the logs carefully...):

[25614.322361] [drm:radeon_cs_packet_next_reloc [radeon]] *ERROR* No 
packet3 for relocation for packet at 303.

[25614.322363] [drm] ib[303]=0xC0044700
[25614.322364] [drm] ib[304]=0x0528
[25614.322364] [drm] ib[305]=0x0080
[25614.322364] [drm] ib[306]=0x2000
[25614.322365] [drm] ib[307]=0x8000
[25614.322365] [drm] ib[308]=0x
[25614.322384] [drm:evergreen_packet3_check.isra.14 [radeon]] *ERROR* 
bad EVENT_WRITE
[25614.322399] [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command 
stream !
[25614.322631] [drm:radeon_cs_packet_next_reloc [radeon]] *ERROR* No 
packet3 for relocation for packet at 717.

[25614.322636] [drm] ib[717]=0xC0044700
[25614.322636] [drm] ib[718]=0x0528
[25614.322637] [drm] ib[719]=0x0080
[25614.322637] [drm] ib[720]=0x2000
[25614.322638] [drm] ib[721]=0x8000
[25614.322638] [drm] ib[722]=0x
[25614.322656] [drm:evergreen_packet3_check.isra.14 [radeon]] *ERROR* 
bad EVENT_WRITE
[25614.322669] [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command 
stream !
[25614.323004] [drm:radeon_cs_packet_next_reloc [radeon]] *ERROR* No 
packet3 for relocation for packet at 706.

[25614.323005] [drm] ib[706]=0xC0044700
[25614.323006] [drm] ib[707]=0x0528
[25614.323006] [drm] ib[708]=0x0080
[25614.323007] [drm] ib[709]=0x2000
[25614.323007] [drm] ib[710]=0x8000
[25614.323007] [drm] ib[711]=0x
[25614.323025] [drm:evergreen_packet3_check.isra.14 [radeon]] *ERROR* 
bad EVENT_WRITE
[25614.323039] [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command 
stream !


Cheers,
Dieter.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] [Bug 38970] [bisected]piglit glx/glx-pixmap-multi failed

2016-09-29 Thread Anutex

I tried to debug this issue with changing the condition to check only bad magic 
and Error.
And the test passed.

Though i am not sure what is the correct behaviour if we are in this condition.
May be we should make some  other condition if the Hash Table have the bucket 
data.
---
 src/glx/dri2_glx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index af388d9..a1fd9ff 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -411,12 +411,13 @@ dri2CreateDrawable(struct glx_screen *base, XID xDrawable,
   return NULL;
}
 
-   if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw)) {
+   if (__glxHashInsert(pdp->dri2Hash, xDrawable, pdraw) == -1) {
   (*psc->core->destroyDrawable) (pdraw->driDrawable);
   DRI2DestroyDrawable(psc->base.dpy, xDrawable);
   free(pdraw);
   return None;
}
+   
 
/*
 * Make sure server has the same swap interval we do for the new
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/omx/dec/h265: add scaling list data

2016-09-29 Thread Leo Liu

Specified by subclause 7.3.4

v2: get the loop optimized

Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/omx/vid_dec_h265.c | 102 --
 1 file changed, 97 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec_h265.c 
b/src/gallium/state_trackers/omx/vid_dec_h265.c
index 0772b4d..df68711 100644
--- a/src/gallium/state_trackers/omx/vid_dec_h265.c
+++ b/src/gallium/state_trackers/omx/vid_dec_h265.c
@@ -57,6 +57,28 @@ enum {
NAL_UNIT_TYPE_PPS = 34,
 };
 
+static const uint8_t Default_8x8_Intra[64] = {
+   16, 16, 16, 16, 17, 18, 21, 24,
+   16, 16, 16, 16, 17, 19, 22, 25,
+   16, 16, 17, 18, 20, 22, 25, 29,
+   16, 16, 18, 21, 24, 27, 31, 36,
+   17, 17, 20, 24, 30, 35, 41, 47,
+   18, 19, 22, 27, 35, 44, 54, 65,
+   21, 22, 25, 31, 41, 54, 70, 88,
+   24, 25, 29, 36, 47, 65, 88, 115
+};
+
+static const uint8_t Default_8x8_Inter[64] = {
+   16, 16, 16, 16, 17, 18, 20, 24,
+   16, 16, 16, 17, 18, 20, 24, 25,
+   16, 16, 17, 18, 20, 24, 25, 28,
+   16, 17, 18, 20, 24, 25, 28, 33,
+   17, 18, 20, 24, 25, 28, 33, 41,
+   18, 20, 24, 25, 28, 33, 41, 54,
+   20, 24, 25, 28, 33, 41, 54, 71,
+   24, 25, 28, 33, 41, 54, 71, 91
+};
+
 struct dpb_list {
struct list_head list;
struct pipe_video_buffer *buffer;
@@ -188,10 +210,80 @@ static unsigned profile_tier_level(struct vl_rbsp *rbsp,
return level_idc;
 }
 
-static void scaling_list_data(void)
+static void scaling_list_data(vid_dec_PrivateType *priv,
+  struct vl_rbsp *rbsp, struct pipe_h265_sps *sps)
 {
-   /* TODO */
-   assert(0);
+   unsigned size_id, matrix_id;
+   unsigned scaling_list_len[4] = { 16, 64, 64, 64 };
+   uint8_t scaling_list4x4[6][64] = {  };
+   int i;
+
+   uint8_t (*scaling_list_data[4])[6][64] = {
+(uint8_t (*)[6][64])scaling_list4x4,
+(uint8_t (*)[6][64])sps->ScalingList8x8,
+(uint8_t (*)[6][64])sps->ScalingList16x16,
+(uint8_t (*)[6][64])sps->ScalingList32x32
+   };
+   uint8_t (*scaling_list_dc_coeff[2])[6] = {
+  (uint8_t (*)[6])sps->ScalingListDCCoeff16x16,
+  (uint8_t (*)[6])sps->ScalingListDCCoeff32x32
+   };
+
+   for (size_id = 0; size_id < 4; ++size_id) {
+
+  for (matrix_id = 0; matrix_id < ((size_id == 3) ? 2 : 6); ++matrix_id) {
+ bool scaling_list_pred_mode_flag = vl_rbsp_u(rbsp, 1);
+
+ if (!scaling_list_pred_mode_flag) {
+/* scaling_list_pred_matrix_id_delta */;
+unsigned matrix_id_with_delta = matrix_id - vl_rbsp_ue(rbsp);
+
+if (matrix_id != matrix_id_with_delta) {
+   memcpy((*scaling_list_data[size_id])[matrix_id],
+  (*scaling_list_data[size_id])[matrix_id_with_delta],
+  scaling_list_len[size_id]);
+   if (size_id > 1)
+  (*scaling_list_dc_coeff[size_id - 2])[matrix_id] =
+ (*scaling_list_dc_coeff[size_id - 
2])[matrix_id_with_delta];
+} else {
+   const uint8_t *d;
+
+   if (size_id == 0)
+  memset((*scaling_list_data[0])[matrix_id], 16, 16);
+   else {
+  if (size_id < 3)
+ d = (matrix_id < 3) ? Default_8x8_Intra : 
Default_8x8_Inter;
+  else
+ d = (matrix_id < 1) ? Default_8x8_Intra : 
Default_8x8_Inter;
+  memcpy((*scaling_list_data[size_id])[matrix_id], d,
+ scaling_list_len[size_id]);
+   }
+   if (size_id > 1)
+  (*scaling_list_dc_coeff[size_id - 2])[matrix_id] = 16;
+}
+ } else {
+int next_coef = 8;
+int coef_num = MIN2(64, (1 << (4 + (size_id << 1;
+
+if (size_id > 1) {
+   /* scaling_list_dc_coef_minus8 */
+   next_coef = vl_rbsp_se(rbsp) + 8;
+   (*scaling_list_dc_coeff[size_id - 2])[matrix_id] = next_coef;
+}
+
+for (i = 0; i < coef_num; ++i) {
+   /* scaling_list_delta_coef */
+   next_coef = (next_coef + vl_rbsp_se(rbsp) + 256) % 256;
+   (*scaling_list_data[size_id])[matrix_id][i] = next_coef;
+}
+ }
+  }
+   }
+
+   for (i = 0; i < 6; ++i)
+  memcpy(sps->ScalingList4x4[i], scaling_list4x4[i], 16);
+
+   return;
 }
 
 static void st_ref_pic_set(vid_dec_PrivateType *priv, struct vl_rbsp *rbsp,
@@ -383,7 +475,7 @@ static void seq_parameter_set(vid_dec_PrivateType *priv, 
struct vl_rbsp *rbsp)
if (sps->scaling_list_enabled_flag)
   /* sps_scaling_list_data_present_flag */
   if (vl_rbsp_u(rbsp, 1))
- scaling_list_data();
+ scaling_list_data(priv, rbsp, sps);
 
sps->amp_enabled_flag = vl_rbsp_u(rbsp, 1);
sps->sample_adaptive_offset_enabled_flag = vl_rbsp_u(rbsp, 1);
@@ -506,7 +598,7 @@ static void picture_parameter_set(vid_dec_PrivateType *priv,
 
/* pps_scaling_list_data_present_flag */

Re: [Mesa-dev] [PATCH 1/4] radeon/vce: allocate at least 4KB of memory for the feedback buffer

2016-09-29 Thread Nicolai Hähnle


On 29.09.2016 18:52, Christian König wrote:

NAK to the whole approach.

VCE feedback buffers are completely separated to UVD or other MM
feedback buffers, so they shouldn't be allocated in radeon_video.c


I can rename the function to rvce_create_feedback_buffer and move it to 
radeon_vce.c, that shouldn't be a problem.



Additional to that older UVD message, feedback and bitstream buffers
have special memory placement requirements. So you clearly shouldn't
allocate those from the sub allocator.


What special memory placement requirements are those, and how are they 
expressed?


If they are captured properly in the alignment requirements and flags 
then the approach should work.


Nicolai



Sorry that I didn't objected earlier, but I wasn't aware of this change
till now.

Regards,
Christian.

Am 29.09.2016 um 18:35 schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

The kernel's CS checker requires it. This fixes a regression
introduced by
the buffer sub-allocation.

Cc: Christian König 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97976
---
  src/gallium/drivers/radeon/radeon_vce.c   |  6 +++---
  src/gallium/drivers/radeon/radeon_video.c | 12 
  src/gallium/drivers/radeon/radeon_video.h |  3 +++
  3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c
b/src/gallium/drivers/radeon/radeon_vce.c
index 10c5a78..dd4c367 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -232,21 +232,21 @@ void rvce_frame_offset(struct rvce_encoder *enc,
struct rvce_cpb_slot *slot,
  }
/**
   * destroy this video encoder
   */
  static void rvce_destroy(struct pipe_video_codec *encoder)
  {
  struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
  if (enc->stream_handle) {
  struct rvid_buffer fb;
-rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+rvid_create_feedback_buffer(enc->screen, &fb);
  enc->fb = &fb;
  enc->session(enc);
  enc->feedback(enc);
  enc->destroy(enc);
  flush(enc);
  rvid_destroy_buffer(&fb);
  }
  rvid_destroy_buffer(&enc->cpb);
  enc->ws->cs_destroy(enc->cs);
  FREE(enc->cpb_array);
@@ -275,21 +275,21 @@ static void rvce_begin_frame(struct
pipe_video_codec *encoder,
if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR)
  reset_cpb(enc);
  else if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_P ||
   pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B)
  sort_cpb(enc);

  if (!enc->stream_handle) {
  struct rvid_buffer fb;
  enc->stream_handle = rvid_alloc_stream_handle();
-rvid_create_buffer(enc->screen, &fb, 512, PIPE_USAGE_STAGING);
+rvid_create_feedback_buffer(enc->screen, &fb);
  enc->fb = &fb;
  enc->session(enc);
  enc->create(enc);
  enc->config(enc);
  enc->feedback(enc);
  flush(enc);
  //dump_feedback(enc, &fb);
  rvid_destroy_buffer(&fb);
  need_rate_control = false;
  }
@@ -304,21 +304,21 @@ static void rvce_begin_frame(struct
pipe_video_codec *encoder,
  static void rvce_encode_bitstream(struct pipe_video_codec *encoder,
struct pipe_video_buffer *source,
struct pipe_resource *destination,
void **fb)
  {
  struct rvce_encoder *enc = (struct rvce_encoder*)encoder;
  enc->get_buffer(destination, &enc->bs_handle, NULL);
  enc->bs_size = destination->width0;
*fb = enc->fb = CALLOC_STRUCT(rvid_buffer);
-if (!rvid_create_buffer(enc->screen, enc->fb, 512,
PIPE_USAGE_STAGING)) {
+if (!rvid_create_feedback_buffer(enc->screen, enc->fb)) {
  RVID_ERR("Can't create feedback buffer.\n");
  return;
  }
  if (!radeon_emitted(enc->cs, 0))
  enc->session(enc);
  enc->encode(enc);
  enc->feedback(enc);
  }
static void rvce_end_frame(struct pipe_video_codec *encoder,
diff --git a/src/gallium/drivers/radeon/radeon_video.c
b/src/gallium/drivers/radeon/radeon_video.c
index d7c5a16..f60ae05 100644
--- a/src/gallium/drivers/radeon/radeon_video.c
+++ b/src/gallium/drivers/radeon/radeon_video.c
@@ -65,20 +65,32 @@ bool rvid_create_buffer(struct pipe_screen
*screen, struct rvid_buffer *buffer,
  unsigned size, unsigned usage)
  {
  memset(buffer, 0, sizeof(*buffer));
  buffer->usage = usage;
  buffer->res = (struct r600_resource *)
  pipe_buffer_create(screen, PIPE_BIND_CUSTOM, usage, size);
return buffer->res != NULL;
  }
  +bool rvid_create_feedback_buffer(struct pipe_screen *screen,
+ struct rvid_buffer *buffer)
+{
+/* The kernel's CS checker asks for at least 4KB space.
+ *
+ * TODO If we update the kernel checker to be satisfied with less,
+ * we could save some memory here (since the sub-allocator could be
+ * used).

[Mesa-dev] mesa/llvmpipe select alpha visual

2016-09-29 Thread Thomas Søndergaard

My application window becomes transparent when I run my application with
llvmpipe by setting LD_LIBRARY_PATH instead of the proprietary NVIDIA
driver that I otherwise use. I've come to the understanding, after digging
into the code for a few hours, that mesa selects this Visual even though my
application hasn't asked for an alpha channel:

Visual ID: 23  depth=32  class=TrueColor, type=(none)
bufferSize=32 level=0 renderType=rgba doubleBuffer=1 stereo=0
rgba: redSize=8 greenSize=8 blueSize=8 alphaSize=8 float=N sRGB=N
auxBuffers=0 depthSize=24 stencilSize=8
accum: redSize=16 greenSize=16 blueSize=16 alphaSize=0
multiSample=0  multiSampleBuffers=0
visualCaveat=None

My application uses Qt-5 and if I modify it to not set

GLX_RED_SIZE=1
GLX_GREEN_SIZE=1
GLX_BLUE_SIZE=1

Then I get the following visual instead and my window is no longer blended
with the background based on the alpha values in the framebuffer.

Visual ID: 21  depth=24  class=TrueColor, type=(none)
bufferSize=24 level=0 renderType=rgba doubleBuffer=1 stereo=0
rgba: redSize=8 greenSize=8 blueSize=8 alphaSize=8 float=N sRGB=N
auxBuffers=0 depthSize=24 stencilSize=8
accum: redSize=16 greenSize=16 blueSize=16 alphaSize=16
multiSample=0  multiSampleBuffers=0
visualCaveat=None
Opaque.

I've digged a little in the mesa code and I can see that if either of
GLX_RED_SIZE, GLX_GREEN_SIZE and GLX_BLUE_SIZE is set then
choose_x_visual() in mesa/src/gallium/state_trackers/glx/xlib/glx_api.c
starts the search for a visual from the deep end, where-as if none of them
are set it starts from the shallow end when searching for visuals?

Why does it do that?

The consequence is that I get a 32-bit visual visual with an alpha channel,
even though I haven't asked for an alpha channel.

I've found out that I can set XLIB_SKIP_ARGB_VISUALS=1 as a workaround, but
it would be nice with a real fix. Is there anything I can do to help?

Would it make sense to simply always search from the shallow end? Or to
only search from 24-bit and down if GLX_ALPHA_SIZE is 0? I saw a comment in
rc/gallium/state_trackers/glx/xlib/glx_api.cc that mesa is limited to 8-bit
per channel, so there doesn't seem to be any point in looking for a visual
with higher depth if we don't want alpha.

That was a long mail - if you got this far, I thank you for your patience
:-)

Thomas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/hud: Add power sensor support

2016-09-29 Thread Brian Paul



Reviewed-by: Brian Paul 

and pushed to master.  Thanks.

-Brian

On 09/29/2016 08:11 AM, Steven Toth wrote:

Implement support for power based sensors, reporting units in
milli-watts and watts.

Also, minor cleanup - change the related if block to a switch.

Tested with two different power sensors, including the nouveau
'power1' sensors on a GTX950 card.

Signed-off-by: Steven Toth 
---
  src/gallium/auxiliary/hud/hud_context.c  | 10 
  src/gallium/auxiliary/hud/hud_private.h  |  1 +
  src/gallium/auxiliary/hud/hud_sensors_temp.c | 38 
  src/gallium/include/pipe/p_defines.h |  1 +
  4 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index a82cdf2..3445488 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -261,6 +261,7 @@ number_to_human_readable(uint64_t num, uint64_t max_value,
 static const char *temperature_units[] = {" C"};
 static const char *volt_units[] = {" mV", " V"};
 static const char *amp_units[] = {" mA", " A"};
+   static const char *watt_units[] = {" mW", " W"};

 const char **units;
 unsigned max_unit;
@@ -301,6 +302,10 @@ number_to_human_readable(uint64_t num, uint64_t max_value,
max_unit = ARRAY_SIZE(hz_units)-1;
units = hz_units;
break;
+   case PIPE_DRIVER_QUERY_TYPE_WATTS:
+  max_unit = ARRAY_SIZE(watt_units)-1;
+  units = watt_units;
+  break;
 default:
if (max_value == 100) {
   max_unit = ARRAY_SIZE(percent_units)-1;
@@ -1067,6 +1072,11 @@ hud_parse_env_var(struct hud_context *hud, const char 
*env)
  SENSORS_CURRENT_CURRENT);
   pane->type = PIPE_DRIVER_QUERY_TYPE_AMPS;
}
+  else if (sscanf(name, "sensors_pow_cu-%s", arg_name) == 1) {
+ hud_sensors_temp_graph_install(pane, arg_name,
+SENSORS_POWER_CURRENT);
+ pane->type = PIPE_DRIVER_QUERY_TYPE_WATTS;
+  }
  #endif
else if (strcmp(name, "samples-passed") == 0 &&
 has_occlusion_query(hud->pipe->screen)) {
diff --git a/src/gallium/auxiliary/hud/hud_private.h 
b/src/gallium/auxiliary/hud/hud_private.h
index c825512..51049af 100644
--- a/src/gallium/auxiliary/hud/hud_private.h
+++ b/src/gallium/auxiliary/hud/hud_private.h
@@ -124,6 +124,7 @@ int hud_get_num_sensors(bool displayhelp);
  #define SENSORS_TEMP_CRITICAL2
  #define SENSORS_VOLTAGE_CURRENT  3
  #define SENSORS_CURRENT_CURRENT  4
+#define SENSORS_POWER_CURRENT5
  void hud_sensors_temp_graph_install(struct hud_pane *pane, const char 
*dev_name,
  unsigned int mode);
  #endif
diff --git a/src/gallium/auxiliary/hud/hud_sensors_temp.c 
b/src/gallium/auxiliary/hud/hud_sensors_temp.c
index bceffc4..7d1398a 100644
--- a/src/gallium/auxiliary/hud/hud_sensors_temp.c
+++ b/src/gallium/auxiliary/hud/hud_sensors_temp.c
@@ -119,6 +119,15 @@ get_sensor_values(struct sensors_temp_info *sti)
if (sf)
   sti->critical = get_value(sti->chip, sf);
break;
+   case SENSORS_POWER_CURRENT:
+  sf = sensors_get_subfeature(sti->chip, sti->feature,
+  SENSORS_SUBFEATURE_POWER_INPUT);
+  if (sf) {
+ /* Sensors API returns in WATTs, even though driver is reporting mW,
+  * convert back to mW */
+ sti->current = get_value(sti->chip, sf) * 1000;
+  }
+  break;
 }

 sf = sensors_get_subfeature(sti->chip, sti->feature,
@@ -173,6 +182,9 @@ query_sti_load(struct hud_graph *gr)
   case SENSORS_CURRENT_CURRENT:
  hud_graph_add_value(gr, (uint64_t) sti->current);
  break;
+ case SENSORS_POWER_CURRENT:
+hud_graph_add_value(gr, (uint64_t) sti->current);
+break;
   }

   sti->last_time = now;
@@ -217,6 +229,7 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, const 
char *dev_name,
mode == SENSORS_VOLTAGE_CURRENT ? "VOLTS" :
mode == SENSORS_CURRENT_CURRENT ? "AMPS" :
mode == SENSORS_TEMP_CURRENT ? "CU" :
+  mode == SENSORS_POWER_CURRENT ? "POWER" :
mode == SENSORS_TEMP_CRITICAL ? "CR" : "UNDEFINED");
  #endif

@@ -234,6 +247,7 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, const 
char *dev_name,
 sti->mode == SENSORS_VOLTAGE_CURRENT ? "Volts" :
 sti->mode == SENSORS_CURRENT_CURRENT ? "Amps" :
 sti->mode == SENSORS_TEMP_CURRENT ? "Curr" :
+   sti->mode == SENSORS_POWER_CURRENT ? "Pow" :
 sti->mode == SENSORS_TEMP_CRITICAL ? "Crit" : "Unkn");

 gr->query_data = sti;
@@ -256,6 +270,9 @@ hud_sensors_temp_graph_install(struct hud_pane *pane, const 
char *dev_name,
 case SENSORS_CURRENT_CURRENT:
hud_pane_set_max_value(pane, 5000);
break

[Mesa-dev] Mesa 13.0.0 release plan (Was Re: Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?))

2016-09-29 Thread Timothy Arceri

On Thu, 2016-09-29 at 15:56 +0100, Emil Velikov wrote:
> On 28 September 2016 at 19:53, Marek Olšák  wrote:
> > 
> > Hi,
> > 
> > It's been almost 4 months since the 12.0 branch was created, and
> > soon
> > it will have been 3 months since Mesa 12.0 was released.
> > 
> > Is there any reason we haven't created the stable branch yet?
> > 
> > Ideally, we would time the release so that it's 1-2 months before
> > fall
> > distribution releases.
> > 
> 
> Thanks Marek !
> 
> In all honesty I was secretly hoping that we'll get Dave/Bas RADV for
> 12.1. 

I believe the release should be 13?? Core Mesa and the Intel driver
have reached 4.4 this release also core Mesa is now at 4.5 despite not
being enabled anywhere.

> With the topic of which would be 'the default' Vulkan driver for
> ATI/AMD hardware to be considered at a later stage.
> 
> That said here are the tentative dates:
> 
> Oct 7/14 2016 - Feature freeze/Release candidate 1
> Oct 14/21 2016 - Release candidate 2
> Oct 21/28 2016 - Release candidate 3/final release
> 
> Fwiw I'm still in favour of getting RADV in even if it's not
> perfect/feature complete. Devs, let me know if there's a "must have"
> feature that we want in 12.1.
> 
> Thanks
> Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: optimize copy_propagation_elements pass

2016-09-29 Thread Ian Romanick

On 09/29/2016 12:17 AM, Tapani Pälli wrote:
> 
> On 09/28/2016 06:14 PM, Ian Romanick wrote:
>> On 09/16/2016 06:21 PM, Tapani Pälli wrote:
>>> Changes make copy_propagation_elements pass faster, reducing link
>>> time spent in test case of bug 94477. Does not fix the actual issue
>>> but brings down the total time. No regressions seen in CI.
>>
>> How does this affect the time of a full shader-db run?
> 
> Almost none at all, this is the open-source shaders (100 runs):
> 
> Difference at 95.0% confidence
> 0.0312 +/- 0.00502746
> 1.72566% +/- 0.278068%
> (Student's t, pooled s = 0.0181375)
> 
> (testing with DOTA-2 shaders gave very similar result)
> 
> My assumption is that this really helps only the most pathological cases
> like in the bug where list size becomes enormous (thousands of entries).
> With just few entries, list is 'fast enough' to walk through anyway (?)
> 
> BTW Eric was proposing to just remove this pass. However when testing
> what happens on removal I noticed there's functional failures
> (arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform starts to
> fail), so it seems we are currently dependent on this pass.
> 
>> There are a bunch of bits of this that are confusing to me.  I think
>> some high-level explanation about which hash tables the acp_ref can be
>> in, which lists it can be in, and how they relate would help.  I've
>> pointed out a couple of the confusing bits below.
>>
>>> Signed-off-by: Tapani Pälli 
>>> ---
>>>
>>> For performance measurements, Martina reported in the bug 8x speedup
>>> to the test case shader link time when using this patch together with
>>> commit 2cd02e30d2e1677762d34f1831b8e609970ef0f3
>>>
>>>  .../glsl/opt_copy_propagation_elements.cpp | 187
>>> -
>>>  1 file changed, 145 insertions(+), 42 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp
>>> b/src/compiler/glsl/opt_copy_propagation_elements.cpp
>>> index e4237cc..1c5060a 100644
>>> --- a/src/compiler/glsl/opt_copy_propagation_elements.cpp
>>> +++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp
>>> @@ -46,6 +46,7 @@
>>>  #include "ir_basic_block.h"
>>>  #include "ir_optimization.h"
>>>  #include "compiler/glsl_types.h"
>>> +#include "util/hash_table.h"
>>>
>>>  static bool debug = false;
>>>
>>> @@ -76,6 +77,18 @@ public:
>>> int swizzle[4];
>>>  };
>>>
>>> +/* Class that refers to acp_entry in another exec_list. Used
>>> + * when making removals based on rhs.
>>> + */
>>> +class acp_ref : public exec_node
>>
>> This pattern is called a box, so maybe acp_box would be a better name.
>> I'm not too hung up on it.
>>
>> With this change, can the acp_entry itself still be in a list?
> 
> The idea here is a class that only refers to a acp_entry but does not
> take any ownership .. so it's really just a list of pointers. I'm OK
> with renaming it.

If only a boxed acp_entry can be in a list, then acp_entry doesn't need
to derive from exec_node.  That's why I was asking.  I looked at the
rest of the code again, and I now see that the acp_entry is in the lhs list.

Correct me if I'm wrong, but an entry will effectively be in two lists
at all times: the lhs list and the rhs list.

Assuming that previous assumption is correct, I might suggest a
different structure that makes it all less confusing.  Don't make
acp_entry drive from exec_node.  Instead, embed two acp_ref (or whatever
it ends up being called) nodes in the acp_entry:

acp_ref lhs_node;
acp_ref rhs_node;

When adding an entry to the lhs list, use

lhs_list->push_tail(&entry->lhs_node);

Similar for rhs list:

rhs_list->push_tail(&entry->rhs_node);

Walk the lists like:

   foreach_in_list_safe(acp_ref, ref, rhs_list) {
  acp_entry *entry = ref->entry;

  ...
   }

I think this would be a lot more clear because both lists are handled in
the same way.  It also avoids the overhead of allocating the boxes.

>>> +{
>>> +public:
>>> +   acp_ref(acp_entry *e)
>>> +   {
>>> +  entry = e;
>>> +   }
>>> +   acp_entry *entry;
>>> +};
>>>
>>>  class kill_entry : public exec_node
>>>  {
>>> @@ -98,14 +111,42 @@ public:
>>>this->killed_all = false;
>>>this->mem_ctx = ralloc_context(NULL);
>>>this->shader_mem_ctx = NULL;
>>> -  this->acp = new(mem_ctx) exec_list;
>>>this->kills = new(mem_ctx) exec_list;
>>> +
>>> +  create_acp();
>>> }
>>> ~ir_copy_propagation_elements_visitor()
>>> {
>>>ralloc_free(mem_ctx);
>>> }
>>>
>>> +   void create_acp()
>>> +   {
>>> +  lhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
>>> +   _mesa_key_pointer_equal);
>>> +  rhs_ht = _mesa_hash_table_create(mem_ctx, _mesa_hash_pointer,
>>> +   _mesa_key_pointer_equal);
>>> +   }
>>> +
>>> +   void destroy_acp()
>>> +   {
>>> +  _mesa_hash_table_destroy(lhs_ht, NULL);
>>> +  _mesa_hash_table_destroy(rhs_ht,

Re: [Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-09-29 Thread Dave Airlie

On 30 September 2016 at 01:07, Jason Ekstrand  wrote:
> On Sep 29, 2016 7:56 AM, "Emil Velikov"  wrote:
>>
>> On 28 September 2016 at 19:53, Marek Olšák  wrote:
>> > Hi,
>> >
>> > It's been almost 4 months since the 12.0 branch was created, and soon
>> > it will have been 3 months since Mesa 12.0 was released.
>> >
>> > Is there any reason we haven't created the stable branch yet?
>> >
>> > Ideally, we would time the release so that it's 1-2 months before fall
>> > distribution releases.
>> >
>>
>> Thanks Marek !
>>
>> In all honesty I was secretly hoping that we'll get Dave/Bas RADV for
>> 12.1. With the topic of which would be 'the default' Vulkan driver for
>> ATI/AMD hardware to be considered at a later stage.
>
> If they have even close to the amount of work we had to get it merged, I
> don't think that's at all realistic.  Then again, Dave is the one who wants
> to have a Vulkan driver for AMD hardware that he can package and ship so
> I'll let him decide how badly he wants it in this release.
>
>> That said here are the tentative dates:
>>
>> Oct 7/14 2016 - Feature freeze/Release candidate 1
>> Oct 14/21 2016 - Release candidate 2
>> Oct 21/28 2016 - Release candidate 3/final release
>>
>> Fwiw I'm still in favour of getting RADV in even if it's not
>> perfect/feature complete. Devs, let me know if there's a "must have"
>> feature that we want in 12.1.

The main problem I have with merging radv is the whole conformance testing
end of it.

It's probably fine if I just make a big printf on device creation that RADV
isn't a conformant vulkan implementation yet. Sorta like what Intel do on the
older GPUs.

Otherwise I don't think merging it is a big job, it's 30,000 lines of standalone
code, I've already merged the prereq patches, and I think any code sharing
should happen in tree.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa 13.0.0 release plan (Was Re: Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?))

2016-09-29 Thread Jason Ekstrand

On Sep 29, 2016 5:14 PM, "Timothy Arceri" 
wrote:
>
> On Thu, 2016-09-29 at 15:56 +0100, Emil Velikov wrote:
> > On 28 September 2016 at 19:53, Marek Olšák  wrote:
> > >
> > > Hi,
> > >
> > > It's been almost 4 months since the 12.0 branch was created, and
> > > soon
> > > it will have been 3 months since Mesa 12.0 was released.
> > >
> > > Is there any reason we haven't created the stable branch yet?
> > >
> > > Ideally, we would time the release so that it's 1-2 months before
> > > fall
> > > distribution releases.
> > >
> >
> > Thanks Marek !
> >
> > In all honesty I was secretly hoping that we'll get Dave/Bas RADV for
> > 12.1.
>
> I believe the release should be 13?? Core Mesa and the Intel driver
> have reached 4.4 this release also core Mesa is now at 4.5 despite not
> being enabled anywhere.

My personal preference, for whatever it's worth, would be to call it 12.1.
The 12.0 release was the biggest release we've had in a long time and it
seems odd to me to jump to 13.0 right away when we really haven't done much
at all in terms of new features. (I think it's only 2 or 3 desktop features
in the case of Intel.  A bit more on the ES side I guess).

> > With the topic of which would be 'the default' Vulkan driver for
> > ATI/AMD hardware to be considered at a later stage.
> >
> > That said here are the tentative dates:
> >
> > Oct 7/14 2016 - Feature freeze/Release candidate 1
> > Oct 14/21 2016 - Release candidate 2
> > Oct 21/28 2016 - Release candidate 3/final release
> >
> > Fwiw I'm still in favour of getting RADV in even if it's not
> > perfect/feature complete. Devs, let me know if there's a "must have"
> > feature that we want in 12.1.
> >
> > Thanks
> > Emil
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa 13.0.0 release plan (Was Re: Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?))

2016-09-29 Thread Timothy Arceri

On Thu, 2016-09-29 at 19:17 -0700, Jason Ekstrand wrote:
> 
> > On Sep 29, 2016 5:14 PM, "Timothy Arceri"  wrote:
> 
> >
> 
> > On Thu, 2016-09-29 at 15:56 +0100, Emil Velikov wrote:
> 
> > > > > On 28 September 2016 at 19:53, Marek Olšák 
wrote:
> 
> > > >
> 
> > > > Hi,
> 
> > > >
> 
> > > > > It's been almost 4 months since the 12.0 branch was created,
and
> 
> > > > soon
> 
> > > > it will have been 3 months since Mesa 12.0 was released.
> 
> > > >
> 
> > > > Is there any reason we haven't created the stable branch yet?
> 
> > > >
> 
> > > > > Ideally, we would time the release so that it's 1-2 months
before
> 
> > > > fall
> 
> > > > distribution releases.
> 
> > > >
> 
> > >
> 
> > > Thanks Marek !
> 
> > >
> 
> > > > In all honesty I was secretly hoping that we'll get Dave/Bas RADV
for
> 
> > > 12.1.
> 
> >
> 
> > I believe the release should be 13?? Core Mesa and the Intel driver
> 
> > > have reached 4.4 this release also core Mesa is now at 4.5 despite
not
> 
> > being enabled anywhere.
> > > > > > My personal preference, for whatever it's worth, would be to call it
12.1.  The 12.0 release was the biggest release we've had in a long
time and it seems odd to me to jump to 13.0 right away when we really
haven't done much at all in terms of new features. (I think it's only
2 or 3 desktop features in the case of Intel.  A bit more on the ES
side I guess).

My understanding is the major version has only ever been bumped when
full support for a new desktop OpenGL version has been reached
regardless of the number of extensions enabled. We did the same thing
going from 8.0 > 9.0 were as the 7 release went all the way to 7.11
over a 4 year period. It seems odd to change the way we bump versions
at this point in time, although in future maybe it will need to be
based on Vulkin versions also.

> 
> > > > > > > > > > > > > > > > > > > > > > > > > With the topic of which would 
> > > > > > > > > > > > > > > > > > > > > > > > > be 'the default' Vulkan 
> > > > > > > > > > > > > > > > > > > > > > > > > driver for

> > ATI/AMD hardware to be considered at a later stage.

> >

> > That said here are the tentative dates:

> >

> > Oct 7/14 2016 - Feature freeze/Release candidate 1

> > Oct 14/21 2016 - Release candidate 2

> > Oct 21/28 2016 - Release candidate 3/final release

> >

> > Fwiw I'm still in favour of getting RADV in even if it's not

> > perfect/feature complete. Devs, let me know if there's a "must have"

> > feature that we want in 12.1.

> >

> > Thanks

> > Emil

> > ___

> > mesa-dev mailing list

> > mesa-dev@lists.freedesktop.org

> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

> ___

> mesa-dev mailing list

> mesa-dev@lists.freedesktop.org

> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

> > > ___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97957] Awful screen tearing in a separate X server with DRI3

2016-09-29 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97957

Michel Dänzer  changed:

   What|Removed |Added

 CC|mic...@daenzer.net  |

--- Comment #8 from Michel Dänzer  ---
(In reply to Dieter Nützel from comment #7)
> E.g. to Michel?
> Maybe it solve this
> https://bugs.freedesktop.org/show_bug.cgi?id=97260#c56
> one , too?

I'll take a look at Chris' patch once it's submitted for review with a proper
rationale for all the changes. Meanwhile, for bug 97260 it would be more useful
if somebody tested the patch I attached there.

Removing myself from Cc because I get notified of updates to this report via
the mesa-dev mailing list.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/88] i965: add initial implementation of on disk shader cache

2016-09-29 Thread Timothy Arceri

On Sun, 2016-09-25 at 20:38 -0700, Kenneth Graunke wrote:
> On Monday, September 26, 2016 1:28:35 PM PDT Timothy Arceri wrote:
> > 
> > On Sun, 2016-09-25 at 19:43 -0700, Kenneth Graunke wrote:
> > > 
> > > On Saturday, September 24, 2016 3:24:53 PM PDT Timothy Arceri
> > > wrote:
> > > > 
> > > > 
> > > > This uses the recently-added cache.c to write out the final
> > > > linked
> > > > binary for vertex and fragment shader programs.
> > > > 
> > > > This is based off the initial implementation done by Carl.
> > > > ---
> > > >  src/mesa/drivers/dri/i965/Makefile.sources   |   1 +
> > > >  src/mesa/drivers/dri/i965/brw_shader_cache.c | 390
> > > > +++
> > > >  src/mesa/drivers/dri/i965/brw_state.h|   7 +
> > > >  3 files changed, 398 insertions(+)
> > > >  create mode 100644
> > > > src/mesa/drivers/dri/i965/brw_shader_cache.c
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources
> > > > b/src/mesa/drivers/dri/i965/Makefile.sources
> > > > index df90cb4..bd2bd37 100644
> > > > --- a/src/mesa/drivers/dri/i965/Makefile.sources
> > > > +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> > > > @@ -147,6 +147,7 @@ i965_FILES = \
> > > >     brw_sf_emit.c \
> > > >     brw_sf.h \
> > > >     brw_sf_state.c \
> > > > +   brw_shader_cache.cpp \
> > > >     brw_state_batch.c \
> > > >     brw_state_cache.c \
> > > >     brw_state_dump.c \
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_shader_cache.c
> > > > b/src/mesa/drivers/dri/i965/brw_shader_cache.c
> > > > new file mode 100644
> > > > index 000..aba45b6
> > > > --- /dev/null
> > > > +++ b/src/mesa/drivers/dri/i965/brw_shader_cache.c
> > > > @@ -0,0 +1,390 @@
> > > > +/*
> > > > + * Copyright © 2014 Intel Corporation
> > > > + *
> > > > + * Permission is hereby granted, free of charge, to any person
> > > > obtaining a
> > > > + * copy of this software and associated documentation files
> > > > (the
> > > > "Software"),
> > > > + * to deal in the Software without restriction, including
> > > > without
> > > > limitation
> > > > + * the rights to use, copy, modify, merge, publish,
> > > > distribute,
> > > > sublicense,
> > > > + * and/or sell copies of the Software, and to permit persons
> > > > to
> > > > whom the
> > > > + * Software is furnished to do so, subject to the following
> > > > conditions:
> > > > + *
> > > > + * The above copyright notice and this permission notice
> > > > (including the next
> > > > + * paragraph) shall be included in all copies or substantial
> > > > portions of the
> > > > + * Software.
> > > > + *
> > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> > > > KIND,
> > > > EXPRESS OR
> > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > > > MERCHANTABILITY,
> > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN
> > > > NO
> > > > EVENT SHALL
> > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
> > > > DAMAGES OR OTHER
> > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > > > OTHERWISE,
> > > > ARISING
> > > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > > > OR
> > > > OTHER DEALINGS
> > > > + * IN THE SOFTWARE.
> > > > + */
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#include "brw_state.h"
> > > > +#include "brw_wm.h"
> > > > +#include "brw_vs.h"
> > > > +#include "brw_context.h"
> > > > +
> > > > +static void
> > > > +gen_vs_sha1(struct brw_context *brw, struct gl_shader_program
> > > > *prog,
> > > > +struct brw_vs_prog_key *vs_key, unsigned char
> > > > *vs_sha1)
> > > > +{
> > > > +   char sha1_buf[41];
> > > > +   unsigned char sha1[20];
> > > > +   char manifest[256];
> > > > +   int offset = 0;
> > > > +
> > > > +   offset += snprintf(manifest, sizeof(manifest), "program:
> > > > %s\n",
> > > > +  _mesa_sha1_format(sha1_buf, prog-
> > > > >sha1));
> > > > +
> > > > +   _mesa_sha1_compute(vs_key, sizeof *vs_key, sha1);
> > > > +   offset += snprintf(manifest + offset, sizeof(manifest) -
> > > > offset,
> > > > +  "vs_key: %s\n",
> > > > _mesa_sha1_format(sha1_buf,
> > > > sha1));
> > > > +
> > > > +   _mesa_sha1_compute(manifest, strlen(manifest), vs_sha1);
> > > > +}
> > > 
> > > The VS/TCS/TES/GS code is basically identical...you could avoid a
> > > lot
> > > of
> > > duplication by doing...
> > > 
> > > static void
> > > gen_shader_sha1(struct brw_context *brw, struct gl_shader_program
> > > *prog,
> > > unsigned stage, void *key, unsigned char
> > > *out_sha1)
> > > {
> > >    char sha1_buf[41];
> > >    unsigned char sha1[20];
> > >    char manifest[256];
> > >    int offset = 0;
> > > 
> > >    format_program_sha1(prog, manifest, sizeof(manifest),
> > > &offset);
> > > 
> > >    _mesa_sha1_compute(key, key_size(stage), sha1);
> > >    offset += sn

Re: [Mesa-dev] more unsigned -> enum pipe_shader_type

2016-09-29 Thread Edward O'Callaghan

I'll have a look this weekend Brian, thanks.

On 09/30/2016 05:58 AM, Brian Paul wrote:
> 
> If anyone is looking for something simple to do, this is just a reminder
> that there are more places in gallium where we can replace unsigned with
> enum pipe_shader_type (a bunch was done in August).
> 
> Specifically,
> pipe_screen::get_shader_param()
> pipe_screen::get_compiler_options()
> pipe_context::set_constant_buffer()
> 
> -Brian
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/r300: initialize pipe_resource::next to NULL

2016-09-29 Thread Michel Dänzer

On 30/09/16 03:55 AM, Marek Olšák wrote:
> On Thu, Sep 29, 2016 at 8:37 PM, Rob Clark  wrote:
>> On Tue, Sep 27, 2016 at 10:56 PM, Michel Dänzer  wrote:
>>> On 28/09/16 12:33 AM, Rob Clark wrote:
 Signed-off-by: Rob Clark 
 ---
 I had a scan through the rest of pipe_resource allocations, and I think
 this is the only remaining one (besides r600_alloc_buffer_struct())
 which was using MALLOC_STRUCT()..  sorry 'bout that
>>>
>>> Note that the MALLOC_STRUCT here isn't relevant:
>>>
>>>
 diff --git a/src/gallium/drivers/r300/r300_screen_buffer.c 
 b/src/gallium/drivers/r300/r300_screen_buffer.c
 index 4747058..24dd92f 100644
 --- a/src/gallium/drivers/r300/r300_screen_buffer.c
 +++ b/src/gallium/drivers/r300/r300_screen_buffer.c
 @@ -163,6 +163,7 @@ struct pipe_resource *r300_buffer_create(struct 
 pipe_screen *screen,
  rbuf = MALLOC_STRUCT(r300_resource);

  rbuf->b.b = *templ;
>>>
>>> The pipe_resource::next field is copied in from the template here, so
>>> the question is really whether the next field of the template is
>>> initialized to NULL by all callers.
>>
>> bleh.. right, ok, I guess I need to track down which callers aren't
>> zero-initializing the templ.
> 
> or do: next = NULL; in all drivers?

Yeah, I suspect that'll be easier.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-29 Thread Tapani Pälli




On 09/29/2016 09:55 PM, Marek Olšák wrote:

On Thu, Sep 29, 2016 at 6:23 PM, Emil Velikov  wrote:

On 27 September 2016 at 13:47, Marek Olšák  wrote:

On Tue, Sep 27, 2016 at 2:34 PM, Emil Velikov  wrote:

On 26 September 2016 at 08:41, Tapani Pälli  wrote:

This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
and same crash in many dEQP EGL tests.

I also found that some Qt example did a workaround because of this
crash: https://bugreports.qt.io/browse/QTBUG-47509

Signed-off-by: Tapani Pälli 
---

This is RFC as I'm not sure if we are supposed to support this. I tried
to verify overall pbuffer situation with some mesa-demos using pbuffer
but those are not working for me at all with or without my patch.

 src/egl/main/eglconfig.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
index 6161d26..20cf9d4 100644
--- a/src/egl/main/eglconfig.c
+++ b/src/egl/main/eglconfig.c
@@ -407,6 +407,11 @@ _eglValidateConfig(const _EGLConfig *conf, EGLBoolean 
for_matching)
   return EGL_FALSE;
}

+   /* pbuffer with MSAA not supported */

Fwiw on my system piglit also crashes + the demos don't render
anything. So I'm leaning that we want this as-is (for the time being)
+ cc stable ?

Can you apply a minor polish to the comment - "XXX/TODO: pbuffer +
MSAA does not work + QT bugreport" or alike.


Please don't add "XXX/TODO". pbuffers were spec'd in 1997 and were
meant to be used on GL 1.x hardware that didn't support MSAA
texturing, thus MSAA pbuffers don't make any sense. Just keep the
current comment.


Can we use your reply instead - it's wise to have the not as often
visited parts nicely documented ?


I don't think my comment is useful if the pbuffer is not expected to
be bound as a texture. Your TODO comment is better. Sorry for the
noise.


Thanks, will send a new patch in a while with the TODO comment included.

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] egl: stop claiming support for pbuffer + msaa

2016-09-29 Thread Tapani Pälli

This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
and same crash in many dEQP EGL tests.

I also found that some Qt example did a workaround because of this
crash: https://bugreports.qt.io/browse/QTBUG-47509

v2: Ian pointed out that v1 removed support for all multisample
configs, including window ones. This one removes pbuffer bit
when adding configs, now only pbuffer+msaa gets rejected and
window+msaa continues to work. Fixed also comment (Emil)

Signed-off-by: Tapani Pälli 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/egl/drivers/dri2/egl_dri2.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 8e376e3..803627d 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -320,6 +320,15 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
*dri_config, int id,
   surface_type &= ~EGL_PIXMAP_BIT;
}
 
+   /* No support for pbuffer + MSAA for now.
+*
+* XXX TODO: pbuffer + MSAA does not work and causes crashes.
+* See QT bugreport: https://bugreports.qt.io/browse/QTBUG-47509
+*/
+   if (base.Samples) {
+  surface_type &= ~EGL_PBUFFER_BIT;
+   }
+
conf->base.SurfaceType |= surface_type;
 
return conf;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Mesa (master): gallium/radeon: zero all query buffers

2016-09-29 Thread Michel Dänzer

On 29/09/16 06:24 PM, Nicolai =?UNKNOWN?Q?H=C3=A4hnle?= wrote:
> Module: Mesa
> Branch: master
> Commit: 2c9d546402a4e3fb55bc3a01a5843dfca82b4a6a
> URL:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=2c9d546402a4e3fb55bc3a01a5843dfca82b4a6a
> 
> Author: Nicolai Hähnle 
> Date:   Thu Sep 15 15:58:36 2016 +0200
> 
> gallium/radeon: zero all query buffers

This change caused the piglit test spec@amd_performance_monitor@measure
to crash for me, see backtrace below. Looks like it may not be the fault
of this change per se but that the r600_perfcounter code needs to be
updated to support the prepare_buffer op?

Thread 1 "amd_performance" received signal SIGSEGV, Segmentation fault.
0x in ?? ()
(gdb) bt
#0  0x in ?? ()
#1  0x70eba13c in r600_new_query_buffer (query=0x86ee00, 
query@entry=0x0, ctx=0x62e760, ctx@entry=0x0) at 
../../../../../src/gallium/drivers/radeon/r600_query.c:341
#2  r600_query_hw_init (rctx=rctx@entry=0x62e760, query=query@entry=0x86ee00) 
at ../../../../../src/gallium/drivers/radeon/r600_query.c:415
#3  0x70eb440c in r600_create_batch_query (ctx=0x62e760, 
num_queries=, query_types=) at 
../../../../../src/gallium/drivers/radeon/r600_perfcounter.c:411
#4  0x708968cb in init_perf_monitor (m=0x86cf40, ctx=0x798fb0) at 
../../../src/mesa/state_tracker/st_cb_perfmon.c:113
#5  st_BeginPerfMonitor (ctx=0x798fb0, m=0x86cf40) at 
../../../src/mesa/state_tracker/st_cb_perfmon.c:180
#6  0x70788078 in _mesa_BeginPerfMonitorAMD (monitor=) 
at ../../../src/mesa/main/performance_monitor.c:547
#7  0x77a984b0 in stub_glBeginPerfMonitorAMD (monitor=1) at 
tests/util/piglit-dispatch-gen.c:799
#8  0x004012b8 in test_basic_measurement (group=0) at 
tests/spec/amd_performance_monitor/measure.c:144
#9  0x00401de2 in piglit_init (argc=1, argv=0x7fffe748) at 
tests/spec/amd_performance_monitor/measure.c:386
#10 0x77b39b18 in run_test (gl_fw=0x615c20, argc=1, 
argv=0x7fffe748) at 
tests/util/piglit-framework-gl/piglit_winsys_framework.c:73
#11 0x77b1e667 in piglit_gl_test_run (argc=1, argv=0x7fffe748, 
config=0x7fffe610) at tests/util/piglit-framework-gl.c:199
#12 0x00400ff0 in main (argc=1, argv=0x7fffe748) at 
tests/spec/amd_performance_monitor/measure.c:39



-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i915/i965: remove commented out warning

2016-09-29 Thread Timothy Arceri

The warning was also the wrong location, it should have been
in the else.
---
 src/mesa/drivers/dri/i915/intel_fbo.h | 4 +---
 src/mesa/drivers/dri/i965/intel_fbo.h | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/intel_fbo.h 
b/src/mesa/drivers/dri/i915/intel_fbo.h
index 114dd68..769dab8 100644
--- a/src/mesa/drivers/dri/i915/intel_fbo.h
+++ b/src/mesa/drivers/dri/i915/intel_fbo.h
@@ -89,10 +89,8 @@ static inline struct intel_renderbuffer *
 intel_renderbuffer(struct gl_renderbuffer *rb)
 {
struct intel_renderbuffer *irb = (struct intel_renderbuffer *) rb;
-   if (irb && irb->Base.Base.ClassID == INTEL_RB_CLASS) {
-  /*_mesa_warning(NULL, "Returning non-intel Rb\n");*/
+   if (irb && irb->Base.Base.ClassID == INTEL_RB_CLASS)
   return irb;
-   }
else
   return NULL;
 }
diff --git a/src/mesa/drivers/dri/i965/intel_fbo.h 
b/src/mesa/drivers/dri/i965/intel_fbo.h
index 89894cd..e6f6156 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.h
+++ b/src/mesa/drivers/dri/i965/intel_fbo.h
@@ -130,10 +130,8 @@ static inline struct intel_renderbuffer *
 intel_renderbuffer(struct gl_renderbuffer *rb)
 {
struct intel_renderbuffer *irb = (struct intel_renderbuffer *) rb;
-   if (irb && irb->Base.Base.ClassID == INTEL_RB_CLASS) {
-  /*_mesa_warning(NULL, "Returning non-intel Rb\n");*/
+   if (irb && irb->Base.Base.ClassID == INTEL_RB_CLASS)
   return irb;
-   }
else
   return NULL;
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.

2016-09-29 Thread Tapani Pälli

I have only 6 failures in that set currently, but this patch fixes all 
of them. Reason seems to be that with these cases we never up calling 
intel_miptree_get_tile_offsets and therefore use uninitialized values 
for tile_x and tile_y.


Tested-by: Tapani Pälli 

On 09/30/2016 08:56 AM, Xu,Randy wrote:

Add the miptree level/slice x/y_offset when count the surface offset
in brw_emit_surface_state. The surface offset has two parts, one is
from mt->offset, which should be 32 aligned in width/height for tiled
buffer; another is from mt->level[current_level].slice[current_slice].
x/y_offset.

This fix will solve 12 deqp failure
dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture

Signed-off-by: Xu,Randy 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 61a4b94..3a5c573 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -85,7 +85,8 @@ brw_emit_surface_state(struct brw_context *brw,
unsigned read_domains, unsigned write_domains)
 {
const struct surface_state_info ss_info = surface_state_infos[brw->gen];
-   uint32_t tile_x = 0, tile_y = 0;
+   uint32_t tile_x = mt->level[0].slice[0].x_offset;
+   uint32_t tile_y = mt->level[0].slice[0].y_offset;
uint32_t offset = mt->offset;

struct isl_surf surf;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

80 matches

Mail list logo