Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc

2017-06-10 Thread Jose Fonseca
I know this is an old thread.  I completely missed it the first time, 
but recently rediscovered after reading 
http://www.phoronix.com/scan.php?page=news_item=Vulkan-CPU-Repository 
, and perhaps it's not too late for a couple comments FWIW.


On 13/02/17 02:17, Jacob Lifshay wrote:

forgot to add mesa-dev when I sent.
-- Forwarded message --
From: "Jacob Lifshay" >

Date: Feb 12, 2017 6:16 PM
Subject: Re: [Mesa-dev] software implementation of vulkan for gsoc/evoc
To: "Dave Airlie" >
Cc:



On Feb 12, 2017 5:34 PM, "Dave Airlie" > wrote:


 > I'm assuming that control barriers in Vulkan are identical to
barriers
 > across a work-group in opencl. I was going to have a work-group
be a single
 > OS thread, with the different work-items mapped to SIMD lanes. If
we need to
 > have additional scheduling, I have written a javascript compiler that
 > supports generator functions, so I mostly know how to write a
llvm pass to
 > implement that. I was planning on writing the shader compiler
using llvm,
 > using the whole-function-vectorization pass I will write, and
using the
 > pre-existing spir-v to llvm translation layer. I would also write
some llvm
 > passes to translate from texture reads and stuff to basic vector ops.

Well the problem is number of work-groups that gets launched could be
quite high, and this can cause a large overhead in number of host
threads
that have to be launched. There was some discussion on this in mesa-dev
archives back when I added softpipe compute shaders.


I would start a thread for each cpu, then have each thread run the 
compute shader a number of times instead of having a thread per shader 
invocation.


At least for llvmpipe, last time I looked into this, using OS green 
threads seemed a simple non-intrusive method of dealing with this --

https://lists.freedesktop.org/archives/mesa-dev/2016-April/114790.html
-- but it sounds like LLVM coroutines can handle this more effectively.




 > I have a prototype rasterizer, however I haven't implemented
binning for
 > triangles yet or implemented interpolation. currently, it can handle
 > triangles in 3D homogeneous and calculate edge equations.
 > https://github.com/programmerjake/tiled-renderer

 > A previous 3d renderer that doesn't implement any vectorization
and has
 > opengl 1.x level functionality:
 >
https://github.com/programmerjake/lib3d/blob/master/softrender.cpp


Well I think we already have a completely fine rasterizer and binning
and whatever
else in the llvmpipe code base. I'd much rather any Mesa based
project doesn't
throw all of that away, there is no reason the same swrast backend
couldn't
be abstracted to be used for both GL and Vulkan and introducing another
just because it's interesting isn't a great fit for long term project
maintenance..

If there are improvements to llvmpipe that need to be made, then that
is something
to possibly consider, but I'm not sure why a swrast vulkan needs a
from scratch
raster implemented. For a project that is so large in scope, I'd think
reusing that code
would be of some use. Since most of the fun stuff is all the texture
sampling etc.


I actually think implementing the rasterization algorithm is the best 
part. I wanted the rasterization algorithm to be included in the 
shaders, eg. triangle setup and binning would be tacked on to the end of 
the vertex shader and parameter interpolation and early z tests would be 
tacked on to the beginning of the fragment shader and blending on to the 
end. That way, llvm could do more specialization and instruction 
scheduling than is possible in llvmpipe now.


Parameter interpolation, early z test, and blending *is* tacked to 
llmvpipe's fragment shaders.



I don't see how to effectively tack triangle setup into the vertex 
shader: vertex shader applies to vertices, where as triangle setup and 
bining applies to primitives.  Usually, each vertex gets transformed 
only once with llvmpipe, no matter how many triangles refer that vertex. 
 The only way to tack triangle setup into vertex shading would be if 
you processed vertices a primitive at a time.  Of course one could put 
an if-statement to skip reprocessing a vertex that already was 
processed, but then you have race conditions, and no benefit of inlining.



And I'm afraid that tacking rasterization too is one those things that 
sound great on paper, quite bad in practice.  And I speak from 
experience: in fact llvmpipe had the last step of rasterization bolted 
on the fragment shaders for some time.  But 

[Mesa-dev] [PATCH 3/3] radv: Rename winsys enums.

2017-06-10 Thread Bas Nieuwenhuizen
Don't rename the enums and constants used for metadata.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c  |  4 +--
 src/amd/vulkan/radv_descriptor_set.c  |  2 +-
 src/amd/vulkan/radv_device.c  | 48 +--
 src/amd/vulkan/radv_image.c   |  2 +-
 src/amd/vulkan/radv_pipeline.c|  2 +-
 src/amd/vulkan/radv_pipeline_cache.c  |  2 +-
 src/amd/vulkan/radv_query.c   |  2 +-
 src/amd/vulkan/radv_winsys.h  | 28 ++--
 src/amd/vulkan/si_cmd_buffer.c|  4 +--
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c | 22 ++--
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h |  2 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 14 
 12 files changed, 63 insertions(+), 69 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index b57ce9fd1de..0a82bf08ec6 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -258,8 +258,8 @@ radv_cmd_buffer_resize_upload_buf(struct radv_cmd_buffer 
*cmd_buffer,
 
bo = device->ws->buffer_create(device->ws,
   new_size, 4096,
-  RADEON_DOMAIN_GTT,
-  RADEON_FLAG_CPU_ACCESS);
+  RADV_DOMAIN_GTT,
+  RADV_FLAG_CPU_ACCESS);
 
if (!bo) {
cmd_buffer->record_fail = true;
diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 3ea4936bfae..8d2623acd1b 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -427,7 +427,7 @@ VkResult radv_CreateDescriptorPool(
 
if (bo_size) {
pool->bo = device->ws->buffer_create(device->ws, bo_size,
-   32, RADEON_DOMAIN_VRAM, 
0);
+   32, RADV_DOMAIN_VRAM, 
0);
pool->mapped_ptr = (uint8_t*)device->ws->buffer_map(pool->bo);
}
pool->size = bo_size;
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 52de47f4bdc..63634e0db3d 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -1126,7 +1126,7 @@ VkResult radv_CreateDevice(
 
if (getenv("RADV_TRACE_FILE")) {
device->trace_bo = device->ws->buffer_create(device->ws, 4096, 
8,
-
RADEON_DOMAIN_VRAM, RADEON_FLAG_CPU_ACCESS);
+RADV_DOMAIN_VRAM, 
RADV_FLAG_CPU_ACCESS);
if (!device->trace_bo)
goto fail;
 
@@ -1550,8 +1550,8 @@ radv_get_preamble_cs(struct radv_queue *queue,
scratch_bo = queue->device->ws->buffer_create(queue->device->ws,
  scratch_size,
  4096,
- 
RADEON_DOMAIN_VRAM,
- 
RADEON_FLAG_NO_CPU_ACCESS);
+ RADV_DOMAIN_VRAM,
+ 
RADV_FLAG_NO_CPU_ACCESS);
if (!scratch_bo)
goto fail;
} else
@@ -1561,8 +1561,8 @@ radv_get_preamble_cs(struct radv_queue *queue,
compute_scratch_bo = 
queue->device->ws->buffer_create(queue->device->ws,
  
compute_scratch_size,
  4096,
- 
RADEON_DOMAIN_VRAM,
- 
RADEON_FLAG_NO_CPU_ACCESS);
+ 
RADV_DOMAIN_VRAM,
+ 
RADV_FLAG_NO_CPU_ACCESS);
if (!compute_scratch_bo)
goto fail;
 
@@ -1573,8 +1573,8 @@ radv_get_preamble_cs(struct radv_queue *queue,
esgs_ring_bo = 
queue->device->ws->buffer_create(queue->device->ws,
esgs_ring_size,
4096,
-   
RADEON_DOMAIN_VRAM,
-   
RADEON_FLAG_NO_CPU_ACCESS);
+   
RADV_DOMAIN_VRAM,
+

[Mesa-dev] [PATCH 2/3] radv: Rename winsys interface structures from radeon* to radv*.

2017-06-10 Thread Bas Nieuwenhuizen
For preventing confusion with a radeon winsys.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_cmd_buffer.c   |  26 ++---
 src/amd/vulkan/radv_cs.h   |  24 ++---
 src/amd/vulkan/radv_descriptor_set.c   |  18 ++--
 src/amd/vulkan/radv_device.c   |  66 ++--
 src/amd/vulkan/radv_image.c|   4 +-
 src/amd/vulkan/radv_meta_buffer.c  |  12 +--
 src/amd/vulkan/radv_private.h  |  86 +++
 src/amd/vulkan/radv_query.c|  12 +--
 src/amd/vulkan/radv_winsys.h   | 102 +-
 src/amd/vulkan/radv_wsi.c  |   8 +-
 src/amd/vulkan/si_cmd_buffer.c |  24 ++---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c  |  38 +++
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h  |   2 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c  | 116 ++---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h  |   4 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c |   4 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c  |   6 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h  |   4 +-
 .../winsys/amdgpu/radv_amdgpu_winsys_public.h  |   2 +-
 19 files changed, 279 insertions(+), 279 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 0fb7bfa4bab..b57ce9fd1de 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -249,7 +249,7 @@ radv_cmd_buffer_resize_upload_buf(struct radv_cmd_buffer 
*cmd_buffer,
  uint64_t min_needed)
 {
uint64_t new_size;
-   struct radeon_winsys_bo *bo;
+   struct radv_winsys_bo *bo;
struct radv_cmd_buffer_upload *upload;
struct radv_device *device = cmd_buffer->device;
 
@@ -334,7 +334,7 @@ radv_cmd_buffer_upload_data(struct radv_cmd_buffer 
*cmd_buffer,
 void radv_cmd_buffer_trace_emit(struct radv_cmd_buffer *cmd_buffer)
 {
struct radv_device *device = cmd_buffer->device;
-   struct radeon_winsys_cs *cs = cmd_buffer->cs;
+   struct radv_winsys_cs *cs = cmd_buffer->cs;
uint64_t va;
 
if (!device->trace_bo)
@@ -537,7 +537,7 @@ radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader,
struct ac_vs_output_info *outinfo)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radv_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
unsigned export_count;
 
@@ -587,7 +587,7 @@ radv_emit_hw_es(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader,
struct ac_es_output_info *outinfo)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radv_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
 
ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
@@ -606,7 +606,7 @@ static void
 radv_emit_hw_ls(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radv_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
uint32_t rsrc2 = shader->rsrc2;
 
@@ -631,7 +631,7 @@ static void
 radv_emit_hw_hs(struct radv_cmd_buffer *cmd_buffer,
struct radv_shader_variant *shader)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radv_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
 
ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
@@ -734,7 +734,7 @@ static void
 radv_emit_geometry_shader(struct radv_cmd_buffer *cmd_buffer,
  struct radv_pipeline *pipeline)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radv_winsys *ws = cmd_buffer->device->ws;
struct radv_shader_variant *gs;
uint64_t va;
 
@@ -799,7 +799,7 @@ static void
 radv_emit_fragment_shader(struct radv_cmd_buffer *cmd_buffer,
  struct radv_pipeline *pipeline)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radv_winsys *ws = cmd_buffer->device->ws;
struct radv_shader_variant *ps;
uint64_t va;
unsigned spi_baryc_cntl = S_0286E0_FRONT_FACE_ALL_BITS(1);
@@ -2004,7 +2004,7 @@ void radv_bind_descriptor_set(struct radv_cmd_buffer 
*cmd_buffer,
  struct radv_descriptor_set *set,
  unsigned idx)
 {
-   struct radeon_winsys *ws = cmd_buffer->device->ws;
+   struct radv_winsys *ws = cmd_buffer->device->ws;
 
assert(!(set->layout->flags & 
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR));
 
@@ -2201,7 +2201,7 

[Mesa-dev] [PATCH 1/3] radv: Rename radv_radeon_winsys.h to radv_winsys.h.

2017-06-10 Thread Bas Nieuwenhuizen
The amdgpu winsys has radv_amdgpu_winsys.h, and getting another 
radv_radeon_winsys.h
in there for a radeon winsys would be awkward.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/Makefile.sources| 2 +-
 src/amd/vulkan/radv_cmd_buffer.c   | 2 +-
 src/amd/vulkan/radv_image.c| 2 +-
 src/amd/vulkan/radv_private.h  | 2 +-
 src/amd/vulkan/{radv_radeon_winsys.h => radv_winsys.h} | 6 +++---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c  | 2 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h  | 2 +-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h  | 2 +-
 8 files changed, 10 insertions(+), 10 deletions(-)
 rename src/amd/vulkan/{radv_radeon_winsys.h => radv_winsys.h} (98%)

diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile.sources
index d3e0c81e9a3..584aea58213 100644
--- a/src/amd/vulkan/Makefile.sources
+++ b/src/amd/vulkan/Makefile.sources
@@ -56,7 +56,7 @@ VULKAN_FILES := \
radv_pipeline.c \
radv_pipeline_cache.c \
radv_private.h \
-   radv_radeon_winsys.h \
+   radv_winsys.h \
radv_query.c \
radv_util.c \
radv_util.h \
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 1ac9de139f1..0fb7bfa4bab 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -26,7 +26,7 @@
  */
 
 #include "radv_private.h"
-#include "radv_radeon_winsys.h"
+#include "radv_winsys.h"
 #include "radv_cs.h"
 #include "sid.h"
 #include "gfx9d.h"
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 91c7e5ff79f..37610f1c249 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -27,7 +27,7 @@
 
 #include "radv_private.h"
 #include "vk_format.h"
-#include "radv_radeon_winsys.h"
+#include "radv_winsys.h"
 #include "sid.h"
 #include "gfx9d.h"
 #include "util/debug.h"
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 87cb0a67fe7..d13e4190510 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -50,7 +50,7 @@
 #include "main/macros.h"
 #include "vk_alloc.h"
 
-#include "radv_radeon_winsys.h"
+#include "radv_winsys.h"
 #include "ac_binary.h"
 #include "ac_nir_to_llvm.h"
 #include "ac_gpu_info.h"
diff --git a/src/amd/vulkan/radv_radeon_winsys.h b/src/amd/vulkan/radv_winsys.h
similarity index 98%
rename from src/amd/vulkan/radv_radeon_winsys.h
rename to src/amd/vulkan/radv_winsys.h
index cdcaeca46ec..b4a4793a127 100644
--- a/src/amd/vulkan/radv_radeon_winsys.h
+++ b/src/amd/vulkan/radv_winsys.h
@@ -26,8 +26,8 @@
  * IN THE SOFTWARE.
  */
 
-#ifndef RADV_RADEON_WINSYS_H
-#define RADV_RADEON_WINSYS_H
+#ifndef RADV_WINSYS_H
+#define RADV_WINSYS_H
 
 #include 
 #include 
@@ -238,4 +238,4 @@ static inline void radeon_emit_array(struct 
radeon_winsys_cs *cs,
cs->cdw += count;
 }
 
-#endif /* RADV_RADEON_WINSYS_H */
+#endif /* RADV_WINSYS_H */
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index 7b749700d1c..3655e0ebd3a 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -29,7 +29,7 @@
 
 #include "ac_debug.h"
 #include "amdgpu_id.h"
-#include "radv_radeon_winsys.h"
+#include "radv_winsys.h"
 #include "radv_amdgpu_cs.h"
 #include "radv_amdgpu_bo.h"
 #include "sid.h"
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h
index 42d89eee54d..41d0b2f2efb 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h
@@ -35,7 +35,7 @@
 #include "r600d_common.h"
 #include 
 
-#include "radv_radeon_winsys.h"
+#include "radv_winsys.h"
 #include "radv_amdgpu_winsys.h"
 
 enum {
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h
index 426cf692ec0..2d9cf248389 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h
@@ -28,7 +28,7 @@
 #ifndef RADV_AMDGPU_WINSYS_H
 #define RADV_AMDGPU_WINSYS_H
 
-#include "radv_radeon_winsys.h"
+#include "radv_winsys.h"
 #include "ac_gpu_info.h"
 #include "addrlib/addrinterface.h"
 #include 
-- 
2.13.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101374] Worms Clan Wars hangs on loading screen

2017-06-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101374

--- Comment #1 from Hi-Angel  ---
It's unclear if OP at the second link described own system (which doesn't have
the problem), or the system of the friends with the problem. Probably the
first, because a single GPU is mentioned, whilst the number of "friends" is 2.

In this case there's a clear pattern that peoples with the problem are only
using NVidia binary drivers, which have nothing to do with Mesa.

I think you could test if it works with Nouveau (given it supports the required
OpenGL), and then if it does, the problem is sure somewhere else.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101374] Worms Clan Wars hangs on loading screen

2017-06-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101374

Bug ID: 101374
   Summary: Worms Clan Wars hangs on loading screen
   Product: Mesa
   Version: unspecified
  Hardware: All
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: cosiek...@o2.pl
QA Contact: mesa-dev@lists.freedesktop.org

Many people have this bug. I recall it was working correctly in the past.
Not sure if this is the right place for this bugreport. Not sure if specific to
mesa.
http://steamcommunity.com/app/233840/discussions/1/1290691308584156386/
http://steamcommunity.com/app/233840/discussions/1/1290691308579921472/

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #8 from John  ---
Grazvydas,
I've rebuilt mesa at the faulty commit with your 2 patches and it worked as
well as before that commit.

Thank you for the quick fix!



Now if possible I'd love to look at the freezes I've had since my first test
months ago, and still have.

The behavior is somewhat different from the one fixed by Grazvydas' patches in
that the application starts and will run fine for a bit, from a few seconds to
a few minutes, and then the PC seems to freeze similarly to my previously
described issue, I still have SSH access, yet trying to restart never works.

These are quite more annoying to debug though as not error gets displayed in
dmesg, and since it has always been a problem for me I have no good commit for
a bisection... I've looked at Xorg logs as well but I saw nothing there either.

A simple test for this is to run SaschaWillems/Vulkan/Raytracing, after moving
around for a few seconds the issue will be triggered.
Mad Max's vulkan benchmark is another, this one always freezes in some sort of
cave, I think in the 3rd scene. Maybe something in there triggers it...

I'd be happy to try a patch or a R600_DEBUG parameter to make it a lot more
verbose, or whatever you think is best of course!


Thank you!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] ac: Use mov_dpp for derivatives.

2017-06-10 Thread Bas Nieuwenhuizen
Slightly faster than bpermute, and seems supported since at least
LLVM 3.9.

v2: Since this supersedes bpermute, remove the bpermute code.
Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/common/ac_llvm_build.c   | 47 
 src/amd/common/ac_llvm_build.h   |  2 +-
 src/amd/common/ac_nir_to_llvm.c  |  8 +++---
 src/gallium/drivers/radeonsi/si_pipe.c   |  2 +-
 src/gallium/drivers/radeonsi/si_pipe.h   |  2 +-
 src/gallium/drivers/radeonsi/si_shader.c |  4 +--
 6 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 237e9291d41..99d41bf52d6 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -783,41 +783,52 @@ ac_get_thread_id(struct ac_llvm_context *ctx)
  */
 LLVMValueRef
 ac_build_ddxy(struct ac_llvm_context *ctx,
- bool has_ds_bpermute,
+ bool has_mov_dpp,
  uint32_t mask,
  int idx,
  LLVMValueRef lds,
  LLVMValueRef val)
 {
-   LLVMValueRef thread_id, tl, trbl, tl_tid, trbl_tid, args[2];
+   LLVMValueRef thread_id, tl, trbl, args[5];
LLVMValueRef result;
 
-   thread_id = ac_get_thread_id(ctx);
+   if (has_mov_dpp) {
+   uint32_t tl_ctrl = 0, trbl_ctrl = 0;
 
-   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
- LLVMConstInt(ctx->i32, mask, false), "");
-
-   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
-   LLVMConstInt(ctx->i32, idx, false), "");
+   for (unsigned i = 0; i < 4; ++i) {
+   tl_ctrl |= (i & mask) << (2 * i);
+   trbl_ctrl |= ((i & mask) + idx) << (2 * i);
+   }
 
-   if (has_ds_bpermute) {
-   args[0] = LLVMBuildMul(ctx->builder, tl_tid,
-  LLVMConstInt(ctx->i32, 4, false), "");
-   args[1] = val;
+   args[0] = val;
+   args[1] = LLVMConstInt(ctx->i32, tl_ctrl, false);
+   args[2] = LLVMConstInt(ctx->i32, 0xf, false);
+   args[3] = LLVMConstInt(ctx->i32, 0xf, false);
+   args[4] = LLVMConstInt(ctx->i1, 1, false);
tl = ac_build_intrinsic(ctx,
-   "llvm.amdgcn.ds.bpermute", ctx->i32,
-   args, 2,
+   "llvm.amdgcn.mov.dpp.i32", ctx->i32,
+   args, 5,
AC_FUNC_ATTR_READNONE |
AC_FUNC_ATTR_CONVERGENT);
 
-   args[0] = LLVMBuildMul(ctx->builder, trbl_tid,
-  LLVMConstInt(ctx->i32, 4, false), "");
+   args[1] = LLVMConstInt(ctx->i32, trbl_ctrl, false);
trbl = ac_build_intrinsic(ctx,
- "llvm.amdgcn.ds.bpermute", ctx->i32,
- args, 2,
+ "llvm.amdgcn.mov.dpp.i32", ctx->i32,
+ args, 5,
  AC_FUNC_ATTR_READNONE |
  AC_FUNC_ATTR_CONVERGENT);
} else {
+   LLVMValueRef tl_tid, trbl_tid;
+
+   thread_id = ac_get_thread_id(ctx);
+
+   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
+   LLVMConstInt(ctx->i32, mask, false), "");
+
+   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
+   LLVMConstInt(ctx->i32, idx, false), "");
+
+
LLVMValueRef store_ptr, load_ptr0, load_ptr1;
 
store_ptr = ac_build_gep0(ctx, lds, thread_id);
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index ebb78fbd79b..14260b05018 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -161,7 +161,7 @@ ac_get_thread_id(struct ac_llvm_context *ctx);
 
 LLVMValueRef
 ac_build_ddxy(struct ac_llvm_context *ctx,
- bool has_ds_bpermute,
+ bool has_mov_dpp,
  uint32_t mask,
  int idx,
  LLVMValueRef lds,
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 49117d21bd2..2385c60d316 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -164,7 +164,7 @@ struct nir_to_llvm_context {
uint8_t num_output_clips;
uint8_t num_output_culls;
 
-   bool has_ds_bpermute;
+   bool has_mov_dpp;
 
bool is_gs_copy_shader;
LLVMValueRef gs_next_vertex;
@@ -1434,7 +1434,7 @@ static LLVMValueRef emit_ddxy(struct nir_to_llvm_context 
*ctx,
LLVMValueRef result;
ctx->has_ddxy = true;
 
-   if (!ctx->lds && 

[Mesa-dev] [PATCH] ac: Use mov_dpp for derivatives.

2017-06-10 Thread Bas Nieuwenhuizen
Slightly faster than bpermute, and seems supported since at least
LLVM 3.9.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/common/ac_llvm_build.c | 78 +-
 1 file changed, 54 insertions(+), 24 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 237e9291d41..62a00f214de 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -789,44 +789,74 @@ ac_build_ddxy(struct ac_llvm_context *ctx,
  LLVMValueRef lds,
  LLVMValueRef val)
 {
-   LLVMValueRef thread_id, tl, trbl, tl_tid, trbl_tid, args[2];
+   LLVMValueRef thread_id, tl, trbl, args[5];
LLVMValueRef result;
 
-   thread_id = ac_get_thread_id(ctx);
-
-   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
- LLVMConstInt(ctx->i32, mask, false), "");
+   /* bpermute is VI+, mov_dpp is VI+ too */
+   if (has_ds_bpermute) {
+   uint32_t tl_ctrl = 0, trbl_ctrl = 0;
 
-   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
-   LLVMConstInt(ctx->i32, idx, false), "");
+   for (unsigned i = 0; i < 4; ++i) {
+   tl_ctrl |= (i & mask) << (2 * i);
+   trbl_ctrl |= ((i & mask) + idx) << (2 * i);
+   }
 
-   if (has_ds_bpermute) {
-   args[0] = LLVMBuildMul(ctx->builder, tl_tid,
-  LLVMConstInt(ctx->i32, 4, false), "");
-   args[1] = val;
+   args[0] = val;
+   args[1] = LLVMConstInt(ctx->i32, tl_ctrl, false);
+   args[2] = LLVMConstInt(ctx->i32, 0xf, false);
+   args[3] = LLVMConstInt(ctx->i32, 0xf, false);
+   args[4] = LLVMConstInt(ctx->i1, 1, false);
tl = ac_build_intrinsic(ctx,
-   "llvm.amdgcn.ds.bpermute", ctx->i32,
-   args, 2,
+   "llvm.amdgcn.mov.dpp.i32", ctx->i32,
+   args, 5,
AC_FUNC_ATTR_READNONE |
AC_FUNC_ATTR_CONVERGENT);
 
-   args[0] = LLVMBuildMul(ctx->builder, trbl_tid,
-  LLVMConstInt(ctx->i32, 4, false), "");
+   args[1] = LLVMConstInt(ctx->i32, trbl_ctrl, false);
trbl = ac_build_intrinsic(ctx,
- "llvm.amdgcn.ds.bpermute", ctx->i32,
- args, 2,
+ "llvm.amdgcn.mov.dpp.i32", ctx->i32,
+ args, 5,
  AC_FUNC_ATTR_READNONE |
  AC_FUNC_ATTR_CONVERGENT);
} else {
-   LLVMValueRef store_ptr, load_ptr0, load_ptr1;
+   LLVMValueRef tl_tid, trbl_tid;
+
+   thread_id = ac_get_thread_id(ctx);
+
+   tl_tid = LLVMBuildAnd(ctx->builder, thread_id,
+   LLVMConstInt(ctx->i32, mask, false), "");
+
+   trbl_tid = LLVMBuildAdd(ctx->builder, tl_tid,
+   LLVMConstInt(ctx->i32, idx, false), "");
+
+   if (has_ds_bpermute) {
+   args[0] = LLVMBuildMul(ctx->builder, tl_tid,
+   LLVMConstInt(ctx->i32, 4, false), "");
+   args[1] = val;
+   tl = ac_build_intrinsic(ctx,
+   "llvm.amdgcn.ds.bpermute", 
ctx->i32,
+   args, 2,
+   AC_FUNC_ATTR_READNONE |
+   AC_FUNC_ATTR_CONVERGENT);
+
+   args[0] = LLVMBuildMul(ctx->builder, trbl_tid,
+   LLVMConstInt(ctx->i32, 4, false), "");
+   trbl = ac_build_intrinsic(ctx,
+   "llvm.amdgcn.ds.bpermute", 
ctx->i32,
+   args, 2,
+   AC_FUNC_ATTR_READNONE |
+   AC_FUNC_ATTR_CONVERGENT);
+   } else {
+   LLVMValueRef store_ptr, load_ptr0, load_ptr1;
 
-   store_ptr = ac_build_gep0(ctx, lds, thread_id);
-   load_ptr0 = ac_build_gep0(ctx, lds, tl_tid);
-   load_ptr1 = ac_build_gep0(ctx, lds, trbl_tid);
+   store_ptr = ac_build_gep0(ctx, lds, thread_id);
+   load_ptr0 = ac_build_gep0(ctx, lds, tl_tid);
+   load_ptr1 = ac_build_gep0(ctx, lds, trbl_tid);
 
-   

Re: [Mesa-dev] [PATCH 1/2] radv: assert on CP_DMA_USE_L2 for SI

2017-06-10 Thread Bas Nieuwenhuizen
The series is

Reviewed-by: Bas Nieuwenhuizen 

On Sat, Jun 10, 2017 at 5:53 PM, Grazvydas Ignotas  wrote:
> The register header (and radeonsi comment) states V_411_SRC_ADDR_TC_L2
> is for CIK+ only, so let's assert on earlier ASICs.
>
> Signed-off-by: Grazvydas Ignotas 
> ---
>  src/amd/vulkan/si_cmd_buffer.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
> index 33414c1..962b76f 100644
> --- a/src/amd/vulkan/si_cmd_buffer.c
> +++ b/src/amd/vulkan/si_cmd_buffer.c
> @@ -1191,10 +1191,11 @@ static void si_emit_cp_dma(struct radv_cmd_buffer 
> *cmd_buffer,
> radeon_emit(cs, src_va >> 32);  /* SRC_ADDR_HI [31:0] 
> */
> radeon_emit(cs, dst_va);/* DST_ADDR_LO [31:0] 
> */
> radeon_emit(cs, dst_va >> 32);  /* DST_ADDR_HI [31:0] 
> */
> radeon_emit(cs, command);
> } else {
> +   assert(!(flags & CP_DMA_USE_L2));
> header |= S_411_SRC_ADDR_HI(src_va >> 32);
> radeon_emit(cs, PKT3(PKT3_CP_DMA, 4, 0));
> radeon_emit(cs, src_va);/* 
> SRC_ADDR_LO [31:0] */
> radeon_emit(cs, header);/* 
> SRC_ADDR_HI [15:0] + flags. */
> radeon_emit(cs, dst_va);/* 
> DST_ADDR_LO [31:0] */
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94168] Incorrect rendering when running Populous 3 on wine using DDraw->WineD3D->OpenGL wrapper [apitrace]

2017-06-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94168

--- Comment #10 from Sven Arvidsson  ---
Gabriel Knight 3 is another Wine direct draw game with a similar problem.

When restoring a saved game the screen stops refreshing, but the game still
runs.

Applying the suggested patch or using LIBGL_ALWAYS_SOFTWARE works around the
problem.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/13] radeonsi: pack si_buffer_resources better

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.h | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index b616757..d8bf13e 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -260,26 +260,27 @@ struct si_descriptors {
 
 struct si_sampler_views {
struct pipe_sampler_view*views[SI_NUM_SAMPLERS];
struct si_sampler_state *sampler_states[SI_NUM_SAMPLERS];
 
/* The i-th bit is set if that element is enabled (non-NULL resource). 
*/
unsignedenabled_mask;
 };
 
 struct si_buffer_resources {
-   enum radeon_bo_usageshader_usage; /* READ, WRITE, or 
READWRITE */
-   enum radeon_bo_usageshader_usage_constbuf;
-   enum radeon_bo_priority priority;
-   enum radeon_bo_priority priority_constbuf;
struct pipe_resource**buffers; /* this has num_buffers 
elements */
 
+   enum radeon_bo_usageshader_usage:4; /* READ, WRITE, or 
READWRITE */
+   enum radeon_bo_usageshader_usage_constbuf:4;
+   enum radeon_bo_priority priority:6;
+   enum radeon_bo_priority priority_constbuf:6;
+
/* The i-th bit is set if that element is enabled (non-NULL resource). 
*/
unsignedenabled_mask;
 };
 
 #define si_pm4_block_idx(member) \
(offsetof(union si_state, named.member) / sizeof(struct si_pm4_state *))
 
 #define si_pm4_state_changed(sctx, member) \
((sctx)->queued.named.member != (sctx)->emitted.named.member)
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/13] radeonsi: pack si_framebuffer better

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/r600_pipe_common.h |  2 +-
 src/gallium/drivers/radeon/r600_texture.c |  2 +-
 src/gallium/drivers/radeonsi/si_pipe.h| 12 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 84d38fb..56056fa 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -835,21 +835,21 @@ void vi_separate_dcc_start_query(struct pipe_context *ctx,
 void vi_separate_dcc_stop_query(struct pipe_context *ctx,
struct r600_texture *tex);
 void vi_separate_dcc_process_and_reset_stats(struct pipe_context *ctx,
 struct r600_texture *tex);
 void vi_dcc_clear_level(struct r600_common_context *rctx,
struct r600_texture *rtex,
unsigned level, unsigned clear_value);
 void evergreen_do_fast_color_clear(struct r600_common_context *rctx,
   struct pipe_framebuffer_state *fb,
   struct r600_atom *fb_state,
-  unsigned *buffers, unsigned *dirty_cbufs,
+  unsigned *buffers, ubyte *dirty_cbufs,
   const union pipe_color_union *color);
 bool r600_texture_disable_dcc(struct r600_common_context *rctx,
  struct r600_texture *rtex);
 void r600_init_screen_texture_functions(struct r600_common_screen *rscreen);
 void r600_init_context_texture_functions(struct r600_common_context *rctx);
 
 /* r600_viewport.c */
 void evergreen_apply_scissor_bug_workaround(struct r600_common_context *rctx,
struct pipe_scissor_state *scissor);
 void r600_viewport_set_rast_deps(struct r600_common_context *rctx,
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 32275b1..25abdd6 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -2621,21 +2621,21 @@ static void si_set_optimal_micro_tile_mode(struct 
r600_common_screen *rscreen,
}
 
rtex->surface.micro_tile_mode = 
rtex->last_msaa_resolve_target_micro_mode;
 
p_atomic_inc(>dirty_tex_counter);
 }
 
 void evergreen_do_fast_color_clear(struct r600_common_context *rctx,
   struct pipe_framebuffer_state *fb,
   struct r600_atom *fb_state,
-  unsigned *buffers, unsigned *dirty_cbufs,
+  unsigned *buffers, ubyte *dirty_cbufs,
   const union pipe_color_union *color)
 {
int i;
 
/* This function is broken in BE, so just disable this path for now */
 #ifdef PIPE_ARCH_BIG_ENDIAN
return;
 #endif
 
if (rctx->render_cond)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 55fda4d..388f6e0 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -159,31 +159,31 @@ struct si_textures_info {
 
 struct si_images_info {
struct pipe_image_view  views[SI_NUM_IMAGES];
uint32_tneeds_color_decompress_mask;
unsignedenabled_mask;
 };
 
 struct si_framebuffer {
struct r600_atomatom;
struct pipe_framebuffer_state   state;
-   unsignednr_samples;
-   unsignedlog_samples;
-   unsignedcompressed_cb_mask;
unsignedcolorbuf_enabled_4bit;
unsignedspi_shader_col_format;
unsignedspi_shader_col_format_alpha;
unsignedspi_shader_col_format_blend;
unsignedspi_shader_col_format_blend_alpha;
-   unsignedcolor_is_int8;
-   unsignedcolor_is_int10;
-   unsigneddirty_cbufs;
+   ubyte   nr_samples:5; /* at most 16xAA */
+   ubyte   log_samples:3; /* at most 4 = 16xAA */
+   ubyte   compressed_cb_mask;
+   ubyte   color_is_int8;
+   ubyte   color_is_int10;
+   ubyte   dirty_cbufs;
booldirty_zsbuf;
boolany_dst_linear;
booldo_update_surf_dirtiness;
 };
 
 struct si_clip_state {
struct r600_atomatom;
struct pipe_clip_state  

[Mesa-dev] [PATCH 07/13] radeonsi: replace si_vertex_elements::elements with separate fields

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_descriptors.c   | 9 -
 src/gallium/drivers/radeonsi/si_state.c | 8 +---
 src/gallium/drivers/radeonsi/si_state.h | 4 +++-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 7 ++-
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index af68ac9..22888a6 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -999,21 +999,21 @@ static void si_get_buffer_from_descriptors(struct 
si_buffer_resources *buffers,
 
 /* VERTEX BUFFERS */
 
 static void si_vertex_buffers_begin_new_cs(struct si_context *sctx)
 {
struct si_descriptors *desc = >vertex_buffers;
int count = sctx->vertex_elements ? sctx->vertex_elements->count : 0;
int i;
 
for (i = 0; i < count; i++) {
-   int vb = sctx->vertex_elements->elements[i].vertex_buffer_index;
+   int vb = sctx->vertex_elements->vertex_buffer_index[i];
 
if (vb >= ARRAY_SIZE(sctx->vertex_buffer))
continue;
if (!sctx->vertex_buffer[vb].buffer.resource)
continue;
 
radeon_add_to_buffer_list(>b, >b.gfx,
  (struct 
r600_resource*)sctx->vertex_buffer[vb].buffer.resource,
  RADEON_USAGE_READ, 
RADEON_PRIO_VERTEX_BUFFER);
}
@@ -1058,35 +1058,34 @@ bool si_upload_vertex_buffer_descriptors(struct 
si_context *sctx)
if (!desc->buffer)
return false;
 
radeon_add_to_buffer_list(>b, >b.gfx,
  desc->buffer, RADEON_USAGE_READ,
  RADEON_PRIO_DESCRIPTORS);
 
assert(count <= SI_MAX_ATTRIBS);
 
for (i = 0; i < count; i++) {
-   struct pipe_vertex_element *ve = >elements[i];
struct pipe_vertex_buffer *vb;
struct r600_resource *rbuffer;
unsigned offset;
-   unsigned vbo_index = ve->vertex_buffer_index;
+   unsigned vbo_index = velems->vertex_buffer_index[i];
uint32_t *desc = [i*4];
 
vb = >vertex_buffer[vbo_index];
rbuffer = (struct r600_resource*)vb->buffer.resource;
if (!rbuffer) {
memset(desc, 0, 16);
continue;
}
 
-   offset = vb->buffer_offset + ve->src_offset;
+   offset = vb->buffer_offset + velems->src_offset[i];
va = rbuffer->gpu_address + offset;
 
/* Fill in T# buffer resource description */
desc[0] = va;
desc[1] = S_008F04_BASE_ADDRESS_HI(va >> 32) |
  S_008F04_STRIDE(vb->stride);
 
if (sctx->b.chip_class != VI && vb->stride) {
/* Round up by rounding down and adding 1 */
desc[2] = (vb->buffer.resource->width0 - offset -
@@ -1630,21 +1629,21 @@ static void si_rebind_buffer(struct pipe_context *ctx, 
struct pipe_resource *buf
 
/* We changed the buffer, now we need to bind it where the old one
 * was bound. This consists of 2 things:
 *   1) Updating the resource descriptor and dirtying it.
 *   2) Adding a relocation to the CS, so that it's usable.
 */
 
/* Vertex buffers. */
if (rbuffer->bind_history & PIPE_BIND_VERTEX_BUFFER) {
for (i = 0; i < num_elems; i++) {
-   int vb = 
sctx->vertex_elements->elements[i].vertex_buffer_index;
+   int vb = sctx->vertex_elements->vertex_buffer_index[i];
 
if (vb >= ARRAY_SIZE(sctx->vertex_buffer))
continue;
if (!sctx->vertex_buffer[vb].buffer.resource)
continue;
 
if (sctx->vertex_buffer[vb].buffer.resource == buf) {
sctx->vertex_buffers_dirty = true;
break;
}
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 3b15545..1cd1f91 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3733,36 +3733,40 @@ static void *si_create_vertex_elements(struct 
pipe_context *ctx,
unsigned data_format, num_format;
int first_non_void;
unsigned vbo_index = elements[i].vertex_buffer_index;
unsigned char swizzle[4];
 
if (vbo_index >= SI_NUM_VERTEX_BUFFERS) {
FREE(v);
return NULL;
}
 
-   if 

[Mesa-dev] [PATCH 02/13] radeonsi: use uint32_t to declare si_shader_key.opt.kill_outputs

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

the next patch will benefit from this
---
 src/gallium/drivers/radeonsi/si_shader.c| 8 +---
 src/gallium/drivers/radeonsi/si_shader.h| 3 ++-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 5 +++--
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index a6b7e5e..4ee4a64 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2269,35 +2269,36 @@ static void si_llvm_export_vs(struct 
lp_build_tgsi_context *bld_base,
unsigned semantic_name, semantic_index;
unsigned target;
unsigned param_count = 0;
unsigned pos_idx;
int i;
 
for (i = 0; i < noutput; i++) {
semantic_name = outputs[i].semantic_name;
semantic_index = outputs[i].semantic_index;
bool export_param = true;
+   unsigned id;
 
switch (semantic_name) {
case TGSI_SEMANTIC_POSITION: /* ignore these */
case TGSI_SEMANTIC_PSIZE:
case TGSI_SEMANTIC_CLIPVERTEX:
case TGSI_SEMANTIC_EDGEFLAG:
break;
case TGSI_SEMANTIC_GENERIC:
/* don't process indices the function can't handle */
if (semantic_index >= SI_MAX_IO_GENERIC)
break;
/* fall through */
default:
-   if (shader->key.opt.kill_outputs &
-   (1ull << 
si_shader_io_get_unique_index(semantic_name, semantic_index)))
+   id = si_shader_io_get_unique_index(semantic_name, 
semantic_index);
+   if (shader->key.opt.kill_outputs[id / 32] & (1u << (id 
% 32)))
export_param = false;
}
 
if (outputs[i].vertex_stream[0] != 0 &&
outputs[i].vertex_stream[1] != 0 &&
outputs[i].vertex_stream[2] != 0 &&
outputs[i].vertex_stream[3] != 0)
export_param = false;
 
 handle_semantic:
@@ -5328,21 +5329,22 @@ static void si_dump_shader_key(unsigned processor, 
const struct si_shader *shade
break;
 
default:
assert(0);
}
 
if ((processor == PIPE_SHADER_GEOMETRY ||
 processor == PIPE_SHADER_TESS_EVAL ||
 processor == PIPE_SHADER_VERTEX) &&
!key->as_es && !key->as_ls) {
-   fprintf(f, "  opt.kill_outputs = 0x%"PRIx64"\n", 
key->opt.kill_outputs);
+   fprintf(f, "  opt.kill_outputs[0] = 0x%x\n", 
key->opt.kill_outputs[0]);
+   fprintf(f, "  opt.kill_outputs[1] = 0x%x\n", 
key->opt.kill_outputs[1]);
fprintf(f, "  opt.clip_disable = %u\n", key->opt.clip_disable);
}
 }
 
 static void si_init_shader_ctx(struct si_shader_context *ctx,
   struct si_screen *sscreen,
   LLVMTargetMachineRef tm)
 {
struct lp_build_tgsi_context *bld_base;
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index de520a2..76e09b2 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -494,21 +494,22 @@ struct si_shader_key {
union {
uint64_tff_tcs_inputs_to_copy; /* for 
fixed-func TCS */
/* When PS needs PrimID and GS is disabled. */
unsignedvs_export_prim_id:1;
} u;
} mono;
 
/* Optimization flags for asynchronous compilation only. */
struct {
/* For HW VS (it can be VS, TES, GS) */
-   uint64_tkill_outputs; /* "get_unique_index" bits */
+   /* Don't use "uint64_t" in order to get 32-bit alignment. */
+   uint32_tkill_outputs[2]; /* "get_unique_index" bits */
unsignedclip_disable:1;
 
/* For shaders where monolithic variants have better code.
 *
 * This is a flag that has no effect on code generation,
 * but forces monolithic shaders to be used as soon as
 * possible, because it's in the "opt" group.
 */
unsignedprefer_mono:1;
} opt;
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 07e6a42..15e46b5 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1234,23 +1234,24 @@ static void si_shader_selector_key_hw_vs(struct 
si_context *sctx,
uint64_t inputs_read = 0;
 
/* 

[Mesa-dev] [PATCH 01/13] radeonsi: remove 8 bytes from si_shader_key by flattening opt.hw_vs

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c| 10 +-
 src/gallium/drivers/radeonsi/si_shader.h|  7 +++
 src/gallium/drivers/radeonsi/si_state.c |  2 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 12 ++--
 4 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 2c92269..a6b7e5e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2282,21 +2282,21 @@ static void si_llvm_export_vs(struct 
lp_build_tgsi_context *bld_base,
case TGSI_SEMANTIC_PSIZE:
case TGSI_SEMANTIC_CLIPVERTEX:
case TGSI_SEMANTIC_EDGEFLAG:
break;
case TGSI_SEMANTIC_GENERIC:
/* don't process indices the function can't handle */
if (semantic_index >= SI_MAX_IO_GENERIC)
break;
/* fall through */
default:
-   if (shader->key.opt.hw_vs.kill_outputs &
+   if (shader->key.opt.kill_outputs &
(1ull << 
si_shader_io_get_unique_index(semantic_name, semantic_index)))
export_param = false;
}
 
if (outputs[i].vertex_stream[0] != 0 &&
outputs[i].vertex_stream[1] != 0 &&
outputs[i].vertex_stream[2] != 0 &&
outputs[i].vertex_stream[3] != 0)
export_param = false;
 
@@ -2314,28 +2314,28 @@ handle_semantic:
semantic_name = TGSI_SEMANTIC_GENERIC;
goto handle_semantic;
case TGSI_SEMANTIC_VIEWPORT_INDEX:
viewport_index_value = outputs[i].values[0];
semantic_name = TGSI_SEMANTIC_GENERIC;
goto handle_semantic;
case TGSI_SEMANTIC_POSITION:
target = V_008DFC_SQ_EXP_POS;
break;
case TGSI_SEMANTIC_CLIPDIST:
-   if (shader->key.opt.hw_vs.clip_disable) {
+   if (shader->key.opt.clip_disable) {
semantic_name = TGSI_SEMANTIC_GENERIC;
goto handle_semantic;
}
target = V_008DFC_SQ_EXP_POS + 2 + semantic_index;
break;
case TGSI_SEMANTIC_CLIPVERTEX:
-   if (shader->key.opt.hw_vs.clip_disable)
+   if (shader->key.opt.clip_disable)
continue;
si_llvm_emit_clipvertex(bld_base, pos_args, 
outputs[i].values);
continue;
case TGSI_SEMANTIC_COLOR:
case TGSI_SEMANTIC_BCOLOR:
case TGSI_SEMANTIC_PRIMID:
case TGSI_SEMANTIC_FOG:
case TGSI_SEMANTIC_TEXCOORD:
case TGSI_SEMANTIC_GENERIC:
if (!export_param)
@@ -5328,22 +5328,22 @@ static void si_dump_shader_key(unsigned processor, 
const struct si_shader *shade
break;
 
default:
assert(0);
}
 
if ((processor == PIPE_SHADER_GEOMETRY ||
 processor == PIPE_SHADER_TESS_EVAL ||
 processor == PIPE_SHADER_VERTEX) &&
!key->as_es && !key->as_ls) {
-   fprintf(f, "  opt.hw_vs.kill_outputs = 0x%"PRIx64"\n", 
key->opt.hw_vs.kill_outputs);
-   fprintf(f, "  opt.hw_vs.clip_disable = %u\n", 
key->opt.hw_vs.clip_disable);
+   fprintf(f, "  opt.kill_outputs = 0x%"PRIx64"\n", 
key->opt.kill_outputs);
+   fprintf(f, "  opt.clip_disable = %u\n", key->opt.clip_disable);
}
 }
 
 static void si_init_shader_ctx(struct si_shader_context *ctx,
   struct si_screen *sscreen,
   LLVMTargetMachineRef tm)
 {
struct lp_build_tgsi_context *bld_base;
 
si_llvm_context_init(ctx, sscreen, tm);
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 7c04b7e..de520a2 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -493,24 +493,23 @@ struct si_shader_key {
 
union {
uint64_tff_tcs_inputs_to_copy; /* for 
fixed-func TCS */
/* When PS needs PrimID and GS is disabled. */
unsignedvs_export_prim_id:1;
} u;
} mono;
 
/* Optimization flags for asynchronous compilation only. */
struct {
-   struct {
-   uint64_t  

[Mesa-dev] [PATCH 05/13] radeonsi: allocate si_state_rasterizer::pm4_poly_offset only when needed

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

Each element has over 700 bytes.
---
 src/gallium/drivers/radeonsi/si_state.c | 14 +-
 src/gallium/drivers/radeonsi/si_state.h |  2 +-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 27a88a8..f4d6ae1 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -863,20 +863,29 @@ static void *si_create_rs_state(struct pipe_context *ctx,
S_028814_CULL_BACK((state->cull_face & PIPE_FACE_BACK) ? 1 : 0) 
|
S_028814_FACE(!state->front_ccw) |
S_028814_POLY_OFFSET_FRONT_ENABLE(util_get_offset(state, 
state->fill_front)) |
S_028814_POLY_OFFSET_BACK_ENABLE(util_get_offset(state, 
state->fill_back)) |
S_028814_POLY_OFFSET_PARA_ENABLE(state->offset_point || 
state->offset_line) |
S_028814_POLY_MODE(state->fill_front != PIPE_POLYGON_MODE_FILL 
||
   state->fill_back != PIPE_POLYGON_MODE_FILL) |

S_028814_POLYMODE_FRONT_PTYPE(si_translate_fill(state->fill_front)) |

S_028814_POLYMODE_BACK_PTYPE(si_translate_fill(state->fill_back)));
 
+   if (!rs->uses_poly_offset)
+   return rs;
+
+   rs->pm4_poly_offset = CALLOC(3, sizeof(struct si_pm4_state));
+   if (!rs->pm4_poly_offset) {
+   FREE(rs);
+   return NULL;
+   }
+
/* Precalculate polygon offset states for 16-bit, 24-bit, and 32-bit 
zbuffers. */
for (i = 0; i < 3; i++) {
struct si_pm4_state *pm4 = >pm4_poly_offset[i];
float offset_units = state->offset_units;
float offset_scale = state->offset_scale * 16.0f;
uint32_t pa_su_poly_offset_db_fmt_cntl = 0;
 
if (!state->offset_units_unscaled) {
switch (i) {
case 0: /* 16-bit zbuffer */
@@ -958,24 +967,27 @@ static void si_bind_rs_state(struct pipe_context *ctx, 
void *state)
old_rs->poly_smooth != rs->poly_smooth ||
old_rs->line_smooth != rs->line_smooth ||
old_rs->clamp_fragment_color != rs->clamp_fragment_color ||
old_rs->force_persample_interp != rs->force_persample_interp)
sctx->do_update_shaders = true;
 }
 
 static void si_delete_rs_state(struct pipe_context *ctx, void *state)
 {
struct si_context *sctx = (struct si_context *)ctx;
+   struct si_state_rasterizer *rs = (struct si_state_rasterizer *)state;
 
if (sctx->queued.named.rasterizer == state)
si_pm4_bind_state(sctx, poly_offset, NULL);
-   si_pm4_delete_state(sctx, rasterizer, (struct si_state_rasterizer 
*)state);
+
+   FREE(rs->pm4_poly_offset);
+   si_pm4_delete_state(sctx, rasterizer, rs);
 }
 
 /*
  * infeered state between dsa and stencil ref
  */
 static void si_emit_stencil_ref(struct si_context *sctx, struct r600_atom 
*atom)
 {
struct radeon_winsys_cs *cs = sctx->b.gfx.cs;
struct pipe_stencil_ref *ref = >stencil_ref.state;
struct si_dsa_stencil_ref_part *dsa = >stencil_ref.dsa_part;
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index dabe9b9..8de8675 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -53,21 +53,21 @@ struct si_state_blend {
/* Set 0xf or 0x0 (4 bits) per render target if the following is
 * true. ANDed with spi_shader_col_format.
 */
unsignedblend_enable_4bit;
unsignedneed_src_alpha_4bit;
 };
 
 struct si_state_rasterizer {
struct si_pm4_state pm4;
/* poly offset states for 16-bit, 24-bit, and 32-bit zbuffers */
-   struct si_pm4_state pm4_poly_offset[3];
+   struct si_pm4_state *pm4_poly_offset;
unsignedpa_sc_line_stipple;
unsignedpa_cl_clip_cntl;
unsignedsprite_coord_enable:8;
unsignedclip_plane_enable:8;
unsignedflatshade:1;
unsignedtwo_side:1;
unsignedmultisample_enable:1;
unsignedforce_persample_interp:1;
unsignedline_stipple_enable:1;
unsignedpoly_stipple_enable:1;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/13] radeonsi: pack si_context better

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

there isn't much to gain here
---
 src/gallium/drivers/radeonsi/si_pipe.h | 36 +-
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 388f6e0..eef05cf 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -230,33 +230,33 @@ union si_vgt_param_key {
 
 struct si_context {
struct r600_common_context  b;
struct blitter_context  *blitter;
void*custom_dsa_flush;
void*custom_blend_resolve;
void*custom_blend_fmask_decompress;
void*custom_blend_eliminate_fastclear;
void*custom_blend_dcc_decompress;
struct si_screen*screen;
+   LLVMTargetMachineReftm; /* only non-threaded compilation */
+   struct si_shader_ctx_state  fixed_func_tcs_shader;
 
struct radeon_winsys_cs *ce_ib;
struct radeon_winsys_cs *ce_preamble_ib;
struct r600_resource*ce_ram_saved_buffer;
-   unsignedce_ram_saved_offset;
-   unsignedtotal_ce_ram_allocated;
-   boolce_need_synchronization;
struct u_suballocator   *ce_suballocator;
+   unsignedce_ram_saved_offset;
+   uint16_ttotal_ce_ram_allocated;
+   boolce_need_synchronization:1;
 
-   struct si_shader_ctx_state  fixed_func_tcs_shader;
-   LLVMTargetMachineReftm; /* only non-threaded compilation */
-   boolgfx_flush_in_progress;
-   boolcompute_is_busy;
+   boolgfx_flush_in_progress:1;
+   boolcompute_is_busy:1;
 
/* Atoms (direct states). */
union si_state_atomsatoms;
unsigneddirty_atoms; /* mask */
/* PM4 states (precomputed immutable states) */
unsigneddirty_states;
union si_state  queued;
union si_state  emitted;
 
/* Atom declarations. */
@@ -320,49 +320,49 @@ struct si_context {
/* Vertex and index buffers. */
boolvertex_buffers_dirty;
boolvertex_buffer_pointer_dirty;
struct pipe_vertex_buffer   vertex_buffer[SI_NUM_VERTEX_BUFFERS];
 
/* MSAA config state. */
int ps_iter_samples;
boolsmoothing_enabled;
 
/* DB render state. */
-   booldbcb_depth_copy_enabled;
-   booldbcb_stencil_copy_enabled;
-   unsigneddbcb_copy_sample;
-   booldb_flush_depth_inplace;
-   booldb_flush_stencil_inplace;
-   booldb_depth_clear;
-   booldb_depth_disable_expclear;
-   booldb_stencil_clear;
-   booldb_stencil_disable_expclear;
unsignedps_db_shader_control;
-   boolocclusion_queries_disabled;
+   unsigneddbcb_copy_sample;
+   booldbcb_depth_copy_enabled:1;
+   booldbcb_stencil_copy_enabled:1;
+   booldb_flush_depth_inplace:1;
+   booldb_flush_stencil_inplace:1;
+   booldb_depth_clear:1;
+   booldb_depth_disable_expclear:1;
+   booldb_stencil_clear:1;
+   booldb_stencil_disable_expclear:1;
+   boolocclusion_queries_disabled:1;
 
/* Emitted draw state. */
+   boolgs_tri_strip_adj_fix:1;
int last_index_size;
int last_base_vertex;
int last_start_instance;
int last_drawid;
int last_sh_base_reg;
int last_primitive_restart_en;
int last_restart_index;
int last_gs_out_prim;
int last_prim;
int last_multi_vgt_param;
int last_rast_prim;
unsignedlast_sc_line_stipple;
unsignedcurrent_vs_state;
unsignedlast_vs_state;
enum pipe_prim_type 

[Mesa-dev] [PATCH 06/13] radeonsi: rename si_vertex_element -> si_vertex_elements

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_descriptors.c | 2 +-
 src/gallium/drivers/radeonsi/si_pipe.h| 2 +-
 src/gallium/drivers/radeonsi/si_state.c   | 6 +++---
 src/gallium/drivers/radeonsi/si_state.h   | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index b04d108..af68ac9 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -1020,21 +1020,21 @@ static void si_vertex_buffers_begin_new_cs(struct 
si_context *sctx)
 
if (!desc->buffer)
return;
radeon_add_to_buffer_list(>b, >b.gfx,
  desc->buffer, RADEON_USAGE_READ,
  RADEON_PRIO_DESCRIPTORS);
 }
 
 bool si_upload_vertex_buffer_descriptors(struct si_context *sctx)
 {
-   struct si_vertex_element *velems = sctx->vertex_elements;
+   struct si_vertex_elements *velems = sctx->vertex_elements;
struct si_descriptors *desc = >vertex_buffers;
unsigned i, count;
unsigned desc_list_byte_size;
unsigned first_vb_use_mask;
uint64_t va;
uint32_t *ptr;
 
if (!sctx->vertex_buffers_dirty || !velems)
return true;
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 0aaec79..46b095d 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -283,21 +283,21 @@ struct si_context {
 
/* shaders */
struct si_shader_ctx_state  ps_shader;
struct si_shader_ctx_state  gs_shader;
struct si_shader_ctx_state  vs_shader;
struct si_shader_ctx_state  tcs_shader;
struct si_shader_ctx_state  tes_shader;
struct si_cs_shader_state   cs_shader_state;
 
/* shader information */
-   struct si_vertex_element*vertex_elements;
+   struct si_vertex_elements   *vertex_elements;
unsignedsprite_coord_enable;
boolflatshade;
booldo_update_shaders;
 
/* shader descriptors */
struct si_descriptors   vertex_buffers;
struct si_descriptors   descriptors[SI_NUM_DESCS];
unsigneddescriptors_dirty;
unsignedshader_pointers_dirty;
unsignedshader_needs_decompress_mask;
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index f4d6ae1..3b15545 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3709,21 +3709,21 @@ static void si_delete_sampler_state(struct pipe_context 
*ctx, void *state)
 
 /*
  * Vertex elements & buffers
  */
 
 static void *si_create_vertex_elements(struct pipe_context *ctx,
   unsigned count,
   const struct pipe_vertex_element 
*elements)
 {
struct si_screen *sscreen = (struct si_screen*)ctx->screen;
-   struct si_vertex_element *v = CALLOC_STRUCT(si_vertex_element);
+   struct si_vertex_elements *v = CALLOC_STRUCT(si_vertex_elements);
bool used[SI_NUM_VERTEX_BUFFERS] = {};
int i;
 
assert(count <= SI_MAX_ATTRIBS);
if (!v)
return NULL;
 
v->count = count;
v->desc_list_byte_size = align(count * 16, SI_CPDMA_ALIGNMENT);
 
@@ -3849,22 +3849,22 @@ static void *si_create_vertex_elements(struct 
pipe_context *ctx,
   S_008F0C_DATA_FORMAT(data_format);
}
memcpy(v->elements, elements, sizeof(struct pipe_vertex_element) * 
count);
 
return v;
 }
 
 static void si_bind_vertex_elements(struct pipe_context *ctx, void *state)
 {
struct si_context *sctx = (struct si_context *)ctx;
-   struct si_vertex_element *old = sctx->vertex_elements;
-   struct si_vertex_element *v = (struct si_vertex_element*)state;
+   struct si_vertex_elements *old = sctx->vertex_elements;
+   struct si_vertex_elements *v = (struct si_vertex_elements*)state;
 
sctx->vertex_elements = v;
sctx->vertex_buffers_dirty = true;
 
if (v &&
(!old ||
 old->count != v->count ||
 old->uses_instance_divisors != v->uses_instance_divisors ||
 v->uses_instance_divisors || /* we don't check which divisors 
changed */
 memcmp(old->fix_fetch, v->fix_fetch, sizeof(v->fix_fetch[0]) * 
v->count)))
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index 8de8675..63a1713 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -91,21 +91,21 

[Mesa-dev] [PATCH 04/13] radeonsi: pack si_state_rasterizer fields

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.h | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index 390e16f..dabe9b9 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -54,38 +54,38 @@ struct si_state_blend {
 * true. ANDed with spi_shader_col_format.
 */
unsignedblend_enable_4bit;
unsignedneed_src_alpha_4bit;
 };
 
 struct si_state_rasterizer {
struct si_pm4_state pm4;
/* poly offset states for 16-bit, 24-bit, and 32-bit zbuffers */
struct si_pm4_state pm4_poly_offset[3];
-   boolflatshade;
-   booltwo_side;
-   boolmultisample_enable;
-   boolforce_persample_interp;
-   boolline_stipple_enable;
-   unsignedsprite_coord_enable;
unsignedpa_sc_line_stipple;
unsignedpa_cl_clip_cntl;
-   unsignedclip_plane_enable;
-   boolpoly_stipple_enable;
-   boolline_smooth;
-   boolpoly_smooth;
-   booluses_poly_offset;
-   boolclamp_fragment_color;
-   boolclamp_vertex_color;
-   boolrasterizer_discard;
-   boolscissor_enable;
-   boolclip_halfz;
+   unsignedsprite_coord_enable:8;
+   unsignedclip_plane_enable:8;
+   unsignedflatshade:1;
+   unsignedtwo_side:1;
+   unsignedmultisample_enable:1;
+   unsignedforce_persample_interp:1;
+   unsignedline_stipple_enable:1;
+   unsignedpoly_stipple_enable:1;
+   unsignedline_smooth:1;
+   unsignedpoly_smooth:1;
+   unsigneduses_poly_offset:1;
+   unsignedclamp_fragment_color:1;
+   unsignedclamp_vertex_color:1;
+   unsignedrasterizer_discard:1;
+   unsignedscissor_enable:1;
+   unsignedclip_halfz:1;
 };
 
 struct si_dsa_stencil_ref_part {
uint8_t valuemask[2];
uint8_t writemask[2];
 };
 
 struct si_state_dsa {
struct si_pm4_state pm4;
unsignedalpha_func;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/13] radeonsi: pack struct si_vertex_elements better

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.h | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index 99c8ee6..77fa467 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -93,32 +93,33 @@ struct si_state_dsa {
 };
 
 struct si_stencil_ref {
struct r600_atomatom;
struct pipe_stencil_ref state;
struct si_dsa_stencil_ref_part  dsa_part;
 };
 
 struct si_vertex_elements
 {
-   unsignedcount;
-   unsignedfirst_vb_use_mask;
-   /* Vertex buffer descriptor list size aligned for optimal prefetch. */
-   unsigneddesc_list_byte_size;
-
-   uint8_t fix_fetch[SI_MAX_ATTRIBS];
+   uint32_tinstance_divisors[SI_MAX_ATTRIBS];
uint32_trsrc_word3[SI_MAX_ATTRIBS];
-   uint32_tformat_size[SI_MAX_ATTRIBS];
-   uint8_t vertex_buffer_index[SI_MAX_ATTRIBS];
uint16_tsrc_offset[SI_MAX_ATTRIBS];
-   unsignedinstance_divisors[SI_MAX_ATTRIBS];
+   uint8_t fix_fetch[SI_MAX_ATTRIBS];
+   uint8_t format_size[SI_MAX_ATTRIBS];
+   uint8_t vertex_buffer_index[SI_MAX_ATTRIBS];
+
+   uint8_t count;
booluses_instance_divisors;
+
+   uint16_tfirst_vb_use_mask;
+   /* Vertex buffer descriptor list size aligned for optimal prefetch. */
+   uint16_tdesc_list_byte_size;
 };
 
 union si_state {
struct {
struct si_state_blend   *blend;
struct si_state_rasterizer  *rasterizer;
struct si_state_dsa *dsa;
struct si_pm4_state *poly_offset;
struct si_pm4_state *ls;
struct si_pm4_state *hs;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/13] radeonsi: pack si_sampler_view better

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 46b095d..55fda4d 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -121,22 +121,22 @@ struct si_blend_color {
struct pipe_blend_color state;
 };
 
 struct si_sampler_view {
struct pipe_sampler_viewbase;
 /* [0..7] = image descriptor
  * [4..7] = buffer descriptor */
uint32_tstate[8];
uint32_tfmask_state[8];
const struct legacy_surf_level  *base_level_info;
-   unsignedbase_level;
-   unsignedblock_width;
+   ubyte   base_level;
+   ubyte   block_width;
bool is_stencil_sampler;
bool dcc_incompatible;
 };
 
 #define SI_SAMPLER_STATE_MAGIC 0x34f1c35a
 
 struct si_sampler_state {
 #ifdef DEBUG
unsignedmagic;
 #endif
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/13] radeonsi: remove 8 bytes from si_shader_key with uint32_t ff_tcs_inputs_to_copy

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

The previous patch helps with this.
---
 src/gallium/drivers/radeonsi/si_shader.c| 8 ++--
 src/gallium/drivers/radeonsi/si_shader.h| 3 ++-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 8 ++--
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 4ee4a64..e525a18 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2486,21 +2486,22 @@ static void si_copy_tcs_inputs(struct 
lp_build_tgsi_context *bld_base)
invocation_id = unpack_param(ctx, ctx->param_tcs_rel_ids, 8, 5);
buffer = desc_from_addr_base64k(ctx, 
ctx->param_tcs_offchip_addr_base64k);
buffer_offset = LLVMGetParam(ctx->main_fn, 
ctx->param_tcs_offchip_offset);
 
lds_vertex_stride = unpack_param(ctx, ctx->param_vs_state_bits, 24, 8);
lds_vertex_offset = LLVMBuildMul(gallivm->builder, invocation_id,
 lds_vertex_stride, "");
lds_base = get_tcs_in_current_patch_offset(ctx);
lds_base = LLVMBuildAdd(gallivm->builder, lds_base, lds_vertex_offset, 
"");
 
-   inputs = ctx->shader->key.mono.u.ff_tcs_inputs_to_copy;
+   inputs = ctx->shader->key.mono.u.ff_tcs_inputs_to_copy[0] |
+((uint64_t)ctx->shader->key.mono.u.ff_tcs_inputs_to_copy[1] << 
32);
while (inputs) {
unsigned i = u_bit_scan64();
 
LLVMValueRef lds_ptr = LLVMBuildAdd(gallivm->builder, lds_base,
LLVMConstInt(ctx->i32, 4 * i, 0),
 "");
 
LLVMValueRef buffer_addr = get_tcs_tes_buffer_address(ctx,
  get_rel_patch_id(ctx),
  invocation_id,
@@ -5277,21 +5278,24 @@ static void si_dump_shader_key(unsigned processor, 
const struct si_shader *shade
fprintf(f, "  mono.u.vs_export_prim_id = %u\n",
key->mono.u.vs_export_prim_id);
break;
 
case PIPE_SHADER_TESS_CTRL:
if (shader->selector->screen->b.chip_class >= GFX9) {
si_dump_shader_key_vs(key, >part.tcs.ls_prolog,
  "part.tcs.ls_prolog", f);
}
fprintf(f, "  part.tcs.epilog.prim_mode = %u\n", 
key->part.tcs.epilog.prim_mode);
-   fprintf(f, "  mono.u.ff_tcs_inputs_to_copy = 0x%"PRIx64"\n", 
key->mono.u.ff_tcs_inputs_to_copy);
+   fprintf(f, "  mono.u.ff_tcs_inputs_to_copy[0] = 0x%x\n",
+   key->mono.u.ff_tcs_inputs_to_copy[0]);
+   fprintf(f, "  mono.u.ff_tcs_inputs_to_copy[1] = 0x%x\n",
+   key->mono.u.ff_tcs_inputs_to_copy[1]);
break;
 
case PIPE_SHADER_TESS_EVAL:
fprintf(f, "  as_es = %u\n", key->as_es);
fprintf(f, "  mono.u.vs_export_prim_id = %u\n",
key->mono.u.vs_export_prim_id);
break;
 
case PIPE_SHADER_GEOMETRY:
if (shader->is_gs_copy_shader)
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 76e09b2..ed1df2b 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -485,21 +485,22 @@ struct si_shader_key {
 */
unsigned as_es:1; /* export shader, which precedes GS */
unsigned as_ls:1; /* local shader, which precedes TCS */
 
/* Flags for monolithic compilation only. */
struct {
/* One byte for every input: SI_FIX_FETCH_* enums. */
uint8_t vs_fix_fetch[SI_MAX_ATTRIBS];
 
union {
-   uint64_tff_tcs_inputs_to_copy; /* for 
fixed-func TCS */
+   /* Don't use "uint64_t" in order to get 32-bit 
alignment. */
+   uint32_tff_tcs_inputs_to_copy[2]; /* for 
fixed-func TCS */
/* When PS needs PrimID and GS is disabled. */
unsignedvs_export_prim_id:1;
} u;
} mono;
 
/* Optimization flags for asynchronous compilation only. */
struct {
/* For HW VS (it can be VS, TES, GS) */
/* Don't use "uint64_t" in order to get 32-bit alignment. */
uint32_tkill_outputs[2]; /* "get_unique_index" bits */
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 15e46b5..6247b9c 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1276,22 +1276,26 @@ static inline void si_shader_selector_key(struct 

[Mesa-dev] [PATCH 09/13] radeonsi: pack struct si_descriptors better

2017-06-10 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_state.h | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index 77fa467..b616757 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -213,56 +213,56 @@ enum {
 SI_NUM_SHADERS * SI_NUM_SHADER_DESCS)
 
 /* This represents descriptors in memory, such as buffer resources,
  * image resources, and sampler states.
  */
 struct si_descriptors {
/* The list of descriptors in malloc'd memory. */
uint32_t *list;
/* The list in mapped GPU memory. */
uint32_t *gpu_list;
-   /* The size of one descriptor. */
-   unsigned element_dw_size;
-   /* The maximum number of descriptors. */
-   unsigned num_elements;
+   /* Slots that have been changed and need to be uploaded. */
+   uint64_t dirty_mask;
 
/* The buffer where the descriptors have been uploaded. */
struct r600_resource *buffer;
int buffer_offset; /* can be negative if not using lower slots */
 
+   /* The size of one descriptor. */
+   ubyte element_dw_size;
+   /* The maximum number of descriptors. */
+   ubyte num_elements;
+
/* Offset in CE RAM */
-   unsigned ce_offset;
+   uint16_t ce_offset;
 
/* Slots allocated in CE RAM. If we get active slots outside of this
 * range, direct uploads to memory will be used instead. This basically
 * governs switching between onchip (CE) and offchip (upload) modes.
 */
-   unsigned first_ce_slot;
-   unsigned num_ce_slots;
+   ubyte first_ce_slot;
+   ubyte num_ce_slots;
 
/* Slots that are used by currently-bound shaders.
 * With CE: It determines which slots are dumped to L2.
 *  It doesn't skip uploads to CE RAM.
 * Without CE: It determines which slots are uploaded.
 */
-   unsigned first_active_slot;
-   unsigned num_active_slots;
-
-   /* Slots that have been changed and need to be uploaded. */
-   uint64_t dirty_mask;
+   ubyte first_active_slot;
+   ubyte num_active_slots;
 
/* Whether CE is used to upload this descriptor array. */
bool uses_ce;
 
-   /* The shader userdata offset within a shader where the 64-bit pointer 
to the descriptor
-* array will be stored. */
-   unsigned shader_userdata_offset;
+   /* The SGPR index where the 64-bit pointer to the descriptor array will
+* be stored. */
+   ubyte shader_userdata_offset;
 };
 
 struct si_sampler_views {
struct pipe_sampler_view*views[SI_NUM_SAMPLERS];
struct si_sampler_state *sampler_states[SI_NUM_SAMPLERS];
 
/* The i-th bit is set if that element is enabled (non-NULL resource). 
*/
unsignedenabled_mask;
 };
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

--- Comment #7 from Grazvydas Ignotas  ---
I've sent different version to ML, testing that one would be preferred:
https://lists.freedesktop.org/archives/mesa-dev/2017-June/158700.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radv: don't even attempt to prefetch on SI

2017-06-10 Thread Grazvydas Ignotas
Before bcae327469 this was emitting CP DMA packet even on SI, but
apparently hasn't caused too many problems. After that commit the
CP DMA code now always sets the CIK+ only bit for prefetch. Just
follow radeonsi there and don't try to prefetch at all.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Fixes: bcae327469 "radv: realign cp dma code with radeonsi"
Signed-off-by: Grazvydas Ignotas 
---
 src/amd/vulkan/radv_cmd_buffer.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 1ac9de1..b08f218 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -529,10 +529,18 @@ radv_emit_graphics_raster_state(struct radv_cmd_buffer 
*cmd_buffer,
 
radeon_set_context_reg(cmd_buffer->cs, R_028814_PA_SU_SC_MODE_CNTL,
   raster->pa_su_sc_mode_cntl);
 }
 
+static inline void
+radv_emit_prefetch(struct radv_cmd_buffer *cmd_buffer, uint64_t va,
+  unsigned size)
+{
+   if (cmd_buffer->device->physical_device->rad_info.chip_class >= CIK)
+   si_cp_dma_prefetch(cmd_buffer, va, size);
+}
+
 static void
 radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
struct radv_pipeline *pipeline,
struct radv_shader_variant *shader,
struct ac_vs_output_info *outinfo)
@@ -540,11 +548,11 @@ radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer,
struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
unsigned export_count;
 
ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   si_cp_dma_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
 
export_count = MAX2(1, outinfo->param_exports);
radeon_set_context_reg(cmd_buffer->cs, R_0286C4_SPI_VS_OUT_CONFIG,
   S_0286C4_VS_EXPORT_COUNT(export_count - 1));
 
@@ -589,11 +597,11 @@ radv_emit_hw_es(struct radv_cmd_buffer *cmd_buffer,
 {
struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
 
ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   si_cp_dma_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
 
radeon_set_context_reg(cmd_buffer->cs, R_028AAC_VGT_ESGS_RING_ITEMSIZE,
   outinfo->esgs_itemsize / 4);
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B320_SPI_SHADER_PGM_LO_ES, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
@@ -609,11 +617,11 @@ radv_emit_hw_ls(struct radv_cmd_buffer *cmd_buffer,
struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
uint32_t rsrc2 = shader->rsrc2;
 
ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   si_cp_dma_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
 
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B520_SPI_SHADER_PGM_LO_LS, 2);
radeon_emit(cmd_buffer->cs, va >> 8);
radeon_emit(cmd_buffer->cs, va >> 40);
 
@@ -633,11 +641,11 @@ radv_emit_hw_hs(struct radv_cmd_buffer *cmd_buffer,
 {
struct radeon_winsys *ws = cmd_buffer->device->ws;
uint64_t va = ws->buffer_get_va(shader->bo);
 
ws->cs_add_buffer(cmd_buffer->cs, shader->bo, 8);
-   si_cp_dma_prefetch(cmd_buffer, va, shader->code_size);
+   radv_emit_prefetch(cmd_buffer, va, shader->code_size);
 
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B420_SPI_SHADER_PGM_LO_HS, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
radeon_emit(cmd_buffer->cs, va >> 40);
radeon_emit(cmd_buffer->cs, shader->rsrc1);
@@ -767,11 +775,12 @@ radv_emit_geometry_shader(struct radv_cmd_buffer 
*cmd_buffer,
   S_028B90_CNT(MIN2(gs_num_invocations, 127)) |
   S_028B90_ENABLE(gs_num_invocations > 0));
 
va = ws->buffer_get_va(gs->bo);
ws->cs_add_buffer(cmd_buffer->cs, gs->bo, 8);
-   si_cp_dma_prefetch(cmd_buffer, va, gs->code_size);
+   radv_emit_prefetch(cmd_buffer, va, gs->code_size);
+
radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B220_SPI_SHADER_PGM_LO_GS, 4);
radeon_emit(cmd_buffer->cs, va >> 8);
radeon_emit(cmd_buffer->cs, va >> 40);
radeon_emit(cmd_buffer->cs, gs->rsrc1);
radeon_emit(cmd_buffer->cs, gs->rsrc2);
@@ -808,11 +817,11 @@ radv_emit_fragment_shader(struct radv_cmd_buffer 
*cmd_buffer,
 
ps = pipeline->shaders[MESA_SHADER_FRAGMENT];
 
va = ws->buffer_get_va(ps->bo);
ws->cs_add_buffer(cmd_buffer->cs, ps->bo, 8);
-   si_cp_dma_prefetch(cmd_buffer, va, ps->code_size);
+   radv_emit_prefetch(cmd_buffer, va, ps->code_size);
 

[Mesa-dev] [PATCH 1/2] radv: assert on CP_DMA_USE_L2 for SI

2017-06-10 Thread Grazvydas Ignotas
The register header (and radeonsi comment) states V_411_SRC_ADDR_TC_L2
is for CIK+ only, so let's assert on earlier ASICs.

Signed-off-by: Grazvydas Ignotas 
---
 src/amd/vulkan/si_cmd_buffer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 33414c1..962b76f 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -1191,10 +1191,11 @@ static void si_emit_cp_dma(struct radv_cmd_buffer 
*cmd_buffer,
radeon_emit(cs, src_va >> 32);  /* SRC_ADDR_HI [31:0] */
radeon_emit(cs, dst_va);/* DST_ADDR_LO [31:0] */
radeon_emit(cs, dst_va >> 32);  /* DST_ADDR_HI [31:0] */
radeon_emit(cs, command);
} else {
+   assert(!(flags & CP_DMA_USE_L2));
header |= S_411_SRC_ADDR_HI(src_va >> 32);
radeon_emit(cs, PKT3(PKT3_CP_DMA, 4, 0));
radeon_emit(cs, src_va);/* SRC_ADDR_LO 
[31:0] */
radeon_emit(cs, header);/* SRC_ADDR_HI 
[15:0] + flags. */
radeon_emit(cs, dst_va);/* DST_ADDR_LO 
[31:0] */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: fixed modulo zero crashes in tgsi interpreter (v2)

2017-06-10 Thread Roland Scheidegger
Pushed, thanks!

Roland

Am 09.06.2017 um 15:39 schrieb Marius Gräfe:
> softpipe throws integer division by zero exceptions on windows
> when using % with integers in a geometry shader.
> 
> v2: Made error results consistent with existing div/mod zero handling in
> tgsi. 64 bit signed integer division by zero returns zero like in
> micro_idiv, unsigned returns ~0u like in micro_udiv.
> Modulo operations always set all result bits to one (like in
> micro_umod).
> ---
>  src/gallium/auxiliary/tgsi/tgsi_exec.c | 40 
> +-
>  1 file changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
> b/src/gallium/auxiliary/tgsi/tgsi_exec.c
> index c41954c..97c75e9 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
> @@ -846,40 +846,40 @@ static void
>  micro_u64div(union tgsi_double_channel *dst,
>   const union tgsi_double_channel *src)
>  {
> -   dst->u64[0] = src[0].u64[0] / src[1].u64[0];
> -   dst->u64[1] = src[0].u64[1] / src[1].u64[1];
> -   dst->u64[2] = src[0].u64[2] / src[1].u64[2];
> -   dst->u64[3] = src[0].u64[3] / src[1].u64[3];
> +   dst->u64[0] = src[1].u64[0] ? src[0].u64[0] / src[1].u64[0] : ~0ull;
> +   dst->u64[1] = src[1].u64[1] ? src[0].u64[1] / src[1].u64[1] : ~0ull;
> +   dst->u64[2] = src[1].u64[2] ? src[0].u64[2] / src[1].u64[2] : ~0ull;
> +   dst->u64[3] = src[1].u64[3] ? src[0].u64[3] / src[1].u64[3] : ~0ull;
>  }
>  
>  static void
>  micro_i64div(union tgsi_double_channel *dst,
>   const union tgsi_double_channel *src)
>  {
> -   dst->i64[0] = src[0].i64[0] / src[1].i64[0];
> -   dst->i64[1] = src[0].i64[1] / src[1].i64[1];
> -   dst->i64[2] = src[0].i64[2] / src[1].i64[2];
> -   dst->i64[3] = src[0].i64[3] / src[1].i64[3];
> +   dst->i64[0] = src[1].i64[0] ? src[0].i64[0] / src[1].i64[0] : 0;
> +   dst->i64[1] = src[1].i64[1] ? src[0].i64[1] / src[1].i64[1] : 0;
> +   dst->i64[2] = src[1].i64[2] ? src[0].i64[2] / src[1].i64[2] : 0;
> +   dst->i64[3] = src[1].i64[3] ? src[0].i64[3] / src[1].i64[3] : 0;
>  }
>  
>  static void
>  micro_u64mod(union tgsi_double_channel *dst,
>   const union tgsi_double_channel *src)
>  {
> -   dst->u64[0] = src[0].u64[0] % src[1].u64[0];
> -   dst->u64[1] = src[0].u64[1] % src[1].u64[1];
> -   dst->u64[2] = src[0].u64[2] % src[1].u64[2];
> -   dst->u64[3] = src[0].u64[3] % src[1].u64[3];
> +   dst->u64[0] = src[1].u64[0] ? src[0].u64[0] % src[1].u64[0] : ~0ull;
> +   dst->u64[1] = src[1].u64[1] ? src[0].u64[1] % src[1].u64[1] : ~0ull;
> +   dst->u64[2] = src[1].u64[2] ? src[0].u64[2] % src[1].u64[2] : ~0ull;
> +   dst->u64[3] = src[1].u64[3] ? src[0].u64[3] % src[1].u64[3] : ~0ull;
>  }
>  
>  static void
>  micro_i64mod(union tgsi_double_channel *dst,
>   const union tgsi_double_channel *src)
>  {
> -   dst->i64[0] = src[0].i64[0] % src[1].i64[0];
> -   dst->i64[1] = src[0].i64[1] % src[1].i64[1];
> -   dst->i64[2] = src[0].i64[2] % src[1].i64[2];
> -   dst->i64[3] = src[0].i64[3] % src[1].i64[3];
> +   dst->i64[0] = src[1].i64[0] ? src[0].i64[0] % src[1].i64[0] : ~0ll;
> +   dst->i64[1] = src[1].i64[1] ? src[0].i64[1] % src[1].i64[1] : ~0ll;
> +   dst->i64[2] = src[1].i64[2] ? src[0].i64[2] % src[1].i64[2] : ~0ll;
> +   dst->i64[3] = src[1].i64[3] ? src[0].i64[3] % src[1].i64[3] : ~0ll;
>  }
>  
>  static void
> @@ -4653,10 +4653,10 @@ micro_mod(union tgsi_exec_channel *dst,
>const union tgsi_exec_channel *src0,
>const union tgsi_exec_channel *src1)
>  {
> -   dst->i[0] = src0->i[0] % src1->i[0];
> -   dst->i[1] = src0->i[1] % src1->i[1];
> -   dst->i[2] = src0->i[2] % src1->i[2];
> -   dst->i[3] = src0->i[3] % src1->i[3];
> +   dst->i[0] = src1->i[0] ? src0->i[0] % src1->i[0] : ~0;
> +   dst->i[1] = src1->i[1] ? src0->i[1] % src1->i[1] : ~0;
> +   dst->i[2] = src1->i[2] ? src0->i[2] % src1->i[2] : ~0;
> +   dst->i[3] = src1->i[3] ? src0->i[3] % src1->i[3] : ~0;
>  }
>  
>  static void
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 101334] Any vulkan app seems to freeze the system

2017-06-10 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101334

Grazvydas Ignotas  changed:

   What|Removed |Added

 CC||nota...@gmail.com

--- Comment #6 from Grazvydas Ignotas  ---
Created attachment 131842
  --> https://bugs.freedesktop.org/attachment.cgi?id=131842=edit
no_si_prefetch

Does this patch help?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev