On 16.02.2016 11:39, Marek Olšák wrote:
On Tue, Feb 16, 2016 at 5:01 PM, Nicolai Hähnle <nhaeh...@gmail.com> wrote:
On 15.02.2016 18:59, Marek Olšák wrote:
From: Marek Olšák <marek.ol...@amd.com>
---
src/gallium/drivers/radeonsi/si_pipe.c | 1 +
src/gallium/drivers/radeonsi/si_pipe.h | 3 ++
src/gallium/drivers/radeonsi/si_shader.c | 53
++++++++++++++++++++++++--------
src/gallium/drivers/radeonsi/si_shader.h | 2 +-
4 files changed, 45 insertions(+), 14 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index fa60732..448fe88 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -600,6 +600,7 @@ struct pipe_screen *radeonsi_screen_create(struct
radeon_winsys *ws)
sscreen->b.has_cp_dma = true;
sscreen->b.has_streamout = true;
+ sscreen->use_monolithic_shaders = true;
if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS |
DBG_PS | DBG_CS;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h
b/src/gallium/drivers/radeonsi/si_pipe.h
index b5790d6..2a2455c 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -84,6 +84,9 @@ struct si_compute;
struct si_screen {
struct r600_common_screen b;
unsigned gs_table_depth;
+
+ /* Whether shaders are monolithic (1-part) or separate (3-part).
*/
+ bool use_monolithic_shaders;
};
struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index b058019..b74ed1e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -70,6 +70,12 @@ struct si_shader_context
unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader.
*/
bool is_gs_copy_shader;
+
+ /* Whether to generate the optimized shader variant compiled as a
whole
+ * (without a prolog and epilog)
+ */
+ bool is_monolithic;
+
int param_streamout_config;
int param_streamout_write_index;
int param_streamout_offset[4];
@@ -3657,8 +3663,10 @@ static void create_function(struct
si_shader_context *ctx)
struct lp_build_tgsi_context *bld_base =
&ctx->radeon_bld.soa.bld_base;
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct si_shader *shader = ctx->shader;
- LLVMTypeRef params[SI_NUM_PARAMS], v2i32, v3i32;
+ LLVMTypeRef params[SI_NUM_PARAMS + SI_NUM_VERTEX_BUFFERS], v2i32,
v3i32;
+ LLVMTypeRef returns[16+32*4];
This is a bit of a magic number, I guess something like max parameters plus
attributes. Can you replace it by the appropriate defines?
There is not a single definition that would express this clearly.
The prolog has to return up to 16 input SGPRs and 4-20 input VGPRs.
Additionally, the prolog returns other data in VGPRs. That's up to
4+16 VGPRs (16 vertex load addresses) for the VS and 20+8 VGPRs (2
vec4 colors) for the PS. The PS epilog returns one SGPR (but in s10 or
so, so we need to allocate 11) and 9*4 VGPRs at most. This all can
change in the future, who knows.
16+32*4 is much more than we'll ever need, but it shouldn't overflow
at least. Assertions also check if we don't overflow.
Hmm, I see. I guess I can live with it, as well as with the casts in
patch 14.
Nicolai
Marek
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev