[Mesa-dev] [PATCH] nv50/ir: fix quadop emission in the presence of predication

2016-02-15 Thread Ilia Mirkin
When there's a predicate, it just goes onto the sources list. If the
quadop only has a single regular source, we will end up thinking that
the predicate is the second source. Check explicitly for the predSrc so
that we don't accidentally emit the wrong thing.

This fixes a bunch of dEQP-GLES3.functional.shaders.derivate.* tests.

Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 5 -
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 5 +++--
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
index 5e6b436..36b851f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
@@ -1248,7 +1248,7 @@ CodeEmitterGK110::emitQUADOP(const Instruction *i, 
uint8_t qOp, uint8_t laneMask
 
defId(i->def(0), 2);
srcId(i->src(0), 10);
-   srcId(i->srcExists(1) ? i->src(1) : i->src(0), 23);
+   srcId((i->srcExists(1) && i->predSrc != 1) ? i->src(1) : i->src(0), 23);
 
if (i->op == OP_QUADOP && progType != Program::TYPE_FRAGMENT)
   code[1] |= 1 << 9; // dall
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index be37792..06e477d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -1535,7 +1535,10 @@ CodeEmitterGM107::emitFSWZADD()
emitRND  (0x27);
emitField(0x26, 1, insn->lanes); /* abused for .ndv */
emitField(0x1c, 8, insn->subOp);
-   emitGPR  (0x14, insn->src(1));
+   if (insn->predSrc != 1)
+  emitGPR  (0x14, insn->src(1));
+   else
+  emitGPR  (0x14);
emitGPR  (0x08, insn->src(0));
emitGPR  (0x00, insn->def(0));
 }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
index bc8354d..682a19d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
@@ -527,7 +527,8 @@ CodeEmitterNV50::emitForm_ADD(const Instruction *i)
 
setSrcFileBits(i, NV50_OP_ENC_LONG_ALT);
setSrc(i, 0, 0);
-   setSrc(i, 1, 2);
+   if (i->predSrc != 1)
+  setSrc(i, 1, 2);
 
if (i->getIndirect(0, 0)) {
   assert(!i->getIndirect(1, 0));
@@ -840,7 +841,7 @@ CodeEmitterNV50::emitQUADOP(const Instruction *i, uint8_t 
lane, uint8_t quOp)
 
emitForm_ADD(i);
 
-   if (!i->srcExists(1))
+   if (!i->srcExists(1) || i->predSrc == 1)
   srcId(i->src(0), 32 + 14);
 }
 
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index 8637db9..650044d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -1334,7 +1334,7 @@ CodeEmitterNVC0::emitQUADOP(const Instruction *i, uint8_t 
qOp, uint8_t laneMask)
 
defId(i->def(0), 14);
srcId(i->src(0), 20);
-   srcId(i->srcExists(1) ? i->src(1) : i->src(0), 26);
+   srcId((i->srcExists(1) && i->predSrc != 1) ? i->src(1) : i->src(0), 26);
 
if (i->op == OP_QUADOP && progType != Program::TYPE_FRAGMENT)
   code[0] |= 1 << 9; // dall
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: set user defined varyings to smooth by default in ES

2016-02-15 Thread Iago Toral
On Tue, 2016-02-16 at 11:03 +1100, Timothy Arceri wrote:
> This is usually handled by the backends in order to handle the
> various interactions with the gl_*Color built-ins.
> 
> The problem is this means linking will fail if one side on the
> interface adds the smooth qualifier to the varying and the other
> side just uses the default even though they match.
> 
> This fixes various deqp tests. The spec is not clear what to for
> deskto GL so leave it as is for now.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index b639378..4203cd5 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -2750,6 +2750,17 @@ interpret_interpolation_qualifier(const struct 
> ast_type_qualifier *qual,
>"vertex shader inputs or fragment shader outputs",
>interpolation_string(interpolation));
>}
> +   } else if (state->es_shader &&
> +  ((mode == ir_var_shader_in &&
> +state->stage != MESA_SHADER_VERTEX) ||
> +   (mode == ir_var_shader_out &&
> +state->stage != MESA_SHADER_FRAGMENT))) {
> +  /* From Section 4.3.9 (Interpolation) of the GLSL ES spec:
> +   *
> +   *" When no interpolation qualifier is present, smooth 
> interpolation
> +   *is used."
> +   */
> +  interpolation = INTERP_QUALIFIER_SMOOTH;
> }
>  
> return interpolation;

Reviewed-by: Iago Toral Quiroga 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix test for big-endian architecture in compiler.h

2016-02-15 Thread Jonathan Gray
On Fri, Feb 12, 2016 at 10:01:21AM +0100, Jochen Rollwagen wrote:
> Hi,
> 
> i think i found & fixed a bug in mesa concerning tests for big-endian
> machines. The defines tested don't exist or are wrongly defined so the test
> (probably) never fires. The gcc defines on my machine concerning big-endian
> are
> 
> jochen@mac-mini:~/sources/mesa$ gcc -dM -E - < /dev/null | grep BIG
> #define __BIGGEST_ALIGNMENT__ 16
> #define __BIG_ENDIAN__ 1
> #define __FLOAT_WORD_ORDER__ __ORDER_BIG_ENDIAN__
> #define _BIG_ENDIAN 1
> #define __ORDER_BIG_ENDIAN__ 4321
> #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__
> 
> The tested values in current mesa are quite different :-)
> 
> The following patch fixes this.

I think you have this backwards.

On OpenBSD/sparc64
$ gcc -dM -E - < /dev/null | grep BIG
$
$ sysctl hw.byteorder
hw.byteorder=4321

endian.h defines BYTE_ORDER and it should be included to test it.

I was under the impression the headers on linux had similiar defines.

Look at how src/gallium/include/pipe/p_config.h does it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Handle removal of LLVMAddTargetData in SVN revision 260919

2016-02-15 Thread Michel Dänzer
On 16.02.2016 15:25, Matthew Dawson wrote:
> LLVM removed LLVMAddTargetData for the 3.9 release in r260919.  For the two
> places in mesa where this is called, only enable the lines when compiling
> for less then 3.9.
> 
> For the radeon driver, I'm not sure how to check if any other LLVM calls need
> to be adjusted.  I think since the target data used is extracted from the
> LLVMModule, it isn't necessary to pass it back to LLVM again.
> 
> The code does compile, and at least for radeonsi does run OpenGL games.

BTW, I recommend getting familiar with piglit so that you can make sure
your changes don't cause any piglit regressions. I did so for this change.


> Signed-off-by: Matthew Dawson 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_init.c   | 2 ++
>  src/gallium/drivers/radeon/radeon_llvm_util.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_init.c
> index 96aba73..8c81170 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
> @@ -112,6 +112,7 @@ create_pass_manager(struct gallivm_state *gallivm)
> gallivm->passmgr = 
> LLVMCreateFunctionPassManagerForModule(gallivm->module);
> if (!gallivm->passmgr)
>return FALSE;
> +#if HAVE_LLVM < 0x0309
> /*
>  * TODO: some per module pass manager with IPO passes might be helpful -
>  * the generated texture functions may benefit from inlining if they are

AFAICT this TODO comment isn't related to LLVMAddTargetData.


> @@ -120,6 +121,7 @@ create_pass_manager(struct gallivm_state *gallivm)
>  
> // Old versions of LLVM get the DataLayout from the pass manager.
> LLVMAddTargetData(gallivm->target, gallivm->passmgr);
> +#endif

I pushed your patch with the #if above moved before this comment.


> diff --git a/src/gallium/drivers/radeon/radeon_llvm_util.c 
> b/src/gallium/drivers/radeon/radeon_llvm_util.c
> index 0dfd9ad..ee21437 100644
> --- a/src/gallium/drivers/radeon/radeon_llvm_util.c
> +++ b/src/gallium/drivers/radeon/radeon_llvm_util.c
> @@ -77,7 +77,9 @@ static void radeon_llvm_optimize(LLVMModuleRef mod)
>   }
>   }
>  
> +#if HAVE_LLVM < 0x0309
>   LLVMAddTargetData(TD, pass_manager);
> +#endif
>   LLVMAddAlwaysInlinerPass(pass_manager);
>   LLVMPassManagerBuilderPopulateModulePassManager(builder, pass_manager);

I also added HAVE_LLVM < 0x0309 guards around the other occurrences of
TD and data_layout, which are unused with this change.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: fix up result_src.type when doing i2u/u2i conversions

2016-02-15 Thread Ilia Mirkin
Even though it's a no-op, it's important to keep track of the type so
that we can pick the properly-signed op later on.

This fixes dEQP-GLES3.functional.shaders.precision.uint.highp_div_fragment,
which ended up using IDIV instead of UDIV.

Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index db00fbd..943582d 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -1979,6 +1979,7 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir)
case ir_unop_u2i:
   /* Converting between signed and unsigned integers is a no-op. */
   result_src = op[0];
+  result_src.type = result_dst.type;
   break;
case ir_unop_b2i:
   if (native_integers) {
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] android: radeonsi: fix building error in si_shader.c

2016-02-15 Thread Michel Dänzer
On 14.02.2016 23:41, Mauro Rossi wrote:
> 
> From: Mauro Rossi >
> Date: Sun, 14 Feb 2016 15:34:16 +0100
> Subject: [PATCH 1/2] android: add support for strchrnul
> 
> Android Bionic has no strchrnul in string functions,
> radeonsi uses strchrnul, so we need an implementation.
> 
> strchrnul.h is added in top mesa include path.

Gallium code (at least outside of src/gallium/state_trackers) is not
supposed to include headers from the toplevel include directory. This
header should be in src/util/ instead.


> +/**
> + *
> + * Copyright (C) 2014 Emil Velikov  >

Why Emil's copyright?


> +char *
> +strchrnul(const char *s, int c)
> +{
> +char * result = strchr(s, c);

No space after the asterisk:

char *result = strchr(s, c);


> From: Mauro Rossi >
> Date: Sun, 14 Feb 2016 15:10:16 +0100
> Subject: [PATCH 2/2] android: radeonsi: fix building error in si_shader.c

With the shortlog changed to something along the lines of

 radeonsi: Fix strchrnul being undefined on Android

this patch is

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Handle removal of LLVMAddTargetData in SVN revision 260919

2016-02-15 Thread Matthew Dawson
LLVM removed LLVMAddTargetData for the 3.9 release in r260919.  For the two
places in mesa where this is called, only enable the lines when compiling
for less then 3.9.

For the radeon driver, I'm not sure how to check if any other LLVM calls need
to be adjusted.  I think since the target data used is extracted from the
LLVMModule, it isn't necessary to pass it back to LLVM again.

The code does compile, and at least for radeonsi does run OpenGL games.

Signed-off-by: Matthew Dawson 
---
 src/gallium/auxiliary/gallivm/lp_bld_init.c   | 2 ++
 src/gallium/drivers/radeon/radeon_llvm_util.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 96aba73..8c81170 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -112,6 +112,7 @@ create_pass_manager(struct gallivm_state *gallivm)
gallivm->passmgr = LLVMCreateFunctionPassManagerForModule(gallivm->module);
if (!gallivm->passmgr)
   return FALSE;
+#if HAVE_LLVM < 0x0309
/*
 * TODO: some per module pass manager with IPO passes might be helpful -
 * the generated texture functions may benefit from inlining if they are
@@ -120,6 +121,7 @@ create_pass_manager(struct gallivm_state *gallivm)
 
// Old versions of LLVM get the DataLayout from the pass manager.
LLVMAddTargetData(gallivm->target, gallivm->passmgr);
+#endif
 
/* Setting the module's DataLayout to an empty string will cause the
 * ExecutionEngine to copy to the DataLayout string from its target
diff --git a/src/gallium/drivers/radeon/radeon_llvm_util.c 
b/src/gallium/drivers/radeon/radeon_llvm_util.c
index 0dfd9ad..ee21437 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_util.c
+++ b/src/gallium/drivers/radeon/radeon_llvm_util.c
@@ -77,7 +77,9 @@ static void radeon_llvm_optimize(LLVMModuleRef mod)
}
}
 
+#if HAVE_LLVM < 0x0309
LLVMAddTargetData(TD, pass_manager);
+#endif
LLVMAddAlwaysInlinerPass(pass_manager);
LLVMPassManagerBuilderPopulateModulePassManager(builder, pass_manager);
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] mesa: gl_NumSamples should always be at least one

2016-02-15 Thread Ilia Mirkin
From ARB_sample_shading:

"gl_NumSamples is the total number of samples in the framebuffer,
 or one if rendering to a non-multisample framebuffer"

So make sure to always pass in at least 1.

Signed-off-by: Ilia Mirkin 
---
 src/mesa/program/prog_statevars.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/program/prog_statevars.c 
b/src/mesa/program/prog_statevars.c
index eed2412..489f75f 100644
--- a/src/mesa/program/prog_statevars.c
+++ b/src/mesa/program/prog_statevars.c
@@ -353,7 +353,7 @@ _mesa_fetch_state(struct gl_context *ctx, const 
gl_state_index state[],
   }
   return;
case STATE_NUM_SAMPLES:
-  ((int *)value)[0] = _mesa_geometric_samples(ctx->DrawBuffer);
+  ((int *)value)[0] = MAX2(1, _mesa_geometric_samples(ctx->DrawBuffer));
   return;
case STATE_DEPTH_RANGE:
   value[0] = ctx->ViewportArray[0].Near;/* near   */
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] glsl: enable OES_sample_variables features

2016-02-15 Thread Ilia Mirkin
Add gl_MaxSamples, and enable the various other variables when this
extension or ESSL 3.20 are set.

Signed-off-by: Ilia Mirkin 
---
 src/compiler/glsl/builtin_variables.cpp  | 15 ---
 src/compiler/glsl/glsl_parser_extras.cpp |  4 
 src/compiler/glsl/glsl_parser_extras.h   |  5 +
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/builtin_variables.cpp 
b/src/compiler/glsl/builtin_variables.cpp
index d20fc4a..540057d 100644
--- a/src/compiler/glsl/builtin_variables.cpp
+++ b/src/compiler/glsl/builtin_variables.cpp
@@ -867,6 +867,9 @@ builtin_variable_generator::generate_constants()
   add_const("gl_MaxTessControlUniformComponents", 
state->Const.MaxTessControlUniformComponents);
   add_const("gl_MaxTessEvaluationUniformComponents", 
state->Const.MaxTessEvaluationUniformComponents);
}
+
+   if (state->is_version(0, 320) || state->OES_sample_variables_enable)
+  add_const("gl_MaxSamples", state->Const.MaxSamples);
 }
 
 
@@ -876,7 +879,9 @@ builtin_variable_generator::generate_constants()
 void
 builtin_variable_generator::generate_uniforms()
 {
-   if (state->is_version(400, 0) || state->ARB_sample_shading_enable)
+   if (state->is_version(400, 320) ||
+   state->ARB_sample_shading_enable ||
+   state->OES_sample_variables_enable)
   add_uniform(int_t, "gl_NumSamples");
add_uniform(type("gl_DepthRangeParameters"), "gl_DepthRange");
add_uniform(array(vec4_t, VERT_ATTRIB_MAX), "gl_CurrentAttribVertMESA");
@@ -1129,7 +1134,9 @@ builtin_variable_generator::generate_fs_special_vars()
  var->enable_extension_warning("GL_AMD_shader_stencil_export");
}
 
-   if (state->is_version(400, 0) || state->ARB_sample_shading_enable) {
+   if (state->is_version(400, 320) ||
+   state->ARB_sample_shading_enable ||
+   state->OES_sample_variables_enable) {
   add_system_value(SYSTEM_VALUE_SAMPLE_ID, int_t, "gl_SampleID");
   add_system_value(SYSTEM_VALUE_SAMPLE_POS, vec2_t, "gl_SamplePosition");
   /* From the ARB_sample_shading specification:
@@ -1142,7 +1149,9 @@ builtin_variable_generator::generate_fs_special_vars()
   add_output(FRAG_RESULT_SAMPLE_MASK, array(int_t, 1), "gl_SampleMask");
}
 
-   if (state->is_version(400, 0) || state->ARB_gpu_shader5_enable) {
+   if (state->is_version(400, 320) ||
+   state->ARB_gpu_shader5_enable ||
+   state->OES_sample_variables_enable) {
   add_system_value(SYSTEM_VALUE_SAMPLE_MASK_IN, array(int_t, 1), 
"gl_SampleMaskIn");
}
 
diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index fbac836..0aac060 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -175,6 +175,9 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
this->Const.MaxTessControlUniformComponents = 
ctx->Const.Program[MESA_SHADER_TESS_CTRL].MaxUniformComponents;
this->Const.MaxTessEvaluationUniformComponents = 
ctx->Const.Program[MESA_SHADER_TESS_EVAL].MaxUniformComponents;
 
+   /* OES_sample_variables */
+   this->Const.MaxSamples = ctx->Const.MaxSamples;
+
this->current_function = NULL;
this->toplevel_ir = NULL;
this->found_return = false;
@@ -606,6 +609,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(OES_EGL_image_external, false, true,  
OES_EGL_image_external),
EXT(OES_geometry_point_size,false, true,  OES_geometry_shader),
EXT(OES_geometry_shader,false, true,  OES_geometry_shader),
+   EXT(OES_sample_variables,   false, true,  OES_sample_variables),
EXT(OES_standard_derivatives,   false, true,  
OES_standard_derivatives),
EXT(OES_texture_3D, false, true,  dummy_true),
EXT(OES_texture_storage_multisample_2d_array, false, true, 
ARB_texture_multisample),
diff --git a/src/compiler/glsl/glsl_parser_extras.h 
b/src/compiler/glsl/glsl_parser_extras.h
index 5f6ca6a..5513345 100644
--- a/src/compiler/glsl/glsl_parser_extras.h
+++ b/src/compiler/glsl/glsl_parser_extras.h
@@ -457,6 +457,9 @@ struct _mesa_glsl_parse_state {
   unsigned MaxTessControlTotalOutputComponents;
   unsigned MaxTessControlUniformComponents;
   unsigned MaxTessEvaluationUniformComponents;
+
+  /* OES_sample_variables */
+  unsigned MaxSamples;
} Const;
 
/**
@@ -593,6 +596,8 @@ struct _mesa_glsl_parse_state {
bool OES_geometry_point_size_warn;
bool OES_geometry_shader_enable;
bool OES_geometry_shader_warn;
+   bool OES_sample_variables_enable;
+   bool OES_sample_variables_warn;
bool OES_standard_derivatives_enable;
bool OES_standard_derivatives_warn;
bool OES_texture_3D_enable;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] st/mesa: add OES_sample_variables support

2016-02-15 Thread Ilia Mirkin
Basically the same thing as ARB_sample_shading except that it also needs
gl_SampleMaskIn support as well as not enable per-sample interpolation
whenever doing per-sample shading. This is done explicitly in another
extension.

Signed-off-by: Ilia Mirkin 
---

I get 16 failures with dEQP tests, these fall into 2 categories:

 - 1-sample multisample surfaces don't behave the way it likes (it considers
   them non-multisampled even though they're created through gl*Multisample*

 - gl_SampleMaskIn is reporting the whole pixel's worth of mask rather than
   just the fragment in question. Looking back, it appears that
   ARB_gpu_shader5 also wants it for only the current fragment.

 docs/GL3.txt| 2 +-
 src/mesa/state_tracker/st_atom_rasterizer.c | 2 ++
 src/mesa/state_tracker/st_atom_shader.c | 2 ++
 src/mesa/state_tracker/st_extensions.c  | 4 
 src/mesa/state_tracker/st_program.c | 5 -
 5 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 26847b9..ae439f6 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -248,7 +248,7 @@ GLES3.2, GLSL ES 3.2
   GL_OES_gpu_shader5   not started (based on 
parts of GL_ARB_gpu_shader5, which is done for some drivers)
   GL_OES_primitive_bounding boxnot started
   GL_OES_sample_shadingnot started (based on 
parts of GL_ARB_sample_shading, which is done for some drivers)
-  GL_OES_sample_variables  not started (based on 
parts of GL_ARB_sample_shading, which is done for some drivers)
+  GL_OES_sample_variables  DONE (nvc0, r600, 
radeonsi)
   GL_OES_shader_image_atomic   not started (based on 
parts of GL_ARB_shader_image_load_store, which is done for some drivers)
   GL_OES_shader_io_blocks  not started (based on 
parts of GLSL 1.50, which is done)
   GL_OES_shader_multisample_interpolation  not started (based on 
parts of GL_ARB_gpu_shader5, which is done)
diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
b/src/mesa/state_tracker/st_atom_rasterizer.c
index c20cadf..d42d512 100644
--- a/src/mesa/state_tracker/st_atom_rasterizer.c
+++ b/src/mesa/state_tracker/st_atom_rasterizer.c
@@ -31,6 +31,7 @@
   */
  
 #include "main/macros.h"
+#include "main/context.h"
 #include "st_context.h"
 #include "st_atom.h"
 #include "st_debug.h"
@@ -239,6 +240,7 @@ static void update_raster_state( struct st_context *st )
 
/* _NEW_MULTISAMPLE | _NEW_BUFFERS */
raster->force_persample_interp =
+ !_mesa_is_gles(ctx) &&
  !st->force_persample_in_shader &&
  ctx->Multisample._Enabled &&
  ctx->Multisample.SampleShading &&
diff --git a/src/mesa/state_tracker/st_atom_shader.c 
b/src/mesa/state_tracker/st_atom_shader.c
index a88f035..8cfe756 100644
--- a/src/mesa/state_tracker/st_atom_shader.c
+++ b/src/mesa/state_tracker/st_atom_shader.c
@@ -36,6 +36,7 @@
  */
 
 #include "main/imports.h"
+#include "main/context.h"
 #include "main/mtypes.h"
 #include "program/program.h"
 
@@ -76,6 +77,7 @@ update_fp( struct st_context *st )
 * Ignore sample qualifier while computing this flag.
 */
key.persample_shading =
+  !_mesa_is_gles(st->ctx) &&
   st->force_persample_in_shader &&
   !(stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
 SYSTEM_BIT_SAMPLE_POS)) &&
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index eff3a2d..49d5a2c 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -861,6 +861,10 @@ void st_init_extensions(struct pipe_screen *screen,
   extensions->OES_copy_image = GL_TRUE;
}
 
+   /* Needs PIPE_CAP_SAMPLE_SHADING + gl_SampleMaskIn.
+*/
+   extensions->OES_sample_variables = extensions->ARB_gpu_shader5;
+
/* Maximum sample count. */
{
   enum pipe_format color_formats[] = {
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index 2e21d02..de628d7 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -32,6 +32,7 @@
 
 
 #include "main/imports.h"
+#include "main/context.h"
 #include "main/hash.h"
 #include "main/mtypes.h"
 #include "program/prog_parameter.h"
@@ -573,7 +574,9 @@ st_translate_fragment_program(struct st_context *st,
  else
 interpLocation[slot] = TGSI_INTERPOLATE_LOC_CENTER;
 
- if (stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
+ /* GLES specifies that only the sample keyword alters interpolation */
+ if (!_mesa_is_gles(st->ctx) &&
+ stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
  SYSTEM_BIT_SAMPLE_POS))
 

[Mesa-dev] [PATCH 2/4] mesa: add OES_sample_variables to extension table, add enable bit

2016-02-15 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/main/extensions_table.h | 1 +
 src/mesa/main/mtypes.h   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index bcd12a2..196a0c6 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -326,6 +326,7 @@ EXT(OES_point_sprite, 
ARB_point_sprite
 EXT(OES_query_matrix, dummy_true   
  ,  x ,  x , ES1,  x , 2003)
 EXT(OES_read_format , dummy_true   
  , GLL, GLC, ES1,  x , 2003)
 EXT(OES_rgb8_rgba8  , dummy_true   
  ,  x ,  x , ES1, ES2, 2005)
+EXT(OES_sample_variables, OES_sample_variables 
  ,  x ,  x ,  x ,  30, 2014)
 EXT(OES_single_precision, dummy_true   
  ,  x ,  x , ES1,  x , 2003)
 EXT(OES_standard_derivatives, OES_standard_derivatives 
  ,  x ,  x ,  x , ES2, 2005)
 EXT(OES_stencil1, dummy_false  
  ,  x ,  x ,  x ,  x , 2005)
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 50cbbd3..14fad39 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3891,6 +3891,7 @@ struct gl_extensions
GLboolean EXT_timer_query;
GLboolean EXT_vertex_array_bgra;
GLboolean OES_copy_image;
+   GLboolean OES_sample_variables;
GLboolean OES_standard_derivatives;
/* vendor extensions */
GLboolean AMD_performance_monitor;
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 18/19] i965: Add helper for lossless compression support

2016-02-15 Thread Pohjolainen, Topi
On Mon, Feb 15, 2016 at 01:27:19PM -0800, Ben Widawsky wrote:
> On Thu, Feb 11, 2016 at 08:34:11PM +0200, Topi Pohjolainen wrote:
> > v2: Use explicitly against base type of GL_FLOAT instead of
> > using _mesa_is_format_integer_color(). Otherwise we miss
> > GL_UNSIGNED_NORMALIZED.
> > 
> > Signed-off-by: Topi Pohjolainen 
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 22 ++
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  3 +++
> >  2 files changed, 25 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index 6c233d8..e9fbeeb 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -294,6 +294,28 @@ intel_miptree_is_lossless_compressed(const struct 
> > brw_context *brw,
> > return mt->num_samples <= 1;
> >  }
> >  
> > +bool
> > +intel_miptree_supports_lossless_compressed(mesa_format format)
> > +{
> > +   /* For now compression is only enabled for integer formats even though
> > +* there exist supported floating point formats also. This is a 
> > heuristic
> > +* decision based on current public benchmarks. In none of the cases 
> > these
> > +* formats provided any improvement but a few cases were seen to 
> > regress.
> > +* Hence these are left to to be enabled in the future when they are 
> > known
> > +* to improve things.
> > +*/
> > +   if (_mesa_get_format_datatype(format) == GL_FLOAT)
> > +  return false;
> > +
> > +   /* In principle, fast clear mechanism and lossless compression go hand 
> > in
> > +* hand. However, fast clear can be also used to clear srgb surfaces by
> > +* using equivalent linear format. This trick, however, can't be 
> > extended
> > +* to be used with lossless compression and therefore a check is needed 
> > to
> > +* see if the format really is linear.
> > +*/
> > +   return _mesa_get_srgb_format_linear(format) == format;
> > +}
> > +
> 
> Hmm. Doesn't this need to use the ccs_e field in surface formats, or did I 
> miss
> something?

It does but I re-used intel_miptree_supports_non_msrt_fast_clear() to check
if a format is supported. There you can see the trick Neil introduced:

if (brw->gen >= 9) {
   mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format);
   const uint32_t brw_format = brw_format_for_mesa_format(linear_format);
   return brw_losslessly_compressible_format(brw, brw_format);
} else ...

None of the srgb formats are supported even for fast clear, but Neil
found out that we can use fast clear in certain cases even for them.

I need to undo the conversion from srgb to linear to know for sure the format
is supported for compression. By the time I wrote this there wasn't direct
utility for checking if format srgb and therefore I chose to write it that
way.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 14/19] i965/gen9: Prepare surface state setup for lossless compression

2016-02-15 Thread Pohjolainen, Topi
On Mon, Feb 15, 2016 at 12:55:28PM -0800, Ben Widawsky wrote:
> On Thu, Feb 11, 2016 at 08:34:07PM +0200, Topi Pohjolainen wrote:
> > v2 (Ben): Use combination of msaa_layout and number of samples
> >   instead of introducing explicit type for lossless
> >   compression (intel_miptree_is_lossless_compressed()).
> > 
> > Signed-off-by: Topi Pohjolainen 
> > ---
> >  src/mesa/drivers/dri/i965/brw_defines.h| 1 +
> >  src/mesa/drivers/dri/i965/gen8_surface_state.c | 6 ++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index b1fa559..f903335 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -656,6 +656,7 @@
> >  #define GEN8_SURFACE_AUX_MODE_MCS   1
> >  #define GEN8_SURFACE_AUX_MODE_APPEND2
> >  #define GEN8_SURFACE_AUX_MODE_HIZ   3
> > +#define GEN9_SURFACE_AUX_MODE_CCS_E 5
> >  
> >  /* Surface state DW7 */
> >  #define GEN9_SURFACE_RT_COMPRESSION_SHIFT   30
> > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> > b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > index 0a52815..e1a37d8 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> > @@ -216,6 +216,9 @@ gen8_get_aux_mode(const struct brw_context *brw,
> > if (brw->gen >= 9 || mt->num_samples == 1)
> >assert(mt->halign == 16);
> >  
> > +   if (intel_miptree_is_lossless_compressed(brw, mt))
> > +  return GEN9_SURFACE_AUX_MODE_CCS_E;
> > +
> > return GEN8_SURFACE_AUX_MODE_MCS;
> >  }
> >  
> > @@ -484,6 +487,9 @@ gen8_update_renderbuffer_surface(struct brw_context 
> > *brw,
> > struct intel_mipmap_tree *aux_mt = mt->mcs_mt;
> > const uint32_t aux_mode = gen8_get_aux_mode(brw, mt, surf_type);
> >  
> > +   if (aux_mode == GEN9_SURFACE_AUX_MODE_CCS_E)
> > +  mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_UNRESOLVED;
> > +
> 
> I am somewhat undecided about whether or not this should be here. On the one
> hand, this is a gen specific thing (you rendered to a losslessy compressed
> buffer, which is only a gen9+ thing). On the other hand, we handle all similar
> stuff in the common meta code, and modifying fast_clear_state here seems a bit
> unclean.
> 
> I'm not opposed to doing this as long as you've considered my potential
> objection.

I wasn't that happy putting this here either. Main render target loop in
brw_wm_surface_state.c::brw_update_renderbuffer_surfaces() is even more
gen-agnostic, and therefore I didn't want to put this there either.

Now that you brought this up, I think the correct place would be
brw_postdraw_set_buffers_need_resolve() called in the end of
brw_try_draw_prims(). What do you think?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: add GL_OES_texture_stencil8 support

2016-02-15 Thread Ilia Mirkin
It's basically the same thing as GL_ARB_texture_stencil8 except that
glCopyTexImage isn't supported, so add STENCIL_INDEX to the list of
invalid GLES formats for glCopyTexImage.

Signed-off-by: Ilia Mirkin 
---

Seems to pass the few dEQP tests that are there. The ext is nearly identical to 
the desktop version.

 docs/GL3.txt | 2 +-
 src/mesa/main/extensions_table.h | 1 +
 src/mesa/main/teximage.c | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 3c4db06..26847b9 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -256,7 +256,7 @@ GLES3.2, GLSL ES 3.2
   GL_OES_texture_border_clamp  DONE (all drivers)
   GL_OES_texture_buffernot started (based on 
GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and 
GL_ARB_texture_buffer_object_rgb32 that are all done)
   GL_OES_texture_cube_map_arraynot started (based on 
GL_ARB_texture_cube_map_array, which is done for all drivers)
-  GL_OES_texture_stencil8  not started (based on 
GL_ARB_texture_stencil8, which is done for some drivers)
+  GL_OES_texture_stencil8  DONE (all drivers that 
support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array  DONE (all drivers that 
support GL_ARB_texture_multisample)
 
 More info about these features and the work involved can be found at
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 43dc358..bcd12a2 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -343,6 +343,7 @@ EXT(OES_texture_half_float  , 
OES_texture_half_float
 EXT(OES_texture_half_float_linear   , OES_texture_half_float_linear
  ,  x ,  x ,  x , ES2, 2005)
 EXT(OES_texture_mirrored_repeat , dummy_true   
  ,  x ,  x , ES1,  x , 2005)
 EXT(OES_texture_npot, ARB_texture_non_power_of_two 
  ,  x ,  x , ES1, ES2, 2005)
+EXT(OES_texture_stencil8, ARB_texture_stencil8 
  ,  x ,  x ,  x ,  30, 2014)
 EXT(OES_texture_storage_multisample_2d_array, ARB_texture_multisample  
  ,  x ,  x , ES1,  31, 2014)
 EXT(OES_texture_view, ARB_texture_view 
  ,  x ,  x ,  x ,  31, 2014)
 EXT(OES_vertex_array_object , dummy_true   
  ,  x ,  x , ES1, ES2, 2010)
diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 57765d7..8a4c628 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -2285,8 +2285,10 @@ copytexture_error_check( struct gl_context *ctx, GLuint 
dimensions,
   }
   if (baseFormat == GL_DEPTH_COMPONENT ||
   baseFormat == GL_DEPTH_STENCIL ||
+  baseFormat == GL_STENCIL_INDEX ||
   rb_base_format == GL_DEPTH_COMPONENT ||
   rb_base_format == GL_DEPTH_STENCIL ||
+  rb_base_format == GL_STENCIL_INDEX ||
   ((baseFormat == GL_LUMINANCE_ALPHA ||
 baseFormat == GL_ALPHA) &&
rb_base_format != GL_RGBA) ||
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 94168] Incorrect rendering when running Populous 3 on wine using DDraw->WineD3D->OpenGL wrapper [apitrace]

2016-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94168

Michel Dänzer  changed:

   What|Removed |Added

  Component|GLX |Mesa core

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: count shader images in MaxCombinedShaderOutputResources

2016-02-15 Thread Ilia Mirkin
On Mon, Feb 15, 2016 at 10:07 PM, Ilia Mirkin  wrote:
> On Mon, Feb 15, 2016 at 10:00 PM, Nicolai Hähnle  wrote:
>> From: Nicolai Hähnle 
>>
>> ---
>> This is on top of Ilia's Gallium images series. Ilia, I think it makes sense
>> for you to include this in your initial push if you agree.
>
> I don't disagree, but Dave might :) He can figure out how to work this
> out when he gets to it.
>
>>
>>  src/mesa/state_tracker/st_extensions.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/src/mesa/state_tracker/st_extensions.c 
>> b/src/mesa/state_tracker/st_extensions.c
>> index 5763ba7..e58ff83 100644
>> --- a/src/mesa/state_tracker/st_extensions.c
>> +++ b/src/mesa/state_tracker/st_extensions.c
>> @@ -373,6 +373,7 @@ void st_init_limits(struct pipe_screen *screen,
>>   c->Program[MESA_SHADER_TESS_EVAL].MaxImageUniforms +
>>   c->Program[MESA_SHADER_GEOMETRY].MaxImageUniforms +
>>   c->Program[MESA_SHADER_FRAGMENT].MaxImageUniforms;
>> +   c->MaxCombinedShaderOutputResources += c->MaxCombinedImageUniforms;
>
> I think that should just be c->Program[MESA_SHADER_FRAGMENT].MaxImageUniforms.

Nope, nevermind. Misread.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add GL_OES_copy_image support

2016-02-15 Thread Ilia Mirkin
On Mon, Feb 15, 2016 at 8:41 PM, Ilia Mirkin  wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>
> I ran this with the dEQP tests, and other than the caveats below, they seem to
> mostly work.
>
> The biggest caveat is that this can't actually be enabled for any drivers that
> don't implement ETC2 in hardware. So really that's just freedreno and the
> super-new desktop hardware. The problem is that you can copy between ETC2 and
> non-compressed images, so you need to have the original data around. At least
> the way that st/mesa implements this, the original data is not kept around.
>
> In order to enable this more generally, st/mesa will have to be taught to keep
> track of the originally-uploaded data. And support copying (and re-decoding)
> of data from another image.
>
> There also appears to be some unrelated problem relating to copying non-0 
> levels
> but that could well be a nouveau issue, or something unrelated. I don't think
> it's a problem with this patch.

After a bunch of investigation, I'm quite sure this is entirely
unrelated to this impl. Something weird is going on there, but it's
not in nouveau, and I'm pretty sure it's not in copy image.

>
>  docs/GL3.txt|  2 +-
>  src/mapi/glapi/gen/es_EXT.xml   | 22 +
>  src/mesa/main/copyimage.c   | 27 ++-
>  src/mesa/main/extensions_table.h|  1 +
>  src/mesa/main/mtypes.h  |  1 +
>  src/mesa/main/tests/dispatch_sanity.cpp |  3 ++
>  src/mesa/main/textureview.c | 86 
> +
>  src/mesa/state_tracker/st_extensions.c  |  8 +++
>  8 files changed, 148 insertions(+), 2 deletions(-)
>
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index 0957247..3c4db06 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -241,7 +241,7 @@ GLES3.2, GLSL ES 3.2
>GL_KHR_debug DONE (all drivers)
>GL_KHR_robustness90% done (the ARB 
> variant)
>GL_KHR_texture_compression_astc_ldr  DONE (i965/gen9+)
> -  GL_OES_copy_imagenot started (based on 
> GL_ARB_copy_image, which is done for some drivers)
> +  GL_OES_copy_imageDONE (core only)
>GL_OES_draw_buffers_indexed  not started
>GL_OES_draw_elements_base_vertex DONE (all drivers)
>GL_OES_geometry_shader   started (Marta)
> diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
> index fb0ef05..91e118f 100644
> --- a/src/mapi/glapi/gen/es_EXT.xml
> +++ b/src/mapi/glapi/gen/es_EXT.xml
> @@ -941,6 +941,28 @@
>
>  
>
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
> +
>  
>  
>   value="0x8DD9"/>
> diff --git a/src/mesa/main/copyimage.c b/src/mesa/main/copyimage.c
> index d571d22..a0f1c69 100644
> --- a/src/mesa/main/copyimage.c
> +++ b/src/mesa/main/copyimage.c
> @@ -25,6 +25,7 @@
>   *Jason Ekstrand 
>   */
>
> +#include "context.h"
>  #include "glheader.h"
>  #include "errors.h"
>  #include "enums.h"
> @@ -360,8 +361,32 @@ compressed_format_compatible(const struct gl_context 
> *ctx,
>case GL_COMPRESSED_SIGNED_RED_RGTC1:
>   compressedClass = BLOCK_CLASS_64_BITS;
>   break;
> +  case GL_COMPRESSED_RGBA8_ETC2_EAC:
> +  case GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC:
> +  case GL_COMPRESSED_RG11_EAC:
> +  case GL_COMPRESSED_SIGNED_RG11_EAC:
> + if (_mesa_is_gles(ctx))
> +compressedClass = BLOCK_CLASS_128_BITS;
> + else
> +return false;
> + break;
> +  case GL_COMPRESSED_RGB8_ETC2:
> +  case GL_COMPRESSED_SRGB8_ETC2:
> +  case GL_COMPRESSED_R11_EAC:
> +  case GL_COMPRESSED_SIGNED_R11_EAC:
> +  case GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2:
> +  case GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2:
> + if (_mesa_is_gles(ctx))
> +compressedClass = BLOCK_CLASS_64_BITS;
> + else
> +return false;
> + break;
>default:
> - return false;
> + if (_mesa_is_gles(ctx) && _mesa_is_astc_format(compressedFormat))
> +compressedClass = BLOCK_CLASS_128_BITS;
> + else
> +return false;
> + break;
> }
>
> switch (otherFormat) {
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index b07d635..d985ff0 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -305,6 +305,7 @@ EXT(OES_blend_subtract  , dummy_true
>  EXT(OES_byte_coordinates, dummy_true 
> ,  x ,  

Re: [Mesa-dev] [PATCH 10/25] radeonsi: add code for dumping all shader parts together

2016-02-15 Thread Michel Dänzer
On 16.02.2016 08:59, Marek Olšák wrote:
> From: Marek Olšák 

[...]

> @@ -4199,13 +4200,27 @@ static void si_shader_dump_stats(struct si_screen 
> *sscreen,
>  void si_shader_dump(struct si_screen *sscreen, struct si_shader *shader,
>   struct pipe_debug_callback *debug, unsigned processor)
>  {
> - if (r600_can_dump_shader(>b, processor))
> - if (!(sscreen->b.debug_flags & DBG_NO_ASM))
> - si_shader_dump_disassembly(>binary, debug);
> + unsigned code_size =
> + (shader->prolog ? shader->prolog->binary.code_size : 0) +
> + shader->binary.code_size +
> + (shader->epilog ? shader->epilog->binary.code_size : 0);

This code is a bit messy and duplicated in at least two places. I'd
suggest factoring this out into a helper function which uses normal if
statements instead of ternary operators.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: count shader images in MaxCombinedShaderOutputResources

2016-02-15 Thread Ilia Mirkin
On Mon, Feb 15, 2016 at 10:00 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> ---
> This is on top of Ilia's Gallium images series. Ilia, I think it makes sense
> for you to include this in your initial push if you agree.

I don't disagree, but Dave might :) He can figure out how to work this
out when he gets to it.

>
>  src/mesa/state_tracker/st_extensions.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/state_tracker/st_extensions.c 
> b/src/mesa/state_tracker/st_extensions.c
> index 5763ba7..e58ff83 100644
> --- a/src/mesa/state_tracker/st_extensions.c
> +++ b/src/mesa/state_tracker/st_extensions.c
> @@ -373,6 +373,7 @@ void st_init_limits(struct pipe_screen *screen,
>   c->Program[MESA_SHADER_TESS_EVAL].MaxImageUniforms +
>   c->Program[MESA_SHADER_GEOMETRY].MaxImageUniforms +
>   c->Program[MESA_SHADER_FRAGMENT].MaxImageUniforms;
> +   c->MaxCombinedShaderOutputResources += c->MaxCombinedImageUniforms;

I think that should just be c->Program[MESA_SHADER_FRAGMENT].MaxImageUniforms.

> c->MaxImageUnits = MAX_IMAGE_UNITS;
> c->MaxImageSamples = 0; /* XXX */
> if (c->MaxCombinedImageUniforms) {
> --
> 2.5.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: count shader images in MaxCombinedShaderOutputResources

2016-02-15 Thread Nicolai Hähnle
From: Nicolai Hähnle 

---
This is on top of Ilia's Gallium images series. Ilia, I think it makes sense
for you to include this in your initial push if you agree.

 src/mesa/state_tracker/st_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 5763ba7..e58ff83 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -373,6 +373,7 @@ void st_init_limits(struct pipe_screen *screen,
  c->Program[MESA_SHADER_TESS_EVAL].MaxImageUniforms +
  c->Program[MESA_SHADER_GEOMETRY].MaxImageUniforms +
  c->Program[MESA_SHADER_FRAGMENT].MaxImageUniforms;
+   c->MaxCombinedShaderOutputResources += c->MaxCombinedImageUniforms;
c->MaxImageUnits = MAX_IMAGE_UNITS;
c->MaxImageSamples = 0; /* XXX */
if (c->MaxCombinedImageUniforms) {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] compiler/glsl: Fix uniform location counting.

2016-02-15 Thread Michel Dänzer
On 16.02.2016 06:38, Matt Turner wrote:
> On Mon, Feb 15, 2016 at 12:50 PM, Ilia Mirkin  wrote:
>> In a few places your indentation is off -- please look at the 'git
>> diff' output, it should be pretty obvious. You used 2 spaces instead
>> of 3 (in just a handful of places).
> 
> If you use vim, you can put something like this in your ~/.vimrc:

IMHO this sort of thing needs to be in the tree.

I recently learned about EditorConfig, which supports lots of editors
and IDEs including vim (via a plugin):

http://editorconfig.org/

Maybe somebody could look into converting the existing .dir-locals.el
files in the tree to .editorconfig files.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: add GL_OES_copy_image support

2016-02-15 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---

I ran this with the dEQP tests, and other than the caveats below, they seem to
mostly work.

The biggest caveat is that this can't actually be enabled for any drivers that
don't implement ETC2 in hardware. So really that's just freedreno and the
super-new desktop hardware. The problem is that you can copy between ETC2 and
non-compressed images, so you need to have the original data around. At least
the way that st/mesa implements this, the original data is not kept around.

In order to enable this more generally, st/mesa will have to be taught to keep
track of the originally-uploaded data. And support copying (and re-decoding)
of data from another image.

There also appears to be some unrelated problem relating to copying non-0 levels
but that could well be a nouveau issue, or something unrelated. I don't think
it's a problem with this patch.

 docs/GL3.txt|  2 +-
 src/mapi/glapi/gen/es_EXT.xml   | 22 +
 src/mesa/main/copyimage.c   | 27 ++-
 src/mesa/main/extensions_table.h|  1 +
 src/mesa/main/mtypes.h  |  1 +
 src/mesa/main/tests/dispatch_sanity.cpp |  3 ++
 src/mesa/main/textureview.c | 86 +
 src/mesa/state_tracker/st_extensions.c  |  8 +++
 8 files changed, 148 insertions(+), 2 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 0957247..3c4db06 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -241,7 +241,7 @@ GLES3.2, GLSL ES 3.2
   GL_KHR_debug DONE (all drivers)
   GL_KHR_robustness90% done (the ARB 
variant)
   GL_KHR_texture_compression_astc_ldr  DONE (i965/gen9+)
-  GL_OES_copy_imagenot started (based on 
GL_ARB_copy_image, which is done for some drivers)
+  GL_OES_copy_imageDONE (core only)
   GL_OES_draw_buffers_indexed  not started
   GL_OES_draw_elements_base_vertex DONE (all drivers)
   GL_OES_geometry_shader   started (Marta)
diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
index fb0ef05..91e118f 100644
--- a/src/mapi/glapi/gen/es_EXT.xml
+++ b/src/mapi/glapi/gen/es_EXT.xml
@@ -941,6 +941,28 @@
 
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
 
 
 
diff --git a/src/mesa/main/copyimage.c b/src/mesa/main/copyimage.c
index d571d22..a0f1c69 100644
--- a/src/mesa/main/copyimage.c
+++ b/src/mesa/main/copyimage.c
@@ -25,6 +25,7 @@
  *Jason Ekstrand 
  */
 
+#include "context.h"
 #include "glheader.h"
 #include "errors.h"
 #include "enums.h"
@@ -360,8 +361,32 @@ compressed_format_compatible(const struct gl_context *ctx,
   case GL_COMPRESSED_SIGNED_RED_RGTC1:
  compressedClass = BLOCK_CLASS_64_BITS;
  break;
+  case GL_COMPRESSED_RGBA8_ETC2_EAC:
+  case GL_COMPRESSED_SRGB8_ALPHA8_ETC2_EAC:
+  case GL_COMPRESSED_RG11_EAC:
+  case GL_COMPRESSED_SIGNED_RG11_EAC:
+ if (_mesa_is_gles(ctx))
+compressedClass = BLOCK_CLASS_128_BITS;
+ else
+return false;
+ break;
+  case GL_COMPRESSED_RGB8_ETC2:
+  case GL_COMPRESSED_SRGB8_ETC2:
+  case GL_COMPRESSED_R11_EAC:
+  case GL_COMPRESSED_SIGNED_R11_EAC:
+  case GL_COMPRESSED_RGB8_PUNCHTHROUGH_ALPHA1_ETC2:
+  case GL_COMPRESSED_SRGB8_PUNCHTHROUGH_ALPHA1_ETC2:
+ if (_mesa_is_gles(ctx))
+compressedClass = BLOCK_CLASS_64_BITS;
+ else
+return false;
+ break;
   default:
- return false;
+ if (_mesa_is_gles(ctx) && _mesa_is_astc_format(compressedFormat))
+compressedClass = BLOCK_CLASS_128_BITS;
+ else
+return false;
+ break;
}
 
switch (otherFormat) {
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index b07d635..d985ff0 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -305,6 +305,7 @@ EXT(OES_blend_subtract  , dummy_true
 EXT(OES_byte_coordinates, dummy_true   
  ,  x ,  x , ES1,  x , 2002)
 EXT(OES_compressed_ETC1_RGB8_texture, OES_compressed_ETC1_RGB8_texture 
  ,  x ,  x , ES1, ES2, 2005)
 EXT(OES_compressed_paletted_texture , dummy_true   
  ,  x ,  x , ES1,  x , 2003)
+EXT(OES_copy_image  , OES_copy_image   
  ,  x ,  x ,  x ,  30, 2014)
 EXT(OES_depth24 , dummy_true   
  ,  x ,  x , ES1, ES2, 2005)
 EXT(OES_depth32 , dummy_false 

Re: [Mesa-dev] black background on android-x86-5.1

2016-02-15 Thread Ilia Mirkin
On Mon, Feb 15, 2016 at 12:49 AM, qinshao...@phoenixos.com
 wrote:
>
> Hi Mauro,
>
> I've heard a lot about you,  I have a question for you Derm.
> my laptop is lenovo B460, the gpu chipset GT218, family is NV50 .  on kitkat, 
> wallpaper can be displayed. but on  lollipop and marshmallow, wallpaper is 
> not displayed.Here is a screenshot

I note that you're only getting GLES 2.0 -- perhaps that's the reason,
the later android's probably require GLES 3 -- I suspect you're
hitting this because you're not building mesa with
--enable-texture-float.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] vc4: Correct typo setting 'handled_qinst_cond'

2016-02-15 Thread Rhys Kidd
On 15 February 2016 at 20:17, Eric Anholt  wrote:

> Rhys Kidd  writes:
>
> > Variable was previously always set to true. Accordingly, the later
> > assert() served no active purpose.
> >
> > Found with GCC warning and code inspection:
> >
> > mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c: In
> function'vc4_generate_code':
> > mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c:315:22: warning: variable
> 'handled_qinst_cond' set but not used [-Wunused-but-set-variable]
> >  bool handled_qinst_cond = true;
> >   ^
> >
> > Separately, the early break for MOV no-ops in the default switch case
> > will trigger the assert() in debug builds.
> >
> > ...
> > /* Skip emitting the MOV if it's a no-op. */
> > if (qir_is_raw_mov(qinst) &&
> > dst.mux == src[0].mux && dst.addr == src[0].addr) {
> > break;
> > }
> > ...
> >
> > This code path is tickled now that this typo is corrected.
>
> Thanks!  I've pushed your 3 patches, but with this last bit about the
> is_raw_mov() problem removed since I just put in a patch to fix the
> problem instead :)
>

Thanks Eric!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix test for big-endian architecture in compiler.h

2016-02-15 Thread Jochen Rollwagen

Am 15.02.2016 um 15:53 schrieb Oded Gabbay:

Sent with MailTrack

On Sat, Feb 13, 2016 at 2:39 AM, Roland Scheidegger  wrote:

Am 12.02.2016 um 10:01 schrieb Jochen Rollwagen:

Hi,

i think i found & fixed a bug in mesa concerning tests for big-endian
machines. The defines tested don't exist or are wrongly defined so the
test (probably) never fires. The gcc defines on my machine concerning
big-endian are

jochen@mac-mini:~/sources/mesa$ gcc -dM -E - < /dev/null | grep BIG
#define __BIGGEST_ALIGNMENT__ 16
#define __BIG_ENDIAN__ 1
#define __FLOAT_WORD_ORDER__ __ORDER_BIG_ENDIAN__
#define _BIG_ENDIAN 1
#define __ORDER_BIG_ENDIAN__ 4321
#define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__

The tested values in current mesa are quite different :-)

The following patch fixes this.

diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h
index c5ee741..99c63cb 100644
--- a/src/mesa/main/compiler.h
+++ b/src/mesa/main/compiler.h
@@ -52,7 +52,7 @@ extern "C" {
   * Try to use a runtime test instead.
   * For now, only used by some DRI hardware drivers for color/texel
packing.
   */
-#if defined(BYTE_ORDER) && defined(BIG_ENDIAN) && BYTE_ORDER == BIG_ENDIAN
+#if defined(__BYTE_ORDER__) && defined(__BIG_ENDIAN__) &&
__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  #if defined(__linux__)
  #include 
  #define CPU_TO_LE32( x )   bswap_32( x )


Note that on some platforms this file would include endian.h - which
defines those BYTE_ORDER etc. values. Albeit it includes this _after_
these ifdefs...
But don't ask me how this is really supposed to work...

Roland

 includes  which includes 

However, this depends on the c/h files to include  before
including "compiler.h", which doesn't always happen (e.g
dummy_errors.c) and it is a very fragile situation.

So I think this is a good fix and this patch is:
Reviewed-by: Oded Gabbay 

Jochen,

Note that I downloaded this patch from pw and it was malformed. I
don't know if its a pw problem or a problem in how you sent the patch
to the ml.

 Oded




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Well, i just copied it from the git-diff-terminal and pasted it into my 
mail-client. Maybe a newline problem ? Anyway, i attached the patch (and 
patched my local mesa with it before, which worked :-) ).


Cheers

Jochen
diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h
index c5ee741..99c63cb 100644
--- a/src/mesa/main/compiler.h
+++ b/src/mesa/main/compiler.h
@@ -52,7 +52,7 @@ extern "C" {
  * Try to use a runtime test instead.
  * For now, only used by some DRI hardware drivers for color/texel packing.
  */
-#if defined(BYTE_ORDER) && defined(BIG_ENDIAN) && BYTE_ORDER == BIG_ENDIAN
+#if defined(__BYTE_ORDER__) && defined(__BIG_ENDIAN__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
 #if defined(__linux__)
 #include 
 #define CPU_TO_LE32( x )   bswap_32( x )
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] vc4: Correct typo setting 'handled_qinst_cond'

2016-02-15 Thread Eric Anholt
Rhys Kidd  writes:

> Variable was previously always set to true. Accordingly, the later
> assert() served no active purpose.
>
> Found with GCC warning and code inspection:
>
> mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c: In function'vc4_generate_code':
> mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c:315:22: warning: variable 
> 'handled_qinst_cond' set but not used [-Wunused-but-set-variable]
>  bool handled_qinst_cond = true;
>   ^
>
> Separately, the early break for MOV no-ops in the default switch case
> will trigger the assert() in debug builds.
>
> ...
> /* Skip emitting the MOV if it's a no-op. */
> if (qir_is_raw_mov(qinst) &&
> dst.mux == src[0].mux && dst.addr == src[0].addr) {
> break;
> }
> ...
>
> This code path is tickled now that this typo is corrected.

Thanks!  I've pushed your 3 patches, but with this last bit about the
is_raw_mov() problem removed since I just put in a patch to fix the
problem instead :)


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: empty buffer binding if the buffer's not really there

2016-02-15 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Sun, Feb 14, 2016 at 1:25 AM, Ilia Mirkin  wrote:
> This can happen with 0-sized buffers.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/state_tracker/st_atom_atomicbuf.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atom_atomicbuf.c 
> b/src/mesa/state_tracker/st_atom_atomicbuf.c
> index d83c396..a27dbe0 100644
> --- a/src/mesa/state_tracker/st_atom_atomicbuf.c
> +++ b/src/mesa/state_tracker/st_atom_atomicbuf.c
> @@ -58,9 +58,11 @@ st_bind_atomics(struct st_context *st,
>   st_buffer_object(binding->BufferObject);
>struct pipe_shader_buffer sb = { 0 };
>
> -  sb.buffer = st_obj->buffer;
> -  sb.buffer_offset = binding->Offset;
> -  sb.buffer_size = st_obj->buffer->width0 - binding->Offset;
> +  if (st_obj && st_obj->buffer) {
> + sb.buffer = st_obj->buffer;
> + sb.buffer_offset = binding->Offset;
> + sb.buffer_size = st_obj->buffer->width0 - binding->Offset;
> +  }
>
>st->pipe->set_shader_buffers(st->pipe, shader_type,
> atomic->Binding, 1, );
> --
> 2.4.10
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/mesa: use new CSO_BITS_ALL_SHADERS

2016-02-15 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Mon, Feb 15, 2016 at 6:06 PM, Brian Paul  wrote:
> ---
>  src/mesa/state_tracker/st_cb_bitmap.c | 9 +++--
>  src/mesa/state_tracker/st_cb_clear.c  | 8 ++--
>  src/mesa/state_tracker/st_cb_drawpixels.c | 8 ++--
>  src/mesa/state_tracker/st_cb_texture.c| 8 ++--
>  4 files changed, 9 insertions(+), 24 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_cb_bitmap.c 
> b/src/mesa/state_tracker/st_cb_bitmap.c
> index e27d487..4fd2dfe 100644
> --- a/src/mesa/state_tracker/st_cb_bitmap.c
> +++ b/src/mesa/state_tracker/st_cb_bitmap.c
> @@ -220,14 +220,11 @@ setup_render_state(struct gl_context *ctx,
>  CSO_BIT_FRAGMENT_SAMPLERS |
>  CSO_BIT_FRAGMENT_SAMPLER_VIEWS |
>  CSO_BIT_VIEWPORT |
> -CSO_BIT_FRAGMENT_SHADER |
>  CSO_BIT_STREAM_OUTPUTS |
> -CSO_BIT_TESSCTRL_SHADER |
> -CSO_BIT_TESSEVAL_SHADER |
> -CSO_BIT_GEOMETRY_SHADER |
>  CSO_BIT_VERTEX_ELEMENTS |
> -CSO_BIT_VERTEX_SHADER |
> -CSO_BIT_AUX_VERTEX_BUFFER_SLOT));
> +CSO_BIT_AUX_VERTEX_BUFFER_SLOT |
> +CSO_BITS_ALL_SHADERS));
> +
>
> /* rasterizer state: just scissor */
> st->bitmap.rasterizer.scissor = ctx->Scissor.EnableFlags & 1;
> diff --git a/src/mesa/state_tracker/st_cb_clear.c 
> b/src/mesa/state_tracker/st_cb_clear.c
> index 01f1c05..5580146 100644
> --- a/src/mesa/state_tracker/st_cb_clear.c
> +++ b/src/mesa/state_tracker/st_cb_clear.c
> @@ -203,14 +203,10 @@ clear_with_quad(struct gl_context *ctx, unsigned 
> clear_buffers)
>  CSO_BIT_SAMPLE_MASK |
>  CSO_BIT_MIN_SAMPLES |
>  CSO_BIT_VIEWPORT |
> -CSO_BIT_FRAGMENT_SHADER |
>  CSO_BIT_STREAM_OUTPUTS |
> -CSO_BIT_VERTEX_SHADER |
> -CSO_BIT_TESSCTRL_SHADER |
> -CSO_BIT_TESSEVAL_SHADER |
> -CSO_BIT_GEOMETRY_SHADER |
>  CSO_BIT_VERTEX_ELEMENTS |
> -CSO_BIT_AUX_VERTEX_BUFFER_SLOT));
> +CSO_BIT_AUX_VERTEX_BUFFER_SLOT |
> +CSO_BITS_ALL_SHADERS));
>
> /* blend state: RGBA masking */
> {
> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
> b/src/mesa/state_tracker/st_cb_drawpixels.c
> index 9c955a5..d1fe330 100644
> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> @@ -479,14 +479,10 @@ draw_textured_quad(struct gl_context *ctx, GLint x, 
> GLint y, GLfloat z,
>   CSO_BIT_VIEWPORT |
>   CSO_BIT_FRAGMENT_SAMPLERS |
>   CSO_BIT_FRAGMENT_SAMPLER_VIEWS |
> - CSO_BIT_FRAGMENT_SHADER |
>   CSO_BIT_STREAM_OUTPUTS |
> - CSO_BIT_VERTEX_SHADER |
> - CSO_BIT_TESSCTRL_SHADER |
> - CSO_BIT_TESSEVAL_SHADER |
> - CSO_BIT_GEOMETRY_SHADER |
>   CSO_BIT_VERTEX_ELEMENTS |
> - CSO_BIT_AUX_VERTEX_BUFFER_SLOT);
> + CSO_BIT_AUX_VERTEX_BUFFER_SLOT |
> + CSO_BITS_ALL_SHADERS);
> if (write_stencil) {
>cso_state_mask |= (CSO_BIT_DEPTH_STENCIL_ALPHA |
>   CSO_BIT_BLEND);
> diff --git a/src/mesa/state_tracker/st_cb_texture.c 
> b/src/mesa/state_tracker/st_cb_texture.c
> index 5f76e44..a06cc72 100644
> --- a/src/mesa/state_tracker/st_cb_texture.c
> +++ b/src/mesa/state_tracker/st_cb_texture.c
> @@ -1341,12 +1341,8 @@ try_pbo_upload_common(struct gl_context *ctx,
>  CSO_BIT_VIEWPORT |
>  CSO_BIT_BLEND |
>  CSO_BIT_RASTERIZER |
> -CSO_BIT_VERTEX_SHADER |
> -CSO_BIT_GEOMETRY_SHADER |
> -CSO_BIT_TESSCTRL_SHADER |
> -CSO_BIT_TESSEVAL_SHADER |
> -CSO_BIT_FRAGMENT_SHADER |
> -CSO_BIT_STREAM_OUTPUTS));
> +CSO_BIT_STREAM_OUTPUTS |
> +CSO_BITS_ALL_SHADERS));
> cso_save_constant_buffer_slot0(cso, PIPE_SHADER_FRAGMENT);
>
>
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/6] st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap

2016-02-15 Thread Marek Olšák
On Sun, Feb 14, 2016 at 3:47 PM, Brian Paul  wrote:
> On 02/13/2016 01:03 PM, Ilia Mirkin wrote:
>>
>> On Fri, Feb 12, 2016 at 10:43 AM, Brian Paul  wrote:
>>>
>>> diff --git a/src/mesa/state_tracker/st_context.c
>>> b/src/mesa/state_tracker/st_context.c
>>> index 9016846..cb2c390 100644
>>> --- a/src/mesa/state_tracker/st_context.c
>>> +++ b/src/mesa/state_tracker/st_context.c
>>> @@ -241,16 +241,23 @@ st_create_context_priv( struct gl_context *ctx,
>>> struct pipe_context *pipe,
>>>  else
>>> st->internal_target = PIPE_TEXTURE_RECT;
>>>
>>> -   /* Vertex element objects used for drawing rectangles for glBitmap,
>>> -* glDrawPixels, glClear, etc.
>>> +   /* Setup vertex element info for 'struct st_util_vertex'.
>>>   */
>>> -   for (i = 0; i < ARRAY_SIZE(st->velems_util_draw); i++) {
>>> -  memset(>velems_util_draw[i], 0, sizeof(struct
>>> pipe_vertex_element));
>>> -  st->velems_util_draw[i].src_offset = i * 4 * sizeof(float);
>>> -  st->velems_util_draw[i].instance_divisor = 0;
>>> -  st->velems_util_draw[i].vertex_buffer_index =
>>> -cso_get_aux_vertex_buffer_slot(st->cso_context);
>>> -  st->velems_util_draw[i].src_format =
>>> PIPE_FORMAT_R32G32B32A32_FLOAT;
>>> +   {
>>> +  const unsigned slot =
>>> cso_get_aux_vertex_buffer_slot(st->cso_context);
>>
>>
>> Can the aux vertex buffer slot change over time? If so, you need some
>> logic to update the vertex_buffer_index for these. From what I can
>> tell it's always 0, not sure what the intention behind it is... seems
>> like it'll be a very annoying problem to debug down the line should it
>> ever change. Thoughts?
>
>
> It's hard-wired to zero as you say but I imagine it could be computed by
> examining the current vertex buffer bindings state to find a free slot such
> that we might be able to avoid saving/restoring all the vertex buffer
> bindings.  I believe Marek wrote the code in question.

Initially, I wanted to hardcode the aux vertex buffer slot to 15, but
it broke Draw. Setting it to 0 worked.

I don't think it's important as long as only one vertex buffer is
saved/restored and not all of them.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: add GL_OES_texture_border_clamp support

2016-02-15 Thread Ilia Mirkin
Only minor differences to the existing ARB_texture_border_clamp support.

Signed-off-by: Ilia Mirkin 
---

I get 53 failures (and 548 passes) in the dEQP tests, they appear to expect
all-red for depth texturing while gallium apparently returns gray. Haven't
figured out if it's the fault of the tests or the implementation.

(I also had to claim it was the EXT version of the ext, and hack up dEQP to
pull the *OES functions instead of the *EXT ones.)

 docs/GL3.txt|  2 +-
 src/mapi/glapi/gen/es_EXT.xml   | 58 -
 src/mesa/main/extensions_table.h|  1 +
 src/mesa/main/samplerobj.c  |  6 ++--
 src/mesa/main/tests/dispatch_sanity.cpp | 10 ++
 src/mesa/main/texparam.c| 11 ---
 6 files changed, 80 insertions(+), 8 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index ea7ceef..0957247 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -253,7 +253,7 @@ GLES3.2, GLSL ES 3.2
   GL_OES_shader_io_blocks  not started (based on 
parts of GLSL 1.50, which is done)
   GL_OES_shader_multisample_interpolation  not started (based on 
parts of GL_ARB_gpu_shader5, which is done)
   GL_OES_tessellation_shader   not started (based on 
GL_ARB_tessellation_shader, which is done for some drivers)
-  GL_OES_texture_border_clamp  not started (based on 
GL_ARB_texture_border_clamp, which is done)
+  GL_OES_texture_border_clamp  DONE (all drivers)
   GL_OES_texture_buffernot started (based on 
GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and 
GL_ARB_texture_buffer_object_rgb32 that are all done)
   GL_OES_texture_cube_map_arraynot started (based on 
GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_stencil8  not started (based on 
GL_ARB_texture_stencil8, which is done for some drivers)
diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
index 86df980..fb0ef05 100644
--- a/src/mapi/glapi/gen/es_EXT.xml
+++ b/src/mapi/glapi/gen/es_EXT.xml
@@ -982,5 +982,61 @@
 
 
 
-  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+  
+  
+
+
+
+  
+  
+  
+
+
+
+  
+  
+  
+
+
+
+  
+  
+  
+
+
+
+
 
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index d1e3a99..b07d635 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -333,6 +333,7 @@ EXT(OES_stencil8, dummy_true
 EXT(OES_stencil_wrap, dummy_true   
  ,  x ,  x , ES1,  x , 2002)
 EXT(OES_surfaceless_context , dummy_true   
  ,  x ,  x , ES1, ES2, 2012)
 EXT(OES_texture_3D  , dummy_true   
  ,  x ,  x ,  x , ES2, 2005)
+EXT(OES_texture_border_clamp, ARB_texture_border_clamp 
  ,  x ,  x ,  x , ES2, 2014)
 EXT(OES_texture_cube_map, ARB_texture_cube_map 
  ,  x ,  x , ES1,  x , 2007)
 EXT(OES_texture_env_crossbar, ARB_texture_env_crossbar 
  ,  x ,  x , ES1,  x , 2005)
 EXT(OES_texture_float   , OES_texture_float
  ,  x ,  x ,  x , ES2, 2005)
diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c
index fe15508..ca366d9 100644
--- a/src/mesa/main/samplerobj.c
+++ b/src/mesa/main/samplerobj.c
@@ -1518,7 +1518,8 @@ _mesa_GetSamplerParameterIiv(GLuint sampler, GLenum 
pname, GLint *params)
 
sampObj = _mesa_lookup_samplerobj(ctx, sampler);
if (!sampObj) {
-  _mesa_error(ctx, GL_INVALID_VALUE,
+  _mesa_error(ctx, (_mesa_is_gles(ctx) ?
+GL_INVALID_OPERATION : GL_INVALID_VALUE),
   "glGetSamplerParameterIiv(sampler %u)",
   sampler);
   return;
@@ -1593,7 +1594,8 @@ _mesa_GetSamplerParameterIuiv(GLuint sampler, GLenum 
pname, GLuint *params)
 
sampObj = _mesa_lookup_samplerobj(ctx, sampler);
if (!sampObj) {
-  _mesa_error(ctx, GL_INVALID_VALUE,
+  _mesa_error(ctx, (_mesa_is_gles(ctx) ?
+GL_INVALID_OPERATION : GL_INVALID_VALUE),
   "glGetSamplerParameterIuiv(sampler %u)",
   sampler);
   return;
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index e641296..24e3d18 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -2436,6 +2436,16 @@ const struct 

Re: [Mesa-dev] [PATCH 2/2] glsl: set user defined varyings to smooth by default

2016-02-15 Thread Timothy Arceri
On Tue, 2016-02-16 at 09:04 +1100, Timothy Arceri wrote:
> On Mon, 2016-02-15 at 16:12 +0100, Iago Toral wrote:
> > On Mon, 2016-02-15 at 18:38 +1100, Timothy Arceri wrote:
> > > This is usually handled by the backends in order to handle the
> > > various interactions with the gl_*Color built-ins.
> > > 
> > > The problem is this means linking will fail if one side on the
> > > interface adds the smooth qualifier to the varying and the other
> > > side just uses the default even though they match.
> > > 
> > > This fixes various deqp tests and should have no impact on
> > > built-ins as they generate GLSL IR directly.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
> > > ---
> > >  src/compiler/glsl/ast_to_hir.cpp | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > > b/src/compiler/glsl/ast_to_hir.cpp
> > > index b639378..47d52ee 100644
> > > --- a/src/compiler/glsl/ast_to_hir.cpp
> > > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > > @@ -2750,6 +2750,11 @@ interpret_interpolation_qualifier(const
> > > struct ast_type_qualifier *qual,
> > >    "vertex shader inputs or fragment
> > > shader
> > > outputs",
> > >    interpolation_string(interpolation));
> > >    }
> > > +   } else if ((mode == ir_var_shader_in &&
> > > +   state->stage != MESA_SHADER_VERTEX) ||
> > > +  (mode == ir_var_shader_out &&
> > > +   state->stage != MESA_SHADER_FRAGMENT)) {
> > > +  interpolation = INTERP_QUALIFIER_SMOOTH;
> > > }
> > 
> > The GLES spec explicitly says that in the absence of an interp
> > qualifier
> > smooth is used, but I can't find the same statement in the desktop
> > GLSL
> > spec. Should we make this ES specific?
> 
> I couldn't find it in the spec either thats why I didn't send this
> out
> last year when I wrote it. However the OpenGL wiki says thats what it
> is by default, and thats what our implementation does after
> validation
> 
> I'll write a piglit test to see what AMD and Nvidia do on the desktop
> just to be sure.

They are inconsitent even not failing to link when they should so I
just sent a patch to change this for ES.

> 
> Thanks for taking a look.
> 
> > 
> > Iago
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: set user defined varyings to smooth by default in ES

2016-02-15 Thread Timothy Arceri
This is usually handled by the backends in order to handle the
various interactions with the gl_*Color built-ins.

The problem is this means linking will fail if one side on the
interface adds the smooth qualifier to the varying and the other
side just uses the default even though they match.

This fixes various deqp tests. The spec is not clear what to for
deskto GL so leave it as is for now.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
---
 src/compiler/glsl/ast_to_hir.cpp | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index b639378..4203cd5 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2750,6 +2750,17 @@ interpret_interpolation_qualifier(const struct 
ast_type_qualifier *qual,
   "vertex shader inputs or fragment shader outputs",
   interpolation_string(interpolation));
   }
+   } else if (state->es_shader &&
+  ((mode == ir_var_shader_in &&
+state->stage != MESA_SHADER_VERTEX) ||
+   (mode == ir_var_shader_out &&
+state->stage != MESA_SHADER_FRAGMENT))) {
+  /* From Section 4.3.9 (Interpolation) of the GLSL ES spec:
+   *
+   *" When no interpolation qualifier is present, smooth interpolation
+   *is used."
+   */
+  interpolation = INTERP_QUALIFIER_SMOOTH;
}
 
return interpolation;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/25] radeonsi: enable compiling one variant per shader

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

Shader stats from VERDE:

Default scheduler:

Totals:
SGPRS: 491272 -> 488672 (-0.53 %)
VGPRS: 289980 -> 311093 (7.28 %)
Code Size: 11091656 -> 11219948 (1.16 %) bytes
LDS: 97 -> 97 (0.00 %) blocks
Scratch: 1732608 -> 2246656 (29.67 %) bytes per wave
Max Waves: 78063 -> 77352 (-0.91 %)
Wait states: 0 -> 0 (0.00 %)

Looking at some of the worst regressions, I get:
- The VGPR increase seems to be caused by the fact that if PS has used less
  than 16 VGPRs, now it will always use 16 VGPRs and sometimes even 20.
  However, the wave count remains at 10 if VGPRs <= 24, so no harm there.
- The scratch increase seems to be caused by SGPR spilling.
  The unnecessary SGPR spilling has been an ongoing issue with the compiler
  and it's completely fixable by rematerializing s_loads or reordering
  instructions.

SI scheduler:

Totals:
SGPRS: 374848 -> 374576 (-0.07 %)
VGPRS: 284456 -> 307515 (8.11 %)
Code Size: 11433068 -> 11535452 (0.90 %) bytes
LDS: 97 -> 97 (0.00 %) blocks
Scratch: 509952 -> 522240 (2.41 %) bytes per wave
Max Waves: 79456 -> 78217 (-1.56 %)
Wait states: 0 -> 0 (0.00 %)

VGPRs - same story as before. The SI scheduler doesn't spill SGPRs so much
and generally spills way less than the default scheduler.
(522240 spills vs 2246656 spills)
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 1 +
 src/gallium/drivers/radeon/r600_pipe_common.h | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c| 4 +++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 324d271..ea02827 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -411,6 +411,7 @@ static const struct debug_named_value 
common_debug_options[] = {
{ "nodccclear", DBG_NO_DCC_CLEAR, "Disable DCC fast clear." },
{ "norbplus", DBG_NO_RB_PLUS, "Disable RB+ on Stoney." },
{ "sisched", DBG_SI_SCHED, "Enable LLVM SI Machine Instruction 
Scheduler." },
+   { "mono", DBG_MONOLITHIC_SHADERS, "Use old-style monolithic shaders 
compiled on demand" },
 
DEBUG_NAMED_VALUE_END /* must be last */
 };
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index e92df87..ee173d3 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -89,6 +89,7 @@
 #define DBG_NO_DCC_CLEAR   (1llu << 44)
 #define DBG_NO_RB_PLUS (1llu << 45)
 #define DBG_SI_SCHED   (1llu << 46)
+#define DBG_MONOLITHIC_SHADERS (1llu << 47)
 
 #define R600_MAP_BUFFER_ALIGNMENT 64
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 44f6047..75d4775 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -622,7 +622,9 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)
sscreen->b.has_cp_dma = true;
sscreen->b.has_streamout = true;
pipe_mutex_init(sscreen->shader_parts_mutex);
-   sscreen->use_monolithic_shaders = true;
+   sscreen->use_monolithic_shaders =
+   HAVE_LLVM < 0x0308 ||
+   (sscreen->b.debug_flags & DBG_MONOLITHIC_SHADERS) != 0;
 
if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS | DBG_PS | 
DBG_CS;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/25] radeonsi: compile non-GS middle parts of shaders immediately if enabled

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

Still disabled.

Only prologs & epilogs are compiled in draw calls, but each variant of those
is compiled only once per process.

VS is always compiled as hw VS.
TES is always compiled as hw VS.

LS and ES stages are always compiled on demand.
---
 src/gallium/drivers/radeonsi/si_shader.c| 58 -
 src/gallium/drivers/radeonsi/si_shader.h| 11 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 36 ---
 3 files changed, 87 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 07ea231..20069b4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4804,11 +4804,11 @@ static void si_init_shader_ctx(struct si_shader_context 
*ctx,
bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
 }
 
-static int si_compile_tgsi_shader(struct si_screen *sscreen,
- LLVMTargetMachineRef tm,
- struct si_shader *shader,
- bool is_monolithic,
- struct pipe_debug_callback *debug)
+int si_compile_tgsi_shader(struct si_screen *sscreen,
+  LLVMTargetMachineRef tm,
+  struct si_shader *shader,
+  bool is_monolithic,
+  struct pipe_debug_callback *debug)
 {
struct si_shader_selector *sel = shader->selector;
struct si_shader_context ctx;
@@ -5842,15 +5842,48 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,
 struct si_shader *shader,
 struct pipe_debug_callback *debug)
 {
+   struct si_shader *mainp = shader->selector->main_shader_part;
int r;
 
-   /* Compile TGSI. */
-   r = si_compile_tgsi_shader(sscreen, tm, shader,
-  sscreen->use_monolithic_shaders, debug);
-   if (r)
-   return r;
+   /* LS and ES are always compiled on demand. */
+   if (!mainp ||
+   (shader->selector->type == PIPE_SHADER_VERTEX &&
+(shader->key.vs.as_es || shader->key.vs.as_ls)) ||
+   (shader->selector->type == PIPE_SHADER_TESS_EVAL &&
+shader->key.tes.as_es)) {
+   /* Monolithic shader (compiled as a whole, has many variants,
+* may take a long time to compile).
+*/
+   r = si_compile_tgsi_shader(sscreen, tm, shader, true, debug);
+   if (r)
+   return r;
+   } else {
+   /* The shader consists of 2-3 parts:
+*
+* - the middle part is the user shader, it has 1 variant only
+*   and it was compiled during the creation of the shader
+*   selector
+* - the prolog part is inserted at the beginning
+* - the epilog part is inserted at the end
+*
+* The prolog and epilog have many (but simple) variants.
+*/
 
-   if (!sscreen->use_monolithic_shaders) {
+   /* Copy the compiled TGSI shader data over. */
+   shader->is_binary_shared = true;
+   shader->binary = mainp->binary;
+   shader->config = mainp->config;
+   shader->num_input_sgprs = mainp->num_input_sgprs;
+   shader->num_input_vgprs = mainp->num_input_vgprs;
+   shader->face_vgpr_index = mainp->face_vgpr_index;
+   memcpy(shader->vs_output_param_offset,
+  mainp->vs_output_param_offset,
+  sizeof(mainp->vs_output_param_offset));
+   shader->uses_instanceid = mainp->uses_instanceid;
+   shader->nr_pos_exports = mainp->nr_pos_exports;
+   shader->nr_param_exports = mainp->nr_param_exports;
+
+   /* Select prologs and/or epilogs. */
switch (shader->selector->type) {
case PIPE_SHADER_VERTEX:
if (!si_shader_select_vs_parts(sscreen, tm, shader, 
debug))
@@ -5915,5 +5948,6 @@ void si_shader_destroy(struct si_shader *shader)
 
r600_resource_reference(>bo, NULL);
 
-   radeon_shader_binary_clean(>binary);
+   if (!shader->is_binary_shared)
+   radeon_shader_binary_clean(>binary);
 }
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 196fa3e..ee81621 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -181,6 +181,11 @@ struct si_shader_selector {
struct si_shader*first_variant; /* immutable after the first 
variant */
struct si_shader*last_variant; /* mutable */
 
+   /* The compiled TGSI shader expecting a 

[Mesa-dev] [PATCH 09/25] radeonsi: add code for combining and uploading shaders from 3 shader parts

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 35 
 src/gallium/drivers/radeonsi/si_shader.h |  9 
 2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index dbb9217..a6a0984 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4036,26 +4036,45 @@ void si_shader_apply_scratch_relocs(struct si_context 
*sctx,
 
 int si_shader_binary_upload(struct si_screen *sscreen, struct si_shader 
*shader)
 {
-   const struct radeon_shader_binary *binary = >binary;
-   unsigned code_size = binary->code_size + binary->rodata_size;
+   const struct radeon_shader_binary *prolog =
+   shader->prolog ? >prolog->binary : NULL;
+   const struct radeon_shader_binary *epilog =
+   shader->epilog ? >epilog->binary : NULL;
+   const struct radeon_shader_binary *mainb = >binary;
+   unsigned bo_size =
+   (prolog ? prolog->code_size : 0) +
+   mainb->code_size +
+   (epilog ? epilog->code_size : mainb->rodata_size);
unsigned char *ptr;
 
+   assert(!prolog || !prolog->rodata_size);
+   assert((!prolog && !epilog) || !mainb->rodata_size);
+   assert(!epilog || !epilog->rodata_size);
+
r600_resource_reference(>bo, NULL);
shader->bo = si_resource_create_custom(>b.b,
   PIPE_USAGE_IMMUTABLE,
-  code_size);
+  bo_size);
if (!shader->bo)
return -ENOMEM;
 
+   /* Upload. */
ptr = sscreen->b.ws->buffer_map(shader->bo->buf, NULL,
PIPE_TRANSFER_READ_WRITE);
-   util_memcpy_cpu_to_le32(ptr, binary->code, binary->code_size);
-   if (binary->rodata_size > 0) {
-   ptr += binary->code_size;
-   util_memcpy_cpu_to_le32(ptr, binary->rodata,
-   binary->rodata_size);
+
+   if (prolog) {
+   util_memcpy_cpu_to_le32(ptr, prolog->code, prolog->code_size);
+   ptr += prolog->code_size;
}
 
+   util_memcpy_cpu_to_le32(ptr, mainb->code, mainb->code_size);
+   ptr += mainb->code_size;
+
+   if (epilog)
+   util_memcpy_cpu_to_le32(ptr, epilog->code, epilog->code_size);
+   else if (mainb->rodata_size > 0)
+   util_memcpy_cpu_to_le32(ptr, mainb->rodata, mainb->rodata_size);
+
sscreen->b.ws->buffer_unmap(shader->bo->buf);
return 0;
 }
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 9331156..4c3c14a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -304,6 +304,9 @@ struct si_shader {
struct si_shader_selector   *selector;
struct si_shader*next_variant;
 
+   struct si_shader_part   *prolog;
+   struct si_shader_part   *epilog;
+
struct si_shader*gs_copy_shader;
struct si_pm4_state *pm4;
struct r600_resource*bo;
@@ -322,6 +325,12 @@ struct si_shader {
unsignednr_param_exports;
 };
 
+struct si_shader_part {
+   struct si_shader_part *next;
+   struct radeon_shader_binary binary;
+   struct si_shader_config config;
+};
+
 static inline struct tgsi_shader_info *si_get_vs_info(struct si_context *sctx)
 {
if (sctx->gs_shader.cso)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/25] radeonsi: add VS prolog

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

This is disabled with use_monolithic_shaders = true.
---
 src/gallium/drivers/radeonsi/si_pipe.c   |  19 +++
 src/gallium/drivers/radeonsi/si_pipe.h   |   3 +
 src/gallium/drivers/radeonsi/si_shader.c | 236 ++-
 src/gallium/drivers/radeonsi/si_shader.h |   9 ++
 4 files changed, 266 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 448fe88..7ce9570 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -22,6 +22,7 @@
  */
 
 #include "si_pipe.h"
+#include "si_shader.h"
 #include "si_public.h"
 #include "sid.h"
 
@@ -536,6 +537,11 @@ static int si_get_shader_param(struct pipe_screen* 
pscreen, unsigned shader, enu
 static void si_destroy_screen(struct pipe_screen* pscreen)
 {
struct si_screen *sscreen = (struct si_screen *)pscreen;
+   struct si_shader_part *parts[] = {
+   sscreen->vs_prologs,
+   /* this will be filled with other shader parts */
+   };
+   unsigned i;
 
if (!sscreen)
return;
@@ -543,6 +549,18 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
if (!sscreen->b.ws->unref(sscreen->b.ws))
return;
 
+   /* Free shader parts. */
+   for (i = 0; i < ARRAY_SIZE(parts); i++) {
+   while (parts[i]) {
+   struct si_shader_part *part = parts[i];
+
+   parts[i] = part->next;
+   radeon_shader_binary_clean(>binary);
+   FREE(part);
+   }
+   }
+   pipe_mutex_destroy(sscreen->shader_parts_mutex);
+
r600_destroy_common_screen(>b);
 }
 
@@ -600,6 +618,7 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)
 
sscreen->b.has_cp_dma = true;
sscreen->b.has_streamout = true;
+   pipe_mutex_init(sscreen->shader_parts_mutex);
sscreen->use_monolithic_shaders = true;
 
if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 2a2455c..f4bafc2 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -87,6 +87,9 @@ struct si_screen {
 
/* Whether shaders are monolithic (1-part) or separate (3-part). */
booluse_monolithic_shaders;
+
+   pipe_mutex  shader_parts_mutex;
+   struct si_shader_part   *vs_prologs;
 };
 
 struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index b74ed1e..fbb8394 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -83,6 +83,7 @@ struct si_shader_context
int param_rel_auto_id;
int param_vs_prim_id;
int param_instance_id;
+   int param_vertex_index0;
int param_tes_u;
int param_tes_v;
int param_tes_rel_patch_id;
@@ -432,7 +433,11 @@ static void declare_input_vs(
/* Build the attribute offset */
attribute_offset = lp_build_const_int32(gallivm, 0);
 
-   if (divisor) {
+   if (!ctx->is_monolithic) {
+   buffer_index = LLVMGetParam(radeon_bld->main_fn,
+   ctx->param_vertex_index0 +
+   input_index);
+   } else if (divisor) {
/* Build index from instance ID, start instance and divisor */
ctx->shader->uses_instanceid = true;
buffer_index = get_instance_index_for_fetch(>radeon_bld,
@@ -3711,6 +3716,15 @@ static void create_function(struct si_shader_context 
*ctx)
params[ctx->param_rel_auto_id = num_params++] = ctx->i32;
params[ctx->param_vs_prim_id = num_params++] = ctx->i32;
params[ctx->param_instance_id = num_params++] = ctx->i32;
+
+   if (!ctx->is_monolithic &&
+   !ctx->is_gs_copy_shader) {
+   /* Vertex load indices. */
+   ctx->param_vertex_index0 = num_params;
+
+   for (i = 0; i < shader->selector->info.num_inputs; i++)
+   params[num_params++] = ctx->i32;
+   }
break;
 
case TGSI_PROCESSOR_TESS_CTRL:
@@ -4678,6 +4692,203 @@ out:
return r;
 }
 
+/**
+ * Create, compile and return a shader part (prolog or epilog).
+ *
+ * \param sscreen  screen
+ * \param list list of shader parts of the same category
+ * \param key  shader part key
+ * \param tm   LLVM target machine
+ * \param debugdebug callback
+ * \param compile  the callback responsible for compilation
+ * \return non-NULL on success
+ */
+static 

[Mesa-dev] [PATCH 17/25] radeonsi: rework polygon stippling for PS prolog

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

Don't use the pstipple module.
---
 src/gallium/drivers/radeonsi/si_shader.c | 149 +++
 1 file changed, 110 insertions(+), 39 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index c6d4cb5..07ea231 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -128,8 +128,7 @@ static struct si_shader_context *si_shader_context(
 static void si_init_shader_ctx(struct si_shader_context *ctx,
   struct si_screen *sscreen,
   struct si_shader *shader,
-  LLVMTargetMachineRef tm,
-  struct tgsi_shader_info *info);
+  LLVMTargetMachineRef tm);
 
 /* ideally pass the sample mask input to the PS epilog as v13, which
  * is its usual location, so that the shader doesn't have to add v_mov.
@@ -211,6 +210,10 @@ static LLVMValueRef unpack_param(struct si_shader_context 
*ctx,
LLVMValueRef value = LLVMGetParam(ctx->radeon_bld.main_fn,
  param);
 
+   if (LLVMGetTypeKind(LLVMTypeOf(value)) == LLVMFloatTypeKind)
+   value = bitcast(>radeon_bld.soa.bld_base,
+   TGSI_TYPE_UNSIGNED, value);
+
if (rshift)
value = LLVMBuildLShr(gallivm->builder, value,
  lp_build_const_int32(gallivm, rshift), 
"");
@@ -2725,13 +2728,12 @@ static LLVMTypeRef const_array(LLVMTypeRef elem_type, 
int num_elements)
 /**
  * Load an image view, fmask view. or sampler state descriptor.
  */
-static LLVMValueRef get_sampler_desc(struct si_shader_context *ctx,
-LLVMValueRef index, enum desc_type type)
+static LLVMValueRef get_sampler_desc_custom(struct si_shader_context *ctx,
+   LLVMValueRef list, LLVMValueRef 
index,
+   enum desc_type type)
 {
struct gallivm_state *gallivm = >radeon_bld.gallivm;
LLVMBuilderRef builder = gallivm->builder;
-   LLVMValueRef ptr = LLVMGetParam(ctx->radeon_bld.main_fn,
-   SI_PARAM_SAMPLERS);
 
switch (type) {
case DESC_IMAGE:
@@ -2747,12 +2749,21 @@ static LLVMValueRef get_sampler_desc(struct 
si_shader_context *ctx,
/* The sampler state is at [12:15]. */
index = LLVMBuildMul(builder, index, LLVMConstInt(ctx->i32, 4, 
0), "");
index = LLVMBuildAdd(builder, index, LLVMConstInt(ctx->i32, 3, 
0), "");
-   ptr = LLVMBuildPointerCast(builder, ptr,
-  const_array(ctx->v4i32, 0), "");
+   list = LLVMBuildPointerCast(builder, list,
+   const_array(ctx->v4i32, 0), "");
break;
}
 
-   return build_indexed_load_const(ctx, ptr, index);
+   return build_indexed_load_const(ctx, list, index);
+}
+
+static LLVMValueRef get_sampler_desc(struct si_shader_context *ctx,
+LLVMValueRef index, enum desc_type type)
+{
+   LLVMValueRef list = LLVMGetParam(ctx->radeon_bld.main_fn,
+SI_PARAM_SAMPLERS);
+
+   return get_sampler_desc_custom(ctx, list, index, type);
 }
 
 static void tex_fetch_ptrs(
@@ -3984,7 +3995,7 @@ static void create_function(struct si_shader_context *ctx)
params[SI_PARAM_FRONT_FACE] = ctx->i32;
params[SI_PARAM_ANCILLARY] = ctx->i32;
params[SI_PARAM_SAMPLE_COVERAGE] = ctx->f32;
-   params[SI_PARAM_POS_FIXED_PT] = ctx->f32;
+   params[SI_PARAM_POS_FIXED_PT] = ctx->i32;
num_params = SI_PARAM_POS_FIXED_PT+1;
 
if (!ctx->is_monolithic) {
@@ -4040,7 +4051,8 @@ static void create_function(struct si_shader_context *ctx)
  S_0286D0_LINEAR_SAMPLE_ENA(1) |
  S_0286D0_LINEAR_CENTER_ENA(1) |
  S_0286D0_LINEAR_CENTROID_ENA(1) |
- S_0286D0_FRONT_FACE_ENA(1));
+ S_0286D0_FRONT_FACE_ENA(1) |
+ S_0286D0_POS_FIXED_PT_ENA(1));
}
 
shader->num_input_sgprs = 0;
@@ -4204,6 +4216,49 @@ static void preload_ring_buffers(struct 
si_shader_context *ctx)
}
 }
 
+static void si_llvm_emit_polygon_stipple(struct si_shader_context *ctx,
+LLVMValueRef param_sampler_views,
+unsigned param_pos_fixed_pt)
+{
+   struct lp_build_tgsi_context *bld_base =
+   >radeon_bld.soa.bld_base;
+   struct 

[Mesa-dev] [PATCH 15/25] radeonsi: add PS epilog

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c   |   1 +
 src/gallium/drivers/radeonsi/si_pipe.h   |   1 +
 src/gallium/drivers/radeonsi/si_shader.c | 289 ++-
 src/gallium/drivers/radeonsi/si_shader.h |   7 +
 4 files changed, 296 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 645d418..02c430d 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -541,6 +541,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
sscreen->vs_prologs,
sscreen->vs_epilogs,
sscreen->tcs_epilogs,
+   sscreen->ps_epilogs
};
unsigned i;
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index d9175b9..5d204ec 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -92,6 +92,7 @@ struct si_screen {
struct si_shader_part   *vs_prologs;
struct si_shader_part   *vs_epilogs;
struct si_shader_part   *tcs_epilogs;
+   struct si_shader_part   *ps_epilogs;
 };
 
 struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index bc6f8cd..915ac1d 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -131,6 +131,10 @@ static void si_init_shader_ctx(struct si_shader_context 
*ctx,
   LLVMTargetMachineRef tm,
   struct tgsi_shader_info *info);
 
+/* ideally pass the sample mask input to the PS epilog as v13, which
+ * is its usual location, so that the shader doesn't have to add v_mov.
+ */
+#define PS_EPILOG_SAMPLEMASK_MIN_LOC 13
 #define VS_EPILOG_PRIMID_LOC 2
 
 #define PERSPECTIVE_BASE 0
@@ -2527,6 +2531,100 @@ static void si_llvm_emit_fs_epilogue(struct 
lp_build_tgsi_context *bld_base)
si_export_mrt_z(bld_base, depth, stencil, samplemask);
 }
 
+/**
+ * Return PS outputs in this order:
+ *
+ * v[0:3] = color0.xyzw
+ * v[4:7] = color1.xyzw
+ * ...
+ * vN+0 = Depth
+ * vN+1 = Stencil
+ * vN+2 = SampleMask
+ * vN+3 = SampleMaskIn (used for OpenGL smoothing)
+ *
+ * The alpha-ref SGPR is returned via its original location.
+ */
+static void si_llvm_return_fs_outputs(struct lp_build_tgsi_context *bld_base)
+{
+   struct si_shader_context *ctx = si_shader_context(bld_base);
+   struct si_shader *shader = ctx->shader;
+   struct lp_build_context *base = _base->base;
+   struct tgsi_shader_info *info = >selector->info;
+   LLVMBuilderRef builder = base->gallivm->builder;
+   unsigned i, j, first_vgpr, vgpr;
+
+   LLVMValueRef color[8][4] = {};
+   LLVMValueRef depth = NULL, stencil = NULL, samplemask = NULL;
+   LLVMValueRef ret;
+
+   /* Read the output values. */
+   for (i = 0; i < info->num_outputs; i++) {
+   unsigned semantic_name = info->output_semantic_name[i];
+   unsigned semantic_index = info->output_semantic_index[i];
+
+   switch (semantic_name) {
+   case TGSI_SEMANTIC_COLOR:
+   assert(semantic_index < 8);
+   for (j = 0; j < 4; j++) {
+   LLVMValueRef ptr = 
ctx->radeon_bld.soa.outputs[i][j];
+   LLVMValueRef result = LLVMBuildLoad(builder, 
ptr, "");
+   color[semantic_index][j] = result;
+   }
+   break;
+   case TGSI_SEMANTIC_POSITION:
+   depth = LLVMBuildLoad(builder,
+ 
ctx->radeon_bld.soa.outputs[i][2], "");
+   break;
+   case TGSI_SEMANTIC_STENCIL:
+   stencil = LLVMBuildLoad(builder,
+   
ctx->radeon_bld.soa.outputs[i][1], "");
+   break;
+   case TGSI_SEMANTIC_SAMPLEMASK:
+   samplemask = LLVMBuildLoad(builder,
+  
ctx->radeon_bld.soa.outputs[i][0], "");
+   break;
+   default:
+   fprintf(stderr, "Warning: SI unhandled fs output 
type:%d\n",
+   semantic_name);
+   }
+   }
+
+   /* Fill the return structure. */
+   ret = ctx->return_value;
+
+   /* Set SGPRs. */
+   ret = LLVMBuildInsertValue(builder, ret,
+  bitcast(bld_base, TGSI_TYPE_SIGNED,
+  LLVMGetParam(ctx->radeon_bld.main_fn,
+   SI_PARAM_ALPHA_REF)),
+  SI_SGPR_ALPHA_REF, 

[Mesa-dev] [PATCH 16/25] radeonsi: add PS prolog

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c  |   1 +
 src/gallium/drivers/radeonsi/si_pipe.h  |   1 +
 src/gallium/drivers/radeonsi/si_shader.c| 324 +++-
 src/gallium/drivers/radeonsi/si_shader.h|  14 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c |   7 +
 5 files changed, 345 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 02c430d..44f6047 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -541,6 +541,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
sscreen->vs_prologs,
sscreen->vs_epilogs,
sscreen->tcs_epilogs,
+   sscreen->ps_prologs,
sscreen->ps_epilogs
};
unsigned i;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 5d204ec..1ac7bc4 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -92,6 +92,7 @@ struct si_screen {
struct si_shader_part   *vs_prologs;
struct si_shader_part   *vs_epilogs;
struct si_shader_part   *tcs_epilogs;
+   struct si_shader_part   *ps_prologs;
struct si_shader_part   *ps_epilogs;
 };
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 915ac1d..c6d4cb5 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -875,7 +875,8 @@ static int lookup_interp_param_index(unsigned interpolate, 
unsigned location)
 static unsigned select_interp_param(struct si_shader_context *ctx,
unsigned param)
 {
-   if (!ctx->shader->key.ps.prolog.force_persample_interp)
+   if (!ctx->shader->key.ps.prolog.force_persample_interp ||
+   !ctx->is_monolithic)
return param;
 
/* If the shader doesn't use center/centroid, just return the parameter.
@@ -1019,6 +1020,7 @@ static void declare_input_fs(
unsigned input_index,
const struct tgsi_full_declaration *decl)
 {
+   struct lp_build_context *base = _bld->soa.bld_base.base;
struct si_shader_context *ctx =
si_shader_context(_bld->soa.bld_base);
struct si_shader *shader = ctx->shader;
@@ -1026,6 +1028,26 @@ static void declare_input_fs(
LLVMValueRef interp_param = NULL;
int interp_param_idx;
 
+   /* Get colors from input VGPRs (set by the prolog). */
+   if (!ctx->is_monolithic &&
+   decl->Semantic.Name == TGSI_SEMANTIC_COLOR) {
+   unsigned i = decl->Semantic.Index;
+   unsigned colors_read = shader->selector->info.colors_read;
+   unsigned mask = colors_read >> (i * 4);
+   unsigned offset = SI_PARAM_POS_FIXED_PT + 1 +
+ (i ? util_bitcount(colors_read & 0xf) : 0);
+
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 0)] =
+   mask & 0x1 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 1)] =
+   mask & 0x2 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 2)] =
+   mask & 0x4 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   radeon_bld->inputs[radeon_llvm_reg_index_soa(input_index, 3)] =
+   mask & 0x8 ? LLVMGetParam(main_fn, offset++) : 
base->undef;
+   return;
+   }
+
interp_param_idx = lookup_interp_param_index(decl->Interp.Interpolate,
 decl->Interp.Location);
if (interp_param_idx == -1)
@@ -3966,6 +3988,16 @@ static void create_function(struct si_shader_context 
*ctx)
num_params = SI_PARAM_POS_FIXED_PT+1;
 
if (!ctx->is_monolithic) {
+   /* Color inputs from the prolog. */
+   if (shader->selector->info.colors_read) {
+   unsigned num_color_elements =
+   
util_bitcount(shader->selector->info.colors_read);
+
+   assert(num_params + num_color_elements <= 
ARRAY_SIZE(params));
+   for (i = 0; i < num_color_elements; i++)
+   params[num_params++] = ctx->f32;
+   }
+
/* Outputs for the epilog. */
num_return_sgprs = SI_SGPR_ALPHA_REF + 1;
num_returns =
@@ -3997,6 +4029,20 @@ static void create_function(struct si_shader_context 
*ctx)

[Mesa-dev] [PATCH 21/25] radeonsi: use smaller types for some si_shader members

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

in order to decrease the shader size for a shader cache.
---
 src/gallium/drivers/radeonsi/si_shader.c | 3 +++
 src/gallium/drivers/radeonsi/si_shader.h | 6 +++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 2789788..3758009 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1889,6 +1889,7 @@ handle_semantic:
case TGSI_SEMANTIC_COLOR:
case TGSI_SEMANTIC_BCOLOR:
target = V_008DFC_SQ_EXP_PARAM + param_count;
+   assert(i < ARRAY_SIZE(shader->vs_output_param_offset));
shader->vs_output_param_offset[i] = param_count;
param_count++;
break;
@@ -1903,6 +1904,7 @@ handle_semantic:
case TGSI_SEMANTIC_TEXCOORD:
case TGSI_SEMANTIC_GENERIC:
target = V_008DFC_SQ_EXP_PARAM + param_count;
+   assert(i < ARRAY_SIZE(shader->vs_output_param_offset));
shader->vs_output_param_offset[i] = param_count;
param_count++;
break;
@@ -5268,6 +5270,7 @@ static bool si_get_vs_epilog(struct si_screen *sscreen,
unsigned offset = shader->nr_param_exports++;
 
epilog_key.vs_epilog.prim_id_param_offset = offset;
+   assert(index < ARRAY_SIZE(shader->vs_output_param_offset));
shader->vs_output_param_offset[index] = offset;
}
 
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index ee81621..a77e54a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -359,10 +359,10 @@ struct si_shader {
ubyte   num_input_vgprs;
charface_vgpr_index;
 
-   unsignedvs_output_param_offset[PIPE_MAX_SHADER_OUTPUTS];
+   ubyte   vs_output_param_offset[40];
booluses_instanceid;
-   unsignednr_pos_exports;
-   unsignednr_param_exports;
+   ubyte   nr_pos_exports;
+   ubyte   nr_param_exports;
 };
 
 struct si_shader_part {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 25/25] radeonsi: implement binary shaders & shader cache in memory

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c  |   5 +-
 src/gallium/drivers/radeonsi/si_pipe.h  |  16 ++
 src/gallium/drivers/radeonsi/si_shader.h|   4 +-
 src/gallium/drivers/radeonsi/si_state.h |   2 +
 src/gallium/drivers/radeonsi/si_state_shaders.c | 234 +++-
 5 files changed, 254 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 75d4775..a576237 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -563,7 +563,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
}
}
pipe_mutex_destroy(sscreen->shader_parts_mutex);
-
+   si_destroy_shader_cache(sscreen);
r600_destroy_common_screen(>b);
 }
 
@@ -611,7 +611,8 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)
sscreen->b.b.resource_create = r600_resource_create_common;
 
if (!r600_common_screen_init(>b, ws) ||
-   !si_init_gs_info(sscreen)) {
+   !si_init_gs_info(sscreen) ||
+   !si_init_shader_cache(sscreen)) {
FREE(sscreen);
return NULL;
}
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 1ac7bc4..ef860a5 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -80,6 +80,7 @@
 #define SI_MAX_BORDER_COLORS   4096
 
 struct si_compute;
+struct hash_table;
 
 struct si_screen {
struct r600_common_screen   b;
@@ -94,6 +95,21 @@ struct si_screen {
struct si_shader_part   *tcs_epilogs;
struct si_shader_part   *ps_prologs;
struct si_shader_part   *ps_epilogs;
+
+   /* Shader cache in memory.
+*
+* Design & limitations:
+* - The shader cache is per screen (= per process), never saved to
+*   disk, and skips redundant shader compilations from TGSI to 
bytecode.
+* - It can only be used with one-variant-per-shader support, in which
+*   case only the main (typically middle) part of shaders is cached.
+* - Only VS, TCS, TES, PS are cached, out of which only the hw VS
+*   variants of VS and TES are cached, so LS and ES aren't.
+* - GS and CS aren't cached, but it's certainly possible to cache
+*   those as well.
+*/
+   pipe_mutex  shader_cache_mutex;
+   struct hash_table   *shader_cache;
 };
 
 struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 48e048d..7e46871 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -362,8 +362,10 @@ struct si_shader {
struct r600_resource*bo;
struct r600_resource*scratch_bo;
union si_shader_key key;
-   struct radeon_shader_binary binary;
boolis_binary_shared;
+
+   /* The following data is all that's needed for binary shaders. */
+   struct radeon_shader_binary binary;
struct si_shader_config config;
struct si_shader_info   info;
 };
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index f64c4d4..40792cb 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -280,6 +280,8 @@ si_create_sampler_view_custom(struct pipe_context *ctx,
 /* si_state_shader.c */
 bool si_update_shaders(struct si_context *sctx);
 void si_init_shader_functions(struct si_context *sctx);
+bool si_init_shader_cache(struct si_screen *sscreen);
+void si_destroy_shader_cache(struct si_screen *sscreen);
 
 /* si_state_draw.c */
 void si_emit_cache_flush(struct si_context *sctx, struct r600_atom *atom);
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index c62cbb7..bc3e5be 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -32,10 +32,217 @@
 
 #include "tgsi/tgsi_parse.h"
 #include "tgsi/tgsi_ureg.h"
+#include "util/hash_table.h"
+#include "util/u_hash.h"
 #include "util/u_memory.h"
 #include "util/u_prim.h"
 #include "util/u_simple_shaders.h"
 
+/* SHADER_CACHE */
+
+/**
+ * Return the TGSI binary in a buffer. The first 4 bytes contain its size as
+ * integer.
+ */
+static void *si_get_tgsi_binary(struct si_shader_selector *sel)
+{
+   unsigned tgsi_size = tgsi_num_tokens(sel->tokens) *
+sizeof(struct tgsi_token);
+   unsigned size = 4 + tgsi_size + sizeof(sel->so);
+   char *result = (char*)MALLOC(size);
+
+   if (!result)
+   return NULL;
+
+   

[Mesa-dev] [PATCH 24/25] gallium/radeon: remove unused radeon_shader_binary_free_* functions

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeon/radeon_elf_util.c | 19 ---
 src/gallium/drivers/radeon/radeon_elf_util.h | 14 --
 2 files changed, 33 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_elf_util.c 
b/src/gallium/drivers/radeon/radeon_elf_util.c
index 70a2c4d..8aaa85d 100644
--- a/src/gallium/drivers/radeon/radeon_elf_util.c
+++ b/src/gallium/drivers/radeon/radeon_elf_util.c
@@ -195,22 +195,3 @@ const unsigned char *radeon_shader_binary_config_start(
}
return binary->config;
 }
-
-void radeon_shader_binary_free_relocs(struct radeon_shader_reloc *relocs,
-   unsigned reloc_count)
-{
-   FREE(relocs);
-}
-
-void radeon_shader_binary_free_members(struct radeon_shader_binary *binary,
-   unsigned free_relocs)
-{
-   FREE(binary->code);
-   FREE(binary->config);
-   FREE(binary->rodata);
-
-   if (free_relocs) {
-   radeon_shader_binary_free_relocs(binary->relocs,
-   binary->reloc_count);
-   }
-}
diff --git a/src/gallium/drivers/radeon/radeon_elf_util.h 
b/src/gallium/drivers/radeon/radeon_elf_util.h
index ea4ab2f..c2af9e0 100644
--- a/src/gallium/drivers/radeon/radeon_elf_util.h
+++ b/src/gallium/drivers/radeon/radeon_elf_util.h
@@ -47,18 +47,4 @@ const unsigned char *radeon_shader_binary_config_start(
const struct radeon_shader_binary *binary,
uint64_t symbol_offset);
 
-/**
- * Free all memory allocated for members of \p binary.  This function does
- * not free \p binary.
- *
- * @param free_relocs If false, reolc information will not be freed.
- */
-void radeon_shader_binary_free_members(struct radeon_shader_binary *binary,
-   unsigned free_relocs);
-
-/**
- * Free \p relocs and all member data.
- */
-void radeon_shader_binary_free_relocs(struct radeon_shader_reloc *relocs,
-   unsigned reloc_count);
 #endif /* RADEON_ELF_UTIL_H */
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/25] radeonsi: make radeon_shader_reloc name string fixed-sized

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

This will simplify implementations of binary shaders.
---
 src/gallium/drivers/radeon/r600_pipe_common.h | 2 +-
 src/gallium/drivers/radeon/radeon_elf_util.c  | 7 ++-
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index ee173d3..7df6177 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -97,7 +97,7 @@ struct r600_common_context;
 struct r600_perfcounters;
 
 struct radeon_shader_reloc {
-   char *name;
+   char name[32];
uint64_t offset;
 };
 
diff --git a/src/gallium/drivers/radeon/radeon_elf_util.c 
b/src/gallium/drivers/radeon/radeon_elf_util.c
index 2e45d43..70a2c4d 100644
--- a/src/gallium/drivers/radeon/radeon_elf_util.c
+++ b/src/gallium/drivers/radeon/radeon_elf_util.c
@@ -98,7 +98,8 @@ static void parse_relocs(Elf *elf, Elf_Data *relocs, Elf_Data 
*symbols,
symbol_name = elf_strptr(elf, symbol_sh_link, symbol.st_name);
 
reloc->offset = rel.r_offset;
-   reloc->name = strdup(symbol_name);
+   strncpy(reloc->name, symbol_name, sizeof(reloc->name)-1);
+   reloc->name[sizeof(reloc->name)-1] = 0;
}
 }
 
@@ -198,10 +199,6 @@ const unsigned char *radeon_shader_binary_config_start(
 void radeon_shader_binary_free_relocs(struct radeon_shader_reloc *relocs,
unsigned reloc_count)
 {
-   unsigned i;
-   for (i = 0; i < reloc_count; i++) {
-   FREE(relocs[i].name);
-   }
FREE(relocs);
 }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/25] radeonsi: add TCS epilog

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c   |   1 +
 src/gallium/drivers/radeonsi/si_pipe.h   |   1 +
 src/gallium/drivers/radeonsi/si_shader.c | 163 ---
 src/gallium/drivers/radeonsi/si_shader.h |   3 +
 4 files changed, 155 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 2b5ce3a..645d418 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -540,6 +540,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
struct si_shader_part *parts[] = {
sscreen->vs_prologs,
sscreen->vs_epilogs,
+   sscreen->tcs_epilogs,
};
unsigned i;
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 8d98779..d9175b9 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -91,6 +91,7 @@ struct si_screen {
pipe_mutex  shader_parts_mutex;
struct si_shader_part   *vs_prologs;
struct si_shader_part   *vs_epilogs;
+   struct si_shader_part   *tcs_epilogs;
 };
 
 struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0085c43..bc6f8cd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -109,9 +109,11 @@ struct si_shader_context
LLVMTypeRef i1;
LLVMTypeRef i8;
LLVMTypeRef i32;
+   LLVMTypeRef i64;
LLVMTypeRef i128;
LLVMTypeRef f32;
LLVMTypeRef v16i8;
+   LLVMTypeRef v2i32;
LLVMTypeRef v4i32;
LLVMTypeRef v4f32;
LLVMTypeRef v8i32;
@@ -2078,14 +2080,51 @@ static void si_write_tess_factors(struct 
lp_build_tgsi_context *bld_base,
 static void si_llvm_emit_tcs_epilogue(struct lp_build_tgsi_context *bld_base)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
-   LLVMValueRef invocation_id;
+   LLVMValueRef rel_patch_id, invocation_id, tf_lds_offset;
 
+   rel_patch_id = get_rel_patch_id(ctx);
invocation_id = unpack_param(ctx, SI_PARAM_REL_IDS, 8, 5);
+   tf_lds_offset = get_tcs_out_current_patch_data_offset(ctx);
 
-   si_write_tess_factors(bld_base,
- get_rel_patch_id(ctx),
- invocation_id,
- get_tcs_out_current_patch_data_offset(ctx));
+   if (!ctx->is_monolithic) {
+   /* Return epilog parameters from this function. */
+   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
+   LLVMValueRef ret = ctx->return_value;
+   LLVMValueRef rw_buffers, rw0, rw1, tf_soffset;
+   unsigned vgpr;
+
+   /* RW_BUFFERS pointer */
+   rw_buffers = LLVMGetParam(ctx->radeon_bld.main_fn,
+ SI_PARAM_RW_BUFFERS);
+   rw_buffers = LLVMBuildPtrToInt(builder, rw_buffers, ctx->i64, 
"");
+   rw_buffers = LLVMBuildBitCast(builder, rw_buffers, ctx->v2i32, 
"");
+   rw0 = LLVMBuildExtractElement(builder, rw_buffers,
+ bld_base->uint_bld.zero, "");
+   rw1 = LLVMBuildExtractElement(builder, rw_buffers,
+ bld_base->uint_bld.one, "");
+   ret = LLVMBuildInsertValue(builder, ret, rw0, 0, "");
+   ret = LLVMBuildInsertValue(builder, ret, rw1, 1, "");
+
+   /* Tess factor buffer soffset is after user SGPRs. */
+   tf_soffset = LLVMGetParam(ctx->radeon_bld.main_fn,
+ SI_PARAM_TESS_FACTOR_OFFSET);
+   ret = LLVMBuildInsertValue(builder, ret, tf_soffset,
+  SI_TCS_NUM_USER_SGPR, "");
+
+   /* VGPRs */
+   rel_patch_id = bitcast(bld_base, TGSI_TYPE_FLOAT, rel_patch_id);
+   invocation_id = bitcast(bld_base, TGSI_TYPE_FLOAT, 
invocation_id);
+   tf_lds_offset = bitcast(bld_base, TGSI_TYPE_FLOAT, 
tf_lds_offset);
+
+   vgpr = SI_TCS_NUM_USER_SGPR + 1;
+   ret = LLVMBuildInsertValue(builder, ret, rel_patch_id, vgpr++, 
"");
+   ret = LLVMBuildInsertValue(builder, ret, invocation_id, vgpr++, 
"");
+   ret = LLVMBuildInsertValue(builder, ret, tf_lds_offset, vgpr++, 
"");
+   ctx->return_value = ret;
+   return;
+   }
+
+   si_write_tess_factors(bld_base, rel_patch_id, invocation_id, 
tf_lds_offset);
 }
 
 static void si_llvm_emit_ls_epilogue(struct lp_build_tgsi_context *bld_base)
@@ -3679,12 +3718,11 @@ static void create_function(struct si_shader_context 
*ctx)
struct 

[Mesa-dev] [PATCH 22/25] radeonsi: move some struct si_shader members to new struct si_shader_info

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

This will be part of shader binaries.
---
 src/gallium/drivers/radeonsi/si_shader.c| 100 
 src/gallium/drivers/radeonsi/si_shader.h|  21 ++---
 src/gallium/drivers/radeonsi/si_state_shaders.c |  18 ++---
 3 files changed, 71 insertions(+), 68 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 3758009..f9bb9ec 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -449,7 +449,7 @@ static void declare_input_vs(
input_index);
} else if (divisor) {
/* Build index from instance ID, start instance and divisor */
-   ctx->shader->uses_instanceid = true;
+   ctx->shader->info.uses_instanceid = true;
buffer_index = get_instance_index_for_fetch(>radeon_bld,

SI_PARAM_START_INSTANCE,
divisor);
@@ -1889,8 +1889,8 @@ handle_semantic:
case TGSI_SEMANTIC_COLOR:
case TGSI_SEMANTIC_BCOLOR:
target = V_008DFC_SQ_EXP_PARAM + param_count;
-   assert(i < ARRAY_SIZE(shader->vs_output_param_offset));
-   shader->vs_output_param_offset[i] = param_count;
+   assert(i < 
ARRAY_SIZE(shader->info.vs_output_param_offset));
+   shader->info.vs_output_param_offset[i] = param_count;
param_count++;
break;
case TGSI_SEMANTIC_CLIPDIST:
@@ -1904,8 +1904,8 @@ handle_semantic:
case TGSI_SEMANTIC_TEXCOORD:
case TGSI_SEMANTIC_GENERIC:
target = V_008DFC_SQ_EXP_PARAM + param_count;
-   assert(i < ARRAY_SIZE(shader->vs_output_param_offset));
-   shader->vs_output_param_offset[i] = param_count;
+   assert(i < 
ARRAY_SIZE(shader->info.vs_output_param_offset));
+   shader->info.vs_output_param_offset[i] = param_count;
param_count++;
break;
default:
@@ -1933,7 +1933,7 @@ handle_semantic:
}
}
 
-   shader->nr_param_exports = param_count;
+   shader->info.nr_param_exports = param_count;
 
/* We need to add the position output manually if it's missing. */
if (!pos_args[0][0]) {
@@ -1995,7 +1995,7 @@ handle_semantic:
 
for (i = 0; i < 4; i++)
if (pos_args[i][0])
-   shader->nr_pos_exports++;
+   shader->info.nr_pos_exports++;
 
pos_idx = 0;
for (i = 0; i < 4; i++) {
@@ -2005,7 +2005,7 @@ handle_semantic:
/* Specify the target we are exporting */
pos_args[i][3] = lp_build_const_int32(base->gallivm, 
V_008DFC_SQ_EXP_POS + pos_idx++);
 
-   if (pos_idx == shader->nr_pos_exports)
+   if (pos_idx == shader->info.nr_pos_exports)
/* Specify that this is the last export */
pos_args[i][2] = uint->one;
 
@@ -4057,18 +4057,18 @@ static void create_function(struct si_shader_context 
*ctx)
  S_0286D0_POS_FIXED_PT_ENA(1));
}
 
-   shader->num_input_sgprs = 0;
-   shader->num_input_vgprs = 0;
+   shader->info.num_input_sgprs = 0;
+   shader->info.num_input_vgprs = 0;
 
for (i = 0; i <= last_sgpr; ++i)
-   shader->num_input_sgprs += llvm_get_type_size(params[i]) / 4;
+   shader->info.num_input_sgprs += llvm_get_type_size(params[i]) / 
4;
 
/* Unused fragment shader inputs are eliminated by the compiler,
 * so we don't know yet how many there will be.
 */
if (ctx->type != TGSI_PROCESSOR_FRAGMENT)
for (; i < num_params; ++i)
-   shader->num_input_vgprs += 
llvm_get_type_size(params[i]) / 4;
+   shader->info.num_input_vgprs += 
llvm_get_type_size(params[i]) / 4;
 
if (bld_base->info &&
(bld_base->info->opcode_count[TGSI_OPCODE_DDX] > 0 ||
@@ -4862,7 +4862,7 @@ int si_compile_tgsi_shader(struct si_screen *sscreen,
si_init_shader_ctx(, sscreen, shader, tm);
ctx.is_monolithic = is_monolithic;
 
-   shader->uses_instanceid = sel->info.uses_instanceid;
+   shader->info.uses_instanceid = sel->info.uses_instanceid;
 
bld_base = _bld.soa.bld_base;
ctx.radeon_bld.load_system_value = declare_system_value;
@@ -4956,43 +4956,43 @@ int si_compile_tgsi_shader(struct si_screen *sscreen,
 
/* Calculate the number of fragment input VGPRs. */
if (ctx.type == TGSI_PROCESSOR_FRAGMENT) {
-   

[Mesa-dev] [PATCH 11/25] radeonsi: first bits for non-monolithic shaders

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
 src/gallium/drivers/radeonsi/si_pipe.h   |  3 ++
 src/gallium/drivers/radeonsi/si_shader.c | 53 
 src/gallium/drivers/radeonsi/si_shader.h |  2 +-
 4 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index fa60732..448fe88 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -600,6 +600,7 @@ struct pipe_screen *radeonsi_screen_create(struct 
radeon_winsys *ws)
 
sscreen->b.has_cp_dma = true;
sscreen->b.has_streamout = true;
+   sscreen->use_monolithic_shaders = true;
 
if (debug_get_bool_option("RADEON_DUMP_SHADERS", FALSE))
sscreen->b.debug_flags |= DBG_FS | DBG_VS | DBG_GS | DBG_PS | 
DBG_CS;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index b5790d6..2a2455c 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -84,6 +84,9 @@ struct si_compute;
 struct si_screen {
struct r600_common_screen   b;
unsignedgs_table_depth;
+
+   /* Whether shaders are monolithic (1-part) or separate (3-part). */
+   booluse_monolithic_shaders;
 };
 
 struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index b058019..b74ed1e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -70,6 +70,12 @@ struct si_shader_context
 
unsigned type; /* TGSI_PROCESSOR_* specifies the type of shader. */
bool is_gs_copy_shader;
+
+   /* Whether to generate the optimized shader variant compiled as a whole
+* (without a prolog and epilog)
+*/
+   bool is_monolithic;
+
int param_streamout_config;
int param_streamout_write_index;
int param_streamout_offset[4];
@@ -3657,8 +3663,10 @@ static void create_function(struct si_shader_context 
*ctx)
struct lp_build_tgsi_context *bld_base = >radeon_bld.soa.bld_base;
struct gallivm_state *gallivm = bld_base->base.gallivm;
struct si_shader *shader = ctx->shader;
-   LLVMTypeRef params[SI_NUM_PARAMS], v2i32, v3i32;
+   LLVMTypeRef params[SI_NUM_PARAMS + SI_NUM_VERTEX_BUFFERS], v2i32, v3i32;
+   LLVMTypeRef returns[16+32*4];
unsigned i, last_array_pointer, last_sgpr, num_params;
+   unsigned num_returns = 0;
 
v2i32 = LLVMVectorType(ctx->i32, 2);
v3i32 = LLVMVectorType(ctx->i32, 3);
@@ -3785,7 +3793,7 @@ static void create_function(struct si_shader_context *ctx)
 
assert(num_params <= Elements(params));
 
-   si_create_function(ctx, NULL, 0, params,
+   si_create_function(ctx, returns, num_returns, params,
   num_params, last_array_pointer, last_sgpr);
 
shader->num_input_sgprs = 0;
@@ -4492,9 +4500,11 @@ static void si_init_shader_ctx(struct si_shader_context 
*ctx,
bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
 }
 
-int si_shader_create(struct si_screen *sscreen, LLVMTargetMachineRef tm,
-struct si_shader *shader,
-struct pipe_debug_callback *debug)
+static int si_compile_tgsi_shader(struct si_screen *sscreen,
+ LLVMTargetMachineRef tm,
+ struct si_shader *shader,
+ bool is_monolithic,
+ struct pipe_debug_callback *debug)
 {
struct si_shader_selector *sel = shader->selector;
struct tgsi_token *tokens = sel->tokens;
@@ -4524,6 +4534,7 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,
 
si_init_shader_ctx(, sscreen, shader, tm,
   poly_stipple ? _shader_info : >info);
+   ctx.is_monolithic = is_monolithic;
 
shader->uses_instanceid = sel->info.uses_instanceid;
 
@@ -4604,14 +4615,6 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,
goto out;
}
 
-   si_shader_dump(sscreen, shader, debug, ctx.type);
-
-   r = si_shader_binary_upload(sscreen, shader);
-   if (r) {
-   fprintf(stderr, "LLVM failed to upload shader\n");
-   goto out;
-   }
-
radeon_llvm_dispose(_bld);
 
/* Calculate the number of fragment input VGPRs. */
@@ -4675,6 +4678,30 @@ out:
return r;
 }
 
+int si_shader_create(struct si_screen *sscreen, LLVMTargetMachineRef tm,
+struct si_shader *shader,
+struct pipe_debug_callback *debug)
+{
+   int r;
+
+   /* Compile TGSI. */
+   r = si_compile_tgsi_shader(sscreen, tm, shader,
+  

[Mesa-dev] [PATCH 13/25] radeonsi: add VS epilog

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

It only exports the primitive ID.
Also used by TES when it's compiled as VS.

The VS input location of the primitive ID input is v2.
---
 src/gallium/drivers/radeonsi/si_pipe.c   |   2 +-
 src/gallium/drivers/radeonsi/si_pipe.h   |   1 +
 src/gallium/drivers/radeonsi/si_shader.c | 172 +--
 src/gallium/drivers/radeonsi/si_shader.h |   4 +
 4 files changed, 168 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 7ce9570..2b5ce3a 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -539,7 +539,7 @@ static void si_destroy_screen(struct pipe_screen* pscreen)
struct si_screen *sscreen = (struct si_screen *)pscreen;
struct si_shader_part *parts[] = {
sscreen->vs_prologs,
-   /* this will be filled with other shader parts */
+   sscreen->vs_epilogs,
};
unsigned i;
 
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index f4bafc2..8d98779 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -90,6 +90,7 @@ struct si_screen {
 
pipe_mutex  shader_parts_mutex;
struct si_shader_part   *vs_prologs;
+   struct si_shader_part   *vs_epilogs;
 };
 
 struct si_blend_color {
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index fbb8394..0085c43 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -129,6 +129,7 @@ static void si_init_shader_ctx(struct si_shader_context 
*ctx,
   LLVMTargetMachineRef tm,
   struct tgsi_shader_info *info);
 
+#define VS_EPILOG_PRIMID_LOC 2
 
 #define PERSPECTIVE_BASE 0
 #define LINEAR_BASE 9
@@ -2230,16 +2231,26 @@ static void si_llvm_emit_vs_epilogue(struct 
lp_build_tgsi_context *bld_base)
  "");
}
 
-   /* Export PrimitiveID when PS needs it. */
-   if (si_vs_exports_prim_id(ctx->shader)) {
-   outputs[i].name = TGSI_SEMANTIC_PRIMID;
-   outputs[i].sid = 0;
-   outputs[i].values[0] = bitcast(bld_base, TGSI_TYPE_FLOAT,
-  get_primitive_id(bld_base, 0));
-   outputs[i].values[1] = bld_base->base.undef;
-   outputs[i].values[2] = bld_base->base.undef;
-   outputs[i].values[3] = bld_base->base.undef;
-   i++;
+   if (ctx->is_monolithic) {
+   /* Export PrimitiveID when PS needs it. */
+   if (si_vs_exports_prim_id(ctx->shader)) {
+   outputs[i].name = TGSI_SEMANTIC_PRIMID;
+   outputs[i].sid = 0;
+   outputs[i].values[0] = bitcast(bld_base, 
TGSI_TYPE_FLOAT,
+  
get_primitive_id(bld_base, 0));
+   outputs[i].values[1] = bld_base->base.undef;
+   outputs[i].values[2] = bld_base->base.undef;
+   outputs[i].values[3] = bld_base->base.undef;
+   i++;
+   }
+   } else {
+   /* Return the primitive ID from the LLVM function. */
+   ctx->return_value =
+   LLVMBuildInsertValue(gallivm->builder,
+ctx->return_value,
+bitcast(bld_base, TGSI_TYPE_FLOAT,
+get_primitive_id(bld_base, 
0)),
+VS_EPILOG_PRIMID_LOC, "");
}
 
si_llvm_export_vs(bld_base, outputs, i);
@@ -3724,6 +3735,11 @@ static void create_function(struct si_shader_context 
*ctx)
 
for (i = 0; i < shader->selector->info.num_inputs; i++)
params[num_params++] = ctx->i32;
+
+   /* PrimitiveID output. */
+   if (!shader->key.vs.as_es && !shader->key.vs.as_ls)
+   for (i = 0; i <= VS_EPILOG_PRIMID_LOC; i++)
+   returns[num_returns++] = ctx->f32;
}
break;
 
@@ -3758,6 +3774,11 @@ static void create_function(struct si_shader_context 
*ctx)
params[ctx->param_tes_v = num_params++] = ctx->f32;
params[ctx->param_tes_rel_patch_id = num_params++] = ctx->i32;
params[ctx->param_tes_patch_id = num_params++] = ctx->i32;
+
+   /* PrimitiveID output. */
+   if (!ctx->is_monolithic && !shader->key.tes.as_es)
+   for (i = 0; i <= VS_EPILOG_PRIMID_LOC; i++)
+

[Mesa-dev] [PATCH 19/25] radeonsi: print full shader name before disassembly

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 34 +++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 20069b4..2789788 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4507,6 +4507,38 @@ static void si_shader_dump_stats(struct si_screen 
*sscreen,
   max_simd_waves);
 }
 
+static const char *si_get_shader_name(struct si_shader *shader,
+ unsigned processor)
+{
+   switch (processor) {
+   case TGSI_PROCESSOR_VERTEX:
+   if (shader->key.vs.as_es)
+   return "Vertex Shader as ES";
+   else if (shader->key.vs.as_ls)
+   return "Vertex Shader as LS";
+   else
+   return "Vertex Shader as VS";
+   case TGSI_PROCESSOR_TESS_CTRL:
+   return "Tessellation Control Shader";
+   case TGSI_PROCESSOR_TESS_EVAL:
+   if (shader->key.tes.as_es)
+   return "Tessellation Evaluation Shader as ES";
+   else
+   return "Tessellation Evaluation Shader as VS";
+   case TGSI_PROCESSOR_GEOMETRY:
+   if (shader->gs_copy_shader == NULL)
+   return "GS Copy Shader as VS";
+   else
+   return "Geometry Shader";
+   case TGSI_PROCESSOR_FRAGMENT:
+   return "Pixel Shader";
+   case TGSI_PROCESSOR_COMPUTE:
+   return "Compute Shader";
+   default:
+   return "Unknown Shader";
+   }
+}
+
 void si_shader_dump(struct si_screen *sscreen, struct si_shader *shader,
struct pipe_debug_callback *debug, unsigned processor)
 {
@@ -4517,7 +4549,7 @@ void si_shader_dump(struct si_screen *sscreen, struct 
si_shader *shader,
 
if (r600_can_dump_shader(>b, processor) &&
!(sscreen->b.debug_flags & DBG_NO_ASM)) {
-   fprintf(stderr, "\n");
+   fprintf(stderr, "\n%s:\n", si_get_shader_name(shader, 
processor));
if (shader->prolog)
si_shader_dump_disassembly(>prolog->binary,
   debug, "prolog");
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/25] radeonsi: add code for dumping all shader parts together

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 31 +++
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index a6a0984..b058019 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4080,14 +4080,15 @@ int si_shader_binary_upload(struct si_screen *sscreen, 
struct si_shader *shader)
 }
 
 static void si_shader_dump_disassembly(const struct radeon_shader_binary 
*binary,
-  struct pipe_debug_callback *debug)
+  struct pipe_debug_callback *debug,
+  const char *name)
 {
char *line, *p;
unsigned i, count;
 
if (binary->disasm_string) {
-   fprintf(stderr, "\nShader Disassembly:\n\n");
-   fprintf(stderr, "%s\n", binary->disasm_string);
+   fprintf(stderr, "Shader %s disassembly:\n", name);
+   fprintf(stderr, "%s", binary->disasm_string);
 
if (debug && debug->debug_message) {
/* Very long debug messages are cut off, so send the
@@ -4117,7 +4118,7 @@ static void si_shader_dump_disassembly(const struct 
radeon_shader_binary *binary
   "Shader Disassembly End");
}
} else {
-   fprintf(stderr, "SI CODE:\n");
+   fprintf(stderr, "Shader %s binary:\n", name);
for (i = 0; i < binary->code_size; i += 4) {
fprintf(stderr, "@0x%x: %02x%02x%02x%02x\n", i,
binary->code[i + 3], binary->code[i + 2],
@@ -4199,13 +4200,27 @@ static void si_shader_dump_stats(struct si_screen 
*sscreen,
 void si_shader_dump(struct si_screen *sscreen, struct si_shader *shader,
struct pipe_debug_callback *debug, unsigned processor)
 {
-   if (r600_can_dump_shader(>b, processor))
-   if (!(sscreen->b.debug_flags & DBG_NO_ASM))
-   si_shader_dump_disassembly(>binary, debug);
+   unsigned code_size =
+   (shader->prolog ? shader->prolog->binary.code_size : 0) +
+   shader->binary.code_size +
+   (shader->epilog ? shader->epilog->binary.code_size : 0);
+
+   if (r600_can_dump_shader(>b, processor) &&
+   !(sscreen->b.debug_flags & DBG_NO_ASM)) {
+   fprintf(stderr, "\n");
+   if (shader->prolog)
+   si_shader_dump_disassembly(>prolog->binary,
+  debug, "prolog");
+   si_shader_dump_disassembly(>binary, debug, "main");
+   if (shader->epilog)
+   si_shader_dump_disassembly(>epilog->binary,
+  debug, "epilog");
+   fprintf(stderr, "\n");
+   }
 
si_shader_dump_stats(sscreen, >config,
 shader->selector ? 
shader->selector->info.num_inputs : 0,
-shader->binary.code_size, debug, processor);
+code_size, debug, processor);
 }
 
 int si_compile_llvm(struct si_screen *sscreen,
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/25] radeonsi: compute how many input SGPRs and VGPRs shaders have

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

Prologs (shader binaries inserted before the API shader binary) need to
know this, so that they won't change the input registers unintentionally.
---
 src/gallium/drivers/radeonsi/si_shader.c | 32 
 src/gallium/drivers/radeonsi/si_shader.h |  2 ++
 2 files changed, 34 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index b61278e..ef4f376 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3580,6 +3580,26 @@ static void declare_streamout_params(struct 
si_shader_context *ctx,
}
 }
 
+static unsigned llvm_get_type_size(LLVMTypeRef type)
+{
+   LLVMTypeKind kind = LLVMGetTypeKind(type);
+
+   switch (kind) {
+   case LLVMIntegerTypeKind:
+   return LLVMGetIntTypeWidth(type) / 8;
+   case LLVMFloatTypeKind:
+   return 4;
+   case LLVMPointerTypeKind:
+   return 8;
+   case LLVMVectorTypeKind:
+   return LLVMGetVectorSize(type) *
+  llvm_get_type_size(LLVMGetElementType(type));
+   default:
+   assert(0);
+   return 0;
+   }
+}
+
 static void create_function(struct si_shader_context *ctx)
 {
struct lp_build_tgsi_context *bld_base = >radeon_bld.soa.bld_base;
@@ -3717,6 +3737,9 @@ static void create_function(struct si_shader_context *ctx)
radeon_llvm_shader_type(ctx->radeon_bld.main_fn, ctx->type);
ctx->return_value = LLVMGetUndef(ctx->radeon_bld.return_type);
 
+   shader->num_input_sgprs = 0;
+   shader->num_input_vgprs = 0;
+
for (i = 0; i <= last_sgpr; ++i) {
LLVMValueRef P = LLVMGetParam(ctx->radeon_bld.main_fn, i);
 
@@ -3726,8 +3749,17 @@ static void create_function(struct si_shader_context 
*ctx)
LLVMAddAttribute(P, LLVMByValAttribute);
else
LLVMAddAttribute(P, LLVMInRegAttribute);
+
+   shader->num_input_sgprs += llvm_get_type_size(params[i]) / 4;
}
 
+   /* Unused fragment shader inputs are eliminated by the compiler,
+* so we don't know yet how many there will be.
+*/
+   if (ctx->type != TGSI_PROCESSOR_FRAGMENT)
+   for (; i < num_params; ++i)
+   shader->num_input_vgprs += 
llvm_get_type_size(params[i]) / 4;
+
if (bld_base->info &&
(bld_base->info->opcode_count[TGSI_OPCODE_DDX] > 0 ||
 bld_base->info->opcode_count[TGSI_OPCODE_DDY] > 0 ||
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index dc75e03..131455b 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -279,6 +279,8 @@ struct si_shader {
struct radeon_shader_binary binary;
struct si_shader_config config;
 
+   ubyte   num_input_sgprs;
+   ubyte   num_input_vgprs;
unsignedvs_output_param_offset[PIPE_MAX_SHADER_OUTPUTS];
booluses_instanceid;
unsignednr_pos_exports;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/25] radeonsi: fail compilation if non-GS non-CS shaders have rodata

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 4bb7ece..dbb9217 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4239,6 +4239,19 @@ int si_compile_llvm(struct si_screen *sscreen,
FREE(binary->global_symbol_offsets);
binary->config = NULL;
binary->global_symbol_offsets = NULL;
+
+   /* Some shaders can't have rodata because their binaries can be
+* concatenated.
+*/
+   if (binary->rodata_size &&
+   (processor == TGSI_PROCESSOR_VERTEX ||
+processor == TGSI_PROCESSOR_TESS_CTRL ||
+processor == TGSI_PROCESSOR_TESS_EVAL ||
+processor == TGSI_PROCESSOR_FRAGMENT)) {
+   fprintf(stderr, "radeonsi: The shader can't have rodata.");
+   return -EINVAL;
+   }
+
return r;
 }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/25] radeonsi: add start_instance parameter to get_instance_index_for_fetch

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 48996b4..858f8cf 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -376,7 +376,7 @@ static LLVMValueRef build_indexed_load_const(
 
 static LLVMValueRef get_instance_index_for_fetch(
struct radeon_llvm_context *radeon_bld,
-   unsigned divisor)
+   unsigned param_start_instance, unsigned divisor)
 {
struct si_shader_context *ctx =
si_shader_context(_bld->soa.bld_base);
@@ -390,8 +390,8 @@ static LLVMValueRef get_instance_index_for_fetch(
result = LLVMBuildUDiv(gallivm->builder, result,
lp_build_const_int32(gallivm, divisor), "");
 
-   return LLVMBuildAdd(gallivm->builder, result, LLVMGetParam(
-   radeon_bld->main_fn, SI_PARAM_START_INSTANCE), "");
+   return LLVMBuildAdd(gallivm->builder, result,
+   LLVMGetParam(radeon_bld->main_fn, 
param_start_instance), "");
 }
 
 static void declare_input_vs(
@@ -429,7 +429,9 @@ static void declare_input_vs(
if (divisor) {
/* Build index from instance ID, start instance and divisor */
ctx->shader->uses_instanceid = true;
-   buffer_index = get_instance_index_for_fetch(>radeon_bld, 
divisor);
+   buffer_index = get_instance_index_for_fetch(>radeon_bld,
+   
SI_PARAM_START_INSTANCE,
+   divisor);
} else {
/* Load the buffer index for vertices. */
LLVMValueRef vertex_id = LLVMGetParam(ctx->radeon_bld.main_fn,
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/25] radeonsi: separate 2 pieces of code from create_function

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 82 
 1 file changed, 51 insertions(+), 31 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 02bfeea..4bb7ece 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3554,6 +3554,30 @@ static const struct lp_build_tgsi_action interp_action = 
{
.emit = build_interp_intrinsic,
 };
 
+static void si_create_function(struct si_shader_context *ctx,
+  LLVMTypeRef *returns, unsigned num_returns,
+  LLVMTypeRef *params, unsigned num_params,
+  int last_array_pointer, int last_sgpr)
+{
+   int i;
+
+   radeon_llvm_create_func(>radeon_bld, returns, num_returns,
+   params, num_params);
+   radeon_llvm_shader_type(ctx->radeon_bld.main_fn, ctx->type);
+   ctx->return_value = LLVMGetUndef(ctx->radeon_bld.return_type);
+
+   for (i = 0; i <= last_sgpr; ++i) {
+   LLVMValueRef P = LLVMGetParam(ctx->radeon_bld.main_fn, i);
+
+   /* We tell llvm that array inputs are passed by value to allow 
Sinking pass
+* to move load. Inputs are constant so this is fine. */
+   if (i <= last_array_pointer)
+   LLVMAddAttribute(P, LLVMByValAttribute);
+   else
+   LLVMAddAttribute(P, LLVMInRegAttribute);
+   }
+}
+
 static void create_meta_data(struct si_shader_context *ctx)
 {
struct gallivm_state *gallivm = 
ctx->radeon_bld.soa.bld_base.base.gallivm;
@@ -3607,6 +3631,27 @@ static unsigned llvm_get_type_size(LLVMTypeRef type)
}
 }
 
+static void declare_tess_lds(struct si_shader_context *ctx)
+{
+   struct gallivm_state *gallivm = >radeon_bld.gallivm;
+   LLVMTypeRef i32 = ctx->radeon_bld.soa.bld_base.uint_bld.elem_type;
+
+   /* This is the upper bound, maximum is 32 inputs times 32 vertices */
+   unsigned vertex_data_dw_size = 32*32*4;
+   unsigned patch_data_dw_size = 32*4;
+   /* The formula is: TCS inputs + TCS outputs + TCS patch outputs. */
+   unsigned patch_dw_size = vertex_data_dw_size*2 + patch_data_dw_size;
+   unsigned lds_dwords = patch_dw_size;
+
+   /* The actual size is computed outside of the shader to reduce
+* the number of shader variants. */
+   ctx->lds =
+   LLVMAddGlobalInAddressSpace(gallivm->module,
+   LLVMArrayType(i32, lds_dwords),
+   "tess_lds",
+   LOCAL_ADDR_SPACE);
+}
+
 static void create_function(struct si_shader_context *ctx)
 {
struct lp_build_tgsi_context *bld_base = >radeon_bld.soa.bld_base;
@@ -3739,26 +3784,15 @@ static void create_function(struct si_shader_context 
*ctx)
}
 
assert(num_params <= Elements(params));
-   radeon_llvm_create_func(>radeon_bld, NULL, 0,
-   params, num_params);
-   radeon_llvm_shader_type(ctx->radeon_bld.main_fn, ctx->type);
-   ctx->return_value = LLVMGetUndef(ctx->radeon_bld.return_type);
+
+   si_create_function(ctx, NULL, 0, params,
+  num_params, last_array_pointer, last_sgpr);
 
shader->num_input_sgprs = 0;
shader->num_input_vgprs = 0;
 
-   for (i = 0; i <= last_sgpr; ++i) {
-   LLVMValueRef P = LLVMGetParam(ctx->radeon_bld.main_fn, i);
-
-   /* We tell llvm that array inputs are passed by value to allow 
Sinking pass
-* to move load. Inputs are constant so this is fine. */
-   if (i <= last_array_pointer)
-   LLVMAddAttribute(P, LLVMByValAttribute);
-   else
-   LLVMAddAttribute(P, LLVMInRegAttribute);
-
+   for (i = 0; i <= last_sgpr; ++i)
shader->num_input_sgprs += llvm_get_type_size(params[i]) / 4;
-   }
 
/* Unused fragment shader inputs are eliminated by the compiler,
 * so we don't know yet how many there will be.
@@ -3782,22 +3816,8 @@ static void create_function(struct si_shader_context 
*ctx)
 
if ((ctx->type == TGSI_PROCESSOR_VERTEX && shader->key.vs.as_ls) ||
ctx->type == TGSI_PROCESSOR_TESS_CTRL ||
-   ctx->type == TGSI_PROCESSOR_TESS_EVAL) {
-   /* This is the upper bound, maximum is 32 inputs times 32 
vertices */
-   unsigned vertex_data_dw_size = 32*32*4;
-   unsigned patch_data_dw_size = 32*4;
-   /* The formula is: TCS inputs + TCS outputs + TCS patch 
outputs. */
-   unsigned patch_dw_size = vertex_data_dw_size*2 + 
patch_data_dw_size;
-   unsigned lds_dwords = patch_dw_size;
-
-   /* The actual size is 

[Mesa-dev] [PATCH 06/25] radeonsi: add samplemask parameter to si_export_mrt_color

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 858f8cf..02bfeea 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1515,7 +1515,8 @@ static void si_alpha_test(struct lp_build_tgsi_context 
*bld_base,
 }
 
 static LLVMValueRef si_scale_alpha_by_sample_mask(struct lp_build_tgsi_context 
*bld_base,
- LLVMValueRef alpha)
+ LLVMValueRef alpha,
+ unsigned samplemask_param)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
@@ -1523,7 +1524,7 @@ static LLVMValueRef si_scale_alpha_by_sample_mask(struct 
lp_build_tgsi_context *
 
/* alpha = alpha * popcount(coverage) / SI_NUM_SMOOTH_AA_SAMPLES */
coverage = LLVMGetParam(ctx->radeon_bld.main_fn,
-   SI_PARAM_SAMPLE_COVERAGE);
+   samplemask_param);
coverage = bitcast(bld_base, TGSI_TYPE_SIGNED, coverage);
 
coverage = lp_build_intrinsic(gallivm->builder, "llvm.ctpop.i32",
@@ -2288,6 +2289,7 @@ static void si_export_mrt_z(struct lp_build_tgsi_context 
*bld_base,
 
 static void si_export_mrt_color(struct lp_build_tgsi_context *bld_base,
LLVMValueRef *color, unsigned index,
+   unsigned samplemask_param,
bool is_last)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
@@ -2310,7 +2312,8 @@ static void si_export_mrt_color(struct 
lp_build_tgsi_context *bld_base,
 
/* Line & polygon smoothing */
if (ctx->shader->key.ps.epilog.poly_line_smoothing)
-   color[3] = si_scale_alpha_by_sample_mask(bld_base, color[3]);
+   color[3] = si_scale_alpha_by_sample_mask(bld_base, color[3],
+samplemask_param);
 
/* If last_cbuf > 0, FS_COLOR0_WRITES_ALL_CBUFS is true. */
if (ctx->shader->key.ps.epilog.last_cbuf > 0) {
@@ -2449,6 +2452,7 @@ static void si_llvm_emit_fs_epilogue(struct 
lp_build_tgsi_context *bld_base)
 
ctx->radeon_bld.soa.outputs[i][j], "");
 
si_export_mrt_color(bld_base, color, semantic_index,
+   SI_PARAM_SAMPLE_COVERAGE,
last_color_export == i);
break;
default:
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/25] gallium/radeon: add basic code for setting shader return values

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

LLVMBuildInsertValue will be used on return_value.
---
 src/gallium/drivers/r600/r600_llvm.c|  2 +-
 src/gallium/drivers/radeon/radeon_llvm.h|  4 +++-
 src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 14 +++---
 src/gallium/drivers/radeonsi/si_shader.c|  9 ++---
 4 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_llvm.c 
b/src/gallium/drivers/r600/r600_llvm.c
index 0fe7c74..434a2c1 100644
--- a/src/gallium/drivers/r600/r600_llvm.c
+++ b/src/gallium/drivers/r600/r600_llvm.c
@@ -789,7 +789,7 @@ LLVMModuleRef r600_tgsi_llvm(
unsigned ArgumentsCount = 0;
for (unsigned i = 0; i < ctx->inputs_count; i++)
Arguments[ArgumentsCount++] = 
LLVMVectorType(bld_base->base.elem_type, 4);
-   radeon_llvm_create_func(ctx, Arguments, ArgumentsCount);
+   radeon_llvm_create_func(ctx, NULL, 0, Arguments, ArgumentsCount);
for (unsigned i = 0; i < ctx->inputs_count; i++) {
LLVMValueRef P = LLVMGetParam(ctx->main_fn, i);
LLVMAddAttribute(P, LLVMInRegAttribute);
diff --git a/src/gallium/drivers/radeon/radeon_llvm.h 
b/src/gallium/drivers/radeon/radeon_llvm.h
index e967ad2..e40ee6c 100644
--- a/src/gallium/drivers/radeon/radeon_llvm.h
+++ b/src/gallium/drivers/radeon/radeon_llvm.h
@@ -113,6 +113,7 @@ struct radeon_llvm_context {
struct tgsi_declaration_range *arrays;
 
LLVMValueRef main_fn;
+   LLVMTypeRef return_type;
 
struct gallivm_state gallivm;
 };
@@ -161,7 +162,8 @@ void radeon_llvm_emit_prepare_cube_coords(struct 
lp_build_tgsi_context * bld_bas
 void radeon_llvm_context_init(struct radeon_llvm_context * ctx);
 
 void radeon_llvm_create_func(struct radeon_llvm_context * ctx,
- LLVMTypeRef *ParamTypes, unsigned ParamCount);
+LLVMTypeRef *return_types, unsigned 
num_return_elems,
+LLVMTypeRef *ParamTypes, unsigned ParamCount);
 
 void radeon_llvm_dispose(struct radeon_llvm_context * ctx);
 
diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index f5e3f6a..0ec14a7 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -1693,14 +1693,22 @@ void radeon_llvm_context_init(struct 
radeon_llvm_context * ctx)
 }
 
 void radeon_llvm_create_func(struct radeon_llvm_context * ctx,
+LLVMTypeRef *return_types, unsigned 
num_return_elems,
 LLVMTypeRef *ParamTypes, unsigned ParamCount)
 {
-   LLVMTypeRef main_fn_type;
+   LLVMTypeRef main_fn_type, ret_type;
LLVMBasicBlockRef main_fn_body;
 
+   if (num_return_elems)
+   ret_type = LLVMStructTypeInContext(ctx->gallivm.context,
+  return_types,
+  num_return_elems, true);
+   else
+   ret_type = LLVMVoidTypeInContext(ctx->gallivm.context);
+
/* Setup the function */
-   main_fn_type = 
LLVMFunctionType(LLVMVoidTypeInContext(ctx->gallivm.context),
-   ParamTypes, ParamCount, 0);
+   ctx->return_type = ret_type;
+   main_fn_type = LLVMFunctionType(ret_type, ParamTypes, ParamCount, 0);
ctx->main_fn = LLVMAddFunction(ctx->gallivm.module, "main", 
main_fn_type);
main_fn_body = LLVMAppendBasicBlockInContext(ctx->gallivm.context,
ctx->main_fn, "main_body");
diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 19c427a..b61278e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -96,6 +96,7 @@ struct si_shader_context
LLVMValueRef esgs_ring;
LLVMValueRef gsvs_ring[4];
LLVMValueRef gs_next_vertex[4];
+   LLVMValueRef return_value;
 
LLVMTypeRef voidt;
LLVMTypeRef i1;
@@ -3711,8 +3712,10 @@ static void create_function(struct si_shader_context 
*ctx)
}
 
assert(num_params <= Elements(params));
-   radeon_llvm_create_func(>radeon_bld, params, num_params);
+   radeon_llvm_create_func(>radeon_bld, NULL, 0,
+   params, num_params);
radeon_llvm_shader_type(ctx->radeon_bld.main_fn, ctx->type);
+   ctx->return_value = LLVMGetUndef(ctx->radeon_bld.return_type);
 
for (i = 0; i <= last_sgpr; ++i) {
LLVMValueRef P = LLVMGetParam(ctx->radeon_bld.main_fn, i);
@@ -4241,7 +4244,7 @@ static int si_generate_gs_copy_shader(struct si_screen 
*sscreen,
 
si_llvm_export_vs(bld_base, outputs, gsinfo->num_outputs);
 
-   LLVMBuildRetVoid(bld_base->base.gallivm->builder);
+   LLVMBuildRet(gallivm->builder, 

[Mesa-dev] [PATCH 04/25] radeonsi: separate out shader key bits for prologs & epilogs

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c| 70 ++-
 src/gallium/drivers/radeonsi/si_shader.h| 77 +++--
 src/gallium/drivers/radeonsi/si_state.c |  2 +-
 src/gallium/drivers/radeonsi/si_state_shaders.c | 91 +
 4 files changed, 140 insertions(+), 100 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index ebd7379..48996b4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -403,7 +403,8 @@ static void declare_input_vs(
struct gallivm_state *gallivm = base->gallivm;
struct si_shader_context *ctx =
si_shader_context(_bld->soa.bld_base);
-   unsigned divisor = ctx->shader->key.vs.instance_divisors[input_index];
+   unsigned divisor =
+   ctx->shader->key.vs.prolog.instance_divisors[input_index];
 
unsigned chan;
 
@@ -854,7 +855,7 @@ static int lookup_interp_param_index(unsigned interpolate, 
unsigned location)
 static unsigned select_interp_param(struct si_shader_context *ctx,
unsigned param)
 {
-   if (!ctx->shader->key.ps.force_persample_interp)
+   if (!ctx->shader->key.ps.prolog.force_persample_interp)
return param;
 
/* If the shader doesn't use center/centroid, just return the parameter.
@@ -924,7 +925,7 @@ static void interp_fs_input(struct si_shader_context *ctx,
intr_name = interp_param ? "llvm.SI.fs.interp" : "llvm.SI.fs.constant";
 
if (semantic_name == TGSI_SEMANTIC_COLOR &&
-   ctx->shader->key.ps.color_two_side) {
+   ctx->shader->key.ps.prolog.color_two_side) {
LLVMValueRef args[4];
LLVMValueRef is_face_positive;
LLVMValueRef back_attr_number;
@@ -1331,12 +1332,12 @@ static void si_llvm_init_export_args(struct 
lp_build_tgsi_context *bld_base,
 
if (ctx->type == TGSI_PROCESSOR_FRAGMENT) {
const union si_shader_key *key = >shader->key;
-   unsigned col_formats = key->ps.spi_shader_col_format;
+   unsigned col_formats = key->ps.epilog.spi_shader_col_format;
int cbuf = target - V_008DFC_SQ_EXP_MRT;
 
assert(cbuf >= 0 && cbuf < 8);
spi_shader_col_format = (col_formats >> (cbuf * 4)) & 0xf;
-   is_int8 = (key->ps.color_is_int8 >> cbuf) & 0x1;
+   is_int8 = (key->ps.epilog.color_is_int8 >> cbuf) & 0x1;
}
 
args[4] = uint->zero; /* COMPR flag */
@@ -1489,13 +1490,13 @@ static void si_alpha_test(struct lp_build_tgsi_context 
*bld_base,
struct si_shader_context *ctx = si_shader_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
 
-   if (ctx->shader->key.ps.alpha_func != PIPE_FUNC_NEVER) {
+   if (ctx->shader->key.ps.epilog.alpha_func != PIPE_FUNC_NEVER) {
LLVMValueRef alpha_ref = LLVMGetParam(ctx->radeon_bld.main_fn,
SI_PARAM_ALPHA_REF);
 
LLVMValueRef alpha_pass =
lp_build_cmp(_base->base,
-ctx->shader->key.ps.alpha_func,
+ctx->shader->key.ps.epilog.alpha_func,
 alpha, alpha_ref);
LLVMValueRef arg =
lp_build_select(_base->base,
@@ -1990,7 +1991,7 @@ static void si_write_tess_factors(struct 
lp_build_tgsi_context *bld_base,
  invocation_id, bld_base->uint_bld.zero, ""));
 
/* Determine the layout of one tess factor element in the buffer. */
-   switch (shader->key.tcs.prim_mode) {
+   switch (shader->key.tcs.epilog.prim_mode) {
case PIPE_PRIM_LINES:
stride = 2; /* 2 dwords, 1 vec2 store */
outer_comps = 2;
@@ -2292,30 +2293,30 @@ static void si_export_mrt_color(struct 
lp_build_tgsi_context *bld_base,
int i;
 
/* Clamp color */
-   if (ctx->shader->key.ps.clamp_color)
+   if (ctx->shader->key.ps.epilog.clamp_color)
for (i = 0; i < 4; i++)
color[i] = radeon_llvm_saturate(bld_base, color[i]);
 
/* Alpha to one */
-   if (ctx->shader->key.ps.alpha_to_one)
+   if (ctx->shader->key.ps.epilog.alpha_to_one)
color[3] = base->one;
 
/* Alpha test */
if (index == 0 &&
-   ctx->shader->key.ps.alpha_func != PIPE_FUNC_ALWAYS)
+   ctx->shader->key.ps.epilog.alpha_func != PIPE_FUNC_ALWAYS)
si_alpha_test(bld_base, color[3]);
 
/* Line & polygon smoothing */
-   if (ctx->shader->key.ps.poly_line_smoothing)
+   if (ctx->shader->key.ps.epilog.poly_line_smoothing)
color[3] = si_scale_alpha_by_sample_mask(bld_base, 

[Mesa-dev] [PATCH 03/25] radeonsi: compute how many input VGPRs fragment shaders have

2016-02-15 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 41 
 src/gallium/drivers/radeonsi/si_shader.h |  2 ++
 2 files changed, 43 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index ef4f376..ebd7379 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4537,6 +4537,47 @@ int si_shader_create(struct si_screen *sscreen, 
LLVMTargetMachineRef tm,
 
radeon_llvm_dispose(_bld);
 
+   /* Calculate the number of fragment input VGPRs. */
+   if (ctx.type == TGSI_PROCESSOR_FRAGMENT) {
+   shader->num_input_vgprs = 0;
+   shader->face_vgpr_index = -1;
+
+   if (G_0286CC_PERSP_SAMPLE_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 2;
+   if (G_0286CC_PERSP_CENTER_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 2;
+   if 
(G_0286CC_PERSP_CENTROID_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 2;
+   if 
(G_0286CC_PERSP_PULL_MODEL_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 3;
+   if 
(G_0286CC_LINEAR_SAMPLE_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 2;
+   if 
(G_0286CC_LINEAR_CENTER_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 2;
+   if 
(G_0286CC_LINEAR_CENTROID_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 2;
+   if 
(G_0286CC_LINE_STIPPLE_TEX_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   if (G_0286CC_POS_X_FLOAT_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   if (G_0286CC_POS_Y_FLOAT_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   if (G_0286CC_POS_Z_FLOAT_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   if (G_0286CC_POS_W_FLOAT_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   if (G_0286CC_FRONT_FACE_ENA(shader->config.spi_ps_input_addr)) {
+   shader->face_vgpr_index = shader->num_input_vgprs;
+   shader->num_input_vgprs += 1;
+   }
+   if (G_0286CC_ANCILLARY_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   if 
(G_0286CC_SAMPLE_COVERAGE_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   if (G_0286CC_POS_FIXED_PT_ENA(shader->config.spi_ps_input_addr))
+   shader->num_input_vgprs += 1;
+   }
+
if (ctx.type == TGSI_PROCESSOR_GEOMETRY) {
shader->gs_copy_shader = CALLOC_STRUCT(si_shader);
shader->gs_copy_shader->selector = shader->selector;
diff --git a/src/gallium/drivers/radeonsi/si_shader.h 
b/src/gallium/drivers/radeonsi/si_shader.h
index 131455b..3be24f3 100644
--- a/src/gallium/drivers/radeonsi/si_shader.h
+++ b/src/gallium/drivers/radeonsi/si_shader.h
@@ -281,6 +281,8 @@ struct si_shader {
 
ubyte   num_input_sgprs;
ubyte   num_input_vgprs;
+   charface_vgpr_index;
+
unsignedvs_output_param_offset[PIPE_MAX_SHADER_OUTPUTS];
booluses_instanceid;
unsignednr_pos_exports;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/25] RadeonSI: 1 variant per shader & shader cache in memory

2016-02-15 Thread Marek Olšák
Hi,

This patch series implements a new compilation mode that compiles shaders to hw 
bytecode only once with the assumption that any state-dependent code will be 
attached at the beginning or end of the bytecode to implement emulated features 
such as vertex buffer addressing, two-side color selection and interpolation, 
colorbuffer format conversions, alpha-test, etc. (the attachable bytecode will 
be called "prolog" and "epilog" shader parts, while the TGSI shader will be 
called the "main" part)

At the end, it adds a simple TGSI->bytecode shader cache that lives in memory.


1) Design points and differences from my XDC talk

The support of the old-style shaders compiled on demand (called "monolithic", 
because there is only one monolithic piece of bytecode) is kept. It can be 
enabled by an environment variable or it's enabled automatically if LLVM is < 
3.8.

Shaders keep their shader key, but now the shader key is used to generate the 
prolog and epilog parts.

The main part is compiled first. At draw time, the prolog and epilog, if they 
are needed, are compiled and all pieces of bytecode are combined. Ideally, we 
would only be doing the combining at draw time, because everything should be 
compiled already.

Prologs and epilogs don't use the LLVM assembler as was planned initially. They 
share most of the code with monolithic shaders, meaning that each is compiled 
as an LLVM IR module.

The driver keeps a global per-screen list of all compiled prologs and epilogs, 
because they are all reusable.

If prolog and epilog compilation turns out to be too slow, we can precompile 
some of them with llc at Mesa compile time. I don't think this will be needed 
though.

VS and TES main parts are always compiled as hardware VS at shader creation. 
Hardware LS and ES stages are always compiled as monolithic shaders on demand 
later due to the lack of games using those.


2) Shader parts

VS prolog:
- vertex buffer address calculations based on instance divisors

VS epilog (hw VS only: VS & TES):
- primitive ID export if PS needs it
- in the future: ignore ClipVertex and ClipDistance outputs if clipping is 
disabled

TCS epilog:
- pack tessellation factors based on the TES primitive type

PS prolog:
- two-side color selection and interpolation
- forcing per-sample interpolation
- polygon stippling
- in the future: support BC_OPTIMIZE better, use interp_mov for flatshaded 
colors

PS epilog:
- alpha-test, alpha-to-one, smoothing, clamping, gl_FragColor broadcast
- color format conversions


3) Performance implications

There is increased VGPR usage because pixel shaders that used to use 4-12 VGPRs 
now always use 16 or even 20. This is not enough to affect the wave count 
though.

There is slightly higher register usage because some SGPRs and VGPRs have to be 
passed from the prolog through the main part to the epilog, so the main part 
has fewer of them. This results in higher SGPR spilling, although that should 
be entirely fixable in the LLVM backend.

Relevant shader-db stats for the default scheduler:

Code Size: 11091656 -> 11219948 (1.16 %) bytes
Scratch: 1732608 -> 2246656 (29.67 %) (SGPR spilling)
Max Waves: 78063 -> 77352 (-0.91 %)

Relevant shader-db stats for the SI scheduler:

Code Size: 11433068 -> 11535452 (0.90 %) bytes
Scratch: 509952 -> 522240 (2.41 %) (SGPR spilling)
Max Waves: 79456 -> 78217 (-1.56 %)

Both the code size and the wave count didn't change much. It looks like 
compiling optimized monolithic shaders in another thread won't make much 
difference.

No benchmarks have been run.


4) RadeonSI shader cache in memory

The motivation is to skip shader compilation for TGSI shaders that have already 
been compiled by the same process before. This is not a real shader cache that 
proprietary drivers implement. The binaries are not stored on the disk. The 
motivations are:
- Apps mix and match their vertex and pixel shaders to produce many 
combinations of linked GLSL shader programs. E.g. if one VS is matched with 20 
pixel shaders, we don't want to compile that VS 20 times. This does appear to 
happen a lot with UE3.
- If apps unload and reload shaders, this effectively makes the reload free for 
the radeonsi driver. (not so much for st/mesa)
- Gallium likes to use the same blit & pass-through shaders in several places.

This only caches the main shader parts (VS as VS, TCS, TES as VS, PS). 
Monolithic shaders including LS & ES and also GS are not cached.


5) Performance of the shader cache

The test is a short apitrace of Borderlands 2.

Without the cache:
GLSL link time = 18361 ms
Driver compile time = 14510 ms

With the cache:
GLSL link time = 12576 ms
Driver compile time = 8552 ms

This leaves a lot to be desired, but it was expected. The TGSI compilation 
takes 41% less time, which means 41% of all TGSI shaders are duplicates. On 
average, linking GLSL shader programs (including the TGSI compilation) takes 
31.5% less time.

The compile times are still unacceptable and caching 

[Mesa-dev] [Bug 94168] Incorrect rendering when running Populous 3 on wine using DDraw->WineD3D->OpenGL wrapper [apitrace]

2016-02-15 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=94168

Bug ID: 94168
   Summary: Incorrect rendering when running Populous 3 on wine
using DDraw->WineD3D->OpenGL wrapper [apitrace]
   Product: Mesa
   Version: git
  Hardware: x86 (IA32)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: GLX
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: diegoand...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

There is a description of this problem on Wine's Bugzilla, I'll refer to that
first:

https://bugs.winehq.org/show_bug.cgi?id=40126

I did some tests and I am now inclined to believe this is a Mesa bug:

After making an apitrace and replaying it, I only see incorrect rendering when
using hardware rendering (Mesa, Gallium3D, radeonsi). When the same apitrace is
replayed in software mode, the rendering is correct and looks like it is
supposed to.

Another user reported the exact same problem with nouveau, but it works fine
with the nvidia binary driver.

Apitrace:
https://drive.google.com/file/d/0B6CofbZiVpM1b1RKSDNLZXdteW8/view?usp=sharing

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: reject explicit location on atomic counter uniforms

2016-02-15 Thread Timothy Arceri
On Mon, 2016-02-15 at 10:49 -0500, Nicolai Hähnle wrote:
> 
> On 12.02.2016 17:53, Timothy Arceri wrote:
> > On Thu, 2016-02-11 at 20:10 -0500, Ilia Mirkin wrote:
> > > This fixes
> > > 
> > > dEQP-GLES31.functional.uniform_location.negative.atomic_fragment
> > > dEQP-GLES31.functional.uniform_location.negative.atomic_vertex
> > > 
> > > Both of which have lines like
> > > 
> > > layout(location = 3, binding = 0, offset = 0) uniform atomic_uint
> > > uni0;
> > > 
> > > The ARB_explicit_uniform_location spec makes a very tangential
> > > mention
> > > regarding atomic counters, but location isn't something that
> > > makes
> > > sense
> > > with them.
> > > 
> > > Signed-off-by: Ilia Mirkin 
> > > ---
> > > 
> > > Had no clue where to stick this check... this seemed like as good
> > > a
> > > place as any.
> > > 
> > >   src/compiler/glsl/ast_to_hir.cpp | 5 +
> > >   1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > > b/src/compiler/glsl/ast_to_hir.cpp
> > > index dbeb5c0..9fce06b 100644
> > > --- a/src/compiler/glsl/ast_to_hir.cpp
> > > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > > @@ -4179,6 +4179,11 @@ ast_declarator_list::hir(exec_list
> > > *instructions,
> > >   state->atomic_counter_offsets[qual_binding] =
> > > qual_offset;
> > >    }
> > > }
> > 
> > Maybe we should just make this:
> > else {
> >    _mesa_glsl_error(, state, "invalid atomic counter layout
> > qualifier");
> > }
> > 
> > ??
> 
> FWIW, I like Ilia's original message better because it gives the
> user 
> more information about why exactly their layout qualifier is
> invalid. 
> Helpful error messages are a good thing.

Sure I agree. The complete solution would be to use a mask for
validation.

    /* Valid atomic layout qualifiers */
    ast_type_qualifier atomic_layout_mask;
    atomic_layout_mask.flags.i = 0;
    atomic_layout_mask.flags.q.explicit_binding = 1;
    atomic_layout_mask.flags.q.explicit_offset = 1;
    atomic_layout_mask.flags.q.unifrom = 1 ;
    ...

    if ((qual->flags.i & ~atomic_layout_mask.flags.i) != 0)
       _mesa_glsl_error(, state, "invalid atomic counter layout
 qualifier");

This would allow catching of all invalid qualifiers rather than just
one off checks which is going to get messy fast. It's likely we can
remove some existing code for various types by implementing something
like this.

We could then have a generic helper that turns the check into a more
useful message based on the flags. This would be useful for exiting
checks as this is how we validate global input qualifiers and I've just
sent a patch to do it on varying inputs.

https://patchwork.freedesktop.org/patch/73710/

Personnaly I'd rather this deqp test keep failing rather than add a one
off case to catch location.

> 
> That said, I won't complain too loudly if making the code simpler or 
> easier to follow ends up making the error messages slightly less
> helpful.
> 
> Cheers,
> Nicolai
> 
> > 
> > > +
> > > +  if (type->qualifier.flags.q.explicit_location) {
> > > + _mesa_glsl_error(, state,
> > > +  "atomic counters cannot have an
> > > explicit
> > > location");
> > > +  }
> > >  }
> > > 
> > >  if (this->declarations.is_empty()) {
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: set user defined varyings to smooth by default

2016-02-15 Thread Timothy Arceri
On Mon, 2016-02-15 at 16:12 +0100, Iago Toral wrote:
> On Mon, 2016-02-15 at 18:38 +1100, Timothy Arceri wrote:
> > This is usually handled by the backends in order to handle the
> > various interactions with the gl_*Color built-ins.
> > 
> > The problem is this means linking will fail if one side on the
> > interface adds the smooth qualifier to the varying and the other
> > side just uses the default even though they match.
> > 
> > This fixes various deqp tests and should have no impact on
> > built-ins as they generate GLSL IR directly.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
> > ---
> >  src/compiler/glsl/ast_to_hir.cpp | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp
> > b/src/compiler/glsl/ast_to_hir.cpp
> > index b639378..47d52ee 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -2750,6 +2750,11 @@ interpret_interpolation_qualifier(const
> > struct ast_type_qualifier *qual,
> >    "vertex shader inputs or fragment shader
> > outputs",
> >    interpolation_string(interpolation));
> >    }
> > +   } else if ((mode == ir_var_shader_in &&
> > +   state->stage != MESA_SHADER_VERTEX) ||
> > +  (mode == ir_var_shader_out &&
> > +   state->stage != MESA_SHADER_FRAGMENT)) {
> > +  interpolation = INTERP_QUALIFIER_SMOOTH;
> > }
> 
> The GLES spec explicitly says that in the absence of an interp
> qualifier
> smooth is used, but I can't find the same statement in the desktop
> GLSL
> spec. Should we make this ES specific?

I couldn't find it in the spec either thats why I didn't send this out
last year when I wrote it. However the OpenGL wiki says thats what it
is by default, and thats what our implementation does after validation

I'll write a piglit test to see what AMD and Nvidia do on the desktop
just to be sure.

Thanks for taking a look.

> 
> Iago
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] compiler/glsl: Fix uniform location counting.

2016-02-15 Thread Connor Abbott
On Mon, Feb 15, 2016 at 4:38 PM, Matt Turner  wrote:
> On Mon, Feb 15, 2016 at 12:50 PM, Ilia Mirkin  wrote:
>> In a few places your indentation is off -- please look at the 'git
>> diff' output, it should be pretty obvious. You used 2 spaces instead
>> of 3 (in just a handful of places).
>
> If you use vim, you can put something like this in your ~/.vimrc:
>
> autocmd BufNewFile,BufRead /home/mattst88/projects/mesa/* set
> expandtab tabstop=8 softtabstop=3 shiftwidth=3
> autocmd BufNewFile,BufRead
> /home/mattst88/projects/mesa/src/glsl/glcpp/* set noexpandtab
> tabstop=8 softtabstop=8 shiftwidth=8
> autocmd BufNewFile,BufRead
> /home/mattst88/projects/mesa/src/glsl/glsl_parser.yy set noexpandtab
> tabstop=8 shiftwidth=8
> autocmd BufNewFile,BufRead /home/mattst88/projects/piglit/* set
> noexpandtab tabstop=8 softtabstop=8 shiftwidth=8
> autocmd BufNewFile,BufRead Makefile* set noexpandtab tabstop=8
> softtabstop=8 shiftwidth=8
> autocmd BufNewFile,BufRead *.mk set noexpandtab tabstop=8
> softtabstop=8 shiftwidth=8
>
> (it'll probably get line wrapped by gmail)

or just use https://github.com/chadversary/vim-mesa that also has the
right cinoptions.

>
>> On Fri, Feb 12, 2016 at 7:38 AM, Plamena Manolova
>>  wrote:
>>> This patch moves the calculation of current uniforms to
>>> link_uniforms, which makes use of UniformRemapTable which
>>> stores all the reserved uniform locations.
>>>
>>> Location assignment for implicit uniforms now tries to use
>>> any gaps left in the table after the location assignment
>>> for explicit uniforms. This gives us more space to store more
>>> uniforms.
>>>
>>> Patch is based on earlier patch with following changes/additions:
>>>
>>>1: Move the counting of explicit locations to
>>>   check_explicit_uniform_locations and then pass
>>>   the number to link_assign_uniform_locations.
>>>2: Count the number of empty slots in UniformRemapTable
>>>   and store them in a list_head.
>>>3: Try to find an empty slot for implicit locations from
>>>   the list, if that fails resize UniformRemapTable.
>>>
>>> Fixes following CTS tests:
>>>ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max
>>>
>>> ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array
>>>
>>> Signed-off-by: Tapani Pälli 
>>> Signed-off-by: Plamena Manolova 
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696
>>> ---
>>>  src/compiler/glsl/link_uniforms.cpp | 85 
>>> -
>>>  src/compiler/glsl/linker.cpp| 73 ---
>>>  src/compiler/glsl/linker.h  | 17 +++-
>>>  src/mesa/main/mtypes.h  |  8 
>>>  4 files changed, 148 insertions(+), 35 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl/link_uniforms.cpp 
>>> b/src/compiler/glsl/link_uniforms.cpp
>>> index 7072c16..aa07de3 100644
>>> --- a/src/compiler/glsl/link_uniforms.cpp
>>> +++ b/src/compiler/glsl/link_uniforms.cpp
>>> @@ -1038,9 +1038,43 @@ assign_hidden_uniform_slot_id(const char *name, 
>>> unsigned hidden_id,
>>> uniform_size->map->put(hidden_uniform_start + hidden_id, name);
>>>  }
>>>
>>> +/**
>>> + * Search through the list of empty blocks to find one that fits the 
>>> current
>>> + * uniform.
>>> + */
>>> +static int
>>> +find_empty_block(struct gl_shader_program *prog,
>>> + struct gl_uniform_storage *uniform)
>>> +{
>>> +   const unsigned entries = MAX2(1, uniform->array_elements);
>>> +
>>> +foreach_list_typed(struct empty_uniform_block, block, link,
>>> +   >EmptyUniformLocations) {
>>> +  /* Found a block with enough slots to fit the uniform */
>>> +  if (block->slots == entries) {
>>> + unsigned start = block->start;
>>> + exec_node_remove(>link);
>>> + ralloc_free(block);
>>> +
>>> + return start;
>>> + /* Found a block with more slots than needed. It can still be used. */
>>> + } else if (block->slots > entries) {
>>> + unsigned start = block->start;
>>> + block->start += entries;
>>> + block->slots -= entries;
>>> +
>>> + return start;
>>> + }
>>> +   }
>>> +
>>> +   return -1;
>>> +}
>>> +
>>>  void
>>>  link_assign_uniform_locations(struct gl_shader_program *prog,
>>> -  unsigned int boolean_true)
>>> +  unsigned int boolean_true,
>>> +  unsigned int num_explicit_uniform_locs,
>>> +  unsigned int max_uniform_locs)
>>>  {
>>> ralloc_free(prog->UniformStorage);
>>> prog->UniformStorage = NULL;
>>> @@ -1131,6 +1165,9 @@ link_assign_uniform_locations(struct 
>>> gl_shader_program *prog,
>>>
>>> parcel_out_uniform_storage parcel(prog, prog->UniformHash, uniforms, 
>>> data);
>>>
>>> +   unsigned total_entries 

Re: [Mesa-dev] [PATCH v2] st/mesa: do not init limits when compute shaders are not supported

2016-02-15 Thread Tobias Klausmann


On 15.02.2016 22:44, Samuel Pitoiset wrote:

When the number of uniform blocks is less than 12,
ARB_uniform_buffer_object can't be enabled and the maximum GL version
is not even 3.1...

This fixes a regression introduced in 7c79c1e (st/mesa: add compute
shader state) if the maximum number of uniform blocks allowed for
compute shaders is less than 12. This happens on Kepler but this might
also affect other Gallium drivers.

Signed-off-by: Samuel Pitoiset 
Reported-by: Tobias Klausmann 
Tested-by: Tobias Klausmann 
---
  src/mesa/state_tracker/st_extensions.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index bdfbded..2f5d3f7 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -75,6 +75,7 @@ static int _clamp(int a, int min, int max)
  void st_init_limits(struct pipe_screen *screen,
  struct gl_constants *c, struct gl_extensions *extensions)
  {
+   int supported_irs;
 unsigned sh;
 boolean can_ubo = TRUE;
  
@@ -177,6 +178,13 @@ void st_init_limits(struct pipe_screen *screen,

case PIPE_SHADER_COMPUTE:
   pc = >Program[MESA_SHADER_COMPUTE];
   options = >ShaderCompilerOptions[MESA_SHADER_COMPUTE];
+
+ if (!screen->get_param(screen, PIPE_CAP_COMPUTE))
+continue;
+ supported_irs =
+screen->get_shader_param(screen, sh, 
PIPE_SHADER_CAP_SUPPORTED_IRS);
+ if (!(supported_irs & (1 << PIPE_SHADER_IR_TGSI)))
+continue;
   break;
default:
   assert(0);


Reviewed-by: Tobias Klausmann

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] st/mesa: do not init limits when compute shaders are not supported

2016-02-15 Thread Ilia Mirkin
On Mon, Feb 15, 2016 at 4:44 PM, Samuel Pitoiset
 wrote:
> When the number of uniform blocks is less than 12,
> ARB_uniform_buffer_object can't be enabled and the maximum GL version
> is not even 3.1...
>
> This fixes a regression introduced in 7c79c1e (st/mesa: add compute
> shader state) if the maximum number of uniform blocks allowed for
> compute shaders is less than 12. This happens on Kepler but this might
> also affect other Gallium drivers.
>
> Signed-off-by: Samuel Pitoiset 
> Reported-by: Tobias Klausmann 
> Tested-by: Tobias Klausmann 

Reviewed-by: Ilia Mirkin 

> ---
>  src/mesa/state_tracker/st_extensions.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_extensions.c 
> b/src/mesa/state_tracker/st_extensions.c
> index bdfbded..2f5d3f7 100644
> --- a/src/mesa/state_tracker/st_extensions.c
> +++ b/src/mesa/state_tracker/st_extensions.c
> @@ -75,6 +75,7 @@ static int _clamp(int a, int min, int max)
>  void st_init_limits(struct pipe_screen *screen,
>  struct gl_constants *c, struct gl_extensions *extensions)
>  {
> +   int supported_irs;
> unsigned sh;
> boolean can_ubo = TRUE;
>
> @@ -177,6 +178,13 @@ void st_init_limits(struct pipe_screen *screen,
>case PIPE_SHADER_COMPUTE:
>   pc = >Program[MESA_SHADER_COMPUTE];
>   options = >ShaderCompilerOptions[MESA_SHADER_COMPUTE];
> +
> + if (!screen->get_param(screen, PIPE_CAP_COMPUTE))
> +continue;
> + supported_irs =
> +screen->get_shader_param(screen, sh, 
> PIPE_SHADER_CAP_SUPPORTED_IRS);
> + if (!(supported_irs & (1 << PIPE_SHADER_IR_TGSI)))
> +continue;
>   break;
>default:
>   assert(0);
> --
> 2.6.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] st/mesa: do not init limits when compute shaders are not supported

2016-02-15 Thread Samuel Pitoiset
When the number of uniform blocks is less than 12,
ARB_uniform_buffer_object can't be enabled and the maximum GL version
is not even 3.1...

This fixes a regression introduced in 7c79c1e (st/mesa: add compute
shader state) if the maximum number of uniform blocks allowed for
compute shaders is less than 12. This happens on Kepler but this might
also affect other Gallium drivers.

Signed-off-by: Samuel Pitoiset 
Reported-by: Tobias Klausmann 
Tested-by: Tobias Klausmann 
---
 src/mesa/state_tracker/st_extensions.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index bdfbded..2f5d3f7 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -75,6 +75,7 @@ static int _clamp(int a, int min, int max)
 void st_init_limits(struct pipe_screen *screen,
 struct gl_constants *c, struct gl_extensions *extensions)
 {
+   int supported_irs;
unsigned sh;
boolean can_ubo = TRUE;
 
@@ -177,6 +178,13 @@ void st_init_limits(struct pipe_screen *screen,
   case PIPE_SHADER_COMPUTE:
  pc = >Program[MESA_SHADER_COMPUTE];
  options = >ShaderCompilerOptions[MESA_SHADER_COMPUTE];
+
+ if (!screen->get_param(screen, PIPE_CAP_COMPUTE))
+continue;
+ supported_irs =
+screen->get_shader_param(screen, sh, 
PIPE_SHADER_CAP_SUPPORTED_IRS);
+ if (!(supported_irs & (1 << PIPE_SHADER_IR_TGSI)))
+continue;
  break;
   default:
  assert(0);
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: do not init limits when compute shaders are not supported

2016-02-15 Thread Samuel Pitoiset



On 02/15/2016 10:40 PM, Samuel Pitoiset wrote:

When the number of uniform blocks is less than 12,
ARB_uniform_buffer_object can't be enabled and the maximum GL version
is not even 3.1...

This fixes a regression introduced in 7c79c1e (st/mesa: add compute
shader state) if the maximum number of uniform blocks allowed for
compute shaders is less than 12. This happens on Kepler but this might
also affect other Gallium drivers.

Signed-off-by: Samuel Pitoiset 
Reported-by: Tobias Klausmann 
Tested-by: Tobias Klausmann 
---
  src/mesa/state_tracker/st_extensions.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index bdfbded..8347fec 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -75,6 +75,7 @@ static int _clamp(int a, int min, int max)
  void st_init_limits(struct pipe_screen *screen,
  struct gl_constants *c, struct gl_extensions *extensions)
  {
+   int supported_irs;
 unsigned sh;
 boolean can_ubo = TRUE;

@@ -177,6 +178,13 @@ void st_init_limits(struct pipe_screen *screen,
case PIPE_SHADER_COMPUTE:
   pc = >Program[MESA_SHADER_COMPUTE];
   options = >ShaderCompilerOptions[MESA_SHADER_COMPUTE];
+
+ if (!screen->get_param(screen, PIPE_CAP_COMPUTE))
+continue;
+ supported_irs =
+screen->get_shader_param(screen, sh, 
PIPE_SHADER_CAP_SUPPORTED_IRS);
+ if (!(compute_supported_irs & (1 << PIPE_SHADER_IR_TGSI)))


Wrong version, this should be supported_irs.


+continue;
   break;
default:
   assert(0);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: do not init limits when compute shaders are not supported

2016-02-15 Thread Samuel Pitoiset
When the number of uniform blocks is less than 12,
ARB_uniform_buffer_object can't be enabled and the maximum GL version
is not even 3.1...

This fixes a regression introduced in 7c79c1e (st/mesa: add compute
shader state) if the maximum number of uniform blocks allowed for
compute shaders is less than 12. This happens on Kepler but this might
also affect other Gallium drivers.

Signed-off-by: Samuel Pitoiset 
Reported-by: Tobias Klausmann 
Tested-by: Tobias Klausmann 
---
 src/mesa/state_tracker/st_extensions.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index bdfbded..8347fec 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -75,6 +75,7 @@ static int _clamp(int a, int min, int max)
 void st_init_limits(struct pipe_screen *screen,
 struct gl_constants *c, struct gl_extensions *extensions)
 {
+   int supported_irs;
unsigned sh;
boolean can_ubo = TRUE;
 
@@ -177,6 +178,13 @@ void st_init_limits(struct pipe_screen *screen,
   case PIPE_SHADER_COMPUTE:
  pc = >Program[MESA_SHADER_COMPUTE];
  options = >ShaderCompilerOptions[MESA_SHADER_COMPUTE];
+
+ if (!screen->get_param(screen, PIPE_CAP_COMPUTE))
+continue;
+ supported_irs =
+screen->get_shader_param(screen, sh, 
PIPE_SHADER_CAP_SUPPORTED_IRS);
+ if (!(compute_supported_irs & (1 << PIPE_SHADER_IR_TGSI)))
+continue;
  break;
   default:
  assert(0);
-- 
2.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] compiler/glsl: Fix uniform location counting.

2016-02-15 Thread Matt Turner
On Mon, Feb 15, 2016 at 12:50 PM, Ilia Mirkin  wrote:
> In a few places your indentation is off -- please look at the 'git
> diff' output, it should be pretty obvious. You used 2 spaces instead
> of 3 (in just a handful of places).

If you use vim, you can put something like this in your ~/.vimrc:

autocmd BufNewFile,BufRead /home/mattst88/projects/mesa/* set
expandtab tabstop=8 softtabstop=3 shiftwidth=3
autocmd BufNewFile,BufRead
/home/mattst88/projects/mesa/src/glsl/glcpp/* set noexpandtab
tabstop=8 softtabstop=8 shiftwidth=8
autocmd BufNewFile,BufRead
/home/mattst88/projects/mesa/src/glsl/glsl_parser.yy set noexpandtab
tabstop=8 shiftwidth=8
autocmd BufNewFile,BufRead /home/mattst88/projects/piglit/* set
noexpandtab tabstop=8 softtabstop=8 shiftwidth=8
autocmd BufNewFile,BufRead Makefile* set noexpandtab tabstop=8
softtabstop=8 shiftwidth=8
autocmd BufNewFile,BufRead *.mk set noexpandtab tabstop=8
softtabstop=8 shiftwidth=8

(it'll probably get line wrapped by gmail)

> On Fri, Feb 12, 2016 at 7:38 AM, Plamena Manolova
>  wrote:
>> This patch moves the calculation of current uniforms to
>> link_uniforms, which makes use of UniformRemapTable which
>> stores all the reserved uniform locations.
>>
>> Location assignment for implicit uniforms now tries to use
>> any gaps left in the table after the location assignment
>> for explicit uniforms. This gives us more space to store more
>> uniforms.
>>
>> Patch is based on earlier patch with following changes/additions:
>>
>>1: Move the counting of explicit locations to
>>   check_explicit_uniform_locations and then pass
>>   the number to link_assign_uniform_locations.
>>2: Count the number of empty slots in UniformRemapTable
>>   and store them in a list_head.
>>3: Try to find an empty slot for implicit locations from
>>   the list, if that fails resize UniformRemapTable.
>>
>> Fixes following CTS tests:
>>ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max
>>ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array
>>
>> Signed-off-by: Tapani Pälli 
>> Signed-off-by: Plamena Manolova 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696
>> ---
>>  src/compiler/glsl/link_uniforms.cpp | 85 
>> -
>>  src/compiler/glsl/linker.cpp| 73 ---
>>  src/compiler/glsl/linker.h  | 17 +++-
>>  src/mesa/main/mtypes.h  |  8 
>>  4 files changed, 148 insertions(+), 35 deletions(-)
>>
>> diff --git a/src/compiler/glsl/link_uniforms.cpp 
>> b/src/compiler/glsl/link_uniforms.cpp
>> index 7072c16..aa07de3 100644
>> --- a/src/compiler/glsl/link_uniforms.cpp
>> +++ b/src/compiler/glsl/link_uniforms.cpp
>> @@ -1038,9 +1038,43 @@ assign_hidden_uniform_slot_id(const char *name, 
>> unsigned hidden_id,
>> uniform_size->map->put(hidden_uniform_start + hidden_id, name);
>>  }
>>
>> +/**
>> + * Search through the list of empty blocks to find one that fits the current
>> + * uniform.
>> + */
>> +static int
>> +find_empty_block(struct gl_shader_program *prog,
>> + struct gl_uniform_storage *uniform)
>> +{
>> +   const unsigned entries = MAX2(1, uniform->array_elements);
>> +
>> +foreach_list_typed(struct empty_uniform_block, block, link,
>> +   >EmptyUniformLocations) {
>> +  /* Found a block with enough slots to fit the uniform */
>> +  if (block->slots == entries) {
>> + unsigned start = block->start;
>> + exec_node_remove(>link);
>> + ralloc_free(block);
>> +
>> + return start;
>> + /* Found a block with more slots than needed. It can still be used. */
>> + } else if (block->slots > entries) {
>> + unsigned start = block->start;
>> + block->start += entries;
>> + block->slots -= entries;
>> +
>> + return start;
>> + }
>> +   }
>> +
>> +   return -1;
>> +}
>> +
>>  void
>>  link_assign_uniform_locations(struct gl_shader_program *prog,
>> -  unsigned int boolean_true)
>> +  unsigned int boolean_true,
>> +  unsigned int num_explicit_uniform_locs,
>> +  unsigned int max_uniform_locs)
>>  {
>> ralloc_free(prog->UniformStorage);
>> prog->UniformStorage = NULL;
>> @@ -1131,6 +1165,9 @@ link_assign_uniform_locations(struct gl_shader_program 
>> *prog,
>>
>> parcel_out_uniform_storage parcel(prog, prog->UniformHash, uniforms, 
>> data);
>>
>> +   unsigned total_entries = num_explicit_uniform_locs;
>> +   unsigned empty_locs = prog->NumUniformRemapTable - 
>> num_explicit_uniform_locs;
>> +
>> for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
>>if (prog->_LinkedShaders[i] == NULL)
>>  continue;
>> @@ -1194,21 +1231,43 @@ 

Re: [Mesa-dev] [v2] Compression support for single-sampled

2016-02-15 Thread Ben Widawsky
On Thu, Feb 11, 2016 at 08:33:53PM +0200, Topi Pohjolainen wrote:
> This series enables compression for single sampled color surfaces,
> also referred to as "lossless compression". This is yet only for
> driver internal use easing pressure on memory bandwidth and caches
> when writing, blending and sampling surfaces uing gpu.
> 
> As a side effect the need for color buffer resolves after fast
> clears is also decreased. Current understanding is that sampling
> engine doesn't understand meta data (auxiliary buffer) for single
> sampled fast cleared surfaces. However, if the meta data is written
> with lossless compression enabled, even sampling engine is capable
> of reading both the color buffer and the auxiliary, and resolves
> can be omitted in those case.
> 
> The final enabling patch is dependent on earlier two-patch series
> fixing state restore mechanism in i965-meta operations.
> 
> v2 (Ben):  Use combination of msaa_layout and number of samples
>instead of introducing explicit type for lossless
>compression.
> 

I'm skipping 19 for now until we get through the rest.

In addition to the comments I left, 11,12,15,16,17 are:
Reviewed-by: Ben Widawsky 

11: I'm not quite sold on the needs for a general flags yet, but it doesn't
really hurt afaict.
16: I'd just squash this in to 19, but I don't really care much/.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 18/19] i965: Add helper for lossless compression support

2016-02-15 Thread Ben Widawsky
On Thu, Feb 11, 2016 at 08:34:11PM +0200, Topi Pohjolainen wrote:
> v2: Use explicitly against base type of GL_FLOAT instead of
> using _mesa_is_format_integer_color(). Otherwise we miss
> GL_UNSIGNED_NORMALIZED.
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 22 ++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  3 +++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 6c233d8..e9fbeeb 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -294,6 +294,28 @@ intel_miptree_is_lossless_compressed(const struct 
> brw_context *brw,
> return mt->num_samples <= 1;
>  }
>  
> +bool
> +intel_miptree_supports_lossless_compressed(mesa_format format)
> +{
> +   /* For now compression is only enabled for integer formats even though
> +* there exist supported floating point formats also. This is a heuristic
> +* decision based on current public benchmarks. In none of the cases these
> +* formats provided any improvement but a few cases were seen to regress.
> +* Hence these are left to to be enabled in the future when they are known
> +* to improve things.
> +*/
> +   if (_mesa_get_format_datatype(format) == GL_FLOAT)
> +  return false;
> +
> +   /* In principle, fast clear mechanism and lossless compression go hand in
> +* hand. However, fast clear can be also used to clear srgb surfaces by
> +* using equivalent linear format. This trick, however, can't be extended
> +* to be used with lossless compression and therefore a check is needed to
> +* see if the format really is linear.
> +*/
> +   return _mesa_get_srgb_format_linear(format) == format;
> +}
> +

Hmm. Doesn't this need to use the ccs_e field in surface formats, or did I miss
something?

>  /**
>   * Determine depth format corresponding to a depth+stencil format,
>   * for separate stencil.
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index 7cdfb37..6ce9f75 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -679,6 +679,9 @@ intel_miptree_supports_non_msrt_fast_clear(struct 
> brw_context *brw,
> const struct intel_mipmap_tree 
> *mt);
>  
>  bool
> +intel_miptree_supports_lossless_compressed(mesa_format format);
> +
> +bool
>  intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
>   struct intel_mipmap_tree *mt);
>  
> -- 
> 2.5.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 16/19] i965: Expose logic telling if non-msrt mcs is supported

2016-02-15 Thread Ben Widawsky
On Thu, Feb 11, 2016 at 08:34:09PM +0200, Topi Pohjolainen wrote:
> Alos use the opportunity to mark inputs constant. (Context has to be

Also

> given as read-write to intel_miptree_supports_non_msrt_fast_clear()
> to support debug output).
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 +
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 8 
>  2 files changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index a072268..6c233d8 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -161,8 +161,9 @@ intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree 
> *mt,
> }
>  }
>  
> -static bool
> -intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling)
> +bool
> +intel_tiling_supports_non_msrt_mcs(const struct brw_context *brw,
> +   unsigned tiling)
>  {
> /* From the Ivy Bridge PRM, Vol2 Part1 11.7 "MCS Buffer for Render
>  * Target(s)", beneath the "Fast Color Clear" bullet (p326):
> @@ -200,9 +201,9 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context 
> *brw, unsigned tiling)
>   * - MCS and Lossless compression is supported for TiledY/TileYs/TileYf
>   * non-MSRTs only.
>   */
> -static bool
> +bool
>  intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw,
> -   struct intel_mipmap_tree *mt)
> +   const struct intel_mipmap_tree 
> *mt)
>  {
> /* MCS support does not exist prior to Gen7 */
> if (brw->gen < 7)
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index a21f33f..7cdfb37 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -671,6 +671,14 @@ intel_miptree_is_lossless_compressed(const struct 
> brw_context *brw,
>   const struct intel_mipmap_tree *mt);
>  
>  bool
> +intel_tiling_supports_non_msrt_mcs(const struct brw_context *brw,
> +   unsigned tiling);
> +
> +bool
> +intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw,
> +   const struct intel_mipmap_tree 
> *mt);
> +
> +bool
>  intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
>   struct intel_mipmap_tree *mt);
>  
[snip]

With the "Also" fix in the commit message:
Reviewed-by: Ben Widawsky 

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v2 14/19] i965/gen9: Prepare surface state setup for lossless compression

2016-02-15 Thread Ben Widawsky
On Thu, Feb 11, 2016 at 08:34:07PM +0200, Topi Pohjolainen wrote:
> v2 (Ben): Use combination of msaa_layout and number of samples
>   instead of introducing explicit type for lossless
>   compression (intel_miptree_is_lossless_compressed()).
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h| 1 +
>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 6 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index b1fa559..f903335 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -656,6 +656,7 @@
>  #define GEN8_SURFACE_AUX_MODE_MCS   1
>  #define GEN8_SURFACE_AUX_MODE_APPEND2
>  #define GEN8_SURFACE_AUX_MODE_HIZ   3
> +#define GEN9_SURFACE_AUX_MODE_CCS_E 5
>  
>  /* Surface state DW7 */
>  #define GEN9_SURFACE_RT_COMPRESSION_SHIFT   30
> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> index 0a52815..e1a37d8 100644
> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
> @@ -216,6 +216,9 @@ gen8_get_aux_mode(const struct brw_context *brw,
> if (brw->gen >= 9 || mt->num_samples == 1)
>assert(mt->halign == 16);
>  
> +   if (intel_miptree_is_lossless_compressed(brw, mt))
> +  return GEN9_SURFACE_AUX_MODE_CCS_E;
> +
> return GEN8_SURFACE_AUX_MODE_MCS;
>  }
>  
> @@ -484,6 +487,9 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
> struct intel_mipmap_tree *aux_mt = mt->mcs_mt;
> const uint32_t aux_mode = gen8_get_aux_mode(brw, mt, surf_type);
>  
> +   if (aux_mode == GEN9_SURFACE_AUX_MODE_CCS_E)
> +  mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_UNRESOLVED;
> +

I am somewhat undecided about whether or not this should be here. On the one
hand, this is a gen specific thing (you rendered to a losslessy compressed
buffer, which is only a gen9+ thing). On the other hand, we handle all similar
stuff in the common meta code, and modifying fast_clear_state here seems a bit
unclean.

I'm not opposed to doing this as long as you've considered my potential
objection.

> uint32_t *surf = allocate_surface_state(brw, , surf_index);
>  
> surf[0] = (surf_type << BRW_SURFACE_TYPE_SHIFT) |
> -- 
> 2.5.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] compiler/glsl: Fix uniform location counting.

2016-02-15 Thread Ilia Mirkin
In a few places your indentation is off -- please look at the 'git
diff' output, it should be pretty obvious. You used 2 spaces instead
of 3 (in just a handful of places).

On Fri, Feb 12, 2016 at 7:38 AM, Plamena Manolova
 wrote:
> This patch moves the calculation of current uniforms to
> link_uniforms, which makes use of UniformRemapTable which
> stores all the reserved uniform locations.
>
> Location assignment for implicit uniforms now tries to use
> any gaps left in the table after the location assignment
> for explicit uniforms. This gives us more space to store more
> uniforms.
>
> Patch is based on earlier patch with following changes/additions:
>
>1: Move the counting of explicit locations to
>   check_explicit_uniform_locations and then pass
>   the number to link_assign_uniform_locations.
>2: Count the number of empty slots in UniformRemapTable
>   and store them in a list_head.
>3: Try to find an empty slot for implicit locations from
>   the list, if that fails resize UniformRemapTable.
>
> Fixes following CTS tests:
>ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max
>ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array
>
> Signed-off-by: Tapani Pälli 
> Signed-off-by: Plamena Manolova 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696
> ---
>  src/compiler/glsl/link_uniforms.cpp | 85 
> -
>  src/compiler/glsl/linker.cpp| 73 ---
>  src/compiler/glsl/linker.h  | 17 +++-
>  src/mesa/main/mtypes.h  |  8 
>  4 files changed, 148 insertions(+), 35 deletions(-)
>
> diff --git a/src/compiler/glsl/link_uniforms.cpp 
> b/src/compiler/glsl/link_uniforms.cpp
> index 7072c16..aa07de3 100644
> --- a/src/compiler/glsl/link_uniforms.cpp
> +++ b/src/compiler/glsl/link_uniforms.cpp
> @@ -1038,9 +1038,43 @@ assign_hidden_uniform_slot_id(const char *name, 
> unsigned hidden_id,
> uniform_size->map->put(hidden_uniform_start + hidden_id, name);
>  }
>
> +/**
> + * Search through the list of empty blocks to find one that fits the current
> + * uniform.
> + */
> +static int
> +find_empty_block(struct gl_shader_program *prog,
> + struct gl_uniform_storage *uniform)
> +{
> +   const unsigned entries = MAX2(1, uniform->array_elements);
> +
> +foreach_list_typed(struct empty_uniform_block, block, link,
> +   >EmptyUniformLocations) {
> +  /* Found a block with enough slots to fit the uniform */
> +  if (block->slots == entries) {
> + unsigned start = block->start;
> + exec_node_remove(>link);
> + ralloc_free(block);
> +
> + return start;
> + /* Found a block with more slots than needed. It can still be used. */
> + } else if (block->slots > entries) {
> + unsigned start = block->start;
> + block->start += entries;
> + block->slots -= entries;
> +
> + return start;
> + }
> +   }
> +
> +   return -1;
> +}
> +
>  void
>  link_assign_uniform_locations(struct gl_shader_program *prog,
> -  unsigned int boolean_true)
> +  unsigned int boolean_true,
> +  unsigned int num_explicit_uniform_locs,
> +  unsigned int max_uniform_locs)
>  {
> ralloc_free(prog->UniformStorage);
> prog->UniformStorage = NULL;
> @@ -1131,6 +1165,9 @@ link_assign_uniform_locations(struct gl_shader_program 
> *prog,
>
> parcel_out_uniform_storage parcel(prog, prog->UniformHash, uniforms, 
> data);
>
> +   unsigned total_entries = num_explicit_uniform_locs;
> +   unsigned empty_locs = prog->NumUniformRemapTable - 
> num_explicit_uniform_locs;
> +
> for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
>if (prog->_LinkedShaders[i] == NULL)
>  continue;
> @@ -1194,21 +1231,43 @@ link_assign_uniform_locations(struct 
> gl_shader_program *prog,
>/* how many new entries for this uniform? */
>const unsigned entries = MAX2(1, uniforms[i].array_elements);
>
> -  /* resize remap table to fit new entries */
> -  prog->UniformRemapTable =
> - reralloc(prog,
> -  prog->UniformRemapTable,
> -  gl_uniform_storage *,
> -  prog->NumUniformRemapTable + entries);
> +  /* Find UniformRemapTable for empty blocks where we can fit this 
> uniform. */
> +  int chosen_location = -1;
> +
> +  if (empty_locs)
> + chosen_location = find_empty_block(prog, [i]);
> +
> +  if (chosen_location != -1) {
> + empty_locs -= entries;
> +  } else {
> + chosen_location = prog->NumUniformRemapTable;
> +
> + /* Add new entries to the total amount of entries. */
> + total_entries += entries;
> +
> + /* resize remap table 

Re: [Mesa-dev] [v2 13/19] i965: Set buffer cleared after actually clearing it

2016-02-15 Thread Ben Widawsky
On Thu, Feb 11, 2016 at 08:34:06PM +0200, Topi Pohjolainen wrote:
> Subsequent patch will modify the surface state to set state to
> unresolved whenever the surface is used as render target. Color
> resolve itself will use the same surface setup path and marking
> the buffer as cleared after the draw call ensures that the state
> correct after the resolve
> 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
> b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> index e92ae6c..6af6985 100644
> --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
> @@ -878,11 +878,12 @@ brw_meta_resolve_color(struct brw_context *brw,
> else
>set_fast_clear_op(brw, GEN7_PS_RENDER_TARGET_RESOLVE_ENABLE);
>  
> -   mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
> get_resolve_rect(brw, mt, );
>  
> brw_draw_rectlist(brw, , 1);
>  
> +   mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
> +
> set_fast_clear_op(brw, 0);
> use_rectlist(brw, false);
>  

This seems correct, as it accurately reflects the state. Given my understanding
of meta though, I'm very uncomfortable though that this doesn't actually change
more than it seems. Furthermore, if the existing ordering is important, it seems
a comment is sorely lacking.

Kristian you wrote this originally, would you mind looking at this one? With
Kristian's ack, this is;
Reviewed-by: Ben Widawsky 


-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add always-false-for-now enables for GL 4.3, 4.4, 4.5.

2016-02-15 Thread Ilia Mirkin
ping? Ian, did you have a chance to look over my answers? Were you
hoping something more thorough would get done for these checks?

On Fri, Jan 8, 2016 at 6:24 PM, Ilia Mirkin  wrote:
> On Fri, Jan 8, 2016 at 6:19 PM, Ian Romanick  wrote:
>> On 01/08/2016 12:50 PM, Ilia Mirkin wrote:
>>> As the relevant extensions get implemented, the lines should be
>>> uncommented. I believe this is (almost) everything needed for those GL
>>> versions though.
>>
>> It looks like this matches the list in GL3.txt for 4.3 and 4.4 anyway.
>> There appear to be some missing bits for 4.5.  Did you check to see if
>> any minimum maximums changed?
>
> The SSBO min changed to 1<<27 ... somewhere. But I didn't do anything 
> thorough.
>
>>
>>>
>>> Signed-off-by: Ilia Mirkin 
>>> ---
>>>  src/mesa/main/version.c | 51 
>>> +++--
>>>  1 file changed, 49 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
>>> index 112a73d..6a90661 100644
>>> --- a/src/mesa/main/version.c
>>> +++ b/src/mesa/main/version.c
>>> @@ -351,8 +351,55 @@ compute_version(const struct gl_extensions *extensions,
>>>   extensions->ARB_shading_language_packing &&
>>>   extensions->ARB_texture_compression_bptc &&
>>>   extensions->ARB_transform_feedback_instanced);
>>> -
>>> -   if (ver_4_2) {
>>> +   const bool ver_4_3 = (ver_4_2 &&
>>> + consts->GLSLVersion >= 430 &&
>>> + extensions->ARB_ES3_compatibility &&
>>> + extensions->ARB_arrays_of_arrays &&
>>> + extensions->ARB_compute_shader &&
>>> + extensions->ARB_copy_image &&
>>> + extensions->ARB_explicit_uniform_location &&
>>> + extensions->ARB_fragment_layer_viewport &&
>>> + extensions->ARB_framebuffer_no_attachments &&
>>> + /* extensions->ARB_internalformat_query2 */ 0 &&
>>> + /* extensions->ARB_robust_buffer_access_behavior 
>>> */ 0 &&
>>> + extensions->ARB_shader_image_size &&
>>> + extensions->ARB_shader_storage_buffer_object &&
>>> + extensions->ARB_stencil_texturing &&
>>> + extensions->ARB_texture_buffer_range &&
>>> + extensions->ARB_texture_query_levels &&
>>> + extensions->ARB_texture_view);
>>> +   const bool ver_4_4 = (ver_4_3 &&
>>> + consts->GLSLVersion >= 440 &&
>>> + extensions->ARB_buffer_storage &&
>>> + extensions->ARB_clear_texture &&
>>> + extensions->ARB_enhanced_layouts &&
>>> + /* extensions->ARB_query_buffer_object */ 0 &&
>>> + extensions->ARB_texture_mirror_clamp_to_edge &&
>>> + extensions->ARB_texture_stencil8 &&
>>> + extensions->ARB_vertex_type_10f_11f_11f_rev);
>>> +   const bool ver_4_5 = (ver_4_4 &&
>>> + consts->GLSLVersion >= 450 &&
>>> + /* extensions->ARB_ES3_1_compatibility */ 0 &&
>>> + extensions->ARB_clip_control &&
>>> + extensions->ARB_conditional_render_inverted &&
>>> + /* extensions->ARB_cull_distance */ 0 &&
>>> + extensions->ARB_derivative_control &&
>>> + extensions->ARB_shader_texture_image_samples &&
>>> + extensions->NV_texture_barrier);
>>
>> GL3.txt also lists:
>>
>> GL_KHR_robust_buffer_access_behavior
>> GL_KHR_robustness
>
> These two are identical to the ARB versions from what I can tell
> (except that there are minor EGL interactions, etc). Didn't seem like
> they'd be getting their own enables. I'd actually be in favor of just
> removing them from GL3.txt, unless I've misunderstood what was going
> on.
>
>> GL_EXT_shader_integer_mix
>
> Well, you'd also get this via GL_ARB_ES3_1_compatibility right? From that 
> spec:
>
> - new GLSL built-in functions which extend mix() to select between int,
> uint, and bool components.
>
>
>>
>> Though I guess you could argue that GL_EXT_shader_integer_mix is really
>> just GLSL 4.50.
>>
>>> +
>>> +   if (ver_4_5) {
>>> +  major = 4;
>>> +  minor = 5;
>>> +   }
>>> +   else if (ver_4_4) {
>>> +  major = 4;
>>> +  minor = 4;
>>> +   }
>>> +   else if (ver_4_3) {
>>> +  major = 4;
>>> +  minor = 3;
>>> +   }
>>> +   else if (ver_4_2) {
>>>major = 4;
>>>minor = 2;
>>> }
>>>
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [v2 05/19] i965: Add helper for detecting lossless compression

2016-02-15 Thread Ben Widawsky
On Sat, Feb 13, 2016 at 09:36:01AM +0200, Pohjolainen, Topi wrote:
> On Thu, Feb 11, 2016 at 01:48:18PM -0800, Ben Widawsky wrote:
> > On Thu, Feb 11, 2016 at 08:33:58PM +0200, Topi Pohjolainen wrote:
> > > Signed-off-by: Topi Pohjolainen 
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 26 
> > > ++
> > >  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  4 
> > >  2 files changed, 30 insertions(+)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > index 5f739d9..31de1ff 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > > @@ -266,6 +266,32 @@ intel_miptree_supports_non_msrt_fast_clear(struct 
> > > brw_context *brw,
> > >return true;
> > >  }
> > >  
> > > +/* On Gen9 support for color buffer compression was extended to single
> > > + * sampled surfaces. This is a helper considering both auxiliary buffer
> > > + * type and number of samples telling if the given miptree represents
> > > + * the new single sampled case - also called lossless compression.
> > > + */
> > > +bool
> > > +intel_miptree_is_lossless_compressed(const struct brw_context *brw,
> > > + const struct intel_mipmap_tree *mt)
> > > +{
> > > +   /* Only available from Gen9 onwards. */
> > > +   if (brw->gen < 9)
> > > +  return false;
> > > +
> > > +   /* Compression always requires auxiliary buffer. */
> > > +   if (!mt->mcs_mt)
> > > +  return false;
> > > +
> > > +   /* Single sample compression is represented re-using msaa compression
> > > +* layout type: "Compressed Multisampled Surfaces".
> > > +*/
> > > +   if (mt->msaa_layout != INTEL_MSAA_LAYOUT_CMS)
> > > +  return false;
> > > +
> > > +   /* And finally distinguish between msaa and single sample case. */
> > > +   return mt->num_samples <= 1;
> > > +}
> > >  
> > >  /**
> > >   * Determine depth format corresponding to a depth+stencil format,
> > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > > index 64f73ea..13d4d7e 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> > > @@ -667,6 +667,10 @@ intel_get_non_msrt_mcs_alignment(struct 
> > > intel_mipmap_tree *mt,
> > >   unsigned *width_px, unsigned *height);
> > >  
> > >  bool
> > > +intel_miptree_is_lossless_compressed(const struct brw_context *brw,
> > > + const struct intel_mipmap_tree *mt);
> > > +
> > > +bool
> > >  intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
> > >   struct intel_mipmap_tree *mt);
> > >  
> > 
> > So I take it this was one of the big changes from my feedback earlier. Do 
> > you
> > prefer this result, or what? I can live with the old thing if you want it.
> 
> I think I like this better, it avoids the CSS/CCS confusion. And in principle,
> both MCS and CCS_E are about compression, and the fact that hardware
> implements them separately only needs to be reflected in surface setup and
> color resolve logic. No need to go further.

I'm glad you like it. I hope it didn't cost too much time, and more importantly,
I hope we don't decide the other way was better at some later date :-)

Reviewed-by: Ben Widawsky 

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix test for big-endian architecture in compiler.h

2016-02-15 Thread Oded Gabbay
On Mon, Feb 15, 2016 at 8:33 PM, Jochen Rollwagen  wrote:
> Am 15.02.2016 um 15:53 schrieb Oded Gabbay:
>>
>> Sent with MailTrack
>>
>> On Sat, Feb 13, 2016 at 2:39 AM, Roland Scheidegger 
>> wrote:
>>>
>>> Am 12.02.2016 um 10:01 schrieb Jochen Rollwagen:

 Hi,

 i think i found & fixed a bug in mesa concerning tests for big-endian
 machines. The defines tested don't exist or are wrongly defined so the
 test (probably) never fires. The gcc defines on my machine concerning
 big-endian are

 jochen@mac-mini:~/sources/mesa$ gcc -dM -E - < /dev/null | grep BIG
 #define __BIGGEST_ALIGNMENT__ 16
 #define __BIG_ENDIAN__ 1
 #define __FLOAT_WORD_ORDER__ __ORDER_BIG_ENDIAN__
 #define _BIG_ENDIAN 1
 #define __ORDER_BIG_ENDIAN__ 4321
 #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__

 The tested values in current mesa are quite different :-)

 The following patch fixes this.

 diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h
 index c5ee741..99c63cb 100644
 --- a/src/mesa/main/compiler.h
 +++ b/src/mesa/main/compiler.h
 @@ -52,7 +52,7 @@ extern "C" {
* Try to use a runtime test instead.
* For now, only used by some DRI hardware drivers for color/texel
 packing.
*/
 -#if defined(BYTE_ORDER) && defined(BIG_ENDIAN) && BYTE_ORDER ==
 BIG_ENDIAN
 +#if defined(__BYTE_ORDER__) && defined(__BIG_ENDIAN__) &&
 __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
   #if defined(__linux__)
   #include 
   #define CPU_TO_LE32( x )   bswap_32( x )

>>> Note that on some platforms this file would include endian.h - which
>>> defines those BYTE_ORDER etc. values. Albeit it includes this _after_
>>> these ifdefs...
>>> But don't ask me how this is really supposed to work...
>>>
>>> Roland
>>
>>  includes  which includes 
>>
>> However, this depends on the c/h files to include  before
>> including "compiler.h", which doesn't always happen (e.g
>> dummy_errors.c) and it is a very fragile situation.
>>
>> So I think this is a good fix and this patch is:
>> Reviewed-by: Oded Gabbay 
>>
>> Jochen,
>>
>> Note that I downloaded this patch from pw and it was malformed. I
>> don't know if its a pw problem or a problem in how you sent the patch
>> to the ml.
>>
>>  Oded
>>
>>
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> Well, i just copied it from the git-diff-terminal and pasted it into my
> mail-client. Maybe a newline problem ? Anyway, i attached the patch (and
> patched my local mesa with it before, which worked :-) ).
>
> Cheers
>
> Jochen

Hi Jochen,
This is just a plain patch and not a commit (created by git).
Do you want to create a proper commit and send it (using git-send) or
do you want me to do it for you ? Of course I'll keep you as author.

Thanks,

 Oded
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/6] st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap

2016-02-15 Thread Brian Paul

On 02/14/2016 10:44 AM, Ilia Mirkin wrote:

On Sun, Feb 14, 2016 at 9:47 AM, Brian Paul  wrote:

On 02/13/2016 01:03 PM, Ilia Mirkin wrote:


On Fri, Feb 12, 2016 at 10:43 AM, Brian Paul  wrote:


diff --git a/src/mesa/state_tracker/st_context.c
b/src/mesa/state_tracker/st_context.c
index 9016846..cb2c390 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -241,16 +241,23 @@ st_create_context_priv( struct gl_context *ctx,
struct pipe_context *pipe,
  else
 st->internal_target = PIPE_TEXTURE_RECT;

-   /* Vertex element objects used for drawing rectangles for glBitmap,
-* glDrawPixels, glClear, etc.
+   /* Setup vertex element info for 'struct st_util_vertex'.
   */
-   for (i = 0; i < ARRAY_SIZE(st->velems_util_draw); i++) {
-  memset(>velems_util_draw[i], 0, sizeof(struct
pipe_vertex_element));
-  st->velems_util_draw[i].src_offset = i * 4 * sizeof(float);
-  st->velems_util_draw[i].instance_divisor = 0;
-  st->velems_util_draw[i].vertex_buffer_index =
-cso_get_aux_vertex_buffer_slot(st->cso_context);
-  st->velems_util_draw[i].src_format =
PIPE_FORMAT_R32G32B32A32_FLOAT;
+   {
+  const unsigned slot =
cso_get_aux_vertex_buffer_slot(st->cso_context);



Can the aux vertex buffer slot change over time? If so, you need some
logic to update the vertex_buffer_index for these. From what I can
tell it's always 0, not sure what the intention behind it is... seems
like it'll be a very annoying problem to debug down the line should it
ever change. Thoughts?



It's hard-wired to zero as you say but I imagine it could be computed by
examining the current vertex buffer bindings state to find a free slot such
that we might be able to avoid saving/restoring all the vertex buffer
bindings.  I believe Marek wrote the code in question.

In any case, my patch doesn't change how/where the
cso_get_aux_vertex_buffer_slot() function is used.  For now, I could add an
assertion that slot==0 to help catch future issues.

Sound OK?


My concern is the following sequence:

aux vertex buffer slot == 0;
initialize vertex elements assuming that aux vertex buffer slot == 0;
set aux vertex buffer slot = 1;
bind vbo's based on current aux vertex buffer slot ( == 1);

So I'd rather that assert go where the vertex buffers get bound,
rather than here. Maybe just assert that vertex_buffer_index ==
aux_buffer_slot when you bind the vertex elements?


OK.






Does the series look OK to you overall?


I'm looking through it... but I'm slow at this stuff, since I'm not
super familiar with all the concepts. Perhaps convince one of your
coworkers to also have a look? :)


Yes, Jose already did, but I wasn't sure if I should wait for more from 
you.  Thanks.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] cso: add CSO_BITS_ALL_SHADERS

2016-02-15 Thread Brian Paul
For saving/restoring all shader stages.
---
 src/gallium/auxiliary/cso_cache/cso_context.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/auxiliary/cso_cache/cso_context.h 
b/src/gallium/auxiliary/cso_cache/cso_context.h
index 0305451..a3563d8 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.h
+++ b/src/gallium/auxiliary/cso_cache/cso_context.h
@@ -171,6 +171,12 @@ void cso_set_render_condition(struct cso_context *cso,
 #define CSO_BIT_VERTEX_SHADER 0x2
 #define CSO_BIT_VIEWPORT  0x4
 
+#define CSO_BITS_ALL_SHADERS (CSO_BIT_VERTEX_SHADER | \
+  CSO_BIT_FRAGMENT_SHADER | \
+  CSO_BIT_GEOMETRY_SHADER | \
+  CSO_BIT_TESSCTRL_SHADER | \
+  CSO_BIT_TESSEVAL_SHADER)
+
 void cso_save_state(struct cso_context *cso, unsigned state_mask);
 void cso_restore_state(struct cso_context *cso);
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/mesa: use new CSO_BITS_ALL_SHADERS

2016-02-15 Thread Brian Paul
---
 src/mesa/state_tracker/st_cb_bitmap.c | 9 +++--
 src/mesa/state_tracker/st_cb_clear.c  | 8 ++--
 src/mesa/state_tracker/st_cb_drawpixels.c | 8 ++--
 src/mesa/state_tracker/st_cb_texture.c| 8 ++--
 4 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_bitmap.c 
b/src/mesa/state_tracker/st_cb_bitmap.c
index e27d487..4fd2dfe 100644
--- a/src/mesa/state_tracker/st_cb_bitmap.c
+++ b/src/mesa/state_tracker/st_cb_bitmap.c
@@ -220,14 +220,11 @@ setup_render_state(struct gl_context *ctx,
 CSO_BIT_FRAGMENT_SAMPLERS |
 CSO_BIT_FRAGMENT_SAMPLER_VIEWS |
 CSO_BIT_VIEWPORT |
-CSO_BIT_FRAGMENT_SHADER |
 CSO_BIT_STREAM_OUTPUTS |
-CSO_BIT_TESSCTRL_SHADER |
-CSO_BIT_TESSEVAL_SHADER |
-CSO_BIT_GEOMETRY_SHADER |
 CSO_BIT_VERTEX_ELEMENTS |
-CSO_BIT_VERTEX_SHADER |
-CSO_BIT_AUX_VERTEX_BUFFER_SLOT));
+CSO_BIT_AUX_VERTEX_BUFFER_SLOT |
+CSO_BITS_ALL_SHADERS));
+
 
/* rasterizer state: just scissor */
st->bitmap.rasterizer.scissor = ctx->Scissor.EnableFlags & 1;
diff --git a/src/mesa/state_tracker/st_cb_clear.c 
b/src/mesa/state_tracker/st_cb_clear.c
index 01f1c05..5580146 100644
--- a/src/mesa/state_tracker/st_cb_clear.c
+++ b/src/mesa/state_tracker/st_cb_clear.c
@@ -203,14 +203,10 @@ clear_with_quad(struct gl_context *ctx, unsigned 
clear_buffers)
 CSO_BIT_SAMPLE_MASK |
 CSO_BIT_MIN_SAMPLES |
 CSO_BIT_VIEWPORT |
-CSO_BIT_FRAGMENT_SHADER |
 CSO_BIT_STREAM_OUTPUTS |
-CSO_BIT_VERTEX_SHADER |
-CSO_BIT_TESSCTRL_SHADER |
-CSO_BIT_TESSEVAL_SHADER |
-CSO_BIT_GEOMETRY_SHADER |
 CSO_BIT_VERTEX_ELEMENTS |
-CSO_BIT_AUX_VERTEX_BUFFER_SLOT));
+CSO_BIT_AUX_VERTEX_BUFFER_SLOT |
+CSO_BITS_ALL_SHADERS));
 
/* blend state: RGBA masking */
{
diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 9c955a5..d1fe330 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -479,14 +479,10 @@ draw_textured_quad(struct gl_context *ctx, GLint x, GLint 
y, GLfloat z,
  CSO_BIT_VIEWPORT |
  CSO_BIT_FRAGMENT_SAMPLERS |
  CSO_BIT_FRAGMENT_SAMPLER_VIEWS |
- CSO_BIT_FRAGMENT_SHADER |
  CSO_BIT_STREAM_OUTPUTS |
- CSO_BIT_VERTEX_SHADER |
- CSO_BIT_TESSCTRL_SHADER |
- CSO_BIT_TESSEVAL_SHADER |
- CSO_BIT_GEOMETRY_SHADER |
  CSO_BIT_VERTEX_ELEMENTS |
- CSO_BIT_AUX_VERTEX_BUFFER_SLOT);
+ CSO_BIT_AUX_VERTEX_BUFFER_SLOT |
+ CSO_BITS_ALL_SHADERS);
if (write_stencil) {
   cso_state_mask |= (CSO_BIT_DEPTH_STENCIL_ALPHA |
  CSO_BIT_BLEND);
diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 5f76e44..a06cc72 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -1341,12 +1341,8 @@ try_pbo_upload_common(struct gl_context *ctx,
 CSO_BIT_VIEWPORT |
 CSO_BIT_BLEND |
 CSO_BIT_RASTERIZER |
-CSO_BIT_VERTEX_SHADER |
-CSO_BIT_GEOMETRY_SHADER |
-CSO_BIT_TESSCTRL_SHADER |
-CSO_BIT_TESSEVAL_SHADER |
-CSO_BIT_FRAGMENT_SHADER |
-CSO_BIT_STREAM_OUTPUTS));
+CSO_BIT_STREAM_OUTPUTS |
+CSO_BITS_ALL_SHADERS));
cso_save_constant_buffer_slot0(cso, PIPE_SHADER_FRAGMENT);
 
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: implement a display list / glBitmap texture atlas

2016-02-15 Thread Brian Paul
This improves the performance of applications which use glXUseXFont()
or wglUseFontBitmaps() and glCallLists() to draw bitmap text.

Basically, we collect all the glBitmap images from the display lists
and put them into a texture atlas.  To render the bitmaps for a
glCallLists() command, we render a set of textured quads where each
quad is textured with one bitmap image.  Actually, the rendering part
has to be done by the Mesa driver or Mesa/gallium state tracker.

Note that GLUT demos that use glutBitmapCharacter() don't benefit
from this.

v2, per Nicolai Hähnle:
- check the max tex rect size is at least 1024.
- add comment in dd.h that texture_rectangle is required.
- in _mesa_DeleteLists(), try to delete the atlas before the list(s)
---
 src/mesa/main/dd.h |   9 ++
 src/mesa/main/dlist.c  | 385 +
 src/mesa/main/dlist.h  |  38 +
 src/mesa/main/mtypes.h |   1 +
 src/mesa/main/shared.c |  15 ++
 5 files changed, 448 insertions(+)

diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 19ef304..3f5aa5d 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -35,6 +35,7 @@
 
 #include "glheader.h"
 
+struct gl_bitmap_atlas;
 struct gl_buffer_object;
 struct gl_context;
 struct gl_display_list;
@@ -154,6 +155,14 @@ struct dd_function_table {
   GLint x, GLint y, GLsizei width, GLsizei height,
   const struct gl_pixelstore_attrib *unpack,
   const GLubyte *bitmap );
+
+   /**
+* Called by display list code for optimized glCallLists/glBitmap rendering
+* The driver must support texture rectangles of width 1024 or more.
+*/
+   void (*DrawAtlasBitmaps)(struct gl_context *ctx,
+const struct gl_bitmap_atlas *atlas,
+GLuint count, const GLubyte *ids);
/*@}*/
 

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 0e25efb..afd2d83 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -72,6 +72,9 @@
 #include "vbo/vbo.h"
 
 
+#define USE_BITMAP_ATLAS 1
+
+
 
 /**
  * Other parts of Mesa (such as the VBO module) can plug into the display
@@ -606,6 +609,261 @@ void mesa_print_display_list(GLuint list);
 
 
 /**
+ * Does the given display list only contain a single glBitmap call?
+ */
+static bool
+is_bitmap_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   if (n[0].opcode == OPCODE_BITMAP) {
+  n += InstSize[OPCODE_BITMAP];
+  if (n[0].opcode == OPCODE_END_OF_LIST)
+ return true;
+   }
+   return false;
+}
+
+
+/**
+ * Is the given display list an empty list?
+ */
+static bool
+is_empty_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   return n[0].opcode == OPCODE_END_OF_LIST;
+}
+
+
+/**
+ * Delete/free a gl_bitmap_atlas.  Called during context tear-down.
+ */
+void
+_mesa_delete_bitmap_atlas(struct gl_context *ctx, struct gl_bitmap_atlas 
*atlas)
+{
+   if (atlas->texObj) {
+  ctx->Driver.DeleteTexture(ctx, atlas->texObj);
+   }
+   free(atlas->glyphs);
+}
+
+
+/**
+ * Lookup a gl_bitmap_atlas by listBase ID.
+ */
+static struct gl_bitmap_atlas *
+lookup_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   atlas = _mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase);
+   return atlas;
+}
+
+
+/**
+ * Create new bitmap atlas and insert into hash table.
+ */
+static struct gl_bitmap_atlas *
+alloc_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   assert(_mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase) == NULL);
+
+   atlas = calloc(1, sizeof(*atlas));
+   if (atlas) {
+  _mesa_HashInsert(ctx->Shared->BitmapAtlas, listBase, atlas);
+   }
+
+   return atlas;
+}
+
+
+/**
+ * Try to build a bitmap atlas.  This involves examining a sequence of
+ * display lists which contain glBitmap commands and putting the bitmap
+ * images into a texture map (the atlas).
+ * If we succeed, gl_bitmap_atlas::complete will be set to true.
+ * If we fail, gl_bitmap_atlas::incomplete will be set to true.
+ */
+static void
+build_bitmap_atlas(struct gl_context *ctx, struct gl_bitmap_atlas *atlas,
+   GLuint listBase)
+{
+   unsigned i, row_height = 0, xpos = 0, ypos = 0;
+   GLubyte *map;
+   GLint map_stride;
+
+   assert(atlas);
+   assert(!atlas->complete);
+   assert(atlas->numBitmaps > 0);
+
+   /* We use a rectangle texture (non-normalized coords) for the atlas */
+   assert(ctx->Extensions.NV_texture_rectangle);
+   assert(ctx->Const.MaxTextureRectSize >= 1024);
+
+   atlas->texWidth = 1024;
+   atlas->texHeight = 0;  /* determined below */
+
+   atlas->glyphs = malloc(atlas->numBitmaps * sizeof(atlas->glyphs[0]));
+   if (!atlas->glyphs) {
+  /* give up */
+  atlas->incomplete = true;
+  return;
+   }
+
+   /* Loop over the display lists.  They should all contain a 

Re: [Mesa-dev] [PATCH 1/2] mesa: implement a display list / glBitmap texture atlas

2016-02-15 Thread Nicolai Hähnle

On 15.02.2016 11:31, Brian Paul wrote:

On 02/15/2016 08:45 AM, Nicolai Hähnle wrote:



On 12.02.2016 20:07, Brian Paul wrote:

This improves the performance of applications which use glXUseXFont()
or wglUseFontBitmaps() and glCallLists() to draw bitmap text.

Basically, we collect all the glBitmap images from the display lists
and put them into a texture atlas.  To render the bitmaps for a
glCallLists() command, we render a set of textured quads where each
quad is textured with one bitmap image.  Actually, the rendering part
has to be done by the Mesa driver or Mesa/gallium state tracker.

Note that GLUT demos that use glutBitmapCharacter() don't benefit
from this.
---
  src/mesa/main/dd.h |   8 ++
  src/mesa/main/dlist.c  | 383
+
  src/mesa/main/dlist.h  |  38 +
  src/mesa/main/mtypes.h |   1 +
  src/mesa/main/shared.c |  15 ++
  5 files changed, 445 insertions(+)

diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 19ef304..5d1370c 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -35,6 +35,7 @@

  #include "glheader.h"

+struct gl_bitmap_atlas;
  struct gl_buffer_object;
  struct gl_context;
  struct gl_display_list;
@@ -154,6 +155,13 @@ struct dd_function_table {
 GLint x, GLint y, GLsizei width, GLsizei height,
 const struct gl_pixelstore_attrib *unpack,
 const GLubyte *bitmap );
+
+   /**
+* Called by display list code for optimized glCallLists/glBitmap
rendering
+*/
+   void (*DrawAtlasBitmaps)(struct gl_context *ctx,
+const struct gl_bitmap_atlas *atlas,
+GLuint count, const GLubyte *ids);
 /*@}*/


diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 0e25efb..1927068 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -72,6 +72,9 @@
  #include "vbo/vbo.h"


+#define USE_BITMAP_ATLAS 1
+
+

  /**
   * Other parts of Mesa (such as the VBO module) can plug into the
display
@@ -606,6 +609,259 @@ void mesa_print_display_list(GLuint list);


  /**
+ * Does the given display list only contain a single glBitmap call?
+ */
+static bool
+is_bitmap_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   if (n[0].opcode == OPCODE_BITMAP) {
+  n += InstSize[OPCODE_BITMAP];
+  if (n[0].opcode == OPCODE_END_OF_LIST)
+ return true;
+   }
+   return false;
+}
+
+
+/**
+ * Is the given display list an empty list?
+ */
+static bool
+is_empty_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   return n[0].opcode == OPCODE_END_OF_LIST;
+}
+
+
+/**
+ * Delete/free a gl_bitmap_atlas.  Called during context tear-down.
+ */
+void
+_mesa_delete_bitmap_atlas(struct gl_context *ctx, struct
gl_bitmap_atlas *atlas)
+{
+   if (atlas->texObj) {
+  ctx->Driver.DeleteTexture(ctx, atlas->texObj);
+   }
+   free(atlas->glyphs);
+}
+
+
+/**
+ * Lookup a gl_bitmap_atlas by listBase ID.
+ */
+static struct gl_bitmap_atlas *
+lookup_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   atlas = _mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase);
+   return atlas;
+}
+
+
+/**
+ * Create new bitmap atlas and insert into hash table.
+ */
+static struct gl_bitmap_atlas *
+alloc_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   assert(_mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase) ==
NULL);
+
+   atlas = calloc(1, sizeof(*atlas));
+   if (atlas) {
+  _mesa_HashInsert(ctx->Shared->BitmapAtlas, listBase, atlas);
+   }
+
+   return atlas;
+}
+
+
+/**
+ * Try to build a bitmap atlas.  This involves examining a sequence of
+ * display lists which contain glBitmap commands and putting the bitmap
+ * images into a texture map (the atlas).
+ * If we succeed, gl_bitmap_atlas::complete will be set to true.
+ * If we fail, gl_bitmap_atlas::incomplete will be set to true.
+ */
+static void
+build_bitmap_atlas(struct gl_context *ctx, struct gl_bitmap_atlas
*atlas,
+   GLuint listBase)
+{
+   unsigned i, row_height = 0, xpos = 0, ypos = 0;
+   GLubyte *map;
+   GLint map_stride;
+
+   assert(atlas);
+   assert(!atlas->complete);
+   assert(atlas->numBitmaps > 0);
+
+   /* We use a rectangle texture (non-normalized coords) for the
atlas */
+   assert(ctx->Extensions.NV_texture_rectangle);
+
+   atlas->texWidth = 1024;
+   atlas->texHeight = 0;  /* determined below */


I don't see explicit checks for either NV_texture_rectangle or max
texture size >= 1024 anywhere.

I see two alternative ways of handling this: either add an explicit
check in render_bitmap_atlas, or expect drivers to only install
DrawAtlasBitmaps when those preconditions are satisfied (in which case
this should probably be documented in dd.h, and the st/mesa patch
adjusted).


I'll add a comment in dd.h that DrawAtlasBitmaps() requires texture

Re: [Mesa-dev] [PATCH 1/2] mesa: implement a display list / glBitmap texture atlas

2016-02-15 Thread Brian Paul

On 02/15/2016 08:45 AM, Nicolai Hähnle wrote:



On 12.02.2016 20:07, Brian Paul wrote:

This improves the performance of applications which use glXUseXFont()
or wglUseFontBitmaps() and glCallLists() to draw bitmap text.

Basically, we collect all the glBitmap images from the display lists
and put them into a texture atlas.  To render the bitmaps for a
glCallLists() command, we render a set of textured quads where each
quad is textured with one bitmap image.  Actually, the rendering part
has to be done by the Mesa driver or Mesa/gallium state tracker.

Note that GLUT demos that use glutBitmapCharacter() don't benefit
from this.
---
  src/mesa/main/dd.h |   8 ++
  src/mesa/main/dlist.c  | 383
+
  src/mesa/main/dlist.h  |  38 +
  src/mesa/main/mtypes.h |   1 +
  src/mesa/main/shared.c |  15 ++
  5 files changed, 445 insertions(+)

diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 19ef304..5d1370c 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -35,6 +35,7 @@

  #include "glheader.h"

+struct gl_bitmap_atlas;
  struct gl_buffer_object;
  struct gl_context;
  struct gl_display_list;
@@ -154,6 +155,13 @@ struct dd_function_table {
 GLint x, GLint y, GLsizei width, GLsizei height,
 const struct gl_pixelstore_attrib *unpack,
 const GLubyte *bitmap );
+
+   /**
+* Called by display list code for optimized glCallLists/glBitmap
rendering
+*/
+   void (*DrawAtlasBitmaps)(struct gl_context *ctx,
+const struct gl_bitmap_atlas *atlas,
+GLuint count, const GLubyte *ids);
 /*@}*/


diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 0e25efb..1927068 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -72,6 +72,9 @@
  #include "vbo/vbo.h"


+#define USE_BITMAP_ATLAS 1
+
+

  /**
   * Other parts of Mesa (such as the VBO module) can plug into the
display
@@ -606,6 +609,259 @@ void mesa_print_display_list(GLuint list);


  /**
+ * Does the given display list only contain a single glBitmap call?
+ */
+static bool
+is_bitmap_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   if (n[0].opcode == OPCODE_BITMAP) {
+  n += InstSize[OPCODE_BITMAP];
+  if (n[0].opcode == OPCODE_END_OF_LIST)
+ return true;
+   }
+   return false;
+}
+
+
+/**
+ * Is the given display list an empty list?
+ */
+static bool
+is_empty_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   return n[0].opcode == OPCODE_END_OF_LIST;
+}
+
+
+/**
+ * Delete/free a gl_bitmap_atlas.  Called during context tear-down.
+ */
+void
+_mesa_delete_bitmap_atlas(struct gl_context *ctx, struct
gl_bitmap_atlas *atlas)
+{
+   if (atlas->texObj) {
+  ctx->Driver.DeleteTexture(ctx, atlas->texObj);
+   }
+   free(atlas->glyphs);
+}
+
+
+/**
+ * Lookup a gl_bitmap_atlas by listBase ID.
+ */
+static struct gl_bitmap_atlas *
+lookup_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   atlas = _mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase);
+   return atlas;
+}
+
+
+/**
+ * Create new bitmap atlas and insert into hash table.
+ */
+static struct gl_bitmap_atlas *
+alloc_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   assert(_mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase) == NULL);
+
+   atlas = calloc(1, sizeof(*atlas));
+   if (atlas) {
+  _mesa_HashInsert(ctx->Shared->BitmapAtlas, listBase, atlas);
+   }
+
+   return atlas;
+}
+
+
+/**
+ * Try to build a bitmap atlas.  This involves examining a sequence of
+ * display lists which contain glBitmap commands and putting the bitmap
+ * images into a texture map (the atlas).
+ * If we succeed, gl_bitmap_atlas::complete will be set to true.
+ * If we fail, gl_bitmap_atlas::incomplete will be set to true.
+ */
+static void
+build_bitmap_atlas(struct gl_context *ctx, struct gl_bitmap_atlas
*atlas,
+   GLuint listBase)
+{
+   unsigned i, row_height = 0, xpos = 0, ypos = 0;
+   GLubyte *map;
+   GLint map_stride;
+
+   assert(atlas);
+   assert(!atlas->complete);
+   assert(atlas->numBitmaps > 0);
+
+   /* We use a rectangle texture (non-normalized coords) for the
atlas */
+   assert(ctx->Extensions.NV_texture_rectangle);
+
+   atlas->texWidth = 1024;
+   atlas->texHeight = 0;  /* determined below */


I don't see explicit checks for either NV_texture_rectangle or max
texture size >= 1024 anywhere.

I see two alternative ways of handling this: either add an explicit
check in render_bitmap_atlas, or expect drivers to only install
DrawAtlasBitmaps when those preconditions are satisfied (in which case
this should probably be documented in dd.h, and the st/mesa patch
adjusted).


I'll add a comment in dd.h that DrawAtlasBitmaps() requires texture 
rectangles of at least width 1024.  

Re: [Mesa-dev] [PATCH] egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage

2016-02-15 Thread Jason Ekstrand
On Feb 15, 2016 3:24 AM, "Pekka Paalanen"  wrote:
>
> On Thu, 11 Feb 2016 10:34:10 -0600
> Derek Foreman  wrote:
>
> > Since commit d1314de293e9e4a63c35f094c3893aaaed8580b4 we ignore
> > damage passed to SwapBuffersWithDamage.
> >
> > Wayland 1.10 now has functionality that allows us to properly
> > process those damage rectangles, and a way to query if it's
> > available.
> >
> > Now we can use wl_surface.damage_buffer and interpret the incoming
> > damage as being in buffer co-ordinates.
> >
> > Signed-off-by: Derek Foreman 
> > ---
> >  src/egl/drivers/dri2/platform_wayland.c | 32
+---
> >  1 file changed, 29 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/egl/drivers/dri2/platform_wayland.c
b/src/egl/drivers/dri2/platform_wayland.c
> > index c2438f7..b5a5b59 100644
> > --- a/src/egl/drivers/dri2/platform_wayland.c
> > +++ b/src/egl/drivers/dri2/platform_wayland.c
> > @@ -653,6 +653,30 @@ create_wl_buffer(struct dri2_egl_surface
*dri2_surf)
> >_buffer_listener, dri2_surf);
> >  }
> >
> > +static EGLBoolean
> > +try_damage_buffer(struct dri2_egl_surface *dri2_surf,
> > +  const EGLint *rects,
> > +  EGLint n_rects)
> > +{
> > +#ifdef WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION
> > +   int i;
> > +
> > +   if (wl_proxy_get_version((struct wl_proxy *)
dri2_surf->wl_win->surface)
> > +   < WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION)
> > +  return EGL_FALSE;
> > +
> > +   for (i = 0; i < n_rects; i++) {
> > +  const int *rect = [i * 4];
> > +
> > +  wl_surface_damage_buffer(dri2_surf->wl_win->surface,
> > +   rect[0],
> > +   dri2_surf->base.Height - rect[1] -
rect[3],
> > +   rect[2], rect[3]);
> > +   }
> > +   return EGL_TRUE;
> > +#endif
> > +   return EGL_FALSE;
> > +}
> >  /**
> >   * Called via eglSwapBuffers(), drv->API.SwapBuffers().
> >   */
> > @@ -703,10 +727,12 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
> > dri2_surf->dx = 0;
> > dri2_surf->dy = 0;
> >
> > -   /* We deliberately ignore the damage region and post maximum
damage, due to
> > +   /* If the compositor doesn't support damage_buffer, we deliberately
> > +* ignore the damage region and post maximum damage, due to
> >  * https://bugs.freedesktop.org/78190 */
> > -   wl_surface_damage(dri2_surf->wl_win->surface,
> > - 0, 0, INT32_MAX, INT32_MAX);
> > +   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
> > +  wl_surface_damage(dri2_surf->wl_win->surface,
> > +0, 0, INT32_MAX, INT32_MAX);
> >
> > if (dri2_dpy->is_different_gpu) {
> >_EGLContext *ctx = _eglGetCurrentContext();
>
> Reviewed-by: Pekka Paalanen 
>
> But I also agree with Emil that having a comment on #ifdef
> WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION usage is good to add.
>
> Bumping the wayland-client requirement to >= 1.10 would be nice,
> but currently the requirement seems to be 1.2 so I wonder if there are
> other things to be cleaned up too.
>
> OTOH, with the #ifdef this patch could go to stable branches, couldn't
> it?
>
> How about landing this is as is, tagged for stable, and a follow-up if
> wanted to bump the wayland-client dependency on master? Would that be
> appropriate?

That's a very good idea. I would love to see this back ported a release or
two.  But we need to move quickly.  11.0 is about to be EOL.

>
> Thanks,
> pq
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/glformats: Bypass resolving effective internal format for GL_BGRA

2016-02-15 Thread Eduardo Lima Mitev
Anyone willing to take a look?

Thanks!

Eduardo

On 02/10/2016 02:57 PM, Eduardo Lima Mitev wrote:
> Currently, when validating format and type on ES3, we treat GL_BGRA as a
> special case when obtaining the effective internal format from the format
> and type. This is because _mesa_base_tex_format() returns GL_RGBA as base
> format for GL_BGRA (and quite a few code paths depend on this behavior).
> 
> However, this makes calls to glTexImage with:
> internalFormat=GL_RGBA format=GL_BGRA_EXT and type=GL_UNSIGNED_BYTE
> to pass when it should generate an invalid operation (since format and
> internalformat mismatch,
> see https://bugs.freedesktop.org/show_bug.cgi?id=92265#c17)
> 
> This patch bypasses the calculation of the effective internal format
> altogether, if either format or internalformat is GL_BGRA. It is just simpler
> to check that both format and internalformat match in the presence of
> GL_BGRA; than tricking the calculation of the effective internal format
> to avoid handling this case incorrectly.
> 
> So effectively, what it does is moving the handling of the special case to
> a higher level, leaving the effective internal format calculation clearer
> and more concise.
> 
> It also fixes piglit test:
> 
> spec@ext_texture_format_bgra@api-errors
> ---
>  src/mesa/main/glformats.c | 27 +--
>  1 file changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
> index f528444..9983ab9 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -2663,27 +2663,26 @@ _mesa_es3_error_check_format_and_type(const struct 
> gl_context *ctx,
>  * CopyTex* commands. In these cases, the GL is required to operate
>  * as if the effective internal format was used as the internalformat
>  * when specifying the texture data."
> +*
> +* However, there is a special case when either format or internalformat 
> is
> +* GL_BGRA (from EXT_texture_format_BGRA). It comes from the fact that
> +* _mesa_base_tex_format() returns a base format of GL_RGBA for GL_BGRA.
> +* This makes perfect sense if you're asking the question, "what channels
> +* does this format have?" However, the code below also checks if two
> +* internal formats match in the ES3 sense, so it doesn't work well for
> +* GL_BGRA. Hence, we bypass the resolution of the effective internal 
> format
> +* altogether, if we have GL_BGRA in either format or internalformat.
>  */
> -   if (_mesa_is_enum_format_unsized(internalFormat)) {
> +   if (_mesa_is_enum_format_unsized(internalFormat) &&
> +   internalFormat != GL_BGRA && format != GL_BGRA) {
>GLenum effectiveInternalFormat =
>   _mesa_es3_effective_internal_format_for_format_and_type(format, 
> type);
>  
>if (effectiveInternalFormat == GL_NONE)
>   return GL_INVALID_OPERATION;
>  
> -  GLenum baseInternalFormat;
> -  if (internalFormat == GL_BGRA_EXT) {
> - /* Unfortunately, _mesa_base_tex_format returns a base format of
> -  * GL_RGBA for GL_BGRA_EXT.  This makes perfect sense if you're
> -  * asking the question, "what channels does this format have?"
> -  * However, if we're trying to determine if two internal formats
> -  * match in the ES3 sense, we actually want GL_BGRA.
> -  */
> - baseInternalFormat = GL_BGRA_EXT;
> -  } else {
> - baseInternalFormat =
> -_mesa_base_tex_format(ctx, effectiveInternalFormat);
> -  }
> +  GLenum baseInternalFormat =
> + _mesa_base_tex_format(ctx, effectiveInternalFormat);
>  
>if (internalFormat != baseInternalFormat)
>   return GL_INVALID_OPERATION;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: reject explicit location on atomic counter uniforms

2016-02-15 Thread Nicolai Hähnle



On 12.02.2016 17:53, Timothy Arceri wrote:

On Thu, 2016-02-11 at 20:10 -0500, Ilia Mirkin wrote:

This fixes

dEQP-GLES31.functional.uniform_location.negative.atomic_fragment
dEQP-GLES31.functional.uniform_location.negative.atomic_vertex

Both of which have lines like

layout(location = 3, binding = 0, offset = 0) uniform atomic_uint
uni0;

The ARB_explicit_uniform_location spec makes a very tangential
mention
regarding atomic counters, but location isn't something that makes
sense
with them.

Signed-off-by: Ilia Mirkin 
---

Had no clue where to stick this check... this seemed like as good a
place as any.

  src/compiler/glsl/ast_to_hir.cpp | 5 +
  1 file changed, 5 insertions(+)

diff --git a/src/compiler/glsl/ast_to_hir.cpp
b/src/compiler/glsl/ast_to_hir.cpp
index dbeb5c0..9fce06b 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -4179,6 +4179,11 @@ ast_declarator_list::hir(exec_list
*instructions,
  state->atomic_counter_offsets[qual_binding] =
qual_offset;
   }
}


Maybe we should just make this:
else {
   _mesa_glsl_error(, state, "invalid atomic counter layout
qualifier");
}

??


FWIW, I like Ilia's original message better because it gives the user 
more information about why exactly their layout qualifier is invalid. 
Helpful error messages are a good thing.


That said, I won't complain too loudly if making the code simpler or 
easier to follow ends up making the error messages slightly less helpful.


Cheers,
Nicolai




+
+  if (type->qualifier.flags.q.explicit_location) {
+ _mesa_glsl_error(, state,
+  "atomic counters cannot have an explicit
location");
+  }
 }

 if (this->declarations.is_empty()) {

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: implement a display list / glBitmap texture atlas

2016-02-15 Thread Nicolai Hähnle



On 12.02.2016 20:07, Brian Paul wrote:

This improves the performance of applications which use glXUseXFont()
or wglUseFontBitmaps() and glCallLists() to draw bitmap text.

Basically, we collect all the glBitmap images from the display lists
and put them into a texture atlas.  To render the bitmaps for a
glCallLists() command, we render a set of textured quads where each
quad is textured with one bitmap image.  Actually, the rendering part
has to be done by the Mesa driver or Mesa/gallium state tracker.

Note that GLUT demos that use glutBitmapCharacter() don't benefit
from this.
---
  src/mesa/main/dd.h |   8 ++
  src/mesa/main/dlist.c  | 383 +
  src/mesa/main/dlist.h  |  38 +
  src/mesa/main/mtypes.h |   1 +
  src/mesa/main/shared.c |  15 ++
  5 files changed, 445 insertions(+)

diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index 19ef304..5d1370c 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -35,6 +35,7 @@

  #include "glheader.h"

+struct gl_bitmap_atlas;
  struct gl_buffer_object;
  struct gl_context;
  struct gl_display_list;
@@ -154,6 +155,13 @@ struct dd_function_table {
   GLint x, GLint y, GLsizei width, GLsizei height,
   const struct gl_pixelstore_attrib *unpack,
   const GLubyte *bitmap );
+
+   /**
+* Called by display list code for optimized glCallLists/glBitmap rendering
+*/
+   void (*DrawAtlasBitmaps)(struct gl_context *ctx,
+const struct gl_bitmap_atlas *atlas,
+GLuint count, const GLubyte *ids);
 /*@}*/


diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 0e25efb..1927068 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -72,6 +72,9 @@
  #include "vbo/vbo.h"


+#define USE_BITMAP_ATLAS 1
+
+

  /**
   * Other parts of Mesa (such as the VBO module) can plug into the display
@@ -606,6 +609,259 @@ void mesa_print_display_list(GLuint list);


  /**
+ * Does the given display list only contain a single glBitmap call?
+ */
+static bool
+is_bitmap_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   if (n[0].opcode == OPCODE_BITMAP) {
+  n += InstSize[OPCODE_BITMAP];
+  if (n[0].opcode == OPCODE_END_OF_LIST)
+ return true;
+   }
+   return false;
+}
+
+
+/**
+ * Is the given display list an empty list?
+ */
+static bool
+is_empty_list(const struct gl_display_list *dlist)
+{
+   const Node *n = dlist->Head;
+   return n[0].opcode == OPCODE_END_OF_LIST;
+}
+
+
+/**
+ * Delete/free a gl_bitmap_atlas.  Called during context tear-down.
+ */
+void
+_mesa_delete_bitmap_atlas(struct gl_context *ctx, struct gl_bitmap_atlas 
*atlas)
+{
+   if (atlas->texObj) {
+  ctx->Driver.DeleteTexture(ctx, atlas->texObj);
+   }
+   free(atlas->glyphs);
+}
+
+
+/**
+ * Lookup a gl_bitmap_atlas by listBase ID.
+ */
+static struct gl_bitmap_atlas *
+lookup_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   atlas = _mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase);
+   return atlas;
+}
+
+
+/**
+ * Create new bitmap atlas and insert into hash table.
+ */
+static struct gl_bitmap_atlas *
+alloc_bitmap_atlas(struct gl_context *ctx, GLuint listBase)
+{
+   struct gl_bitmap_atlas *atlas;
+
+   assert(listBase > 0);
+   assert(_mesa_HashLookup(ctx->Shared->BitmapAtlas, listBase) == NULL);
+
+   atlas = calloc(1, sizeof(*atlas));
+   if (atlas) {
+  _mesa_HashInsert(ctx->Shared->BitmapAtlas, listBase, atlas);
+   }
+
+   return atlas;
+}
+
+
+/**
+ * Try to build a bitmap atlas.  This involves examining a sequence of
+ * display lists which contain glBitmap commands and putting the bitmap
+ * images into a texture map (the atlas).
+ * If we succeed, gl_bitmap_atlas::complete will be set to true.
+ * If we fail, gl_bitmap_atlas::incomplete will be set to true.
+ */
+static void
+build_bitmap_atlas(struct gl_context *ctx, struct gl_bitmap_atlas *atlas,
+   GLuint listBase)
+{
+   unsigned i, row_height = 0, xpos = 0, ypos = 0;
+   GLubyte *map;
+   GLint map_stride;
+
+   assert(atlas);
+   assert(!atlas->complete);
+   assert(atlas->numBitmaps > 0);
+
+   /* We use a rectangle texture (non-normalized coords) for the atlas */
+   assert(ctx->Extensions.NV_texture_rectangle);
+
+   atlas->texWidth = 1024;
+   atlas->texHeight = 0;  /* determined below */


I don't see explicit checks for either NV_texture_rectangle or max 
texture size >= 1024 anywhere.


I see two alternative ways of handling this: either add an explicit 
check in render_bitmap_atlas, or expect drivers to only install 
DrawAtlasBitmaps when those preconditions are satisfied (in which case 
this should probably be documented in dd.h, and the st/mesa patch adjusted).



+
+   atlas->glyphs = malloc(atlas->numBitmaps * sizeof(atlas->glyphs[0]));
+   if (!atlas->glyphs) {
+  /* give up */
+  

Re: [Mesa-dev] [PATCH 2/2] st/mesa: new st_DrawAtlasBitmaps() function for drawing bitmap text

2016-02-15 Thread Nicolai Hähnle

On 12.02.2016 20:07, Brian Paul wrote:

This basically saves the current pipeline state, sets up state for
rendering, constructs a set of textured quads, renders, then restores
the previous pipeline state.

It shouldn't be hard to implement a similar function for non-gallium
drives.  With some code refactoring, the vertex definition code could
probably be shared.


Except for the potential (trivial) interaction with my comments on the 
first patch, this is


Reviewed-by: Nicolai Hähnle 


---
  src/mesa/state_tracker/st_cb_bitmap.c | 143 +-
  src/mesa/state_tracker/st_context.h   |   1 +
  2 files changed, 141 insertions(+), 3 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_bitmap.c 
b/src/mesa/state_tracker/st_cb_bitmap.c
index d84bfef..461159b 100644
--- a/src/mesa/state_tracker/st_cb_bitmap.c
+++ b/src/mesa/state_tracker/st_cb_bitmap.c
@@ -33,6 +33,7 @@
  #include "main/imports.h"
  #include "main/image.h"
  #include "main/bufferobj.h"
+#include "main/dlist.h"
  #include "main/macros.h"
  #include "main/pbo.h"
  #include "program/program.h"
@@ -51,6 +52,7 @@
  #include "pipe/p_shader_tokens.h"
  #include "util/u_inlines.h"
  #include "util/u_simple_shaders.h"
+#include "util/u_upload_mgr.h"
  #include "program/prog_instruction.h"
  #include "cso_cache/cso_context.h"

@@ -182,7 +184,8 @@ make_bitmap_texture(struct gl_context *ctx, GLsizei width, 
GLsizei height,
  static void
  setup_render_state(struct gl_context *ctx,
 struct pipe_sampler_view *sv,
-   const GLfloat *color)
+   const GLfloat *color,
+   bool atlas)
  {
 struct st_context *st = st_context(ctx);
 struct cso_context *cso = st->cso_context;
@@ -249,7 +252,10 @@ setup_render_state(struct gl_context *ctx,
for (i = 0; i < st->state.num_samplers[PIPE_SHADER_FRAGMENT]; i++) {
   samplers[i] = >state.samplers[PIPE_SHADER_FRAGMENT][i];
}
-  samplers[fpv->bitmap_sampler] = >bitmap.sampler;
+  if (atlas)
+ samplers[fpv->bitmap_sampler] = >bitmap.atlas_sampler;
+  else
+ samplers[fpv->bitmap_sampler] = >bitmap.sampler;
cso_set_samplers(cso, PIPE_SHADER_FRAGMENT, num,
 (const struct pipe_sampler_state **) samplers);
 }
@@ -324,7 +330,7 @@ draw_bitmap_quad(struct gl_context *ctx, GLint x, GLint y, 
GLfloat z,
assert(height <= (GLsizei) maxSize);
 }

-   setup_render_state(ctx, sv, color);
+   setup_render_state(ctx, sv, color, false);

 /* convert Z from [0,1] to [-1,-1] to match viewport Z scale/bias */
 z = z * 2.0f - 1.0f;
@@ -571,6 +577,9 @@ init_bitmap_state(struct st_context *st)
 st->bitmap.sampler.mag_img_filter = PIPE_TEX_FILTER_NEAREST;
 st->bitmap.sampler.normalized_coords = st->internal_target == 
PIPE_TEXTURE_2D;

+   st->bitmap.atlas_sampler = st->bitmap.sampler;
+   st->bitmap.atlas_sampler.normalized_coords = 0;
+
 /* init baseline rasterizer state once */
 memset(>bitmap.rasterizer, 0, sizeof(st->bitmap.rasterizer));
 st->bitmap.rasterizer.half_pixel_center = 1;
@@ -665,11 +674,139 @@ st_Bitmap(struct gl_context *ctx, GLint x, GLint y,
  }


+/**
+ * Called via ctx->Driver.DrawAtlasBitmap()
+ */
+static void
+st_DrawAtlasBitmaps(struct gl_context *ctx,
+const struct gl_bitmap_atlas *atlas,
+GLuint count, const GLubyte *ids)
+{
+   struct st_context *st = st_context(ctx);
+   struct pipe_context *pipe = st->pipe;
+   struct st_texture_object *stObj = st_texture_object(atlas->texObj);
+   struct pipe_sampler_view *sv;
+   /* convert Z from [0,1] to [-1,-1] to match viewport Z scale/bias */
+   const float z = ctx->Current.RasterPos[2] * 2.0f - 1.0f;
+   const float *color = ctx->Current.RasterColor;
+   const float clip_x_scale = 2.0f / st->state.framebuffer.width;
+   const float clip_y_scale = 2.0f / st->state.framebuffer.height;
+   const unsigned num_verts = count * 4;
+   const unsigned num_vert_bytes = num_verts * sizeof(struct st_util_vertex);
+   struct st_util_vertex *verts;
+   struct pipe_vertex_buffer vb = {0};
+   unsigned i;
+
+   if (!st->bitmap.cache) {
+  init_bitmap_state(st);
+   }
+
+   st_flush_bitmap_cache(st);
+
+   st_validate_state(st);
+
+   sv = st_create_texture_sampler_view(pipe, stObj->pt);
+
+   setup_render_state(ctx, sv, color, true);
+
+   vb.stride = sizeof(struct st_util_vertex);
+
+   u_upload_alloc(st->uploader, 0, num_vert_bytes, 4,
+  _offset, , (void **) );
+
+   /* build quads vertex data */
+   for (i = 0; i < count; i++) {
+  const GLfloat epsilon = 0.0001F;
+  const struct gl_bitmap_glyph *g = >glyphs[ids[i]];
+  const float xmove = g->xmove, ymove = g->ymove;
+  const float xorig = g->xorig, yorig = g->yorig;
+  const float s0 = g->x, t0 = g->y;
+  const float s1 = s0 + g->w, t1 = t0 + g->h;
+  const float x0 = 

Re: [Mesa-dev] [PATCH] glsl: warn in GL as well as ES when varying not written

2016-02-15 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Mon, 2016-02-15 at 14:06 +1100, Timothy Arceri wrote:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93339
> ---
>  src/compiler/glsl/link_varyings.cpp | 23 +++
>  1 file changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/src/compiler/glsl/link_varyings.cpp 
> b/src/compiler/glsl/link_varyings.cpp
> index 2e84972..e3c2fa6 100644
> --- a/src/compiler/glsl/link_varyings.cpp
> +++ b/src/compiler/glsl/link_varyings.cpp
> @@ -1925,22 +1925,7 @@ assign_varying_locations(struct gl_context *ctx,
>  
>   if (var && var->data.mode == ir_var_shader_in &&
>   var->data.is_unmatched_generic_inout) {
> -if (prog->IsES) {
> -   /*
> -* On Page 91 (Page 97 of the PDF) of the GLSL ES 1.0 spec:
> -*
> -* If the vertex shader declares but doesn't write to a
> -* varying and the fragment shader declares and reads it,
> -* is this an error?
> -*
> -* RESOLUTION: No.
> -*/
> -   linker_warning(prog, "%s shader varying %s not written "
> -  "by %s shader\n.",
> -  _mesa_shader_stage_to_string(consumer->Stage),
> -  var->name,
> -  _mesa_shader_stage_to_string(producer->Stage));
> -} else if (prog->Version <= 120) {
> +if (!prog->IsES && prog->Version <= 120) {
> /* On page 25 (page 31 of the PDF) of the GLSL 1.20 spec:
>  *
>  * Only those varying variables used (i.e. read) in
> @@ -1958,6 +1943,12 @@ assign_varying_locations(struct gl_context *ctx,
>  _mesa_shader_stage_to_string(consumer->Stage),
>   var->name,
>  _mesa_shader_stage_to_string(producer->Stage));
> +} else {
> +   linker_warning(prog, "%s shader varying %s not written "
> +  "by %s shader\n.",
> +  _mesa_shader_stage_to_string(consumer->Stage),
> +  var->name,
> +  _mesa_shader_stage_to_string(producer->Stage));
>  }
>   }
>}


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/1] configure: Bail out on llvm-config component error

2016-02-15 Thread Nicolai Hähnle

On 12.02.2016 19:41, Jan Vesely wrote:

Signed-off-by: Jan Vesely 
---
  configure.ac | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/configure.ac b/configure.ac
index 2750d4d..57330cb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2295,6 +2295,9 @@ dnl in LLVM_LIBS.

  if test "x$MESA_LLVM" != x0; then

+if ! $LLVM_CONFIG --libs ${LLVM_COMPONENTS} >/dev/null; then
+   AC_MSG_ERROR([Calling ${LLVM_CONFIG} failed])
+fi
  LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"

  dnl llvm-config may not give the right answer when llvm is a built as a



+1 for making the build process more user-friendly.

Reviewed-by: Nicolai Hähnle 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: set user defined varyings to smooth by default

2016-02-15 Thread Iago Toral
On Mon, 2016-02-15 at 18:38 +1100, Timothy Arceri wrote:
> This is usually handled by the backends in order to handle the
> various interactions with the gl_*Color built-ins.
> 
> The problem is this means linking will fail if one side on the
> interface adds the smooth qualifier to the varying and the other
> side just uses the default even though they match.
> 
> This fixes various deqp tests and should have no impact on
> built-ins as they generate GLSL IR directly.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index b639378..47d52ee 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -2750,6 +2750,11 @@ interpret_interpolation_qualifier(const struct 
> ast_type_qualifier *qual,
>"vertex shader inputs or fragment shader outputs",
>interpolation_string(interpolation));
>}
> +   } else if ((mode == ir_var_shader_in &&
> +   state->stage != MESA_SHADER_VERTEX) ||
> +  (mode == ir_var_shader_out &&
> +   state->stage != MESA_SHADER_FRAGMENT)) {
> +  interpolation = INTERP_QUALIFIER_SMOOTH;
> }

The GLES spec explicitly says that in the absence of an interp qualifier
smooth is used, but I can't find the same statement in the desktop GLSL
spec. Should we make this ES specific?

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/4] Android: Fix building secondary arch in mixed 32/64-bit builds

2016-02-15 Thread Chih-Wei Huang
2016-02-03 4:45 GMT+08:00 Rob Herring :
> TARGET_CC is not defined for the secondary arch on combined 32/64-bit
> builds. The build system uses 2ND_TARGET_CC instead and it is not meant
> to be used in module makefiles. LOCAL_CC was used to provide C only
> flags as -std=c99 is not valid for C++ files. Since Android 4.4,
> LOCAL_CONLYFLAGS was added to set compiler flags on C files only, so it
> can be used now instead of LOCAL_CC.
>
> This will break on pre-4.4 versions of Android, but it unlikely anyone
> is using current Mesa with such an old version of Android.
>
> Cc: Emil Velikov 
> Cc: Chih-Wei Huang 
> Signed-off-by: Rob Herring 
> ---
> v2:
> - move c99 comment
> - Reword the commit msg to better describe the problem and about pre-4.4
>   breakage
>
>  Android.common.mk | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/Android.common.mk b/Android.common.mk
> index 948561c..72fa5d9 100644
> --- a/Android.common.mk
> +++ b/Android.common.mk
> @@ -21,13 +21,8 @@
>  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>  # DEALINGS IN THE SOFTWARE.
>
> -# use c99 compiler by default
> -ifeq ($(LOCAL_CC),)
>  ifeq ($(LOCAL_IS_HOST_MODULE),true)
> -LOCAL_CC := $(HOST_CC) -std=c99 -D_GNU_SOURCE
> -else
> -LOCAL_CC := $(TARGET_CC) -std=c99
> -endif
> +LOCAL_CFLAGS += -D_GNU_SOURCE
>  endif
>
>  LOCAL_C_INCLUDES += \
> @@ -60,6 +55,10 @@ LOCAL_CFLAGS += \
> -fvisibility=hidden \
> -Wno-sign-compare
>
> +# mesa requires at least c99 compiler
> +LOCAL_CONLYFLAGS += \
> +   -std=c99
> +
>  ifeq ($(strip $(MESA_ENABLE_ASM)),true)
>  ifeq ($(TARGET_ARCH),x86)
>  LOCAL_CFLAGS += \
> --

Looks good to me.


-- 
Chih-Wei
Android-x86 project
http://www.android-x86.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/4] Android: enable building on arm64

2016-02-15 Thread Chih-Wei Huang
2016-02-03 4:45 GMT+08:00 Rob Herring :
> Use the LOCAL_CFLAGS_{32/64} instead of arch specific variants to define
> the DEFAULT_DRIVER_DIR. This enables building for arm64.
>
> Cc: Emil Velikov 
> Cc: Chih-Wei Huang 
> Signed-off-by: Rob Herring 
> ---
> v2:
> - Use LOCAL_CFLAGS_(32|64) instead of arch flags
>
>  src/egl/Android.mk | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/src/egl/Android.mk b/src/egl/Android.mk
> index ebd67af..cf71251 100644
> --- a/src/egl/Android.mk
> +++ b/src/egl/Android.mk
> @@ -44,9 +44,8 @@ LOCAL_CFLAGS := \
> -DHAVE_ANDROID_PLATFORM
>
>  ifeq ($(MESA_LOLLIPOP_BUILD),true)
> -LOCAL_CFLAGS_arm := -DDEFAULT_DRIVER_DIR=\"/system/lib/dri\"
> -LOCAL_CFLAGS_x86 := -DDEFAULT_DRIVER_DIR=\"/system/lib/dri\"
> -LOCAL_CFLAGS_x86_64 := -DDEFAULT_DRIVER_DIR=\"/system/lib64/dri\"
> +LOCAL_CFLAGS_32 := -DDEFAULT_DRIVER_DIR=\"/system/lib/dri\"
> +LOCAL_CFLAGS_64 := -DDEFAULT_DRIVER_DIR=\"/system/lib64/dri\"
>  else
>  LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\"/system/lib/dri\"
>  endif
> --

Looks good to me.
Thank you!


-- 
Chih-Wei
Android-x86 project
http://www.android-x86.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix test for big-endian architecture in compiler.h

2016-02-15 Thread Oded Gabbay
Sent with MailTrack

On Sat, Feb 13, 2016 at 2:39 AM, Roland Scheidegger  wrote:
>
> Am 12.02.2016 um 10:01 schrieb Jochen Rollwagen:
> > Hi,
> >
> > i think i found & fixed a bug in mesa concerning tests for big-endian
> > machines. The defines tested don't exist or are wrongly defined so the
> > test (probably) never fires. The gcc defines on my machine concerning
> > big-endian are
> >
> > jochen@mac-mini:~/sources/mesa$ gcc -dM -E - < /dev/null | grep BIG
> > #define __BIGGEST_ALIGNMENT__ 16
> > #define __BIG_ENDIAN__ 1
> > #define __FLOAT_WORD_ORDER__ __ORDER_BIG_ENDIAN__
> > #define _BIG_ENDIAN 1
> > #define __ORDER_BIG_ENDIAN__ 4321
> > #define __BYTE_ORDER__ __ORDER_BIG_ENDIAN__
> >
> > The tested values in current mesa are quite different :-)
> >
> > The following patch fixes this.
> >
> > diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h
> > index c5ee741..99c63cb 100644
> > --- a/src/mesa/main/compiler.h
> > +++ b/src/mesa/main/compiler.h
> > @@ -52,7 +52,7 @@ extern "C" {
> >   * Try to use a runtime test instead.
> >   * For now, only used by some DRI hardware drivers for color/texel
> > packing.
> >   */
> > -#if defined(BYTE_ORDER) && defined(BIG_ENDIAN) && BYTE_ORDER == BIG_ENDIAN
> > +#if defined(__BYTE_ORDER__) && defined(__BIG_ENDIAN__) &&
> > __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> >  #if defined(__linux__)
> >  #include 
> >  #define CPU_TO_LE32( x )   bswap_32( x )
> >
>
> Note that on some platforms this file would include endian.h - which
> defines those BYTE_ORDER etc. values. Albeit it includes this _after_
> these ifdefs...
> But don't ask me how this is really supposed to work...
>
> Roland

 includes  which includes 

However, this depends on the c/h files to include  before
including "compiler.h", which doesn't always happen (e.g
dummy_errors.c) and it is a very fragile situation.

So I think this is a good fix and this patch is:
Reviewed-by: Oded Gabbay 

Jochen,

Note that I downloaded this patch from pw and it was malformed. I
don't know if its a pw problem or a problem in how you sent the patch
to the ml.

Oded


>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: remove unused helper

2016-02-15 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Mon, 2016-02-15 at 18:38 +1100, Timothy Arceri wrote:
> Seems to have become unused when i965 moved to NIR.
> ---
>  src/compiler/glsl/ir.cpp | 15 ---
>  src/compiler/glsl/ir.h   | 11 ---
>  2 files changed, 26 deletions(-)
> 
> diff --git a/src/compiler/glsl/ir.cpp b/src/compiler/glsl/ir.cpp
> index c7a2496..750f617 100644
> --- a/src/compiler/glsl/ir.cpp
> +++ b/src/compiler/glsl/ir.cpp
> @@ -1696,21 +1696,6 @@ interpolation_string(unsigned interpolation)
> return "";
>  }
>  
> -
> -glsl_interp_qualifier
> -ir_variable::determine_interpolation_mode(bool flat_shade)
> -{
> -   if (this->data.interpolation != INTERP_QUALIFIER_NONE)
> -  return (glsl_interp_qualifier) this->data.interpolation;
> -   int location = this->data.location;
> -   bool is_gl_Color =
> -  location == VARYING_SLOT_COL0 || location == VARYING_SLOT_COL1;
> -   if (flat_shade && is_gl_Color)
> -  return INTERP_QUALIFIER_FLAT;
> -   else
> -  return INTERP_QUALIFIER_SMOOTH;
> -}
> -
>  const char *const ir_variable::warn_extension_table[] = {
> "",
> "GL_ARB_shader_stencil_export",
> diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
> index bf9b7ca..93c893d 100644
> --- a/src/compiler/glsl/ir.h
> +++ b/src/compiler/glsl/ir.h
> @@ -432,17 +432,6 @@ public:
>  
> 
> /**
> -* Determine how this variable should be interpolated based on its
> -* interpolation qualifier (if present), whether it is gl_Color or
> -* gl_SecondaryColor, and whether flatshading is enabled in the current GL
> -* state.
> -*
> -* The return value will always be either INTERP_QUALIFIER_SMOOTH,
> -* INTERP_QUALIFIER_NOPERSPECTIVE, or INTERP_QUALIFIER_FLAT.
> -*/
> -   glsl_interp_qualifier determine_interpolation_mode(bool flat_shade);
> -
> -   /**
>  * Determine whether or not a variable is part of a uniform or
>  * shader storage block.
>  */


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] Android: fix build break in libmesa_program

2016-02-15 Thread Neil Roberts
Looks good to me. Sorry about breaking it.

Reviewed-by: Neil Roberts 

- Neil

Rob Herring  writes:

> Commit 5fd848f6c9ee ("program: Use _mesa_geometric_samples to calculate
> gl_NumSamples") broken Android builds. Add the missing include path "main"
> to framebuffer.h like other includes in prog_statevars.c.
>
> Cc: Neil Roberts 
> Cc: Ilia Mirkin 
> Signed-off-by: Rob Herring 
> ---
> v2: Add main to #include instead of adding the path to Android.mk
>
>  src/mesa/program/prog_statevars.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/program/prog_statevars.c 
> b/src/mesa/program/prog_statevars.c
> index eed2412..af5fefe 100644
> --- a/src/mesa/program/prog_statevars.c
> +++ b/src/mesa/program/prog_statevars.c
> @@ -40,7 +40,7 @@
>  #include "prog_statevars.h"
>  #include "prog_parameter.h"
>  #include "main/samplerobj.h"
> -#include "framebuffer.h"
> +#include "main/framebuffer.h"
>  
>  
>  #define ONE_DIV_SQRT_LN2 (1.201122408786449815)
> -- 
> 2.5.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] Android: fix build break in libmesa_program

2016-02-15 Thread Rob Herring
Commit 5fd848f6c9ee ("program: Use _mesa_geometric_samples to calculate
gl_NumSamples") broken Android builds. Add the missing include path "main"
to framebuffer.h like other includes in prog_statevars.c.

Cc: Neil Roberts 
Cc: Ilia Mirkin 
Signed-off-by: Rob Herring 
---
v2: Add main to #include instead of adding the path to Android.mk

 src/mesa/program/prog_statevars.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/program/prog_statevars.c 
b/src/mesa/program/prog_statevars.c
index eed2412..af5fefe 100644
--- a/src/mesa/program/prog_statevars.c
+++ b/src/mesa/program/prog_statevars.c
@@ -40,7 +40,7 @@
 #include "prog_statevars.h"
 #include "prog_parameter.h"
 #include "main/samplerobj.h"
-#include "framebuffer.h"
+#include "main/framebuffer.h"
 
 
 #define ONE_DIV_SQRT_LN2 (1.201122408786449815)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/wayland: Try to use wl_surface.damage_buffer for SwapBuffersWithDamage

2016-02-15 Thread Pekka Paalanen
On Thu, 11 Feb 2016 10:34:10 -0600
Derek Foreman  wrote:

> Since commit d1314de293e9e4a63c35f094c3893aaaed8580b4 we ignore
> damage passed to SwapBuffersWithDamage.
> 
> Wayland 1.10 now has functionality that allows us to properly
> process those damage rectangles, and a way to query if it's
> available.
> 
> Now we can use wl_surface.damage_buffer and interpret the incoming
> damage as being in buffer co-ordinates.
> 
> Signed-off-by: Derek Foreman 
> ---
>  src/egl/drivers/dri2/platform_wayland.c | 32 +---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/src/egl/drivers/dri2/platform_wayland.c 
> b/src/egl/drivers/dri2/platform_wayland.c
> index c2438f7..b5a5b59 100644
> --- a/src/egl/drivers/dri2/platform_wayland.c
> +++ b/src/egl/drivers/dri2/platform_wayland.c
> @@ -653,6 +653,30 @@ create_wl_buffer(struct dri2_egl_surface *dri2_surf)
>_buffer_listener, dri2_surf);
>  }
>  
> +static EGLBoolean
> +try_damage_buffer(struct dri2_egl_surface *dri2_surf,
> +  const EGLint *rects,
> +  EGLint n_rects)
> +{
> +#ifdef WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION
> +   int i;
> +
> +   if (wl_proxy_get_version((struct wl_proxy *) dri2_surf->wl_win->surface)
> +   < WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION)
> +  return EGL_FALSE;
> +
> +   for (i = 0; i < n_rects; i++) {
> +  const int *rect = [i * 4];
> +
> +  wl_surface_damage_buffer(dri2_surf->wl_win->surface,
> +   rect[0],
> +   dri2_surf->base.Height - rect[1] - rect[3],
> +   rect[2], rect[3]);
> +   }
> +   return EGL_TRUE;
> +#endif
> +   return EGL_FALSE;
> +}
>  /**
>   * Called via eglSwapBuffers(), drv->API.SwapBuffers().
>   */
> @@ -703,10 +727,12 @@ dri2_wl_swap_buffers_with_damage(_EGLDriver *drv,
> dri2_surf->dx = 0;
> dri2_surf->dy = 0;
>  
> -   /* We deliberately ignore the damage region and post maximum damage, due 
> to
> +   /* If the compositor doesn't support damage_buffer, we deliberately
> +* ignore the damage region and post maximum damage, due to
>  * https://bugs.freedesktop.org/78190 */
> -   wl_surface_damage(dri2_surf->wl_win->surface,
> - 0, 0, INT32_MAX, INT32_MAX);
> +   if (!n_rects || !try_damage_buffer(dri2_surf, rects, n_rects))
> +  wl_surface_damage(dri2_surf->wl_win->surface,
> +0, 0, INT32_MAX, INT32_MAX);
>  
> if (dri2_dpy->is_different_gpu) {
>_EGLContext *ctx = _eglGetCurrentContext();

Reviewed-by: Pekka Paalanen 

But I also agree with Emil that having a comment on #ifdef
WL_SURFACE_DAMAGE_BUFFER_SINCE_VERSION usage is good to add.

Bumping the wayland-client requirement to >= 1.10 would be nice,
but currently the requirement seems to be 1.2 so I wonder if there are
other things to be cleaned up too.

OTOH, with the #ifdef this patch could go to stable branches, couldn't
it?

How about landing this is as is, tagged for stable, and a follow-up if
wanted to bump the wayland-client dependency on master? Would that be
appropriate?


Thanks,
pq


pgph0hauASsPL.pgp
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] egl: support EGL_LARGEST_PBUFFER in eglCreatePbufferSurface(...)

2016-02-15 Thread Tapani Pälli
From: Daniel Czarnowski 

Patch provides a 'sane default' for a set pbuffer surface size when
EGL_LARGEST_PBUFFER is used by the client. MIN2 macro is moved to
egldefines so that it can be shared.

Fixes following Piglit test:
   egl-create-largest-pbuffer-surface

Signed-off-by: Matt Roper 
Cc: "11.0 11.1" LargestPbuffer) {
+  surf->Width = MIN2(surf->Width, _EGL_MAX_PBUFFER_WIDTH);
+  surf->Height = MIN2(surf->Height, _EGL_MAX_PBUFFER_HEIGHT);
+   }
+
return EGL_TRUE;
 }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Revert "i965: Restore vbo after color resolve during brw_try_draw_prims()"

2016-02-15 Thread Topi Pohjolainen
This got pushed accidentally in the first place but wasn't reverted
as it didn't regress piglit but instead fixed one newly introduced
test exercising a corner in case in i965 driver. However, saving and
restoring vertex buffer context is complicated and requires more
thought.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94150

Signed-off-by: Topi Pohjolainen 
CC: Ben Widawsky 
CC: Ian Romanick 
Reviewed-by: Tapani Palli 
---
 src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c 
b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
index 93f1a85..b2b07e7 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c
@@ -887,15 +887,6 @@ brw_meta_resolve_color(struct brw_context *brw,
 
_mesa_meta_end(ctx);
 
-   /* Restore in case we were called in the middle of brw_try_draw_prims().
-* But only in case the just restored context really uses vertex buffer
-* objects.
-*/
-   if (ctx->API != API_OPENGLES) {
-  ctx->vbo_context->exec.array.recalculate_inputs = true;
-  vbo_bind_arrays(ctx);
-   }
-
/* We're typically called from intel_update_state() and we're supposed to
 * return with the state all updated to what it was before
 * brw_meta_resolve_color() was called.  The meta rendering will have
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >