[Mesa-dev] [PATCH] tgsi: properly parse indirect dimension references (e.g. for UBOs)
Signed-off-by: Ilia Mirkin --- src/gallium/auxiliary/tgsi/tgsi_text.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c b/src/gallium/auxiliary/tgsi/tgsi_text.c index c6134c5..6403344 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_text.c +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c @@ -804,6 +804,13 @@ parse_src_operand( src->Dimension.Indirect = 0; src->Dimension.Dimension = 0; src->Dimension.Index = bracket[0].index; + if (bracket[0].ind_file != TGSI_FILE_NULL) { + src->Dimension.Indirect = 1; + src->DimIndirect.File = bracket[0].ind_file; + src->DimIndirect.Index = bracket[0].ind_index; + src->DimIndirect.Swizzle = bracket[0].ind_comp; + src->DimIndirect.ArrayID = bracket[0].ind_array; + } bracket[0] = bracket[1]; } src->Register.Index = bracket[0].index; -- 1.8.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Initialize new chunks of realloc'd memory.
On Tue, Jul 8, 2014 at 9:51 PM, Chris Forbes wrote: > I think you want to move the memset after the !annotation->ann bail > out. Currently, if that allocation were to fail (and we care enough to > check...) , you'll segfault. Yeah... of course. Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Initialize new chunks of realloc'd memory.
I think you want to move the memset after the !annotation->ann bail out. Currently, if that allocation were to fail (and we care enough to check...) , you'll segfault. On Wed, Jul 9, 2014 at 3:44 PM, Matt Turner wrote: > Otherwise we'd compare uninitialized pointers with NULL and dereference, > leading to crashes. > --- > src/mesa/drivers/dri/i965/intel_asm_annotation.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c > b/src/mesa/drivers/dri/i965/intel_asm_annotation.c > index 4717baf..d524725 100644 > --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c > +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c > @@ -96,9 +96,13 @@ void annotate(struct brw_context *brw, >struct backend_instruction *inst, unsigned offset) > { > if (annotation->ann_size <= annotation->ann_count) { > + int old_size = annotation->ann_size; >annotation->ann_size = MAX2(1024, annotation->ann_size * 2); >annotation->ann = reralloc(annotation->mem_ctx, annotation->ann, > struct annotation, annotation->ann_size); > + memset(annotation->ann + old_size, 0, > + (annotation->ann_size - old_size) * sizeof(struct annotation)); > + >if (!annotation->ann) > return; > } > -- > 1.8.5.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Initialize new chunks of realloc'd memory.
Otherwise we'd compare uninitialized pointers with NULL and dereference, leading to crashes. --- src/mesa/drivers/dri/i965/intel_asm_annotation.c | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c b/src/mesa/drivers/dri/i965/intel_asm_annotation.c index 4717baf..d524725 100644 --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c @@ -96,9 +96,13 @@ void annotate(struct brw_context *brw, struct backend_instruction *inst, unsigned offset) { if (annotation->ann_size <= annotation->ann_count) { + int old_size = annotation->ann_size; annotation->ann_size = MAX2(1024, annotation->ann_size * 2); annotation->ann = reralloc(annotation->mem_ctx, annotation->ann, struct annotation, annotation->ann_size); + memset(annotation->ann + old_size, 0, + (annotation->ann_size - old_size) * sizeof(struct annotation)); + if (!annotation->ann) return; } -- 1.8.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 80933] Fullscreen OpenGL programs (e.g. games) crash if focus lost then regained, something to do with automatic compositing suspension
https://bugs.freedesktop.org/show_bug.cgi?id=80933 Michel Dänzer changed: What|Removed |Added Status|NEW |NEEDINFO --- Comment #2 from Michel Dänzer --- We'll need to see a backtrace of an application crash, with symbols for as many entries of the backtrace as possible. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa/st: add support for dynamic ubo selection
Signed-off-by: Ilia Mirkin --- With ChrisF's patches to add support for this in core mesa, this generates code like: FRAG DCL OUT[0], COLOR DCL CONST[0] DCL CONST[1][0] DCL CONST[2][0] DCL CONST[3][0] DCL CONST[4][0] DCL TEMP[0], LOCAL DCL ADDR[0..1] IMM[0] UINT32 {0, 0, 0, 0} IMM[1] INT32 {1, 0, 0, 0} 0: UADD TEMP[0].x, CONST[0]., IMM[1]. 1: UARL ADDR[1].x, TEMP[0]. 2: UARL ADDR[1].x, TEMP[0]. 3: MOV TEMP[0], CONST[ADDR[1].x][0] 4: MOV OUT[0], TEMP[0] 5: END Not sure what the deal is with the two UARL's, but nouveau's backend removes one of them pretty easily. I assume others handle this too. Unfortunately the core patches aren't quite ready yet, but this patch doesn't regress anything. src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 9bc7500..3202c56 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -1947,16 +1947,16 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) break; case ir_binop_ubo_load: { - ir_constant *uniform_block = ir->operands[0]->as_constant(); + ir_constant *const_uniform_block = ir->operands[0]->as_constant(); ir_constant *const_offset_ir = ir->operands[1]->as_constant(); unsigned const_offset = const_offset_ir ? const_offset_ir->value.u[0] : 0; + unsigned const_block = const_uniform_block ? const_uniform_block->value.u[0] + 1 : 0; st_src_reg index_reg = get_temp(glsl_type::uint_type); st_src_reg cbuf; cbuf.type = glsl_type::vec4_type->base_type; cbuf.file = PROGRAM_CONSTANT; cbuf.index = 0; - cbuf.index2D = uniform_block->value.u[0] + 1; cbuf.reladdr = NULL; cbuf.negate = 0; @@ -1966,7 +1966,6 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) /* Constant index into constant buffer */ cbuf.reladdr = NULL; cbuf.index = const_offset / 16; - cbuf.has_index2 = true; } else { /* Relative/variable index into constant buffer */ @@ -1976,6 +1975,21 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) memcpy(cbuf.reladdr, &index_reg, sizeof(index_reg)); } + if (const_uniform_block) { + /* Constant constant buffer */ + cbuf.reladdr2 = NULL; + cbuf.index2D = const_block; + cbuf.has_index2 = true; + } + else { + /* Relative/variable constant buffer */ + emit(ir, TGSI_OPCODE_UADD, st_dst_reg(index_reg), op[0], + st_src_reg_for_int(1)); + cbuf.reladdr2 = ralloc(mem_ctx, st_src_reg); + memcpy(cbuf.reladdr2, &index_reg, sizeof(index_reg)); + cbuf.has_index2 = true; + } + cbuf.swizzle = swizzle_for_size(ir->type->vector_elements); cbuf.swizzle += MAKE_SWIZZLE4(const_offset % 16 / 4, const_offset % 16 / 4, -- 1.8.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] nvc0/ir: support 2d constbuf indexing
Signed-off-by: Ilia Mirkin --- .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 13 + 1 file changed, 13 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp index 37e2f7e..95423d4 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp @@ -1606,6 +1606,19 @@ NVC0LoweringPass::visit(Instruction *i) i->op = OP_VFETCH; assert(prog->getType() != Program::TYPE_FRAGMENT); // INTERP } + } else if (i->src(0).getFile() == FILE_MEMORY_CONST) { + if (i->src(0).isIndirect(1)) { +Value *ptr; +if (i->src(0).isIndirect(0)) + ptr = bld.mkOp3v(OP_INSBF, TYPE_U32, bld.getSSA(), +i->getIndirect(0, 1), bld.mkImm(0x1010), +i->getIndirect(0, 0)); +else + ptr = bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(), +i->getIndirect(0, 1), bld.mkImm(16)); +i->setIndirect(0, 1, NULL); +i->setIndirect(0, 0, ptr); + } } break; case OP_ATOM: -- 1.8.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] nvc0: allocate more space before a counter is configured
On Sat, Jul 5, 2014 at 7:41 PM, Martin Peres wrote: > On 05/07/2014 20:49, Samuel Pitoiset wrote: >> >> On nvc0, a counter can up to 6 sources instead of only one >> for nve4+. This fixes a crash when a counter uses more than >> one source. > > The verb is missing in the first sentence :) Yeah, he accidentally the whole sentence[1]. But I fixed it up before pushing it. -ilia [1] http://ultimate-photos.com/share/i-just-accidentally-a-coca-cola-bottle.jpg ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] nvc0: allocate more space before a counter is configured
On 05/07/2014 20:49, Samuel Pitoiset wrote: On nvc0, a counter can up to 6 sources instead of only one for nve4+. This fixes a crash when a counter uses more than one source. The verb is missing in the first sentence :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: add fix geometry shader memory leaks
For the series: Reviewed-by: Marek Olšák Marek On Wed, Jul 9, 2014 at 12:32 AM, Brian Paul wrote: > Spotted by Charmaine Lee. > Cc: "10.2" > --- > src/mesa/main/context.c |3 +++ > src/mesa/main/shared.c |1 + > 2 files changed, 4 insertions(+) > > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c > index b082159..50aae8b 100644 > --- a/src/mesa/main/context.c > +++ b/src/mesa/main/context.c > @@ -1215,6 +1215,9 @@ _mesa_free_context_data( struct gl_context *ctx ) > _mesa_reference_vertprog(ctx, &ctx->VertexProgram._Current, NULL); > _mesa_reference_vertprog(ctx, &ctx->VertexProgram._TnlProgram, NULL); > > + _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current, NULL); > + _mesa_reference_geomprog(ctx, &ctx->GeometryProgram._Current, NULL); > + > _mesa_reference_fragprog(ctx, &ctx->FragmentProgram.Current, NULL); > _mesa_reference_fragprog(ctx, &ctx->FragmentProgram._Current, NULL); > _mesa_reference_fragprog(ctx, &ctx->FragmentProgram._TexEnvProgram, NULL); > diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c > index dc22025..5ae7014 100644 > --- a/src/mesa/main/shared.c > +++ b/src/mesa/main/shared.c > @@ -312,6 +312,7 @@ free_shared_state(struct gl_context *ctx, struct > gl_shared_state *shared) > _mesa_DeleteHashTable(shared->Programs); > > _mesa_reference_vertprog(ctx, &shared->DefaultVertexProgram, NULL); > + _mesa_reference_geomprog(ctx, &shared->DefaultGeometryProgram, NULL); > _mesa_reference_fragprog(ctx, &shared->DefaultFragmentProgram, NULL); > > _mesa_HashDeleteAll(shared->ATIShaders, delete_fragshader_cb, ctx); > -- > 1.7.10.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] st/mesa: fix geometry shader memory leak
Changes looks good. Tested with svga driver. Reviewed-by: Charmaine Lee From: mesa-dev on behalf of Brian Paul Sent: Tuesday, July 8, 2014 3:32 PM To: mesa-dev@lists.freedesktop.org Cc: 10.2 Subject: [Mesa-dev] [PATCH 2/2] st/mesa: fix geometry shader memory leak Spotted by Charmaine Lee. Cc: "10.2" --- src/mesa/state_tracker/st_context.c |1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index c7f3ec6..c805a09 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -307,6 +307,7 @@ void st_destroy_context( struct st_context *st ) cso_release_all(st->cso_context); st_reference_fragprog(st, &st->fp, NULL); + st_reference_geomprog(st, &st->gp, NULL); st_reference_vertprog(st, &st->vp, NULL); /* release framebuffer surfaces */ -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=iVNYIcCaC9TDvyNBQU%2F5q5NVsC01tSgJb3oX27T14ck%3D%0A&m=PGTeI%2BoujnkFXc9mu71bomZEroIqI0JU%2FWsluOF9vs4%3D%0A&s=f6d160459d8d239933fc9e34ab65f560b5b06adc65b8c4ad0e1a4acacedda314 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] mesa/st: add per sample shading state to fp key and set interpolation
Alright. For the patch: Reviewed-by: Marek Olšák Marek On Wed, Jul 9, 2014 at 2:48 AM, Ilia Mirkin wrote: > So... I don't think we're going to figure this out here. At least I > have nothing enlightening to say. FWIW this is doing the same thing as > what i965 does wrt the persample_shading computation. It should be > pretty easy to change should we decide on a different interpretation > of the spec. > > The only question is whether this is necessary at all. i.e. what > happens if you have a ARB_gs5-enabled shader with a "regular" varying > and sample varying. If the regular varying should just be interpolated > at the sample, perhaps all this stuff is stupid and I should just drop > it and keep the "if per-sample shading, just interpolate at the sample > no matter what" semantics that ARB_sample_shading introduced. I tried > to start a thread about this but no one bit :) > > Hopefully people who have more experience than I will be able to > suggest how to proceed. As always, happy to do it whichever way. > > -ilia > > On Sat, Jul 5, 2014 at 6:48 PM, Marek Olšák wrote: >> What you say makes sense. I'm just asking what that sentence in the >> spec means if it isn't about interpolation. :) >> >> Marek >> >> On Sun, Jul 6, 2014 at 12:40 AM, Ilia Mirkin wrote: >>> On Sat, Jul 5, 2014 at 6:22 PM, Marek Olšák wrote: There is this vague statement in the sample shading spec: When the sample shading fraction is 1.0, a separate set of colors and other associated data are evaluated for each sample, each set of values are evaluated at the sample location. I thought it was about interpolation, meaning that the fraction must 1.0 for the per-sample interpolation to be enabled, right? >>> >>> I guess so. I'm pretty much just doing what the intel driver is >>> doing... see brw_wm.c. >>> >>> /* Gets the minimum number of shader invocations per fragment. >>> * This function is useful to determine if we need to do per >>> * sample shading or per fragment shading. >>> */ >>> GLint >>> _mesa_get_min_invocations_per_fragment(struct gl_context *ctx, >>>const struct gl_fragment_program >>> *prog, >>>bool ignore_sample_qualifier) >>> >>> So I guess instead of > 1 this should be checking for == >>> ctx->DrawBuffer->Visual.samples? (and > 1) >>> >>> Although in that case, I _really_ don't get the point of having >>> non-1.0/0.0 fractions ever existing. Why would you want it to be >>> shaded for e.g. 4/8 samples if all the inputs are going to get >>> interpolated the same way? >>> >>> -ilia >>> Marek On Sat, Jul 5, 2014 at 6:07 AM, Ilia Mirkin wrote: > This enables a gallium driver not to care about the semantics of > ARB_sample_shading vs ARB_gpu_shader5 sample attributes. When > ARB_sample_shading-style sample shading is enabled, all of the fp inputs > are marked for interpolation at the sample location. > > Signed-off-by: Ilia Mirkin > --- > > I _think_ I got this right... at least the tests still pass. But my > understanding of all the various update atoms, etc is really weak... > please > read carefully :) > > Now that I understand all this interpolation stuff better, I see why it > was > suggested I add proper per-sample interpolation first and the > ARB_sample_shading stuff second. Oh well... > > src/mesa/state_tracker/st_atom_shader.c | 6 +- > src/mesa/state_tracker/st_program.c | 3 +++ > src/mesa/state_tracker/st_program.h | 3 +++ > 3 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/state_tracker/st_atom_shader.c > b/src/mesa/state_tracker/st_atom_shader.c > index 67c6157..6515a98 100644 > --- a/src/mesa/state_tracker/st_atom_shader.c > +++ b/src/mesa/state_tracker/st_atom_shader.c > @@ -89,6 +89,10 @@ update_fp( struct st_context *st ) > key.clamp_color = st->clamp_frag_color_in_shader && > st->ctx->Color._ClampFragmentColor; > > + /* Ignore sample qualifier while computing this flag. */ > + key.persample_shading = > + _mesa_get_min_invocations_per_fragment(st->ctx, &stfp->Base, true) > > 1; > + > st->fp_variant = st_get_fp_variant(st, stfp, &key); > > st_reference_fragprog(st, &st->fp, stfp); > @@ -108,7 +112,7 @@ update_fp( struct st_context *st ) > const struct st_tracked_state st_update_fp = { > "st_update_fp", /* name */ > { /* dirty */ > - _NEW_BUFFERS,/* mesa */ > + _NEW_BUFFERS | _NEW_MULTISAMPLE, /* mesa */ >ST_NEW_FRAGMENT_PROGRAM /* st */ > }, > update_fp
Re: [Mesa-dev] [PATCH 2/3] mesa/st: add per sample shading state to fp key and set interpolation
So... I don't think we're going to figure this out here. At least I have nothing enlightening to say. FWIW this is doing the same thing as what i965 does wrt the persample_shading computation. It should be pretty easy to change should we decide on a different interpretation of the spec. The only question is whether this is necessary at all. i.e. what happens if you have a ARB_gs5-enabled shader with a "regular" varying and sample varying. If the regular varying should just be interpolated at the sample, perhaps all this stuff is stupid and I should just drop it and keep the "if per-sample shading, just interpolate at the sample no matter what" semantics that ARB_sample_shading introduced. I tried to start a thread about this but no one bit :) Hopefully people who have more experience than I will be able to suggest how to proceed. As always, happy to do it whichever way. -ilia On Sat, Jul 5, 2014 at 6:48 PM, Marek Olšák wrote: > What you say makes sense. I'm just asking what that sentence in the > spec means if it isn't about interpolation. :) > > Marek > > On Sun, Jul 6, 2014 at 12:40 AM, Ilia Mirkin wrote: >> On Sat, Jul 5, 2014 at 6:22 PM, Marek Olšák wrote: >>> There is this vague statement in the sample shading spec: >>> >>> When the sample shading fraction is 1.0, a separate set of colors and >>> other associated data are evaluated for each sample, each set of values >>> are evaluated at the sample location. >>> >>> I thought it was about interpolation, meaning that the fraction must >>> 1.0 for the per-sample interpolation to be enabled, right? >> >> I guess so. I'm pretty much just doing what the intel driver is >> doing... see brw_wm.c. >> >> /* Gets the minimum number of shader invocations per fragment. >> * This function is useful to determine if we need to do per >> * sample shading or per fragment shading. >> */ >> GLint >> _mesa_get_min_invocations_per_fragment(struct gl_context *ctx, >>const struct gl_fragment_program >> *prog, >>bool ignore_sample_qualifier) >> >> So I guess instead of > 1 this should be checking for == >> ctx->DrawBuffer->Visual.samples? (and > 1) >> >> Although in that case, I _really_ don't get the point of having >> non-1.0/0.0 fractions ever existing. Why would you want it to be >> shaded for e.g. 4/8 samples if all the inputs are going to get >> interpolated the same way? >> >> -ilia >> >>> >>> Marek >>> >>> >>> On Sat, Jul 5, 2014 at 6:07 AM, Ilia Mirkin wrote: This enables a gallium driver not to care about the semantics of ARB_sample_shading vs ARB_gpu_shader5 sample attributes. When ARB_sample_shading-style sample shading is enabled, all of the fp inputs are marked for interpolation at the sample location. Signed-off-by: Ilia Mirkin --- I _think_ I got this right... at least the tests still pass. But my understanding of all the various update atoms, etc is really weak... please read carefully :) Now that I understand all this interpolation stuff better, I see why it was suggested I add proper per-sample interpolation first and the ARB_sample_shading stuff second. Oh well... src/mesa/state_tracker/st_atom_shader.c | 6 +- src/mesa/state_tracker/st_program.c | 3 +++ src/mesa/state_tracker/st_program.h | 3 +++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_atom_shader.c b/src/mesa/state_tracker/st_atom_shader.c index 67c6157..6515a98 100644 --- a/src/mesa/state_tracker/st_atom_shader.c +++ b/src/mesa/state_tracker/st_atom_shader.c @@ -89,6 +89,10 @@ update_fp( struct st_context *st ) key.clamp_color = st->clamp_frag_color_in_shader && st->ctx->Color._ClampFragmentColor; + /* Ignore sample qualifier while computing this flag. */ + key.persample_shading = + _mesa_get_min_invocations_per_fragment(st->ctx, &stfp->Base, true) > 1; + st->fp_variant = st_get_fp_variant(st, stfp, &key); st_reference_fragprog(st, &st->fp, stfp); @@ -108,7 +112,7 @@ update_fp( struct st_context *st ) const struct st_tracked_state st_update_fp = { "st_update_fp", /* name */ { /* dirty */ - _NEW_BUFFERS,/* mesa */ + _NEW_BUFFERS | _NEW_MULTISAMPLE, /* mesa */ ST_NEW_FRAGMENT_PROGRAM /* st */ }, update_fp /* update */ diff --git a/src/mesa/state_tracker/st_program.c b/src/mesa/state_tracker/st_program.c index b603759..9d7b7c4 100644 --- a/src/mesa/state_tracker/st_program.c +++ b/src/mesa/state_tracker/st_pr
Re: [Mesa-dev] [PATCH 1/3] gallium: switch dedicated centroid field to interpolation location
On Sat, Jul 5, 2014 at 5:20 PM, Roland Scheidegger wrote: > Am 05.07.2014 06:07, schrieb Ilia Mirkin: >> The new location field can be either center, centroid, or sample, which >> indicates the location that the shader should interpolate at. > Looks good though maybe it should be mentioned in the interpolator > section in tgsi.rst (yeah centroid wasn't neither). I added this bit: """ +The Location field specifies the location inside the pixel that the +interpolation should be done at, one of ``TGSI_INTERPOLATE_LOC_*``. """ Let me know if you want me to resend. > In any case, > Reviewed-by: Roland Scheidegger > >> >> Signed-off-by: Ilia Mirkin >> --- >> >> I tried to make sure I hit all the uses, but I guess a bunch of drivers don't >> look at the Centroid field at all? > Well drivers not doing msaa or just fake one like llvmpipe aren't > interested in it. > And, this wasn't exposed via d3d9, though r300 reportedly can do it (the > docs even mention it for r5xx), I suspect using some of the crazy > backdoor mechanisms of d3d9. Some very quick look at the docs though > don't indicate how this could be done (if it's even possible per > interpolator) for these chips, which is I guess why centroid isn't handled. > >> >> src/gallium/auxiliary/tgsi/tgsi_build.c | 8 >> src/gallium/auxiliary/tgsi/tgsi_dump.c| 5 +++-- >> src/gallium/auxiliary/tgsi/tgsi_scan.c| 2 +- >> src/gallium/auxiliary/tgsi/tgsi_scan.h| 2 +- >> src/gallium/auxiliary/tgsi/tgsi_strings.c | 7 +++ >> src/gallium/auxiliary/tgsi/tgsi_strings.h | 2 ++ >> src/gallium/auxiliary/tgsi/tgsi_ureg.c| 12 ++-- >> src/gallium/auxiliary/tgsi/tgsi_ureg.h| 2 +- >> src/gallium/drivers/ilo/shader/toy_tgsi.c | 2 +- >> src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 +- >> src/gallium/drivers/r600/r600_shader.c| 4 ++-- >> src/gallium/drivers/radeonsi/si_shader.c | 6 +++--- >> src/gallium/include/pipe/p_shader_tokens.h| 9 +++-- >> src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 5 +++-- >> src/mesa/state_tracker/st_glsl_to_tgsi.h | 2 +- >> src/mesa/state_tracker/st_program.c | 13 + >> 16 files changed, 52 insertions(+), 31 deletions(-) >> >> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c >> b/src/gallium/auxiliary/tgsi/tgsi_build.c >> index 7621b6a..bef5c75 100644 >> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c >> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c >> @@ -200,7 +200,7 @@ tgsi_default_declaration_interp( void ) >> struct tgsi_declaration_interp di; >> >> di.Interpolate = TGSI_INTERPOLATE_CONSTANT; >> - di.Centroid = 0; >> + di.Location = TGSI_INTERPOLATE_LOC_CENTER; >> di.CylindricalWrap = 0; >> di.Padding = 0; >> >> @@ -209,7 +209,7 @@ tgsi_default_declaration_interp( void ) >> >> static struct tgsi_declaration_interp >> tgsi_build_declaration_interp(unsigned interpolate, >> - unsigned centroid, >> + unsigned interpolate_location, >>unsigned cylindrical_wrap, >>struct tgsi_declaration *declaration, >>struct tgsi_header *header) >> @@ -217,7 +217,7 @@ tgsi_build_declaration_interp(unsigned interpolate, >> struct tgsi_declaration_interp di; >> >> di.Interpolate = interpolate; >> - di.Centroid = centroid; >> + di.Location = interpolate_location; >> di.CylindricalWrap = cylindrical_wrap; >> di.Padding = 0; >> >> @@ -433,7 +433,7 @@ tgsi_build_full_declaration( >>size++; >> >>*di = tgsi_build_declaration_interp(full_decl->Interp.Interpolate, >> - full_decl->Interp.Centroid, >> + full_decl->Interp.Location, >>full_decl->Interp.CylindricalWrap, >>declaration, >>header); >> diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c >> b/src/gallium/auxiliary/tgsi/tgsi_dump.c >> index 8e09bac..884d8cf 100644 >> --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c >> +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c >> @@ -349,8 +349,9 @@ iter_declaration( >> ENM( decl->Interp.Interpolate, tgsi_interpolate_names ); >>} >> >> - if (decl->Interp.Centroid) { >> - TXT( ", CENTROID" ); >> + if (decl->Interp.Location != TGSI_INTERPOLATE_LOC_CENTER) { >> + TXT( ", " ); >> + ENM( decl->Interp.Location, tgsi_interpolate_locations ); >>} >> >>if (decl->Interp.CylindricalWrap) { >> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c >> b/src/gallium/auxil
[Mesa-dev] A proposal for new testing requirements for stable releases
I've been doing stable-branch release of mesa for close to a year now. In all of that time, there's one thing that I've never been very comfortable with. Namely, release candidates are not tested very thoroughly prior to release.[*] I'm glad that we haven't had any major problems yet with broken releases. But I'd like to improve our release testing rather than just trust to future luck. Here are the changes that I'm proposing now for comments: 1. The release schedule needs to be predictable. The current ideal is "every 2 weeks" for stable releases. I believe that is healthy and feasible. I propose that releases occur consistently every 2 weeks on Friday, (in the release manager's timezone---currently America/Los_Angeles). 2. The release candidate needs to be made available in advance to allow for testing. I propose that the release manager push a candidate branch to the upstream repository by the end of the Monday prior to each scheduled Friday release. 3. I'd like to receive a testing report from each driver team This is the meat of my proposal. I'm requesting that each driver team designate one (or more) people that will be responsible for running (or collecting) tests for each release-candidate and reporting the results to me. With a new release-candidate pushed by the end of the day on Monday, and me starting the actual release work on Friday, that gives 72 hours for each team to perform testing. I'm happy to let each driver team decide its own testing strategy. I would hope it would be based on running piglit across a set of "interesting" hardware configurations, (and augmenting piglit as necessary as bug fixes are performed). But I do not plan to be involved in the specification of what those configurations are. All I need is something along the lines of: "The radeon team is OK with releasing commit " sent to me before the scheduled day of the release. Obviously, any negative report from any team can trigger changes to the patches to be included in the release. And in order to put some teeth into this scheme: I propose that on the day of the release, the release manager drop all driver-specific patches for any driver for which the driver team has not provided an affirmative testing report. Does the above sound like a reasonable scheme? If so,who from each driver team is willing to volunteer to send me testing reports? I'll be happy to send personal reminders to such volunteers as releases are getting close and testing reports are missing, (before yanking out any driver-specific patches). Thanks in advance for any thoughts or comments, -Carl PS. This same testing scheme can obviously be used for major releases as well, presumably with the same volunteers, (but different details on release timing). [*] For background, here's the testing scheme I've been using so far: 1. Running each release candidate through piglit on my laptop 2. Trying to leave the stable branch pushed out for at least a day or so to allow testing prior to release. The testing I do on my laptop at least gives some sanity checking for core, but very little driver-specific testing. Meanwhile, the majority of stable-branch fixes seem to be in driver-specific code. And I don't know if there is any serious testing of the stable branch by anyone else between when I push it and when I release, (I haven't heard much either way). I know it doesn't help that the timing of my releases has not often been predictable. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: add fix geometry shader memory leaks
Spotted by Charmaine Lee. Cc: "10.2" --- src/mesa/main/context.c |3 +++ src/mesa/main/shared.c |1 + 2 files changed, 4 insertions(+) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index b082159..50aae8b 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1215,6 +1215,9 @@ _mesa_free_context_data( struct gl_context *ctx ) _mesa_reference_vertprog(ctx, &ctx->VertexProgram._Current, NULL); _mesa_reference_vertprog(ctx, &ctx->VertexProgram._TnlProgram, NULL); + _mesa_reference_geomprog(ctx, &ctx->GeometryProgram.Current, NULL); + _mesa_reference_geomprog(ctx, &ctx->GeometryProgram._Current, NULL); + _mesa_reference_fragprog(ctx, &ctx->FragmentProgram.Current, NULL); _mesa_reference_fragprog(ctx, &ctx->FragmentProgram._Current, NULL); _mesa_reference_fragprog(ctx, &ctx->FragmentProgram._TexEnvProgram, NULL); diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c index dc22025..5ae7014 100644 --- a/src/mesa/main/shared.c +++ b/src/mesa/main/shared.c @@ -312,6 +312,7 @@ free_shared_state(struct gl_context *ctx, struct gl_shared_state *shared) _mesa_DeleteHashTable(shared->Programs); _mesa_reference_vertprog(ctx, &shared->DefaultVertexProgram, NULL); + _mesa_reference_geomprog(ctx, &shared->DefaultGeometryProgram, NULL); _mesa_reference_fragprog(ctx, &shared->DefaultFragmentProgram, NULL); _mesa_HashDeleteAll(shared->ATIShaders, delete_fragshader_cb, ctx); -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/mesa: fix geometry shader memory leak
Spotted by Charmaine Lee. Cc: "10.2" --- src/mesa/state_tracker/st_context.c |1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index c7f3ec6..c805a09 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -307,6 +307,7 @@ void st_destroy_context( struct st_context *st ) cso_release_all(st->cso_context); st_reference_fragprog(st, &st->fp, NULL); + st_reference_geomprog(st, &st->gp, NULL); st_reference_vertprog(st, &st->vp, NULL); /* release framebuffer surfaces */ -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] various exec_list things
Well, if that's how things work then, for the series, : Reviewed-by: Thomas Helland 2014-07-08 23:07 GMT+02:00 Matt Turner : > On Tue, Jul 8, 2014 at 2:04 PM, Thomas Helland > wrote: > > I don't feel comfortable reviewing things > > on here quite yet, so you should > > probably get a reviewed-by by someone else. > > > > That being said this series LGTM. > > Feel free to give a Reviewed-by tag if the code looks correct to you. > It's up to other people to decide how much that tag means. :) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: add a mechanism to allow #extension directives in the middle of shaders
On 07/08/2014 12:29 PM, Marek Olšák wrote: From: Marek Olšák This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work with sample shading. Also, if this is disabled, the error message at least makes sense now. --- src/glsl/glsl_parser.yy | 8 src/glsl/glsl_parser_extras.cpp | 2 ++ src/glsl/glsl_parser_extras.h | 2 ++ src/mesa/main/mtypes.h | 5 + 4 files changed, 17 insertions(+) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index b989749..4c87163 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -376,6 +376,14 @@ external_declaration_list: if ($2 != NULL) state->translation_unit.push_tail(& $2->link); } + | external_declaration_list extension_statement { + if (!state->allow_extension_directive_midshader) { + _mesa_glsl_error(& @2, state, + "#extension directive is not allowed " + "in the middle of a shader"); + YYERROR; + } + } ; variable_identifier: diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index b327c2b..890123a 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -210,6 +210,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx, this->early_fragment_tests = false; memset(this->atomic_counter_offsets, 0, sizeof(this->atomic_counter_offsets)); + this->allow_extension_directive_midshader = + ctx->Const.AllowGLSLExtensionDirectiveMidShader; } /** diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 1791816..f720d6a 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -490,6 +490,8 @@ struct _mesa_glsl_parse_state { /** Atomic counter offsets by binding */ unsigned atomic_counter_offsets[MAX_COMBINED_ATOMIC_BUFFERS]; + + bool allow_extension_directive_midshader; }; # define YYLLOC_DEFAULT(Current, Rhs, N) \ diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index a7126fd..b42a175 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3349,6 +3349,11 @@ struct gl_constants GLuint ForceGLSLVersion; /** +* Allow GLSL #extension directives in the middle of shaders. +*/ + GLboolean AllowGLSLExtensionDirectiveMidShader; + + /** * Does the driver support real 32-bit integers? (Otherwise, integers are * simulated via floats.) */ For both, Reviewed-by: Brian Paul There was a thread about this a few months ago. Photoshop is another app that has this issue. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] various exec_list things
On Tue, Jul 8, 2014 at 2:04 PM, Thomas Helland wrote: > I don't feel comfortable reviewing things > on here quite yet, so you should > probably get a reviewed-by by someone else. > > That being said this series LGTM. Feel free to give a Reviewed-by tag if the code looks correct to you. It's up to other people to decide how much that tag means. :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3] various exec_list things
I don't feel comfortable reviewing things on here quite yet, so you should probably get a reviewed-by by someone else. That being said this series LGTM. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] exec_list: make various places use the new get_size() method
Instead of hand-rolling it. Signed-off-by: Connor Abbott --- src/glsl/ast_to_hir.cpp | 4 +--- src/glsl/ir_reader.cpp| 7 +++ src/glsl/opt_function_inlining.cpp| 7 ++- src/mesa/drivers/dri/i965/brw_fs.cpp | 5 + src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +--- src/mesa/program/ir_to_mesa.cpp | 5 + 6 files changed, 9 insertions(+), 23 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 885bee5..a2ab26b 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -5007,9 +5007,7 @@ ast_process_structure_or_interface_block(exec_list *instructions, * 'declarations' list in each of the elements. */ foreach_list_typed (ast_declarator_list, decl_list, link, declarations) { - foreach_list_typed (ast_declaration, decl, link, &decl_list->declarations) { - decl_count++; - } + decl_count += decl_list->declarations.get_size(); } /* Allocate storage for the fields and process the field diff --git a/src/glsl/ir_reader.cpp b/src/glsl/ir_reader.cpp index 4017bdd..4f141de 100644 --- a/src/glsl/ir_reader.cpp +++ b/src/glsl/ir_reader.cpp @@ -723,10 +723,9 @@ ir_reader::read_expression(s_expression *expr) ir_read_error(expr, "invalid operator: %s", s_op->value()); return NULL; } - - int num_operands = -3; /* skip "expression" */ - foreach_in_list(s_expression, e, &((s_list *) expr)->subexpressions) - ++num_operands; + + /* skip "expression" */ + int num_operands = (int) ((s_list *) expr)->subexpressions.get_size() - 3; int expected_operands = ir_expression::get_num_operands(op); if (num_operands != expected_operands) { diff --git a/src/glsl/opt_function_inlining.cpp b/src/glsl/opt_function_inlining.cpp index b84bb8e..9626639 100644 --- a/src/glsl/opt_function_inlining.cpp +++ b/src/glsl/opt_function_inlining.cpp @@ -100,16 +100,13 @@ ir_call::generate_inline(ir_instruction *next_ir) { void *ctx = ralloc_parent(this); ir_variable **parameters; - int num_parameters; + unsigned num_parameters; int i; struct hash_table *ht; ht = hash_table_ctor(0, hash_table_pointer_hash, hash_table_pointer_compare); - num_parameters = 0; - foreach_in_list(ir_rvalue, param, &this->callee->parameters) - num_parameters++; - + num_parameters = this->callee->parameters.get_size(); parameters = new ir_variable *[num_parameters]; /* Generate the declarations for the parameters to our inlined code, diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a3ad375..6ab82b9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2957,10 +2957,7 @@ fs_visitor::calculate_register_pressure() invalidate_live_intervals(); calculate_live_intervals(); - int num_instructions = 0; - foreach_in_list(fs_inst, inst, &instructions) { - ++num_instructions; - } + unsigned num_instructions = instructions.get_size(); regs_live_at_ip = rzalloc_array(mem_ctx, int, num_instructions); diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 28e59c6..ce8dc36 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -106,9 +106,7 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, cfg_t *cfg, num_acp = 0; for (int b = 0; b < cfg->num_blocks; b++) { for (int i = 0; i < ACP_HASH_SIZE; i++) { - foreach_in_list(acp_entry, entry, &out_acp[b][i]) { -num_acp++; - } +num_acp += out_acp[b][i].get_size(); } } diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 1109051..cc9c8ae 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -2814,10 +2814,7 @@ get_mesa_program(struct gl_context *ctx, prog->NumTemporaries = v.next_temp; - int num_instructions = 0; - foreach_in_list(ir_instruction, node, &v.instructions) { - num_instructions++; - } + unsigned num_instructions = v.instructions.get_size(); mesa_instructions = (struct prog_instruction *)calloc(num_instructions, -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] exec_list: add a prepend function
This complements the existing append function. It's implemented in a rather simple way right now; it could be changed if performance is a concern. Signed-off-by: Connor Abbott --- src/glsl/list.h | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/src/glsl/list.h b/src/glsl/list.h index 922bd68..ca6ee9d 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -345,9 +345,15 @@ struct exec_list { void move_nodes_to(exec_list *target); /** -* Append all nodes from the source list to the target list +* Append all nodes from the source list to the end of the target list */ void append_list(exec_list *source); + + /** +* Prepend all nodes from the source list to the beginning of the target +* list +*/ + void prepend_list(exec_list *source); #endif }; @@ -479,6 +485,13 @@ exec_list_append(struct exec_list *list, struct exec_list *source) } static inline void +exec_list_prepend(struct exec_list *list, struct exec_list *source) +{ + exec_list_append(source, list); + exec_list_move_nodes_to(source, list); +} + +static inline void exec_node_insert_list_before(struct exec_node *n, struct exec_list *before) { if (exec_list_is_empty(before)) @@ -554,6 +567,11 @@ inline void exec_list::append_list(exec_list *source) exec_list_append(this, source); } +inline void exec_list::prepend_list(exec_list *source) +{ + exec_list_prepend(this, source); +} + inline void exec_node::insert_before(exec_list *before) { exec_node_insert_list_before(this, before); -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] exec_list: add a function to count the size of a list
Signed-off-by: Connor Abbott --- src/glsl/list.h | 20 1 file changed, 20 insertions(+) diff --git a/src/glsl/list.h b/src/glsl/list.h index ca6ee9d..68ab3fd 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -324,6 +324,8 @@ struct exec_list { const exec_node *get_tail() const; exec_node *get_tail(); + + unsigned get_size(); void push_head(exec_node *n); void push_tail(exec_node *n); @@ -405,6 +407,19 @@ exec_list_get_tail(struct exec_list *list) return !exec_list_is_empty(list) ? list->tail_pred : NULL; } +static inline unsigned +exec_list_get_size(struct exec_list *list) +{ + unsigned size = 0; + + for (struct exec_node *node = list->head; node->next != NULL; + node = node->next) { + size++; + } + + return size; +} + static inline void exec_list_push_head(struct exec_list *list, struct exec_node *n) { @@ -537,6 +552,11 @@ inline exec_node *exec_list::get_tail() return exec_list_get_tail(this); } +inline unsigned exec_list::get_size() +{ + return exec_list_get_size(this); +} + inline void exec_list::push_head(exec_node *n) { exec_list_push_head(this, n); -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3] various exec_list things
This series adds a couple things I need to exec_list for my work, and does some cleanups made possible. Only compile tested on i965. Connor Abbott (3): exec_list: add a prepend function exec_list: add a function to count the size of a list exec_list: make various places use the new get_size() method src/glsl/ast_to_hir.cpp| 4 +-- src/glsl/ir_reader.cpp | 7 ++-- src/glsl/list.h| 40 +- src/glsl/opt_function_inlining.cpp | 7 ++-- src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +-- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +-- src/mesa/program/ir_to_mesa.cpp| 5 +-- 7 files changed, 48 insertions(+), 24 deletions(-) -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
Am 08.07.2014 18:03, schrieb Erik Faye-Lund: > On Tue, Jul 8, 2014 at 5:44 PM, Roland Scheidegger wrote: >> Am 08.07.2014 08:22, schrieb Matt Turner: >>> On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund wrote: On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner wrote: > This comment tripped me up for a second. This really means that you've > found either > > - min(max(x, 0.0), 1.0); or > - max(min(x, 1.0), 0.0) Hmm, but are optimizing both of these to saturate OK? Shouldn't min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give 1.0? >>> >>> Under standard IEEE rules, wouldn't the NaN propagate through all of >>> these expressions? >> not under ieee754-2008 rules, no. min/max return the other number if one >> of them is a NaN. > > I'm not entirely sure what IEEE754-2008 has to say about min/max, as > GLSL already defines those operations (and states that IEEE754 > compliance isn't required). I was referring to the definition of the > less-than comparison-operator, which GLSL seems to lean on. And that > seems to always evaluate to 'false'. Oh right I didn't notice min/max are defined that way. I guess that's due to older specs - pre glsl 4 NaN support wasn't really required and just about anything involving NaNs or Infs was ok to give undefined results anyway (and some hw indeed didn't have NaN support). Even current glsl version is not very strict there, still saying things like "NaNs are not required to be generated" and "Operations and built-in functions that operate on a NaN are not required to return a NaN as the result". I am not entirely sure why the spec wasn't tightened up a bit, but maybe some glsl 4+ capable hw exists which doesn't implement d3d10 rules. > >> This is enforced by d3d10 so at least d3d10 capable >> hardware is going to honor that. >> https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050%28v%3Dvs.85%29.aspx&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0A&m=9dVgmTYumlj5DwQp2o0tjv4fTQpBhA%2FCDiMURzjOadM%3D%0A&s=f153cbe35b73f7fdb09fc50f8cfab56651bf346257838bc32ad35a721110fe67 >> The order of min/max though indeed matters. > > Right, this is very useful information, thanks. This makes me think > that the perhaps the dependence on the less-than definition could be > considered a spec-bug. A quick test reveals that NVIDIA's Windows > driver implement both min(0.5, NaN) and min(NaN, 0.5) as returning > 0.5. So I'd say: raise an issue with Khronos, and implement DX10 > rules > I agree that this looks like a bug, at the very least (given the very lax NaN requirements in general) I think it should allow the d3d10 result if it doesn't do already (it is not obvious to me if the glsl less-than operator really needs to adhere to ieee754 spec wrt NaNs). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/mesa, gallium: add a workaround for Unigine Heaven 4.0 and Valley 1.0
From: Marek Olšák Most (all?) Unigine shaders fail to compile without this if sample shading is advertised. This is, of course, Unigine developers' fault. --- src/gallium/include/state_tracker/st_api.h | 1 + src/gallium/state_trackers/dri/common/dri_context.c | 2 ++ src/gallium/state_trackers/dri/common/dri_screen.c | 1 + src/mesa/drivers/dri/common/drirc | 20 +--- src/mesa/drivers/dri/common/xmlpool/t_options.h | 5 + src/mesa/state_tracker/st_extensions.c | 3 +++ 6 files changed, 29 insertions(+), 3 deletions(-) diff --git a/src/gallium/include/state_tracker/st_api.h b/src/gallium/include/state_tracker/st_api.h index 9dcb76f..46f0436 100644 --- a/src/gallium/include/state_tracker/st_api.h +++ b/src/gallium/include/state_tracker/st_api.h @@ -245,6 +245,7 @@ struct st_config_options boolean force_glsl_extensions_warn; unsigned force_glsl_version; boolean force_s3tc_enable; + boolean allow_glsl_extension_directive_midshader; }; /** diff --git a/src/gallium/state_trackers/dri/common/dri_context.c b/src/gallium/state_trackers/dri/common/dri_context.c index f6979a7..827f847 100644 --- a/src/gallium/state_trackers/dri/common/dri_context.c +++ b/src/gallium/state_trackers/dri/common/dri_context.c @@ -55,6 +55,8 @@ dri_fill_st_options(struct st_config_options *options, driQueryOptioni(optionCache, "force_glsl_version"); options->force_s3tc_enable = driQueryOptionb(optionCache, "force_s3tc_enable"); + options->allow_glsl_extension_directive_midshader = + driQueryOptionb(optionCache, "allow_glsl_extension_directive_midshader"); } GLboolean diff --git a/src/gallium/state_trackers/dri/common/dri_screen.c b/src/gallium/state_trackers/dri/common/dri_screen.c index dceb628..f894e5d 100644 --- a/src/gallium/state_trackers/dri/common/dri_screen.c +++ b/src/gallium/state_trackers/dri/common/dri_screen.c @@ -69,6 +69,7 @@ const __DRIconfigOptionsExtension gallium_config_options = { DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED("false") DRI_CONF_DISABLE_SHADER_BIT_ENCODING("false") DRI_CONF_FORCE_GLSL_VERSION(0) + DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER("false") DRI_CONF_SECTION_END DRI_CONF_SECTION_MISCELLANEOUS diff --git a/src/mesa/drivers/dri/common/drirc b/src/mesa/drivers/dri/common/drirc index ebc04cd..4b9841b 100644 --- a/src/mesa/drivers/dri/common/drirc +++ b/src/mesa/drivers/dri/common/drirc @@ -11,17 +11,21 @@ Application bugs worked around in this file: is still 1.10. * Unigine Heaven 3.0 with ARB_texture_multisample uses a "ivec4 * vec4" - expression, which fails to compile with GLSL 1.10. + expression, which is illegal in GLSL 1.10. Adding "#version 130" fixes this. * Unigine Heaven 3.0 with ARB_shader_bit_encoding uses the uint keyword, which - fails to compile with GLSL 1.10. + is illegal in GLSL 1.10. Adding "#version 130" fixes this. * Unigine Heaven 3.0 with ARB_shader_bit_encoding uses a "uint & int" - expression, which fails (and should fail) to compile with any GLSL version. + expression, which is illegal in any GLSL version. Disabling ARB_shader_bit_encoding fixes this. +* If ARB_sample_shading is supported, Unigine Heaven 4.0 and Valley 1.0 uses + an #extension directive in the middle of its shaders, which is illegal + in GLSL. + TODO: document the other workarounds. --> @@ -45,6 +49,7 @@ TODO: document the other workarounds. + @@ -52,6 +57,15 @@ TODO: document the other workarounds. + + + + + + + + + diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h b/src/mesa/drivers/dri/common/xmlpool/t_options.h index fc9e104..b73a662 100644 --- a/src/mesa/drivers/dri/common/xmlpool/t_options.h +++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h @@ -105,6 +105,11 @@ DRI_CONF_OPT_BEGIN_V(force_glsl_version, int, def, "0:999") \ DRI_CONF_DESC(en,gettext("Force a default GLSL version for shaders that lack an explicit #version line")) \ DRI_CONF_OPT_END +#define DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER(def) \ +DRI_CONF_OPT_BEGIN_B(allow_glsl_extension_directive_midshader, def) \ +DRI_CONF_DESC(en,gettext("Allow GLSL #extension directives in the middle of shaders")) \ +DRI_CONF_OPT_END + /** diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 4207cb6..aa59fbf 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -772,6 +772,9 @@ void st_init_extensions(struct st_context *st) if (st->options.disable_glsl_line_continuations) ctx->Const.DisableGLSLLineContinuations = 1; + if (st->options.allow_glsl_extension_directive_midshader) + ctx->Const.A
[Mesa-dev] [PATCH 1/2] glsl: add a mechanism to allow #extension directives in the middle of shaders
From: Marek Olšák This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work with sample shading. Also, if this is disabled, the error message at least makes sense now. --- src/glsl/glsl_parser.yy | 8 src/glsl/glsl_parser_extras.cpp | 2 ++ src/glsl/glsl_parser_extras.h | 2 ++ src/mesa/main/mtypes.h | 5 + 4 files changed, 17 insertions(+) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index b989749..4c87163 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -376,6 +376,14 @@ external_declaration_list: if ($2 != NULL) state->translation_unit.push_tail(& $2->link); } + | external_declaration_list extension_statement { + if (!state->allow_extension_directive_midshader) { + _mesa_glsl_error(& @2, state, + "#extension directive is not allowed " + "in the middle of a shader"); + YYERROR; + } + } ; variable_identifier: diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index b327c2b..890123a 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -210,6 +210,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx, this->early_fragment_tests = false; memset(this->atomic_counter_offsets, 0, sizeof(this->atomic_counter_offsets)); + this->allow_extension_directive_midshader = + ctx->Const.AllowGLSLExtensionDirectiveMidShader; } /** diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 1791816..f720d6a 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -490,6 +490,8 @@ struct _mesa_glsl_parse_state { /** Atomic counter offsets by binding */ unsigned atomic_counter_offsets[MAX_COMBINED_ATOMIC_BUFFERS]; + + bool allow_extension_directive_midshader; }; # define YYLLOC_DEFAULT(Current, Rhs, N) \ diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index a7126fd..b42a175 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3349,6 +3349,11 @@ struct gl_constants GLuint ForceGLSLVersion; /** +* Allow GLSL #extension directives in the middle of shaders. +*/ + GLboolean AllowGLSLExtensionDirectiveMidShader; + + /** * Does the driver support real 32-bit integers? (Otherwise, integers are * simulated via floats.) */ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/16] gallium: add is_percentage and is_float fields to pipe_driver_query_info
It would be nicer to define all the types as enums, e.g. enum pipe_driver_query_type { PIPE_DRIVER_QUERY_TYPE_BYTES, PIPE_DRIVER_QUERY_TYPE_PERCENTAGE, PIPE_DRIVER_QUERY_TYPE_FLOAT, PIPE_DRIVER_QUERY_TYPE_UINT }; That would be nicer than a bunch of booleans. If that's not possible, please document how the booleans interact with each other. Marek On Mon, Jul 7, 2014 at 5:47 PM, Samuel Pitoiset wrote: > This will be used by GL_AMD_performance_monitor. > > Signed-off-by: Samuel Pitoiset > --- > src/gallium/include/pipe/p_defines.h | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/gallium/include/pipe/p_defines.h > b/src/gallium/include/pipe/p_defines.h > index 501c1e2..db5c9bf 100644 > --- a/src/gallium/include/pipe/p_defines.h > +++ b/src/gallium/include/pipe/p_defines.h > @@ -740,6 +740,8 @@ struct pipe_driver_query_info > unsigned query_type; /* PIPE_QUERY_DRIVER_SPECIFIC + i */ > union pipe_numeric_type_union max_value; /* max value that can be > returned */ > boolean uses_byte_units; /* whether the result is in bytes */ > + boolean is_percentage; > + boolean is_float; > unsigned group_id; > }; > > -- > 2.0.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
On Tue, Jul 8, 2014 at 5:44 PM, Roland Scheidegger wrote: > Am 08.07.2014 08:22, schrieb Matt Turner: >> On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund wrote: >>> On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner wrote: This comment tripped me up for a second. This really means that you've found either - min(max(x, 0.0), 1.0); or - max(min(x, 1.0), 0.0) >>> >>> Hmm, but are optimizing both of these to saturate OK? Shouldn't >>> min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give >>> 1.0? >> >> Under standard IEEE rules, wouldn't the NaN propagate through all of >> these expressions? > not under ieee754-2008 rules, no. min/max return the other number if one > of them is a NaN. I'm not entirely sure what IEEE754-2008 has to say about min/max, as GLSL already defines those operations (and states that IEEE754 compliance isn't required). I was referring to the definition of the less-than comparison-operator, which GLSL seems to lean on. And that seems to always evaluate to 'false'. > This is enforced by d3d10 so at least d3d10 capable > hardware is going to honor that. > http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050%28v=vs.85%29.aspx > The order of min/max though indeed matters. Right, this is very useful information, thanks. This makes me think that the perhaps the dependence on the less-than definition could be considered a spec-bug. A quick test reveals that NVIDIA's Windows driver implement both min(0.5, NaN) and min(NaN, 0.5) as returning 0.5. So I'd say: raise an issue with Khronos, and implement DX10 rules. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v3 PATCH 13/16] i965/vec4: Allow propagation of instructions with saturate flag to sel
When sel conditon is bounded within 0 and 1.0. This allows code as: mov.sat a b sel.ge dst a 0.25F To be propagated as: sel.ge.sat dst b 0.25F v3: - Syntax clarification in inst->saturate assignment - Remove extra parenthesis when assigning src_reg value from copy_entry (Matt Turner) Signed-off-by: Abdiel Janulgue --- .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 75 ++ 1 file changed, 48 insertions(+), 27 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp index 2c41d02..3251551 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp @@ -36,13 +36,17 @@ extern "C" { namespace brw { +struct copy_entry { + src_reg *value[4]; + bool saturate; +}; + static bool is_direct_copy(vec4_instruction *inst) { return (inst->opcode == BRW_OPCODE_MOV && !inst->predicate && inst->dst.file == GRF && - !inst->saturate && !inst->dst.reladdr && !inst->src[0].reladdr && inst->dst.type == inst->src[0].type); @@ -74,16 +78,16 @@ is_channel_updated(vec4_instruction *inst, src_reg *values[4], int ch) static bool try_constant_propagate(struct brw_context *brw, vec4_instruction *inst, - int arg, src_reg *values[4]) + int arg, struct copy_entry *entry) { /* For constant propagation, we only handle the same constant * across all 4 channels. Some day, we should handle the 8-bit * float vector format, which would let us constant propagate * vectors better. */ - src_reg value = *values[0]; + src_reg value = *entry->value[0]; for (int i = 1; i < 4; i++) { - if (!value.equals(*values[i])) + if (!value.equals(*entry->value[i])) return false; } @@ -213,22 +217,22 @@ is_logic_op(enum opcode opcode) static bool try_copy_propagate(struct brw_context *brw, vec4_instruction *inst, - int arg, src_reg *values[4]) + int arg, struct copy_entry *entry) { /* For constant propagation, we only handle the same constant * across all 4 channels. Some day, we should handle the 8-bit * float vector format, which would let us constant propagate * vectors better. */ - src_reg value = *values[0]; + src_reg value = *entry->value[0]; for (int i = 1; i < 4; i++) { /* This is equals() except we don't care about the swizzle. */ - if (value.file != values[i]->file || - value.reg != values[i]->reg || - value.reg_offset != values[i]->reg_offset || - value.type != values[i]->type || - value.negate != values[i]->negate || - value.abs != values[i]->abs) { + if (value.file != entry->value[i]->file || + value.reg != entry->value[i]->reg || + value.reg_offset != entry->value[i]->reg_offset || + value.type != entry->value[i]->type || + value.negate != entry->value[i]->negate || + value.abs != entry->value[i]->abs) { return false; } } @@ -239,7 +243,7 @@ try_copy_propagate(struct brw_context *brw, vec4_instruction *inst, */ int s[4]; for (int i = 0; i < 4; i++) { - s[i] = BRW_GET_SWZ(values[i]->swizzle, + s[i] = BRW_GET_SWZ(entry->value[i]->swizzle, BRW_GET_SWZ(inst->src[arg].swizzle, i)); } value.swizzle = BRW_SWIZZLE4(s[0], s[1], s[2], s[3]); @@ -299,8 +303,23 @@ try_copy_propagate(struct brw_context *brw, vec4_instruction *inst, if (value.equals(inst->src[arg])) return false; + if (entry->saturate) { + switch(inst->opcode) { + case BRW_OPCODE_SEL: + if (inst->src[1].file != IMM || + inst->src[1].fixed_hw_reg.dw1.f < 0.0 || + inst->src[1].fixed_hw_reg.dw1.f > 1.0) { +return false; + } + break; + default: + return false; + } + } + value.type = inst->src[arg].type; inst->src[arg] = value; + inst->saturate = inst->saturate || entry->saturate; return true; } @@ -308,9 +327,9 @@ bool vec4_visitor::opt_copy_propagation() { bool progress = false; - src_reg *cur_value[virtual_grf_reg_count][4]; + struct copy_entry entries[virtual_grf_reg_count]; - memset(&cur_value, 0, sizeof(cur_value)); + memset(&entries, 0, sizeof(entries)); foreach_in_list(vec4_instruction, inst, &instructions) { /* This pass only works on basic blocks. If there's flow @@ -321,7 +340,7 @@ vec4_visitor::opt_copy_propagation() * src/glsl/opt_copy_propagation.cpp to track available copies. */ if (!is_dominated_by_previous_instruction(inst)) { -memset(cur_value, 0, sizeof(cur_value)); +memset(&entries, 0, sizeof(entries)); continue; } @@ -342,33 +361,34 @
[Mesa-dev] [v3 PATCH 12/16] i965/fs: Allow propagation of instructions with saturate flag to sel
When sel conditon is bounded within 0 and 1.0. This allows code as: mov.sat a b sel.ge dst a 0.25F To be propagated as: sel.ge.sat dst b 0.25F v3: Syntax clarification in inst->saturate assignment (Matt Turner) Signed-off-by: Abdiel Janulgue --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 28e59c6..9d1aeab 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -43,6 +43,7 @@ struct acp_entry : public exec_node { fs_reg dst; fs_reg src; enum opcode opcode; + bool saturate; }; struct block_data { @@ -348,11 +349,26 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) } } + if (entry->saturate) { + switch(inst->opcode) { + case BRW_OPCODE_SEL: + if (inst->src[1].file != IMM || + inst->src[1].fixed_hw_reg.dw1.f < 0.0 || + inst->src[1].fixed_hw_reg.dw1.f > 1.0) { +return false; + } + break; + default: + return false; + } + } + inst->src[arg].file = entry->src.file; inst->src[arg].reg = entry->src.reg; inst->src[arg].reg_offset = entry->src.reg_offset; inst->src[arg].subreg_offset = entry->src.subreg_offset; inst->src[arg].stride *= entry->src.stride; + inst->saturate = inst->saturate || entry->saturate; if (!inst->src[arg].abs) { inst->src[arg].abs = entry->src.abs; @@ -515,7 +531,6 @@ can_propagate_from(fs_inst *inst) inst->src[0].file == UNIFORM || inst->src[0].file == IMM) && inst->src[0].type == inst->dst.type && - !inst->saturate && !inst->is_partial_write()); } @@ -570,6 +585,7 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, bblock_t *block, entry->dst = inst->dst; entry->src = inst->src[0]; entry->opcode = inst->opcode; + entry->saturate = inst->saturate; acp[entry->dst.reg % ACP_HASH_SIZE].push_tail(entry); } else if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD && inst->dst.file == GRF) { -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v3 PATCH 11/16] glsl: Optimize clamp(x, b, 1.0), where b > 0.0 as max(saturate(x), b)
v2: - Output max(saturate(x),b) instead of saturate(max(x,b)) - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is > 0.0 and inner constant is 1.0. - Fix comments to show that the optimization is a commutative operation (Matt Turner) Signed-off-by: Abdiel Janulgue --- src/glsl/opt_algebraic.cpp | 23 +++ 1 file changed, 23 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 6dfb681..447618f 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -137,6 +137,21 @@ is_less_than_one(ir_constant *ir) return (component == ir->type->vector_elements); } +static inline bool +is_greater_than_zero(ir_constant *ir) +{ + if (!is_valid_vec_const(ir)) + return false; + + unsigned component = 0; + for (int c = 0; c < ir->type->vector_elements; c++) { + if (ir->get_float_component(c) > 0.0f) + component++; + } + + return (component == ir->type->vector_elements); +} + static void update_type(ir_expression *ir) { @@ -684,6 +699,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if (is_less_than_one(inner_val_b->as_constant()) && outer_const->is_zero()) return expr(ir_binop_min, saturate(inner_val_a), inner_val_b); + +/* Found a {min|max} ({max|min} (x, b), 1.0), where b > 0.0 + * and its variations + */ +if (outer_const->is_one() && is_greater_than_zero(inner_val_b->as_constant())) + return expr(ir_binop_max, saturate(inner_val_a), inner_val_b); +if (inner_val_b->as_constant()->is_one() && is_greater_than_zero(outer_const)) + return expr(ir_binop_max, saturate(inner_val_a), outer_const); } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v3 PATCH 10/16] glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x), b)
v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia Mirkin - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is zero and inner constant is < 1 - Fix comments to reflect we are doing a commutative operation (Matt Turner) Signed-off-by: Abdiel Janulgue --- src/glsl/opt_algebraic.cpp | 39 +++ 1 file changed, 39 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 4b052933..6dfb681 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -110,6 +110,33 @@ is_vec_basis(ir_constant *ir) return (ir == NULL) ? false : ir->is_basis(); } +static inline bool +is_valid_vec_const(ir_constant *ir) +{ + if (ir == NULL) + return false; + + if (!ir->type->is_scalar() && !ir->type->is_vector()) + return false; + + return true; +} + +static inline bool +is_less_than_one(ir_constant *ir) +{ + if (!is_valid_vec_const(ir)) + return false; + + unsigned component = 0; + for (int c = 0; c < ir->type->vector_elements; c++) { + if (ir->get_float_component(c) < 1.0f) + component++; + } + + return (component == ir->type->vector_elements); +} + static void update_type(ir_expression *ir) { @@ -645,6 +672,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) if ((outer_const->is_one() && inner_val_a->is_zero()) || (inner_val_a->is_one() && outer_const->is_zero())) return saturate(inner_val_b); + +/* Found a {min|max} ({max|min} (x, 0.0), b) where b < 1.0 + * and its variations + */ +if (is_less_than_one(outer_const) && inner_val_b->is_zero()) + return expr(ir_binop_min, saturate(inner_val_a), outer_const); + +if (!inner_val_b->as_constant()) + continue; + +if (is_less_than_one(inner_val_b->as_constant()) && outer_const->is_zero()) + return expr(ir_binop_min, saturate(inner_val_a), inner_val_b); } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [v3 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
v2: - Check that the base type is float (Ian Romanick) v3: - Make sure comments reflect that we are doing a commutative operation - Add missing condition where the inner constant is 1.0 and outer constant is 0.0 - Make indexing of operands easier to read (Matt Turner) Signed-off-by: Abdiel Janulgue --- src/glsl/opt_algebraic.cpp | 36 1 file changed, 36 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index ac7514a..4b052933 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -614,6 +614,42 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) break; + case ir_binop_min: + case ir_binop_max: + if (ir->type->base_type != GLSL_TYPE_FLOAT) + break; + + /* Replace min(max) operations and its commutative combinations with + * a saturate operation + */ + for (int op = 0; op < 2; op++) { + ir_expression *minmax = op_expr[op]; + ir_constant *outer_const = op_const[1 - op]; + ir_expression_operation op_cond = (ir->operation == ir_binop_max) ? +ir_binop_min : ir_binop_max; + + if (!minmax || !outer_const || (minmax->operation != op_cond)) +continue; + + /* Found a min(max) combination. Now try to see if its operands + * meet our conditions that we can do just a single saturate operation + */ + for (int minmax_op = 0; minmax_op < 2; minmax_op++) { +ir_rvalue *inner_val_a = minmax->operands[minmax_op]; +ir_rvalue *inner_val_b = minmax->operands[1 - minmax_op]; + +if (!inner_val_a || !inner_val_b) + continue; + +/* Found a {min|max} ({max|min} (x, 0.0), 1.0) operation and its variations */ +if ((outer_const->is_one() && inner_val_a->is_zero()) || +(inner_val_a->is_one() && outer_const->is_zero())) + return saturate(inner_val_b); + } + } + + break; + case ir_unop_rcp: if (op_expr[0] && op_expr[0]->operation == ir_unop_rcp) return op_expr[0]->operands[0]; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
Am 08.07.2014 08:22, schrieb Matt Turner: > On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund wrote: >> On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner wrote: >>> This comment tripped me up for a second. This really means that you've >>> found either >>> >>> - min(max(x, 0.0), 1.0); or >>> - max(min(x, 1.0), 0.0) >> >> Hmm, but are optimizing both of these to saturate OK? Shouldn't >> min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give >> 1.0? > > Under standard IEEE rules, wouldn't the NaN propagate through all of > these expressions? not under ieee754-2008 rules, no. min/max return the other number if one of them is a NaN. This is enforced by d3d10 so at least d3d10 capable hardware is going to honor that. http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050%28v=vs.85%29.aspx The order of min/max though indeed matters. > > The GLSL 4.40 spec says > > "Operations and built-in functions that operate on a NaN are not required to > return a NaN as the result." > > So it seems like we have a lot of flexibility here. Is there some text > I'm missing? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 10/16] glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x), b)
On 08.07.2014 12:37, Abdiel Janulgue wrote: > > On 07.07.2014 20:25, Matt Turner wrote: >> On Mon, Jul 7, 2014 at 6:57 AM, Abdiel Janulgue >> wrote: >>> v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by >>> Ilia Mirkin >>> - Make sure we do component-wise comparison for vectors (Ian Romanick) >>> >>> Signed-off-by: Abdiel Janulgue >>> --- >>> src/glsl/opt_algebraic.cpp | 10 ++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp >>> index 2ad561c..bcda7a9 100644 >>> --- a/src/glsl/opt_algebraic.cpp >>> +++ b/src/glsl/opt_algebraic.cpp >>> @@ -643,6 +643,16 @@ ir_algebraic_visitor::handle_expression(ir_expression >>> *ir) >>> /* Found a min (max(x, 0), 1.0) */ >>> if (outer_const->is_one() && inner_val_a->is_zero()) >>> return saturate(inner_val_b); >>> + >>> +unsigned component = 0; >>> +for (int c = 0; c < outer_const->type->vector_elements; c++) { >>> + if (outer_const->get_float_component(c) < 1.0f) >>> + component++; >>> +} >>> +/* Found a min (max(x, 0.0) b), where b < 1.0 */ >> >> Is this case the only thing it finds? >> >> Does it find max(min(x, b), 0.0)? > > Unfortunately, you are right. The detection only works if the inner > constant is zero. I'll fix this in the next revision > >> >>> +if ((component == outer_const->type->vector_elements) && >>> +inner_val_b->is_zero()) >>> + return expr(ir_binop_min, saturate(inner_val_a), >>> outer_const); >> >> It seems like it would incorrectly recognize max(min(x, 0.0), b). >> > > This one it would correctly recognize. Actually no, this one actually does not work either. Fix coming up! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 10/16] glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x), b)
On 07.07.2014 20:25, Matt Turner wrote: > On Mon, Jul 7, 2014 at 6:57 AM, Abdiel Janulgue > wrote: >> v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by >> Ilia Mirkin >> - Make sure we do component-wise comparison for vectors (Ian Romanick) >> >> Signed-off-by: Abdiel Janulgue >> --- >> src/glsl/opt_algebraic.cpp | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp >> index 2ad561c..bcda7a9 100644 >> --- a/src/glsl/opt_algebraic.cpp >> +++ b/src/glsl/opt_algebraic.cpp >> @@ -643,6 +643,16 @@ ir_algebraic_visitor::handle_expression(ir_expression >> *ir) >> /* Found a min (max(x, 0), 1.0) */ >> if (outer_const->is_one() && inner_val_a->is_zero()) >> return saturate(inner_val_b); >> + >> +unsigned component = 0; >> +for (int c = 0; c < outer_const->type->vector_elements; c++) { >> + if (outer_const->get_float_component(c) < 1.0f) >> + component++; >> +} >> +/* Found a min (max(x, 0.0) b), where b < 1.0 */ > > Is this case the only thing it finds? > > Does it find max(min(x, b), 0.0)? Unfortunately, you are right. The detection only works if the inner constant is zero. I'll fix this in the next revision > >> +if ((component == outer_const->type->vector_elements) && >> +inner_val_b->is_zero()) >> + return expr(ir_binop_min, saturate(inner_val_a), >> outer_const); > > It seems like it would incorrectly recognize max(min(x, 0.0), b). > This one it would correctly recognize. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] i965: overwriting parts of a register and optimization passes
Hi, I have some code that first initializes a register and then overwrites a specific subregister. However, after the optimization passes in brw_vec4.cpp the initialization instruction goes away. I see that live intervals are computed for the destination register without considering if we are writing only to a specific subregister, so I guess that is the reason for this behaviour, but this strikes me as odd, as it defeats the purpose of overwriting only specific subregs, so I guess there is something else that I am missing. This an example of what is happening: mov(8) g6<1>.xUD g0<4,4,1>UD { align16 WE_all 1Q }; mov(1) g6.2<1>UD 0x0007UD{ align1 WE_all compacted }; The first MOV is removed from the instruction set when I put the second MOV, however the second MOV should only be writing to subreg 6.2 and keep the rest of register 6 intact, or at least that is what I am trying to do... Can someone explain what is wrong with those two instructions? how should I overwrite only a specific subregister without causing this behavior? Thanks, Iago ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v2 PATCH 09/16] glsl: Optimize clamp(x, 0, 1) as saturate(x)
On 08.07.2014 09:47, Erik Faye-Lund wrote: > On Tue, Jul 8, 2014 at 8:22 AM, Matt Turner wrote: >> On Mon, Jul 7, 2014 at 11:00 PM, Erik Faye-Lund wrote: >>> On Mon, Jul 7, 2014 at 7:18 PM, Matt Turner wrote: This comment tripped me up for a second. This really means that you've found either - min(max(x, 0.0), 1.0); or - max(min(x, 1.0), 0.0) >>> >>> Hmm, but are optimizing both of these to saturate OK? Shouldn't >>> min(max(NaN, 0.0), 1.0) give 0.0, whereas max(min(NaN, 1.0), 0.0) give >>> 1.0? >> >> Under standard IEEE rules, wouldn't the NaN propagate through all of >> these expressions? >> >> The GLSL 4.40 spec says >> >> "Operations and built-in functions that operate on a NaN are not required to >> return a NaN as the result." >> >> So it seems like we have a lot of flexibility here. Is there some text >> I'm missing? > > I think the point about flexibility is a bit weak or even > misunderstood: surely this applies to the built-ins used, not what we > generate after optimizing. So if we chose to do that, we'd need to > prevent min and max from propagating NaN even when issued stand-alone, > which might negatively impact performance in some ISAs. > > As to why I reacted, I was just remembering that I read somewhere (one > of Humus' low-level shader optimization paper, IIRC) that the HLSL > compiler refused to do some similar optimizations for NaN-reasons. > > Checking the spec a bit closer, though: > - min(x, y) is defined as "Returns y if y < x, otherwise it returns x" > - max(x, y) is defined as "Returns y if x < y, otherwise it returns x". > > All comparisons with NaN should AFAIK fail, making both the first and > second comparison return NaN, if NaN were to be "properly" supported. > So my initial analysis was wrong. However, all of the following > permutations of the same logic would still be inconsistent. > > min(max(0.0, x), 1.0) > max(min(1.0, x), 0.0) > min(1.0, max(0.0, x)) > max(0.0, min(1.0, x)) > min(1.0, max(x, 0.0)) > max(0.0, min(x, 1.0)) > > I don't understand the code well enough to figure out if the patch > optimizes these, though. The code actually goes through all 8 commutative variations and tries to optimize if it spots one of these conditions. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev