On Tue, Apr 26, 2016 at 11:25 AM, Nicolai Hähnle <nhaeh...@gmail.com> wrote: > From: Nicolai Hähnle <nicolai.haeh...@amd.com> > > In optimized builds, visit(ir_expression *) experiences inlining with gcc that > leads the function to have a roughly 32KB stack frame. This is a problem given > that the function is called recursively. In non-optimized builds, the stack > frame is much smaller, hence one gets crashes that happen only in optimized > builds. > > Arguably there is a compiler bug or at least severe misfeature here. In any > case, the easy thing to do for now seems to be moving the bulk of the > non-recursive code into a separate function. This is sufficient to convince my > version of gcc not to blow up the stack frame of the recursive part. Just to > be > sure, add the gcc-specific noinline attribute to prevent this bug from > reoccuring if inliner heuristics change. > > Cc: "11.1 11.2" <mesa-sta...@lists.freedesktop.org> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95133 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95026 > --- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > index ad818a8..958b2c0 100644 > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > @@ -450,6 +450,12 @@ public: > virtual void visit(ir_barrier *); > /*@}*/ > > + void visit_expression(ir_expression *, st_src_reg *) > +#if defined(__GNUC__) > + __attribute__((noinline)) > +#endif
As mentioned on IRC, there's something of a precedent for sticking these things into src/util/macros.h and src/gallium/include/pipe/p_compiler.h (although the src one appears more appropriate. There's even a src/mapi/glapi/gen/gl_XML.py:# define NOINLINE __attribute__((noinline)) However you choose to resolve this, Reviewed-by: Ilia Mirkin <imir...@alum.mit.edu> > + ; > + > void visit_atomic_counter_intrinsic(ir_call *); > void visit_ssbo_intrinsic(ir_call *); > void visit_membar_intrinsic(ir_call *); > @@ -1535,10 +1541,7 @@ glsl_to_tgsi_visitor::reladdr_to_temp(ir_instruction > *ir, > void > glsl_to_tgsi_visitor::visit(ir_expression *ir) > { > - unsigned int operand; > st_src_reg op[ARRAY_SIZE(ir->operands)]; > - st_src_reg result_src; > - st_dst_reg result_dst; > > /* Quick peephole: Emit MAD(a, b, c) instead of ADD(MUL(a, b), c) > */ > @@ -1561,7 +1564,7 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) > if (ir->operation == ir_quadop_vector) > assert(!"ir_quadop_vector should have been lowered"); > > - for (operand = 0; operand < ir->get_num_operands(); operand++) { > + for (unsigned int operand = 0; operand < ir->get_num_operands(); > operand++) { > this->result.file = PROGRAM_UNDEFINED; > ir->operands[operand]->accept(this); > if (this->result.file == PROGRAM_UNDEFINED) { > @@ -1578,6 +1581,19 @@ glsl_to_tgsi_visitor::visit(ir_expression *ir) > assert(!ir->operands[operand]->type->is_matrix()); > } > > + visit_expression(ir, op); > +} > + > +/* The non-recursive part of the expression visitor lives in a separate > + * function and should be prevented from being inlined, to avoid a stack > + * explosion when deeply nested expressions are visited. > + */ > +void > +glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op) > +{ > + st_src_reg result_src; > + st_dst_reg result_dst; > + > int vector_elements = ir->operands[0]->type->vector_elements; > if (ir->operands[1]) { > vector_elements = MAX2(vector_elements, > -- > 2.5.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev