On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 05:27 PM, Richard Guenther wrote:
> > On Wed, 27 Jul 2011, Tom de Vries wrote:
> > 
> >> On 07/27/2011 02:12 PM, Richard Guenther wrote:
> >>> On Wed, 27 Jul 2011, Tom de Vries wrote:
> >>>
> >>>> On 07/27/2011 01:50 PM, Tom de Vries wrote:
> >>>>> Hi Richard,
> >>>>>
> >>>>> I have a patch set for bug 43513 - The stack pointer is adjusted twice.
> >>>>>
> >>>>> 01_pr43513.3.patch
> >>>>> 02_pr43513.3.test.patch
> >>>>> 03_pr43513.3.mudflap.patch
> >>>>>
> >>>>> The patch set has been bootstrapped and reg-tested on x86_64.
> >>>>>
> >>>>> I will sent out the patches individually.
> >>>>>
> >>>>
> >>>> The patch replaces a vla __builtin_alloca that has a constant argument 
> >>>> with an
> >>>> array declaration.
> >>>>
> >>>> OK for trunk?
> >>>
> >>> I don't think it is safe to try to get at the VLA type the way you do.
> >>
> >> I don't understand in what way it's not safe. Do you mean I don't manage 
> >> to find
> >> the type always, or that I find the wrong type, or something else?
> > 
> > I think you might get the wrong type,
> 
> Ok, I'll review that code one more time.
> 
> > you also do not transform code
> > like
> > 
> >   int *p = alloca(4);
> >   *p = 3;
> > 
> > as there is no array type involved here.
> > 
> 
> I was trying to stay away from non-vla allocas.  A source declared alloca has
> function livetime, so we could have a single alloca in a loop, called 10 
> times,
> with all 10 instances live at the same time. This patch does not detect such
> cases, and thus stays away from non-vla allocas. A vla decl does not have such
> problems, the lifetime ends when it goes out of scope.

Yes indeed - that probably would require more detailed analysis.

> >>> In fact I would simply do sth like
> >>>
> >>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
> >>>   n_elem = size * 8 / BITS_PER_UNIT;
> >>>   array_type = build_array_type_nelts (elem_type, n_elem);
> >>>   var = create_tmp_var (array_type, NULL);
> >>>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
> >>>
> >>
> >> I tried this code on the example, and it works, but the newly declared 
> >> type has
> >> an 8-bit alignment, while the vla base type has a 32 bit alignment.  This 
> >> make
> >> the memory access in the example potentially unaligned, which prohibits an
> >> ivopts optimization, so the resulting text size is 68 instead of the 64 
> >> achieved
> >> with my current patch.
> > 
> > Ok, so then set DECL_ALIGN of the variable to something reasonable
> > like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
> > alignment that the targets alloca function would guarantee.
> > 
> 
> I tried that, but that doesn't help. It's the alignment of the type that
> matters, not of the decl.

It shouldn't.  All accesses are performed with the original types and
alignment comes from that (plus the underlying decl).

> So should we try to find the base type of the vla, and use that, or use the
> nonstandard char type?

I don't think we can reliably find the base type of the vla - well,
in practice we may because we control how we lower VLAs during
gimplification, but nothing in the IL constraints say that the
resulting pointer type should be special.

Using a char[] decl shouldn't be a problem IMHO.

> >>> And obviously you lose the optimization we arrange with inserting
> >>> __builtin_stack_save/restore pairs that way - stack space will no
> >>> longer be shared for subsequent VLAs.  Which means that you'd
> >>> better limit the size you allow this promotion.
> >>>
> >>
> >> Right, I could introduce a parameter for this.
> > 
> > I would think you could use PARAM_LARGE_STACK_FRAME for now and say,
> > allow a size of PARAM_LARGE_STACK_FRAME / 10?
> > 
> 
> That unfortunately is too small for the example from bug report. The default
> value of the param is 250, so that would be a threshold of 25, and the alloca
> size of the example is 40.  Perhaps we can try a threshold of
> PARAM_LARGE_STACK_FRAME - estimated_stack_size or some such?

Hm.  estimated_stack_size is not O(1), so no.  I think we need to
find a sensible way of allowing stack sharing.  Eventually Michas
patch for introducing points-of-death would help here, if we'd
go for folding this during stack-save/restore optimization.

Richard.

Reply via email to