On Wed, 27 Jul 2011, Tom de Vries wrote: > On 07/27/2011 05:27 PM, Richard Guenther wrote: > > On Wed, 27 Jul 2011, Tom de Vries wrote: > > > >> On 07/27/2011 02:12 PM, Richard Guenther wrote: > >>> On Wed, 27 Jul 2011, Tom de Vries wrote: > >>> > >>>> On 07/27/2011 01:50 PM, Tom de Vries wrote: > >>>>> Hi Richard, > >>>>> > >>>>> I have a patch set for bug 43513 - The stack pointer is adjusted twice. > >>>>> > >>>>> 01_pr43513.3.patch > >>>>> 02_pr43513.3.test.patch > >>>>> 03_pr43513.3.mudflap.patch > >>>>> > >>>>> The patch set has been bootstrapped and reg-tested on x86_64. > >>>>> > >>>>> I will sent out the patches individually. > >>>>> > >>>> > >>>> The patch replaces a vla __builtin_alloca that has a constant argument > >>>> with an > >>>> array declaration. > >>>> > >>>> OK for trunk? > >>> > >>> I don't think it is safe to try to get at the VLA type the way you do. > >> > >> I don't understand in what way it's not safe. Do you mean I don't manage > >> to find > >> the type always, or that I find the wrong type, or something else? > > > > I think you might get the wrong type, > > Ok, I'll review that code one more time. > > > you also do not transform code > > like > > > > int *p = alloca(4); > > *p = 3; > > > > as there is no array type involved here. > > > > I was trying to stay away from non-vla allocas. A source declared alloca has > function livetime, so we could have a single alloca in a loop, called 10 > times, > with all 10 instances live at the same time. This patch does not detect such > cases, and thus stays away from non-vla allocas. A vla decl does not have such > problems, the lifetime ends when it goes out of scope.
Yes indeed - that probably would require more detailed analysis. > >>> In fact I would simply do sth like > >>> > >>> elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1); > >>> n_elem = size * 8 / BITS_PER_UNIT; > >>> array_type = build_array_type_nelts (elem_type, n_elem); > >>> var = create_tmp_var (array_type, NULL); > >>> return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var)); > >>> > >> > >> I tried this code on the example, and it works, but the newly declared > >> type has > >> an 8-bit alignment, while the vla base type has a 32 bit alignment. This > >> make > >> the memory access in the example potentially unaligned, which prohibits an > >> ivopts optimization, so the resulting text size is 68 instead of the 64 > >> achieved > >> with my current patch. > > > > Ok, so then set DECL_ALIGN of the variable to something reasonable > > like MIN (size * 8, GET_MODE_PRECISION (word_mode)). Basically the > > alignment that the targets alloca function would guarantee. > > > > I tried that, but that doesn't help. It's the alignment of the type that > matters, not of the decl. It shouldn't. All accesses are performed with the original types and alignment comes from that (plus the underlying decl). > So should we try to find the base type of the vla, and use that, or use the > nonstandard char type? I don't think we can reliably find the base type of the vla - well, in practice we may because we control how we lower VLAs during gimplification, but nothing in the IL constraints say that the resulting pointer type should be special. Using a char[] decl shouldn't be a problem IMHO. > >>> And obviously you lose the optimization we arrange with inserting > >>> __builtin_stack_save/restore pairs that way - stack space will no > >>> longer be shared for subsequent VLAs. Which means that you'd > >>> better limit the size you allow this promotion. > >>> > >> > >> Right, I could introduce a parameter for this. > > > > I would think you could use PARAM_LARGE_STACK_FRAME for now and say, > > allow a size of PARAM_LARGE_STACK_FRAME / 10? > > > > That unfortunately is too small for the example from bug report. The default > value of the param is 250, so that would be a threshold of 25, and the alloca > size of the example is 40. Perhaps we can try a threshold of > PARAM_LARGE_STACK_FRAME - estimated_stack_size or some such? Hm. estimated_stack_size is not O(1), so no. I think we need to find a sensible way of allowing stack sharing. Eventually Michas patch for introducing points-of-death would help here, if we'd go for folding this during stack-save/restore optimization. Richard.