The e500v2 (SPE) hardware is such that if the address of vector (double world
load / stores) are not double world aligned the instruction will trap.
So this alignment is not optional.
Edmar
On Fri, Jun 7, 2013 at 3:43 PM, Richard Henderson wrote:
> On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
>> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
>> for optimization purposes beyond ABI mandated levels. It is fine to emit
>> the vars aligned as much as we want for optimization purposes, but if we
>> can't be sure that references to that decl bind to the definition we
>> increased the alignment on (e.g. common variables, or -fpic code without
>> hidden visibility, weak vars etc.), we can't assume that alignment.
>
> When the linker merges common blocks, it chooses both maximum size and maximum
> alignment. Thus for any common block for which we can prove the block must
> reside in the module (any executable, or hidden common in shared object), we
> can go ahead and use the increased alignment.
>
> It's only in shared objects with non-hidden common blocks that we have a
> problem, since in that case we may resolve the common block via copy reloc to
> a
> memory block in another module.
>
> So while decl_binds_to_current_def_p is a good starting point, I think we can
> do a little better with common blocks. Which ought to take care of those
> vectorization regressions you mention.
>
>> @@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out
>>align = MAX_OFILE_ALIGNMENT;
>> }
>>
>> - /* On some machines, it is good to increase alignment sometimes. */
>> - if (! DECL_USER_ALIGN (decl))
>> + /* On some machines, it is good to increase alignment sometimes.
>> + But as DECL_ALIGN is used both for actually emitting the variable
>> + and for code accessing the variable as guaranteed alignment, we
>> + can only increase the alignment if it is a performance optimization
>> + if the references to it must bind to the current definition. */
>> + if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
>> {
>> #ifdef DATA_ALIGNMENT
>>unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
>> @@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out
>> }
>> #endif
>> }
>> +#ifdef DATA_ABI_ALIGNMENT
>> + else if (! DECL_USER_ALIGN (decl))
>> +{
>> + unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl),
>> align);
>> + /* For backwards compatibility, don't assume the ABI alignment for
>> + TLS variables. */
>> + if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
>> + align = data_align;
>> +}
>> +#endif
>
> This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
> defined, but DATA_ALIGNMENT isn't. And while I realize you documented it, I
> don't like the restriction that D_A /must/ return something larger than D_A_A.
> All that means is that in complex cases D_A will have to call D_A_A itself.
>
> I would think that it would be better to rearrange as
>
> if (!D_U_A)
> {
> #ifdef D_A_A
> align = ...
> #endif
> #ifdef D_A
> if (d_b_t_c_d_p)
> align = ...
> #endif
> }
>
> Why the special case for TLS? If we really want that special case surely that
> test should go into D_A_A itself, and not here in generic code.
>
>> Bootstrapped/regtested on x86_64-linux and i686-linux. No idea about other
>> targets, I've kept them all using DATA_ALIGNMENT, which is considered
>> optimization increase only now, if there is some ABI mandated alignment
>> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
>> well as DATA_ALIGNMENT.
>
> I've had a brief look over the instances of D_A within the tree atm. Most of
> them carry the cut-n-paste comment "for the same reasons". These I believe
> never intended an ABI change, and were really only interested in optimization.
>
> But these I think require a good hard look to see if they really intended an
> ABI alignment:
>
> c6x comment explicitly mentions abi
> criscompiler options for alignment -- systemwide or local?
> mmixcomment mentions GETA instruction
> s390comment mentions LARL instruction
> rs6000 SPE and E500 portion of the alignment non-optional?
>
> Relevant port maintainers CCed.
>
>
> r~