Re: Vectorizer/alignment

2013-11-11 Thread Hendrik Greving
I've filed bug 59084. I think it actually might affect the same x86 backend stuff as bug 41464. Hendrik On Mon, Nov 11, 2013 at 4:00 PM, H.J. Lu wrote: > On Mon, Nov 11, 2013 at 2:48 PM, Hendrik Greving > wrote: >> Ok, thanks, that explains it... Apparently x86 splits the vector movs >> into 2 i

Re: Vectorizer/alignment

2013-11-11 Thread H.J. Lu
On Mon, Nov 11, 2013 at 2:48 PM, Hendrik Greving wrote: > Ok, thanks, that explains it... Apparently x86 splits the vector movs > into 2 in > ix86_expand_vector_move_misalign->ix86_avx256_split_vector_move_misalign. > But I wanted to mention that e.g. icc, despite also putting g_a, g_b, > g_c int

Re: Vectorizer/alignment

2013-11-11 Thread Hendrik Greving
Ok, thanks, that explains it... Apparently x86 splits the vector movs into 2 in ix86_expand_vector_move_misalign->ix86_avx256_split_vector_move_misalign. But I wanted to mention that e.g. icc, despite also putting g_a, g_b, g_c into .comm, actually generates AVX2 vmovdqu using ymm... Examples: f

Re: Vectorizer/alignment

2013-11-11 Thread David Edelsohn
On Mon, Nov 11, 2013 at 3:56 PM, Richard Henderson wrote: >> I suppose targets without .bss section support should not switch >> (that is, targets not defining BSS_SECTION_ASM_OP or >> ASM_OUTPUT_ALIGNED_BSS). > > Good point. I don't expect that we have many of those left, but > if any do still

Re: Vectorizer/alignment

2013-11-11 Thread Richard Henderson
On 11/11/2013 11:57 PM, Richard Biener wrote: > On Mon, Nov 11, 2013 at 2:39 PM, Jakub Jelinek wrote: >> On Mon, Nov 11, 2013 at 02:13:24PM +0100, Richard Biener wrote: >>> On Mon, Nov 11, 2013 at 12:39 PM, Jakub Jelinek wrote: On Mon, Nov 11, 2013 at 12:29:29PM +0100, Richard Biener wrote:

Re: Vectorizer/alignment

2013-11-11 Thread Richard Biener
On Mon, Nov 11, 2013 at 2:39 PM, Jakub Jelinek wrote: > On Mon, Nov 11, 2013 at 02:13:24PM +0100, Richard Biener wrote: >> On Mon, Nov 11, 2013 at 12:39 PM, Jakub Jelinek wrote: >> > On Mon, Nov 11, 2013 at 12:29:29PM +0100, Richard Biener wrote: >> >> On Fri, Nov 8, 2013 at 6:51 PM, Hendrik Grev

Re: Vectorizer/alignment

2013-11-11 Thread Jakub Jelinek
On Mon, Nov 11, 2013 at 02:13:24PM +0100, Richard Biener wrote: > On Mon, Nov 11, 2013 at 12:39 PM, Jakub Jelinek wrote: > > On Mon, Nov 11, 2013 at 12:29:29PM +0100, Richard Biener wrote: > >> On Fri, Nov 8, 2013 at 6:51 PM, Hendrik Greving > >> wrote: > >> > That didn't do it. What was the rati

Re: Vectorizer/alignment

2013-11-11 Thread Richard Biener
On Mon, Nov 11, 2013 at 12:39 PM, Jakub Jelinek wrote: > On Mon, Nov 11, 2013 at 12:29:29PM +0100, Richard Biener wrote: >> On Fri, Nov 8, 2013 at 6:51 PM, Hendrik Greving >> wrote: >> > That didn't do it. What was the rationale w.r.t. to the relation >> > between the vectorized sequenced and/or

Re: Vectorizer/alignment

2013-11-11 Thread Jakub Jelinek
On Mon, Nov 11, 2013 at 12:29:29PM +0100, Richard Biener wrote: > On Fri, Nov 8, 2013 at 6:51 PM, Hendrik Greving > wrote: > > That didn't do it. What was the rationale w.r.t. to the relation > > between the vectorized sequenced and/or the alignment (I think these > > things are actually 2 separat

Re: Vectorizer/alignment

2013-11-11 Thread Richard Biener
On Fri, Nov 8, 2013 at 6:51 PM, Hendrik Greving wrote: > That didn't do it. What was the rationale w.r.t. to the relation > between the vectorized sequenced and/or the alignment (I think these > things are actually 2 separate things..) and the common block?! We cannot adjust the alignment of a co

Re: Vectorizer/alignment

2013-11-08 Thread Richard Biener
Hendrik Greving wrote: >The code for a simple loop like > >for (i = 0; i < LENGTH-1; i++) { >g_c[i] = g_a[i] + g_b[i]; >} > >looks good for g++ (4.9.0 20131028 (experimental)) (-O3 core-avx2) > >.L2: >vmovdqa g_a(%rax), %ymm0 # 26 *movv8si_internal/2 [length = 8] >vpaddd g_b(%rax), %ymm0,

Vectorizer/alignment

2013-11-08 Thread Hendrik Greving
The code for a simple loop like for (i = 0; i < LENGTH-1; i++) { g_c[i] = g_a[i] + g_b[i]; } looks good for g++ (4.9.0 20131028 (experimental)) (-O3 core-avx2) .L2: vmovdqa g_a(%rax), %ymm0 # 26 *movv8si_internal/2 [length = 8] vpaddd g_b(%rax), %ymm0, %ymm0 # 27 *addv8si3/2 [length = 8]