Re: Vectorization: Loop peeling with misaligned support.

2013-11-17 Thread Toon Moene
On 11/16/2013 04:25 AM, Tim Prince wrote: Many decisions on compiler defaults still are based on an unscientific choice of benchmarks, with gcc evidently more responsive to input from the community. I'm also quite convinced that we are hampered by the fact that there is no IPA on alignment

Re: Vectorization: Loop peeling with misaligned support.

2013-11-17 Thread Richard Biener
Ondřej Bílka nel...@seznam.cz wrote: On Sat, Nov 16, 2013 at 11:37:36AM +0100, Richard Biener wrote: Ondřej Bílka nel...@seznam.cz wrote: On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: IIRC what can still be seen is store-buffer related slowdowns when you have a big

Re: Vectorization: Loop peeling with misaligned support.

2013-11-17 Thread Ondřej Bílka
On Sun, Nov 17, 2013 at 04:42:18PM +0100, Richard Biener wrote: Ondřej Bílka nel...@seznam.cz wrote: On Sat, Nov 16, 2013 at 11:37:36AM +0100, Richard Biener wrote: Ondřej Bílka nel...@seznam.cz wrote: On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: IIRC what can still

Re: Vectorization: Loop peeling with misaligned support.

2013-11-16 Thread Richard Biener
Ondřej Bílka nel...@seznam.cz wrote: On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: Also keep in mind that usually costs go up significantly if misalignment causes cache line splits (processor will fetch 2 lines). There are non-linear costs of filling up the store queue in

Re: Vectorization: Loop peeling with misaligned support.

2013-11-16 Thread Ondřej Bílka
On Sat, Nov 16, 2013 at 11:37:36AM +0100, Richard Biener wrote: Ondřej Bílka nel...@seznam.cz wrote: On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: IIRC what can still be seen is store-buffer related slowdowns when you have a big unaligned store load in your loop. Thus

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Richard Biener
On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote: Hi, In loop vectorization, I found that vectorizer insists on loop peeling even our target supports misaligned memory access. This results in much bigger code size for a very simple loop. I defined

RE: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Bingfeng Mei
: Vectorization: Loop peeling with misaligned support. On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote: Hi, In loop vectorization, I found that vectorizer insists on loop peeling even our target supports misaligned memory access. This results in much bigger code size for a very

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Hendrik Greving
: Vectorization: Loop peeling with misaligned support. On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote: Hi, In loop vectorization, I found that vectorizer insists on loop peeling even our target supports misaligned memory access. This results in much bigger code size

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Xinliang David Li
: 15 November 2013 14:02 To: Bingfeng Mei Cc: gcc@gcc.gnu.org Subject: Re: Vectorization: Loop peeling with misaligned support. On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote: Hi, In loop vectorization, I found that vectorizer insists on loop peeling even our target

RE: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Bingfeng Mei
to guarantee to generate loop peeling. Bingfeng -Original Message- From: Xinliang David Li [mailto:davi...@google.com] Sent: 15 November 2013 17:30 To: Bingfeng Mei Cc: Richard Biener; gcc@gcc.gnu.org Subject: Re: Vectorization: Loop peeling with misaligned support. The right longer

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Xinliang David Li
. Bingfeng -Original Message- From: Xinliang David Li [mailto:davi...@google.com] Sent: 15 November 2013 17:30 To: Bingfeng Mei Cc: Richard Biener; gcc@gcc.gnu.org Subject: Re: Vectorization: Loop peeling with misaligned support. The right longer term fix is suggested by Richard

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Ondřej Bílka
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: Also keep in mind that usually costs go up significantly if misalignment causes cache line splits (processor will fetch 2 lines). There are non-linear costs of filling up the store queue in modern out-of-order processors (x86).

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Ondřej Bílka
On Fri, Nov 15, 2013 at 11:26:06PM +0100, Ondřej Bílka wrote: Minor correction, a mutt read replaced a set1.s file by one that I later used for avx2 variant. A correct file is following .file set1.c .text .p2align 4,,15 .globl set .type set, @function

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Tim Prince
On 11/15/2013 2:26 PM, Ondřej Bílka wrote: On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: Also keep in mind that usually costs go up significantly if misalignment causes cache line splits (processor will fetch 2 lines). There are non-linear costs of filling up the store queue