On 11/16/2013 04:25 AM, Tim Prince wrote:
Many decisions on compiler defaults still are based on an unscientific
choice of benchmarks, with gcc evidently more responsive to input from
the community.
I'm also quite convinced that we are hampered by the fact that there is
no IPA on alignment
Ondřej Bílka nel...@seznam.cz wrote:
On Sat, Nov 16, 2013 at 11:37:36AM +0100, Richard Biener wrote:
Ondřej Bílka nel...@seznam.cz wrote:
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote:
IIRC what can still be seen is store-buffer related slowdowns when
you have a big
On Sun, Nov 17, 2013 at 04:42:18PM +0100, Richard Biener wrote:
Ondřej Bílka nel...@seznam.cz wrote:
On Sat, Nov 16, 2013 at 11:37:36AM +0100, Richard Biener wrote:
Ondřej Bílka nel...@seznam.cz wrote:
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote:
IIRC what can still
Ondřej Bílka nel...@seznam.cz wrote:
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote:
Also keep in mind that usually costs go up significantly if
misalignment causes cache line splits (processor will fetch 2 lines).
There are non-linear costs of filling up the store queue in
On Sat, Nov 16, 2013 at 11:37:36AM +0100, Richard Biener wrote:
Ondřej Bílka nel...@seznam.cz wrote:
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote:
IIRC what can still be seen is store-buffer related slowdowns when you have a
big unaligned store load in your loop. Thus
On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote:
Hi,
In loop vectorization, I found that vectorizer insists on loop peeling even
our target supports misaligned memory access. This results in much bigger
code size for a very simple loop. I defined
: Vectorization: Loop peeling with misaligned support.
On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote:
Hi,
In loop vectorization, I found that vectorizer insists on loop peeling even
our target supports misaligned memory access. This results in much bigger
code size for a very
: Vectorization: Loop peeling with misaligned support.
On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote:
Hi,
In loop vectorization, I found that vectorizer insists on loop peeling even
our target supports misaligned memory access. This results in much bigger
code size
: 15 November 2013 14:02
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org
Subject: Re: Vectorization: Loop peeling with misaligned support.
On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei b...@broadcom.com wrote:
Hi,
In loop vectorization, I found that vectorizer insists on loop peeling even
our target
to guarantee to generate
loop peeling.
Bingfeng
-Original Message-
From: Xinliang David Li [mailto:davi...@google.com]
Sent: 15 November 2013 17:30
To: Bingfeng Mei
Cc: Richard Biener; gcc@gcc.gnu.org
Subject: Re: Vectorization: Loop peeling with misaligned support.
The right longer
.
Bingfeng
-Original Message-
From: Xinliang David Li [mailto:davi...@google.com]
Sent: 15 November 2013 17:30
To: Bingfeng Mei
Cc: Richard Biener; gcc@gcc.gnu.org
Subject: Re: Vectorization: Loop peeling with misaligned support.
The right longer term fix is suggested by Richard
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote:
Also keep in mind that usually costs go up significantly if
misalignment causes cache line splits (processor will fetch 2 lines).
There are non-linear costs of filling up the store queue in modern
out-of-order processors (x86).
On Fri, Nov 15, 2013 at 11:26:06PM +0100, Ondřej Bílka wrote:
Minor correction, a mutt read replaced a set1.s file by one that I later
used for avx2 variant. A correct file is following
.file set1.c
.text
.p2align 4,,15
.globl set
.type set, @function
On 11/15/2013 2:26 PM, Ondřej Bílka wrote:
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote:
Also keep in mind that usually costs go up significantly if
misalignment causes cache line splits (processor will fetch 2 lines).
There are non-linear costs of filling up the store queue
14 matches
Mail list logo