Re: Thank you!

2013-11-15 Thread Jeff Law
On 11/15/13 18:33, Mark Mitchell wrote: I'd very much like to thank all who are, have been, or will be developers and maintainers of GCC. Of course, I'm particularly grateful to those who reviewed my patches, fixed the bugs I introduced, endured my nit-picking reviews of their patches, and so f

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Tim Prince
On 11/15/2013 2:26 PM, Ondřej Bílka wrote: On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: Also keep in mind that usually costs go up significantly if misalignment causes cache line splits (processor will fetch 2 lines). There are non-linear costs of filling up the store queue i

Thank you!

2013-11-15 Thread Mark Mitchell
Folks -- It's been a long time since I've posted to the GCC mailing list because (as is rather obvious) I haven't been directly involved in GCC development for quite some time. As of today, I'm no longer at Mentor Graphics (the company that acquired CodeSourcery), so I no longer even have a ma

Re: suspect code in fold-const.c

2013-11-15 Thread Kenneth Zadeck
This patch fixes a number of places where the mode bitsize had been used but the mode precision should have been used. The tree level is somewhat sloppy about this - some places use the mode precision and some use the mode bitsize. It seems that the mode precision is the proper choice sinc

Re: proposal to make SIZE_TYPE more flexible

2013-11-15 Thread DJ Delorie
> Everything handling __int128 would be updated to work with a > target-determined set of types instead. > > Preferably, the number of such keywords would be arbitrary (so I suppose > there would be a single RID_INTN for them) - that seems cleaner than the > system for address space keywords w

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Ondřej Bílka
On Fri, Nov 15, 2013 at 11:26:06PM +0100, Ondřej Bílka wrote: Minor correction, a mutt read replaced a set1.s file by one that I later used for avx2 variant. A correct file is following .file "set1.c" .text .p2align 4,,15 .globl set .type set, @function

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Ondřej Bílka
On Fri, Nov 15, 2013 at 09:17:14AM -0800, Hendrik Greving wrote: > Also keep in mind that usually costs go up significantly if > misalignment causes cache line splits (processor will fetch 2 lines). > There are non-linear costs of filling up the store queue in modern > out-of-order processors (x86)

Re: RFC: FLT_ROUNDS and fesetround

2013-11-15 Thread Joseph S. Myers
On Fri, 15 Nov 2013, H.J. Lu wrote: > Hi, > > float.h has > > /* Addition rounds to 0: zero, 1: nearest, 2: +inf, 3: -inf, -1: unknown. */ > /* ??? This is supposed to change with calls to fesetround in . */ > #undef FLT_ROUNDS > #define FLT_ROUNDS 1 > > Clang introduces __builtin_flt_rounds

RFC: FLT_ROUNDS and fesetround

2013-11-15 Thread H.J. Lu
Hi, float.h has /* Addition rounds to 0: zero, 1: nearest, 2: +inf, 3: -inf, -1: unknown. */ /* ??? This is supposed to change with calls to fesetround in . */ #undef FLT_ROUNDS #define FLT_ROUNDS 1 Clang introduces __builtin_flt_rounds and #define FLT_ROUNDS (__builtin_flt_rounds()) I am no

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Xinliang David Li
I agree it is hard to tune cost model to make it precise. Trunk compiler now supports better command line control for cost model selection. It seems to me that you can backport that change (as well as changes to control loop and slp vectorizer with different options) to your branch. With those, yo

Re: Frame pointer, bug or feature? (x86)

2013-11-15 Thread Andrew Pinski
On Fri, Nov 15, 2013 at 9:31 AM, Hendrik Greving wrote: > In the below test case, "CASE_A" actually uses a frame pointer, while > !CASE_A doesn't. I can't imagine this is a feature, this is a bug, > isn't it? Is there any reason the compiler couldn't know that > loop_blocks never needs a dynamic s

RE: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Bingfeng Mei
Thanks for the suggestion. It seems that parameter is only available in HEAD, not in 4.8. I will backport to 4.8. However, implementing a good cost model seems quite tricky to me. There are conflicting requirements for different processors. For us or many embedded processors, 4-time size increa

Frame pointer, bug or feature? (x86)

2013-11-15 Thread Hendrik Greving
In the below test case, "CASE_A" actually uses a frame pointer, while !CASE_A doesn't. I can't imagine this is a feature, this is a bug, isn't it? Is there any reason the compiler couldn't know that loop_blocks never needs a dynamic stack size? #include #include #define MY_DEFINE 100 #define CA

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Xinliang David Li
The right longer term fix is suggested by Richard. For now you can probably override the peel parameter for your target (in the target option_override function). maybe_set_param_value (PARAM_VECT_MAX_PEELING_FOR_ALIGNMENT, 0, opts->x_param_values, opts_set->x_param_values); David

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Hendrik Greving
Also keep in mind that usually costs go up significantly if misalignment causes cache line splits (processor will fetch 2 lines). There are non-linear costs of filling up the store queue in modern out-of-order processors (x86). Bottom line is that it's much better to peel e.g. for AVX2/AVX3 if the

Re: [RFC] Target compilation for offloading

2013-11-15 Thread Andrey Turetskiy
Let's suppose, we are going to run target gcc driver from lto-wrapper. How could a list of offload targets be passed there from option parser? In my opinion, the simpliest way to do it is to use environment variable. Would you agree with such approach? On Fri, Nov 8, 2013 at 6:34 PM, Jakub Jelinek

RE: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Bingfeng Mei
Hi, Richard, Speed difference is 154 cycles (with workaround) vs. 198 cycles. So loop peeling is also slower for our processors. By vectorization_cost, do you mean TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST hook? In our case, it is easy to make decision. But generally, if peeling loop is fas

Re: Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Richard Biener
On Fri, Nov 15, 2013 at 2:16 PM, Bingfeng Mei wrote: > Hi, > In loop vectorization, I found that vectorizer insists on loop peeling even > our target supports misaligned memory access. This results in much bigger > code size for a very simple loop. I defined > TARGET_VECTORIZE_SUPPORT_VECTOR_MI

Re: suspect code in fold-const.c

2013-11-15 Thread Kenneth Zadeck
On 11/15/2013 04:07 AM, Eric Botcazou wrote: this code from fold-const.c starts on line 13811. else if (TREE_INT_CST_HIGH (arg1) == signed_max_hi && TREE_INT_CST_LOW (arg1) == signed_max_lo && TYPE_UNSIGNED (arg1_type) /* We will flip the si

Vectorization: Loop peeling with misaligned support.

2013-11-15 Thread Bingfeng Mei
Hi, In loop vectorization, I found that vectorizer insists on loop peeling even our target supports misaligned memory access. This results in much bigger code size for a very simple loop. I defined TARGET_VECTORIZE_SUPPORT_VECTOR_MISALGINMENT and also TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST

Re: suspect code in fold-const.c

2013-11-15 Thread Eric Botcazou
> this code from fold-const.c starts on line 13811. > > else if (TREE_INT_CST_HIGH (arg1) == signed_max_hi > && TREE_INT_CST_LOW (arg1) == signed_max_lo > && TYPE_UNSIGNED (arg1_type) > /* We will flip the signedness of the comparison operator >